How Private Is It? Privacy Metrics and Preservation Techniques (1G1)

Session Topic: How Private Is It? (1G1) Convener: Dwight Irving

Notes-taker(s): Leon Thomas Trying to get data, but people are bailing out, because people want to know "What are you going to do with my data?" Even friends/family wouldn't trust with things like Phone Records. Decision was to let people control the data themselves - with hints on how they are doing. This led to creation of privacy metrics Matrix: Everything that could be known about a person across top People across the left axis Within the "What could be known" there is info that is publicly known; there is info that personal contacts would/could know. Created an linear algebraic equation on what can be known. This was really difficulty. Problem: SSN vs Dogs name are not equally weighted; this is where "Feelings" come into play. There is something that came out of social networking; Concept of "weighted networks" where your social graph can be summed to indicate trust/privacy. Your real privacy (RP) equals Identity + (Persona (what people know about you) * Identity or I(1+Pp) Once you start connecting things up, there is a "star" . The "one goat theorem" if there is a certainty, you know what it is. Once someone knows the most important thing about you, the power law says that there isn't much worse that things can get. Identity doesn't follow a power law. Name is not terrible identifying How does this help the user understand how "private" they are. The RP (real privacy) is calculated similar to a credit score (0-1000 scale) 840 is a good score for the general population One of the first records they got was Phone records: If you give out all of this, your real privacy goes down from 840 > 420 If you don't give last 4 digits of phone, your score would go up 420 > 600. If you are in a group of 10000, then people feel better about their privacy. Giving out only Area Code only boosts a bit. 600 > 650 If you date out (calling city, rate, plan), goes from 650 > 800 A user may have a shared secret, but it cannot be factored into the calculation for a specific person. --- Reducing entropy: Reducing resolution. Phone number, you delete the last 4 so that they do not have it. Figure out context. Heartbeat - your heartbeat when running is one thing, but your heartbeat while in a hospital has far greater insight. Add noise. How can users show data about themselves; what does someone need to find out about you, and are you providing them too much so that they can figure out what you didn't authorize. - Monkey Wrenching to insert disinformation about a user; can this be done for a user's benefit? PhoneNumber example: Phone - last 4 = Score 1 Area Code only = Score 2 - End user will get a cut of the monies generated through the sale of their data. Potentially a user could enter their data, then as your data is sold you could: Point.ly is trying to use games to get them involved that would provide rewards for participation. DataBank could be inserted instead of an ad revenue model where users could be compensated for a consumer's behavior on your site. They try to strip out identity and use persona instead; You can, however, use this info to identify who a user is.
 * A) Address - factor of one
 * B) Where I start my daily runs - same as address
 * Number
 * Time
 * Rate
 * Plan
 * Incoming/Outgoing
 * Calling City
 * Persona risk - who is contacting you/who can contact
 * Can extrapolate extra data from there
 * 1) get a check each month
 * 2) have the money donated to charity of your choice
 * 3) have your phonebill (example) be reduced as a result