Personal Data Stores (PDS)

Monday – Session 4 - O

Convener: Paul Trevithick

Notes-taker(s): Drummond Reed, Stacey Pitsillides

A.	Tags for the session - technology discussed/ideas considered:

B.	Discussion notes, key understandings, outstanding questions, observations, and, if appropriate to this discussion: action items, next steps:

NOTES FROM: Drummond Reed:

They are not necessarily the place that all of a user's data is stored, but more like a virtual directory and dashboard - a single point of control for sharing personal information. The actual PDS can actually get/set (or read/write) data from multiple data sources (e.g., health records from doctors/hospitals, financial records from banks/insurance companies, home records from tax authorities, etc.)

Paul explained that the Higgins Project is implementing PDS. This idea seems to be catching on quickly now, and some of the key challenges are now ready to be discussed.

LOCATION: PDS can run on any device or be accessed from any device - they are not necessarily based in the cloud, though they may be.

ENCRYPTION: PDS can stored data/metadata in the clear, or "blinded" so that it is encrypted. This raises many questions, e.g., how does data from legacy apps get into a blinded PDS? Does the PDS handle synchronization?

PERSONAL DATA MINING: This is much easier (or in some cases only feasible) if the data is not blinded in the PDS. New forms of aggregation are possible in which the aggregation is happening on the user side. This is also someplace where zero-knowledge-proof technology (ZKP) can be useful.

PERSONAL DATA TERMS/LINK CONTRACTS - This "turns on its head" normal terms-of-service (TOS) that sites offer to users today, but which users really have no choice to accept. A PDS ecology is potentially a way that users can, acting as a "class", publish terms for personal data sharing that sites will accept.

PERSONAL DATA BANKING: Once a user has the ability to store and provide access to their data, they can begin to "bank" it or make it available to "exchanges" so that it can realize its latent value, similar to the way a bank earns interest in your money. Exchanges can provide a service to extract data (with permission from the user) and then aggregate it and provide value to multiple parties (e.g., companies, governments, non-profits, and the user) that is either very difficult or impossible to unlock today.

RELATIONSHIP TO INFORMATION CARDS AND RELATIONSHIP CARDS: PDS architecture is a close fit with Information Card architecture when either the cards, or data accessed by personal cards, is stored in the PDS. This is the Higgins architecture. PDS is an even closer fit with relationship cards (r-cards) where the data relationship can be dynamic, and thus can set up a persistent feed of data.

SOCIAL ADDRESS BOOK: This is a classic example of a PDS application -- users can share their own address records with each other. The result is a "Plaxo without the Plaxo", i.e., p2p address book record sharing without any company in the middle.

THE DATA MODEL PROBLEM: One big challenge with PDS is how to do the data model. The schema for all personal data is so large: how can it be modelled? And how can it be mapped to all the various places from which the user is going to want to publish and subscribe data? The Higgins approach is to maintain a mapping at the PDS (that can be shared by a large population of PDS). Paul shared a figure that from one industry participant that a set of 15,000 form-fill mappings for websites had a breakage rate of 125 per day. That means that a team of two developers could maintain the mappings for a significant percentage of high-usage websites.

CORRELATION MANAGEMENT: Another advantage of PDS is that it gives a user a place to manage correlation between identifiers.

SEMANTICS: Yet another advantage is that a PDS (or a PDS ecosystem) can help develop and standardize the semantics used so that the services using PDS can be much smarter dealing with "things" rather than "strings".

PDS APPS: Many apps can be built on top of a PDS. One example demonstrated by Azigo VP Engineering Mike McIntosh was a password manager where all the usernames and passwords are saved in your PDS. This same app can be available on multiple devices, including mobile phones, to automatically authenticate you using your PDS as the "sync". +++ NOTES FROM: Stacey Pitsillides

data sources users/service providers - everything about you – shift it to you?? Dashboard – centralizing control not necessary geography Ok to give ‘…’ access to these photos - relying party Concept of personal data stores Businesses made possible because of it.. – stock market of data Personal data banking – take it back?? Financial markets (trust to be your bank) - ie google - business aspects – build it?? Diaspora group If they don’t interoperate? Multiple streams – local vs something that follows you same digital user accounts? Cloud service talking on your behalf – where does it live? Master copy? Synchronization service, blinded data store, employees have no access to your data. Your data is in your devices and in the cloud they cant read it! Encrypted link (blinding) – co-operation?? How do you share if all the data is encrypted? Copy which is encrypted, which then unencrypted … theory Q: protocols?? Speak Store data there it’s fine Challenge the assumption that the data needs to be encrypted… prevent data mining – aggregated data, layer above it Consensual where my data goes! Just happens … provide a 0 knowledge proof Analogy to the financial system User consensually say ‘yes’ Turn Facebooks terms of service upside down You can have access to my data but you cant cache it Digital signatures on a legal policy One way functions (privacy – secrecy?) - no audit, you don’t know what happened to your data, trust frameworks. Protected by an agreement on what happens to data, audit logs, what we did with our data - prove! Natural element of an identity management system. Directory service Person – set of correlations – lets me say that this character is that character and that account – meta persona Narrow context – persona – multiple set of interactions – trust (non profit) incentives – different - store data in order to leverage it Stored value system – same store – hold the key Everything goes to my store first Get a copy into my store Could your identity to be stored in a myriad of places- from client to store to world – one data flow ? (key) pair ? public/private - information card in the cloud ? problem- its on one machine, the card itself is a piece of data, card store, active client – relationship card – gesture is here’s some stuff – one time push – include one more claim – the pointer to the stuff – if I trust you builds a rel with what I give to you, when you hand somebody, one shot copy of v- card data pointer back to… delegated – social address book (plaxo, done right??) How are we gonna figure out the uniqueness of this data storm? How does it work ? transport problem … fragmented and splinted, central dashboard Let me consol the policies - common data schema (fail) simple, have wide distribution, doesn’t do much ?? Higgins data model – persona data model – mappings, because the world is how it is build mapping tables and mapping rules. Common data model, bi-directional mapping rules, hides a multitude of sins – master schema – ‘it can’t be done‘ ? concluded you don’t have to convince the world, massive energy for massive consensus It can maintain its own mappings, at a cost of having to maintain Open source – mapping is propriety – next company builds their rules – collaborate Internationalized domain names and strings – include Not indeed unique – redesign – how unique the interaction you engage with ? If you live on the e-commerce personalize the msg for each one is insane Can’t care who you are .. intermediates Meta data – i-phone “from strings to things” – meta data along the data string A person at a place, at a time XTI – synchronize info – client libraries – password manager, Higgins pd store - your data is never written down? Issue of duty What if you lose the key????? What happens to your data ? Google pass phrase recovery, password reset no/ coz they don’t know it!! Demo: save values into personal data store, same account for different devices, it will be synchronized – icons that show its recognized – 1st time visited – saved an account there – recognized that it s a user password field .. remember – simplest app that we could think of – simple schema