<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>My Place in the Crowd</title>
	
	<link>http://blog.myplaceinthecrowd.org</link>
	<description>The Common Data Project Blog</description>
	<lastBuildDate>Tue, 04 Oct 2011 17:48:10 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/MyPlaceInTheCrowd" /><feedburner:info uri="myplaceinthecrowd" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Open Graph, Silk, etc: Let’s stop calling it a privacy problem</title>
		<link>http://feedproxy.google.com/~r/MyPlaceInTheCrowd/~3/ET6d7Tf2KTY/</link>
		<comments>http://blog.myplaceinthecrowd.org/2011/10/04/open-graph-silk-etc-lets-stop-calling-it-a-privacy-problem/#comments</comments>
		<pubDate>Tue, 04 Oct 2011 17:48:06 +0000</pubDate>
		<dc:creator>Alex Selkirk</dc:creator>
				<category><![CDATA[Public Policy]]></category>
		<category><![CDATA[The Future of Advertising and Media]]></category>
		<category><![CDATA[Amazon]]></category>
		<category><![CDATA[Facebook]]></category>
		<category><![CDATA[open graph]]></category>
		<category><![CDATA[privacy law]]></category>
		<category><![CDATA[silk]]></category>

		<guid isPermaLink="false">http://blog.myplaceinthecrowd.org/?p=2840</guid>
		<description><![CDATA[The recent announcements of Facebook’s Open Graph update and Amazon Silk have provoked the usual media reaction about privacy. Maybe it&#8217;s time to give up on trying to fight data collection and data use issues with privacy arguments. Briefly, the new features: Facebook is creating more ways for you to passively track your own activity [...]]]></description>
			<content:encoded><![CDATA[<p>The recent announcements of <a title="After f8 - Resources for Building the Personalized Web " href="http://developers.facebook.com/blog/post/379">Facebook’s Open Graph update</a> and <a title="Amazon: Introducing Amazon Silk" href="http://amazonsilk.wordpress.com/2011/09/28/introducing-amazon-silk/">Amazon Silk</a> have provoked the <a title="NYT: As ‘Like’ Buttons Spread, So Do Facebook’s Tentacles" href="http://bits.blogs.nytimes.com/2011/09/27/as-like-buttons-spread-so-do-facebooks-tentacles/">usual</a> media <a title="NYT: Amazon’s Silk Browser Plays Another Role" href="http://bits.blogs.nytimes.com/2011/09/28/amazons-silk-browser-plays-another-role/">reaction</a> about privacy. Maybe it&#8217;s time to give up on trying to fight data collection and data use issues with privacy arguments.</p>
<p>Briefly, the new features: Facebook is creating more ways for you to passively track your own activity and share it with others. Amazon, in the name of speedier browsing (on their new Kindle device), has launched a service that will capture all of your online browsing activity tied to your identity, and use it to do what sounds like collaborative filtering to predict your browsing patterns and speed them up.  </p>
<div id="attachment_2843" class="wp-caption alignleft" style="width: 280px"><a href="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/10/imadog.jpg"><img class="size-medium wp-image-2843" title="Who cares if they know I'm a dog?" src="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/10/imadog-270x300.jpg" alt="Who cares if they know I'm a dog?" width="270" height="300" /></a><p class="wp-caption-text">Who cares if they know I&#39;m a dog? (<a href='www.sfweekly.com/microsites/2009-vs-1993/'>SF Weekly</a>)</p></div>
<p><a href='http://www.amazon.com/gp/help/customer/display.html/?nodeId=200775270'>Amazon likens what Silk is doing to the role of an Internet Service Provider</a>, which seems reasonable, but since <a href='http://blog.myplaceinthecrowd.org/2011/05/11/kerry-mccain-privacy-bill-what-it-got-right-whats-still-missing/'>regulators are getting wary of how ISP&#8217;s leverage the data that passes through them</a>, Amazon may not always enjoy that association.</p>
<p>EPIC (<a href='http://epic.org/'>Electronic Privacy Information Center</a>) has sent a <a title="EPIC Facebook Letter 9/29/2011" href="http://epic.org/privacy/facebook/EPIC_Facebook_FTC_letter.pdf">letter</a> to the FTC requesting an investigation of Facebook&#8217;s Open Graph changes and the new Timeline.</p>
<p>I&#8217;m not optimistic about the response. Depending on how the default privacy settings are configured, Open Graph may fall victim to another <a href="http://www.washingtonpost.com/wp-dyn/content/article/2007/11/29/AR2007112902503.html">&#8220;Facebook ruined my diamond ring surprise by advertising it <em>on my behalf</em>&#8221; kerfuffle</a>, which will result in a half-hearted apology from Zuckerberg and some shuffling around of checkboxes and radio buttons. The watchdogs aren’t as used to keeping tabs on Amazon, which has done a better job of meeting expectations around its use of customer data, so Silk may provoke a bit more soul-searching. </p>
<p>But I doubt it. In an <a title="The Chronicle: Why Privacy Matters Even if You Have 'Nothing to Hide'" href="http://chronicle.com/article/Why-Privacy-Matters-Even-if/127461/">excerpt from his book</a> “Nothing to Hide: The False Tradeoff Between Privacy and Security” published in the Chronicle of Higher Education earlier this year, Daniel J. Solove does a great job of explaining why we have trouble protecting individual privacy at the cost of [national] security. In the course of his argument he makes two points which are useful in thinking about protecting privacy on the internet.</p>
<p>He quotes South Carolina law professor Ann Bartow as saying,</p>
<blockquote><p>There are not enough privacy “dead bodies” for privacy to be weighed against other harms.</p></blockquote>
<p>There&#8217;s plenty of media chatter monitoring the decay of personal privacy online, but the conversations have been largely theoretical, the stuff of political and social theory. We have yet to have an event that crystallizes the conversation into a debate of moral rights and wrongs.</p>
<h2>Whatevers, See No Evil, and the OMG!&#8217;s</h2>
<p>At one end of the &#8220;privacy theory&#8221; debate, there are the Whatevers, whose blas&eacute; battle cry of “No one cares about privacy any more,” is bizarrely intended to be reassuring. At the other end are the OMG!&#8217;s, who only speak of data collection and online privacy in terms of degrees of personal violation, which equally bizarrely has the effect of inducing public equanimity in the face of &#8220;fresh violations.&#8221;</p>
<p>However, as per usual, the majority of people exist in the middle where so long as they &#8220;See no evil and Hear no evil,&#8221; privacy is a tab in the Settings dialog, not a civil liberties issue. Believe it or not this attitude hampers both companies trying to get more information out of their users AND civil liberties advocates who desperately want the public to &#8220;wake up&#8221; to what&#8217;s happening. Recently, privacy lost to free speech – but more on that in a minute.</p>
<p>When you look into most of the privacy concerns that are raised about legitimate web sites and software, (not viruses, phishing or other malicious efforts) they usually have to do with fairly mundane personal information. Your name or address being disclosed inadvertently. Embarrassing photos. Terms you search for. The web sites you visit. Public records digitized and put on the web.</p>
<p>The most legally harmful examples involve identity theft, which while not unrelated to internet privacy, falls squarely in the well-understood territory of criminal activity. What&#8217;s less clear is what&#8217;s wrong with &#8220;legitimate actors&#8221; such as Google and Facebook and what they&#8217;re doing with our data.</p>
<p>Which brings us a second point from Solove:</p>
<blockquote><p>“Legal and policy solutions focus too much on the problems under the Orwellian metaphor—those of surveillance—and aren&#8217;t adequately addressing the Kafkaesque problems—those of information processing.”</p></blockquote>
<p>In other words, who cares if the servers at Google &#8220;know&#8221; what I&#8217;m up to. We can&#8217;t as yet really even understand what it means for a computer to &#8220;know&#8221; something about human activity. Instead, the real question is what is Google (the company, comprised of human beings) deciding to do with this data?</p>
<h2>What are People deciding to do with data?</h2>
<p>By and large, the data collection that happens on the internet today is feeding into one flavor or another of “targeted advertising.” Loosely, that means showing you advertisements that are intended for an individual with some of your traits, based on information that has been collected about you. A male. A parent. A music lover. <a title="IF: What f8 Means for Advertisers: The Ability to Target Users Based on Media Consumption" href="http://www.insidefacebook.com/2011/09/22/what-f8-means-for-advertisers-the-ability-to-target-users-based-on-media-consumption/">The changes to Facebook&#8217;s Open Graph will create a targeting field day</a>. Which, on some level is a perfectly reasonable and predictable extension of age-old advertising and marketing practices.</p>
<blockquote><p>In theory, advertising provides social value in bridging information gaps about useful, valuable products; data-driven services like Facebook, Google and Amazon are simply providing the technical muscle to close that gap.</p></blockquote>
<p>However, Open Graph, Silk and other data rich services place us at the top of a very long and shallow slide down to a much darker side of information processing, which has nothing to do with the processing, but about manipulation and balance of power. And it&#8217;s the very length and gentle slope of that slide that make it almost impossible for us to talk about what&#8217;s really going wrong, and makes it even somewhat pleasant to ride down on it. (Yes, I&#8217;m making a slippery <i>slide</i> argument.)</p>
<h2>At the top of the slide, are issues of values and dehumanization.</h2>
<p>Recently employers have been <a title="NYT: Another Hurdle for the Jobless: Credit Inquiries" href="http://www.nytimes.com/2009/08/07/business/07credit.html">making use of credit checks to screen potential candidates</a>, automatically rejecting applicants with low credit scores. Perhaps this is an ingenious, if crude, way to quickly filter down a flood of job applicants. While its utility remains to be proven, it&#8217;s with good reason that we pause to consider the unintended consequences of such a policy. In many areas, we have often chosen to supplement &#8220;objective,&#8221; statistical evaluations with more humanist, subjective techniques (the college application process being one notable example). We are also a society that likes to believe in second chances.</p>
<h2>A bit further down the slide, there are questions of fairness.</h2>
<p><a title="NYT: What Does Your Credit-Card Company Know About You?" href="http://www.nytimes.com/2009/05/17/magazine/17credit-t.html">Credit card companies have been using purchase histories</a> as a way to decide who to push to pay their debt in full and who to strike a deal with. In other words, they&#8217;re figuring out who will be susceptible to &#8220;being guilted&#8221; and who&#8217;s just going to give them the finger when they call. This is a truly ingenious and effective way to lower the cost and increase the effectiveness of debt collection efforts. But is it fair to debtors that some people &#8220;get a deal&#8221; and others don&#8217;t? Surely, such inequalities have always existed. At the very least, it&#8217;s problematic that such practices are happening behind closed doors with little to no public oversight, all in the name of protecting individual privacy.</p>
<h2>Finally, there are issues of manipulation where information about you is used to get you to do things you don&#8217;t actually want to do.</h2>
<p>The fast food industry has been <a title="BBC: Manufacturing Fast Food Addiction" href="http://www.bbc.co.uk/worldservice/specials/1616_fastfood/page7.shtml">micro-engineering the taste, smell and texture of their food products to induce a very real food addiction</a> in the human brain. Surely, this is where online behavioral data-mining is headed, amplified by the power to deliver custom-tailored experiences to individuals.</p>
<h2>But it&#8217;s just the Same-Old, Same-Old</h2>
<p>This last scenario sounds bad, but isn&#8217;t this simply more of the same old advertising techniques we love to hate? Is there a bright line test we can apply so we know when we&#8217;ve &#8220;crossed the line&#8221; over into manipulation and lies?</p>
<h2>Drawing Lines</h2>
<p>Clearly the ethics of data use and manipulation in advertising is something we have been struggling with for a long time and something we will continue to struggle with, probably forever. However, some lines have been drawn, even if they&#8217;re not very clear.</p>
<p>While the original defining study on <a title="Wikipedia: Subliminal Stimuli" href="http://en.wikipedia.org/wiki/Subliminal_stimuli">subliminal advertising</a> has since been invalidated, when it was first publicized, the idea of messages being delivered subliminally into people’s minds was broadly condemned. In a world of imperfect definitions of “truth in advertising” it was immediately clear to the public that subliminal messaging (if it could be done) crossed the line into pure manipulation, and that was unacceptable. It was quickly banned by the UK, Australia and the American Networks and the National Association of Broadcasters.</p>
<blockquote><p>Thought Experiment: If we were to impose a &#8220;code of ethics&#8221; on data practitioners, what would it look like?</p></blockquote>
<p>Here&#8217;s a real-world, data-driven scenario:</p>
<ul>
<li><a title="NYT: A Fight Over How Drugs Are Pitched" href="http://www.nytimes.com/2011/04/25/business/25privacy.html">Pharmacies sell customer information to drug companies</a> so that they can identify doctors who will be most &#8220;receptive&#8221; to their marketing efforts.</li>
<li>Drug companies spend <a title="AdWeek: Pharma (Slowly) Goes Digital" href="http://www.adweek.com/news/technology/pharma-slowly-goes-digital-103185">$1 billion a year</a> advertising online to encourage individuals to “ask your doctor about [insert your favorite drug here]” with vague happy-people-in-sunshine imagery.</li>
<li>Drug companies employ <a title="NYT: Gimme an Rx! Cheerleaders Pep Up Drug Sales" href="http://www.nytimes.com/2005/11/28/business/28cheer.html">90,000 salespeople</a> (in 2005) to visit the best target doctors and sway them to their brands.</li>
</ul>
<p>Vermont passed a law outlawing the use of the pharmacy data without patient consent on the grounds of individual privacy. Then, this past June 23<sup>rd</sup>, the supreme court decided it was a free-speech problem and <a title="10-779 WILLIAM H. SORRELL, ATTORNEY GENERAL OF VERMONT, ET AL., PETITIONERS v. IMS HEALTH INC. ET AL." href="http://www.supremecourt.gov/opinions/10pdf/10-779.pdf">struck down the Vermont law</a>.</p>
<h2>Privacy as an argument for hemming in questionable data use will probably continue to fail.</h2>
<p>The trouble again is that theoretical privacy harms are weak sauce in comparison to data as a way to &#8220;bridge information gaps.&#8221; If we shut down use of this data on the basis of privacy, that prevents the government from using the same data to prioritize distribution of vaccines to clinics in high-risk areas.</p>
<p>Ah, but here we&#8217;ve stumbled on the real problem…</p>
<h2>Let&#8217;s shift the conversation from Privacy to Access</h2>
<p>Innovative health care cost reduction schemes like <a title="NYT: For Chronic Care, Try Turning to Your Employer" href="http://www.nytimes.com/2010/07/24/business/24patient.html">care management are starved for data</a>. Privacy concerns about broad, timely <a title="NYT: Should Tax Bills Be Public Information?" href="http://www.nytimes.com/2010/02/14/business/yourtaxes/14disclose.html">analysis of tax returns have prevented effective policy evaluation</a>. Municipalities negotiating with corporations <a title="NYT: A study finds benefits forlocalities that offer subsidies to attract companies." href="http://www.nytimes.com/2003/12/11/business/economic-scene-study-finds-benefits-forlocalities-that-offer-subsidies-attract.html">lack data to make difficult economic stimulus decisions</a>. Meanwhile private companies are drowning in data that they are barely scratching the surface of.</p>
<p>At the risk of sounding like a broken record, since we have <a title="CDP: The Common Data Project White Paper v.2" href="http://commondataproject.org/docs/whitepaper.pdf">written volumes</a> about this already:</p>
<ul>
<li>The problem does not lie in the mere fact that data is collected, but in how it is secured and processed and in who&#8217;s interest it is deployed.</li>
<li>Your activity on the internet, captured in increasingly granular detail is enormously valuable, and can be mined for a broad range of uses that as a society we may or may not approve of.</li>
<li>Privacy is an ineffective weapon to wield against the dark side of data use and instead, we should focus our efforts on <strong><em>(1)</em></strong> regulations that require companies to be more transparent about <strong><em>how</em></strong> they&#8217;re using data and <strong><em>(2)</em></strong> making personal data into a public resource that is in the hands of many.</li>
</ul>
<p>&nbsp;
<div class="tweetmeme_button" style="float: left; margin-right: 5;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F10%2F04%2Fopen-graph-silk-etc-lets-stop-calling-it-a-privacy-problem%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F10%2F04%2Fopen-graph-silk-etc-lets-stop-calling-it-a-privacy-problem%2F&amp;source=commondata&amp;style=compact&amp;service=bit.ly&amp;service_api=R_9afa4ef8f1202e731c482afe26f028a9&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<img src="http://feeds.feedburner.com/~r/MyPlaceInTheCrowd/~4/ET6d7Tf2KTY" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.myplaceinthecrowd.org/2011/10/04/open-graph-silk-etc-lets-stop-calling-it-a-privacy-problem/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.myplaceinthecrowd.org/2011/10/04/open-graph-silk-etc-lets-stop-calling-it-a-privacy-problem/</feedburner:origLink></item>
		<item>
		<title>Kerry-McCain Privacy Bill: What it got right, what’s still missing.</title>
		<link>http://feedproxy.google.com/~r/MyPlaceInTheCrowd/~3/btAP7R-Zai0/</link>
		<comments>http://blog.myplaceinthecrowd.org/2011/05/11/kerry-mccain-privacy-bill-what-it-got-right-whats-still-missing/#comments</comments>
		<pubDate>Wed, 11 May 2011 13:15:30 +0000</pubDate>
		<dc:creator>Alex Selkirk</dc:creator>
				<category><![CDATA[Protecting Privacy in Meaningful Ways]]></category>
		<category><![CDATA[Public Policy]]></category>
		<category><![CDATA[Access to Information]]></category>
		<category><![CDATA[Kerry-McCain Privacy Bill]]></category>

		<guid isPermaLink="false">http://blog.myplaceinthecrowd.org/?p=2709</guid>
		<description><![CDATA[At long last, we have a bill to talk about. It&#8217;s official name is the &#8220;Commercial Privacy Bill of Rights Act of 2011&#8221; and it was introduced by Senators Kerry and McCain. I was pleasantly surprised by how well many of the concepts and definitions were articulated, especially given some of the vague commentary that [...]]]></description>
			<content:encoded><![CDATA[<p>At long last, we have a bill to talk about. It&#8217;s official name is the &#8220;<a title="Commercial Privacy Bill of Rights Act of 2011" href="http://kerry.senate.gov/imo/media/doc/Commercial%20Privacy%20Bill%20of%20Rights%20Text.pdf">Commercial Privacy Bill of Rights Act of 2011</a>&#8221; and it was introduced by Senators Kerry and McCain.</p>
<p>I was pleasantly surprised by how well many of the concepts and definitions were articulated, especially given some of the vague commentary that I had read before the bill was officially released.</p>
<div class="simplePullQuote">Perhaps most importantly, the bill acknowledges that de-identification doesn&#8217;t work, even if it doesn&#8217;t make a lot of noise about it. </div>
<p>More generally though, there is a lot that is right about this bill, and it cannot be dismissed as an ill-conceived, knee-jerk reaction to the media hype around privacy issues.</p>
<p><a href="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/privacy_bill.png"><img class="alignleft size-full wp-image-2793" style="margin: 10px; border: 1px solid black;" title="Commercial Privacy Bill of Rights Act of 2011" src="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/privacy_bill.png" alt="Commercial Privacy Bill of Rights Act of 2011" width="250" /></a>For readers who are interested, I have outlined some of the <a href="#keypoints">key points</a> from the bill that jumped out at me, as well as some <a href="#questions">questions and clarifications</a>. Before getting to that however, I&#8217;d like to make three suggestions for additions to the bill.</p>
<h2>Transparency, Clear Definitions and Public Access</h2>
<p>Lawmakers should legislate more transparency into data collection; they should define what it means to render data &#8220;not personally identifiable;&#8221; and they should push for commercial data to be made available for public use.</p>
<h3>Legislators should look for opportunities to  require more transparency of companies and organizations collecting data by establishing new standards for &#8220;privacy accounting&#8221; practices.</h3>
<p>Doing so will encourage greater responsibility on the part of data collectors and provide regulators with more meaningful tools for oversight. Some examples include:</p>
<ol>
<li>Companies collecting data should be required to identify outside contractors they hire to perform data-related services. Currently in the bill, companies are liable for their contractors when it comes to privacy and security issues. However, we need a more positive carrot to incent companies to keep closer track of who has access to sensitive data and for what purposes. A requirement to publicly account for that information is the best way to encourage more disciplined internal accounting practices.</li>
<li>Data collectors should publicly and specifically state  what data they are collecting in plain English. Most privacy  policies today are far too vague and high-level because companies don&#8217;t want to be limited by their own policies.</li>
</ol>
<p>For example, the following is taken from the <a title="Google Toolbar Privacy Policy" href="http://www.google.com/support/toolbar/bin/answer.py?hl=en&amp;answer=81841&amp;rd=2">Google Toolbar Privacy Policy</a>:</p>
<blockquote><p>&#8220;Toolbar&#8217;s enhanced features, such as PageRank and Sidewiki, operate by sending Google <em>the addresses and other information </em>about sites at the time you visit them.&#8221; (Italics mine.)</p></blockquote>
<p>This begs the question, what exactly is covered by &#8220;other information?&#8221; How long I remain on a page? Whether I scroll down to the bottom of the page? What personalized content shows up? What comments I leave? The passwords I type in? These are all reasonable examples of the level of specificity at which Google could be more transparent about what data they collect. None of these items are too technical for the general user to understand and at this granularity, I don&#8217;t believe such a list would be terribly onerous keep up to date. We should be able to find a workable middle-ground that gives users of online services a more specific idea of what data is being collected about them without overwhelming them with too much technical detail.</p>
<h3>Legislators Need to Establish Meaningful Standards for Anonymization</h3>
<p>After describing the spirit of the regulations, the bill assigns certain tasks that are either too detailed or too dynamic to &#8220;rulemaking proceedings.&#8221; One such task is defining the requirements for providing adequate data security. I would like to add an additional, critical task to the responsibilities of those proceedings:</p>
<p>They must define what it means to &#8220;render not personally  identifiable&#8221; (Sec 202a5A) or &#8220;anonymise&#8221; (sec 701-4) data.</p>
<p>Without a clear legal standard for anonymization the public will  continue to be misled into believing that anonymous means their data is no longer  linkable to their identity when in fact there can only ever be degrees of anonymity because complete anonymity does not exist. This is a problem <a href="http://blog.myplaceinthecrowd.org/tag/privacy-guarantee/">we have been struggling with as well</a>.</p>
<div class="simplePullQuote">Our best guess at a good way to approach a legal definition would be to build up a framework around acceptable levels of risk and require companies and organizations collecting data to quantify the amount of risk they incur when they share data, which is actually possible with something like differential privacy.</div>
<h3>Legislators Should Push for Public Access</h3>
<div class="simplePullQuote">Entities that collect data from the public should be required to make it publicly available, through something like our proposal for <a href="http://blog.myplaceinthecrowd.org/2011/04/04/whitepaper-2-0-a-moral-and-practical-argument-for-public-access-to-private-data/">the datatrust</a>.</div>
<p>Businesses of all sorts have, with the advent  of technology, become data businesses. They live and die by the  data that they come by, though little of it was given to them for the  purposes it is now used for. That doesn&#8217;t mean we should delete the  data, or stop them from gathering it &#8211; that data is enormously valuable.</p>
<p>It <strong><em>does </em></strong>mean that the public needs a  datastore to compete with the massive private sector data warehouses.  The competitive edge that large datasets provide the entities that have  them is gigantic, and no amount of notice and security can address that  imbalance with the paucity of granular data available in the public  realm.<br />
<a name="keypoints"></a></p>
<p>Now for a more detailed look at the bill.</p>
<h2>Key Points of the Bill</h2>
<ol>
<li>The bill is about protecting Personally Identifiable Information (PII), which it correctly disambiguates to mean both the unique identifying information itself AND any information that is linked to that identifier.</li>
<li>Though much of the related discussion in the media talks about the bill in terms of its impact to tracking individuals on the internet, the bill is about all commercial entities, online or off.</li>
<li>&#8220;Entities&#8221; must give notice to users about collecting or using PII &#8211; this isn&#8217;t particularly shocking, but what may be more complicated will be what constitutes &#8220;notice&#8221;.</li>
<li>Opt-out for individuals is required for use of information that would otherwise be considered an <em>un</em>authorized use. (This is a nice thought, but the list of exceptions to the unauthorized use definition seems to be very comprehensive &#8211; if anyone has a good example of use that would &#8220;otherwise be unauthorized&#8221; and is thus addressed by this point, I would be interested to hear it.)</li>
<li>Opt-out for individuals is also required for the use of an individual&#8217;s covered information by a third-party for behavioral advertising or marketing. (I guess this means that a news site would need to provide an opt-out for users that prevents ad-networks from setting cookies, for example?)</li>
<li>Opt-in for individuals is required for the use or transfer of sensitive PII (a special category of PII that could cause the individual physical or economic harm, in particular medical information or religious affiliations) for uses other than handling a transaction (does serving an ad count as a transaction? &#8211; this is not defined), fighting fraud or preventative security. Opt-in is also required if there is a material change to the previously consented uses and that use creates a risk of economic or physical harm.</li>
<li>Entities need to be accountable for providing adequate security/protection for the PII that they store.</li>
<li>Entities can use the PII that they collect for an enumerated list of purposes, but from my reading, just about any purpose related to their business.</li>
<li>Entities can&#8217;t transfer this data to other entities without explicit user consent. Entities may not combine de-identified data with other data &#8220;in order to&#8221; re-identify it. (Unclear if they combine it without the intent of re-identification, but it has the same effect.)</li>
<li>Entities are liable for the actions of the vendors they contract PII work to.</li>
<li>Individuals must be able to access and update the information entities have about them. (The process of authenticating individuals to ensure they are updating their own information will be a hard nut to crack, and ironically may potentially require additional information be collected about them to do so.)</li>
</ol>
<p>It&#8217;s hard to disagree with the direction of the above points &#8211; all are ideas that seem to be doing the right thing for user privacy. However, there are some hidden issues, some of which may be my misunderstanding, but some of which definitely require clarifying the goal of the bill.<br />
<a name="questions"></a></p>
<h3>Clarifications/Questions</h3>
<p>1. <strong>Practical Enforcement</strong> &#8211; While the bill specifies fines and indicates that various rule making groups will be created to flesh out the practical implications of the bill, it&#8217;s not clear how the new law will actually change the status quo when it comes to enforcement of privacy rules. With no filing and accounting requirements to demonstrate that they are actually doing so, outside of blatant violations such as completely failing to provide notice to end users use of PII, the FTC will have no way of &#8220;being alerted&#8221; when data collectors break the rules. Instead, they will be operating blindly, wholly dependent on whistle blowers for any view into the reality of day-to-day data collection practices.</p>
<p>2. <strong>Meaningful Notice and Consent </strong>- While the bill lays out specific scenarios where &#8220;proper notice&#8221; and &#8220;explicit [individual] consent&#8221; will be required, there is no further explication of what &#8220;proper notice&#8221; and &#8220;explicit consent&#8221; should consist of.</p>
<p>Today, &#8220;proper notice&#8221; for online services <a href="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/fda_nutritional_facts.gif"><img class="alignright size-medium wp-image-2801" title="FDA Nutritional Facts Sample" src="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/fda_nutritional_facts-266x300.gif" alt="FDA Nutritional Facts Sample" width="266" height="300" /></a>consists of providing a lengthy legal document that is almost never read, and even more rarely fully understood by individuals. In the same vein, &#8220;Explicit consent&#8221; is when those same individuals &#8220;agree&#8221;  to the terms laid out in the lengthy document they didn&#8217;t read.</p>
<div class="simplePullQuote"> We need guidelines that provide formatting and placement requirements for notice and consent, much the way the the FDA actually designed <a title="FDA Nutritional Facts" href="http://www.fda.gov/food/labelingnutrition/ConsumerInformation/ucm109832.htm">&#8220;Nutrition Facts&#8221; labels</a> for food packaging.</div>
<p>3. <strong>Regulating Ad Networks</strong> &#8211; In the bill&#8217;s attempt to distinguish between third-parties (requires separate notice) and business partners (does not require separate notice), it remains unclear which category ad networks belong to.</p>
<p>Ads served up directly by New York Times on nytimes.com should probably be considered an integral part of the NYT site.</p>
<p>However, should Google AdWords be handled in the same way? Or are they really third party advertisers that should be required to provide users with separate notice before they can set and retrieve cookies?</p>
<p>More disturbingly, the bill seems to imply that online services gain an all-inclusive free pass to track you wherever you go on the web as soon as you &#8220;establish a business relationship,&#8221; what EFF is calling the &#8220;<a href="http://www.eff.org/deeplinks/2011/04/well-meaning-privacy-bill-rights-could-codify">Facebook loophole</a>.&#8221; This means that by signing up for a gmail account, you are also agreeing to Google AdWords tracking what you read on blogs and what you buy online.</p>
<p>This is, of course, how privacy agreements work today. But the ostensible goal of this bill is to close such loopholes.</p>
<h2>A Step In The Right Direction</h2>
<p>The Kerry-McCain Privacy Bill is undeniable evidence of significant progress in public awareness of privacy issues. However, in the final analysis, the bill in its current form is unlikely to practically change how businesses collect, use and manage sensitive personal data.
<div class="tweetmeme_button" style="float: left; margin-right: 5;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F05%2F11%2Fkerry-mccain-privacy-bill-what-it-got-right-whats-still-missing%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F05%2F11%2Fkerry-mccain-privacy-bill-what-it-got-right-whats-still-missing%2F&amp;source=commondata&amp;style=compact&amp;service=bit.ly&amp;service_api=R_9afa4ef8f1202e731c482afe26f028a9&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<img src="http://feeds.feedburner.com/~r/MyPlaceInTheCrowd/~4/btAP7R-Zai0" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.myplaceinthecrowd.org/2011/05/11/kerry-mccain-privacy-bill-what-it-got-right-whats-still-missing/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.myplaceinthecrowd.org/2011/05/11/kerry-mccain-privacy-bill-what-it-got-right-whats-still-missing/</feedburner:origLink></item>
		<item>
		<title>The CDP Private Map Maker v0.2</title>
		<link>http://feedproxy.google.com/~r/MyPlaceInTheCrowd/~3/byv_AvX8s0A/</link>
		<comments>http://blog.myplaceinthecrowd.org/2011/04/27/the-cdp-private-map-maker-v0-2/#comments</comments>
		<pubDate>Wed, 27 Apr 2011 13:15:26 +0000</pubDate>
		<dc:creator>Tony Gibbon</dc:creator>
				<category><![CDATA[Building the Datatrust]]></category>
		<category><![CDATA[CDP Announcements]]></category>
		<category><![CDATA[Protecting Privacy in Meaningful Ways]]></category>
		<category><![CDATA[anonymization]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[differential privacy]]></category>
		<category><![CDATA[Maps]]></category>
		<category><![CDATA[Privacy]]></category>
		<category><![CDATA[Privacy Guarantee]]></category>

		<guid isPermaLink="false">http://blog.myplaceinthecrowd.org/?p=2655</guid>
		<description><![CDATA[We&#8217;ve released version 0.2 of the CDP Private Map Maker &#8211; A new way to release sensitive map data! (Requires Silverlight.) Speedy, but is it safe? Today, releasing sensitive data safely on a map is not a trivial task. The common anonymization methods tend to either be manual and time consuming, or create a very [...]]]></description>
			<content:encoded><![CDATA[<p>We&#8217;ve released version 0.2 of the <a href="http://demos.commondataproject.org/MapMaker.html">CDP Private Map Maker</a> &#8211; A new way to release sensitive map data! (Requires <a title="Get Silverlight Plug-in" href="http://www.microsoft.com/getsilverlight/">Silverlight</a>.)</p>
<p><strong><a href="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/Map-Maker-Screenshot2.png"><img class="alignnone size-full wp-image-2786" title="Map Maker Screenshot" src="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/Map-Maker-Screenshot2.png" alt="" width="440" /></a><br />
</strong></p>
<h2>Speedy, but is it safe?</h2>
<p>Today, releasing sensitive data safely on a map is not a trivial task.  The common anonymization methods tend to either be manual and time consuming, or create a very low resolution map.</p>
<p>Compared to current manual anonymization methods, which can take months if not years, our map maker leverages differential privacy to generate a map programmatically in much less time. For the sample datasets included, this process took a couple of minutes.</p>
<p>However, speed is not the map maker&#8217;s most important feature, safety is, through the ability to quantify privacy risk.</p>
<h2>Accounting for Privacy Risk, Literally and Figuratively</h2>
<p>We&#8217;re still leveraging the same differential privacy principles <a title="Differential Privacy and Statistical Significance" href="http://blog.myplaceinthecrowd.org/2010/05/26/recap-and-proposal-955-the-statistically-insignificant-privacy-guarantee/">we&#8217;ve been working with all along</a>.  Differential privacy not only allows us to (mostly) automate the process of generating the maps, it also allows us to quantitatively balance the accuracy of the map against the privacy risk incurred when releasing the data.  (The purpose of the post is not to discuss whether differential privacy works&#8211;it&#8217;s an area of <a title="Google Scholar Search: &quot;Differential Privacy&quot;" href="http://scholar.google.com/scholar?q=differential+privacy&amp;hl=en&amp;btnG=Search&amp;as_sdt=1%2C33&amp;as_sdtp=on">privacy research that has been around for several years and there are others better equipped to defend its capabilities</a>.)</p>
<p>Think of it as a form of accounting. Rather than buying what <em>appears to be</em> cost-effective and hoping for the best, you can actually see the price of each item (privacy risk) AND know how accurate it will be.</p>
<p>Previous implementations of differential privacy (including our own) have done this accounting in code. The new map maker provides a graphical user interface so you can play with the settings yourself.<br />
More details on how this works below.</p>
<h2>Compared to v0.1</h2>
<p>Version 0.2 updates our <a title="PINQ Demo" href="http://blog.myplaceinthecrowd.org/2010/05/04/update-pinq-demo-revisited/">first test-drive of differential privacy</a>.  Our first iteration allowed you to query the number of people in an arbitrary region of the map, returning meaningful results about the area as a whole without exposing individuals in the dataset.</p>
<p>The flexibility that application provided as compared to pre-bucketed data is great if you have a specific question, but the workflow of looking at a blank map and choosing an area to query doesn’t align with how people often use maps and data.  We generally like to see the data at a high level, and then dig deeper as needed.</p>
<p>In this round, we&#8217;re aiming for a more intuitive user experience. Our two target users are:</p>
<ol>
<li><strong>Data Releaser</strong> The person releasing the data who wants to make intelligent decisions about how to balance privacy risk and data utility.</li>
<li><strong>Data User</strong> The person trying to make use of the the data, who would like to have a general overview of a data set before delving in with more specific questions.</li>
</ol>
<p>As a result, we&#8217;ve flipped our workflow on it&#8217;s head. Rather than providing a blank map for you to query, the map maker now immediately produces populated maps at different levels of accuracy and privacy risk.</p>
<p>We&#8217;ve also added the ability to upload your own datasets and choose your own privacy settings to see how the private map maker works.</p>
<h2>However, please do not upload actually sensitive data to this demo.</h2>
<p>v.02 is for demonstration purposes only. Our hope is to create a forum where organizations with real data release scenarios can begin to engage with the differential privacy research community. If you&#8217;re interested in a more serious experiment with real data, please <a title="Contact the Common Data Project" href="http://www.commondataproject.org/contact">contact us</a>.</p>
<p>Any data you do upload is available publicly to other users until it is deleted. (You can delete any uploaded dataset through the map maker interface.) The sample data sets provided cannot be deleted, and were synthetically generated &#8211; please do not use the sample data for any purpose other than seeing how the map maker works &#8211; the data is fake.</p>
<p>You can play with the demo <a href="http://demos.commondataproject.org/MapMaker.html" target="_blank">here</a>. (Requires <a title="Get Silverlight Plug-in" href="http://www.microsoft.com/getsilverlight/">Silverlight</a>.)</p>
<p>Finally, a subtle, but significant change we should call out: &#8211; <a href="http://blog.myplaceinthecrowd.org/2010/01/07/pinq-privacy-demo/">Our previous map demo</a> leveraged an implementation of differential privacy called <a href="http://research.microsoft.com/en-us/projects/pinq/default.aspx" target="_blank">PINQ, developed at Microsoft Research</a>.  Creating the grids for this map maker required a different workflow so we wrote our own implementation to add noise to the cell counts, using the same fundamentals of differential privacy.</p>
<h2>More Details on How the Private Map Maker Works</h2>
<h3>How exactly do we generate the maps? One option – Nudge each data point a little</h3>
<p>The key to differential privacy is adding random noise to each answer.  It only returns aggregates so we can’t ask it to ‘make a data point private’, but what if we added noise to each data point by moving it slightly?  The person consuming the map then wouldn’t know exactly where the data point originated from making it private, right?</p>
<p>The problem with this process is that we can’t automate adding this random noise because external factors might cause the noise to be ineffective.  Consider the red data point below.</p>
<p><a href="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/Nudge2.png"><img title="Nudge2" src="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/Nudge2.png" alt="" width="440" /></a></p>
<p>If we nudge it randomly, there’s a pretty good chance we’ll nudge it right into the water.  Since there aren’t residences in the middle of Manhasset Bay, this could significantly narrow down the possibilities for the actual origin of the data point.  (One of the more problematic scenarios is pictured above.)  And water isn’t the only issue—if we’re dealing with residences, nudging into a strip mall, school, etc. could cause the same problem.  Because of these external factors, the process is manual and time consuming.   On top of that, unlike differential privacy, there’s no mathematical measure about how much information is being divulged—you’re relying on the manual review to catch any privacy issues.</p>
<h3>Another Option – Grids</h3>
<p>As a compromise between querying a blank map, and the time consuming (and potentially error prone) process of nudging data points, we decided to generate grid squares based on noisy answers—the darker the grid square, the higher the answer.  The grid is generated simply by running one differential privacy-protected query for each square.  Here’s an example grid from a fake dataset:</p>
<p><a href="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/Grid.png"><img class="alignnone size-full wp-image-2657" title="Grid" src="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/Grid.png" alt="" width="440" /></a></p>
<p>“But Tony!” you say, “Weren’t you just telling us how much better arbitrary questions are as compared to the bucketing we often see?”  First, this isn’t meant to necessarily replace the ability to ask arbitrary questions, but instead provides another tool allowing you to see the data first.  And second, compared to the way released data is often currently pre-bucketed, we’re able to offer more granular grids.</p>
<h3>Choosing a Map</h3>
<p>Now comes the manual part.  There are two variables you can adjust when choosing a map: grid size and margin of error.  While this step is manual, most of the work is done for you, so it&#8217;s much less time-intensive than moving data points around.  For demonstration purposes, we currently generate several options which you can select from in the gallery view.  You could release any of the maps that are pre-generated as they are all protected by differential privacy with the given +/- &#8211;but some are not useful and others may be wasting privacy currency.</p>
<p>Grid size is simply the area of each cell.  Since a cell is the smallest area you can compare (with either another cell or 0), you must set it to accommodate the minimum resolution required for your analysis.  For example, using the map to allocate resources at the borough level vs. the block level require different resolutions to be effective.  You also have to consider the density of the dataset.  If your analysis is at the block level, but the dataset is very sparse such that there&#8217;s only about one point per block, the noise will protect those individuals, and the map will be uniformly noisy.</p>
<p>Margin of error specifies a range that the noisy answer will likely fall within.  The higher the margin of error, the less the noisy answer tells us about specific data points within the cell.  A cell with answer 20 +/- 3 means the real answer is likely between 17 and 23.  While an answer of 20 +/- 50 means the real answer is likely between -30 and 70, and thus it’s reasonably likely that there are no data points within that cell at all.</p>
<p>To select a map, first pan and zoom the map to show the portion  you’re interested in, and then click the target icon for a dataset.</p>
<p style="text-align: center;"><a href="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/target_button_circle.png"><img class="size-full wp-image-2680 aligncenter" title="Map Maker Target Button" src="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/target_button_circle.png" alt="Map Maker Target Button" width="326" height="98" /></a></p>
<p>When you click the target, a gallery with previews of the nine pre-generated options are displayed.</p>
<p><a href="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/Gallery1.png"><img class="alignnone size-full wp-image-2692" title="Gallery1" src="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/Gallery1.png" alt="" /></a></p>
<p>As an example, let&#8217;s imagine that I&#8217;m doing block level analysis, so I&#8217;m only interested in the third column:</p>
<p><a href="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/Gallery2.png"><img class="alignnone size-full wp-image-2696" title="Gallery2" src="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/Gallery2.png" alt="" width="443" height="708" /></a></p>
<p>This sample dataset has a fairly small amount of data, such that in the top cell (+/- 50) and to some extent the middle cell (+/- 9), the noise overwhelms the data.  In this case, we would have to consider tuning down the privacy protection towards the +/- 3 cell, in order to have a useful map at that resolution. (For this demo, the noise level is hard-coded.)  The other option is to sacrifice resolution (moving left in the gallery view), so there are more data points in a given square and thus won&#8217;t be drowned out by higher noise levels.</p>
<p>Once you have selected a grid, you can pan and zoom the map to the desired scale.  The legend is currently dynamic such that it will adjust as necessary to the magnitude of the data in your current view.
<div class="tweetmeme_button" style="float: left; margin-right: 5;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F04%2F27%2Fthe-cdp-private-map-maker-v0-2%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F04%2F27%2Fthe-cdp-private-map-maker-v0-2%2F&amp;source=commondata&amp;style=compact&amp;service=bit.ly&amp;service_api=R_9afa4ef8f1202e731c482afe26f028a9&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<img src="http://feeds.feedburner.com/~r/MyPlaceInTheCrowd/~4/byv_AvX8s0A" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.myplaceinthecrowd.org/2011/04/27/the-cdp-private-map-maker-v0-2/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.myplaceinthecrowd.org/2011/04/27/the-cdp-private-map-maker-v0-2/</feedburner:origLink></item>
		<item>
		<title>Should Pharma have access to doctors’ prescription records?</title>
		<link>http://feedproxy.google.com/~r/MyPlaceInTheCrowd/~3/HIeH5kAnhiI/</link>
		<comments>http://blog.myplaceinthecrowd.org/2011/04/26/should-pharma-have-access-to-doctors-perscription-records/#comments</comments>
		<pubDate>Tue, 26 Apr 2011 15:50:35 +0000</pubDate>
		<dc:creator>Mimi Yin</dc:creator>
				<category><![CDATA[Interesting Uses of Data]]></category>
		<category><![CDATA[Public Policy]]></category>
		<category><![CDATA[Access to Information]]></category>
		<category><![CDATA[In The News]]></category>

		<guid isPermaLink="false">http://blog.myplaceinthecrowd.org/?p=2758</guid>
		<description><![CDATA[Maine, New Hampshire and Vermont want to pass laws to prevent pharmacies from selling prescription data to drug companies, who in turn use it for &#8220;targeted marketing to doctors&#8221; or &#8220;tailoring their products to better meet the needs of health practitioners&#8221; (depending on who you talk to). This gets at the heart of the issue [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.flickr.com/photos/notionscapital/3811600610/sizes/m/in/photostream/"><img class="alignright" src="http://farm3.static.flickr.com/2566/3811600610_86c6a7aef5.jpg" alt="" width="173" height="500" /></a> <a href="http://www.nytimes.com/2011/04/25/business/25privacy.html">Maine, New Hampshire and Vermont want to pass laws to prevent pharmacies from selling prescription data to drug companies</a>, who in turn use it for &#8220;targeted marketing to doctors&#8221; or &#8220;tailoring their products to better meet the needs of health practitioners&#8221; (depending on who you talk to).</p>
<p>This gets at the heart of the issue of imbalance between private and public sectors when it comes to access to sensitive information.</p>
<p>From our perspective, it doesn&#8217;t seem like a good idea to limit data usage. If the drug companies are smart, they&#8217;re also using the same data to figure out things like what drugs are being prescribed in combination and how that affects the effectiveness of their products.</p>
<p>Instead, we should be thinking of ways to expand access so that for every drug company buying data for marketing and product development, there is an active community of researchers, public advocates and policymakers who have low-cost or free access to the same data.
<div class="tweetmeme_button" style="float: left; margin-right: 5;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F04%2F26%2Fshould-pharma-have-access-to-doctors-perscription-records%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F04%2F26%2Fshould-pharma-have-access-to-doctors-perscription-records%2F&amp;source=commondata&amp;style=compact&amp;service=bit.ly&amp;service_api=R_9afa4ef8f1202e731c482afe26f028a9&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<img src="http://feeds.feedburner.com/~r/MyPlaceInTheCrowd/~4/HIeH5kAnhiI" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.myplaceinthecrowd.org/2011/04/26/should-pharma-have-access-to-doctors-perscription-records/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://blog.myplaceinthecrowd.org/2011/04/26/should-pharma-have-access-to-doctors-perscription-records/</feedburner:origLink></item>
		<item>
		<title>Comments on Richard Thaler “Show Us the Data. (It’s Ours, After All.)” NYT 4/23/11</title>
		<link>http://feedproxy.google.com/~r/MyPlaceInTheCrowd/~3/4USCV0U5v3k/</link>
		<comments>http://blog.myplaceinthecrowd.org/2011/04/26/comments-on-richard-thaler-show-us-the-data-it%e2%80%99s-ours-after-all-nyt-42311/#comments</comments>
		<pubDate>Tue, 26 Apr 2011 13:15:00 +0000</pubDate>
		<dc:creator>Alex Selkirk</dc:creator>
				<category><![CDATA[Interesting Uses of Data]]></category>
		<category><![CDATA[Public Policy]]></category>
		<category><![CDATA[Access to Information]]></category>
		<category><![CDATA[Data Collection]]></category>
		<category><![CDATA[data portability]]></category>
		<category><![CDATA[In The News]]></category>
		<category><![CDATA[open data]]></category>
		<category><![CDATA[Privacy]]></category>
		<category><![CDATA[Privacy Policies]]></category>

		<guid isPermaLink="false">http://blog.myplaceinthecrowd.org/?p=2737</guid>
		<description><![CDATA[Professor Richard Thaler, a professor from the University of Chicago wrote a piece in the New York Times this weekend with an idea that is dear to CDP&#8217;s mission: making data available to the individuals it was collected from. Particularly because the title of the piece suggests that he is saying exactly what we are [...]]]></description>
			<content:encoded><![CDATA[<p><a title="Professor Richard Thaler of the University of Chicago" href="http://faculty.chicagobooth.edu/richard.thaler/research/">Professor Richard Thaler</a>, a professor from the University of Chicago wrote <a title="Richard Thaler &quot;Show Us the Data. (It’s Ours, After All.)&quot;" href="http://www.nytimes.com/2011/04/24/business/24view.html?_r=2&amp;scp=1&amp;sq=%22richard%20h.%20thaler%22&amp;st=cse">a piece in the New York Times </a>this weekend with an idea that is dear to CDP&#8217;s mission: making data available to the individuals it was collected from.</p>
<p>Particularly because the title of the piece suggests that he is saying exactly what we are saying, I wanted to write a few quick comments to clarify how it is different.</p>
<p>1.	It’s great that he’s saying loudly and clearly that the payback for data collection should be the data itself &#8211; that’s definitely a key point we’re trying to make with CDP, and not enough people realize how valuable that data is to individuals, and more generally, to the public.</p>
<p>2.	However, what Professor Thaler is pushing for is more along the lines of “<a title="DataPortability.org" href="http://dataportability.org/">data portability</a>”, the idea of which we agree with at an ethical and moral level, has some real practical limitations when we start talking about implementation. In my experience, data structures change so rapidly that companies are unable to keep up with how their data is evolving month-to-month. I find it hard to imagine that entire industries could coordinate a standard that could hold together for very long without undermining the very qualities that make data-driven services powerful and innovative.</p>
<p><a href="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/lego.png"><img class="alignleft size-full wp-image-2751" title="lego" src="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/lego.png" alt="" width="548" height="229" /></a></p>
<p>3.	I&#8217;m also not sure why Professor Thaler says that the <a href="http://kerry.senate.gov/imo/media/doc/Commercial%20Privacy%20Bill%20of%20Rights%20Text.pdf">Kerry-McCain Commercial Privacy Bill of Rights Act of 2011</a> doesn&#8217;t cover this issue. My reading of the bill is that it’s covered in the general sense of access to your information – Section 202(4) reads:</p>
<blockquote><p>to provide any individual to whom the personally identifiable information that is covered information [covered information is essentially anything that is tied to your identity] pertains, and which the covered entity or its service provider stores, appropriate and reasonable-</p>
<p>(A) access to such information; and</p>
<p>(B) mechanisms to correct such information to improve the accuracy of such information;</p></blockquote>
<p>Perhaps what he is simply pointing out is the lack of any mention about instituting data standards to enable portability versus simply instituting standards around data transparency.</p>
<p>I have a long post about the bill that is not quite ready to put out there, and it does have a lot of issues, but I didn&#8217;t think that was one of them.</p>
<p>&nbsp;
<div class="tweetmeme_button" style="float: left; margin-right: 5;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F04%2F26%2Fcomments-on-richard-thaler-show-us-the-data-it%25e2%2580%2599s-ours-after-all-nyt-42311%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F04%2F26%2Fcomments-on-richard-thaler-show-us-the-data-it%25e2%2580%2599s-ours-after-all-nyt-42311%2F&amp;source=commondata&amp;style=compact&amp;service=bit.ly&amp;service_api=R_9afa4ef8f1202e731c482afe26f028a9&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<img src="http://feeds.feedburner.com/~r/MyPlaceInTheCrowd/~4/4USCV0U5v3k" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.myplaceinthecrowd.org/2011/04/26/comments-on-richard-thaler-show-us-the-data-it%e2%80%99s-ours-after-all-nyt-42311/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.myplaceinthecrowd.org/2011/04/26/comments-on-richard-thaler-show-us-the-data-it%e2%80%99s-ours-after-all-nyt-42311/</feedburner:origLink></item>
		<item>
		<title>Response to: “A New Internet Privacy Law?” (New York Times – Opinion, March 18)</title>
		<link>http://feedproxy.google.com/~r/MyPlaceInTheCrowd/~3/3WzyKNqRzCk/</link>
		<comments>http://blog.myplaceinthecrowd.org/2011/04/06/response-to-a-new-internet-privacy-law-new-york-times-opinion/#comments</comments>
		<pubDate>Wed, 06 Apr 2011 12:19:26 +0000</pubDate>
		<dc:creator>Alex Selkirk</dc:creator>
				<category><![CDATA[Protecting Privacy in Meaningful Ways]]></category>
		<category><![CDATA[Public Policy]]></category>
		<category><![CDATA[Access to Information]]></category>
		<category><![CDATA[anonymization]]></category>
		<category><![CDATA[In The News]]></category>

		<guid isPermaLink="false">http://blog.myplaceinthecrowd.org/?p=2644</guid>
		<description><![CDATA[There has been scant detailed coverage of the current discussions in Congress around an online privacy bill. The Wall Street Journal has published several pieces on it in their &#8220;What They Know&#8221; section but I&#8217;ve had a hard time finding anything that actually details the substance of the proposed legislation. There are mentions of Internet [...]]]></description>
			<content:encoded><![CDATA[<p>There has been scant detailed coverage of the current discussions in Congress around an online privacy bill. The Wall Street Journal has <a href="http://blogs.wsj.com/digits/2011/02/11/lawmaker-introduces-new-privacy-bill/">published</a> <a href="http://online.wsj.com/article/SB10001424052748704629104576190911145462284.html">several</a> <a href="http://online.wsj.com/article/SB10001424052748704662604576202971768984598.html">pieces</a> on it in their &#8220;<a title="Wall Street Journal: What They Know" href="http://online.wsj.com/public/page/what-they-know-digital-privacy.html?mod=WSJ_topnav_tech">What They Know</a>&#8221; section but I&#8217;ve had a hard time finding anything that actually details the substance of the proposed legislation. There are mentions of Internet Explorer 9&#8242;s <a href="http://windows.microsoft.com/en-US/internet-explorer/products/ie-9/features/tracking-protection">Tracking Protection Lists</a>, and Firefox&#8217;s <a href="http://www.mozilla.com/en-US/firefox/features/#advancedsecurity">&#8220;Do Not Track&#8221; functionality</a>, but little else.</p>
<p><a href="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/dam.png"><img class="alignleft size-full wp-image-2651" title="Spring a Leak" src="http://blog.myplaceinthecrowd.org/wp-content/uploads/2011/04/dam.png" alt="" width="320" height="273" /></a></p>
<p>Not surprisingly, we&#8217;re generally feeling like legislators are barking up the wrong tree by pushing to limit rather than expand legitimate uses of data in hard-to-enforce ways (e.g. &#8220;Do Not Track,&#8221; data deletion) without actually providing standards and guidance where government regulation could be truly useful and effective (e.g. providing a technical definition of &#8220;anonymous&#8221; for the industry and standardizing &#8220;privacy risk&#8221; accounting methods).</p>
<p>Last but not least, we&#8217;re dismayed that no one seems to be worried about <a href="http://blog.myplaceinthecrowd.org/2011/04/04/whitepaper-2-0-a-moral-and-practical-argument-for-public-access-to-private-data/">the lack of public access to all this data</a>.</p>
<p>In response, we sent the following letter to the editor to the New York Times on March 23, 2011 in response to the first appearance of the issue in their pages &#8211; an opinion piece titled &#8220;<a href="http://www.nytimes.com/2011/03/19/opinion/19sat2.html">A New Internet Privacy Law</a>,&#8221; published on March 18, 2011.</p>
<p>&nbsp;</p>
<blockquote><p>While it is heartening to see Washington finally paying attention to online privacy, the new regulations appear to miss the point.</p>
<p>What&#8217;s needed is more data, more creative re-uses of data and more public access to data.</p>
<p>Instead, current proposals are headed in the direction of unenforceable regulations that hope to limit data collection and use.</p>
<p>So, what *should* regulators care about?</p>
<p>1. Much valuable data analysis can and should be done without identifying individuals. However, there is as yet, no widely accepted technical definition of &#8220;anonymous.&#8221; As a result, data is bought, sold and shared with &#8220;third-parties&#8221; with wildly varying degrees of privacy protection. Regulation can help standardize anonymization techniques which would create a freer, safer market for data-sharing.</p>
<p>2. The data stockpiles being amassed in the private sector have enormous value to the public, yet we have little to no access to it. Lawmakers should explore ways to encourage or require companies to donate data to the public.</p>
<p>The future will be about making better decisions with data, and the public is losing out.</p>
<p>Alex Selkirk<br />
The Common Data Project &#8211; Working towards a public trust of sensitive data<br />
<a title="The Common Data Project" href="http://commondataproject.org">http://commondataproject.org</a></p>
<p>&nbsp;</p></blockquote>
<div class="tweetmeme_button" style="float: left; margin-right: 5;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F04%2F06%2Fresponse-to-a-new-internet-privacy-law-new-york-times-opinion%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F04%2F06%2Fresponse-to-a-new-internet-privacy-law-new-york-times-opinion%2F&amp;source=commondata&amp;style=compact&amp;service=bit.ly&amp;service_api=R_9afa4ef8f1202e731c482afe26f028a9&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<img src="http://feeds.feedburner.com/~r/MyPlaceInTheCrowd/~4/3WzyKNqRzCk" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.myplaceinthecrowd.org/2011/04/06/response-to-a-new-internet-privacy-law-new-york-times-opinion/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.myplaceinthecrowd.org/2011/04/06/response-to-a-new-internet-privacy-law-new-york-times-opinion/</feedburner:origLink></item>
		<item>
		<title>Whitepaper 2.0: A moral and practical argument for public access to private data.</title>
		<link>http://feedproxy.google.com/~r/MyPlaceInTheCrowd/~3/vW16XxBKb58/</link>
		<comments>http://blog.myplaceinthecrowd.org/2011/04/04/whitepaper-2-0-a-moral-and-practical-argument-for-public-access-to-private-data/#comments</comments>
		<pubDate>Mon, 04 Apr 2011 08:45:29 +0000</pubDate>
		<dc:creator>The Common Data Project</dc:creator>
				<category><![CDATA[Best Practices]]></category>
		<category><![CDATA[Building the Datatrust]]></category>
		<category><![CDATA[CDP Announcements]]></category>
		<category><![CDATA[Public Policy]]></category>
		<category><![CDATA[Access to Information]]></category>
		<category><![CDATA[Datatrust]]></category>
		<category><![CDATA[Governance]]></category>

		<guid isPermaLink="false">http://blog.myplaceinthecrowd.org/?p=2619</guid>
		<description><![CDATA[It&#8217;s here! The Common Data Project&#8217;s White Paper version 2.0. This is our most comprehensive moral and practical argument to date for the creation of a public datatrust that provides public access to today&#8217;s growing store of sensitive personal information. At this point, there can be no doubt that sensitive personal data, in aggregate, is [...]]]></description>
			<content:encoded><![CDATA[<p>It&#8217;s here! <a href="http://commondataproject.org/docs/whitepaper.pdf">The Common Data Project&#8217;s White Paper version 2.0</a>.</p>
<p>This is our most comprehensive moral and practical argument to date for the creation of a public datatrust that provides public access to today&#8217;s growing store of sensitive personal information.</p>
<blockquote><p>At this point, there can be no doubt that sensitive personal data, in aggregate, is and will continue to be an invaluable resource for commerce and society. However, today, the private sector holds a near monopoly on such data. We believe that it is time <strong>We, The People</strong> gain access to our own data; access that will enable researchers, policymakers and NGOs acting in the public interest to make decisions in the same data-informed ways businesses have for decades.</p></blockquote>
<blockquote><p>Access to sensitive personal information will be the next &#8220;Digital Divide&#8221; and our work is perhaps best described as an effort to bridge that gap.</p></blockquote>
<p>Still, we recognize that there are many hurdles to overcome. Currently, highly valuable data, from online behavioral data to personal financial and medical records are silo-ed and, in the name of privacy, inaccessible. Valuable data is kept out of the reach of the public and in many cases unavailable even to the businesses, organizations and government agencies that collect the data in the first place. Many of these data holders have business reasons or public mandates to share the data they have, but can&#8217;t or only do so in a severely limited manner and through a time-consuming process.</p>
<p>We believe there are technological and policy solutions that can remedy this situation and our white paper attempts to sketch out these solutions in the form of a &#8220;datatrust.&#8221;</p>
<p>We set out to answer the major questions and open issues that challenge the viability of the datatrust idea.</p>
<ol>
<li>Is public access to sensitive personal information really necessary?</li>
<li>If it is, why isn&#8217;t this already a solved problem?</li>
<li>How can you open up sensitive data to the public without harming the individuals represented in that data?</li>
<li>How can any organization be trusted to hold such sensitive data?</li>
<li>Assuming this is possible and there is public will to pull it off, will such data be useful?</li>
<li>All existing anonymization methodologies degrade the utility of data, how will the datatrust strike a balance between utility and privacy?</li>
<li>How will the data be collated, managed and curated into a usable form?</li>
<li>How will the quality of the data be evaluated and maintained?</li>
<li>Who has a stake in the datatrust?</li>
<li>The datatrust&#8217;s purported mission is to serve the interests of society, will you and I as members of society have a say in how the datatrust is run?</li>
</ol>
<p>You can read <a href="http://commondataproject.org/docs/whitepaper.pdf">the full paper here</a>.</p>
<p>Comments, reactions and feedback are all welcome. You can post your thoughts here or write us directly at <em>info at commondataproject dot org</em>.
<div class="tweetmeme_button" style="float: left; margin-right: 5;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F04%2F04%2Fwhitepaper-2-0-a-moral-and-practical-argument-for-public-access-to-private-data%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2011%2F04%2F04%2Fwhitepaper-2-0-a-moral-and-practical-argument-for-public-access-to-private-data%2F&amp;source=commondata&amp;style=compact&amp;service=bit.ly&amp;service_api=R_9afa4ef8f1202e731c482afe26f028a9&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<img src="http://feeds.feedburner.com/~r/MyPlaceInTheCrowd/~4/vW16XxBKb58" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.myplaceinthecrowd.org/2011/04/04/whitepaper-2-0-a-moral-and-practical-argument-for-public-access-to-private-data/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.myplaceinthecrowd.org/2011/04/04/whitepaper-2-0-a-moral-and-practical-argument-for-public-access-to-private-data/</feedburner:origLink></item>
		<item>
		<title>In The Mix…predicting the future; releasing healthcare claims; and $1.5 millions awarded to data privacy</title>
		<link>http://feedproxy.google.com/~r/MyPlaceInTheCrowd/~3/04mfYGZtt24/</link>
		<comments>http://blog.myplaceinthecrowd.org/2010/11/30/in-the-mixpredicting-the-future-releasing-healthcare-claims-and-1-5-millions-awarded-to-data-privacy/#comments</comments>
		<pubDate>Tue, 30 Nov 2010 23:40:59 +0000</pubDate>
		<dc:creator>Becky Pezely</dc:creator>
				<category><![CDATA[For Nerds and Geeks]]></category>
		<category><![CDATA[Protecting Privacy in Meaningful Ways]]></category>
		<category><![CDATA[Public Policy]]></category>
		<category><![CDATA[anonymization]]></category>
		<category><![CDATA[in the mix]]></category>
		<category><![CDATA[Privacy]]></category>
		<category><![CDATA[Search Engines]]></category>
		<category><![CDATA[Visualizing Data]]></category>

		<guid isPermaLink="false">http://blog.myplaceinthecrowd.org/?p=2598</guid>
		<description><![CDATA[Some people out there think they can predict the future by scraping content off the web. Does it work simply because web 2.0 technologies are great at creating echo chambers? Is this just another way of amplifying that echo chamber and generating yet more self-fulfilling trend prophecies? See the Future with a Search (MIT Technology [...]]]></description>
			<content:encoded><![CDATA[<p>Some people out there think they can predict the future by scraping content off the web. Does it work simply because web 2.0 technologies are great at creating echo chambers? Is this just another way of amplifying that echo chamber and generating yet more self-fulfilling trend prophecies? <a href="http://www.technologyreview.com/computing/26452/page1/?a=f" target="_blank">See the Future with a Search</a> (<a href="http://www.technologyreview.com/" target="_blank">MIT Technology Review</a>)</p>
<p>The U.S. Office of Personnel Management wants to create a huge database that contains healthcare claims of millions of. Many are concerned for how the data will be protected and used. <a href="http://www.computerworld.com/s/article/9195493/More_federal_health_database_details_coming_following_privacy_alarm" target="_blank">More federal health database details coming following privacy alarm</a> (<a href="http://www.computerworld.com/" target="_blank">Computer World</a>)</p>
<p>Researchers at Purdue were awarded $1.5 million to investigate how well current techniques for anonymizing data are working and whether there&#8217;s a need for better methods. It would be interesting to know what they think of <a href="http://blog.myplaceinthecrowd.org/tag/differential-privacy/" target="_blank">differential privacy</a>. They  appear to be actually doing the dirty work of figuring out whether theoretical re-identification is more than just a theory. <a href="http://threatpost.com/en_us/blogs/national-science-foundation-funds-purdue-data-anonymization-project-110210">National Science Foundation Funds Purdue Data-Anonymization Project</a> (<a href="http://threatpost.com/" target="_blank">Threat Post</a>)
<div class="tweetmeme_button" style="float: left; margin-right: 5;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2010%2F11%2F30%2Fin-the-mixpredicting-the-future-releasing-healthcare-claims-and-1-5-millions-awarded-to-data-privacy%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2010%2F11%2F30%2Fin-the-mixpredicting-the-future-releasing-healthcare-claims-and-1-5-millions-awarded-to-data-privacy%2F&amp;source=commondata&amp;style=compact&amp;service=bit.ly&amp;service_api=R_9afa4ef8f1202e731c482afe26f028a9&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<img src="http://feeds.feedburner.com/~r/MyPlaceInTheCrowd/~4/04mfYGZtt24" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.myplaceinthecrowd.org/2010/11/30/in-the-mixpredicting-the-future-releasing-healthcare-claims-and-1-5-millions-awarded-to-data-privacy/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://blog.myplaceinthecrowd.org/2010/11/30/in-the-mixpredicting-the-future-releasing-healthcare-claims-and-1-5-millions-awarded-to-data-privacy/</feedburner:origLink></item>
		<item>
		<title>@IAPP Privacy Foo Camp 2010: What Is Anonymous Enough?</title>
		<link>http://feedproxy.google.com/~r/MyPlaceInTheCrowd/~3/6L0BbtHx-fM/</link>
		<comments>http://blog.myplaceinthecrowd.org/2010/10/26/iapp-privacy-foo-camp-2010-what-is-anonymous-enough/#comments</comments>
		<pubDate>Tue, 26 Oct 2010 18:40:41 +0000</pubDate>
		<dc:creator>Becky Pezely</dc:creator>
				<category><![CDATA[Best Practices]]></category>
		<category><![CDATA[Protecting Privacy in Meaningful Ways]]></category>
		<category><![CDATA[anonymization]]></category>
		<category><![CDATA[Conference]]></category>
		<category><![CDATA[Data Mining]]></category>
		<category><![CDATA[IAPP]]></category>
		<category><![CDATA[Privacy]]></category>

		<guid isPermaLink="false">http://blog.myplaceinthecrowd.org/?p=2581</guid>
		<description><![CDATA[Editor’s Note: Becky Pezely is an independent contractor for Shan Gao Ma, a consulting company started by Alex Selkirk, President of the Board of the Common Data Project.  Becky’s work, like Tony&#8217;s, touches on many of the privacy challenges that CDP hopes to address with the datatrust.  We’re happy to have her guest blogging about [...]]]></description>
			<content:encoded><![CDATA[<p><em>Editor’s Note: Becky Pezely is an independent contractor for <a href="http://shangaoma.com/">Shan Gao Ma</a>, a consulting company started by Alex Selkirk, President of the Board of the Common Data Project.  Becky’s work, like <a href="http://blog.myplaceinthecrowd.org/2010/01/07/pinq-privacy-demo/">Tony&#8217;s</a>, touches on many of the privacy challenges that CDP hopes to address with the <a href="http://blog.myplaceinthecrowd.org/2010/06/29/a-big-update-for-the-common-data-project/">datatrust</a>.  We’re happy to have her guest blogging about IAPP Academy 2010 here. </em></p>
<p>Several weeks ago we attended the <a href="https://www.privacyassociation.org/" target="_blank">2010 Global Privacy Summit (IAPP 2010)</a> in Baltimore, Maryland.   </p>
<p>In addition to some engaging high-profile keynotes – including FTC <a href="http://www.ftc.gov/bcp/about.shtm" target="_blank">Bureau of Consumer Protection</a> Director <a href="http://en.wikipedia.org/wiki/David_Vladeck" target="_blank">David Vladeck</a> – we got to participate in the <em>first ever</em> IAPP <a href="http://en.wikipedia.org/wiki/Foo_Camp" target="_blank">Foo Camp</a>. </p>
<p>The Foo Camp was comprised of four discussion topics aimed at covering the top technology concerns facing a wide-range of privacy professionals.</p>
<p>The session we ran was titled “Low Impact Data Mining”.  The intention was to discuss, and better understand, the current challenges in managing data within an organization.  All with a lens on managing data in a way that is “low impact” on resources while returning “high (positive) impact” on the business.</p>
<p>The individuals in our group represented a vast array of industries including: financial services, insurance, pharmaceutical, law enforcement, online marketing, health care, retail and telecommunications.  It was fascinating that, even across such a wide range of industries, that there could be such a pervasive set of privacy  challenges that were common among them.</p>
<p>Starting with:</p>
<h2>What is “anonymous enough”?</h2>
<p><a href="http://www.eff.org/deeplinks/2009/09/what-information-personally-identifiable" target="_blank">If all you need is gender, zip code and birthdate to re-identify someone</a> then what data, when released, is truly “anonymous enough”?  Can a baseline be defined, and enforced, within our organization that ensures customer protection?</p>
<p>It feels safe to say that this was the root-challenge from which all the others stemmed.  Today the release of data is mostly controlled, and subsequently managed, by a trusted person(s). The individual(s) is the ones responsible for “sanitizing” the data that gets released internally, or externally, to the organization.  They are charged with managing the release of data to fulfill everything from understanding business performance to fulfilling business obligations with partners.  And their primary concern is to know how well they are protecting their customer’s information, not only from the perspective of company policy, but also from a perspective of personal morals. They are they gatekeepers for assessing the level of protection provided based on which data they released to whom and they want to have some guarantee that what they are releasing is “anonymous enough” to have the level of protection they want to achieve.  These gatekeepers want to know when the data they release is “anonymous enough” and how they can employ a definition, or measurement, that guarantees the right level of anonymity for their customers.</p>
<p>This challenge compounds for these individuals, and their organizations, when adding in various other truths of the nature of data today:</p>
<h2>The silos are getting joined.</h2>
<p>The convention that used to be held was that data within an organization was in a silo – all on it’s own and protected – such that anyone looking at the data, would only see <em>that</em> set of data.  Now, it’s starting to become the reality that these data sets are getting joined and it’s not always known where, when, how, with whom the join originated. Nor is it known where the joined data set could is currently stored since it was modified from its original silo.  Soon that joined data-set takes on a life of its own and makes its way around the institution.  Given the likelihood of this occurring, how can the person(s) responsible for being the gatekeeper(s) of the data, and assessing the level of protection provided to customers, do so with any kind of reliable measurement that guarantees the right level of anonymity?</p>
<h2>And now there’s data in the public market.</h2>
<p>Not only is the data joined with data (from other silos) within the organization, but also with data outside the organization sold in the public market.  This prospect has increased the ability for organizations to produce data that is “high impact” for the business – because they now know WAY MORE about their customers.  But does the benefit outweigh the liability? As the ability to know more about individual customers increases, so does the level of sensitivity and the concern for privacy.    How do organizations successfully navigate mounting privacy concerns as they move from in silos, to joined-silos, to joined-silos combined with public data?   </p>
<h3>The line between “data analytics” and looking at “raw data” is blurring.</h3>
<p>Because the data is richer, and more plentiful, the act of data analysis isn’t as benign as it might once have been.  The definition of “data analytics” has evolved from something high-level (to know, for example, how many new customers are using the service this quarter) to something that  looks a lot more like looking at raw data to target specific parts of their business to specific customers (to, for example, sell &lt;these products&gt; to customers that make &lt;this much money&gt;, are females ages 30 – 35 and live in &lt;this neighborhood&gt; and typically spend &lt;this much&gt; on &lt;these types of products&gt;, etc…).</p>
<h3>And the data has different ways of exiting the system.</h3>
<p>The truth is, as scary as this data can be, everyone wants to get their hands on it, because the data leads to awareness that is meaningful and valuable for the business.  Thus, the data is shared everywhere – inside and outside the organization.  With that fact comes a whole set of challenges emerge when considering all the ways data might be exiting any given “silo”, such as: Where is all the data going?  How is it getting modified (joined, sanitized, rejoined) and at which point is it no longer the data that needs to be protected by the organization? How much data needs to be released externally to fulfill partner/customer business obligations? Once the data has exited, can the organization’s privacy practices still be enforced? </p>
<h3>Brand affects privacy policy.  Privacy policy affects brand.</h3>
<p>Privacy is a concern of the whole business, not just the resources that manage the data, nor solely the resources that manage liability.  In the event of a “big oopsie” where there is a data/privacy breach, it will be the communication with customers before, during and after the incident that determines the internal and external impact on the brand and the perception of the organization.  And that communication is dictated by both what the privacy policy enforces and what brand “allows”.  In today’s age of data, how can an organization have an open dialog with customers about their data if the brand does not support having that kind of a conversation?  No surprise that Facebook is the exemplary case for this: Facebook continues to pave a new path, and draw customers, to share and disclose more information about themselves.  As a result they have experienced the backlash from customers when they take it too far. The line of communication is very open – customers have a clear way to lash back when Facebook has gone too far, and Facebook has a way of visibly standing behind their decision or admitting their mistake.  Either way, it is now commonplace for Facebook’s customers to expect that there will be more incidents like this and that Facebook has a way (apparently suitable enough to keep most customers) of dealing with it.  Their “policy” allowed them to respond this way, and now it’s become a part of who Facebook is.  And now the policy that evolves to support this behavior moving forward.</p>
<p>In the discussion of data and privacy, it seems inherently obvious that the mountain of challenges we face is large, complicated and impacts the core of all our businesses.  Nonetheless, it is still fascinating to have been able to witness first-hand – and to now be able to specifically articulate &#8211; how similar the challenges are across a diverse group of businesses and how similar the concerns are across job-function. </p>
<div class="simplePullQuote">We want to re-thank everyone from IAPP that joined in on the discussions that we had at Foo Camp and throughout the conference.  We look forward to an opportunity to deep dive into these types of problems.</div>
<p><strong>Post Script:</strong> Meanwhile, the challenges, and related questions, around the <a href="http://blog.myplaceinthecrowd.org/tag/anonymization/" target="_blank">anonymization of data</a> with some kind of <a href="http://blog.myplaceinthecrowd.org/tag/privacy-guarantee/" target="_blank">measurable privacy guarantee</a> that came up at Foo Camp are ones that we have been discussing on our blog for quite some time.  These are precisely the sorts of challenges that have motivated us to create a <a href="http://blog.myplaceinthecrowd.org/tag/datatrust/" target="_blank">datatrust</a>.  While we typically envision the datatrust being used in scenarios where there isn’t direct access to data, we walked away with specific examples from our discussions at IAPP Foo Camp where direct access to the data is required – particularly to fulfill business obligation – as a type of collateral (or currency). </p>
<p>The concept of data as the new currency of today’s economy has emerged.  Not only did it come up at the IAPP Foo Camp, it also came up back in August where <a href="http://brianrowe.org/2010/08/18/pii-keynote-marc-davis-of-microsoft/" target="_blank">we heard Marc Davis talk about this at IPP 2010</a>. With all of this in mind, it is interesting evaluate the possibility of the datatrust being able to act as a special type of data broker in these exchanges.  The idea being that the datatrust is a sanctioned data broker (by the industry, or possibly by the government), that inherently meets federal, local, municipal regulations and protects the consumers of business partners who want to exchange data as “currency,” while alleviating businesses and their partners from the headaches of managing data use/reuse.  The “tax” on using the service is that these aggregates are stored and made available to the public to query in the way we imagine (no direct access to the data) for policy-making and research.  This is something that feels compelling to us and will influence our thinking as we continue to move forward with our work.
<div class="tweetmeme_button" style="float: left; margin-right: 5;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2010%2F10%2F26%2Fiapp-privacy-foo-camp-2010-what-is-anonymous-enough%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2010%2F10%2F26%2Fiapp-privacy-foo-camp-2010-what-is-anonymous-enough%2F&amp;source=commondata&amp;style=compact&amp;service=bit.ly&amp;service_api=R_9afa4ef8f1202e731c482afe26f028a9&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<img src="http://feeds.feedburner.com/~r/MyPlaceInTheCrowd/~4/6L0BbtHx-fM" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.myplaceinthecrowd.org/2010/10/26/iapp-privacy-foo-camp-2010-what-is-anonymous-enough/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.myplaceinthecrowd.org/2010/10/26/iapp-privacy-foo-camp-2010-what-is-anonymous-enough/</feedburner:origLink></item>
		<item>
		<title>Common Data Project at IAPP Privacy Academy 2010</title>
		<link>http://feedproxy.google.com/~r/MyPlaceInTheCrowd/~3/TF1O82PJt7o/</link>
		<comments>http://blog.myplaceinthecrowd.org/2010/09/13/common-data-project-at-iapp-privacy-academy-2010/#comments</comments>
		<pubDate>Mon, 13 Sep 2010 14:00:02 +0000</pubDate>
		<dc:creator>Alex Selkirk</dc:creator>
				<category><![CDATA[Best Practices]]></category>
		<category><![CDATA[CDP Announcements]]></category>
		<category><![CDATA[Protecting Privacy in Meaningful Ways]]></category>
		<category><![CDATA[IAPP]]></category>

		<guid isPermaLink="false">http://blog.myplaceinthecrowd.org/?p=2541</guid>
		<description><![CDATA[Below is a preview of our slides and handout for the conference. Unlike our previous presentations, we won&#8217;t be talking about CDP and the Datatrust at all. Instead, we&#8217;ll be focused on presenting on how SGM helps companies minimize the privacy impact of their data-mining. More specifically, we&#8217;ll be stepping through the symbiotic documentation system [...]]]></description>
			<content:encoded><![CDATA[<div class="simplePullQuote">We will be giving a Lightning Talk on &#8220;Low-Impact Data-Mining&#8221; and running two breakout sessions at the <a href="https://www.privacyassociation.org/events_and_programs/iapp_privacy_academy/preconference_workshops1/">IT Privacy Foo Camp &#8211; Preconference Session</a>, Wednesday Sept 29.</div>
<p>Below is a preview of our <a href="http://blog.myplaceinthecrowd.org/wp-content/uploads/2010/09/IAPP2010-Low-Impact_Data-Mining.pdf">slides</a> and handout for the conference. Unlike our previous presentations, we won&#8217;t be talking about CDP and the Datatrust at all. Instead, we&#8217;ll be focused on presenting on how <a href="http://shangaoma.com/">SGM</a> helps companies minimize the privacy impact of their data-mining.</p>
<p>More specifically, we&#8217;ll be stepping through the symbiotic documentation system we&#8217;ve created between the <strong>product development/data science</strong> folks collecting and making use of the data and the <strong>privacy/legal</strong> folks trying to regulate and monitor compliance with privacy policies. We will be using the <a href="http://shangaoma.com/jobs#ddtool">SGM Data Dictionary</a> as a case study in the breakout sessions.</p>
<p>Still, we expect that many of issues we&#8217;ve been grappling with from the datatrust perspective (e.g. public perception, trust, ownership of data, meaningful privacy guarantees) will come up as they are universal issues that are central to any meaningful discussion about privacy today.<br />
<a href="http://blog.myplaceinthecrowd.org/wp-content/uploads/2010/09/IAPP2010-Low-Impact_Data-Mining.pdf"><img class="alignleft size-full wp-image-2556" title="Symbiotic Documentation System" src="http://blog.myplaceinthecrowd.org/wp-content/uploads/2010/09/Slide2.png" alt="" width="600" /></a></p>
<hr />
<h2>Handout</h2>
<h3>What is data science?<br />
</h3>
<p><a href="http://radar.oreilly.com/2010/06/what-is-data-science.html">An introduction to data-mining from O&#8217;Reilly Radar</a> that provides a good explanation of how data-mining is distinct from previous uses of data and provides plenty of examples of how data-mining is changing products and services today.</p>
<h3>The &#8220;Anonymous&#8221; Promise and De-indentification</h3>
<ol>
<li>How you can be re-identified: <a href="http://www.eff.org/deeplinks/2009/09/what-information-personally-identifiable">Zip code + Birth date + Gender = Identity<br />
</a></li>
<li>Promising new technologies for anonymization: <a href="http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1450006"><em><strong>Broken Promises of Privacy: Responding to the Surprising Failure of Anonymization</strong></em><strong> </strong> by Paul Ohm.</a></li>
</ol>
<h3>Differential Privacy: A Programmatic Way to Enforce Your Privacy Guarantee?</h3>
<ol>
<li>A Microsoft Research Implementation: <a href="http://research.microsoft.com/en-us/projects/pinq/default.aspx">PINQ</a></li>
<li><a href="http://blog.myplaceinthecrowd.org/2009/08/28/pinq-programmatic-privacy/">CDP&#8217;s write-up about PINQ.<br />
</a></li>
<li>A deeper look at <a href="http://blog.myplaceinthecrowd.org/2010/05/26/recap-and-proposal-955-the-statistically-insignificant-privacy-guarantee/">how differential privacy&#8217;s mathematical guarantee might translate into laymen&#8217;s terms</a>.</li>
</ol>
<h3>Paradigms of Data Ownership: Individuals vs Companies</h3>
<ol>
<li><a href="http://archive.nyu.edu/handle/2451/14257"><strong><em>Markets and Privacy</em></strong> by Kenneth C. Laudon<br />
</a></li>
<li><a href="http://www.englishdiscourse.org/lessig.html"><strong><em>Privacy as Property</em></strong> by Lawrence Lessig</a></li>
<li>CDP explores the advantages and challenges to a<a href="http://commondataproject.org/paper-licenses-intro"> &#8220;Creative Commons-style&#8221; model for licensing personal information?<br />
</a></li>
<li><a href="http://commondataproject.org/paper-policies-intro">CDP&#8217;s Guide to How to Read a Privacy Policy</a></li>
</ol>
<div class="tweetmeme_button" style="float: left; margin-right: 5;">
			<a href="http://api.tweetmeme.com/share?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2010%2F09%2F13%2Fcommon-data-project-at-iapp-privacy-academy-2010%2F"><br />
				<img src="http://api.tweetmeme.com/imagebutton.gif?url=http%3A%2F%2Fblog.myplaceinthecrowd.org%2F2010%2F09%2F13%2Fcommon-data-project-at-iapp-privacy-academy-2010%2F&amp;source=commondata&amp;style=compact&amp;service=bit.ly&amp;service_api=R_9afa4ef8f1202e731c482afe26f028a9&amp;b=2" height="61" width="50" /><br />
			</a>
		</div>
<img src="http://feeds.feedburner.com/~r/MyPlaceInTheCrowd/~4/TF1O82PJt7o" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.myplaceinthecrowd.org/2010/09/13/common-data-project-at-iapp-privacy-academy-2010/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.myplaceinthecrowd.org/2010/09/13/common-data-project-at-iapp-privacy-academy-2010/</feedburner:origLink></item>
	</channel>
</rss><!-- Dynamic page generated in 0.941 seconds. --><!-- Cached page generated by WP-Super-Cache on 2012-05-19 14:20:25 -->

