<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:on="http://www.oreillynet.com/csrss/" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">
<title>O'Reilly Radar - Insight, analysis, and research about emerging technologies.</title>
<link rel="alternate" type="text/html" href="http://radar.oreilly.com/" />

<id>tag:radar.oreilly.com,2010-08-31://57</id>
<updated>2012-02-10T00:20:20Z</updated>
<subtitle>http://radar.oreilly.com/</subtitle>
<generator uri="http://www.sixapart.com/movabletype/">Movable Type Pro 4.21-en</generator>

<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/oreilly/radar/atom" /><feedburner:info uri="oreilly/radar/atom" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><geo:lat>38.393314</geo:lat><geo:long>-122.836667</geo:long><feedburner:emailServiceId>oreilly/radar/atom</feedburner:emailServiceId><feedburner:feedburnerHostname>http://feedburner.google.com</feedburner:feedburnerHostname><entry>
<title>Jury to Eolas: Nobody owns the interactive web</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/QwONzjQNDPw/eolas-interactive-web-patent.html" />
<id>tag:radar.oreilly.com,2012://57.47810</id>

<published>2012-02-10T00:20:20Z</published>
<updated>2012-02-10T00:20:20Z</updated>

<summary type="html">A Texas jury has struck down a company's claim to ownership of the interactive web.  Eolas, which has been suing technology companies for more than a decade, now faces the prospect of losing the patents.</summary>
<author>
<name>Alex Howard</name>
<uri>http://radar.oreilly.com/alexh</uri>
</author>

<category term="Programming" scheme="http://www.sixapart.com/ns/types#category" />

<category term="Web 2.0" scheme="http://www.sixapart.com/ns/types#category" />

<category term="eolas" label="Eolas" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="intellectualproperty" label="intellectual property" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="patents" label="patents" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="webbrowser" label="Web browser" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="worldwideweb" label="World Wide Web" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;As Joe Mullin reported at Wired earlier tonight, a &lt;a href="http://www.wired.com/threatlevel/2012/02/interactive-web-patent/"&gt;Texas jury has struck down a company's claim to own the interactive web&lt;/a&gt;.  The decision in this case comes after more than a decade of legal wrangling that has drawn in some of the biggest technology companies and retailers in the world.  As Timothy Lee &lt;a href="http://arstechnica.com/tech-policy/news/2012/02/jury-rules-that-eolass-interactive-web-patent-is-invalid.ars"&gt;observed&lt;/a&gt; at Ars Technica, Eolas, "a patent troll that has been shaking down technology companies for the better part of a decade, now faces the prospect of losing the patent."&lt;/p&gt;

&lt;p&gt;It's a rare reversal of two software patents (&lt;a href="http://www.google.com/patents?id=-gnJAAAAEBAJ"&gt;7,599,985&lt;/a&gt; and &lt;a href="http://www.google.com/patents?id=kKAZAAAAEBAJ"&gt;5,838,906&lt;/a&gt;), that shouldn't have been granted in the first place.  It's also an important victory for the open Internet. &lt;/p&gt;

&lt;p align="center"&gt;
&lt;blockquote class="twitter-tweet"&gt;&lt;p&gt;Big news for open web. Eolas patents '906 and '985 ruled invalid today by a jury in Tyler TX. Early web dev (viola) presented as prior art.&lt;/p&gt;&amp;mdash; Dale Dougherty (@dalepd) &lt;a href="https://twitter.com/dalepd/status/167730358471770112" data-datetime="2012-02-09T22:03:28+00:00"&gt;February 9, 2012&lt;/a&gt;&lt;/blockquote&gt;
&lt;script src="//platform.twitter.com/widgets.js" charset="utf-8"&gt;&lt;/script&gt;&lt;/p&gt;

&lt;p&gt;As a result of the decision, the eight companies that were resisting the patent lawsuits won't have to pay anything to Eolas. If Google, YouTube, Yahoo, Amazon, Adobe, JC Penney, CDW Corp., and Staples had lost the patent infringement suit, they would have been subject to more than $600 million in damages. &lt;/p&gt;

&lt;p&gt;The Eolas patent case represents one of the most infamous claims to ownership of the commons that grew up in universities, garages and labs in the early 1990s. &lt;/p&gt;

&lt;p&gt;Here's a quick summary of the history: the '906 patent was applied for in 1994 and granted to Eolas in &lt;a href="http://www.google.com/patents?id=kKAZAAAAEBAJ"&gt;1998&lt;/a&gt;. Eolas &lt;a href="http://www.techdirt.com/search.php?cx=partner-pub-4050006937094082%3Acx0qff-dnm1&amp;cof=FORID%3A9&amp;ie=ISO-8859-1&amp;q=eolas"&gt;sued Microsoft&lt;/a&gt; in 1999. Microsoft lost that trial and settled with Eolas. The World Wide Web Consortium (&lt;a href="http://www.w3.org/"&gt;W3C&lt;/a&gt;) and Microsoft both petitioned the U.S. Patent Office to reconsider patent. The Patent Office upheld it, both times. &lt;/p&gt;

&lt;p&gt;The Eolas patent covers "embedded application" in a browser, a broad description of a function that was typical of client-server systems of the time. The patent was  then used by Eolas founder Michael Doyle to make a broad claim about the invention of interactivity on the web, based upon a medical imaging application that enabled a user to manipulate images on a web browser with computation occurring in the background on a server. &lt;/p&gt;

&lt;p&gt;The case appears to have turned on the demonstration of prior art by the defense. A computer science student at the University of California at Berkeley, Pei-Yuan Wei, testified during the trial that he had conceived of making &lt;a href="http://www.wired.com/threatlevel/2012/02/tim-berners-lee-patent/"&gt;interactive web features&lt;/a&gt; as early as 1991, including the creation of the &lt;a href="http://en.wikipedia.org/wiki/ViolaWWW"&gt;Viola Web browser&lt;/a&gt;. Viola, first released in April 1992, was the first web browser with inline graphics, scripting, tables and a stylesheet.  The web browser was in &lt;a href="http://radar.oreilly.com/2007/07/microsoft-reaches-settlement-o.html"&gt;development&lt;/a&gt; at O'Reilly in 1992-1994. Another UC Berkeley student, Scott Silvey, testified that he had demonstrated such features to engineers at Sun Microsystems in 1993. &lt;/p&gt;

&lt;p&gt;That testimony, when combined with that of web pioneers like Eric Bina, the co-founder of Netscape, and Dave Raggett, who invented the HTML "embed" tag, and Tim Berners-Lee, the inventor of the World Wide Web, was enough to convince this jury. &lt;/p&gt;

&lt;p&gt; "It was ahead of its time," &lt;a href="http://www.wired.com/threatlevel/2012/02/tim-berners-lee-patent/"&gt;testified Berners-Lee&lt;/a&gt;. "The things Pei was doing would later be done in Java." &lt;/p&gt;

&lt;p&gt;One interesting detail that emerged in the case was that the U.S. Patent Office didn't have access to the Internet in 1994 and was apparently forbidden from going on the Internet in 1997, which would make research into prior art in cyberspace somewhat of a challenge. &lt;/p&gt;

&lt;p&gt;Patent trolls continue to be a major issue for software companies and the technology industry as a whole in 2012, as an episode of "This American Life" on &lt;a href="http://www.thisamericanlife.org/radio-archives/episode/441/when-patents-attack"&gt;when patents attack&lt;/a&gt; effectively communicated. &lt;/p&gt;

&lt;p&gt;As Mike Masnick points out at TechDirt, while today was &lt;a href="http://www.techdirt.com/articles/20120209/15395117718/web-is-saved-east-texas-jury-says-eolas-patents-are-invalid.shtml"&gt;an important victory&lt;/a&gt; for the networked commons and civil society, Eolas still has a lot of settlement money in hand to pursue an appeal.&lt;/p&gt;

&lt;p&gt;That said, the jury's decision to invalidate Eolas' claims of ownership regarding the basic technology that enables access to the interactive web means the company won't be suing anyone for a while. &lt;/p&gt;

&lt;p align="center"&gt;
&lt;blockquote class="twitter-tweet"&gt;&lt;p&gt;Texas jury agreed Eolas 906 patent invalid. Good thing too!&lt;/p&gt;&amp;mdash; Tim Berners-Lee (@timberners_lee) &lt;a href="https://twitter.com/timberners_lee/status/167724524299759616" data-datetime="2012-02-09T21:40:17+00:00"&gt;February 9, 2012&lt;/a&gt;&lt;/blockquote&gt;
&lt;script src="//platform.twitter.com/widgets.js" charset="utf-8"&gt;&lt;/script&gt;&lt;/p&gt;

&lt;p&gt;Here's to the Open Web.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt;&lt;p&gt;

&lt;ul&gt;
&lt;li&gt; &lt;a href="http://www.oreillynet.com/onlamp/blog/2003/11/pto_director_orders_reexam_for.html"&gt; PTO Director Orders Re-Exam for '906 Patent&lt;/a&gt;(2003&lt;/li&gt;

&lt;p&gt;&lt;li&gt; &lt;a href="http://www.oreillynet.com/onlamp/blog/2004/06/butting_heads_over_the_906_reb.html "&gt;Butting Heads Over the '906 Rebuttal&lt;br /&gt;
&lt;/a&gt;(2004)&lt;/li&gt;&lt;br /&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2007/07/microsoft-reaches-settlement-o.html"&gt;Microsoft Reaches Settlement on EOLAS Patent&lt;/a&gt; (2007)&lt;/li&gt;&lt;br /&gt;
&lt;/ul&gt;&lt;/p&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=QwONzjQNDPw:zvxq3dC2eIc:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=QwONzjQNDPw:zvxq3dC2eIc:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=QwONzjQNDPw:zvxq3dC2eIc:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=QwONzjQNDPw:zvxq3dC2eIc:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=QwONzjQNDPw:zvxq3dC2eIc:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=QwONzjQNDPw:zvxq3dC2eIc:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=QwONzjQNDPw:zvxq3dC2eIc:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/QwONzjQNDPw" height="1" width="1"/&gt;</content>
<dc:source>http://www.oreillynet.com/pub/au/4520</dc:source>
<dc:type>text</dc:type>
<on:image />
<feedburner:origLink>http://radar.oreilly.com/2012/02/eolas-interactive-web-patent.html</feedburner:origLink></entry>

<entry>
<title>O'Reilly ebooks now optimized for Kindle Fire</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/0DAcEF7Nxc8/kindle-fire-mobi-kf8-oreilly-ebook.html" />
<id>tag:radar.oreilly.com,2012://57.47804</id>

<published>2012-02-09T17:00:00Z</published>
<updated>2012-02-09T17:00:00Z</updated>

<summary type="html">If your O'Reilly ebook bundle includes a Mobi file, you can now download a KF8-compliant file. These updated files take advantage of the Kindle Fire's functionality.</summary>
<author>
<name>Adam Witwer</name>
<uri>http://radar.oreilly.com/adamw</uri>
</author>

<category term="Publishing" scheme="http://www.sixapart.com/ns/types#category" />

<category term="ebookdesign" label="ebook design" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="ebookupdates" label="ebook updates" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="kindlefire" label="kindle fire" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="kindleformat8" label="Kindle Format 8" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="publishingecosystem" label="publishing ecosystem" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="publishingformats" label="publishing formats" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="publishingtools" label="publishing tools" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;Earlier this week, we at O'Reilly regenerated all of our ebook-bundle Mobi files, upgrading them to meet the specifications for Amazon's latest ebook format, &lt;a href="http://www.amazon.com/gp/feature.html?docId=1000729511"&gt;KF8&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;These files are now available for download in &lt;a href="https://members.oreilly.com/account/emedia/index"&gt;your account&lt;/a&gt; on oreilly.com. If your ebook bundle includes a Mobi file (and more than 90% of bundles do), you can download the updated, KF8-compliant file now. (Note: All O'Reilly Media files are now available in KF8. Partner publishers will come soon.)&lt;/p&gt;

&lt;p&gt;As always, our ebook bundles are DRM-free. See &lt;a href="http://oreilly.com/ebooks/mobi/"&gt;this page&lt;/a&gt; for instructions on loading O'Reilly Mobi files to your Kindle.&lt;/p&gt;

&lt;p&gt;We've optimized our Mobi files for Kindle Fire by taking advantage of KF8's support of &lt;a href="http://www.w3.org/TR/css3-mediaqueries/"&gt;@media queries&lt;/a&gt;. While @media queries have been commonplace on the web for some time, they are just now making their way to ebook ecosystems. KF8's support of @media queries allows you to create an ebook that looks and potentially behaves differently based on your reading device. &lt;/p&gt;

&lt;p&gt;For an example of @media queries in action, see the image below, which shows how the same Mobi file appears on a traditional Kindle (left) versus the new Kindle Fire (right):&lt;/p&gt;

&lt;p class="image-box-580"&gt;&lt;a href="http://radar.oreilly.com/assets_c/2012/02/fireveink.html"&gt;&lt;img src="http://radar.oreilly.com/2012/02/09/1-0212-fireveink-580.png" border="0" alt="Comparison of a Mobi file on a traditional Kindle and the Kindle fire" style="margin-bottom:15px;" width="580" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;&lt;a href="http://radar.oreilly.com/assets_c/2012/02/fireveink.html"&gt;Click to enlarge&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Amazon's support for @media queries makes this possible, and O'Reilly is among the first publishers to employ this feature across all of its Kindle content. Here are some of the new features that you can expect to see on your Kindle Fire (enhancements vary by book):&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt; Color images&lt;/li&gt;
  &lt;li&gt; Syntax-highlighted code&lt;/li&gt;
  &lt;li&gt; Improved layout and design with CSS3&lt;/li&gt;
  &lt;li&gt; Embedded &lt;a href="http://font.ubuntu.com/#charset-mono-regular"&gt;code font&lt;/a&gt; for better legibility and glyph support&lt;/li&gt;  
&lt;/ul&gt;

&lt;p&gt;Here are some screenshots from our newly optimized Mobis:&lt;/p&gt;

&lt;p class="image-box-580"&gt;&lt;a href="http://radar.oreilly.com/assets_c/2012/02/make_electronics.html"&gt;&lt;img src="http://radar.oreilly.com/2012/02/09/2-0212-make-electronics-580.png" border="0" alt="Optimized Mobi file from Make Electronics" style="margin-bottom:15px;" width="580" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;&lt;a href="http://radar.oreilly.com/assets_c/2012/02/make_electronics.html"&gt;Click to enlarge&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p class="image-box-580"&gt;&lt;a href="http://radar.oreilly.com/assets_c/2012/02/js_tdg_6e.html"&gt;&lt;img src="http://radar.oreilly.com/2012/02/09/3-0212-js-tdg-6e-580.png" border="0" alt="Optimized Mobi file from JavaScript: The Definitive Guide" style="margin-bottom:15px;" width="580" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;em&gt;&lt;a href="http://radar.oreilly.com/assets_c/2012/02/js_tdg_6e.html"&gt;Click to enlarge&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Starting this week, our books will begin to be available in KF8 format through Amazon's Kindle Store. However, an unfortunate limitation of buying from Amazon is that they don't normally provide customers with publisher updates. By contrast, buying direct from O'Reilly gives you access to &lt;a href="http://shop.oreilly.com/category/ebooks.do"&gt;lifetime, DRM-free updates in all standard ebook formats&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2011/10/ipad-amazon-kindle-fire.html"&gt;iPad vs. Kindle Fire: Early impressions and a few predictions&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2012/01/kindle-fire-three-pros-five-co.html"&gt;Kindle Fire: Three pros, five cons&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=0DAcEF7Nxc8:Zibm1WmWbq8:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=0DAcEF7Nxc8:Zibm1WmWbq8:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=0DAcEF7Nxc8:Zibm1WmWbq8:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=0DAcEF7Nxc8:Zibm1WmWbq8:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=0DAcEF7Nxc8:Zibm1WmWbq8:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=0DAcEF7Nxc8:Zibm1WmWbq8:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=0DAcEF7Nxc8:Zibm1WmWbq8:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/0DAcEF7Nxc8" height="1" width="1"/&gt;</content>

<dc:type>text</dc:type>
<on:image />
<feedburner:origLink>http://radar.oreilly.com/2012/02/kindle-fire-mobi-kf8-oreilly-ebook.html</feedburner:origLink></entry>

<entry>
<title>Strata Week: Your personal automated data scientist</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/NVO_t-FkGos/wolfram-alpha-pro-crisis-data-dating-data.html" />
<id>tag:radar.oreilly.com,2012://57.47806</id>

<published>2012-02-09T16:00:00Z</published>
<updated>2012-02-09T16:00:00Z</updated>

<summary type="html">Wolfram|Alpha launches a pro version of its computational knowledge engine, guidelines emerge for protecting the data of people in crisis, and researchers cast doubt on dating sites' matchmaking algorithms.</summary>
<author>
<name>Audrey Watters</name>
<uri>http://radar.oreilly.com/audreyw</uri>
</author>

<category term="Data" scheme="http://www.sixapart.com/ns/types#category" />

<category term="Web 2.0" scheme="http://www.sixapart.com/ns/types#category" />

<category term="crisisresponse" label="crisis response" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="dataprotection" label="data protection" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="datascience" label="data science" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="datascientists" label="data scientists" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="dating" label="dating" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="personaldata" label="personal data" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="strataweek" label="strataweek" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="wolframalpha" label="wolfram alpha" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;Here are a few of the data stories that caught my attention this week:&lt;/p&gt;

&lt;h2 id="wolfram-alpha"&gt;Wolfram|Alpha Pro: An on-call data scientist&lt;/h2&gt;

&lt;p&gt;The computational knowledge engine &lt;a href="http://wolframalpha.com"&gt;Wolfram|Alpha&lt;/a&gt; unveiled a &lt;a href="http://blog.wolframalpha.com/2012/02/08/announcing-wolframalpha-pro/"&gt;pro&lt;/a&gt; version this week. For $4.99 per month ($2.99 for students), Wolfram|Alpha Pro offers access to more of the computational power "under the hood" of the site, in part by allowing users to upload their own datasets, which Wolfram|Alpha will in turn analyze.&lt;/p&gt;

&lt;p&gt;This includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; Text files &amp;mdash; Wolfram|Alpha will respond with the character and word count, provide an estimate on how long it would take to read aloud, and reveal the most common word, average sentence length and more.&lt;/li&gt;
&lt;li&gt; Spreadsheets &amp;mdash; It will crunch the numbers and return a variety of statistics and graphs.&lt;/li&gt;
&lt;li&gt; Image files &amp;mdash; It will analyze the image's dimensions, size, and colors, and let you apply several different filters.&lt;/li&gt;
&lt;/ul&gt;

&lt;p class="image-box-580"&gt;&lt;img src="http://radar.oreilly.com/2012/02/09/0212-wofram-pro-example.png" border="0" alt="Wolfram Alpha Pro example" width="580" style="margin-bottom: 15px;" /&gt;&lt;br /&gt;&lt;em&gt;Wolfram|Alpha Pro subscribers can upload and analyze their own datasets.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;There's also a new extended keyboard that contains the Greek alphabet and other special characters for manually entering data. Data and analysis from these entries and any queries can also be downloaded.&lt;/p&gt;

&lt;p&gt;"In a sense," &lt;a href="http://blog.wolframalpha.com/2012/02/08/announcing-wolframalpha-pro/"&gt;writes&lt;/a&gt; Wolfram's founder Stephen Wolfram, "the concept is to imagine what a good data scientist would do if confronted with your data, then just immediately and automatically do that &amp;mdash; and show you the results."&lt;/p&gt;

&lt;div style="float: left; border-top: thin gray solid; border-bottom: thin gray solid; padding: 20px; margin: 20px 2px; clear: both;"&gt;&lt;a href="https://en.oreilly.com/strata2012/public/regwith/radar20?cmp=il-radar-st12-strataweek-020912"&gt;&lt;img style="float: left; border: none; padding-right: 10px;" src="http://radar.oreilly.com/2011-strata-ca-promo.png" /&gt;&lt;/a&gt;&lt;a href="https://en.oreilly.com/strata2012/public/regwith/radar20?cmp=il-radar-st12-strataweek-020912"&gt;&lt;strong&gt;Strata 2012&lt;/strong&gt;&lt;/a&gt; &amp;mdash;  The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.&lt;br /&gt;
 &lt;br /&gt;
&lt;a href="https://en.oreilly.com/strata2012/public/regwith/radar20?cmp=il-radar-st12-strataweek-020912"&gt;&lt;strong&gt;Save 20% on registration with the code RADAR20&lt;/strong&gt;&lt;/a&gt;&lt;/div&gt;

&lt;h2 id="crisis"&gt;Crisis-mapping and data protection standards&lt;/h2&gt;

&lt;p&gt;&lt;a href="http://ushahidi.com/"&gt;Ushahidi&lt;/a&gt;'s Patrick Meier &lt;a href="http://irevolution.net/2012/02/05/iom-data-protection/"&gt;takes a look&lt;/a&gt; at the recently released &lt;a href="http://www.iom.int/jahia/Jahia/media/press-briefing-notes/pbnEU/cache/offonce/lang/en?entryId=31191"&gt;Data Protection Manual&lt;/a&gt; issued by the International Organization for Migration (&lt;a href="http://www.iom.int/"&gt;IOM&lt;/a&gt;).  According to the IOM, the manual is meant to serve as a guide to help:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;" ... protect the personal data of the migrants in its care. It follows concerns about the general increase in data theft and loss and the recognition that hackers are finding ever more sophisticated ways of breaking into personal files.  The IOM Data Protection Manual aims to protect the integrity and confidentiality of personal data and to prevent inappropriate disclosure."&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Meier describes the manual as "required reading" but notes that there is no mention of social media in the 150-page document. "This is perfectly understandable given IOM's work," he writes, "but there is no denying that disaster-affected communities are becoming more digitally-enabled &amp;mdash; and thus, increasingly the source of important, user-generated information."&lt;/p&gt;

&lt;p&gt;Meier moves through the &lt;a href="http://irevolution.net/2012/02/05/iom-data-protection/"&gt;Data Protection Manual's principles&lt;/a&gt;, highlighting the ones that may be challenged when it comes to user-generated, crowdsourced data and raising important questions about consent, privacy, and security.&lt;/p&gt;

&lt;h2 id="dating"&gt;Doubting the dating industry's algorithms&lt;/h2&gt;

&lt;p&gt;Many online dating websites claim that their algorithms are able to help match singles with their perfect mate.  But a forthcoming article in "&lt;a href="http://www.psychologicalscience.org/index.php/news/releases/grading-the-online-dating-industry.html"&gt;Psychological Science in the Public Interest&lt;/a&gt;," a journal of the Association for Psychological Science, casts some doubt on the data science of dating.&lt;/p&gt;

&lt;p&gt;According to the article's lead author Eli Finkel, associate professor of social psychology at Northwestern University, "there is no compelling evidence that any online dating matching algorithm actually works."  Finkel argues that dating sites' algorithms do not "adhere to the standards of science," and adds that "it is unlikely that their algorithms can work, even in principle, given the limitations of the sorts of matching procedures that these sites use."&lt;/p&gt;

&lt;p&gt;It's "relationship science" versus the in-take questions that most dating sites ask in order to help users create their profiles and suggest matches.  Finkel and his coauthors note that some of the strongest predictors for good relationships &amp;mdash; such as how couples interact under pressure &amp;mdash; aren't assessed by dating sites.&lt;/p&gt;

&lt;p&gt;The paper calls for the creation of a panel to grade the scientific credibility of each online dating site.&lt;/p&gt;

&lt;h2&gt;Got data news?&lt;/h2&gt;

&lt;p&gt;Feel free to &lt;a href="mailto:dataweek@oreilly.com"&gt;email me&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2012/02/megaupload-user-data-bloomberg-pentaho.html#megaupload"&gt;Megaupload's seizure and questions about controlling user data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2011/03/social-media-red-cross-lafd.html"&gt;Social media in a time of need&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2011/06/dating-data-okcupid-oktrends.html"&gt;Dating with data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="http://blogs.oreilly.com/cgi-bin/mt/mt-search.cgi?blog_id=57&amp;tag=strataweek&amp;limit=20&amp;IncludeBlogs=57"&gt;More Strata Week coverage&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=NVO_t-FkGos:-m_b_V1eIH8:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=NVO_t-FkGos:-m_b_V1eIH8:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=NVO_t-FkGos:-m_b_V1eIH8:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=NVO_t-FkGos:-m_b_V1eIH8:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=NVO_t-FkGos:-m_b_V1eIH8:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=NVO_t-FkGos:-m_b_V1eIH8:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=NVO_t-FkGos:-m_b_V1eIH8:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/NVO_t-FkGos" height="1" width="1"/&gt;</content>

<dc:type>text</dc:type>
<on:image>http://radar.oreilly.com/strata-week.png</on:image>
<feedburner:origLink>http://radar.oreilly.com/2012/02/wolfram-alpha-pro-crisis-data-dating-data.html</feedburner:origLink></entry>

<entry>
<title>It's time for a unified ebook format and the end of DRM</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/3SVAKS_h-1Y/unified-ebook-format-end-drm.html" />
<id>tag:radar.oreilly.com,2012://57.47799</id>

<published>2012-02-09T14:00:00Z</published>
<updated>2012-02-09T14:00:00Z</updated>

<summary type="html">The music industry has shown that you need to offer consumers a universal format and  content without rights restrictions. So when will publishers pay attention?</summary>
<author>
<name>Joe Wikert</name>
<uri>http://radar.oreilly.com/joew</uri>
</author>

<category term="Publishing" scheme="http://www.sixapart.com/ns/types#category" />

<category term="consumers" label="consumers" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="devices" label="devices" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="drm" label="drm" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="ebook" label="ebook" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="epub3" label="EPUB 3" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="format" label="format" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="publishers" label="publishers" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="readers" label="readers" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;&lt;em&gt;This post originally appeared on &lt;a href="http://www.publishersweekly.com/pw/by-topic/digital/content-and-e-books/article/50484-the-toc-perspective-a-call-for-a-unified-e-book-market.html"&gt;Publishers Weekly&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src="http://radar.oreilly.com/2012/02/07/0212-unified-ebook-format.png" border="0" alt="Ereaders" width="300" style="float: right; margin: 3px 0 10px 10px;" /&gt;Imagine buying a car that locks you into one brand of fuel. A new BMW, for example, that only runs on BMW gas. There are plenty of BMW gas stations around, even a few in your neighborhood, so convenience isn't an issue. But if one of those other gas stations offers a discount, a membership program, or some other attractive marketing campaign, you can't participate. You're locked in with the BMW gas stations.&lt;/p&gt;

&lt;p&gt;This could never happen, right? Consumers are too smart to buy into something like this. Or are they? After all, isn't that exactly what's happening in the ebook world? You buy a dedicated ebook reader like a Kindle or a NOOK and you're locked in to that company's content. Part of this problem has to do with ebook formats (e.g., EPUB or Mobipocket) while another part of it stems from publisher insistence on the use of digital rights management (DRM). Let's look at these issues individually.&lt;/p&gt;

&lt;h2&gt;Platform lock-in&lt;/h2&gt;

&lt;p&gt;I've often referred to it as Amazon's not-so-secret formula: Every time I buy another ebook for my Kindle, I'm building a library that makes me that much more loyal to Amazon's platform. If I've invested thousands or even hundreds of dollars in Kindle-formatted content, how could I possibly afford to switch to another reading platform?&lt;/p&gt;

&lt;p&gt;It would be too inconvenient to have part of my library in Amazon's Mobipocket format and the rest in EPUB. Even though I could read both on a tablet (e.g., the iPad), I'd be forced to switch between two different apps. The user interface between any two reading apps is similar but not identical, and searching across your entire library becomes a two-step process since there's no way to access all of your content within one app.&lt;/p&gt;

&lt;p&gt;This situation isn't unique to Amazon. The same issue exists for all the other dedicated ereader hardware platforms (e.g., Kobo, NOOK, etc.). Google Books initially seemed like a solution to this problem, but &lt;a href="http://books.google.com/help/ebooks/ereader.html"&gt;it still doesn't offer mobi formats for the Kindle&lt;/a&gt;, so it's selling content for every format under the sun &amp;mdash; except the one with the largest market share.&lt;/p&gt;

&lt;p&gt;EPUB would seem to be the answer. It's a popular format based on web standards, and it's developed and maintained by &lt;a href="http://idpf.org/"&gt;an organization&lt;/a&gt; that's focused on openness and broad industry adoption. It also happens to be the format used by seemingly every ebook vendor except the largest one: Amazon.&lt;/p&gt;

&lt;p&gt;Even if we could get Amazon to adopt EPUB, though, we'd still have that other pesky issue to deal with: DRM.&lt;/p&gt;

&lt;h2&gt;The myth of DRM&lt;/h2&gt;

&lt;p&gt;I often blame Napster for the typical book publisher's fear of piracy. Publishers saw what happened in the music industry and figured the only way they'd make their book content available digitally was to tightly wrap it with DRM. The irony of this is that some of the most highly pirated books were never released as ebooks. Thanks to the magic of high-speed scanner technology, any print book can easily be converted to an ebook and distributed illegally.&lt;/p&gt;

&lt;p&gt;Some publishers don't want to hear this, but the truth is that DRM can be hacked. It does not eliminate piracy. It not only fails as a piracy deterrent, but it also introduces restrictions that make ebooks less attractive than print books. We've all read a print book and passed it along to a friend. Good luck doing that with a DRM'd ebook! What publishers don't seem to understand is that DRM implies a lack of trust. All customers are considered thieves and must be treated accordingly.&lt;/p&gt;

&lt;p&gt;The evil of DRM doesn't end there, though. Author Charlie Stross recently wrote a terrific blog post entitled "&lt;a href="http://www.antipope.org/charlie/blog-static/2011/11/cutting-their-own-throats.html"&gt;Cutting Their Own Throats&lt;/a&gt;." It's all about how publisher fear has enabled a big ebook player like Amazon to further reinforce its market position, often at the expense of publishers and authors. It's an unintended consequence of DRM that's impacting our entire industry.&lt;/p&gt;

&lt;p&gt;Given all these issues, why not eliminate DRM and trust your customers? Even the music industry, the original casualty of the Napster phenomenon, has seen the light and moved on from DRM.&lt;/p&gt;

&lt;div style="float: left; border-top: thin gray solid; border-bottom: thin gray solid; padding: 20px; margin: 20px 2px; clear: both;"&gt;&lt;a href="http://www.toccon.com/toc2012?cmp=il-radar-tc12-unified-ebook-format-drm-joe-pw"&gt;&lt;img style="float: left; border: none; padding-right: 10px;" src="http://radar.oreilly.com/toc11-148.png" /&gt;&lt;/a&gt;&lt;a href="http://www.toccon.com/toc2012?cmp=il-radar-tc12-unified-ebook-format-drm-joe-pw"&gt;&lt;strong&gt;TOC NY 2012&lt;/strong&gt;&lt;/a&gt; &amp;mdash;  O'Reilly's TOC Conference, being held Feb. 13-15, 2012, in New York, is where the publishing and tech industries converge. Practitioners and executives from both camps will share what they've learned and join together to navigate publishing's ongoing transformation.&lt;br /&gt;
 &lt;br /&gt;
&lt;a href="http://www.toccon.com/toc2012?cmp=il-radar-tc12-unified-ebook-format-drm-joe-pw"&gt;&lt;strong&gt;Register to attend TOC 2012&lt;/strong&gt;&lt;/a&gt;&lt;/div&gt;

&lt;h2&gt;Lessons from the music industry&lt;/h2&gt;

&lt;p&gt;Several years ago, Steve Jobs posted a letter to the music industry pleading for them to abandon DRM. The letter no longer appears on Apple's website, &lt;a href="http://www.engadget.com/2007/02/06/a-letter-from-steve-jobs-on-drm-lets-get-rid-of-it/"&gt;but community commentary about it lives on&lt;/a&gt;. My favorite part of that letter is where Jobs asks why the music industry would allow DRM to go away. The answer is that, "DRMs haven't worked, and may never work, to halt music piracy." In fact, &lt;a href="http://www.engadget.com/2007/02/06/a-letter-from-steve-jobs-on-drm-lets-get-rid-of-it/"&gt;a study last year by Rice University and Duke University&lt;/a&gt; contends that removing DRM can actually &lt;em&gt;decrease&lt;/em&gt; piracy. Yes, you read that right.&lt;/p&gt;

&lt;p&gt;I recently had an experience with my digital music collection that drove this point home for me. I had just switched from an iPhone to an Android phone and wanted to get my music from the old device onto the new one. All I had to do was drag and drop the folder containing my music in iTunes to the SD card in my new phone. It worked perfectly because the music file formats are universal and there was no DRM involved.&lt;/p&gt;

&lt;p&gt;Imagine trying to do that with your ebook collection. Try dragging your Kindle ebooks onto your new NOOK, for example. Incompatible file formats and DRM prevent that from happening ... today. At some point in the not-too-distant future, though, I'm optimistic the book publishing industry will get to the same stage as the music industry and offer a universal, DRM-free format for all ebooks. Then customers will be free to use whatever e-reader they prefer without fear of lock-in and incompatibilities.&lt;/p&gt; 

&lt;p&gt;The music industry made the transition, why can't we?&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;
&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt;&lt;/p&gt;&lt;br /&gt;
&lt;ul&gt;&lt;br /&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2012/01/on-pirates-and-piracy.html"&gt;On pirates and piracy&lt;/a&gt;&lt;/li&gt;&lt;br /&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2011/01/book-piracy-drm-data.html"&gt;Book piracy: Less DRM, more data&lt;/a&gt;&lt;/li&gt;&lt;br /&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2008/06/putting-ebook-piracy-into-pers.html"&gt;Putting Ebook Piracy into Perspective&lt;/a&gt;&lt;/li&gt;&lt;br /&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2008/10/the-analog-hole-in-digital-boo.html"&gt;The Analog Hole: Another Argument Against DRM&lt;/a&gt;&lt;/li&gt;&lt;br /&gt;
&lt;/ul&gt;&lt;/p&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=3SVAKS_h-1Y:w69jrBoHfzY:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=3SVAKS_h-1Y:w69jrBoHfzY:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=3SVAKS_h-1Y:w69jrBoHfzY:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=3SVAKS_h-1Y:w69jrBoHfzY:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=3SVAKS_h-1Y:w69jrBoHfzY:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=3SVAKS_h-1Y:w69jrBoHfzY:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=3SVAKS_h-1Y:w69jrBoHfzY:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/3SVAKS_h-1Y" height="1" width="1"/&gt;</content>

<dc:type>text</dc:type>
<on:image>http://radar.oreilly.com/2012/02/07/0212-unified-ebook-format-slider.png</on:image>
<feedburner:origLink>http://radar.oreilly.com/2012/02/unified-ebook-format-end-drm.html</feedburner:origLink></entry>

<entry>
<title>Now available: Best of TOC 2012 anthology</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/s-C6OtVh9uI/best-of-toc-2012-free-ebook-anthology.html" />
<id>tag:radar.oreilly.com,2012://57.47797</id>

<published>2012-02-09T13:30:00Z</published>
<updated>2012-02-09T13:30:00Z</updated>

<summary type="html">"Best of TOC 2012" explores the ideas that are shaping the content world, including: the adaptation of publishing, digital's legal issues, new tech and tools, and thoughts from the edge of publishing.</summary>
<author>
<name>Mac Slocum</name>
<uri>http://radar.oreilly.com/mslocum</uri>
</author>

<category term="Publishing" scheme="http://www.sixapart.com/ns/types#category" />

<category term="anthology" label="anthology" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="bestoftoc2012" label="Best of TOC 2012" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="ebook" label="ebook" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="toc" label="TOC" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;&lt;a href="http://shop.oreilly.com/product/0636920025290.do?cmp=il-radar-ebooks-best-of-toc-2012-announcement"&gt;&lt;img src="http://radar.oreilly.com/2012/02/07/0212-best-of-toc-12-cover.png" border="0" alt="Best of TOC 2012" style="float: right; margin: 3px 0 10px 10px;" width="191" /&gt;&lt;/a&gt;We just released "&lt;a href="http://shop.oreilly.com/product/0636920025290.do?cmp=il-radar-ebooks-best-of-toc-2012-announcement"&gt;Best of TOC 2012&lt;/a&gt;,"  a free anthology that brings together key interviews and analysis from Radar's &lt;a href="http://radar.oreilly.com/publishing"&gt;publishing&lt;/a&gt; area. &lt;/p&gt;

&lt;p&gt;The material in Best of TOC falls into four sections:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The adaptation of publishing&lt;/strong&gt; &amp;mdash; The disruption in publishing is just getting started. Journalists are &lt;a href="http://radar.oreilly.com/2011/12/marc-herman-kindle-single-journalism.html"&gt;experimenting with ebook options&lt;/a&gt; over traditional outlets, readers are wrapping their heads around the concept of &lt;a href="http://radar.oreilly.com/2011/11/the-paperless-book.html"&gt;paperless books&lt;/a&gt;, and authors are &lt;a href="http://radar.oreilly.com/2011/03/future-of-publishers.html"&gt;wondering if they even need publishers&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Digital publishing and the legal landscape&lt;/strong&gt; &amp;mdash; The emerging global market for books is stirring up all sorts of legal questions concerning &lt;a href="http://radar.oreilly.com/2011/03/golan-copyright-international.html"&gt;copyright&lt;/a&gt;, public domain and &lt;a href="http://radar.oreilly.com/2011/10/digital-rights-complexity.html"&gt;digital publishing rights&lt;/a&gt; for authors and publishers. Existing laws are slowly adapting to &lt;a href="http://radar.oreilly.com/2011/05/libel-twitter-facebook-blogs.html"&gt;new media platforms&lt;/a&gt; as well.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Publishing tech and tools&lt;/strong&gt; &amp;mdash; Digital publishing is requiring &lt;a href="http://radar.oreilly.com/2011/12/html5-for-publishers-canvas-geo-formats.html"&gt;tech education&lt;/a&gt; for everyone, from publishers to authors to readers. In addition, the rise of mobile is driving the development of &lt;a href="http://radar.oreilly.com/2011/10/content-on-an-infinite-canvas.html"&gt;publishing's next toolset&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The edge of publishing&lt;/strong&gt; &amp;mdash; Adaptation to a new publishing landscape starts with a &lt;a href="http://radar.oreilly.com/2010/11/open-ended-publishing.html"&gt;change in thinking&lt;/a&gt; &amp;mdash; not only in how we think about technology and &lt;a href="http://radar.oreilly.com/2011/02/future-of-the-book.html"&gt;books as objects&lt;/a&gt;, but in how we define our various roles and how we &lt;a href="http://radar.oreilly.com/2010/11/publishing-needs-a-social-stra.html"&gt;choose to collaborate&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;You can &lt;a href="http://shop.oreilly.com/product/0636920025290.do?cmp=il-radar-ebooks-best-of-toc-2012-announcement"&gt;download a free copy of "Best of TOC 2012" here&lt;/a&gt; (available in EPUB, Mobi and PDF formats).&lt;/p&gt;

&lt;div style="float: left; border-top: thin gray solid; border-bottom: thin gray solid; padding: 20px; margin: 20px 2px; clear: both;"&gt;&lt;a href="http://www.toccon.com/toc2012?cmp=il-radar-tc12-best-of-toc-2012-announcement"&gt;&lt;img style="float: left; border: none; padding-right: 10px;" src="http://radar.oreilly.com/toc11-148.png" /&gt;&lt;/a&gt;&lt;a href="http://www.toccon.com/toc2012?cmp=il-radar-tc12-best-of-toc-2012-announcement"&gt;&lt;strong&gt;TOC NY 2012&lt;/strong&gt;&lt;/a&gt; &amp;mdash;  O'Reilly's TOC Conference, being held Feb. 13-15, 2012, in New York, is where the publishing and tech industries converge. Practitioners and executives from both camps will share what they've learned and join together to navigate publishing's ongoing transformation.&lt;br /&gt;
 &lt;br /&gt;
&lt;a href="http://www.toccon.com/toc2012?cmp=il-radar-tc12-best-of-toc-2012-announcement"&gt;&lt;strong&gt;Register to attend TOC 2012&lt;/strong&gt;&lt;/a&gt;&lt;/div&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=s-C6OtVh9uI:kglwg1dyBwk:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=s-C6OtVh9uI:kglwg1dyBwk:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=s-C6OtVh9uI:kglwg1dyBwk:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=s-C6OtVh9uI:kglwg1dyBwk:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=s-C6OtVh9uI:kglwg1dyBwk:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=s-C6OtVh9uI:kglwg1dyBwk:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=s-C6OtVh9uI:kglwg1dyBwk:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/s-C6OtVh9uI" height="1" width="1"/&gt;</content>
<dc:source>http://www.oreillynet.com/pub/au/3515</dc:source>
<dc:type>text</dc:type>
<on:image>http://radar.oreilly.com/2012/02/07/0212-best-of-toc-12-slider.png</on:image>
<feedburner:origLink>http://radar.oreilly.com/2012/02/best-of-toc-2012-free-ebook-anthology.html</feedburner:origLink></entry>

<entry>
<title>Four short links: 9 February 2012</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/_m2_6YEiTdE/four-short-links-9-february-20-2.html" />
<id>tag:radar.oreilly.com,2012://57.47803</id>

<published>2012-02-09T11:00:00Z</published>
<updated>2012-02-09T11:00:00Z</updated>

<summary type="html"> Weave -- web-based visualization platform designed to enable visualization of any available data by anyone for any purpose. GPL and MPL-licensed. (via Flowing Data) Flotr2 -- MIT-licensed Javascript library for drawing HTML5 charts and graphs. It is a branch of flotr which removes the Prototype dependency and includes many improvements. (via Javascript Weekly) What Silicon Valley Gets Wrong About...</summary>
<author>
<name>Nat Torkington</name>
<uri>http://radar.oreilly.com/nat/</uri>
</author>

<category term="charting" label="charting" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="danmeyer" label="Dan Meyer" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="education" label="education" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="javascript" label="javascript" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="opensource" label="open source" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="programming" label="programming" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="security" label="security" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="startups" label="startups" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="visualisation" label="visualisation" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="web" label="web" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="http://www.oicweave.org/"&gt;Weave&lt;/a&gt; -- &lt;i&gt;web-based visualization platform designed to enable visualization of any available data by anyone for any purpose&lt;/i&gt;. GPL and MPL-licensed. (via &lt;a href="http://flowingdata.com/2012/02/07/weave-for-visualization-development/"&gt;Flowing Data&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://humblesoftware.com/flotr2/"&gt;Flotr2&lt;/a&gt; -- MIT-licensed Javascript &lt;i&gt;library for drawing HTML5 charts and graphs. It is a branch of flotr which removes the Prototype dependency and includes many improvements.&lt;/i&gt; (via &lt;a href="http://javascriptweekly.com"&gt;Javascript Weekly&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://blog.mrmeyer.com/?p=12782"&gt;What Silicon Valley Gets Wrong About Math Education Again And Again&lt;/a&gt; (Dan Meyer) -- nicely said: it's hard to test true understanding, easy to automate only part of the testing and assessment support for learners.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://mitmproxy.org/"&gt;mitmproxy&lt;/a&gt; -- GPLv3-licensed SSL-aware HTTP proxy which lets you snoop on the traffic being sent back to the mothership from apps.&lt;/li&gt;
&lt;/ol&gt;&lt;/p&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=_m2_6YEiTdE:4mahLvDnes4:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=_m2_6YEiTdE:4mahLvDnes4:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=_m2_6YEiTdE:4mahLvDnes4:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=_m2_6YEiTdE:4mahLvDnes4:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=_m2_6YEiTdE:4mahLvDnes4:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=_m2_6YEiTdE:4mahLvDnes4:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=_m2_6YEiTdE:4mahLvDnes4:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/_m2_6YEiTdE" height="1" width="1"/&gt;</content>
<dc:source>http://www.oreillynet.com/pub/au/149</dc:source>
<dc:type>text</dc:type>
<on:image />
<feedburner:origLink>http://radar.oreilly.com/2012/02/four-short-links-9-february-20-2.html</feedburner:origLink></entry>

<entry>
<title>Tip for B&amp;N: Don't just follow Amazon</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/Ks-66BFFkOA/amazon-bn-market-competition-toc-podcast.html" />
<id>tag:radar.oreilly.com,2012://57.47794</id>

<published>2012-02-08T17:00:00Z</published>
<updated>2012-02-08T17:00:00Z</updated>

<summary type="html">Amazon is the clear market leader, but that doesn't mean everyone else should throw in the towel. In this podcast, Joseph Esposito,  president of Portable CEO consulting, discusses the current publishing market and how B&amp;amp;amp;N can best compete.</summary>
<author>
<name>Joe Wikert</name>
<uri>http://radar.oreilly.com/joew</uri>
</author>

<category term="Publishing" scheme="http://www.sixapart.com/ns/types#category" />

<category term="amazon" label="Amazon" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="barnesnoble" label="barnes &amp; noble" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="businessmodels" label="business models" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="competition" label="competition" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="tocpodcast" label="TOC Podcast" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;This post is part of the &lt;a href="http://blogs.oreilly.com/cgi-bin/mt/mt-search.cgi?blog_id=57&amp;tag=TOC%20Podcast&amp;limit=20&amp;IncludeBlogs=57"&gt;TOC podcast series&lt;/a&gt;. You can also subscribe to the free &lt;a href="http://itunes.apple.com/us/podcast/tools-change-for-publishing/id465091714"&gt;TOC podcast through iTunes&lt;/a&gt;.&lt;/p&gt;

&lt;hr&gt;

&lt;p&gt;I follow dozens of publishing blogs and tweet streams, but there's one that always rises above the rest for me. Any time I see something from Joseph Esposito (&lt;a href="https://twitter.com/#!/josephjesposito"&gt;@JosephJEsposito&lt;/a&gt;), president of Portable CEO consulting, I make sure I read it. He's a frequent contributor to the &lt;a href="http://scholarlykitchen.sspnet.org/"&gt;Scholarly Kitchen blog&lt;/a&gt;, and &lt;a href="http://scholarlykitchen.sspnet.org/2012/01/10/how-barnes-noble-can-take-a-bite-out-of-amazon/"&gt;one of his recent articles&lt;/a&gt; there got me thinking about the need for better competition in the publishing industry. I sat down with Joe to discuss Amazon's dominance, what B&amp;amp;N should do to improve its position and much more. &lt;/p&gt;

&lt;p&gt;Key points from the full video interview (&lt;a href="#interview"&gt;below&lt;/a&gt;) include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;"B&amp;amp;N needs an 'MCI solution'"&lt;/strong&gt; &amp;mdash; Amazon is the clear market leader and, as #2, B&amp;amp;N must avoid just following Amazon's lead and come up with a completely new and different product and content model. What B&amp;amp;N is doing with in-store Nook merchandising is great, but they've got to go much further. [Discussed at the &lt;a href="http://www.youtube.com/watch?v=IX1m-lgWZMs#t=1m00s"&gt;1:00 mark&lt;/a&gt;.]&lt;/li&gt;

&lt;p&gt;&lt;li&gt;&lt;strong&gt;Can B&amp;amp;N do anything to disrupt Amazon Prime?&lt;/strong&gt; &amp;mdash; Amazon and anyone else creating a Prime-like service will start to run into the same challenges Netflix has encountered. [Discussed at &lt;a href="http://www.youtube.com/watch?v=IX1m-lgWZMs#t=4m07s"&gt;4:07&lt;/a&gt;.]&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt;&lt;strong&gt;Broad content repositories vs. narrow, vertical ones&lt;/strong&gt; &amp;mdash; Specific genres lend themselves more to this sort of offering, and each one could have a different pricing model. &lt;a href="http://safaribooksonline.com/Corporate/Index/"&gt;Safari Books Online&lt;/a&gt; is a great example. [Discussed at &lt;a href="http://www.youtube.com/watch?v=IX1m-lgWZMs#t=5m52s"&gt;5:52&lt;/a&gt;.]&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt;&lt;strong&gt;Pay-for-performance is the only option&lt;/strong&gt; &amp;mdash; Amazon has publicly stated that the Kindle Owner's Lending Library program pays most publishers a flat fee. I strongly believe that's the wrong model, and Joe talks about why the flat fee probably won't be a viable long-term option. [Discussed at &lt;a href="http://www.youtube.com/watch?v=IX1m-lgWZMs#t=6m45s"&gt;6:45&lt;/a&gt;.]&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt;&lt;strong&gt;Apps vs. HTML5/EPUB&lt;/strong&gt; &amp;mdash; Publishers are starting to figure out that platform-specific investments often aren't wise. Development costs for a single platform, even if that's iOS, are still high, so the future leads to more open, portable solutions. [Discussed at &lt;a href="http://www.youtube.com/watch?v=IX1m-lgWZMs#t=8m26s"&gt;8:26&lt;/a&gt;.]&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt;&lt;strong&gt;DRM&lt;/strong&gt; &amp;mdash; Joe makes an excellent point when he notes that, "the pro-DRM stance that many publishers have is not really getting them anywhere." [Discussed at &lt;a href="http://www.youtube.com/watch?v=IX1m-lgWZMs#t=11m05s"&gt;11:05&lt;/a&gt;.]&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt;&lt;strong&gt;Discoverability &amp;amp; recommendations&lt;/strong&gt; &amp;mdash; Discoverability will continue to get worse before it improves, but better integration with the social graph can provide a way forward. [Discussed at &lt;a href="http://www.youtube.com/watch?v=IX1m-lgWZMs#t=15m06s"&gt;15:06&lt;/a&gt;.]&lt;/li&gt;&lt;br /&gt;
&lt;/ul&gt;&lt;/p&gt;

&lt;p id="interview"&gt;You can view the entire interview in the following video.&lt;/p&gt;

&lt;p&gt;&lt;iframe width="600" height="305" src="http://www.youtube.com/embed/IX1m-lgWZMs" frameborder="0" allowfullscreen&gt;&lt;/iframe&gt; &lt;/p&gt;

&lt;div style="float: left; border-top: thin gray solid; border-bottom: thin gray solid; padding: 20px; margin: 20px 2px; clear: both;"&gt;&lt;a href="http://www.toccon.com/toc2012?cmp=il-radar-tc12-joseph-esposito-toc-podcast"&gt;&lt;img style="float: left; border: none; padding-right: 10px;" src="http://radar.oreilly.com/toc11-148.png" /&gt;&lt;/a&gt;&lt;a href="http://www.toccon.com/toc2012?cmp=il-radar-tc12-joseph-esposito-toc-podcast"&gt;&lt;strong&gt;TOC NY 2012&lt;/strong&gt;&lt;/a&gt; &amp;mdash;  O'Reilly's TOC Conference, being held Feb. 13-15, 2012, in New York City, is where the publishing and tech industries converge. Practitioners and executives from both camps will share what they've learned and join together to navigate publishing's ongoing transformation.&lt;br /&gt;
 &lt;br /&gt;
&lt;a href="http://www.toccon.com/toc2012?cmp=il-radar-tc12-joseph-esposito-toc-podcast"&gt;&lt;strong&gt;Register to attend TOC 2012&lt;/strong&gt;&lt;/a&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2012/01/amazon-jason-calacanis-toc.html"&gt;Coming soon to a location near you: The Amazon Store?&lt;/a&gt;&lt;/li&gt;

&lt;p&gt;&lt;li&gt; &lt;a href="http://jwikert.typepad.com/the_average_joe/2012/01/barnes-noble-its-time-to-disrupt-the-industry.html"&gt;Barnes &amp;amp; Noble: It's Time to Disrupt the Industry!&lt;/a&gt;&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt; &lt;a href="http://radar.oreilly.com/2011/01/data-markets-resellers-gnip.html"&gt;Hating Amazon is not a strategy&lt;/a&gt;&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt; &lt;a href="http://radar.oreilly.com/2011/12/drm-amazon-and-publisher-throa.html"&gt;Open Question: Is it realistic for publishers to cut Amazon out of the equation?&lt;/a&gt;&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt; &lt;a href="http://blogs.oreilly.com/cgi-bin/mt/mt-search.cgi?blog_id=57&amp;tag=TOC%20Podcast&amp;limit=20&amp;IncludeBlogs=57"&gt;More TOC Podcasts&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;
&lt;/p&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=Ks-66BFFkOA:toc52x1idR0:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=Ks-66BFFkOA:toc52x1idR0:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=Ks-66BFFkOA:toc52x1idR0:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=Ks-66BFFkOA:toc52x1idR0:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=Ks-66BFFkOA:toc52x1idR0:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=Ks-66BFFkOA:toc52x1idR0:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=Ks-66BFFkOA:toc52x1idR0:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/Ks-66BFFkOA" height="1" width="1"/&gt;</content>

<dc:type>text</dc:type>
<on:image>http://radar.oreilly.com/toc-podcast-slider.png</on:image>
<feedburner:origLink>http://radar.oreilly.com/2012/02/amazon-bn-market-competition-toc-podcast.html</feedburner:origLink></entry>

<entry>
<title>The NoSQL movement</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/QxTWbjEQrG8/nosql-non-relational-database.html" />
<id>tag:radar.oreilly.com,2012://57.47780</id>

<published>2012-02-08T14:00:00Z</published>
<updated>2012-02-08T14:00:00Z</updated>

<summary type="html">A relational database is no longer the default choice. Mike Loukides charts the rise of the NoSQL movement and explains how to choose the right database for your application.</summary>
<author>
<name>Mike Loukides</name>
<uri>http://radar.oreilly.com/mikel</uri>
</author>

<category term="Data" scheme="http://www.sixapart.com/ns/types#category" />

<category term="databases" label="databases" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="nonrelationaldatabase" label="non-relational database" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="nosql" label="nosql" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="planningforbigdata" label="Planning for Big Data" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;&lt;img src="http://radar.oreilly.com/2012/02/08/0212-nosql-logos.png" border="0" width="300" alt="NoSQL logos" style="float: right; margin: 3px 0 10px 10px;" /&gt;In a conversation last year, Justin Sheehy, CTO of &lt;a href="http://www.basho.com/"&gt;Basho&lt;/a&gt;, described NoSQL as a movement,
rather than a technology. This description immediately felt right;
I've never been comfortable talking about NoSQL, which when taken
literally, extends from the minimalist Berkeley DB (commercialized
as &lt;a href="http://en.wikipedia.org/wiki/Sleepycat_Software"&gt;Sleepycat&lt;/a&gt;,
now owned by Oracle) to the big iron &lt;a href="http://hbase.apache.org/"&gt;HBase&lt;/a&gt;, with detours into software as
fundamentally different as &lt;a href="http://neo4j.org/"&gt;Neo4J&lt;/a&gt; (a
graph database) and &lt;a href="http://www.fluidinfo.com/"&gt;FluidDB&lt;/a&gt;
(which defies description).&lt;/p&gt;

&lt;p&gt;But what does it mean to say that NoSQL is a movement rather
than a technology? We certainly don't see picketers outside
Oracle's headquarters. Justin said succinctly that NoSQL is a
movement for choice in database architecture. There is no single
overarching technical theme; a single technology would belie the
principles of the movement.&lt;/p&gt;

&lt;p&gt;Think of the last 15 years of software development. We've gotten
very good at building large, database-backed applications. Many of
them are web applications, but even more of them aren't. "Software
architect" is a valid job description; it's a position to which
many aspire. But what do software architects do? They specify the
high-level design of applications: the front end, the APIs, the
middleware, the business logic &amp;mdash; the back end? Well, maybe not.&lt;/p&gt;

&lt;p&gt;
Since the '80s, the dominant back end of business systems has been a
relational database, whether Oracle, SQL Server or DB2. That's not
much of an architectural choice. Those are all great products, but
they're essentially similar, as are all the other relational
databases. And it's remarkable that we've explored many architectural
variations in the design of clients, front ends, and middleware, on a
multitude of platforms and frameworks, but haven't until recently
questioned the architecture of the back end. Relational databases have
been a given.&lt;/p&gt;

&lt;p&gt;Many things have changed since the advent of relational
databases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;We're dealing with much more data. Although advances in storage
capacity and CPU speed have allowed the databases to keep pace,
we're in a new era where size itself is an important part of the
problem, and any significant database needs to be distributed.&lt;/li&gt;

&lt;p&gt;&lt;li&gt;We require sub-second responses to queries. In the '80s, most&lt;br /&gt;
database queries could run overnight as batch jobs. That's no&lt;br /&gt;
longer acceptable. While some analytic functions can still run as&lt;br /&gt;
overnight batch jobs, we've seen the web evolve from static files&lt;br /&gt;
to complex database-backed sites, and that requires sub-second&lt;br /&gt;
response times for most queries.&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt;We want applications to be up 24/7. Setting up redundant&lt;br /&gt;
servers for static HTML files is easy, but a database replication&lt;br /&gt;
in a complex database-backed application is another.&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt;We're seeing many applications in which the database has to&lt;br /&gt;
soak up data as fast (or even much faster) than it processes&lt;br /&gt;
queries: in a logging application, or a distributed sensor&lt;br /&gt;
application, writes can be much more frequent than reads.&lt;br /&gt;
Batch-oriented ETL (extract, transform, and load) hasn't&lt;br /&gt;
disappeared, and won't, but capturing high-speed data flows is&lt;br /&gt;
increasingly important.&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt;We're frequently dealing with changing data or with&lt;br /&gt;
unstructured data. The data we collect, and how we use it, grows&lt;br /&gt;
over time in unpredictable ways. Unstructured data isn't a&lt;br /&gt;
particularly new feature of the data landscape, since unstructured&lt;br /&gt;
data has always existed, but we're increasingly unwilling to force&lt;br /&gt;
a structure on data a priori.&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt;We're willing to sacrifice our sacred cows. We know that&lt;br /&gt;
consistency and isolation and other properties are very valuable,&lt;br /&gt;
of course. But so are some other things, like latency and&lt;br /&gt;
availability and not losing data even if our primary server goes&lt;br /&gt;
down. The challenges of modern applications make us realize that&lt;br /&gt;
sometimes we might need to weaken one of these constraints in order&lt;br /&gt;
to achieve another.&lt;/li&gt;&lt;br /&gt;
&lt;/ul&gt;&lt;/p&gt;

&lt;p&gt;These changing requirements lead us to different tradeoffs and
compromises when designing software. They require us to rethink
what we require of a database, and to come up with answers aside
from the relational databases that have served us well over the
years. So let's look at these requirements in somewhat more
detail.&lt;/p&gt;

&lt;div style="float: left; border-top: thin gray solid; border-bottom: thin gray solid; padding: 20px; margin: 20px 2px; clear: both;"&gt;&lt;a href="http://www.microsoft.com/sql"&gt;&lt;img style="float: left; border: none; padding-right: 10px;" src="http://radar.oreilly.com/sponsor-ms-sql-server.png" /&gt;&lt;/a&gt;&lt;a href="http://www.microsoft.com/sql"&gt;&lt;strong&gt;Microsoft SQL Server&lt;/strong&gt;&lt;/a&gt; is a comprehensive information platform offering enterprise-ready technologies and tools that help businesses derive maximum value from information at the lowest TCO. SQL Server 2012 launches next year, offering a cloud-ready information platform delivering mission-critical confidence, breakthrough insight, and cloud on your terms; find out more at &lt;a href="http://www.microsoft.com/sql"&gt;www.microsoft.com/sql&lt;/a&gt;.&lt;/div&gt;

&lt;p&gt;&lt;br /&gt;
&lt;h2&gt;Size, response, availability&lt;/h2&gt;&lt;/p&gt;

&lt;p&gt;It's a given that any modern application is going to be
distributed. The size of modern datasets is only one reason for
distribution, and not the most important. Modern applications
(particularly web applications) have many concurrent users who
demand reasonably snappy response. In their 2009 &lt;a href="http://velocityconf.com/"&gt;Velocity Conference&lt;/a&gt; talk, &lt;a href="http://www.youtube.com/watch?v=bQSE51-gr2s"&gt;Performance Related
Changes and their User Impact&lt;/a&gt;, Eric Schurman and Jake Brutlag
showed results from independent research projects at Google and
Microsoft. Both projects demonstrated imperceptibly small increases
in response time cause users to move to another site; if response
time is over a second, you're losing a very measurable percentage
of your traffic.&lt;/p&gt;

&lt;p&gt;If you're not building a web application &amp;mdash; say you're doing
business analytics, with complex, time-consuming queries  &amp;mdash; the
world has changed, and users now expect business analytics to run
in something like real time. Maybe not the sub-second latency
required for web users, but queries that run overnight are no
longer acceptable. Queries that run while you go out for coffee are
marginal. It's not just a matter of convenience; the ability to run
dozens or hundreds of queries per day changes the nature of the
work you do. You can be more experimental: you can follow through
on hunches and hints based on earlier queries. That kind of
spontaneity was impossible when research went through the DBA at
the data warehouse.&lt;/p&gt;

&lt;p&gt;Whether you're building a customer-facing application or doing
internal analytics, scalability is a big issue. Vertical
scalability (buy a bigger, faster machine) always runs into limits.
Now that the laws of physics have stalled Intel-architecture clock
speeds in the 3.5GHz range, those limits are more apparent than
ever. Horizontal scalability (build a distributed system with more
nodes) is the only way to scale indefinitely. You're scaling
horizontally even if you're only buying single boxes: it's been a
long time since I've seen a server (or even a high-end desktop)
that doesn't sport at least four cores. Horizontal scalability is
tougher when you're scaling across racks of servers at a colocation
facility, but don't be deceived: that's how scalability works in
the 21st century, even on your laptop. Even in your cell phone. We
need database technologies that aren't just fast on single servers:
they must also scale across multiple servers.&lt;/p&gt;

&lt;p&gt;Modern applications also need to be highly available. That goes
without saying, but think about how the meaning of "availability"
has changed over the years. Not much more than a decade ago, a web
application would have a single HTTP server that handed out static
files. These applications might be data-driven; but "data driven"
meant that a batch job rebuilt the web site overnight, and user
transactions were queued into a batch processing system, again for
processing overnight. Keeping such a system running isn't terribly
difficult. High availability doesn't impact the database. If the
database is only engaged in batched rebuilds or transaction
processing, the database can crash without damage. That's the world
for which relational databases were designed. In the '80s, if your
mainframe ran out of steam, you got a bigger one. If it crashed,
you were down. But when databases became a living, breathing part
of the application, availability became an issue. There is no way
to make a single system highly available; as soon as any component
fails, you're toast. Highly available systems are, by nature,
distributed systems.&lt;/p&gt;

&lt;p&gt;If a distributed database is a given, the next question is how
much work a distributed system will require. There are
fundamentally two options: databases that have to be distributed
manually, via sharding; and databases that are inherently
distributed. Relational databases are split between multiple hosts
by manual sharding, or determining how to partition the datasets
based on some properties of the data itself: for example, first
names starting with A-K on one server, L-Z on another. A lot of
thought goes into designing a sharding and replication strategy
that doesn't impair performance, while keeping the data relatively
balanced between servers. There's a third option that is
essentially a hybrid: databases that are not inherently
distributed, but that are designed so they can be partitioned
easily. &lt;a href="http://www.mongodb.org/"&gt;MongoDB&lt;/a&gt; is an example
of a database that can be sharded easily (or even automatically);
&lt;a href="http://hbase.apache.org/"&gt;HBase&lt;/a&gt;, &lt;a href="http://www.basho.com/"&gt;Riak&lt;/a&gt;, and &lt;a href="http://cassandra.apache.org/"&gt;Cassandra&lt;/a&gt; are all inherently
distributed, with options to control how replication and
distribution work.&lt;/p&gt;

&lt;p&gt;What database choices are viable when you need good interactive
response? There are two separate issues: read latency and write
latency. For reasonably simple queries on a database with
well-designed indexes, almost any modern database can give decent
read latency, even at reasonably large scale. Similarly, just about
all modern databases claim to be able to keep up with writes at
high-speed. Most of these databases, including HBase, Cassandra,
Riak, and &lt;a href="http://couchdb.apache.org/"&gt;CouchDB&lt;/a&gt;, write
data immediately to an append-only file, which is an extremely
efficient operation. As a result, writes are often significantly
faster than reads.&lt;/p&gt;

&lt;p&gt;Whether any particular database can deliver the performance you
need depends on the nature of the application, and whether you've
designed the application in a way that uses the database
efficiently: in particular, the structure of queries, more than the
structure of the data itself. &lt;a href="http://redis.io/"&gt;Redis&lt;/a&gt;
is an in-memory database with extremely fast response, for both
read and write operations; but there are a number of tradeoffs. By
default, data isn't saved to disk, and is lost if the system
crashes. You can configure Redis for durability, but at the cost of
some performance. Redis is also limited in scalability; there's
some replication capability, but support for clusters is still
coming. But if you want raw speed, and have a dataset that can fit
into memory, Redis is a great choice.&lt;/p&gt;

&lt;p&gt;It would be nice if there were some benchmarks to cover database
performance in a meaningful sense, but as the saying goes, "there
are lies, damned lies, and benchmarks." In particular, no small
benchmark can properly duplicate a real test-case for an
application that might reasonably involve dozens (or hundreds) of
servers.&lt;/p&gt;

&lt;h2&gt;Changing data and cheap lunches&lt;/h2&gt;

&lt;p&gt;NoSQL databases are frequently called "schemaless" because they
don't have the formal schema associated with relational databases.
The lack of a formal schema, which typically has to be designed
before any code is written, means that schemaless databases are a
better fit for current software development practices, such as
agile development. Starting from the simplest thing that could
possibly work and iterating quickly in response to customer input
doesn't fit well with designing an all-encompassing data schema at
the start of the project. It's impossible to predict how data will
be used, or what additional data you'll need as the project
unfolds. For example, many applications are now annotating their
data with geographic information: latitudes and longitudes,
addresses. That almost certainly wasn't part of the initial data
design.&lt;/p&gt;

&lt;p&gt;How will the data we collect change in the future? Will we
be collecting biometric information along with tweets and
Foursquare checkins? Will music sites such as Last.FM and Spotify
incorporate factors like blood pressure into their music selection
algorithms? If you think these scenarios are futuristic, think
about Twitter. When it started out, it just collected bare-bones
information with each tweet: the tweet itself, the Twitter handle,
a timestamp, and a few other bits. Over its five-year history,
though, lots of metadata has been added. A tweet may be 140
characters at most, but a couple KB is actually sent to the server,
and all of this is saved in the database. Up-front schema design is
a poor fit in a world where data requirements are fluid.&lt;/p&gt;

&lt;p&gt;In addition, modern applications frequently deal with
unstructured data: blog posts, web pages, voice transcripts, and
other data objects that are essentially text. O'Reilly maintains a
substantial database of job listings for some internal research
projects. The job descriptions are chunks of text in natural
languages. They're not unstructured because they don't fit into a
schema. You can easily create a JOBDESCRIPTION column in a table,
and stuff text strings into it. It's that knowing the data type and
where it fits in the overall structure doesn't help. What are the
questions you're likely to ask? Do you want to know about skills,
certifications, the employer's address, the employer's industry?
Those are all valid columns for a table, but you don't know what
you care about in advance; you won't find equivalent information in
each job description; and the only way to get from the text to the
data is through various forms of pattern matching and
classification. Doing the classification up front, so you could
break a job listing down into skills, certifications, etc., is a
huge effort that would largely be wasted. The guys who work with
this data recently had fits disambiguating "Apple Computer" from
"apple orchard." Would you even know this was a problem outside of
a concrete research project based on a concrete question? If you're
just pre-populating an INDUSTRY column from raw data, would you
notice that lots of computer industry jobs were leaking into fruit
farming? A JOBDESCRIPTION column doesn't hurt, but doesn't help
much either, and going further, by trying to design a schema around
the data that you'll find in the unstructured text, that definitely
hurts. The kinds of questions you're likely to ask have everything
to do with the data itself, and little to do with that data's
relations to other data.&lt;/p&gt;

&lt;p&gt;However, it's really a mistake to say that NoSQL databases have
no schema. In a document database, such as CouchDB or MongoDB,
documents are key-value pairs. While you can add documents with
differing sets of keys (missing keys or extra keys), or even add
keys to documents over time, applications still must know that
certain keys are present to query the database; indexes have to be
set up to make searches efficient. The same thing applies to
column-oriented databases, such as HBase and Cassandra. While any
row may have as many columns as needed, some up-front thought has
to go into what columns are needed to organize the data. In most
applications, a NoSQL database will require less up-front planning,
and offer more flexibility as the application evolves. As we'll see,
data design revolves more around the queries you want to ask than
the domain objects that the data represents. It's not a free lunch;
possibly a cheap lunch, but not free.&lt;/p&gt;

&lt;p&gt;What kinds of storage models do the more common NoSQL databases
support? Redis is a relatively simple key-value store, but with a
twist: values can be data structures (lists and sets), not just
strings. It supplies operations for working directly with sets and
lists (for example, union and intersection).&lt;/p&gt;

&lt;p&gt;CouchDB and MongoDB both store documents in JSON format, where
JSON is a format originally designed for representing JavaScript
objects, but now available in many languages. So on one hand, you
can think of CouchDB and MongoDB as object databases; but you could
also think of a JSON document as a list of key-value pairs. Any
document can contain any set of keys, and any key can be associated
with an arbitrarily complex value that is itself a JSON document.
CouchDB queries are views, which are themselves documents in the
database that specify searches. Views can be very complex, and can
use a built-in MapReduce facility to process and summarize results.
Similarly, MongoDB queries are &lt;a href="http://www.json.org/"&gt;JSON&lt;/a&gt; documents, specifying fields and
values to match, and query results can be processed by a built in
MapReduce. To use either database effectively, you start by
designing your views: what do you want to query, and how. Once you
do that, it will become clear what keys are needed in your
documents.&lt;/p&gt;

&lt;p&gt;Riak can also be viewed as a document database, though with more
flexibility about document types. It natively handles JSON, XML,
and plain text, and a plug-in architecture allows you to add
support for other document types. Searches "know about" the
structure of JSON and XML documents. Like CouchDB, Riak
incorporates MapReduce to perform complex queries efficiently.&lt;/p&gt;

&lt;p&gt;Cassandra and HBase are usually called column-oriented
databases, though a better term is a "sparse row store." In these
databases, the equivalent to a relational "table" is a set of rows,
identified by a key. Each row consists of an unlimited number of
columns; columns are essentially keys that let you look up values
in the row. Columns can be added at any time, and columns that are
unused in a given row don't occupy any storage. NULLs don't exist.
And since columns are stored contiguously, and tend to have similar
data, compression can be very efficient, and searches along a
column are likewise efficient. HBase describes itself as a database
that can store billions of rows with millions of columns.&lt;/p&gt;

&lt;p&gt;How do you design a schema for a database like this? As with the
document databases, your starting point should be the queries
you'll want to make. There are some radically different
possibilities. Consider storing logs from a web server. You may
want to look up the IP addresses that accessed each URL you serve.
The URLs can be the primary key; each IP address can be a column.
This approach will quickly generate thousands of unique columns,
but that's not a problem &amp;mdash; and a single query, with no joins, gets
you all the IP addresses that accessed a single URL. If some URLs
are visited by many addresses, and some are only visited by a few,
that's no problem: remember that NULLs don't exist. This design
isn't even conceivable in a relational database. You can't have a
table that doesn't have a fixed number of columns.&lt;/p&gt;

&lt;p&gt;Now, let's make it more complex: you're writing an ecommerce
application, and you'd like to access all the purchases that a
given customer has made. The solution is similar. The column family
is organized by customer ID (primary key), you have columns for
first name, last name, address, and all the normal customer
information, plus as many rows as are needed for each purchase. In
a relational database, this would probably involve several tables
and joins. In the NoSQL databases, it's a single lookup. Schema
design doesn't go away, but it changes: you think about the queries
you'd like to execute, and how you can perform those
efficiently.&lt;/p&gt;

&lt;p&gt;This isn't to say that there's no value to normalization, just
that data design starts from a different place. With a relational
database, you start with the domain objects, and represent them in
a way that guarantees that virtually any query can be expressed.
But when you need to optimize performance, you look at the queries
you actually perform, then merge tables to create longer rows, and
do away with joins wherever possible. With the schemaless
databases, whether we're talking about data structure servers,
document databases, or column stores, you go in the other
direction: you start with the query, and use that to define your
data objects.&lt;/p&gt;

&lt;h2&gt;The sacred cows&lt;/h2&gt;

&lt;p&gt;The &lt;a href="http://en.wikipedia.org/wiki/ACID"&gt;ACID&lt;/a&gt;
properties (atomicity, consistency, isolation, durability) have
been drilled into our heads. But even these come into play as we
start thinking seriously about database architecture. When a
database is distributed, for instance, it becomes much more
difficult to achieve the same kind of consistency or isolation that
you can on a single machine. And the problem isn't just that it's
"difficult" but rather that achieving them ends up in direct
conflict with some of the reasons to go distributed. It's not that
properties like these aren't very important &amp;mdash; they certainly are &amp;mdash; but today's software architects are discovering that they
require the freedom to choose when it might be worth a
compromise.&lt;/p&gt;

&lt;p&gt;What about transactions, &lt;a href="http://en.wikipedia.org/wiki/Two_phase_commit"&gt;two-phase&lt;/a&gt;
commit, and other mechanisms inherited from big iron legacy
databases? If you've read almost any discussion of concurrent or
distributed systems, you've heard that banking systems care a lot
about consistency. What if you and your spouse withdraw money from
the same account at the same time? Could you overdraw the account?
That's what ACID is supposed to prevent. But a few months ago, I
was talking to someone who builds banking software, and he said "If
you really waited for each transaction to be properly committed on
a world-wide network of ATMs, transactions would take so long to
complete that customers would walk away in frustration. What
happens if you and your spouse withdraw money at the same time and
overdraw the account? You both get the money; we fix it up later."&lt;/p&gt;

&lt;p&gt;
This isn't to say that bankers have discarded transactions,
two-phase commit and other database techniques; they're just
smarter about it. In particular, they're distinguishing between
local consistency and absolutely global consistency. Gregor Hohpe's
classic article &lt;a href="http://www.eaipatterns.com/ramblings/18_starbucks.html"&gt;Starbucks
Does Not Use Two-Phase Commit&lt;/a&gt; makes a similar point: in an
asynchronous world, we have many strategies for dealing with
transactional errors, including write-offs. None of these
strategies are anything like two-phase commit. They don't force the
world into inflexible, serialized patterns.&lt;/p&gt;

&lt;p&gt;The &lt;a href="http://en.wikipedia.org/wiki/CAP_theorem"&gt;CAP
theorem&lt;/a&gt; is more than a sacred cow; it's a law of the database
universe that can be expressed as "Consistency, Availability,
Partition Tolerance: choose two." But let's rethink relational
databases in light of this theorem. Databases have stressed
consistency. The CAP theorem is really about distributed
systems, and as we've seen, relational databases were developed
when distributed systems were rare and exotic at best. If you
needed more power, you bought a bigger mainframe. Availability
isn't an issue on a single server: if it's up, it's up, if it's
down, it's down. And partition tolerance is meaningless when
there's nothing to partition. As we saw at the beginning of
this article, distributed systems are a given for modern
applications; you won't be able to scale to the size and
performance you need on a single box. So the CAP theorem is
historically irrelevant to relational databases: they're good at
providing consistency, and they have been adapted to provide high
availability with some success, but they are hard to partition
without extreme effort or extreme cost.&lt;/p&gt;

&lt;p&gt;Since partition tolerance is a fundamental requirement for
distributed applications, it becomes a question of what to
sacrifice: consistency or availability. There have been two
approaches: Riak and Cassandra stress availability, while HBase has
stressed consistency. With Cassandra and Riak, the tradeoff between
consistency and availability is tunable. CouchDB and MongoDB are
essentially single-headed databases, and from that standpoint,
availability is a function of how long you can keep the hardware
running. However, both have add-ons that can be used to build
clusters. In a cluster, CouchDB and MongoDB are eventually
consistent (like Riak and Cassandra); availability depends on what
you do with the tools they provide. You need to set up sharding and
replication, and use what's essentially a proxy server to present a
single interface to cluster's clients. &lt;a href="https://github.com/cloudant/bigcouch"&gt;BigCouch&lt;/a&gt; is an
interesting effort to integrate clustering into CouchDB, making it
more like Riak. Now that &lt;a href="https://cloudant.com/"&gt;Cloudant&lt;/a&gt; has announced that it is
&lt;a href="http://servicesangle.com/blog/2012/01/06/cloudant-plans-to-wind-down-bigcouch-development-merge-features-into-apache-couchdb/"&gt;merging BigCouch and CouchDB&lt;/a&gt;, we can expect to see clustering become part of the CouchDB core.&lt;/p&gt;

&lt;p&gt;We've seen that absolute consistency isn't a hard requirement
for banks, nor is it the way we behave in our real-world
interactions. Should we expect it of our software? Or do we care
more about availability?&lt;/p&gt;

&lt;p&gt;It depends. The consistency requirements of many social
applications are very soft. You don't need to get the correct
number of Twitter or Facebook followers every time you log in. If
you search, you probably don't care if the results don't contain
the comments that were posted a few seconds ago. And if you're
willing to accept less-than-perfect consistency, you can make huge
improvements in performance. In the world of big-data-backed web
applications, with databases spread across hundreds (or potentially
thousands) of nodes, the performance penalty of locking down a
database while you add or modify a row is huge; if your application
has frequent writes, you're effectively serializing all the writes
and losing the advantage of the distributed database. In practice,
in an "eventually consistent" database, changes typically propagate
to the nodes in tenths of a second; we're not talking minutes or
hours before the database arrives in a consistent state.&lt;/p&gt;

&lt;p&gt;Given that we have all been battered with talk about "five
nines" reliability, and given that it is a big problem for any
significant site to be down, it seems clear that we should
prioritize availability over consistency, right? The architectural
decision isn't so easy, though. There are many applications in
which inconsistency must eventually be dealt with. If consistency
isn't guaranteed by the database, it becomes a problem that the
application has to manage. When you choose availability over
consistency, you're potentially making your application more
complex. With proper replication and failover strategies, a
database designed for consistency (such as HBase) can probably
deliver the availability you require; but this is another design
tradeoff. Regardless of the database you're using, more stringent
reliability requirements will drive you toward exotic engineering.
Only you can decide the right balance for your application. The
point isn't that any given decision is right or wrong, but that you
can (and have to) choose, and that's a good thing.&lt;/p&gt;

&lt;h2&gt;Other features&lt;/h2&gt;

&lt;p&gt;I've completed a survey of the major tradeoffs you need to think
about in selecting a database for a modern big data application.
But the major tradeoffs aren't the only story. There are many
database projects with interesting features. Here are a some of the
ideas and projects I find most interesting:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; Scripting: Relational databases all come with some variation of
the SQL language, which can be seen as a scripting language for
data. In the non-relational world, a number of scripting languages
are available. CouchDB and Riak support JavaScript, as does
MongoDB. The &lt;a href="http://hadoop.apache.org/"&gt;Hadoop&lt;/a&gt; project
has spawned a several data scripting languages that are usable
with HBase, including &lt;a href="http://pig.apache.org/"&gt;Pig&lt;/a&gt; and
&lt;a href="http://hive.apache.org/"&gt;Hive&lt;/a&gt;. The Redis project is
experimenting with integrating the &lt;a href="http://www.lua.org/"&gt;Lua&lt;/a&gt; scripting language.&lt;/li&gt;

&lt;p&gt;&lt;li&gt; RESTful interfaces: CouchDB and Riak are unique in offering&lt;br /&gt;
RESTful interfaces. These are interfaces based on HTTP and the architectural&lt;br /&gt;
style elaborated in Roy Fielding's &lt;a href="http://www.ics.uci.edu/~fielding/pubs/dissertation/top.htm"&gt;doctoral&lt;br /&gt;
dissertation&lt;/a&gt; and &lt;a href="http://oreilly.com/catalog/9780596529260/"&gt;Restful Web&lt;br /&gt;
Services&lt;/a&gt;. CouchDB goes so far as to serve as a web application&lt;br /&gt;
framework. Riak also offers a more traditional protocol buffer&lt;br /&gt;
interface, which is a better fit if you expect a high volume of&lt;br /&gt;
small requests.&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt; Graphs: &lt;a href="http://neo4j.org/"&gt;Neo4J&lt;/a&gt; is a special&lt;br /&gt;
purpose database designed for maintaining large graphs: data where&lt;br /&gt;
the data items are nodes, with edges representing the connections&lt;br /&gt;
between the nodes. Because graphs are extremely flexible data&lt;br /&gt;
structures, a graph database can emulate any other kind of&lt;br /&gt;
database.&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt; SQL: I've been discussing the NoSQL movement, but SQL is a&lt;br /&gt;
familiar language, and is always just around the corner. A couple&lt;br /&gt;
of startups are working on adding SQL to Hadoop-based datastores:&lt;br /&gt;
&lt;a href="http://www.drawntoscalehq.com/"&gt;DrawnToScale&lt;/a&gt; (which&lt;br /&gt;
focuses on low-latency, high-volume web applications) and &lt;a href=&lt;br /&gt;
"http://www.hadapt.com/"&gt;Hadapt&lt;/a&gt; (which focuses on analytics and&lt;br /&gt;
bringing data warehousing into the 20-teens). In a few years, will&lt;br /&gt;
we be looking at hybrid databases that take advantage of both&lt;br /&gt;
relational and non-relational models? Quite possibly.&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt; Scientific data: Yet another direction comes from &lt;a href="http://www.scidb.org/"&gt;SciDB&lt;/a&gt;, a database project aimed at the&lt;br /&gt;
largest scientific applications (particularly the &lt;a href="http://www.lsst.org/lsst"&gt;Large Synoptic Survey Telescope&lt;/a&gt;).&lt;br /&gt;
The storage model is based on multi-dimensional arrays. It is&lt;br /&gt;
designed to scale to hundreds of petabytes of storage, collecting&lt;br /&gt;
tens of terabytes per night. It's still in the relatively early&lt;br /&gt;
stages.&lt;/li&gt;&lt;/p&gt;

&lt;p&gt;&lt;li&gt; Hybrid architectures: NoSQL is really about architectural&lt;br /&gt;
choice. And perhaps the biggest expression of architectural choice&lt;br /&gt;
is a hybrid architecture: rather than using a single database&lt;br /&gt;
technology, mixing and matching technologies to play to their&lt;br /&gt;
strengths. I've seen a number of applications that use traditional&lt;br /&gt;
relational databases for the portion of the data for which the&lt;br /&gt;
relational model works well, and a non-relational database for the&lt;br /&gt;
rest. For example, customer data could go into a relational&lt;br /&gt;
database, linked to a non-relational database for unstructured data&lt;br /&gt;
such as product reviews and recommendations. It's all about&lt;br /&gt;
flexibility. A hybrid architecture may be the best way to integrate&lt;br /&gt;
"social" features into more traditional ecommerce sites.&lt;/li&gt;&lt;br /&gt;
&lt;/ul&gt;&lt;/p&gt;

&lt;p&gt;These are only a few of the interesting ideas and projects that are
floating around out there. Roughly a year ago, I counted a couple
dozen non-relational database projects; I'm sure there are several
times that number today. Don't hesitate to add notes about your own
projects in the comments.&lt;/p&gt;

&lt;h2&gt;In the end&lt;/h2&gt;

&lt;p&gt;In a conversation with Eben Hewitt, author of &lt;a href="http://shop.oreilly.com/product/0636920010852.do"&gt;Cassandra: The
Definitive Guide&lt;/a&gt;, Eben summarized what you need to think about
when architecting the back end of a data-driven system. They're the
same issues software architects have been dealing with for years:
you need to think about the whole ecosystems in which the
application works; you need to consider your goals (Do you require
high availability? Fault tolerance?); you need to consider support
options; you need to isolate what will change over the life of the
application, and separate that from what remains the same. The big
difference is that now there are options; you don't have to choose
the relational model. There are other options for building large
databases that scale horizontally, are highly available, and can
deliver great performance to users. And these options, the
databases that make up the NoSQL movement, can often achieve these
goals with greater flexibility and lower cost.&lt;/p&gt;

&lt;p&gt;It used to be said that nobody got fired for buying IBM. Then
nobody got fired for buying Microsoft. Now, I suppose, nobody gets
fired for buying Oracle. But just as the landscape changed for IBM
and Microsoft, it's shifting again, and even &lt;a href="http://www.oracle.com/technetwork/database/nosqldb/overview/index.html"&gt;
Oracle has a NoSQL solution&lt;/a&gt;. Rather than relational databases
being the default, we're moving into a world where developers are
considering their architectural options, and deciding which
products fit their application: how the databases fit into their
programming model, whether they can scale in ways that make sense
for the application, whether they have strong or relatively weak
consistency requirements.&lt;/p&gt;

&lt;p&gt;For years, the relational default has
kept developers from understanding their real back-end
requirements. The NoSQL movement has given us the opportunity to
explore what we really require from our databases, and to find out
what we already knew: there is no one-size-fits-all solution.&lt;/p&gt;

&lt;div style="float: left; border-top: thin gray solid; border-bottom: thin gray solid; padding: 20px; margin: 20px 2px; clear: both;"&gt;&lt;a href="https://en.oreilly.com/strata2012/public/regwith/radar20?cmp=il-radar-st12-nosql-movement"&gt;&lt;img style="float: left; border: none; padding-right: 10px;" src="http://radar.oreilly.com/2011-strata-ca-promo.png" /&gt;&lt;/a&gt;&lt;a href="https://en.oreilly.com/strata2012/public/regwith/radar20?cmp=il-radar-st12-nosql-movement"&gt;&lt;strong&gt;Strata 2012&lt;/strong&gt;&lt;/a&gt; &amp;mdash;  The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.&lt;br /&gt; &lt;br /&gt;
&lt;a href="https://en.oreilly.com/strata2012/public/regwith/radar20?cmp=il-radar-st12-nosql-movement"&gt;&lt;strong&gt;Save 20% on registration with the code RADAR20&lt;/strong&gt;&lt;/a&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2011/10/oracle-nosql-database-architecture.html"&gt;Oracle's NoSQL: A product and an acknowledgement&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2012/02/what-is-apache-hadoop.html"&gt;What is Apache Hadoop?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=QxTWbjEQrG8:539Me1k8dEo:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=QxTWbjEQrG8:539Me1k8dEo:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=QxTWbjEQrG8:539Me1k8dEo:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=QxTWbjEQrG8:539Me1k8dEo:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=QxTWbjEQrG8:539Me1k8dEo:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=QxTWbjEQrG8:539Me1k8dEo:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=QxTWbjEQrG8:539Me1k8dEo:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/QxTWbjEQrG8" height="1" width="1"/&gt;</content>

<dc:type>text</dc:type>
<on:image>http://radar.oreilly.com/2012/02/08/0212-nosql-logos-slider.png</on:image>
<feedburner:origLink>http://radar.oreilly.com/2012/02/nosql-non-relational-database.html</feedburner:origLink></entry>

<entry>
<title>Four short links: 8 February 2012</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/J6CXEJoWC0E/four-short-links-8-february-20-2.html" />
<id>tag:radar.oreilly.com,2012://57.47792</id>

<published>2012-02-08T11:00:00Z</published>
<updated>2012-02-08T11:00:00Z</updated>

<summary type="html"> Mavuno -- an open source, modular, scalable text mining toolkit built upon Hadoop. (Apache-licensed) Cow Clicker -- Wired profile of Cowclicker creator Ian Bogost. I was impressed by Cow Clickers [...] have turned what was intended to be a vapid experience into a source of camaraderie and creativity. People create communities around social activities, even when they are antisocial....</summary>
<author>
<name>Nat Torkington</name>
<uri>http://radar.oreilly.com/nat/</uri>
</author>

<category term="bigdata" label="big data" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="datamining" label="data mining" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="fun" label="fun" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="hadoop" label="hadoop" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="publishing" label="publishing" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="science" label="science" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="socialsoftware" label="social software" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="textanalysis" label="text analysis" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="unicode" label="unicode" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="http://mavuno.isi.edu/mavuno/main.html"&gt;Mavuno&lt;/a&gt; -- &lt;i&gt;an open source, modular, scalable text mining toolkit built upon Hadoop.&lt;/i&gt; (Apache-licensed)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.wired.com/magazine/2011/12/ff_cowclicker/all/1"&gt;Cow Clicker&lt;/a&gt; -- Wired profile of Cowclicker creator Ian Bogost. I was impressed by &lt;i&gt;Cow Clickers [...] have turned what was intended to be a vapid experience into a source of camaraderie and creativity&lt;/i&gt;. People create communities around social activities, even when they are antisocial. (via &lt;a href="http://boingboing.net/2012/02/02/ian-bogost-the-sarcastic-game.html"&gt;BoingBoing&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://boingboing.net/2012/02/03/unicodes-pile-of-poo-cha.html"&gt;Unicode Has a Pile of Poo Character&lt;/a&gt; (BoingBoing) -- this is perfect.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://cameronneylon.net/blog/the-research-works-act-and-the-breakdown-of-mutual-incomprehension/"&gt;The Research Works Act and the Breakdown of Mutual Incomprehension&lt;/a&gt; (Cameron Neylon) -- an excellent summary of how researchers and publishers view each other and their place in the world.&lt;/li&gt;
&lt;/ol&gt;&lt;/p&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=J6CXEJoWC0E:ifvKpZCim2w:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=J6CXEJoWC0E:ifvKpZCim2w:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=J6CXEJoWC0E:ifvKpZCim2w:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=J6CXEJoWC0E:ifvKpZCim2w:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=J6CXEJoWC0E:ifvKpZCim2w:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=J6CXEJoWC0E:ifvKpZCim2w:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=J6CXEJoWC0E:ifvKpZCim2w:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/J6CXEJoWC0E" height="1" width="1"/&gt;</content>
<dc:source>http://www.oreillynet.com/pub/au/149</dc:source>
<dc:type>text</dc:type>
<on:image />
<feedburner:origLink>http://radar.oreilly.com/2012/02/four-short-links-8-february-20-2.html</feedburner:origLink></entry>

<entry>
<title>Unstructured data is worth the effort when you've got the right tools</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/GtrLtZOHGN8/unstructured-data-analysis-tools.html" />
<id>tag:radar.oreilly.com,2012://57.47719</id>

<published>2012-02-07T14:00:00Z</published>
<updated>2012-02-07T14:00:00Z</updated>

<summary type="html">Alyona Medelyan and Anna Divoli are inventing tools to help companies contend with vast quantities of fuzzy data. They discuss their work and what lies ahead for big data in this interview.</summary>
<author>
<name>Suzanne Axtell</name>
<uri>http://radar.oreilly.com/suzannea</uri>
</author>

<category term="Data" scheme="http://www.sixapart.com/ns/types#category" />

<category term="algorithms" label="algorithms" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="alyonamedelyan" label="Alyona Medelyan" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="annadivoli" label="Anna Divoli" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="apis" label="APIs" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="bigdata" label="big data" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="dataanalysis" label="data analysis" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="datamining" label="data mining" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="entityrelationextraction" label="entity relation extraction" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="naturallanguageprocessing" label="natural language processing" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="pingar" label="Pingar" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="unstructureddata" label="unstructured data" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;&lt;br /&gt;
&lt;p&gt;It's dawning on companies that data analysis can yield insights and inform business decisions. As data-driven benefits grow, so do our demands about what more data can tell us and what other types we can mine.&lt;/p&gt;&lt;/p&gt;

&lt;p&gt;During her PhD studies, Alyona Medelyan (&lt;a href="https://twitter.com/#!/zelandiya"&gt;@zelandiya&lt;/a&gt;) developed &lt;a href="http://code.google.com/p/maui-indexer/"&gt;Maui&lt;/a&gt;, an open source tool that performs as well as professional librarians in identifying main topics in documents. Medelyan now leads the research and development of API-based products at &lt;a href="http://pingar.com/"&gt;Pingar&lt;/a&gt;.&lt;/p&gt; 

&lt;p&gt;Pingar senior software researcher Anna Divoli (&lt;a href="https://twitter.com/#!/annadivoli"&gt;@annadivoli&lt;/a&gt;) studied sentence extraction for semi-automatic annotation of biological databases. Her current research focuses on developing methodologies for acquiring knowledge from textual data.&lt;/p&gt;

&lt;p&gt;"Big data is important in many diverse areas, such as science, social media, and enterprise," observes Divoli. "Our big data niche is analysis of unstructured text." In the interview below, Medelyan and Divoli describe their work and what they see on the horizon for unstructured data analysis.&lt;/p&gt;

&lt;h2&gt;How did you get started in big data?&lt;/h2&gt;
 
&lt;p&gt;&lt;strong&gt;Anna Divoli:&lt;/strong&gt; I began working with big data as it relates to science during my PhD. I worked with bioinformaticians who mined &lt;a href="http://en.wikipedia.org/wiki/Proteomics"&gt;proteomics&lt;/a&gt; data. My research was on mining information from the biomedical literature that could serve as annotation in a database of protein families.&lt;/p&gt;
 
&lt;p&gt;&lt;strong&gt;Alyona Medelyan:&lt;/strong&gt; Like Anna, I mainly focus on unstructured data and how it can be managed using clever algorithms. During my PhD in natural language processing and data mining, I started applying such algorithms to large datasets to investigate how time-consuming data analysis and processing tasks can be automated.&lt;/p&gt;
 
&lt;h2&gt;What projects are you working on now?&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Alyona Medelyan:&lt;/strong&gt; For the past two years at Pingar, I've been developing solutions for enterprise customers who accumulate unstructured data and want to search, analyze, and explore this data efficiently. We develop entity extraction, text summarization, and other text analytics solutions to help scrub and interpret unstructured data in an organization.&lt;/p&gt;
 
&lt;p&gt;&lt;strong&gt;Anna Divoli:&lt;/strong&gt; We're focusing on several verticals that struggle with too much textual data, such as bioscience, legal, and government. We also strive to develop language-independent solutions. &lt;/p&gt;
 
 &lt;div style="float: left; border-top: thin gray solid; border-bottom: thin gray solid; padding: 20px; margin: 20px 2px; clear: both;"&gt;&lt;a href="https://en.oreilly.com/strata2012/public/regwith/radar20?cmp=il-radar-st12-medelyan-divoli-interview"&gt;&lt;img style="float: left; border: none; padding-right: 10px;" src="http://radar.oreilly.com/2011-strata-ca-promo.png" /&gt;&lt;/a&gt;&lt;a href="https://en.oreilly.com/strata2012/public/regwith/radar20?cmp=il-radar-st12-medelyan-divoli-interview"&gt;&lt;strong&gt;Strata 2012&lt;/strong&gt;&lt;/a&gt; &amp;mdash;  The 2012 Strata Conference, being held Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work.&lt;br /&gt;
 &lt;br /&gt;
&lt;a href="https://en.oreilly.com/strata2012/public/regwith/radar20?cmp=il-radar-st12-medelyan-divoli-interview"&gt;&lt;strong&gt;Save 20% on registration with the code RADAR20&lt;/strong&gt;&lt;/a&gt;&lt;/div&gt;
 
&lt;h2&gt;What are the trends and challenges you're seeing in the big data space?&lt;/h2&gt;
 
&lt;p&gt;&lt;strong&gt;Anna Divoli:&lt;/strong&gt; There are plenty of trends that span  various aspects of big data, such as making the data accessible from mobile devices, cloud solutions, addressing security and privacy issues, and analyzing social data.&lt;/p&gt; 

&lt;p&gt;One trend that is pertinent to us is the increasing popularity of APIs. Plenty of APIs exist that give access to large datasets, but there also powerful APIs that manage big data efficiently, such as text analytics, entity extraction, and data mining APIs.&lt;/p&gt;
 
&lt;p&gt;&lt;strong&gt;Alyona Medelyan:&lt;/strong&gt; The great thing about APIs is that they can be integrated into existing applications used inside an organization.&lt;/p&gt;
 
&lt;p&gt;With regard to the challenges, enterprise data is very messy, inconsistent, and spread out across multiple internal systems and applications. APIs like the ones we're working on can bring consistency and structure to a company's legacy data.&lt;/p&gt;
 
&lt;h2&gt;The presentation you'll be giving at the &lt;a href="http://strataconf.com/strata2012?cmp=il-radar-st12-medelyan-divoli-interview"&gt;Strata Conference&lt;/a&gt; will focus on &lt;a href="http://strataconf.com/strata2012/public/schedule/detail/22499?cmp=il-radar-st12-medelyan-divoli-interview"&gt;practical applications of mining unstructured data&lt;/a&gt;. Why is this an important topic to address?&lt;/h2&gt;
 
&lt;p&gt;&lt;strong&gt;Anna Divoli:&lt;/strong&gt; Every single organization in every vertical deals with unstructured data. Tons of text is produced daily &amp;mdash; emails, reports, proposals, patents, literature, etc. This data needs to be mined to allow fast searching, easy processing, and quick decision making.&lt;/p&gt;
 
&lt;p&gt;&lt;strong&gt;Alyona Medelyan:&lt;/strong&gt; Big data often stands for structured data that is collected into a well-defined database &amp;mdash; who bought which book in an online bookstore, for example. Such databases are relatively easy to mine because they have a consistent form. At the same time, there is plenty of unstructured data that is just as valuable, but it's extremely difficult to analyze it because it lacks structure. In our presentation, we will show how to detect structure using APIs, natural language processing and text mining, and demonstrate how this creates immediate value for business users.&lt;/p&gt;
 
&lt;h2&gt;Are there important new tools or projects on the horizon for big data?&lt;/h2&gt;
 
&lt;p&gt;&lt;strong&gt;Alyona Medelyan:&lt;/strong&gt; Text analytics tools are very hot right now, and they improve daily as scientists come up with new ways of making algorithms understand written text more accurately. It is amazing that an algorithm can detect names of people, organizations, and locations within seconds simply by analyzing the context in which words are used. The trend for such tools is to move toward recognition of further useful entities, such as product names, brands, events, and skills.&lt;/p&gt;
 
&lt;p&gt;&lt;strong&gt;Anna Divoli:&lt;/strong&gt; Also, entity relation extraction is an important trend. A relation that consistently connects two entities in many documents is important information in science and enterprise alike. Entity relation extraction helps detect new knowledge in big data.&lt;/p&gt;
 
&lt;p&gt;Other trends include detecting sentiment in social data, integrating multiple languages, and applying text analytics to audio and video transcripts. The number of videos grows at a constant rate, and transcripts are even more unstructured than written text because there is no punctuation. That's another exciting area on the horizon!&lt;/p&gt;
 
&lt;h2&gt;Who do you follow in the big data community?&lt;/h2&gt;
 
&lt;p&gt;&lt;strong&gt;Alyona Medelyan:&lt;/strong&gt; We tend to follow researchers in areas that are used for dealing with big data, such as natural language processing, visualization, user experience, human computer information retrieval, as well as the semantic web. Two of them are also speaking at Strata this year: &lt;a href="http://strataconf.com/strata2012/public/schedule/speaker/122562?cmp=il-radar-st12-medelyan-divoli-interview"&gt;Daniel Tunkelang&lt;/a&gt; and &lt;a href="http://strataconf.com/strata2012/public/schedule/speaker/66363?cmp=il-radar-st12-medelyan-divoli-interview"&gt;Marti Hearst&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;
&lt;p&gt;&lt;em&gt;This interview was edited and condensed.&lt;/em&gt;&lt;/p&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2012/01/unstructured-data-chaos.html"&gt;Embracing the chaos of data&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2011/03/simon-rogers-guardian-wikileaks.html"&gt;Before you interrogate data, you must tame it&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2011/11/data-mining-reputation.html"&gt;If your data practices were made public, would you be nervous?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2011/08/site-search-analytics-data.html"&gt;When was the last time you mined your site's search data?&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2010/08/watson-and-turing.html"&gt;Watson, Turing, and extreme machine learning&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=GtrLtZOHGN8:6aT7GcFGtvM:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=GtrLtZOHGN8:6aT7GcFGtvM:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=GtrLtZOHGN8:6aT7GcFGtvM:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=GtrLtZOHGN8:6aT7GcFGtvM:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=GtrLtZOHGN8:6aT7GcFGtvM:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=GtrLtZOHGN8:6aT7GcFGtvM:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=GtrLtZOHGN8:6aT7GcFGtvM:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/GtrLtZOHGN8" height="1" width="1"/&gt;</content>

<dc:type>text</dc:type>
<on:image />
<feedburner:origLink>http://radar.oreilly.com/2012/02/unstructured-data-analysis-tools.html</feedburner:origLink></entry>

<entry>
<title>Four short links: 7 February 2012</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/5r0zwrD6Zyk/four-short-links-7-february-20-1.html" />
<id>tag:radar.oreilly.com,2012://57.47791</id>

<published>2012-02-07T11:00:00Z</published>
<updated>2012-02-07T11:00:00Z</updated>

<summary type="html"> Integrated Content Editor (GitHub) -- a track changes implementation, built in javascript, for anything that is contenteditable on the web, written by the NY Times team and open sourced. Data Tables -- featureful jQuery plugin for tables of data. (via Javascript Weekly) Creating a Developer Community (Slideshare) -- treat the problem like a channel conversion funnel: turn visitors into...</summary>
<author>
<name>Nat Torkington</name>
<uri>http://radar.oreilly.com/nat/</uri>
</author>

<category term="bigdata" label="big data" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="data" label="data" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="developerrelations" label="developer relations" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="javascript" label="javascript" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="opensource" label="open source" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="realitymining" label="reality mining" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="socialgraph" label="social graph" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="ui" label="ui" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="versioncontrol" label="version control" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="web" label="web" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="https://github.com/NYTimes/ice/"&gt;Integrated Content Editor&lt;/a&gt; (GitHub) -- &lt;i&gt;a track changes implementation, built in javascript, for anything that is contenteditable on the web&lt;/i&gt;, written by the NY Times team and open sourced.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://datatables.net/"&gt;Data Tables&lt;/a&gt; -- featureful jQuery plugin for tables of data. (via &lt;a href="http://javascriptweekly.com"&gt;Javascript Weekly&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.slideshare.net/kohsuke/building-developer-community"&gt;Creating a Developer Community&lt;/a&gt; (Slideshare) -- treat the problem like a channel conversion funnel: turn visitors into downloaders, downloaders into users, users into contributors. His screenshots of shitty conversions are great! (via &lt;a href="https://twitter.com/#!/kohsukekawa/status/164756957641707520"&gt;Kohsuke Kawaguchi&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://arxiv.org/pdf/1201.5722v1.pdf"&gt;Sex Differences in Intimate Relationships&lt;/a&gt; (PDF) -- Albert-Laszlo Barabasi and others use social graph analysis to analyze communications patterns in relationships.  &lt;i&gt;Notice that not only does the preference for an opposite-sex &amp;#8220;best friend&amp;#8221; kick in significantly earlier for females than for males (~18 years vs mid-20s, respectively), but females maintain a higher plateau value for much longer.&lt;/i&gt; More &lt;a href="http://reality.media.mit.edu/"&gt;reality mining&lt;/a&gt; to understand ourselves. (via &lt;a href="https://twitter.com/#!/sgourley/status/164619524249894912"&gt;Sean Gourley&lt;/a&gt;)&lt;/li&gt;
&lt;/ol&gt;&lt;/p&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=5r0zwrD6Zyk:cBgWu0_xP-Q:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=5r0zwrD6Zyk:cBgWu0_xP-Q:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=5r0zwrD6Zyk:cBgWu0_xP-Q:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=5r0zwrD6Zyk:cBgWu0_xP-Q:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=5r0zwrD6Zyk:cBgWu0_xP-Q:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=5r0zwrD6Zyk:cBgWu0_xP-Q:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=5r0zwrD6Zyk:cBgWu0_xP-Q:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/5r0zwrD6Zyk" height="1" width="1"/&gt;</content>
<dc:source>http://www.oreillynet.com/pub/au/149</dc:source>
<dc:type>text</dc:type>
<on:image />
<feedburner:origLink>http://radar.oreilly.com/2012/02/four-short-links-7-february-20-1.html</feedburner:origLink></entry>

<entry>
<title>Small Massachusetts HIT conference returns to big issues in health care</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/k522CgRlGu0/small-massachusetts-hit-confer.html" />
<id>tag:radar.oreilly.com,2012://57.47793</id>

<published>2012-02-06T19:46:10Z</published>
<updated>2012-02-06T19:46:10Z</updated>

<summary type="html">The real reason hospitals haven't joined health information exchanges, and other reports from the Massachusetts Heath Data Consortium's annual conference.</summary>
<author>
<name>Andy Oram</name>
<uri>http://www.praxagora.com/andyo/</uri>
</author>

<category term="Gov 2.0" scheme="http://www.sixapart.com/ns/types#category" />

<category term="ashishjha" label="Ashish Jha" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="ehrs" label="EHRs" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="electronichealthrecords" label="electronic health records" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="healthcare" label="health care" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="healthinformationexchange" label="health information exchange" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="healthit" label="health IT" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="hie" label="HIE" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="massachusettsheathdataconsortium" label="Massachusetts Heath Data Consortium" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="medical" label="medical" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="mhdc" label="MHDC" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;

&lt;p&gt;I've come to look forward to the Massachusetts Heath Data Consortium's annual &lt;a href="http://mahealthdata.org/HIT12/agenda"&gt;HIT conference&lt;/a&gt; because--although speakers tout the very real and impressive progress made by Massachusetts health providers--you can also hear acerbic and ruthlessly candid critiques of policy and the status quo. Two notable take-aways from last year's conference (which I &lt;a href="http://radar.oreilly.com/2011/02/report-from-massachusetts-heal.html"&gt;wrote up&lt;/a&gt; at the time) were the equivalence of old "managed care" to new "accountable care organizations" and the complaint that electronic health records were "too expensive, too hard to use, and too disruptive to workflow." I'll return to these claims later.&lt;/p&gt;
&lt;h3&gt;

&lt;p&gt;The sticking point: health information exchange&lt;/p&gt;

&lt;/h3&gt;

&lt;p&gt;

&lt;p&gt;This year, the spears were lobbed by &lt;a href="http://www.hsph.harvard.edu/faculty/ashish-jha/"&gt;Ashish Jha&lt;/a&gt; of Harvard Medical School, who laid out a broad overview of progress since the release of meaningful use criteria and then accused health care providers of undermining one of its main goals, the exchange of data between different providers who care for the same patient.  Through quantitative research (publication &lt;a href="http://www.ncbi.nlm.nih.gov/pubmed/22084896"&gt;in progress&lt;/a&gt;), Jha's researchers showed a correlation between fear of competition and low adoption of HIEs. Hospitals with a larger, more secure position in their markets, or in more concentrated markets, were more likely to join an HIE.&lt;/p&gt;

&lt;p&gt;

&lt;p&gt;The research bolsters Jha's claim that the commonly cited barriers to using HIEs (technical challenges, cost, and privacy concerns) are surmountable, and that the real problem is a refusal to join because a provider fears that patients would migrate to other providers. It seems to me that the government and public can demand better from providers, but simply cracking the whip may be ineffective. Nor should it be necessary. An urgent shortage of medical care exists everywhere in the country, except perhaps a few posh neighborhoods. There's plenty for all providers. Once insurance is provided to all the people in need, no institution should need to fear a lack of business, unless its performance record is dismal.&lt;/p&gt;

&lt;p&gt;

&lt;p&gt;Jha also put up some research showing a strong trend toward adopting electronic health records, although the small offices that give half the treatment in the United States are still left behind. He warned that to see big benefits, we need to bring in health care institutions that are currently given little attention by the government--nursing home, rehab facilities, and so forth--and give them incentives to digitize. He wrapped up by quoting David Blumenthal, former head of the ONC, on the subject of HIEs. Blumenthal predicted that we'd see EHRs in most providers over the next few years, and that the real battle would be getting them to adopt health information exchange.&lt;br /&gt;
&lt;p&gt;&lt;/p&gt;

&lt;p&gt;Meanwhile, meaningful use could trigger a &lt;a href="http://www.fiercehealthit.com/story/health-it-sales-growth-predicted-rise-only-slightly-2012/2012-02-02"&gt;shake-out in the EHR industry&lt;/a&gt;, as vendors who have spent years building silo'd projects fail to meet the Stage 2 requirements that fulfill the highest aspirations of the HITECH act that defined meaningful use, including health information exchange. Meanwhile, a small but steadily increasing number of open source projects have achieved meaningful use certification. So we'll see more advances in the adoption of both EHRs and HIEs.&lt;/p&gt;

&lt;h3&gt;

&lt;p&gt;Low-hanging fruit signals a new path for cost savings&lt;/p&gt;

&lt;/h3&gt;

&lt;p&gt;

&lt;p&gt;The big achievement in Massachusetts, going into the conference today, was a recent &lt;a href="http://articles.boston.com/2011-10-06/business/30253045_1_health-care-providers-primary-care-physicians"&gt;agreement&lt;/a&gt; between the state's major insurer, Blue Cross Blue Shield, and the 800-pound gorilla of the state's health care market, Partners HealthCare System. The pact significantly slows the skyrocketing costs that we've all become accustomed to in the United States, through the adoption of global payments (that is, fixed reimbursements for treating patients in certain categories). That two institutions of such weight can relinquish the old, imprisoning system of fee-for-service is news indeed.&lt;/p&gt;

&lt;p&gt;

&lt;p&gt;Note that the Blue Cross/Partners agreement doesn't even involve the formation of an Accountable Care Organization. Presumably, Partners believes it can pick some low-hanging fruit through modest advances in efficiency. Cost savings you can really count will come from ACOs, where total care of the patient is streamlined through better transfers of care and intensive communication. Patient-centered medical homes can do even more. So an ACO is actually much smarter than old managed care. But it depends on collecting good data and using it right.&lt;/p&gt;

&lt;p&gt;

&lt;p&gt;The current deal is an important affirmation of the path Massachusetts took long before the rest of the country in aiming for universal health coverage. We all knew at the time that the Massachusetts bill was not addressing costs and that these would have to be tackled eventually. And at first, of course, health premiums went up because a huge number of new people were added to the roles, and many of them were either sick or part of high-risk populations.&lt;/p&gt;

&lt;p&gt;

&lt;p&gt;The cost problem is now being addressed through administrative pressure (at one point, Governor Deval Patrick flatly denied a large increase requested by insurers), proposed laws, and sincere efforts at the private level such as the Blue Cross/Partners deal. I asked a member of the Patrick administration whether they problem could be solved without a new law, and he expressed the opinion that there's a good chance it could be. Steven Fox of Blue Cross Blue Shield said that 70% of their HMO members go to physicians in their Alternative Quality Network, which features global payments. And he said these members have better outcomes at lower costs.&lt;/p&gt;

&lt;p&gt;

&lt;p&gt;ACOs have a paradoxical effect on health information exchange Jha predicted that ACOs--while greatly streamlining the exchanges between their member organizations, because these save money--will resist exchanging data with outside providers because keeping patients is even more important for ACOs than for traditional hospitals and clinics. Only by keeping a patient can the ACO reap the benefits of the investments they make in long-term patient health.&lt;/p&gt;

&lt;p&gt;

&lt;p&gt;As Doris Mitchell received an award for her work with the MHDC, executive directory Ray Campbell mentioned the rapid growth and new responsibilities of her agency, the Group Insurance Commission, which negotiates all health insurance coverage for state employees, as cities and towns have been transferring their municipal employees to it. A &lt;a href="http://www.mma.org/labor-and-personnel/5590-senate-passes-municipal-health-insurance-reform-plan"&gt;highly contentious bill&lt;/a&gt; last year that allowed the municipalities to transfer their workers to the GIC was widely interpreted as a blow against unionized workers, when it was actually just a bid to save money through the familiar gambit of combining the insured into a larger pool. I &lt;a href="http://radar.oreilly.com/2011/04/summary-of-health-care-outcome.html"&gt;covered this controversy&lt;/a&gt; at the time.&lt;/p&gt;

&lt;h3&gt;

&lt;p&gt;A low-key conference&lt;/p&gt;

&lt;/h3&gt;

&lt;p&gt;

&lt;p&gt;Attendance was down at this year's conference, with about half the attendees and vendors as last year's. Lowered interest seemed to be reflected as none of the three CEOs receiving awards turned up to represent their institutions (the two institutions mentioned earlier for their historic cost-cutting deal--Blue Cross Blue Shield and Partners HealthCare--along with Steward Health Care).&lt;/p&gt;

&lt;p&gt;

&lt;p&gt;The morning started with a thoughtful look at the requirements for ACOs by Frank Ingari of Essence Healthcare, who predicted a big rise in investment by health care institutions in their IT departments.  Later speakers echoed this theme, saying that hospitals should invest less in state-of-the-art equipment that leads to immediately billable activities, and more in the underlying IT that will allow them to collect research data and cut waste. Some of the benefits available through this research were covered in a &lt;a href="http://www.oscon.com/oscon2010/public/schedule/detail/15242"&gt;talk at the Open Source convention&lt;/a&gt; a couple years ago.&lt;/p&gt;

&lt;p&gt;

&lt;p&gt;Another intriguing session covered technologies available today that could be more widely adopted to improve health care. Videos of robots always draw an enthusiastic response, but a more significant innovation ultimately may be a database McKesson is developing that lets doctors evaluate genetic tests and decide when such tests are worth the money and trouble.&lt;/p&gt;

&lt;p&gt;

&lt;p&gt;The dozen vendors were joined by a non-profit, &lt;a href="http://www.sustainablehealthcareforhaiti.com"&gt;Sustainable Healthcare for Haiti&lt;/a&gt;. Their first project is one of the most basic health interventions one can make: providing wells for drinkable water. They have a local sponsor who can manage their relationship with the government, and an ambitious mission that includes job development, an outpatient clinic, and an acute care children's hospital.&lt;/p&gt;
&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=k522CgRlGu0:Gm-ERk47nTk:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=k522CgRlGu0:Gm-ERk47nTk:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=k522CgRlGu0:Gm-ERk47nTk:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=k522CgRlGu0:Gm-ERk47nTk:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=k522CgRlGu0:Gm-ERk47nTk:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=k522CgRlGu0:Gm-ERk47nTk:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=k522CgRlGu0:Gm-ERk47nTk:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/k522CgRlGu0" height="1" width="1"/&gt;</content>
<dc:source>http://www.oreillynet.com/pub/au/36</dc:source>
<dc:type>text</dc:type>
<on:image />
<feedburner:origLink>http://radar.oreilly.com/2012/02/small-massachusetts-hit-confer.html</feedburner:origLink></entry>

<entry>
<title>Business-government ties complicate cyber security</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/XOYgyBK8wCY/cyber-security-business-government.html" />
<id>tag:radar.oreilly.com,2012://57.47767</id>

<published>2012-02-06T14:00:00Z</published>
<updated>2012-02-06T14:00:00Z</updated>

<summary type="html">Is an attack on a U.S. business' network an attack on the U.S. itself? "Inside Cyber Warfare" author Jeffrey Carr discusses the intermingling of corporate and government interests in this interview.</summary>
<author>
<name>Howard Wen</name>
<uri>http://radar.oreilly.com/howardw</uri>
</author>

<category term="Gov 2.0" scheme="http://www.sixapart.com/ns/types#category" />

<category term="Web 2.0" scheme="http://www.sixapart.com/ns/types#category" />

<category term="Web Ops &amp; Performance" scheme="http://www.sixapart.com/ns/types#category" />

<category term="attack" label="attack" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="cloud" label="cloud" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="cyberwarfare" label="cyber warfare" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="cybersecurity" label="cybersecurity" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="cyberwarfare" label="cyberwarfare" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="infrastructure" label="infrastructure" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="security" label="security" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;From time to time, we like to check in with "&lt;a href="http://shop.oreilly.com/product/0636920021490.do?cmp=il-radar-books-jeff-carr-cyber-war-2nd-edition-interview"&gt;Inside Cyber Warfare&lt;/a&gt;" author Jeffrey Carr to get his thoughts on the digital security landscape. These conversations often address &lt;a href="http://radar.oreilly.com/2011/02/cybersecurity-gov-hackers.html"&gt;specific&lt;/a&gt; threats, but with the recent release of the second edition of Carr's book, we decided to explore some of the larger concepts shaping this space.&lt;/p&gt;

&lt;h2&gt;Are corporate and government interests in the U.S. becoming one and the same? That is, an attack on an American business' network may be regarded as an assault on the country itself?&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Jeffrey Carr:&lt;/strong&gt; Due to the dependence of the U.S. government upon private contractors, the insecurity of one impacts the security of the other. The fact is that there are an unlimited number of ways that an attacker can compromise a person, organization or government agency due to the interdependencies and connectedness that exist between both.&lt;/p&gt;

&lt;h2&gt;Are national network security and media piracy becoming interrelated and confused?&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Jeffrey Carr:&lt;/strong&gt; It has definitely become confused to the point where the Department of Homeland Security (DHS) is now the enforcement arm of the Recording Industry Association of America (RIAA), which I find utterly disgraceful. It's due entirely to the money and power that entertainment industry lobbyists have to wave in front of members of Congress. It has absolutely nothing to do with improving the security of our critical infrastructure or reducing the attack platform used by bad actors.&lt;/p&gt;

&lt;h2&gt;Flipping this around, how much of a cyber threat does the U.S. pose to other countries?&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Jeffrey Carr:&lt;/strong&gt; The U.S. is probably as capable or more capable at conducting cyber operations than any of the other nation states who engage in it. It's not a question of &amp;quot;they do it to us, but we don't do it to them.&amp;quot; It's a question of how to defend your critical assets in light of the fact that everyone is doing it.&lt;/p&gt;

&lt;h2&gt;What recent technologies concern you the most?&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Jeffrey Carr:&lt;/strong&gt; We are racing to adopt &lt;a href="http://radar.oreilly.com/2011/12/cloud-service-security-attack.html"&gt;cloud computing&lt;/a&gt; without regard to security. In fact, many customers wrongly assume that the cloud provider is responsible for their data's security when the reverse is true. Not only is security a major problem, but there's no telling where in the world your data may reside since most large cloud providers have server farms scattered around the world. That, in turn, makes the data susceptible to foreign governments that have cause to request legal access to data sitting on servers inside their borders.&lt;/p&gt;

&lt;div style="float: left; border-top: thin gray solid; border-bottom: thin gray solid; padding: 20px; margin: 20px 2px; clear: both;"&gt;&lt;a href="http://shop.oreilly.com/product/0636920021490.do?cmp=il-radar-books-jeff-carr-cyber-war-2nd-edition-interview"&gt;&lt;img style="float: left; border: none; padding-right: 10px;" src="http://radar.oreilly.com/2011/12/02/1111-insider-cyber-war-2nd-cover.png" /&gt;&lt;/a&gt;&lt;a href="http://shop.oreilly.com/product/0636920021490.do?cmp=il-radar-books-jeff-carr-cyber-war-2nd-edition-interview"&gt;&lt;strong&gt;Inside Cyber Warfare, 2nd Edition&lt;/strong&gt;&lt;/a&gt; &amp;mdash; Jeffrey Carr's second edition of "Inside Cyber Warfare" goes beyond the headlines of attention-grabbing DDoS attacks and takes a deep look inside recent cyber-conflicts, including the use of Stuxnet.&lt;/div&gt;

&lt;p&gt;&lt;em&gt;This interview was edited and condensed.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Related:&lt;/strong&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2011/12/cloud-service-security-attack.html"&gt;Why cloud services are a tempting target for attackers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2010/02/cyber-warfare-dont-inflate-it.html"&gt;Cyber warfare: don't inflate it, don't underestimate it&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="http://radar.oreilly.com/2011/02/cybersecurity-gov-hackers.html"&gt;Trend to watch: Formal relationships between governments and hackers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt; &lt;a href="http://answers.oreilly.com/topic/1386-how-to-prepare-for-a-cyber-attack/"&gt;How to prepare for a cyber attack&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=XOYgyBK8wCY:8szk7fxlNiA:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=XOYgyBK8wCY:8szk7fxlNiA:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=XOYgyBK8wCY:8szk7fxlNiA:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=XOYgyBK8wCY:8szk7fxlNiA:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=XOYgyBK8wCY:8szk7fxlNiA:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=XOYgyBK8wCY:8szk7fxlNiA:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=XOYgyBK8wCY:8szk7fxlNiA:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/XOYgyBK8wCY" height="1" width="1"/&gt;</content>

<dc:type>text</dc:type>
<on:image />
<feedburner:origLink>http://radar.oreilly.com/2012/02/cyber-security-business-government.html</feedburner:origLink></entry>

<entry>
<title>Four short links: 6 February 2012</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/rB8nwI1SedY/four-short-links-6-february-20.html" />
<id>tag:radar.oreilly.com,2012://57.47790</id>

<published>2012-02-06T11:00:00Z</published>
<updated>2012-02-06T11:00:00Z</updated>

<summary type="html"> Jirafe -- open source e-commerce analytics for Magento platform. iModela -- a $1000 3D milling machine. (via BoingBoing) It's Too Late to Save The Common Web (Robert Scoble) -- paraphrased: "Four years ago, I told you all that Google and Facebook were evil. You did nothing, which is why I must now use Google and Facebook." His list of...</summary>
<author>
<name>Nat Torkington</name>
<uri>http://radar.oreilly.com/nat/</uri>
</author>

<category term="analytics" label="analytics" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="apple" label="apple" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="bigdata" label="big data" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="ebooks" label="ebooks" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="ecommerce" label="ecommerce" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="futureofmanufacturing" label="future of manufacturing" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="socialgraph" label="social graph" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="socialsoftware" label="social software" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;&lt;ol&gt;
&lt;li&gt;&lt;a href="http://jirafe.com/"&gt;Jirafe&lt;/a&gt; -- open source e-commerce analytics for Magento platform.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://icreate.rolanddg.com/iModela/Global/English/index.html/"&gt;iModela&lt;/a&gt; -- a $1000 3D milling machine. (via &lt;a href="http://boingboing.net/2012/02/02/imodela-rolands-1000-hobby.html"&gt;BoingBoing&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;&lt;a href="http://scobleizer.com/2012/02/04/its-too-late-for-dave-winer-and-john-battelle-to-save-the-common-web/"&gt;It's Too Late to Save The Common Web&lt;/a&gt; (Robert Scoble) -- paraphrased: "Four years ago, I told you all that Google and Facebook were evil. You did nothing, which is why I must now use Google and Facebook."  His list of reasons that Facebook beats the Open Web gives new shallows to the phrase "vanity metrics". Yes, the open web does not go out of its way to give you an inflated sense of popularity and importance. On the other hand, the things you do put there are in your control and will stay as long as you want them to. But that's obviously not a killer feature compared to a bottle of Astroglide and an autorefreshing page showing your Klout score and the number of Google+ circles you're in.&lt;/li&gt;
&lt;li&gt;&lt;a href="http://www.macobserver.com/tmo/article/apple_clarifies_ibooks_author_eula_excludes_claim_on_content/"&gt;iBooks Author EULA Clarified&lt;/a&gt; (MacObserver) -- important to note that it doesn't say you can't use the content you've written, only that you can't sell .ibook files through anyone but Apple. Less obnoxious than the "we own all your stuff, dude" interpretation, but still a bit crap. I wonder how anticompetitive this will be seen as. Apple's vertical integration is ripe for Justice Department investigation.&lt;/li&gt;
&lt;/ol&gt;&lt;/p&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=rB8nwI1SedY:xOzaFw-aCUI:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=rB8nwI1SedY:xOzaFw-aCUI:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=rB8nwI1SedY:xOzaFw-aCUI:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=rB8nwI1SedY:xOzaFw-aCUI:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=rB8nwI1SedY:xOzaFw-aCUI:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=rB8nwI1SedY:xOzaFw-aCUI:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=rB8nwI1SedY:xOzaFw-aCUI:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/rB8nwI1SedY" height="1" width="1"/&gt;</content>
<dc:source>http://www.oreillynet.com/pub/au/149</dc:source>
<dc:type>text</dc:type>
<on:image />
<feedburner:origLink>http://radar.oreilly.com/2012/02/four-short-links-6-february-20.html</feedburner:origLink></entry>

<entry>
<title>Top stories: January 30-February 3, 2012</title>
<link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/oreilly/radar/atom/~3/EppO5bHHZaY/hadoop-unstructured-data-moneyball.html" />
<id>tag:radar.oreilly.com,2012://57.47787</id>

<published>2012-02-03T20:00:00Z</published>
<updated>2012-02-03T20:00:00Z</updated>

<summary type="html">This week on O'Reilly: Edd Dumbill examined the components and functions of the Hadoop ecosystem, Pete Warden gave a big thumbs-up to unstructured data, and Jonathan Alexander looked at how a Moneyball approach could help software teams. </summary>
<author>
<name>Mac Slocum</name>
<uri>http://radar.oreilly.com/mslocum</uri>
</author>

<category term="apple" label="apple" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="datatool" label="data tool" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="hadoop" label="hadoop" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="iphone" label="iphone" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="mobilepayment" label="mobile payment" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="moneyball" label="moneyball" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="nfc" label="nfc" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="software" label="software" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="softwareteam" label="software team" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="topstories" label="top stories" scheme="http://www.sixapart.com/ns/types#tag" />
<category term="unstructureddata" label="unstructured data" scheme="http://www.sixapart.com/ns/types#tag" />

<content type="html" xml:lang="en" xml:base="http://radar.oreilly.com/">
&lt;p&gt;Here's a look at the top stories published across O'Reilly sites this week.&lt;/p&gt;

&lt;p style="width: 100%; height: 20px; margin: 0; clear: both;"  /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="http://radar.oreilly.com/2012/02/what-is-apache-hadoop.html"&gt;&lt;img src="http://radar.oreilly.com/2012/01/20/0112-hadoop-slider2.png" border="0" width="148" style="float: left; margin: 3px 10px 10px 0;" /&gt;&lt;/a&gt;&lt;a href="http://radar.oreilly.com/2012/02/what-is-apache-hadoop.html"&gt;&lt;strong&gt;What is Apache Hadoop?&lt;/strong&gt;&lt;/a&gt;&lt;br /&gt;Apache Hadoop has been the driving force behind the growth of the big data industry. But what does it do, and why do you need all its strangely-named friends? (Related: &lt;a href="http://radar.oreilly.com/2012/02/hadoop-doug-cutting-apache-data-processing.html"&gt;Hadoop creator Doug Cutting on why Hadoop caught on&lt;/a&gt;.)&lt;/p&gt;

&lt;p style="width: 100%; height: 20px; margin: 0; clear: both;"  /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="http://radar.oreilly.com/2012/01/unstructured-data-chaos.html"&gt;&lt;img src="http://radar.oreilly.com/2012/02/03/0112-data-chaos-slider2.png" border="0" width="148" style="float: left; margin: 3px 10px 10px 0;" /&gt;&lt;/a&gt;&lt;a href="http://radar.oreilly.com/2012/01/unstructured-data-chaos.html"&gt;&lt;strong&gt;Embracing the chaos of data&lt;/strong&gt;&lt;/a&gt;&lt;br /&gt;
Data scientists, it's time to welcome errors and uncertainty into your data projects. In this interview, Jetpac CTO Pete Warden discusses the advantages of unstructured data.&lt;/p&gt;

&lt;p style="width: 100%; height: 20px; margin: 0; clear: both;"  /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="http://radar.oreilly.com/2012/01/moneyball-software-engineers-teams-data.html"&gt;&lt;img src="http://radar.oreilly.com/2012/01/27/0112-scoreboard-slider.jpg" border="0" width="148" style="float: left; margin: 3px 10px 10px 0;" /&gt;&lt;/a&gt;&lt;a href="http://radar.oreilly.com/2012/01/moneyball-software-engineers-teams-data.html"&gt;&lt;strong&gt;Moneyball for software engineering, part 2&lt;/strong&gt;&lt;/a&gt;&lt;br /&gt;
A look at the "Moneyball"-style metrics and techniques managers can employ to get the most out of their software teams.&lt;/p&gt;

&lt;p style="width: 100%; height: 20px; margin: 0; clear: both;"  /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="http://radar.oreilly.com/2012/01/with-govuk-british-government.html"&gt;&lt;img src="http://radar.oreilly.com/2012/02/03/0212-govuk-slider.png" border="0" width="148" style="float: left; margin: 3px 10px 10px 0;" /&gt;&lt;/a&gt;&lt;a href="http://radar.oreilly.com/2012/01/with-govuk-british-government.html"&gt;&lt;strong&gt;With GOV.UK, British government redefines the online government platform&lt;/strong&gt;&lt;/a&gt;&lt;br /&gt;A new beta .gov website in Britain is open source, mobile friendly, platform agnostic, and open for feedback. &lt;/p&gt;

&lt;p&gt;&lt;br /&gt;
&lt;p style="width: 100%; height: 20px; margin: 0; clear: both;"  /&gt;&lt;/p&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href="http://radar.oreilly.com/2012/02/iphone5-nfc-paypal-homedepot-square-politics.html#apple"&gt;&lt;img src="http://radar.oreilly.com/2011/08/25/0811-apple-logo-slider.png" border="0" width="148" style="float: left; margin: 3px 10px 10px 0;" /&gt;&lt;/a&gt;&lt;a href="http://radar.oreilly.com/2012/02/iphone5-nfc-paypal-homedepot-square-politics.html#apple"&gt;&lt;strong&gt;When will Apple mainstream mobile payments?&lt;/strong&gt;&lt;/a&gt;&lt;br /&gt;David Sims parses the latest iPhone / near-field-communication rumors and considers the impact of Apple's (theoretical) entrance into the mobile payment space.&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;
&lt;p style="width: 100%; height: 20px; margin: 0; clear: both;"  /&gt;&lt;/p&gt;&lt;/p&gt;

&lt;hr&gt;

&lt;p&gt;
&lt;a href="https://en.oreilly.com/strata2012/public/regwith/radar20?cmp=il-radar-st12-top-stories-020312"&gt;&lt;strong&gt;Strata 2012&lt;/strong&gt;&lt;/a&gt;, Feb. 28-March 1 in Santa Clara, Calif., will offer three full days of hands-on data training and information-rich sessions. Strata brings together the people, tools, and technologies you need to make data work. &lt;a href="https://en.oreilly.com/strata2012/public/regwith/radar20?cmp=il-radar-st12-top-stories-020312"&gt;&lt;strong&gt;Save 20% on Strata registration with the code RADAR20&lt;/strong&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;div class="feedflare"&gt;
&lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=EppO5bHHZaY:eF781gY_mn4:V_sGLiPBpWU"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=EppO5bHHZaY:eF781gY_mn4:V_sGLiPBpWU" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=EppO5bHHZaY:eF781gY_mn4:yIl2AUoC8zA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=yIl2AUoC8zA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=EppO5bHHZaY:eF781gY_mn4:JEwB19i1-c4"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?i=EppO5bHHZaY:eF781gY_mn4:JEwB19i1-c4" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=EppO5bHHZaY:eF781gY_mn4:7Q72WNTAKBA"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=7Q72WNTAKBA" border="0"&gt;&lt;/img&gt;&lt;/a&gt; &lt;a href="http://feeds.feedburner.com/~ff/oreilly/radar/atom?a=EppO5bHHZaY:eF781gY_mn4:qj6IDK7rITs"&gt;&lt;img src="http://feeds.feedburner.com/~ff/oreilly/radar/atom?d=qj6IDK7rITs" border="0"&gt;&lt;/img&gt;&lt;/a&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/oreilly/radar/atom/~4/EppO5bHHZaY" height="1" width="1"/&gt;</content>
<dc:source>http://www.oreillynet.com/pub/au/3515</dc:source>
<dc:type>text</dc:type>
<on:image>http://radar.oreilly.com/radar-topstories-slider.png</on:image>
<feedburner:origLink>http://radar.oreilly.com/2012/02/hadoop-unstructured-data-moneyball.html</feedburner:origLink></entry>

</feed>

