<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Infochimps Blog</title>
	
	<link>http://blog.infochimps.com</link>
	<description>Big data insights, news and tips straight from the Data Mine</description>
	<lastBuildDate>Wed, 23 May 2012 21:33:36 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/infochimps-blog" /><feedburner:info uri="infochimps-blog" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:emailServiceId>infochimps-blog</feedburner:emailServiceId><feedburner:feedburnerHostname>http://feedburner.google.com</feedburner:feedburnerHostname><item>
		<title>Texas Has Chest Congestion</title>
		<link>http://feedproxy.google.com/~r/infochimps-blog/~3/CNzOj2P3pqU/</link>
		<comments>http://blog.infochimps.com/2012/05/23/texas-has-chest-congestion/#comments</comments>
		<pubDate>Wed, 23 May 2012 21:33:36 +0000</pubDate>
		<dc:creator>Winnie Hsia</dc:creator>
				<category><![CDATA[Pop Data]]></category>

		<guid isPermaLink="false">http://blog.infochimps.com/?p=1627</guid>
		<description><![CDATA[Here&#8217;s a great example of how one company takes Big Data and makes it fun.  Help is a drug company that strives to simplify the pharmaceutical choices for customers.  Their website now features a map highlighting sales data from Target and Walgreens called &#8220;What&#8217;s &#8230; <a href="http://blog.infochimps.com/2012/05/23/texas-has-chest-congestion/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.infochimps.com/wp-content/uploads/2012/05/whatswrongwithtexas.png"><img class="alignleft size-full wp-image-1628" title="whatswrongwithtexas" src="http://blog.infochimps.com/wp-content/uploads/2012/05/whatswrongwithtexas.png" alt="" width="640" /></a></p>
<p>Here&#8217;s a great example of how one company takes Big Data and makes it fun.  <a title="Help" href="http://www.helpineedhelp.com/" target="_blank">Help</a> is a drug company that strives to simplify the pharmaceutical choices for customers.  Their website now features a map highlighting sales data from Target and Walgreens called <a title="What's wrong U.S.?" href="http://helpineedhelp.com/whatswrongus/" target="_blank">&#8220;What&#8217;s wrong U.S.?&#8221;</a>.  A bar chart for each state shows how many people are buying products for particular ailments versus the national average; you can also click on the state and get region by region details.  For example, Central Texas, home to our home, Austin, TX has a higher than average number of blisters.  Maybe it&#8217;s all the <a href="http://www.shape.com/lifestyle/fit-getaways/fittest-cities-10-austin-texas">running and biking</a> we do!</p>
<p>(via <a href="http://flowingdata.com/2012/05/17/montana-cant-sleep/">Flowing Data</a>)</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=CNzOj2P3pqU:FfZLCmtcL5A:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=CNzOj2P3pqU:FfZLCmtcL5A:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?i=CNzOj2P3pqU:FfZLCmtcL5A:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/infochimps-blog/~4/CNzOj2P3pqU" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.infochimps.com/2012/05/23/texas-has-chest-congestion/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.infochimps.com/2012/05/23/texas-has-chest-congestion/</feedburner:origLink></item>
		<item>
		<title>S3Chimp: Information Science in Action</title>
		<link>http://feedproxy.google.com/~r/infochimps-blog/~3/DKBZVYqb2mE/</link>
		<comments>http://blog.infochimps.com/2012/05/21/s3chimp-information-science-in-action/#comments</comments>
		<pubDate>Mon, 21 May 2012 16:33:36 +0000</pubDate>
		<dc:creator>Winnie Hsia</dc:creator>
				<category><![CDATA[Data Mine]]></category>

		<guid isPermaLink="false">http://blog.infochimps.com/?p=1625</guid>
		<description><![CDATA[I’m Selene, Infochimps’ new Analyst. Prior to my new position, I was an Infochimps intern. I recently graduated from the School of Information at the University of Texas with a Master’s of Science in Information Studies. As part of my &#8230; <a href="http://blog.infochimps.com/2012/05/21/s3chimp-information-science-in-action/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p dir="ltr"><img class="alignleft" title="Selene Arrazolo" src="http://www.infochimps.com/images/team/selene_arrazolo.jpg" alt="" width="130" height="130" />I’m Selene, Infochimps’ new Analyst. Prior to my new position, I was an Infochimps intern. I recently graduated from the <a href="http://www.ischool.utexas.edu">School of Information at the University of Texas</a> with a Master’s of Science in Information Studies. As part of my MSIS degree plan, I completed a semester long project entitled: Developing and Integrating a Lightweight Metadata System into a Data Ingestion Workflow here at Infochimps, Inc.</p>
<p>The main ingredients of the project were Ruby on Rails, MongoDB, and everyone’s favorite, Amazon Web Services. The result is an alpha stage of the tentatively named S3Chimp. It is an addition to <a href="http://blog.infochimps.com/2012/04/09/dashpot/">Dashpot</a>, our Analytics &amp; Operations Dashboard for the Infochimps Platform. Dashpot boasts an easy-to-use analytics and operations dashboard that provides business metrics and visualization, cluster management capabilities, and system monitoring on top of the Infochimps Platform. Integrating a lightweight metadata system into the workflow makes it possible for Dashpot to also track and organize distributed massive-scale data assets. What was once time-consuming (according to us as well as various people in the industry), can now be a dynamic part of an organization’s internal analytics.</p>
<p dir="ltr">Before I could begin making S3Chimp, organizing the Infochimps Amazon S3 Buckets was key. Perhaps a company that boasts about its command of data should have a beautifully organized set of buckets? Perhaps&#8230;.  But let’s pretend that is not the case. And let us imagine that a young and excited Information Studies graduate student decides to tackle the S3 clutter. The essential steps in such a scenario include designing a thought-out schema guideline tailored to the company’s needs and data types, and insensately enforcing those guidelines.</p>
<p dir="ltr">Next on the list was learning Ruby on Rails, over several weeks. It was a baptism by fire. I learned the very basics of Ruby on Rails and how to love the MVC trinity. Ruby on Rails is a smart and fun web app framework and it was an enjoyable experience, relative to PHP. Relative to a Saturday afternoon at Barton Springs? Not so much.</p>
<p dir="ltr">With a snazzy script written in the enchanted Infochimps Data Mine, I was able to take the most exciting leap which was taking metadata from the now beautifully organized S3 buckets, and injecting it into MongoDB, a NoSQL database. The result is the S3Chimp genesis. S3Chimps is a system that that tells you what data and how much of it is in AWS, all from your analytics dashboard. Future plans for this product include making a tool to capture provenance metadata, and other goodies.</p>
<p dir="ltr"><img class="alignleft" title="MongoDB" src="http://www.thebuzzmedia.com/wp-content/uploads/2010/07/mongo-db-huge-logo.png" alt="" width="270" height="90" />You can find me at the <a href="http://www.10gen.com/events/mongo-nyc">upcoming MongoDB NYC conference</a><a href="http://www.10gen.com/events/mongo-nyc,">,</a> if you’d like to ask me about our awesome new Ironfan Platform, Dashpot, or my CapStone project.</p>
<p dir="ltr">I’d like to thank my Field Supervisor, Flip Kromer as well as my Faculty Adviser, Dr. Melanie Feinberg.</p>
<p dir="ltr"><em>Keep an eye out for my next blog post where I will be chronicling my personal Ruby on Rails adventure that is near and dear to my librarian heart. Travis Dempsey and I will make an in-house database of our office library’s catalog. The Bukfin Repostiry’s catalog is currently housed in <a href="https://www.librarything.com/catalog/Bufkin_Repository">Librarything</a>.</em></p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=DKBZVYqb2mE:PgWflPwtRP8:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=DKBZVYqb2mE:PgWflPwtRP8:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?i=DKBZVYqb2mE:PgWflPwtRP8:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/infochimps-blog/~4/DKBZVYqb2mE" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.infochimps.com/2012/05/21/s3chimp-information-science-in-action/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.infochimps.com/2012/05/21/s3chimp-information-science-in-action/</feedburner:origLink></item>
		<item>
		<title>How We Do It</title>
		<link>http://feedproxy.google.com/~r/infochimps-blog/~3/J_4qnGtTc50/</link>
		<comments>http://blog.infochimps.com/2012/05/18/how-we-do-it/#comments</comments>
		<pubDate>Fri, 18 May 2012 12:00:18 +0000</pubDate>
		<dc:creator>Nathaniel Eliot</dc:creator>
				<category><![CDATA[Monkey Business]]></category>

		<guid isPermaLink="false">http://blog.infochimps.com/?p=1619</guid>
		<description><![CDATA[Infochimps uses many cutting edge tools (Chef, Amazon Web Services, Hadoop, Hbase, ElasticSearch, Flume, MongoDB, Phantom.js, etc. ad nauseum), and we&#8217;ve written a number of custom tools to help corral these sometimes wild horses into a working team. Ironfan, our &#8230; <a href="http://blog.infochimps.com/2012/05/18/how-we-do-it/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft" title="This is how we do it" src="http://www.funzine.hu/wp-content/uploads/2011/09/this-is-how-we-do-it.jpg" alt="" width="300" />Infochimps uses many cutting edge tools (Chef, Amazon Web Services, Hadoop, Hbase, ElasticSearch, Flume, MongoDB, Phantom.js, etc. ad nauseum), and we&#8217;ve written a number of custom tools to help corral these sometimes wild horses into a working team. Ironfan, our Chef specialization for big-data in the cloud, coordinates the installation and configuration of the many necessary components. Wukong is our Ruby library for Hadoop, combining the flexibility of JRuby with the raw power of MapReduce. Wonderdog is our Hadoop interface to ElasticSearch, allowing us to deliver large amounts of data quickly into a stable and searchable NoSQL data stores. Swineherd, the workflow engine for Hadoop jobs, helps tie all of this together into a coherent framework for running multi-stage data ingestions.</p>
<p>To crib a DevOps aphorism, however, it&#8217;s not the technology that makes Infochimps work: it&#8217;s the culture. Specifically, it’s about culture that keeps the challenges from all that novel technology manageable.</p>
<p><span id="more-1619"></span></p>
<p><a href="http://blog.infochimps.com/wp-content/uploads/2012/05/418048_10150734896246407_77218166406_10940682_1448237063_n.jpg"><img class="aligncenter size-full wp-image-1620" title="418048_10150734896246407_77218166406_10940682_1448237063_n" src="http://blog.infochimps.com/wp-content/uploads/2012/05/418048_10150734896246407_77218166406_10940682_1448237063_n.jpg" alt="" width="619" height="413" /></a><br />
Our hiring process is a big part of building and maintaining that culture. We have multiple interview passes, to efficiently separate the few who will fit from the large mass of potential hires. The first pass is with our office manager Holly, a sweet lady who weeds out obvious mismatches in personality, interest, or resume. Next is a phone interview with Adam, our technical team lead, to help weed out those with obviously insufficient skill-sets. After that comes a team interview at the office, to do a finer test on the cultural fit, and start sniffing out where the candidate&#8217;s skills and interests are. The last hurdle before an offer is a short initial contract job (a week or two long, paid on completion); nothing demonstrates work ethic and development style clearer than actual development work. Although there are two technical passes, in all cases the focus is less on existing experience, and more on attitude and potential: even the most experienced candidate will lack experience with most of our tools, so adaptability and initiative are important traits in a successful hire.</p>
<p>Our management style relies heavily on what we have won in the hiring process: a work environment full of capable, intelligent, and self-motivated people. Management structure is very flat; I regularly consult with the C-level folks, and everyone else, as a particular task requires. Well-defined (but flexible) roles help keep communication open, as it’s usually obvious who should be included in a discussion. The technical leadership is focused on setting and tying together goals, leaving most choices about the implementation to those doing the work, but always available to clarify what choices align best with the bigger picture. Shared language for common pains and frustrations (e.g. spending currency as an analogy for causing developer frustration, or our various terms for types of technical debt) help encourage empathy, and a shared focus on troubleshooting over blame-assignment. Above all, management strives to avoid mandatory overhead for development (i.e. regular status reports and meetings), instead relying on each employee&#8217;s good judgment and occasional casual check-ins to decide how they communicate status and needs.</p>
<p>Beyond core management style, Infochimps goes to great lengths to support their employees. Developers aren&#8217;t always the best at remembering to eat in the middle of deep code delve, so lunches (and gentle reminders from Holly) are supplied, in addition to a fully stocked kitchen. There&#8217;s an employee joy fund, which employees propose and vote on uses: past choices have ranged from &#8220;a new coffee maker that doesn&#8217;t suck&#8221; to bimonthly yoga classes. There are company outings, both formal and impromptu, and some fun and games around the office too (including the occasional Magic: the Gathering free-for-all).</p>
<p>Employee career development is another big key to employee joy. An employee&#8217;s focus is largely self-directed, with interest trumping experience in all but the most time- or stability-sensitive projects. To paraphrase Flip, one of our founders, our aim is to make employees awesomely valuable to the open job market, and totally disinterested in it.</p>
<p>Our development culture is heavily agile, embracing elements from Scrum, Kanban, and DevOps without slavish adherence to any of them. Though core technology choices often come from the C-level, good ideas can and do come from anywhere, from the newest hire to the office manager. In a similar way, although operations are my core responsibility, they are not mine alone: we are closer to the ideal of DevOps (or perhaps NoOps+1 or AllOps), in that ultimately everyone shares the goal (and some of the load) of keeping everything operational. Repeatability is key to many of our core products, but we balance with an understanding that automation is best done to address boredom or terror, not just inefficiency; a task must be either be too predictable to be interesting, or too complex to be feasible, to be a good reason to add further infrastructure. We are also consciously risk-taking, preferring failure from audacity to failure from inaction, and failing forward instead of rolling back wherever possible.</p>
<p>Our infrastructure choices are made with similar goals in mind: developer experience and ergonomics are important criteria for tool choice. Resources are open by default (in cultural assumption, where security concerns prevent it in actuality), so that developers may get to what they need to easily. Components should ideally be small, decoupled, and late-binding wherever possible; reducing the interdependence the system improves both how manageable it is, and how flexible your architecture can be in the face of changing business needs. Making infrastructure repeatable (by making it from code, via Ironfan) means that building anew is an attractive option, which can free you from some of the worst of legacy code upgrade cycles. Archiving unused code and data from production systems, as opposed to supporting everything without question, makes the resultant systems easier to understand and trust.</p>
<p>So now that we&#8217;ve got this great workplace, what&#8217;s next? We foresee (and are even starting to experience) some growing pains as we shift into our enterprise focused work. How do we handle the impedance mismatch between our model and our clients&#8217; models? What do we do as the company grows beyond the size of <a href="http://en.wikipedia.org/wiki/Dunbar's_number">the monkeysphere</a>? How should we tackle user segmentation and security as we build our Platform out?</p>
<p>Ultimately, the answers boil down to the same thing we have been doing: find the best teammates we can, then tear down any barriers between them and being awesome.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=J_4qnGtTc50:9371psaKTE4:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=J_4qnGtTc50:9371psaKTE4:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?i=J_4qnGtTc50:9371psaKTE4:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/infochimps-blog/~4/J_4qnGtTc50" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.infochimps.com/2012/05/18/how-we-do-it/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.infochimps.com/2012/05/18/how-we-do-it/</feedburner:origLink></item>
		<item>
		<title>Why Geeks Win</title>
		<link>http://feedproxy.google.com/~r/infochimps-blog/~3/O8Nn1b6oAs4/</link>
		<comments>http://blog.infochimps.com/2012/05/16/why-geeks-win/#comments</comments>
		<pubDate>Wed, 16 May 2012 15:00:37 +0000</pubDate>
		<dc:creator>Winnie Hsia</dc:creator>
				<category><![CDATA[Pop Data]]></category>

		<guid isPermaLink="false">http://blog.infochimps.com/?p=1622</guid>
		<description><![CDATA[We found this great little chart on Chart Porn today and thought it was an excellent representation of the foundations of our company.  Yay, geeks!]]></description>
			<content:encoded><![CDATA[<p>We found this great little chart on <a href="http://chartporn.org/">Chart Porn</a> today and thought it was an excellent representation of the foundations of our company.  Yay, geeks!</p>
<p><img class="alignnone" title="Geeks and Repetitive Tasks" src="http://chartporn.org/wp-content/uploads/2012/05/image13.png" alt="" width="513" height="367" /></p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=O8Nn1b6oAs4:eUhviDYgQOc:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=O8Nn1b6oAs4:eUhviDYgQOc:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?i=O8Nn1b6oAs4:eUhviDYgQOc:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/infochimps-blog/~4/O8Nn1b6oAs4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.infochimps.com/2012/05/16/why-geeks-win/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.infochimps.com/2012/05/16/why-geeks-win/</feedburner:origLink></item>
		<item>
		<title>Why Real-Time Analytics? [Free White Paper]</title>
		<link>http://feedproxy.google.com/~r/infochimps-blog/~3/FPRsrNIJJC4/</link>
		<comments>http://blog.infochimps.com/2012/05/09/why-real-time-analytics/#comments</comments>
		<pubDate>Wed, 09 May 2012 13:00:29 +0000</pubDate>
		<dc:creator>Tim Gasper</dc:creator>
				<category><![CDATA[Data Mine]]></category>
		<category><![CDATA[Products & Features]]></category>

		<guid isPermaLink="false">http://blog.infochimps.com/?p=1614</guid>
		<description><![CDATA[When you think Big Data, the first words that come to mind are often Hadoop and NoSQL, but what do these technologies actually mean for your business?  Different Big Data technologies have different use cases where they work best.  For &#8230; <a href="http://blog.infochimps.com/2012/05/09/why-real-time-analytics/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.infochimps.com/wp-content/uploads/2012/05/realtime-analytics.png"><img class="alignleft size-full wp-image-1616" src="http://blog.infochimps.com/wp-content/uploads/2012/05/realtime-analytics.png" alt="" width="640" /></a></p>
<p>When you think Big Data, the first words that come to mind are often Hadoop and NoSQL, but what do these technologies actually mean for your business?  Different Big Data technologies have different use cases where they work best.  For your real-time Big Data challenges often a very different class of tools must be implemented.</p>
<p>In this <a title="Why Real-Time Analytics - Big Data White Paper" href="http://www.infochimps.com/_kanzi/assets/whitepapers/why-real-time-analytics.pdf?utm_source=white%2Bpaper&amp;utm_medium=social&amp;utm_campaign=real-time-analytics">free white paper</a>, we&#8217;ll explore:</p>
<ul>
<li>How to create a flexible architecture that allows you to use the best Big Data tools and technologies for the job at hand</li>
<li>Where Hadoop analysis and NoSQL databases work and where they can fall short</li>
<li>How Hadoop differs from real-time analytics and stream processing approaches</li>
<li>Visual representations of how real-time analytics works and real world use cases</li>
<li>How to leverage the Infochimps Platform to perform real-time analytics</li>
</ul>
<div><strong><span style="font-size: large"><span style="line-height: 55px"><a title="Why Real-Time Analytics - Big Data White Paper" href="http://www.infochimps.com/_kanzi/assets/whitepapers/why-real-time-analytics.pdf?utm_source=white%2Bpaper&amp;utm_medium=social&amp;utm_campaign=real-time-analytics">Download the white paper here</a></span></span></strong></div>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=FPRsrNIJJC4:myUJUcM-nps:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=FPRsrNIJJC4:myUJUcM-nps:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?i=FPRsrNIJJC4:myUJUcM-nps:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/infochimps-blog/~4/FPRsrNIJJC4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.infochimps.com/2012/05/09/why-real-time-analytics/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://blog.infochimps.com/2012/05/09/why-real-time-analytics/</feedburner:origLink></item>
		<item>
		<title>The Era of Big Data and What It Means For You</title>
		<link>http://feedproxy.google.com/~r/infochimps-blog/~3/G7nB9erhkE4/</link>
		<comments>http://blog.infochimps.com/2012/05/07/the-era-of-big-data-and-what-it-means-for-you/#comments</comments>
		<pubDate>Mon, 07 May 2012 21:11:39 +0000</pubDate>
		<dc:creator>Winnie Hsia</dc:creator>
				<category><![CDATA[Big Data News]]></category>
		<category><![CDATA[Community]]></category>

		<guid isPermaLink="false">http://blog.infochimps.com/?p=1612</guid>
		<description><![CDATA[When it comes to predicting the future, your best resource (short of a soothsayer) is historical data.  As data collection, storage and processing has become more sophisticated, the volume of data has exploded. A recent article in the McKinsey Quarterly, states &#8230; <a href="http://blog.infochimps.com/2012/05/07/the-era-of-big-data-and-what-it-means-for-you/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><img class="alignleft" title="Future" src="http://www.bostonglobe.com/rf/image_r/Boston/2011-2020/2011/12/29/BostonGlobe.com/ReceivedContent/Images/cookie2.r.jpg" alt="" width="420" /></p>
<p>When it comes to predicting the future, your best resource (short of a soothsayer) is historical data.  As data collection, storage and processing has become more sophisticated, the volume of data has exploded. A recent article in the<a href="http://www.mckinseyquarterly.com/Are_you_ready_for_the_era_of_big_data_2864#"> McKinsey Quarterly</a>, states that in the US, across most business sectors, companies with more than 1000 employees store, on average, over 235 terabytes of data &#8211; more data than contained in the entirety of the US Library of Congress.</p>
<p>What does this mean?  It means that companies are sitting on a goldmine of insights for competitive advantage.  The McKinsey Quarterly article mentions this example:</p>
<blockquote><p>The top marketing executive at a sizable US retailer recently found herself perplexed by the sales reports she was getting. A major competitor was steadily gaining market share across a range of profitable segments. Despite a counterpunch that combined online promotions with merchandizing improvements, her company kept losing ground.</p>
<p>When the executive convened a group of senior leaders to dig into the competitor’s practices, they found that the challenge ran deeper than they had imagined. The competitor had made massive investments in its ability to collect, integrate, and analyze data from each store and every sales unit and had used this ability to run myriad real-world experiments. At the same time, it had linked this information to suppliers’ databases, making it possible to adjust prices in real time, to reorder hot-selling items automatically, and to shift items from store to store easily. By constantly testing, bundling, synthesizing, and making information instantly available across the organization—from the store floor to the CFO’s office—the rival company had become a different, far nimbler type of business.</p></blockquote>
<p>The amount of data we produce is staggering and the underlying possibilities are incredible, but that doesn&#8217;t necessarily mean companies have the ability to extract true value from their data.</p>
<p>Looking to understand how Big Data can revolutionize how your organization does business?  Sign up for a <strong><a href="http://www.infochimps.com/free-big-data-consultation">free Big Data consultation</a></strong> with some of our leading data scientists to get started today!</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=G7nB9erhkE4:2tWrXvi51Gw:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=G7nB9erhkE4:2tWrXvi51Gw:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?i=G7nB9erhkE4:2tWrXvi51Gw:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/infochimps-blog/~4/G7nB9erhkE4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.infochimps.com/2012/05/07/the-era-of-big-data-and-what-it-means-for-you/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.infochimps.com/2012/05/07/the-era-of-big-data-and-what-it-means-for-you/</feedburner:origLink></item>
		<item>
		<title>Milking Big Data in Pursuit of… More Milk</title>
		<link>http://feedproxy.google.com/~r/infochimps-blog/~3/RLG411B_s7Y/</link>
		<comments>http://blog.infochimps.com/2012/05/02/milking-big-data-in-pursuit-of-more-milk/#comments</comments>
		<pubDate>Wed, 02 May 2012 17:15:33 +0000</pubDate>
		<dc:creator>Winnie Hsia</dc:creator>
				<category><![CDATA[Big Data News]]></category>

		<guid isPermaLink="false">http://blog.infochimps.com/?p=1611</guid>
		<description><![CDATA[A recent article from The Atlantic explores how Big Data has revolutionized the dairy industry.  In the past sixty years, through innovations in dairy science, milk production from an individual dairy cow has gone up from an average 5,000 pounds of &#8230; <a href="http://blog.infochimps.com/2012/05/02/milking-big-data-in-pursuit-of-more-milk/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><img class="alignnone" title="Cows" src="http://4.bp.blogspot.com/-MgL_65DABso/TxH9hYzHVCI/AAAAAAAACRc/U9bZqX1d6LA/s1600/organic-dairy-cows.jpg" alt="" width="640" /></p>
<p>A recent article from <a href="http://www.theatlantic.com/technology/archive/2012/05/the-perfect-milk-machine-how-big-data-transformed-the-dairy-industry/256423/">The Atlantic</a> explores how Big Data has revolutionized the dairy industry.  In the past sixty years, through innovations in dairy science, milk production from an individual dairy cow has gone up from an average 5,000 pounds of milk in a lifetime to 21,000 pounds of milk.  This astonishing increase has largely been fueled by data-driven predictions that allow dairy breeders to optimize their herds.</p>
<blockquote><p>Dairy breeding is perfect for quantitative analysis. <a href="http://www.holsteinusa.com/pdf/print_material/read_pedigrees.pdf">Pedigree records</a> have been assiduously kept; <a href="http://www.docstoc.com/docs/115946014/Artificial-Insemination">relatively easy artificial insemination</a> has helped centralized genetic information in a <a href="http://www.holsteinusa.com/genetic_evaluations/ss_pedanal.html?printable=true">small number of key bulls</a> since the 1960s; there are a relatively <a href="http://www.wcds.ca/proc/1997/ch01-97.htm">small and easily measurable number of traits</a> &#8212; milk production, fat in the milk, protein in the milk, longevity, udder quality &#8212; that breeders want to optimize; each cow works for three or four years, which means that farmers <a href="http://www.theatlantic.com/technology/archive/2012/05/the-perfect-milk-machine-how-big-data-transformed-the-dairy-industry/256423/www.ksre.ksu.edu/library/agec2/mf272.pdf">invest thousands of dollars</a> into each animal, so it&#8217;s worth it to get the best semen money can buy. The economics push breeders to use the genetics.</p></blockquote>
<p>&nbsp;</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=RLG411B_s7Y:IdBf_CuYDhQ:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=RLG411B_s7Y:IdBf_CuYDhQ:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?i=RLG411B_s7Y:IdBf_CuYDhQ:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/infochimps-blog/~4/RLG411B_s7Y" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.infochimps.com/2012/05/02/milking-big-data-in-pursuit-of-more-milk/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://blog.infochimps.com/2012/05/02/milking-big-data-in-pursuit-of-more-milk/</feedburner:origLink></item>
		<item>
		<title>52 Billion Chickens</title>
		<link>http://feedproxy.google.com/~r/infochimps-blog/~3/slyg03DN5CQ/</link>
		<comments>http://blog.infochimps.com/2012/04/27/52-billion-chickens/#comments</comments>
		<pubDate>Fri, 27 Apr 2012 22:10:47 +0000</pubDate>
		<dc:creator>Winnie Hsia</dc:creator>
				<category><![CDATA[Pop Data]]></category>

		<guid isPermaLink="false">http://blog.infochimps.com/?p=1609</guid>
		<description><![CDATA[As you enter your weekend, consider this, human beings are outnumbered by lots of creatures in this world, including ants, which Harvard biologist and ant expert, Edward O. Wilson claims outnumber us one million to one. I&#8217;d personally suspect we &#8230; <a href="http://blog.infochimps.com/2012/04/27/52-billion-chickens/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>As you enter your weekend, consider this, human beings are outnumbered by lots of creatures in this world, including ants, which Harvard biologist and ant expert, Edward O. Wilson claims outnumber us one million to one. I&#8217;d personally suspect we are also greatly outnumbered by numerous varieties of insects, arachnids, and in Austin, <a href="http://www.allaboutbirds.org/guide/Common_grackle/id">grackles</a>.</p>
<p>Somewhat unsurprisingly, we are also outnumbered by chickens.  In 2009, we killed 52 billion chickens for food (to say nothing of the ones we kept alive).  Kind of makes you thankful they aren&#8217;t <a href="http://wordinfo.info/words/images/chicken-attack.gif">fighting back</a>.</p>
<p>Happy Friday!</p>
<div class="visually_embed" data-category="Food"><img class="visually_embed_infographic" src="http://visually.visually.netdna-cdn.com/FoodforThought_4e09178d45006_w640.jpg" alt="" /></p>
<div class="visually_embed_bar"><span>by </span> <a href="http://www.nationalgeographic.com/" target="_blank">NatGeo</a>. <span class="visually_embed_cycle">Browse more <a href="http://visual.ly">data visualizations</a>.</span></div>
<p>&nbsp;</p>
</div>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=slyg03DN5CQ:lBTcChGoc-8:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=slyg03DN5CQ:lBTcChGoc-8:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?i=slyg03DN5CQ:lBTcChGoc-8:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/infochimps-blog/~4/slyg03DN5CQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.infochimps.com/2012/04/27/52-billion-chickens/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://blog.infochimps.com/2012/04/27/52-billion-chickens/</feedburner:origLink></item>
		<item>
		<title>Finding Real Neighborhoods</title>
		<link>http://feedproxy.google.com/~r/infochimps-blog/~3/kn0E4q7czPs/</link>
		<comments>http://blog.infochimps.com/2012/04/23/finding-real-neighborhoods/#comments</comments>
		<pubDate>Mon, 23 Apr 2012 17:01:37 +0000</pubDate>
		<dc:creator>Winnie Hsia</dc:creator>
				<category><![CDATA[Pop Data]]></category>

		<guid isPermaLink="false">http://blog.infochimps.com/?p=1604</guid>
		<description><![CDATA[The boundaries of a neighborhood can be a topic of hot contention. Look to a tourist guidebook, a real estate agent, and a local and you&#8217;ll get four about whether or not north of 14th Street still counts as &#8220;The &#8230; <a href="http://blog.infochimps.com/2012/04/23/finding-real-neighborhoods/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.infochimps.com/wp-content/uploads/2012/04/eastvillage.jpg"><img class="alignnone size-large wp-image-1605" title="eastvillage" src="http://blog.infochimps.com/wp-content/uploads/2012/04/eastvillage-1024x683.jpg" alt="" width="640" height="426" /></a></p>
<p>The boundaries of a neighborhood can be a topic of hot contention. Look to a tourist guidebook, a real estate agent, and a local and you&#8217;ll get four about whether or not north of 14th Street still counts as &#8220;The Village&#8221; in NYC.  <a href="http://livehoods.org/">Livehoods</a>, a project by the <a href="http://www.cs.cmu.edu/">School of Computer Science</a> at <a href="http://www.cmu.edu/">Carnegie Mellon University</a> takes a social spin on answering these questions and uncovers some truly insightful data of neighborhood boundaries, relationships, activity levels, character, and more.</p>
<p>&nbsp;</p>
<blockquote><p>Livehoods offer a new way to conceptualize the dynamics, structure, and character of a city by analyzing the social media its residents generate. By looking at people&#8217;s checkin patterns at places across the city, we create a mapping of the different dynamic areas that comprise it. Each Livehood tells a different story of the people and places that shape it.</p></blockquote>
<p><a href="http://blog.infochimps.com/wp-content/uploads/2012/04/newjersey.jpg"><img class="alignnone size-full wp-image-1606" title="newjersey" src="http://blog.infochimps.com/wp-content/uploads/2012/04/newjersey.jpg" alt="" width="640" /></a></p>
<p>One thing I found particular fascinating, though not wholly unexpected about the New York City map was the clustering of neighborhoods in New Jersey.  In NYC, with the relative proximity of&#8230; everything to everything, it&#8217;s not surprising to find that neighborhoods are small areas comprised of a tightly clustered businesses and homes.  In New Jersey, the &#8220;neighborhoods&#8221; span across a half dozen suburban towns in the same county.</p>
<p>Interested in experimenting with some Foursquare data yourself?  Check out our <a href="http://www.infochimps.com/datasets/foursquare-places">Foursquare Places API</a>!</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=kn0E4q7czPs:RrnVZskIqNg:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=kn0E4q7czPs:RrnVZskIqNg:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?i=kn0E4q7czPs:RrnVZskIqNg:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/infochimps-blog/~4/kn0E4q7czPs" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.infochimps.com/2012/04/23/finding-real-neighborhoods/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://blog.infochimps.com/2012/04/23/finding-real-neighborhoods/</feedburner:origLink></item>
		<item>
		<title>Drought Tracking and Texas’ Extreme Weather</title>
		<link>http://feedproxy.google.com/~r/infochimps-blog/~3/NTiIF5nopx0/</link>
		<comments>http://blog.infochimps.com/2012/04/18/drought-tracking-and-texas-extreme-weather/#comments</comments>
		<pubDate>Wed, 18 Apr 2012 22:10:41 +0000</pubDate>
		<dc:creator>Winnie Hsia</dc:creator>
				<category><![CDATA[Pop Data]]></category>

		<guid isPermaLink="false">http://blog.infochimps.com/?p=1602</guid>
		<description><![CDATA[Living in Austin, TX, it was pretty obvious that last year with its record number of 100+ degree days without rain, thousands of square miles burned in wildfires, and billions lost on agriculture that we were in the middle of &#8230; <a href="http://blog.infochimps.com/2012/04/18/drought-tracking-and-texas-extreme-weather/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.infochimps.com/wp-content/uploads/2012/04/Drought-map.png"><img class="alignnone size-full wp-image-1603" title="Drought-map" src="http://blog.infochimps.com/wp-content/uploads/2012/04/Drought-map.png" alt="" width="625" height="364" /></a></p>
<p>Living in Austin, TX, it was pretty obvious that last year with its record number of 100+ degree days without rain, thousands of square miles burned in wildfires, and billions lost on agriculture that we were in the middle of a serious drought. The impact across the state and throughout much of the South since October 2010 is staggeringly reviewed in this simple flipbook-style map from <a href="http://stateimpact.npr.org/texas/drought/">NPR</a>.</p>
<p>The potential solutions to the problem are outlined in the <a href="http://stateimpact.npr.org/texas/2012/02/01/five-ways-to-find-water-for-a-thirsty-texas/">Water Plan</a>. It will be interesting to see how the continuation of this drought will affect job growth, home prices, population, and more throughout the state in the coming years.</p>
<blockquote><p>Various plans for dealing with future droughts and growing demand for water in Texas exist, but most comprehensive — and accepted — is the state Water Plan. It offers a frank assessment of the current landscape, saying Texas “does not and will not have enough water to meet the needs of its people, its businesses, and its agricultural enterprises.” It predicts that “if a drought affected the entire state like it did in the 1950s,” Texas could lose around $116 billion, over a million jobs, and the growing state&#8217;s population could actually shrink by 1.4 million people.</p></blockquote>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=NTiIF5nopx0:lyOtG5-3W9I:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/infochimps-blog?a=NTiIF5nopx0:lyOtG5-3W9I:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/infochimps-blog?i=NTiIF5nopx0:lyOtG5-3W9I:F7zBnMyn0Lo" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/infochimps-blog/~4/NTiIF5nopx0" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.infochimps.com/2012/04/18/drought-tracking-and-texas-extreme-weather/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.infochimps.com/2012/04/18/drought-tracking-and-texas-extreme-weather/</feedburner:origLink></item>
	</channel>
</rss>

