<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>the scottbot irregular</title>
	
	<link>http://www.scottbot.net/HIAL</link>
	<description>data are everywhen</description>
	<lastBuildDate>Mon, 28 May 2012 14:40:25 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.2</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/TheScottbotIrregular" /><feedburner:info uri="thescottbotirregular" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>The Myth of Text Analytics and Unobtrusive Measurement</title>
		<link>http://feedproxy.google.com/~r/TheScottbotIrregular/~3/SF2-VXczqig/</link>
		<comments>http://www.scottbot.net/HIAL/?p=16713#comments</comments>
		<pubDate>Sun, 06 May 2012 14:16:34 +0000</pubDate>
		<dc:creator>Scott Weingart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[data analysis]]></category>
		<category><![CDATA[digital humanities]]></category>
		<category><![CDATA[methodologies]]></category>
		<category><![CDATA[social science]]></category>
		<category><![CDATA[statistics]]></category>
		<category><![CDATA[text analysis]]></category>

		<guid isPermaLink="false">http://www.scottbot.net/HIAL/?p=16713</guid>
		<description><![CDATA[Just realized Klout is the perfect metaphor for media in the modern era. It assumes you&#8217;re an expert in anything you talk a lot about. — Dan Munz (@dan_munz) May 5, 2012 Text analytics are often used in the social sciences as a way of unobtrusively observing people and their interactions. Humanists tend to approach <a href='http://www.scottbot.net/HIAL/?p=16713' class='excerpt-more'>[...]</a>]]></description>
			<content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.type=&amp;rft.format=text&amp;rft.title=The+Myth+of+Text+Analytics+and+Unobtrusive+Measurement&amp;rft.source=the+scottbot+irregular&amp;rft.date=2012-05-06&amp;rft.identifier=http%3A%2F%2Fwww.scottbot.net%2FHIAL%2F%3Fp%3D16713&amp;rft.language=English&amp;rft.subject=Uncategorized&amp;rft.aulast=Weingart&amp;rft.aufirst=Scott"></span><blockquote class="twitter-tweet"><p>Just realized Klout is the perfect metaphor for media in the modern era. It assumes you&#8217;re an expert in anything you talk a lot about.</p>
<p>— Dan Munz (@dan_munz) <a href="https://twitter.com/dan_munz/status/198924429324075009" data-datetime="2012-05-05T23:57:34+00:00">May 5, 2012</a></p></blockquote>
<p>Text analytics are often used in the social sciences as a way of unobtrusively observing people and their interactions. Humanists tend to approach the supporting algorithms with skepticism, and with good reason. This post is about the difficulties of using words or counts as a proxy for some secondary or deeper meaning. Although I offer no solutions here, readers of the blog will know I am hopeful of the promise of these sorts of measurements <em>if used appropriately</em>, and right now, we&#8217;re still too close to the cutting edge to know exactly what that means. There are, however, copious examples of text analytics used well in the humanities (most recently, for example, <a href="http://www.jstor.org/discover/10.1086/663350?uid=3739256&amp;sid=56050796273">Joanna Guldi&#8217;s  publication on the history of walking</a>).</p>
<h1>The Confusion</h1>
<p><a href="http://klout.com/">Klout</a> is a web service which ranks your social influence based on your internet activity. I don&#8217;t know how Klout&#8217;s algorithm works (and I doubt they&#8217;d be terribly forthcoming if I asked), but one of the products of that algorithm is a list of topics about which you are influential. For instance, Klout believes me to be quite influential with regards to <strong>Money </strong>(really? I don&#8217;t even have any of that.) and <strong>Journalism </strong>(uhmm.. no.), somewhat influential in <strong>Juggling</strong> (spot on.), <strong>Pizza</strong> (I guess I <em>am</em> from New York&#8230;), <strong>Scholarship</strong> (Sure!), and <strong>iPads</strong> (I&#8217;ve never touched an iPad.), and vaguely influential on the topic of <strong>Cars</strong> (nope) and <strong>Mining</strong> (do they mean text mining?).</p>
<p><div id="attachment_16746" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/05/2008-04-25_Netpliance_i-Opener_pizza_key.jpg"><img class="size-medium wp-image-16746 " title="By Ildar Sagdejev (Specious) (Own work) [GFDL (www.gnu.org/copyleft/fdl.html) or CC-BY-SA-3.0-2.5-2.0-1.0 (www.creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/05/2008-04-25_Netpliance_i-Opener_pizza_key-300x199.jpg" alt="By Ildar Sagdejev (Specious) (Own work) [GFDL (www.gnu.org/copyleft/fdl.html) or CC-BY-SA-3.0-2.5-2.0-1.0 (www.creativecommons.org/licenses/by-sa/3.0)], via Wikimedia Commons" width="300" height="199" /></a><p class="wp-caption-text">My pizza expertise is clear.</p></div>Thankfully careers don&#8217;t ride on this measurement (we have <a href="http://en.wikipedia.org/wiki/H-index">other metrics</a> for that), but the danger is still fairly clear: the confusion of vocabulary and syntax for semantics and pragmatics. There are clear layers between the written word and its intended meaning, and those layers often depend on context and prior knowledge. Further, regardless of the intended meaning of the author, how her words are interpreted in the larger world can vary wildly. She may talk about money and pizza until she is blue in the face, but if the whole world disagrees with her, that is no measurement of expertise nor influence (even if angry pizza-lovers frequently shout at her about her pizza opinions).</p>
<p>We see very simple examples of this in <a href="http://en.wikipedia.org/wiki/Sentiment_analysis">sentiment  analysis</a>, a way to extract the attitude of the writer toward whatever it was he&#8217;s written. An old friend who recently dipped his fingers in sentiment analysis wrote this:</p>
<blockquote class="twitter-tweet" data-in-reply-to="197391138792017922"><p>@<a href="https://twitter.com/scott_bot">scott_bot</a> My sentiment engine just spit out this gem: &#8220;2.33 sarah jessica parker is a very handsome man&#8221; (scale is -5 to +5)</p>
<p>— Warren Moore (@warrenm) <a href="https://twitter.com/warrenm/status/197398507420794880" data-datetime="2012-05-01T18:54:06+00:00">May 1, 2012</a></p></blockquote>
<p>According to his algorithm, that sentence was a positive one. Unless I seriously misunderstand my social cues (which I suppose wouldn&#8217;t be <em>too</em> unlikely), I very much doubt the intended positivity of the author. However, most decent algorithms would pick up that this was a tweet from somebody who was positive about Sarah Jessica Parker.</p>
<h1>Unobtrusive Measurements</h1>
<p>This particular approach to understanding humans belongs to the larger methodological class of <a href="http://www.socialresearchmethods.net/kb/unobtrus.php">unobtrusive measurements</a>. Generally speaking, this topic is discussed in the context of the social sciences and is contrasted with more &#8216;obtrusive&#8217; measurements along the lines of interviews or sticking people in labs. Historians generally don&#8217;t need to talk about unobtrusive measurements because, hey, the only way we could be obtrusive to our subjects would require exhuming bodies. It&#8217;s the idea that you can cleverly infer things about people from a distance, <em>without them knowing that they are being studied</em>.</p>
<p>Notice the disconnect between what I just said, and the word itself. &#8216;Unobtrusive&#8217; against &#8220;without them knowing that they are being studied.&#8221; These are clearly not the same thing, and that distinction between definition and word is fairly important &#8211; and not merely in the context of this discussion. One classic example (<a href="http://www.tandfonline.com/doi/abs/10.1080/00224545.1968.9933615">Doob and Gross, 1968</a>) asks how somebody&#8217;s social status determines whether someone might take aggressive action against them. They specifically measures a driver&#8217;s likelihood to honk his horn in frustration based on the perceived social status of the driver in front of them. Using a new luxury car and an old rusty station wagon, the researchers would stop at traffic lights that had turned green and would wait to see whether the car behind them honked. In the end, significantly more people honked at the low status car. More succinctly: status affects decisions of aggression.  Honking and the perceived worth of the car were used as proxies for aggression and perceptions of status, much like vocabulary is used as a proxy for meaning.</p>
<p>In no world would this be considered unobtrusive from the subject&#8217;s point of view. The experimenters intruded on their world, and their actions and lives changed because of it. All it says is that the subjects won&#8217;t change their behavior based on the knowledge that they are being studied. However, when an unobtrusive experiment becomes large enough, even one as innocuous as counting words, even <em>that</em> advantage no longer holds. Take, for example, citation analysis and the <a href="http://en.wikipedia.org/wiki/H-index">h-index</a>. Citation analysis was initially construed as an unobtrusive measurement; we can say things about scholars and scholarly communication by looking at their citation patterns rather than interviewing them directly. However, now that entire nations (like Australia or the UK) use quantitative analysis to distribute funding to scholarship, the measurements are no longer unobtrusive. Scholars know how the new scholarly economy works, and have no problem changing their practices to get tenure, funding, etc.</p>
<h1>The Measurement and The Effect: Untested Proxies</h1>
<p>A paper was recently published (<a href="http://onlinelibrary.wiley.com/doi/10.1111/j.1744-6570.2011.01239.x/full">O&#8217;Boyle Jr. and Aguinis, 2012</a>) on the non-normality of individual performance. The idea is that we assume that people&#8217;s performance (for example students in a classroom) are normally distributed along a bell curve. A few kids get really good grades, a few kids get really bad grades, but most are &#8216;C&#8217; students. The authors challenge this view, suggesting performance takes on more of a power-law distribution, where very few people perform <em>very well</em>, and the majority perform very poorly, with 80% of people performing worse than the statistical average. If that&#8217;s hard to imagine, it&#8217;s because people are trained to think of averages on a bell curve, where 50% are greater than average and 50% are worse than average. Instead, imagine one person gets a score of 100, and another five people get scores of 10. The average is (100 + (10 * 5)) / 6 = 25, which means five out of the six people performed worse than average.</p>
<p>It&#8217;s an interesting hypothesis, and (in my opinion) probably a correct one, but their paper does not do a great job showing that. The reason is (you guessed it) they use scores as a proxy for performance.  For example, they look at the number of published papers individuals have in top-tier journals, and show that some authors are very productive whereas most are not. However, it&#8217;s a fairly widely-known phenomena that in science, famous names are more likely to be published than obscure ones (there are many anecdotes about anonymous papers being rejected until the original, famous author is revealed, at which point the paper is magically accepted). The number of accepted papers may be as much a proxy for fame as it is for performance, so the results do not support their hypothesis. The authors then look at awards given to actors and writers, however those awards suffer the same issues: the more well-known an actor, the the more likely they&#8217;ll be used in good movies, the more likely they&#8217;ll be visible to award-givers, etc. Again, awards are not a proxy for the quality of a performance. The paper then goes on to measure elected officials based on votes in elections. I don&#8217;t think I need to go on about how votes might not map one-to-one on the performance and prowess of an elected official.</p>
<p>I blogged a review of the <a href="http://www.scottbot.net/HIAL/?p=12364">most recent culturomics paper</a>, which used <a href="http://books.google.com/ngrams">google ngrams</a> to look at the frequency of recurring natural disasters (earthquakes, floods, etc.) vs. the frequency of recurring social events (war, unemployment, etc.). The paper concludes that, <em>because of differences in the frequency of word-use for words like &#8216;war&#8217; or &#8216;earthquake&#8217;,</em> the <em>phenomena themselves</em> are subject to different laws. The authors use word frequency as a proxy for the frequency of the events themselves, much in the same way that Klout seems to measure influence based on word-usage and counting. The problem, of course, is that the processes which govern what people decide to write down do not enjoy a one-to-one relationship to what people experience. Using words as proxies for events is just as problematic as using them for proxies of expertise, influence, or performance. The underlying processes are simply far more complicated than these algorithms give them credit for.</p>
<p>It should be noted, however, that the counts are not <em>meaningless</em>; they just don&#8217;t necessarily work as proxies for what these ngram scholars are trying to measure. Further, although the underlying processes are quite complex, the effect size of social or political pressure on word-use may be negligible to the point that their hypothesis is actually correct. The point isn&#8217;t that one <em>cannot</em> use one measurement as a proxy for something else; rather, the effectiveness of that proxy is assumed rather than actually explored or tested in any way. We need to do a better job, <em>especially</em> as humanists, of figuring out exactly how certain measurements map onto effects we seek.</p>
<p>A <a href="http://arxiv.org/abs/1003.6087">beautiful case study</a> that exemplifies this point was written by famous statistician Andrew Gelman, and it aims to use unobtrusive and indirect measurements to find alien attacks and zombie outbreaks. He uses <a href="http://www.google.com/trends/">Google Trends</a> to show that the number of zombies in the world are growing at a frightening rate.</p>
<div id="attachment_16750" class="wp-caption aligncenter" style="width: 615px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/05/Fullscreen-capture-562012-100204-AM.bmp.jpg"><img class="size-full wp-image-16750" title="Google Trends Zombies" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/05/Fullscreen-capture-562012-100204-AM.bmp.jpg" alt="" width="605" height="300" /></a><p class="wp-caption-text">Zombies will soon take over!</p></div>
<p>&nbsp;</p>
<img src="http://feeds.feedburner.com/~r/TheScottbotIrregular/~4/SF2-VXczqig" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.scottbot.net/HIAL/?feed=rss2&amp;p=16713</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.scottbot.net/HIAL/?p=16713</feedburner:origLink></item>
		<item>
		<title>ORBIS: The next step in Digital Humanities</title>
		<link>http://feedproxy.google.com/~r/TheScottbotIrregular/~3/VdqzJY8w5S0/</link>
		<comments>http://www.scottbot.net/HIAL/?p=15585#comments</comments>
		<pubDate>Wed, 02 May 2012 15:04:17 +0000</pubDate>
		<dc:creator>Scott Weingart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[GIS]]></category>
		<category><![CDATA[network analysis]]></category>
		<category><![CDATA[review]]></category>
		<category><![CDATA[scholarly communication]]></category>

		<guid isPermaLink="false">http://www.scottbot.net/HIAL/?p=15585</guid>
		<description><![CDATA[Every once in a while, a new project comes around bearing a message loud and clear: this is a sign of things to come. ORBIS, the Stanford Geospatial Network Model of the Roman World, is one such project. ORBIS was created by Walter Scheidel, Elijah Meeks, and a host of others. At the very beginning, <a href='http://www.scottbot.net/HIAL/?p=15585' class='excerpt-more'>[...]</a>]]></description>
			<content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.type=&amp;rft.format=text&amp;rft.title=ORBIS%3A+The+next+step+in+Digital+Humanities&amp;rft.source=the+scottbot+irregular&amp;rft.date=2012-05-02&amp;rft.identifier=http%3A%2F%2Fwww.scottbot.net%2FHIAL%2F%3Fp%3D15585&amp;rft.language=English&amp;rft.subject=Uncategorized&amp;rft.aulast=Weingart&amp;rft.aufirst=Scott"></span><p>Every once in a while, a new project comes around bearing a message loud and clear: this is a sign of things to come. <a href="http://orbis.stanford.edu/#">ORBIS, the Stanford Geospatial Network Model of the Roman World</a>, is one such project.</p>
<p>ORBIS was created by Walter Scheidel, Elijah Meeks, and a host of others. At the very beginning, I should point out <em>I am not a classicist</em>. The below review is of the nature rather than the content of ORBIS as a scholarly product.</p>
<div id="attachment_15593" class="wp-caption aligncenter" style="width: 460px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/04/medi-sea_dark-and-abstract450w.png"><img class="size-full wp-image-15593" title="Roman World" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/04/medi-sea_dark-and-abstract450w.png" alt="" width="450" height="291" /></a><p class="wp-caption-text">Roman Travel Network</p></div>
<p>ORBIS is many things but, most simply, it is an interface allowing researchers to experience the geography of the Roman world from an ancient perspective. The executive summary: given any two cities in the ancient world, it returns the fastest, cheapest, or shortest route between them, given the month, the mode of transportation, and various other options. It&#8217;s Google Maps for the ancient world, complete with the &#8220;Avoid Highways&#8221; feature.</p>
<p>I was among the lucky few to see an early version of the tool, and after sending back an informal review, Elijah Meeks invited me to review the site publicly via my blog. The first section explains what I feel is the most important contribution of ORBIS to the Digital Humanities; it is a reflexive tool that allows the humanist to engage with the process as well as the product. I then highlight some of the cool features, and finally list some rough edges and desiderata for future iterations or similar projects.</p>
<h1>Tool As Argument</h1>
<p>Beyond being an exceptionally well-made and useful tool, it is not the tool itself which makes ORBIS stand out. Walter Scheidel and Elijah Meeks could have posted the automated map portion of the site by itself, and it would have garnered deserving praise, but they went well beyond that goal; they made a <em>reflexive</em> tool.</p>
<p>ORBIS is among the first digital scholarly tools for the humanities (that I have encountered) that really lives up to the name &#8220;digital scholarly tool for the humanities.&#8221; Beyond being a simple tool, ORBIS is an explicit and transparent argument, a way of presenting research that also happens to allow, by its very existence, further research to be done. <strong>It is a map that allows the user to engage in the process of map-making</strong>, and a presentation of a process that allows the user to make and explore in ways the initial creators could not have foreseen. Of course, as with any project there are a few rough edges and desired features, which I&#8217;ll get into further down below.</p>
<div id="attachment_15606" class="wp-caption aligncenter" style="width: 705px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/04/figbit7.png"><img class="size-large wp-image-15606" title="Rome Elevation Data" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/04/figbit7-1024x514.png" alt="" width="695" height="348" /></a><p class="wp-caption-text">Elevation data to help model the difficulty in getting from one place to another.</p></div>
<p>Along with the map, the Makers of this project (by which I mean authors, developers, data gatherers, &#8230;) present a fairly interactive documentary of the map-making process, including historical accounts, data sources, algorithmic explanations, visual aids, downloadable data, and a forthcoming API. They built an explicit model of the ancient world, taking into account roads and rivers, oceans and coastlines, weather and geographic features, various modes of transportation for civilian and military purposes, and put it all together so any researcher can sit down and figure out how long it would have taken, or how expensive it would have been, to travel between 751 locations in the ancient Roman world. Rather than asking us to trust that their data are accurate, the makers revealed their model &#8211; their underlying argument &#8211; for critique and extension.</p>
<h1>Exploring the Ancient World</h1>
<p>The ORBIS model includes 751 sites covering about 4 million square miles of ancient space, including over 50,000 miles of road or desert tracks, nearly 20,000 miles of navigable rivers and canals, and almost 1,000 sea routes between sea ports. As I mentioned earlier, the model works like Google Maps; given two locations, it tells you the cheapest, shortest, or fastest route between them. These calculations take into account the time-of-year and usual weather, elevation changes between sites, fourteen modes of travel (ox cart, foot, army on march, camel caravan, etc.), river travel (including extra difficulty moving upstream), etc.</p>
<div id="attachment_15609" class="wp-caption aligncenter" style="width: 979px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/04/orbis.png"><img class="size-full wp-image-15609" title="ORBIS" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/04/orbis.png" alt="" width="969" height="702" /></a><p class="wp-caption-text">The ORBIS Interface</p></div>
<p>Another exciting feature on ORBIS is the distance cartogram. This visualization reveals the impact of travel speed and transport prices on overall connectivity; it allows the researcher to see how far other cities were with respect to a certain core city (for instance Constantinople) from the perspective of cost and travel time rather than mere geographical distance. This feature brings the researcher closer to the actual ancient Roman experience. A larger insight is revealed when taking a &#8220;distant reading&#8221; approach to the cartogram: &#8220;Distance cartograms show that due to massive cost differences between aquatic and terrestrial modes of transport, peripheries were far more remote from the center in terms of price than in terms of time.&#8221;</p>
<div id="attachment_15614" class="wp-caption aligncenter" style="width: 1272px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/04/dynamic_cartogram.png"><img class="size-full wp-image-15614" title="Dynamic Cartogram" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/04/dynamic_cartogram.png" alt="" width="1262" height="735" /></a><p class="wp-caption-text">Constantinople Cartogram</p></div>
<h1>Desiderata</h1>
<p>ORBIS is a big step forward in designing digital scholarly objects for the digital humanities. It is a tool that is both useful and reflexive, offering engagement with both process and product. It also exemplifies an increasingly popular mode of scholarly communication: the published online object. Because the mode is still (even after decades of online DH projects) not quite solidified, ORBIS lacks a few of the basic features of common scholarly communication, and by straddling both the new and the old, ORBIS doesn&#8217;t <em>quite</em> live up to the best qualities of either digital or analog publication.</p>
<p>First of all, although their team sent a preliminary version of the site out to many people, it never went through any formal review process. Readers of this blog will know that I am no advocate of traditional publication systems or the antiquated marriage of publication and peer-review, but at this point it is worth noting that ORBIS (to my knowledge) has only been reviewed informally, by sympathetic reviewers like myself. Perhaps this means that adoption of the tool should be approached with greater caution until it is more formally reviewed by a post-publication periodical like the <a href="http://journalofdigitalhumanities.org/">Journal of Digital Humanities</a>.</p>
<p>That being said, the site does try remain true to humanistic and traditional publication roots. A paper version is in the works, and it is written such that we researchers can engage in the process of the tool. Unfortunately, it perhaps stays a bit <em>too</em> true to the paper model. The site is designed to read top-to-bottom, left-to-right, and none of the internal references to other sections include links to aid in navigation. Further, if the intent is to simultaneously allow exploration of the tool and its creation, the design does not realize this goal. The map appears at &#8220;the end&#8221; of the site, all the way on the right, and because of the layout, it is impossible to view it alongside the text describing it without opening a new window. There is quite a bit of white space to the right of the text on my wide-screen monitor &#8211; perhaps a smaller version of the tool can be embedded in that space.</p>
<p>One of the strengths of the project is the explicit nature of its creation. Data can be downloaded, and the sources, provenance, algorithms, and technologies are clearly stated. The model as an argument is, in short, visible and comprehensible even to those with little prior knowledge on these technologies. What this does is bridge the gap between code and humanistic inquiry, adding levels of model explication and tool-use between them. ORBIS is by far not the first project to make the creation of a tool explicit, but usually that explication is simply a public posting of the code and some limited comments or descriptions of how that code works. Unfortunately, although ORBIS does include a better bridge to explicate its argument, <em>it does not offer the code</em>. It&#8217;s a bit like David Copperfield explaining how he made the Statue of Liberty disappear; the explanation would certainly be helpful, but if he really wanted other people to be able to create similar illusions, he&#8217;d offer up the materials as well. (Alright, the metaphor doesn&#8217;t completely work, but stick with it.) The digital humanities seems finally to be getting into code sharing, and this is a <em>good thing</em>. The cost for sharing code is essentially free (although there&#8217;s a much greater price for sharing good code &#8211; all the extra time spent marking it up and making it pretty), and the benefits should go without saying: More things like ORBIS, much faster. Better tools built collectively and suiting all our individual needs.</p>
<p>The last, most important, and most difficult of my desires deals with uncertainty. There&#8217;s been a lot of talk about data uncertainty in the humanities lately, not least of which stemming from Stanford, the home university of ORBIS. It&#8217;s a difficult problem to solve, but presented as it is, the ORBIS project lends itself to the varieties of critiques common in the work of Johanna Drucker and others. How do you know that these were the shortest routes? What about missing information? What about the fact that every bit of travel was its own experience, with different human and environmental factors playing in, perhaps delays for sick relatives or mutineering seamen? These questions are swept under the table when ORBIS presents one route and one set of numbers per query: here, <em>this</em> is the fastest route, <em>these </em>are the cities, <em>this</em> is how much it would cost. The visualization and end-products create an illusion of certainty in the data, although in the text, the makers are quick to point out that a researcher should not take it as certain. One solution, and this extends to <em>all</em> data-driven DH projects, is to model uncertainty in the data from the ground up. How much more certain is one route than another? How certain are you of the weather in one location compared to the weather elsewhere? This sort of information flows naturally into <a href="http://www.scottbot.net/HIAL/?p=8237">models of Bayesian data analysis</a>, and would allow ORBIS to deliver a list of <em>credible</em> routes, revealing which parts of those routes are more or less certain, and including other information like the probability of a ship being lost at sea on a particular route. Of course, data uncertainty is only part of the problem, and this would only be a partial solution.</p>
<p>This isn&#8217;t the place to detail exactly how uncertainty should be modeled in the data, and exactly what ought to be done with it, but the fact is there is <em>already</em> rich knowledge in the model and in the data available dealing with the uncertainty of travel, but that information disappears as soon as it is presented in the map interface. If ORBIS represents the next step in humanities tool production, it doesn&#8217;t quite (yet) live up to the promise of humanities data analysis, impressive as their analysis is. There is still not yet a clear enough representation of uncertainty and interpretation to reach that goal. To be fair, I&#8217;ve yet to see a single project living up to that promise at anything close to large-scale; the tools just haven&#8217;t been developed yet. Perhaps that promise is impossible at large scale, although I certainly hope that is not the case.</p>
<h1>The View From Here</h1>
<p>Despite my long list of rough edges and desiderata, I still stand by my statement that this tool is an exemplar of a shift in digital humanities projects. The tool itself is profoundly impressive and will prove useful for a variety of research, but what stands out from the humanities standpoint is the explicit nature of the ORBIS underbelly. It blurs the line between tool and argument. There are other profoundly impressive and useful tools out there (topic modeling comes to mind). However, with topic modeling, the assumptions are still obscure to the unfamiliar, despite <a href="http://www.scottbot.net/HIAL/?p=221">my own best efforts</a> and the <a href="http://tedunderwood.wordpress.com/2012/04/07/topic-modeling-made-just-simple-enough/">even better efforts of others</a>. This is because the software topic modeling is packaged with, the software we use to run the analyses, does not simultaneously engage in the process of its own creation in the way that ORBIS does. Going forward, I predict the most used (or at least the most useful) digital tools for humanists will include that engagement, rather than existing as black boxes out of which results spring forth, fully armed and ready to battle as Athena from Zeus&#8217;s forehead. ORBIS is by no means the first to attempt such a feat but, I think, it is as-yet the most successful.</p>
<p>&nbsp;</p>
<img src="http://feeds.feedburner.com/~r/TheScottbotIrregular/~4/VdqzJY8w5S0" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.scottbot.net/HIAL/?feed=rss2&amp;p=15585</wfw:commentRss>
		<slash:comments>4</slash:comments>
		<feedburner:origLink>http://www.scottbot.net/HIAL/?p=15585</feedburner:origLink></item>
		<item>
		<title>Science Systems Engineering</title>
		<link>http://feedproxy.google.com/~r/TheScottbotIrregular/~3/oJhHelL6ycM/</link>
		<comments>http://www.scottbot.net/HIAL/?p=13199#comments</comments>
		<pubDate>Wed, 07 Mar 2012 15:48:30 +0000</pubDate>
		<dc:creator>Scott Weingart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[complexity]]></category>
		<category><![CDATA[history of science]]></category>
		<category><![CDATA[human dynamics]]></category>
		<category><![CDATA[scholarly communication]]></category>
		<category><![CDATA[scientonomy]]></category>

		<guid isPermaLink="false">http://www.scottbot.net/HIAL/?p=13199</guid>
		<description><![CDATA[Warning: This post is potentially evil, and definitely normative. While I am unsure whether what I describe below should be done, I&#8217;m becoming increasingly certain that it could be. Read with caution. Complex Adaptive Systems Science is a complex adaptive system. It is a constantly evolving network of people and ideas and artifacts which interact with and feed back on each <a href='http://www.scottbot.net/HIAL/?p=13199' class='excerpt-more'>[...]</a>]]></description>
			<content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.type=&amp;rft.format=text&amp;rft.title=Science+Systems+Engineering&amp;rft.source=the+scottbot+irregular&amp;rft.date=2012-03-07&amp;rft.identifier=http%3A%2F%2Fwww.scottbot.net%2FHIAL%2F%3Fp%3D13199&amp;rft.language=English&amp;rft.subject=Uncategorized&amp;rft.aulast=Weingart&amp;rft.aufirst=Scott"></span><p><strong>Warning</strong>: This post is potentially <span style="color: #ff0000;">evil</span>, and definitely <span style="text-decoration: underline;">normative</span>. While I am unsure whether what I describe below <em>should </em>be done<em>, </em>I&#8217;m becoming increasingly certain that it <em>could</em> be. Read with caution.</p>
<h1>Complex Adaptive Systems</h1>
<p><a href="http://en.wikipedia.org/wiki/Wissenschaft">Science</a> is a <a href="http://en.wikipedia.org/wiki/Complex_adaptive_system">complex adaptive system</a>. It is a constantly evolving network of people and ideas and artifacts which interact with and feed back on each other to produce this amorphous socio-intellectual entity we call science. Science is also a bunch of nested complex adaptive systems, some overlapping, and is itself part of many other systems besides.</p>
<p>The study of complex interactions is enjoying a boom period due to the facilitating power of the &#8220;information age.&#8221; Because any complex system, whether it be a social group or a pool of chemicals, can exist in almost innumerable states while comprising the same constituent parts, it requires massive computational power to comprehend all the many states a system might find itself in. From the other side, it takes a massive amount of data observation and collection to figure out what states systems eventually <em>do</em> find themselves in, and that knowledge of how complex systems play out in the real world relies on collective and automated data gathering. From seeing how complex systems work in reality, we can infer properties of their underlying mechanisms; by modeling those mechanisms and computing the many possibilities they might allow, we can learn more about ourselves and our place in the larger multisystem. <a class="simple-footnote" title="I&#8217;m coining the term &#8220;multisystem&#8221; because ecosystem is insufficient, and I don&#8217;t know something better. By multisystem, I mean any system of systems; specifically here, the universe and how it evolves. If you&#8217;ve got a better term that invokes that concept, I&#8217;m all for using it. Cosmos comes to mind, but it no longer represents &#8220;order,&#8221; a series of interlocking systems, in the way it once did." id="return-note-13199-1" href="#note-13199-1"><sup>1</sup></a></p>
<p>One of the surprising results of complexity theory is that seemingly isolated changes can produce rippling, massive effects throughout a system.  Only a decade after the removal of big herbivores like giraffes and elephants from an African savanna, a generally positive relationship between bugs and plants turned into an antagonistic one. Because the herbivores no longer grazed on certain trees, those trees began producing less nectar and fewer thorns, which in turn caused cascading repercussions throughout the ecosystem. Ultimately, the trees&#8217; mortality rate doubled, and a variety of species were worse-off than they had been. <a class="simple-footnote" title="Palmer, Todd M, Maureen L Stanton, Truman P Young, Jacob R Goheen, Robert M Pringle, and Richard Karban. 2008. “Breakdown of an Ant-Plant Mutualism Follows the Loss of Large Herbivores from an African Savanna.” Science319 (5860) (January 11): 192–195. doi:10.1126/science.1151579." id="return-note-13199-2" href="#note-13199-2"><sup>2</sup></a> Similarly, the introduction of an invasive species can cause untold damage to an ecosystem, as has become abundantly clear in Florida <a class="simple-footnote" title="Gordon, Doria R. 1998. “Effects of Invasive, Non-Indigenous Plant Species on Ecosystem Processes: Lessons From Florida.” Ecological Applications 8 (4): 975–989. doi:10.1890/1051-0761(1998)008[0975:EOINIP]2.0.CO;2." id="return-note-13199-3" href="#note-13199-3"><sup>3</sup></a> and around the world (the extinction of flightless birds in New Zealand springs to mind).</p>
<div id="attachment_13318" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/3202569865_43e34ec54f_o.jpg"><img class="size-medium wp-image-13318" title="Giraffes at sunries" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/3202569865_43e34ec54f_o-300x199.jpg" alt="" width="300" height="199" /></a><p class="wp-caption-text">http://www.flickr.com/photos/arnolouise/3202569865/</p></div>
<p>Both evolutionary and complexity theories show that self-organizing systems evolve in such a way that they are self-sustaining and self-perpetuating. Often, within a given context or environment, the systems which are most resistant to attack, or the most adaptable to change, are the most likely to persist and grow. Because the entire environment evolves concurrently, small changes in one subsystem tend to propagate as small changes in many others. However, when the constraints of the environment change rapidly (like with the introduction of an asteroid and a cloud of sun-cloaking dust), when a new and sufficiently foreign system is introduced (land predators to New Zealand), or when an important subsystem is changed or removed (the loss of megafauna in Africa), devastating changes ripple outward.</p>
<p>An environmental ecosystem is one in which many smaller overlapping systems exist, and changes in the parts may change the whole; society can be described similarly. Students of history know that the effects of one event (a sinking ship, an assassination, a terrorist attack) can propagate through society for years or centuries to come. However, a system not merely a slave to these single occurrences which cause Big Changes. The structure and history of a system implies certain stable, low energy states. We often anthropomorphize the tendency of systems to come to a stable mean, for example &#8220;nature <em>abhors</em> a vacuum.&#8221; This is just the manifestation of the second law of thermodynamics: entropy always increases, systems naturally tend toward low energy states.</p>
<p>For the systems of society, they are historically structured constrained in such a way that certain changes would require very little energy (an assassination leading to war in a world already on the brink), whereas others would require quite a great deal (say, an attempt to cause war between Canada and the U.S.). It is a combination of the current structural state of a system and the interactions of the constituent parts that lead that system in one direction or another. Put simply, <strong>a society, its people, and its environment are responsible for its future</strong>. Not terribly surprising, I know, but the formal framework of complexity theory is a useful one for what is described below.</p>
<div id="attachment_13317" class="wp-caption aligncenter" style="width: 364px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/354px-Meta-stability.svg_.png"><img class="size-full wp-image-13317" title="metastability" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/354px-Meta-stability.svg_.png" alt="" width="354" height="206" /></a><p class="wp-caption-text">metastability</p></div>
<p>The above picture, from the Wikipedia article on <a href="http://en.wikipedia.org/wiki/Metastability">metastability</a>, provides an example of what&#8217;s described above. The ball is resting in a valley, a low energy state, and a small change may temporarily excite the system, but the ball eventually finds its way into the same, or another, low energy state. When the environment is stable, its subsystems tend to find comfortably stable niches as well. Of course, I&#8217;m not sure anyone would call society wholly stable&#8230;</p>
<h1>Science as a System</h1>
<p>Science (by which I mean <em>wissenschaft, </em>any systematic research) is part of society, and itself includes many constituent and overlapping parts. I <a href="http://www.scottbot.net/HIAL/?p=12050">recently argued</a>, not without precedent, that the correspondence network between early modern Europeans facilitated the rapid growth of knowledge we like to call the Scientific Revolution. Further, that network was an inevitable outcome of socio/political/technological factors, including shrinking transportation costs, increasing political unrest leading to scholarly displacement, and, very simply, an increased interest in communicating once communication proved so fruitful. The state of the system affected the parts, the parts in turn affected the system, and a growing feedback loop led to the co-causal development of a massive communication network and a period of massively fruitful scholarly work.</p>
<div class="wp-caption aligncenter" style="width: 623px"><img title="Scientific Correspondence Network" src="http://www.scottbot.net/HIAL/wp-content/uploads/2011/11/Grotius.png" alt="" width="613" height="576" /><p class="wp-caption-text">Scientific Correspondence Network</p></div>
<p>Today and in the past, science is embedded in, and occasionally embodied by, the various organizational and communicative hierarchies its practitioners find themselves in. The people, ideas, and products of science feed back on one another. Scientists are perhaps more affected by their labs, by the process of publication, by the realities of funding, than they might admit. In return, the knowledge and ideas produced by science, the <em>message</em>, shape and constrain the medium in which they are propagated. I&#8217;ve often heard and read two opposing views: that knowledge is True and Right  and unaffected the various social goings on of those who produce it, and that knowledge is Constructed and Meaningless outside of the social and linguistic system it resides in. The truth, I&#8217;m sure, is a complex tangle somewhere between the two, and affected by both.</p>
<p>In either case, science does not take place in a vacuum. We do our work through various media and with various funds, in departments and networks and (sometimes) lab-coats, using a slew of carefully designed tools and a language that was not, in general, made for this purpose. In short, we and our work exist in a  complex system.</p>
<h1>Engineering the Academy</h1>
<p>That system is changing. Michael Nielsen&#8217;s recent book <a class="simple-footnote" title="Nielsen, Michael. Reinventing Discovery: The New Era of Networked Science. Princeton University Press, 2011." id="return-note-13199-4" href="#note-13199-4"><sup>4</sup></a> talks about the rise of citizen science, augmented intelligence, and collaborative systems as not merely as ways to do what we&#8217;ve already done faster, but as <em>new methods of discovery</em>. The ability to coordinate on such a scale, and in such new ways, changes the game of science. It changes the <em>system</em>.</p>
<p>While much of these changes are happening automatically, in a self-organized sort of way, Nielsen suggests that we can learn from our past and learn from other successful collective ventures in order to make a &#8220;design science of collaboration.&#8221; That is, using what we know of how people work together best, of what spurs on the most inspired research and the most interesting results, we can <em>design</em> systems to facilitate collaboration and scientific research. In Nielsen&#8217;s case, he&#8217;s talking mostly about computer systems; how can we design a website or an algorithm or a technological artifact that will aid in scientific discovery, using the massive distributed power of the information age? One way Nielson points out is &#8220;designed serendipity,&#8221; creating an environment where scientists are more likely experience serendipitous occurrences, and thus more likely to come up with innovated and unexpected ideas.</p>
<div id="attachment_13320" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/4818952324_a2cce9be1b_b.jpg"><img class="size-medium wp-image-13320" title="Engineering" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/4818952324_a2cce9be1b_b-300x234.jpg" alt="" width="300" height="234" /></a><p class="wp-caption-text">Can we engineer science? http://www.flickr.com/photos/seattlemunicipalarchives/4818952324</p></div>
<p>In complexity terms, this idea is restructuring the system in such a way that the constituent parts or subsystems will be or do &#8220;better,&#8221; however we feel like defining better in this situation. It&#8217;s definitely not the first time an idea like this has been used. For example, science policy makers, government agencies, and funding bodies have long known that science will often go where the money is. If there is a lot of money available to research some particular problem, then that problem will tend to get researched. If the main funding requires research funded to become open access, by and large that will happen (<a href="http://publicaccess.nih.gov/">NIH&#8217;s PubMed requirements</a>).</p>
<p>There are innumerable ways to affect the system in a top-down way in order to shape its future. Terrence Deacon writes about how it is the <em>constraints</em> on a system which tend it toward some equilibrium state <a class="simple-footnote" title="Deacon, Terrence W. “Emergence: The Hole at the Wheel’s Hub.” In The Re-Emergence of Emergence: The Emergentist Hypothesis from Science to Religion, edited by Philip Clayton and Paul Davies. Oxford University Press, USA, 2006." id="return-note-13199-5" href="#note-13199-5"><sup>5</sup></a>; by shaping the structure of the scientific system, we can <em>predictably</em> shape its direction. That is, we can artificially create a low energy state (say, open access due to policy and funding changes), and let the constituent parts find their way into that low energy state eventually, reaching equilibrium. I talked a bit more about this idea of constraints leading a system in a <a href="http://www.scottbot.net/HIAL/?p=11621">recent post</a>.</p>
<p>As may be recalled from the discussion above, however, this is <em>not</em> the only way to affect a complex system. External structural changes are only part of the story of how a system grows shifts, but only a small part of the story. Because of the series of interconnected feedback loops that embody a system&#8217;s complexity, small changes can (and often do) propagate up and change the system as a whole. Lie, Slotine, and Barabási recently began writing about the &#8220;controllability of complex networks <a class="simple-footnote" title="Liu, Yang-Yu, Jean-Jacques Slotine, and Albert-László Barabási. “Controllability of Complex Networks.” Nature473, no. 7346 (May 12, 2011): 167–173." id="return-note-13199-6" href="#note-13199-6"><sup>6</sup></a>,&#8221;  suggesting ways in which changing or controlling constituent parts of a complex system can reliably and predictably change the entire system, perhaps leading it toward a new preferred low energy state. In this case, they were talking about the importance of well-connected hubs in a network; adding or removing them in certain areas can deeply affect the evolution of that network, no matter the constraints. Watts recounts a great example of how a small power outage rippled into a national disaster because just the right connections were overloaded and removed <a class="simple-footnote" title="Watts, Duncan J. Six Degrees: The Science of a Connected Age. 1st ed. W. W. Norton &amp; Company, 2003." id="return-note-13199-7" href="#note-13199-7"><sup>7</sup></a>. The strategic introduction or removal of certain specific links in the scientific system may go far toward changing the system itself.</p>
<p>Not only is science is a complex adaptive system, it is a system which is becoming increasingly well-understood. A century of various science studies combined with the recent appearance of giant swaths of data about science and scientists themselves is beginning to allow us to learn the structure and mechanisms of the scientific system. We do not, and will never, know the most intricate details of that system, however in many cases and for many changes, we only need to know general properties of a system in order to change it in predictable ways. If society feels a certain state of science is better than others, either for the purpose of improved productivity or simply more control, we are beginning to see which levers we need to pull in order to enact those changes.</p>
<p>This is dangerous. We may be able to predict first order changes, but as they feed back onto second order, third order, and further-down-the-line changes, the system becomes more unpredictable. Changing one thing positively may affect other aspects in massively negative (and massively unpredictable) ways.</p>
<p>However, generally if humans <em>can</em> do something, we will. I predict the coming years will bring a more formal Science Systems Engineering, a specialty apart from science policy which will attempt to engineer the direction of scientific research from whatever angle possible. My first post on this blog concerned a concept I dubbed <a href="http://www.scottbot.net/HIAL/?p=47">scientonomy</a>, which was just yet another attempt at unifying everybody who studies science in a meta sort of way. In that vocabulary, then, this science systems engineering would be an <strong>applied scientonomy</strong>. We have countless experts in all aspects of how science works on a day-to-day basis from every angle; that expertise may soon become much more prominent in application.</p>
<p>It is my hope and belief that a more formalized way of discussing and engineering scientific endeavors, either on the large scale or the small, can lead to benefits to humankind in the long run. I share the optimism of Michael Nielsen in thinking that we can design ways to help the academy run more smoothly and to lead it toward a more thorough, nuanced, and interesting understanding of whatever it is being studied. However, I&#8217;m also aware of the dangers of this sort of approach, first and foremost being disagreement on what is &#8220;better&#8221; for science or society.</p>
<p>At this point, I&#8217;m just putting this idea out there to hear the thoughts of my readers. In my meatspace day-to-day interactions, I tend to be around experimental scientists and quantitative social scientists who in general love the above ideas,  but at my heart and on my blog I feel like a humanist, and these ideas worry me for all the obvious reasons (and even some of the more obscure ones). I&#8217;d love to get some input, especially from those who are terrified that somebody could even think this is possible.</p>
<div class="simple-footnotes"><p class="notes">Notes:</p><ol><li id="note-13199-1">I&#8217;m coining the term &#8220;multisystem&#8221; because ecosystem is insufficient, and I don&#8217;t know something better. By multisystem, I mean any system of systems; specifically here, the universe and how it evolves. If you&#8217;ve got a better term that invokes that concept, I&#8217;m all for using it. Cosmos comes to mind, but it no longer represents &#8220;<a href="http://www.etymonline.com/index.php?term=cosmos">order</a>,&#8221; a series of interlocking systems, in the way it once did. <a href="#return-note-13199-1">&#8617;</a></li><li id="note-13199-2">Palmer, Todd M, Maureen L Stanton, Truman P Young, Jacob R Goheen, Robert M Pringle, and Richard Karban. 2008. “Breakdown of an Ant-Plant Mutualism Follows the Loss of Large Herbivores from an African Savanna.” <em>Science</em>319 (5860) (January 11): 192–195. doi:10.1126/science.1151579. <a href="#return-note-13199-2">&#8617;</a></li><li id="note-13199-3">Gordon, Doria R. 1998. “Effects of Invasive, Non-Indigenous Plant Species on Ecosystem Processes: Lessons From Florida.” <em>Ecological Applications</em> 8 (4): 975–989. doi:10.1890/1051-0761(1998)008[0975:EOINIP]2.0.CO;2. <a href="#return-note-13199-3">&#8617;</a></li><li id="note-13199-4">Nielsen, Michael. <em>Reinventing Discovery: The New Era of Networked Science</em>. Princeton University Press, 2011. <a href="#return-note-13199-4">&#8617;</a></li><li id="note-13199-5">Deacon, Terrence W. “Emergence: The Hole at the Wheel’s Hub.” In <em>The Re-Emergence of Emergence: The Emergentist Hypothesis from Science to Religion</em>, edited by Philip Clayton and Paul Davies. Oxford University Press, USA, 2006. <a href="#return-note-13199-5">&#8617;</a></li><li id="note-13199-6">Liu, Yang-Yu, Jean-Jacques Slotine, and Albert-László Barabási. “Controllability of Complex Networks.” <em>Nature</em>473, no. 7346 (May 12, 2011): 167–173. <a href="#return-note-13199-6">&#8617;</a></li><li id="note-13199-7">Watts, Duncan J. <em>Six Degrees: The Science of a Connected Age</em>. 1st ed. W. W. Norton &amp; Company, 2003. <a href="#return-note-13199-7">&#8617;</a></li></ol></div><img src="http://feeds.feedburner.com/~r/TheScottbotIrregular/~4/oJhHelL6ycM" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.scottbot.net/HIAL/?feed=rss2&amp;p=13199</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://www.scottbot.net/HIAL/?p=13199</feedburner:origLink></item>
		<item>
		<title>Halting Conditions</title>
		<link>http://feedproxy.google.com/~r/TheScottbotIrregular/~3/FeVybJmDTXI/</link>
		<comments>http://www.scottbot.net/HIAL/?p=12736#comments</comments>
		<pubDate>Fri, 02 Mar 2012 14:46:47 +0000</pubDate>
		<dc:creator>Scott Weingart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[data analysis]]></category>
		<category><![CDATA[digital humanities]]></category>
		<category><![CDATA[methodologies]]></category>

		<guid isPermaLink="false">http://www.scottbot.net/HIAL/?p=12736</guid>
		<description><![CDATA[Occasionally, in computer science, the term &#8220;halting condition&#8221; is thrown around as the point at which the program should stop running. Say I&#8217;ve got a robot that watches my roommate and I play scrabble, and I want it to count how many scrabble pieces we use, and tell us who won and what the highest <a href='http://www.scottbot.net/HIAL/?p=12736' class='excerpt-more'>[...]</a>]]></description>
			<content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.type=&amp;rft.format=text&amp;rft.title=Halting+Conditions&amp;rft.source=the+scottbot+irregular&amp;rft.date=2012-03-02&amp;rft.identifier=http%3A%2F%2Fwww.scottbot.net%2FHIAL%2F%3Fp%3D12736&amp;rft.language=English&amp;rft.subject=Uncategorized&amp;rft.aulast=Weingart&amp;rft.aufirst=Scott"></span><p>Occasionally, in computer science, the term &#8220;halting condition&#8221; is thrown around as the point at which the program should stop running.</p>
<p>Say I&#8217;ve got a robot that watches my roommate and I play scrabble, and I want it to count how many scrabble pieces we use, and tell us who won and what the highest scoring word was. Unfortunately, let&#8217;s say, I&#8217;m also Superman, so our scrabble games frequently end early when I hear cries for help and run off to the nearest phone booth. Our robot has to decide what conditions mean the game is over so it can give us the winner report; in this case, it is either when one player runs out of pieces, or when nobody plays a piece for a significant amount of time, because games often end early. Those are our halting conditions.</p>
<div id="attachment_12826" class="wp-caption aligncenter" style="width: 510px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/scrabblebot.jpg"><img class="size-full wp-image-12826" title="Scrabblebot" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/scrabblebot.jpg" alt="" width="500" height="360" /></a><p class="wp-caption-text">Scrabble Robot from http://www.flickr.com/photos/ittybittiesforyou/3350135154/</p></div>
<p>When it comes to data collection, humanists have no halting conditions. We don&#8217;t even have decent halting <em><a href="http://en.wikipedia.org/wiki/Heuristic">heuristics</a></em>. Lisa Rhody just blogged a <a href="http://lisa.therhodys.net/2012/03/chasing-the-great-data-whale/">fantastically important piece</a> about the difficulties of data collection in the humanities, and her points are worth stressing. &#8220;You need to know,&#8221; Rhody writes, &#8220;when it&#8217;s time to cut the rope and release what <em>might</em> be done.&#8221; She points out that humanists need to be discerning in what data we <em>do</em> collect, and we need to be comfortable with analyzing and releasing imperfect data. &#8220;The decision <em>not</em> to be perfect is the right choice, but it isn&#8217;t an easy one.&#8221;</p>
<h1>Research Design</h1>
<p>Many (but not all!) of the natural sciences have it easy. You design an experiment, you get the data you planned to get, then you analyze and release it. The halting conditions, when to stop collecting and cleaning data, are usually fairly easily pre-determined and stuck to. Psychology and the social sciences are usually similar; they often either use data that already exists, or else collect it themselves under pre-specified conditions.</p>
<p>The humanities, well&#8230; we&#8217;re used to a tradition that involves very deep and particular reading. The tiniest stones of our studied objects do not go unturned. The idea that a first pass, <em>an incomplete pass</em>, can lead to anything at all, let alone analysis and release, is almost anathema to the traditional humanistic mindset.</p>
<p>Herein lies the problem of humanities big data. We&#8217;re trying to measure the length of a coastline by sitting on the beach with a ruler, rather flying over with a helicopter and a camera. And humanists know that, like the sandy coastline shifting with the tides, our data are constantly changing with each new context or interpretation. Cartographers are aware of this problem, too, but they&#8217;re still able to make fairly accurate maps.</p>
<div id="attachment_12828" class="wp-caption aligncenter" style="width: 510px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/4442066126_febc20557a.jpg"><img class="size-full wp-image-12828" title="Coastline" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/4442066126_febc20557a.jpg" alt="" width="500" height="333" /></a><p class="wp-caption-text">Measuring the Coast</p></div>
<p>While I won&#8217;t suggest that humanists should take a more natural-scientific approach to research, beginning with a specific hypothesis and pre-specified data that could either confirm or deny it, we <em>should</em> look to them for inspiration on how to plan research. Thinking about what sort of specific analyses you&#8217;d like to perform with the data at the end can reasonably constrain what you try to collect from the beginning. Think about what bits of data are redundant, or would yield diminishing returns on your time and money investment of data collection.</p>
<h1>Being Comfortable With Imperfection</h1>
<p>In her blog post, Lisa wrote about her experience at <a href="http://mith.umd.edu/">MITH</a>. She had a four month fellowship to research 4,500 poems; she could easily have spent the whole time collecting increasingly minute data about each poem. In the end, she settled on only collecting the gender of the poet and whether the poem pertained to a work of art, opting not to include information like when each poem was published, what work of art it referred to, etc. She would then go in later and use other large-scale analytic tools (like text analysis), augmenting those results with the tags she entered about each poem.</p>
<p>A lot of valuable, rich information was lost in this data collection, but the important thing is that Lisa was still able to go in with a specific question, and collect only that which she needed most to explore it. The data may not have been perfect, and they may not have described everything, but they were sufficient and useful.</p>
<p>Her story reminded me a  lot of my undergraduate years. I spent <em>all of them</em> collecting data on early modern letters for my old advisor. Letters, of course, generally have various locations and dates attached to them, and this presented us with no end of problems. Sometimes the places mentioned were cities, or houses, or states; granularities differed. Over the course of two hundred years, cities would change names, move, or wink out of or into existence entirely. Sometimes they would subsumed into new or different empires. Computers, unfortunately, need fairly regularized data to perform comparative analyses, so we had to make a lot of editorial decisions when entering locations that would make answering our questions easier, but would lose some of the nuance otherwise available.</p>
<p>Similarly, my colleague <a href="http://jeanajorgensen.com/wordpress/">Jeana Jorgensen</a> recently spent several months painstakingly hand-collecting data about the usage of body parts in fairy tales for her dissertation. Of particular interest in her case was the overtly interpretive layer she added to the collection; for example, did a reference somehow embody the &#8220;grotesque?&#8221; By allowing herself the freedom to use interpretive frameworks, she embraced the subjective nature of data collection, and was able to analyze her data accordingly.</p>
<p>Of course, by allowing this sort of humanistic nuance, the amount of data one could collect for any single sentence is effectively infinite, and so Jeana had to constrain herself to only collecting for that which she could eventually use. It nevertheless took her months of daily collection, but if she tried to make her data perfect or complete, it would have taken her over a lifetime. She still managed to produce really interesting and thoughtful results for her dissertation.</p>
<p><em>Perfect or complete data is impossible in the humanities</em>. The best we can do is not as much as we can, but as much as we need. There is a point of diminishing return for data collection; that point at which you can&#8217;t measure the coastline fast enough before the tides change it. We as humanists have to become comfortable with incompleteness and imperfection, and trust that in aggregate those data can still tell us <em>something</em>, even if they can&#8217;t reveal <em>everything</em>.</p>
<div id="attachment_12829" class="wp-caption aligncenter" style="width: 510px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/4657547742_8522e34a67.jpg"><img class="size-full wp-image-12829" title="Missing Pieces" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/4657547742_8522e34a67.jpg" alt="" width="500" height="333" /></a><p class="wp-caption-text">We can still see the landscape, even though not every piece is in place. http://www.flickr.com/photos/carmyarmyofme/4657547742/</p></div>
<p>The trick and art is knowing the right halting conditions. How much is too much? What data will actually be useful? These are not easy questions, and their answers differ for every project. The important thing to remember is to <strong><em>just do it</em></strong>. Too many projects get hung up because they just haven&#8217;t quite collected enough yet, or if they just spend a few more months cleaning their data will be so much better. There will never be a point when your data are perfect. Do your analysis now, release it, and be comfortable with the fact that you&#8217;ve fairly accurately mapped the coastline, even if you haven&#8217;t quite worked out the jitters of the tides.</p>
<img src="http://feeds.feedburner.com/~r/TheScottbotIrregular/~4/FeVybJmDTXI" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.scottbot.net/HIAL/?feed=rss2&amp;p=12736</wfw:commentRss>
		<slash:comments>4</slash:comments>
		<feedburner:origLink>http://www.scottbot.net/HIAL/?p=12736</feedburner:origLink></item>
		<item>
		<title>The Internet Listens</title>
		<link>http://feedproxy.google.com/~r/TheScottbotIrregular/~3/DhViXkqcMgs/</link>
		<comments>http://www.scottbot.net/HIAL/?p=12665#comments</comments>
		<pubDate>Thu, 01 Mar 2012 13:43:07 +0000</pubDate>
		<dc:creator>Scott Weingart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[scholarly communication]]></category>

		<guid isPermaLink="false">http://www.scottbot.net/HIAL/?p=12665</guid>
		<description><![CDATA[The public science blogosphere has recently been buzzing about an online edited book review called Download The Universe. The twist is that the editors only review online-only science books, and their definition of &#8220;book&#8221; is broadly construed: [W]e define ebooks broadly. They may be self-published pdf manuscripts. They may be Kindle Singles about science. They can <a href='http://www.scottbot.net/HIAL/?p=12665' class='excerpt-more'>[...]</a>]]></description>
			<content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.type=&amp;rft.format=text&amp;rft.title=The+Internet+Listens&amp;rft.source=the+scottbot+irregular&amp;rft.date=2012-03-01&amp;rft.identifier=http%3A%2F%2Fwww.scottbot.net%2FHIAL%2F%3Fp%3D12665&amp;rft.language=English&amp;rft.subject=Uncategorized&amp;rft.aulast=Weingart&amp;rft.aufirst=Scott"></span><p>The public science blogosphere has recently been buzzing about an online edited book review called <a href="http://www.downloadtheuniverse.com/">Download The Universe</a>. The twist is that the editors only review <em>online-only</em> science books, and their definition of &#8220;book&#8221; is broadly construed:</p>
<blockquote><p>[W]e define ebooks broadly. They may be self-published pdf manuscripts. They may be Kindle Singles about science. They can even be apps that have games embedded in them. We hope that we will eventually review new kinds of ebooks that we can&#8217;t even imagine yet. And we hope that you will find Download the Universe a useful doorway into that future.</p></blockquote>
<p>The site aims to fill the publicity gap that prevents interesting and good science ebooks from finding their way into the hands of receptive readers. Traditional reviews and blogs tend not to cover this new media, the editors say. In the spirit of the fast-paced nature of the internet, the entire project was conceived last month at <a href="http://scienceonline2012.com">Science Online</a> (<a href="https://twitter.com/#!/search/realtime/%23scio12">#scio12</a>) and already features 8 posts and an editing staff of 16.</p>
<div id="attachment_12666" class="wp-caption aligncenter" style="width: 956px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/dtu.png"><img class="size-full wp-image-12666" title="Download The Universe" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/03/dtu.png" alt="" width="946" height="248" /></a><p class="wp-caption-text">Download The Universe</p></div>
<p>My initial excitement of this project was tempered somewhat when I found that their news feed offered exceptionally tiny snippets of their ebook reviews. That&#8217;s no good! I&#8217;m subscribed to 361 feeds in Google Reader, with nearly 500 posts a day, and if I don&#8217;t have a few paragraphs to see whether an article is interesting, it is unlikely that I&#8217;d ever click through to the actual page to investigate further. (By the way, if you&#8217;re interested in the best of what I read, you can <a href="http://www.google.com/reader/shared/user/18160470639609463806/state/com.google/starred">subscribe to my favorite feed items here</a>, where I read through 361 blogs so you don&#8217;t have to.) Unfortunately, snippet news feeds are becoming increasingly frequent, as blogs and sites attempt to entice you to their pages where they can get usage statistics and ad-views in ways they could not through a simple RSS feed.</p>
<p><strong>Apparently, when you talk, the internet listens</strong>. My disappointment was such that I sent an email to the coordinating editor, science writer <a href="http://carlzimmer.com/">Carl Zimmer</a>, explaining my problem. He immediately sent a reply telling me he would look into the feedburner settings, and within short order, the RSS became a full, no-snippet news feed. Woah! A big (and public) thank you to Carl Zimmer, and the entire crew at <a href="http://www.downloadtheuniverse.com/">Download The Universe</a>, for putting together a wonderful and important new site and for being so receptive to their readers. Bravo!</p>
<img src="http://feeds.feedburner.com/~r/TheScottbotIrregular/~4/DhViXkqcMgs" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.scottbot.net/HIAL/?feed=rss2&amp;p=12665</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.scottbot.net/HIAL/?p=12665</feedburner:origLink></item>
		<item>
		<title>More heavy-handed culturomics</title>
		<link>http://feedproxy.google.com/~r/TheScottbotIrregular/~3/LRN-URn7wwc/</link>
		<comments>http://www.scottbot.net/HIAL/?p=12364#comments</comments>
		<pubDate>Mon, 27 Feb 2012 17:08:37 +0000</pubDate>
		<dc:creator>Scott Weingart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[culturomics]]></category>
		<category><![CDATA[history]]></category>
		<category><![CDATA[human dynamics]]></category>
		<category><![CDATA[text analysis]]></category>

		<guid isPermaLink="false">http://www.scottbot.net/HIAL/?p=12364</guid>
		<description><![CDATA[A few days ago, Gao, Hu, Mao, and Perc posted a preprint of their forthcoming article comparing social and natural phenomena. The authors, apparently all engineers and physicists, use the google ngrams data to come to the conclusion that &#8220;social and natural phenomena are governed by fundamentally different processes.&#8221; The take-home message is that words describing <a href='http://www.scottbot.net/HIAL/?p=12364' class='excerpt-more'>[...]</a>]]></description>
			<content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.type=&amp;rft.format=text&amp;rft.title=More+heavy-handed+culturomics+&amp;rft.source=the+scottbot+irregular&amp;rft.date=2012-02-27&amp;rft.identifier=http%3A%2F%2Fwww.scottbot.net%2FHIAL%2F%3Fp%3D12364&amp;rft.language=English&amp;rft.subject=Uncategorized&amp;rft.aulast=Weingart&amp;rft.aufirst=Scott"></span><p>A few days ago, Gao, Hu, Mao, and Perc <a href="http://arxiv.org/abs/1202.5299">posted a preprint</a> of their forthcoming <a href="http://rsif.royalsocietypublishing.org/content/early/2012/02/13/rsif.2011.0846.short">article comparing social and natural phenomena</a>. The authors, apparently all engineers and physicists, use the <a href="http://books.google.com/ngrams">google ngrams data</a> to come to the conclusion that &#8220;social and natural phenomena are governed by fundamentally different processes.&#8221; The take-home message is that words describing natural phenomena increase in frequency at regular, predictable rates, whereas the use of certain socially-oriented words change in unpredictable ways. Unfortunately, the paper doesn&#8217;t necessarily differentiate between words and what they describe.</p>
<p>Specifically, the authors invoke random fractal theory (sort of a descendant of chaos theory) to find regular patterns in 1-grams. A 1-gram is just a single word, and this study looks at how the frequency of certain words grow or shrink over time. A &#8220;<a href="http://en.wikipedia.org/wiki/Hurst_exponent">hurst parameter</a>&#8221; is found for 24 words, a dozen pertaining to nature (earthquake, fire, etc.), and another dozen &#8220;social&#8221; words (war, unemployment, etc.). The hurst parameter (<em>H</em>) is a number which, essentially, reveals whether or not a time series of data is correlated with itself. That is, given a set of observations over the last hundred years, autocorrelated data means the observation for this year will very likely follow a predictable trend from the past.</p>
<p>If <em>H</em> is between 0.5 and 1, that means the dataset has &#8220;long-term positive correlation,&#8221; which is roughly equivalent to saying that data quite some time in the past will still positively and noticeably effect data today. If <em>H</em> is under 0.5, data are negatively correlated with their past, suggesting that a high value in the past implies a low value in the future, and if <em>H</em> = 0.5, the data likely describe Brownian motion (they are random). <em>H</em> can exceed 1 as well, a point which I&#8217;ll get to momentarily.</p>
<p>The authors first looked at the frequency of 12 words describing natural phenomena between 1770 and 2007. In each case, <em>H </em>was between 0.5 and 1, suggesting a long-term positive trend in the use of the terms. That is, the use of the term &#8220;earthquake&#8221; does not fluctuate terribly wildly from year to year; looking at how frequently it was used in the past can reasonably predict how frequently it will be used in the future. The data have a long &#8220;memory.&#8221;</p>
<div id="attachment_12431" class="wp-caption aligncenter" style="width: 293px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/Natural1grams.png"><img class="size-full wp-image-12431" title="Natural 1-grams" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/Natural1grams.png" alt="" width="283" height="455" /></a><p class="wp-caption-text">Natural 1-grams from Gao et al. (2012)</p></div>
<p>The paper then analyzed 12 words describing social phenomena, with very different results. According to the authors, &#8221;social phenomena, apart from rare exceptions, cannot be classiﬁed solely as processes with persistent-long range correlations.&#8221; For example, the use of the word &#8220;war&#8221; bursts around World War I and World War II; these are <em>unpredictable</em> moments in the discussion of social phenomena. The way &#8220;war&#8221; was used in the past was not a good predictor of how &#8220;war&#8221; would be used around 1915 and 1940, for obvious reasons.</p>
<div id="attachment_12432" class="wp-caption aligncenter" style="width: 322px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/Social1grams.png"><img class="size-full wp-image-12432" title="Social 1-grams" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/Social1grams.png" alt="" width="312" height="456" /></a><p class="wp-caption-text">Social 1-grams from Gao et al. (2012)</p></div>
<p>You may notice that, for many of the social terms, <em>H</em> is actually greater than 1, &#8220;which indicates that social phenomena are most likely to be either nonstationary, on-off intermittent, or Levy walk-like process.&#8221; Basically, the <em>H</em> parameter alone is not sufficient to describe what&#8217;s going on with the data. Nonstationary processes are, essentially, unpredictable. A stationary process can be random, but at least certain statistical properties of that randomness remain persistent. Nonstationary processes don&#8217;t have those persistent statistical properties. The authors point out that not all social phenomena will have H &gt;1, citing famine, because it might relate to natural phenomena. They also point out that &#8220;the more the social phenomena can be considered recent (unemployment, recession, democracy), the higher their Hurst parameter is likely to be.&#8221;</p>
<p>In sum, they found that &#8220;<strong>The prevalence of long-term memory in natural phenomena [compels them] to conjecture that the long-range correlations in the usage frequency of the corresponding terms is predominantly driven by occurrences in nature of those phenomena</strong>,&#8221; whereas &#8220;<strong>it is clear that all these processes [describing social phenomena] are fundamentally different from those describing natural phenomena</strong>.&#8221; That the social phenomena follow different laws is not unexpected, they say, because they themselves are more complex; they rely on political, economic, and social forces, as well as natural phenomena.</p>
<p>While this paper is exceptionally interesting, and shows a very clever use of fairly basic data (24 one-dimensional variables, just looking at word use per year), it lacks the same sort of nuance also lacking in the <a href="http://www.sciencemag.org/content/331/6014/176.short">original culturomics paper</a>. Namely, in this case, it lacks the awareness that social and natural phenomena are not directly coupled with the words used to describe them, nor the frequency with which those words are used. The paper suggests that natural and social phenomena are governed by different scaling laws when, realistically, it is <em>the way they are discussed, and how those discussions are published</em> which are governed by the varying scaling laws. Further, although they used words exemplifying the difference between &#8220;nature&#8221; and &#8220;society,&#8221; the two are not always so easily disentangled, either in language or the underlying phenomena.</p>
<p>Perhaps the sort of words used to describe social events change differently than the sort used to describe natural events. Perhaps, because natural phenomena are often immediately felt across vast distances, whereas news of social phenomena can take some time to diffuse, how rapidly some words are discussed may take very different forms. Discussions and word-usage are always embedded in a larger network. Also needing to be taken into account is <em>who</em> is discussing social vs. natural phenomena, and which is more likely to get published and preserved to eventually be scanned by Google Books.</p>
<p>Without a doubt the authors have noticed a very interesting trend, but rather than matching the phenomena directly to word, as they did, we should be using this sort of study to look at how language changes, how people change, and ultimately what relationship people have with the things they discuss and publish. At this point, the engineers and physicists still have a greater comfort with the statistical tools needed to fully utilize the google books corpus, but there are some humanists out there already doing <a href="http://tedunderwood.wordpress.com/">absolutely fantastic quantitative work</a> with <a href="http://sappingattention.blogspot.com/">similar data</a>.</p>
<p>This paper, while impressive, is further proof that the quantitative study of culture should not be left to those with (apparently) little background in the subject. While it is not unlikely that different factors do, in fact, determine the course of natural disasters versus that of human interaction, this paper does not convincingly tease those apart. It may very well be that the language use is indicative of differences in underlying factors in the phenomena described, however no study is cited suggesting this to be the case. Claims like &#8220;social and natural phenomena are governed by fundamentally different processes,&#8221; given the above <em>language</em> data, could easily have been avoided, I think, with a short discussion between the authors and a humanist.</p>
<img src="http://feeds.feedburner.com/~r/TheScottbotIrregular/~4/LRN-URn7wwc" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.scottbot.net/HIAL/?feed=rss2&amp;p=12364</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.scottbot.net/HIAL/?p=12364</feedburner:origLink></item>
		<item>
		<title>The Networked Structure of Scientific Growth</title>
		<link>http://feedproxy.google.com/~r/TheScottbotIrregular/~3/yDmWQrdQSL8/</link>
		<comments>http://www.scottbot.net/HIAL/?p=12050#comments</comments>
		<pubDate>Wed, 22 Feb 2012 01:11:37 +0000</pubDate>
		<dc:creator>Scott Weingart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[big data]]></category>
		<category><![CDATA[complexity]]></category>
		<category><![CDATA[data analysis]]></category>
		<category><![CDATA[digital humanities]]></category>
		<category><![CDATA[history of science]]></category>
		<category><![CDATA[human dynamics]]></category>
		<category><![CDATA[methodologies]]></category>
		<category><![CDATA[network analysis]]></category>
		<category><![CDATA[open access]]></category>
		<category><![CDATA[republic of letters]]></category>
		<category><![CDATA[scholarly communication]]></category>
		<category><![CDATA[scientonomy]]></category>

		<guid isPermaLink="false">http://www.scottbot.net/HIAL/?p=12050</guid>
		<description><![CDATA[Well, it looks like Digital Humanities Now scooped me on posting my own article. As some of you may have read, I recently did not submit a paper on the Republic of Letters, opting instead to hold off until I could submit it to a journal which allowed authorial preprint distribution. Preprints are a vital <a href='http://www.scottbot.net/HIAL/?p=12050' class='excerpt-more'>[...]</a>]]></description>
			<content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.type=&amp;rft.format=text&amp;rft.title=The+Networked+Structure+of+Scientific+Growth&amp;rft.source=the+scottbot+irregular&amp;rft.date=2012-02-22&amp;rft.identifier=http%3A%2F%2Fwww.scottbot.net%2FHIAL%2F%3Fp%3D12050&amp;rft.language=English&amp;rft.subject=Uncategorized&amp;rft.aulast=Weingart&amp;rft.aufirst=Scott"></span><p>Well, it looks like <a href="http://digitalhumanitiesnow.org/">Digital Humanities Now</a> scooped me on posting my own article. As some of you may have read, I recently <a href="http://www.scottbot.net/HIAL/?p=11755">did not submit a paper on the Republic of Letters</a>, opting instead to hold off until I could submit it to a journal which allowed authorial preprint distribution. Preprints are a vital part of rapid knowledge exchange in our ever-quickening world, and while some disciplines have embraced the preprint culture, many others have yet to. I&#8217;d love the humanities to embrace that practice, and in the spirit of being the change you want to see in the world, I&#8217;ve decided to post a preprint of my Republic of Letters paper, which I will be submitting to another journal in the near future. <strong>You can read the full first draft <a href="http://scottbot.net/uploads/weingartNetworks.pdf">here</a>.</strong></p>
<p>The paper, briefly, is an attempt to contextualize the <a href="http://en.wikipedia.org/wiki/Republic_of_Letters">Republic of Letters</a> and the <a href="http://en.wikipedia.org/wiki/Scientific_revolution">Scientific Revolution</a> using modern computational methodologies. It draws from secondary sources on the Republic of Letters itself, especially from my old mentor <a href="http://www.clas.ufl.edu/users/ufhatch/pages/">R.A. Hatch</a>, some network analysis from sociology and statistical physics, modeling, <a href="http://en.wikipedia.org/wiki/Human_dynamics">human dynamics</a>, and complexity theory. All of this is combined through datasets graciously donated by the Dutch <a href="http://ckcc.huygens.knaw.nl/">Circulation of Knowledge</a> group and Oxford&#8217;s <a href="http://www.history.ox.ac.uk/cofk/">Cultures of Knowledge</a> project, totaling about 100,000 letters worth of metadata. Because it favors large scale quantitative analysis over an equally important close and qualitative analysis, the paper is a contribution to historiopgraphic methodology rather than historical narrative; that is, it doesn&#8217;t say anything particularly novel about history, but it does offer a (fairly) new way of looking at and contextualizing it.</p>
<div id="attachment_12053" class="wp-caption aligncenter" style="width: 1081px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/cengephi1.png"><img class="size-full wp-image-12053" title="Circulation of Knowledge" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/cengephi1.png" alt="" width="1071" height="924" /></a><p class="wp-caption-text">A visualization of the Dutch Republic of Letters using Sci2 &amp; Gephi</p></div>
<p>At its core, the paper suggests that by looking at how scholarly networks naturally grow and connect, we as historians can have new ways to tease out what was contingent upon the period and situation. It turns out that social networks of a certain topology are <a href="http://www.scholarpedia.org/article/Basin_of_attraction">basins of attraction</a> similar to those I discussed in <a href="http://www.scottbot.net/HIAL/?p=11621">Flow and Empty Space</a>. With enough time and any of a variety of facilitating social conditions and technologies, a network similar in shape and influence to the Republic of Letters will almost inevitably form. Armed with this knowledge, we as historians can move back to the microhistories and individuated primary materials to find exactly <em>what</em> those facilitating factors were, who played the key roles in the network, how the network may differ from what was expected, and so forth. Essentially, this method is one <a href="http://www.scottbot.net/HIAL/?p=1942">base map</a> we can use to navigate and situate historical narrative.</p>
<p>Of course, I make no claims of this being the <em>right</em> way to look at history, or the only quantitative base map we can use. The important point is that it raises new kinds of questions and is one mechanism to facilitate the re-integration of the individual and the <em><a href="http://en.wikipedia.org/wiki/Longue_dur%C3%A9e">longue durée</a>, </em>the close and the distant reading.</p>
<p>The project casts a necessarily wide net. I do not yet, and probably could not ever, have mastery over each and every disciplinary pool I draw from. With that in mind, I welcome comments, suggestions, and criticisms from historians, network analysts, modelers, sociologists, and whomever else cares to weigh in. Whomever helps will get a gracious acknowledgement in the final version, good scholarly karma, and a cookie if we ever meet in person. The draft will be edited and submitted in the coming months, and if you have ideas, please post them in the comment section below. Also, if you use ideas from the paper, please cite it as an unpublished manuscript or, if it gets published, cite that version instead.</p>
<img src="http://feeds.feedburner.com/~r/TheScottbotIrregular/~4/yDmWQrdQSL8" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.scottbot.net/HIAL/?feed=rss2&amp;p=12050</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://www.scottbot.net/HIAL/?p=12050</feedburner:origLink></item>
		<item>
		<title>On Keeping Pledges</title>
		<link>http://feedproxy.google.com/~r/TheScottbotIrregular/~3/_cl2Nm9pZVk/</link>
		<comments>http://www.scottbot.net/HIAL/?p=11755#comments</comments>
		<pubDate>Mon, 20 Feb 2012 16:55:56 +0000</pubDate>
		<dc:creator>Scott Weingart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[open access]]></category>
		<category><![CDATA[republic of letters]]></category>
		<category><![CDATA[scholarly communication]]></category>

		<guid isPermaLink="false">http://www.scottbot.net/HIAL/?p=11755</guid>
		<description><![CDATA[A few months back, I posted a series of pledges about being a good scholarly citizen. Among other things, I pledged to keep my data and code open whenever possible, and to fight to retain the right to distribute materials pending and following their publication. I also signed the Open Access Pledge. Since then, a <a href='http://www.scottbot.net/HIAL/?p=11755' class='excerpt-more'>[...]</a>]]></description>
			<content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.type=&amp;rft.format=text&amp;rft.title=On+Keeping+Pledges&amp;rft.source=the+scottbot+irregular&amp;rft.date=2012-02-20&amp;rft.identifier=http%3A%2F%2Fwww.scottbot.net%2FHIAL%2F%3Fp%3D11755&amp;rft.language=English&amp;rft.subject=Uncategorized&amp;rft.aulast=Weingart&amp;rft.aufirst=Scott"></span><p>A few months back, I posted a <a href="http://www.scottbot.net/HIAL/?page_id=3086">series of pledges</a> about being a good scholarly citizen. Among other things, I pledged to keep my data and code open whenever possible, and to fight to retain the right to distribute materials pending and following their publication. I also signed the <a href="http://www.openaccesspledge.com/">Open Access Pledge</a>. Since then, a <a href="http://thecostofknowledge.com/">petition boycotting Elsevier</a> cropped up with <a href="http://gowers.files.wordpress.com/2012/02/elsevierstatementfinal.pdf">very similar goals</a>, and as of this writing has nearly 7,000 signatures.</p>
<p>As a young scholar with as-yet no single authored publications (although one is pending in the forward-thinking <a href="http://digitalhumanitiesnow.org/the-journal-of-digital-humanities/">Journal of Digital Humanities</a>, which you should all go and peer review), I had to think very carefully in making these pledges. It&#8217;s a dangerous world out there for people who aren&#8217;t free to publish in whatever journal they like; reducing my publication options is not likely to win me anything but good karma.</p>
<p>With that in mind, I actually was careful never to pledge <em>explicitly </em>that I would not publish in closed access venues; rather, I pledged to &#8220;Freely distribute all published material for which I have the right, and to fight to retain those rights in situations where that is not the case.&#8221; The pressure of the eventual job market prevented me from saying anything stronger.</p>
<p>Today, my resolve was tested. A <a href="http://www.zetabooks.com/cfp-jems-2012-fall-issue-shaping-the-republic-of-letters.html">recent CFP</a> solicited papers about &#8220;Shaping the Republic of Letters: Communication, Correspondence and Networks in Early Modern Europe.&#8221; This is, essentially, the exact topic that I&#8217;ve been studying and analyzing for the past several years, and I recently finished a draft of a paper on this topic precisely. The paper utilizes methodologies not-yet prevalent in the humanities, and I&#8217;d like the opportunity to spread the technique as quickly and widely as possible, in the hopes that some might find it useful or at least interesting. I also feel strongly that the early and open dissemination of scholarly production is paramount to a healthy research community.</p>
<div class="mceTemp mceIEcenter">
<dl id="attachment_11756" class="wp-caption aligncenter" style="width: 570px;">
<dt class="wp-caption-dt"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/800px-open_access_plos-svg.png"><img class=" wp-image-11756 " title="Open Access" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/800px-open_access_plos-svg.png" alt="" width="560" height="224" /></a></dt>
</dl>
</div>
<p>I e-mailed the editor asking about access rights, and he sent a very kind reply, saying that, unfortunately, any article in the journal must be unpublished (even on the internet), and cannot be republished for two years following its publication. The journal itself is part of a small press, and as such is probably trying to get itself established and sold to libraries, so their reticence is (perhaps) understandable. However, I was faced with a dilemma: submit my article to them, going against the spirit &#8211; though not the letter &#8211; of my pledge, or risk losing a golden opportunity to submit my first single-authored article to a journal where it would actually fit.</p>
<p>In the end, it was actually the object of my study itself &#8211; the Republic of Letters &#8211; that convinced me to make a stand and not submit my article. The Republic, a self-titled community of 17th century scholars communicating widely by post, was embodied by the ideal of universal citizenship and the free flow of knowledge. While they did not live up to this ideal, in large part because of the technologies of the time, we now are closer to being able to do so. I need to do my part in bringing about this ideal by taking a stand on the issues of open access and dissemination.</p>
<p>The below was my e-mail to the editor:</p>
<blockquote>
<div>
<p>Many thanks for your fast reply.</p>
<p>Unfortunately, I cannot submit my article unless those conditions are changed. I fear they represent a policy at odds with the past ideals and present realities of scholarly dissemination. The ideals of the Republic of Letters, regarding the free flow of information and universal citizenship, are finally becoming attainable (at least in some parts of the world) with nigh-ubiquitous web access. In a world as rapidly changing as our own, immediate access to the materials of scholarly production is becoming an essential element not just of <em>science</em>, in the English sense of the word, but <em>wissenschaft</em> at large. <a href="http://www.plosone.org/article/info%3Adoi%2F10.1371%2Fjournal.pone.0013636" target="_blank">Numerous</a> <a href="http://www.istl.org/10-winter/article2.html" target="_blank">studies</a> <a href="http://opcit.eprints.org/oacitation-biblio.html" target="_blank">have</a> <a href="http://onlinelibrary.wiley.com/doi/10.1002/asi.20898/full" target="_blank">s<wbr>hown</wbr></a> that the open availability of electronic prints for an article increases readership and citations (both to the author and to the journal), reduces the time to the adoption of new ideas, and facilitates a more rapidly innovating and evolving literature in the scholarly world. While I empathize that you represent a fairly small press and may be worried that the availability of pre-prints would affect <a class="simple-footnote" title="Big thanks to Andrew Simpson for pointing out the error of my ways!" id="return-note-11755-1" href="#note-11755-1"><sup>1</sup></a> sales, I have seen no studies showing this to be the case, although I would of course be open to reading such research if you know of some. In either case, it has been shown that pre-prints at worst do not affect scholarly use and dissemination in the least, and at best increase readership, citation, and impact by up to 250%.</p>
<p>Good luck with your journal, and I look forward to reading the upcoming issue when it becomes available.</p>
</div>
</blockquote>
<p>It&#8217;s a frightening world out there. I considered not posting about this interaction, for fear of the possibility of angering or being blacklisted by the editorial or advisory board of the press, some of whom are respected names in my intended field of study. However, fear is the enemy of change, and the support of <a href="http://nowviskie.org/">Bethany Nowviskie</a> and a <a href="https://twitter.com/#!/scott_bot/status/171595262287020032">host of tweeters</a> convinced me that this was the right thing to do.</p>
<p>With that in mind, I herewith post a draft of my article analyzing the Republic of Letters, currently titled <a href="http://scottbot.net/uploads/weingartNetworks.pdf">The Networked Structure of Scientific Growth</a>. Please feel free to share it for non-commercial use, citing it if you use it (but making sure to cite the published version if it eventually becomes so), and I&#8217;d love your comments if you have any. I&#8217;ll dedicate a separate post to this release later, but I figured you all deserved this after reading the whole post.</p>
<div class="simple-footnotes"><p class="notes">Notes:</p><ol><li id="note-11755-1">Big thanks to Andrew Simpson for pointing out the error of my ways! <a href="#return-note-11755-1">&#8617;</a></li></ol></div><img src="http://feeds.feedburner.com/~r/TheScottbotIrregular/~4/_cl2Nm9pZVk" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.scottbot.net/HIAL/?feed=rss2&amp;p=11755</wfw:commentRss>
		<slash:comments>7</slash:comments>
		<feedburner:origLink>http://www.scottbot.net/HIAL/?p=11755</feedburner:origLink></item>
		<item>
		<title>Flow and Empty Space</title>
		<link>http://feedproxy.google.com/~r/TheScottbotIrregular/~3/B0d7wgEmc6k/</link>
		<comments>http://www.scottbot.net/HIAL/?p=11621#comments</comments>
		<pubDate>Sun, 19 Feb 2012 17:43:48 +0000</pubDate>
		<dc:creator>Scott Weingart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[complexity]]></category>
		<category><![CDATA[history]]></category>
		<category><![CDATA[human dynamics]]></category>
		<category><![CDATA[juggling]]></category>
		<category><![CDATA[modeling]]></category>

		<guid isPermaLink="false">http://www.scottbot.net/HIAL/?p=11621</guid>
		<description><![CDATA[Thirty spokes unite in one nave and on that which is non-existent [on the hole in the nave] depends the wheel&#8217;s utility. Clay is moulded into a vessel and on that which is non-existent [on its hollowness] depends the vessel&#8217;s utility. By cutting out doors and windows we build a house and on that which <a href='http://www.scottbot.net/HIAL/?p=11621' class='excerpt-more'>[...]</a>]]></description>
			<content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.type=&amp;rft.format=text&amp;rft.title=Flow+and+Empty+Space&amp;rft.source=the+scottbot+irregular&amp;rft.date=2012-02-19&amp;rft.identifier=http%3A%2F%2Fwww.scottbot.net%2FHIAL%2F%3Fp%3D11621&amp;rft.language=English&amp;rft.subject=Uncategorized&amp;rft.aulast=Weingart&amp;rft.aufirst=Scott"></span><blockquote><p>Thirty spokes unite in one nave and on that which is non-existent [on the hole in the nave] depends the wheel&#8217;s utility. Clay is moulded into a vessel and on that which is non-existent [on its hollowness] depends the vessel&#8217;s utility. By cutting out doors and windows we build a house and on that which is non-existent [on the empty space within] depends the house&#8217;s utility. Therefore, existence renders actual but non-existence renders useful.</p>
<p>-Laozi, Tao Te Ching, Susuki Translation</p></blockquote>
<p>(NOTE 1: Although it may not seem it from the introduction, this post is actually about humanities research, eventually. Stick with it and it may pay off!)</p>
<p>(NOTE 2: I&#8217;ve warned in the past about invoking concepts you know little about; let me be the first to say I know next to nothing about Eastern philosophy or <em>t&#8217;ai chi ch&#8217;uan</em>, though I do know a bit about emergence and a bit about juggling. This post uses the above concepts as helpful metaphors, fully apologizing to those who know a bit more about the concepts for the butchering of them that will likely ensue.)</p>
<p>The astute reader may have noticed that, besides being a sometimes-historian and a sometimes-data-scientist, the third role I often take on is that of a circus artist. Juggling and prop manipulation have been part of my life for over a decade now, and though I don&#8217;t perform as much as I used to, the feeling I get from practicing is still fairly essential in keeping me sane. What juggling provides me that I cannot get elsewhere is what prop manipulators generally call a state of &#8220;flow.&#8221;</p>
<div id="attachment_11640" class="wp-caption aligncenter" style="width: 310px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/P1010240.jpg"><img class="size-medium wp-image-11640" title="Scott Juggles" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/P1010240-300x224.jpg" alt="" width="300" height="224" /></a><p class="wp-caption-text">Look! It&#39;s me in a candy store!</p></div>
<p>The concept <a href="http://en.wikipedia.org/wiki/Flow_(psychology)">draws from a positive psychology term</a> developed by Mihály Csíkszentmihályi, and is roughly equivalent to being in &#8220;the zone.&#8221; Although I haven&#8217;t quite experienced it, this feeling apparently comes to programmers working late at night trying to solve a problem. It&#8217;s also been described by dancers, puzzle solvers, and pretty much anyone else who gets so into something they feel, if only for a short time, they have totally lost themselves in their activity. A fellow contact juggler, Richard Hartnell, recently filmed a <a href="http://www.youtube.com/watch?v=BKizrvTeO1A">fantastic video</a> describing what flow means to him as a performer. I make no claims here to any meaning behind the flow state. The human brain is complex beyond my understanding, and though I do not ascribe any mystical properties to the experience, having felt &#8220;flow&#8221; so deeply, I can certainly see why some do treat it as a religious experience.</p>
<p>The most important contribution to my ability to experience this state while juggling was, oddly enough, a <em>t&#8217;ai chi ch&#8217;uan</em> course. Really, it was <em>one concept</em> from the course, called <em>song kua</em>, &#8220;relax the hips,&#8221; that truly opened up flow for me. It&#8217;s a complex concept, but the part I&#8217;d like to highlight here is the relationship between exertion and relaxation, between a push and a pull. When you move your body, that movement generally starts with an <em>intention</em>. I want my hand to move to the right, so I move it to the right. There is, however, another way to move parts of the body, and this is via <em>relaxation</em>. If I&#8217;m standing in a certain way, and I relax my hip in one directoin, my body will naturally shift in the opposite direction. My body naturally gets <em>pulled</em> one way, rather than me <em>pushing</em> it to go there. In the circus arts, I can now quickly reach a flow state by creating a system between myself and whatever prop I&#8217;m using, and allowing the state of that system to <em>pull</em> me to the next state, rather than intentionally <em>pushing</em> myself and my prop in some intentional way. It was, for me, a mind-blowing shift in perspective, and one that had absolutely nothing to do with my academic pursuits until last night, on a short plane ride back from <a href="http://www.apaonline.org/">Chicago APA</a>.</p>
<p>In the past two weeks, I&#8217;ve been finishing up the first draft of a humanities paper that uses concepts from complex systems and network analysis. In it, I argue (among other things) that there are statistical regularities in human behavior, and that we as historians can use that backdrop as a context against which we can study history, finding actions and events which deviate from the norm. Much recent research has gone into showing that people, <em>on average, </em>behave in certain ways, generally due to constraints placed on us by physics, biology, and society. This is not to say humans are inherently <em>predictable</em> - merely that there are boundaries beyond which certain actions are unlikely or even impossible given the constraints of our system. In the paper, I further go on to suggest that the way we develop our social networks also exhibits regularities across history, and the differences against those regularities, and the mechanisms by which they occur, are historically interesting.</p>
<p>Fast-forward to last night: I&#8217;m reading a fantastic essay by anthropologist Terrence W. Deacon about the emergence of self-organizing biological systems on the plane-ride home. <a class="simple-footnote" title="in The Re-Emergence of Emergence, 2009, edited by Philip Clayton &amp; Paul Davies." id="return-note-11621-1" href="#note-11621-1"><sup>1</sup></a> In the essay, Deacon attempts to explain why entropy seems to decrease enough to allow, well, Life, The Universe, and Everything, given the <a href="http://en.wikipedia.org/wiki/Second_law_of_thermodynamics">second law of thermodynamics</a>. His answer is that there are <a href="http://en.wikipedia.org/wiki/Attractor">basins of attraction</a> in the dynamics of most processes which inherently and inevitably produce order. That is, as a chaotic system interacts with itself, there are dynamical states which the system can inhabit which are <em>inherently self-sustaining</em>. After a chaotic system shuffles around for long enough, it will eventually and randomly reach a state that &#8220;attracts&#8221; toward a self-sustaining dynamical state, and once it falls into that basin of attraction, the system will feed back on itself, remaining in its state, creating apparent order from chaos for a sustained period of time.</p>
<p>Deason invokes a similar <em>Tao Te Ching</em> section as was quoted above, suggesting that empty or negative space, if constrained properly and possessing the correct qualities, act as a kind of <em>potential energy</em>. The existence of the walls of a clay pot are what allows it to be a clay pot, but the function of it rests in the constrained negative space bounded by those walls. In the universe, Deason suggests, constraints are implicit and temporally sensitive; if only a few state structures are self-sustaining, those states, if reached, will naturally persist. Similar to that basic tenant of natural selection, <em>that which can persist tends to</em>.</p>
<p>The example Deason first uses is that of a whirlpool forming in the empty space behind a rock in a flowing river.</p>
<blockquote><p>Consider a whirlpool, stably spinning behind a boulder in a stream. As moving water enters this location it is compensated for by a corresponding outflow. The presence of an obstruction imparts a lateral momentum to the molecules in the flow. The previous momentum is replaced by introducing a reverse momentum imparted to the water as it flows past the obstruction and rushes to fill the comparatively vacated region behind the rock. So not only must excess water move out of the local vicinity at a constant rate; these vectors of perturbed momentum must also be dissipated locally so that energy and water doesn&#8217;t build up. The spontaneous instabilities that result when an obstruction is introduced will effectively induce irregular patterns of build-up and dissipation of flow that &#8216;explore&#8217; new possibilities, and the resulting dynamics tends toward the minimization of the constantly building instabilities. This &#8216;exploration&#8217; is essentially the result of chaotic dynamics that are constantly self-undermining. To the extent that characteristics of component interactions or boundary conditions allow any degree of regularity to develop (e.g. circulation within a trailing eddy), these will come to dominate, because there are only a few causal architectures that are not self-undermining. This is also the case for semi-regular patterns (e.g. patterns of eddies that repeatedly form and disappear over time), which are just less self-undermining than other configurations.</p>
<p>&#8230;</p>
<p><em>The flow is not forced to form a whirlpool. This dynamical geometry is not &#8216;pushed&#8217; into existence, so to speak, by specially designed barriers and guides to the flow. Rather, the system as a whole will tend to spend more time in this semi-regular behaviour because the dynamical geometry of the whirlpool affords one of the few ways that the constant instabilities can most consistently compensate for one another. </em>[Deason, 2009, emphasis added]</p></blockquote>
<div id="attachment_11675" class="wp-caption aligncenter" style="width: 705px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/3164577339_8fd4446cb7_o.jpg"><img class="size-large wp-image-11675" title="Whirlpool" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/3164577339_8fd4446cb7_o-1024x635.jpg" alt="" width="695" height="430" /></a><p class="wp-caption-text">Self-Organizing System (http://www.flickr.com/photos/lapstrake/3164577339/)</p></div>
<p>Essentially, when lots of things interact at random, there are some <em>self-organized</em> constraints to their interactions which allow order to arise from chaos. This order may be fleeting or persistent. Rather than using the designed constraint of a clay pot, walls of a room, or spokes around a hub, the constraints to the system arise from the potential in the context of the interactions, and in the properties of the interacting objects themselves.</p>
<p>So what in the world does this have to do with the humanities?</p>
<p>My argument in the above paper was that people naturally interact in certain ways; there are certain <em>basins of attraction</em>, properties of societies that tend to self-organize and persist. These are stochastic regularities; people do not always interact in the same way, and societies do not come to the same end, nor meet their ends in the same fashion. However, there are properties which make social organization more likely, and <em>knowing</em> how societies tend to form, historians can use that knowledge to frame questions and focus studies.</p>
<p>Explicit, data-driven models of the various mechanisms of human development and interaction will allow a more nuanced backdrop against which the actualities of the historical narrative can be studied. <a href="https://dhs.stanford.edu/spatial-humanities/models-as-product-process-and-publication/">Elijah Meeks recently posted</a>, about models,</p>
<blockquote><p>[T]he beauty of a model is that all of these [historical] assumptions are formalized and embedded in the larger argument&#8230;  That formalization can be challenged, extended, enhanced and amended [by more historical research]&#8230; Rather than a linear text narrative, the model itself is an argument.</p></blockquote>
<p>It is striking how seemingly unrelated strands of my life came together last night. The pull and flow of juggling, the bounded ordering of emergent behaviors, and the regularities in human activities. Perhaps this is indicative of the <a href="http://en.wikipedia.org/wiki/Consilience">consilience</a> of human endeavors; perhaps it is simply the overactive pattern-recognition circuits in my brain doing what they do best. In any case, even if the relationships are merely loose metaphors, it seems clear that a richer understanding of complexity theory, modeling, and data-driven humanities leading to a more nuanced, humanistic understanding of <a href="http://en.wikipedia.org/wiki/Human_dynamics">human dynamics</a> would benefit all. This understanding can help ground the study of history in the <a href="http://chnm.gmu.edu/essays-on-history-new-media/essays/?essayid=6">Age of Abundance</a>. A balance can be drawn between the uniquely human and individual, on one side, and the statistically regular ordering of systems, on the other; both sides need to be framed in terms of the other. Unfortunately, the dialogue on this topic in the public eye has thus-far been dominated by applied mathematicians and statistical physicists who tend not to take into account the insights gained from centuries of qualitative humanistic inquiry. That probably means it&#8217;s our job to learn from them, because it seems unlikely that they will try to learn from us.</p>
<div class="simple-footnotes"><p class="notes">Notes:</p><ol><li id="note-11621-1">in <em>The Re-Emergence of Emergence</em>, 2009, edited by Philip Clayton &amp; Paul Davies. <a href="#return-note-11621-1">&#8617;</a></li></ol></div><img src="http://feeds.feedburner.com/~r/TheScottbotIrregular/~4/B0d7wgEmc6k" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.scottbot.net/HIAL/?feed=rss2&amp;p=11621</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.scottbot.net/HIAL/?p=11621</feedburner:origLink></item>
		<item>
		<title>Citing ODH’s Summer Institutes</title>
		<link>http://feedproxy.google.com/~r/TheScottbotIrregular/~3/i6HSBH72QUY/</link>
		<comments>http://www.scottbot.net/HIAL/?p=11012#comments</comments>
		<pubDate>Fri, 10 Feb 2012 15:41:10 +0000</pubDate>
		<dc:creator>Scott Weingart</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[citations]]></category>
		<category><![CDATA[digital humanities]]></category>
		<category><![CDATA[scholarly communication]]></category>
		<category><![CDATA[zotero]]></category>

		<guid isPermaLink="false">http://www.scottbot.net/HIAL/?p=11012</guid>
		<description><![CDATA[While I generally like to reserve posts for a wider audience, this is the second time I&#8217;ve come across this particular issue, and I&#8217;d like help from the masses. Every summer, the NEH&#8217;s Office for Digital Humanities funds a series of Institutes for Advanced Topics in the Digital Humanities. I&#8217;ve had the great fortune of <a href='http://www.scottbot.net/HIAL/?p=11012' class='excerpt-more'>[...]</a>]]></description>
			<content:encoded><![CDATA[<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc&amp;rfr_id=info%3Asid%2Focoins.info%3Agenerator&amp;rft.type=&amp;rft.format=text&amp;rft.title=Citing+ODH%27s+Summer+Institutes&amp;rft.source=the+scottbot+irregular&amp;rft.date=2012-02-10&amp;rft.identifier=http%3A%2F%2Fwww.scottbot.net%2FHIAL%2F%3Fp%3D11012&amp;rft.language=English&amp;rft.subject=Uncategorized&amp;rft.aulast=Weingart&amp;rft.aufirst=Scott"></span><p>While I generally like to reserve posts for a wider audience, this is the second time I&#8217;ve come across this particular issue, and I&#8217;d like help from the masses. Every summer, the NEH&#8217;s Office for Digital Humanities funds a series of <a href="http://www.neh.gov/grants/guidelines/IATDH.html">Institutes for Advanced Topics in the Digital Humanities</a>. I&#8217;ve had the great fortune of attending one on <a href="http://complexity.uncc.edu/nehinstitute">computer simulations in the humanities</a>, and teaching at one on <a href="http://www.ipam.ucla.edu/programs/hum2010/">network analysis for the humanities</a>. I often find myself wishing I could cite one, as a whole, because of all the valuable experience and knowledge I received there. Unfortunately I have found no standard format to cite whole conferences, workshops, or summer institutes.</p>
<div id="attachment_11028" class="wp-caption aligncenter" style="width: 138px"><a href="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/NEH_Logo3.jpg"><img class="size-full wp-image-11028" title="NEH Logo" src="http://www.scottbot.net/HIAL/wp-content/uploads/2012/02/NEH_Logo3.jpg" alt="" width="128" height="130" /></a><p class="wp-caption-text">Our Great and Glorious Funders</p></div>
<p>I asked Brett Bobley, the ODH director, if he had any suggestions, but unfortunately he was at as much a loss as I. His reply: &#8220;Good question! I&#8217;d cite the URL (ex: <a title="http://is.gd/QnFs11" href="http://t.co/cked5IEm" rel="nofollow" target="_blank" data-expanded-url="http://is.gd/QnFs11" data-ultimate-url="https://securegrants.neh.gov/PublicQuery/main.aspx?f=1&amp;gn=HT-50016-09" data-display-url="is.gd/QnFs11">http://is.gd/QnFs11</a> ). But we don&#8217;t have a format. Want to choose one &amp; we&#8217;ll anoint it?&#8221; I&#8217;m not terribly familiar with citation styles, but I figured I&#8217;d try one out and see if the The DH Hive Mind had any better ideas. If so, please post in the comments. Ideally, the citation should include the URL of the grant, the PI(s), the date, the location, and the grant number (<strong>this is very important for tracking the impact of these summer institutes</strong>). While the PI is important, though, as the cited ideas do not come from the PI but rather the entire institute, I have chosen to place the institute name first.</p>
<p>&#8220;Network Analysis for the Humanities.&#8221; August 15-27, 2010. <em>ODH Institute for Advanced Topics in the Digital Humanities:</em> HT-50016-09. Tim Tangherlini, PI. <a href="https://securegrants.neh.gov/PublicQuery/main.aspx?f=1&amp;gn=HT-50016-09">https://securegrants.neh.gov/PublicQuery/main.aspx?f=1&amp;gn=HT-50016-09</a>.</p>
<p>&#8220;Computer Simulations in the Humanities.&#8221; June 1-17, 2011. <em>ODH Institute for Advanced Topics in the Digital Humanities</em>: HT-50030-10. Marvin J. Croy, PI. <a href="https://securegrants.neh.gov/PublicQuery/main.aspx?f=1&amp;gn=HT-50030-10">https://securegrants.neh.gov/PublicQuery/main.aspx?f=1&amp;gn=HT-50030-10</a></p>
<p>What thoughts?</p>
<img src="http://feeds.feedburner.com/~r/TheScottbotIrregular/~4/i6HSBH72QUY" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.scottbot.net/HIAL/?feed=rss2&amp;p=11012</wfw:commentRss>
		<slash:comments>4</slash:comments>
		<feedburner:origLink>http://www.scottbot.net/HIAL/?p=11012</feedburner:origLink></item>
	</channel>
</rss>

