<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Ryan Robitaille</title>
	<atom:link href="http://ryrobes.com/feed/" rel="self" type="application/rss+xml" />
	<link>http://ryrobes.com</link>
	<description>Business Technologist, Data Artist, Problem Solver</description>
	<lastBuildDate>Fri, 29 Sep 2017 05:34:27 +0000</lastBuildDate>
	<language>en-US</language>
		<sy:updatePeriod>hourly</sy:updatePeriod>
		<sy:updateFrequency>1</sy:updateFrequency>
	<generator>https://wordpress.org/?v=3.8.41</generator>
	<item>
		<title>Connecting Tableau to ElasticSearch (READ: How to query ElasticSearch with Hive SQL and Hadoop)</title>
		<link>http://ryrobes.com/systems/connecting-tableau-to-elasticsearch-read-how-to-query-elasticsearch-with-hive-sql-and-hadoop/</link>
		<comments>http://ryrobes.com/systems/connecting-tableau-to-elasticsearch-read-how-to-query-elasticsearch-with-hive-sql-and-hadoop/#comments</comments>
		<pubDate>Tue, 17 Dec 2013 07:15:56 +0000</pubDate>
		<dc:creator><![CDATA[Ry]]></dc:creator>
				<category><![CDATA[systems]]></category>
		<category><![CDATA[elasticsearch]]></category>
		<category><![CDATA[hadoop]]></category>
		<category><![CDATA[hive]]></category>
		<category><![CDATA[operational]]></category>
		<category><![CDATA[Tableau]]></category>
		<category><![CDATA[ubuntu]]></category>

		<guid isPermaLink="false">http://ryrobes.com/?p=43334</guid>
		<description><![CDATA[I've been a big fan of ElasticSearch the since last Spring. Put simply, it's a search server based on Apache Lucene. But in all honestly, it's really a massively scalable, auto-balancing, redundant, NoSQL data-store plus a full search and analytics server.

But, if I'm storing a ton of data in ES, I certainly can't use Tableau, since ES querying is strictly RESTful... or can I?]]></description>
				<content:encoded><![CDATA[<p style="float:right; margin:0 0 10px 15px; width:240px;">
		<img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/12/ku-xlarge-2ggg.jpg" width="240" />
		</p><p><span class="dropcap">I</span><!--/.dropcap-->'ve been a big fan of <a href="http://www.elasticsearch.org/" target="_blank">ElasticSearch</a> the since last Spring - using it on my <a href="http://riffbank.com/" target="_blank">RiffBank project</a> as well as various other "data collection" experiments. Put (very) simply, it's a badass search server based on Apache Lucene. But in all honestly, to me, it's really <strong>a very scalable, auto-balancing, redundant, NoSQL data-store with all the benefits of a full search and analytics server</strong>. </p>
<h3>#helladistributed</h3>
<h4>Also... it's fast. <em>Really. Fucking. Fast.</em></h4>
<p>Generally speaking, that is (can you <strong>use</strong> "generally speaking" after dropping an f-bomb?).</p>
<p><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/12/gggku-xlarge-4.jpg" alt="gggku-xlarge-(4)" width="500" height="280" class="aligncenter size-full wp-image-43399" /><br />
<em>(yes, I said "Hadoop" - don't be afraid, little Ricky)</em></p>
<p>My current love affair (romantic fling?) with ES not-withstanding - it generally isn't usually used as a general purpose data-store (even though it's <strong>more</strong> than capable), but with the impending release of v1.0 I can see this changing and it's role expanding. At the very <em>least</em> why would you bother to using something like MongoDB (or a similar "hot" NoSQL sink) when ES is all that and more (plus a joy to work with and scale out). </p>
<h4>But therein lies a problem - if I'm storing a shit-ton of data in ES, I certainly can't use my go-to visual analytics tool Tableau on it, since ES querying is strictly RESTful...  or can I? </h4>
<p>Enter <a href="http://www.elasticsearch.org/overview/hadoop/">ES's Hadoop plug-in</a>. While primarily created to get Hadoop data INTO ES (assumably) we also use it to create an external "table" (more like 'data structure template') in Hive pointing to an ES index and SQL our little goat hearts out on it (and use a pretty generic Hive driver in Tableau to connect to it.).</p>
<h4>So: Tableau -> Hive SQL -> Hadoop -> Map/Reduce -> ES</h4>
<p>So, yeah, there is some serious overhead and translation / abstraction involved (obviously) and queries will be much slower than native - but the only other (direct) alternative is... <strong>well, there ISN'T ONE</strong>. You'd have to build custom ETL to load data from ES to another DB and query that directly (or query the initial source, if possible). </p>
<p><em>Maybe some day Tableau will allow us to write data-source access plug-ins... </em></p>
<p>Granted, if you already run a hadoop cluster you should be able to leverage it for better scaling MapR jobs. But, I'm going to assume that you don't use hadoop at all and we will set up an instance JUST for making ElasticSearch queries via Hive SQL. </p>
<div class="woo-sc-box normal   ">
<strong>Note</strong>: This step-by-step was created <a href="http://digitalocean.com/" target="_blank">on a DigitalOcean 32-bit VM</a> using a fresh <strong>Ubuntu 12.04 LTS</strong> install (64-bit would work fine as well albeit with small library differences - I was using small instances, hence the 32-bit).<br />
</div>
<p>First things first - this assumes that your ES cluster is already running on the network in an area accessable by the box we will be configuring. In my case it's a group of similar Ubuntu 12.04LTS DigitalOcean VMs as shown below. I'm not going to cover setting up ES, but trust me - <a href="http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/setup.html" target="_blank">it's dead easy and very felxible</a>. </p>
<p>For my sample dataset I'm using <a href="http://en.wikipedia.org/wiki/Wikipedia:Database_download" target="_blank">a Wikipedia English Page dump</a> (imported via <a href="https://github.com/elasticsearch/elasticsearch-river-wikipedia" target="_blank">the ES wikipedia-river plugin</a>).</p>
<p><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/12/es-head-shot.jpg" alt="es-head-shot" width="500" height="350" class="aligncenter size-full wp-image-43340" /><br />
(showing my test ES cluster* / index setup - using <a href="http://mobz.github.io/elasticsearch-head/" target="_blank">the excellent "head" plugin</a> - great for watching shards re-balance and overall a great front-tend tool for most 'Elastic' needs)</p>
<h3>*Lovecraftian node names optional</h3>
<h4>"They had come from the stars, and had brought Their images with Them..."</h4>
<p><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/12/ku-xlarge4.jpg" alt="ku-xlarge4" width="500" height="265" class="aligncenter size-full wp-image-43390" /></p>
<h3>Let's get started...</h3>
<p>Ok, first log in to a fresh 12.04 box... let's create a 'dumbo' user and get him/her all configured correctly...</p>
<p><i>I'm going to assume that you are logged in as root.</i></p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">addgroup hadoop<br />
adduser <span style="color: #660033;">--ingroup</span> hadoop dumbo</div></div>
<p>Now to make sure that we can invoke SUDO as dumbo.</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">visudo<br />
<br />
<span style="color: #666666; font-style: italic;">#add this at the bottom and save</span><br />
dumbo <span style="color: #007800;">ALL</span>=<span style="color: #7a0874; font-weight: bold;">&#40;</span>ALL<span style="color: #7a0874; font-weight: bold;">&#41;</span> ALL</div></div>
<p>Many posts on setting up Hadoop recommend disabling ipv6, so we will just take care of that in order to eliminate any possible cluster 'lookup' issues later.</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #c20cb9; font-weight: bold;">nano</span> <span style="color: #000000; font-weight: bold;">/</span>etc<span style="color: #000000; font-weight: bold;">/</span>sysctl.conf<br />
<br />
<span style="color: #666666; font-style: italic;">#add this</span><br />
net.ipv6.conf.all.disable_ipv6 = <span style="color: #000000;">1</span><br />
net.ipv6.conf.default.disable_ipv6 = <span style="color: #000000;">1</span><br />
net.ipv6.conf.lo.disable_ipv6 = <span style="color: #000000;">1</span></div></div>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">sysctl <span style="color: #660033;">-p</span></div></div>
<p>Ok, enough root. Now feel free to disconnect and re-login as our 'dumbo' user (or SU to it, or reboot, whatever you like).</p>
<p>Let's set some environment settings for later (and fun, laziness)</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #c20cb9; font-weight: bold;">nano</span> ~<span style="color: #000000; font-weight: bold;">/</span>.bashrc<br />
<br />
<span style="color: #666666; font-style: italic;">###### put at end of.bashrc#########</span><br />
<span style="color: #007800;">PS1</span>=<span style="color: #ff0000;">'${debian_chroot:+($debian_chroot)}\[\033[01;32m\]\u@\h\[\033[00m\]:\[\033[01;34m\]\w\[\033[00m\]\$ '</span><br />
<br />
<span style="color: #7a0874; font-weight: bold;">alias</span> <span style="color: #007800;">free</span>=<span style="color: #ff0000;">&quot;free -m&quot;</span><br />
<span style="color: #7a0874; font-weight: bold;">alias</span> <span style="color: #007800;">update</span>=<span style="color: #ff0000;">&quot;sudo aptitude update&quot;</span><br />
<span style="color: #7a0874; font-weight: bold;">alias</span> <span style="color: #007800;">install</span>=<span style="color: #ff0000;">&quot;sudo aptitude install&quot;</span><br />
<span style="color: #7a0874; font-weight: bold;">alias</span> <span style="color: #007800;">upgrade</span>=<span style="color: #ff0000;">&quot;sudo aptitude safe-upgrade&quot;</span><br />
<span style="color: #7a0874; font-weight: bold;">alias</span> <span style="color: #007800;">remove</span>=<span style="color: #ff0000;">&quot;sudo aptitude remove&quot;</span><br />
<br />
<span style="color: #666666; font-style: italic;"># Set Hadoop/Hive-related environment variables</span><br />
<span style="color: #7a0874; font-weight: bold;">export</span> <span style="color: #007800;">HADOOP_HOME</span>=<span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hadoop<br />
<span style="color: #7a0874; font-weight: bold;">export</span> <span style="color: #007800;">HIVE_HOME</span>=<span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hive<br />
<span style="color: #7a0874; font-weight: bold;">export</span> <span style="color: #007800;">HADOOP_MAPRED_HOME</span>=<span style="color: #007800;">$HADOOP_HOME</span><br />
<span style="color: #7a0874; font-weight: bold;">export</span> <span style="color: #007800;">HADOOP_COMMON_HOME</span>=<span style="color: #007800;">$HADOOP_HOME</span><br />
<span style="color: #7a0874; font-weight: bold;">export</span> <span style="color: #007800;">HADOOP_HDFS_HOME</span>=<span style="color: #007800;">$HADOOP_HOME</span><br />
<span style="color: #7a0874; font-weight: bold;">export</span> <span style="color: #007800;">YARN_HOME</span>=<span style="color: #007800;">$HADOOP_HOME</span><br />
<br />
<br />
<span style="color: #666666; font-style: italic;"># Set JAVA_HOME (we will also configure JAVA_HOME directly for Hadoop later on)</span><br />
<span style="color: #7a0874; font-weight: bold;">export</span> <span style="color: #007800;">JAVA_HOME</span>=<span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>lib<span style="color: #000000; font-weight: bold;">/</span>jvm<span style="color: #000000; font-weight: bold;">/</span>java-<span style="color: #000000;">7</span>-oracle<br />
<br />
<span style="color: #666666; font-style: italic;"># Some convenient aliases and functions for running Hadoop-related commands</span><br />
<span style="color: #7a0874; font-weight: bold;">unalias</span> fs <span style="color: #000000; font-weight: bold;">&amp;&gt;</span> <span style="color: #000000; font-weight: bold;">/</span>dev<span style="color: #000000; font-weight: bold;">/</span>null<br />
<span style="color: #7a0874; font-weight: bold;">alias</span> <span style="color: #007800;">fs</span>=<span style="color: #ff0000;">&quot;hadoop fs&quot;</span><br />
<span style="color: #7a0874; font-weight: bold;">unalias</span> hls <span style="color: #000000; font-weight: bold;">&amp;&gt;</span> <span style="color: #000000; font-weight: bold;">/</span>dev<span style="color: #000000; font-weight: bold;">/</span>null<br />
<span style="color: #7a0874; font-weight: bold;">alias</span> <span style="color: #007800;">hls</span>=<span style="color: #ff0000;">&quot;fs -ls&quot;</span><br />
<br />
lzohead <span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #7a0874; font-weight: bold;">&#41;</span> <span style="color: #7a0874; font-weight: bold;">&#123;</span><br />
&nbsp; &nbsp; hadoop fs <span style="color: #660033;">-cat</span> <span style="color: #007800;">$1</span> <span style="color: #000000; font-weight: bold;">|</span> lzop <span style="color: #660033;">-dc</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">head</span> <span style="color: #660033;">-1000</span> <span style="color: #000000; font-weight: bold;">|</span> <span style="color: #c20cb9; font-weight: bold;">less</span><br />
<span style="color: #7a0874; font-weight: bold;">&#125;</span><br />
<br />
<span style="color: #666666; font-style: italic;"># Add Hadoop bin/ directory to PATH</span><br />
<span style="color: #7a0874; font-weight: bold;">export</span> <span style="color: #007800;">PATH</span>=<span style="color: #007800;">$PATH</span>:<span style="color: #007800;">$HADOOP_HOME</span><span style="color: #000000; font-weight: bold;">/</span>bin:<span style="color: #007800;">$HADOOP_HOME</span><span style="color: #000000; font-weight: bold;">/</span>sbin:<span style="color: #007800;">$HIVE_HOME</span><span style="color: #000000; font-weight: bold;">/</span>bin<br />
<span style="color: #666666; font-style: italic;">######end .bashrc#########</span></div></div>
<p>Let's re-up our .bashrc to reflect the changes</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #7a0874; font-weight: bold;">source</span> ~<span style="color: #000000; font-weight: bold;">/</span>.bashrc</div></div>
<p>Ok, let's add some packages and update our install (using some of the aliases we just created)</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">update<br />
<span style="color: #c20cb9; font-weight: bold;">install</span> htop <span style="color: #c20cb9; font-weight: bold;">zip</span> python build-essential python-dev python-setuptools <span style="color: #c20cb9; font-weight: bold;">locate</span> python-software-properties lzop<br />
<span style="color: #c20cb9; font-weight: bold;">sudo</span> add-apt-repository ppa:webupd8team<span style="color: #000000; font-weight: bold;">/</span>java<br />
update<br />
<span style="color: #c20cb9; font-weight: bold;">install</span> oracle-java7-installer</div></div>
<p>Ok, now we can start installing shit without fear! You scared? Don't be scared. Be brave!</p>
<h3>First up, Hadoop.</h3>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #7a0874; font-weight: bold;">cd</span> ~<br />
<span style="color: #c20cb9; font-weight: bold;">wget</span> http:<span style="color: #000000; font-weight: bold;">//</span>mirror.metrocast.net<span style="color: #000000; font-weight: bold;">/</span>apache<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>core<span style="color: #000000; font-weight: bold;">/</span>hadoop-2.2.0<span style="color: #000000; font-weight: bold;">/</span>hadoop-2.2.0.tar.gz<br />
<span style="color: #c20cb9; font-weight: bold;">tar</span> <span style="color: #660033;">-zxvf</span> .<span style="color: #000000; font-weight: bold;">/</span>hadoop-2.2.0.tar.gz<br />
<span style="color: #c20cb9; font-weight: bold;">mv</span> ~<span style="color: #000000; font-weight: bold;">/</span>hadoop-2.2.0 ~<span style="color: #000000; font-weight: bold;">/</span>hadoop</div></div>
<p>Ok, now we have to edit a bunch of config files. Bear with me here...</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #c20cb9; font-weight: bold;">nano</span> <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>etc<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>hadoop-env.sh<br />
<br />
<span style="color: #666666; font-style: italic;">#add or modify</span><br />
<span style="color: #7a0874; font-weight: bold;">export</span> <span style="color: #007800;">JAVA_HOME</span>=<span style="color: #000000; font-weight: bold;">/</span>usr<span style="color: #000000; font-weight: bold;">/</span>lib<span style="color: #000000; font-weight: bold;">/</span>jvm<span style="color: #000000; font-weight: bold;">/</span>java-<span style="color: #000000;">7</span>-oracle<br />
<span style="color: #7a0874; font-weight: bold;">export</span> <span style="color: #007800;">HADOOP_CONF_DIR</span>=<span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>etc<span style="color: #000000; font-weight: bold;">/</span>hadoop</div></div>
<div class="woo-sc-hr"></div>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #c20cb9; font-weight: bold;">nano</span> <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>etc<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>core-site.xml</div></div>
<div class="codecolorer-container xml blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="xml codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>fs.default.name<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>hdfs://localhost:9000<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></div></div>
<div class="woo-sc-hr"></div>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #c20cb9; font-weight: bold;">nano</span> <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>etc<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>hdfs-site.xml</div></div>
<div class="codecolorer-container xml blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="xml codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>dfs.replication<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>1<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>dfs.name.dir<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>/home/dumbo/dfs/name<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;final<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>true<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/final<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>dfs.data.dir<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>/home/dumbo/dfs/data<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;final<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>true<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/final<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>dfs.tmp.dir<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>/home/dumbo/dfs/tmp<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;final<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>true<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/final<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></div></div>
<div class="woo-sc-hr"></div>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #c20cb9; font-weight: bold;">mv</span> <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>etc<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>mapred-site.xml.template <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>etc<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>mapred-site.xml<br />
<span style="color: #c20cb9; font-weight: bold;">nano</span> <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>etc<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>mapred-site.xml</div></div>
<div class="codecolorer-container xml blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="xml codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>mapred.job.tracker<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>localhost:9001<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>mapred.system.dir<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>/home/dumbo/mapred/system<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;final<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>true<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/final<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp;<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></div></div>
<div class="woo-sc-hr"></div>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #c20cb9; font-weight: bold;">nano</span> <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>etc<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>yarn-site.xml</div></div>
<div class="codecolorer-container xml blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="xml codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">&nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>mapreduce.framework.name<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>yarn<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>yarn.nodemanager.aux-services<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>mapreduce_shuffle<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>yarn.nodemanager.aux-services.mapreduce.shuffle.class<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>org.apache.hadoop.mapred.ShuffleHandler<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></div></div>
<p>Ok, now we need to make sure that Hadoop can spawn new 'dumbo' shell sessions via ssh without a password (it's how their scripts operate)....</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #c20cb9; font-weight: bold;">ssh-keygen</span> <span style="color: #660033;">-t</span> dsa <span style="color: #660033;">-P</span> <span style="color: #ff0000;">''</span> <span style="color: #660033;">-f</span> ~<span style="color: #000000; font-weight: bold;">/</span>.ssh<span style="color: #000000; font-weight: bold;">/</span>id_dsa <br />
<span style="color: #c20cb9; font-weight: bold;">cat</span> ~<span style="color: #000000; font-weight: bold;">/</span>.ssh<span style="color: #000000; font-weight: bold;">/</span>id_dsa.pub <span style="color: #000000; font-weight: bold;">&gt;&gt;</span> ~<span style="color: #000000; font-weight: bold;">/</span>.ssh<span style="color: #000000; font-weight: bold;">/</span>authorized_keys</div></div>
<p>We need to make sure that 127.0.0.1 doesn't map to your literal hostname in /etc/hosts but only localhost (can cause loopback connection issues)</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #c20cb9; font-weight: bold;">sudo</span> <span style="color: #c20cb9; font-weight: bold;">nano</span> <span style="color: #000000; font-weight: bold;">/</span>etc<span style="color: #000000; font-weight: bold;">/</span>hosts<br />
<br />
<span style="color: #666666; font-style: italic;">#for example</span><br />
xxx.xxx.xx.1xx &nbsp;innsmouth <span style="color: #666666; font-style: italic;">#my ip / my hostname</span><br />
127.0.0.1 &nbsp; &nbsp; &nbsp; localhost</div></div>
<p>Time to "format" our hadoop HDFS 'filesystem'...</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">~<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>bin<span style="color: #000000; font-weight: bold;">/</span>hadoop namenode <span style="color: #660033;">-format</span></div></div>
<p>Let's start it all up and hope for the best...</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">~<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>sbin<span style="color: #000000; font-weight: bold;">/</span>hadoop-daemon.sh start namenode<br />
~<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>sbin<span style="color: #000000; font-weight: bold;">/</span>hadoop-daemon.sh start datanode<br />
~<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>sbin<span style="color: #000000; font-weight: bold;">/</span>yarn-daemon.sh start resourcemanager<br />
~<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>sbin<span style="color: #000000; font-weight: bold;">/</span>yarn-daemon.sh start nodemanager<br />
~<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>sbin<span style="color: #000000; font-weight: bold;">/</span>mr-jobhistory-daemon.sh start historyserver</div></div>
<p>If all is good, you shouldn't get any errors and see all 5+ procs running if you run a 'jps' command</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">jps</div></div>
<div class="codecolorer-container text blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">1239 JobHistoryServer<br />
1493 Jps<br />
993 DataNode<br />
934 NameNode<br />
1167 NodeManager<br />
1106 ResourceManager<br />
...etc</div></div>
<h3>Let's move on to Hive</h3>
<p>Don't worry, way less to do here.</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #7a0874; font-weight: bold;">cd</span> ~<br />
<span style="color: #c20cb9; font-weight: bold;">wget</span> http:<span style="color: #000000; font-weight: bold;">//</span>apache.osuosl.org<span style="color: #000000; font-weight: bold;">/</span>hive<span style="color: #000000; font-weight: bold;">/</span>hive-0.12.0<span style="color: #000000; font-weight: bold;">/</span>hive-0.12.0-bin.tar.gz<br />
<span style="color: #c20cb9; font-weight: bold;">tar</span> <span style="color: #660033;">-zxvf</span> hive-0.12.0-bin.tar.gz<br />
<span style="color: #c20cb9; font-weight: bold;">mv</span> ~<span style="color: #000000; font-weight: bold;">/</span>hive-0.12.0-bin ~<span style="color: #000000; font-weight: bold;">/</span>hive</div></div>
<p>Only one config file to set up / create <em>(to load the lib that we will get in the next step)</em></p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #c20cb9; font-weight: bold;">nano</span> ~<span style="color: #000000; font-weight: bold;">/</span>hive<span style="color: #000000; font-weight: bold;">/</span>conf<span style="color: #000000; font-weight: bold;">/</span>hive-site.xml</div></div>
<div class="codecolorer-container xml blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="xml codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;?xml</span> <span style="color: #000066;">version</span>=<span style="color: #ff0000;">&quot;1.0&quot;</span><span style="color: #000000; font-weight: bold;">?&gt;</span></span><br />
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;?xml-stylesheet</span> <span style="color: #000066;">type</span>=<span style="color: #ff0000;">&quot;text/xsl&quot;</span> <span style="color: #000066;">href</span>=<span style="color: #ff0000;">&quot;configuration.xsl&quot;</span><span style="color: #000000; font-weight: bold;">?&gt;</span></span><br />
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>hive.aux.jars.path<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/name<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span>/aux_lib/<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/value<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/property<span style="color: #000000; font-weight: bold;">&gt;</span></span></span><br />
<span style="color: #009900;"><span style="color: #000000; font-weight: bold;">&lt;/configuration<span style="color: #000000; font-weight: bold;">&gt;</span></span></span></div></div>
<p>Create some dirs in HDFS for Hive...</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">hadoop fs <span style="color: #660033;">-mkdir</span> &nbsp; &nbsp; &nbsp; <span style="color: #000000; font-weight: bold;">/</span>tmp<br />
hadoop fs <span style="color: #660033;">-mkdir</span> &nbsp; &nbsp; &nbsp; <span style="color: #000000; font-weight: bold;">/</span>user<span style="color: #000000; font-weight: bold;">/</span>hive<span style="color: #000000; font-weight: bold;">/</span>warehouse<br />
hadoop fs <span style="color: #660033;">-chmod</span> g+<span style="color: #c20cb9; font-weight: bold;">w</span> &nbsp; <span style="color: #000000; font-weight: bold;">/</span>tmp<br />
hadoop fs <span style="color: #660033;">-chmod</span> g+<span style="color: #c20cb9; font-weight: bold;">w</span> &nbsp; <span style="color: #000000; font-weight: bold;">/</span>user<span style="color: #000000; font-weight: bold;">/</span>hive<span style="color: #000000; font-weight: bold;">/</span>warehouse</div></div>
<p>You should now be able to run 'hive' and not see any errors launching.. (then just type 'quit;' to exit)</p>
<h3>The ES "plug-in" part</h3>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #7a0874; font-weight: bold;">cd</span> ~<br />
<span style="color: #c20cb9; font-weight: bold;">wget</span> https:<span style="color: #000000; font-weight: bold;">//</span>download.elasticsearch.org<span style="color: #000000; font-weight: bold;">/</span>hadoop<span style="color: #000000; font-weight: bold;">/</span>hadoop-latest.zip<br />
<span style="color: #c20cb9; font-weight: bold;">unzip</span> .<span style="color: #000000; font-weight: bold;">/</span>hadoop-latest.zip<br />
<br />
<span style="color: #c20cb9; font-weight: bold;">mkdir</span> <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hive<span style="color: #000000; font-weight: bold;">/</span>aux_lib<span style="color: #000000; font-weight: bold;">/</span> <span style="color: #666666; font-style: italic;"># creating a folder for our new jar</span><br />
<span style="color: #c20cb9; font-weight: bold;">cp</span> ~<span style="color: #000000; font-weight: bold;">/</span>elasticsearch-hadoop<span style="color: #000000; font-weight: bold;">/</span>dist<span style="color: #000000; font-weight: bold;">/</span>elasticsearch-hadoop-1.3.0.M1-yarn.jar <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hive<span style="color: #000000; font-weight: bold;">/</span>aux_lib<span style="color: #000000; font-weight: bold;">/</span></div></div>
<p>Bit of a curveball here: Time to copy the lib to a HDFS folder so remote connections can find it as well as local.</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">hadoop fs <span style="color: #660033;">-mkdir</span> <span style="color: #000000; font-weight: bold;">/</span>aux_lib<br />
hadoop fs <span style="color: #660033;">-copyFromLocal</span> <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hive<span style="color: #000000; font-weight: bold;">/</span>aux_lib<span style="color: #000000; font-weight: bold;">/*</span> hdfs:<span style="color: #000000; font-weight: bold;">///</span>aux_lib<br />
hadoop fs <span style="color: #660033;">-ls</span> <span style="color: #000000; font-weight: bold;">/</span>aux_lib <span style="color: #666666; font-style: italic;"># to check that the jar made it</span></div></div>
<p>Booya. We <strong>should</strong> be all configured to make the magic happen now. Let's start up an interactive HIVE SQL shell and check it out.</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">hive <span style="color: #660033;">--auxpath</span> <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hive<span style="color: #000000; font-weight: bold;">/</span>aux_lib<span style="color: #000000; font-weight: bold;">/</span> <span style="color: #660033;">--config</span> <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hive<span style="color: #000000; font-weight: bold;">/</span>conf<span style="color: #000000; font-weight: bold;">/</span></div></div>
<div class="codecolorer-container text blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">13/12/17 04:52:01 INFO Configuration.deprecation: mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive<br />
13/12/17 04:52:01 INFO Configuration.deprecation: mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize<br />
13/12/17 04:52:01 INFO Configuration.deprecation: mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize<br />
13/12/17 04:52:01 INFO Configuration.deprecation: mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack<br />
13/12/17 04:52:01 INFO Configuration.deprecation: mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node<br />
13/12/17 04:52:01 INFO Configuration.deprecation: mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces<br />
13/12/17 04:52:01 INFO Configuration.deprecation: mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative<br />
<br />
Logging initialized using configuration in jar:file:/home/dumbo/hive/lib/hive-common-0.12.0.jar!/hive-log4j.properties<br />
SLF4J: Class path contains multiple SLF4J bindings.<br />
SLF4J: Found binding in [jar:file:/home/dumbo/hadoop/share/hadoop/common/lib/slf4j-log4j12-1.7.5.jar!/org/slf4j/impl/StaticLoggerBinder.class]<br />
SLF4J: Found binding in [jar:file:/home/dumbo/hive/lib/slf4j-log4j12-1.6.1.jar!/org/slf4j/impl/StaticLoggerBinder.class]<br />
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.<br />
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]<br />
hive&gt;</div></div>
<p><strong>Fuckin' A right!</strong> If you see this above. You are a master of all your survey - or something like that. Anyways, with the ES plugin - we are creating an external table def that maps directly to our ES index / ES query.</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #666666; font-style: italic;"># typically you'd need to run this first</span><br />
ADD JAR <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hive<span style="color: #000000; font-weight: bold;">/</span>aux_lib<span style="color: #000000; font-weight: bold;">/</span>elasticsearch-hadoop-1.3.0.M1-yarn.jar;<br />
<span style="color: #666666; font-style: italic;"># but it should automatically be loaded via our config (here for ref)</span><br />
<br />
<br />
<span style="color: #666666; font-style: italic;"># anyways</span><br />
CREATE EXTERNAL TABLE wikitable <span style="color: #7a0874; font-weight: bold;">&#40;</span><br />
&nbsp; &nbsp; title string,<br />
&nbsp; &nbsp; redirect_page string <span style="color: #7a0874; font-weight: bold;">&#41;</span><br />
STORED BY <span style="color: #ff0000;">'org.elasticsearch.hadoop.hive.ESStorageHandler'</span><br />
TBLPROPERTIES<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #ff0000;">'es.resource'</span> = <span style="color: #ff0000;">'wikipedia_river/page/_search?q=*'</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0000;">'es.host'</span> = <span style="color: #ff0000;">'localhost'</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0000;">'es.port'</span> = <span style="color: #ff0000;">'9200'</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>;</div></div>
<p>Notice the ES vars, In this case I'm running a simple ES client on the Hadoop server (no data, no master) and connecting to that - but it could be anywhere on your network, shouldn't matter.</p>
<p>You should see a response like this.</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">OK<br />
Time taken: <span style="color: #000000;">6.11</span> seconds</div></div>
<p>Let's give it some work to do.</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #000000; font-weight: bold;">select</span> count<span style="color: #7a0874; font-weight: bold;">&#40;</span>distinct title<span style="color: #7a0874; font-weight: bold;">&#41;</span> from wikitable;</div></div>
<p>...and the MapReduce train gets rolling'.</p>
<div class="codecolorer-container text blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">Total MapReduce jobs = 1<br />
Launching Job 1 out of 1<br />
Number of reduce tasks determined at compile time: 1<br />
In order to change the average load for a reducer (in bytes):<br />
&nbsp; set hive.exec.reducers.bytes.per.reducer=&lt;number&gt;<br />
In order to limit the maximum number of reducers:<br />
&nbsp; set hive.exec.reducers.max=&lt;number&gt;<br />
In order to set a constant number of reducers:<br />
&nbsp; set mapred.reduce.tasks=&lt;number&gt;<br />
Starting Job = job_1387231099006_0004, Tracking URL = http://innsmouth:8088/proxy/application_1387231099006_0004/<br />
Kill Command = /home/dumbo/hadoop/bin/hadoop job &nbsp;-kill job_1387231099006_0004<br />
Hadoop job information for Stage-1: number of mappers: 5; number of reducers: 1<br />
2013-12-17 05:03:33,378 Stage-1 map = 0%, &nbsp;reduce = 0%<br />
2013-12-17 05:04:16,198 Stage-1 map = 1%, &nbsp;reduce = 0%, Cumulative CPU 47.58 sec<br />
2013-12-17 05:04:17,231 Stage-1 map = 1%, &nbsp;reduce = 0%, Cumulative CPU 47.58 sec<br />
2013-12-17 05:04:18,274 Stage-1 map = 1%, &nbsp;reduce = 0%, Cumulative CPU 49.59 sec<br />
2013-12-17 05:04:19,354 Stage-1 map = 1%, &nbsp;reduce = 0%, Cumulative CPU 52.76 sec<br />
2013-12-17 05:04:20,398 Stage-1 map = 1%, &nbsp;reduce = 0%, Cumulative CPU 52.76 sec<br />
...</div></div>
<p>It's a bit of an odd (and slow) example (esp on my small VM set up / example data), since in pure ES you'd just run a faceted open query on title - but it shows that we can talk to ES using Hive SQL. Also the plugin is getting better all the time and should optimize the query better in the future.</p>
<p>Now we have a Hive "wikitable" object that can be treated just like any other hive table (via Tableau or other). However, did you notice the ElasticSearch URL query string during table creation (<strong>wikipedia_river/page/_search?q=*</strong>)?</p>
<p>This essentially allows us to run a query / filter / etc in native ES BEFORE it gets abstracted and interpreted by Hive. For Example...</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">CREATE EXTERNAL TABLE wikitallica <span style="color: #7a0874; font-weight: bold;">&#40;</span> &nbsp;title string, redirect_page string <span style="color: #7a0874; font-weight: bold;">&#41;</span><br />
STORED BY <span style="color: #ff0000;">'org.elasticsearch.hadoop.hive.ESStorageHandler'</span><br />
TBLPROPERTIES<span style="color: #7a0874; font-weight: bold;">&#40;</span><span style="color: #ff0000;">'es.resource'</span> = <span style="color: #ff0000;">'wikipedia_river/page/_search?q=metallica'</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0000;">'es.host'</span> = <span style="color: #ff0000;">'localhost'</span>,<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff0000;">'es.port'</span> = <span style="color: #ff0000;">'9200'</span><span style="color: #7a0874; font-weight: bold;">&#41;</span>;</div></div>
<p>See <strong>q=title:metallica</strong>? It's going to do a regular full-text search on all fields for 'metallica' before passing the result set to Hive SQL.</p>
<p>Powerful leverage indeed... now if only we could pass parameters from query to table def...</p>
<h3>Tableau time</h3>
<p>First, make sure that Hive is launched as an external service (defaults to port 10000)</p>
<div class="codecolorer-container bash blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="bash codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">hive <span style="color: #660033;">--auxpath</span> <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hive<span style="color: #000000; font-weight: bold;">/</span>aux_lib<span style="color: #000000; font-weight: bold;">/</span> <span style="color: #660033;">--config</span> <span style="color: #000000; font-weight: bold;">/</span>home<span style="color: #000000; font-weight: bold;">/</span>dumbo<span style="color: #000000; font-weight: bold;">/</span>hive<span style="color: #000000; font-weight: bold;">/</span>conf<span style="color: #000000; font-weight: bold;">/</span> <span style="color: #660033;">--service</span> hiveserver</div></div>
<p>Easy. Install the <a href="http://doc.mapr.com/display/MapR/Hive+ODBC+Connector" target="_blank">MapR Hive ODBC Connector</a> and fire up Tableau.</p>
<p>And you are "off to the races" as they say...</p>
<p><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/12/tabdriv.jpg" alt="tabdriv" width="500" height="450" class="aligncenter size-full wp-image-43411" /></p>
<h3>Errors? Issues? Let me know in the comments.</h3>
<p>As with any EPIC nuts n' bolts type How-To post, there are bound to be errors and typos in v1 - and there are a lot of moving parts in this one...</p>
<p><em>[ Drawings part of <a href="http://johnkenn.blogspot.com/" target="_blank">Kenn Mortensen</a>'s amazing <em>Post-It Monstres </em>collection ]</em></p>
]]></content:encoded>
			<wfw:commentRss>http://ryrobes.com/systems/connecting-tableau-to-elasticsearch-read-how-to-query-elasticsearch-with-hive-sql-and-hadoop/feed/</wfw:commentRss>
		<slash:comments>25</slash:comments>
		</item>
		<item>
		<title>Enterprise IT: Why does &#8220;having a job&#8221; and good software have to be mutually exclusive?</title>
		<link>http://ryrobes.com/random/enterprise-it-why-does-having-a-job-and-good-software-have-to-be-mutually-exclusive/</link>
		<comments>http://ryrobes.com/random/enterprise-it-why-does-having-a-job-and-good-software-have-to-be-mutually-exclusive/#comments</comments>
		<pubDate>Fri, 30 Aug 2013 20:14:01 +0000</pubDate>
		<dc:creator><![CDATA[Ry]]></dc:creator>
				<category><![CDATA[Random]]></category>
		<category><![CDATA[enterprise]]></category>
		<category><![CDATA[IT]]></category>
		<category><![CDATA[rant]]></category>

		<guid isPermaLink="false">http://ryrobes.com/?p=43293</guid>
		<description><![CDATA[We've all heard that 'witty' interjection - usually after explaining (or complaining about) a very complicated software procedure (typically administrative / operational in nature) someone will always say something like "Hey, that's why we have jobs!" (or, if consulting, "that's why you're here", etc)..... Why can't we have both?]]></description>
				<content:encoded><![CDATA[<p style="float:right; margin:0 0 10px 15px; width:240px;">
		<img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/08/jounalistfrustration-853204.jpeg" width="240" />
		</p><p><span class="dropcap">W</span><!--/.dropcap-->e've all heard that 'witty' interjection - usually after explaining (or complaining about) a very complicated software procedure (typically administrative / operational in nature) someone will always say something like <strong>"Hey, that's why we have jobs!"</strong> (or, if consulting, <strong>"that's why you're here"</strong>, etc). Sometimes then followed by everyone making their <a href="http://3.bp.blogspot.com/-GJsMw4r4QuM/Tz_Y7x2B9DI/AAAAAAAAACQ/3JN5fracnPQ/s1600/LOL+FACE.jpg">best 4chan LOL face</a> before returning to the days tedium.</p>
<h4 align="center">Most Enterprise Software == Added 'Technical Debt'.</h4>
<p>All packaged up in a big shiny box, and by "box" I mean <strong>"poorly formatted auto-bro email"</strong>. I haven't see a product box or printed manual in years, those cheap pricks.</p>
<h2 align="center">It's 2013 for fucks sake.</h2>
<h4>I never imagined that the state of enterprise corporate software would *still* be so damn bad.</h4>
<ul>
<li>Layers upon layers of <a href="http://en.wikipedia.org/wiki/Cruft" target="_blank">antiquated cruft</a> masquerading as "robustness" or "a solid foundation"</li>
<li>complicated software stacks with multiple dependencies for the sake of a single (lazily designed) application</li>
<li>purposely obfuscated database back-end schema (so you have to buy their "reporting modules" of course)</li>
<li>monolithic per-user licensing plans on <em>self-hosted</em> software</li>
<li>...the list goes on...</li>
</ul>
<p>Hell, it's spawned this complacent "business as usual" worldview among an entire generation of IT workers and managers. And people wonder why we drink so much...</p>
<p><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/08/dilbert-usability.jpg" alt="dilbert-usability" width="500" height="168" class="aligncenter size-full wp-image-43316" style="border:0px;" /></p>
<p>Does anyone ever look around and say <strong>"Ye gods! We need to stop this ghastly madness!"</strong>. Not really. And I'm not telling you to stand up and be that heretic, but I am asking you to think about it.</p>
<p>Why is it that many large well-known companies seem to write bad software on purpose? How does this continue for so long? Teams are too large? Bound to their own legacy technical debt? They don't want to rock the boat? The more (over) complicated it is, the more money they can charge for it? </p>
<p>I'm sure it's all of these things and more - but that doesn't make it right.<br />
The people at these big software companies are supposed to be smarter than us, right? So we can get a better solution quicker, by opening our wallets and standing on the empathetic shoulders of software giants, right? I'm not so sure.</p>
<p><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/08/Blog06_Dilbert_SoftwareBugs.gif" alt="Blog06_Dilbert_SoftwareBugs" width="520" class="aligncenter size-full wp-image-43318" style="border:0px;" /></p>
<p>Fresh interfaces, fresh ideas, forward-thinking technology? Nope. Not here. Enterprise IT these days is about as "backward thinking" as it can get away with. </p>
<p>Every new decade brings the same old song, often using <a href="http://en.wikipedia.org/wiki/Moore's_law" target="_blank">Moore's Law</a> and cheap labor as a crutch.</p>
<h4>"Bring on more of the same! We can always mitigate it with more computing power and more Dilbert-esque application-specific workers!"</h4>
<p>I jest, but look around - it forms the very basis of many IT support organizations.</p>
<p>IT is kind of funny this way. It's pretty much seen as a "necessary evil" in most traditional (READ: BigCorp) industries. They have to have it and they don't like paying for it - because it doesn't really make them money directly. So when I complain about software decisions that turns a team into a bunch technological ditch diggers, they would say, "So? That's your job". But it isn't. IT's job is to support the "business people" so that they can make more money and everyone can high-five each other. </p>
<p>Instead, most IT jobs these days are truly "meta". They exist to support the tools they use to support the business (many times with even more layers of recursion than that)...</p>
<p>Don't take me as a ungrateful whiner - maybe they're right, maybe we "wouldn't have jobs" if software was (mostly) excellent... <strong>but I seriously doubt it</strong>. Instead we'd be able to spend more time making things *that* much better and profitable.</p>
<p>You know, as opposed to swimming upstream toward the next exposed tree root because someone didn't think to build a boat instead of buying a bottomless Yacht.</p>
<h4>We ALL can do better. Better software up top - better purchasing decisions down low.</h4>
<h1>Vote with your dollars.</h1>
]]></content:encoded>
			<wfw:commentRss>http://ryrobes.com/random/enterprise-it-why-does-having-a-job-and-good-software-have-to-be-mutually-exclusive/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>RiffBank &#8211; Parsing arbitrary Text-based Guitar Tab into an Indexable and Queryable &#8220;RiffCode for ElasticSearch</title>
		<link>http://ryrobes.com/elasticsearch/riffbank-parsing-arbitrary-text-based-guitar-tab-into-an-indexable-queryable-riffcode-for-elasticsearch/</link>
		<comments>http://ryrobes.com/elasticsearch/riffbank-parsing-arbitrary-text-based-guitar-tab-into-an-indexable-queryable-riffcode-for-elasticsearch/#comments</comments>
		<pubDate>Thu, 29 Aug 2013 23:13:09 +0000</pubDate>
		<dc:creator><![CDATA[Ry]]></dc:creator>
				<category><![CDATA[ElasticSearch]]></category>
		<category><![CDATA[elasticsearch]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[riffbank]]></category>
		<category><![CDATA[tablature]]></category>

		<guid isPermaLink="false">http://ryrobes.com/?p=43249</guid>
		<description><![CDATA[Guitar tablature is meant for human readability...not for machine consumption. Granted it's "procedural" and "linear" already, but it's also column-based AND row-based at the same time (readers read down a short row and then over) - you are dealing with text chunks that are easily understandable by a human, but require a lot of "context" [&#8230;]]]></description>
				<content:encoded><![CDATA[<p style="float:right; margin:0 0 10px 15px; width:240px;">
		<img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/08/riffcode-pic.jpg" width="240" />
		</p><div class="woo-sc-box normal  rounded full"><b>Quick Links</b>: This post is on the approach taken to get guitar tab into a normalized and data-structured form. See previous posts for more context.</p>
<ul>
<li><a href="http://ryrobes.com/elasticsearch/riffbank-searching-guitar-tab-data-as-a-pseudo-language-with-elasticsearch/" title="RiffBank – Indexing “Guitar Tab Data” as a pseudo-language with ElasticSearch" target="_blank"><b>Part 0</b>: Why Guitar Tab?</a></li>
<li><b>Part 1</b>: Text Tab to "RiffCode" (this post) </li>
<li><b>Part 2</b>: Riff Storage and Querying in ElasticSearch</li>
<li><b>Part 3</b>: Simple UI display with PHP-FatFree and Twitter Bootstrap</li>
</ul>
</div>
<p><a href="http://riffbank.com"><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/08/riffbank-square.jpg" alt="riffbank-square" width="453" height="329" class="aligncenter size-full wp-image-43202" align="center" /></a></p>
<h4>Guitar tablature is meant for human readability...<br/>not for machine consumption.</h4>
<p>Granted it's "procedural" and "linear" already, but it's also column-based AND row-based at the same time (readers read down a short row and then over) - you are dealing with text chunks that are <strong>easily understandable by a human</strong>, but <strong>require a lot of "context" and rules for a machine to decipher</strong>. Not to mention the fact that it's hand-written by humans, which is another error waiting to happen.</p>
<h4>Aside from the "how to do this" aspect, I also had to create a system to 'normalize' tab into a consistent format that lent itself to being queried properly. </h4>
<h3>The solution?  "RiffCode"</h3>
<p><strong>My initial implementation goes like this:</strong></p>
<ul>
<li>encode single notes and chords into "pseudo-words" </li>
<li>turn those riff sections into "sentences"</li>
<li>capture note meta-data when possible (palm-muting, etc)</li>
</ul>
<p>By storing the data this way I can use full-text search technology to try and gleam results from (which, with <a href="http://elasticsearch.com/" target="_blank">ElasticSearch</a>, worked quite well).</p>
<p>In practice it looks / works like this:</p>
<pre>input</pre>
<div class="codecolorer-container text blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">E-|----------------------------------------------------------------|<br />
B-|----------------------------------------------------------------|<br />
G-|*--------------------------------------------------------------*|<br />
D-|*-----------------------------------------------5--------------*|<br />
A-|---7-7-5-7---------7-7-5-----------7-7-5-7------5---------------|<br />
E-|-0---------------0-------7-6-5---0--------------3---6-5-0-3-5---|</div></div>
<pre>output</pre>
<div class="codecolorer-container text blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">&nbsp;6a 5h 5h 5f 5h 6a 5h 5h 5f 6h 6g 6f 6a 5h 5h 5f 5h 6d5f4f 6g 6f 6a 6d 6f</div></div>
<p>The concept is to pivot each "column line" of tab and keep it as "lossless" as possible (including extraneous spaces) using a basic letter system for fret number and number for strings. The system has some shortcomings (over 26 frets), but is adequate (if not damn good) for 95% of tab.</p>
<p>Ok, now how do we do this a million times over? - Python.</p>
<p>Since I wasn't even sure this was going to work - I wrote fast and carelessly. So what we have is a very inelegant solution that iterates over the text <strong>several</strong> times and creates several dictionaries - and then re-constructs it at the end. </p>
<p><strong>It's one cluster fuck of a text-parsing function, but it works, and it's fast enough.</strong></p>
<h2>"No time for love, Dr. Jones!"</h2>
<div class="codecolorer-container python blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="python codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #ff7700;font-weight:bold;">def</span> GenerateRiffCodeFromText<span style="color: black;">&#40;</span>x<span style="color: black;">&#41;</span>: <span style="color: #808080; font-style: italic;"># x being the giant text string...</span><br />
<br />
&nbsp; &nbsp; riff_number <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
&nbsp; &nbsp; commonchar <span style="color: #66cc66;">=</span> <span style="color: #008000;">None</span><br />
&nbsp; &nbsp; string_line <span style="color: #66cc66;">=</span> <span style="color: #008000;">None</span><br />
&nbsp; &nbsp; lineresults <span style="color: #66cc66;">=</span> <span style="color: black;">&#123;</span><span style="color: black;">&#125;</span> <span style="color: #808080; font-style: italic;"># or dict()</span><br />
<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> lineno<span style="color: #66cc66;">,</span> linestr <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">enumerate</span><span style="color: black;">&#40;</span>x<span style="color: black;">&#41;</span>:<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; linestr <span style="color: #66cc66;">=</span> linestr.<span style="color: black;">rstrip</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\r</span><span style="color: #000099; font-weight: bold;">\n</span>'</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;">#.strip()</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; first3char <span style="color: #66cc66;">=</span> linestr<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span>:<span style="color: #ff4500;">3</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>linestr<span style="color: black;">&#41;</span> <span style="color: #66cc66;">&gt;</span> <span style="color: #ff4500;">0</span>: &nbsp; <span style="color: #808080; font-style: italic;"># not counting some common accent symbols in case tab author was crazy w solo accents</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; prev_commonchar <span style="color: #66cc66;">=</span> commonchar<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; mod_linestr <span style="color: #66cc66;">=</span> linestr.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'~'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">' '</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\\</span>'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">' '</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'^'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">' '</span> <span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; commonchar <span style="color: #66cc66;">=</span> <span style="color: black;">&#40;</span><span style="color: #dc143c;">collections</span>.<span style="color: black;">Counter</span><span style="color: black;">&#40;</span>mod_linestr<span style="color: black;">&#41;</span>.<span style="color: black;">most_common</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span> <span style="color: #808080; font-style: italic;"># [0] is digit, [1] is freq</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> commonchar <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">' '</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; commonchar <span style="color: #66cc66;">=</span> <span style="color: black;">&#40;</span><span style="color: #dc143c;">collections</span>.<span style="color: black;">Counter</span><span style="color: black;">&#40;</span>mod_linestr<span style="color: black;">&#41;</span>.<span style="color: black;">most_common</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">2</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span> <span style="color: #808080; font-style: italic;"># [0] is digit, [1] is freq</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">pass</span> <span style="color: #808080; font-style: italic;"># ? not sure see 2x4</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; prev_commonchar <span style="color: #66cc66;">=</span> commonchar<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; commonchar <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'DIVIDER'</span><br />
<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># find (probable) string lines and label them</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: black;">&#40;</span>commonchar <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'-'</span> <span style="color: #ff7700;font-weight:bold;">and</span> linestr.<span style="color: black;">find</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'P'</span><span style="color: black;">&#41;</span> <span style="color: #66cc66;">&lt;</span> <span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: black;">&#40;</span>commonchar.<span style="color: black;">isdigit</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">and</span> linestr.<span style="color: black;">find</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'-'</span><span style="color: black;">&#41;</span> <span style="color: #66cc66;">&gt;</span> <span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span>: <span style="color: #808080; font-style: italic;"># so we don't grab the Palm Mute line...</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> string_line <span style="color: #66cc66;">==</span> <span style="color: #008000;">None</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; prev_string_line <span style="color: #66cc66;">=</span> string_line<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; string_line <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">1</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; prev_string_line <span style="color: #66cc66;">=</span> string_line<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; string_line <span style="color: #66cc66;">=</span> string_line + <span style="color: #ff4500;">1</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#lineresults[lineno] = string_line</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; prev_string_line <span style="color: #66cc66;">=</span> string_line<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; string_line <span style="color: #66cc66;">=</span> <span style="color: #008000;">None</span><br />
<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># find (probable) meta / PM line ?</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> linestr.<span style="color: black;">find</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'P'</span><span style="color: black;">&#41;</span> <span style="color: #66cc66;">&gt;</span> <span style="color: #ff4500;">2</span> <span style="color: #ff7700;font-weight:bold;">and</span> string_line <span style="color: #66cc66;">==</span> <span style="color: #008000;">None</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; prev_string_line <span style="color: #66cc66;">=</span> string_line<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; string_line <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># find (possible) section headers</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>linestr<span style="color: black;">&#41;</span><span style="color: #66cc66;">&lt;</span><span style="color: #ff4500;">30</span> <span style="color: #ff7700;font-weight:bold;">and</span> commonchar.<span style="color: black;">isalpha</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">and</span> string_line <span style="color: #66cc66;">==</span> <span style="color: #008000;">None</span> <span style="color: #ff7700;font-weight:bold;">and</span> commonchar <span style="color: #66cc66;">&lt;&gt;</span> <span style="color: #483d8b;">'DIVIDER'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; riff_name <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'riff name?'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; riff_name <span style="color: #66cc66;">=</span> <span style="color: #008000;">None</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#print &quot;line number: &quot; + str(lineno) + &quot;: &quot; + linestr.rstrip()+' ',</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#print '{' + commonchar + ' ' + str(string_line) + ' ' + str(riff_name) +'} Rnum' + str(riff_number) #+ 'prevst'+str(prev_string_line)</span><br />
<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> prev_string_line <span style="color: #66cc66;">==</span> <span style="color: #ff4500;">6</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; riff_number <span style="color: #66cc66;">=</span> riff_number + <span style="color: #ff4500;">1</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">elif</span> <span style="color: black;">&#40;</span>commonchar <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'DIVIDER'</span> <span style="color: #ff7700;font-weight:bold;">and</span> prev_string_line <span style="color: #66cc66;">&lt;</span> <span style="color: #ff4500;">6</span> <span style="color: #ff7700;font-weight:bold;">and</span> prev_string_line <span style="color: #66cc66;">&gt;</span> <span style="color: #008000;">None</span><span style="color: black;">&#41;</span>: <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; riff_number <span style="color: #66cc66;">=</span> riff_number + <span style="color: #ff4500;">1</span><br />
<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># add all this shit to a dick(t)</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #ff7700;font-weight:bold;">not</span> lineresults.<span style="color: black;">has_key</span><span style="color: black;">&#40;</span>riff_number<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; lineresults<span style="color: black;">&#91;</span>riff_number<span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> <span style="color: black;">&#123;</span><span style="color: black;">&#125;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; lineresults<span style="color: black;">&#91;</span>riff_number<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>string_line<span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> <span style="color: black;">&#123;</span><span style="color: #483d8b;">'linestr'</span>:linestr.<span style="color: black;">rstrip</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: #66cc66;">,</span> <span style="color: #483d8b;">'commonchar'</span>:commonchar<span style="color: #66cc66;">,</span> <span style="color: #483d8b;">'string_line'</span>:string_line<span style="color: #66cc66;">,</span> <span style="color: #483d8b;">'riff_name'</span>:riff_name<span style="color: #66cc66;">,</span> <span style="color: #483d8b;">'riff_number'</span>:riff_number<span style="color: black;">&#125;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#lineresults[riff_number] = {'linestr':linestr.rstrip(), 'lineno':lineno, 'string_line':string_line, 'riff_name':riff_name, 'riff_number':riff_number}</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#print 'done: ' + str(len(lineresults)) + ' lines'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#pp.pprint(lineresults)</span><br />
<br />
<br />
&nbsp; &nbsp; linelengths <span style="color: #66cc66;">=</span> <span style="color: black;">&#123;</span><span style="color: black;">&#125;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># get longest line ?</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> rnum <span style="color: #ff7700;font-weight:bold;">in</span> lineresults:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; linelengths<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">999</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> ln <span style="color: #ff7700;font-weight:bold;">in</span> lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> ln <span style="color: #66cc66;">&gt;</span> <span style="color: #ff4500;">0</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>ln<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> <span style="color: #66cc66;">&lt;</span> linelengths<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; linelengths<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>ln<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> linelengths<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span> <span style="color: #66cc66;">==</span> <span style="color: #ff4500;">999</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; linelengths<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
<br />
&nbsp; &nbsp; result_dict <span style="color: #66cc66;">=</span> <span style="color: black;">&#123;</span><span style="color: black;">&#125;</span><br />
<br />
<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> rnum <span style="color: #ff7700;font-weight:bold;">in</span> lineresults:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; result_dict<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> <span style="color: black;">&#123;</span><span style="color: black;">&#125;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; raw_lines <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">''</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; raw_lines +<span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #008000;">None</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; riff_name <span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #008000;">None</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">pass</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; raw_lines +<span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">pass</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; raw_lines +<span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">pass</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; raw_lines +<span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">2</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">pass</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; raw_lines +<span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">3</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">pass</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; raw_lines +<span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">4</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">pass</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; raw_lines +<span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">pass</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; raw_lines +<span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">6</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">pass</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; result_dict<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'raw_lines'</span><span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> raw_lines<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># alphabetize fret nums</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> i <span style="color: #ff7700;font-weight:bold;">in</span> lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>i<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr_a'</span><span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> AlphabatizeFrets<span style="color: black;">&#40;</span>lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>i<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">''</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> column <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">range</span><span style="color: black;">&#40;</span>linelengths<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># first attempt at simple PM recording..</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print lineresults[rnum][6]['linestr'][column],</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr_a'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>column<span style="color: black;">&#93;</span> <span style="color: #66cc66;">&lt;&gt;</span> <span style="color: #483d8b;">' '</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'#'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'0'</span> <span style="color: #808080; font-style: italic;"># record nothing?</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print '?',</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'0'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print lineresults[rnum][6]['linestr'][column],</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">6</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr_a'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>column<span style="color: black;">&#93;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">' '</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'-'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print '?',</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'?'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print lineresults[rnum][5]['linestr'][column],</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr_a'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>column<span style="color: black;">&#93;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">' '</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'-'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print '?',</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'?'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print lineresults[rnum][4]['linestr'][column],</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">4</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr_a'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>column<span style="color: black;">&#93;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">' '</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'-'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print '?',</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'?'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print lineresults[rnum][3]['linestr'][column],</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">3</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr_a'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>column<span style="color: black;">&#93;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">' '</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'-'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print '?',</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'?'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print lineresults[rnum][2]['linestr'][column],</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">2</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr_a'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>column<span style="color: black;">&#93;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">' '</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'-'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print '?',</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'?'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print lineresults[rnum][1]['linestr'][column],</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> lineresults<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'linestr_a'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span>column<span style="color: black;">&#93;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">' '</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'-'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">###sz = lineresults[rnum][1]['linestr_a'][column].replace(' ','-')</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">##print '?',</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'?'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#print ',',</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s +<span style="color: #66cc66;">=</span> <span style="color: #483d8b;">' '</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#print s,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">pass</span> <span style="color: #808080; font-style: italic;">#test</span><br />
<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#changing some chars for a test indexing run</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># constructing the &quot;LONG CODE&quot;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#print s # original s code</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># check first &quot;note&quot; for bar notes</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;##&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;B&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;E&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;A&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;?&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;||&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;::&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">and</span> <span style="color: #483d8b;">&quot;-&quot;</span> <span style="color: #ff7700;font-weight:bold;">not</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s <span style="color: #66cc66;">=</span> s<span style="color: black;">&#91;</span><span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;##&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;B&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;E&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;A&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;?&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;||&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;::&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">and</span> <span style="color: #483d8b;">&quot;-&quot;</span> <span style="color: #ff7700;font-weight:bold;">not</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s <span style="color: #66cc66;">=</span> s<span style="color: black;">&#91;</span><span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># check late &quot;note&quot; for bar notes</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;##&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;E&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;A&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;?&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;||&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">and</span> <span style="color: #483d8b;">&quot;-&quot;</span> <span style="color: #ff7700;font-weight:bold;">not</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s <span style="color: #66cc66;">=</span> s<span style="color: black;">&#91;</span>:-<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;##&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;E&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;A&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;?&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span> <span style="color: #ff7700;font-weight:bold;">or</span> <span style="color: #483d8b;">&quot;||&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">and</span> <span style="color: #483d8b;">&quot;-&quot;</span> <span style="color: #ff7700;font-weight:bold;">not</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">8</span>:<span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s <span style="color: #66cc66;">=</span> s<span style="color: black;">&#91;</span>:-<span style="color: #ff4500;">8</span><span style="color: black;">&#93;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#s = s.replace('0EADGBE','').replace('0||||||','').replace('0::::::','')</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#s = s.replace('|||','')</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s <span style="color: #66cc66;">=</span> s.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'0||||||'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'|'</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># riff bar seperators</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s <span style="color: #66cc66;">=</span> s.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'#||||||'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'|'</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># riff bar seperators</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s <span style="color: #66cc66;">=</span> s.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'?'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'-'</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># temp - work on it later (mark out missing tabbed strings)</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;| &quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> s<span style="color: black;">&#91;</span>:<span style="color: #ff4500;">2</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; s <span style="color: #66cc66;">=</span> s<span style="color: black;">&#91;</span><span style="color: #ff4500;">2</span>:<span style="color: black;">&#93;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># now to change LONG CODE to SHORT CODE</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; long_code <span style="color: #66cc66;">=</span> s<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; long_code_list <span style="color: #66cc66;">=</span> long_code.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; short_code <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">''</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> w <span style="color: #ff7700;font-weight:bold;">in</span> long_code_list:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#print w</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> w <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'|'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; w <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'|'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ww <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'|'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">elif</span> w <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'0------'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; w <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'.'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ww <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'.'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wd <span style="color: #66cc66;">=</span> <span style="color: black;">&#123;</span><span style="color: black;">&#125;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; cnt <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; cntf<span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> l <span style="color: #ff7700;font-weight:bold;">in</span> w:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> l <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'#'</span> <span style="color: #ff7700;font-weight:bold;">or</span> l <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'0'</span> <span style="color: #ff7700;font-weight:bold;">or</span> l <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'-'</span> <span style="color: #ff7700;font-weight:bold;">or</span> l <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'/'</span> <span style="color: #ff7700;font-weight:bold;">or</span> l <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'('</span> <span style="color: #ff7700;font-weight:bold;">or</span> l <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">')'</span> <span style="color: #ff7700;font-weight:bold;">or</span> l <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'<span style="color: #000099; font-weight: bold;">\\</span>'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wd<span style="color: black;">&#91;</span>cnt<span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>l<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wd<span style="color: black;">&#91;</span>cnt<span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">7</span>-cnt<span style="color: black;">&#41;</span>+<span style="color: #008000;">str</span><span style="color: black;">&#40;</span>l<span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; cnt <span style="color: #66cc66;">=</span> cnt+<span style="color: #ff4500;">1</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#elif w[0:1] in [a-z]:</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># &nbsp; print 'tt'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#pp.pprint(wd)</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ww <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">''</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> i <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">range</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span><span style="color: #66cc66;">,</span><span style="color: #ff4500;">7</span><span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>: <span style="color: #808080; font-style: italic;"># ? changed last minute</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; ww +<span style="color: #66cc66;">=</span> wd<span style="color: black;">&#91;</span>i<span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">pass</span><br />
<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#short_code += str(w)+' '</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>ww<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'0-----'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">''</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'0----'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">''</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'0---'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">''</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'0--'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">''</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'0-'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">''</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'0'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">''</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'#-----'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'*'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'#----'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'*'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'#---'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'*'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'#--'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'*'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'#-'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'*'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'#'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'*'</span><span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> wdee<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">5</span>:<span style="color: black;">&#93;</span> <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'-----'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee<span style="color: black;">&#91;</span>:-<span style="color: #ff4500;">5</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">elif</span> wdee<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">4</span>:<span style="color: black;">&#93;</span> <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'----'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee<span style="color: black;">&#91;</span>:-<span style="color: #ff4500;">4</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">elif</span> wdee<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">3</span>:<span style="color: black;">&#93;</span> <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'---'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee<span style="color: black;">&#91;</span>:-<span style="color: #ff4500;">3</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">elif</span> wdee<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">2</span>:<span style="color: black;">&#93;</span> <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'--'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee<span style="color: black;">&#91;</span>:-<span style="color: #ff4500;">2</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">elif</span> wdee<span style="color: black;">&#91;</span>-<span style="color: #ff4500;">1</span>:<span style="color: black;">&#93;</span> <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'-'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> wdee<span style="color: black;">&#91;</span>:-<span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #483d8b;">&quot;H&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> wdee:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'&gt;'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #483d8b;">&quot;P&quot;</span> <span style="color: #ff7700;font-weight:bold;">in</span> wdee:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; wdee <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'&lt;'</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; short_code +<span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>wdee<span style="color: black;">&#41;</span>+<span style="color: #483d8b;">' '</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; result_dict<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'short_code'</span><span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> short_code<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; result_dict<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'long_code'</span><span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> long_code<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; result_dict<span style="color: black;">&#91;</span>rnum<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'riff_name'</span><span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> riff_name<br />
<br />
<br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#return short_code</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> result_dict</div></div>
<h4>Text parsing from hell!</h4>
<p>Anyways, that function is part of the module that I use to read the tab file into a JSON format that I can then insert into my "riff" ElasticSearch index - that is, of course, AFTER I pull the raw tab out of my "scraping" index. :)</p>
<p>Note: I'm using <a href="http://docs.python-requests.org/en/latest/" target="_blank">the amazing Requests module</a> instead of <a href="http://www.elasticsearch.org/guide/clients/" target="_blank">a specific ES python module</a>.</p>
<div class="codecolorer-container python blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="python codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #808080; font-style: italic;"># ryan robitaille</span><br />
<span style="color: #808080; font-style: italic;"># prototype [riffbank / riffwords / riffml / riffql / riffjson] &quot;encoder&quot; script</span><br />
<span style="color: #808080; font-style: italic;"># 7/3/2013</span><br />
<br />
<span style="color: #ff7700;font-weight:bold;">import</span> requests<span style="color: #66cc66;">,</span> <span style="color: #dc143c;">pprint</span><span style="color: #66cc66;">,</span> json<span style="color: #66cc66;">,</span> <span style="color: #dc143c;">urllib</span><br />
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">time</span><span style="color: #66cc66;">,</span> <span style="color: #dc143c;">os</span><span style="color: #66cc66;">,</span> <span style="color: #dc143c;">string</span><span style="color: #66cc66;">,</span> <span style="color: #dc143c;">sys</span><span style="color: #66cc66;">,</span> <span style="color: #dc143c;">time</span><span style="color: #66cc66;">,</span> <span style="color: #dc143c;">collections</span><span style="color: #66cc66;">,</span> hashlib<br />
<span style="color: #ff7700;font-weight:bold;">import</span> riff_coder <span style="color: #808080; font-style: italic;"># custom</span><br />
<span style="color: #ff7700;font-weight:bold;">from</span> <span style="color: #dc143c;">random</span> <span style="color: #ff7700;font-weight:bold;">import</span> choice<br />
<br />
pp <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">pprint</span>.<span style="color: black;">PrettyPrinter</span><span style="color: black;">&#40;</span>indent<span style="color: #66cc66;">=</span><span style="color: #ff4500;">3</span><span style="color: black;">&#41;</span><br />
<br />
<span style="color: #808080; font-style: italic;"># get a random ES box each time</span><br />
es_boxes <span style="color: #66cc66;">=</span> <span style="color: black;">&#91;</span><span style="color: #483d8b;">'192.168.xxx.234'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'192.168.xxx.115'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'192.168.xxx.47'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'192.168.xxx.241'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'192.168.xxx.191'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'localhost'</span><span style="color: black;">&#93;</span><br />
<br />
payload <span style="color: #66cc66;">=</span> <span style="color: black;">&#123;</span> <span style="color: #483d8b;">'query'</span>: <span style="color: black;">&#123;</span> <span style="color: #483d8b;">'bool'</span>: <span style="color: black;">&#123;</span> <span style="color: #483d8b;">'must'</span>: <span style="color: black;">&#91;</span> <span style="color: black;">&#123;</span> <span style="color: #483d8b;">'match_all'</span>: <span style="color: black;">&#123;</span> <span style="color: black;">&#125;</span> <span style="color: black;">&#125;</span> <span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> <span style="color: #483d8b;">'must_not'</span>: <span style="color: black;">&#91;</span> <span style="color: black;">&#123;</span> <span style="color: #483d8b;">'term'</span>: <span style="color: black;">&#123;</span> <span style="color: #483d8b;">'incoming.riff_indexed'</span>: <span style="color: #ff4500;">3</span> <span style="color: black;">&#125;</span> <span style="color: black;">&#125;</span> <span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> <span style="color: #483d8b;">'should'</span>: <span style="color: black;">&#91;</span> <span style="color: black;">&#93;</span> <span style="color: black;">&#125;</span> <span style="color: black;">&#125;</span> <span style="color: black;">&#125;</span><br />
rrr <span style="color: #66cc66;">=</span> requests.<span style="color: black;">get</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;http://localhost:9200/scraper/incoming/_search?from=0&amp;size=5000&quot;</span><span style="color: #66cc66;">,</span> data<span style="color: #66cc66;">=</span>json.<span style="color: black;">dumps</span><span style="color: black;">&#40;</span>payload<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
resp <span style="color: #66cc66;">=</span> rrr.<span style="color: black;">json</span><br />
rr <span style="color: #66cc66;">=</span> resp<span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
<br />
<br />
<span style="color: #ff7700;font-weight:bold;">for</span> i <span style="color: #ff7700;font-weight:bold;">in</span> rr<span style="color: black;">&#91;</span><span style="color: #483d8b;">'hits'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'hits'</span><span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_id'</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#pp.pprint(i)</span><br />
<br />
&nbsp; &nbsp; es_box <span style="color: #66cc66;">=</span> choice<span style="color: black;">&#40;</span>es_boxes<span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; x <span style="color: #66cc66;">=</span> i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'raw_text'</span><span style="color: black;">&#93;</span>.<span style="color: black;">splitlines</span><span style="color: black;">&#40;</span><span style="color: #008000;">True</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; riff_coder_dict <span style="color: #66cc66;">=</span> riff_coder.<span style="color: black;">GenerateRiffCodeFromText</span><span style="color: black;">&#40;</span>x<span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#for each in riff_Coder_dict, insert ALL above fields plus rnum, raw_riff, short_code (no need for long code)</span><br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#then update scraper record as riff_indexed</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'----------------'</span><span style="color: #66cc66;">,</span>i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'artist_name'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">' - '</span><span style="color: #66cc66;">,</span>i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'song_name'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'---------------- '</span><span style="color: #66cc66;">,</span>es_box<br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#pp.pprint(riff_coder_dict)</span><br />
<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; spotify_album_released <span style="color: #66cc66;">=</span> i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'spotify_album_released'</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; spotify_album_href <span style="color: #66cc66;">=</span> i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'spotify_album_href'</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; spotify_album_name <span style="color: #66cc66;">=</span> i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'spotify_album_name'</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; spotify_track_href <span style="color: #66cc66;">=</span> i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'spotify_track_href'</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; spotify_track_popularity <span style="color: #66cc66;">=</span> i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'spotify_track_popularity'</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; spotify_album_released <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">''</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; spotify_album_href <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">''</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; spotify_album_name <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">''</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; spotify_track_href <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">''</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; spotify_track_popularity <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">''</span><br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; tabversion <span style="color: #66cc66;">=</span> i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'tab_version'</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; tabversion <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> r <span style="color: #ff7700;font-weight:bold;">in</span> riff_coder_dict:<br />
<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; url_hash <span style="color: #66cc66;">=</span> hashlib.<span style="color: black;">sha1</span><span style="color: black;">&#40;</span>i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'source_url'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>.<span style="color: black;">hexdigest</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>+<span style="color: #483d8b;">'__'</span>+<span style="color: #008000;">str</span><span style="color: black;">&#40;</span>r<span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># add domain of URL?</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; payload <span style="color: #66cc66;">=</span> <span style="color: black;">&#123;</span> <span style="color: #483d8b;">'artist_id'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'artist_id'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'artist_name'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'artist_name'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'artist_terms'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'artist_terms'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'audio_summary'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'audio_summary'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'images'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'images'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'location'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'location'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'lookup_name'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'lookup_name'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'lookup_song_name'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'lookup_song_name'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'riff_num'</span>:r<span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'riff_name'</span>:riff_coder_dict<span style="color: black;">&#91;</span>r<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'riff_name'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'riff_code'</span>:riff_coder_dict<span style="color: black;">&#91;</span>r<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'short_code'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'riff_text'</span>:riff_coder_dict<span style="color: black;">&#91;</span>r<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'raw_lines'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'long_code'</span>:riff_coder_dict<span style="color: black;">&#91;</span>r<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'long_code'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'similar_artists'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'similar_artists'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'song_id'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'song_id'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'song_name'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'song_name'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'source_url'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'source_url'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'tab_version'</span>:tabversion<span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'text_hash'</span>:hashlib.<span style="color: black;">sha1</span><span style="color: black;">&#40;</span>i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'raw_text'</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>.<span style="color: black;">hexdigest</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: #66cc66;">,</span> \<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'track_data_7digital'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'track_data_7digital'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'track_data_spotify'</span>:i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'track_data_spotify'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> &nbsp;\<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'spotify_data'</span>: <span style="color: black;">&#123;</span> &nbsp; <span style="color: #483d8b;">'album_released'</span>:spotify_album_released<span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'album_href'</span>:spotify_album_href<span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'album_name'</span>:spotify_album_name<span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'track_href'</span>:spotify_track_href<span style="color: #66cc66;">,</span> \<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #483d8b;">'track_popularity'</span>:spotify_track_popularity <span style="color: black;">&#125;</span> &nbsp;<span style="color: black;">&#125;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; inr <span style="color: #66cc66;">=</span> requests.<span style="color: black;">put</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;http://&quot;</span>+es_box+<span style="color: #483d8b;">&quot;:9200/riffs/single/&quot;</span>+<span style="color: #008000;">str</span><span style="color: black;">&#40;</span>url_hash<span style="color: black;">&#41;</span><span style="color: #66cc66;">,</span> data<span style="color: #66cc66;">=</span>json.<span style="color: black;">dumps</span><span style="color: black;">&#40;</span>payload<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># no need for response</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> inr.<span style="color: black;">text</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'----------------'</span><span style="color: #66cc66;">,</span>i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'artist_name'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">' - '</span><span style="color: #66cc66;">,</span>i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_source'</span><span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #483d8b;">'song_name'</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'----------------'</span><br />
<br />
<br />
&nbsp; &nbsp; updpayload <span style="color: #66cc66;">=</span> <span style="color: black;">&#123;</span>&nbsp; <span style="color: #483d8b;">'script'</span>: <span style="color: black;">&#123;</span> <span style="color: #483d8b;">&quot;script&quot;</span> : <span style="color: #483d8b;">&quot;ctx._source.riff_indexed = 3&quot;</span> <span style="color: black;">&#125;</span> &nbsp;<span style="color: black;">&#125;</span><br />
&nbsp; &nbsp; upd <span style="color: #66cc66;">=</span> requests.<span style="color: black;">post</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;http://&quot;</span>+es_box+<span style="color: #483d8b;">&quot;:9200/scraper/incoming/&quot;</span>+i<span style="color: black;">&#91;</span><span style="color: #483d8b;">'_id'</span><span style="color: black;">&#93;</span>+<span style="color: #483d8b;">&quot;/_update&quot;</span><span style="color: #66cc66;">,</span> data<span style="color: #66cc66;">=</span>json.<span style="color: black;">dumps</span><span style="color: black;">&#40;</span>updpayload<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> upd.<span style="color: black;">text</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">''</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">''</span></div></div>
<p>So now I've got each tab file (song) split into many "riff-based" JSON documents in my ElasticSearch system... (with a lot of extra meta-data picked up along the way - I'll write another post on searching Spotify and Echonest)</p>
<h4>What's next? Getting it OUT in a meaningful way...</h4>
<p><a href="http://fixxer.ryrobes.com/wp-content/uploads/2013/08/json-shot.jpg"><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/08/json-shot.jpg" alt="json-shot" width="867" height="834" class="aligncenter size-full wp-image-43279" /></a></p>
]]></content:encoded>
			<wfw:commentRss>http://ryrobes.com/elasticsearch/riffbank-parsing-arbitrary-text-based-guitar-tab-into-an-indexable-queryable-riffcode-for-elasticsearch/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>RiffBank &#8211; Indexing &#8220;Guitar Tab Data&#8221; as a pseudo-language with ElasticSearch</title>
		<link>http://ryrobes.com/elasticsearch/riffbank-searching-guitar-tab-data-as-a-pseudo-language-with-elasticsearch/</link>
		<comments>http://ryrobes.com/elasticsearch/riffbank-searching-guitar-tab-data-as-a-pseudo-language-with-elasticsearch/#comments</comments>
		<pubDate>Wed, 21 Aug 2013 22:32:31 +0000</pubDate>
		<dc:creator><![CDATA[Ry]]></dc:creator>
				<category><![CDATA[ElasticSearch]]></category>
		<category><![CDATA[elasticsearch]]></category>
		<category><![CDATA[guitar]]></category>
		<category><![CDATA[project]]></category>
		<category><![CDATA[riff]]></category>
		<category><![CDATA[riffbank]]></category>
		<category><![CDATA[tablature]]></category>

		<guid isPermaLink="false">http://ryrobes.com/?p=43191</guid>
		<description><![CDATA[What is it? A fun data/hack project that I've been working on for the past few weeks. Basically put... Riffbank is a "reverse search engine" for guitar tabs. Give it a simple section of guitar tab (a "riff"), and it will tell you what songs it could be (plus any other metadata it knows). Give [&#8230;]]]></description>
				<content:encoded><![CDATA[<p style="float:right; margin:0 0 10px 15px; width:240px;">
		<img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/08/riffbank-square-150x150.jpg" width="240" />
		</p><div class="woo-sc-box normal  rounded full"><b>Quick Links</b>: This post is the <b>"Why"</b> and background on my motivation for this project. Separate posts breakdown the implementation. Links coming as published.</p>
<ul>
<li><b>Part 0</b>: Why Guitar Tab? (this post, genius)</li>
<li><a href="http://ryrobes.com/elasticsearch/riffbank-parsing-arbitrary-text-based-guitar-tab-into-an-indexable-queryable-riffcode-for-elasticsearch/" title="RiffBank – Parsing arbitrary Text-based Guitar Tab into an Indexable and Queryable “RiffCode for ElasticSearch"><b>Part 1</b>: Text Tab to "RiffCode"</a></li>
<li><a href="http://ryrobes.com/elasticsearch/riffbank-parsing-arbitrary-text-based-guitar-tab-into-an-indexable-queryable-riffcode-for-elasticsearch/" title="RiffBank – Parsing arbitrary Text-based Guitar Tab into an Indexable and Queryable “RiffCode for ElasticSearch"><b>Part 2</b>: Riff Storage and Querying in ElasticSearch</a></li>
<li><b>Part 3</b>: Simple UI display with PHP-FatFree and Twitter Bootstrap</li>
</ul>
</div>
<h1>What is it?</h1>
<p>A fun data/hack project that I've been working on for the past few weeks. Basically put...</p>
<h2><a href="http://riffbank.com" target="_blank">Riffbank is a "reverse search engine" for guitar tabs.</a></h2>
<p>Give it a simple section of guitar tab (a "riff"), and it will tell you what songs it could be (plus any other metadata it knows). Give it a try and let me know!</p>
<p><a href="http://riffbank.com" target="_blank"><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/08/riffbank-small500px.jpg" alt="riffbank-small500px" width="500" height="756" class="aligncenter size-full wp-image-43203" /></a><br />
<small> <b>[ still a work in progress, being a small side-project ]</b> </small></p>
<h2>Why?</h2>
<p>Anyone who has seen my crazy shit on this blog has probably <a href="http://ryrobes.com/visual-analytics-and-data-porn/enter-tableauman-creating-useful-datavisualizations-with-old-school-guitar-tab-and-metallica/" target="_blank">stumbled upon my 'Enter Sandman Tableau 7 viz'</a>, Basically, I set out to see if I could process guitar tab in such a way so it could be re-displayed in an analytical visualization engine like Tableau, and become a bit of a "tab learning app". </p>
<p>Obviously this was a proof of concept and just a cool hack to use Tableau for - but I never completely stopped thinking about the concept of using guitar tab acsii files as a "source of data".</p>
<p><a href="http://riffbank.com" target="_blank"><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/08/riffjson.jpg" alt="riffjson" width="500" height="483" class="aligncenter size-full wp-image-43207" /></a><br />
<small> <b>[ some riffJSON action ]</b> </small></p>
<p>If you've learned guitar at some point in life and been into tech and computers - you've probably used text-file based guitar tablature before. Hell, I can remember using it back in the early 90's for god sake. I would print it out on a dot-matrix printer and keep a mess of it in my guitar case... </p>
<p>Ah, memories. Anyways, it's been around for a long time - and there is NO shortage of it on the internet these days. While writing the sandman viz, I remember thinking "could I do this with a whole shitload of tab files - and if so, what insights (or not) could I gain from that kind 'data' repository?".</p>
<p><a href="http://riffbank.com" target="_blank"><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/08/es-head.jpg" alt="es-head" width="500" height="233" class="aligncenter size-full wp-image-43208" /></a><br />
<small> <b> [the riffbank elasticsearch cluster]</b> </small></p>
<p>More to come...</p>
]]></content:encoded>
			<wfw:commentRss>http://ryrobes.com/elasticsearch/riffbank-searching-guitar-tab-data-as-a-pseudo-language-with-elasticsearch/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Facts, Stories, Conveyance: Postmortem &#8220;Data Viz Advice&#8221; from a famous American street photographer</title>
		<link>http://ryrobes.com/data-storytelling/facts-stories-conveyance-postmortem-data-viz-advice-from-a-famous-american-street-photographer/</link>
		<comments>http://ryrobes.com/data-storytelling/facts-stories-conveyance-postmortem-data-viz-advice-from-a-famous-american-street-photographer/#comments</comments>
		<pubDate>Mon, 22 Apr 2013 19:47:28 +0000</pubDate>
		<dc:creator><![CDATA[Ry]]></dc:creator>
				<category><![CDATA[Data Storytelling]]></category>

		<guid isPermaLink="false">http://ryrobes.com/?p=43064</guid>
		<description><![CDATA[Just got back from SF - one of the things we did was visit the Museum of Modern Art - the building is a wonder in itself, but obviously the museum hosts countless photographs and art installations from many famous (and not so famous) artists. I had no idea I'd find inspiration that can be DIRECTLY applied to modern data visualization practices. Advice that doesn't get followed as much as it should...
]]></description>
				<content:encoded><![CDATA[<p style="float:right; margin:0 0 10px 15px; width:240px;">
		<img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/04/winogr-150x150.jpg" width="240" />
		</p><p><a href="http://fixxer.ryrobes.com/wp-content/uploads/2013/04/902958_10151427976077763_1704206675_o.jpg"><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/04/902958_10151427976077763_1704206675_o-150x150.jpg" alt="902958_10151427976077763_1704206675_o" width="150" height="150" class="alignleft size-thumbnail wp-image-43113" /></a><span class="dropcap">I</span><!--/.dropcap--> just got back from spending the last week in the beautiful <a href="http://instagram.com/p/YLeUSkSjFo/" target="_blank">San Francisco</a> Bay area. If you've never been, seriously, go there. As an obnoxious East Coast / New England guy, I thought I had seen it all. Nope.</p>
<p>Anyways, one of the things we did was visit the <a href="http://www.sfmoma.org/" target="_blank">SF Museum of Modern Art</a> - the building is a wonder in itself, but obviously the museum hosts countless photographs and art installations from many famous <em>(and not so famous)</em> artists. Honestly, I had no expectations. I was on vacation after all - just doing a fun "touristy" activity between <a href="http://instagram.com/p/YRDPzpSjJW/" target="_blank">rounds of Grey Goose Oysters shooters</a>, <a href="http://instagram.com/p/YaxbGqyjBl/" target="_blank">eating sea creatures</a> and pints of AnchorSteam.</p>
<h4>I had no idea I'd find inspiration that can be DIRECTLY applied to modern data visualization practices. Advice that doesn't get followed as much as it should...</h4>
<p>The exhibit - a <a href="http://www.sfmoma.org/about/press/press_exhibitions/releases/920" target="_blank">complete retrospective of the working life and career of Garry Winogrand</a> (1920s - 1984), including many photos that were never published <em>(he left behind 6.5k rolls of undeveloped film when he passed at the age of 56)</em>. </p>
<p>Garry was a <a href="http://en.wikipedia.org/wiki/Garry_Winogrand" target="_blank">famous street-documentary photographer</a> mostly known for his NYC shots in the 50s-early 80s. His many striking photos told a story through the faces and bodies of it's living subjects. Short on hard "data" - but miles long on human interest, wonder, and conveyance. Stories of frozen emotions and reactions, but without a full sense of context (aka interpretative art; mystery).</p>
<p>I came upon a particular photo in the "Down in the Bronx" section of the exhibit that caught my eye. It was an assault scene from the 50s. Murder possibly. There was a well dressed man on the ground, you can see only his well-made shoes and an upturned hat on the ground - near a large pool of blood. You can see people rushing around him, a nun leaning in and praying, and a little boy stading in the background looking on with eyes agape. Clearly "something" had happened just moments earlier.</p>
<p>On the display card that accompanied the photo is where I found my connection. It described how Garry always gravitated toward this type of photo journalism instead of blatantly trying to capture the <i>What</i> or <i>Why</i> for the sake of written narration. This quote from him was below it.</p>
<div align="center">
<h2>"There is nothing more mysterious<br/>as a fact clearly described"</h2>
<h4> - Garry Winogrand</h4>
</div>
<p><a href="http://fixxer.ryrobes.com/wp-content/uploads/2013/04/winogr.jpg"><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/04/winogr-150x150.jpg" alt="winogr" width="150" height="150" class="alignright size-thumbnail wp-image-43160" /></a><b>Boom</b>. I immediately tapped this quote out on my phone. </p>
<p>It seems that in all the academic and ideological viz wars we've seen lately - people sometimes forget the importance of storytelling and conveyance. Showing the results without the story is next to useless. </p>
<p>Our industry is about action and communication, not mystery. </p>
<h4>The facts mean nothing if the reader can't understand it, conceptualize it, or (even worse) be intrigued enough to <strong>even give a shit</strong> about it. </h4>
<p>The reader might be a literal "reader" of a magazine, newspaper, or online periodical - but really in our case the "reader" is YOUR FUCKING BOSS and YOUR CUSTOMERS. It's essential that they not only know the <strong>facts</strong>, but the <strong>breadcrumbs</strong> that led to such facts. <strong>Seeing is understanding</strong>. You can't make things <b>actionable</b> if they can't be understood (even more important is self-explanatory interactive visualizations).</p>
<h4>In any useful data viz - calculations &#038; insightful analysis is good, facts are good, in-depth data is good, but...</h4>
<div align="center">
<h2 style="font-size:64px;">CONTEXT IS GOD.</h2>
</div>
<p>It's not only about WHAT you are showing, but about the why, how, and the elusive 'what do we do next'. Data is good, geeky math is good, but if it can't be framed in a way that biz or end-user peeps (or the public) can <b>understand</b> it's just shred fodder that you can use in little Suzi's hamster cage.</p>
<p>Getting and interpreting the data is one part - CONVEYING it is another altogether.<br />
(again - not to get into a <strong>Pure Math / Science vs Functional "Art"</strong> war - but it bodes discussion)</p>
<p>What do you guys think?</p>
]]></content:encoded>
			<wfw:commentRss>http://ryrobes.com/data-storytelling/facts-stories-conveyance-postmortem-data-viz-advice-from-a-famous-american-street-photographer/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Going Yard: Using MLB data and Tableau 8 to &#8220;re-imagine&#8221; Homeruns in PetCo Park &#8211; &#8220;The Business Case&#8221;</title>
		<link>http://ryrobes.com/tableau/going-yard-using-mlb-data-and-tableau-8-to-re-imagine-homeruns-in-petco-park-the-business-case/</link>
		<comments>http://ryrobes.com/tableau/going-yard-using-mlb-data-and-tableau-8-to-re-imagine-homeruns-in-petco-park-the-business-case/#comments</comments>
		<pubDate>Tue, 02 Apr 2013 01:10:00 +0000</pubDate>
		<dc:creator><![CDATA[Ry]]></dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[Tableau]]></category>
		<category><![CDATA[Digging for Data]]></category>
		<category><![CDATA[gameday]]></category>
		<category><![CDATA[MLB]]></category>
		<category><![CDATA[PetCo Park]]></category>
		<category><![CDATA[San Diego Padres]]></category>
		<category><![CDATA[tableau 8]]></category>

		<guid isPermaLink="false">http://ryrobes.com/?p=42935</guid>
		<description><![CDATA[Answering life's Big Questions with Data Viz: The San Diego Padres are changing the wall dimensions for their home stadium, PetCo Park, for the 2013 MLB season. In a notoriously large "pitchers park", how might this change affect the amount of Homeruns, the Padres record, and (possibly) ticket sales?]]></description>
				<content:encoded><![CDATA[<p style="float:right; margin:0 0 10px 15px; width:240px;">
		<img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/04/padres-hat-tableau2.jpg" width="240" />
		</p><p><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/04/padres-hat-tableau2-150x150.jpg" alt="padres-hat-tableau2" width="150" height="150" class="alignleft size-thumbnail wp-image-43039" />Greetings, it's been awhile since I published anything - my apologies! I've been hella busy this year, but <i>finally</i> have dedicated some time to a mini-project I've been working on for some time. This time, it's a baseball viz / park issue that I can hopefully shed some (theoretical) light on. IN an effort to stop writing massive blog posts - consider this post a intro - more like the "Business Case", the next one will be "Technical HOW, WHY, WHAT".</p>
<p>Anyways, on with the premise...</p>
<div class="woo-sc-box normal  rounded full">
<h2>THE QUESTION:</h2>
<p>The <a href="http://sandiego.padres.mlb.com/index.jsp?c_id=sd" target="_blank">San Diego Padres</a> are changing the wall dimensions for their home stadium, <a href="http://www.petcoparkevents.com/" target="_blank">PetCo Park</a>, for the 2013 MLB season. In a notoriously large "pitchers park", how might this change affect the amount of Homeruns, the Padres record, and (possibly) ticket sales?<br />
</div>
<div class="woo-sc-box normal  rounded full">
<h2>MY APPROACH:</h2>
<p>Using freely available data from the MLB Gameday app (via <a href="http://gd2.mlb.com/components/game/mlb/year_2013/month_04/day_01/gid_2013_04_01_bosmlb_nyamlb_1/" target="_blank">their awesome XML datastore</a>) can we map out the hitcharts for the past 3-5 years (by mapping Gameday x,y coordinates to geocoded coordinates) and see what effect a closer (and lower) wall would have had on THOSE seasons. Changing Outs and Base Hits into HRs (with associated RBIs, if men are on base) If we can try to understand with MIGHT have happened, we can shed some light on what this season might bring.</div>
<h4>Tools Used:</h4>
<div class="shortcode-unorderedlist bullet">
<ul>
<li><b>Python</b> = Data Harvesting (MLB, Weather, Salaries)</li>
<li><b>SQL</b> = Data Storage (SQL Server 2012 Azure cloud DB)</li>
<li><b>Tableau</b> = Data Presentation / Visualization / Analysis</li>
<li><b>Photoshop</b> = Designing Layout (mock-up wireframes) + 'Design' Sections of the Viz</li>
<li><b>Total Time Taken</b> = <i>Lots</i>. :)</li>
</ul>
<p></div>

<div class="woo-sc-box normal  rounded full">
<h2>THE RESULTS:</h2>
<p><b>More Homeruns?</b> Yes, as many as 50-100 MORE per year.<br />
<b>More Padres Wins?</b> No (the visiting team will probably gain MORE HRs).<br />
<b>More Ticket Sales?</b> Possibly (HRs make for a more 'exciting' game to watch, and may be a bigger draw for fans than wins - <a href="http://umresearchboard.org/resources/davis/Baseball_Attendance_Winning.pdf" target="_blank">but the jury</a> <a href="http://www.hardballtimes.com/main/article/how-are-wins-attendance-and-payroll-all-related/" target="_blank">is out on that</a>).</p>
<p><b>Also, notice that attendance seems to be declining regardless of wins AND plummeting average ticket prices...</b></p>
<p><i>Take this with a grain of salt, Baseball is a game that can change on a dime. I'm only making estimations based on that data available.</i></p>
<p>But don't take my word for it - <a href="http://ryrobes.com/visual-analytics-and-data-porn/mapping-san-diego-padres-hits-petco-park-changes-tableau-8/" title="Mapping San Diego Padres Hits &#038; PetCo Park Changes | Tableau 8">check out the viz in all it's glory by clicking on the image below</a>.</div>
<div id="attachment_42966" style="width: 510px" class="wp-caption aligncenter"><a href="http://ryrobes.com/visual-analytics-and-data-porn/mapping-san-diego-padres-hits-petco-park-changes-tableau-8/"><img src="http://fixxer.ryrobes.com/wp-content/uploads/2013/04/padres-long-image.jpg" alt="Taming PetCo Park" width="500" height="1142" class="size-full wp-image-42966" /></a><p class="wp-caption-text"><b>"Taming PetCo Park"</b>. Click for full interactive version.</p></div>
<div class="woo-sc-divider"></div>
<p>Gird your loins. Next time, I'll have all the gory details - Python, SQL scripts and all....</p>
]]></content:encoded>
			<wfw:commentRss>http://ryrobes.com/tableau/going-yard-using-mlb-data-and-tableau-8-to-re-imagine-homeruns-in-petco-park-the-business-case/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>The Anatomy of a Tableau 8 Dashboard (or a &#8220;floating tile grid autopsy&#8221;)</title>
		<link>http://ryrobes.com/tableau/the-anatomy-of-a-tableau-8-dashboard-or-a-floating-tile-grid-autopsy/</link>
		<comments>http://ryrobes.com/tableau/the-anatomy-of-a-tableau-8-dashboard-or-a-floating-tile-grid-autopsy/#comments</comments>
		<pubDate>Fri, 28 Dec 2012 00:58:21 +0000</pubDate>
		<dc:creator><![CDATA[Ry]]></dc:creator>
				<category><![CDATA[Tableau]]></category>
		<category><![CDATA[dashboard]]></category>
		<category><![CDATA[information design]]></category>
		<category><![CDATA[kraken]]></category>
		<category><![CDATA[tableau public]]></category>

		<guid isPermaLink="false">http://ryrobes.com/?p=42796</guid>
		<description><![CDATA[With the release of Tableau 8 just around the corner, and the Tableau 8 Public servers in a (very) limited release beta – I figured that it was good a time as any to see what I could come up with using (a couple) of the new features. Specifically the “floating tile” option for dashboards...]]></description>
				<content:encoded><![CDATA[<p style="float:right; margin:0 0 10px 15px; width:240px;">
		<img src="http://fixxer.ryrobes.com/wp-content/uploads/2012/12/yeaser1.jpg" width="240" />
		</p><a href="http://ryrobes.com/ryrobes_ga.twbx" class="woo-sc-button  silver" ><span class="woo-download">TCC 2013 Workbook Link</span></a>
<p>With the release of Tableau 8 just around the corner, and the Tableau 8 Public servers in a (very) limited release beta - I figured that it was good a time as any to see what I could come up with using (a couple) of the new features. Specifically the "floating tile" option for dashboard creation (one that I'm particularly jazzed about, as you'll see)...</p>
<h3>"Floating Tiles" = A Game Changer</h3>
<p>Previously in 7 (and earlier) you were restricted to putting dashboard "parts" (for lack of a better term - think worksheets, text, and images) into a editable "grid" pattern. While this allowed some cool designs if you wanted to get creative - you could really only go so far. Now since that has been lifted (with the added inclusion of allowing overlapping dashboard parts) we are going to see some really cool designs.</p>
<div align="center"><a href="http://publicbeta.tableausoftware.com/views/ryrobesgoogleanalyticsdash/googleanalyticsdash?:embed=y&amp;:display_count=y" target="_blank"><img src="http://fixxer.ryrobes.com/wp-content/uploads/2012/12/yeaser.jpg" alt="yeaser" width="450" height="381" class="aligncenter size-full wp-image-42806" style="border:0px;" alt="Click here to open the viz in a new window"  title="Click here to open the viz in a new window" /></a>
</div>
<p><br/>I wanted to visually illustrate how the new dashboard layout rules radically change the dash construction and design in Tableau 8 - so I started up Photoshop to give this particular viz a "grid autopsy". Anyone who is familiar with Tableau in any way should be able to see quickly what we've got going on here.</p>
<p>This dashboard viz is literally made up of 21 separate Tableau worksheets, plus several static images and static text boxes. </p>
<div align="center">
<a href="http://fixxer.ryrobes.com/wp-content/uploads/2012/12/viz-anatomy.jpg"><img src="http://fixxer.ryrobes.com/wp-content/uploads/2012/12/viz-anatomy.jpg" alt="viz-anatomy" width="500" height="490" class="aligncenter size-medium wp-image-42804"  style="border:0px;" /></a>
</div>
<p></br></p>
<ul>
<li><span class="shortcode-typography" style="font-family: Arial, Helvetica, sans-serif; font-size: 18px; color: blue;">Blue - Tableau Worksheet</span></li>
<li><span class="shortcode-typography" style="font-family: Arial, Helvetica, sans-serif; font-size: 18px; color: red;">Red - Static Image</span></li>
<li><span class="shortcode-typography" style="font-family: Arial, Helvetica, sans-serif; font-size: 18px; color: green;">Green - Static Text</span></li>
</ul>
<p>Also, the ability to move tiles pixel by pixel and size them pixel makes it possible without contemplating murder.</p>
<p><em>For the record, SOME of this would have been somewhat possible with 7, but it would have been a total trauma to keep things aligned properly - and generally not recommended for anyone who wants continued grip on their sanity.</em></p>
<div class="woo-sc-box info  rounded full">NOTE: Yes, I know that the "trending" green up triangle should be a worksheet with a shape calculation - but I got really lazy</div>
<div class="woo-sc-box tick  rounded full">UPDATE: I'm also quite fond of the "date range" text worksheet. It's not groundbreaking or anything, but having a calculated field dynamically show some absolute filter ranges means that when filtering you can have text updated to "re-represent" what the user is looking at WITHOUT counting on the filter dialogs themselves...</p>
<p><strong>Ex.</strong></p>
<p><a href="http://fixxer.ryrobes.com/wp-content/uploads/2012/12/date-range-wksht-string.jpg"><img src="http://fixxer.ryrobes.com/wp-content/uploads/2012/12/date-range-wksht-string.jpg" alt="date-range-wksht-string" width="440" height="305" class="aligncenter size-medium wp-image-42854" /></a></p>
</div>
<p>To be continued... :) Really looking forward to what everyone else comes up with too!</p>
]]></content:encoded>
			<wfw:commentRss>http://ryrobes.com/tableau/the-anatomy-of-a-tableau-8-dashboard-or-a-floating-tile-grid-autopsy/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
		<item>
		<title>Quick &amp; Dirty Address Geocoding and Formatting with Google Maps API</title>
		<link>http://ryrobes.com/python/quick-dirty-address-geocoding-and-formatting-with-google-maps-api/</link>
		<comments>http://ryrobes.com/python/quick-dirty-address-geocoding-and-formatting-with-google-maps-api/#comments</comments>
		<pubDate>Tue, 11 Dec 2012 00:19:32 +0000</pubDate>
		<dc:creator><![CDATA[Ry]]></dc:creator>
				<category><![CDATA[Cut-n-Paste Code]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[geocode]]></category>
		<category><![CDATA[geocoding]]></category>
		<category><![CDATA[google maps]]></category>

		<guid isPermaLink="false">http://ryrobes.com/?p=42739</guid>
		<description><![CDATA[Howdy. Cheers to all ya'll down there in internet land. I got in a conversation earlier today regarding geocoding addresses in data-sets - it's a pretty common thing, and I've done it numerous times for a WIDE variety of data-points (Bigfoot, Sex Offenders, Concert Venues, etc.), so I figured hell, I'll clean it up and offer it to the Google gods. Maybe someone will find it useful...]]></description>
				<content:encoded><![CDATA[<p style="float:right; margin:0 0 10px 15px; width:240px;">
		<img src="http://fixxer.ryrobes.com/wp-content/uploads/2012/12/Full4.jpg" width="240" />
		</p><p><img src="http://fixxer.ryrobes.com/wp-content/uploads/2012/12/Vacation.jpg" alt="Why aren't we flying? Because getting there is half the fun. You know that." title="Why aren't we flying? Because getting there is half the fun. You know that." width="500" align="center" /></p>
<p>Howdy. Cheers to all ya'll down there in internet land. I got in a conversation earlier today regarding geocoding addresses in data-sets - it's a pretty common thing, and I've done it numerous times for a WIDE variety of data-points (Bigfoot, Sex Offenders, Concert Venues, etc.), so I figured hell, I'll clean it up and offer it to the Google gods. Maybe someone will find it useful.</p>
<h3>"Why aren't we flying? Because getting there is half the fun. You know that."</h3>
<p>The script is pretty simple, but it usually gets the job done with minimal modification. I tend to use a small dimension table for geo lat / long data - because generally the coordinates of an address won't change, so there is no reason to RE-Geocode an address (in most cases).</p>
<p>In this example, I'm using a SQL Server connection - but the query usage are so basic that any other could be used just by loading a different module and changing the connection syntax a bit (SQLite, Oracle, MYSQL, etc).</p>
<p>You'll need to download and <a href="https://bitbucket.org/shelldweller/python-geocoder" target="_blank">install this simple Google Maps API wrapper first</a>.</p>
<div class="codecolorer-container python blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="python codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #ff7700;font-weight:bold;">from</span> geocode.<span style="color: black;">google</span> <span style="color: #ff7700;font-weight:bold;">import</span> GoogleGeocoderClient<br />
<span style="color: #ff7700;font-weight:bold;">import</span> pymssql <span style="color: #808080; font-style: italic;"># or MySQLdb or etc.</span><br />
<br />
<span style="color: #808080; font-style: italic;"># sql server example</span><br />
db <span style="color: #66cc66;">=</span> pymssql.<span style="color: black;">connect</span><span style="color: black;">&#40;</span>host<span style="color: #66cc66;">=</span><span style="color: #483d8b;">'localhost'</span><span style="color: #66cc66;">,</span> <span style="color: #dc143c;">user</span><span style="color: #66cc66;">=</span><span style="color: #483d8b;">'sa'</span><span style="color: #66cc66;">,</span> password<span style="color: #66cc66;">=</span><span style="color: #483d8b;">'trolololo'</span><span style="color: black;">&#41;</span> <br />
cursor <span style="color: #66cc66;">=</span> db.<span style="color: black;">cursor</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
<br />
geocoder <span style="color: #66cc66;">=</span> GoogleGeocoderClient<span style="color: black;">&#40;</span><span style="color: #008000;">False</span><span style="color: black;">&#41;</span><br />
<br />
<br />
<span style="color: #808080; font-style: italic;"># your source SQL in this example I'm expect at least 2 fields, the first a unique id for that address (for updating), the next a string of the address</span><br />
cursor.<span style="color: black;">execute</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;select 12341 as myid, '101 s presidents st baltimore MD USA' as address &quot;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># the hotel I wrote this in</span><br />
addy_queue <span style="color: #66cc66;">=</span> cursor.<span style="color: black;">fetchall</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
<br />
<span style="color: #ff7700;font-weight:bold;">for</span> curraddy <span style="color: #ff7700;font-weight:bold;">in</span> addy_queue:<br />
<br />
&nbsp; &nbsp; addy_id <span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>curraddy<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># your unique id</span><br />
&nbsp; &nbsp; addy <span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>curraddy<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># your location</span><br />
<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'****************************ADDY IN*********************************'</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">' ('</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>addy_id<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">') '</span> + addy<br />
<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:&nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; result <span style="color: #66cc66;">=</span> geocoder.<span style="color: black;">geocode</span><span style="color: black;">&#40;</span>addy<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># result.is_success() # nice little boolean if you need it</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; googaddy <span style="color: #66cc66;">=</span> result.<span style="color: black;">get_formatted_address</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;'&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;&quot;</span><span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; coord <span style="color: #66cc66;">=</span> result.<span style="color: black;">get_location</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; latitude <span style="color: #66cc66;">=</span> coord<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; longitude <span style="color: #66cc66;">=</span> coord<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; location_type <span style="color: #66cc66;">=</span> result.<span style="color: black;">get_location_type</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># just for fun</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'****************************ADDY OUT*********************************'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>latitude<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">', '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>longitude<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' ('</span> + location_type + <span style="color: #483d8b;">')'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> googaddy <span style="color: #808080; font-style: italic;"># Google Maps formatted address</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">' '</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># this is where you're updating the geo coords BACK to your data, or inserting them in a sep table</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; cursor.<span style="color: black;">execute</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;delete from homes.dbo.my_geo where id = '&quot;</span> +<span style="color: #008000;">str</span><span style="color: black;">&#40;</span>addy_id<span style="color: black;">&#41;</span>+<span style="color: #483d8b;">&quot;' &quot;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; cursor.<span style="color: black;">execute</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;insert into homes.dbo.my_geo (lat, long, id, clean_address) values ('&quot;</span>+<span style="color: #008000;">str</span><span style="color: black;">&#40;</span>latitude<span style="color: black;">&#41;</span>+<span style="color: #483d8b;">&quot;', '&quot;</span>+<span style="color: #008000;">str</span><span style="color: black;">&#40;</span>longitude<span style="color: black;">&#41;</span>+<span style="color: #483d8b;">&quot;', '&quot;</span> +<span style="color: #008000;">str</span><span style="color: black;">&#40;</span>addy_id<span style="color: black;">&#41;</span>+<span style="color: #483d8b;">&quot;', '&quot;</span>+<span style="color: #008000;">str</span><span style="color: black;">&#40;</span>googaddy<span style="color: black;">&#41;</span>+<span style="color: #483d8b;">&quot;' ) &quot;</span><span style="color: black;">&#41;</span> <br />
&nbsp; &nbsp; &nbsp; &nbsp; db.<span style="color: black;">commit</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">pass</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># in case of error: do something (optional) I'm sure you're going to have some bad addys, just address them later, or flag them here</span><br />
<br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#time.sleep(.2) # just in case you dont want to hammer Google's servers, but I say - fuck 'em</span></div></div>
<div class="woo-sc-box info   ">FYI - Last I heard, the Google API limits geocoding to 2,500 requests within 24 hours, after that you'll just get error responses.</div>
<p>Feel free to give me a shout!</p>
]]></content:encoded>
			<wfw:commentRss>http://ryrobes.com/python/quick-dirty-address-geocoding-and-formatting-with-google-maps-api/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>Build Tableau Data Extracts out of CSV Files? More Python TDE API madness!</title>
		<link>http://ryrobes.com/python/build-tableau-data-extracts-out-of-csv-files-more-python-tde-api-madness/</link>
		<comments>http://ryrobes.com/python/build-tableau-data-extracts-out-of-csv-files-more-python-tde-api-madness/#comments</comments>
		<pubDate>Fri, 07 Dec 2012 19:10:17 +0000</pubDate>
		<dc:creator><![CDATA[Ry]]></dc:creator>
				<category><![CDATA[Python]]></category>
		<category><![CDATA[Tableau]]></category>
		<category><![CDATA[csv]]></category>
		<category><![CDATA[tableau 8]]></category>
		<category><![CDATA[tableau data extracts]]></category>

		<guid isPermaLink="false">http://ryrobes.com/?p=42693</guid>
		<description><![CDATA[So here we have the 3rd in a series about using Tableau 8′s ‘Data Extract’ API to automatically create TDE files from various data sources without using the desktop client. This time we’re focusing on good ole’ Comma Separated files....

Get yo CSV on!]]></description>
				<content:encoded><![CDATA[<p style="float:right; margin:0 0 10px 15px; width:240px;">
		<img src="http://fixxer.ryrobes.com/wp-content/uploads/2012/12/LOG-ren-and-stimpy-1552749-1280-1024.jpg" width="240" />
		</p><p>So here we have the 3rd in a <a href="http://ryrobes.com/category/tableau/">series about using Tableau 8's 'Data Extract' API</a> to automatically create TDE files from various data sources without using the desktop client. This time we're focusing on good ole' Comma Separated files.</p>
<p><img src="http://fixxer.ryrobes.com/wp-content/uploads/2012/12/powdered-toast-man.jpg" alt="" title="powdered-toast-man" width="200" height="171" class="alignleft size-full wp-image-42718" /> Much like <a href="http://ryrobes.com/python/sql-server-query-to-tableau-data-extract-more-tde-api-fun-with-python-tableau-8/" title="SQL Server Query to Tableau Data Extract LIKE A BOSS – Some more TDE API fun with Python &#038; Tableau 8">my SQL Server script</a>, this one tries to guess field names and field data types in order to produce as good of an extract as possible - I've also added a little command line progress bar for your <strong>FILE WATCHING ENJOYMENT</strong>. </p>
<p>You know, in case you're into shit like that. </p>
<p>Plus, I report on file sizes and row counts! OMG! You ALSO have the ability to run MULTIPLE files in a directory - in case you have a bunch that need to be crunched into their own TDE files - that's right, Christmas come early! (You're welcome)</p>
<p>I used some census data files for testing <em>(as well as Metallica, Bigfoot, and UFO data - but that's a given)</em> as you can see below. I gotta, say processing CSV files is decent speedy - even though I haven't tried to process any HUGE files yet (2GB+).</p>
<p>Bit of explanation about the script parameters:</p>
<ul>
<li><strong>cvsfilenamemask = '.csv'</strong> It's basically an "ends with" string - so you can use something like '.csv' to process ALL CSV files in a directory, or simply change it to something like 'input.csv' to ONLY process that file from that dir.</li>
<li><strong>sourcedir = 'C:\\Python27\\'</strong> Fairly obvious - this is the folder that the script will look in for any files matching 'csvfilenamemask' above. You can also use UNC file paths like ''\\\\ryrobesxps\d$\\'. Remember to double up your slashes accordingly!</li>
<li><strong>targetdir = 'C:\\Python27\\'</strong> Same as above, except this it the target dir that the resulting TDE file will be written to. </li>
<li><strong>csvdelimiter = ','</strong> You should know this one.</li>
<li><strong>csvquotechar = '"'</strong> AND this one, but who knows.</li>
<li><strong>rowoutput = False</strong> Set to 'True' to have all the row and column info echoed into the command line window - it's good for debugging data and script errors but it slows things down like 10X (literally). Who knew that printing out thousands of things would be so expensive! ;)</li>
</ul>
<h4>Here's a sample output from some tests I did earlier</h4>
<div class="codecolorer-container text blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="text codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap">C:\Python27&gt;python CSV_to_TDE.py<br />
<br />
###########################################################################<br />
&nbsp; Now working on cbp10co.csv (2,155,390) -&gt; cbp10co.tde (59,871 rows per =)<br />
###########################################################################<br />
[ = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = ]<br />
(2,155,390 rows)<br />
<br />
###########################################################################<br />
&nbsp; Now working on cbp10st.csv (456,410) -&gt; cbp10st.tde (12,678 rows per =)<br />
###########################################################################<br />
[ = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = ]<br />
(456,410 rows)<br />
<br />
###########################################################################<br />
&nbsp; Now working on PlayerStats.csv (23) -&gt; PlayerStats.tde (10 rows per =)<br />
###########################################################################<br />
[ = = ]<br />
(23 rows)<br />
<br />
###########################################################################<br />
&nbsp; Now working on simple.csv (32) -&gt; simple.tde (10 rows per =)<br />
###########################################################################<br />
[ = = = ]<br />
(31 rows)<br />
<br />
cbp10co.csv &nbsp; 383 seconds &nbsp; 2,155,390 rows. TDE file is 172.9MB (source was 42.2MB)<br />
cbp10st.csv &nbsp; 431 seconds &nbsp; 456,410 rows. TDE file is 114.6MB (source was 43.5MB)<br />
PlayerStats.csv 4 seconds &nbsp; 23 rows. TDE file is 3.5KB (source was 57.4KB)<br />
simple.csv &nbsp; &nbsp; &nbsp;1 seconds &nbsp; 31 rows. TDE file is 659.0bytes (source was 34.4KB)<br />
<br />
TOTAL RUN &nbsp; &nbsp; 820 seconds &nbsp; 2,611,854 rows - 287.5MB of text into 85.8MB of data sex!<br />
<br />
C:\Python27&gt;</div></div>
<p>Sexy, right? Indeed. Ok, here it is, it's a whopper, so get that Cntrl-C hand ready...</p>
<h3>Wh3r3's th3 b33f? Oh, nevermind.</h3>
<div class="woo-sc-box normal   "><strong>Note:</strong> if you have less than 20 rows or so, my "fancy" header checking could get confused. Why you would have a file like that, I have no idea - just sayin'.</div>
<div class="codecolorer-container python blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="python codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #808080; font-style: italic;"># ryan robitialle (12/6/2012)</span><br />
<span style="color: #808080; font-style: italic;"># creating Tableau Data Extracts via CSV files</span><br />
<br />
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">csv</span><span style="color: #66cc66;">,</span> <span style="color: #dc143c;">os</span><span style="color: #66cc66;">,</span> <span style="color: #dc143c;">time</span><br />
<span style="color: #ff7700;font-weight:bold;">from</span> <span style="color: #dc143c;">datetime</span> <span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">datetime</span><br />
<span style="color: #ff7700;font-weight:bold;">import</span> dataextract <span style="color: #ff7700;font-weight:bold;">as</span> tde <span style="color: #808080; font-style: italic;">#saves some typing, cause i'm a lazy fucker</span><br />
<br />
<span style="color: #808080; font-style: italic;">################ PARAMETERS FOR YOU, CODE MONKEY! ##########################</span><br />
cvsfilenamemask <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'s.csv'</span> <span style="color: #808080; font-style: italic;"># can be explicit 'thisfile.csv' for one file - or open '.csv' for all that match</span><br />
sourcedir <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'C:<span style="color: #000099; font-weight: bold;">\\</span>Python27<span style="color: #000099; font-weight: bold;">\\</span>'</span> <span style="color: #808080; font-style: italic;"># need to double up the \\s | windows shares use like this '\\\\ryrobesxps\d$\' etc</span><br />
targetdir <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'C:<span style="color: #000099; font-weight: bold;">\\</span>Python27<span style="color: #000099; font-weight: bold;">\\</span>'</span> <span style="color: #808080; font-style: italic;"># can't be a share or UNC path</span><br />
csvdelimiter <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">','</span> <span style="color: #808080; font-style: italic;"># obvious!</span><br />
csvquotechar <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'&quot;'</span> <span style="color: #808080; font-style: italic;"># obvious!</span><br />
rowoutput <span style="color: #66cc66;">=</span> <span style="color: #008000;">False</span> <span style="color: #808080; font-style: italic;"># useful for debugging data errors / slows shit down a lot however</span><br />
<span style="color: #808080; font-style: italic;">################ PARAMETERS FOR YOU, CODE MONKEY! ##########################</span><br />
<br />
<span style="color: #808080; font-style: italic;"># Note: if you have less than a few thousand rows, the progress bar will be a bit fucked looking.</span><br />
<br />
fileperf <span style="color: #66cc66;">=</span> <span style="color: #008000;">dict</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># for saving each files execution times</span><br />
<br />
<span style="color: #808080; font-style: italic;"># since the CSV module imports all fields as strings regardless of what they are..</span><br />
<span style="color: #ff7700;font-weight:bold;">def</span> datatyper<span style="color: black;">&#40;</span>n<span style="color: black;">&#41;</span>: &nbsp; &nbsp;<span style="color: #808080; font-style: italic;"># force some data types to figure shit out</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>: &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># kind of lame.... BUT IT WORKS</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; x <span style="color: #66cc66;">=</span> <span style="color: #008000;">int</span><span style="color: black;">&#40;</span>n<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">int</span><span style="color: black;">&#40;</span>n<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; x <span style="color: #66cc66;">=</span> <span style="color: #008000;">float</span><span style="color: black;">&#40;</span>n<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">float</span><span style="color: black;">&#40;</span>n<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; date_object <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">datetime</span>.<span style="color: black;">strptime</span><span style="color: black;">&#40;</span>n<span style="color: #66cc66;">,</span> <span style="color: #483d8b;">'%m/%d/%Y'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> date_object<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; date_object <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">datetime</span>.<span style="color: black;">strptime</span><span style="color: black;">&#40;</span>n<span style="color: #66cc66;">,</span> <span style="color: #483d8b;">'%Y-%m-%d'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> date_object<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> n <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'NULL'</span>: <span style="color: #808080; font-style: italic;"># just in case, don't want any literal NULLs in there</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">None</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">elif</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>n<span style="color: black;">&#41;</span> <span style="color: #66cc66;">&gt;</span> <span style="color: #ff4500;">0</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>n<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>: <span style="color: #808080; font-style: italic;"># no need to return an empty string, let's NULL that shit out</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #008000;">None</span><br />
<span style="color: #808080; font-style: italic;"># end ugly data types function</span><br />
<br />
<span style="color: #ff7700;font-weight:bold;">def</span> showhumanfilesize<span style="color: black;">&#40;</span>num<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> x <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: black;">&#91;</span><span style="color: #483d8b;">'bytes'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'KB'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'MB'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'GB'</span><span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> num <span style="color: #66cc66;">&lt;</span> <span style="color: #ff4500;">1024.0</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #483d8b;">&quot;%3.1f%s&quot;</span> % <span style="color: black;">&#40;</span>num<span style="color: #66cc66;">,</span> x<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; num /<span style="color: #66cc66;">=</span> <span style="color: #ff4500;">1024.0</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #483d8b;">&quot;%3.1f%s&quot;</span> % <span style="color: black;">&#40;</span>num<span style="color: #66cc66;">,</span> <span style="color: #483d8b;">'TB'</span><span style="color: black;">&#41;</span><br />
<br />
<span style="color: #ff7700;font-weight:bold;">def</span> intWithCommas<span style="color: black;">&#40;</span>x<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">type</span><span style="color: black;">&#40;</span>x<span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">not</span> <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: black;">&#91;</span><span style="color: #008000;">type</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span><span style="color: #66cc66;">,</span> <span style="color: #008000;">type</span><span style="color: black;">&#40;</span>0L<span style="color: black;">&#41;</span><span style="color: black;">&#93;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">raise</span> <span style="color: #008000;">TypeError</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;Parameter must be an integer.&quot;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> x <span style="color: #66cc66;">&lt;</span> <span style="color: #ff4500;">0</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #483d8b;">'-'</span> + intWithCommas<span style="color: black;">&#40;</span>-x<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; result <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">''</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">while</span> x <span style="color: #66cc66;">&gt;=</span> <span style="color: #ff4500;">1000</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; x<span style="color: #66cc66;">,</span> r <span style="color: #66cc66;">=</span> <span style="color: #008000;">divmod</span><span style="color: black;">&#40;</span>x<span style="color: #66cc66;">,</span> <span style="color: #ff4500;">1000</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; result <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">&quot;,%03d%s&quot;</span> % <span style="color: black;">&#40;</span>r<span style="color: #66cc66;">,</span> result<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> <span style="color: #483d8b;">&quot;%d%s&quot;</span> % <span style="color: black;">&#40;</span>x<span style="color: #66cc66;">,</span> result<span style="color: black;">&#41;</span><br />
<br />
<span style="color: #ff7700;font-weight:bold;">def</span> file_lines<span style="color: black;">&#40;</span>fname<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">with</span> <span style="color: #008000;">open</span><span style="color: black;">&#40;</span>fname<span style="color: black;">&#41;</span> <span style="color: #ff7700;font-weight:bold;">as</span> f:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> i<span style="color: #66cc66;">,</span> l <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">enumerate</span><span style="color: black;">&#40;</span>f<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">pass</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">return</span> i + <span style="color: #ff4500;">1</span><br />
<br />
<span style="color: #808080; font-style: italic;">#print ' '</span><br />
<span style="color: #808080; font-style: italic;">#print '[ Note: Each . = ' +str(dotsevery)+ ' rows processed ]'</span><br />
<br />
<span style="color: #dc143c;">os</span>.<span style="color: black;">chdir</span><span style="color: black;">&#40;</span>sourcedir<span style="color: black;">&#41;</span><br />
<span style="color: #ff7700;font-weight:bold;">for</span> csvfilename <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #dc143c;">os</span>.<span style="color: black;">listdir</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;.&quot;</span><span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> csvfilename.<span style="color: black;">endswith</span><span style="color: black;">&#40;</span>cvsfilenamemask<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; tdefilename <span style="color: #66cc66;">=</span> csvfilename.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'.'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span>+<span style="color: #483d8b;">'.tde'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; linez <span style="color: #66cc66;">=</span> file_lines<span style="color: black;">&#40;</span>sourcedir + csvfilename<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> linez <span style="color: #66cc66;">&gt;</span> <span style="color: #ff4500;">36</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; dotsevery <span style="color: #66cc66;">=</span> linez/<span style="color: #ff4500;">36</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; dotsevery <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">10</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">' '</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'###########################################################################'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">' &nbsp;Now working on '</span> + csvfilename + <span style="color: #483d8b;">' ('</span>+<span style="color: #008000;">str</span><span style="color: black;">&#40;</span>intWithCommas<span style="color: black;">&#40;</span>linez<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>+<span style="color: #483d8b;">') -&gt; '</span> + tdefilename + <span style="color: #483d8b;">' ('</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>intWithCommas<span style="color: black;">&#40;</span>dotsevery<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' rows per =)'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'###########################################################################'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#print dotsevery</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #dc143c;">time</span>.<span style="color: black;">sleep</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># so you can read it.</span><br />
<br />
<span style="color: #808080; font-style: italic;"># BEGIN MULTI FILE LOOP</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; start_time <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">time</span>.<span style="color: #dc143c;">time</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># simple timing for test purposes</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># taking a sample of the file</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; csvfile <span style="color: #66cc66;">=</span> <span style="color: #008000;">open</span><span style="color: black;">&#40;</span>csvfilename<span style="color: #66cc66;">,</span> <span style="color: #483d8b;">'rb'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; toplines <span style="color: #66cc66;">=</span> csvfile.<span style="color: black;">readlines</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; filebuffer <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">''</span> <span style="color: #808080; font-style: italic;"># empty string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> i <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">range</span><span style="color: black;">&#40;</span>dotsevery<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; filebuffer <span style="color: #66cc66;">=</span> filebuffer + toplines<span style="color: black;">&#91;</span>i<span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; hasheader <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">csv</span>.<span style="color: black;">Sniffer</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>.<span style="color: black;">has_header</span><span style="color: black;">&#40;</span>filebuffer<span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># csvfile.read() &nbsp;/ &nbsp;filebuffer</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; hasheader <span style="color: #66cc66;">=</span> <span style="color: #008000;">False</span><br />
<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># ok lets go</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; csvfile.<span style="color: black;">seek</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># YOU WILL DO, WHAT I SAY, WHEN I SAY! BACK TO THE FRONT!</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; csvreader <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">csv</span>.<span style="color: black;">DictReader</span><span style="color: black;">&#40;</span>csvfile<span style="color: #66cc66;">,</span> delimiter<span style="color: #66cc66;">=</span>csvdelimiter<span style="color: #66cc66;">,</span> quotechar<span style="color: #66cc66;">=</span>csvquotechar<span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; dfields <span style="color: #66cc66;">=</span> <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; dtypes <span style="color: #66cc66;">=</span> <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> hasheader <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> f <span style="color: #ff7700;font-weight:bold;">in</span> csvreader.<span style="color: black;">fieldnames</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; dfields.<span style="color: black;">append</span><span style="color: black;">&#40;</span>f<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>: <span style="color: #808080; font-style: italic;"># WTF? No header? JERK.</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fieldnum <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#print 'If you don\'t have a header, how the fuck will you recognize the fields in Tableau?'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> f <span style="color: #ff7700;font-weight:bold;">in</span> csvreader.<span style="color: black;">fieldnames</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; dfields.<span style="color: black;">append</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'field'</span>+<span style="color: #008000;">str</span><span style="color: black;">&#40;</span>fieldnum<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fieldnum <span style="color: #66cc66;">=</span> fieldnum + <span style="color: #ff4500;">1</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; csvreader <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">csv</span>.<span style="color: black;">DictReader</span><span style="color: black;">&#40;</span>csvfile<span style="color: #66cc66;">,</span> delimiter<span style="color: #66cc66;">=</span>csvdelimiter<span style="color: #66cc66;">,</span> quotechar<span style="color: #66cc66;">=</span>csvquotechar<span style="color: #66cc66;">,</span> fieldnames<span style="color: #66cc66;">=</span>dfields<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># we have to make our own field names</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> row <span style="color: #ff7700;font-weight:bold;">in</span> csvreader:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> i <span style="color: #ff7700;font-weight:bold;">in</span> dfields:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; dtypes.<span style="color: black;">append</span><span style="color: black;">&#40;</span><span style="color: #008000;">str</span><span style="color: black;">&#40;</span><span style="color: #008000;">type</span><span style="color: black;">&#40;</span>datatyper<span style="color: black;">&#40;</span>row<span style="color: black;">&#91;</span>i<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">break</span> <span style="color: #808080; font-style: italic;"># got shit, we're out</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; csvfile.<span style="color: black;">seek</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># BACK TO THE FRONT! (AGAIN!)</span><br />
<br />
<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #dc143c;">os</span>.<span style="color: black;">chdir</span><span style="color: black;">&#40;</span>targetdir<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>: &nbsp;<span style="color: #808080; font-style: italic;"># Just in case the file exists already, we don't want to bomb out</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tdefile <span style="color: #66cc66;">=</span> tde.<span style="color: black;">Extract</span><span style="color: black;">&#40;</span>tdefilename<span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># in CWD</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>: <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #dc143c;">os</span>.<span style="color: black;">system</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'del '</span> + targetdir + tdefilename<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #dc143c;">os</span>.<span style="color: black;">system</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'del DataExtract.log'</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;">#might as well erase this bitch too</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tdefile <span style="color: #66cc66;">=</span> tde.<span style="color: black;">Extract</span><span style="color: black;">&#40;</span>targetdir + tdefilename<span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># ok lets build the table definition in TDE with our list of names and types first</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># replacing literals with TDE datatype integers, etc</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; tableDef <span style="color: #66cc66;">=</span> tde.<span style="color: black;">TableDefinition</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;">#create a new table def</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; numfields <span style="color: #66cc66;">=</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>dfields<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;">#print numfields</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> rowoutput <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'*** field names list ***'</span> <span style="color: #808080; font-style: italic;"># debug </span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> t <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">range</span><span style="color: black;">&#40;</span>numfields<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fieldtypeo <span style="color: #66cc66;">=</span> dtypes<span style="color: black;">&#91;</span>t<span style="color: black;">&#93;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;&lt;type '&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;'&gt;&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;&lt;class '&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'NoneType'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'str'</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'uuid.UUID'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'str'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fieldname <span style="color: #66cc66;">=</span> dfields<span style="color: black;">&#91;</span>t<span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fieldtype <span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>fieldtypeo<span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;str&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;15&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;datetime.datetime&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;13&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;int&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;7&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;decimal.Decimal&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;10&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;float&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;10&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;uuid.UUID&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;15&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;bool&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;11&quot;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> rowoutput <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> fieldname + <span style="color: #483d8b;">' &nbsp;(looks like '</span> + fieldtypeo +<span style="color: #483d8b;">', TDE datatype '</span> + fieldtype + <span style="color: #483d8b;">')'</span> &nbsp;<span style="color: #808080; font-style: italic;"># debug </span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tableDef.<span style="color: black;">addColumn</span><span style="color: black;">&#40;</span>fieldname<span style="color: #66cc66;">,</span> <span style="color: #008000;">int</span><span style="color: black;">&#40;</span>fieldtype<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># if we pass a non-int to fieldtype, it'll fail</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tableDef.<span style="color: black;">addColumn</span><span style="color: black;">&#40;</span>fieldname<span style="color: #66cc66;">,</span> <span style="color: #ff4500;">15</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># if we get a weird type we don't recognize, just make it a string</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> rowoutput <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'***'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #dc143c;">time</span>.<span style="color: black;">sleep</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># wait 5 seconds so you can actually read shit!</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># ok, lets print out the table def we just made, for shits and giggles</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> rowoutput <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'################## TDE table definition created ######################'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> c <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">range</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span><span style="color: #66cc66;">,</span>tableDef.<span style="color: black;">getColumnCount</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'Column: '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>tableDef.<span style="color: black;">getColumnName</span><span style="color: black;">&#40;</span>c<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' Type: '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>tableDef.<span style="color: black;">getColumnType</span><span style="color: black;">&#40;</span>c<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #dc143c;">time</span>.<span style="color: black;">sleep</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># wait 5 seconds so you can actually read shit!</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># ok lets add the new def as a table in the extract</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; tabletran <span style="color: #66cc66;">=</span> tdefile.<span style="color: black;">addTable</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;Extract&quot;</span><span style="color: #66cc66;">,</span>tableDef<span style="color: black;">&#41;</span> <br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># time to start pumping rows!</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; rowsinserted <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">1</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># if we have a header, we don't want to try and process it</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> hasheader <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; csvreader.<span style="color: black;">next</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'['</span><span style="color: #66cc66;">,</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> row <span style="color: #ff7700;font-weight:bold;">in</span> csvreader:<br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> rowoutput <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>: <span style="color: #808080; font-style: italic;"># row deets, else just '.'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'************** INSERTING ROW NUMBER: '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>rowsinserted<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">'**************'</span> <span style="color: #808080; font-style: italic;"># debug output</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>: <span style="color: #808080; font-style: italic;"># only print dot every 50 records</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: black;">&#40;</span>rowsinserted%dotsevery<span style="color: black;">&#41;</span> <span style="color: #66cc66;">==</span> <span style="color: #ff4500;">0</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'='</span><span style="color: #66cc66;">,</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; columnposition <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow <span style="color: #66cc66;">=</span> tde.<span style="color: black;">Row</span><span style="color: black;">&#40;</span>tableDef<span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> t <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">range</span><span style="color: black;">&#40;</span>numfields<span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fieldtype <span style="color: #66cc66;">=</span> dtypes<span style="color: black;">&#91;</span>t<span style="color: black;">&#93;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;&lt;type '&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;'&gt;&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;&lt;class '&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'NoneType'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'str'</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'uuid.UUID'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'str'</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fieldname <span style="color: #66cc66;">=</span> dfields<span style="color: black;">&#91;</span>t<span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> rowoutput <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>: <span style="color: #808080; font-style: italic;"># column deets</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>columnposition<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' '</span> + fieldname + <span style="color: #483d8b;">': &nbsp; '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' ('</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>fieldtype<span style="color: black;">&#41;</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'.'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span> + <span style="color: #483d8b;">')'</span> <span style="color: #808080; font-style: italic;"># debug output</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> fieldtype <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'str'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span> <span style="color: #66cc66;">!=</span> <span style="color: #008000;">None</span>: <span style="color: #808080; font-style: italic;"># we don't want no None!</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setCharString</span><span style="color: black;">&#40;</span>columnposition<span style="color: #66cc66;">,</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setNull</span><span style="color: black;">&#40;</span>columnposition<span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># ok, put that None here</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> fieldtype <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'int'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span> <span style="color: #66cc66;">!=</span> <span style="color: #008000;">None</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setInteger</span><span style="color: black;">&#40;</span>columnposition<span style="color: #66cc66;">,</span> datatyper<span style="color: black;">&#40;</span>row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setNull</span><span style="color: black;">&#40;</span>columnposition<span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> fieldtype <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'bool'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span> <span style="color: #66cc66;">!=</span> <span style="color: #008000;">None</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setBoolean</span><span style="color: black;">&#40;</span>columnposition<span style="color: #66cc66;">,</span> datatyper<span style="color: black;">&#40;</span>row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setNull</span><span style="color: black;">&#40;</span>columnposition<span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> fieldtype <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'decimal.Decimal'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span> <span style="color: #66cc66;">!=</span> <span style="color: #008000;">None</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setDouble</span><span style="color: black;">&#40;</span>columnposition<span style="color: #66cc66;">,</span> datatyper<span style="color: black;">&#40;</span>row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setNull</span><span style="color: black;">&#40;</span>columnposition<span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> fieldtype <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'datetime.datetime'</span>: <span style="color: #808080; font-style: italic;"># sexy datetime splitting</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span> <span style="color: #66cc66;">!=</span> <span style="color: #008000;">None</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; strippeddate <span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>datatyper<span style="color: black;">&#40;</span>row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'.'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span> <span style="color: #808080; font-style: italic;"># just in case we get microseconds (not all datetime uses them)</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; timechunks <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">time</span>.<span style="color: black;">strptime</span><span style="color: black;">&#40;</span><span style="color: #008000;">str</span><span style="color: black;">&#40;</span>strippeddate<span style="color: black;">&#41;</span><span style="color: #66cc66;">,</span> <span style="color: #483d8b;">&quot;%Y-%m-%d %H:%M:%S&quot;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># chunky style!</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setDateTime</span><span style="color: black;">&#40;</span>columnposition<span style="color: #66cc66;">,</span> timechunks<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> timechunks<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> timechunks<span style="color: black;">&#91;</span><span style="color: #ff4500;">2</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> timechunks<span style="color: black;">&#91;</span><span style="color: #ff4500;">3</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> timechunks<span style="color: black;">&#91;</span><span style="color: #ff4500;">4</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> timechunks<span style="color: black;">&#91;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> <span style="color: #ff4500;">0000</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setNull</span><span style="color: black;">&#40;</span>columnposition<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; columnposition <span style="color: #66cc66;">=</span> columnposition + <span style="color: #ff4500;">1</span> <span style="color: #808080; font-style: italic;"># we gots to know what column number we're working on!</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; tabletran.<span style="color: black;">insert</span><span style="color: black;">&#40;</span>newrow<span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># finally insert buffered row into TDE 'table'</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">close</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; rowsinserted <span style="color: #66cc66;">=</span> rowsinserted + <span style="color: #ff4500;">1</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># ok let's write out that file and get back to making dinner</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; tdefile.<span style="color: black;">close</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; csvfile.<span style="color: black;">close</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'] '</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'('</span>+<span style="color: #008000;">str</span><span style="color: black;">&#40;</span>intWithCommas<span style="color: black;">&#40;</span>rowsinserted<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>+<span style="color: #483d8b;">' rows)'</span> <span style="color: #808080; font-style: italic;"># to clear out the row on command line</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; plist <span style="color: #66cc66;">=</span> <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># timing purposes for debugging / optimizing / FUN! This is FUN, Lars.</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; timetaken <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">time</span>.<span style="color: #dc143c;">time</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> - start_time<br />
&nbsp; &nbsp; &nbsp; &nbsp; plist.<span style="color: black;">append</span><span style="color: black;">&#40;</span>timetaken<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; plist.<span style="color: black;">append</span><span style="color: black;">&#40;</span>rowsinserted<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; plist.<span style="color: black;">append</span><span style="color: black;">&#40;</span><span style="color: #dc143c;">os</span>.<span style="color: black;">path</span>.<span style="color: black;">getsize</span><span style="color: black;">&#40;</span>sourcedir + csvfilename<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; plist.<span style="color: black;">append</span><span style="color: black;">&#40;</span><span style="color: #dc143c;">os</span>.<span style="color: black;">path</span>.<span style="color: black;">getsize</span><span style="color: black;">&#40;</span>sourcedir + tdefilename<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; fileperf<span style="color: black;">&#91;</span><span style="color: #008000;">str</span><span style="color: black;">&#40;</span>csvfilename<span style="color: black;">&#41;</span><span style="color: black;">&#93;</span> <span style="color: #66cc66;">=</span> plist<br />
<br />
<span style="color: #808080; font-style: italic;"># just for our &quot;result time&quot;</span><br />
totaltime <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
totalrecords <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
totalpresize <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
totalpostsize <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
<br />
<span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">' '</span><br />
<br />
<span style="color: #ff7700;font-weight:bold;">for</span> p <span style="color: #ff7700;font-weight:bold;">in</span> fileperf:<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> p + <span style="color: #483d8b;">' &nbsp; &nbsp; '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>intWithCommas<span style="color: black;">&#40;</span><span style="color: #008000;">int</span><span style="color: black;">&#40;</span>fileperf<span style="color: black;">&#91;</span>p<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' seconds &nbsp; &nbsp; processed '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>intWithCommas<span style="color: black;">&#40;</span>fileperf<span style="color: black;">&#91;</span>p<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' records. Resulting TDE file is '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>showhumanfilesize<span style="color: black;">&#40;</span>fileperf<span style="color: black;">&#91;</span>p<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">3</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' (source was '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>showhumanfilesize<span style="color: black;">&#40;</span>fileperf<span style="color: black;">&#91;</span>p<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">2</span><span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">')'</span><br />
&nbsp; &nbsp; totaltime <span style="color: #66cc66;">=</span> fileperf<span style="color: black;">&#91;</span>p<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span> + totaltime<br />
&nbsp; &nbsp; totalrecords <span style="color: #66cc66;">=</span> fileperf<span style="color: black;">&#91;</span>p<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span> + totalrecords<br />
&nbsp; &nbsp; totalpresize <span style="color: #66cc66;">=</span> fileperf<span style="color: black;">&#91;</span>p<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">2</span><span style="color: black;">&#93;</span> + totalpresize<br />
&nbsp; &nbsp; totalpostsize <span style="color: #66cc66;">=</span> fileperf<span style="color: black;">&#91;</span>p<span style="color: black;">&#93;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">3</span><span style="color: black;">&#93;</span> + totalpostsize<br />
<br />
<span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>fileperf<span style="color: black;">&#41;</span> <span style="color: #66cc66;">&gt;</span> <span style="color: #ff4500;">1</span>:<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">' '</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'TOTAL RUN&nbsp; &nbsp; &nbsp; &nbsp; '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>intWithCommas<span style="color: black;">&#40;</span><span style="color: #008000;">int</span><span style="color: black;">&#40;</span>totaltime<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' seconds&nbsp; &nbsp; &nbsp; processed '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>intWithCommas<span style="color: black;">&#40;</span>totalrecords<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' records - crunched '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>showhumanfilesize<span style="color: black;">&#40;</span>totalpresize<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' of text into '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>showhumanfilesize<span style="color: black;">&#40;</span>totalpostsize<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' of binary sexiness'</span></div></div>
<p>Issues? Let me know below and I can fix them! Maybe.</p>
]]></content:encoded>
			<wfw:commentRss>http://ryrobes.com/python/build-tableau-data-extracts-out-of-csv-files-more-python-tde-api-madness/feed/</wfw:commentRss>
		<slash:comments>14</slash:comments>
		</item>
		<item>
		<title>SQL Server Query to Tableau Data Extract LIKE A BOSS &#8211; Some more TDE API fun with Python &amp; Tableau 8</title>
		<link>http://ryrobes.com/python/sql-server-query-to-tableau-data-extract-more-tde-api-fun-with-python-tableau-8/</link>
		<comments>http://ryrobes.com/python/sql-server-query-to-tableau-data-extract-more-tde-api-fun-with-python-tableau-8/#comments</comments>
		<pubDate>Wed, 05 Dec 2012 01:20:30 +0000</pubDate>
		<dc:creator><![CDATA[Ry]]></dc:creator>
				<category><![CDATA[Cut-n-Paste Code]]></category>
		<category><![CDATA[Microsoft SQL Server]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[SQL]]></category>
		<category><![CDATA[Tableau]]></category>
		<category><![CDATA[microsoft sql server. pymssql]]></category>
		<category><![CDATA[mssql]]></category>
		<category><![CDATA[sql server]]></category>
		<category><![CDATA[tableau data extracts]]></category>

		<guid isPermaLink="false">http://ryrobes.com/?p=42576</guid>
		<description><![CDATA[Coming off the excitement of my last post about writing a simple bare-bones python usage of Tableau’s brand-new Data Extracts API from version 8.0 – I figured that it was time to build on that. Let’s take a step forward and get a little more complicated – AND a little more useful.... Let's do this.]]></description>
				<content:encoded><![CDATA[<p style="float:right; margin:0 0 10px 15px; width:240px;">
		<img src="http://fixxer.ryrobes.com/wp-content/uploads/2012/12/he_man_picture.jpg" width="240" />
		</p><p>Coming off the <em>excitement</em> of <a href="http://ryrobes.com/python/building-tableau-data-extract-files-with-python-in-tableau-8-sample-usage/" title="Building Tableau Data Extract files with Python in Tableau 8 – Sample Usage">my last post about writing a simple bare-bones python usage of Tableau's brand-new Data Extracts API</a> from version 8.0 - I figured that it was time to build on that. Let's take a step forward and get a little more complicated - AND a little more useful.<img src="http://fixxer.ryrobes.com/wp-content/uploads/2012/12/heman-case.jpg" alt="I'm encapsulated in a proprietary binary data file OF EMOTION!" title="I'm encapsulated in a proprietary binary data file OF EMOTION!" width="207" height="158" class="alignleft size-full wp-image-42626" /></p>
<h2>Oh, noes!</h2>
<h4>I'm encapsulated in a proprietary binary data file OF EMOTION!</h4>
<h2>Save me, Cringer!</h2>
<p>Wait. This could be MORE useful then my last post?<br />
<h4>You mean your business doesn't run on a billion rows of dummy shit data? Weird.</h4>
<p>Anyways, this time we're going to extract data from SQL Server with a single SQL query and populate a new TDE file. </p>
<p><strong>This script has 3 main features:</strong></p>
<div class="shortcode-unorderedlist tick"></p>
<ul>
<li>Can run any query you want (even 'select * from' queries)</li>
<li>You don't have to pre-define the field names</li>
<li>You don't have to pre-define the data types</li>
</ul>
<p></div>

<h2>I got you, shawty. I GOT YOU.</h2>
<p>All of which we had to explicitly state in the last script (even as a 'proof of concept'), and as you can imagine - with a wide result set, defining every little thing and then inserting every little thing can be a huge pain in the ass (not to mention ugly). With my approach, it's literally plug and play - err, paste and play. You can even run queries like "select * from " it's not even necessary to use the field names (assuming they are all unique, that is - more on that later) - and yes, I mentioned that twice <strong>for <em>EMPHASIS</em></strong>!</p>
<h4>Ye gods! No scripting out my lame_ass_field_names? Huzzah!</h4>
<p>Basically, the script works like this:</p>
<ul>
<li>Connect to SQL Server</li>
<li>Execute your SQL query</li>
<li>Look at the field names and first row data</li>
<li>Have Python try to guess what data types they are (based on that first row)</li>
<li>Map those data types and field names to a TDE schema definition (TableDefinition())</li>
<li>Loop through the entire result set, calling each field by name and type so the correct insert method is used (setDouble, setInteger, setNull, setDateTime, setBoolean, etc)</li>
<li>Do a little headbanging as you watch the text <em>fly</em> by!</li>
<li>Close the file (write the TDE to disk)</li>
<li>Close the SQL connection</li>
<li>Wipe hands on pants</li>
</ul>
<h4>Booya. Done.</h4>
<p> And you didn't have to do a damn thing except give it a server connection and a SQL query. Now you can open that fresh TDE file and have yourself a little data party. { Pants optional }</p>
<div class="woo-sc-box tick  rounded full"><strong>Requires:</strong> <a href="http://python.org/download/" target="_blank">Python 2.7.X</a>+, the DataExtracts module, and <a href="http://code.google.com/p/pymssql/" target="_blank">pymssql module</a> (<a href="http://www.lfd.uci.edu/~gohlke/pythonlibs/#pymssql" target="_blank">pre-compiled binaries here</a>). Oh, and it would help to have a SQL Server and Tableau Desktop as well. ;)</div>
<p>Here it is! Cut, Paste, and give it a try, gosh darnint!</p>
<div class="woo-sc-box info  rounded full"><strong>UPDATE:</strong> Added var <strong>(rowoutput = True / False)</strong> for turning off the debug row / column output - good for looking into errors but slowed the script down like 10x</div>
<div class="codecolorer-container python blackboard" style="overflow:auto;white-space:nowrap;border:1px solid #9F9F9F;"><div class="python codecolorer" style="padding:5px;font:normal 12px/1.4em Monaco, Lucida Console, monospace;white-space:nowrap"><span style="color: #808080; font-style: italic;"># ryan robitaille 12/6/2012</span><br />
<span style="color: #808080; font-style: italic;"># simple Tableau Data Extract creation from a single microsoft sql server sql statement</span><br />
<br />
<span style="color: #ff7700;font-weight:bold;">import</span> dataextract <span style="color: #ff7700;font-weight:bold;">as</span> tde <span style="color: #808080; font-style: italic;"># saves some typing, cause i'm a lazy fucker</span><br />
<span style="color: #ff7700;font-weight:bold;">import</span> <span style="color: #dc143c;">os</span><span style="color: #66cc66;">,</span> <span style="color: #dc143c;">time</span><span style="color: #66cc66;">,</span> pymssql <span style="color: #808080; font-style: italic;"># for file manipulation, script timing (not necc), database access!</span><br />
<br />
<span style="color: #808080; font-style: italic;">###################### FOR YOUR PARAMETERS, SON! ######################</span><br />
tdefilename <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'ufo_datas.tde'</span><br />
sql <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">&quot;select * from UFOdata.dbo.Sightings&quot;</span> <span style="color: #808080; font-style: italic;"># whatever</span><br />
sqlserverhost <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'localhost'</span><br />
sqlusername <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'sa'</span><br />
sqlpassword <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'passy'</span><br />
sqldatabase <span style="color: #66cc66;">=</span> <span style="color: #483d8b;">'UFOdata'</span><br />
rowoutput <span style="color: #66cc66;">=</span> <span style="color: #008000;">False</span> <span style="color: #808080; font-style: italic;"># for DEBUGGING data errors / slows shit down 10X however</span><br />
<span style="color: #808080; font-style: italic;">###################### FOR YOUR PARAMETERS, SON! ######################</span><br />
<br />
dotsevery <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">75</span><br />
<br />
start_time <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">time</span>.<span style="color: #dc143c;">time</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># simple timing for test purposes</span><br />
<br />
mssql_db <span style="color: #66cc66;">=</span> pymssql.<span style="color: black;">connect</span><span style="color: black;">&#40;</span>host<span style="color: #66cc66;">=</span>sqlserverhost<span style="color: #66cc66;">,</span> <span style="color: #dc143c;">user</span><span style="color: #66cc66;">=</span>sqlusername<span style="color: #66cc66;">,</span> password<span style="color: #66cc66;">=</span>sqlpassword<span style="color: #66cc66;">,</span> database<span style="color: #66cc66;">=</span>sqldatabase<span style="color: #66cc66;">,</span> as_dict<span style="color: #66cc66;">=</span><span style="color: #008000;">True</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># as_dict very important</span><br />
mssql_cursor <span style="color: #66cc66;">=</span> mssql_db.<span style="color: black;">cursor</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
mssql_cursor.<span style="color: black;">execute</span><span style="color: black;">&#40;</span>sql<span style="color: black;">&#41;</span><br />
<br />
<span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">' '</span><br />
<span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'[ Note: Each . = '</span> +<span style="color: #008000;">str</span><span style="color: black;">&#40;</span>dotsevery<span style="color: black;">&#41;</span>+ <span style="color: #483d8b;">' rows processed ]'</span><br />
<br />
fieldnameslist <span style="color: #66cc66;">=</span> <span style="color: black;">&#91;</span><span style="color: black;">&#93;</span> <span style="color: #808080; font-style: italic;"># define our empty list</span><br />
<br />
<span style="color: #808080; font-style: italic;">#go through the first row to TRY to set fieldnames and datatypes</span><br />
<span style="color: #ff7700;font-weight:bold;">for</span> row <span style="color: #ff7700;font-weight:bold;">in</span> mssql_cursor:<br />
&nbsp; &nbsp; &nbsp; &nbsp; itemz <span style="color: #66cc66;">=</span> <span style="color: #008000;">len</span><span style="color: black;">&#40;</span>row.<span style="color: black;">keys</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>/<span style="color: #ff4500;">2</span> <span style="color: #808080; font-style: italic;"># because the dict rowset includes BOTH number keys and fieldname keys</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> k <span style="color: #ff7700;font-weight:bold;">in</span> row.<span style="color: black;">keys</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; fieldnameslist.<span style="color: black;">append</span><span style="color: black;">&#40;</span><span style="color: #008000;">str</span><span style="color: black;">&#40;</span>k<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">'|'</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span><span style="color: #008000;">type</span><span style="color: black;">&#40;</span>row<span style="color: black;">&#91;</span>k<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;&lt;type '&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;'&gt;&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;&lt;class '&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'NoneType'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'str'</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'uuid.UUID'</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">'str'</span><span style="color: black;">&#41;</span> <span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">break</span> <span style="color: #808080; font-style: italic;"># after the first row, we SHOULD have a decent idea of the datatypes</span><br />
<span style="color: #808080; font-style: italic;"># ^ a bit inelegant, but it gets the job done</span><br />
<br />
fieldnameslist.<span style="color: black;">sort</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># sort them out so the integer keys are first (we're gonna whack em)</span><br />
<span style="color: #ff7700;font-weight:bold;">del</span> fieldnameslist<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span>:itemz<span style="color: black;">&#93;</span> <span style="color: #808080; font-style: italic;"># remove first x amount of keys (should be all integers instead of dict literals)</span><br />
<br />
<br />
<span style="color: #ff7700;font-weight:bold;">try</span>: &nbsp;<span style="color: #808080; font-style: italic;"># Just in case the file exists already, we don't want to bomb out</span><br />
&nbsp; &nbsp; tdefile <span style="color: #66cc66;">=</span> tde.<span style="color: black;">Extract</span><span style="color: black;">&#40;</span>tdefilename<span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># in CWD</span><br />
<span style="color: #ff7700;font-weight:bold;">except</span>: <br />
&nbsp; &nbsp; <span style="color: #dc143c;">os</span>.<span style="color: black;">system</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'del '</span>+tdefilename<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #dc143c;">os</span>.<span style="color: black;">system</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'del DataExtract.log'</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;">#might as well erase this bitch too</span><br />
&nbsp; &nbsp; tdefile <span style="color: #66cc66;">=</span> tde.<span style="color: black;">Extract</span><span style="color: black;">&#40;</span>tdefilename<span style="color: black;">&#41;</span><br />
<br />
<span style="color: #808080; font-style: italic;"># ok lets build the table definition in TDE with our list of names and types first</span><br />
<span style="color: #808080; font-style: italic;"># replacing literals with TDE datatype integers, etc</span><br />
tableDef <span style="color: #66cc66;">=</span> tde.<span style="color: black;">TableDefinition</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;">#create a new table def</span><br />
<br />
<span style="color: #ff7700;font-weight:bold;">if</span> rowoutput <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>:<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'*** field names list ***'</span> <span style="color: #808080; font-style: italic;"># debug </span><br />
<span style="color: #ff7700;font-weight:bold;">for</span> t <span style="color: #ff7700;font-weight:bold;">in</span> fieldnameslist:<br />
&nbsp; &nbsp; fieldtype <span style="color: #66cc66;">=</span> t.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'|'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; fieldname <span style="color: #66cc66;">=</span> t.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'|'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; fieldtype <span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>fieldtype<span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;str&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;15&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;datetime.datetime&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;13&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;int&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;7&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;decimal.Decimal&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;10&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;float&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;10&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;uuid.UUID&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;15&quot;</span><span style="color: black;">&#41;</span>.<span style="color: black;">replace</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;bool&quot;</span><span style="color: #66cc66;">,</span><span style="color: #483d8b;">&quot;11&quot;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> rowoutput <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> fieldname + <span style="color: #483d8b;">' &nbsp;(looks like '</span> + t.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'|'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span> +<span style="color: #483d8b;">', TDE datatype '</span> + fieldtype + <span style="color: #483d8b;">')'</span> &nbsp;<span style="color: #808080; font-style: italic;"># debug </span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">try</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; tableDef.<span style="color: black;">addColumn</span><span style="color: black;">&#40;</span>fieldname<span style="color: #66cc66;">,</span> <span style="color: #008000;">int</span><span style="color: black;">&#40;</span>fieldtype<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># if we pass a non-int to fieldtype, it'll fail</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">except</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; tableDef.<span style="color: black;">addColumn</span><span style="color: black;">&#40;</span>fieldname<span style="color: #66cc66;">,</span> <span style="color: #ff4500;">15</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># if we get a weird type we don't recognize, just make it a string</span><br />
<span style="color: #ff7700;font-weight:bold;">if</span> rowoutput <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>:<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'***'</span><br />
&nbsp; &nbsp; <span style="color: #dc143c;">time</span>.<span style="color: black;">sleep</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># wait 5 seconds so you can actually read shit!</span><br />
<br />
<span style="color: #ff7700;font-weight:bold;">if</span> rowoutput <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>:<br />
&nbsp; &nbsp; <span style="color: #808080; font-style: italic;"># ok, lets print out the table def we just made, for shits and giggles</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'################## TDE table definition created ######################'</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> c <span style="color: #ff7700;font-weight:bold;">in</span> <span style="color: #008000;">range</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">0</span><span style="color: #66cc66;">,</span>tableDef.<span style="color: black;">getColumnCount</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'Column: '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>tableDef.<span style="color: black;">getColumnName</span><span style="color: black;">&#40;</span>c<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' Type: '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>tableDef.<span style="color: black;">getColumnType</span><span style="color: black;">&#40;</span>c<span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #dc143c;">time</span>.<span style="color: black;">sleep</span><span style="color: black;">&#40;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># wait 5 seconds so you can actually read shit!</span><br />
<br />
<span style="color: #808080; font-style: italic;"># ok lets add the new def as a table in the extract</span><br />
tabletran <span style="color: #66cc66;">=</span> tdefile.<span style="color: black;">addTable</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">&quot;Extract&quot;</span><span style="color: #66cc66;">,</span>tableDef<span style="color: black;">&#41;</span> <br />
<span style="color: #808080; font-style: italic;"># why table NEEDS to be called 'Extract' is beyond me</span><br />
<br />
rowsinserted <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">1</span> <span style="color: #808080; font-style: italic;"># we need to count stuff, dude! Robots start at 0, I START AT 1!</span><br />
<br />
<span style="color: #808080; font-style: italic;"># ok, for each row in the result set, we iterate through all the fields and insert based on datatype</span><br />
<span style="color: #ff7700;font-weight:bold;">for</span> row <span style="color: #ff7700;font-weight:bold;">in</span> mssql_cursor:<br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> rowoutput <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'************** INSERTING ROW NUMBER: '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>rowsinserted<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">'**************'</span> <span style="color: #808080; font-style: italic;"># debug output</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>: <span style="color: #808080; font-style: italic;"># only print dot every 50 records</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> <span style="color: black;">&#40;</span>rowsinserted%dotsevery<span style="color: black;">&#41;</span> <span style="color: #66cc66;">==</span> <span style="color: #ff4500;">0</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">'.'</span><span style="color: #66cc66;">,</span><br />
<br />
&nbsp; &nbsp; columnposition <span style="color: #66cc66;">=</span> <span style="color: #ff4500;">0</span><br />
&nbsp; &nbsp; newrow <span style="color: #66cc66;">=</span> tde.<span style="color: black;">Row</span><span style="color: black;">&#40;</span>tableDef<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">for</span> t <span style="color: #ff7700;font-weight:bold;">in</span> fieldnameslist:<br />
&nbsp; &nbsp; &nbsp; &nbsp; fieldtype <span style="color: #66cc66;">=</span> t.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'|'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; fieldname <span style="color: #66cc66;">=</span> t.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'|'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> rowoutput <span style="color: #66cc66;">==</span> <span style="color: #008000;">True</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>columnposition<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' '</span> + fieldname + <span style="color: #483d8b;">': &nbsp; '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' ('</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>fieldtype<span style="color: black;">&#41;</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'.'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span> + <span style="color: #483d8b;">')'</span> <span style="color: #808080; font-style: italic;"># debug output</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> fieldtype <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'str'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span> <span style="color: #66cc66;">!=</span> <span style="color: #008000;">None</span>: <span style="color: #808080; font-style: italic;"># we don't want no None!</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setCharString</span><span style="color: black;">&#40;</span>columnposition<span style="color: #66cc66;">,</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setNull</span><span style="color: black;">&#40;</span>columnposition<span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># ok, put that None here</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> fieldtype <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'int'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span> <span style="color: #66cc66;">!=</span> <span style="color: #008000;">None</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setInteger</span><span style="color: black;">&#40;</span>columnposition<span style="color: #66cc66;">,</span> row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setNull</span><span style="color: black;">&#40;</span>columnposition<span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> fieldtype <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'bool'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span> <span style="color: #66cc66;">!=</span> <span style="color: #008000;">None</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setBoolean</span><span style="color: black;">&#40;</span>columnposition<span style="color: #66cc66;">,</span> row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setNull</span><span style="color: black;">&#40;</span>columnposition<span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> fieldtype <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'decimal.Decimal'</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span> <span style="color: #66cc66;">!=</span> <span style="color: #008000;">None</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setDouble</span><span style="color: black;">&#40;</span>columnposition<span style="color: #66cc66;">,</span> row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setNull</span><span style="color: black;">&#40;</span>columnposition<span style="color: black;">&#41;</span><br />
<br />
&nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> fieldtype <span style="color: #66cc66;">==</span> <span style="color: #483d8b;">'datetime.datetime'</span>: <span style="color: #808080; font-style: italic;"># sexy datetime splitting</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">if</span> row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span> <span style="color: #66cc66;">!=</span> <span style="color: #008000;">None</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; strippeddate <span style="color: #66cc66;">=</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>row<span style="color: black;">&#91;</span>fieldname<span style="color: black;">&#93;</span><span style="color: black;">&#41;</span>.<span style="color: black;">split</span><span style="color: black;">&#40;</span><span style="color: #483d8b;">'.'</span><span style="color: black;">&#41;</span><span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span> <span style="color: #808080; font-style: italic;"># just in case we get microseconds (not all datetime uses them)</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; timechunks <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">time</span>.<span style="color: black;">strptime</span><span style="color: black;">&#40;</span><span style="color: #008000;">str</span><span style="color: black;">&#40;</span>strippeddate<span style="color: black;">&#41;</span><span style="color: #66cc66;">,</span> <span style="color: #483d8b;">&quot;%Y-%m-%d %H:%M:%S&quot;</span><span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># chunky style!</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setDateTime</span><span style="color: black;">&#40;</span>columnposition<span style="color: #66cc66;">,</span> timechunks<span style="color: black;">&#91;</span><span style="color: #ff4500;">0</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> timechunks<span style="color: black;">&#91;</span><span style="color: #ff4500;">1</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> timechunks<span style="color: black;">&#91;</span><span style="color: #ff4500;">2</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> timechunks<span style="color: black;">&#91;</span><span style="color: #ff4500;">3</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> timechunks<span style="color: black;">&#91;</span><span style="color: #ff4500;">4</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> timechunks<span style="color: black;">&#91;</span><span style="color: #ff4500;">5</span><span style="color: black;">&#93;</span><span style="color: #66cc66;">,</span> <span style="color: #ff4500;">0000</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; <span style="color: #ff7700;font-weight:bold;">else</span>:<br />
&nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; newrow.<span style="color: black;">setNull</span><span style="color: black;">&#40;</span>columnposition<span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; <br />
&nbsp; &nbsp; &nbsp; &nbsp; columnposition <span style="color: #66cc66;">=</span> columnposition + <span style="color: #ff4500;">1</span> <span style="color: #808080; font-style: italic;"># we gots to know what column number we're working on!</span><br />
&nbsp; &nbsp; tabletran.<span style="color: black;">insert</span><span style="color: black;">&#40;</span>newrow<span style="color: black;">&#41;</span> <span style="color: #808080; font-style: italic;"># finally insert buffered row into TDE 'table'</span><br />
&nbsp; &nbsp; newrow.<span style="color: black;">close</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
&nbsp; &nbsp; rowsinserted <span style="color: #66cc66;">=</span> rowsinserted + <span style="color: #ff4500;">1</span><br />
<br />
<span style="color: #808080; font-style: italic;"># ok let's write out that file and get back to making dinner</span><br />
tdefile.<span style="color: black;">close</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
mssql_db.<span style="color: black;">close</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span><br />
<br />
<span style="color: #808080; font-style: italic;"># timing purposes for debugging / optimizing / FUN! This is FUN, Lars.</span><br />
timetaken <span style="color: #66cc66;">=</span> <span style="color: #dc143c;">time</span>.<span style="color: #dc143c;">time</span><span style="color: black;">&#40;</span><span style="color: black;">&#41;</span> - start_time<br />
<span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>rowsinserted<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' rows inserted in '</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>timetaken<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' seconds'</span><br />
<span style="color: #ff7700;font-weight:bold;">print</span> <span style="color: #483d8b;">' &nbsp; &nbsp;('</span> + <span style="color: #008000;">str</span><span style="color: black;">&#40;</span>rowsinserted/timetaken<span style="color: black;">&#41;</span> + <span style="color: #483d8b;">' rows per second)'</span><br />
<span style="color: #808080; font-style: italic;"># woo, let's have a drink!</span></div></div>
<p>I threw this together in a few hours, and will probably be adding to it, fixing it down the line. It has very little exception handling and is only around 130 lines; including my asinine comments and whitespace lines. Parts of it are a bit 'inelegant', lots of list building and string mangling - but it works. Quite well actually.</p>
<p>Love to get some feedback so I can make it better. I tested it on many different tables I had laying around, including that crazy MS AdventureWorks2008R2 Data Warehouse DB which uses a lot of strange data types and calculated fields. All worked splendidly.</p>
<div class="woo-sc-box normal  rounded full">
<strong>Current Known Issues: </strong></p>
<p>1) I'm only looking at the first row for data type guessing. I really should be taking a much larger sample of the data. </p>
<p>2) If you do a bunch of joins, don't explicitly define the field names in the query, and have duplicate field names in the result set - the file will miss some fields, or even bomb out completely.</p>
<p>i.e. - Don't do a: </p>
<pre>select a.*, b.*, c.* from table1 a, table2 b, table3 c 
            where a.id = b.id and a.id = c.id


</pre>
<p>But you wouldn't do that fucking nonsense anyways - right? ;)<br />
</div>
<p>I'm also thinking of making a version of this with command line parameters instead of file editing.</p>
<pre>i.e. Python TDE_from_mssql.py 
--file=belly.tde 
--server=servername 
--username=sa 
--database=db45 
--sql="select * from bellybutton"

</pre>
<p>What's next for TDE file population posts? Web scraping, MySQL, Oracle, flat files? Might as well hit 'em all.</p>
]]></content:encoded>
			<wfw:commentRss>http://ryrobes.com/python/sql-server-query-to-tableau-data-extract-more-tde-api-fun-with-python-tableau-8/feed/</wfw:commentRss>
		<slash:comments>16</slash:comments>
		</item>
	</channel>
</rss>

<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using apc
Database Caching using apc
Object Caching 2114/2215 objects using apc
Content Delivery Network via fixxer.ryrobes.com

 Served from: www.ryrobes.com @ 2026-03-02 00:42:02 by W3 Total Cache -->