<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Jeethu's Blog</title>
	
	<link>http://jeethurao.com/blog</link>
	<description>Life, Code et al.</description>
	<lastBuildDate>Tue, 25 Dec 2012 08:37:46 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/JeethusBlog" /><feedburner:info uri="jeethusblog" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Setting up Scala with IntelliJ IDEA on Mountain Lion</title>
		<link>http://feedproxy.google.com/~r/JeethusBlog/~3/8PpRfKYns84/</link>
		<comments>http://jeethurao.com/blog/?p=217#comments</comments>
		<pubDate>Tue, 25 Dec 2012 08:37:09 +0000</pubDate>
		<dc:creator>Jeethu</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[intellij]]></category>
		<category><![CDATA[scala]]></category>

		<guid isPermaLink="false">http://jeethurao.com/blog/?p=217</guid>
		<description><![CDATA[I&#8217;d recently finished Martin Odersky&#8217;s Functional Programming Principles in Scala class Coursera. I&#8217;d briefly played with OCaml for a couple of months back in 2002, when I was on the verge of dropping out of college and had so much time on my hands that I didn&#8217;t know what to do with it. Subsequently, KS [...]]]></description>
				<content:encoded><![CDATA[<p>I&#8217;d recently finished Martin Odersky&#8217;s <a title="Scala Class" href="https://www.coursera.org/course/progfun">Functional Programming Principles in Scala</a> class Coursera. I&#8217;d briefly played with OCaml for a couple of months back in 2002, when I was on the verge of dropping out of college and had so much time on my hands that I didn&#8217;t know what to do with it. Subsequently, <a href="http://kssreeram.org">KS</a> introduced me to Scheme and <a title="Structure and Interpretation of Computer Programs" href="http://mitpress.mit.edu/sicp/">SICP</a>. Given these precedents, despite joining about week late, the class was a breeze and I thoroughly enjoyed it.</p>
<p>The class used the Typesafe Scala IDE, which is built on Eclipse. I&#8217;ve been using JetBrains PyCharm for most of my python work and when they had IntelliJ 12 Ultimate Edition on sale at a 75% discount last week, I bought it. Last night, I spent a bit of time setting up IntelliJ to work with Scala on my Retina MBP. I&#8217;m posting this as a future reference.</p>
<p>To start with, we need to install scala and sbt using <a title="Homebrew" href="http://mxcl.github.com/homebrew/">homebrew</a>.</p>
<pre class="brush: bash; title: ; notranslate">
brew install scala --with-docs
brew install sbt
</pre>
<p>I also install <a title="drip" href="https://github.com/flatland/drip">drip</a>, a JVM launcher which preloads and keeps a warm JVM instance to make things appear faster. This is optional, but highly recommended.</p>
<pre class="brush: bash; title: ; notranslate">
brew install drip
</pre>
<p>We also need to add the following lines to <em>~/.bash_profile</em>.</p>
<pre class="brush: bash; title: ; notranslate">

export JAVA_HOME=$(/usr/libexec/java_home)

export SCALA_HOME=/usr/local/Cellar/scala/2.9.2/libexec

export JAVACMD=drip

export DRIP_SHUTDOWN=30

export SBT_OPTS=&quot;-XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:PermSize=128M -XX:MaxPermSize=512M&quot;

</pre>
<p>Finally, get JDK docs and sources from Apple. Download and install &#8220;Java for OS X 2012-006 Developer Package&#8221; from <a title="Downloads for Apple Developers" href="https://developer.apple.com/downloads?q=java%20developer">here</a>.</p>
<p>Now, to configure IDEA 12. Start by installing the Scala plugin from the plugin repository and restart IDEA.</p>
<p>Next, start a new project in IDEA and select &#8220;Scala Module&#8221;. Enter the project name, click on the New button for Project SDK. The $JAVA_HOME directory should already be selected in the dialog which appears, just click on Choose to proceed.</p>
<p>In Scala settings, select the &#8220;Set Scala Home&#8221; radio button and enter &#8220;/usr/local/Cellar/scala/2.9.2/libexec&#8221; in the text box below it. It&#8217;ll show a warning about not being able to find the /doc/scala-devel-docs/api directory. We&#8217;ll set it later. Click on &#8220;Finish&#8221; to create the new project.</p>
<p>Now to setup the documentation directories. Select &#8220;File -&gt; Project Structure&#8221;. Select &#8220;Libraries&#8221; under &#8220;Project Settings&#8221; in the left most pane. Select &#8220;scala-library&#8221;. The JavaDocs path would be set to &#8220;/usr/local/Cellar/scala/2.9.2/libexec/doc/scala/devel-docs/api&#8221;, which is incorrect. Remove that entry and add a new JavaDoc entry pointing to &#8220;/usr/local/Cellar/scala/2.9.2/share/doc/scala&#8221;. This is where homebrew installs scala docs if scala is installed with the &#8220;&#8211;with-docs&#8221; option. Click on &#8220;Facets&#8221; under &#8220;Project Settings&#8221; and ensure that the &#8220;Language Level&#8221; is set to &#8220;Scala 2.9&#8243;.</p>
<p>It&#8217;s also a good idea to ensure that the locations for Java docs are set properly. Select &#8220;SDKs&#8221; under &#8220;Platform Settings&#8221;. On the &#8220;Sourcepath&#8221; tab, ensure that there&#8217;s an entry pointing to &#8220;/Library/Java/JavaVirtualMachines/1.6.0_37-b06-434.jdk/Contents/Home/src.jar!/src&#8221;. On the &#8220;Documentation Path&#8221; tab, ensure there are two entries pointing to &#8220;/Library/Java/JavaVirtualMachines/1.6.0_37-b06-434.jdk/Contents/Home/docs.jar!/docs/api&#8221; and &#8220;/Library/Java/JavaVirtualMachines/1.6.0_37-b06-434.jdk/Contents/Home/appledocs.jar!/appledoc/api&#8221;.</p>
<p>And that&#8217;s it. You could probably start by adding a Scala Worksheet to the project and try playing with the REPL. The default shortcut for inline documentation in IDEA is &#8220;Ctrl-J&#8221;.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeethurao.com/blog/?feed=rss2&amp;p=217</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://jeethurao.com/blog/?p=217</feedburner:origLink></item>
		<item>
		<title>Installing Rosetta on Snow Leopard</title>
		<link>http://feedproxy.google.com/~r/JeethusBlog/~3/SJoOXvGWVwU/</link>
		<comments>http://jeethurao.com/blog/?p=211#comments</comments>
		<pubDate>Thu, 07 Jan 2010 10:46:13 +0000</pubDate>
		<dc:creator>Jeethu</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[osx]]></category>
		<category><![CDATA[rosetta]]></category>
		<category><![CDATA[snow leopard]]></category>

		<guid isPermaLink="false">http://jeethurao.com/blog/?p=211</guid>
		<description><![CDATA[I was trying to use the flare actionscript decomplier, which I&#8217;d never used since I upgraded to Snow Leopard about a month ago. The OSX version of flare is a PPC binary and needs Rosetta to run. Unfortunately, since flare is a command line application, it just spits out a message (&#8220;You need the Rosetta [...]]]></description>
				<content:encoded><![CDATA[<p>I was trying to use the <a href="http://www.nowrap.de/flare.html">flare</a> actionscript decomplier, which I&#8217;d never used since I upgraded to Snow Leopard about a month ago. The OSX version of flare is a PPC binary and needs <a href="http://www.apple.com/rosetta/">Rosetta</a> to run. Unfortunately, since flare is a command line application, it just spits out a message (&#8220;<em>You need the Rosetta software to run flare. The Rosetta installer is in Optional Installs on your Mac OS X installation disc.</em>&#8220;) and exits. GUI applications trigger a dialog prompting you to download and install Rosetta. My Snow Leopard upgrade disk is at home and I don&#8217;t want to wait until evening to install Rosetta off the disk. So, I tried searching for a link to the Rosetta installer on the web, without any success. The only other recourse was to find a PPC built application to install and run (and eventually uninstall after Rosetta was installed) to trigger the installer dialog. After a little bit of searching, I finally managed to find PPC binaries from the <a href="http://folding.stanford.edu/English/DownloadOld">Folding@home</a> project. I installed it, and got the dialog to install Rosetta, which incidentally was only a 2MB download. I wish there was an easier way to download the Rosetta installer straight from the apple website.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeethurao.com/blog/?feed=rss2&amp;p=211</wfw:commentRss>
		<slash:comments>8</slash:comments>
		<feedburner:origLink>http://jeethurao.com/blog/?p=211</feedburner:origLink></item>
		<item>
		<title>Browser toolbars and privacy</title>
		<link>http://feedproxy.google.com/~r/JeethusBlog/~3/DtNc_HkuzDo/</link>
		<comments>http://jeethurao.com/blog/?p=186#comments</comments>
		<pubDate>Mon, 22 Jun 2009 13:10:27 +0000</pubDate>
		<dc:creator>Jeethu</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://jeethurao.com/blog/?p=186</guid>
		<description><![CDATA[Some 6 months ago, out of curiousity, I tried using Wireshark to monitor the http traffic browser was generating while casually browsing the web on Firefox with the StumbleUpon toolbar. The resulting traffic dump was pretty interesting. The Stumbleupon toolbar was calling home for every single URL I visited. The extension was making an HTTP [...]]]></description>
				<content:encoded><![CDATA[<p>Some 6 months ago, out of curiousity, I tried using <a href="http://www.wireshark.org/">Wireshark</a> to monitor the http traffic browser was generating while casually browsing the web on Firefox with the <a href="http://www.stumbleupon.com/">StumbleUpon</a> toolbar. The resulting traffic dump was pretty interesting. The Stumbleupon toolbar was calling home for every single URL I visited.</p>
<p>The extension was making an HTTP POST request for every url I&#8217;d visited to http://74.201.117.232/getmeta.php?username=&lt;my_user_id&gt;.</p>
<p><img class="size-full wp-image-193" title="Stumbleupon screenshot" src="http://jeethurao.com/blog/wp-content/uploads/2009/06/stumbleupon_dump-Wireshark-1.jpg" alt="Stumbleupon screenshot" width="849" height="401" /></p>
<p>The use case for this is to be to check if I&#8217;ve rated the url, and if so, what&#8217;s my rating. But then, why not cache the data instead of calling home every time I open a page (even in the same browsing session) ? I freaked out and uninstalled the toolbar the very moment, and that was pretty much it. Maybe mentioned it in passing to a couple of friends. Most of them didn&#8217;t really care about this since they don&#8217;t use StumbleUpon and those who did thought it was du jour with all the social bookmarking/news toolbars, which is true. Almost every other toolbar also does this.</p>
<p>I thought about this for a while, thought the toobars should hash the urls before sending it to the servers, and using <a href="http://jeethurao.com/blog/?p=164">bloom filters</a> to reduce the number of time the client would have to call home to check if a user has rated a url. And that was pretty much it, until last week.</p>
<p><a href="http://lankerisms.blogspot.com/">kamathln</a> told me that someone wants to write a Jetpack extension for <a href="http://tagz.in/">Tagz</a>. That someone turned out to be Yathi, an old mutual friend of ours. He wanted to write a relatively simple Jetpack extension and wanted some server side support to get info on any url (comments, points, number of saves etc). I quickly added the requisite  server side support and he quickly hacked together a nice little jetpack script. It turned out to be one of the first few jetpack extensions on <a href="http://userscripts.org/jetpacks/">userscripts.org</a>, a couple of people started using it and all was fine. Until I started looking at the server logs, when it felt like a deja vu all over again. Our script was leaking our users&#8217; browsing history quite like I&#8217;d observed with the StumbleUpon toolbar 6 months ago.</p>
<p>Eventually, I decided that sending urls in plain text is a bad idea. Also, the lookups should be cached atleast for a short while. An extended form of the idea involved using bloom filters, but that&#8217;d have been too much work. So, we now normalize the url to a standard representation, then hash it with sha256 and then send the hash to the server. Although this is not quite a completely bullet proof solution, its certainly better than sending the url to the server every time.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeethurao.com/blog/?feed=rss2&amp;p=186</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://jeethurao.com/blog/?p=186</feedburner:origLink></item>
		<item>
		<title>Walled Gardens?</title>
		<link>http://feedproxy.google.com/~r/JeethusBlog/~3/bVl2vYwVALg/</link>
		<comments>http://jeethurao.com/blog/?p=144#comments</comments>
		<pubDate>Tue, 16 Jun 2009 18:00:17 +0000</pubDate>
		<dc:creator>Jeethu</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[feature]]></category>
		<category><![CDATA[social bookmarking]]></category>
		<category><![CDATA[tagz]]></category>

		<guid isPermaLink="false">http://jeethurao.com/blog/?p=144</guid>
		<description><![CDATA[This probably is the most inappropriately titled post on my blog. Maybe this should have been titled &#8220;Why I think I waste so much time on proggit and hacker news&#8221; or &#8220;PR for my new feature on tagz&#8221;. For a long time I thought that people flock to social news sites to find new links pertinent to their [...]]]></description>
				<content:encoded><![CDATA[<p><span style="font-family: Times;"> </span></p>
<div style="background-image: initial; background-repeat: initial; background-color: #ffffff; margin: 0px;">
<p style="margin: 0px;">This probably is the most inappropriately titled post on my blog. Maybe this should have been titled &#8220;Why I think I waste so much time on proggit and hacker news&#8221; or &#8220;PR for my new feature on tagz&#8221;.</p>
<p style="margin: 0px;">
<p style="margin: 0px;">For a long time I thought that people flock to social news sites to find new links pertinent to their interests (programming, compsci, math, economics &#8230; in my case). And going by this model, I thought the utility brought on by comments on these sites are only marginal compared to the the utility provided by the inflow of new and interesting links. Lately I realized that this couldn&#8217;t be further from the truth. I hadn&#8217;t realized that I tend to spend more time reading comments than on reading the linked content. Seems like I&#8217;m more interested in what people have to say about the links than the links themselves.</p>
<p style="margin: 0px;">This reminded me of a pain point I&#8217;ve always had with social news sites. Social news sites are kinda like walled gardens. When I&#8217;m reading the comments on one of them, I&#8217;m missing out on a lot of interesting comments on other sites. And there&#8217;s no easy way (with a few clicks) to find discussions on all sites.</p>
<p style="margin: 0px;">
<p style="margin: 0px;">
<p style="margin: 0px;">Since <a style="color: #551a8b;" href="http://tagz.in/" target="_blank">Tagz</a> was written with the intent of solving things which nagged me the most with social news and bookmarking sites, I decided to annotate all posts on Tagz with links to the comments page on a Delicious, Digg, Hacker News, Reddit and Twitter. Now, there are a few little inconsistencies, Delicious doesn&#8217;t have anything like a comments page, so I link to the url info page. And due to the use of url shorteners, there isn&#8217;t a way to directly search Twitter for links. I link the the Backtweets search results page for the URL.</p>
<p style="margin: 0px;">
<p style="margin: 0px;">
<p style="margin: 0px;">Initially, I didn&#8217;t want to add more clutter to the main page, so I kept these links only on every post&#8217;s comments and history page. But yesterday, I thought it&#8217;d be a better idea to just include those links on the main page. One of my more marketing oriented friends even recommended against having it on the main page, since that&#8217;d likely increase the <a style="color: #551a8b;" href="http://en.wikipedia.org/wiki/Bounce_Rate" target="_blank">bounce rate</a>. Honestly, I don&#8217;t really care about it and I always thought convenience takes precedence over everything else, So I just added them anyways.</p>
</div>
]]></content:encoded>
			<wfw:commentRss>http://jeethurao.com/blog/?feed=rss2&amp;p=144</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://jeethurao.com/blog/?p=144</feedburner:origLink></item>
		<item>
		<title>Counting bloom filters in python and javascript</title>
		<link>http://feedproxy.google.com/~r/JeethusBlog/~3/8ZD63fo2kOY/</link>
		<comments>http://jeethurao.com/blog/?p=164#comments</comments>
		<pubDate>Tue, 21 Apr 2009 19:13:39 +0000</pubDate>
		<dc:creator>Jeethu</dc:creator>
				<category><![CDATA[Javascript]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[datastructure]]></category>

		<guid isPermaLink="false">http://jeethurao.com/blog/?p=164</guid>
		<description><![CDATA[Continuing the theme of implementing simple datastructures in python and javascript, here&#8217;s a simple counting bloom filter implementation in python and javascript which I&#8217;d written for Tagz. I&#8217;d almost forgotten about this, until a thread today on compsci.reddit reminded me of it. With this implementation, you can build a bloom filter in python and add/remove/lookup elements. You [...]]]></description>
				<content:encoded><![CDATA[<p>Continuing the <a href="http://jeethurao.com/blog/?p=146">theme</a> of implementing simple datastructures in python and javascript, here&#8217;s a simple counting <a href="http://en.wikipedia.org/wiki/Bloom_filter">bloom filter</a> implementation in python and javascript which I&#8217;d written for <a href="http://tagz.in">Tagz</a>. I&#8217;d almost forgotten about this, until a <a href="http://www.reddit.com/r/compsci/comments/8dx9z/ask_compsci_what_is_the_cleverest_data_structure/">thread</a> today on compsci.reddit reminded me of it. With this implementation, you can build a bloom filter in python and add/remove/lookup elements. You can also serialize it to JSON, send it over the wire to a javascript client and use the same filter from javascript code, which can come in handy at times.</p>
<p>Bloom filters are some of the simplest and yet the coolest of all probabilistic datastructures. Basically, are a set like datastructure, which you train on your dataset and then later use it to quickly check if it was trained on a certain value. The cool part is, unlike hashtables or trees, if you train a bloom filter on N elements (say 10,000) it doesn&#8217;t actually store any of those elements, so its like a compressed representation of your dataset, which can be used to do quick hit testing. But this comes at a price. Bloom filters might lie at times <img src='http://jeethurao.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />  . But its still useful, nonetheless. Every time we query a bloom filter, its like asking it a question &#8220;Were you trained with the value X ?&#8221;, to which it replies with a yes or a no (a boolean result), so, there are 3 possibilites.</p>
<ol>
<li>It tells the truth, which is the most likely case and all is fine.</li>
<li>It might say it was trained on X, while it really wasn&#8217;t. (A false positive)</li>
<li>It might say it wasn&#8217;t trained on X, while it actually was trained on it. (A false negative)</li>
</ol>
<p>Ok, so the property which makes bloom filters really useful is that it never gives false negatives, so #3 never happens. So, it mostly tells the truth, but in the unlikely event that it lies, it always errs on the conservative side. This, combined with the property that they don&#8217;t have to store the actual elements they were trained on, make bloom filters incredibly useful in a lot of of circumstances.</p>
<p>One place I used them in <a href="http://tagz.in">Tagz</a> is to check if a user has voted on a particular link. If the filter replies afirmatively then we can go query the backend (DB, Memcached, whatever) to get the actual score for the vote. Assuming that the bloom filter lies 5% of the times, we still eliminate 95% of the backend queries in this case. And the best part is that the rate at which it lies can be tuned. Its really a compromise between the size of the filter and the rate at which it lies. Simply put, the bigger a bloom filter, the lesser it lies. So, the constructor for the <a href="http://bitbucket.org/woadwarrior/bloom/src/tip/bloom.py#cl-191">SimpleBloomFilter</a> class in my implementation takes 2 parameters, capacity and err.</p>
<p>Another application for bloom filters is to synchronize stuff. Lets say, you have to update a lot of items with a certain keys between two machines A &amp; B over a slow connection, where each of them have quite a few items between them. So, A, instead of sending all the keys it has to B, can instead build a compact bloom filter out the keys it has and send it to B, B can test its item set against the bloom filter and send only the items it doesn&#8217;t find in it. The elaborate setup I just described is called a Bloom Join.</p>
<p>One more thing, this is a variant of a bloom filter called a counting bloom filter. Plain old bloom filters are add only structures. You can train it on an element, but you can&#8217;t untrain it on something which its already been trained on. Counting bloom filters gain the ability to be untrained on elements by consuming a little more space. Also, I noticed that the data in the filter tends to be pretty sparse, so I do some simple RLE compression on it before serializing it to JSON which makes it a lot more compact.</p>
<p>That concludes my watered down explanation of a counting bloom filter. The Wikipedia <a href="http://en.wikipedia.org/wiki/Bloom_filter">link</a> i&#8217;d mentioned in the beginning has a much better explanation of bloom filters. As usual, the code can be found on <a href="http://bitbucket.org/woadwarrior/bloom/">bitbucket</a>. All the code is MIT Licensed except for the 3rd party <a href="http://bitbucket.org/woadwarrior/bloom/src/tip/js/sha256.js">sha256.js</a> javascript module from <a href="http://www.bichlmeier.info/sha256.html">Christoph Bichlmeier</a>.</p>
<p>PS: Whats next ? <a href="http://en.wikipedia.org/wiki/BK-tree">BK-Trees</a> anyone ?</p>
]]></content:encoded>
			<wfw:commentRss>http://jeethurao.com/blog/?feed=rss2&amp;p=164</wfw:commentRss>
		<slash:comments>5</slash:comments>
		<feedburner:origLink>http://jeethurao.com/blog/?p=164</feedburner:origLink></item>
		<item>
		<title>A Mochikit style Dombuilder for YUI</title>
		<link>http://feedproxy.google.com/~r/JeethusBlog/~3/UokrylqeL6A/</link>
		<comments>http://jeethurao.com/blog/?p=156#comments</comments>
		<pubDate>Mon, 13 Apr 2009 15:58:43 +0000</pubDate>
		<dc:creator>Jeethu</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://jeethurao.com/blog/?p=156</guid>
		<description><![CDATA[Before moving to YUI about a year ago, I was using Mochikit as my primary JS library. As advertised, Mochikit happens to be one of the most pythonic javascript libraries ever. One of the sweetest parts of Mochikit IMO has been Mochikit.DOM. This is something which I&#8217;ve always missed with YUI. innerHTML is fast, but icky [...]]]></description>
				<content:encoded><![CDATA[<p>Before moving to <a href="http://developer.yahoo.com/yui/">YUI</a> about a year ago, I was using <a href="http://mochikit.com/">Mochikit</a> as my primary JS library. As advertised, Mochikit happens to be one of the most pythonic javascript libraries ever. One of the sweetest parts of Mochikit IMO has been <a href="http://mochikit.com/doc/html/MochiKit/DOM.html">Mochikit.DOM</a>. This is something which I&#8217;ve always missed with YUI. <a href="https://developer.mozilla.org/En/DOM/Element.innerHTML">innerHTML</a> is fast, but icky and it feels a little inelegant. So, I ended up writing something like Mochikit.DOM for YUI while writing <a href="http://tagz.in">Tagz</a>. Thought it might be useful to others as well. So, here&#8217;s the <a href="http://bitbucket.org/woadwarrior/dombuilder/">mercurial repo</a> with the code.</p>
<p>The <a href="http://bitbucket.org/woadwarrior/dombuilder/src/tip/utils.js">utils.js</a> file contains some utility functions like forEach, map, filter, partial. The only function from utils.js used in dombuilder.js is partial, so you might want to add it to dombuilder.js to remove the dependence on utils.js.</p>
<p>Here&#8217;s an obligatory trivial example (included in the repo).<br />
<script src="http://gist.github.com/94499.js"></script></p>
]]></content:encoded>
			<wfw:commentRss>http://jeethurao.com/blog/?feed=rss2&amp;p=156</wfw:commentRss>
		<slash:comments>6</slash:comments>
		<feedburner:origLink>http://jeethurao.com/blog/?p=156</feedburner:origLink></item>
		<item>
		<title>Tries and Ternary Search Trees in Python and Javascript</title>
		<link>http://feedproxy.google.com/~r/JeethusBlog/~3/aN8kWAWKCAE/</link>
		<comments>http://jeethurao.com/blog/?p=146#comments</comments>
		<pubDate>Sat, 11 Apr 2009 09:07:50 +0000</pubDate>
		<dc:creator>Jeethu</dc:creator>
				<category><![CDATA[Javascript]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[datastructure]]></category>

		<guid isPermaLink="false">http://jeethurao.com/blog/?p=146</guid>
		<description><![CDATA[There are a couple of places in which Tagz, where I needed efficient prefix matching. The most obvious way to do this is to use a Trie or a Ternary Search Tree. So, I ended up implementing both in Python. I&#8217;ve had this stuff lying around in my mercurial repo for Tagz for quite some [...]]]></description>
				<content:encoded><![CDATA[<p>There are a couple of places in which <a href="http://tagz.in/">Tagz</a>, where I needed efficient prefix matching. The most obvious way to do this is to use a <a href="http://en.wikipedia.org/wiki/Trie">Trie</a> or a <a href="http://en.wikipedia.org/wiki/Ternary_search_tries">Ternary Search Tree</a>. So, I ended up implementing both in Python. I&#8217;ve had this stuff lying around in my mercurial repo for Tagz for quite some time now. I just thought of releasing it today.</p>
<p>Interestingly, some crude benchmarks indicate that the trie implementation is more efficient than the ternary search tree implementation in terms of both speed and space (Look for test.py in the repo), I also ported the trie module to javascript, and its included in the repo.</p>
<p>Here are two GraphViz graphs visualizing both the structures, generated using the excellent <a href="http://software.inl.fr/trac/wiki/GvGen">GvGen</a> library.
<a href='http://jeethurao.com/blog/?attachment_id=147' title='Trie'><img width="150" height="150" src="http://jeethurao.com/blog/wp-content/uploads/2009/04/trie-150x150.png" class="attachment-thumbnail" alt="Trie" /></a>
<a href='http://jeethurao.com/blog/?attachment_id=148' title='Ternary Search Tree'><img width="150" height="150" src="http://jeethurao.com/blog/wp-content/uploads/2009/04/tst-150x150.png" class="attachment-thumbnail" alt="Ternary Search Tree" /></a>
</p>
<p>In the Ternary Search Tree graph, Blue vertices = left, Green vertices = middle, Red vertices = right.</p>
<p>The code can be found <a href="http://bitbucket.org/woadwarrior/trie/">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeethurao.com/blog/?feed=rss2&amp;p=146</wfw:commentRss>
		<slash:comments>4</slash:comments>
		<feedburner:origLink>http://jeethurao.com/blog/?p=146</feedburner:origLink></item>
		<item>
		<title>Using redis</title>
		<link>http://feedproxy.google.com/~r/JeethusBlog/~3/kMWtbqFo4RI/</link>
		<comments>http://jeethurao.com/blog/?p=142#comments</comments>
		<pubDate>Mon, 06 Apr 2009 17:04:24 +0000</pubDate>
		<dc:creator>Jeethu</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[programming]]></category>
		<category><![CDATA[Python]]></category>
		<category><![CDATA[redis]]></category>
		<category><![CDATA[tagz]]></category>

		<guid isPermaLink="false">http://jeethurao.com/blog/?p=142</guid>
		<description><![CDATA[I&#8217;ve been using memcached for all the caching on Tagz. Redis is a relatively new key value database which covers a superset of memcached&#8217;s functionality. One of the biggest problems I&#8217;ve had with memcached (actually it has nothing to do with memcached) is that whenever I store a large datastructure on memcached, deserializing (unpickling) it takes [...]]]></description>
				<content:encoded><![CDATA[<p>I&#8217;ve been using <a href="http://www.danga.com/memcached/">memcached</a> for all the caching on Tagz. <a href="http://code.google.com/p/redis/">Redis</a> is a relatively new key value database which covers a superset of memcached&#8217;s functionality. One of the biggest problems I&#8217;ve had with memcached (actually it has nothing to do with memcached) is that whenever I store a large datastructure on memcached, deserializing (unpickling) it takes quite a while (only a couple of milliseconds, but it still counts).</p>
<p>This happens whenever I end up storing large lists or dictionaries on memcached. Redis solves this problem effectively by providing list and set primitives besides storing plain old strings. This effectively solves the  aforementioned problem. Also, the list primitive supports atomic push/pop commands, which could be used to implement efficient queues. And this along with the on disk persistance feature solves another problem I have with <a href="http://xph.us/software/beanstalkd/">beanstalkd</a>, which is the lack of persistance.</p>
<p>Overall, this seems like a great solution to quite a few tiny little problems I have with performance on Tagz. I&#8217;m planning on playing with redis a little more tonight and if all goes well, I&#8217;ll shift from memcached to redis.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeethurao.com/blog/?feed=rss2&amp;p=142</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://jeethurao.com/blog/?p=142</feedburner:origLink></item>
		<item>
		<title>A change in direction for Tagz</title>
		<link>http://feedproxy.google.com/~r/JeethusBlog/~3/csBq-YEBZg8/</link>
		<comments>http://jeethurao.com/blog/?p=136#comments</comments>
		<pubDate>Wed, 01 Apr 2009 11:24:59 +0000</pubDate>
		<dc:creator>Jeethu</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[change]]></category>
		<category><![CDATA[direction]]></category>
		<category><![CDATA[tagz]]></category>

		<guid isPermaLink="false">http://jeethurao.com/blog/?p=136</guid>
		<description><![CDATA[Tagz has come a long way since I launched it last September. Something which began as a clean room django application has been accumulating a lot of cruft. One patch at a time, its turned itself into an unmaintainable mess of a codebase. In retrospect, I feel Python and Postgres weren&#8217;t really the best choices [...]]]></description>
				<content:encoded><![CDATA[<p>Tagz has come a long way since I launched it last September. Something which began as a clean room django application has been accumulating a lot of cruft. One patch at a time, its turned itself into an unmaintainable mess of a codebase.</p>
<p>In retrospect, I feel Python and Postgres weren&#8217;t really the best choices I made for writing Tagz. I believe Tagz would be better written in PHP with MySQL as the DB. I&#8217;ve come to learn the hard way that Django with Postgresql can&#8217;t quite match the blazing speeds possible using raw PHP with MySQL (and MyISAM DBs).</p>
<p>Starting today, I&#8217;ve decided on stopping all development on the current code base of Tagz. I&#8217;ve begun a rewrite of Tagz in PHP. The current <strong>users may rest assured</strong>, since backwards compatibility is an important goal for this rewrite. I&#8217;m hoping to finish the rewrite in less than a month. I&#8217;m expecting the transition to be a smooth one.</p>
<p>Finally, Thanks to all the current users (no thanks to all the spammers) for all the encouragement and the feature requests, without which Tagz would&#8217;ve never have come close to what it is today.</p>
]]></content:encoded>
			<wfw:commentRss>http://jeethurao.com/blog/?feed=rss2&amp;p=136</wfw:commentRss>
		<slash:comments>5</slash:comments>
		<feedburner:origLink>http://jeethurao.com/blog/?p=136</feedburner:origLink></item>
		<item>
		<title>Does HTML validation really matter ?</title>
		<link>http://feedproxy.google.com/~r/JeethusBlog/~3/KgG37HmT5rg/</link>
		<comments>http://jeethurao.com/blog/?p=133#comments</comments>
		<pubDate>Sat, 07 Mar 2009 04:05:43 +0000</pubDate>
		<dc:creator>Jeethu</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[html]]></category>
		<category><![CDATA[validation]]></category>

		<guid isPermaLink="false">http://jeethurao.com/blog/?p=133</guid>
		<description><![CDATA[Over at Codinghorror, Jeff Atwood ponders if making your pages W3C compliant is really worth all the effort. I&#8217;m sure just about everyone who has written more than a couple of html pages has thought about this. As a programmer, I find writing html and css to be a real chore. My productivity drops very [...]]]></description>
				<content:encoded><![CDATA[<p>Over at Codinghorror, Jeff Atwood <a href="http://www.codinghorror.com/blog/archives/001234.html">ponders</a> if making your pages W3C compliant is really worth all the effort. I&#8217;m sure just about everyone who has written more than a couple of html pages has thought about this. As a programmer, I find writing html and css to be a real chore. My productivity drops very close to zero whenever I&#8217;ve got to write html. But when you get down to do something as menial (from a lazy developer&#8217;s perspective) as writing html, how much harder is it to make it validate ?</p>
<p>But then, no fallacious argument is complete without a strawman or two. Start with more than a dozen sites (including two of his own) which don&#8217;t validate. Brilliant, capitalize on the <a href="http://en.wikipedia.org/wiki/Bandwagon_effect">bandwagon effect</a>. Which is kinda like saying &#8220;Those big name sites made it big without having valid (x)html. Ipso facto, your site stands a better chance of making it big by not writing valid (x)html&#8221;.  The second one&#8217;s a little better. With HTML 4.01 Strict, you can&#8217;t have a target attribute for an anchor tag. So, something like:</p>
<pre>&lt;a href="http://www.example.com/" target="_blank"&gt;foo&lt;/a&gt;</pre>
<p>Doesn&#8217;t validate. Now, I&#8217;m sure everybody these days uses some JS library (for those who don&#8217;t, there&#8217;s always getElementByClassName). Apparently, stackoverflow uses jQuery. So, here&#8217;s a solution. Start with</p>
<pre>&lt;a href="http://www.example.com/" class="external"&gt;foo&lt;/a&gt;</pre>
<p>Or something like it. Then do something like</p>
<pre>$(a.external).attr("target","_blank");</pre>
<p>Yeah, yeah, I know its a dirty hack, but heck it buys you  compliance. For what price ? One extra line of Javascript. And the whole point in writing validators mostly is about making it easier to parse for all the programs and bots which crawl/fetch your pages and AFAIK, most of them, don&#8217;t grok Javascript.  How hard is it to write</p>
<pre>&lt;td style="width:80px"&gt;</pre>
<p>as opposed to</p>
<pre>&lt;td width=80&gt;</pre>
<p>Fortunately, you don&#8217;t have to compile html, so you don&#8217;t have a compiler which barks at you and also, the browsers are lax enough to consume almost any kludge that you throw at them.</p>
<p>The bottom line: hell yeah, its worth the effort, and how hard is it, really ? Just because your <a href="http://stackoverflow.com/">web site</a> does&#8217;t validate, it doesn&#8217;t mean that you can brush it off. All I care about is that mine <a href="http://validator.w3.org/check?uri=http%3A%2F%2Ftagz.in%2F">does</a> and everyone&#8217;s should, atleast in theory <img src='http://jeethurao.com/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>PS:</p>
<p>1: In all honestly, my blog isn&#8217;t w3c validator compliant, thanks to the wordpress <a href="http://wordpress.org/extend/plugins/syntaxhighlighter/">plugin</a> that I use for syntax hilighting source code.</p>
<p>2: Accesibility matters as well, we web devs keep bitching about having to support IE6 (I do it as well). How many of our web 2.0 apps work with Javascript and css disabled or on <a href="http://en.wikipedia.org/wiki/Lynx_(web_browser)">lynx</a> or <a href="http://links.sourceforge.net/">links</a> ?</p>
]]></content:encoded>
			<wfw:commentRss>http://jeethurao.com/blog/?feed=rss2&amp;p=133</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://jeethurao.com/blog/?p=133</feedburner:origLink></item>
	</channel>
</rss>
