<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" xml:lang="en">
 <id>http://bionicspirit.com/</id>
 <title>Bionic Spirit</title>
 
 <link href="http://bionicspirit.com/" />
 <updated>2012-03-26T11:56:00+03:00</updated>

 <author>
   <name>Bionic Spirit</name>
   <email>contact@bionicspirit.com</email>
   <uri>http://bionicspirit.com</uri>
 </author>

 
 
 <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/bionicspirit" /><feedburner:info uri="bionicspirit" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><logo>http://bionicspirit.com/public/images/android-avatar-200.jpg</logo><feedburner:emailServiceId>bionicspirit</feedburner:emailServiceId><feedburner:feedburnerHostname>http://feedburner.google.com</feedburner:feedburnerHostname><entry>
   <title>How To Build a Naive Bayes Classifier</title>
   <link href="http://feedproxy.google.com/~r/bionicspirit/~3/1keilCGHjPs/howto-build-naive-bayes-classifier.html" />
   <updated>2012-02-09T00:00:00+02:00</updated>
   <id>http://bionicspirit.com/blog/2012/02/09/howto-build-naive-bayes-classifier</id>

   <author>
     <name>Bionic Spirit</name>
     <email>contact@bionicspirit.com</email>
     <uri>http://bionicspirit.com</uri>
   </author>

   <rights type="text">
     Copyright 2012 Alexandru Nedelcu.
     Some rights reserved (CC BY-NC 3.0)
     License: http://creativecommons.org/licenses/by-nc/3.0/
   </rights>

   
   <category scheme="http://bionicspirit.com/tags/" term="Algorithms" label="Algorithms" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Programming" label="Programming" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Mining" label="Mining" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Ruby" label="Ruby" />
   

   <content type="html">&lt;p&gt;Some use-cases for building a classifier:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Spam detection, for example you could build your own
&lt;a href="http://akismet.com/"&gt;Akismet&lt;/a&gt; API&lt;/li&gt;
&lt;li&gt;Automatic assignment of categories to a set of items&lt;/li&gt;
&lt;li&gt;Automatic detection of the primary language (e.g. Google Translate)&lt;/li&gt;
&lt;li&gt;Sentiment analysis, which in simple terms refers to discovering if
an opinion is about love or hate about a certain topic&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;In general you can do a lot better with more specialized techniques,
however the Naive Bayes classifier is general-purpose, simple to
implement and good-enough for most applications. And while other
algorithms give better accuracy, in general I discovered that having
better data in combination with an algorithm that you can tweak does
give better results for less effort.&lt;/p&gt;

&lt;p&gt;In this article I'm describing the math behind it. Don't fear the
math, as this is simple enough that a high-schooler understands. And
even though there are a lot of libraries out there that already do
this, you're far better off for understanding the concept behind it,
otherwise you won't be able to tweak the implementation in response to
your needs.&lt;/p&gt;

&lt;h2&gt;0. The Source Code&lt;/h2&gt;

&lt;p&gt;I published the source-code associated at
&lt;a href="https://github.com/alexandru/stuff-classifier"&gt;github.com/alexandru/stuff-classifier&lt;/a&gt;. The
implementation itself is at
&lt;a href="https://github.com/alexandru/stuff-classifier/blob/master/lib/stuff-classifier/bayes.rb"&gt;lib/bayes.rb&lt;/a&gt;,
with the corresponding
&lt;a href="https://github.com/alexandru/stuff-classifier/blob/master/test/test_002_naive_bayes.rb"&gt;test/test_002_naive_bayes.rb&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;1. Introduction to Probabilities&lt;/h2&gt;

&lt;p&gt;Let's start by refreshing forgotten knowledge. Again, this is very
basic stuff, but if you can't follow the theory here, you can always
go to the
&lt;a href="http://www.khanacademy.org/#probability"&gt;probabilities section on khanacademy.org&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;1.1. Events and Event Types&lt;/h3&gt;

&lt;p&gt;An "event" is a set of outcomes (a subset of all possible outcomes)
with a probability attached. So when flipping a coin, we can have one
of these 2 events happening: tail or head. Each of them has a
probability of 50%. Using a Venn diagram, this would look like this:&lt;/p&gt;

&lt;p&gt;&lt;img class="center" src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/coin-flip.png" style="float: none; display: block; margin: auto;"&gt;&lt;/p&gt;

&lt;p&gt;And another example which clearly shows the &lt;em&gt;dependence&lt;/em&gt; between
"rain" and "cloud formation", as raining can only happen if there are
clouds:&lt;/p&gt;

&lt;p&gt;&lt;img class="center" src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/inclusive.png" style="float: none; display: block; margin: auto;"&gt;&lt;/p&gt;

&lt;p&gt;The relationship between events is very important, as you'll see next:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;2 events are &lt;strong&gt;disjoint (exclusive)&lt;/strong&gt; if they can't happen at the same
time (a single coin flip cannot yield a tail and a head at the same
time). For Bayes classification, we are not concerned with disjoint
events.&lt;/li&gt;
&lt;li&gt;2 events are &lt;strong&gt;independent&lt;/strong&gt; when they can happen at the same time,
but the occurrence of one event does not make the occurrence of
another more or less probable. For example the second coin-flip you
make is not affected by the outcome of the first coin-flip.&lt;/li&gt;
&lt;li&gt;2 events are &lt;strong&gt;dependent&lt;/strong&gt; if the outcome of one affects the other. In
the example above, clearly it cannot rain without a cloud
formation. Also, in a horse race, some horses have better
performance on rainy days.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;What we are concerned here is the difference between dependent and
independent events, because calculating the intersection (both
happening at the same time) depends on it. So for independent events,
calculating the intersection is easy:&lt;/p&gt;

&lt;p&gt;&lt;img src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/independent-events-intersection.png"&gt;&lt;/p&gt;

&lt;p&gt;Some examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;if you have 2 hard-drives, each of them having a 0.3 (30%)
probability of failure within the next year, that means there's a
0.09 (9%) probability of them failing both within the next year&lt;/li&gt;
&lt;li&gt;if you flip a coin 4 times, there's a 0.0625 probability of getting
a tail 4 times in a row (0.5 ^ 4)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Things are not so simple for dependent events, which is where the
Bayes Theorem comes into play.&lt;/p&gt;

&lt;h3&gt;1.2. Conditional Probabilities and The Bayes Theorem&lt;/h3&gt;

&lt;p&gt;Let's take one example. So we have the following stats:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;30 emails out of a total of 74 are spam messages&lt;/li&gt;
&lt;li&gt;51 emails out of those 74 contain the word "penis"&lt;/li&gt;
&lt;li&gt;20 emails containing the word "penis" have been marked as spam&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;So the question is: what is the probability that the latest received
email is a spam message, given that it contains the word "penis"?&lt;/p&gt;

&lt;p&gt;So these 2 events are clearly dependent, which is why you must use the
simple form of the Bayes Theorem:&lt;/p&gt;

&lt;p&gt;&lt;img src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/conditional-prob.png"&gt;&lt;/p&gt;

&lt;p&gt;With the solution being:&lt;/p&gt;

&lt;p&gt;&lt;img src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/spam-simple-bayes.png"&gt;&lt;/p&gt;

&lt;p&gt;This was a simple one, you could definitely see the result without
complicating yourself with the Bayes formula.&lt;/p&gt;

&lt;h3&gt;1.3. The Naive Bayes Approach&lt;/h3&gt;

&lt;p&gt;Let us complicate the problem above by adding to it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;25 emails out of the total contain the word "viagra"&lt;/li&gt;
&lt;li&gt;24 emails out of those have been marked as spam&lt;/li&gt;
&lt;li&gt;so what's the probability that an email is spam, given that it
contains both "viagra" and "penis"?&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Shit just got more complicated, because now the formula is this one:&lt;/p&gt;

&lt;p&gt;&lt;img src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/spam-multiple-bayes.png"&gt;&lt;/p&gt;

&lt;p&gt;And you definitely don't want to bother with it if we keep adding
words. But what if we simplified our assumptions and just say that the
occurrence of &lt;em&gt;penis&lt;/em&gt; is totally independent from the occurrence of
&lt;em&gt;viagra&lt;/em&gt;? Then the formula just got much simpler:&lt;/p&gt;

&lt;p&gt;&lt;img src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/spam-multiple-bayes-naive.png"&gt;&lt;/p&gt;

&lt;p&gt;To classify an email as spam, you'll have to calculate the conditional
probability by taking hints from the words contained. And the Naive
Bayes approach is exactly what I described above: we make the
assumption that the occurrence of one word is totally unrelated to the
occurrence of another, to simplify the processing and complexity
involved.&lt;/p&gt;

&lt;p&gt;This does highlight the flaw of this method of classification, because
clearly those 2 events we've picked (viagra and penis) are correlated
and our assumption is wrong. But this just means our results will be
less accurate.&lt;/p&gt;

&lt;h2&gt;2. Implementation&lt;/h2&gt;

&lt;p&gt;I mention it again, you can take a look at the source-code published
at
&lt;a href="https://github.com/alexandru/stuff-classifier/"&gt;github.com/alexandru/stuff-classifier&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;2.1. General Algorithm&lt;/h3&gt;

&lt;p&gt;You simply get the probability for a text to belong to each of the
categories you test against. The category with the highest probability
for the given text wins:&lt;/p&gt;

&lt;p&gt;&lt;img src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/bayes-classifier-formula.png"&gt;&lt;/p&gt;

&lt;p&gt;Do note that above I also eliminated the &lt;em&gt;denominator&lt;/em&gt; from our original
formula, because it is a constant that we do not need (called
&lt;em&gt;evidence&lt;/em&gt;).&lt;/p&gt;

&lt;h3&gt;2.2. Avoiding Floating Point Underflow (UPDATE Feb 27, 2012)&lt;/h3&gt;

&lt;p&gt;Because of the underlying limits of floating points, if you're working
with big documents (not the case in this example), you do have to make
one important optimization to the above formula:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;instead of the probabilities of each word, you store the (natural)
logarithms of those probabilities&lt;/li&gt;
&lt;li&gt;instead of multiplying the numbers, you add them instead&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;So instead of the above formula, if you need this optimization, then
use this one:&lt;/p&gt;

&lt;p&gt;&lt;img src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/bayes-logarithms.png"&gt;&lt;/p&gt;

&lt;h3&gt;2.3. Training&lt;/h3&gt;

&lt;p&gt;Your implementation must have a training method. Here's how mine looks like:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;train&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;each_word&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;increment_word&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;increment_cat&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;category&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;And its usage:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train&lt;/span&gt; &lt;span class="ss"&gt;:spam&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Grow your penis to 20 inches in just 1 week&amp;quot;&lt;/span&gt;
&lt;span class="n"&gt;classifier&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;train&lt;/span&gt; &lt;span class="ss"&gt;:ham&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  &lt;span class="s2"&gt;&amp;quot;I&amp;#39;m hungry, no I don&amp;#39;t want your penis&amp;quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;For the full implementation, take a look at
&lt;a href="https://github.com/alexandru/stuff-classifier/blob/master/lib/stuff-classifier/base.rb"&gt;base.rb&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;2.4. Getting Rid of Stop Words / Stemming&lt;/h3&gt;

&lt;p&gt;First of all, you must get rid of the junk. Every language has words
that are so commonly used that make them meaningless for any kind of
classification you may want to do. For instance in English you have
words such as "the", "to", "you", "he", "only", "if", "it" that you
can safely strip out from the text.&lt;/p&gt;

&lt;p&gt;I've compiled a list of such words in this file:
&lt;a href="https://github.com/alexandru/stuff-classifier/blob/master/lib/stuff-classifier/stop_words.rb"&gt;stop_words.rb&lt;/a&gt;. You
can compile such a list by yourself if you're not using English for
example. Head over to &lt;a href="http://www.gutenberg.org/"&gt;Project Gutenberg&lt;/a&gt;,
download some books in the language you want, count the words in them,
sort by popularity in descending order and keep the top words as words
that you can safely ignore.&lt;/p&gt;

&lt;p&gt;Also, our classifier is really dumb in the sense that it does not care
about the meaning or context of a word. So there's a problem: consider
the word "running". What you want is to treat this just as "run",
which is the morphological root of the word. You also want to treat
"parenting" and "parents" as "parent".&lt;/p&gt;

&lt;p&gt;This process is called &lt;em&gt;stemming&lt;/em&gt; and there are lots of libraries for
it. I think currently the most up-to-date and comprehensive library
for stemming is Snowball. It's a C library with lots of bindings
available, including for Ruby and Python and it even has support for
my native language (Romanian).&lt;/p&gt;

&lt;p&gt;Take a look at what I'm doing in
&lt;a href="https://github.com/alexandru/stuff-classifier/blob/master/lib/stuff-classifier/tokenizer.rb"&gt;tokenizer.rb&lt;/a&gt;,
where I'm getting rid of stop words and stemming the remainings.&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="n"&gt;each_word&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Hello world! How are you?&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# =&amp;gt; [&amp;quot;hello&amp;quot;, &amp;quot;world&amp;quot;]&lt;/span&gt;

&lt;span class="n"&gt;each_word&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Lots of dogs, lots of cats! &lt;/span&gt;
&lt;span class="s1"&gt;  This is the information highway&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# =&amp;gt; [&amp;quot;lot&amp;quot;, &amp;quot;dog&amp;quot;, &amp;quot;lot&amp;quot;, &amp;quot;cat&amp;quot;, &amp;quot;inform&amp;quot;, &amp;quot;highwai&amp;quot;]&lt;/span&gt;

&lt;span class="n"&gt;each_word&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;I don&amp;#39;t really get what you want to&lt;/span&gt;
&lt;span class="s2"&gt;  accomplish. There is a class TestEval2, you can do test_eval2 =&lt;/span&gt;
&lt;span class="s2"&gt;  TestEval2.new afterwards. And: class A ... end always yields nil, so&lt;/span&gt;
&lt;span class="s2"&gt;  your output is ok I guess ;-)&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c1"&gt;# =&amp;gt; [&amp;quot;really&amp;quot;, &amp;quot;want&amp;quot;, &amp;quot;accomplish&amp;quot;, &amp;quot;class&amp;quot;,&lt;/span&gt;
&lt;span class="c1"&gt;#     &amp;quot;testeval&amp;quot;, &amp;quot;test&amp;quot;, &amp;quot;eval&amp;quot;, &amp;quot;testeval&amp;quot;, &amp;quot;new&amp;quot;, &lt;/span&gt;
&lt;span class="c1"&gt;#     &amp;quot;class&amp;quot;, &amp;quot;end&amp;quot;, &amp;quot;yields&amp;quot;, &amp;quot;nil&amp;quot;, &amp;quot;output&amp;quot;, &lt;/span&gt;
&lt;span class="c1"&gt;#     &amp;quot;ok&amp;quot;, &amp;quot;guess&amp;quot;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;&lt;strong&gt;NOTE:&lt;/strong&gt; depending on the size of your training data, this may not be
a good idea. Stemming is useful in the beginning when you don't have a
lot of data. Otherwise consider "&lt;em&gt;house&lt;/em&gt;" and "&lt;em&gt;housing&lt;/em&gt;" ... the
former is used less frequently in a spammy context then the later.&lt;/p&gt;

&lt;h3&gt;2.5. Implementation Guidelines&lt;/h3&gt;

&lt;p&gt;When classifying emails for spam, it is a good idea to be sure that a
certain message is a spam message, otherwise users may get pissed by
too many false positives.&lt;/p&gt;

&lt;p&gt;Therefore it is a good idea to have &lt;em&gt;thresholds&lt;/em&gt;. This is how my
implementation looks like:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;classify&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="kp"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="c1"&gt;# Find the category with the highest probability&lt;/span&gt;

  &lt;span class="n"&gt;max_prob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="n"&gt;best&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kp"&gt;nil&lt;/span&gt;
  
  &lt;span class="n"&gt;scores&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cat_scores&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;each&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
    &lt;span class="n"&gt;cat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;prob&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;max_prob&lt;/span&gt;
      &lt;span class="n"&gt;max_prob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prob&lt;/span&gt;
      &lt;span class="n"&gt;best&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;cat&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="c1"&gt;# Return the default category in case the threshold condition was&lt;/span&gt;
  &lt;span class="c1"&gt;# not met. For example, if the threshold for :spam is 1.2&lt;/span&gt;
  &lt;span class="c1"&gt;#&lt;/span&gt;
  &lt;span class="c1"&gt;#    :spam =&amp;gt; 0.73, :ham =&amp;gt; 0.40  (OK)&lt;/span&gt;
  &lt;span class="c1"&gt;#    :spam =&amp;gt; 0.80, :ham =&amp;gt; 0.70  (Fail, :ham is too close)&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt; &lt;span class="k"&gt;unless&lt;/span&gt; &lt;span class="n"&gt;best&lt;/span&gt;
  &lt;span class="n"&gt;threshold&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="vi"&gt;@thresholds&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;best&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;

  &lt;span class="n"&gt;scores&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;each&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;score&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
    &lt;span class="n"&gt;cat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;score&lt;/span&gt;
    &lt;span class="k"&gt;next&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cat&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;best&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;default&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;prob&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;threshold&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;max_prob&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;best&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;h2&gt;Final Words&lt;/h2&gt;

&lt;p&gt;My example involved spam classification, however this is not how
modern spam classifiers work btw. Because the independence assumptions
are often inaccurate, this type of classifier can be gamed by spammers
to trigger a lot of false positives, which will make the user turn the
feature off eventually.&lt;/p&gt;

&lt;p&gt;But it is general purpose, being good enough not only for spam
detection, but also for lots of other use-cases and it's enough to get
you started.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/bionicspirit/~4/1keilCGHjPs" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://bionicspirit.com/blog/2012/02/09/howto-build-naive-bayes-classifier.html</feedburner:origLink></entry>
 
 
 
 <entry>
   <title>Data Mining: Finding Similar Items and Users</title>
   <link href="http://feedproxy.google.com/~r/bionicspirit/~3/didmgf6Paj8/cosine-similarity-euclidean-distance.html" />
   <updated>2012-01-16T00:00:00+02:00</updated>
   <id>http://bionicspirit.com/blog/2012/01/16/cosine-similarity-euclidean-distance</id>

   <author>
     <name>Bionic Spirit</name>
     <email>contact@bionicspirit.com</email>
     <uri>http://bionicspirit.com</uri>
   </author>

   <rights type="text">
     Copyright 2012 Alexandru Nedelcu.
     Some rights reserved (CC BY-NC 3.0)
     License: http://creativecommons.org/licenses/by-nc/3.0/
   </rights>

   
   <category scheme="http://bionicspirit.com/tags/" term="Algorithms" label="Algorithms" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Programming" label="Programming" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Mining" label="Mining" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Ruby" label="Ruby" />
   

   <content type="html">&lt;p&gt;&lt;img class="right" src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/similarity-graphic-small.png" style="float: right; margin-left: 10px; margin-bottom: 10px;" align="right"&gt;&lt;/p&gt;

&lt;p&gt;Because we want to give kick-ass product recommendations.&lt;/p&gt;

&lt;p&gt;I'm showing you how to find related items based on a really simple
formula. If you pay attention, this technique is used all over the web
(like on Amazon) to personalize the user experience and increase
conversion rates.&lt;/p&gt;

&lt;p&gt;To get one question out of the way: there are already many available
libraries that do this, but as you'll see there are multiple ways of
skinning the cat and you won't be able to pick the right one without
understanding the process, at least intuitively.&lt;/p&gt;

&lt;h2&gt;Defining the Problem&lt;/h2&gt;

&lt;p&gt;&lt;img class="right" src="//d2fo8u6if66r8r.cloudfront.net/assets/photos/amazon.png" title="Amazon gives kick-ass suggestions to their customers"  style="float: right; margin-left: 10px; margin-bottom: 10px;" align="right"&gt;&lt;/p&gt;

&lt;p&gt;To find similar items to a certain item, you've got to first define
what it means for 2 items to be similar and this depends on the
problem you're trying to solve:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;on a blog, you may want to suggest similar articles that share the
same tags, or that have been viewed by the same people viewing the
item you want to compare with&lt;/li&gt;
&lt;li&gt;Amazon has this section called "&lt;em&gt;customers that bought this item also
bought&lt;/em&gt;", which is self-explanatory&lt;/li&gt;
&lt;li&gt;a service like IMDB, based on your ratings, could find users similar
to you, users that liked or hated approximately the same movies you did,
thus giving you suggestions on movies you'd like to watch in the future&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;In each case you need a way to classify these items you're comparing,
whether it is tags, or items purchased, or movies reviewed. We'll be
using tags, as it is simpler, but the formula holds for more
complicated instances.&lt;/p&gt;

&lt;h2&gt;Redefining the Problem in Terms of Geometry&lt;/h2&gt;

&lt;p&gt;We'll be using my blog as sample. Let's take some tags:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;API&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Algorithms&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Amazon&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Android&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Books&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Browser&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;That's 6 tags. Well, what if we considered these tags as dimensions in
a 6-dimensional
&lt;a href="http://en.wikipedia.org/wiki/Euclidean_space"&gt;Euclidean space&lt;/a&gt;? Then
each item you want to sort or compare becomes a point in this space,
in which a coordinate (representing a tag) is either one (tagged) or
zero (not tagged).&lt;/p&gt;

&lt;p&gt;So let's say we've got one article tagged with &lt;em&gt;API&lt;/em&gt; and
&lt;em&gt;Browser&lt;/em&gt;. Then its associated point will be:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Now these coordinates could represent something else. For instance
they could represent users. If say you've got a total of 6 users in
your system, 2 of them rating an item with 3 and 5 stars respectively,
you could have for the article in question this associated point
(do note the order is very important):&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;So now you can go ahead and calculate distances between these
points. For instance you could calculate the angle between the
associated vectors, or the actual euclidean distance between the 2
points. For a 2-dimensional Euclidean space, here's how it would look
like:&lt;/p&gt;

&lt;p&gt;&lt;img class="center" src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/similarity-graphic.png" style="float: none; display: block; margin: auto;"&gt;&lt;/p&gt;

&lt;h2&gt;Euclidean Distance&lt;/h2&gt;

&lt;p&gt;The mathematical formula for the Euclidean distance is really
simple. Considering 2 points, A and B, with their associated
coordinates, the distance is defined as:&lt;/p&gt;

&lt;p&gt;&lt;img class="center" src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/euclidean-distance.png" style="float: none; display: block; margin: auto;"&gt;&lt;/p&gt;

&lt;p&gt;The lower the distance between 2 points, then the higher the
similarity. Here's some Ruby code:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="c1"&gt;# Returns the Euclidean distance between 2 points&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# Params:&lt;/span&gt;
&lt;span class="c1"&gt;#  - a, b: list of coordinates (float or integer)&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;euclidean_distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;sq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="no"&gt;Math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sq&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c1"&gt;# Returns the associated point of our tags_set, relative to our&lt;/span&gt;
&lt;span class="c1"&gt;# tags_space.&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# Params:&lt;/span&gt;
&lt;span class="c1"&gt;#  - tags_set: list of tags&lt;/span&gt;
&lt;span class="c1"&gt;#  - tags_space: _ordered_ list of tags&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;tags_to_point&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tags_set&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tags_space&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;tags_space&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;tags_set&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;member?&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c1"&gt;# Returns other_items sorted by similarity to this_item &lt;/span&gt;
&lt;span class="c1"&gt;# (most relevant are first in the returned list)&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# Params:&lt;/span&gt;
&lt;span class="c1"&gt;#  - items: list of hashes that have [:tags]&lt;/span&gt;
&lt;span class="c1"&gt;#  - by_these_tags: list of tags to compare with&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;sort_by_similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;by_these_tags&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;tags_space&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;by_these_tags&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:tags&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;  
  &lt;span class="n"&gt;tags_space&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;flatten!&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort!&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uniq!&lt;/span&gt;

  &lt;span class="n"&gt;this_point&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tags_to_point&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;by_these_tags&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tags_space&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;other_points&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;items&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; 
    &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tags_to_point&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:tags&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tags_space&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;

  &lt;span class="n"&gt;similarities&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;other_points&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;that_point&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;
    &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;item&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;euclidean_distance&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;this_point&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;that_point&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  
  &lt;span class="n"&gt;sorted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;similarities&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sort&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;sorted&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;point&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;point&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;And here is the test you could do, and btw you can copy the above and
the bellow script and run it directly:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="c1"&gt;# SAMPLE DATA&lt;/span&gt;

&lt;span class="n"&gt;all_articles&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;
  &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="ss"&gt;:article&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Data Mining: Finding Similar Items&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
   &lt;span class="ss"&gt;:tags&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Algorithms&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Programming&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Mining&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
     &lt;span class="s2"&gt;&amp;quot;Python&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Ruby&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; 
  &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="ss"&gt;:article&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Blogging Platform for Hackers&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;  
   &lt;span class="ss"&gt;:tags&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Publishing&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Server&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Cloud&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Heroku&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
     &lt;span class="s2"&gt;&amp;quot;Jekyll&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;GAE&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; 
  &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="ss"&gt;:article&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;UX Tip: Don&amp;#39;t Hurt Me On Sign-Up&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
   &lt;span class="ss"&gt;:tags&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Web&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Design&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;UX&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; 
  &lt;span class="p"&gt;{&lt;/span&gt;
   &lt;span class="ss"&gt;:article&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Crawling the Android Marketplace&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
   &lt;span class="ss"&gt;:tags&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Python&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Android&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;Mining&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
     &lt;span class="s2"&gt;&amp;quot;Web&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;API&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="o"&gt;]&lt;/span&gt;

&lt;span class="c1"&gt;# SORTING these articles by similarity with an article &lt;/span&gt;
&lt;span class="c1"&gt;# tagged with Publishing + Web + API&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# The list is returned in this order:&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# 1. article: Crawling the Android Marketplace&lt;/span&gt;
&lt;span class="c1"&gt;#    similarity: 2.0&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# 2. article: &amp;quot;UX Tip: Don&amp;#39;t Hurt Me On Sign-Up&amp;quot;&lt;/span&gt;
&lt;span class="c1"&gt;#    similarity: 2.0&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# 3. article: Blogging Platform for Hackers&lt;/span&gt;
&lt;span class="c1"&gt;#    similarity: 2.645751&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# 4. article: &amp;quot;Data Mining: Finding Similar Items&amp;quot;&lt;/span&gt;
&lt;span class="c1"&gt;#    similarity: 2.828427&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;

&lt;span class="n"&gt;sorted&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;sort_by_similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;all_articles&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Publishing&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;Web&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;API&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yaml&amp;#39;&lt;/span&gt;
&lt;span class="nb"&gt;puts&lt;/span&gt; &lt;span class="no"&gt;YAML&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dump&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;h3&gt;The Problem (or Strength) of Euclidean Distance&lt;/h3&gt;

&lt;p&gt;Can you see one flaw with it for our chosen data-set and intention? I
think you can - the first 2 articles have the same Euclidean distance
to ["Publishing", "Web", "API"], even though the first article shares
2 tags with our chosen item, instead of just 1 tag as the rest.&lt;/p&gt;

&lt;p&gt;To visualize why, look at the points used in calculating the distance
for the first article:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;So 4 coordinates are different. Now look at the points used for the
second article:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Again, 4 coordinates are different. So here's the deal with Euclidean
distance: it measures &lt;em&gt;dissimilarity&lt;/em&gt;. The coordinates that are the
same are less important than the coordinates that are different. For
my purpose here, this is not good - because articles with more tags
(or less) tags than the average are going to be disadvantaged.&lt;/p&gt;

&lt;h2&gt;Cosine Similarity&lt;/h2&gt;

&lt;p&gt;This method is very similar to the one above, but does tend to give
slightly different results, because this one actually measures
similarity instead of dissimilarity. Here's the formula:&lt;/p&gt;

&lt;p&gt;&lt;img class="center" src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/cosine-similarity.png" style="float: none; display: block; margin: auto;"&gt;&lt;/p&gt;

&lt;p&gt;If you look at the visual with the 2 axis and 2 points, we need the
cosine of the angle &lt;em&gt;theta&lt;/em&gt; that's between the vectors associated with
our 2 points. And for our sample it does give better results.&lt;/p&gt;

&lt;p&gt;The values will range between -1 and 1. -1 means that 2 items are
total opposites, 0 means that the 2 items are independent of each
other and 1 means that the 2 items are very similar (btw, because we
are only doing zeros and ones for coordinates here, this score will
never get negative for our sample).&lt;/p&gt;

&lt;p&gt;Here's the Ruby code (leaving out the wiring to our sample data, do
that as an exercise):&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;dot_product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;products&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;products&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="nb"&gt;p&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nb"&gt;p&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;magnitude&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;point&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;squares&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;point&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;map&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="no"&gt;Math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;squares&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;

&lt;span class="c1"&gt;# Returns the cosine of the angle between the vectors &lt;/span&gt;
&lt;span class="c1"&gt;#associated with 2 points&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# Params:&lt;/span&gt;
&lt;span class="c1"&gt;#  - a, b: list of coordinates (float or integer)&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;cosine_similarity&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;dot_product&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;magnitude&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;magnitude&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Also, sorting the articles in the above sample gives me the following:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="yaml"&gt;&lt;span class="p-Indicator"&gt;-&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;article&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;Crawling the Android Marketplace&lt;/span&gt;
  &lt;span class="l-Scalar-Plain"&gt;similarity&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;0.5163977794943222&lt;/span&gt;

&lt;span class="p-Indicator"&gt;-&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;article&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;UX&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Tip:&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Don&amp;#39;t&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Hurt&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Me&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;On&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Sign-Up&amp;quot;&lt;/span&gt;
  &lt;span class="l-Scalar-Plain"&gt;similarity&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;0.33333333333333337&lt;/span&gt;

&lt;span class="p-Indicator"&gt;-&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;article&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;Blogging Platform for Hackers&lt;/span&gt;
  &lt;span class="l-Scalar-Plain"&gt;similarity&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;0.23570226039551587&lt;/span&gt;

&lt;span class="p-Indicator"&gt;-&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;article&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;Data&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Mining:&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Finding&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Similar&lt;/span&gt;&lt;span class="nv"&gt; &lt;/span&gt;&lt;span class="s"&gt;Items&amp;quot;&lt;/span&gt;
  &lt;span class="l-Scalar-Plain"&gt;similarity&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;0.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Right, so much better for this chosen sample and usage. Ain't this
fun? BUT, you guessed it, there's a problem with this too ...&lt;/p&gt;

&lt;h3&gt;The Problem with Our Sample; The Tf-Idf Weight&lt;/h3&gt;

&lt;p&gt;Our data sample is so simple that we could have simply counted the
number of common tags and use that as a metric. The result would be
the same without getting fancy with Cosine Similarity :-)&lt;/p&gt;

&lt;p&gt;Clearly a tag such as "Heroku" is more specific than a general purpose
tag such as "Web". Also, just because Jekyll was mentioned in an
article, that doesn't make the article about Jekyll. Also an article
tagged with "Android" may be twice as Android-related as another
article also tagged with "Android".&lt;/p&gt;

&lt;p&gt;So here's a solution to this: the
&lt;strong&gt;&lt;a href="http://en.wikipedia.org/wiki/Tf%E2%80%93idf"&gt;Tf-Idf weight&lt;/a&gt;&lt;/strong&gt;, &lt;em&gt;a
statistical measure used to evaluate how important a word is to a
document in a collection or corpus&lt;/em&gt;. With it you can give values to
your coordinates that are much more specific than simple ones and
zeros. But I'll leave that for another day.&lt;/p&gt;

&lt;p&gt;Also, related to our simple data-set here, perhaps an even simpler
metric, like the
&lt;a href="http://en.wikipedia.org/wiki/Jaccard_index"&gt;Jaccard index&lt;/a&gt; would be
better.&lt;/p&gt;

&lt;h2&gt;Pearson Correlation Coefficient&lt;/h2&gt;

&lt;p&gt;The
&lt;a href="http://en.wikipedia.org/wiki/Pearson_product-moment_correlation_coefficient"&gt;Pearson Correlation Coefficient&lt;/a&gt;
for finding the similarity of 2 items is slightly more sophisticated
and doesn't really apply to my chosen data-set. This coefficient
measures how well two samples are linearly related.&lt;/p&gt;

&lt;p&gt;For example, on IMDB we may have 2 users. One of them, lets call him
John, has given the following ratings to 5 movies:
[1, 2, 3, 4, 5]. The other one, Mary, has given the following ratings
to the same 5 movies: [4, 5, 6, 7, 8]. The 2 users are very similar,
as there is a perfect linear correlation between them, since Mary just
gives the same rankings as John plus 3. The formula itself or the
theory is not very intuitive though. But it is simple to calculate:&lt;/p&gt;

&lt;p&gt;&lt;img class="center" src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/pearson.png" style="float: none; display: block; margin: auto;"&gt;&lt;/p&gt;

&lt;p&gt;Here's the code:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;pearson_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;length&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;unless&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;

  &lt;span class="c1"&gt;# summing the preferences&lt;/span&gt;
  &lt;span class="n"&gt;sum1&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;sum2&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="c1"&gt;# summing up the squares&lt;/span&gt;
  &lt;span class="n"&gt;sum1_sq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="n"&gt;sum2_sq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="c1"&gt;# summing up the product&lt;/span&gt;
  &lt;span class="n"&gt;prod_sum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;zip&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;inject&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ab&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;sum&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;ab&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;ab&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
  
  &lt;span class="c1"&gt;# calculating the Pearson score&lt;/span&gt;
  &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;prod_sum&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sum1&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;sum2&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  
  &lt;span class="n"&gt;den&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Math&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sqrt&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;sum1_sq&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sum1&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sum2_sq&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sum2&lt;/span&gt; &lt;span class="o"&gt;**&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;den&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;num&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;den&lt;/span&gt;  
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;span class="nb"&gt;puts&lt;/span&gt; &lt;span class="n"&gt;pearson_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; 1.0&lt;/span&gt;
&lt;span class="nb"&gt;puts&lt;/span&gt; &lt;span class="n"&gt;pearson_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; 0.5063696835418333&lt;/span&gt;
&lt;span class="nb"&gt;puts&lt;/span&gt; &lt;span class="n"&gt;pearson_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; 0.4338609156373132&lt;/span&gt;
&lt;span class="nb"&gt;puts&lt;/span&gt; &lt;span class="n"&gt;pearson_score&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;6&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c1"&gt;# =&amp;gt; -1&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;h2&gt;Manhattan Distance&lt;/h2&gt;

&lt;p&gt;There is no one size fits all and the formula you're going to use
depends on your data and what you want out of it.&lt;/p&gt;

&lt;p&gt;For instance the
&lt;a href="http://en.wikipedia.org/wiki/Taxicab_geometry"&gt;Manhattan Distance&lt;/a&gt;
computes the distance that would be traveled to get from one data
point to the other if a grid-like path is followed. I like this
graphic from Wikipedia that perfectly illustrates the difference with
Euclidean distance:&lt;/p&gt;

&lt;p&gt;&lt;img class="center" src="//d2fo8u6if66r8r.cloudfront.net/assets/graphics/manhattan.png" style="float: none; display: block; margin: auto;"&gt;&lt;/p&gt;

&lt;p&gt;Red, yellow and blue lines all have the same length and the distance
is bigger than the corresponding green diagonal, which is the normal
Euclidean distance.&lt;/p&gt;

&lt;p&gt;Personally I haven't found a usage for it, as it is more related to
path-finding algorithms, but it's a good thing to keep in mind that it
exists and may prove useful. Since it measures how many changes you
have to do to your origin location to get to your destination while
being limited to taking small steps in a grid-like system, it is very
similar in spirit to the
&lt;a href="http://en.wikipedia.org/wiki/Levenshtein_distance"&gt;Levenshtein distance&lt;/a&gt;,
which measures the minimum number of changes required to transform
some text into another.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/bionicspirit/~4/didmgf6Paj8" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://bionicspirit.com/blog/2012/01/16/cosine-similarity-euclidean-distance.html</feedburner:origLink></entry>
 
 
 
 
 
 <entry>
   <title>Blogging Platform for Hackers</title>
   <link href="http://feedproxy.google.com/~r/bionicspirit/~3/IH4Zs6kWe5Y/blogging-for-hackers.html" />
   <updated>2012-01-05T00:00:00+02:00</updated>
   <id>http://bionicspirit.com/blog/2012/01/05/blogging-for-hackers</id>

   <author>
     <name>Bionic Spirit</name>
     <email>contact@bionicspirit.com</email>
     <uri>http://bionicspirit.com</uri>
   </author>

   <rights type="text">
     Copyright 2012 Alexandru Nedelcu.
     Some rights reserved (CC BY-NC 3.0)
     License: http://creativecommons.org/licenses/by-nc/3.0/
   </rights>

   
   <category scheme="http://bionicspirit.com/tags/" term="Publishing" label="Publishing" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Server" label="Server" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Cloud" label="Cloud" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Heroku" label="Heroku" />
   
   <category scheme="http://bionicspirit.com/tags/" term="GAE" label="GAE" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Ruby" label="Ruby" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Jekyll" label="Jekyll" />
   

   <content type="html">&lt;p&gt;&lt;img class="right" src="//d2fo8u6if66r8r.cloudfront.net/assets/photos/heroku.png" style="float: right; margin-left: 10px; margin-bottom: 10px;" align="right"&gt;&lt;/p&gt;

&lt;p&gt;I'm showing you how to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;host your own static website on Heroku's free plan;&lt;/li&gt;
&lt;li&gt;use Google's App Engine as a CDN, for better responsiveness;&lt;/li&gt;
&lt;li&gt;keep Heroku's free dyno alive, by using a GAE cron job;&lt;/li&gt;
&lt;li&gt;have a very responsive, scalable and secure blog, with ultimate;
control and simplicity, for zero bucks per month;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;You could just skip this article and browse the source code of my
blog:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/alexandru/bionicspirit.com"&gt;github.com/alexandru/bionicspirit.com&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Forget about Wordpress or Blogger. Hacking your own stuff is much more
fun. Also, make sure to read
&lt;a href="http://tom.preston-werner.com/2008/11/17/blogging-like-a-hacker.html"&gt;Blogging Like a Hacker&lt;/a&gt;,
by Tom Preston-Werner, GitHub's cofounder and the author of Jekyll.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;UPDATE: article was changed three times to better express
rationale and in response to user feedback.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Jekyll and Heroku, Sitting in a Tree&lt;/h2&gt;

&lt;p&gt;I love &lt;a href="https://github.com/mojombo/jekyll"&gt;Jekyll&lt;/a&gt;, the static
website generator. It is pure awesomeness for me:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;all content is hosted in a Git repository, the best CMS ever
invented&lt;/li&gt;
&lt;li&gt;my articles are written in Markdown, with Emacs, the most potent
text editor ever created - think Textmate-snippets, macros, syntax
highlighting, keyboard-driven navigation and spelling corrections&lt;/li&gt;
&lt;li&gt;static content scales like crazy, without any special gimmicks. A
small VPS can serve thousands of requests per second without a
sweat&lt;/li&gt;
&lt;li&gt;static content is also secure by default, no constant upgrades
required, no SQL injections&lt;/li&gt;
&lt;li&gt;I always make little tweaks to my design, I'm never satisfied, which
is why it makes sense to make my own, but checkout
&lt;a href="http://octopress.org/"&gt;Octopress&lt;/a&gt; in case you want a reasonable
default&lt;/li&gt;
&lt;li&gt;I've lost an entire blog when my hosting account got blocked in the
past. Never again, as my content is right now saved in 2 Git
repositories and on my local machine&lt;/li&gt;
&lt;li&gt;by working with my own domain, making my own shit, Google will never
make me cry ;-)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Jekyll's first hosting option you should consider is
&lt;a href="http://pages.github.com/"&gt;GitHub Pages&lt;/a&gt;, however you will need &lt;em&gt;some&lt;/em&gt;
dynamic behavior, like having configurable redirects. If you don't
then ignore this post and just read
&lt;a href="https://github.com/mojombo/jekyll/wiki/usage"&gt;Jekyll's tutorial&lt;/a&gt;, but
you can come back to this post when its limits start bothering you.&lt;/p&gt;

&lt;p&gt;Heroku's free plan is awesome, in spite of what
&lt;a href="/blog/2011/10/23/why-i-find-heroku-suboptimal.html"&gt;I said previously&lt;/a&gt;.
It's great for prototyping and for quickly seeing your website
online. Instant gratification is awesome. Well, it does have some
problems and to tell you the truth, for hosting my blog I would have
rather used Google's &lt;a href="http://code.google.com/appengine/"&gt;App Engine&lt;/a&gt;,
if only they allowed me to have naked domains. I like my domains to be
naked.&lt;/p&gt;

&lt;p&gt;One note in regards to the scalability of static content I mentioned
above. In Heroku the Bamboo stack features a Varnish frontend. If you
set proper expiry headers on your content, subsequent requests will
not hit the Ruby server.&lt;/p&gt;

&lt;h2&gt;Hosting Static Content on Heroku&lt;/h2&gt;

&lt;p&gt;So this tutorial is about hosting a Jekyll website, which is why I'm
going to make some assumptions about your directory structure. However
you can modify these instructions for any static website, not just
Jekyll-generated stuff.&lt;/p&gt;

&lt;p&gt;First, the setup:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="bash"&gt;&lt;span class="c"&gt;# install the heroku command-line utility&lt;/span&gt;
gem install heroku

&lt;span class="c"&gt;# change to your website directory&lt;/span&gt;
&lt;span class="nb"&gt;cd &lt;/span&gt;website/

&lt;span class="c"&gt;# initialize a git repo, if you haven&amp;#39;t done so&lt;/span&gt;
git init
&lt;span class="c"&gt;# ... and commit everything to it&lt;/span&gt;
git add .
git commit -m &lt;span class="s1"&gt;&amp;#39;initial commit&amp;#39;&lt;/span&gt;

&lt;span class="c"&gt;# create the heroku app&lt;/span&gt;
heroku create
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;OK, now we need a Rake-powered application to serve our
content. We'll need a &lt;em&gt;./Gemfile&lt;/em&gt; ...&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="n"&gt;source&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;http://rubygems.org&amp;#39;&lt;/span&gt;

&lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;rack&amp;#39;&lt;/span&gt;
&lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;mime-types&amp;#39;&lt;/span&gt;

&lt;span class="n"&gt;group&lt;/span&gt; &lt;span class="ss"&gt;:development&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
  &lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;jekyll&amp;#39;&lt;/span&gt;
  &lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;rdiscount&amp;#39;&lt;/span&gt;
  &lt;span class="n"&gt;gem&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;hpricot&amp;#39;&lt;/span&gt;  
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Then install these gems with:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="bash"&gt;bundle install
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;You also need a Rake configuration file, &lt;em&gt;./config.ru&lt;/em&gt;. What follows is the
configuration that I am using. You can go simpler, a lot simpler than
this actually, but I like flexibility and Heroku also does something
funny with files served through Rack::File, so I refrained from using
it ...&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="c1"&gt;# Rack configuration file for serving a Jekyll-generated static&lt;/span&gt;
&lt;span class="c1"&gt;# website from Heroku, with some nice additions:&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# * knows how to do redirects, with settings taken from ./_config.yaml&lt;/span&gt;
&lt;span class="c1"&gt;# * sets the cache expiry for HTML files differently from other static&lt;/span&gt;
&lt;span class="c1"&gt;#   assets, with settings taken from ./_config.yaml&lt;/span&gt;

&lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;yaml&amp;#39;&lt;/span&gt;
&lt;span class="nb"&gt;require&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;mime/types&amp;#39;&lt;/span&gt;

&lt;span class="c1"&gt;# main configuration file, also used by Jekyll&lt;/span&gt;
&lt;span class="no"&gt;CONFIG&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;YAML&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="n"&gt;load_file&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dirname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;__FILE__&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;_config.yml&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# points to our generated website directory&lt;/span&gt;
&lt;span class="no"&gt;PUBLIC&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;expand_path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dirname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;__FILE__&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; 
                          &lt;span class="no"&gt;CONFIG&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;destination&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;_site&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

&lt;span class="c1"&gt;# For cutting down on the boilerplate&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;BaseMiddleware&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;initialize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="vi"&gt;@app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;each&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;span class="c1"&gt;# Rack middleware for correcting paths:&lt;/span&gt;
&lt;span class="c1"&gt;#  &lt;/span&gt;
&lt;span class="c1"&gt;# 1. redirects from the www. version to the naked domain version&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# 2. converts directory/paths/ to directory/paths/index.html (most&lt;/span&gt;
&lt;span class="c1"&gt;#    importantly / to /index.html)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PathCorrections&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;BaseMiddleware&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;PATH_INFO&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;index.html&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;PATH_INFO&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;].&lt;/span&gt;&lt;span class="n"&gt;end_with?&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;/&amp;#39;&lt;/span&gt;
    &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Rack&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;host&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_with?&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;www.&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;301&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;Location&amp;quot;&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;//www.&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;//&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)},&lt;/span&gt; &lt;span class="nb"&gt;self&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
      &lt;span class="vi"&gt;@app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;    
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;span class="c1"&gt;# Middleware that enables configurable redirects. The configuration is&lt;/span&gt;
&lt;span class="c1"&gt;# done in the standard Jekyll _config.yml file.&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# Sample configuration in _config.yml:&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;#   redirects:&lt;/span&gt;
&lt;span class="c1"&gt;#     - from: /docs/some-document.html&lt;/span&gt;
&lt;span class="c1"&gt;#       to: /archive/some-document.html&lt;/span&gt;
&lt;span class="c1"&gt;#       type: 301&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# The sample above will do a permanent redirect from ((*/docs/dialer.html*))&lt;/span&gt;
&lt;span class="c1"&gt;# to ((*/archive/some-document.html*))&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Redirects&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;BaseMiddleware&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Rack&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path_info&lt;/span&gt;
    &lt;span class="n"&gt;ext&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;/&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;end_with?&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;/&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;redirect&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;CONFIG&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;redirects&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;].&lt;/span&gt;&lt;span class="n"&gt;find&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt;&lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;x&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;from&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
      &lt;span class="n"&gt;new_location&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;redirect&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;to&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
      &lt;span class="n"&gt;new_location&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;base_url&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;new_location&lt;/span&gt; &lt;span class="p"&gt;\&lt;/span&gt;
        &lt;span class="k"&gt;unless&lt;/span&gt; &lt;span class="n"&gt;new_location&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;start_with?&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;http&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;redirect&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;type&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="mi"&gt;302&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Location&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;new_location&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="nb"&gt;self&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;
      &lt;span class="vi"&gt;@app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;span class="c1"&gt;# The 404 Not Found message should be a simple one in case the&lt;/span&gt;
&lt;span class="c1"&gt;# mimetype of a file is not HTML (like the message returned by&lt;/span&gt;
&lt;span class="c1"&gt;# Rack::File). However, in case of HTML files, then we should display&lt;/span&gt;
&lt;span class="c1"&gt;# a custom 404 message&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Fancy404NotFound&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;BaseMiddleware&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="vi"&gt;@app&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;404&lt;/span&gt; 
      &lt;span class="n"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;extname&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;PATH_INFO&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;=~&lt;/span&gt; &lt;span class="sr"&gt;/html?$/&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;ext&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;&amp;#39;&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ext&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Content-Type&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;text/html&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt; &lt;span class="no"&gt;PUBLIC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;pages&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;404.html&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;span class="c1"&gt;# Mimicking Rack::File&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# I couldn&amp;#39;t work with Rack::File directly, because for some reason&lt;/span&gt;
&lt;span class="c1"&gt;# Heroku prevents me from overriding the Cache-Control header, setting&lt;/span&gt;
&lt;span class="c1"&gt;# it to 12 hours. But 12 hours is not suitable for HTML content that&lt;/span&gt;
&lt;span class="c1"&gt;# may receive fixes and other assets should have an expiry in the far &lt;/span&gt;
&lt;span class="c1"&gt;# future, with 12 hours not being enough. &lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Application&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;BaseMiddleware&lt;/span&gt;
  &lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Http404&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;guess_mimetype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;MIME&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Types&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;of&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="kp"&gt;nil&lt;/span&gt;
    &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_s&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kp"&gt;nil&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;request&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;Rack&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;env&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;path_info&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;request&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path_info&lt;/span&gt;

    &lt;span class="c1"&gt;# a /ping request always hits the Ruby Rake server - useful in&lt;/span&gt;
    &lt;span class="c1"&gt;# case you want to setup a cron to check if the server is still&lt;/span&gt;
    &lt;span class="c1"&gt;# online or bring it back to life in case it sleeps&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;path_info&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;/ping&amp;quot;&lt;/span&gt;
      &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
          &lt;span class="s1"&gt;&amp;#39;Content-Type&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;text/plain&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
          &lt;span class="s1"&gt;&amp;#39;Cache-Control&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;no-cache&amp;#39;&lt;/span&gt;
      &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="no"&gt;DateTime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;to_s&lt;/span&gt;&lt;span class="o"&gt;]]&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
    
    &lt;span class="n"&gt;headers&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;mimetype&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;guess_mimetype&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path_info&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Content-Type&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;mimetype&lt;/span&gt;
      &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;mimetype&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;text/html&amp;#39;&lt;/span&gt;
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Content-Language&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;en&amp;#39;&lt;/span&gt; 
        &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Content-Type&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;; charset=utf-8&amp;quot;&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
    
    &lt;span class="k"&gt;begin&lt;/span&gt;
      &lt;span class="c1"&gt;# basic validation of the path provided&lt;/span&gt;
      &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="no"&gt;Http404&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;path_info&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;include?&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;..&amp;#39;&lt;/span&gt;
      &lt;span class="n"&gt;abs_path&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;PUBLIC&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path_info&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;.&lt;/span&gt;&lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
      &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="no"&gt;Http404&lt;/span&gt; &lt;span class="k"&gt;unless&lt;/span&gt; &lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;exists?&lt;/span&gt; &lt;span class="n"&gt;abs_path&lt;/span&gt;

      &lt;span class="c1"&gt;# setting Cache-Control expiry headers&lt;/span&gt;
      &lt;span class="n"&gt;type&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;path_info&lt;/span&gt; &lt;span class="o"&gt;=~&lt;/span&gt; &lt;span class="sr"&gt;/\.html?$/&lt;/span&gt; &lt;span class="p"&gt;?&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;html&amp;#39;&lt;/span&gt; &lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;assets&amp;#39;&lt;/span&gt;
      &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Cache-Control&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;public, max-age=&amp;quot;&lt;/span&gt;
      &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;Cache-Control&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+=&lt;/span&gt; &lt;span class="no"&gt;CONFIG&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;expires&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;][&lt;/span&gt;&lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="o"&gt;].&lt;/span&gt;&lt;span class="n"&gt;to_s&lt;/span&gt;

      &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;File&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;abs_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;r&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;rescue&lt;/span&gt; &lt;span class="no"&gt;Http404&lt;/span&gt;
      &lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;404 Not Found: &lt;/span&gt;&lt;span class="si"&gt;#{&lt;/span&gt;&lt;span class="n"&gt;path_info&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;

    &lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="n"&gt;status&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;


&lt;span class="c1"&gt;#&lt;/span&gt;
&lt;span class="c1"&gt;# the actual Rack configuration, using &lt;/span&gt;
&lt;span class="c1"&gt;# the middleware defined above&lt;/span&gt;
&lt;span class="c1"&gt;#&lt;/span&gt;

&lt;span class="n"&gt;use&lt;/span&gt; &lt;span class="no"&gt;Redirects&lt;/span&gt;
&lt;span class="n"&gt;use&lt;/span&gt; &lt;span class="no"&gt;PathCorrections&lt;/span&gt;
&lt;span class="n"&gt;use&lt;/span&gt; &lt;span class="no"&gt;Fancy404NotFound&lt;/span&gt;

&lt;span class="n"&gt;run&lt;/span&gt; &lt;span class="no"&gt;Application&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;new&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;PUBLIC&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;This Rack configuration uses settings defined in the standard Jekyll
&lt;em&gt;_config.yaml&lt;/em&gt; file. Here are some settings needed for it to work as
intended:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="yaml"&gt;&lt;span class="l-Scalar-Plain"&gt;destination&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;./_site&lt;/span&gt;

&lt;span class="l-Scalar-Plain"&gt;expires&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt;
  &lt;span class="l-Scalar-Plain"&gt;html&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;3600&lt;/span&gt; &lt;span class="c1"&gt;# one hour&lt;/span&gt;
  &lt;span class="l-Scalar-Plain"&gt;assets&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;1314000&lt;/span&gt; &lt;span class="c1"&gt;# one year&lt;/span&gt;

&lt;span class="l-Scalar-Plain"&gt;redirects&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt;
  &lt;span class="p-Indicator"&gt;-&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;from&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;/rss/&lt;/span&gt;
    &lt;span class="l-Scalar-Plain"&gt;to&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;http://feeds.feedburner.com/bionicspirit&lt;/span&gt;
    &lt;span class="l-Scalar-Plain"&gt;type&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;302&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;OK, so once done, test this configuration:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="bash"&gt;&lt;span class="c"&gt;# generating the website&lt;/span&gt;
jekyll

&lt;span class="c"&gt;# starting the server&lt;/span&gt;
rackup
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Deployment is as easy as pie:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="bash"&gt;git push heroku master
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;One note: Heroku could be configured to automatically generate the
website for you. However you either have to use the Cedar stack, or
generate the pages on the fly. In case of the Cedar stack, you lose
Varnish. Just keep your generated files in Git, it's easier.&lt;/p&gt;

&lt;h2&gt;Commenting with Disqus, Facebook or Roll Your Own&lt;/h2&gt;

&lt;p&gt;For commenting &lt;a href="http://disqus.com"&gt;Disqus&lt;/a&gt; is a really good
service. In case you have a very popular website amongst normal
people, it may be even better to integrate Facebook's commenting
widget.&lt;/p&gt;

&lt;p&gt;Well, I had some fun a while ago and created my own:
&lt;a href="https://github.com/alexandru/TheBuzzEngine"&gt;TheBuzzEngine&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Unfortunately it doesn't have all the features I want, but it does get
the job done and it isn't bloated. These days I'll probably get around
to adding some stuff to it, like threaded comments and email
subscriptions. This is what happens when working for fun on stuff -
once you're over a certain threshold, the return of investment is too
low to bother with extra development.&lt;/p&gt;

&lt;p&gt;I recommend Disqus, although rolling your own is fun and keeps you in
control (which is the reason I'm using Jekyll in the first place).&lt;/p&gt;

&lt;h2&gt;Using Google App Engine as Your CDN or Cron Manager&lt;/h2&gt;

&lt;p&gt;So when using Heroku's free plan, I feel a little uncomfortable
because relying on one dyno can get you in trouble. Having Varnish in
front is great, but Varnish is a cache manager. For instance, if you
happen to push a new version of your latest article to Heroku, then
the Varnish cache gets cleared and the Ruby server can potentially get
exposed to a lot of requests and one dyno on Heroku can only serve one
request at a time.&lt;/p&gt;

&lt;p&gt;So why not push all our static assets, except HTML files, to a CDN?
It's best practice anyway as your website should be more
responsive. If you have an Amazon AWS account, then CloudFront + S3
are great.&lt;/p&gt;

&lt;p&gt;However, I started with the goal of hosting this for zero bucks (it's
fun, so why not?). Therefore I'm going to teach you how to push your
files to Google's &lt;a href="http://code.google.com/appengine/"&gt;App Engine&lt;/a&gt;. I
don't really know how GAE works as a CDN for static files, but it
seems that it does have the
&lt;a href="http://blog.sallarp.com/google-app-engine-cdn/"&gt;properties of a CDN&lt;/a&gt;
(i.e. serving content to users from servers closer to their location).&lt;/p&gt;

&lt;p&gt;Another problem with Heroku's free plan is that the free dyno goes to
sleep, to save resources. While I advise you to just pay up for an
extra dyno, you can get around this restriction by just configuring
GAE to send a periodic &lt;em&gt;ping&lt;/em&gt; to your website.&lt;/p&gt;

&lt;p&gt;Here's my GAE configuration file, &lt;em&gt;app.yaml&lt;/em&gt; which should sit in your
root (assuming &lt;em&gt;./assets&lt;/em&gt; is the directory you want to serve from GAE):&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="yaml"&gt;&lt;span class="l-Scalar-Plain"&gt;application&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;assets-bionicspirit&lt;/span&gt;
&lt;span class="l-Scalar-Plain"&gt;version&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;1&lt;/span&gt;
&lt;span class="l-Scalar-Plain"&gt;runtime&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;python27&lt;/span&gt;
&lt;span class="l-Scalar-Plain"&gt;api_version&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;1&lt;/span&gt;
&lt;span class="l-Scalar-Plain"&gt;threadsafe&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;true&lt;/span&gt;

&lt;span class="l-Scalar-Plain"&gt;handlers&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt;

&lt;span class="p-Indicator"&gt;-&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;url&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;/assets&lt;/span&gt;
  &lt;span class="l-Scalar-Plain"&gt;static_dir&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;assets&lt;/span&gt;
  &lt;span class="l-Scalar-Plain"&gt;expiration&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;365d&amp;quot;&lt;/span&gt;

&lt;span class="c1"&gt;# next item is for our cron job, described below, but you can ignore&lt;/span&gt;
&lt;span class="c1"&gt;# it if you don&amp;#39;t want a cron job ...&lt;/span&gt;

&lt;span class="p-Indicator"&gt;-&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;url&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;/tasks/ping&lt;/span&gt;
  &lt;span class="l-Scalar-Plain"&gt;script&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;ping.app&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;As you can see, I'm setting the expiry of my static assets to a
whooping 1 year.&lt;/p&gt;

&lt;p&gt;I also have a real handler, at &lt;em&gt;/tasks/ping&lt;/em&gt; configured. This will be
our cron job that sends a ping to our Heroku app, every X
minutes. Here's the code for &lt;em&gt;ping.py&lt;/em&gt; ...&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="python"&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;webapp2&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;google.appengine.api&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;urlfetch&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;PingService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;webapp2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RequestHandler&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
      &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;headers&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;Content-Type&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;text/plain&amp;#39;&lt;/span&gt;

      &lt;span class="n"&gt;url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;http://bionicspirit.com/ping&amp;quot;&lt;/span&gt;
      &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;urlfetch&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;deadline&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;HTTP &lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s"&gt; - &lt;/span&gt;&lt;span class="si"&gt;%s&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;%&lt;/span&gt; 
              &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;status_code&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;content&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
      &lt;span class="k"&gt;except&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
          &lt;span class="bp"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;out&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;ERROR: no response&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;webapp2&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WSGIApplication&lt;/span&gt;&lt;span class="p"&gt;([(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;/tasks/ping&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;PingService&lt;/span&gt;&lt;span class="p"&gt;)],&lt;/span&gt; &lt;span class="n"&gt;debug&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;False&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;But we are not done. To configure &lt;em&gt;/tasks/ping&lt;/em&gt; to run every X
minutes, you also need a &lt;em&gt;cron.yaml&lt;/em&gt; file ...&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="yaml"&gt;&lt;span class="l-Scalar-Plain"&gt;cron&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt;
&lt;span class="p-Indicator"&gt;-&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;description&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;ping bionicspirit.com to wake it up&lt;/span&gt;
  &lt;span class="l-Scalar-Plain"&gt;url&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;/tasks/ping&lt;/span&gt;
  &lt;span class="l-Scalar-Plain"&gt;schedule&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;every 4 minutes&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Assuming you already have the
&lt;a href="http://code.google.com/appengine/docs/python/gettingstarted/devenvironment.html"&gt;GAE SDK installed&lt;/a&gt;,
then run this command:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="bash"&gt;appcfg.py update .
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;To see it working on this blog, here are the requests:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Heroku URL getting requested: &lt;a href="http://bionicspirit.com/ping"&gt;http://bionicspirit.com/ping&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;GAE Cron Job getting executed: &lt;a href="http://assets-bionicspirit.appspot.com/tasks/ping"&gt;http://assets-bionicspirit.appspot.com/tasks/ping&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Extra Tip - CloudFlare&lt;/h2&gt;

&lt;p&gt;Luigi Montanez kindly pointed out in below's comments the availability
of &lt;a href="https://www.cloudflare.com/"&gt;CloudFlare&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;CloudFlare is a proxy that sits between your website and your
users. It allegedly prevents DDOS attacks on your website, but it also
caches static content, which helps because apparently it also has the
properties of a CDN.&lt;/p&gt;

&lt;p&gt;I activated it to see how it works. The main reason is that GAE has a
1 GB bandwidth-out daily limit - and this article generated ~ 10,000
visits in only one day, which consumed ~ 700 MB of bandwidth on GAE
(for a couple of small images, I don't want to imagine what would
happen for an image-rich post). So that's not good and I placed
CloudFlare in front of GAE and my Heroku instance, which should save
some bandwidth for me.&lt;/p&gt;

&lt;p&gt;I don't have a conclusion on CloudFlare. If it works as advertised,
then it is &lt;em&gt;awesome&lt;/em&gt;. Although be careful about it as I've seen
reports on the Internet that it may in fact add latency to your
website, instead of decreasing it.&lt;/p&gt;

&lt;p&gt;For my website however, everything seems to be fine. I am monitoring
my website with &lt;a href="http://pingdom.com"&gt;Pingdom.com&lt;/a&gt;, a service which
also reports the average responsiveness of the website, calculated by
doing requests from multiple locations. The homepage, which is not
cached by CloudFlare or served by GAE, has an average load time of
300ms, while cached static resources from GAE and proxied through
CloudFlare are doing much better.&lt;/p&gt;

&lt;p&gt;So we'll see.&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;The result is a really responsive, scalable and kick-ass blog, for
zero bucks spent on hosting.&lt;/p&gt;

&lt;p&gt;This very blog is hosted using the method described above. Well, I'll
probably return to my trustworthy VPS instance as I'm paying for it
anyway, but this was fun.&lt;/p&gt;

&lt;p&gt;Enjoy ~&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/bionicspirit/~4/IH4Zs6kWe5Y" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://bionicspirit.com/blog/2012/01/05/blogging-for-hackers.html</feedburner:origLink></entry>
 
 
 
 
 
 <entry>
   <title>Crawling the Android Marketplace</title>
   <link href="http://feedproxy.google.com/~r/bionicspirit/~3/O_j6uqrF1g4/crawling-the-android-marketplace-155200-apps.html" />
   <updated>2011-12-15T00:00:00+02:00</updated>
   <id>http://bionicspirit.com/blog/2011/12/15/crawling-the-android-marketplace-155200-apps</id>

   <author>
     <name>Bionic Spirit</name>
     <email>contact@bionicspirit.com</email>
     <uri>http://bionicspirit.com</uri>
   </author>

   <rights type="text">
     Copyright 2011 Alexandru Nedelcu.
     Some rights reserved (CC BY-NC 3.0)
     License: http://creativecommons.org/licenses/by-nc/3.0/
   </rights>

   
   <category scheme="http://bionicspirit.com/tags/" term="Python" label="Python" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Android" label="Android" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Mining" label="Mining" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Stats" label="Stats" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Web" label="Web" />
   
   <category scheme="http://bionicspirit.com/tags/" term="API" label="API" />
   

   <content type="html">&lt;p&gt;I had a very specific need for fetching the details for some apps in
the marketplace, in an automated manner. And so I found
&lt;a href="https://github.com/jberkel/supermarket"&gt;the supermarket gem&lt;/a&gt;, a
wrapper for the
&lt;a href="http://code.google.com/p/android-market-api/"&gt;Android Market API&lt;/a&gt;
Java implementation. However, it gives unpredictable results (it
wouldn't return the details of our in-house apps or of many other
examples I tried) and Google is placing hard-limits on the number of
requests you can make per minute. This is an internal API, probably
used by the marketplace client and the implementation mentioned above
was created through reverse-engineering.&lt;/p&gt;

&lt;p&gt;This really pissed me off, this is Google, they should grok APIs. But
this info is already available from their website and so I went ahead
and crawled it.&lt;/p&gt;

&lt;p&gt;The script and the data collected are is available. Read below.&lt;/p&gt;

&lt;!-- more --&gt;


&lt;h2&gt;How To Do it By Yourself&lt;/h2&gt;

&lt;p&gt;&lt;img class="right" src="//d2fo8u6if66r8r.cloudfront.net/assets/photos/wolfspider150.gif" style="float: right; margin-left: 10px; margin-bottom: 10px;" align="right"&gt;&lt;/p&gt;

&lt;p&gt;The actual script that I created can be found in the
&lt;a href="https://github.com/alexandru/AndroidMarketCrawler"&gt;AndroidMarketCrawler&lt;/a&gt;
GitHub Repository, with the relevant files being:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="https://github.com/alexandru/AndroidMarketCrawler/blob/master/crawler.py"&gt;crawler.py&lt;/a&gt; - source code with lots of comments, it's really not complicated, you should go read it&lt;/li&gt;
&lt;li&gt;marketplace_database.json_lines.bz2 - compressed file
containing the details of the crawled apps, one per each line; this
is not a proper JSON file, you use it by reading it line by line,
where each line represents a JSON object (personal preference, as
otherwise the file is pretty big and you can run out of
memory)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;&lt;strong&gt;UPDATE:&lt;/strong&gt; The Android Marketplace explicitly bans crawling
apparently. This crawler and associated data only serves educational
purposes. Don't abuse it.&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="python"&gt;&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;app&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;AndroidMarketCrawler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;concurrency&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="c"&gt;# app is at this point a dictionary with the details needed, like&lt;/span&gt;
    &lt;span class="c"&gt;#  id, name, developer name, number of installs, etc...&lt;/span&gt;
    &lt;span class="n"&gt;fh&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;dumps&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;app&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;I used the Python programming language, along with
&lt;a href="http://eventlet.net"&gt;Eventlet&lt;/a&gt; for fetching URLs in parallel (async I/O with
epoll/libevent, providing you with coroutines support and green
threads) and &lt;a href="http://packages.python.org/pyquery/"&gt;PyQuery&lt;/a&gt; for
selecting DOM elements using CSS3 selectors (instead of XPath or
BeautifulSoup). If you fancy Ruby instead, you could use slight
equivalents such as
&lt;a href="https://github.com/igrigorik/em-http-request"&gt;em-http-request&lt;/a&gt; and
&lt;a href="http://nokogiri.org/"&gt;Nokogiri&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;So you start fetching content from a root and add application links as
you encounter them in a queue. We are then using a (green) threadpool
to start fetching jobs for each of the links in the queue. So it's
recursive. The results are also pushed in another queue, ready to be
consumed by the client.&lt;/p&gt;

&lt;p&gt;Be careful though, don't abuse this, as it will generate a ton of
traffic and your IP may end up being banned by Google. It also takes a
lot of time; with good bandwidth and a VPS located in California, it
still took me 5 hours for the script to finish. Don't abuse the
concurrency settings either, 10 is enough.&lt;/p&gt;

&lt;h2&gt;155,200 Apps Available From the US&lt;/h2&gt;

&lt;p&gt;You have to realize that this number is only approximate. Apps are
going strong in other countries, such as South Korea and Google does
Geo-IP filtering, which means some of the apps were unavailable to me,
depending on restrictions set by their developers.&lt;/p&gt;

&lt;p&gt;The numbers published by
&lt;a href="http://www.readwriteweb.com/mobile/2011/10/android-market-hits-500000-suc.php"&gt;Research2Guidance in October&lt;/a&gt;
tell the story of 500,000 apps published on the Marketplace. But this
gets weird, as I took the number of downloads from those 155,200 apps
and it &lt;em&gt;matches&lt;/em&gt; the number of downloads
&lt;a href="http://android-developers.blogspot.com/2011/12/10-billion-android-market-downloads-and.html"&gt;published by Google this month&lt;/a&gt;. See
below.&lt;/p&gt;

&lt;h3&gt;An Average of 13.63 Billion Downloads&lt;/h3&gt;

&lt;p&gt;So there have been between 5,514,202,281 and 21,545,335,515 downloads
for &lt;em&gt;free apps&lt;/em&gt;, making the average 13,529,768,898 downloads.&lt;/p&gt;

&lt;p&gt;More interesting however is that according to my data for paid apps,
the number of downloads is between 42,576,311 and 164,116,615. This
number seems rather low to me, making it clear that Android
distribution is freemium based.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;UPDATE: initially here there were some charts showing the popularity
of free/paid apps per category. I deleted them as I don't like the
flames these kind of charts generate!&lt;/em&gt;&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/bionicspirit/~4/O_j6uqrF1g4" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://bionicspirit.com/blog/2011/12/15/crawling-the-android-marketplace-155200-apps.html</feedburner:origLink></entry>
 
 
 
 
 
 <entry>
   <title>Android Learning Resources</title>
   <link href="http://feedproxy.google.com/~r/bionicspirit/~3/G9klySIw2Tk/android-learning-resources.html" />
   <updated>2011-12-12T00:00:00+02:00</updated>
   <id>http://bionicspirit.com/blog/2011/12/12/android-learning-resources</id>

   <author>
     <name>Bionic Spirit</name>
     <email>contact@bionicspirit.com</email>
     <uri>http://bionicspirit.com</uri>
   </author>

   <rights type="text">
     Copyright 2011 Alexandru Nedelcu.
     Some rights reserved (CC BY-NC 3.0)
     License: http://creativecommons.org/licenses/by-nc/3.0/
   </rights>

   
   <category scheme="http://bionicspirit.com/tags/" term="Books" label="Books" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Android" label="Android" />
   

   <content type="html">&lt;p&gt;Starting out learning Android development may be intimidating at
first, as with any new platform of reasonable complexity you'll have
a lot to learn. However the learning process is fun. So here are
some learning resources that I'm currently following.&lt;/p&gt;

&lt;!-- more --&gt;


&lt;p&gt;DISCLAIMER: the Amazon links in this article contain my affiliate code
and I get a commission should you choose to buy from Amazon in the
next 24 hours. However I'm including these links primarily because of
the awesome reviews included, but you should buy straight from the
publisher (publisher links also included, without an affiliate ID).&lt;/p&gt;

&lt;h2&gt;Books for Learning Android&lt;/h2&gt;

&lt;p&gt;My problem with books is that technical books get obsolete really
fast. Books from 2010, while still useful, are already insufficient
now with the release of Ice Cream Sandwich. The upside is that there
are a lot of books out there.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://commonsware.com"&gt;&lt;img class="right" src="//d2fo8u6if66r8r.cloudfront.net/assets/photos/books-commonsware.png" title=""  style="float: right; margin-left: 10px; margin-bottom: 10px;" align="right"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;For ~ $40 you can buy a 1-year subscription for
&lt;a href="http://commonsware.com"&gt;CommonsWare.com&lt;/a&gt;. The author, Mark Murphy, is
a very proficient Android developer, trainer and consultant, with a
huge
&lt;a href="http://stackoverflow.com/users/115145/commonsware"&gt;StackOverflow reputation&lt;/a&gt;
:) More seriously - for $40 you get 3 books that are continuously
updated, which is great.&lt;/p&gt;

&lt;p&gt;Here are the books (but don't buy them from Amazon, as you won't get
the 1-year subscription, which is the main reason I'm recommending
these):&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/0981678009/ref=as_li_ss_tl?ie=UTF8&amp;tag=bionicspirit-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0981678009"&gt;The Busy Coder's Guide to Android Development&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/098167805X/ref=as_li_ss_tl?ie=UTF8&amp;tag=bionicspirit-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=098167805X"&gt;The Busy Coder's Guide to Advanced Android Development&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/0981678041/ref=as_li_ss_tl?ie=UTF8&amp;tag=bionicspirit-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=0981678041"&gt;Android Programming Tutorials&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;&lt;a href="http://www.amazon.com/gp/product/1934356565/ref=as_li_ss_tl?ie=UTF8&amp;amp;tag=bionicspirit-20&amp;amp;linkCode=as2&amp;amp;camp=1789&amp;amp;creative=390957&amp;amp;creativeASIN=1934356565"&gt;&lt;img class="right" src="//d2fo8u6if66r8r.cloudfront.net/assets/photos/prag_hello_android.jpg" width="150" title="Hello Android (3rd edition)"  style="float: right; margin-left: 10px; margin-bottom: 10px;" align="right"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Again, don't buy these items from Amazon. Buy them with the 1-year
subscription from the &lt;a href="http://commonsware.com"&gt;Author's Website&lt;/a&gt; (I
did so myself and btw, I have no affiliation with the author). The
subscription is useful because you'll get upgrades for new versions of
Android and bug-fixes, for a whole year.&lt;/p&gt;

&lt;p&gt;Another book I've been reading is
&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/1934356565/ref=as_li_ss_tl?ie=UTF8&amp;tag=bionicspirit-20&amp;linkCode=as2&amp;camp=1789&amp;creative=390957&amp;creativeASIN=1934356565"&gt;Hello Android&lt;/a&gt;
by &lt;a href="http://www.zdnet.com/blog/burnette"&gt;Ed Burnette&lt;/a&gt;, published by the
&lt;a href="http://pragprog.com/book/eband3/hello-android"&gt;Pragmatic Programmers&lt;/a&gt;.
It's pretty good, but it is more of an introduction (truly a Hello
World).&lt;/p&gt;

&lt;p&gt;Other books I have not tried, so my list stops here, but updates will
follow.&lt;/p&gt;

&lt;h2&gt;Free Stuff Available Online&lt;/h2&gt;

&lt;p&gt;I'm the kind of developer that prefers to rely on freely available
stuff, because I learn by doing and technical books on APIs are
boring. Plus, I like free stuff, however if you plan on getting
serious about it, then a small investment is worth it and will keep
you focused (nothing will keep you more focused than spending some
money).&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://developer.android.com/resources/browser.html?tag=tutorial"&gt;Tutorials, by Android Developers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://developer.android.com/resources/browser.html?tag=article"&gt;Articles, by Android Developers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The official beginner tutorials are not so comprehensive, however
you'll get a lot of value from reading the Articles. Lots of value
in there, which works best if used in conjunction with the samples
provided:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://developer.android.com/resources/browser.html?tag=sample"&gt;Samples, by Android Developers&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Seriously, you won't find a faster learning path than reading and
understanding the source-code of real apps. However to not attempt
doing this without going through some of the tutorials in the link
above.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://developer.android.com/videos/index.html#v=twmuBbC_oB8"&gt;Videos from Google I/O&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://developer.android.com/guide/developing/index.html"&gt;The Official Dev Guide&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The videos from Google I/O are a gold mine, providing insight you
won't find anywhere else. Highly recommended. All in all, the
official documentation is good, although lacking beginner
friendliness and a structured clear path. Plus I noticed it has
holes in it, but you'll be fine.&lt;/p&gt;

&lt;p&gt;For asking questions, or browse around for insightful gems, there's
nothing better than:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href="http://stackoverflow.com/questions/tagged/android"&gt;The Android tag on StackOverflow.com&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;At the time of this writing, I don't have many Android-related
blogs to share with you and the ones I know about are of poor
quality - apparently not many people blog about their Android
experience. Hopefully this will change for the better.&lt;/p&gt;

&lt;p&gt;Please help me out in identifying other resources! Thanks!&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/bionicspirit/~4/G9klySIw2Tk" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://bionicspirit.com/blog/2011/12/12/android-learning-resources.html</feedburner:origLink></entry>
 
 
 
 
 
 <entry>
   <title>Earning Money as an Amazon Affiliate</title>
   <link href="http://feedproxy.google.com/~r/bionicspirit/~3/ab_wbpgfJd4/earning-money-as-an-amazon-affiliate.html" />
   <updated>2011-11-29T00:00:00+02:00</updated>
   <id>http://bionicspirit.com/blog/2011/11/29/earning-money-as-an-amazon-affiliate</id>

   <author>
     <name>Bionic Spirit</name>
     <email>contact@bionicspirit.com</email>
     <uri>http://bionicspirit.com</uri>
   </author>

   <rights type="text">
     Copyright 2011 Alexandru Nedelcu.
     Some rights reserved (CC BY-NC 3.0)
     License: http://creativecommons.org/licenses/by-nc/3.0/
   </rights>

   
   <category scheme="http://bionicspirit.com/tags/" term="Story" label="Story" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Publishing" label="Publishing" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Income" label="Income" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Amazon" label="Amazon" />
   

   <content type="html">&lt;p&gt;
  I published an article that I've meant to publish for a long
  time. I'm usually lazy to not bother writing many articles, however
  this time I also thought about doing an experiment - you see I'm (1)
  on a tight budget and (2) a cheap bastard - so what if I could get
  enough money to pay for my monthly hosting on Linode, while
  satisfying my urge to write from time to time? 
&lt;/p&gt;

&lt;p&gt;
  As such I included Amazon Affiliate links in that post to see what
  happens.
&lt;/p&gt;

&lt;h2&gt;UPDATE: Moral Considerations Against Amazon's Associates&lt;/h2&gt;

&lt;p&gt;
  The discussion on &lt;a
  href="http://news.ycombinator.com/item?id=3291167"
  target="_blank"&gt;Hacker News&lt;/a&gt; took an interesting turn. A couple
  of comments are a little unbalanced, however there's &lt;a
  href="http://news.ycombinator.com/item?id=3292508" target="_blank"&gt;a
  comment&lt;/a&gt; that I like and with which I am in agreement:
&lt;/p&gt;

&lt;p class="dialog"&gt;
  &lt;i&gt;
    Having affiliate links creates incentives that may not align with
    faithfully serving your readers. It does not automatically bias your
    writing, but it can certainly create the appearance of bias.
  &lt;/i&gt;
  &lt;br /&gt; &lt;br /&gt;
  &lt;i&gt;
    A concrete example would be writing an especially glowing review
    of the new Kindle because you have a vested stake in people buying
    them. Or, perhaps, NOT writing a glowing review because you fear
    it will be perceived as shilling for affiliate cash.
  &lt;/i&gt;
  &lt;br /&gt; &lt;br /&gt;
  &lt;i&gt;
    In fact On The Media recently did a story about the Washington
    Post struggling with whether or not to include Amazon affiliate
    links in its book reviews. I think it presents both sides of the
    argument: &lt;a
    href="http://www.onthemedia.org/2011/nov/11/web-links-money-makers/transcript/"
    target="_blank"&gt;www.onthemedia.org/2011/nov/11/web-links...&lt;/a&gt;
  &lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;
  This comment is striking, as when I started fantasizing about where
  should I go from here, the one thing that crossed my mind was that I
  could write a review for the Kindle (I own one) - and oops, 
  unfortunately in that instance my actual opinion would have been
  &lt;i&gt;unfaithful&lt;/i&gt; - &lt;a href="http://www.elidickinson.com/"
  target='_blank'&gt;@esd&lt;/a&gt; nailed it.
&lt;/p&gt;

&lt;p&gt;
  Also, &lt;a href="http://news.ycombinator.com/item?id=3292777"&gt;a
  reply&lt;/a&gt; puts the above in balance:
&lt;/p&gt;

&lt;p class="dialog"&gt;
  &lt;i&gt;
    Newspapers likely disallow such practices in order to maintain
    journalistic integrity, but a blog author who is writing posts on
    purpose to sell things is probably not interested in maintaining
    journalistic integrity. The blog author is just interested in
    selling stuff. Maybe the blog posts are well-written and
    interesting, or maybe they are not. If they are not, then readers
    who care principally about content will likely avoid the blog on the
    lack of merit of the content itself.
  &lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;
  (the quality of Hacker News still amazes me)
&lt;/p&gt;

&lt;p&gt;
  In my oppinion, putting affiliate links is not bad or evil per se -
  for instance you could say that customers of Apple or Google have an
  inherent bias because they feel the need to protect their monetary
  and/or emotional investment. And we aren't professional journalists,
  trained to watch out for such things - but a mistake is a
  mistake. So from now on, no more Amazon-related reviews coming from
  me (maybe I'll try clearly delimited boxes or something).
&lt;/p&gt;

&lt;h2&gt;Some Facts and Stats&lt;/h2&gt;

&lt;p&gt;
  The article I'm talking about:
&lt;/p&gt;

&lt;p&gt;
  &amp;nbsp;&amp;nbsp;&amp;nbsp;&amp;nbsp;&lt;a href="/blog/2011/11/25/4-books-for-learning-to-design-the-hard-way.html"&gt;4 Books for Learning to Design, the Hard Way&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
  I published this article on the 25th, &lt;i&gt;The Black Friday&lt;/i&gt; - so
  it had perfect timing. Then I pushed this link to Hacker News and
  Reddit. I hope you will forgive me, since this was a little
  self-promotion and I don't deny it, however I hope you found the
  content therein to be worth it, as it was published with my best
  intentions.
&lt;/p&gt;

&lt;p&gt;
  The resulting traffic and the fee that Amazon gives for the orders
  generated took me by surprise. Here's how my traffic looks like
  (unique visitors):
&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Nov 24: 45&lt;/li&gt;
  &lt;li&gt;Nov 25: 10,263&lt;/li&gt;
  &lt;li&gt;Nov 26: 6,939&lt;/li&gt;
  &lt;li&gt;Nov 27: 1,713&lt;/li&gt;
  &lt;li&gt;Nov 28: 1,299&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
  Wow. I never generated this much traffic with my blog. Also, here's
  the Amazon stats (updated for Nov 28th):
&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Ordered items: 333&lt;/li&gt;
  &lt;li&gt;Clicks: 6,259&lt;/li&gt;
  &lt;li&gt;Conversion: 5.32%&lt;/li&gt;
  &lt;li&gt;Total items shipped: 294&lt;/li&gt;
  &lt;li&gt;Total earnings: $367.10&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
  So considering that not all items have been shipped and this ain't
  over yet (I'm still getting traffic), we're looking at:
&lt;/p&gt;

&lt;p class="dialog"&gt;
  a half-an-hour effort for a single article, earning &lt;b&gt;$400&lt;/b&gt;, in
  4 days, on a very low-traffic personal blog
&lt;/p&gt;

&lt;p&gt;
  My goal was achieved too - I now have enough money for ~20 months of
  hosting on Linode. Thank you dear readers, I am in your debt.
&lt;/p&gt;

&lt;h2&gt;Amazon Associates versus Google AdSense&lt;/h2&gt;

&lt;p&gt;
  &lt;i&gt;Google AdSense&lt;/i&gt; rewards are &lt;i&gt;per-click&lt;/i&gt; and is the first
  option of many webmasters, because it does generate more
  money. However I feel that the overall user experience suffers a lot
  - the links served may be context-dependent, but the quality is poor.
&lt;/p&gt;

&lt;p&gt;
  &lt;i&gt;Amazon Associates&lt;/i&gt; rewards are instead &lt;i&gt;per-action&lt;/i&gt;. When
  the user buys something, you get a referral fee. This can work very
  well because the items displayed are hand-picked by you and the
  links add value to your content - whenever I search for reviews of
  individuals (which I trust more than those of experts), my first
  stop is on Amazon.
&lt;/p&gt;

&lt;h2&gt;Why A Single Flower Doesn't Bring Spring&lt;/h2&gt;

&lt;p&gt;
  So whenever anybody does this successfully, the appetite only grows
  bigger. After all, this kind of revenue has the potential of being
  passive. However don't get your hopes up:
&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    in this instance, the timing was perfect and so the circumstances
    can't be reproduced every month
  &lt;/li&gt;
  &lt;li&gt;
    to have constant conversions you need a big and loyal following
    that trusts your recommendations
  &lt;/li&gt;
  &lt;li&gt;
    to have a big and loyal following, you also need lots of traffic
    driven by search engines 
  &lt;/li&gt;
  &lt;li&gt;
    to score well in search engines, you need lots of articles with
    lots of keywords and good ranking
  &lt;/li&gt;
  &lt;li&gt;
    to have loyal readers and a good ranking on Google, those articles
    must be high-quality
  &lt;/li&gt;
  &lt;li&gt;
    it takes lots and lots of work for the above, probably 1 or 2
    years, considering you have the talent of writing content already,
    or the money to hire people to do it
  &lt;/li&gt;
  &lt;li&gt;
    if you can't recommend anything else other than books, then you're
    screwed, as you don't have the time to read so many
  &lt;/li&gt;
  &lt;li&gt;
    if the products getting recommended are not relevant to your
    audience, then they won't convert
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
  So there you have it. My experiment with Amazon's Associates, while
  probably a short one, has brought joy in my heart :)
&lt;/p&gt;

&lt;p&gt;
  Thank you,
&lt;/p&gt;

&lt;img src="http://feeds.feedburner.com/~r/bionicspirit/~4/ab_wbpgfJd4" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://bionicspirit.com/blog/2011/11/29/earning-money-as-an-amazon-affiliate.html</feedburner:origLink></entry>
 
 
 
 <entry>
   <title>4 Books For Learning to Design, The Hard Way</title>
   <link href="http://feedproxy.google.com/~r/bionicspirit/~3/yaJYExceb4U/4-books-for-learning-to-design-the-hard-way.html" />
   <updated>2011-11-25T00:00:00+02:00</updated>
   <id>http://bionicspirit.com/blog/2011/11/25/4-books-for-learning-to-design-the-hard-way</id>

   <author>
     <name>Bionic Spirit</name>
     <email>contact@bionicspirit.com</email>
     <uri>http://bionicspirit.com</uri>
   </author>

   <rights type="text">
     Copyright 2011 Alexandru Nedelcu.
     Some rights reserved (CC BY-NC 3.0)
     License: http://creativecommons.org/licenses/by-nc/3.0/
   </rights>

   
   <category scheme="http://bionicspirit.com/tags/" term="Books" label="Books" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Design" label="Design" />
   
   <category scheme="http://bionicspirit.com/tags/" term="UX" label="UX" />
   

   <content type="html">&lt;p&gt;
  This is the path I'm taking to &lt;i&gt;not suck&lt;/i&gt; at design anymore, as
  frankly, I'm getting tired of sucking. I've read the first 3 books
  here and I'm having progress on the last one. I highly recommend all
  4.
&lt;/p&gt;

&lt;p&gt;
  (&lt;b&gt;UPDATE - DISCLAIMER:&lt;/b&gt; this article contains Amazon affiliate links, as part of
  an experiment which I'm &lt;a href="/blog/2011/11/29/earning-money-as-an-amazon-affiliate.html"&gt;describing
  here&lt;/a&gt;, however the article expresses my genuine view)
&lt;/p&gt;

&lt;h2&gt;The Design of Everyday Things (&lt;a href="http://www.amazon.com/gp/product/0465067107/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0465067107" rel="nofollow" title="The Design of Everyday Things"&gt;link&lt;/a&gt;)&lt;/h2&gt;

&lt;a href="http://www.amazon.com/gp/product/0465067107/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0465067107" title="The Design of Everyday Things" rel="nofollow"&gt;
  &lt;img class="right" src="//d2fo8u6if66r8r.cloudfront.net/assets/photos/book-design-1.jpg" width="104" height="160" style="float: right; margin-left: 10px; margin-bottom: 10px;" align="right"&gt;
&lt;/a&gt;

&lt;p&gt;
  It explains the design of common household objects that you may use
  daily and it is one of the best books you can read on user
  experience.
&lt;/p&gt;

&lt;p&gt;
  To me this was an eye opener that completely changed the way I think
  about interfaces. It explains how people interact with the objects
  around them and how they learn. It explains the importance of a
  user's mental model of how your product works. It gives you a good
  feeling on what it means and why it matters to &lt;i&gt;design for
  errors&lt;/i&gt;. It helps you to prevent a lot of design errors that a
  lot of products have.
&lt;/p&gt;

&lt;p style="clear:both;"&gt;
  The book itself was written in 1990, so it does have here and there
  some references to products that are outdated, but the analysis
  itself will never be outdated or obsolete. Quite the contrary - it's
  fascinating how user centric design guidlines &lt;i&gt;stay the same&lt;/i&gt;,
  even though there are a lot of people out there that repeat the same
  mistakes over and over again.
&lt;/p&gt;

&lt;p&gt;
  When it comes to software, a complement article on the subject is &lt;a
  href="http://www.joelonsoftware.com/uibook/fog0000000249.html"&gt;User
  Interface Design For Programmers&lt;/a&gt;, by Joel Spolsky.
&lt;/p&gt;

&lt;h2&gt;Non-Designer's Design Book (&lt;a href="http://www.amazon.com/gp/product/0321534042/ref=as_li_ss_il?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0321534042" rel="nofollow" title="Non-Designer's Design Book"&gt;link&lt;/a&gt;)&lt;/h2&gt;

&lt;a href="http://www.amazon.com/gp/product/0321534042/ref=as_li_ss_il?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0321534042" title="Non-Designer's Design Book" rel="nofollow"&gt;
  &lt;img class="right" src="//d2fo8u6if66r8r.cloudfront.net/assets/photos/book-design-2.jpg" width="112" height="160" style="float: right; margin-left: 10px; margin-bottom: 10px;" align="right"&gt;
&lt;/a&gt;

&lt;p&gt;
  Robin Williams does an excelent job introducing you to the basic
  concepts of designing visuals, with clearly explained principles and
  techniques.
&lt;/p&gt;

&lt;p&gt;
  It's pretty hard for us developers to design anything pleasing to
  the eye. And that's not the only problem you're facing - the visuals
  of a site have to give hints to the user about their next actions,
  so you've got many constraints to worry about. Sometimes you get
  lucky by just copying and combining other designs you
  like. Sometimes you have a good idea about what you want, but one
  day you like the result, then the next it looks like an abomination.
&lt;/p&gt;

&lt;p&gt;
  The book goes into some detail about how designers think. It has
  plenty of visual examples, it gives you many examples of what &lt;i&gt;not
  to do&lt;/i&gt;, it explains how to work around those problems, it is
  concise and doesn't bore you to tears - while not that useful for
  someone with design experience, for a developer that sucks at this
  game, this is a really, really good design manual.
&lt;/p&gt;

&lt;h2&gt;Color: A Course in Mastering the Art of Mixing Colors (&lt;a href="http://www.amazon.com/gp/product/1585422193/ref=as_li_ss_il?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=1585422193" rel="nofollow" title="Color: A Course in Mastering the Art of Mixing Colors"&gt;link&lt;/a&gt;)&lt;/h2&gt;

&lt;a href="http://www.amazon.com/gp/product/1585422193/ref=as_li_ss_il?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=1585422193" rel="nofollow" title="Color: A Course in Mastering the Art of Mixing Colors"&gt;
  &lt;img class="right" src="//d2fo8u6if66r8r.cloudfront.net/assets/photos/book-design-3.jpg" width="129" height="160" style="float: right; margin-left: 10px; margin-bottom: 10px;" align="right"&gt;
&lt;/a&gt;

&lt;p&gt;
  One of the most surprisingly difficult problems when creating
  designs is picking the colors palette. This is so goddamn difficult
  sometimes. We do know that certain color combinations work better
  than others, but how do you pick them? How can you achieve
  &lt;i&gt;harmony&lt;/i&gt; as to not make your users' eyes bleed?

  And not only that, but you also want to emphasise certain portions
  of your pages in a way to attract attention - did you know that
  according to statistics, &lt;i&gt;more car accidents involve red cars than
  any other color?&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;
  There are web services out there, like &lt;a
  href="http://kuler.adobe.com/"&gt;Adobe Kuler&lt;/a&gt; or &lt;a
  href="http://www.colourlovers.com/"&gt;COLOURlovers.com&lt;/a&gt; which allow you to choose color
  palettes created by other people - the PROBLEM being that you'll
  never know what makes a palette work and so you'll make good changes
  to it only by luck or with expensive A/B testing.
&lt;/p&gt;

&lt;p&gt;
  This book by Betty Edwards is not your only choice for learning
  Color Theory. It isn't even related to web design in any way, having
  a whole section on mixing oil paint. 
&lt;/p&gt;

&lt;p&gt;
  However, I believe that color theory can only be learned from people
  that have real experience in mixing colors. For this reason,
  articles or books about color theory that aren't written by painters
  are quite shallow - and the theory in this book transcends the tools
  used.
&lt;/p&gt;

&lt;p&gt;
  It explains notions on color harmony, on the importance of contrast
  and gives you valuable insight and advices on how to mix and match
  colors. It's a fun read too, because of the quotes from faimous
  people contained - however, it's not a light read because you do
  have to execute the exercises within for best results. But I think
  it is worth it.
&lt;/p&gt;

&lt;p&gt;
  (I'm now my wife's advisor on colors, although our bedroom ended up
  looking awful, but there's no substitute for experience earned by
  making mistakes and at least I know where I went wrong ;))
&lt;/p&gt;

&lt;h2&gt;Drawing on the Right Side of the Brain (&lt;a href="http://www.amazon.com/gp/product/0874774195/ref=as_li_ss_il?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0874774195"  rel="nofollow" title="The New Drawing on the Right Side of the Brain"&gt;link&lt;/a&gt;)&lt;/h2&gt;

&lt;a href="http://www.amazon.com/gp/product/0874774195/ref=as_li_ss_il?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0874774195" rel="nofollow" title="The New Drawing on the Right Side of the Brain"&gt;
  &lt;img class="right" src="//d2fo8u6if66r8r.cloudfront.net/assets/photos/book-design-4.jpg" width="131" height="160" style="float: right; margin-left: 10px; margin-bottom: 10px;" align="right"&gt;
&lt;/a&gt;

&lt;p&gt;
  I've picked up drawing as my hobby. I'm only a beginner, but it's
  a fun hobby to have - it's impressive, silent and inexpensive. It also
  boosts your creativity like nothing else, as the only real limit you
  have is your imagination (much like software development ;))
&lt;/p&gt;

&lt;p&gt;
  Ever tried drawing anything lately? You should. The results will be
  awful. But do you know why? It's not your hand, it's not from a lack
  of talent, it's your eyes that are deceiving you - &lt;i&gt;in order to
  learn how to draw, you have to relearn how to see&lt;/i&gt;. That's
  because everything you see is right now filtered and transformed by
  your brain - as a cheap/fast exercise, look in a mirror at an arm's
  length and take a guess if your mirrored head is of the same size as
  your actual head (then use your hands to measure). The picture
  you're getting through your mind's eye is deceiving and for drawing
  skill and creativity to emerge you have to silence your brain.
&lt;/p&gt;

&lt;p&gt;
  This book of Betty Edwards has an awesome technique that works for
  everybody. Or so she says, but as far as I'm concerned I'm
  progressing in leaps and bounds and can already do drawings that I
  couldn't hope of doing before starting to read this book ... I also
  promise to publish some drawings, but only after I'll get decent
  (only started to do this 2 months ago, a frog doesn't transform into
  a prince overnight you know).
&lt;/p&gt;

&lt;p&gt;
  How does drawing help you in web design? Well, do you really have to
  ask?
&lt;/p&gt;

&lt;h2&gt;Update: Suggestions Received from Readers&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/0201362988/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0201362988"&gt;The Design of Design&lt;/a&gt;, by Frederick P. Brooks&lt;/li&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/1592537413/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399373&amp;creativeASIN=1592537413"&gt;Visual Language for Designers&lt;/a&gt;, by Connie Malamed&lt;/li&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/0300115954/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0300115954"&gt;Interaction of Color&lt;/a&gt;, by Josef Albers (&lt;a href="http://www.handprint.com/HP/WCL/book3.html#albers" rel="nofollow"&gt;review&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/0891343377/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0891343377"&gt;Keys to Drawing&lt;/a&gt;, by Bert Dodson&lt;/li&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/1119998956/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399373&amp;creativeASIN=1119998956"&gt;Design for Hackers&lt;/a&gt;, by David Kadavy&lt;/li&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/0442240392/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0442240392"&gt;Design and Form&lt;/a&gt;, by Johannes Itten&lt;/li&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/041238390X/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399373&amp;creativeASIN=041238390X"&gt;Elements of Color&lt;/a&gt;, by Johannes Itten&lt;/li&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/0471285528/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0471285528"&gt;Principles of Form and Design&lt;/a&gt;, by Wucius Wong&lt;/li&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/0898150523/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0898150523"&gt;Thinking with a Pencil&lt;/a&gt;, by Henning Nelms&lt;/li&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/0262691914/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0262691914"&gt;The Sciences of the Artificial&lt;/a&gt;, by Herbert A. Simon&lt;/li&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/3764384840/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=3764384840"&gt;Designerly Ways of Knowing&lt;/a&gt;, by Nigel Cross&lt;/li&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/0133033899/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0133033899"&gt;Designing Visual Interfaces&lt;/a&gt;, by Kevin Mullet and Darrell Sano&lt;/li&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/1568984650/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=1568984650"&gt;Grid Systems: Principles of Organizing Type&lt;/a&gt;, by Kimberly Elam (&lt;a href="http://filtercake.tumblr.com/post/7999571798/design-is-not-decoration" rel="nofollow"&gt;presentation&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;&lt;a rel="nofollow" href="http://www.amazon.com/gp/product/0262062666/ref=as_li_ss_tl?ie=UTF8&amp;tag=alexanedel-20&amp;linkCode=as2&amp;camp=217145&amp;creative=399369&amp;creativeASIN=0262062666"&gt;101 Things I Learned in Architecture School&lt;/a&gt;, by Matthew Frederick&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
  Thanks people, wow, I know have my hands full :) ... keep `em coming!
&lt;/p&gt;

&lt;img src="http://feeds.feedburner.com/~r/bionicspirit/~4/yaJYExceb4U" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://bionicspirit.com/blog/2011/11/25/4-books-for-learning-to-design-the-hard-way.html</feedburner:origLink></entry>
 
 
 
 <entry>
   <title>How I Use Flickr: For Backup</title>
   <link href="http://feedproxy.google.com/~r/bionicspirit/~3/CAbnUJaGIr0/how-i-use-flickr.html" />
   <updated>2011-10-29T00:00:00+03:00</updated>
   <id>http://bionicspirit.com/blog/2011/10/29/how-i-use-flickr</id>

   <author>
     <name>Bionic Spirit</name>
     <email>contact@bionicspirit.com</email>
     <uri>http://bionicspirit.com</uri>
   </author>

   <rights type="text">
     Copyright 2011 Alexandru Nedelcu.
     Some rights reserved (CC BY-NC 3.0)
     License: http://creativecommons.org/licenses/by-nc/3.0/
   </rights>

   
   <category scheme="http://bionicspirit.com/tags/" term="Cloud" label="Cloud" />
   
   <category scheme="http://bionicspirit.com/tags/" term="API" label="API" />
   

   <content type="html">&lt;p&gt;
  I've got a growing number of personal pictures and the collection is
  growing since 2003, when I got my first digital camera, a shitty
  Sanyo that still works and that I still use whenever I forget about
  my Nikon.
&lt;/p&gt;

&lt;p&gt;
  But here's the thing with digital pictures - &lt;i&gt;&lt;b&gt;they are cheap to
  make, but also easy to lose&lt;/b&gt;&lt;/i&gt;. Digital storage is not as
  reliable as glossy paper. Pictures printed on paper can easily last
  for a 100 years. That's not the case with any digital storage medium
  and we will suffer for it.
&lt;/p&gt;

&lt;h2&gt;Storing My Pictures In The Cloud&lt;/h2&gt;

&lt;p&gt;
  Pro accounts on Flickr have unlimited storage and can upload and
  access full-resolution pictures. This is great, although be careful
  about believing in "unlimited plans", as nothing is really unlimited
  and by abusing Flickr you may find yourself locked out of your
  account.
&lt;/p&gt;

&lt;p&gt;
  Unfortunately the tools for uploading really suck and I haven't
  encountered yet a graphical interface that did what I needed. So for
  synchronizing, I've built my own script in Ruby using the excelent
  &lt;a href="http://hanklords.github.com/flickraw/"&gt;Flickraw gem&lt;/a&gt; and
  &lt;a href="http://exifr.rubyforge.org/"&gt;exifr&lt;/a&gt;, another Ruby gem
  that reads Exif headers from Jpeg files.
&lt;/p&gt;

&lt;p&gt;
  One common problem is that you ALWAYS have duplicates. And you don't
  want to upload duplicates. What you really want is an "&lt;i&gt;rsync&lt;/i&gt;"
  command for Flickr. But how do you know if a picture was already
  uploaded?
&lt;/p&gt;

&lt;p&gt;
  The approach I'm using is to add a &lt;a
  href="http://www.flickr.com/groups/api/discuss/72157594497877875/"
  target="_blank"&gt;machine tag&lt;/a&gt; to my pictures, which is set like a
  tag, but has the format "namespace:key=value". This machine tag
  represents the MD5 hash of the picture and if you want to see if a
  certain photo was already uploaded to flickr, you can always &lt;a
  href="http://www.flickr.com/services/api/flickr.photos.search.html"
  target="_blank"&gt;search for it&lt;/a&gt;. Here's how it looks on one of my
  pictures:
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="bash"&gt;checksum:md5&lt;span class="o"&gt;=&lt;/span&gt;5b2fa91c38a7f878088e1420b924e6d9
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;
  Besides this, I have this problem with some of the older photos
  taken by my Sanyo, where the taken-date is totally fucked and for
  personal photos the taken date is maybe more important than the
  actual photo quality. I use the excelent &lt;a
  href="http://www.sno.phy.queensu.ca/~phil/exiftool/"&gt;ExifTool&lt;/a&gt; to
  correct those photos. It's nice building on the hard work of other
  people ;)
&lt;/p&gt;

&lt;p&gt;
  So, currently I have 3545 pictures uploaded on Flickr in full
  resolution and the number will more than triple as soon as I make an
  inventory of my pictures stored on old hardware I've got lying
  around.
&lt;/p&gt;

&lt;p&gt;
  It is fun being a developer. I can make shit happen.
&lt;/p&gt;

&lt;h2&gt;Flickr is Not A Reliable Backup&lt;/h2&gt;

&lt;p&gt;
  Flickr is an online service that isn't meant for being a backup. I
  share only a fraction of what I upload, everything else is &lt;i&gt;family
  only&lt;/i&gt;. They may terminate your account at any time for whatever
  reason. They may also go out of business. Yahoo may sell it, etc,
  etc... I do think Flickr is awesome btw and one reason that I store
  my photos on Flickr is to be able to always have the whole archive
  with me. But for backup alone, that's not enough.
&lt;/p&gt;

&lt;p&gt;
  What you really need to do is:
&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    your main repository should be stored locally and properly
    maintained - I do that on my main computer currently, but multi-TB
    external hard-drives are cheap
  &lt;/li&gt;
  &lt;li&gt;
    in case of cloud backup, you always need a secondary service for
    redundancy
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
  Google's Picasa is a good option because you can explicitly buy
  storage. This means that if you pay for 80 GB of storage nobody is
  going to get upset that you uploaded 80 GB of private photos
  ... also, if your photo collection matters to you, I wouldn't put my
  trust in their Google+ offering (photos of up to 2048x2048 pixels do
  not count towards your free quota). That's because that offer is not
  meant for you. Just pay up.
&lt;/p&gt;

&lt;p&gt;
  So on Google's Picasa, I'm currently working on integrating with
  their API too. The desktop app is nice, but too limited for me.
&lt;/p&gt;

&lt;img src="http://feeds.feedburner.com/~r/bionicspirit/~4/CAbnUJaGIr0" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://bionicspirit.com/blog/2011/10/29/how-i-use-flickr.html</feedburner:origLink></entry>
 
 
 
 <entry>
   <title>Testing Different Browsers: It`s a Pain in the Ass</title>
   <link href="http://feedproxy.google.com/~r/bionicspirit/~3/75kcc9Sovis/testing-different-browsers.html" />
   <updated>2011-10-25T00:00:00+03:00</updated>
   <id>http://bionicspirit.com/blog/2011/10/25/testing-different-browsers</id>

   <author>
     <name>Bionic Spirit</name>
     <email>contact@bionicspirit.com</email>
     <uri>http://bionicspirit.com</uri>
   </author>

   <rights type="text">
     Copyright 2011 Alexandru Nedelcu.
     Some rights reserved (CC BY-NC 3.0)
     License: http://creativecommons.org/licenses/by-nc/3.0/
   </rights>

   
   <category scheme="http://bionicspirit.com/tags/" term="Javascript" label="Javascript" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Browser" label="Browser" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Web" label="Web" />
   

   <content type="html">&lt;p&gt;
  I got a notice that my &lt;a
  href="https://github.com/alexandru/crossdomain-requests-js"
  target="blank_"&gt;crossdomain-requests-js script&lt;/a&gt; (described in
  length &lt;a
  href="/blog/2011/03/24/cross-domain-requests.html"&gt;here&lt;/a&gt;)
  does not work on IExplorer 9.
&lt;/p&gt;

&lt;p&gt;
  God, how much I hate dealing with browsers. Initially when I wrote
  that script I tested on IExplorer 6 and IExplorer 8. I already had
  to deal with issues regarding the IExplorer 8 compatibility mode. I
  already had taken the decision to completely ignore IExplorer 7, as
  that was a fucked up release. Either way, it is possible that I
  completely wrecked IExplorer support with later changes that I
  failed to retest.
&lt;/p&gt;

&lt;p&gt;
  And all of this for a 233 lines script (including comments and
  whitespace). How fucked up is that?
&lt;/p&gt;

&lt;p&gt;
  This is also why I love JQuery so much. Unfortunately if you want to
  publish a &lt;i&gt;reusable&lt;/i&gt; library it's extremely annoying to bring a
  dependency such as JQuery with it. Even though JQuery is popular,
  the latest release is a whooping 90K in size. Do you know how
  painful that is on mobile browsers? NO, infrastructure code has to
  be done for the lowest common denominator: that's why people still
  do complex shit in C. C is portable and you can link to C from any
  other library.
&lt;/p&gt;

&lt;p&gt;
  Either way, I might do a rewrite in CoffeeScript: at least that's
  going to save me the pain of dealing with syntax differences, as
  I've been having those kinds of problems too ;-)
&lt;/p&gt;

&lt;h2&gt;So what to do?&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;
    prefer CoffeeScript for development, not because it is cool but
    because it is sane
  &lt;/li&gt;
  &lt;li&gt;
    Microsoft was kind enough to release virtual-machine images for
    testing your websites in various IExplorer versions - I'm
    installing them in VirtualBox right now using this script here: &lt;a
    target="_blank"
    href="https://github.com/xdissent/ievms"&gt;https://github.com/xdissent/ievms&lt;/a&gt;
    (many thanks to the author, I'll let you know how it goes)
  &lt;/li&gt;
  &lt;li&gt;
    Always test with the 6 biggies: Firefox, Chrome, Opera, IExplorer
    6, IExplorer 8 (watch out for the compatibility mode) and
    IExplorer 9
  &lt;/li&gt;
  &lt;li&gt;
    There are few differences between Chrome and Safari, both being
    based on WebKit and V8 is pretty compatible with Safari's JS
    Engine, so I don't bother with it. But it doesn't hurt if you have
    it around.
  &lt;/li&gt;
  &lt;li&gt;
    Mobile Safari and Android's browser are &lt;i&gt;different&lt;/i&gt; from
    desktop Safari and Chrome. Do not assume that your shit will
    automatically work on mobiles.
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;Do me a small favor ...&lt;/h2&gt;

&lt;p&gt;
  Open this page: &lt;a href="/projects/crossdomain-requests-js/"
  target="_blank"&gt;http://bionicspirit.com/projects/crossdomain-requests-js/&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
  Then tell me if it works or not in your browser, including the
  browser name and version you are using. You can use the comments
  section below, or in &lt;a href="http://bit.ly/sA07lI"
  target="_blank"&gt;this opened issue on GitHub&lt;/a&gt;.  
&lt;/p&gt;

&lt;img src="http://feeds.feedburner.com/~r/bionicspirit/~4/75kcc9Sovis" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://bionicspirit.com/blog/2011/10/25/testing-different-browsers.html</feedburner:origLink></entry>
 
 
 
 <entry>
   <title>Why I Find Heroku Suboptimal</title>
   <link href="http://feedproxy.google.com/~r/bionicspirit/~3/IcloJvrGIYo/why-i-find-heroku-suboptimal.html" />
   <updated>2011-10-23T00:00:00+03:00</updated>
   <id>http://bionicspirit.com/blog/2011/10/23/why-i-find-heroku-suboptimal</id>

   <author>
     <name>Bionic Spirit</name>
     <email>contact@bionicspirit.com</email>
     <uri>http://bionicspirit.com</uri>
   </author>

   <rights type="text">
     Copyright 2011 Alexandru Nedelcu.
     Some rights reserved (CC BY-NC 3.0)
     License: http://creativecommons.org/licenses/by-nc/3.0/
   </rights>

   
   <category scheme="http://bionicspirit.com/tags/" term="Heroku" label="Heroku" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Server" label="Server" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Cloud" label="Cloud" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Nginx" label="Nginx" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Varnish" label="Varnish" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Linode" label="Linode" />
   

   <content type="html">&lt;p&gt;
  I love freebies. I often find myself compelled to search for the
  best price / convenience ratio, and from this perspective you cannot
  really argue against something offered for free. And yet, here I am
  bitching and moaning about Heroku.
&lt;/p&gt;

&lt;p&gt;
  Heroku provides a free-quota that's a LOT more reasonable than all
  the shitty PHP hosting offerings out there. And when time comes to
  scale, it lets you scale nicely for a price.
&lt;/p&gt;

&lt;p&gt;
  Normally you develop your app on your localhost (which is like this
  warm and cozy place for all developers, &lt;i&gt;no place like
  127.0.0.1&lt;/i&gt; and all that), but then you want to deploy. You have
  to get out of your comfort zone and face the jungle and it's a true
  jungle out there, filled with shitty / underpowered and expensive
  hosting offerings. If going for a normal VPS, you'll have to
  configure your application server, your database server, your
  webserver that sits on top, maybe a reverse proxy cache, a memcached
  instance or two, a load balancer, a firewall, an email server and it
  goes on and on. And if going for a classic shared-hosting
  environment, then God help you.
&lt;/p&gt;

&lt;p&gt;
  There's a reason children with happy childhoods don't want to grow
  up - the world is an ugly and scary place.
&lt;/p&gt;

&lt;h2&gt;git push heroku master&lt;/h2&gt;

&lt;p&gt;
  Heroku is great. It basically allows you to avoid growing-up. The
  deployment itself couldn't be simpler, and when browsing their web
  interface for available add-ons, I feel like a child in a
  candy-store.
&lt;/p&gt;

&lt;p&gt;
  Basically you start with a free worker, to which you can add other
  "free" services, like a 5MB PostgreSql database and a 5MB Memcached
  instance, allowing you to prototype stuff. They even have plugins
  from third-parties that give you freebies, like a 250MB CouchDB, or
  a 240MB MongoDB. Then as you grow, you start adding more and more
  resources as needed. This has been labeled as &lt;i&gt;platform as a
  service&lt;/i&gt; and it's what the cool kids are talking about these
  days.

  Heck, there are people that are living within that free-quota
  without problems. One such example that I know of is &lt;a
  href="http://tzigla.com" target="_blank"&gt;http://tzigla.com&lt;/a&gt;
  ... or it was last time I talked to the authors, both acquaintances
  of mine, and Cristi described how he ended-up doing lots of
  workarounds to get around limitations and he was really excited
  about how everything fell into place.
&lt;/p&gt;

&lt;p&gt;
  But as I was sitting there admiring their determination and skill, I
  started wondering why the hell haven't they rented a normal VPS?
&lt;/p&gt;

&lt;p&gt;
  I mean really, if you end up pulling all kinds of crap to get around
  limitations, wouldn't it be better to just pay up? And if you're
  short on cash or you're the kind of entrepreneur that likes to spend
  frugally, then wouldn't you be better just renting a normal VPS? I
  asked him just that of course, and his reply was basically:
&lt;/p&gt;

&lt;p class="dialog"&gt;
  &lt;i&gt;I hate to do sys-admin stuff, installing and upgrading packages and all that&lt;/i&gt;
&lt;/p&gt;

&lt;p&gt;
  But it doesn't have to be that way. It's really not that hard. The
  reason for these feelings is the Ubuntu I have had installed on my
  primary laptop for 5 years already. Once you work with Ubuntu or
  your favorite Linux distribution, every day, configuring a
  web-server for starters is something like a half-an-hour chore. Or
  let's say 1 hour, and then it's done. And you don't have to worry
  about it again.
&lt;/p&gt;

&lt;p&gt;
  &lt;b&gt;And there are disadvantages to Heroku&lt;/b&gt;, lots of them: that's
  because you lose control and end up on top of a platform that's
  designed as a common denominator to appeal to all needs in an
  equally substandard manner.
&lt;/p&gt;

&lt;h2&gt;Example 1: Nginx&lt;/h2&gt;

&lt;p&gt;
  Nginx is a freakishly fast web server that consumes really few
  resources. Its main appeal is in serving static files and you do
  have static files to serve. When you grow you may want to move those
  static files to a CDN, like CloudFront, which serves content from
  locations closer to the actual users, but for serving css/javascript
  and small images - a properly configured Nginx is all you need. And
  you can't really move any files served from your main domain to a
  CDN (like HTML content).
&lt;/p&gt;

&lt;p&gt;
  You can also be smart about semi-static pages in Rails - you can
  cache the output inside the &lt;i&gt;public/&lt;/i&gt; directory to be served by
  Nginx. And if you still want to hit your controller on every
  request, like when doing A/B Testing on a page, you can send an
  &lt;i&gt;X-Accel-Redirect&lt;/i&gt; header in your response to Nginx and let
  Nginx to the actual content streaming for you. You can also instruct
  Nginx to serve files from different locations, based on certain
  variables like the domain name, thus avoiding hitting the Rails
  application server on every request.
&lt;/p&gt;

&lt;p&gt;
  There's a lot you can do with Nginx if you're on a budget, and yet
  this is not possible within Heroku ... which even though it may use
  Nginx as an http reverse proxy, it certainly doesn't use it for
  serving static files. All files are thus served by hitting the Rails
  server, unless Varnish is involved.  
&lt;/p&gt;

&lt;h2&gt;Example 2: Varnish&lt;/h2&gt;

&lt;p&gt;
  &lt;a href="https://www.varnish-cache.org/"&gt;Varnish&lt;/a&gt; is described as
  being a &lt;i&gt;web application accelerator&lt;/i&gt; and the things it can do
  are truly mind-blowing.
&lt;/p&gt;

&lt;p&gt;
  Varnish sits in front of your application servers. It can do
  &lt;i&gt;load-balancing&lt;/i&gt; for you with extreme efficiency, although
  that's not its main purpose. Its main purpose is to cache content.
&lt;/p&gt;

&lt;p&gt;
  When caching content you have an extreme freedom to specify the Key
  for fetching cached items. You can use anything when instructing
  Varnish on what and how to cache, like cookies or the user's IP or
  any HTTP header. Do you want to also cache content for logged-in
  users, even though that content is slightly different from user to
  user? Not a problem.  The configuration language is also extremely
  flexible, allowing you to tap in the request pipeline with any
  custom behavior you want.  The performance of Varnish coupled with
  this extreme flexibility is what makes it great. It also has this
  uncanny ability to reload its configuration without restarting or
  dropping active connections.
&lt;/p&gt;

&lt;p&gt;
  Heroku has Varnish in its stable stack, called Bamboo. But you
  cannot configure it. The configuration is the same for everybody
  ... you basically set expiry headers on your response, Varnish
  caches it for you and the cache gets invalidated on every new
  deployment.
&lt;/p&gt;

&lt;p&gt;
  This is actually good and has given rise to the famous Heroku
  use-case: hosting mostly static websites on it. But Varnish can be
  much more than that, otherwise it kind of gets in your way, and
  surprise - Heroku is pulling Varnish out of the configuration,
  starting with the new Celadon Cedar stack. This is because Varnish
  gets in the way of their ambitious plans: to make heroku
  platform-agnostic, thus adding support for Node.js and long-pooling.
&lt;/p&gt;

&lt;p&gt;
  The now recommended alternative for serving cached static content is
  to use Rack::Cache in combination with their Memcached add-on. But
  this sucks because (1) it hits the Rails server on every request and
  in the free plan you only have a single process to serve those
  requests + (2) the free plan for Memcached is only 5MB.
&lt;/p&gt;

&lt;h2&gt;Example 3: asynchronous jobs&lt;/h2&gt;

&lt;p&gt;
  One common-sense approach to not having a sluggish web interface is
  to get slow code out of your HTTP process. Lots of libraries and
  plugins are available for all web frameworks, like
  &lt;i&gt;delayed_job&lt;/i&gt; for Rails or &lt;i&gt;Celery&lt;/i&gt; for Django. And you
  can just write your own half-baked jobs queue and shove it in your
  cron.
&lt;/p&gt;

&lt;p&gt;
  You cannot have asynchronous jobs using Heroku's free plan. You must
  get an extra dyno for that.
&lt;/p&gt;

&lt;h2&gt;Price comparison with Linode&lt;/h2&gt;

&lt;p&gt;
  The cheapest &lt;a
  href="http://www.linode.com/?r=c7376c22b7853329bfb629a54dc9a843be935c36"&gt;Linode
  instance&lt;/a&gt; is &lt;b&gt;$20&lt;/b&gt; per month, and for starters you can have ...
&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;1 Nginx server&lt;/li&gt;
  &lt;li&gt;2 Passenger/Rails processes&lt;/li&gt;
  &lt;li&gt;
    1 worker for processing asynchronous jobs, it can even be a plain
    cron-job ; you do have complete flexibility in configuring
    cron-jobs
  &lt;/li&gt;
  &lt;li&gt;
    1 PostgreSQL database, configured for 256MB RAM usage, with 18 GB
    of storage. It's not much, but it isn't &lt;i&gt;shared&lt;/i&gt; either and
    does just fine, trust me ... btw, the &lt;a
    href="http://pgmag.org/"&gt;PostgreSQL magazine&lt;/a&gt; (first issue) has
    an article about configuring/optimizing PostgreSQL's memory usage
  &lt;/li&gt;
  &lt;li&gt;
    1 Postfix email server, for bug reports + sending all the spam
    you want (Linode lets you configure reverse DNS lookup, so you can
    have a cheap email server that doesn't trigger spam alerts)
  &lt;/li&gt;
  &lt;li&gt;
    ability to serve for any domain you want, including wildcard subdomains
  &lt;/li&gt;
  &lt;li&gt;
    your own SSL certificate, for free depending on provider
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
  The equivalent Heroku configuration would cost a minimum of &lt;b&gt;$114
  per month&lt;/b&gt;.
&lt;/p&gt;

&lt;p&gt;
  So lets say that you're growing and you want to add Ronin, Heroku's
  plan for a database of 1.7 GB &lt;i&gt;hot data set&lt;/i&gt; (whatever the fuck
  that means). That will cost you a whooping &lt;b&gt;$200 per month&lt;/b&gt;
  extra, versus &lt;b&gt;$80&lt;/b&gt; for a 2GB of RAM instance on Linode, or
  even better, $160 for a 4GB of RAM instance.
&lt;/p&gt;

&lt;h2&gt;Linode sucks too, but that's besides the point&lt;/h2&gt;

&lt;p&gt;
  You lose the ability to increase your dynos in response to traffic
  surges. On the other hand you'll be amazed at how much you can
  squeeze out of your rented hardware and if a properly configured
  setup fails to serve, then the problems you have probably can't be
  solved by just adding extra web servers.
&lt;/p&gt;

&lt;p&gt;
  Really, do some reading on why Reddit is down so often. Do some
  reading on why Amazon's EBS is completely unreliable for databases
  (btw, Heroku does use EBS and they've also had their share of
  downtime due to AWS experiencing problems).
&lt;/p&gt;

&lt;p&gt;
  Stop fearing the penguin and start configuring your own damn
  servers. As with everything that's actually worth it in life (like
  having children of your own), it's hard at first but the return of
  investment will be tenfold.
&lt;/p&gt;

&lt;p&gt;
  &lt;b&gt;PS:&lt;/b&gt; I'm obviously advertising &lt;a
  href="http://www.linode.com/?r=c7376c22b7853329bfb629a54dc9a843be935c36"&gt;Linode&lt;/a&gt;
  here. Links to it contain my affiliate tracking code, and if you
  become a customer you'll give me $20 worth of credit, which helps me
  pay for this blog's hosting (what can I say, I'm a cheap
  bastard). On the other hand this does express my genuine view of
  these services.
&lt;/p&gt;

&lt;img src="http://feeds.feedburner.com/~r/bionicspirit/~4/IcloJvrGIYo" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://bionicspirit.com/blog/2011/10/23/why-i-find-heroku-suboptimal.html</feedburner:origLink></entry>
 
 
 
 <entry>
   <title>Cross-Domain, Cross-Browser AJAX Requests</title>
   <link href="http://feedproxy.google.com/~r/bionicspirit/~3/gaEBNCIqF4w/cross-domain-requests.html" />
   <updated>2011-03-24T00:00:00+02:00</updated>
   <id>http://bionicspirit.com/blog/2011/03/24/cross-domain-requests</id>

   <author>
     <name>Bionic Spirit</name>
     <email>contact@bionicspirit.com</email>
     <uri>http://bionicspirit.com</uri>
   </author>

   <rights type="text">
     Copyright 2011 Alexandru Nedelcu.
     Some rights reserved (CC BY-NC 3.0)
     License: http://creativecommons.org/licenses/by-nc/3.0/
   </rights>

   
   <category scheme="http://bionicspirit.com/tags/" term="Javascript" label="Javascript" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Browser" label="Browser" />
   
   <category scheme="http://bionicspirit.com/tags/" term="Web" label="Web" />
   

   <content type="html">&lt;p&gt;
  This article describes how to make cross-browser requests, in all
  browsers (including &lt;u&gt;IExplorer 6&lt;/u&gt;), without using a proxy or JSONP
  (which is limited and awkward) -- as long as you control the
  destination server, or if the destination server allows.
&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;I'm explaining this file: &lt;a
  href="https://github.com/alexandru/crossdomain-requests-js/blob/gh-pages/public/crossdomain-ajax.js"&gt;crossdomain-ajax.js&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;Simple usage example: &lt;a
  href="/projects/crossdomain-requests-js/"&gt;http://bionicspirit.com/projects/crossdomain-requests-js/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;
  For a more serious example that works, checkout the Comments section
  getting loaded at the bottom of this page.
&lt;/p&gt;

&lt;h3&gt;UPDATED Oct 27, 2011&lt;/h3&gt;

&lt;p&gt;
  Added restrictions of usage and removed functionality that doesn't
  work on IExplorer. So in case this doesn't work for you, please see
  this page: &lt;a
  href="https://github.com/alexandru/crossdomain-requests-js/wiki/Troubleshooting"&gt;Troubleshooting&lt;/a&gt;
&lt;/p&gt;

&lt;h2&gt;In Modern Browsers - Meet Cross-Origin Resource Sharing&lt;/h2&gt;

&lt;p&gt;
  Or &lt;a href="http://www.w3.org/TR/cors/"&gt;CORS&lt;/a&gt; for short, or &lt;a
  href="https://developer.mozilla.org/en/http_access_control"
  target="_blank"&gt;HTTP Access Control&lt;/a&gt;, available in recent
  browsers, allows you to make cross-domain HTTP requests; the only
  requirement being that you have must have control over the
  server-side implementation of the domain targeted in your
  XMLHttpRequest calls.
&lt;/p&gt;

&lt;p&gt;
  This little piece of technology is available since Firefox 3.5 /
  IExplorer 8 and yet when searching for answers on websites like
  StackOverflow, it rarely comes up.
&lt;/p&gt;

&lt;p&gt;
  For the purposes of this tutorial, we'll assume we want to make a
  request from website &lt;u&gt;&lt;i&gt;http://source.com&lt;/i&gt;&lt;/u&gt; to
  &lt;u&gt;&lt;i&gt;http://destination.org&lt;/i&gt;&lt;/u&gt;, and that you control the
  implementation to both.
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;xhr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;XMLHttpRequest&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="c1"&gt;// NOPE, it doesn&amp;#39;t work, yet&lt;/span&gt;
&lt;span class="nx"&gt;xhr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;POST&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;http://destination.org&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;h2&gt;Response of &lt;i&gt;destination.org&lt;/i&gt;&lt;/h2&gt;

&lt;p&gt;
  It's pretty simple really, all you need to do is to return these
  headers in your response:
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="yaml"&gt;&lt;span class="l-Scalar-Plain"&gt;Access-Control-Allow-Methods&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;GET, POST, OPTIONS&lt;/span&gt;
&lt;span class="l-Scalar-Plain"&gt;Access-Control-Allow-Credentials&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;true&lt;/span&gt;
&lt;span class="l-Scalar-Plain"&gt;Access-Control-Allow-Origin&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;http://source.com&lt;/span&gt;
&lt;span class="l-Scalar-Plain"&gt;Access-Control-Allow-Headers&lt;/span&gt;&lt;span class="p-Indicator"&gt;:&lt;/span&gt; &lt;span class="l-Scalar-Plain"&gt;Content-Type, *&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;
  You can find a description of them on &lt;a
  href="https://developer.mozilla.org/en/http_access_control#The_HTTP_request_headers"&gt;Mozilla
  Doc Center&lt;/a&gt;, but the most important one is
  &lt;i&gt;Access-Control-Allow-Origin&lt;/i&gt;, which indicates the Origin(s)
  allowed to make such a request.
&lt;/p&gt;

&lt;p style="text-decoration: line-through"&gt;
  &lt;b&gt;Note:&lt;/b&gt; these options allow for wildcards (like you can say
  that you allow for any Origin by putting a "*" in that header), but
  it is better to be explicit about what's allowed, otherwise your
  request won't work very well cross-browser.
&lt;/p&gt;

&lt;p&gt;
  &lt;b&gt;(New) Note:&lt;/b&gt; In regards to Access-Control-Allow-Origin,
  IExplorer DOES NOT support wildcards. See &lt;a
  href="https://github.com/alexandru/crossdomain-requests-js/wiki/Troubleshooting"&gt;Troubleshooting&lt;/a&gt; for details.
&lt;/p&gt;

&lt;h2&gt;Client-side Implementation of Ajax Request for CORS&lt;/h2&gt;

&lt;p&gt;
  On browsers where &lt;u&gt;XMLHttpRequest&lt;/u&gt; is valid, support for CORS
  can be validated by checking for the availability of the
  &lt;u&gt;withCredentials&lt;/u&gt; property.
&lt;/p&gt;

&lt;p&gt;
  So we've got a tiny issue: &lt;u&gt;IExplorer's&lt;/u&gt; implementation is
  different than that of Firefox's or the rest of the browsers
  (naturally). Instead of using the same &lt;u&gt;XMLHttpRequest&lt;/u&gt; object,
  IExplorer 8 adds an &lt;a
  href="http://msdn.microsoft.com/en-us/library/cc288060(v=vs.85).aspx"&gt;XDomainRequest&lt;/a&gt;
  object.
&lt;/p&gt;

&lt;p&gt;
  So to initialize an async request, that will work on IExplorer
  8, Firefox, Chrome and the other browsers supporting it:
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;xhr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;XMLHttpRequest&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{}&lt;/span&gt;	

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;xhr&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;withCredentials&amp;quot;&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;xhr&lt;/span&gt;&lt;span class="p"&gt;){&lt;/span&gt;
    &lt;span class="nx"&gt;xhr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;XDomainRequest&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;undefined&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;){&lt;/span&gt;
    &lt;span class="nx"&gt;xhr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;XDomainRequest&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nx"&gt;xhr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;
    &lt;span class="nx"&gt;xhr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;
  But we aren't done yet, the callbacks used by these request objects
  have different behavior on IExplorer. So let's say we've got 2
  callbacks that we want to register, one for success, one for errors,
  having the following signatures (same as jQuery):
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;responseText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="p"&gt;...&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;
  To have correct behavior cross-browser:
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="c1"&gt;// &lt;/span&gt;
&lt;span class="c1"&gt;// combines the success/error handlers into one &lt;/span&gt;
&lt;span class="c1"&gt;// higher-order function (getting a little fancy for code-reuse)&lt;/span&gt;
&lt;span class="c1"&gt;//&lt;/span&gt;
&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;handle_load&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event_type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c1"&gt;//&lt;/span&gt;
        &lt;span class="c1"&gt;// stupid IExplorer won&amp;#39;t receive any param on callbacks!!!&lt;/span&gt;
        &lt;span class="c1"&gt;// thus the object used is the initial `xhr` object&lt;/span&gt;
        &lt;span class="c1"&gt;// (bound to this function because it&amp;#39;s a closure)&lt;/span&gt;
        &lt;span class="c1"&gt;//&lt;/span&gt;
        &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;XHRobj&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;is_iexplorer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;xhr&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

        &lt;span class="c1"&gt;//&lt;/span&gt;
        &lt;span class="c1"&gt;// IExplorer also skips on readyState&lt;/span&gt;
        &lt;span class="c1"&gt;// Also, it&amp;#39;s success/error based on the `event_type` used at the call-site&lt;/span&gt;
        &lt;span class="c1"&gt;// &lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;load&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;is_iexplorer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;readyState&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;success&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nx"&gt;success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;responseText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;

&lt;span class="k"&gt;try&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// IExplorer throws an exception on this one&lt;/span&gt;
    &lt;span class="c1"&gt;//&lt;/span&gt;
    &lt;span class="c1"&gt;// Setting this to `true` is specifying to make the request with Cookies attached.&lt;/span&gt;
    &lt;span class="c1"&gt;// BUT -- it&amp;#39;s pretty useless, as IExplorer doesn&amp;#39;t support sending Cookies.&lt;/span&gt;
    &lt;span class="c1"&gt;//&lt;/span&gt;
    &lt;span class="c1"&gt;// Also, trying to set cookies from the response is not really possible directly &lt;/span&gt;
    &lt;span class="c1"&gt;// (workarounds are available though -- you can return anything in the response&amp;#39;s &lt;/span&gt;
    &lt;span class="c1"&gt;//  body and use local javascript for persistence/propagation on next request)&lt;/span&gt;
    &lt;span class="c1"&gt;//&lt;/span&gt;
    &lt;span class="nx"&gt;xhr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;withCredentials&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;catch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt;

&lt;span class="c1"&gt;//&lt;/span&gt;
&lt;span class="c1"&gt;// `onload` + `onerror` are actually new additions to these browsers.&lt;/span&gt;
&lt;span class="c1"&gt;//&lt;/span&gt;
&lt;span class="c1"&gt;// IExplorer doesn&amp;#39;t actually push params on calling these callbacks.&lt;/span&gt;
&lt;span class="c1"&gt;// For every other browser, the XHRobj we want is in `e.target`, &lt;/span&gt;
&lt;span class="c1"&gt;// where `e` is an event object.&lt;/span&gt;
&lt;span class="c1"&gt;//&lt;/span&gt;
&lt;span class="nx"&gt;xhr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onload&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;handle_load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;load&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="nx"&gt;is_iexplorer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="nx"&gt;xhr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onerror&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;handle_load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;error&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)(&lt;/span&gt;&lt;span class="nx"&gt;is_iexplorer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;e&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;target&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
&lt;span class="nx"&gt;xhr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;
  Also of notice, here's how to check if the browser is IExplorer:
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;is_iexplorer&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; 
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;navigator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;userAgent&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;indexOf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;MSIE&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!=-&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; 
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;
  Well, that's it, unless you want to support the rest of desktop
  browsers in use.
&lt;/p&gt;

&lt;h2&gt;Fallback for Older Browsers&lt;/h2&gt;

&lt;p&gt;
  &lt;u&gt;Opera 10&lt;/u&gt; doesn't have this feature, neither do IExplorer &lt; 8,
  Firefox &lt; 3.5 -- and I don't really know when Chrome/Safari added
  it.

  Fortunately there's a workaround -- Flash can do whatever you want
  and runs the same on ~90% of desktop browsers out there, AND it can
  interact with Javascript.
&lt;/p&gt;

&lt;p&gt;
  Not to reinvent the wheel, here's a cool plugin: &lt;a
  href="http://flxhr.flensed.com/"&gt;flensed.flXHR&lt;/a&gt;.
&lt;/p&gt;

&lt;h3&gt;Why bother with CORS?&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;Flash is not available on the iPhone&lt;/li&gt;
  &lt;li&gt;Flash loads slower than Javascript&lt;/li&gt;
  &lt;li&gt;Flash SWF files come with a lot of junk that your browser has to download&lt;/li&gt;
  &lt;li&gt;The whole experience using flXHR will be visibly slower than with CORS&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;flXHR Usage&lt;/h3&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="c1"&gt;//&lt;/span&gt;
&lt;span class="c1"&gt;// Does a request using flXHR (the JS-Flash alternative &lt;/span&gt;
&lt;span class="c1"&gt;// implementation for XMLHttpRequest)&lt;/span&gt;
&lt;span class="c1"&gt;//&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;_ajax_with_flxhr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;     &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;url&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;type&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;type&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;GET&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;success&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;success&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;error&lt;/span&gt;   &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;error&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;data&lt;/span&gt;    &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;data&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;

    &lt;span class="c1"&gt;//&lt;/span&gt;
    &lt;span class="c1"&gt;// handles callbacks, just as above&lt;/span&gt;
    &lt;span class="c1"&gt;//&lt;/span&gt;
    &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;handle_load&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;readyState&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;status&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nx"&gt;success&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="nx"&gt;success&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;responseText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;else&lt;/span&gt;
                &lt;span class="nx"&gt;error&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;XHRobj&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;flproxy&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;flensed&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;flXHR&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; 
        &lt;span class="nx"&gt;autoUpdatePlayer&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="nx"&gt;instanceId&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="s2"&gt;&amp;quot;myproxy1&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="nx"&gt;xmlResponseText&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; 
        &lt;span class="nx"&gt;onreadystatechange&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;&lt;span class="nx"&gt;handle_load&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="nx"&gt;flproxy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;flproxy&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;data&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;
  We are NOT done. The destination server also needs a file called
  &lt;u&gt;crossdomain.xml&lt;/u&gt;, which represents a &lt;a target="_blank" style="white-space: nowrap"
  href="http://www.adobe.com/devnet/articles/crossdomain_policy_file_spec.html"&gt;Crossdomain
  Policy File Spec&lt;/a&gt;. As a requirement, this file has to be placed
  in the domain's root,
  i.e. &lt;u&gt;http://destination.org/crossdomain.xml&lt;/u&gt; ...
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="xml"&gt;&lt;span class="cp"&gt;&amp;lt;?xml version=&amp;quot;1.0&amp;quot;?&amp;gt;&lt;/span&gt;
&lt;span class="cp"&gt;&amp;lt;!DOCTYPE cross-domain-policy SYSTEM &amp;quot;http://www.macromedia.com/xml/dtds/cross-domain-policy.dtd&amp;quot;&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;cross-domain-policy&amp;gt;&lt;/span&gt;
  &lt;span class="c"&gt;&amp;lt;!-- wildcard means &amp;#39;allow all&amp;#39; --&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;allow-access-from&lt;/span&gt; &lt;span class="na"&gt;domain=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;*&amp;quot;&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;allow-http-request-headers-from&lt;/span&gt; &lt;span class="na"&gt;domain=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;*&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;headers=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;*&amp;quot;&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/cross-domain-policy&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;



&lt;h2&gt;Not Loading the Junk when Not Needed&lt;/h2&gt;

&lt;p&gt;
  Javascript is &lt;u&gt;asynchronous&lt;/u&gt; and we should take advantage of
  that by not loading &lt;u&gt;flensed.flXHR&lt;/u&gt;, unless needed and at the
  last moment too (no need to load it until we want to make a
  request).
&lt;/p&gt;

&lt;p&gt;
  We need a method for asynchronously loading a Javascript file and
  executing a callback onload. And since we may be executing this
  function multiple times at once, we need to take care of
  race-conditions. First things first:
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="c1"&gt;// keeps count of files already included&lt;/span&gt;

&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;FILES_INCLUDED&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt;

&lt;span class="c1"&gt;// keeps count of files in the processes of getting loaded&lt;/span&gt;
&lt;span class="c1"&gt;// for avoiding race conditions &lt;/span&gt;

&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;FILES_LOADING&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt; 

&lt;span class="c1"&gt;// stacks of registered callbacks, that will get executed once&lt;/span&gt;
&lt;span class="c1"&gt;// a file loads -- this to deal with multiple file inclusions at once,&lt;/span&gt;
&lt;span class="c1"&gt;// and not ignoring anything&lt;/span&gt;

&lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;REGISTERED_CALLBACKS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{};&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;register_callback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;REGISTERED_CALLBACKS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
        &lt;span class="nx"&gt;REGISTERED_CALLBACKS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Array&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nx"&gt;REGISTERED_CALLBACKS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;execute_callbacks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;REGISTERED_CALLBACKS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;callback&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;REGISTERED_CALLBACKS&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;pop&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nx"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;
  To asynchronously load a Javascript file, with onload callback, behold:
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="c1"&gt;//&lt;/span&gt;
&lt;span class="c1"&gt;// Loads a Javascript file asynchronously, executing a `callback`&lt;/span&gt;
&lt;span class="c1"&gt;// if/when file gets loaded.&lt;/span&gt;
&lt;span class="c1"&gt;//&lt;/span&gt;
&lt;span class="c1"&gt;// Returns `true` if callback got executed immediately, `false` otherwise.&lt;/span&gt;
&lt;span class="c1"&gt;//&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nx"&gt;async_load_javascript&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// stores callback in the stack&lt;/span&gt;
    &lt;span class="nx"&gt;register_callback&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;callback&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// dealing with race conditions&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;FILES_INCLUDED&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;execute_callbacks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;FILES_LOADING&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; 
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="nx"&gt;FILES_LOADING&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

    &lt;span class="c1"&gt;// dynamically adds a &amp;lt;script&amp;gt; tag to the document&lt;/span&gt;
    &lt;span class="kd"&gt;var&lt;/span&gt; &lt;span class="nx"&gt;html_doc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;getElementsByTagName&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;head&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="nx"&gt;js&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;createElement&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;script&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;js&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;setAttribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;type&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;text/javascript&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;js&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;setAttribute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;src&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="nx"&gt;html_doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;appendChild&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;js&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

    &lt;span class="c1"&gt;// onload, then go through the stack of callbacks, &lt;/span&gt;
    &lt;span class="c1"&gt;// and execute all of them&lt;/span&gt;
    &lt;span class="nx"&gt;js&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onreadystatechange&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;js&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;readyState&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;complete&amp;#39;&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="nx"&gt;js&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;readyState&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;loaded&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nx"&gt;FILES_INCLUDED&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="nx"&gt;FILES_INCLUDED&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
                &lt;span class="nx"&gt;execute_callbacks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="c1"&gt;// same as above, same shit for dealing with incompatibilities&lt;/span&gt;
    &lt;span class="nx"&gt;js&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;onload&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt; &lt;span class="nx"&gt;FILES_INCLUDED&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="nx"&gt;FILES_INCLUDED&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="nx"&gt;execute_callbacks&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;file&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;h2&gt;Almost there&lt;/h2&gt;

&lt;p&gt;
  To bind it all together we need to plug this into our main logic. So
  if browser does not support CORS, it fallbacks to this implementation.
&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="c1"&gt;// to recapitulate&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;xhr&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;withCredentials&amp;quot;&lt;/span&gt; &lt;span class="k"&gt;in&lt;/span&gt; &lt;span class="nx"&gt;xhr&lt;/span&gt;&lt;span class="p"&gt;){&lt;/span&gt;
    &lt;span class="nx"&gt;xhr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;typeof&lt;/span&gt; &lt;span class="nx"&gt;XDomainRequest&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;undefined&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;){&lt;/span&gt;
    &lt;span class="nx"&gt;xhr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nx"&gt;XDomainRequest&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="nx"&gt;xhr&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;url&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="k"&gt;else&lt;/span&gt;
    &lt;span class="nx"&gt;xhr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// NOT SUPPORTED, then fallback&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;xhr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; 
    &lt;span class="nx"&gt;async_load_javascript&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;CROSSDOMAINJS_PATH&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s2"&gt;&amp;quot;flXHR/flXHR.js&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;_ajax_with_flxhr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;
  To see the final code, go here: &lt;a
  href="https://github.com/alexandru/crossdomain-requests-js/blob/gh-pages/public/crossdomain-ajax.js"&gt;crossdomain-ajax.js&lt;/a&gt;. Or
  to see it working, go here: &lt;a
  href="http://bionicspirit.com/projects/crossdomain-requests-js/"&gt;bionicspirit.com/projects/crossdomain-requests-js/&lt;/a&gt;.
  &lt;br /&gt;
  (or just leave me a comment below ;))
&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/bionicspirit/~4/gaEBNCIqF4w" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://bionicspirit.com/blog/2011/03/24/cross-domain-requests.html</feedburner:origLink></entry>
 
 
 
 
 
 
 
 
</feed>

