<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en-US">
  <title type="text">Virendra Rajput's Blog</title>

  <updated>2014-01-29T00:22:23+05:30</updated>

  <link rel="alternate" type="text/html" href="http://virendra.me" />
  <id>http://virendra.me/</id>
  <link rel="self" type="application/atom+xml" href="http://virendra.me/atom.xml" />

  <author>
    <name>Virendra Rajput</name>
    <uri>http://virendra.me/</uri>
  </author>

  
  
  <entry>
    <title>Exploiting Intelltest bug to get answers [in Online exams]</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/intelltest-hack/"/>
    <id>http://virendra.me/intelltest-hack</id>
    <updated>2014-01-24T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[Exploiting Intelltest bug to get answers [in Online exams]. Intelltest is a platform which provides continuous evaluations and online examination hosting]]></summary>
    <content type="html" xml:base="http://virendra.me/intelltest-hack/"><![CDATA[<p><em>
<a href="http://www.intelltest.com/">Intelltest</a> is a platform which provides continuous evaluations and online examination hosting. It is widely adopted by several prestigious institutes including <a href="http://unipune.ac.in/">Pune University</a> (for conducting online exams), Maharashtra Institute of Technology, Sinhagad College of Engineering and many more institutes for conducting online examinations.
</em></p>

<p>This exploit requires you to have the Network Inspector under Developer Tools (for Google Chrome or Chromium) or Firebug (if you are on Firefox). You can exploit it to view the correct option&#39;s for <abbr title="Multiple Choice Questions">MCQ&#39;s</abbr>.</p>

<p>Here is a demo (answered 3 questions):</p>

<iframe width="640" height="360" src="http://www.youtube.com/embed/DM_GIfPE5tg"></iframe>

<p>And here is the result for the test (scored 3/50 for the 3 questions attempted):</p>

<p><img height="360" width="640" src="/img/TestResults.png" alt="Test Results"></p>
]]></content>
  </entry>
  
  
  <entry>
    <title>Tool to unsubscribe from Facebook App requests and activity</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/unsubscribe-fb-apps-bookmarklet/"/>
    <id>http://virendra.me/unsubscribe-fb-apps-bookmarklet</id>
    <updated>2013-12-07T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[Are you tired of receiving notifications from Facebook apps? And wish if you could just unsubscribe from all of them, at one go! Well then, it's your lucky day! Because I built a bookmarklet, that does that (since, I was too lazy to unsubscribe manually).]]></summary>
    <content type="html" xml:base="http://virendra.me/unsubscribe-fb-apps-bookmarklet/"><![CDATA[<p>Are you tired of receiving notifications from Facebook apps?</p>

<p>And wish if you could just unsubscribe from all of them, at one go!</p>

<p>Well then, it&#39;s your lucky day! Because I built a bookmarklet, that does that (since, I was too lazy to unsubscribe manually).</p>

<hr>

<p>For using it, you will have to follow these steps:</p>

<p><span><strong>1</strong></span>. First of all, add the bookmarklet to your Bookmarks bar by dragging this button -</p>

<p><a style="text-decoration: underline; padding: 2px 6px; color: #fff; background: #32a0eb; -webkit-border-radius: 4px; -moz-border-radius: 4px; border-radius: 4px;" href="javascript:(function()%7Bfunction%20happyFn(e)%7Bif(halt%7C%7C!e%7C%7C!e.length)%7Bdocument.getElementById(%22happyStatus%22).innerHTML%3D%22Done!%22%3Breturn%7De%5B0%5D.querySelector(%22%5Btype%3Dcheckbox%5D%22).click()%3Bdocument.getElementById(%22count_apps%22).innerHTML%3Dj-e.length%2B1%3Bwindow.setTimeout(function()%7BhappyFn(e.splice(1))%7D%2C1e3)%7Dfunction%20haltFn()%7Bhalt%3Dtrue%3Breturn%20false%7Dvar%20nodes%3Ddocument.getElementsByClassName(%22notif%22)%2Ci%2Chalt%3Dfalse%2Csub_apps%3D%5B%5D%3Bfor(var%20i%3D0%3Bi%3Cnodes.length%3Bi%2B%2B)%7Bvar%20node%3Dnodes%5Bi%5D%3Bif(node.querySelector(%22%5Btype%3Dcheckbox%5D%22).checked%3D%3Dtrue)%7Bsub_apps.push(node)%7D%7Dvar%20j%3Dsub_apps.length%3Bvar%20happyDiv%3Ddocument.createElement(%22div%22)%3BhappyDiv.innerHTML%3D%22%3Cdiv%20id%3D'happy'%20style%3D'background-color%3A%23ddd%3Bfont-size%3A16px%3Btext-align%3Acenter%3Bposition%3Afixed%3Btop%3A40px%3Bright%3A40px%3Bwidth%3A200px%3Bheight%3A100px%3Bborder%3A4px%20solid%20black%3Bz-index%3A9999%3Bpadding-top%3A15px%3B'%3E%3Cspan%20id%3D'count_apps'%3E0%3C%2Fspan%3E%20of%20%22%2Bsub_apps.length%2B%22%20apps%20unsubscribed.%3Cdiv%20id%3D'happyStatus'%20style%3D'margin-top%3A30px%3B'%3E%3Ca%20onclick%3D'haltFn()'%20id%3D'happyButton'%20href%3D'%23'%20style%3D'display%3Ablock%3B'%3EStop%20it.%3C%2Fa%3E%3C%2Fdiv%3E%3C%2Fdiv%3E%22%3Bdocument.getElementsByTagName(%22body%22)%5B0%5D.appendChild(happyDiv)%3BhappyFn(sub_apps)%7D)()">Unsubscribe me</a> &lt;--  drag me to your bookmarks bar</p>

<p><span><strong>2</strong></span>. Go to the Facebook Notifications settings <a href="https://www.facebook.com/settings?tab=notifications">here</a> (you need to be logged into Facebook).</p>

<p><span><strong>3</strong></span>. Now click on the Edit link in &quot;App requests and the activity&quot; row as shown below:</p>

<p><img src="/img/click_edit.png"></p>

<p>It will expand, and all the apps you are subscribed to will be listed.</p>

<p><span><strong>4</strong></span>. Now click on the &quot;Unsubscribe Me&quot; bookmarklet you just add. And that&#39;s all.
It will automatically start, the unsubscribing process.
The progress will be shown in a box in the top right.
It will show a message &quot;Done&quot;, when it is done unsubscribing from all the apps.</p>

<p>Enjoy! :-)</p>

<hr>

<p>The <a href="https://github.com/bkvirendra/fb-unsubscribe-bookmarklet">source code</a> is available on Github, if you are interested in contributing.</p>

<script type="text/javascript">
function externalLinks() {
  for(var c = document.getElementsByTagName("a"), a = 0;a < c.length;a++) {
    var b = c[a];
    b.getAttribute("href") && b.hostname !== location.hostname && (b.target = "_blank")
  }
}
;
externalLinks();
</script>
]]></content>
  </entry>
  
  
  <entry>
    <title>Photography in Lonavala, India</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/photography-in-lonavala/"/>
    <id>http://virendra.me/photography-in-lonavala</id>
    <updated>2013-09-04T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[Some of my random clicks taken in Lonavala, India.]]></summary>
    <content type="html" xml:base="http://virendra.me/photography-in-lonavala/"><![CDATA[<p>These were some of my random clicks in Lonavala, India:</p>

<div class="fbutils-album clearfix" style="width: 100%"> 
    <div class="fbutils-photos">

        <a class="fbutils-photo" href="/img/photos/original/flower.jpg" title="a flower">
            <img class="fbutils-thumb" src="/img/photos/thumbs/flower.jpg">
        </a>
        <a class="fbutils-photo" href="/img/photos/original/tumblr_msefs3V2J21ryuegwo1_1280.jpg" title="View from the hill top">
            <img class="fbutils-thumb" src="/img/photos/thumbs/tumblr_msefs3V2J21ryuegwo1_1280.jpg">
        </a>
        <a class="fbutils-photo" href="/img/photos/original/tumblr_msefs3V2J21ryuegwo2_1280.jpg" title="a leaf">
            <img class="fbutils-thumb" src="/img/photos/thumbs/tumblr_msefs3V2J21ryuegwo2_1280.jpg">
        </a>
        <a class="fbutils-photo" href="/img/photos/original/tumblr_mslpglTxSH1ryuegwo3_1280.jpg" title="the lake view">
            <img class="fbutils-thumb" src="/img/photos/thumbs/tumblr_mslpglTxSH1ryuegwo3_1280.jpg">
        </a>
        <a class="fbutils-photo" href="/img/photos/original/tumblr_mslpglTxSH1ryuegwo4_1280.jpg" title="the lake view (2)">
            <img class="fbutils-thumb" src="/img/photos/thumbs/tumblr_mslpglTxSH1ryuegwo4_1280.jpg">
        </a>
        <a class="fbutils-photo" href="/img/photos/original/tumblr_mslpglTxSH1ryuegwo5_1280.jpg" title="the sky">
            <img class="fbutils-thumb" src="/img/photos/thumbs/tumblr_mslpglTxSH1ryuegwo5_1280.jpg">
        </a>
        <a class="fbutils-photo" href="/img/photos/original/tumblr_mslpglTxSH1ryuegwo6_1280.jpg" title="the lake view (3)">
            <img class="fbutils-thumb" src="/img/photos/thumbs/tumblr_mslpglTxSH1ryuegwo6_1280.jpg">
        </a>
       <a class="fbutils-photo" href="/img/photos/original/tumblr_mslpglTxSH1ryuegwo8_1280.jpg" title="the sky view at evening">
            <img class="fbutils-thumb" src="/img/photos/thumbs/tumblr_mslpglTxSH1ryuegwo8_1280.jpg">
        </a>
       <a class="fbutils-photo" href="/img/photos/original/tumblr_mslpglTxSH1ryuegwo8_1280.jpg" title="the sky view at evening">
            <img class="fbutils-thumb" src="/img/photos/thumbs/tumblr_mslpglTxSH1ryuegwo8_1280.jpg">
        </a>
       <a class="fbutils-photo" href="/img/photos/original/tumblr_mslpglTxSH1ryuegwo9_1280.jpg" title="the crab in hand">
            <img class="fbutils-thumb" src="/img/photos/thumbs/tumblr_mslpglTxSH1ryuegwo9_1280.jpg">
        </a>
        <a class="fbutils-photo" href="/img/photos/original/tumblr_msefs3V2J21ryuegwo3_400.jpg" title="a random shot">
            <img class="fbutils-thumb" src="/img/photos/thumbs/tumblr_msefs3V2J21ryuegwo3_400.jpg">
        </a>
        <a class="fbutils-photo" href="/img/photos/original/tumblr_msefs3V2J21ryuegwo4_1280.jpg" title="a random shot (2)">
            <img class="fbutils-thumb" src="/img/photos/thumbs/tumblr_msefs3V2J21ryuegwo4_1280.jpg">
        </a>
        <a class="fbutils-photo" href="/img/photos/original/tumblr_msefs3V2J21ryuegwo5_1280.jpg" title="view from the mountain top">
            <img class="fbutils-thumb" src="/img/photos/thumbs/tumblr_msefs3V2J21ryuegwo5_1280.jpg">
        </a>

    </div>
</div>
]]></content>
  </entry>
  
  
  <entry>
    <title>Scraping IMDB with Python</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/scraping-imdb-with-python/"/>
    <id>http://virendra.me/scraping-imdb-with-python</id>
    <updated>2013-06-19T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[Scraping is fun, whether you’re doing it just for fun or profit. IMDB does not have an API, for accessing information on movies or TV series. So, had to write a scraper for fetching accessing their information on movies.]]></summary>
    <content type="html" xml:base="http://virendra.me/scraping-imdb-with-python/"><![CDATA[<p>Scraping is fun, whether you’re doing it just for fun or profit. I created a couple of <code>scrapers</code> already for <a href="https://github.com/bkvirendra/iTunes-Charts-WebCrawler">iTunes</a>, <a href="https://github.com/bkvirendra/paintbottle-downloader">Paintbottle</a> (deleted as per requested by the site-admin), <a href="https://github.com/bkvirendra/scrapy-talk-samples">Cricinfo</a>, <a href="https://github.com/bkvirendra/didyoumean">Google’s Did you Mean?</a> and more. check em out on <a href="https://github.com/bkvirendra?tab=repositories">Github</a>.</p>

<p><a href="http://imdb.com/">IMDB</a> does not have an API, for accessing information on movies or TV series. So, had to write a <code>scraper</code> for fetching accessing their information on movies.</p>

<p>I did know about the couple of other unofficial API’s (including <a href="http://www.omdbapi.com/">omdb</a>), but creating your own solution is always fun :)</p>

<p>If you don’t want to go much into the technical details, but are just looking to use it, it is hosted at <a href="http://getimdb.herokuapp.com">http://getimdb.herokuapp.com</a>.</p>

<p>The <code>scraper</code> is written in <code>Python</code> and uses <a href="http://lxml.de/">lxml</a>, for parsing the webpages. I m using <code>XPath</code> for selecting elements from the DOM.</p>

<p>Following are the dependencies, and can be installed using <code>pip</code>:</p>

<div class="highlight"><pre><code class="python"><span class="n">requests</span><span class="o">==</span><span class="mf">1.2</span><span class="o">.</span><span class="mi">3</span>
 <span class="n">lxml</span><span class="o">==</span><span class="mf">3.2</span><span class="o">.</span><span class="mi">1</span>
</code></pre></div>

<p>The code:</p>

<div class="highlight"><pre><code class="python"><span class="c">#!/usr/bin/env python</span>

<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">requests</span>
<span class="kn">import</span> <span class="nn">lxml.html</span>

<span class="k">def</span> <span class="nf">main</span><span class="p">(</span><span class="nb">id</span><span class="p">):</span>
    <span class="n">hxs</span> <span class="o">=</span> <span class="n">lxml</span><span class="o">.</span><span class="n">html</span><span class="o">.</span><span class="n">document_fromstring</span><span class="p">(</span><span class="n">requests</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s">&quot;http://www.imdb.com/title/&quot;</span> <span class="o">+</span> <span class="nb">id</span><span class="p">)</span><span class="o">.</span><span class="n">content</span><span class="p">)</span>
    <span class="n">movie</span> <span class="o">=</span> <span class="p">{}</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;title&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/h1/span[1]/text()&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;title&#39;</span><span class="p">]</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;year&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/h1/span[2]/a/text()&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="k">try</span><span class="p">:</span>
            <span class="n">movie</span><span class="p">[</span><span class="s">&#39;year&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/h1/span[3]/a/text()&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
        <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
            <span class="n">movie</span><span class="p">[</span><span class="s">&#39;year&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s">&quot;&quot;</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;certification&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/div[2]/span[1]/@title&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;certification&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s">&quot;&quot;</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;running_time&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/div[2]/time/text()&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;running_time&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s">&quot;&quot;</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;genre&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/div[2]/a/span/text()&#39;</span><span class="p">)</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;genre&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;release_date&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/div[2]/span[3]/a/text()&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="k">try</span><span class="p">:</span>
            <span class="n">movie</span><span class="p">[</span><span class="s">&#39;release_date&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/div[2]/span[4]/a/text()&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
        <span class="k">except</span> <span class="ne">Exception</span><span class="p">:</span>
            <span class="n">movie</span><span class="p">[</span><span class="s">&#39;release_date&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s">&quot;&quot;</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;rating&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/div[3]/div[3]/strong/span/text()&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;rating&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s">&quot;&quot;</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;metascore&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/div[3]/div[3]/a[2]/text()&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span><span class="o">.</span><span class="n">split</span><span class="p">(</span><span class="s">&#39;/&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;metascore&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="mi">0</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;description&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/p[2]/text()&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;description&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s">&quot;&quot;</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;director&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/div[4]/a/span/text()&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;director&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s">&quot;&quot;</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;stars&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/div[6]/a/span/text()&#39;</span><span class="p">)</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;stars&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s">&quot;&quot;</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;poster&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;img_primary&quot;]/div/a/img/@src&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;poster&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s">&quot;&quot;</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;gallery&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;combined-photos&quot;]/div/a/img/@src&#39;</span><span class="p">)</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;gallery&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s">&quot;&quot;</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;storyline&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;titleStoryLine&quot;]/div[1]/p/text()&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;storyline&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s">&quot;&quot;</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;votes&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="n">hxs</span><span class="o">.</span><span class="n">xpath</span><span class="p">(</span><span class="s">&#39;//*[@id=&quot;overview-top&quot;]/div[3]/div[3]/a[1]/span/text()&#39;</span><span class="p">)[</span><span class="mi">0</span><span class="p">]</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
    <span class="k">except</span> <span class="ne">IndexError</span><span class="p">:</span>
        <span class="n">movie</span><span class="p">[</span><span class="s">&#39;votes&#39;</span><span class="p">]</span> <span class="o">=</span> <span class="s">&quot;&quot;</span>
    <span class="k">return</span> <span class="n">movie</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">&quot;__main__&quot;</span><span class="p">:</span>
    <span class="k">print</span> <span class="n">main</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
</code></pre></div>

<p>You can use it by passing any valid <code>imdb id</code> as an argument:</p>
<div class="highlight"><pre><code class="text language-text" data-lang="text">$ python imdb.py tt1905041
</code></pre></div>
<p>And the output will be returned as follows:</p>

<div class="highlight"><pre><code class="json"><span class="p">{</span>
  <span class="nt">&quot;certification&quot;</span><span class="p">:</span> <span class="s2">&quot;PG-13&quot;</span><span class="p">,</span> 
  <span class="nt">&quot;description&quot;</span><span class="p">:</span> <span class="s2">&quot;Hobbs has Dom and Brian reassemble their crew in order</span>
<span class="s2">   to take down a mastermind who commands an organization of mercenary drivers across 12 countries. Payment? Full pardons for them all.&quot;</span><span class="p">,</span> 
  <span class="nt">&quot;director&quot;</span><span class="p">:</span> <span class="s2">&quot;Justin Lin&quot;</span><span class="p">,</span> 
  <span class="nt">&quot;gallery&quot;</span><span class="p">:</span> <span class="p">[</span>
    <span class="s2">&quot;http://ia.media-imdb.com/images/G/01/imdb/images/nopicture/small/unknown-1394846836._V379391227_.png&quot;</span><span class="p">,</span> 
    <span class="s2">&quot;http://ia.media-imdb.com/images/G/01/imdb/images/nopicture/small/unknown-1394846836._V379391227_.png&quot;</span><span class="p">,</span> 
    <span class="s2">&quot;http://ia.media-imdb.com/images/G/01/imdb/images/nopicture/small/unknown-1394846836._V379391227_.png&quot;</span>
  <span class="p">],</span> 
  <span class="nt">&quot;genre&quot;</span><span class="p">:</span> <span class="p">[</span>
    <span class="s2">&quot;Action&quot;</span><span class="p">,</span> 
    <span class="s2">&quot;Crime&quot;</span><span class="p">,</span> 
    <span class="s2">&quot;Thriller&quot;</span>
  <span class="p">],</span> 
  <span class="nt">&quot;metascore&quot;</span><span class="p">:</span> <span class="s2">&quot;61&quot;</span><span class="p">,</span> 
  <span class="nt">&quot;poster&quot;</span><span class="p">:</span> <span class="s2">&quot;http://ia.media-imdb.com/images/M/MV5BMTM3NTg2NDQzOF5BMl5BanBnXkFtZTcwNjc2NzQzOQ@@._V1_SX214_.jpg&quot;</span><span class="p">,</span> 
  <span class="nt">&quot;rating&quot;</span><span class="p">:</span> <span class="s2">&quot;7.2&quot;</span><span class="p">,</span> 
  <span class="nt">&quot;release_date&quot;</span><span class="p">:</span> <span class="s2">&quot;24 May 2013&quot;</span><span class="p">,</span> 
  <span class="nt">&quot;running_time&quot;</span><span class="p">:</span> <span class="s2">&quot;130 min&quot;</span><span class="p">,</span> 
  <span class="nt">&quot;stars&quot;</span><span class="p">:</span> <span class="p">[</span>
    <span class="s2">&quot;Vin Diesel&quot;</span><span class="p">,</span> 
    <span class="s2">&quot;Paul Walker&quot;</span><span class="p">,</span> 
    <span class="s2">&quot;Dwayne Johnson&quot;</span>
  <span class="p">],</span> 
  <span class="nt">&quot;storyline&quot;</span><span class="p">:</span> <span class="s2">&quot;Since Dom (Diesel) and Brian&#39;s (Walker) Rio heist toppled a kingpin&#39;s empire and left their crew with $100 million, our heroes have scattered across the globe. But their inability to return home and living forever on the lam have left their lives incomplete. Meanwhile, Hobbs (Johnson) has been tracking an organization of lethally skilled mercenary drivers across 12 countries, whose mastermind (Evans) is aided by a ruthless second-in-command revealed to be the love Dom thought was dead, Letty (Rodriguez). The only way to stop the criminal outfit is to outmatch them at street level, so Hobbs asks Dom to assemble his elite team in London. Payment? Full pardons for all of them so they can return home and make their families whole again.&quot;</span><span class="p">,</span> 
  <span class="nt">&quot;title&quot;</span><span class="p">:</span> <span class="s2">&quot;Furious 6&quot;</span><span class="p">,</span> 
  <span class="nt">&quot;votes&quot;</span><span class="p">:</span> <span class="s2">&quot;154,139&quot;</span><span class="p">,</span> 
  <span class="nt">&quot;year&quot;</span><span class="p">:</span> <span class="s2">&quot;2013&quot;</span>
<span class="p">}</span>
</code></pre></div>

<p>This will return a <code>JSON</code> object containing the data for the movie.
You can fork the code on <a href="https://github.com/bkvirendra/imdb-scraper">Github</a>.
You can try it out at <a href="http://getimdb.herokuapp.com">http://getimdb.herokuapp.com/</a>.</p>
]]></content>
  </entry>
  
  
  <entry>
    <title>Scraping iTunes Charts Using Scrapy Python</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/scraping-itunes-charts-using-scrapy-python/"/>
    <id>http://virendra.me/scraping-itunes-charts-using-scrapy-python</id>
    <updated>2013-06-13T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[Hey guys, recently I published a guest post on David Walsh’s Blog. Its a getting started tutorial for Scrapy, in this tutorial I build a simple web spider using Scrapy that crawls the iTunes charts and extracts the list of Top free apps.]]></summary>
    <content type="html" xml:base="http://virendra.me/scraping-itunes-charts-using-scrapy-python/"><![CDATA[<p>Hey guys, recently I published a guest post on <a href="http://davidwalsh.name">David Walsh’s Blog</a>. </p>

<p>Its a getting started tutorial for <a href="http://www.scrapy.org">Scrapy</a>, in this tutorial I build a simple web spider using <code>Scrapy</code> that crawls the <a href="http://www.apple.com/itunes/charts/free-apps/">iTunes</a> charts and extracts the list of <a href="http://www.apple.com/itunes/charts/free-apps/">Top free apps</a>.</p>

<p>The complete post is available <a href="http://davidwalsh.name/python-scrape">here</a>.</p>

<p>The code for the scrapper in the post is available <a href="https://github.com/bkvirendra/iTunes-Charts-WebCrawler">here</a>. </p>
]]></content>
  </entry>
  
  
  <entry>
    <title>Using Dropbox as your database backup space</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/using-dropbox-as-your-database-backup-space/"/>
    <id>http://virendra.me/using-dropbox-as-your-database-backup-space</id>
    <updated>2013-05-24T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[Recently, due to loss of hardware on AWS, we (markitty.com lost access to our EC2 instance, and Amazon couldn’t do anything to help us out it. And well since we already keep everything backed up, it wasn’t much of a big deal to spawn a new instance and get back online. Also with that, Markitty had its first down time since the launch. But it taught me a lesson that we couldn’t really trust AWS infrastructure with our customers data. We already use Dropbox at Markitty, so I decided to trust it with DB backups. We make use of Postgresql as our main Database on our Django stack. So, I hacked this python script that would take a regular backup of our main database and upload it to one of our Dropbox folders.]]></summary>
    <content type="html" xml:base="http://virendra.me/using-dropbox-as-your-database-backup-space/"><![CDATA[<p>Recently, due to loss of hardware on <a href="http://aws.amazon.com">AWS</a>, we (<a href="http://markitty.com/">markitty.com</a> lost access to our EC2 instance, and <a href="http://amazon.com/">Amazon</a> couldn’t do anything to help us out it. </p>

<p>And well since we already keep everything backed up, it wasn’t much of a big deal to spawn a new instance and get back online. Also with that, <a href="http://markitty.com/">Markitty</a> had its first down time since the launch.</p>

<p>But it taught me a lesson that we couldn’t really trust <a href="http://aws.amazon.com">AWS</a> infrastructure with our customers data.</p>

<p>We already use <a href="https://www.dropbox.com">Dropbox</a> at <a href="http://markitty.com/">Markitty</a>, so I decided to trust it with DB backups. We make use of <a href="http://www.postgresql.org">Postgresql</a> as our main Database on our <a href="http://djangoproject.com">Django</a> stack.</p>

<p>So, I hacked this <code>python script</code> that would take a regular backup of our main database and upload it to one of our <a href="https://www.dropbox.com">Dropbox</a> folders.</p>

<p>It includes 3 main files:</p>

<p><code>db_backup.sh</code> is the shell script that makes use of pg_dump to get the compressed backup of the database.</p>

<p><code>uploader.py</code> is the Python script that uploads the database to the Dropbox folder.</p>

<p><code>client_secrets.json</code> stores the credentials including <code>app_key</code>, <code>app_secret</code>, <code>access_key</code> and <code>access_secret</code>.</p>

<p>You need to provide the <code>DB_Username</code> and <code>DB_Name</code> in <code>db_backup.sh</code>.</p>

<script src="https://gist.github.com/bkvirendra/5559219.js"></script>

<p>Follow these steps to setup the Dropbox app:</p>

<ol>
<li><p>You will need to create a Dropbox app, to get the App<em>key and App</em>Secret. You can create it <a href="https://www.dropbox.com/developers/apps/create">here</a> (select the App Type as <code>Core</code> and select the Permission type as <code>Full Dropbox</code>)</p></li>
<li><p>Once the app is successfully created, Dropbox will provide you the <code>app_key</code> and <code>app_secret</code>. Provide this <code>app_key</code> and <code>app_secret</code> in <code>client_secrets.json</code> (please do not share your <code>App_Key</code> and <code>App_Secret</code> publicly).</p></li>
<li><p>Then execute the uploader.py, it will generate an authentication link which you will need to open in your web browser. Press the Allow button, and hit Enter in the shell.</p></li>
<li><p>It will then print the <code>access_key</code> and the <code>access_secret</code>, that you will need to provide in the <code>client_secrets.json</code>
And you are done with the Dropbox setup. </p></li>
</ol>

<p>After that, you can setup a Cron job that will execute the db_backup.sh everyday and get your Database backup in your Dropbox folder.
The scripts all yours under Creative Commons License :-)</p>

<p>You can fork it on Github <a href="https://github.com/bkvirendra/Dropbox_db_backup">here</a>.</p>
]]></content>
  </entry>
  
  
  <entry>
    <title>Featuring in Indian Express</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/featuring-in-indian-express/"/>
    <id>http://virendra.me/featuring-in-indian-express</id>
    <updated>2013-05-10T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[Recently I got published by the Indian Express.

The article is available online here.


    
        
    

]]></summary>
    <content type="html" xml:base="http://virendra.me/featuring-in-indian-express/"><![CDATA[<p>Recently I got published by the <a href="http://www.indianexpress.com/">Indian Express</a>.</p>

<p>The article is available online <a href="http://www.indianexpress.com/news/geek-diaries/1106144/">here</a>.</p>

<div>
    <a href="/img/photos/original/tumblr_mml4w0vM6y1ryuegwo1_1280.png" title="Featured in the Indian Express">
        <img class="full_sized" src="/img/photos/original/tumblr_mml4w0vM6y1ryuegwo1_1280.png">
    </a>
</div>
]]></content>
  </entry>
  
  
  <entry>
    <title>Things I learned from my first Hackathon</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/things-i-learned-from-my-first-hackathon/"/>
    <id>http://virendra.me/things-i-learned-from-my-first-hackathon</id>
    <updated>2013-02-25T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[This was my first real Hackathon experience. Real because it was my first full fledged hackathon — before this I did participate in a few programming events at college’s where we had to create an app in a couple of hours. It was organized by Google Developers Group. As usual it was a 24 hours hackathon. So here’s what I learned.]]></summary>
    <content type="html" xml:base="http://virendra.me/things-i-learned-from-my-first-hackathon/"><![CDATA[<div>
    <img class="centered_photos" src="/img/tumblr_inline_mis66bsGJ21qz4rgp.jpg" align="Planning at the Google Developers Hackathon">
</div>

<p>This was my first real Hackathon experience. Real because it was my first full fledged hackathon — before this I did participate in a few programming events at college’s where we had to create an app in a couple of hours. It was organized by <a href="http://www.meetup.com/Pune-GDG/events/88636722/">Google Developers Group</a>. As usual it was a 24 hours hackathon.</p>

<p>So here’s what I learned -</p>

<ol>
<li>Prepare in advance:</li>
</ol>

<p>While you are planning to attend a hackathon, you might want to short list a few ideas, days before the event. Brainstorming this with a friend will be even better. It also helps, if you can define the scope of the idea that want to work on. Try to answer a few questions — How big will the project be? Can you build it as a solo developer? How familiar are you with  technologies that you will be using? Are the organizers putting any constraints on which technologies you can or can not use? <em>(Come on, you seriously don’t need to be aware of anything, apart from the language itself! In fact, you can just figure out everything else)</em></p>

<p>I began thinking of ideas as soon as the hackathon dates were put up. Me and my my <a href="http://narenonit.blogspot.in/" title="Narendra Rajput&#39;s Blog">brother</a> (also an hacker, and my partner at the hackathon) used to brainstorm, discussing whether or not it was realistic, and did we have enough skills to execute it. Also if the thing we build, would it be of any use to people? Are we solving an real problem? We short-listed a few ideas that answered the above questions positively.</p>

<ol>
<li>Deciding on the idea:</li>
</ol>

<p>One question that you would like to ask yourself while deciding on the idea would be — is it realistic? This is one major decision, because since you’ll have a very limited amount of time with you, with limited resources. The idea actually has to be well planned before the execution really begins. Even though we had thought of a few ideas in advance, it was quite confusing to settle on one when the hackathon started.</p>

<p>At that point the most important thing that you have to consider is — whether or not you can execute it well in the allotted time? Once you make the decision you have to break the execution down into stages, and allocate enough time to each stage. This really helps you with analyzing whether or not, what you’re trying to achieve is doable and helps you keep check on time later.</p>

<p>In the beginning, it seemed to turn out to be quite challenging for us. Since the idea that we decided had much bigger scope, and only with 24 hours in our hands, it seemed too tough but we weren’t going to back down.</p>

<p>3. Have awesome people in your team:</p>

<p>Well, the idea that you’ll be working often requires skills. And the better the skills you have, the more efficiently you can execute you’re idea. This is one thing, that depends on how well you can convince some other guy (a developer probably), to work on your idea. The more convincing you are, better the talent you’ll attract.</p>

<p>We were pretty lucky on this one — we had one of the best front end developers (I ve meet yet) <a href="http://jquer.in/" title="Jay Kanakiya&#39;s Blog">Jay Kanakiya</a>, along with <a href="http://narenonit.blogspot.in/" title="Narendra Rajput&#39;s Blog">Narendra Rajput</a>, a hardcore Ruby developer and myself (a Pythonist). So we weren’t really short of skills at least. I am a Python/Django developer but it won’t be cool to praise myself :-)</p>

<ol>
<li>Get stuff done:</li>
</ol>

<div>
    <img class="centered_photos" src="/img/tumblr_inline_mis6ff29Hj1qz4rgp.jpg" align="Omkar #hacking">
</div>

<p>Hackathons are all about getting stuff done. At the end of the day, what really matters is how well did you execute your idea. And many a times its not easy, because when you get stuck at something (everyone does), and you might not succeed even after spending the precious time figuring it out (it just happens).</p>

<p>Well this is the prime time, where you are really tested on how well do you make decisions. You have to come up with the plan B (no-one really has a plan B), an alternative. Its all about how do you actually <a href="https://twitter.com/search?q=hack" title="Twitter hashtag">#hack</a> the problem (come with a crazy solution, which before that never existed).</p>

<p>We did face, enough problems, but the way we hacked our way was the fun part.</p>

<div>
    <img class="centered_photos" src="/img/tumblr_inline_mis6gqhud31qz4rgp.jpg" align="The Winners!">
</div>

<p><strong>And did I tell you we won the second prize?</strong></p>

<p>We are told we will be featured, on <a href="https://developers.google.com/" title="Google Developers">developers.google.com</a> - so hold your breath!</p>

<p>The app that we built at the hackathon is available in beta <a href="https://github.com/bkvirendra/socialtvapp">here</a>.</p>

<p>We are working hard to launch it for public soon so share your feedback (good and bad). </p>

<p>Special thanks to <a href="https://twitter.com/nileshbhojani%E2%80%8E">Nilesh Bhojani</a> for helping me with the blog post.</p>

<p>You can view the hackathon album <a href="https://plus.google.com/photos/117099251731858299644/albums/5838915643456031841">here</a>.</p>

<p><strong>Do let me know if you are going for a hackathon and need a hand! Would be really happy to help :)</strong></p>
]]></content>
  </entry>
  
  
  <entry>
    <title>Directory Downloader in Python</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/directory-downloader-python/"/>
    <id>http://virendra.me/directory-downloader-python</id>
    <updated>2012-11-30T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[And then, I came across a few websites that allowed directory browsing. So I started saving the images manually from the index. And well those directories had thousands of images, and downloading them manually would suck (being a #hacker you always want everything to be automated). So I started hacking a script, that would carry out this task for me. And in just 15 minutes, I cracked it. And had fun, downloading entire webserver directories in minutes. You can use this Python script to download entire directories (if the webserver has indexes open).]]></summary>
    <content type="html" xml:base="http://virendra.me/directory-downloader-python/"><![CDATA[<p>Recently, while browsing for some <a href="http://www.google.co.in/search?num=10&amp;hl=en&amp;site=imghp&amp;tbm=isch&amp;source=hp&amp;biw=1280&amp;bih=679&amp;q=facebook+timeline+cover&amp;oq=facebook+time&amp;gs_l=img.3.0.0l10.2062.3857.0.4660.13.11.0.2.2.0.222.1327.5j4j2.11.0...0.0...1ac.1.09Nz1Gvk57k" title="Google image search">Facebook Timeline covers</a> that I wanted for my <a href="http://www.facebook.com/TheVirendraRajput" title="Virendra Rajput&#39;s Facebook Profile">Facebook Profile</a>. I came across hundreds of covers that I would love to have on my hard-disk as my Timeline Cover collection (yeah, I have a Timeline Cover collection).</p>

<p>And then, I came across a few websites that allowed <a href="http://wiki.apache.org/httpd/DirectoryListings">directory browsing</a>. So I started saving the images manually from the <code>index</code>. And well those directories had thousands of images, and downloading them manually would suck (being a #hacker you always want everything to be automated).</p>

<p>So I started hacking a script, that would carry out this task for me. And in just 15 minutes, I cracked it. And had fun, downloading entire <code>webserver</code> directories in minutes.</p>

<p>You can use this <code>Python</code> script to download entire directories (if the webserver has <code>indexes</code> open).</p>

<p>This script also makes use of <code>Beautifulsoup</code>, you can install it, by using the following command:</p>
<div class="highlight"><pre><code class="text language-text" data-lang="text">pip install beautifulsoup4  # if you have pip installed

 easy_install BeautifulSoup4 # if you have easy_install
</code></pre></div>
<p>For using the script, you need to pass the <code>directory url</code> as a commandline argument to the script, for Eg. </p>

<p>For downloading the directory at <a href="http://www.namecovers.com/_asset/_thumb/">http://www.namecovers.com/<em>asset/</em>thumb/</a></p>
<div class="highlight"><pre><code class="text language-text" data-lang="text">$ python downloader.py http://www.namecovers.com/_asset/_thumb/
</code></pre></div>
<p>The code:</p>

<div class="highlight"><pre><code class="python"><span class="kn">import</span> <span class="nn">urllib2</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">os</span>

<span class="kn">from</span> <span class="nn">bs4</span> <span class="kn">import</span> <span class="n">BeautifulSoup</span>
<span class="kn">from</span> <span class="nn">urlparse</span> <span class="kn">import</span> <span class="n">urlparse</span>

<span class="k">def</span> <span class="nf">downloader</span><span class="p">(</span><span class="n">urls</span><span class="p">,</span> <span class="n">grab_url</span><span class="p">,</span> <span class="n">foldername</span><span class="p">):</span>
    <span class="k">if</span> <span class="ow">not</span> <span class="n">os</span><span class="o">.</span><span class="n">path</span><span class="o">.</span><span class="n">exists</span><span class="p">(</span><span class="n">foldername</span><span class="p">):</span>
        <span class="k">print</span> <span class="s">&quot;</span><span class="se">\&quot;</span><span class="s">&quot;</span><span class="o">+</span> <span class="n">foldername</span> <span class="o">+</span> <span class="s">&quot;</span><span class="se">\&quot;</span><span class="s"> does not exist!&quot;</span>
        <span class="n">os</span><span class="o">.</span><span class="n">makedirs</span><span class="p">(</span><span class="n">foldername</span><span class="p">)</span>
        <span class="k">print</span> <span class="s">&quot;Creating </span><span class="se">\&quot;</span><span class="s">&quot;</span> <span class="o">+</span> <span class="n">foldername</span> <span class="o">+</span> <span class="s">&quot;</span><span class="se">\&quot;</span><span class="s">...&quot;</span> 
    <span class="k">for</span> <span class="n">cover</span> <span class="ow">in</span> <span class="n">urls</span><span class="p">:</span>
        <span class="k">try</span><span class="p">:</span>
            <span class="k">print</span> <span class="s">&quot;Downloading item &quot;</span> <span class="o">+</span> <span class="n">cover</span> <span class="o">+</span> <span class="s">&quot;...&quot;</span>
            <span class="k">print</span> <span class="n">grab_url</span> <span class="o">+</span> <span class="n">cover</span>
            <span class="n">img</span> <span class="o">=</span> <span class="n">urllib2</span><span class="o">.</span><span class="n">urlopen</span><span class="p">(</span><span class="n">grab_url</span> <span class="o">+</span> <span class="n">cover</span><span class="p">)</span>
            <span class="n">output</span> <span class="o">=</span> <span class="nb">open</span><span class="p">(</span><span class="n">foldername</span> <span class="o">+</span> <span class="s">&quot;/&quot;</span> <span class="o">+</span> <span class="n">cover</span><span class="p">,</span><span class="s">&#39;&#39;</span><span class="n">wb</span><span class="s">&#39;&#39;</span><span class="p">)</span>
            <span class="n">output</span><span class="o">.</span><span class="n">write</span><span class="p">(</span><span class="n">img</span><span class="o">.</span><span class="n">read</span><span class="p">())</span>
            <span class="n">output</span><span class="o">.</span><span class="n">close</span><span class="p">()</span>
            <span class="k">print</span> <span class="n">cover</span> <span class="o">+</span> <span class="s">&quot;... downloaded!!&quot;</span>
        <span class="k">except</span> <span class="ne">Exception</span><span class="p">,</span> <span class="n">e</span><span class="p">:</span>
            <span class="k">pass</span>
    <span class="k">return</span>

<span class="k">def</span> <span class="nf">main</span><span class="p">(</span><span class="n">url</span><span class="p">):</span>
    <span class="n">urls</span> <span class="o">=</span> <span class="p">[]</span>
    <span class="k">print</span> <span class="s">&quot;Fetching the page...&quot;</span>
    <span class="n">page</span> <span class="o">=</span> <span class="n">urllib2</span><span class="o">.</span><span class="n">urlopen</span><span class="p">(</span><span class="n">url</span><span class="p">)</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
    <span class="k">print</span> <span class="s">&quot;Fetching completed!&quot;</span>
    <span class="n">soup</span> <span class="o">=</span> <span class="n">BeautifulSoup</span><span class="p">(</span><span class="n">page</span><span class="p">)</span>
    <span class="k">print</span> <span class="s">&quot;Grabbing the objects of the page...&quot;</span>
    <span class="n">lis</span> <span class="o">=</span> <span class="n">soup</span><span class="o">.</span><span class="n">find_all</span><span class="p">(</span><span class="s">&quot;li&quot;</span><span class="p">)</span>
    <span class="k">for</span> <span class="n">item</span> <span class="ow">in</span> <span class="n">lis</span><span class="p">:</span>
        <span class="n">urls</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">item</span><span class="o">.</span><span class="n">a</span><span class="p">[</span><span class="s">&quot;href&quot;</span><span class="p">])</span>
    <span class="n">domain</span> <span class="o">=</span> <span class="n">urlparse</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
    <span class="n">downloader</span><span class="p">(</span><span class="n">urls</span><span class="p">,</span> <span class="n">url</span><span class="p">,</span> <span class="n">domain</span><span class="o">.</span><span class="n">netloc</span><span class="p">)</span>
    <span class="k">print</span> <span class="s">&quot;All files have been successfully downloaded!&quot;</span>
    <span class="k">return</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">&quot;__main__&quot;</span><span class="p">:</span>
    <span class="n">main</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
</code></pre></div>

<p>You can also fork it on Github <a href="https://github.com/bkvirendra/directory-downloader">here</a>.</p>
]]></content>
  </entry>
  
  
  <entry>
    <title>How I keep my Facebook Page active using Cron Jobs!</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/how-i-keep-my-facebook-page-active-using-cron-jobs/"/>
    <id>http://virendra.me/how-i-keep-my-facebook-page-active-using-cron-jobs</id>
    <updated>2012-10-01T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[I have this habit of forgetting important tasks, lately. Since multitasking is never an easy job, to do and you tend to forget alot. Being a regular member of the Brahma Kumaris community, I was looking forward to contribute to the community some way possible. Hence, I was thinking of creating an Online presence of the community. So after a while, I created a Facebook Page for Brahma Kumaris, where I used to share Daily Muralis along with videos for classes, that I had to fetch from various sources online.  The Murali was available in plain text and the videos were being posted at Brahma Kumaris Youtube Channel. The Murali was available in 2 languages viz. English and Hindi at Brahma Kumaris Jewels.]]></summary>
    <content type="html" xml:base="http://virendra.me/how-i-keep-my-facebook-page-active-using-cron-jobs/"><![CDATA[<p>I have this habit of forgetting important tasks, lately. Since multitasking is never an easy job, to do and you tend to forget alot. Being a regular member of the <a href="http://www.brahmakumaris.com/" title="Brahma Kumaris Official Website">Brahma Kumaris</a> community, I was looking forward to contribute to the community some way possible.</p>

<p>Hence, I was thinking of creating an Online presence of the community. So after a while, I created a Facebook Page for <a href="http://www.facebook.com/TheBrahmaKumaris">Brahma Kumaris</a>, where I used to share Daily Muralis along with videos for classes, that I had to fetch from various sources online. </p>

<p>The Murali was available in plain text and the videos were being posted at <a href="http://www.youtube.com/user/brahmakumariz">Brahma Kumaris Youtube Channel</a>. The Murali was available in 2 languages viz. English and <a href="http://en.wikipedia.org/wiki/Hindi">Hindi</a> at <a href="http://jewels.brahmakumaris.org/">Brahma Kumaris Jewels</a>.</p>

<p>I did the task of updating the <a href="http://www.facebook.com/TheBrahmaKumaris">Page</a> manually for a while, but it wasn’t possible for a lazy guy like me to do it long, considering all the lazy habits I have. And it was important to update the Page exactly in the morning, so that it would reach members all around the globe.</p>

<p>I added a few fellow members of the community as the Admin of the Page, so that they could help me with updating the page, regularly. But still we weren’t able to keep it updated.</p>

<p>So I decided to solve the problem, the <strong>HACKER WaY</strong>. And finally came up with the script that I developed in 2 hours approximately, when I challenged my Bro, that I would get it done in an hour, but it took me an hour more.</p>

<p>After testing it, thoroughly, and adding support for <a href="http://en.wikipedia.org/wiki/Hindi">Hindi</a> language, I created a Cron Job, that would run exactly at 9:30 AM, everyday.</p>

<p>The script would fetch the <a href="http://jewels.brahmakumaris.org/">Jewels Brahma Kumaris Page</a>, and would parse it to get the Murali text from it, and then filter it to remove any special characters, or so.</p>

<p>Then it would make a post request to the <a href="http://developers.facebook.com/docs/reference/api">Facebook Graph API</a> with all the required credentials, and get the Murali posted on the Page. Since the murali was available in 2 languages, I had 2 different scripts, for handling each language.</p>

<p>Then I had a script for fetching the Class video from the <a href="http://www.youtube.com/user/brahmakumariz">Brahma Kumaris Youtube Channel</a>. This script would get today’s Murali class video, and would post it on the <a href="http://www.facebook.com/TheBrahmaKumaris">page</a>.</p>

<p>The scripts have really been useful, for keeping the <a href="http://www.facebook.com/TheBrahmaKumaris">Page</a> updated regularly, and have contributed  in increasing the Page Reach and Likes as well. Currently, the Page has 5,990 likes. And around 350+ people talking about this. Now, I rarely have to check the <a href="http://www.facebook.com/TheBrahmaKumaris">page</a>, since the <code>Cron</code> manages it all.</p>

<p>Well, I mostly prefer <code>Python</code>, but I used PHP to create these scripts since, <code>Python</code> has a lot of hosting issues. I was considering hosting it on <code>Heroku</code>, but again <code>Heroku</code> doesn’t provide <code>Cron</code> for free. Hence PHP was a better option, again.</p>

<p>You can fork the script on Github <a href="https://github.com/bkvirendra/BkMuralis-Cron-script">here</a>.</p>

<p><strong>I would like to hear from you, about any such similar situations that you have come across, and how you managed to solve it!</strong></p>
]]></content>
  </entry>
  
  
  <entry>
    <title>#Cracking the Multunus @Twitter Challenge</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/cracking-the-multunus-twitter-challenge/"/>
    <id>http://virendra.me/cracking-the-multunus-twitter-challenge</id>
    <updated>2012-09-19T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[I like to keep #hacking interesting problems that I come around. And just yesterday, I came across this new Twitter #Challenge, created by Multunus Softwares. Well actually the Contest Date was Ended, the deadline was 12th August and it was almost a month old now. But the Puzzle was really interesting. It was actually a web app, which accepts your Twitter handle (ie. Username) and generates a cloud of numbers, which is unique for every Twitter handle. The number cloud was really strange at first, but after playing with it (regenerating number cloud again and again), for a while I got to know the pattern of numbers that was been generated. So, the problem was to understand the number pattern that was been generated in the cloud, and to build an App similar to that.]]></summary>
    <content type="html" xml:base="http://virendra.me/cracking-the-multunus-twitter-challenge/"><![CDATA[<p>I like to keep #hacking interesting problems that I come around. 
And just yesterday, I came across this new <a href="http://twitterpuzzle.herokuapp.com/">Twitter #Challenge</a>, created by <a href="http://multunus.com/">Multunus Softwares</a>. Well actually the Contest Date was Ended, the deadline was 12th August and it was almost a month old now.</p>

<p>But the <a href="http://twitterpuzzle.herokuapp.com/">Puzzle</a> was really interesting. It was actually a web app, which accepts your Twitter handle (ie. Username) and generates a <a href="http://crackedpuzzle.herokuapp.com/?handle=multunus">cloud of numbers</a>, which is unique for every Twitter handle.</p>

<p>The number cloud was really strange at first, but after playing with it (regenerating number cloud again and again), for a while I got to know the pattern of numbers that was been generated.</p>

<p>So, the problem was to understand the number pattern that was been generated in the cloud, and to build an App similar to that.</p>

<p>After I understood the logic behind the generated cloud, it was just some python #code that I had to put together for the App (ie. solution) was up and running.</p>

<p>I didn’t actually win anything, but it was really fun solving another interesting puzzle. </p>

<p><strong>The <a href="http://crackedpuzzle.herokuapp.com/">Solution</a> is hosted on Heroku, and the source code is available on <a href="https://github.com/bkvirendra/Multunus-Twitter-Puzzle-Solution">Github</a>.</strong></p>

<p><strong>The Problem Statement is available <a href="http://twitterpuzzle.herokuapp.com/">here</a>.</strong></p>

<p><strong>Would like to hear your experiences about solving some really interesting problems that you have came across!</strong></p>
]]></content>
  </entry>
  
  
  <entry>
    <title>Google's &quot;Did You Mean&quot; Hack in Python</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/googles-did-you-mean-hack-in-python/"/>
    <id>http://virendra.me/googles-did-you-mean-hack-in-python</id>
    <updated>2012-09-03T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[So I had this problem, with one of my apps Nearme, where people weren't actually querying correctly (there were a lot of misspelled words in the queries). Since these queries were Proper nouns, so there is no specific dictionary/ source that I can make use of to correct them. So I thought of using Google’s “Did You Mean” since it corrects all types of words (including Proper Nouns, that aren't included in any of the Dictionaries). So here’s a hack that I wrote solve this problem of fixing the spelling mistakes that users made while Querying my app. (it's not the BEST solution to the problem, but well it works). The code is in Python, and makes use of one of my favorite modules BeautifulSoup.]]></summary>
    <content type="html" xml:base="http://virendra.me/googles-did-you-mean-hack-in-python/"><![CDATA[<p>I ve always been pretty fond of <a href="https://www.google.com/">Google</a> Search Engine (<em>well everyone is</em>). <a href="https://www.google.com/">Google</a> has some really handy features that are helpful while searching something that you cant actually <strong>spell correctly</strong> (well there are alot of things that aren&#39;t easy to spell, unless you are a <em>English Professor</em> / or maybe an <em>expert in Literature</em>).</p>

<p>So I had this problem, with one of my apps <a href="http://bkvirendra.github.com/Nearme/">Nearme</a>, where people weren&#39;t actually querying correctly (there were a lot of misspelled words in the queries). Since these queries were Proper nouns, so there is no specific dictionary/ source that I can make use of to correct them. So I thought of using <a href="http://stackoverflow.com/questions/307291/how-does-the-google-did-you-mean-algorithm-work">Google’s “Did You Mean”</a> since it corrects all types of words (including Proper Nouns, that aren&#39;t included in any of the Dictionaries)</p>

<p>So here’s a hack that I wrote solve this problem of fixing the spelling mistakes that users made while Querying my app.
(it&#39;s not the BEST solution to the problem, but <em>well it works</em>)</p>

<p>The code is in Python, and makes use of one of my favorite modules <code>BeautifulSoup</code>.</p>

<p>The <code>getPage</code> function is used to retrieve the pages in <code>gzip</code> so that it reduces the Bandwidth usage while retrieving the page.</p>

<p>The <code>didYouMean</code> is the main function that you call with the argument of <code>word</code> and it will return you the correct the word (if it is misspelled) or else it will simply <code>1</code> that means the word has no corrections.</p>

<p>The <code>code</code> for the script:</p>

<div class="highlight"><pre><code class="python"><span class="kn">import</span> <span class="nn">os</span>
<span class="kn">import</span> <span class="nn">urllib2</span>
<span class="kn">import</span> <span class="nn">io</span>
<span class="kn">import</span> <span class="nn">gzip</span>
<span class="kn">import</span> <span class="nn">sys</span>
<span class="kn">import</span> <span class="nn">urllib</span>
<span class="kn">import</span> <span class="nn">re</span>

<span class="kn">from</span> <span class="nn">bs4</span> <span class="kn">import</span> <span class="n">BeautifulSoup</span>
<span class="kn">from</span> <span class="nn">StringIO</span> <span class="kn">import</span> <span class="n">StringIO</span>

<span class="k">def</span> <span class="nf">getPage</span><span class="p">(</span><span class="n">url</span><span class="p">):</span>
    <span class="n">request</span> <span class="o">=</span> <span class="n">urllib2</span><span class="o">.</span><span class="n">Request</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
    <span class="n">request</span><span class="o">.</span><span class="n">add_header</span><span class="p">(</span><span class="s">&#39;Accept-encoding&#39;</span><span class="p">,</span> <span class="s">&#39;gzip&#39;</span><span class="p">)</span>
    <span class="n">request</span><span class="o">.</span><span class="n">add_header</span><span class="p">(</span><span class="s">&#39;User-Agent&#39;</span><span class="p">,</span><span class="s">&#39;Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/535.20 (KHTML, like Gecko) Chrome/19.0.1036.7 Safari/535.20&#39;</span><span class="p">)</span>
    <span class="n">response</span> <span class="o">=</span> <span class="n">urllib2</span><span class="o">.</span><span class="n">urlopen</span><span class="p">(</span><span class="n">request</span><span class="p">)</span>
    <span class="k">if</span> <span class="n">response</span><span class="o">.</span><span class="n">info</span><span class="p">()</span><span class="o">.</span><span class="n">get</span><span class="p">(</span><span class="s">&#39;Content-Encoding&#39;</span><span class="p">)</span> <span class="o">==</span> <span class="s">&#39;gzip&#39;</span><span class="p">:</span>
        <span class="n">buf</span> <span class="o">=</span> <span class="n">StringIO</span><span class="p">(</span> <span class="n">response</span><span class="o">.</span><span class="n">read</span><span class="p">())</span>
        <span class="n">f</span> <span class="o">=</span> <span class="n">gzip</span><span class="o">.</span><span class="n">GzipFile</span><span class="p">(</span><span class="n">fileobj</span><span class="o">=</span><span class="n">buf</span><span class="p">)</span>
        <span class="n">data</span> <span class="o">=</span> <span class="n">f</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
    <span class="k">else</span><span class="p">:</span>
        <span class="n">data</span> <span class="o">=</span> <span class="n">response</span><span class="o">.</span><span class="n">read</span><span class="p">()</span>
    <span class="k">return</span> <span class="n">data</span>

<span class="k">def</span> <span class="nf">didYouMean</span><span class="p">(</span><span class="n">q</span><span class="p">):</span>
    <span class="n">q</span> <span class="o">=</span> <span class="nb">str</span><span class="p">(</span><span class="nb">str</span><span class="o">.</span><span class="n">lower</span><span class="p">(</span><span class="n">q</span><span class="p">))</span><span class="o">.</span><span class="n">strip</span><span class="p">()</span>
    <span class="n">url</span> <span class="o">=</span> <span class="s">&quot;http://www.google.com/search?q=&quot;</span> <span class="o">+</span> <span class="n">urllib</span><span class="o">.</span><span class="n">quote</span><span class="p">(</span><span class="n">q</span><span class="p">)</span>
    <span class="n">html</span> <span class="o">=</span> <span class="n">getPage</span><span class="p">(</span><span class="n">url</span><span class="p">)</span>
    <span class="n">soup</span> <span class="o">=</span> <span class="n">BeautifulSoup</span><span class="p">(</span><span class="n">html</span><span class="p">)</span>
    <span class="n">ans</span> <span class="o">=</span> <span class="n">soup</span><span class="o">.</span><span class="n">find</span><span class="p">(</span><span class="s">&#39;a&#39;</span><span class="p">,</span> <span class="n">attrs</span><span class="o">=</span><span class="p">{</span><span class="s">&#39;class&#39;</span> <span class="p">:</span> <span class="s">&#39;spell&#39;</span><span class="p">})</span>
    <span class="k">try</span><span class="p">:</span>
        <span class="n">result</span> <span class="o">=</span> <span class="nb">repr</span><span class="p">(</span><span class="n">ans</span><span class="o">.</span><span class="n">contents</span><span class="p">)</span>
        <span class="n">result</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s">&quot;u&#39;&quot;</span><span class="p">,</span><span class="s">&quot;&quot;</span><span class="p">)</span>
        <span class="n">result</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s">&quot;/&quot;</span><span class="p">,</span><span class="s">&quot;&quot;</span><span class="p">)</span>
        <span class="n">result</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s">&quot;&lt;b&gt;&quot;</span><span class="p">,</span><span class="s">&quot;&quot;</span><span class="p">)</span>
        <span class="n">result</span> <span class="o">=</span> <span class="n">result</span><span class="o">.</span><span class="n">replace</span><span class="p">(</span><span class="s">&quot;&lt;i&gt;&quot;</span><span class="p">,</span><span class="s">&quot;&quot;</span><span class="p">)</span>
        <span class="n">result</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="s">&#39;[^A-Za-z0-9\s]+&#39;</span><span class="p">,</span> <span class="s">&#39;&#39;</span><span class="p">,</span> <span class="n">result</span><span class="p">)</span>
        <span class="n">result</span> <span class="o">=</span> <span class="n">re</span><span class="o">.</span><span class="n">sub</span><span class="p">(</span><span class="s">&#39; +&#39;</span><span class="p">,</span><span class="s">&#39; &#39;</span><span class="p">,</span><span class="n">result</span><span class="p">)</span>
    <span class="k">except</span> <span class="ne">AttributeError</span><span class="p">:</span>
        <span class="n">result</span> <span class="o">=</span> <span class="mi">1</span>
    <span class="k">return</span> <span class="n">result</span>

<span class="k">if</span> <span class="n">__name__</span> <span class="o">==</span> <span class="s">&quot;__main__&quot;</span><span class="p">:</span>
    <span class="n">response</span> <span class="o">=</span> <span class="n">didYouMean</span><span class="p">(</span><span class="n">sys</span><span class="o">.</span><span class="n">argv</span><span class="p">[</span><span class="mi">1</span><span class="p">])</span>
    <span class="k">return</span> <span class="n">response</span>
</code></pre></div>

<p>You can even fork it on Github <a href="https://github.com/bkvirendra/didyoumean">here</a>.</p>
]]></content>
  </entry>
  
  
  <entry>
    <title>Find places around you, using Nearme on SMS</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/find-places-around-you-using-nearme-on-sms/"/>
    <id>http://virendra.me/find-places-around-you-using-nearme-on-sms</id>
    <updated>2012-08-19T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[Well for me Find Near Me is a pretty handy app when I m looking to find places around me. Since its really easy to use, and comes in handy when I m in some new town, and I dont know much about the vicinity. And well since I live in India, you just cannot expect the Internet connection to be working all the times. So its obvious that you cannot be dependent on using Apps which require Internet connection to function. So I came up with my solution, A Nearme app on sms. So it should work regardless whether I m connected to Internet or not.   Txtweb is platform which provides Internet on sms. So developers can create apps, that can be used via sms. It is really easy to get started with the Txtweb platform, since they have some excellent documentation.]]></summary>
    <content type="html" xml:base="http://virendra.me/find-places-around-you-using-nearme-on-sms/"><![CDATA[<p>Well for me <a href="http://itunes.apple.com/us/app/find-near-me/id353369769?mt=8">Find Near Me</a> is a pretty handy app when I m looking to find places around me. Since its really easy to use, and comes in handy when I m in some new town, and I dont know much about the vicinity.</p>

<p>And well since I live in <a href="http://articles.timesofindia.indiatimes.com/2012-05-02/internet/31537373_1_256kbps-speed-internet-speed-akamai">India</a>, you just cannot expect the Internet connection to be working all the times. So its obvious that you cannot be dependent on using Apps which require Internet connection to function. So I came up with my solution, A Nearme app on sms. So it should work regardless whether I m connected to Internet or not.
 
<a href="http://developer.txtweb.com/">Txtweb</a> is platform which provides Internet on sms. So developers can create apps, that can be used via sms. It is really easy to get started with the <a href="http://developer.txtweb.com/">Txtweb platform</a>, since they have some excellent documentation. </p>

<p>I considered using <a href="https://developers.google.com/maps/documentation/">Google Places API</a> to locate places around you. </p>

<p>You can find how to use the app, along with some how to’s at <a href="http://bkvirendra.github.com/Nearme/">App homepage</a>.</p>

<p><strong><a href="http://developer.txtweb.com/apps/nearme">Try it out</a> on the Txtweb emulator for free!</strong></p>

<p><em>The source code is available on <a href="https://github.com/bkvirendra/Nearme">Github</a></em>.</p>
]]></content>
  </entry>
  
  
  <entry>
    <title>Launching an API based App | Scrapit - Extract keywords from webpages</title>
    <author>
      <name>Virendra Rajput</name>
      <uri>http://virendra.me/</uri>
    </author>
    <link rel="alternate" type="text/html" href="http://virendra.me/launching-an-api-based-app-scrapit-extract-keywords/"/>
    <id>http://virendra.me/launching-an-api-based-app-scrapit-extract-keywords</id>
    <updated>2012-08-06T00:00:00+05:30</updated>
    <summary type="html"><![CDATA[Recently trying to find some good alternatives for web scrapper for one my apps, that had to scrap links and extract keywords from the webpages, since it had to much robust, handling broken html, and heaps of text.  I browsed around the web looking for some existing libraries that could help me out. Since I don’t like to create things from scratch. Well I couldn’t find what I exactly wanted, but I found somethings that could help me building it.  Python has some really good text processing modules along with html processing libraries.]]></summary>
    <content type="html" xml:base="http://virendra.me/launching-an-api-based-app-scrapit-extract-keywords/"><![CDATA[<p>Recently trying to find some good alternatives for web scrapper for one my apps, that had to scrap links and extract keywords from the webpages, since it had to much robust, handling broken html, and heaps of text. </p>

<p>I browsed around the web looking for some existing libraries that could help me out. Since I don’t like to create things from scratch. Well I couldn’t find what I exactly wanted, but I found somethings that could help me building it. </p>

<p>Python has some really good <a href="http://stackoverflow.com/questions/6030291/python-or-java-for-text-processing-text-mining-information-retrieval-natural">text processing modules</a> along with <a href="http://pypi.python.org/pypi?%3Aaction=search&amp;term=html+parser&amp;submit=search">html processing libraries</a>. So I ended up using : </p>

<p><a href="http://pypi.python.org/pypi/topia.termextract/"><code>Topia.termextract</code></a> for text processing</p>

<p><a href="http://lxml.de/"><code>lxml</code></a> for html parsing</p>

<p>After writing the code, I tested it pretty thoroughly and after the module was completed. I thought to launch it as an API service. Well deployment was not an issue, since <a href="http://heroku.com/">Heroku</a> is my option. So got the API deployed on <a href="http://neilmiddleton.com/why-heroku-is-a-game-changer/">Heroku</a>, along with some modifications, and running it with gunicorn server.</p>

<p><strong>What is <a href="http://scrapit.herokuapp.com/">Scrapit</a>?</strong></p>

<p><a href="http://scrapit.herokuapp.com/">Scrapit</a> is an API for scrapping webpages for keywords. Using Scrapit you can extract important keywords from webpages. That are relevant to the page.</p>

<p><strong>Using Scrapit</strong>:</p>

<p>You need to make calls to</p>
<div class="highlight"><pre><code class="text language-text" data-lang="text">http://scrapit.herokuapp.com/q/?q={url} 
</code></pre></div>
<p>Parameters:</p>

<p><code>q</code> : (required) url to be fetched</p>

<p><code>occurs</code> : (optional) Will only return the words that are repeated more that once on the webpage. Set to &#39;&#39;1&#39;&#39; while you want to enable it</p>

<p><code>pretty</code> : (optional) Used for pretty printing the response. Set to &#39;&#39;1&#39;&#39; while you want to enable it</p>

<p>Example Usage:</p>

<p><a href="http://scrapit.herokuapp.com/q/?q=http://imdb.com">http://scrapit.herokuapp.com/q/?q=http://imdb.com</a></p>

<p><a href="http://scrapit.herokuapp.com/q/?q=http://imdb.com&amp;pretty=1">http://scrapit.herokuapp.com/q/?q=http://imdb.com&amp;pretty=1</a></p>

<p><a href="http://scrapit.herokuapp.com/q/?q=http://imdb.com&amp;pretty=1&amp;occurs=1">http://scrapit.herokuapp.com/q/?q=http://imdb.com&amp;pretty=1&amp;occurs=1</a></p>

<p>Well I m going to try and continuously fix the bugs in the API.</p>

<p><em>So if you, have any suggestions that would make the <a href="http://scrapit.herokuapp.com/">Scrappit</a> any better, they are welcome here :)</em></p>
]]></content>
  </entry>
  

</feed>