Virendra Rajput's Blog

Exploiting Intelltest bug to get answers [in Online exams]

2014-01-24T00:00:00+05:30

Intelltest is a platform which provides continuous evaluations and online examination hosting. It is widely adopted by several prestigious institutes including Pune University (for conducting online exams), Maharashtra Institute of Technology, Sinhagad College of Engineering and many more institutes for conducting online examinations.

This exploit requires you to have the Network Inspector under Developer Tools (for Google Chrome or Chromium) or Firebug (if you are on Firefox). You can exploit it to view the correct option's for MCQ's.

Here is a demo (answered 3 questions):

And here is the result for the test (scored 3/50 for the 3 questions attempted):

Tool to unsubscribe from Facebook App requests and activity

2013-12-07T00:00:00+05:30

Are you tired of receiving notifications from Facebook apps?

And wish if you could just unsubscribe from all of them, at one go!

Well then, it's your lucky day! Because I built a bookmarklet, that does that (since, I was too lazy to unsubscribe manually).

For using it, you will have to follow these steps:

1. First of all, add the bookmarklet to your Bookmarks bar by dragging this button -

Unsubscribe me <-- drag me to your bookmarks bar

2. Go to the Facebook Notifications settings here (you need to be logged into Facebook).

3. Now click on the Edit link in "App requests and the activity" row as shown below:

It will expand, and all the apps you are subscribed to will be listed.

4. Now click on the "Unsubscribe Me" bookmarklet you just add. And that's all. It will automatically start, the unsubscribing process. The progress will be shown in a box in the top right. It will show a message "Done", when it is done unsubscribing from all the apps.

Enjoy! :-)

The source code is available on Github, if you are interested in contributing.

Photography in Lonavala, India

2013-09-04T00:00:00+05:30

These were some of my random clicks in Lonavala, India:

Scraping IMDB with Python

2013-06-19T00:00:00+05:30

Scraping is fun, whether you’re doing it just for fun or profit. I created a couple of scrapers already for iTunes, Paintbottle (deleted as per requested by the site-admin), Cricinfo, Google’s Did you Mean? and more. check em out on Github.

IMDB does not have an API, for accessing information on movies or TV series. So, had to write a scraper for fetching accessing their information on movies.

I did know about the couple of other unofficial API’s (including omdb), but creating your own solution is always fun :)

If you don’t want to go much into the technical details, but are just looking to use it, it is hosted at http://getimdb.herokuapp.com.

The scraper is written in Python and uses lxml, for parsing the webpages. I m using XPath for selecting elements from the DOM.

Following are the dependencies, and can be installed using pip:

requests==1.2.3
 lxml==3.2.1

The code:

#!/usr/bin/env python

import sys
import requests
import lxml.html

def main(id):
    hxs = lxml.html.document_fromstring(requests.get("http://www.imdb.com/title/" + id).content)
    movie = {}
    try:
        movie['title'] = hxs.xpath('//*[@id="overview-top"]/h1/span[1]/text()')[0].strip()
    except IndexError:
        movie['title']
    try:
        movie['year'] = hxs.xpath('//*[@id="overview-top"]/h1/span[2]/a/text()')[0].strip()
    except IndexError:
        try:
            movie['year'] = hxs.xpath('//*[@id="overview-top"]/h1/span[3]/a/text()')[0].strip()
        except IndexError:
            movie['year'] = ""
    try:
        movie['certification'] = hxs.xpath('//*[@id="overview-top"]/div[2]/span[1]/@title')[0].strip()
    except IndexError:
        movie['certification'] = ""
    try:
        movie['running_time'] = hxs.xpath('//*[@id="overview-top"]/div[2]/time/text()')[0].strip()
    except IndexError:
        movie['running_time'] = ""
    try:
        movie['genre'] = hxs.xpath('//*[@id="overview-top"]/div[2]/a/span/text()')
    except IndexError:
        movie['genre'] = []
    try:
        movie['release_date'] = hxs.xpath('//*[@id="overview-top"]/div[2]/span[3]/a/text()')[0].strip()
    except IndexError:
        try:
            movie['release_date'] = hxs.xpath('//*[@id="overview-top"]/div[2]/span[4]/a/text()')[0].strip()
        except Exception:
            movie['release_date'] = ""
    try:
        movie['rating'] = hxs.xpath('//*[@id="overview-top"]/div[3]/div[3]/strong/span/text()')[0]
    except IndexError:
        movie['rating'] = ""
    try:
        movie['metascore'] = hxs.xpath('//*[@id="overview-top"]/div[3]/div[3]/a[2]/text()')[0].strip().split('/')[0]
    except IndexError:
        movie['metascore'] = 0
    try:
        movie['description'] = hxs.xpath('//*[@id="overview-top"]/p[2]/text()')[0].strip()
    except IndexError:
        movie['description'] = ""
    try:
        movie['director'] = hxs.xpath('//*[@id="overview-top"]/div[4]/a/span/text()')[0].strip()
    except IndexError:
        movie['director'] = ""
    try:
        movie['stars'] = hxs.xpath('//*[@id="overview-top"]/div[6]/a/span/text()')
    except IndexError:
        movie['stars'] = ""
    try:
        movie['poster'] = hxs.xpath('//*[@id="img_primary"]/div/a/img/@src')[0]
    except IndexError:
        movie['poster'] = ""
    try:
        movie['gallery'] = hxs.xpath('//*[@id="combined-photos"]/div/a/img/@src')
    except IndexError:
        movie['gallery'] = ""
    try:
        movie['storyline'] = hxs.xpath('//*[@id="titleStoryLine"]/div[1]/p/text()')[0].strip()
    except IndexError:
        movie['storyline'] = ""
    try:
        movie['votes'] = hxs.xpath('//*[@id="overview-top"]/div[3]/div[3]/a[1]/span/text()')[0].strip()
    except IndexError:
        movie['votes'] = ""
    return movie

if __name__ == "__main__":
    print main(sys.argv[1])

You can use it by passing any valid imdb id as an argument:

$ python imdb.py tt1905041

And the output will be returned as follows:

{
  "certification": "PG-13", 
  "description": "Hobbs has Dom and Brian reassemble their crew in order
   to take down a mastermind who commands an organization of mercenary drivers across 12 countries. Payment? Full pardons for them all.", 
  "director": "Justin Lin", 
  "gallery": [
    "http://ia.media-imdb.com/images/G/01/imdb/images/nopicture/small/unknown-1394846836._V379391227_.png", 
    "http://ia.media-imdb.com/images/G/01/imdb/images/nopicture/small/unknown-1394846836._V379391227_.png", 
    "http://ia.media-imdb.com/images/G/01/imdb/images/nopicture/small/unknown-1394846836._V379391227_.png"
  ], 
  "genre": [
    "Action", 
    "Crime", 
    "Thriller"
  ], 
  "metascore": "61", 
  "poster": "http://ia.media-imdb.com/images/M/MV5BMTM3NTg2NDQzOF5BMl5BanBnXkFtZTcwNjc2NzQzOQ@@._V1_SX214_.jpg", 
  "rating": "7.2", 
  "release_date": "24 May 2013", 
  "running_time": "130 min", 
  "stars": [
    "Vin Diesel", 
    "Paul Walker", 
    "Dwayne Johnson"
  ], 
  "storyline": "Since Dom (Diesel) and Brian's (Walker) Rio heist toppled a kingpin's empire and left their crew with $100 million, our heroes have scattered across the globe. But their inability to return home and living forever on the lam have left their lives incomplete. Meanwhile, Hobbs (Johnson) has been tracking an organization of lethally skilled mercenary drivers across 12 countries, whose mastermind (Evans) is aided by a ruthless second-in-command revealed to be the love Dom thought was dead, Letty (Rodriguez). The only way to stop the criminal outfit is to outmatch them at street level, so Hobbs asks Dom to assemble his elite team in London. Payment? Full pardons for all of them so they can return home and make their families whole again.", 
  "title": "Furious 6", 
  "votes": "154,139", 
  "year": "2013"
}

This will return a JSON object containing the data for the movie. You can fork the code on Github. You can try it out at http://getimdb.herokuapp.com/.

Scraping iTunes Charts Using Scrapy Python

2013-06-13T00:00:00+05:30

Hey guys, recently I published a guest post on David Walsh’s Blog.

Its a getting started tutorial for Scrapy, in this tutorial I build a simple web spider using Scrapy that crawls the iTunes charts and extracts the list of Top free apps.

The complete post is available here.

The code for the scrapper in the post is available here.

Using Dropbox as your database backup space

2013-05-24T00:00:00+05:30

Recently, due to loss of hardware on AWS, we (markitty.com lost access to our EC2 instance, and Amazon couldn’t do anything to help us out it.

And well since we already keep everything backed up, it wasn’t much of a big deal to spawn a new instance and get back online. Also with that, Markitty had its first down time since the launch.

But it taught me a lesson that we couldn’t really trust AWS infrastructure with our customers data.

We already use Dropbox at Markitty, so I decided to trust it with DB backups. We make use of Postgresql as our main Database on our Django stack.

So, I hacked this python script that would take a regular backup of our main database and upload it to one of our Dropbox folders.

It includes 3 main files:

db_backup.sh is the shell script that makes use of pg_dump to get the compressed backup of the database.

uploader.py is the Python script that uploads the database to the Dropbox folder.

client_secrets.json stores the credentials including app_key, app_secret, access_key and access_secret.

You need to provide the DB_Username and DB_Name in db_backup.sh.

Follow these steps to setup the Dropbox app:

You will need to create a Dropbox app, to get the Appkey and AppSecret. You can create it here (select the App Type as Core and select the Permission type as Full Dropbox)
Once the app is successfully created, Dropbox will provide you the app_key and app_secret. Provide this app_key and app_secret in client_secrets.json (please do not share your App_Key and App_Secret publicly).
Then execute the uploader.py, it will generate an authentication link which you will need to open in your web browser. Press the Allow button, and hit Enter in the shell.
It will then print the access_key and the access_secret, that you will need to provide in the client_secrets.json And you are done with the Dropbox setup.

After that, you can setup a Cron job that will execute the db_backup.sh everyday and get your Database backup in your Dropbox folder. The scripts all yours under Creative Commons License :-)

You can fork it on Github here.

Featuring in Indian Express

2013-05-10T00:00:00+05:30

Recently I got published by the Indian Express.

The article is available online here.

Things I learned from my first Hackathon

2013-02-25T00:00:00+05:30

This was my first real Hackathon experience. Real because it was my first full fledged hackathon — before this I did participate in a few programming events at college’s where we had to create an app in a couple of hours. It was organized by Google Developers Group. As usual it was a 24 hours hackathon.

So here’s what I learned -

Prepare in advance:

While you are planning to attend a hackathon, you might want to short list a few ideas, days before the event. Brainstorming this with a friend will be even better. It also helps, if you can define the scope of the idea that want to work on. Try to answer a few questions — How big will the project be? Can you build it as a solo developer? How familiar are you with technologies that you will be using? Are the organizers putting any constraints on which technologies you can or can not use? (Come on, you seriously don’t need to be aware of anything, apart from the language itself! In fact, you can just figure out everything else)

I began thinking of ideas as soon as the hackathon dates were put up. Me and my my brother (also an hacker, and my partner at the hackathon) used to brainstorm, discussing whether or not it was realistic, and did we have enough skills to execute it. Also if the thing we build, would it be of any use to people? Are we solving an real problem? We short-listed a few ideas that answered the above questions positively.

Deciding on the idea:

One question that you would like to ask yourself while deciding on the idea would be — is it realistic? This is one major decision, because since you’ll have a very limited amount of time with you, with limited resources. The idea actually has to be well planned before the execution really begins. Even though we had thought of a few ideas in advance, it was quite confusing to settle on one when the hackathon started.

At that point the most important thing that you have to consider is — whether or not you can execute it well in the allotted time? Once you make the decision you have to break the execution down into stages, and allocate enough time to each stage. This really helps you with analyzing whether or not, what you’re trying to achieve is doable and helps you keep check on time later.

In the beginning, it seemed to turn out to be quite challenging for us. Since the idea that we decided had much bigger scope, and only with 24 hours in our hands, it seemed too tough but we weren’t going to back down.

3. Have awesome people in your team:

Well, the idea that you’ll be working often requires skills. And the better the skills you have, the more efficiently you can execute you’re idea. This is one thing, that depends on how well you can convince some other guy (a developer probably), to work on your idea. The more convincing you are, better the talent you’ll attract.

We were pretty lucky on this one — we had one of the best front end developers (I ve meet yet) Jay Kanakiya, along with Narendra Rajput, a hardcore Ruby developer and myself (a Pythonist). So we weren’t really short of skills at least. I am a Python/Django developer but it won’t be cool to praise myself :-)

Get stuff done:

Hackathons are all about getting stuff done. At the end of the day, what really matters is how well did you execute your idea. And many a times its not easy, because when you get stuck at something (everyone does), and you might not succeed even after spending the precious time figuring it out (it just happens).

Well this is the prime time, where you are really tested on how well do you make decisions. You have to come up with the plan B (no-one really has a plan B), an alternative. Its all about how do you actually #hack the problem (come with a crazy solution, which before that never existed).

We did face, enough problems, but the way we hacked our way was the fun part.

And did I tell you we won the second prize?

We are told we will be featured, on developers.google.com - so hold your breath!

The app that we built at the hackathon is available in beta here.

We are working hard to launch it for public soon so share your feedback (good and bad).

Special thanks to Nilesh Bhojani for helping me with the blog post.

You can view the hackathon album here.

Do let me know if you are going for a hackathon and need a hand! Would be really happy to help :)

Directory Downloader in Python

2012-11-30T00:00:00+05:30

Recently, while browsing for some Facebook Timeline covers that I wanted for my Facebook Profile. I came across hundreds of covers that I would love to have on my hard-disk as my Timeline Cover collection (yeah, I have a Timeline Cover collection).

And then, I came across a few websites that allowed directory browsing. So I started saving the images manually from the index. And well those directories had thousands of images, and downloading them manually would suck (being a #hacker you always want everything to be automated).

So I started hacking a script, that would carry out this task for me. And in just 15 minutes, I cracked it. And had fun, downloading entire webserver directories in minutes.

You can use this Python script to download entire directories (if the webserver has indexes open).

This script also makes use of Beautifulsoup, you can install it, by using the following command:

pip install beautifulsoup4  # if you have pip installed

 easy_install BeautifulSoup4 # if you have easy_install

For using the script, you need to pass the directory url as a commandline argument to the script, for Eg.

For downloading the directory at http://www.namecovers.com/asset/thumb/

$ python downloader.py http://www.namecovers.com/_asset/_thumb/

The code:

import urllib2
import sys
import os

from bs4 import BeautifulSoup
from urlparse import urlparse

def downloader(urls, grab_url, foldername):
    if not os.path.exists(foldername):
        print "\""+ foldername + "\" does not exist!"
        os.makedirs(foldername)
        print "Creating \"" + foldername + "\"..." 
    for cover in urls:
        try:
            print "Downloading item " + cover + "..."
            print grab_url + cover
            img = urllib2.urlopen(grab_url + cover)
            output = open(foldername + "/" + cover,''wb'')
            output.write(img.read())
            output.close()
            print cover + "... downloaded!!"
        except Exception, e:
            pass
    return

def main(url):
    urls = []
    print "Fetching the page..."
    page = urllib2.urlopen(url).read()
    print "Fetching completed!"
    soup = BeautifulSoup(page)
    print "Grabbing the objects of the page..."
    lis = soup.find_all("li")
    for item in lis:
        urls.append(item.a["href"])
    domain = urlparse(url)
    downloader(urls, url, domain.netloc)
    print "All files have been successfully downloaded!"
    return

if __name__ == "__main__":
    main(sys.argv[1])

You can also fork it on Github here.

How I keep my Facebook Page active using Cron Jobs!

2012-10-01T00:00:00+05:30

I have this habit of forgetting important tasks, lately. Since multitasking is never an easy job, to do and you tend to forget alot. Being a regular member of the Brahma Kumaris community, I was looking forward to contribute to the community some way possible.

Hence, I was thinking of creating an Online presence of the community. So after a while, I created a Facebook Page for Brahma Kumaris, where I used to share Daily Muralis along with videos for classes, that I had to fetch from various sources online.

The Murali was available in plain text and the videos were being posted at Brahma Kumaris Youtube Channel. The Murali was available in 2 languages viz. English and Hindi at Brahma Kumaris Jewels.

I did the task of updating the Page manually for a while, but it wasn’t possible for a lazy guy like me to do it long, considering all the lazy habits I have. And it was important to update the Page exactly in the morning, so that it would reach members all around the globe.

I added a few fellow members of the community as the Admin of the Page, so that they could help me with updating the page, regularly. But still we weren’t able to keep it updated.

So I decided to solve the problem, the HACKER WaY. And finally came up with the script that I developed in 2 hours approximately, when I challenged my Bro, that I would get it done in an hour, but it took me an hour more.

After testing it, thoroughly, and adding support for Hindi language, I created a Cron Job, that would run exactly at 9:30 AM, everyday.

The script would fetch the Jewels Brahma Kumaris Page, and would parse it to get the Murali text from it, and then filter it to remove any special characters, or so.

Then it would make a post request to the Facebook Graph API with all the required credentials, and get the Murali posted on the Page. Since the murali was available in 2 languages, I had 2 different scripts, for handling each language.

Then I had a script for fetching the Class video from the Brahma Kumaris Youtube Channel. This script would get today’s Murali class video, and would post it on the page.

The scripts have really been useful, for keeping the Page updated regularly, and have contributed in increasing the Page Reach and Likes as well. Currently, the Page has 5,990 likes. And around 350+ people talking about this. Now, I rarely have to check the page, since the Cron manages it all.

Well, I mostly prefer Python, but I used PHP to create these scripts since, Python has a lot of hosting issues. I was considering hosting it on Heroku, but again Heroku doesn’t provide Cron for free. Hence PHP was a better option, again.

You can fork the script on Github here.

I would like to hear from you, about any such similar situations that you have come across, and how you managed to solve it!

#Cracking the Multunus @Twitter Challenge

2012-09-19T00:00:00+05:30

I like to keep #hacking interesting problems that I come around. And just yesterday, I came across this new Twitter #Challenge, created by Multunus Softwares. Well actually the Contest Date was Ended, the deadline was 12th August and it was almost a month old now.

But the Puzzle was really interesting. It was actually a web app, which accepts your Twitter handle (ie. Username) and generates a cloud of numbers, which is unique for every Twitter handle.

The number cloud was really strange at first, but after playing with it (regenerating number cloud again and again), for a while I got to know the pattern of numbers that was been generated.

So, the problem was to understand the number pattern that was been generated in the cloud, and to build an App similar to that.

After I understood the logic behind the generated cloud, it was just some python #code that I had to put together for the App (ie. solution) was up and running.

I didn’t actually win anything, but it was really fun solving another interesting puzzle.

The Solution is hosted on Heroku, and the source code is available on Github.

The Problem Statement is available here.

Would like to hear your experiences about solving some really interesting problems that you have came across!

Google's "Did You Mean" Hack in Python

2012-09-03T00:00:00+05:30

I ve always been pretty fond of Google Search Engine (well everyone is). Google has some really handy features that are helpful while searching something that you cant actually spell correctly (well there are alot of things that aren't easy to spell, unless you are a English Professor / or maybe an expert in Literature).

So I had this problem, with one of my apps Nearme, where people weren't actually querying correctly (there were a lot of misspelled words in the queries). Since these queries were Proper nouns, so there is no specific dictionary/ source that I can make use of to correct them. So I thought of using Google’s “Did You Mean” since it corrects all types of words (including Proper Nouns, that aren't included in any of the Dictionaries)

So here’s a hack that I wrote solve this problem of fixing the spelling mistakes that users made while Querying my app. (it's not the BEST solution to the problem, but well it works)

The code is in Python, and makes use of one of my favorite modules BeautifulSoup.

The getPage function is used to retrieve the pages in gzip so that it reduces the Bandwidth usage while retrieving the page.

The didYouMean is the main function that you call with the argument of word and it will return you the correct the word (if it is misspelled) or else it will simply 1 that means the word has no corrections.

The code for the script:

import os
import urllib2
import io
import gzip
import sys
import urllib
import re

from bs4 import BeautifulSoup
from StringIO import StringIO

def getPage(url):
    request = urllib2.Request(url)
    request.add_header('Accept-encoding', 'gzip')
    request.add_header('User-Agent','Mozilla/5.0 (Macintosh; Intel Mac OS X 10_7_3) AppleWebKit/535.20 (KHTML, like Gecko) Chrome/19.0.1036.7 Safari/535.20')
    response = urllib2.urlopen(request)
    if response.info().get('Content-Encoding') == 'gzip':
        buf = StringIO( response.read())
        f = gzip.GzipFile(fileobj=buf)
        data = f.read()
    else:
        data = response.read()
    return data

def didYouMean(q):
    q = str(str.lower(q)).strip()
    url = "http://www.google.com/search?q=" + urllib.quote(q)
    html = getPage(url)
    soup = BeautifulSoup(html)
    ans = soup.find('a', attrs={'class' : 'spell'})
    try:
        result = repr(ans.contents)
        result = result.replace("u'","")
        result = result.replace("/","")
        result = result.replace("","")
        result = result.replace("","")
        result = re.sub('[^A-Za-z0-9\s]+', '', result)
        result = re.sub(' +',' ',result)
    except AttributeError:
        result = 1
    return result

if __name__ == "__main__":
    response = didYouMean(sys.argv[1])
    return response

You can even fork it on Github here.

Find places around you, using Nearme on SMS

2012-08-19T00:00:00+05:30

Well for me Find Near Me is a pretty handy app when I m looking to find places around me. Since its really easy to use, and comes in handy when I m in some new town, and I dont know much about the vicinity.

And well since I live in India, you just cannot expect the Internet connection to be working all the times. So its obvious that you cannot be dependent on using Apps which require Internet connection to function. So I came up with my solution, A Nearme app on sms. So it should work regardless whether I m connected to Internet or not. Txtweb is platform which provides Internet on sms. So developers can create apps, that can be used via sms. It is really easy to get started with the Txtweb platform, since they have some excellent documentation.

I considered using Google Places API to locate places around you.

You can find how to use the app, along with some how to’s at App homepage.

Try it out on the Txtweb emulator for free!

The source code is available on Github.

Launching an API based App | Scrapit - Extract keywords from webpages

2012-08-06T00:00:00+05:30

Recently trying to find some good alternatives for web scrapper for one my apps, that had to scrap links and extract keywords from the webpages, since it had to much robust, handling broken html, and heaps of text.

I browsed around the web looking for some existing libraries that could help me out. Since I don’t like to create things from scratch. Well I couldn’t find what I exactly wanted, but I found somethings that could help me building it.

Python has some really good text processing modules along with html processing libraries. So I ended up using :

Topia.termextract for text processing

lxml for html parsing

After writing the code, I tested it pretty thoroughly and after the module was completed. I thought to launch it as an API service. Well deployment was not an issue, since Heroku is my option. So got the API deployed on Heroku, along with some modifications, and running it with gunicorn server.

What is Scrapit?

Scrapit is an API for scrapping webpages for keywords. Using Scrapit you can extract important keywords from webpages. That are relevant to the page.

Using Scrapit:

You need to make calls to

http://scrapit.herokuapp.com/q/?q={url}

Parameters:

q : (required) url to be fetched

occurs : (optional) Will only return the words that are repeated more that once on the webpage. Set to ''1'' while you want to enable it

pretty : (optional) Used for pretty printing the response. Set to ''1'' while you want to enable it

Example Usage:

http://scrapit.herokuapp.com/q/?q=http://imdb.com

http://scrapit.herokuapp.com/q/?q=http://imdb.com&pretty=1

http://scrapit.herokuapp.com/q/?q=http://imdb.com&pretty=1&occurs=1

Well I m going to try and continuously fix the bugs in the API.

So if you, have any suggestions that would make the Scrappit any better, they are welcome here :)