ht4

Your first HipChat bot

Wed, 11 May 2011 00:00:00 GMT

A few weeks ago I stumbled upon a promising new startup while searching for a hosted group chat solution.

Besides being feature-rich and easy to use, HipChat allows you to write your own client by exposing an XMPP entry-point.

If you have never heard of XMPP it is a common chat protocol used by services such as Google Talk and Facebook Chat.

wobot

wobot is my attempt at abstracting the XMPP protocol in Node.js and offering a simple API to write your own bot.

Installing wobot

To get started you will need a few things:

Node.js (http://nodejs.org/#download)
npm (http://npmjs.org)

Additionally, wobot depends on the node-xmpp module which requires the following:

libexpat1-dev: apt-get install libexpat1-dev
libicu-dev: apt-get install libicu-dev

Once you have installed the build dependencies, install wobot in your working directory using npm:

mkdir ~/mybot
cd ~/mybot
npm install wobot

Configurations

Once wobot is installed you will need to add a new member to your HipChat group and login as that member.

Under "My Account" > "XMPP/Jabber Info" you will find the following:

Instantiate the wobot.Bot class as follows:

var wobot = require('wobot');
var bot = new wobot.Bot({
  jid: '1234_12345@chat.hipchat.com/bot',
  password: 'yourpassword',
  name: 'XMPP Bot'
});
bot.connect();

After running the above script, you should see your bot in HipChat's Lobby.

Auto-joining on connect

Once the bot is connected you will likely want it to join one or many channels. This can be done as follows:

bot.on('connect', function() {
  this.join('1234_bot_testing@conf.hipchat.com');
});

Reacting to a message

Whenever a message is sent to a channel your bot is in, the message event will be emitted.

Here is a simple example of a echo bot:

bot.on('message', function(channel, from, msg) {
  if (from === this.name) return false;
  this.message(channel, '@' + from + ' you just said: ' + msg);
});

Conclusion

The goal of this article was to briefly introduce wobot.

There are many more examples and detailed documentation on GitHub.

-Christian Joudrey

Grooveshark: Behind the scenes

Wed, 30 Mar 2011 00:00:00 GMT

Since its revamp in 2010, most of the magic in Grooveshark happens on the client-side. I will not be talking about these technologies as it was already covered by Jerod Santo's post: The Tech Behind the New Grooveshark.

What I'd like to address is the data exchanges between the client and the server when you hit the play button and the search button.

I do not work for Grooveshark. Some of this information may be incorrect.

When you first load Grooveshark a session id is set by PHP:

Set-Cookie: PHPSESSID=1c3b5c7d906f60cab128b1a2b4c30201; expires=Wed, 06-Apr-2011 23:11:20 GMT; path=/; domain=.grooveshark.com

This session id is required throughout the API and is used to compute a communication token.

The SWF proxy

Most of the communication between the client-side and the server is handled by the Flash object JSQueue.swf.

For those of you who are interested, Grooveshark logs all traffic between the SWF proxy and the server to the browser's console:

Most calls are POST requests to https://listen.grooveshark.com/more.php? with a JSON formatted string as the data:

{
  "parameters": ,
  "method": ,
  "header": {
    "uuid": ,
    "clientRevision": <"20101222.5" | "20101222">,
    "country: ,
    "privacy": 0,
    "session": ,
    "client": <"jsqueue" | "htmlshark">,
    "token": 
  }
}

Depending on the method called, client and clientRevision will differ. For instance, searching for songs must be done with htmlshark while downloading songs must be jsqueue.

The API responds with a JSON formatted string with the following structure:

{
  "header": {
    "session": ,
    "serviceVersion: "20100903".
    "prefetchEnabled": true
  },
  "result": 
}

The communication token

When the app first loads, an API call is made to getCommunicationToken with the following parameters:

{ "secretKey": "" }

The secret key is actually just a MD5 hash of your session id:

var n=hex_md5(GS.service.sessionID);req=new a("getCommunicationToken",{secretKey:n},t,w,{},true)}

The result of this call is a 13 hex character string which they call the communication token.

Hashing the communication token

The communication token is actually used by the Flash proxy to compute another token that the server validates your request with.

This final token is composed of two parts: the randomizer and the hash.

The randomizer consists of 6 random hex characters that are regenerated by the Flash proxy before each API request.

The hash is computed as follows:

Searching for songs

When you search for a song using the main search box multiple requests are made to the API:

A search request for Songs
A search request for Playlists
A search request for Users
A search request for Artists
A search request for Albums

The data returned by the API is used to build the various elements of the results view:

What is interesting about these requests is that the pagination is done on the client-side. The API appears to always return a maximum of 200 results.

All the queries are sent to the same method getSearchResultsEx with the following parameters:

{
    "query": ,
    "type": <"Songs" | "Playlists" | "Users" | "Artists" | "Albums">,
    "guts": 0,
    "ppOverride": false
}

From my tests, it appears that this method requires the client value of the JSON header to be set to htmlshark.

Grooveshark has a lot of interesting data. Searching by "type": "Songs" return many interesting fields such as:

AlbumID
ArtistID
SongName
AlbumName
ArtistName
CoverArtFilename
EstimateDuration (in seconds)
IsLowBitrateAvailable
Popularity
ArtistPopularity
SongPlays
ArtistPlays
SphinxWeight
Score
Rank (used to sort the results, I think)

Sample song entry:

{
  SongID: '7507736',
  AlbumID: '1123311',
  ArtistID: '490',
  GenreID: '16',
  Name: 'Cry for Help',
  SongName: 'Cry for Help',
  AlbumName: 'Greatest Hits',
  ArtistName: 'Rick Astley',
  Year: '',
  TrackNum: '16',
  CoverArtFilename: '1123311.jpg',
  TSAdded: '1209773471',
  AvgRating: 0,
  AvgDuration: 246,
  EstimateDuration: 244,
  Flags: 0,
  IsLowBitrateAvailable: '1',
  IsSponsored: '0',
  IsVerified: '1',
  SongVerified: '1',
  AlbumVerified: 1,
  ArtistVerified: 1,
  Popularity: 1100400047,
  AlbumPopularity: 0,
  ArtistPopularity: 1100400084,
  SongPlays: 87,
  ArtistPlays: 2882,
  SphinxWeight: 350700,
  Score: 41414.48648202,
  Rank: 0.99154757716654
}

Downloading cover arts

Cover arts are available in 3 formats:

Small: 100x100
Medium: 170x170
Large: 240x240

They can be accessed through: http://beta.grooveshark.com/static/amazonart/.

For example, http://beta.grooveshark.com/static/amazonart/s1123311.jpg:

Obtaining a song

About a year ago I was looking into how Grooveshark worked. At the time, when you played a song, the API would return a link to an MP3 file hosted on an Akamai server. The link would have an expiry time of about 7 minutes.

Since then, they have rolled out a new way of serving the media. They seem to be running a bunch of lighttpd servers that stream the media.

The first step is to obtain the stream's ip and a stream key for a given song.

This is done by calling getStreamKeyFromSongIDEx with the following parameters:

{
  "prefetch": ,
  "mobile": ,
  "country": ,
  "songID": 
}

The API response contains:

uSecs: The length of the song in micro seconds.
ip: Hostns to tame of the stream server.
streamKey: Key used to obtain the song from the stream server.

The next step is quite straightforward.

Perform a POST request to http:///stream.php with the following url encoded parameter and the MP3 will be returned:

streamKey=

Interesting facts about the media files

When mobile=false songs are sent as they were uploaded. This includes the original ID3 tags and the original bitrate.

When mobile=true songs are converted to mono and a bitrate of 64 kbps. In addition, all ID3 tags have been stripped.

Closing remarks

If you want to use Grooveshark's API, you might be better off (legally) to use their HTTP API: ApiShark.com.

Although this explains how to download songs from Grooveshark, it's not an excuse to do so. Grooveshark's terms of service explicitly disallows any storage of data from their service among other things. You should very carefully read and take notice of it.

If someone from Grooveshark happens to stumble across this, let me first say that I am a great fan of your service. If you are unhappy with this blog post for some reason, feel free to contact me.

-Christian Joudrey

Dissecting Google thumbnails

Wed, 23 Mar 2011 00:00:00 GMT

Today I had an itch to look into how Google generates the preview thumbnails when you do a search.

Google results page

The first thing I discovered is that the thumbnails are only loaded the first time you click a magnifying glass . In the case where you access a previous query, the thumbnails are loaded from the cache once the results are rendered to the screen.

JSONP /webpagethumbnail request

After your first click the magnifying glass, 10 JSONP calls (1 per search result) are made to http://clients1.google.com/webpagethumbnail.

An example request for my search query site:reddit.com programming:

c=11
r=2
f=2
s=300:585
hl=en
gl=ca

query=programming
d=http://www.reddit.com/programming
b=1
j=google.vs.r
a=IFs

A few values are hardcoded in the page's HTML (before the search results are even loaded), namely the thumbnail size s and locale values hl and gl:

"kfeUrlPrefix":"/webpagethumbnail?c=11&r=2&f=2&s=300:585&query=&hl=en&gl=ca"

The next values are what interest me though:

query only contains the keyword I searched and not my entire query site:reddit.com programming. I find this particularly interesting as this "slicing" logic seems to be done client-side.
d contains the full URL of the given result item.
j contains the JSONP callback function
a contains a 3 character checksum to prevent 3rd party requests (from what I concluded) that is obtained with the results HTML (in this case IFs):

JSONP /webpagethumbnail response

The thumbnails are sent with a expiry time of 1 day from a server running snapshot_btfe (likely the codename of their thumbnail server). No surprise there.

The structure of the returned JSON goes as follows:

{
  "s": "b",
  "b": 1,
  "dim": [302, 585],
  "ssegs": [ "data:image/jpeg...", "data:image/jpeg..." ],
  "ssegs-heights: [405, 180],
  "tbts": [ ... ],
  "url": "http://www.reddit.com/r/programming"
}

dim contains the total width and height of the thumbnails
ssegs contains an array of strings each composed of a data uri with a segment of the thumbnail
ssegs-heights contains the height of each segment
tbts contains an array of text that will be overlayed on top of the thumbnails
url contains the url of the requested page

At this time I am unsure what s and b are used for.

Building the thumbnail

The thumbnail appears to be split into segments when the height is greater than 405 pixels. I'm guessing this is either for performance reasons or compatibility (IE8 supports max 32KB data URIs)?

Both segments are simply appended one after the other in the preview bubble.

Building the overlay text

As I previously explained the overlay text data is contained within the JSON in the tbts array.

Each text overlay has an entry in the array with the following values:

box contains the dimension and position (top, left) of the thumbnail highlight
txt contains the HTML text that is displayed in the overlay
txtBox contains the dimensions and position of the text box

For example:

{
  "box": {
    "h": 10,
    "l": 211,
    "t": 71,
    "w": 74
  },
  "txt": "A reddit for discussion and news about computer programming ...",
  "txtBox": {
    "h": 42,
    "l": 0,
    "t": 25,
    "w": 300
  }
}

A div is then appended for each box and textbox into the preview bubble which gives the end result:

Unanswered questions

Unfortunately there are many unanswered questions. I would really be greatful if someone at Google made an official post about how the thumbnails work.

Specifically:

How are the thumbnail images generated? I'm guessing they are using a headless version of Chrome?
How are the position of the boxes calculated?
What kind of infrastructure is behind the thumbnail service?
What is the ratio of pages that are currently thumbnailed?
Will there ever be an open API?

-Christian Joudrey

Fiddling with netcat - intercepting proxy

Wed, 02 Dec 2009 00:00:00 GMT

A couple of days ago Chris needed a way to see how a particular client was interacting with a server. Obviously there are numerous ways to do this, but I was curious how easy it would be to implement something similar with a quick netcat command. Sure enough after a little bit of fiddling I was able to produce exactly what he needed.

nc -l -p 12345 < pipe | tee outgoing.log | nc server 12345 | tee pipe incoming.log

Now this may seem a little cryptic so I'm going to dissect each portion to explain how it works. Keep in mind the pipe references an actual pipe. You can make a FIFO pipe by running mkfifo pipe or mknod pipe p - the former is the most usual way. If you're not familiar with named pipes I recommend reading up on them before continuing with this post as you may get a little confused.

nc -l -p 12345 < pipe

This portion simply has netcat listen on port 12345 and send anything from the pipe to the connected client. If you're not familiar with the pipes think of it as a simple file with the word hello in it. When someone were to successfully connect to the netcat instance it would send the hello to the client.

| tee outgoing.log

If you're not familiar with tee this may seem a bit obscure. Tee prints the things piped to it to stdout as well as writing it to a file. In this instance any traffic from the connected client will get printed to stdout and to the file outgoing.log. An example of how this would work is if I connected to the netcat instance and simply typed hello it would print it out to the screen and log it to the outgoing.log file.

| nc server 12345

This is the server that you would normally want to connect to. Remember the goal is to make a quick intercepting proxy to see how the client reacts to the server. This is the server.

| tee pipe incoming.log

Here is where the magic happens. This completes the relay so the client and server can communicate across the proxy. What this does is takes the network traffic from the server and using tee prints it to stdout while piping it to our pipe and incoming.log files respectively.

Now all of this may make sense individually, though how they work together might be slightly confusing.

If you recall the first command sends all data from our pipe to the client - and at the end we pipe all data from the server to the pipe. See now? We're simply taking all data the server send and sending it to the client completing the relay and allowing for normal operation.

Now in this case Chris needed this for a quick look at how a normal IRC client interacts with the server since the RFC is lacking - so here is a real world example of where this was used (though there's likely infinite better ways to do it):

nc -l -p 12345 < pipe | tee outgoing.log | nc irc.freenode.net 6667 | tee pipe incoming.log

You'll notice when you execute the above command you'll start seeing some traffic from the server instantly:

NOTICE AUTH :*** Looking up your hostname...
NOTICE AUTH :*** Checking ident
NOTICE AUTH :*** No identd (auth) response
NOTICE AUTH :*** Couldn't look up your hostname

Now we connect to the netcat server - in this case localhost on port 12345 and if everything goes as planned it should connect like normal to Freenode. If you take a peak at the netcat server you'll see a bunch of activity!

The cool part is the logs - we can see exactly how this particular IRC client (IRSSI) and server (Freenode) interact.

Once again this isn't the best way to do this - tcpdump, wireshark and infinite other choices are available. That being said it's fun to fiddle and learn.

-Cody Robertson