<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:blogger="http://schemas.google.com/blogger/2008" xmlns:georss="http://www.georss.org/georss" xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" gd:etag="W/&quot;DUIMQnw8fSp7ImA9WhBVGUU.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301</id><updated>2013-04-26T07:46:23.275-07:00</updated><category term="rate beer" /><category term="worldbank" /><category term="ICASSP" /><category term="linking open data" /><category term="dafx2009" /><category term="beer" /><category term="on the fly search" /><category term="spotify" /><category term="live blog" /><category term="SciPy" /><category term="trolls" /><category term="stuff" /><category term="torrents" /><category term="dbtune.org" /><category term="gNat" /><category term="mopy" /><category term="ISMIR 2009" /><category term="musical data" /><category term="PhD" /><category term="echonest" /><category term="soundcloud" /><category term="build relationships with algorithms" /><category term="ByK" /><category term="SuiteSparse" /><category term="dj" /><category term="rant" /><category term="compile tutorial" /><category term="CA-12" /><category term="pie" /><category term="UMFpack" /><category term="hype hype hype" /><category term="aesthetics" /><category term="sweatsedos" /><category term="Larry Lessig" /><category term="stockholm" /><category term="similarity" /><category term="recsys2010" /><category term="moustaki" /><category term="ISMIR" /><category term="Music Ontology" /><category term="NumPy" /><category term="continuous mixing" /><category term="IEEE-THEMES" /><category term="music informatics" /><category term="music information retrieval" /><category term="iTunes" /><category term="MIR" /><category term="muxtape" /><category term="change congress" /><category term="things" /><category term="motools" /><category term="Mphil" /><category term="remix" /><category term="site redesign" /><category term="ICMC 2008" /><category term="playlisting" /><category term="thesis" /><category term="trust" /><category term="womrad" /><category term="mashed" /><category term="Myspace" /><category term="piracy" /><category term="the echo nest" /><category term="BBC /programmes" /><category term="Draft Lessig" /><category term="ISMIR tutorials" /><category term="BASS" /><category term="ben=dj" /><category term="musicmetric" /><category term="flamebait" /><category term="marsyas" /><category term="python" /><category term="mashed08" /><category term="openhacklondon" /><category term="10.6" /><category term="Myspace Music" /><category term="AMD" /><category term="background" /><category term="playlists" /><category term="beer advocate" /><category term="ignite london" /><category term="flamewar" /><category term="muxtape_v2" /><category term="update" /><category term="Mac OSX" /><category term="NLP" /><category term="mypyspace" /><category term="ShamelessSelfPromotion" /><category term="autoDJ" /><category term="music recommendation" /><category term="ISMIR 2008" /><category term="afternoon" /><category term="large collection analysis" /><category term="web crawling" /><category term="plamere" /><category term="API" /><category term="musichackday" /><category term="meta" /><category term="SMC" /><category term="legalize it" /><category term="social networking tools" /><category term="twitter" /><category term="kurtisrandom" /><category term="Jackie Speier" /><category term="easy2stalk" /><category term="not_DAFx" /><title>Stuff.  Also, things.</title><subtitle type="html">Stuff.  Also, things.  -- A blog wherein I babble about all kinds of, well, stuff.  Primarily the discussion is focused around things related to my research interests.  These topics are principally content based music information retrieval and ways to inform that process using various cultural sources via data mining of the internet in general and social networks in particular.</subtitle><link rel="http://schemas.google.com/g/2005#feed" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/posts/default" /><link rel="alternate" type="text/html" href="http://stuffalsothings.blogspot.com/" /><link rel="next" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default?start-index=26&amp;max-results=25&amp;redirect=false&amp;v=2" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><generator version="7.00" uri="http://www.blogger.com">Blogger</generator><openSearch:totalResults>47</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/StuffAlsoThings" /><feedburner:info uri="stuffalsothings" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry gd:etag="W/&quot;C08NRnYzeip7ImA9WhJTGUQ.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-3418828891354503735</id><published>2012-06-29T10:07:00.001-07:00</published><updated>2012-06-29T10:44:57.882-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2012-06-29T10:44:57.882-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="torrents" /><category scheme="http://www.blogger.com/atom/ns#" term="MIR" /><category scheme="http://www.blogger.com/atom/ns#" term="the echo nest" /><category scheme="http://www.blogger.com/atom/ns#" term="legalize it" /><category scheme="http://www.blogger.com/atom/ns#" term="musicmetric" /><category scheme="http://www.blogger.com/atom/ns#" term="piracy" /><category scheme="http://www.blogger.com/atom/ns#" term="playlisting" /><category scheme="http://www.blogger.com/atom/ns#" term="spotify" /><title>Licensed listening based on the habits of pirates or lessons from sloppy item resolution</title><content type="html">&lt;table cellpadding="0" cellspacing="0" class="tr-caption-container" style="float: left; margin-right: 1em; text-align: left;"&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://farm5.staticflickr.com/4080/4942642065_267d6a45d9.jpg" imageanchor="1" style="clear: left; margin-bottom: 1em; margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="320" src="http://farm5.staticflickr.com/4080/4942642065_267d6a45d9.jpg" width="251" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;from flickr user &lt;a href="http://www.flickr.com/photos/tschaut"&gt;nozoomii&lt;/a&gt;, &lt;a href="http://creativecommons.org/licenses/by-nc-sa/2.0/"&gt;CC&amp;nbsp;by-nc-sa&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
So I spent Thursday and Friday a couple weeks ago at the &lt;a href="http://bcn.musichackday.org/2012/"&gt;Barcelona Music Hackday&lt;/a&gt;, part of the &lt;a href="http://sonar.es/es/2012/"&gt;Sónar Music Festival&lt;/a&gt;.  There were loads of excellent hacks (&lt;a href="http://wiki.musichackday.org/index.php?title=Barcelona_Hacks_2012"&gt;full list&lt;/a&gt;), including my own, &lt;a href="http://wiki.musichackday.org/index.php?title=Legalize_It!"&gt;Legalize It!&lt;/a&gt;. In this post I'm going to go into a bit more depth about the hack, lessons learned and teasers for things I might do.&lt;br /&gt;
&lt;br /&gt;
&lt;h3&gt;


    Motivation&lt;/h3&gt;
&lt;div&gt;
The core idea is a simple one – straightforward listening to things that are popular on Bittorrent (note that &lt;i&gt;popular on Bittorrent&lt;/i&gt;&amp;nbsp;is a slightly fuzzy concept, since Bittorrent is protocol for ad-hoc distribution, but we'll get back to that in a bit), without all the nastiness (and &lt;a href="http://www.bbc.co.uk/news/technology-18518777"&gt;DNS blocking&lt;/a&gt;!) of looking at, say, &lt;a href="https://piratereverse.info/top/101"&gt;The Pirate Bay's top music torrents&lt;/a&gt;&amp;nbsp;(that's a proxy of tpb btw). &amp;nbsp;And of course this removes any legal trouble that would be associated with gather and listening to music via those torrent charts.&lt;br /&gt;
&lt;br /&gt;
&lt;h3&gt;


 How it all works&lt;/h3&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;span style="font-size: x-small;"&gt;tl;dr - It's a torrent chart metadata-based content resolver, written in python and JS,&lt;a href="https://github.com/gearmonkey/legalize"&gt; you can fork the code&lt;/a&gt;.&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Legalize It! has two parts, client and server. The &lt;a href="https://github.com/gearmonkey/legalize/blob/master/main.py"&gt;server&lt;/a&gt; is a fairly simple &lt;a href="http://docs.pylonsproject.org/projects/pyramid/en/1.3-branch/"&gt;pyramid&lt;/a&gt; webserver with two main tasks (it's deployed on &lt;a href="http://www.heroku.com/"&gt;heroku&lt;/a&gt;). The first involves fetching the torrent charts and resolving torrent release groups to legal streaming albums (&lt;a href="http://www.spotify.com/"&gt;Spotify&lt;/a&gt;, currently). &amp;nbsp;This is simply a matter of fetching the daily torrent releasegroup chart from the &lt;a href="http://developer.musicmetric.com/"&gt;Musicmetric API&lt;/a&gt; (full disclosure, they're my employer and I wrote most of the chart endpoints...) then walking through the top N items that look like albums and matching them to spotify albums. The matching is done through a fantastically naive string title + artist search on spotify's metadata API via the very useful &lt;a href="https://bitbucket.org/runeh/spotimeta/"&gt;Spotimeta&lt;/a&gt;&amp;nbsp;python wrapper. This album resolution process has a &lt;a href="http://legalize-it.herokuapp.com/top/25"&gt;simple web interface&lt;/a&gt;&amp;nbsp;(if it returns an error try a refresh, heroku workers on the free tier sleep a bit too much), that I mostly built for testing, but can be quite useful without the commitment of installing the Spotify App. In addition to the human readable page, you can get the &lt;a href="http://legalize-it.herokuapp.com/top/25.json"&gt;response back as JSON&lt;/a&gt;, which is handy on the client side.&lt;br /&gt;
&lt;br /&gt;
The second job for the server is&amp;nbsp;necessary&amp;nbsp;to help select which songs from the top albums we'll be listening to. The final Spotify app will only select on song per album, so users can get a taste of every album in the top N without having to listen to N complete albums. &amp;nbsp;But this should be done with care and grace to enhance the listening experience. &amp;nbsp;Thankfully, with the&amp;nbsp;assistance&amp;nbsp;of &lt;a href="http://developer.echonest.com/"&gt;The Echo Nest&lt;/a&gt;'s searchable audio summary song-level features this is quite easy. &amp;nbsp;Using their API we can search for a song by title and artist name (see a pattern of terrible id matching forming? What can I say, it's a hack.) and get back a set of descriptors that looks like this:&lt;br /&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
&lt;script src="https://gist.github.com/3018478.js?file=karmaplice.json"&gt;
&lt;/script&gt;&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
Once we have the set of song-level descriptors for every song in the&amp;nbsp;neighbouring&amp;nbsp;albums, it's simply a matter of minimizing the step size. &amp;nbsp;I've done a bit of work on playlists before, and this seemed like a reasonable approach. &amp;nbsp;While there are a number of approaches to step-size minimization, for this particular application we're doing a greedy sort of optimization that goes something like:&lt;br /&gt;
&lt;br /&gt;
&lt;ol&gt;
&lt;li&gt;&lt;span style="background-color: white;"&gt;&amp;nbsp;Select the song &lt;i&gt;a&lt;/i&gt; in album &lt;i&gt;A&lt;/i&gt; and the song &lt;i&gt;b&lt;/i&gt; in album &lt;i&gt;B, &lt;/i&gt;such that for a given audio descriptor (we'll use dancibility by default, but it could just as easily be a different measure, eg. tempo, loudness) the absolute value of the descriptor of &lt;i&gt;a&lt;/i&gt;&amp;nbsp;less the descriptor of &lt;i&gt;b&lt;/i&gt;&amp;nbsp;is minimized. &amp;nbsp;That is to say we're looking for the song-pair from these two albums that are &lt;i&gt;closest&lt;/i&gt;&amp;nbsp;in terms of whatever descriptor is being used.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;Select song &lt;i&gt;c&lt;/i&gt;&amp;nbsp;from album &lt;i&gt;C&lt;/i&gt;&amp;nbsp;such that the absolute&amp;nbsp;difference from it's audio descriptor to song &lt;i&gt;b&lt;/i&gt;'s &amp;nbsp;is similarly minimized.&lt;/li&gt;
&lt;li&gt;Repeat (2) for remaining albums, using the last chosen song against the next album.&lt;/li&gt;
&lt;/ol&gt;
&lt;div&gt;
It's worth noting that this is not the globally optimal &lt;a href="http://en.wikipedia.org/wiki/Shortest_path"&gt;shortest path&lt;/a&gt;&amp;nbsp;from the first album to the last album, but it comes with a tremendous advantage -- the first two songs are selected without any need to deal with the rest of the playlist, which can be worked on overtime. &amp;nbsp;This allows for pseudo real time playlist creation, since we just need to know the next song before the current one is done playing.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
To facilitate this algorithm the server has a hook that performs step (1) or (2) on on either a pair of albums or a song and album (specified as spotify URIs) can be accessed via a url that looks like&amp;nbsp;&lt;span style="background-color: white;"&gt;&amp;nbsp;&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;
&lt;span style="background-color: white;"&gt;&lt;span style="font-family: 'Courier New', Courier, monospace; font-size: x-small;"&gt;http://legalize-it.herokuapp.com/paired/[spotify URI of track or album]/[spotify URI of album]&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;
For example,&amp;nbsp;&lt;a href="http://legalize-it.herokuapp.com/paired/spotify:album:6LBiuhK7PZKjVXyMfPxPoh/spotify:album:5nVUqrdkEMlWTm9sqjrYBt" style="background-color: white;"&gt;http://legalize-it.herokuapp.com/paired/spotify:album:6LBiuhK7PZKjVXyMfPxPoh/spotify:album:5nVUqrdkEMlWTm9sqjrYBt&lt;/a&gt;&amp;nbsp;which returns a bit of JSON showing the closest (by dancability) song pair between&amp;nbsp;My Beautiful Dark Twisted Fantasy by Kanye West and&amp;nbsp;Up All Night by One Direction. &amp;nbsp;This method also supports using any other audio summary feature being used for distance by adding it to the end of the URI. &amp;nbsp;For example&amp;nbsp;&lt;a href="http://legalize-it.herokuapp.com/paired/spotify:album:6LBiuhK7PZKjVXyMfPxPoh/spotify:album:5nVUqrdkEMlWTm9sqjrYBt/tempo" style="background-color: white;"&gt;http://legalize-it.herokuapp.com/paired/spotify:album:6LBiuhK7PZKjVXyMfPxPoh/spotify:album:5nVUqrdkEMlWTm9sqjrYBt/tempo&lt;/a&gt;&amp;nbsp;find the closest song pair from the same two records, but by tempo rather than by dancebility.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
These two features are laced together in the &lt;a href="https://github.com/gearmonkey/legalize/tree/master/spotify_app"&gt;client&lt;/a&gt;, which is a simple spotify app, that's basically just a small bit of jQuery that grabs the list of the top 25 albums then&amp;nbsp;asynchronously&amp;nbsp;gets the selected track for each album updating the playlist as each track is selected.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;table align="center" cellpadding="0" cellspacing="0" class="tr-caption-container" style="margin-left: auto; margin-right: auto; text-align: center;"&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td style="text-align: center;"&gt;&lt;a href="http://img.skitch.com/20120615-wsay5abr8cqkxkdd1iwykmb3m.medium.jpg" imageanchor="1" style="margin-left: auto; margin-right: auto;"&gt;&lt;img border="0" height="302" src="http://img.skitch.com/20120615-wsay5abr8cqkxkdd1iwykmb3m.medium.jpg" width="400" /&gt;&lt;/a&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class="tr-caption" style="text-align: center;"&gt;The Legalize It! app, as demo'd at the Music Hack Day&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;
&lt;div&gt;
If you'd like to install the App, here are some&amp;nbsp;&lt;a href="https://github.com/gearmonkey/legalize/blob/master/readme.md"&gt;instructions for installation in developer mode&lt;/a&gt;. &amp;nbsp;I'll be cleaning up the UI to conform to Spotify's app guidelines and submitting the app to their App finder thing, but that will take a while.&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;h3&gt;


Going Forward&lt;/h3&gt;
&lt;div&gt;
While this is mostly complete for what it does there are a number of feature adds that I'll be slowly dealing with as time goes on. Highest among these is taking the same idea and applying it to different resolvers (maybe use &lt;a href="http://toma.hk/"&gt;tomahawk&lt;/a&gt;&amp;nbsp;rather than picking one...). There are also some smaller feature adds to the app, things like autoplay once the first track has loaded and switching between different echonest summary features or musicmetric charts (eg. P2P release groups by acceleration).&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;
Also, thanks very much to Spotify, who awarded me a prize for my hack!&lt;br /&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
So what do you think? &amp;nbsp;Should we care about the taste of a bunch of peers on Bittorrent? Or am I doing it wrong?&lt;/div&gt;
&lt;br /&gt;
&lt;h3&gt;


 &lt;/h3&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/J8HCS62tyMk" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/3418828891354503735/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=3418828891354503735" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/3418828891354503735?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/3418828891354503735?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/J8HCS62tyMk/licensed-listening-based-on-habits-of.html" title="Licensed listening based on the habits of pirates or lessons from sloppy item resolution" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2012/06/licensed-listening-based-on-habits-of.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CUYGQ3gyfSp7ImA9WhVUF0w.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-4263887627734630836</id><published>2012-05-22T11:58:00.001-07:00</published><updated>2012-05-22T11:58:42.695-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2012-05-22T11:58:42.695-07:00</app:edited><title>Some brief thoughts on the London homebrew scene</title><content type="html">Not that &lt;a href="http://en.wikipedia.org/wiki/Homebrew_(video_games)"&gt;homebrew&lt;/a&gt;, the &lt;a href="http://en.wikipedia.org/wiki/Homebrewing"&gt;beer one&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
A few weeks back I was asked by &lt;a href="http://www.strongroombar.com/"&gt;The Strongroom bar&lt;/a&gt; to write up a few words on homebrewing in London for the lit at their &lt;a href="http://www.strongroombar.com/events/2012-04-20/"&gt;London Beer Festival&lt;/a&gt;.  Unfortunately, it had to be cut to make room for some write-ups from breweries added at the last minute, but it occurs to me that it might be interesting to others, so I'm throwing it up here.  Without further ado:&lt;br /&gt;
&lt;br /&gt;
---&lt;br /&gt;
The Rise and Rise of Homebrewing&lt;br /&gt;
&lt;br /&gt;
&lt;b&gt;Homebrewing&lt;/b&gt; in London is enjoying something of a comeback.  Over the last few years, along with the rise (some might say return) of craft and artisanal beer in London, amateur and hobby brewing has been growing mightily.  Like the current trends in professional craft brewing, these new hobbyists are focusing on quality and experimentation over matters of the bottom line.  In London, this DIY brewing movement is focused in two clubs, each on different sides of the city and each with a different emphasis.&lt;br /&gt;
  &lt;br /&gt;
Meeting in East London is the &lt;a href="http://londonamateurbrewers.co.uk"&gt;London Amateur Brewers (LAB)&lt;/a&gt;, of which I am a member. Until its recent closure, LAB met monthly at The Wenlock Arms in Hackney (for current meeting locations, consult the website).  LAB has an open structure, operating without formal dues and a very small set of officers.  The meetings consist of a short technical talk on some aspect of the brewing process or overview of a particular style of beer, followed by tastings of member’s beers.  A typical meeting will involve tasting 8-12 beers, over an hour to hour and a half.  These beers are typically very diverse in style, and at a single meeting you can encounter everything from a Best bitter to a new world IPA to a Belgian Saison.  In addition to its regular meetings, LAB holds homebrewing competitions and festivals, &lt;a href="http://londonandsoutheast.brewcompetition.com/"&gt;the last of which was on 12 November 2011 in Wimbledon&lt;/a&gt;.&lt;br /&gt;
&lt;br /&gt;
Across London in Durden Park, is the eponymous &lt;a href="http://www.durdenparkbeer.org.uk/"&gt;Durden Park Beer Circle&lt;/a&gt;.  The Beer Circle is a more formal group then LAB, having both a formal membership process and thematic meetings, where the homebrew tastings are all keeping to a particular style, which will change from month to month.  Additionally this group has something of a focus on understanding and preserving the historical beers of Britain.  Over the years they have sought, archived, and tested many accurate recreations of style of beer long out of fashion.  These recipes have been gathered into a book that group puts out, “Old British Beers and How To Make Them.”  This book is an excellent resource in any homebrewer’s library.  In fact, you can taste some (slightly modified versions) of the recipes in this book in action at some of London’s fine craft brewers, where it has served as inspiration for novel interpretations of classical local styles, most especially Porters and Stouts.  &lt;br /&gt;
&lt;br /&gt;
Think you might want to give homebrewing a try?  It’s easier than you might think.  Come to a meeting (if you aren’t in London, &lt;a href="http://www.craftbrewing.org.uk/"&gt;the Craft Brewers Association&lt;/a&gt; can point you in the right direction) or just simply give it a try in your kitchen.  Aside from the previously linked webistes, information to get you started can be found at &lt;a href="http://howtobrew.com"&gt;How to Brew&lt;/a&gt;, &lt;a href="http://homebrew.stackexchange.com/"&gt;Homebrewing Stackexchange&lt;/a&gt;, &lt;a href="http://www.homebrewersassociation.org/"&gt;The Homebrewer's Association&lt;/a&gt;, and &lt;a href="http://www.jimsbeerkit.co.uk/forum/index.php"&gt;Jim's Beer Kit&lt;/a&gt;, among others. Good luck and happy brewing!&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/9Y61djZ2V4M" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/4263887627734630836/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=4263887627734630836" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/4263887627734630836?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/4263887627734630836?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/9Y61djZ2V4M/some-brief-thoughts-on-london-homebrew.html" title="Some brief thoughts on the London homebrew scene" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2012/05/some-brief-thoughts-on-london-homebrew.html</feedburner:origLink></entry><entry gd:etag="W/&quot;C0QMRns9cSp7ImA9WhdbGEo.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-2499243168127038662</id><published>2011-10-17T02:37:00.001-07:00</published><updated>2011-10-17T10:09:47.569-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2011-10-17T10:09:47.569-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="beer" /><category scheme="http://www.blogger.com/atom/ns#" term="musicmetric" /><category scheme="http://www.blogger.com/atom/ns#" term="easy2stalk" /><category scheme="http://www.blogger.com/atom/ns#" term="API" /><category scheme="http://www.blogger.com/atom/ns#" term="ShamelessSelfPromotion" /><category scheme="http://www.blogger.com/atom/ns#" term="ignite london" /><category scheme="http://www.blogger.com/atom/ns#" term="womrad" /><category scheme="http://www.blogger.com/atom/ns#" term="ISMIR" /><title>Upcoming talks and travels</title><content type="html">I'm going to be up to a number of things that may be of interest to the readers of this (rather sparse) blog.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;b&gt;&lt;span class="Apple-style-span"&gt;&lt;h3&gt;Quick summary&lt;/h3&gt;&lt;/span&gt;&lt;/b&gt;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;&lt;span class="Apple-style-span"&gt;Co-chairing &lt;a href="http://womrad.org/2011/index.html"&gt;The Workshop on Music Recommendation and Discovery (WOMRAD)&lt;/a&gt; Oct 23rd, Chicago&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span"&gt;Presenting some new Musicmetric API features during &lt;a href="http://ismir2011.ismir.net/"&gt;ISMIR&lt;/a&gt;, Oct 24-28 (demo on the 28th), Miami&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span class="Apple-style-span"&gt;Giving a talk about beer and critical tasting at &lt;a href="http://ignitelondon.net/"&gt;Ignite London 5&lt;/a&gt;, Nov 8th, London&lt;/span&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;a href="http://womrad.org/2011/index.html"&gt;&lt;h3&gt;&lt;b&gt;WOMRAD&lt;/b&gt;&lt;/h3&gt;&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;Our workshop (other co-chairs are &lt;a href="http://twitter.com/utstikkar"&gt;Amélie Anglade&lt;/a&gt;, &lt;a href="http://twitter.com/ocelma"&gt;Òscar Celma&lt;/a&gt;, &lt;a href="http://twitter.com/plamere"&gt;Paul Lamere&lt;/a&gt;, and &lt;a href="http://twitter.com/functiontelechy"&gt;Brian McFee&lt;/a&gt;) will cover a diverse array of approaches and angles for music recommendation and discovery.  The workshop is runs the full and is part of &lt;a href="http://recsys.acm.org/2011/index.shtml"&gt;RecSys 2011&lt;/a&gt;, though I sadly can't stay for most of the conference aside from our workshop (see below).  It should prove to be an interesting day of research.  Are you planning on attending? &lt;a href="http://lanyrd.com/2011/womrad/"&gt; Let us know.&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;h3&gt;&lt;a href="http://ismir2011.ismir.net/"&gt;&lt;b&gt;ISMIR&lt;/b&gt;&lt;/a&gt;&lt;/h3&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;I'll  be dashing off from Chicago to Miami to attend ISMIR, and to present some new (not-yet-released) &lt;a href="http://developer.musicmetric.com/"&gt;API features from Musicmetric&lt;/a&gt;.  While things aren't quite live yet, I can say that in addition to our artist based endpoints, we'll be offering track-based endpoints soon as well, and aligning them with an I-bet-you-can-guess-which large public audio feature test set. More detail on this one to come.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;h3&gt;&lt;a href="http://ignitelondon.net/"&gt;&lt;b&gt;Ignite London 5&lt;/b&gt;&lt;/a&gt;&lt;/h3&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;Ignite is a series of lightning talks that have taken place in cities all over the world, unified not by a common theme, but a common format: all talks last five minutes, contain 20 slides, and the slides are automatically advanced every 15 seconds.  The matching ethos of this structure is perhaps best seen in the Ignite slogan, "Enlighten us, but make it quick."  I'll be speaking about beer, style and critical tasting in a talk titled "Ale or Lager and Other False Choices."  Here's a brief description:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;blockquote&gt;In a word, my talk is about beer.  In a few more words, the driving narrative behind the talk is a  crash course in beer styles and more generally, critical tasting.  After an extraordinarily brief description of beer, broad ideas of style and the critical tasting process, the core of the talk will be made up of live lightning tastes of commercial examples of various styles of beer  (one slide per style, 12 styles covered with one commercial example each).  For coherence these tasting slides will grouped into broader styles, with aim toward width, rather than depth of coverage.  The styles will be approximately based on those from the &lt;a href="http://www.bjcp.org/2008styles/catdex.php"&gt;BJCP&lt;/a&gt; and the &lt;a href="http://www.worldbeercup.org/beer_styles_menu.html"&gt;Brewer's Association&lt;/a&gt;.&lt;/blockquote&gt;I still haven't sorted out the exact spread of beers or how they will be grouped, though I'm leaning toward something simple and obviously ingredient tied (something like -- Lagers, ales:yeast driven, ales:malt forward, ales:hop heavy, with 3 beers each from a different recognized style in each). If anyone has any thoughts about style divisions or specific examples do let me know.  If you'd like to go to ignite (and you know you would) the tickets will be &lt;a href="http://ignitelondon5.eventbrite.com/"&gt;available over this way&lt;/a&gt; later this week.&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;So, lots of things going on. Plus &lt;a href="http://www.musicmetric.com/beta/"&gt;there's this other thing&lt;/a&gt; I've been working on.  &lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;Right, back to it.&lt;/span&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/cL36RDiXzWM" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/2499243168127038662/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=2499243168127038662" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/2499243168127038662?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/2499243168127038662?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/cL36RDiXzWM/upcoming-talks-and-travels.html" title="Upcoming talks and travels" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2011/10/upcoming-talks-and-travels.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DUcFQH45fip7ImA9WhdQFE4.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-6532523798362684796</id><published>2011-08-15T12:09:00.000-07:00</published><updated>2011-08-15T12:36:51.026-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2011-08-15T12:36:51.026-07:00</app:edited><title>A SXSW panel proposal - The Wisdom of Thieves: Meaning in P2P Behavior</title><content type="html">So I've submitted a proposal for SXSW interactive 2012 entitled "The Wisdom of Thieves: Meaning in P2P Behavior".  If you're the sort of person that might be interested in that sort of thing &lt;a href="http://panelpicker.sxsw.com/ideas/view/13522"&gt;you can comment and/or vote on it over here&lt;/a&gt;.  The talk will basically be a tour of all the fun and exciting ways you can use BitTorrent data to make better (mostly music, but also TV, film, and app-store type things) applications, with data sources &lt;a href="http://developer.musicmetric.com/movie.html"&gt;like this&lt;/a&gt;.  Here's the abstract and questions:&lt;div&gt;&lt;blockquote&gt;
&lt;br /&gt;The act of piracy is typically viewed as devaluing content - the track that wasn’t streamed, the video game that wasn’t purchased. However, peer-to-peer networks of piracy are rich descriptions of fans who are interested enough to find content. By observing these descriptions, artists can better understand their fan base; recommendation and discovery can be better tuned. In this talk we’ll explore the similarities between BitTorrent downloads and a number of other means of online interaction, such as likes, mentions, and scrobbles. We’ll show how interactions vary between popular artists and works versus those found in the long tail, whether they’re emerging artists or niche films. Our audience will leave with a utility belt of tools to leverage data about and around peer-to-peer sharing of music and video. This talk will use data available via the Semetric API and open source Python scripts, freely available for download prior to the talk.
&lt;br /&gt;&lt;/blockquote&gt;Questions Answered:
&lt;br /&gt;&lt;blockquote&gt;&lt;ol&gt;&lt;li&gt;How is peer-to-peer activity different from communities on Facebook, Twitter or Spotify?&lt;/li&gt;&lt;li&gt;Can you use location data and a torrent network to optimize a tour schedule?&lt;/li&gt;&lt;li&gt;Which countries should I syndicate my TV show in?&lt;/li&gt;&lt;li&gt;How can you use co-occurrence in piracy to recommend content?&lt;/li&gt;&lt;li&gt;Why should I consider the behavior of roving bands of thieves?&lt;/li&gt;&lt;/ol&gt;&lt;/blockquote&gt;Also, my colleagues have panel proposals for SXSW music and film as well, go check them out &lt;a href="http://panelpicker.sxsw.com/ideas/view/13571"&gt;here&lt;/a&gt; and &lt;a href="http://panelpicker.sxsw.com/ideas/view/13557"&gt;here&lt;/a&gt;.&lt;/div&gt;
&lt;br /&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/Shc0qY2yiZI" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/6532523798362684796/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=6532523798362684796" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/6532523798362684796?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/6532523798362684796?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/Shc0qY2yiZI/sxsw-panel-proposal-wisdom-of-thieves.html" title="A SXSW panel proposal - The Wisdom of Thieves: Meaning in P2P Behavior" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2011/08/sxsw-panel-proposal-wisdom-of-thieves.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DU4DQ3g9fCp7ImA9WhZWFUk.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-5320450678060965780</id><published>2011-05-16T04:50:00.000-07:00</published><updated>2011-05-16T05:12:52.664-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2011-05-16T05:12:52.664-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="musicmetric" /><category scheme="http://www.blogger.com/atom/ns#" term="trolls" /><category scheme="http://www.blogger.com/atom/ns#" term="python" /><category scheme="http://www.blogger.com/atom/ns#" term="flamewar" /><category scheme="http://www.blogger.com/atom/ns#" term="NLP" /><title>Doing ridiculous things with natural language processing</title><content type="html">Over this past weekend before last, while I jealously followed from afar the &lt;a href="http://sf.musichackday.org/"&gt;SF musichackday&lt;/a&gt; that I was unable to attend as I'm awaiting the result of a visa application, I started mucking about with the &lt;a href="http://musicmetric.com/sf-api"&gt;beta musicmetric api&lt;/a&gt; (full disclosure, they are my employer), in particular the sentiment analyzer.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So the first thing I put together was a bit of python to fetch the content of a tweet and use the mm_api to determine it's tone.  This can be done quite simply (&lt;a href="https://github.com/musicmetric/musicmetric_api/blob/master/sentitweet.py"&gt;full source&lt;/a&gt;):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;script src="https://gist.github.com/964289.js?file=sentiment_tweets.py"&gt;&lt;/script&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;Which gives a number between 1 and 5, with 1 indicating the text is 'very negative' and 5 indicating the text is 'very positive' (&lt;a href="http://www.musicmetric.com/2010/01/musicmetrics-sentiment-analysis-v1-0-beta/"&gt;gory details of the sentiment analyzer&lt;/a&gt;).  While the sentiment analyzer is trained for larger chunks of text (500 word album/movie reviews and that sort of thing) it in fact does fairly well with tweets (though sarcasm is its downfall).  So I thought I'd do something a bit silly and built a 'flamewar detector and troll finder' for conversations on twitter.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I've called the initial command line tool firealarm.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;To gather the conversations, I'm just piggybacking on &lt;a href="http://twitter.com/#!/jwheare"&gt;@jwheare&lt;/a&gt;'s great tool &lt;a href="http://exquisitetweets.com/"&gt;exquisite tweets&lt;/a&gt;.  Once a conversation is archived over on exquisite tweets, the cli can be pointed to it via the conversation's url.  Each tweet in the conversation is pushed through the sentiment analyzer; the simple mean (µ) of all the sentiment scores is then dubiously used to determine if the conversation is a flamewar.  If the sentiment is generally negative (µ &amp;lt; 3) it's a flamewar, if it's generally positive (µ&amp;gt;3) it's not a flamewar, and if it is exactly neutral (µ ==3) it's declared a tossup.  The troll finder is equally straight forward (and equally dubious!).  Across the sequence of tweets, the author of the tweet with the highest magnitude negative delta sentiment proceeding it is considered the troll.  In the case of a tie the first occurance wins.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Here's an example (note that the linked-to example is a nasty, nasty flame war. If you offend easily, might want to skip it.  Also, obviously, views expressed are not mine, etc.):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;$  python firealarm.py &lt;a href="http://www.exquisitetweets.com/collection/RodBegbie/402"&gt;http://www.exquisitetweets.com/collection/RodBegbie/402&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This generates the following output:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;script src="https://gist.github.com/970914.js?file=firealarm.out"&gt;&lt;/script&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;add example="" here=""&gt;&lt;/add&gt;&lt;/div&gt;&lt;div&gt;A plot of the sentiments, with the maximum negative delta in red, looks like this:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a href="http://3.bp.blogspot.com/-ul5sf-tJJ1M/Tc1pO73szmI/AAAAAAAAADs/Y8CAQge9_-8/s1600/senti-plot.png" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 197px;" src="http://3.bp.blogspot.com/-ul5sf-tJJ1M/Tc1pO73szmI/AAAAAAAAADs/Y8CAQge9_-8/s320/senti-plot.png" border="0" alt="" id="BLOGGER_PHOTO_ID_5606252816456535650" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A quick read of &lt;a href="http://www.exquisitetweets.com/collection/RodBegbie/402"&gt;the tweets&lt;/a&gt; and you can see that the actual sentiment of the tweets is a bit more negative overall then the analyzer output, but this is good enough for binary classification and a fairly reasonable troll ID mechanism, albeit fairly naive.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The code is &lt;a href="https://github.com/gearmonkey/flamesense"&gt;over at github&lt;/a&gt; if you want to have a look or run it yourself.  If you want to run it, you'll need a musicmetric api key which takes about 30 seconds to get (&lt;a href="https://secure.semetric.com/sf-api-signup"&gt;apply here&lt;/a&gt;).  Eventually I'm going to turn this into a web app, and when that happens I'll let everybody know.   Also if you happen to find any really bad mislabels, &lt;a href="mailto:api@musicmetric.com"&gt;let us know&lt;/a&gt;, as it helps us tune up our process.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Have fun being algorythmically judgemental!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/zKSzIf9tOFM" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/5320450678060965780/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=5320450678060965780" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/5320450678060965780?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/5320450678060965780?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/zKSzIf9tOFM/doing-ridiculous-things-with-natural.html" title="Doing ridiculous things with natural language processing" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://3.bp.blogspot.com/-ul5sf-tJJ1M/Tc1pO73szmI/AAAAAAAAADs/Y8CAQge9_-8/s72-c/senti-plot.png" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2011/05/doing-ridiculous-things-with-natural.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DUUAQXw8eyp7ImA9WhZREE8.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-5864237007847639749</id><published>2011-04-05T16:42:00.000-07:00</published><updated>2011-04-05T11:14:00.273-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2011-04-05T11:14:00.273-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="beer advocate" /><category scheme="http://www.blogger.com/atom/ns#" term="beer" /><category scheme="http://www.blogger.com/atom/ns#" term="API" /><category scheme="http://www.blogger.com/atom/ns#" term="twitter" /><category scheme="http://www.blogger.com/atom/ns#" term="linking open data" /><category scheme="http://www.blogger.com/atom/ns#" term="rate beer" /><title>Free Beer: A Plea for Open Data [About Beer]</title><content type="html">&lt;div style="text-align: left;"&gt;(I wrote most of this right after my viva, but got a bit sidetracked...)&lt;/div&gt;&lt;a href="http://3.bp.blogspot.com/-s91nZxz6uoY/TWubs0u2IsI/AAAAAAAAADk/YUGM-r2zEvg/s1600/mistertim_24_02_2011_5.jpg" onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}"&gt;&lt;/a&gt;&lt;div style="text-align: left;"&gt;Hey look, my first blog post about beer (or at least beer metadata).&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So &lt;s&gt;yesterday&lt;/s&gt; a few weeks ago,  Tim Cowlishaw (&lt;a href="http://twitter.com/#!/mistertim"&gt;@mistertim&lt;/a&gt; ) stated &lt;a href="https://twitter.com/mistertim/status/40869809142894592"&gt;this&lt;/a&gt; on twitter:&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;img src="http://3.bp.blogspot.com/-B2HjUGswua4/TWeEOxAm16I/AAAAAAAAAC0/qzXu4D38DXE/s320/mistertim_24_02_2011_1.jpg" style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 84px;" border="0" alt="" id="BLOGGER_PHOTO_ID_5577572052730566562" /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;To which I replied with &lt;a href="https://twitter.com/alsothings/status/40885092108865536"&gt;this&lt;/a&gt; :&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;img src="http://3.bp.blogspot.com/-EGd5n4lPDBw/TWeHvePQyAI/AAAAAAAAAC8/STv-t5dMJbI/s400/alsothings_24_02_2011_1.jpg" style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 84px;" border="0" alt="" id="BLOGGER_PHOTO_ID_5577575913162328066" /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;span class="Apple-style-span"&gt;(&lt;a href="http://en.wikipedia.org/wiki/Web_scraping"&gt;Scraping&lt;/a&gt; is a way to get the info a human reads, say on a website into a format a computer program can read. More on why I'd want to do that in a minute...)&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;Which was followed by what I thought was a &lt;a href="https://twitter.com/mistertim/status/40870402053902337"&gt;reasonable request&lt;/a&gt;:&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;img src="http://2.bp.blogspot.com/-MN0iaYY4tpw/TWeIdaFD6UI/AAAAAAAAADE/sqjO5YJ9MwQ/s400/mistertim_24_02_2011_2.jpg" style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 81px;" border="0" alt="" id="BLOGGER_PHOTO_ID_5577576702319782210" /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;div style="text-align: left;"&gt;Now at this point I had figured that was the end of it.  Both &lt;a href="http://ratebeer.com/"&gt;rate beer&lt;/a&gt; or &lt;a href="http://beeradvocate.com/"&gt;beer advocate&lt;/a&gt; ignored my requests for data a year or so ago, I was expecting the same this time. However, beer advocate responded via &lt;a href="https://twitter.com/beeradvocate/status/40901067273142273"&gt;twitter&lt;/a&gt;:&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;span class="Apple-style-span" style="color: rgb(0, 0, 238); -webkit-text-decorations-in-effect: underline; "&gt;&lt;img src="http://2.bp.blogspot.com/-CtTlFki5TVg/TWuQyXRjMfI/AAAAAAAAADM/7LZ3mRBFD20/s400/beeradvocate_24_02_2011_1.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5578711758343975410" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 311px; height: 77px; " /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;(Note that the link to the tweet no longer resolves, because beeradvocate decided to delete this tweet a couple hours later.  The screen capture was taken from my twitter client just after the deletion...)&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;Now, I hadn't been prepared for such knee-jerk nastiness regarding seemingly reasonable data requests and neither had Tim as he quickly push out &lt;a href="https://twitter.com/mistertim/status/40908797878738944"&gt;this&lt;/a&gt; &lt;a href="https://twitter.com/mistertim/status/40909876838273025"&gt;series&lt;/a&gt; of &lt;a href="https://twitter.com/mistertim/status/40910607347761152"&gt;messages&lt;/a&gt;:&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;div style="text-align: left;"&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="font-size: 16px; "&gt;&lt;a href="http://3.bp.blogspot.com/-s91nZxz6uoY/TWubs0u2IsI/AAAAAAAAADk/YUGM-r2zEvg/s1600/mistertim_24_02_2011_5.jpg"&gt;&lt;/a&gt;&lt;a href="http://2.bp.blogspot.com/-j_QRBoz_RK0/TWubsofKr5I/AAAAAAAAADc/kf8DjVYbyDI/s1600/mistertim_24_02_2011_4.jpg"&gt;&lt;span class="Apple-style-span" style="color: rgb(0, 0, 0); -webkit-text-decorations-in-effect: none; "&gt;&lt;/span&gt;&lt;/a&gt;&lt;div&gt;&lt;a href="http://2.bp.blogspot.com/-j_QRBoz_RK0/TWubsofKr5I/AAAAAAAAADc/kf8DjVYbyDI/s1600/mistertim_24_02_2011_4.jpg"&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="font-size: 16px; "&gt;&lt;/span&gt;&lt;/span&gt;&lt;/a&gt;&lt;span class="Apple-style-span"&gt;&lt;a href="http://3.bp.blogspot.com/-q0xX0hEgPVA/TWubsnOzY0I/AAAAAAAAADU/KPByAyPJivo/s1600/mistertim_24_02_2011_3.jpg"&gt;&lt;img src="http://3.bp.blogspot.com/-q0xX0hEgPVA/TWubsnOzY0I/AAAAAAAAADU/KPByAyPJivo/s400/mistertim_24_02_2011_3.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5578723754176111426" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 400px; height: 53px; " /&gt;&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="color: rgb(0, 0, 238); font-size: 16px; -webkit-text-decorations-in-effect: underline; "&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/span&gt;&lt;img src="http://2.bp.blogspot.com/-j_QRBoz_RK0/TWubsofKr5I/AAAAAAAAADc/kf8DjVYbyDI/s400/mistertim_24_02_2011_4.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5578723754513182610" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 400px; height: 71px; " /&gt;&lt;a href="http://3.bp.blogspot.com/-q0xX0hEgPVA/TWubsnOzY0I/AAAAAAAAADU/KPByAyPJivo/s1600/mistertim_24_02_2011_3.jpg"&gt;&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;span class="Apple-style-span" style="color: rgb(0, 0, 238); font-size: 16px; -webkit-text-decorations-in-effect: underline; "&gt;&lt;img src="http://3.bp.blogspot.com/-s91nZxz6uoY/TWubs0u2IsI/AAAAAAAAADk/YUGM-r2zEvg/s400/mistertim_24_02_2011_5.jpg" border="0" alt="" id="BLOGGER_PHOTO_ID_5578723757800170178" style="display: block; margin-top: 0px; margin-right: auto; margin-bottom: 10px; margin-left: auto; text-align: center; cursor: pointer; width: 400px; height: 82px; " /&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;While I and others pushed some similar responses, Tim's summarize things really well: boo, disdain, technical critique. (after this both Tim and I appeared to be blocked from following beeradvocate...)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The crux of all this is that I (and it would appear others, but from here I speak only for myself) would love to have access to structured data (as a &lt;a href="http://en.wikipedia.org/wiki/Web_data_services"&gt;service&lt;/a&gt; or, better yet, as &lt;a href="http://tomheath.com/blog/2009/03/linked-data-web-of-data-semantic-web-wtf/"&gt;documents&lt;/a&gt;) about beer and the people who drink it.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;I'd love to build browser-based applications that do cross-domain recommendation of say beer and music.  But in order to do that I'd need data about people's taste, in beer and music.  &lt;a href="http://www.programmableweb.com/apis/directory/1?apicat=Music"&gt;Lots of options to work with in the music domain&lt;/a&gt;. But beer?  Machine readable beer data is harder to find.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Both &lt;a href="http://ratebeer.com/"&gt;ratebeer&lt;/a&gt; and &lt;a href="http://beeradvocate.com/"&gt;beer advocate&lt;/a&gt; have a great deal of this data, it's just not (openly) machine readable.  In ratebeer's case this is entirely &lt;a href="http://en.wikipedia.org/wiki/Crowdsourcing"&gt;crowdsourced&lt;/a&gt; and for beer advocate this is true for their community pages.  There's a compelling case that crowdsourced data should be as open as possible, given that the data itself comes from the public at large.  But beyond the moral case, opening your data means that the wide-world of evening and weekend software developers/architects/designers/whatevers (many have the same job during the day) will expand what is possible a site's data in a way that will benefit said site (like my half baked idea above).  This, in essence, is the commercial argument for supporting open data and has been shown to be extremely effective in other domains (say, to pick one at random, &lt;a href="http://musichackday.org/"&gt;music&lt;/a&gt;).  And there is a simply massive spread of open data apis (again, both &lt;a href="http://www.programmableweb.com/apis"&gt;service&lt;/a&gt; and &lt;a href="http://richard.cyganiak.de/2007/10/lod/imagemap.html"&gt;document&lt;/a&gt;) but barely any covering data about my favourite topic that isn't music, beer.  So what do you say ratebeer or beeradvocate?  How about some nice strucutured data?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;i&gt;note: I should mention that there are a couple sites that are beer related and open: &lt;a href="http://untappd.com/"&gt;untappd&lt;/a&gt; and &lt;a href="http://beerspotr.org/"&gt;beerspotr&lt;/a&gt;.  Both are good sites, though neither is quite to the point of hitting critical mass in terms of data coverage and usefulness just yet.  Either might at some point in the future, but ratebeer and beeradvocate already have, the data just isn't accessable.&lt;/i&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;/div&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/G3-dQ8KUk-w" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/5864237007847639749/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=5864237007847639749" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/5864237007847639749?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/5864237007847639749?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/G3-dQ8KUk-w/free-beer-plea-for-open-data-about-beer.html" title="Free Beer: A Plea for Open Data [About Beer]" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://3.bp.blogspot.com/-B2HjUGswua4/TWeEOxAm16I/AAAAAAAAAC0/qzXu4D38DXE/s72-c/mistertim_24_02_2011_1.jpg" height="72" width="72" /><thr:total>1</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2011/02/free-beer-plea-for-open-data-about-beer.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DUIMQnwyfyp7ImA9WhBVGUU.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-3673728334318087794</id><published>2011-04-01T10:57:00.000-07:00</published><updated>2013-04-26T07:46:23.297-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2013-04-26T07:46:23.297-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="similarity" /><category scheme="http://www.blogger.com/atom/ns#" term="thesis" /><category scheme="http://www.blogger.com/atom/ns#" term="playlisting" /><category scheme="http://www.blogger.com/atom/ns#" term="PhD" /><title>Viva passed, corrections approved, blog barely updated...</title><content type="html">The last couple months have proven me to be a terrible blogger, as I haven't posted at all.&lt;br /&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Anyway, that aside, I'm pleased to announce that I have passed my viva with minor corrections (back on march 2nd) and as of about an hour ago, had my submitted corrections approved, which means I'm totally done!&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
Hoorah!&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
So before I run off for a bit of celebratory drinking, I thought I'd post the soft copy in the the series of tubes (&lt;a href="http://benfields.net/bfields_thesis.pdf"&gt;here's the full pdf&lt;/a&gt;) and here is a brief chapter-by-chapter summary:&lt;/div&gt;
&lt;div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;b&gt;Chapter 1: Introduction.&lt;/b&gt; We present the set of problems this thesis will address, through a discussion of relevant contexts, including changing patterns in music consumption and listening. The core terms are defined. Constraints imposed on this work are laid out along with our aims.  Finally, we provide this outline to expose the structure of the document itself.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Chapter 2: Playlists and Program Direction.&lt;/b&gt; We survey the state of the art in playlist tools and playlist generation. A framework for types of playlists is presented. We then give a brief history of playlist creation. This is followed by a discussion of music similarity, the current state of the art and how playlist generation depends on music similarity. The re- mainder of the chapter covers a representative survey of all things playlist. This includes commercially available tools to make and manage playlists, research into playlist generation and analysis of playlists from a selection of available playlist generators. Having reviewed existing tools and gen- eration methods, we aim to demonstrate that a better understanding of song-to-song relationships than currently exists is a necessary underpin- ning for a robust playlist generation system, and this motivates much of the work in this thesis.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Chapter 3: Multimodal Social Network Analysis.&lt;/b&gt; We present an exten- sive analysis of a sample of a social network of musicians. First we analyse the network sample using standard complex network techniques to verify that it has similar properties to other web-derived complex networks. We then compute content-based pairwise dissimilarity values using the musical data associated with the network sample, and the relationship between those content-based distances and distances from network the- ory are explored. Following this exploration, hybrid graphs and distance measures are constructed and used to examine the community structure of the artist network. We close the chapter by presenting the results of these investigations and consider the recommendation and discovery applications these hybrid measures improve.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Chapter 4: Steerable Optimizing Self-Organized Radio.&lt;/b&gt; Using request radio shows as a base interactive model, we present the Steerable Opti- mizing Self-Organized Radio system as a prototypical music recommender system along side robust automatic playlist generation. This work builds directly on the hybrid models of similarity described in Chapter 3 through the creation of a web-based radio system that interacts with current lis- teners through the selection of periodic requests songs from a pool of nominees. We describe the interactive model behind the request system. The system itself is then described in detail. We detail the evaluation process, though note that the inability to rigorously compare playlists creates some difficulty for a complete study.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Chapter 5: A Method to Describe and Compare Playlists.&lt;/b&gt; In&lt;span class="Apple-tab-span" style="white-space: pre;"&gt; &lt;/span&gt;this chapter we survey current means of evaluating playlists. We present a means of comparing playlists in a reduced dimensional space through the use of aggregated tag clouds and topic models. To evaluate the fitness of this measure, we perform prototypical retrieval tasks on playlists taken from radio station logs gathered from Radio Paradise and Yes.com, using tags from Last.fm with the result showing better than random performance when using the query playlist’s station as ground truth, while failing to do so when using time of day as ground truth. We then discuss possible applications for this measurement technique as well as ways it might be improved.&lt;/li&gt;
&lt;li&gt;&lt;b&gt;Chapter 6: Conclusions.&lt;/b&gt; We discuss the findings of this thesis in their to- tality. After summarizing the conclusions we discuss possible future work and directions implied by these findings.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;div&gt;
Enjoy!&lt;/div&gt;
&lt;div&gt;
&lt;br /&gt;&lt;/div&gt;
&lt;div&gt;
(Also, if you find any deep hiding typos, I'd love to know about them.  Not sending it to the printer/binder till Monday...)&lt;/div&gt;
&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/3taZXaqxbak" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/3673728334318087794/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=3673728334318087794" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/3673728334318087794?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/3673728334318087794?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/3taZXaqxbak/viva-passed-corrections-approved-blog.html" title="Viva passed, corrections approved, blog barely updated..." /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2011/04/viva-passed-corrections-approved-blog.html</feedburner:origLink></entry><entry gd:etag="W/&quot;AkcFQHk7eip7ImA9Wx9RGUo.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-1406908392129655455</id><published>2010-12-21T16:29:00.000-08:00</published><updated>2010-12-21T16:33:31.702-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-12-21T16:33:31.702-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="similarity" /><category scheme="http://www.blogger.com/atom/ns#" term="music informatics" /><category scheme="http://www.blogger.com/atom/ns#" term="thesis" /><category scheme="http://www.blogger.com/atom/ns#" term="playlisting" /><title>woooo!</title><content type="html">I finally submitted my phd thesis.  I'll post some bits of it over the next few weeks, and the whole thing after my viva.  To start, here's the abstract:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;---&lt;br /&gt;&lt;div&gt;&lt;blockquote&gt;&lt;/blockquote&gt;&lt;div&gt;It is not hyperbole to note that a revolution has occurred in the way that we as a society distribute data and information. This revolution has come about through the confluence of Web-related technologies and the approaching- universal adoption of internet connectivity. Add to this mix the normalised use of lossy compression in digital music and the uptick in digital music download and streaming services; the result is an environment where nearly anyone can listen to nearly any piece of music nearly anywhere. This is in many respects the pinnacle in music access and availability. Yet, a listener is now faced with a dilemma of choice. Without being familiar with the ever-expanding millions of songs available, how does a listener know what to listen to? If a near-complete collection of recorded music is available what does one listen to next? While the world of music distribution underwent a revolution, the ubiquitous access and availability it created brought new problems in recommendation and discovery.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In this thesis, a solution to these problems of recommendation and discovery is presented. We begin with an introduction to the core concepts around the playlist (i.e. sequential ordering of musical works). Next, we examine the history of the playlist as a recommendation technique, starting from before the invention of audio recording and moving through to modern automatic methods. This leads to an awareness that the creation of suitable playlists requires a high degree of knowledge of the relation between songs in a collection (e.g. song similarity). To better inform our base of knowledge of the relationships between songs we explore the use of social network analysis in combination with content-based music information retrieval. In an effort to show the promise of this more complex relational space, a fully automatic interactive radio system is proposed, using audio-content and social network data as a backbone. The implementation of the system is detailed. The creation of this system presents another problem in the area of evaluation. To that end, a novel distance metric between playlists is specified and tested. This distance method is then applied as a mean of evaluation to our interactive radio system. We then conclude with a discussion of what has been shown and what future work remains.&lt;/div&gt;&lt;/div&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/cLHEP5kP2s4" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/1406908392129655455/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=1406908392129655455" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/1406908392129655455?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/1406908392129655455?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/cLHEP5kP2s4/woooo.html" title="woooo!" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2010/12/woooo.html</feedburner:origLink></entry><entry gd:etag="W/&quot;AkECRnk7eCp7ImA9Wx5WFUw.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-3202189432480074637</id><published>2010-09-26T08:42:00.000-07:00</published><updated>2010-09-26T09:31:07.700-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-09-26T09:31:07.700-07:00</app:edited><title>womrad live blog part the last</title><content type="html">last session: Long tail stuff:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://womrad.org/2010/papers/9.pdf"&gt;Music Recommendation in the Personal Long Tail: Using a Social-based Analysis of a User's Long-Tailed Listening Behavior&lt;/a&gt;. &lt;/div&gt;&lt;div&gt;Kibeom Lee (presenting), Woon Seung Yeo and Kyogu Lee&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;focusing on popularity bias - referencing oscar's thesis work (Help! I'm stuck in the head)&lt;/li&gt;&lt;li&gt;Goal: keep the awesome of collaborative filtering but sort out popularity bias&lt;/li&gt;&lt;li&gt;the mystery of unpopular but 'loved' songs on last.fm -- shouldn't loved songs be played frequently... perhaps an area of music the user likes but doesn't venture very far into&lt;/li&gt;&lt;li&gt;'My tail is your head' - find the users who have a 'head' that overlaps with your 'tail' to draw recs from&lt;/li&gt;&lt;li&gt;personal story about how this idea came about -- one person's popularity bias is another person's novel rec.&lt;/li&gt;&lt;li&gt;refs oscar and paul's ISMIR 07 rec tutorial - this system is geared toward the top half of the user type pyramid&lt;/li&gt;&lt;li&gt;scraped last.fm to get more tracks per user (API gives 50/user scrape gives 500)&lt;/li&gt;&lt;li&gt;lots of tracks (about 9million)&lt;/li&gt;&lt;li&gt;eval by asking users how things worked out comparing recs from proposed algor v. trad model rate; used a 1-5 rating scale&lt;/li&gt;&lt;li&gt;promo'd the website in various ways, but not too much response&lt;/li&gt;&lt;li&gt;but, the limited response did show some improvement over traditional approach&lt;/li&gt;&lt;li&gt;overall - some improvement, much potential&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Q how many users?: see above&lt;/div&gt;&lt;div&gt;Q so were your recs in the global head?: &lt;/div&gt;&lt;div&gt;sorta, mostly in the midsection&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://womrad.org/2010/papers/1.pdf"&gt;Music Recommendation and the Long Tail. &lt;/a&gt;&lt;/div&gt;&lt;div&gt;Mark Levy (presenting) and Klaas Bosteels&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;an overview of lit showing various rec bias especially the idea of positive feedback reinforcing the head (&lt;a href="http://en.wikipedia.org/wiki/Bias_tape"&gt;not this kind of bias though&lt;/a&gt;)&lt;/li&gt;&lt;li&gt;this work looks at 7 billion scrobbles all scrobbles from Jan - Mar this year (holy crap, that's some scale)&lt;/li&gt;&lt;li&gt;recs just from the last.fm radio&lt;/li&gt;&lt;li&gt;how do you define the long tail? use a fixed ref of overall artist ranks (number of listeners from last) + a fit model ~50-60k artists in the 'head'&lt;/li&gt;&lt;li&gt;looked at rec radio, non-rec radio, all music&lt;/li&gt;&lt;li&gt;the last.fm radio has less head bias then general listening, but only just&lt;/li&gt;&lt;li&gt;used an experimental cohort of listeners: new, active, but not insane spamming amounts of scrobbling. two subsets : radio users and not so much&lt;/li&gt;&lt;li&gt;this shows very little difference in the non-radio long tail listening among those who use last.fm radio v. those who don't&lt;/li&gt;&lt;li&gt;but: perhaps there's some demographic trouble&lt;/li&gt;&lt;li&gt;so split radio users into high users and low users&lt;/li&gt;&lt;li&gt;still no tail bias to speak of&lt;/li&gt;&lt;li&gt;perhaps from the fact that real systems only rec new tracks, mitigating reinforcement&lt;/li&gt;&lt;li&gt;so: built a simple item-based rec which limited candidates to the 'play direct-from-artist' scheme, not allowed to give artists with more than 10000 fans&lt;/li&gt;&lt;li&gt;deployed on playground.last.fm&lt;/li&gt;&lt;li&gt;eval based on a sample of the last.fm user traffic&lt;/li&gt;&lt;li&gt;effectively pushes curve out another order of magnitude&lt;/li&gt;&lt;li&gt;&lt;a href="http://playground.last.fm/demo/directrecs"&gt;try online&lt;/a&gt;&lt;/li&gt;&lt;li&gt;[me: this is great!]&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Q Do you see a problem, in terms of scholarship, with the fact that in practice you have access to all this data and the public does not?&lt;/div&gt;&lt;/div&gt;&lt;div&gt;well, hrm. how about being an intern&lt;/div&gt;&lt;div&gt;Q Does this make better recs?&lt;/div&gt;&lt;div&gt;Better, eh, interesting sure.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;And WOMRAD done. &lt;a href="http://womrad.org/2010/organizingcommitte.html"&gt;feedback is elicited&lt;/a&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/DRvyrhcQBuw" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/3202189432480074637/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=3202189432480074637" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/3202189432480074637?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/3202189432480074637?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/DRvyrhcQBuw/womrad-live-blog-part-last.html" title="womrad live blog part the last" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2010/09/womrad-live-blog-part-last.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CUYGQHY6cCp7ImA9Wx5WFUw.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-7944522318530975180</id><published>2010-09-26T07:13:00.000-07:00</published><updated>2010-09-26T07:58:41.818-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-09-26T07:58:41.818-07:00</app:edited><title>afternoon papers</title><content type="html">&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;content-based stuff now:&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://womrad.org/2010/papers/13.pdf"&gt;Content-based music recommendation based on user preference examples. &lt;/a&gt;&lt;/div&gt;&lt;div&gt;Dmitry Bogdanov, Martín Haro, Ferdinand Fuhrmann, Emilia Gómez and Perfecto Herrera &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Dmitry presenting&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;Sim is not rec.  need similarity&lt;/li&gt;&lt;li&gt;can we improve content based rec by merging pref data?&lt;/li&gt;&lt;li&gt;gmm + pref model&lt;/li&gt;&lt;li&gt;process:&lt;/li&gt;&lt;/ul&gt;&lt;ol&gt;&lt;li&gt;ask user for small set of tracks that specify the user's preference by example&lt;/li&gt;&lt;li&gt;get bag of frames on these&lt;/li&gt;&lt;li&gt;SVMs to get sematics (probablistic)&lt;/li&gt;&lt;li&gt;in this semantic space, search for tracks&lt;/li&gt;&lt;/ol&gt;&lt;ul&gt;&lt;li&gt;can search in a variety of ways (use of Pearson's correlation is taken from prev work)&lt;/li&gt;&lt;li&gt;for eval compare our method to a bunch of existing methods, content-based , contextual, random&lt;/li&gt;&lt;li&gt;some users did a test get pref set (varies form 19 to 178 tracks for a user) this takes a long time&lt;/li&gt;&lt;li&gt;get lots of tracks from all the methods, shuffle, stick in front of user ask lots of Qs per track&lt;/li&gt;&lt;li&gt;created three categories based on the evals: Hits, trusts, fails&lt;/li&gt;&lt;/ul&gt;&lt;ol&gt;&lt;li&gt;Hits -user likes, is new&lt;/li&gt;&lt;li&gt;trusts - user likes, is not new&lt;/li&gt;&lt;li&gt;fail - no to all&lt;/li&gt;&lt;li&gt;unclear - the rest (18%)&lt;/li&gt;&lt;/ol&gt;&lt;ul&gt;&lt;li&gt;A good system should provide many hits and some trusts avoiding fails&lt;/li&gt;&lt;li&gt;in the results, last.fm (via api) is very good for hits and trusts&lt;/li&gt;&lt;li&gt;everyone else was bad at trusts&lt;/li&gt;&lt;li&gt;the new method was best for non-last.fm with hits, but last.fm is different drawing set of music so they're better&lt;/li&gt;&lt;li&gt;proposed semantics offer an improvement over pure timbral features&lt;/li&gt;&lt;li&gt;but still inferior to industrial approaches, though this proposed work improves considerably, a good way to cold start perhaps&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Q (oscar) I dont' understand the last.fm? why didn't you use for sim?&lt;/div&gt;&lt;/div&gt;&lt;div&gt;we tried, couldn't get enough info&lt;/div&gt;&lt;div&gt;(oscar follow up) low trust on the content, do you think it's tied to a lack of transparency?&lt;/div&gt;&lt;div&gt;maybe, but our definition of trust just meant user likes and knows.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Q() was the SEM-ALL about finding songs that are close to any or all?&lt;/div&gt;&lt;div&gt;any&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;UPDATE (~5pm):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;a href="http://womrad.org/2010/papers/21.pdf"&gt;Applying Constrained Clustering for Active Exploration of Music Collections.&lt;/a&gt;&lt;/div&gt;&lt;div&gt;Pedro Mercado and Hanna Lukashevich&lt;/div&gt;&lt;div&gt;Hannah is presenting&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;clustering can help you swim in the sea of data&lt;/li&gt;&lt;li&gt;users can fix incorrect clusters, positive feedback&lt;/li&gt;&lt;li&gt;system diagram:&lt;img src="http://4.bp.blogspot.com/_PJFQpgIoi0g/TJ9b25Kzq-I/AAAAAAAAACc/hU8SD_Dtayc/s320/womrad_21.jpg" style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 210px; height: 320px;" border="0" alt="" id="BLOGGER_PHOTO_ID_5521232666796731362" /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;similarity can be given considered as a graph, then you can do random walks, calc eigen values etc.&lt;/li&gt;&lt;li&gt;but, what if this user doesn't care about somethings? User pref based feature selection.&lt;/li&gt;&lt;li&gt;in the given space, you can then find distance (paper uses Pearson's but other dist could be used)&lt;/li&gt;&lt;li&gt;contraint the space (tricky math, see paper...)&lt;/li&gt;&lt;li&gt;eval: used the MIREX 04content description data&lt;/li&gt;&lt;li&gt;constraints from genre labels&lt;/li&gt;&lt;li&gt;using test train as an example: what's in contraint space, what isn't&lt;/li&gt;&lt;li&gt;mutual information, something else I didn't catch&lt;/li&gt;&lt;li&gt;some graphs showing that there's more awesome with presented method&lt;/li&gt;&lt;li&gt;when looking at outliers, things are less clear but still seem positive&lt;/li&gt;&lt;li&gt;[graphs are page 6 of the pdf, have a look for details]&lt;/li&gt;&lt;li&gt;to wrap up: ML approaches can improve recs at least with our simulated user...&lt;/li&gt;&lt;li&gt;our clustering methods are speedy, though scale is tricky but since our matrix sparse should be doable&lt;/li&gt;&lt;li&gt;Way better than random constraints&lt;/li&gt;&lt;li&gt;future work: stick constraints in feature selector, we did this, to appear in ICML, gives significant imporvement, but causes some trouble, read paper for detail [excellent ICML tease...]&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;-- coffee and demos now...&lt;/div&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/Uqn6hvWxQU0" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/7944522318530975180/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=7944522318530975180" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/7944522318530975180?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/7944522318530975180?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/Uqn6hvWxQU0/afternoon-papers.html" title="afternoon papers" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://4.bp.blogspot.com/_PJFQpgIoi0g/TJ9b25Kzq-I/AAAAAAAAACc/hU8SD_Dtayc/s72-c/womrad_21.jpg" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2010/09/afternoon-papers.html</feedburner:origLink></entry><entry gd:etag="W/&quot;D0UBRHg7fCp7ImA9Wx5WFUw.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-8770580542890381511</id><published>2010-09-26T06:10:00.000-07:00</published><updated>2010-09-26T08:34:15.604-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-09-26T08:34:15.604-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="afternoon" /><category scheme="http://www.blogger.com/atom/ns#" term="recsys2010" /><category scheme="http://www.blogger.com/atom/ns#" term="live blog" /><category scheme="http://www.blogger.com/atom/ns#" term="womrad" /><title>WOMRAD Afternoon live blog</title><content type="html">Afternoon live blog for WOMRAD.  Intro in first post.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Afternoon session.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;1500: Industrial panel&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;panelists:&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;&lt;/li&gt;&lt;li&gt;Òscar Celma (BMAT, Spain), moderator&lt;/li&gt;&lt;li&gt;Tom Butcher (Microsoft Bing, US)&lt;/li&gt;&lt;li&gt;Mark Levy (last.fm, UK)&lt;/li&gt;&lt;li&gt;&lt;/li&gt;&lt;li&gt;Michael S. Papish (Rovi Corporation) subing for Henri-Pierre Mousset (Yozik, France)&lt;/li&gt;&lt;li&gt;Gilad Shlang (Meemix, Israel)&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Q (asked by OC): Do we need recommenders anymore? Are they relevant? (SFMusicTech quote about music rec only needed for people w/o friends)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;TB: still valid, but personally more interested in discovery now...&lt;/div&gt;&lt;div&gt;ML: don't sell things, but tremendous effort in this direction, important to users, builds trust. Plus we compliment not replace social connections&lt;/div&gt;&lt;div&gt;MP: no need to draw lines, reinforces complimentary service idea.  Many users may not need them, but perhaps that's not who these systems are for&lt;/div&gt;&lt;div&gt;GS: What's wrong with not having friends? Also, a tight group of friends may not have discovery, as a group. The removal of place opens more possibility to access long tail or different parts of the head.  Perhaps more personalization than rec, but this a fine line&lt;/div&gt;&lt;div&gt;MP: the opposition of the individual v. group. If you listen to music without a community you loss the social experience.&lt;/div&gt;&lt;div&gt;GS: fair enough. some points about individual optimization of education.  Group important, but also personal growth.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Q (asked by OC): Netflix prize. What is a good recommendation (in music)?  How do we evaluate (in music)?&lt;/div&gt;&lt;div&gt;GS: a good rec will get people interested.  wow factor.  acknowledge you and surprise you at same time. music is short , which makes it easier to tune a rec profile. sharing implies liking, that's useful. tagging; more tags=more popular&lt;/div&gt;&lt;div&gt;(...small aside...)&lt;/div&gt;&lt;div&gt;ML: we run controlled experiments. quietly divide users to test different methods. Netflix fails in that it evaluates with data that's already been seen not new data.&lt;/div&gt;&lt;div&gt;MP: good rec means different things at different times. gives an example of a good rec that is not interesting: an artist you've previously bought releases a new album. not interesting but good. this would be bad for radio.&lt;/div&gt;&lt;div&gt;TB: in industry there are many ways to test. more purchase is different than more enjoyment.&lt;/div&gt;&lt;div&gt;ML: we at last would love for some theory to be developed for rec based on user logs.&lt;/div&gt;&lt;div&gt;OC: more data please&lt;/div&gt;&lt;div&gt;ML: you can always ask...&lt;/div&gt;&lt;div&gt;(this goes back and forth)&lt;/div&gt;&lt;div&gt;MP: yet there will never be good data. sparse data is hard, but makes your a better human (eat your green)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Q (OC) discuss user interfaces, user exerience etc. :&lt;/div&gt;&lt;div&gt;TB: pandora is a winner, don't ignore the interaction&lt;/div&gt;&lt;div&gt;ML: thesixtyone is great. interface v. playful good long tail, would love for a last.fm interaction, but discussing issues&lt;/div&gt;&lt;div&gt;MP: name checks Paul lamere, who he cites saying thesixtyone is an exotic rec, but MP thinks this is the way people normally use music, we should work to have systems that act like this. Need a toolbelt not one ubiquitous tool&lt;/div&gt;&lt;div&gt;GS:we tackle similar things. in B2B you need systems that complete clients' existing systems. If you over rec, you can scare people off, social dynamics&lt;/div&gt;&lt;div&gt;MP: think about the inverse rec - what should you not rec?  Also from a UX stand point, to build trust, change recs over time. General to personal as a user interacts with a system&lt;/div&gt;&lt;div&gt;GS: different recs can be very personal&lt;/div&gt;&lt;div&gt;ML: last.fm takes a sort of opposite &lt;/div&gt;&lt;div&gt;-crosstalk-&lt;/div&gt;&lt;div&gt;MP: this is possible from last.fm's transparency &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Q (OC) What do you want solved?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;TB: we're hiring. Also, algorithms must scale or they're scalable&lt;/div&gt;&lt;div&gt;ML: how to merged datasources? how to use human-to-human recs&lt;/div&gt;&lt;div&gt;MP: see my keynote, exploit user psych, What are good Qs to ask users to build profiles&lt;/div&gt;&lt;div&gt;GS: more info for recs, params in audiofiles. map user params to extractable params Moving techniques to non-western musics. What about china and india? We should be serve them.&lt;/div&gt;&lt;div&gt;MP: is the sonic data really the key? I don't think so, too much effort in this direction sim is different than rec&lt;/div&gt;&lt;div&gt;GS: but sim is a good start&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Q(OC) Do you use content-based features (y/n please)? ISMIR fun, glass ceiling, do you follow this work in academics, do you think it's solved?&lt;/div&gt;&lt;div&gt;GS: Yes. see last discussion core to our business, vital to start a relationship, move to social and such over time.&lt;/div&gt;&lt;div&gt;OC: what sorts of things do you use&lt;/div&gt;&lt;div&gt;GS: of things. 10 (does not list). aggressiveness very important.&lt;/div&gt;&lt;div&gt;ML: I come from the MIR community. we do content-based ID, have tried to intro content-based stuff and it's never been successful. But our hearts are in it. We have enough users that cold start doesn't matter. auto-tagging would be sweet though for the holes in our social data (musicological tags for instance). maybe youtube&lt;/div&gt;&lt;div&gt;TB: yes for the most part to ML's comments. content-based is too costly, tags and metadata are super effective&lt;/div&gt;&lt;div&gt;GS:  what about the new company, that doesn't have lots of data. is you're just getting into the game. these people need results. can't tell them go gather data for a year and we'll sort it out&lt;/div&gt;&lt;div&gt;MP: item to item is very different than personalized&lt;/div&gt;&lt;div&gt;ML: check that P2P paper from ismir&lt;/div&gt;&lt;div&gt;Hannah from Franhauffer: we have clients (like film producers) no data,what then?&lt;/div&gt;&lt;div&gt;MP: exactly.&lt;/div&gt;&lt;div&gt;(Eugenio Tacchini): GS is the DNA all of it? really?&lt;/div&gt;&lt;div&gt;GS: no, not really. music DNA for rich space, but still need personalized info&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Q (OC) if you were to hire a researcher (aside: research cannot program) what kind skills do you want (not resume skills, fancy skills)?&lt;/div&gt;&lt;div&gt;TB: domain experience, audio music, computer vision, breadth better than depth. production coding skill in some language&lt;/div&gt;&lt;div&gt;ML: we're hiring as well. If you don't want to code, probably won't work. CS skills really important. Big database skills. hadoop win. strong C++ and research also python. data and viz as well.&lt;/div&gt;&lt;div&gt;MP: we are hiring as well. growing r &amp;amp; d group. we have offices all over. we like building things. though we have room for research. we like solving problems. again broad. can you pivot. don't need a PhD to be useful.&lt;/div&gt;&lt;div&gt;GS: we're also hiring (that's 4 for 4) we're a start up. data analysis and mining. core CS + creative skills, willing to sweat.&lt;/div&gt;&lt;div&gt;OC: perhaps also adoptability&lt;/div&gt;&lt;div&gt;GS: yes. you're there to invent. plus we're in Tel Aviv and that's sweet&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Q (claudio): What is the relationship your company has with musicians are they just a commodity?&lt;/div&gt;&lt;div&gt;OC: our missing speaker (Henri) does this&lt;/div&gt;&lt;div&gt;GS: I spoke with him, he thinks: for young musicians it's hard to reach your audience. &lt;/div&gt;&lt;div&gt;OC: BMAT does this with jamendo. when they type in 'Michael Jackson' what do you do?&lt;/div&gt;&lt;div&gt;MP: but don't sell recsys as a way to push new artists only. In a certain context, ie. neg search, but careful. Don't exploit users or artists&lt;/div&gt;&lt;div&gt;GS: but the state of the art pushes new bits&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;(from audience) What about piracy?&lt;/div&gt;&lt;div&gt;GS: it's not good.&lt;/div&gt;&lt;div&gt;TB: there are 2 view points: piracy increases consumption. otherside: do we now that?&lt;/div&gt;&lt;div&gt;OC: now we're over time sorry.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;---&lt;/div&gt;&lt;div&gt;in light of the near transcript I just typed, I'm starting a new post for the afternoon talks.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Updated (5:33) : corrected questioner ID&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/XsIC-mvU8CU" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/8770580542890381511/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=8770580542890381511" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/8770580542890381511?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/8770580542890381511?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/XsIC-mvU8CU/womrad-afternoon-live-blog.html" title="WOMRAD Afternoon live blog" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2010/09/womrad-afternoon-live-blog.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DUEGR3oycSp7ImA9Wx5WFks.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-2908577219110785202</id><published>2010-09-26T00:07:00.000-07:00</published><updated>2010-09-28T02:53:46.499-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-09-28T02:53:46.499-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="recsys2010" /><category scheme="http://www.blogger.com/atom/ns#" term="live blog" /><category scheme="http://www.blogger.com/atom/ns#" term="womrad" /><title>A womrad live blog</title><content type="html">&lt;div&gt;I'm in Barcelona today for the Workshop on Music Recommendation and Discovery (&lt;a href="http://womrad.org/"&gt;WOMRAD&lt;/a&gt;).  The theme is 'Is Music Recommendation Broken? How Can We Fix It?'&lt;/div&gt;&lt;div&gt;I'm giving a &lt;a href="http://womrad.org/2010/papers/12.pdf"&gt;talk&lt;/a&gt; at 11am ( in about 2 hours ) and I'll be doing some (mostly) live updates about the program...&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Update (10:05am):&lt;/div&gt;&lt;div&gt;Keynote: "&lt;a href="http://womrad.org/2010/keynote.html"&gt;The Dark Art: Is Music Recommendation Science a Science?&lt;/a&gt;"&lt;/div&gt;&lt;div&gt;UPDATE ( 28 Sept 2010, 11:52am): Michael has &lt;a href="http://www.slideshare.net/mpapish/the-dark-art-is-music-recommendation-science-a-science"&gt;posted the slides to his talk&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt; The view from outside, as his industrial has used and observed recs&lt;/li&gt;&lt;li&gt;Been there since the beginning (which appears to be about 2000)&lt;/li&gt;&lt;li&gt;Recommenders must combine humans and machines&lt;/li&gt;&lt;li&gt;understand both content and listeners, transparency, embrace emosocio aspects, optimize trust&lt;/li&gt;&lt;li&gt;What is science? Must be falsifiable (Popper) or Solvable, reproducible puzzles (gah, missed name)&lt;/li&gt;&lt;li&gt;Puzzle - understand the listeners preferences -- foundations (ISMIR 2001 resolution) - testable reusable&lt;/li&gt;&lt;li&gt;Lots of metrics though (too many?) (do we need a metric for metrics?)&lt;/li&gt;&lt;li&gt;MIREX (summary of AMS task) (haha it's automated, tell that to andy and mert) - very acoustically focused, not exactly recommendation similarity != recommendation&lt;/li&gt;&lt;li&gt;use of statistical measures across datasets e.g. Netflix prize -- but what about discovery? -- Netflix produces better numbers but does it produce better recommendations?&lt;/li&gt;&lt;li&gt;More holistic measures -- survey users about trust and satisfaction (Swearingen &amp;amp; Sinha) -- may miss UI issues -- practical 'business' metrics -- bottomline measurements -- does this remove the science?&lt;/li&gt;&lt;li&gt;appreciated history of MIR (from a rec POV) will stick pic here -- currently hitting 'Wall of Good Recs' since recs don't suck it's no harder to test&lt;/li&gt;&lt;li&gt;easy to test for bad recs -- hard to test for good recs&lt;/li&gt;&lt;li&gt;What if the emerging problems (like UI and trust) are no longer measurable&lt;/li&gt;&lt;li&gt;Is user preference too variable and unstable to be useful?&lt;/li&gt;&lt;li&gt;from science to art?&lt;/li&gt;&lt;li&gt;2 options:  &lt;/li&gt;&lt;li&gt;1: focus on unsolved MIR: better encoding of preference (more socio-cultural research)&lt;/li&gt;&lt;li&gt;What are the limits of the avg listener (hey it's our &lt;a href="http://musicmachinery.com/2010/06/18/the-playlist-survey/"&gt;playlist survey!&lt;/a&gt;)--playlist turing tests, understand artist v. album v. tracks -- can we build tools/games to expand this&lt;/li&gt;&lt;li&gt;listener profile -- can you quantify the sonic v. social preference -- add relevance layers to search and retrieval&lt;/li&gt;&lt;li&gt;2: adjourn to the Beach&lt;/li&gt;&lt;li&gt;Questions:&lt;/li&gt;&lt;li&gt;Mark Levy: Do you think you're too embarrassed about good engineering? What about controlled experiments by people like google/last?  -- Move from science to engineering (this confuses me slightly ISMIR has alway been Engineering not Pure Science) It is fruitful but is it science.&lt;/li&gt;&lt;li&gt;Claudio: Can you speak a bit about your experience combining human knowledge vs. algorithms --- yes. what do you do with human knowledge? it's tricky. look for the ideal rec experience - sit around with your friends and play records: how do you scale that in a system? It's not about classification - humans are good at putting things together - train people to be qualitative assessors&lt;/li&gt;&lt;li&gt;Oscar: Since you used to be in college radio, how do you think this experience could inform playlist? Do you use playlisters? Well only a 1.5yr experience, but made me think about the groups of listeners.  Name checks John Peel. What about presentation - In terms of what rovi does:  Minimally we can stop making bad playlists: gives example then breaks - v. hard to differentiate btwn good - v. good - excellent&lt;/li&gt;&lt;li&gt;Me: what about bypassing order by selecting good sets:&lt;/li&gt;&lt;li&gt;(Eugenio Tacchini): how much is the expert transparency necessary? yes give justification but need to avoid the feeling of stereotyping, weird vague directions, not just look at this user but look at this part of this user.&lt;/li&gt;&lt;li&gt;tom butcher: Is music rec really a unique snowflake? - Every domain is unique. -- One thing: a bad user rec in music costs 2 minutes, a bad film rec costs you 2hrs music has a lower penalty cost for bad recs.  Also diff in features will sonic features get you to pref, prob not in music [I think this is a think which may improve...] &lt;/li&gt;&lt;/ul&gt;&lt;/div&gt;&lt;div&gt;(update 2 10:31am)&lt;/div&gt;&lt;div&gt;session 1:&lt;/div&gt;&lt;div&gt;Time Dependency&lt;/div&gt;&lt;div&gt;&lt;a href="http://womrad.org/2010/papers/16.pdf"&gt;Rocking around the clock eight days a week: an exploration of temporal patterns of music listening&lt;/a&gt; --  [Perfecto Herrera], Zuriñe Resa and Mohamed Sordo&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;personal ex. showing diff between early day v. late night playlist&lt;/li&gt;&lt;li&gt;trying to link 2 concepts - Day- hour - (weather?) and Music track selection&lt;/li&gt;&lt;li&gt;few papers on this idea  -- take things from Human Dynamics -- trying to enable playing music 'at the right moment'  -- explore circular stats&lt;/li&gt;&lt;li&gt;Circular stats (eqs in paper at link) basically transform raw data by a perodicity (days, weeks)&lt;/li&gt;&lt;li&gt;Circular stats have analogous tools to trad stats - hyp tests for instance&lt;/li&gt;&lt;li&gt;Data for eval is full listening history of 992 unique last.fm users with artist/title + time of day (ToD) also got genre via track.getTopTags, keeping genre -- discarded users w/o enough data&lt;/li&gt;&lt;li&gt;scraped about half the data&lt;/li&gt;&lt;li&gt;attempt to make predictions - use two years of data to predict the ToD of play in next year&lt;/li&gt;&lt;li&gt;results: by day about 2.5x better than chance, by hour about 3-5x better than chance (move from half hour to hour tolerance doubles data&lt;/li&gt;&lt;li&gt;note that the figures are overall, some users are v. predictable in this way, some are not.&lt;/li&gt;&lt;li&gt;Concl: temporal patterns can be predicted - not just what but when. plugs the &lt;a href="http://playground.last.fm/demo/clock"&gt;last.fm clocks&lt;/a&gt;&lt;/li&gt;&lt;li&gt;Q (dunno who asked): what about user to user offsets (eg. if a user gets up at 6 v 8am 8 am means something different)? Currently can't do this, need sensor data.  Would be sweet if we could, though not tha tthe predictions are peruser, so this is to some effect already dealt with&lt;/li&gt;&lt;li&gt;Q (again, people say who you are): Method issue - when comp day v. hour there's a percentile diff in the err tolerance? Sure this could work look at baseline compare...&lt;/li&gt;&lt;li&gt;Q (Eugenio Tacchini): I tried this awhile ago, aggregated data, didn't find much spread do you think aggregation is the issue?  yeah, must be specific to the user, right time + right user not just right user&lt;/li&gt;&lt;li&gt;Q (Klaas):do you think it would work with less data (can't wait 2 years)? Probably.  This was a very conservative methodology, could probably get by with maybe three months. For this work we wanted lots of data to make things clear&lt;/li&gt;&lt;li&gt;Q (seriously ID guys): did you use a popularity filter? No. tested if pref for a genre is different than the average for that genre&lt;/li&gt;&lt;/ul&gt;Break time then my talk. no notes for my talk as I'm talking...&lt;br /&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Update (12:16): I was without my machine for the social tag session, not just my talk.  I'll get my hand notes in another post but for now here are the papers:&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://womrad.org/2010/papers/12.pdf"&gt;Using Song Social Tags and Topic Models to Describe and Compare Playlists. Benjamin Fields, Christophe Rhodes and Mark d'Inverno&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://womrad.org/2010/papers/2.pdf"&gt;Piloted Search and Recommendation with Social Tag Cloud-Based Navigation. Cédric Mesnage and Mark Carman&lt;/a&gt;&lt;/li&gt;&lt;li&gt;&lt;a href="http://womrad.org/2010/papers/18.pdf"&gt;A Method for Obtaining Semantic Facets of Music Tags. Mohamed Sordo, Fabien Gouyon and Luís Sarmento&lt;/a&gt;&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;Update (  ):&lt;/div&gt;&lt;/div&gt;&lt;div&gt;next paper is being skipped since the author was unable to attend due to illness:&lt;/div&gt;&lt;div&gt;&lt;a href="http://womrad.org/2010/papers/17.pdf"&gt;Survey of Music Recommendation Aids. Pirkka Åman and Lassi Liikkanen&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Now joining the presentation already in progress by Audrey Laplante: &lt;/div&gt;&lt;div&gt;&lt;a href="http://womrad.org/2010/papers/3.pdf"&gt;The Role People Play in Adolescents' Music Information Acquisition.&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;ul&gt;&lt;li&gt;qualitative study of adolescents&lt;/li&gt;&lt;li&gt;'Did your music taste change significantly in the last three years?' Yes, whys:  New boyfriend, New school therefore new friends, important discussion topic&lt;/li&gt;&lt;li&gt;"who in your 'gang' or group exert the most influence on others in terms of music?" -- 3 self-identified. Characteristics: highly invested in music, good comm, willing to share info.  People who are opinion leaders want to &lt;b&gt;stay&lt;/b&gt; opinion leaders, will invest heavily in effort&lt;/li&gt;&lt;li&gt;in other domains work shows that &lt;b&gt;weak ties&lt;/b&gt; are more important then &lt;b&gt;strong ties&lt;/b&gt; in finding &lt;b&gt;new information &lt;/b&gt;works almost all the time -- for 2 participants weak ties important -- for others strong ties &lt;b&gt;with significantly different social network&lt;/b&gt; are important -- music as vehicle for social interaction&lt;/li&gt;&lt;li&gt;strong ties have different roles -- not important for discover, but critical for  &lt;b&gt;legitimization of musical taste&lt;/b&gt; &lt;/li&gt;&lt;li&gt;similar and reliable social connections are critical &lt;/li&gt;&lt;li&gt;social network maps (pic forthcoming...)&lt;/li&gt;&lt;li&gt;unknown how common these results are (same survey) as yet unknown exact implications for recommenders&lt;/li&gt;&lt;li&gt;Q (unknown)- Weak ties v. Strong ties -- how do you define the difference?: not really about newness, but it's entirely possible with new detail&lt;/li&gt;&lt;li&gt;Q (claudio) - What kinds of systems are implied with this work? Not necessarily a different system for adolescents. tight connectivity is critical, perhaps the difference is that strong ties may become more critical&lt;/li&gt;&lt;li&gt;(claudio) - does the notion that music describes you change as you get older?: not really actually, adolescent are interested in individual  uniqueness&lt;/li&gt;&lt;li&gt;Q (Mark L) are social networks online somewhat different?: yes and no. in facebook you can find relatives, but noise is a big problem. But trust is not known&lt;/li&gt;&lt;li&gt;I asked about using graph difference.  Answer could work, also other automatic methods...&lt;/li&gt;&lt;/ul&gt;&lt;div&gt;lunch now. I'll make a new post for the afternoon session.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;updated again (5:14pm) Eugenio Tacchini is Italian not Finnish (oops)&lt;/div&gt; &lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/vQ__jTpAmJ8" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/2908577219110785202/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=2908577219110785202" title="4 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/2908577219110785202?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/2908577219110785202?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/vQ__jTpAmJ8/womrad-live-blog.html" title="A womrad live blog" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>4</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2010/09/womrad-live-blog.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DkQARHY5fCp7ImA9Wx5XEU4.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-4591309452659296402</id><published>2010-09-09T03:27:00.000-07:00</published><updated>2010-09-10T08:59:05.824-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-09-10T08:59:05.824-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="on the fly search" /><category scheme="http://www.blogger.com/atom/ns#" term="playlisting" /><category scheme="http://www.blogger.com/atom/ns#" term="musichackday" /><title>Roomba Recon - A musichackday brain dump</title><content type="html">So this past weekend I attended the 2nd (annual?) &lt;a href="http://london.musichackday.org/2010"&gt;London Music Hackday&lt;/a&gt; at the Guardian's offices at King's Cross.  For the hack I created an algorithm that generates playlists between arbitrary start and end songs on soundcloud.  It does this with almost no pre-indexing, allowing for playlists to cover the entire network and always use an up-to-date graph.  It's (mostly) &lt;a href="http://doc.gold.ac.uk/~map01bf/recon/playlist"&gt;running live if you'd like to play with it&lt;/a&gt;.&lt;div&gt;&lt;br /&gt;&lt;div&gt;Briefly, it performs a sort of bastardized &lt;a href="http://en.wikipedia.org/wiki/A*"&gt;A* search&lt;/a&gt;, bilaterially from both the start and the end song to form the playlist.  There's a parameter to limit the length of the two playlist segments, by default this is 4 so the max playlistlength is 10 (2*4+2 for the end points).&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;The search algorithm collects social links of the artist corresponding to the given song.  For each of these connections (you know, 'friends' or in soundcloud jargon 'followings') a determination of the cost of adding that song is calculated in the following way (for the half  built from the start song):&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;img src="http://1.bp.blogspot.com/_PJFQpgIoi0g/TIpHVPi4n9I/AAAAAAAAACM/8ohHEtyRMMc/s200/latex-image-6.png" style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 100px; height: 50px;border-style:none;" border="0" alt="" id="BLOGGER_PHOTO_ID_5515299123944267730" /&gt;&lt;/div&gt;&lt;div&gt;where &lt;img src="http://3.bp.blogspot.com/_PJFQpgIoi0g/TIpBSVGlieI/AAAAAAAAAB0/rpMCpPK4-18/s320/latex-image-3.png" style="cursor:pointer; cursor:hand;width: 33px; height: 16px;border-style:none;" border="0" alt="" id="BLOGGER_PHOTO_ID_5515292476826814946" /&gt;  is the cost to add song &lt;i&gt;m&lt;/i&gt; to list after song &lt;i&gt;n&lt;/i&gt;, &lt;img src="http://3.bp.blogspot.com/_PJFQpgIoi0g/TIpBhIJSwhI/AAAAAAAAAB8/u_k8LyXBtoc/s200/latex-image-4.png" style="cursor:pointer; cursor:hand;width: 28px; height: 16px;border-style:none;" border="0" alt="" id="BLOGGER_PHOTO_ID_5515292731046543890" /&gt; is some measure of distance from song&lt;i&gt; n&lt;/i&gt; to song &lt;i&gt;m&lt;/i&gt; and &lt;img src="http://2.bp.blogspot.com/_PJFQpgIoi0g/TIpHoLqZ5NI/AAAAAAAAACU/Ouog5wjBmtQ/s200/latex-image-7.png" style="cursor:pointer; cursor:hand;width: 26px; height: 16px;border-style:none;" border="0" alt="" id="BLOGGER_PHOTO_ID_5515299449319580882" /&gt; is the same measure of distance from song &lt;i&gt;m &lt;/i&gt;to song &lt;i&gt;e. &lt;/i&gt; Song &lt;i&gt;e&lt;/i&gt; is the end song for the whole playlist.  So basically the idea is that the cost of moving to a node is a ratio of how far away it is from where you were to how far it is to where you're trying to get.  The whole thing is reversed for the other half, so the cost function makes it cheap to move toward the start song.  If you simply want to randomly traverse social links the cost can be set to an arbitrary equal value (I used 1) for all links.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;This leaves the matter of distance.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Starting with what I know best, I decided to try a content-based distance first.  I should say that from the onset I figured this would be insanely slow, but none the less, I gave it a go.  I implemented (&lt;a href="http://doc.gold.ac.uk/~map01bf/recon"&gt;available directly as well&lt;/a&gt;) a little object that will &lt;a href="http://developer.echonest.com/docs/v4/track.html#analyze"&gt;grab the echonest timbre features&lt;/a&gt; for any two soundcloud songs, summarize the features into a single multidimensional gaussian (mean and std) then take the &lt;a href="http://en.wikipedia.org/wiki/Cosine_distance"&gt;cosine distance&lt;/a&gt; between the two tracks (&lt;a href="http://docs.scipy.org/doc/scipy/reference/spatial.distance.html"&gt;other distance metrics could be computed as well&lt;/a&gt;, but cosine seemed reasonable).  That takes something on the order of 45 seconds to do for every pair of tracks.  When using it in the above playlister the whole thing would take maybe 4 hours (I think, I never actually let it complete).  Clearly way too slow.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So taking inspiration from my &lt;a href="http://womrad.org/2010/program.html"&gt;about to be published work&lt;/a&gt; at &lt;a href="http://womrad.org/"&gt;WOMRAD&lt;/a&gt;, I thought some &lt;a href="http://en.wikipedia.org/wiki/Natural_language_processing"&gt;NLP&lt;/a&gt; could save the day.  So the other distance measure I implemented (no direct access yet) is based on a tracks tags and comments.  First I &lt;a href="http://en.wikipedia.org/wiki/Tokenize#Tokenizer"&gt;tokenize&lt;/a&gt; the comments and combine them with the tags to create a &lt;a href="http://en.wikipedia.org/wiki/Vector_space_model"&gt;vector space model&lt;/a&gt; of a track's descriptive text.  I then weighted everything using &lt;a href="http://en.wikipedia.org/wiki/Tfidf"&gt;tfidf&lt;/a&gt; (the idf was populated with a random sample of tracks from across soundcloud that I gathered over the weekend, about 41,000 tracks in total.  This is the only indexing that is done in advance).  From the tfidf weighted terms in a vector space, I took the cosine distance.  This is both quite quick and gives pretty good results.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Everything was built in python, the app is running in &lt;a href="http://www.cherrypy.org/"&gt;cherrypy&lt;/a&gt;, using &lt;a href="http://numpy.scipy.org/"&gt;numpy&lt;/a&gt; and &lt;a href="http://scipy.org/"&gt;scipy&lt;/a&gt; for the data handling and &lt;a href="http://nlp.fi.muni.cz/projekty/gensim/"&gt;gensim&lt;/a&gt; for the tfidf related bits. &lt;a href="http://github.com/soundcloud/python-api-wrapper"&gt;Soundcloud&lt;/a&gt; and &lt;a href="http://code.google.com/p/pyechonest/"&gt;echonest&lt;/a&gt; interaction is all via their respective python wrappers.  Also there's a more terse write up over at &lt;a href="http://wiki.musichackday.org/index.php?title=Roomba_Recon"&gt;the musichackday wiki&lt;/a&gt;.  I'll stick the code on my repository on &lt;a href="http://github.com/gearmonkey/"&gt;github&lt;/a&gt; once it's cleaned up a bit (though that might be a little while as I seem to be rather busy with something at the moment...)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Right.  Back to writing my thesis.&lt;/div&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/DSAUGmVQIm0" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/4591309452659296402/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=4591309452659296402" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/4591309452659296402?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/4591309452659296402?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/DSAUGmVQIm0/roomba-recon-musichackday-brain-dump.html" title="Roomba Recon - A musichackday brain dump" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://1.bp.blogspot.com/_PJFQpgIoi0g/TIpHVPi4n9I/AAAAAAAAACM/8ohHEtyRMMc/s72-c/latex-image-6.png" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2010/09/roomba-recon-musichackday-brain-dump.html</feedburner:origLink></entry><entry gd:etag="W/&quot;AkUDRH0-fCp7ImA9WxFWEk4.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-615420466115789509</id><published>2010-05-29T09:40:00.000-07:00</published><updated>2010-05-30T09:51:15.354-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-05-30T09:51:15.354-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="ShamelessSelfPromotion" /><category scheme="http://www.blogger.com/atom/ns#" term="playlists" /><category scheme="http://www.blogger.com/atom/ns#" term="meta" /><category scheme="http://www.blogger.com/atom/ns#" term="ISMIR" /><title>publications on playlists in ISMIR</title><content type="html">So for this year's ISMIR I'll be doing another tutorial.  This one is entitled "&lt;a href="http://ismir2010.ismir.net/program/tutorials/"&gt;Finding A Path Through The Jukebox – The Playlist Tutorial&lt;/a&gt;" and I'll be presenting it with &lt;a href="http://musicmachinery.com/"&gt;Paul Lamere&lt;/a&gt;.  As you may have guessed by the title it's all about playlists.  So to frame some of my background work I thought I'd poke around the ISMIR proceedings to get a more complete idea of all of the papers that dealt with topic across the 10 years of proceedings (plus the just announced titles from this year).&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;First I did a simple title search using the tool at &lt;a href="http://www.ismir.net/proceedings/index.php?function=show_search_form&amp;amp;table_name=papers"&gt;ismir.net&lt;/a&gt;.  This shows that from 2000 - 2009 there were 14 papers with 'playlist' occurring somewhere in the title.  Here they are over time:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://benfields.net/images/raw_paper_counts.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 200px;" src="http://benfields.net/images/raw_paper_counts.jpg" border="0" alt="" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;Well, that doesn't show very much, just some interest, no trends or anything.  So from there I took a look at the results of at the text search available from &lt;a href="http://rainer.typke.org/ismir_fulltext.html"&gt;Rainer Typke's website&lt;/a&gt;.  The full text search found some 123 papers mentioning playlist, certainly a few more than the title search.  From there I wanted to see what the distribution of these papers was over time (as above), though this took a bit more work, as I couldn't sort out a means to export the search results...  Anyway after a bit of counting I got this:&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://benfields.net/images/raw_mentions.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 200px;" src="http://benfields.net/images/raw_mentions.jpg" border="0" alt="" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div&gt;Well, now we're getting somewhere!  Clearly there's an increasing number of papers discussing playlists at ISMIR.  But wait, you say, this doesn't take into account the considerable expansion of the size of the conference over it's existence.  So we can normalize to the number of papers per year that are known the the Cumulative ISMIR proceedings ( [35, 43, 62, 56, 108, 119, 99, 131, 111, 148] from 2000 - 2009 if anyone is interested).  Below you can see both the title only and full paper search results normalized to the total number of papers:&lt;/div&gt;&lt;div&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://benfields.net/images/noramlized_counts.jpg"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 320px; height: 200px;" src="http://benfields.net/images/noramlized_counts.jpg" border="0" alt="" /&gt;&lt;/a&gt;&lt;br /&gt;The normalization didn't seem to change the trend much.  But this leaves me wondering, what can be drawn from the the massive (and growing) disparity between title mentions and fulltext mentions?  Obviously one would expect a higher number of hits, but a tenfold increase, seems very large.  My first suspicion is that a great deal of this disparity comes from the fact that many papers at ISMIR that mention playlists are actually about something else (music similarity for instance) and then throw on a playlist as something of an afterthought.  Perhaps this is an implicit acknowledgment of the great human-factor power of the playlist (as discussed in for instance &lt;a href="http://ismir2006.ismir.net/PAPERS/ISMIR0685_Paper.pdf"&gt;this paper&lt;/a&gt;) or perhaps it's something else entirely.  &lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Regardless of these finer points, it's clearly fair to say that there is a great deal of interest in playlist generation and analysis.  If you're interested in these things, why not sign up for &lt;a href="http://ismir2010.ismir.net/program/tutorials/"&gt;our tutorial&lt;/a&gt;?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/bWmdkVWe8iQ" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/615420466115789509/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=615420466115789509" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/615420466115789509?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/615420466115789509?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/bWmdkVWe8iQ/publications-on-playlists-in-ismir.html" title="publications on playlists in ISMIR" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2010/05/publications-on-playlists-in-ismir.html</feedburner:origLink></entry><entry gd:etag="W/&quot;C0cCQHc9cSp7ImA9WxBaEkw.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-1945127764694260506</id><published>2010-03-21T15:30:00.000-07:00</published><updated>2010-03-21T15:37:41.969-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-03-21T15:37:41.969-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="site redesign" /><category scheme="http://www.blogger.com/atom/ns#" term="meta" /><title>facelift</title><content type="html">Hi blog readers.  You may notice a slight change of scenery and the addition of some links just above the main text body.  The links go the the other parts of my homepage and the color scheme shift is to keep everything consistent.  I'm not much of a designer so I'll happily take any critique of the color scheme and such...&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/wYLWoIITETk" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/1945127764694260506/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=1945127764694260506" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/1945127764694260506?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/1945127764694260506?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/wYLWoIITETk/facelift.html" title="facelift" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2010/03/facelift.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CUAASHY5eyp7ImA9WxBUFEg.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-5292522014947011814</id><published>2010-03-01T05:03:00.000-08:00</published><updated>2010-03-01T06:29:09.823-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-03-01T06:29:09.823-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Myspace" /><category scheme="http://www.blogger.com/atom/ns#" term="marsyas" /><category scheme="http://www.blogger.com/atom/ns#" term="hype hype hype" /><category scheme="http://www.blogger.com/atom/ns#" term="social networking tools" /><category scheme="http://www.blogger.com/atom/ns#" term="music informatics" /><category scheme="http://www.blogger.com/atom/ns#" term="ShamelessSelfPromotion" /><category scheme="http://www.blogger.com/atom/ns#" term="ICASSP" /><category scheme="http://www.blogger.com/atom/ns#" term="IEEE-THEMES" /><title>IEEE-THEMES --shameless self promotion--</title><content type="html">I'm going to be presenting work at &lt;a href="http://ieee-themes.org/"&gt;IEEE-THEMES&lt;/a&gt;, a workshop collocated with &lt;a href="http://www.icassp2010.com/"&gt;ICASSP&lt;/a&gt;, on March 15th in Dallas, TX.  The talk is associated with an article to be published in the august issue of Select Topics in Signal Processing, which is a special issue on signal processing and social networks.  Here's the title/abstract (note: link is to a preprint, camera-ready isn't due till after the talk so paper may well change a touch...) :&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;&lt;a href="http://doc.gold.ac.uk/~map01bf/papers/bfields_themes.pdf"&gt;Analysis and Exploitation of Musician Social Networks for Recommendation and Discovery&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Abstract—This paper presents an extensive analysis of a sample of a social network of musicians. The network sample is first analyzed using standard complex network techniques to verify that it has similar properties to other web-derived complex networks. Content-based pairwise dissimilarity values between the musical data associated with the network sample are computed, and the relation- ship between those content-based distances and distances from network theory explored. Following this exploration, hybrid graphs and distance measures are constructed, and used to examine the community structure of the artist network. Finally, results of these investigations are presented and considered in the light of recommendation and discovery applications with these hybrid measures as their basis.&lt;/div&gt;&lt;/blockquote&gt;&lt;div&gt;The paper mostly covers content that has been discussed elsewhere (much of it with &lt;a href="http://kurtisrandom.blogspot.com/"&gt;Kurt Jacobson&lt;/a&gt;) refactored for a broader audience and with wider narratives in mind.  That said there are some notable new findings in the paper as well.  We have run another acoustic dissimilarity measure across the entire set (the&lt;a href="http://music-ir.org/mirex/2009/results/abs/GTfinal.pdf"&gt; 2009 MIREX entry&lt;/a&gt; in &lt;a href="http://www.music-ir.org/mirex/2009/index.php/Audio_Music_Similarity_and_Retrieval_Results"&gt;audio music similarity&lt;/a&gt; using &lt;a href="http://marsyas.info/"&gt;marsyas&lt;/a&gt;) which for the most part confirms our earlier findings (that acoustic similarity and social similarity [mostly] aren't linearly correlated and that community genre labeling becomes more homogeneous [again, mostly] when using the audio sim as a weight).  Additionally, we have broadened our comparison metrics to include an examination of the mutual information between the different dissimilarity sets.  This also basically confirms our earlier findings, though mutual information provides a very satisfying level of nuance that is not possible from simply testing (using Pearsons) for linear correlation, especially given that our data is quite far from a normal distribution.  So, if you're planning to be at ICASSP, I'd highly recommend IEEE-THEMES (the rest of the program looks to be very interesting as well...) and if you aren't going to be in Dallas, there are a few options for you.  &lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;If you're in London right now, you can come to &lt;a href="http://maps.google.co.uk/maps?q=Goldsmiths+college&amp;amp;hl=en&amp;amp;cd=1&amp;amp;ei=rM6LS-S1JoyGOKGw1bsG&amp;amp;sig2=epQBWTYMwNgX9_dZVo6Yig&amp;amp;sll=51.474634,-0.035233&amp;amp;sspn=0.006295,0.006295&amp;amp;ie=UTF8&amp;amp;view=map&amp;amp;cid=1969408599269406229&amp;amp;ved=0CEkQpQY&amp;amp;hq=Goldsmiths+college&amp;amp;hnear=&amp;amp;ll=51.47448,-0.035384&amp;amp;spn=0.005747,0.012048&amp;amp;z=17&amp;amp;iwloc=A"&gt;Goldsmiths&lt;/a&gt; today at 4pm to rm 144 in the main building, where I'll be giving a trail run of the talk.  &lt;/li&gt;&lt;li&gt;Slides (and perhaps some video) will be made available at some point (probably just after the talk is given). &lt;/li&gt;&lt;li&gt;IEEE is running a pay-to-watch &lt;a href="http://ieee-themes.org/index_files/Page1077.htm"&gt;live stream of THEMES&lt;/a&gt;, so there's that as well.&lt;/li&gt;&lt;/ol&gt;Generally, if you're going to be in Dallas fr0m March 15-19, much discussion can happen in person.  Also, between now and then I'll be doing some traveling (tomorrow till 6 March I'll be at UIUC, then from there till the 14th of March I'll be in San Diego) so if any readers are interested in some in person discussion and our locations overlap, let me know and perhaps something can be arranged. &lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/5NjO-x8GjfU" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/5292522014947011814/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=5292522014947011814" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/5292522014947011814?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/5292522014947011814?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/5NjO-x8GjfU/ieee-themes-shameless-self-promotion.html" title="IEEE-THEMES --shameless self promotion--" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2010/03/ieee-themes-shameless-self-promotion.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DkYNRHk6cSp7ImA9WxBWGU0.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-3012377734400862670</id><published>2010-02-11T07:41:00.000-08:00</published><updated>2010-02-11T08:03:15.719-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-11T08:03:15.719-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="10.6" /><category scheme="http://www.blogger.com/atom/ns#" term="Mac OSX" /><category scheme="http://www.blogger.com/atom/ns#" term="compile tutorial" /><category scheme="http://www.blogger.com/atom/ns#" term="python" /><category scheme="http://www.blogger.com/atom/ns#" term="update" /><category scheme="http://www.blogger.com/atom/ns#" term="SciPy" /><category scheme="http://www.blogger.com/atom/ns#" term="NumPy" /><title>scipy and numpy from source, revisted</title><content type="html">A while back I &lt;a href="http://stuffalsothings.blogspot.com/2009/08/compiling-bleeding-edge-scipy-on-mac-os.html"&gt;posted&lt;/a&gt; some instructions for getting scipy and numpy mostly up and running from current svn checkouts under python 2.6 with mac os x 10.5.8.  I  updated to 10.6 sometime back and have been using the preinstalled version of numpy (1.2) for my array needs without any scipy with solid results.  However, I needed to get at some scipy functionality (doing some mutual information analysis via &lt;a href="http://code.google.com/p/pyentropy/"&gt;pyentropy&lt;/a&gt;) so I thought I'd give the process a go with the newer OS version.  I'm pleased to report that everything works and was relatively easy to install/build.  Basically the old instructions still hold with a couple points.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;It is necessary to update to a newer version of numpy, that you compile using the same fortran compiler you'll use with scipy.&lt;/li&gt;&lt;li&gt;If you're using the build of macpython that comes with 10.6 (which is py2.6) you'll need to add the option --install-lib=/Library/Python/2.6/site-packages/ to any commands using distutil to install (eg. setup.py install)&lt;/li&gt;&lt;/ol&gt;And that's about it.  I used fresh check outs of scipy (r6233, v0.8.0.dev) and numpy(r8106, v1.5.0.dev), but the same versions I've had for a while of Sparse and gFortran (the details of which are in the old post).  As bonus this seems to result in less unittest failures in scipy (now only 10!) for whatever that's worth.&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/-ipU18cIMUg" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/3012377734400862670/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=3012377734400862670" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/3012377734400862670?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/3012377734400862670?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/-ipU18cIMUg/scipy-and-numpy-from-source-revisted.html" title="scipy and numpy from source, revisted" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>2</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2010/02/scipy-and-numpy-from-source-revisted.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DEYMQHk7eCp7ImA9WxBXFk0.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-7147490125369822710</id><published>2010-01-27T07:19:00.000-08:00</published><updated>2010-01-27T07:29:41.700-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-01-27T07:29:41.700-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="sweatsedos" /><category scheme="http://www.blogger.com/atom/ns#" term="worldbank" /><category scheme="http://www.blogger.com/atom/ns#" term="stockholm" /><category scheme="http://www.blogger.com/atom/ns#" term="musichackday" /><category scheme="http://www.blogger.com/atom/ns#" term="echonest" /><title>MusicHackday: Stockholm</title><content type="html">So in a touch more than 48hrs I'll be hoping on a plane to go the &lt;a href="http://stockholm.musichackday.org/"&gt;Stockholm MusicHackday&lt;/a&gt;.  It should be excellent, if the last &lt;a href="http://london.musichackday.org/"&gt;one I went to&lt;/a&gt; is any judge.  I'll be joined by fellow &lt;a href="http://doc.gold.ac.uk/isms/"&gt;ISMS&lt;/a&gt; member &lt;a href="http://www.mikesroom.org/"&gt;Mike Jewell&lt;/a&gt;.  The hack is being formulated, but may involve &lt;a href="http://developer.worldbank.org/"&gt;The World Bank's api&lt;/a&gt; and some yet to be determined sources of listener statistics.  Also, somehow the &lt;a href="http://developer.echonest.com/"&gt;echonest's api&lt;/a&gt; will be involved because I need to leave stockholm with one of &lt;a href="http://musicmachinery.com/2010/01/26/best-ever-echo-nest-prize-at-the-stockholm-music-hackday/"&gt;these&lt;/a&gt;.  We may need some further assistance to get something done in 24hrs, so if you're going to be at the hack and are looking for some folk to hack with drop a line in the comments...&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/1w_HofZtJS0" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/7147490125369822710/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=7147490125369822710" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/7147490125369822710?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/7147490125369822710?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/1w_HofZtJS0/musichackday-stockholm.html" title="MusicHackday: Stockholm" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2010/01/musichackday-stockholm.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DkYDQHw4fip7ImA9WxBXFk0.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-851454081612410948</id><published>2010-01-27T06:00:00.000-08:00</published><updated>2010-01-27T06:56:11.236-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-01-27T06:56:11.236-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="music informatics" /><category scheme="http://www.blogger.com/atom/ns#" term="playlisting" /><category scheme="http://www.blogger.com/atom/ns#" term="background" /><title>A bit about playlists and similarity</title><content type="html">Sorry about the general radio silence of late.  Many things going on, most of them interesting.&lt;div&gt;Lately I've been spending quite a bit of time considering various aspects of playlist generation and how they all fit together.  Here are some of my lines of thought:&lt;div&gt;&lt;ol&gt;&lt;li&gt;Evaluation of a playlist.  How?  Along which dimension? (Good v. Bad, Appropriate v. Offensive, Interesting v. Boring)&lt;/li&gt;&lt;li&gt;How do people in various functions create playlists?  How does this process and its output compare to common (or state of the art) methods employed in automatic playlist construction.  This is to say, are we doing it right?  Are the correct questions even being asked?&lt;/li&gt;&lt;li&gt;What is the relationship between notions of music similarity (or pairwise relationship in the generic) and playlist construction?&lt;/li&gt;&lt;/ol&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;While all these ideas are interrelated, for now I'm going to pick at point (3) a bit.  I'm coming to believe this is central in understanding the other two points as well, at least to an extent.  There are many ways to consider how two songs are related.  In &lt;a href="http://www.ismir.net/"&gt;music informatics&lt;/a&gt; this similarity is almost always content-based, even if it isn't content derived.  This can include methods based on timbral or harmonic features or most tags or similar labels (though these sometimes get away from content descriptors).  This paints some kind of picture but leaves out something that can be critical to manual playlist construct as it is commonly understood (e.g. in radio or the creation of a 'mixtape'), socio-cultural context.  In order to have the widest array of possible playlist constructions, it is necessary to have as complete an understanding of the relationship between member songs (not just neighbors...).  Put another way, the complexity of your playlist is maximally bound by the complexity of your similarity measure.  &lt;/div&gt;&lt;div style="text-align: center;"&gt;M&lt;=&lt;i&gt;C&lt;/i&gt;s&lt;/div&gt;&lt;div style="text-align: left;"&gt;Where M is some not yet existant measure of the possible semantic complexity of a playlist and s is a similar measure of the semantic complexity of the similarity measure used in the construction of that playlist. &lt;i&gt;C&lt;/i&gt; is our fudge factor constant.  Now, obviously there are plenty of situations where complex structure isn't required.  But if the goal is to make playlists for a wide range of functions and settings, it will be required some times.&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;In practice what this means is that you can make a bag of songs from a bag of features.  However, imparting long form structure is at a minimum dependant on a much more complex understanding  of the relationships (eg. sim) between songs (say from social networks or radio logs...)&lt;/div&gt;&lt;div style="text-align: left;"&gt;&lt;br /&gt;&lt;/div&gt;&lt;div style="text-align: left;"&gt;Anyway, this is all a bit vague right now.  I'm working on some better formalization, we'll see how that goes.  Anyone have any thoughts?&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/DCvi5XYPuio" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/851454081612410948/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=851454081612410948" title="3 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/851454081612410948?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/851454081612410948?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/DCvi5XYPuio/bit-about-playlists-and-similarity.html" title="A bit about playlists and similarity" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>3</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2010/01/bit-about-playlists-and-similarity.html</feedburner:origLink></entry><entry gd:etag="W/&quot;AkEMQXwyeSp7ImA9WxNTFkQ.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-8338816767421663729</id><published>2009-08-19T07:03:00.000-07:00</published><updated>2009-08-19T09:18:00.291-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-08-19T09:18:00.291-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="UMFpack" /><category scheme="http://www.blogger.com/atom/ns#" term="Mac OSX" /><category scheme="http://www.blogger.com/atom/ns#" term="compile tutorial" /><category scheme="http://www.blogger.com/atom/ns#" term="AMD" /><category scheme="http://www.blogger.com/atom/ns#" term="SuiteSparse" /><category scheme="http://www.blogger.com/atom/ns#" term="SciPy" /><category scheme="http://www.blogger.com/atom/ns#" term="NumPy" /><title>Compiling bleeding edge SciPy on Mac OS X</title><content type="html">I do most of my number crunching computing task with &lt;a href="http://scipy.org/"&gt;SciPy&lt;/a&gt; these days, having basically kicked the matlab habit with the brief exception of occasional use of legacy libraries.  SciPy is a joy to work with, but is a huge pain to build from source, in light of nasty dependancies (fortran things mostly) and some system specific hardware acceleration trickiness.  Thankfully most users can download one of many prebuilt packages, perhaps the best being &lt;a href="http://www.enthought.com/products/epd.php"&gt;enthought's&lt;/a&gt;.  If you've ever wanted to see what SciPy is all about, this is the easiest way to do so.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;That said, one of the great things about using SciPy instead of matlab is that it's python.  Except all the prebuilt binaries (to my knowledge anyway) use at newest python 2.5.  Again, probably not a problem for most, but I use the nice socket library (amongst other things) that's been improved significantly in python 2.6.  So for a while I had my SciPy python and my everything else python and every so often I would make another attempt building SciPy for py2.6 on my mac to integrate the two and every time it would defeat me.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Until yesterday.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So I'm going to attempt to fully document the process as I've now done it on 2 similar machines (home &amp;amp; lab) and now that I've figured out the tricky bits it seems fairly easy to reproduce.  These instruction were followed on 1 - 2 year old intel based macs running 10.5.8.  (Note these instructions don't touch on installing the other pieces of standard SciPy setup, ipython and pylab/matplotlib as I've never had much trouble getting these to build.  I believe the easy_install process works for both, mostly)&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;(optional) If you don't want to build universal python modules remove "&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;-arch ppc -arch i386&lt;/span&gt;" from the &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;BASECFLAGS&lt;/span&gt; and the &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;LDFLAGS&lt;/span&gt; in the python library Makefile, which should live somewhere around here: &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;&lt;span class="Apple-style-span"  style="font-size:small;"&gt;/Library/Frameworks/Python.framework/Versions/Current/lib/python2.6/config/Makefile&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;&lt;li&gt;If you don't already have xCode 3.1.3 and the associated developer tools you need to get it for apple's custom build of gcc 4.2 (it's not the version that comes with most box copies of 10.5).  Download and install a fresh copy of the &lt;a href="http://developer.apple.com/technology/xcode.html"&gt;Apple Developer Tools&lt;/a&gt;.  You can get SciPy to compile with other variants of gcc 4.2 or greater (from &lt;a href="http://www.macports.org/"&gt;MacPorts&lt;/a&gt; for instance) but they don't support apple specific options, which are very helpful in other situations.&lt;/li&gt;&lt;li&gt;Download and install&lt;a href="http://r.research.att.com/tools/"&gt; gFortran as a patch&lt;/a&gt; to the apple gcc from att research.  Why apple doesn't leave gfortran in gcc I don't know, but they don't and we need it.  It's critical you use this fortran compiler as other variants of gfortran or g77 seem to cause errors.&lt;/li&gt;&lt;li&gt;Download and install UMFPACK and AMD from SuiteSparse.  The easiest way I've gotten through this is to &lt;a href="http://www.cise.ufl.edu/research/sparse/SuiteSparse/current/"&gt;download the entire SuiteSparse&lt;/a&gt; and then do the following:&lt;ol style="list-style-type:lower-alpha"&gt;&lt;li&gt;Modifiy the package wide config makefile found at &lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;SuiteSparse/UFconfig/UFconfig.mk&lt;/span&gt; by uncommenting the Macintosh options (currently lines 299 - 303)&lt;/li&gt;&lt;li&gt;In order to only compile the 2 packages we also need to modify the high level makefile (&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;SuiteSparse/Makefile&lt;/span&gt;) by commenting out the references to the other packages under the default call (currently lines 10, 12-17, 19-24).&lt;/li&gt;&lt;li&gt;run make while in the SuiteSparse dir&lt;/li&gt;&lt;li&gt;because it would be too easy if SuiteSparse actually had an install routine, we have to install the just compiled libs ourselves.  This is how I did it, though you can stick all these bits wherever you like as long as the python compiler and linker will see them:&lt;br /&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;$sudo install UMFPACK/Include/* /usr/local/include/&lt;br /&gt;$sudo install AMD/Include/* /usr/local/include/&lt;br /&gt;$sudo install UMFPACK/Lib/* /usr/local/lib/&lt;br /&gt;$sudo install AMD/Lib/* /usr/local/lib/&lt;br /&gt;$sudo install UFconfig/UFconfig.h /usr/local/include/&lt;/span&gt;&lt;/li&gt;&lt;/ol&gt;&lt;/li&gt;&lt;li&gt;Grab a bleeding edge copy of SciPy and NumPy via their subversion repositories.:&lt;br /&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;$svn co http://svn.scipy.org/svn/numpy/trunk numpy-from-svn&lt;br /&gt;$svn co http://svn.scipy.org/svn/scipy/trunk scipy-from-svn&lt;/span&gt;&lt;/li&gt;&lt;li&gt;Build and install NumPy:&lt;br /&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;$cd numpy-from-svn&lt;br /&gt;$sudo python setup.py build --fcompiler=gnu95 install&lt;/span&gt;&lt;/li&gt;&lt;li&gt;Test NumPy to make sure it's not broken (note that the tests need to be run out of the build directory):&lt;br /&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;$cd ..&lt;br /&gt;$python -c "import numpy;numpy.test()"&lt;/span&gt;&lt;br /&gt;Make sure numpy doesn't fail any of the tests (known fails and skips are okay) or the next bit may not work.&lt;/li&gt;&lt;li&gt;Similar to step 6, build and install SciPy:&lt;br /&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;$cd scipy-from-svn&lt;br /&gt;$sudo python setup.py config_fc --fcompiler=gfortran install&lt;/span&gt;&lt;/li&gt;&lt;li&gt;similar to step 7, move out of the build directory and run the built in tests:&lt;br /&gt;&lt;span class="Apple-style-span"  style="font-family:'courier new';"&gt;$cd ..&lt;br /&gt;$python -c "import scipy;scipy.test()"&lt;/span&gt;&lt;br /&gt;You're going to get some fails and maybe some errors.  You're going to have to use your own judgement to as to whether these errors and fails are substantial.  Most of the troubles I've encountered are trivial, things like a type being dtype('int32') instead of 'int32' which is actually the same and just needs to be updated to reflect newer numpy.&lt;/li&gt;&lt;/ol&gt;And now you have a nice SciPy build for whatever flavor of python you're working with on you Mac.  Note that I have no idea how well this will work in anything other than python 2.6 on Mac OSX 10.5.8, though it will probably mostly work with other variants.  Also, for completeness, I most recently compiled these versions: NumPy-r7303, SciPy-r5893.  At some point I'm going to give it a go with python 3.x but that will be a whole new kind of pain I suspect.  Anyway, if anyone uses these instructions and they don't quite work or you don't understand part of it, please let me know and I'll try to clarify or help as best I can.  I'd really love to build a definitive set of instructions for building SciPy on a Mac, but I can only verify these instructions on my machines.&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/GD6IbEkbtF0" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/8338816767421663729/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=8338816767421663729" title="5 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/8338816767421663729?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/8338816767421663729?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/GD6IbEkbtF0/compiling-bleeding-edge-scipy-on-mac-os.html" title="Compiling bleeding edge SciPy on Mac OS X" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>5</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2009/08/compiling-bleeding-edge-scipy-on-mac-os.html</feedburner:origLink></entry><entry gd:etag="W/&quot;C0UEQ389fSp7ImA9WxJUFkw.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-756837322595532483</id><published>2009-07-14T07:18:00.000-07:00</published><updated>2009-07-14T15:40:02.165-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-07-14T15:40:02.165-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="soundcloud" /><category scheme="http://www.blogger.com/atom/ns#" term="kurtisrandom" /><category scheme="http://www.blogger.com/atom/ns#" term="music recommendation" /><category scheme="http://www.blogger.com/atom/ns#" term="playlisting" /><category scheme="http://www.blogger.com/atom/ns#" term="musichackday" /><title>musichackday</title><content type="html">So I spent the weekend holed up in the Guardian offices at the &lt;a href="http://www.musichackday.org/"&gt;musichackday&lt;/a&gt;.  I went in with some perhaps overly ambition plans to generate playlists across the &lt;a href="http://soundcloud.com/"&gt;SoundCloud&lt;/a&gt; user graph, with song selection optimization done with features via &lt;a href="http://theechonest.com/"&gt;theechonest&lt;/a&gt;.  This &lt;i&gt;might&lt;/i&gt; have been barely possible if I had been working with a couple other people of similar background, but circumstances led to me hacking mostly solo at this particular event.&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;In the end I spent a substantial amount of time beating the SoundCloud python wrapper into being more helpful for what I wanted it to do (which is perhaps not what it's envisioned use was, but hey, that's what hacks are for), namely walking the user (artist) space and creating a &lt;a href="http://en.wikipedia.org/wiki/Complex_network"&gt;Complex Network&lt;/a&gt; so I can move the playlist generation tools we've &lt;a href="http://stuffalsothings.blogspot.com/2008/10/mypyspace-status-update.html"&gt;created around myspace&lt;/a&gt; crawls over to SoundCloud.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;So, to that end, I've created some bits of python that walk through the user graph on the SoundCloud and build a graph using &lt;a href="http://igraph.sourceforge.net/"&gt;iGraph&lt;/a&gt;.  This code base is living over at a new github repository I've created called &lt;a href="https://github.com/gearmonkey/pySomethingClever/tree"&gt;pySomethingClever&lt;/a&gt;.  Included over there are diff files documenting the changes I made to &lt;b&gt;official&lt;/b&gt; &lt;a href="http://github.com/soundcloud/python-api-wrapper/tree"&gt;SoundCloud-api-wrapper&lt;/a&gt;, which will enable any willing victims to grab and run the hacky bits of code I have up.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Once I got the api wrapper in a place where it could do a bit of what I wanted I fired off a crawl.  I got through about 4,000 users (of a complete user network of about 170k nodes for ~2.3% of the network) in  SoundCloud's network before the presentations started on Sunday.  To clarify slightly, the network contains all the users of SoundCloud, but only the outlinks (users a given user follows) from 4,000 nodes.  This is to say I had a (mostly) complete vertex list and a very incomplete edge list.  With the super great help of &lt;a href="http://kurtisrandom.blogspot.com/"&gt;kurtj&lt;/a&gt;&lt;a href="http://twitter.com/kurtjx"&gt;x&lt;/a&gt; this sampled network was pushed through the &lt;a href="http://xavier.informatics.indiana.edu/lanet-vi/"&gt;lanet&lt;/a&gt; k-core decomposition visualization to draw out some of the community structure and related forms of the sample graph.  Here's that graph:  &lt;/div&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://doc.gold.ac.uk/~map01bf/scratch/bensgraph.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="http://doc.gold.ac.uk/~map01bf/scratch/bensgraph.png" border="0" alt="" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;div&gt;The size of each node is tied to the number of links (either direction) touching that node.  The color and placement have to do with how critical the node is to the rest of the network maintaining its current state of connectedness.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;div&gt;Since the hack I've continued gather edges toward a complete representation of SoundCloud.  I currently have the out link edges from more than 17,000 SoundCloud users (about 10% of the user base) and should have a full capture in the next few days.  Below you can see the same  visualization with  the edges from 16,000 users (the graph is set to write every 2k):&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;a href="http://doc.gold.ac.uk/~map01bf/scratch/16knoCliques.png"&gt;&lt;img style="display:block; margin:0px auto 10px; text-align:center;cursor:pointer; cursor:hand;width: 400px; height: 300px;" src="http://doc.gold.ac.uk/~map01bf/scratch/16knoCliques.png" border="0" alt="" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;As the crawl continues, my guess is the middle bits will continue to fill in, which would be expected if the SoundCloud behaves in the usual &lt;a href="http://en.wikipedia.org/wiki/Power_law"&gt;Power Law&lt;/a&gt; fashion (as most  of The Internet's networks, social or otherwise, tend to).&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;It should be noted that these visualizations, while very interesting, are just the beginning of what is possible once the whole user network is captured. I'm going to be building some playlist generators and recommenders around this in the coming weeks.  If things look good (and from here I'm quite excited) I'll push some of it to the &lt;a href="http://ismir2009.ismir.net/"&gt;ISMIR late breaking demos&lt;/a&gt; and possibly to &lt;a href="http://www.cp.jku.at/conferences/admire2009/"&gt;AdMIRe&lt;/a&gt;. More to come!&lt;/div&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/LZSwJaq6qM0" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/756837322595532483/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=756837322595532483" title="6 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/756837322595532483?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/756837322595532483?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/LZSwJaq6qM0/musichackday.html" title="musichackday" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>6</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2009/07/musichackday.html</feedburner:origLink></entry><entry gd:etag="W/&quot;CEYBQnY5fyp7ImA9WxJWFEk.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-7671723062624001631</id><published>2009-06-19T13:09:00.001-07:00</published><updated>2009-06-19T13:09:13.827-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-06-19T13:09:13.827-07:00</app:edited><title>The Semantic Web and Why You Should Care - a highly informal slidebook</title><content type="html">As you may have noticed, my posting has been super light (even for me) for the last few months.  Sorry about that.  Anyway, the reader may be interested in the presentation I gave at the IEEE/ACM Joint Conference on Digital Libraries, Workshop On Integrating Digital Library Content with Computational Tools and Services this morning (about 4 hours ago actually).  These slides really require the presentation audio to make sense and if it was recorded (unsure if it was...) I'll edit the slide share to have the audio track as well.  Regardless, here's the slidebook.  Have fun.&lt;div style="width:425px;text-align:left" id="__ss_1610212"&gt;&lt;a style="font:14px Helvetica,Arial,Sans-serif;display:block;margin:12px 0 3px 0;text-decoration:underline;" href="http://www.slideshare.net/BenFields/the-semantic-web-and-why-you-should-care?type=presentation" title="The Semantic Web and Why You Should Care."&gt;The Semantic Web and Why You Should Care.&lt;/a&gt;&lt;object style="margin:0px" width="425" height="355"&gt;&lt;param name="movie" value="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=jcdlworkshopondigitalcontent-090619142139-phpapp01&amp;stripped_title=the-semantic-web-and-why-you-should-care" /&gt;&lt;param name="allowFullScreen" value="true"/&gt;&lt;param name="allowScriptAccess" value="always"/&gt;&lt;embed src="http://static.slidesharecdn.com/swf/ssplayer2.swf?doc=jcdlworkshopondigitalcontent-090619142139-phpapp01&amp;stripped_title=the-semantic-web-and-why-you-should-care" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="355"&gt;&lt;/embed&gt;&lt;/object&gt;&lt;div style="font-size:11px;font-family:tahoma,arial;height:26px;padding-top:2px;"&gt;View more &lt;a style="text-decoration:underline;" href="http://www.slideshare.net/"&gt;Microsoft Word documents&lt;/a&gt; from &lt;a style="text-decoration:underline;" href="http://www.slideshare.net/BenFields"&gt;Ben Fields&lt;/a&gt;.&lt;/div&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/6fHqG097Kp0" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/7671723062624001631/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=7671723062624001631" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/7671723062624001631?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/7671723062624001631?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/6fHqG097Kp0/semantic-web-and-why-you-should-care.html" title="The Semantic Web and Why You Should Care - a highly informal slidebook" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2009/06/semantic-web-and-why-you-should-care.html</feedburner:origLink></entry><entry gd:etag="W/&quot;AkQAQHg6eSp7ImA9WxJTFUU.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-4053206227225844692</id><published>2009-04-24T08:33:00.001-07:00</published><updated>2009-04-24T08:52:21.611-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-04-24T08:52:21.611-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="music informatics" /><category scheme="http://www.blogger.com/atom/ns#" term="playlisting" /><category scheme="http://www.blogger.com/atom/ns#" term="Mphil" /><category scheme="http://www.blogger.com/atom/ns#" term="openhacklondon" /><title>back to playlisting</title><content type="html">&lt;div&gt;After a brief tangent into the wide world of mixing (of which I'll post some more in a bit from my proposed &lt;a href="http://smc2009.smcnetwork.org/"&gt;SMC&lt;/a&gt; paper in a bit) I'm back into playlist generation and related topics.  Along those lines it occurs to me that readers of my little blog may be interested in perusing my recently completed (Dec 2008 actually) M.Phil to PhD upgrade document.  For those of you unfamiliar with the British PhD system, PhD students start as a Master's of Philosophy by research student, then after about 24 months of independent research go through a process of summarizing and defending their work so far and what they intend to accomplish in the coming 18 - 24 months.  The outcome of this process is one of three things:&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;ol&gt;&lt;li&gt;Your work is deemed interesting, rigorous and sufficiently likely to succeed in the next couple of years.  As a result, you upgrade to a PhD student and continue on with your research (this is what happened in my case)&lt;/li&gt;&lt;li&gt;You graduate at that point with a M.Phil.&lt;/li&gt;&lt;li&gt;You completely fail your upgrade process.&lt;/li&gt;&lt;/ol&gt;So that happened back in mid december.  Here's the abstract from my upgrade:&lt;br /&gt;&lt;/div&gt;&lt;div&gt;&lt;blockquote&gt;&lt;/blockquote&gt;&lt;blockquote&gt;&lt;/blockquote&gt;&lt;div&gt;&lt;/div&gt;&lt;blockquote&gt;&lt;div&gt;A framework is described to consider various real world playlist use cases.  Automatic playlist generation is introduced as a means to improve music recommendation.  Literature in related topics is discussed.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;A sample of the Myspace artist network is examined to investigate the relationship between social connectivity and audio-based similarity.  Audio data from the Myspace artist pages is analyzed using well-established signal-based music information retrieval techniques.  In addition to showing that the Myspace artist network exhibits many of the properties common to social networks, it is seen that there is an ambiguous relationship between audio-based similarity and the social connectivity. Further the Myspace sample is examined with the pairwise relational connectivity measure Minimum cut/Maximum flow.  These values are then compared to a pairwise acoustic Earth Mover's Distance measure and the relationship is discussed.  A means of constructing playlists using the maximum flow value to exploit both the social and acoustic distances is realized.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;Two playlist generation methods are proposed for development and experimentation.  The first is a direct extension of the myspace dataset analysis into a robust playlist system for interactive internet radio broadcast.  The second is content based system which uses expert constructed playlists to construct transition models which can then be used on new material.  This is followed by a discussion of evaluation needs and strategies. &lt;/blockquote&gt; If you're interested in reading the whole thing (comments welcome and encouraged) &lt;a href="http://doc.gold.ac.uk/~map01bf/papers/Upgrade.pdf"&gt;download the pdf&lt;/a&gt;.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;On an unrelated note, I'll be a &lt;a href="http://openhacklondon.pbwiki.com/"&gt;Yahoo Open Hackday 09&lt;/a&gt; in Covent Garden in a couple weekends.  It's free and I believe there are still some tickets if anyone is interested.  It should be rad.&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/U0ITkrc-grc" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/4053206227225844692/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=4053206227225844692" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/4053206227225844692?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/4053206227225844692?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/U0ITkrc-grc/back-to-playlisting.html" title="back to playlisting" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>1</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2009/04/back-to-playlisting.html</feedburner:origLink></entry><entry gd:etag="W/&quot;DEYFQnw-eip7ImA9WxVaGEU.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-4507800690703607707</id><published>2009-04-16T05:23:00.000-07:00</published><updated>2009-04-16T05:48:33.252-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-04-16T05:48:33.252-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="not_DAFx" /><category scheme="http://www.blogger.com/atom/ns#" term="autoDJ" /><category scheme="http://www.blogger.com/atom/ns#" term="SMC" /><title>Conference alterations...</title><content type="html">So this paper I'm trying to put together on mixing algorithms just didn't quite come together in time for DAFx.  Or more exactly, the actual research wasn't quite done till like yesterday.  But it's not all bad.  I'm submitting to &lt;a href="http://smc2009.smcnetwork.org/"&gt;Sound and Music Computing 2009&lt;/a&gt; instead, which looks like it'll be an interesting conference.  This also gives me a couple more much need days to hash out a few more thoughts on things. &lt;br /&gt;Here's a teaser from the paper:&lt;br /&gt;&lt;br /&gt;For the purpose of this discussion we will be dealing with various types of song transitions in order of temporal complexity, from the simplest to describe in time to the most complex.&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;Song to song transition types&lt;br&gt;&lt;br /&gt;&lt;li&gt;Arbitrary Length Fixed Time Crossfade&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Phrase-Aligned Start No Tempo Adjust&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Phrase-Aligned Start, Running Beat Alignment&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Phrase-Aligned Start, Phrase-Aligned Finish&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt; Once I've got this draft done I'll post some more bits.  In the meantime, anyone have any further ideas on core subdivisions of song to song transitions, from the perspective of time alignment?&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/wwg6tthnmNA" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/4507800690703607707/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=4507800690703607707" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/4507800690703607707?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/4507800690703607707?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/wwg6tthnmNA/conference-alterations.html" title="Conference alterations..." /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>0</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2009/04/conference-alterations.html</feedburner:origLink></entry><entry gd:etag="W/&quot;C0cMRng_fSp7ImA9WxVbEUo.&quot;"><id>tag:blogger.com,1999:blog-6331230179506650301.post-3873747398657485538</id><published>2009-03-27T08:54:00.000-07:00</published><updated>2009-03-27T09:24:47.645-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-03-27T09:24:47.645-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="continuous mixing" /><category scheme="http://www.blogger.com/atom/ns#" term="dafx2009" /><category scheme="http://www.blogger.com/atom/ns#" term="remix" /><title>working toward a rigorous definition of 'continuous mixing'</title><content type="html">So I have been thinking about this idea of continuous mixing, or generally speaking, more content aware automatic song to song transitions for some time now.  To that end I'm aiming to put something together for &lt;a href="http://dafx09.como.polimi.it/"&gt;DAfX&lt;/a&gt; on the topic.  As this deadline for submission is fast approaching, my writing is generally focus in that direction. In the meantime however, I'll throw out a vague english definition of continuous mixing so that you the reader can get an idea of where I'm coming from on this topic, with some more rigorous definitions to follow.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;continuous mixing: A song to song mixing technique where the goal is to obfuscate the transition between the two songs such that a casual listener cannot immediately pinpoint when the transition occurred.  This can involve the use of beat matching/alignment, phrase matching/alignment, content aware equalization and other technical elements as well as sensible song selection with regard to harmony (i.e. key changes that make musical sense).&lt;/blockquote&gt;&lt;br /&gt;&lt;br /&gt;So that's my starting point.  Anyone else have an opinion?  What is continuous mixing mean to you?  Bonus question!  When is continuous mixing an appropriate transition technique in playlist presentation?  I'm starting with modern electronic dance music,  but these techniques can certainly apply to other musics (speed metal? maybe. free jazz?  probably not...)&lt;img src="http://feeds.feedburner.com/~r/StuffAlsoThings/~4/E132_fGrUHs" height="1" width="1"/&gt;</content><link rel="replies" type="application/atom+xml" href="http://stuffalsothings.blogspot.com/feeds/3873747398657485538/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="http://www.blogger.com/comment.g?blogID=6331230179506650301&amp;postID=3873747398657485538" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/3873747398657485538?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6331230179506650301/posts/default/3873747398657485538?v=2" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/StuffAlsoThings/~3/E132_fGrUHs/working-toward-rigorous-definition-of.html" title="working toward a rigorous definition of 'continuous mixing'" /><author><name>ben</name><uri>http://www.blogger.com/profile/00577690418643247192</uri><email>noreply@blogger.com</email><gd:image rel="http://schemas.google.com/g/2005#thumbnail" width="16" height="16" src="http://img2.blogblog.com/img/b16-rounded.gif" /></author><thr:total>2</thr:total><feedburner:origLink>http://stuffalsothings.blogspot.com/2009/03/working-toward-rigorous-definition-of.html</feedburner:origLink></entry></feed>
