<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:posterous="http://posterous.com/help/rss/1.0" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">
  <channel>
    <title>Ben Prew's posterous</title>
    <link>http://benprew.posterous.com</link>
    <description>Most recent posts at Ben Prew's posterous</description>
    <generator>posterous.com</generator>
    <link xmlns="http://www.w3.org/2005/Atom" href="http://posterous.com/api/sup_update#b117bf4b4" type="application/json" rel="http://api.friendfeed.com/2008/03#sup" />
    
    
    <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/BenPrewsPosterous" /><feedburner:info uri="benprewsposterous" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://posterous.superfeedr.com/" /><feedburner:browserFriendly></feedburner:browserFriendly><item>
      <pubDate>Wed, 27 Jul 2011 17:05:00 -0700</pubDate>
      <title>Why I use median, and when you should too</title>
      <link>http://benprew.posterous.com/why-i-use-median-and-when-you-should-too</link>
      <guid>http://benprew.posterous.com/why-i-use-median-and-when-you-should-too</guid>
      <description>
        <![CDATA[<p>
	<p>One of the issues I struggled with on <a href="mtg.throwingbones.com">MTG Card Prices</a> was how to price a card, based on historical sales. <p /> My initial solution was to take the average price of the prices that had occurred in the last 4 weeks, rolling. Prices for individual cards always fluctuated a little, so smoothing out bumps worked well. Also, it was easy (MySQL has an avg() function). <p /> While it worked for most of the cards, it would occasionally lead to strange results. An example of this is the Beta Sol Ring, which had sales like this: <p /> $25.12, $1725.98, $80.98, $69.99, $62, $73.32, $109.54, $110.25 <p /> The $1,725.98 price is a huge outlier, and while the auction itself included other beta cards, the auction title only mentioned the Sol Ring, so of course the matcher matched it to a Beta Sol Ring. <p /> So, with the above auctions, using average, the price is $253.59, which no one will pay for. <p /> Knowing that any matching system will incorrectly match an auction like this, the question becomes, "How can I get a reasonable price given the occasional outlier?". <p /> Fortunately, there is a solution to this problem, and I came across it recently while reading <a href="http://www.amazon.com/Data-Analysis-Open-Source-Tools/dp/0596802358/ref=sr_1_1?ie=UTF8&amp;qid=1311808579&amp;sr=8-1">"Data Analysis with Open Source Tools"</a>. Philip mentions that when you have a set that in not evenly distributed, you should use median, not average. <p /> Using our set of prices above, the median comes to $73.77, which is close to what someone would actually buy a Beta Sol Ring for.<p /> This is a much better solution, and it's a shame that MySQL doesn't have a built-in median function. I ended up using a <a href="http://rpbouman.blogspot.com/2007/12/calculating-financial-median-in-mysql.html">substring_index solution</a> that seemed like the least painful, and didn't require any self-joins.</p>
<p>So, the rule of thumb is: "If you know your data is evenly distributed, use average, otherwise you probably want median".</p>
	
</p>

<p><a href="http://benprew.posterous.com/why-i-use-median-and-when-you-should-too">Permalink</a> 

	| <a href="http://benprew.posterous.com/why-i-use-median-and-when-you-should-too#comment">Leave a comment&nbsp;&nbsp;&raquo;</a>

</p>]]>
      </description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/531127/ben.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/5AAUEsb7vAHf</posterous:profileUrl>
        <posterous:firstName>Ben</posterous:firstName>
        <posterous:lastName>Prew</posterous:lastName>
        <posterous:nickName>benprew</posterous:nickName>
        <posterous:displayName>Ben Prew</posterous:displayName>
      </posterous:author>
    </item>
    <item>
      <pubDate>Sat, 25 Jun 2011 22:53:00 -0700</pubDate>
      <title>The price effect of banning Jace and Stoneforge Mystic in Standard</title>
      <link>http://benprew.posterous.com/the-price-effect-of-banning-jace-and-stonefor</link>
      <guid>http://benprew.posterous.com/the-price-effect-of-banning-jace-and-stonefor</guid>
      <description>
        <![CDATA[<p>
	<p>A few days ago, <a href="http://mtg.throwingbones.com/card/33477">Jace, the Mind Sculptor</a> and <a href="http://mtg.throwingbones.com/card/33287">Stoneforge Mystic</a> were banned by Wizards. What is unexpected is that the price for a <a href="http://mtg.throwingbones.com/card/33372">Foil Jace</a> is still high, close to $200.  This may be because the are not many for sale, so their value remains high, or it may be a delayed effect of the banning, we will have to watch and see.</p>

<p>Jace is used in other formats, so his price may stay stable, but you
can see both the <a href="http://mtg.throwingbones.com/card/33287">Stoneforge Mystic</a> and <a href="http://mtg.throwingbones.com/card/33432">Foil Stoneforge Mystic</a> are down since June 19th (46% and 30%, respectively)</p>
	
</p>

<p><a href="http://benprew.posterous.com/the-price-effect-of-banning-jace-and-stonefor">Permalink</a> 

	| <a href="http://benprew.posterous.com/the-price-effect-of-banning-jace-and-stonefor#comment">Leave a comment&nbsp;&nbsp;&raquo;</a>

</p>]]>
      </description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/531127/ben.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/5AAUEsb7vAHf</posterous:profileUrl>
        <posterous:firstName>Ben</posterous:firstName>
        <posterous:lastName>Prew</posterous:lastName>
        <posterous:nickName>benprew</posterous:nickName>
        <posterous:displayName>Ben Prew</posterous:displayName>
      </posterous:author>
    </item>
    <item>
      <pubDate>Fri, 03 Jun 2011 17:30:00 -0700</pubDate>
      <title>Matching items to a known corpus</title>
      <link>http://benprew.posterous.com/matching-items-to-a-known-corpus</link>
      <guid>http://benprew.posterous.com/matching-items-to-a-known-corpus</guid>
      <description>
        <![CDATA[<p>
	<p>As part of <a href="http://mtg.throwingbones.com">MTG Card Prices</a>, we have to be able to identify what cards <br />are part of an auction. If you could only list a single card on eBay, <br />this would be relatively simple, but it is complicated by the fact <br />that you can list <p /> </p>
<ul>
<li>Multiple different cards together&nbsp;</li>
<li>Different editions of the card (10th ed, M11, M10, etc)&nbsp;</li>
<li>Foil versions of some card&nbsp;</li>
<li>Foreign editions&nbsp;</li>
<li>Altered cards&nbsp;</li>
</ul>
<p><br />And so on. For some of the more difficult cases (multiple cards <br />together, altered cards, foreign cards), I am just excluding them. If <br />a person is buy 2 Jaces and a Tezzeret, it's hard to get an idea how <br />that person is valuing those cards individually. <p /> For the other cases, I have a database of cards I've created using the <br />Gatherer search engine and some Internet digging. That database is <br />used as my "corpus" and auctions have to be matched to it. <p /> Every day, there are about 10k auctions that are listed that we need <br />to try and match to our corpus. Also, as the matching logic evolves <br />or new cards become available, we need to be able to rematch days or <br />even weeks of cards historically. <p /> I started by breaking up the card and cardset name into their <br />keywords, so a card like "Tezzeret, Agent of Bolas" from "Mirrodin <br />Besieged" would become "tezzeret agent bolas mirrodin besieged". <br />These words were then put into a hash table, specifying the card # <br />that they were a part of. Something like this:</p>
<p>{<br /> &nbsp; &nbsp; mirrodin =&gt; [ 1, 2, 3 ],<br /> &nbsp; &nbsp; besieged =&gt; [ 1, 2, 3 ],<br />&nbsp; &nbsp; tezzeret =&gt; [ 1 ],<br />&nbsp; &nbsp; bolas =&gt; [ 1 ],<br /> }</p>
<p>Then, you would break the auction into keywords and add a point for each card that has that keyword. (ex. you have the auction "Tezzeret from Mirrodin, like NEW!!!!", this would break into "tezzeret from <br />mirrodin like new", and when you add up the keywords, card 1 would <br />have 2 points, while card 2 and 3 would only have 1 point). <p /> Then, you would order the cards by score, and if you met some <br />threshold, you'd match the auction to the card with the most points. <p /> This was a fine solution to the problem and generated good "guesses" <br />if a card couldn't be matched. <p /> Unfortunately, it was too slow for my needs (5 auctions/sec), <br />requiring n hash lookups for each auction (where n is the # of <br />keywords in an auction), and it took up a fair amount of memory. <p /> Doing a little research, I came across a data structure called a trie <br />that I thought would work well. (yes, it's trie). And since I'm using <br />Ruby, there's a gem for it. <p /> After doing a little testing, I was achieving similar auto-matching <br />%'s, but had more then doubled my performance (12 auctions/sec). <p /> Unfortunately, the trie structure was even more memory intensive then <br />the previous solution. But, it had pointed me in the right direction. <br /> Instead of storing the trie a character at a time, I built a trie to <br />store an entire word at a time. That and I used hashes to store the <br />data, instead of creating a trie-node each time. <p /> With a little monkey-patching to the Hash class, the result was much <br />improved memory and incredibly fast matching (170+ auctions /sec)!</p>
<p>&nbsp;</p>
	
</p>

<p><a href="http://benprew.posterous.com/matching-items-to-a-known-corpus">Permalink</a> 

	| <a href="http://benprew.posterous.com/matching-items-to-a-known-corpus#comment">Leave a comment&nbsp;&nbsp;&raquo;</a>

</p>]]>
      </description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/531127/ben.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/5AAUEsb7vAHf</posterous:profileUrl>
        <posterous:firstName>Ben</posterous:firstName>
        <posterous:lastName>Prew</posterous:lastName>
        <posterous:nickName>benprew</posterous:nickName>
        <posterous:displayName>Ben Prew</posterous:displayName>
      </posterous:author>
    </item>
    <item>
      <pubDate>Tue, 16 Nov 2010 11:49:53 -0800</pubDate>
      <title>Command line scripts with Sinatra</title>
      <link>http://benprew.posterous.com/command-line-scripts-with-sinatra</link>
      <guid>http://benprew.posterous.com/command-line-scripts-with-sinatra</guid>
      <description>
        <![CDATA[<p>
	There are times when I want to do batch processing or perform other <br />command line activities that don't fit well into the web-framework. <br />The trouble is, my db connection info is based on what "environement" <br />I'm working in. I could re-create the hooks to set that environment, <br />but I'd rather leverage the existing capabilities in Sinatra to do so. <br /> That way if they change how it works in the future, I don't have to <br />work through all that again. <p /> In Sinatra, there's a Delegator module that handles passing requests <br />for methods like 'production?' or 'set' off to the App. With that in <br />mind, I can easily just include the Delegator and let it handle <br />everything from there. <p /> Here a sample of my code: <p /> require 'rubygems' <br />require 'optparse' <br />require 'sinatra/base' <br />include Sinatra::Delegator <p /> options = {} <p /> OptionParser.new do |op| <br /> op.on('-e env') { |val| set :environment, val.to_sym } <br />end.parse! <p /> You can see this on my github page as well: <p /> <a href="https://github.com/benprew/mtg/blob/master/bin/match_xtns.rb">https://github.com/benprew/mtg/blob/master/bin/match_xtns.rb</a>
	
</p>

<p><a href="http://benprew.posterous.com/command-line-scripts-with-sinatra">Permalink</a> 

	| <a href="http://benprew.posterous.com/command-line-scripts-with-sinatra#comment">Leave a comment&nbsp;&nbsp;&raquo;</a>

</p>]]>
      </description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/531127/ben.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/5AAUEsb7vAHf</posterous:profileUrl>
        <posterous:firstName>Ben</posterous:firstName>
        <posterous:lastName>Prew</posterous:lastName>
        <posterous:nickName>benprew</posterous:nickName>
        <posterous:displayName>Ben Prew</posterous:displayName>
      </posterous:author>
    </item>
    <item>
      <pubDate>Mon, 10 May 2010 12:07:22 -0700</pubDate>
      <title>Building a Login Process with Sinatra and OpenID</title>
      <link>http://benprew.posterous.com/building-a-login-process-with-sinatra-and-ope</link>
      <guid>http://benprew.posterous.com/building-a-login-process-with-sinatra-and-ope</guid>
      <description>
        <![CDATA[<p>
	<div class='p_embed p_file_embed'>
<a href="http://benprew.posterous.com/building-a-login-process-with-sinatra-and-ope"><img alt="" src="http://posterous.com/images/filetypes/pdf.png" /></a>
<div class='p_embed_description'>
<strong>Building_a_Logon_Process.pdf</strong>
<a href="http://posterous.com/getfile/files.posterous.com/benprew/DZHqqP5cnxcSXh6X8kePRLmvmrUleTJwo1Sacs0oZOMd4e0RHXnOeGum27er/Building_a_Logon_Process.pdf">Download this file</a>
</div>
</div>
<p>Slides used in a talk I gave to the Portland Ruby Brigade in May. <p /> </p>
	
</p>

<p><a href="http://benprew.posterous.com/building-a-login-process-with-sinatra-and-ope">Permalink</a> 

	| <a href="http://benprew.posterous.com/building-a-login-process-with-sinatra-and-ope#comment">Leave a comment&nbsp;&nbsp;&raquo;</a>

</p>]]>
      </description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/531127/ben.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/5AAUEsb7vAHf</posterous:profileUrl>
        <posterous:firstName>Ben</posterous:firstName>
        <posterous:lastName>Prew</posterous:lastName>
        <posterous:nickName>benprew</posterous:nickName>
        <posterous:displayName>Ben Prew</posterous:displayName>
      </posterous:author>
    </item>
    <item>
      <pubDate>Fri, 23 Apr 2010 00:04:00 -0700</pubDate>
      <title>Testing sessions with Sinatra</title>
      <link>http://benprew.posterous.com/testing-sessions-with-sinatra</link>
      <guid>http://benprew.posterous.com/testing-sessions-with-sinatra</guid>
      <description>
        <![CDATA[<p>
	<p>I've got a sinatra-based app that relies on sessions and I need to test them.  After doing a little digging, here was the solution I was able to come up with:</p>
<p />
<div>1. Make sure that you're setting your RACK_ENV to 'test' ( I do this in my Rakefile )</div>
<div>2. Disable sessions in your app when in test? </div>
<div>3. when you make a request that requires a session, pass 'rack.session' =&gt; {} in the environment hash</div>
<p />
<div>See an example of this in my app: <a href="http://github.com/benprew/picklespears">http://github.com/benprew/picklespears</a><br />
<div> </div>
<div>Here's the breakdown of the code,</div>
</div>
<p />
<div>In the Rakefile:</div>
<p />
<div><span style="font-family: Bitstream Vera Sans Mono, Courier, monospace; font-size: 12px; line-height: 17px;">  <span class="no" style="line-height: 1.4em; color: #008080; padding: 0px; margin: 0px;">ENV</span><span class="o" style="line-height: 1.4em; font-weight: bold; padding: 0px; margin: 0px;">[</span><span class="s1" style="line-height: 1.4em; color: #dd1144; padding: 0px; margin: 0px;">'RACK_ENV'</span><span class="o" style="line-height: 1.4em; font-weight: bold; padding: 0px; margin: 0px;">]</span> <span class="o" style="line-height: 1.4em; font-weight: bold; padding: 0px; margin: 0px;">=</span> <span class="s1" style="line-height: 1.4em; color: #dd1144; padding: 0px; margin: 0px;">'test'</span></span></div>
<p />
<div>In the app (picklespears.rb)</div>
<p />
<div><span style="font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 11px; line-height: 14px;">
</span><div class="CodeRay">
  <div class="code"><pre></pre></div>
</div>
<div class="line" style="padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 1em; line-height: 1.4em; background-color: #ffffcc; margin: 0px;">
<span class="k" style="line-height: 1.4em; font-weight: bold; padding: 0px; margin: 0px;">if</span> <span class="nb" style="line-height: 1.4em; color: #0086b3; padding: 0px; margin: 0px;">test</span><span class="p" style="line-height: 1.4em; padding: 0px; margin: 0px;">?</span>
</div><div class="line" style="padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 1em; line-height: 1.4em; background-color: transparent; margin: 0px;">  <span class="n" style="line-height: 1.4em; padding: 0px; margin: 0px;">set</span> <span class="ss" style="line-height: 1.4em; color: #990073; padding: 0px; margin: 0px;">:sessions</span><span class="p" style="line-height: 1.4em; padding: 0px; margin: 0px;">,</span> <span class="kp" style="line-height: 1.4em; font-weight: bold; padding: 0px; margin: 0px;">false</span>
</div><div class="line" style="padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 1em; line-height: 1.4em; background-color: transparent; margin: 0px;"><span class="k" style="line-height: 1.4em; font-weight: bold; padding: 0px; margin: 0px;">else</span></div><div class="line" style="padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 1em; line-height: 1.4em; background-color: transparent; margin: 0px;">  <span class="n" style="line-height: 1.4em; padding: 0px; margin: 0px;">set</span> <span class="ss" style="line-height: 1.4em; color: #990073; padding: 0px; margin: 0px;">:sessions</span><span class="p" style="line-height: 1.4em; padding: 0px; margin: 0px;">,</span> <span class="kp" style="line-height: 1.4em; font-weight: bold; padding: 0px; margin: 0px;">true</span>
</div><div class="line" style="padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 1em; line-height: 1.4em; background-color: transparent; margin: 0px;"><span class="k" style="line-height: 1.4em; font-weight: bold; padding: 0px; margin: 0px;">end</span></div>
</div>
<div></div>
<div>And finally, in the test itself (test/test_player.rb)</div>
<div></div>
<div><span style="font-family: helvetica, arial, freesans, clean, sans-serif; font-size: 11px; line-height: 14px;">
</span><div class="CodeRay">
  <div class="code"><pre></pre></div>
</div>
<div class="line" style="padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 1em; line-height: 1.4em; background-color: #ffffcc; margin: 0px;">    <span class="n" style="line-height: 1.4em; padding: 0px; margin: 0px;">post</span> <span class="s1" style="line-height: 1.4em; color: #dd1144; padding: 0px; margin: 0px;">'/player/update'</span><span class="p" style="line-height: 1.4em; padding: 0px; margin: 0px;">,</span> <span class="p" style="line-height: 1.4em; padding: 0px; margin: 0px;">{</span> <span class="ss" style="line-height: 1.4em; color: #990073; padding: 0px; margin: 0px;">:name</span> <span class="o" style="line-height: 1.4em; font-weight: bold; padding: 0px; margin: 0px;">=&gt;</span> <span class="s1" style="line-height: 1.4em; color: #dd1144; padding: 0px; margin: 0px;">'new_name'</span> <span class="p" style="line-height: 1.4em; padding: 0px; margin: 0px;">},</span> <span class="s1" style="line-height: 1.4em; color: #dd1144; padding: 0px; margin: 0px;">'rack.session'</span> <span class="o" style="line-height: 1.4em; font-weight: bold; padding: 0px; margin: 0px;">=&gt;</span> <span class="p" style="line-height: 1.4em; padding: 0px; margin: 0px;">{</span> <span class="ss" style="line-height: 1.4em; color: #990073; padding: 0px; margin: 0px;">:player_id</span> <span class="o" style="line-height: 1.4em; font-weight: bold; padding: 0px; margin: 0px;">=&gt;</span> <span class="n" style="line-height: 1.4em; padding: 0px; margin: 0px;">player</span><span class="o" style="line-height: 1.4em; font-weight: bold; padding: 0px; margin: 0px;">.</span><span class="n" style="line-height: 1.4em; padding: 0px; margin: 0px;">id</span> <span class="p" style="line-height: 1.4em; padding: 0px; margin: 0px;">}</span>
</div><p /><p />
</div>

	
</p>

<p><a href="http://benprew.posterous.com/testing-sessions-with-sinatra">Permalink</a> 

	| <a href="http://benprew.posterous.com/testing-sessions-with-sinatra#comment">Leave a comment&nbsp;&nbsp;&raquo;</a>

</p>]]>
      </description>
      <posterous:author>
        <posterous:userImage>http://files.posterous.com/user_profile_pics/531127/ben.jpg</posterous:userImage>
        <posterous:profileUrl>http://posterous.com/users/5AAUEsb7vAHf</posterous:profileUrl>
        <posterous:firstName>Ben</posterous:firstName>
        <posterous:lastName>Prew</posterous:lastName>
        <posterous:nickName>benprew</posterous:nickName>
        <posterous:displayName>Ben Prew</posterous:displayName>
      </posterous:author>
    </item>
  </channel>
</rss>

