<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" version="2.0">

<channel>
	<title>pokeit method</title>
	
	<link>http://pokeitmethod.com</link>
	<description>poker + econometrics</description>
	<lastBuildDate>Tue, 20 Apr 2010 02:12:03 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/pokeit_method" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="pokeit_method" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>Pokeit Video Demos!</title>
		<link>http://pokeitmethod.com/2010/03/pokeit-video-demos/</link>
		<comments>http://pokeitmethod.com/2010/03/pokeit-video-demos/#comments</comments>
		<pubDate>Sat, 06 Mar 2010 19:05:55 +0000</pubDate>
		<dc:creator>chaz</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://pokeitmethod.com/?p=299</guid>
		<description><![CDATA[Just finished recording a demo of the Pokeit pre-flop equity model.
The first part gives an overview of the Pokeit equity model and a brief discussion of how it can be used to estimate of your opponent&#8217;s hand range and your win probability against that range:

In part 2 we use Pokeit to size up a tight [...]]]></description>
			<content:encoded><![CDATA[<p>Just finished recording a demo of the Pokeit pre-flop equity model.</p>
<p>The first part gives an overview of the Pokeit equity model and a brief discussion of how it can be used to estimate of your opponent&#8217;s hand range and your win probability against that range:</p>
<p><object style="width: 629px; height: 518px;" classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="629" height="518" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="src" value="http://www.youtube.com/v/Zq4mh42PfLo" /><embed style="width: 629px; height: 518px;" type="application/x-shockwave-flash" width="629" height="518" src="http://www.youtube.com/v/Zq4mh42PfLo"></embed></object></p>
<p><span id="more-299"></span>In part 2 we use Pokeit to size up a tight situation where we have to judge whether we are ahead in a hand or behind.</p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="629" height="518" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="src" value="http://www.youtube.com/v/YwPyDAdHXNM" /><embed type="application/x-shockwave-flash" width="629" height="518" src="http://www.youtube.com/v/YwPyDAdHXNM"></embed></object></p>
<p>Enjoy.</p>
<p>-chaz</p>
]]></content:encoded>
			<wfw:commentRss>http://pokeitmethod.com/2010/03/pokeit-video-demos/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Pokeit Logos</title>
		<link>http://pokeitmethod.com/2010/02/pokeit-logos/</link>
		<comments>http://pokeitmethod.com/2010/02/pokeit-logos/#comments</comments>
		<pubDate>Wed, 03 Feb 2010 19:29:20 +0000</pubDate>
		<dc:creator>chaz</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://pokeitmethod.com/?p=281</guid>
		<description><![CDATA[Blue for the product

White for the blog.

Both Bryon Zandt exclusives. What a baller.
]]></description>
			<content:encoded><![CDATA[<p style="text-align: left;">Blue for the product</p>
<p style="text-align: left;"><a href="http://pokeitmethod.com/wp-content/uploads/2010/02/pokeit-blue1.jpg"><img class="aligncenter size-full wp-image-284" title="pokeit blue" src="http://pokeitmethod.com/wp-content/uploads/2010/02/pokeit-blue1.jpg" alt="" width="632" height="305" /></a></p>
<p style="text-align: left;"><span id="more-281"></span>White for the blog.</p>
<p style="text-align: center;"><a href="http://pokeitmethod.com/wp-content/uploads/2010/02/pokeit-white.jpg"><img class="alignnone size-full wp-image-285" title="pokeit white" src="http://pokeitmethod.com/wp-content/uploads/2010/02/pokeit-white.jpg" alt="" width="571" height="338" /></a></p>
<p style="text-align: left;">Both Bryon Zandt exclusives. What a baller.</p>
]]></content:encoded>
			<wfw:commentRss>http://pokeitmethod.com/2010/02/pokeit-logos/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pokeit Letter #4 – What is the Probability of Getting Head on First Try</title>
		<link>http://pokeitmethod.com/2009/12/pokeit-letter-4-%e2%80%93-what-is-the-probability-of-getting-head-on-first-try/</link>
		<comments>http://pokeitmethod.com/2009/12/pokeit-letter-4-%e2%80%93-what-is-the-probability-of-getting-head-on-first-try/#comments</comments>
		<pubDate>Sun, 20 Dec 2009 22:41:46 +0000</pubDate>
		<dc:creator>chaz</dc:creator>
				<category><![CDATA[letters]]></category>

		<guid isPermaLink="false">http://pokeitmethod.com/?p=211</guid>
		<description><![CDATA[
This question, posed to a class full of undergraduates, wasn’t interpreted quite the way Mustafa had intended. At the time, I was taking Stat 10 to satisfy a pre-requisite for my hastily put together plan to switch majors and apply to UNC’s business school. See, it turned out that physics just wasn’t my calling. Perhaps [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><img class="size-full wp-image-214 aligncenter" title="Mustafa_1" src="http://pokeitmethod.com/wp-content/uploads/2009/12/Mustafa_1.JPG" alt="Mustafa_1" width="464" height="145" /></p>
<p>This question, posed to a class full of undergraduates, wasn’t interpreted quite the way Mustafa had intended. At the time, I was taking Stat 10 to satisfy a pre-requisite for my hastily put together plan to switch majors and apply to UNC’s business school. See, it turned out that physics just wasn’t my calling. Perhaps it was Dr. Yu Wu’s prickly demeanor in Physics 26, or the fact that Dr Hernandez in Physics 27 was an asshole. Whatever it was, my grades in physics had not been compelling. By the time the second Physics 27 midterm rolled around, knew I had to stop griding glass into my eyes. I spoke with my academic advisor, and by the end of the day, I had signed up for STAT 10 and ECON 10 in the Spring. Next semester, Mustafa Tural introduced me to statistics.</p>
<p><span id="more-211"></span></p>
<p>You’d be hard pressed to find a more non-caring group of people just going through the motions, than the one taking STAT 10 in the Spring of 2005. Let’s be honest, no one is there because they’ve got a passion for Bayes. You’re in STAT 10 because you want to get into the B-School. You want to get into the B-School because you want a well paying job out of college. You want a well paying job out of college because, well, you want to make the big bucks; and you don’t care enough about school to get a graduate degree (and no, an MBA doesn’t count).</p>
<p><em> </em></p>
<p>I think I may have eked out a B in the class &#8211; maybe. I know of at least 4 times when I was more than 20 minutes late to that class because I was finishing my homework in the hallway. While I was reviewing a particularly shitty test with Mustafa during office hours, he asked me a pointed question:</p>
<p>“Do you care at all?&#8221;</p>
<p>I think my reply was “What?”</p>
<p>So as penitence for not paying attention in STAT 10, I’ve decided to do my part, and teach it to all of you. So let’s get to it. Consider this post a crash course in STAT 10 for poker players.<strong> </strong></p>
<p><strong> </strong></p>
<p><strong>Descriptive statistics<a href="#_ftn1"><strong>[1]</strong></a></strong></p>
<p><strong> </strong></p>
<p>First things first, we need to distinguish between two concepts: the <em>population</em> is the entire group that we wish to study, while the <em>sample</em> is the subset of the population for which we have information. Ideally the <em>sample</em> is an unbiased representation of the population drawn at random, but <a href="http://pokeitmethod.com/2009/12/pokeit-letter-3-%E2%80%93-cum-hoc-ergo-propter-hoc/" target="_blank">as we have already touched on with regards to observed hands</a>, we can’t assume this to be the case.</p>
<p>When faced with a bunch of numbers, in either a population or a sample, we often look for simple ways to summarize the data. For example, Group A contains chip counts at an arbitrary table on Day 1 of the WSOP Main Event while Group B contains chip counts at that same table on Day 7 (chip counts are in thousands of dollars):</p>
<p style="padding-left: 30px;"><strong>Group A &#8211; Day 1: </strong>$10.5, $7.8, $17.0, $11.0, $8.9, $9.5, $10.2, $23.4, $25.8<br />
<strong>Group B &#8211; Day 7: </strong>$1,400, $503, $2,500, $5,230, $980, $1,900, $7,201, $3,290, $1,309</p>
<p>We can identify two general differences between these two groups. Group B tends to have much higher chip counts, and Group A tends to be clustered closely together while Group B is spread out. These two statements capture the concepts of “central tendency”, and “spread”.</p>
<p>Measurements of “central tendency” express whether the numbers tend to be high or low. The most common of these are:</p>
<p style="padding-left: 30px;"><strong>Mean: </strong>The average value<br />
<strong>Median: </strong>the middle value<br />
<strong>Mode: </strong>the most common value (In practice the mode is useless)</p>
<p>The mean and the median of a population will be different if the distribution is “skewed”, meaning that there are larger (or smaller) gaps between values at the high end than at the low end. For example, the distribution of income is very skewed: the income of the wealthiest people differs by billions of dollars, while the income of the poorest people differs by pennies. Because of this, “mean income” might be a slightly misleading indicator, since a few wealthy people can pull the average up, so that most people actually have income below average. The median addresses this issue, by reporting the income of the person right in the middle of the distribution. In the March 2005 Current Population study, mean household income was $61,905. However, 63% of households earned less than the average. The median income was $46,400.</p>
<p>The second characteristic of “spread” captures whether observations are clustered closely together, or spread apart. The descriptive statistics most often used to describe “spread” are the <strong>variance</strong> or the <strong>standard deviation</strong>. The standard deviation in a group is the average distance between each observation and the mean, and the variance is just the standard deviation, squared.</p>
<p>A third concept of <strong>skewness</strong>, refers to whether the gaps at the top of the distribution are larger or smaller than those at the bottom. Skewness, however, is not synonymous with “biased”.</p>
<p>The <strong>maximum </strong>and <strong>minimum </strong>of a sample or population should be self-evident. Finally the Xth <strong>percentile</strong> refers to the value that X% of the group lies below. For example the median is exactly the same thing as the 50<sup>th</sup> percentile.</p>
<p><strong>Probability</strong></p>
<p><strong> </strong></p>
<p>In probability, an <strong>event</strong> is something determined by chance that either does or does not happen. An event can be described as <strong>simple</strong>, meaning that there is only one way to achieve the outcome, or <strong>complex</strong>, meaning that there are a number of simple events that would satisfy the condition.</p>
<p>Let’s use a standard 52 card deck to illustrate this principle. There are four suits in a deck, spades (<img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" />), hearts (<img class="alignnone size-full wp-image-229" title="heart" src="http://pokeitmethod.com/wp-content/uploads/2009/12/heart.jpg" alt="heart" width="13" height="13" />), diamonds (<img class="alignnone size-full wp-image-230" title="diamond" src="http://pokeitmethod.com/wp-content/uploads/2009/12/diamond.jpg" alt="diamond" width="11" height="13" />), and clubs (<img class="alignnone size-full wp-image-231" title="club" src="http://pokeitmethod.com/wp-content/uploads/2009/12/club.jpg" alt="club" width="13" height="13" />), and there are 13 unique cards ranks in each suit, 2, 3, 4, 5, 6, 7, 8, 9, 10, J, Q, K, A.</p>
<p>An event in this context is any single card. If you were to deal one card from the deck, A simple event might be the K <img class="alignnone size-full wp-image-231" title="club" src="http://pokeitmethod.com/wp-content/uploads/2009/12/club.jpg" alt="club" width="13" height="13" /> since there is only one of them. A complex event would be dealing any club (<img class="alignnone size-full wp-image-231" title="club" src="http://pokeitmethod.com/wp-content/uploads/2009/12/club.jpg" alt="club" width="13" height="13" />), since there are 13 different cards that would satisfy this condition.</p>
<p>Formally, let S denote the space of all possible outcomes. Any event is a subset S. We use letters like <em>A</em> and <em>B</em> to denote generic events while <em>¬A</em> and ¬<em>B</em> will denote the <strong>complement </strong>of <em>A</em> or <em>B</em>. The complement contains all the things in S that are not part of the event and it can be thought of as the opposite of any event. If <em>A</em> is the event that the dealt card’s suit is a “club (<img class="alignnone size-full wp-image-231" title="club" src="http://pokeitmethod.com/wp-content/uploads/2009/12/club.jpg" alt="club" width="13" height="13" />)” then <em>¬A</em> is the event that the card’s suit is “not a club (<img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" />,<img class="alignnone size-full wp-image-229" title="heart" src="http://pokeitmethod.com/wp-content/uploads/2009/12/heart.jpg" alt="heart" width="13" height="13" />,<img class="alignnone size-full wp-image-230" title="diamond" src="http://pokeitmethod.com/wp-content/uploads/2009/12/diamond.jpg" alt="diamond" width="11" height="13" />)”. The <strong>union </strong>of two events, ­<em>A</em><img class="alignnone size-full wp-image-227" title="union" src="http://pokeitmethod.com/wp-content/uploads/2009/12/union.jpg" alt="union" width="14" height="10" /><em>B</em>, consists of all outcomes that satisfy one event or the other (or both); while the <strong>intersection, </strong><em>A</em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /><em>B</em>, are all outcomes satisfying both conditions. For practical purposes you can read it as:</p>
<p style="padding-left: 30px;">The <strong>union: </strong>(­<em>A</em><img class="alignnone size-full wp-image-227" title="union" src="http://pokeitmethod.com/wp-content/uploads/2009/12/union.jpg" alt="union" width="14" height="10" /><em>B) </em>as “<em>A</em> or <em>B”</em>. This is the more inclusive set of events containing any event that is either A or B<br />
The <strong>intersection</strong>: (­<em>A</em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /><em>B)</em> as “<em>A </em>and <em>B”</em>. This is the more exclusive set of events containing only events that are both A and B</p>
<p>For example, let’s deal one card from a 52 card deck. Here, <em>A</em> is the event that the dealt card is a “club (<img class="alignnone size-full wp-image-231" title="club" src="http://pokeitmethod.com/wp-content/uploads/2009/12/club.jpg" alt="club" width="13" height="13" />)” and <em>B</em> is the event that the dealt card is a “King (K)”. We can then write the following:</p>
<p><img class="aligncenter size-full wp-image-212" title="union_table" src="http://pokeitmethod.com/wp-content/uploads/2009/12/union_table.jpg" alt="union_table" width="578" height="281" /></p>
<p>­­A <strong>probability measure </strong>is a function P[A] that tells us the fraction of times that an event occurs. The probability measure must satisfy three properties:</p>
<p style="padding-left: 30px;">1) 0 ≤ P[A] ≤ 1<br />
2) P[S] = 1<br />
3) P[¬<em>A</em>] = 1 – P[A]</p>
<p>The first property states that the probability cannot be negative, and it cannot exceed one. The second property requires that <em>something </em>in the space of all possible outcomes will occur, and the third property states that if the chance that <em>A</em> happens is <em>X</em>, then the chance that <em>A</em> doesn’t happen is 1-<em>X</em>.</p>
<p>The chance of any complex event  occurring can be found by adding up the probabilities of all the simple events contained in the complex event. There is a 1/52 chance that a particular card is dealt. The probability that the card is a “club (<img class="alignnone size-full wp-image-231" title="club" src="http://pokeitmethod.com/wp-content/uploads/2009/12/club.jpg" alt="club" width="13" height="13" />)” thus is P[A] = P[2<img class="alignnone size-full wp-image-231" title="club" src="http://pokeitmethod.com/wp-content/uploads/2009/12/club.jpg" alt="club" width="13" height="13" />] + P[3<img class="alignnone size-full wp-image-231" title="club" src="http://pokeitmethod.com/wp-content/uploads/2009/12/club.jpg" alt="club" width="13" height="13" />] + … + P[A<img class="alignnone size-full wp-image-231" title="club" src="http://pokeitmethod.com/wp-content/uploads/2009/12/club.jpg" alt="club" width="13" height="13" />] = 13/52 = 1/4.</p>
<p>If we already know the probability that some complex events occur, and we want to calculate the chance that their union occurs, we cannot simply add up the probabilities together. For example, there is an 13/52 chance that the dealt card is a “club (<img class="alignnone size-full wp-image-231" title="club" src="http://pokeitmethod.com/wp-content/uploads/2009/12/club.jpg" alt="club" width="13" height="13" />)”, and there is an 4/52 chance that the dealt card is a “King (K)”. The chance that the card is “a club (<img class="alignnone size-full wp-image-231" title="club" src="http://pokeitmethod.com/wp-content/uploads/2009/12/club.jpg" alt="club" width="13" height="13" />) or a King (K)” is <em>not</em> 13/52 + 4/52 = 17/52. If we take a look at our deck of cards in S, 16 of the 52 outcomes are either a club or a king. This should be the chance that ­<em>A</em><img class="alignnone size-full wp-image-227" title="union" src="http://pokeitmethod.com/wp-content/uploads/2009/12/union.jpg" alt="union" width="14" height="10" /><em>B</em> occurs. By simply adding P[A] to P[B], we have double counted the outcome that is both club (<img class="alignnone size-full wp-image-231" title="club" src="http://pokeitmethod.com/wp-content/uploads/2009/12/club.jpg" alt="club" width="13" height="13" />) and King (K). The correct calculation of the probability is:</p>
<p style="padding-left: 30px;">P[­<em>A</em><img class="alignnone size-full wp-image-227" title="union" src="http://pokeitmethod.com/wp-content/uploads/2009/12/union.jpg" alt="union" width="14" height="10" /><em>B</em>] = P[A] + P[B] – P[­­<em>A</em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /><em>B</em>]</p>
<p>When two events do not intersect, we say that the events are disjoint or mutually exclusive. For example being dealt a King and being dealt an Ace are mutually exclusive events, since there is no outcome satisfying both conditions. In this special case, the probability that one or the other occurs is simply their sum:</p>
<p style="padding-left: 30px;">P[­<em>A</em><img class="alignnone size-full wp-image-227" title="union" src="http://pokeitmethod.com/wp-content/uploads/2009/12/union.jpg" alt="union" width="14" height="10" /><em>B</em>] = P[A] + P[B] if <em>A</em> and <em>B</em> are mutually exclusive events</p>
<p><strong>Conditional probability</strong></p>
<p>Suppose you are playing a game of Hold’em. You are in the big blind and everyone folds around to the player in the small blind. The player in the small blind reaches for his chips, but in the process he accidentally flips over one of his cards exposing the A<img class="alignnone size-full wp-image-229" title="heart" src="http://pokeitmethod.com/wp-content/uploads/2009/12/heart.jpg" alt="heart" width="13" height="13" />. Now, given that he was dealt one Ace, you want to know the probability that he’s holding pocket aces. Stated another way, you&#8217;re interested in the <strong>conditional probability</strong> that your opponent is holding two aces <em>given </em>that he is holding at least one ace.</p>
<p>If we know that event B has occurred (opponent dealt one ace), we can use this information to revise our expectations about A (opponent holds pocket aces). The probability of “A conditional on B” or the probability of A given B is always calculated as:</p>
<p style="padding-left: 30px;">P[<em>A</em>|<em>B</em>] = P[­­<em>A</em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /><em>B</em>]/P[<em>B</em>]</p>
<p>Here, P[­­<em>A</em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /><em>B</em>] is the intersection of the first card and the second card being an ace. We consult the internet and find out that the probability of being dealt pocket aces is 12/2652. P[<em>B</em>] is just the probability of pulling an ace out of a 52 card deck, so P[<em>B</em>] = 4/52. Therefore, given that the first card is an ace, the probability of  our opponent holding pocket aces is P[<em>A</em>|<em>B</em>] = (12/2652)/(4/52) = 3/51. This is in fact a nice, intuitive result. Once one ace has been dealt to our opponent, there are 3 left in the deck of 51 remaining cards. The total number of possible hands pre-flop is 52*51 = 2652. This is commonly reduced to 1326 since the order in which the two cards are dealt is not important.</p>
<p>We say that two events are <strong>independent</strong> if P[<em>A|B</em>] = P[A]; in other words, knowing <em>B</em> does not help us revise our probabilities that <em>A</em> occurred. In this example, “first card ace” and “second card ace” are <em>not</em> independent, since P[B] = 4/52, while P[<em>A|B</em>] = 3/51.</p>
<p>If we want to calculate the probability that an intersection occurs (that both <em>A</em> and <em>B</em> happen), we rearrange the formula for conditional probability so that:</p>
<p style="padding-left: 30px;">P[­­<em>A</em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /><em>B</em>] = P[<em>A</em>|<em>B</em>] × P[<em>B</em>], in general; and<br />
P[­­<em>A</em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /><em>B</em>] = P[<em>A</em>] × P[<em>B</em>], if the events are independent.</p>
<p>The last rule is incredibly useful. Bill Chen &amp; Jerrod Ankenman put it to work in their Theory of Doubling Up, which is outlined in their epic tomb, “The Mathematics of Poker”. The theory, which is used to estimate the probability of winning a tournament,  goes something like this:</p>
<p>Consider a winner take all tournament. Excepting for skill considerations, a player’s equity in the tournament is proportional to his chip stack. If we make a further assumption that the chance of a player doubling his chip stack is constant P[C] = 50% throughout the tournament, then the probability of him winning the tournament is P[C]<em><sup>N</sup></em> where <em>N</em> is the number of times the player must double up. In a four person tournament, N=2 (a player must double his stack twice to have all the chips in play). Thus the probability of winning the tournament is</p>
<p style="padding-left: 30px;">P[­­<em>C<sub>1</sub></em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /><em>C<sub>2</sub></em>] = P[­­<em>C<sub>1</sub></em>] × [<em>C<sub>2</sub></em>] = (1/2) × (1/2) = 1/4</p>
<p>Likewise, the probability of winning a 128 person tournament where one must double up 7 times (1-2-4-8-16-32-64-128) equal to</p>
<p style="padding-left: 30px;">P[­­<em>C<sub>1</sub></em><em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" />C<sub>2</sub></em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /><em>C<sub>3</sub></em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /><em>C<sub>4</sub></em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /><em>C<sub>5</sub></em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /><em>C<sub>6</sub></em><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /><em>C<sub>7</sub></em>] = (1/2) <em><sup> 7</sup></em> = 1/128</p>
<p><sub> </sub></p>
<p>When we have repeated, independent trails, this rule is convenient for calculating the probability that <em>A</em> and <em>B</em> occur. If we want to know instead the chance that <em>A</em> or<strong> </strong><em>B </em>occurs, we have to combine several of our rules. Let’s say we’re holding a spade flush draw on the flop, and we want to know what the probability is that either the turn or the river brings a fifth spade. Thus:</p>
<p style="padding-left: 30px;">P[(­­turn = <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>) or (river = <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)]<br />
= 1 – P[(¬{(­­turn = <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>) or (river = <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)}]<br />
= 1 – P[(turn ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>) and (river ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)]<br />
= 1 – P[(turn ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)] × P[(river ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)]<br />
= 1 – (38/47) × (37/46)<br />
= <a href="http://www.texasholdem-poker.com/examples/9">0.3497</a></p>
<p style="padding-left: 30px;">QED…</p>
<p>What this is saying is that the probability that a spade comes on either the turn or the river is the same as 1 minus the probability that a spade doesn&#8217;t come on either the turn or the river. Since the probability that the river is not a spade is dependent on whether or not the turn is a spade, the intersection is calculated using the formula P[­­A<img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" />B] = P[A|B] × P[B]:</p>
<p style="text-align: left; padding-left: 30px;">P[(­­turn ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)<img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" />(river ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)] = P[(river ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)|(­­turn ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)] × P[(­­turn ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)]</p>
<p style="text-align: left;">P[(­­turn ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)] is simply the number of non-spades in the deck (47 cards in deck &#8211; 9 spades in deck = 38) divided by the number of cards in the deck (47) which equals <strong>38/47</strong>. P[(river ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)|(­­turn ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)] is the conditional probability that the river is not a spade given that the turn was not a spade. If the turn comes out non-spade, there are still 9 spades left in the deck, so the number of non spades left in the deck is the number of remaining cards (46) minus the number of spades (9) which is equal to 37. Divide 37 by the number of remaining cards and you get the probability that the river is a non-spade given that the turn was not a spade: <strong>37/46</strong>.</p>
<p style="padding-left: 30px;">P[(­­turn ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)<a href="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg"><img class="alignnone size-full wp-image-226" title="intersection" src="http://pokeitmethod.com/wp-content/uploads/2009/12/intersection1.jpg" alt="intersection" width="14" height="10" /></a>(river ≠ <a href="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg"><img class="alignnone size-full wp-image-222" title="spade" src="http://pokeitmethod.com/wp-content/uploads/2009/12/spade.jpg" alt="spade" width="12" height="13" /></a>)] = (38/47) × (37/46) = 0.6503<br />
1 – 0.6503 = <a href="http://www.texasholdem-poker.com/examples/9">0.3497</a></p>
<p style="text-align: left; padding-left: 30px;">QED again&#8230;</p>
<p><strong>In closing</strong></p>
<p>Seeing as many of you spent your intro to statistics class playing donkaments on PartyPoker, I thought I’d preach the truth to you in a language you’d understand. Be proud of yourselves. If you’ve made it this far, you’re now equipped with the most elementary, yet essential knowledge of statistics. But let us not get ahead of ourselves, there is still much to learn. Bayes is looming in the background, and we have yet to resolve the little problem of biased observed hands. Keeping these challenges in mind, we shall end this lesson by remembering the solemn words of the great Barry Greenstein.</p>
<p><a href="http://www.blogcdn.com/www.cardsquad.com/media/2006/01/ace_river.jpg">Math is idiotic</a></p>
<p>-chaz</p>
<hr size="1" /><a href="#_ftnref1">[1]</a> Adapted from: Lich-Tyler, Stephen. &#8220;A Primer in Probability&#8221;. (2008)</p>
]]></content:encoded>
			<wfw:commentRss>http://pokeitmethod.com/2009/12/pokeit-letter-4-%e2%80%93-what-is-the-probability-of-getting-head-on-first-try/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pokeit Paper #1 – Quantifying the Bias in Observed Hands</title>
		<link>http://pokeitmethod.com/2009/12/pokeit-paper-1-quantifying-the-bias-in-observed-hands/</link>
		<comments>http://pokeitmethod.com/2009/12/pokeit-paper-1-quantifying-the-bias-in-observed-hands/#comments</comments>
		<pubDate>Sun, 13 Dec 2009 22:38:57 +0000</pubDate>
		<dc:creator>chaz</dc:creator>
				<category><![CDATA[papers]]></category>

		<guid isPermaLink="false">http://pokeitmethod.com/?p=150</guid>
		<description><![CDATA[We are all familiar with idea that the distribution of hands shown at showdown is different from the distribution of all hands dealt. The specter of this bias has confounded the poker botting community and lead many to eschew estimating hand ranges all together. While my scouring of the pokerai.org archives was not totally exhaustive, [...]]]></description>
			<content:encoded><![CDATA[<p>We are all familiar with idea that the distribution of hands shown at showdown is different from the distribution of all hands dealt. The specter of this bias has confounded the poker botting community and lead many to eschew estimating hand ranges all together. While my scouring of the pokerai.org archives was not totally exhaustive, I don’t believe that anyone else has tried to quantify the bias in showed hands in a systematic way. In this analysis we will use a two econometric models, one created from a dataset revealing every hand, and the other from a dataset limited to hands showed at showdown, to predict a player’s hand range distribution in several different game states. By comparing the showdown equity of the match-up between an arbitrary hand and the &#8216;all hands&#8217; and &#8217;showed&#8217; hand range distributions, we can estimate the bias in terms of its affect on showdown equity. In a meta-analysis of +45,000 game states, equity estimates derived from the dataset limited to hands showed at showdown were -1.34% ± 2.1% lower on average than those derived from the full dataset &#8211; indicating a slight upward bias in the strength of observed hands.</p>
<p><span id="more-150"></span></p>
<p><strong>A naïve look at the bias</strong></p>
<p>Observed hands are subject to selection. Players fold the majority of hands before they actually reach a showdown, and even then, the loser can muck his hand if he’s beaten. In a sample of 1,225,010 games of low limit NL-Holdem 6-max, only 69,187 hands (5.68%) were actually observed at showdown. Because players get to choose which hands they want to play, on average they will select to play hands with a higher expected value over those with a lower expected value. If hands with higher expected values are more likely to be played then they are also more likely to be showed at showdown. Because this selection is non-random, it introduces a bias.</p>
<p>To refine what we mean by better or worse hands, we will use the <a href="http://www.tightpoker.com/images/books/david-sklansky.jpg" target="_blank">Sklansky Hand groups</a> to order all of the starting hands in Texas Hold’em into 9 ranked categories. While the Sklansky groups aren’t perfect, they have the admirable attribute of being well known:</p>
<p align="center"><img class="aligncenter size-full wp-image-151" title="sklansky_table" src="http://pokeitmethod.com/wp-content/uploads/2009/12/sklansky_table.jpg" alt="sklansky_table" width="491" height="173" /></p>
<p><strong> </strong></p>
<p>Using my buddy Joe’s personal database, we can identify the size of the selection bias by comparing the frequency of observing each hand group in the full dataset of 219,680 hands to the frequency of observing each hand group in the limited dataset of 8,329 hands showed at showdown. Plotted out, the frequencies for each category give us the hand range distributions for showed hands vs. all hands:</p>
<p><img class="aligncenter size-full wp-image-152" title="chart_1" src="http://pokeitmethod.com/wp-content/uploads/2009/12/chart_1.jpg" alt="chart_1" width="572" height="361" /></p>
<p><strong><em> </em></strong></p>
<p><strong><em>Figure 1:</em></strong><em> A plot comparing the frequency of observing a hand in each of the 9 Sklansky Hand Groupings for all hands dealt vs. only those showed at a showdown. The probability distribution for all hands is indicated by the black line, while the probability distribution for shown hands is indicated by the blue line. The yellow bars show the net difference between the frequencies of each group.</em></p>
<p>The observed hand selection bias appears to be quite large. Crappy group 9 hands represent 61% of all hands dealt, but only 20% of hands that are observed at showdown. Likewise you are more than five times as likely to observe a group 1 hand like Aces or Kings in a sample of observed hands than you are in a population with all hands revealed. The showed hands distribution is roughly flat before dipping around group 6 – 8 and rising up for group 9. Meanwhile, the revealed hands distribution is <a href="http://en.wikipedia.org/wiki/Skewness" target="_blank">heavily left skewed</a> with the majority of hands coming from the garbage group 9 category.</p>
<p>Have these results just shot a massive hole through our plan of modeling opponent hand range distributions? In a word, no.</p>
<p>The above chart is a bit misleading. While our revealed hand dataset includes hands that are showed and hands that are folded somewhere along the way, it also includes hands that are immediately folded pre-flop. When trying to model and opponent’s hand range from their actions &amp; the game state, we’re not really concerned about the hand range of a player who just folded. Slimming down our dataset to only those hands that are played is the first step towards looking specifically at the bias in the <em>conditional probability</em> generated by using only showed hands.</p>
<p><strong>Conditional probability bias</strong></p>
<p>Defining a hand range is all about conditional probability. For example, if we hold Kings in the cut-off, we may want to know the conditional probability that the guy who just raised x3bb under the gun has Aces. If P[A] is the likelihood that the player has aces, and P[B] is our set of 2 game state variables &#8211;  his bet size of x3bb, and his position, UTG, then P(A|B) is the conditional probability that our opponent has Aces, given that he raise x3bb UTG.</p>
<p>The set of B game state factors is whatever you decide to put in your model. It’s limited only by your creativity, grasp of the game, available data, and computing power. My simple pre-flop model (described at detail <a href="http://pokeitmethod.com/2009/11/pokeit-pre-flop-prototype-is-available-for-download/" target="_blank">here</a> and <a href="http://pokerai.org/pf3/viewtopic.php?f=3&amp;t=2736" target="_blank">here</a>) uses the following inputs to model the hand range:</p>
<p style="padding-left: 30px;">♦ Player action on each pre-flop ‘round’ of betting (call/check or raise)<br />
♦ Action behind the player (call, raise, 3-bet, etc.)<br />
♦ A variable that combines position with # of players at the table<br />
♦ Amount bet on each pre-flop ‘round’<br />
♦ An interaction between player action and action behind<br />
♦ An interaction between player action and position/# players<br />
♦ An interaction between player action and amount bet</p>
<p>And the output variable is a categorical variable 1-9 corresponding to groups in the Sklansky starting hand rankings. Since the model is based off of the hands of one particular player, Joe, no player type variables are needed.</p>
<p>A database of 219,680 revealed hands was used to produce a dataset of 45,039 hands which were not folded pre-flop, and a dataset of 8,329 of hands which were showed at showdown. Comparing the results produced by the ‘all hands’ dataset to the results of the ‘showed’ dataset should show us just how biased showed hands are.</p>
<p><strong>Example situations</strong></p>
<p>We will examine this bias by modeling the hand ranges of a few example game states of 6-max No-Limit Hold’em:</p>
<p style="padding-left: 30px;">♦ CO limps<br />
♦ CO calls MP’s x3bb raise<br />
♦ UTG raises x3bb<br />
♦ Button 3-bets MP’s x3bb raise<br />
♦ A weighted average of all game states</p>
<p>The hand ranges derived from the ‘all hands’ and ‘showed’ datasets will be plotted for each game state with the net bias being the displacement between the two distributions.</p>
<p>Estimating an opponent’s hand range is only the first step towards quantifying the bias in showed hands. What ultimately matters when we’re at the table is how our hand holds up against our opponent’s range &#8211; on average. Multiplying the probability of our opponent holding each Sklansky hand group by the probability of our hand winning against each Sklansky hand group gives us our equity against our opponent’s hand range. From now on we’ll call this the range-equity. We can quantify the selection bias for any given game state by comparing the range-equity produced by the ‘all hands’ model to the range-equity produced by ‘showed’ model.</p>
<p><strong>CO limps</strong></p>
<p style="text-align: center;"><img title="chart_2" src="http://pokeitmethod.com/wp-content/uploads/2009/12/chart_2.jpg" alt="chart_2" width="574" height="361" /></p>
<p><strong> </strong></p>
<p><strong> </strong></p>
<p><strong> </strong></p>
<p>Here the game state is CO limping after 2 folds behind. This is a fairly weak play as evidenced by the majority of hands coming from the lower ranked hand groups. The ‘all hands’ distribution has peaks at group 9 for 34% and at group 7 for 20.9%. The ‘showed’ model is also left skewed, but group 9 is lower by 14% and the middle peak is spread across groups 5-7. This is the first evidence that the conditional probability is biased, and it confirms our suspicion that the bias is shifted towards stronger hands. We can obtain range-equity estimates for our ‘all hands’ and ‘showed’ models by multiplying the hand-range distribution by the equity a particular hand has against each Sklansky group.</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-154" title="chart_2_table" src="http://pokeitmethod.com/wp-content/uploads/2009/12/chart_2_table.jpg" alt="chart_2_table" width="583" height="56" /></p>
<p>To illustrate the breadth of the bias, we matched the distributions with the hand that generated the greatest bias in range-equity (max), the hand that generated the smallest bias (min), and an average of every hand’s equity against each Skalansky group (average). For CO limps, the range equity of the showed dataset would be biased down by a max of -1.89% if you held 44s. That the bias is negative indicates that the ‘showed’ model predicts a stronger hand for the CO and a lower corresponding equity for your hand. Taking a look at the equities and it’s not hard to see why 32o generates the lowest bias. If you held 32o, your equity against each group is uniformly bad. Biases in the hand range between the groups changes the range-equity very little. Meanwhile, for 44s, group 9 has a greater equity (63%) than any other group. Since, the greatest bias (14%) is found in group 9, this should have a large impact on the range-equity estimates. The average bias, or the range-equity bias you would get if you took an average of every hand&#8217;s equity against each Sklansky group, is -1.07%.</p>
<p><strong> </strong></p>
<p><strong>CO calls MP’s x3bb raise</strong></p>
<p><strong> </strong></p>
<p><img class="aligncenter size-full wp-image-155" title="chart_3" src="http://pokeitmethod.com/wp-content/uploads/2009/12/chart_3.jpg" alt="chart_3" width="580" height="414" /></p>
<p><strong> </strong></p>
<p><strong> </strong></p>
<p>The most likely holdings for an opponent cold-calling a raise in this situation are drawing hand and playable hands that aren’t good enough for a 3-bet. Groups 3&amp;4 contain mid-pairs (99, 88) big unsuited aces (AQ, AJ) and some nice suited overs (JTs, QJs, KJs, T9s, QTs, 98s, J9s, KTs). The smaller peak around groups 6&amp;7 also contain drawing hands like the pairs 66-22, dangerous overs (AT, KT, QT) and suited connectors (86s, 76s, 54s). If we compare the distribution of the CO cold-call to the distribution of the CO limp we see that the hump around groups 6&amp;7 is present in both but that our opponent is now more likely to be playing groups 3&amp;4 over the chaff in category 9.</p>
<p>The shift of the ‘showed’ hands distribution towards stronger hands is also quite clear for this game state. It’s almost as if the blue ‘showed’ distribution was nudged one group over from the black ‘all hands’ distribution. The hand match-up resulting in the largest bias is Q9o with -1.95%, the lowest is 32o with -0.03%, and the average is -0.87%.</p>
<p><strong> </strong></p>
<p><strong>UTG raises x3bb</strong></p>
<p><strong> </strong></p>
<p><strong> </strong></p>
<p><strong> </strong></p>
<p><strong><img class="aligncenter size-full wp-image-156" title="chart_4" src="http://pokeitmethod.com/wp-content/uploads/2009/12/chart_4.jpg" alt="chart_4" width="581" height="417" /></strong></p>
<p>Here we have the classic UTG raise x3bb. This is a play that represents strength and our hand range estimates show that. The ‘all hands’ model indicates that 51% of our opponent’s likely hands are in the top 3 groups while the ‘showed’ model tells us that 61% of our opponent’s likely hands are in the top 3 groups. There is a second hump around groups 6-7 showing that our opponent may also be mixing it up with weaker cards when he raises x3bb UTG. The maximum bias of -3.89% comes from holding tens in this spot. This is to be expected since the bias is greatest in group 1, and the over pairs in that group pose the biggest problem for tens. The lowest bias of -0.47% comes from holding aces, and the average bias across all hands is -1.96%.</p>
<p><strong> </strong></p>
<p><strong>Button 3-bets MP’s x3bb raise</strong></p>
<p><strong> </strong></p>
<p><img class="aligncenter size-full wp-image-157" title="chart_5" src="http://pokeitmethod.com/wp-content/uploads/2009/12/chart_5.jpg" alt="chart_5" width="581" height="417" /></p>
<p>The Button 3-bet is essentially a more right skewed version of the x3bb UTG open. The ‘all hands’ model tells us there&#8217;s a 65% chance that the Button has a hand from groups 1-3 and that percentage rises to 71% if we use the ‘showed’ model. The max range-equity bias of -2.33% comes from holding tens, the minimum of -0.38% comes from holding Aces and the average across all hands is -1.04%.</p>
<p><strong> </strong></p>
<p><strong>Weighted average of all game states</strong></p>
<p><strong> </strong></p>
<p><strong> </strong></p>
<p>Our primary goal here is to estimate the overall bias created by the selection in showed hands. The bias in the range-equity of showed hands is both a function of the game state and the hole cards used in the match-up. Game states occur with varying frequencies. For instance, you might see a ton of opponents limp, but only a few 5-bets pre-flop. In order to get a picture of the overall bias in showed hands, we can take a weighted average of the hand ranges generated by all the game states we observed.</p>
<p style="text-align: center;"><img class="aligncenter" title="chart_6" src="http://pokeitmethod.com/wp-content/uploads/2009/12/chart_6.jpg" alt="chart_6" width="581" height="417" /></p>
<p>The above chart shows the average hand range distribution of the 45,039 game states derived using the ‘all hands’ model and the ‘showed’ model. Multiplying each distribution by the equity of the max (TT), min (AA) and average equity of all hands gives us the range-equity for both models.<a href="#_ftn1">[1]</a></p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-159" title="chart_7" src="http://pokeitmethod.com/wp-content/uploads/2009/12/chart_7.jpg" alt="chart_7" width="575" height="335" /></p>
<p>The average range-equity bias of pocket tens is highest among hands at -2.38% ± 3.62%, while the lowest bias is for Aces at -0.26% ± 0.68% (using the 95% confidence interval, μ ± 2σ). The average equity of all hands matched with the weighted average of all the game states gives us range-equity estimates of 41.57% for ‘all hands’ and 40.83% for ‘showed’.  The resulting range-equity bias for the average hand is -1.34% ± 2.1% and the distribution is displayed in the above chart.</p>
<p><strong>Conclusion</strong></p>
<p><strong> </strong></p>
<p>What’s the take away you ask? Let’s begin with the caveats. This analysis was done on the hands of 1 player, and the hand range estimates derived speak only to his play in a particular situation. As of yet, it’s not clear how different strategies affect the bias of observed hands. Also, we do not know how the simplification made in assigning outcome hands to the Sklansky groups rather than each 169 pre-flop hands affects our estimates. Since the model used to predict hand ranges is fairly simple in its construction, further refinement may reduce estimated bias. These questions requires further research.</p>
<p>Assuming that these caveats don’t severely alter our results, our findings indicate that that the bias in observed hands is not disastrously large. Running the model on a weighted average of game states and showdown equities produced a bias of -1.34% ± 2.1%. The small size of the bias can be explained in a few ways. First, the bias in range-equity is a function of both the hand range distribution and the showdown equity of the hand matched up against the distribution. If the variation in the showdown equity is relatively low between hand groups, the bias in the hand range estimates has little effect on the resulting range-equity. Hands that come to mind are Aces – which does well against all other hands, and 32o which is uniformly bad. Second, while a good hand is more likely to go to showdown than a bad hand, a good hand is also played differently than a bad hand. It’s one thing to say players showdown better hands. It&#8217;s another thing to say that the hand range of a TAG who raises x3bb UTG against a loose table is biased since he is more likely to show down one part of his distribution over another AND a random hand’s equity against the biased parts of his distribution is sufficiently large to change a decision from being +EV to –EV. The implications of first statement are pretty intuitive. The implications of the second statement are not.</p>
<p>In the coming weeks, I’ll be working to make the model more robust which includes estimating for all 169 hand categories rather than the nine Sklansky groups, as well as layering on additional input variables. I’ve also got a few statistical tricks in my back pocket which may help correct for a large part of this bias.</p>
<p>Stay tuned…</p>
<hr size="1" /><a href="#_ftnref1">[1]</a>When we were only looking at one game state, we could subtract the ‘all-hands’ range-equity from the ‘showed’ range-equity to derive the net bias. Since our hand ranges are the product of an average of 45,039 game states, we calculate the bias in all 45,039 hands first and then take the mean to get the average bias derived from the model.</p>
]]></content:encoded>
			<wfw:commentRss>http://pokeitmethod.com/2009/12/pokeit-paper-1-quantifying-the-bias-in-observed-hands/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pokeit Letter #3 – Cum hoc ergo propter hoc!</title>
		<link>http://pokeitmethod.com/2009/12/pokeit-letter-3-%e2%80%93-cum-hoc-ergo-propter-hoc/</link>
		<comments>http://pokeitmethod.com/2009/12/pokeit-letter-3-%e2%80%93-cum-hoc-ergo-propter-hoc/#comments</comments>
		<pubDate>Thu, 03 Dec 2009 04:46:37 +0000</pubDate>
		<dc:creator>chaz</dc:creator>
				<category><![CDATA[letters]]></category>

		<guid isPermaLink="false">http://pokeitmethod.com/?p=97</guid>
		<description><![CDATA[
There’s this great story one of my economics professors use to tell in order to illustrate that “correlation does not imply causation”. It’s most certainly false but it goes like this. The setting is 16th century Russia, during the latter half of Ivan the Terrible’s reign. In general, this was not a great time to [...]]]></description>
			<content:encoded><![CDATA[<p style="text-align: center;"><img class="size-full wp-image-98 aligncenter" title="xkcd" src="http://pokeitmethod.com/wp-content/uploads/2009/12/xkcd.JPG" alt="xkcd" width="461" height="187" /></p>
<p>There’s this great story one of my economics professors use to tell in order to illustrate that “correlation does not imply causation”. It’s most certainly false but it goes like this. The setting is 16<sup>th</sup> century Russia, during the latter half of Ivan the Terrible’s reign. In general, this was not a great time to be living in Russia. A combination of drought, famine, Polish-Lithuanian raids, Tatar invasions, and the sea-trading blockade carried out by the Swedes, Poles and the Hanseatic League had devastated the country.  On top of that, a particularly nasty epidemic of the plague was killing between 600 and 1000 people every day.  It was not known at the time how the plague spread, so efforts to fight it were subject to wild theories. Ivan, mentally unstable and physically disabled, suspected treachery. To prove it, he had his advisors gather statistics on the number of doctors and the number of dead throughout his Kingdom. Once it was discovered that regions with more doctors also had more deaths, Ivan rounded up all of the doctors and had them executed for treason.</p>
<p><span id="more-97"></span>There’s also this great chart I found on Wikipedia showing the relationship between U.S. highway fatalities and fresh lemon imports:</p>
<p><img class="aligncenter size-full wp-image-99" title="Fresh LEMONS" src="http://pokeitmethod.com/wp-content/uploads/2009/12/Fresh-LEMONS.JPG" alt="Fresh LEMONS" width="500" height="325" /></p>
<p><em>(Just an aside, I work for a firm that does a lot of professional services contracting with the federal government. The most common result of these contracts are PowerPoint presentations with dozens of pretty charts and tables. There’s a running joke that many of the <a href="http://www.wallstats.com/blog/down-the-rabbit-hole-of-the-pentagon-graphics-machine/">ridiculous PowerPoints that come out of the Federal government</a> are actually produced by some entry level grunt working for our firm. Basically, what I’m trying to say is that the odds are 50:50 that someone down the hall from me is convinced that we need to increase fresh lemon imports. Better make it from Mexico just to be sure.)</em></p>
<p>The phrase that comes to mind when people talk about murderous Russian Doctors and the unique benefits of fresh lemon imports is <strong>correlation does not imply causation</strong>. This fallacy, which is also known as <strong>cum hoc ergo propter hoc</strong> (Latin for &#8220;with this, therefore because of this&#8221;) is actually a combination of several different limitations. Chief among them are the problems of <strong>reverse causality</strong>, <strong>unexplained heterogeneity</strong>, and <strong>selection</strong>.</p>
<p><strong>Reverse Causality<a href="#_ftn1"><strong>[1]</strong></a></strong></p>
<p>Interpreting relationships in the social sciences is hard (the math in the <a href="http://pokeitmethod.com/2009/11/pokeit-letter-2-nate-silvers-crystal-ball/" target="_blank">last letter</a>, that’s the easy part). The problem boils down to the fact that statisticians and economists commonly lack the laboratory ideal of a randomized trial. That is, we aren’t always able to introduce an ‘X’ variable randomly into a population and measure its average effect on an outcome ‘Y’ (although a group of <a href="http://www.povertyactionlab.org/" target="_blank">clever development economists</a> are doing just that). For instance, it would be infeasible and immoral to mandate that a ‘treatment’ group of children must go to school, while preventing a ‘control’ group from enrolling in order to measure the effect of schooling on future wages. Likewise, you can’t set up a ‘Cold War’ experiment where a random selection of nations enacts communist policies while another cohort pursues free-market capitalism just to find out if Ayn Rand was on to something.</p>
<p style="text-align: left;">In the real world, we do not get to control the X variables (schooling, systems of government, lemon imports). Often they are determined by other things – possibly including Y itself. Regression analysis captures the correlation between X and Y, but nothing else. Let’s go back to imperialist Russia and try to find the effect of Doctors on Plague. We run the regression:</p>
<p style="text-align: left;"><img class="aligncenter size-full wp-image-101" title="regression_1" src="http://pokeitmethod.com/wp-content/uploads/2009/12/regression_1.jpg" alt="regression_1" width="219" height="30" /></p>
<p>And we find that b is large and positive – lets say that each additional doctors is associated with 10 infections in an arbitrary population. Can we take this as foolproof evidence that doctors were spreading the plague, if not from the fleas nesting in their fur, than by intentional infection? Well, not necessarily. The more sensible explanation is that more doctors were attracted to areas where there were higher incidents of plague. While this definitely makes more sense, it is important to remember that it is still an interpretation and not a product of the data itself. All the data can tell you is that there is a correlation.</p>
<p>A more challenging interpretation can arise from trying to estimate the effect of the psychological condition known as ‘tilt’ on win/loss rates in poker. Tilt is poker term for a state of mental frustration caused by bad beats, challenging interpersonal situations, and/or a losing session. Tilting may cause a player to play overly aggressive or loose, and it often has a negative effect on profitability. Let’s say we want to measure the effect of being on tilt on average profitability. Setting aside for now the exact specification of these two variables, we set up the equation:</p>
<p align="center"><img class="aligncenter size-full wp-image-102" title="regression_2" src="http://pokeitmethod.com/wp-content/uploads/2009/12/regression_2.jpg" alt="regression_2" width="209" height="34" /></p>
<p>After running a regression we find a strong and negative correlation between being on tilt and profitability. This result is somewhat ambiguous though. Is it that being on tilt causes players to lose money, or does losing money on cause players to go on tilt? As any poker player can tell you, it’s very likely that the causation goes both ways.</p>
<p><strong>Unobserved Heterogeneity</strong></p>
<p><strong> </strong></p>
<p>A second type of complication is unobserved heterogeneity. This is a problem when people with different values of X are different in other ways that also affect Y. If some unobserved factor is correlated with both X and Y, our estimates of b will be biased.</p>
<p>The most common example of this is the “ability bias” in estimating the returns to schooling. Suppose a person’s wages are a function of their education level and their ‘ability’. Here ability can mean intelligence, job skills, savvy, taste for office politics, whatever.</p>
<p><img class="aligncenter size-full wp-image-103" title="regression_3" src="http://pokeitmethod.com/wp-content/uploads/2009/12/regression_3.jpg" alt="regression_3" width="353" height="35" /></p>
<p>Even if you devised a set of really neat statistics to measure <em>Ability</em>, they’d almost surely be imperfect in some way, and you often don’t have access to all the necessary data. Suffice it to say, the <em>Ability</em> variable is omitted and absorbed into the error term. The problem now is that <em>Ability</em> affects earnings, but it also affects how much education you get. If we are unable to measure <em>Ability</em>, then we will mistakenly attribute its effect to education.</p>
<p>Player names in online poker provide a more pure example of unobserved heterogeneity. Before you can give your credit card number to an offshore, semi-legal poker website, you have to create a player name. My friend Joseph Crowley has observed that players with “Mike” or “Mikey” in their name tend to be <a href="http://www.urbandictionary.com/define.php?term=poker%20donkey" target="_blank">huge donkeys</a> at the table. Let’s test his hypothesis, that is, whether or not having some variant of Mike in your name is associated with lower win rates in dollars per hand:</p>
<p align="center"><img class="aligncenter size-full wp-image-104" title="regression_4" src="http://pokeitmethod.com/wp-content/uploads/2009/12/regression_4.jpg" alt="regression_4" /></p>
<p>Rather than just hypothesizing the relationship between these two variables, let’s test it empirically using data collected from the hand histories of real money poker players. I have a database on my work-issued Lenovo Thinkpad with approximately 800,000 games of $0.50 &#8211; $1 NL Hold’em providing statistics on 22,420 players. Using this data, I identify if a player has “Mike” in their name with the dummy variable <em>MIKEY </em>and the win rates of every player in number of dollars won per hand with the variable <em>USD_hand</em>. Running the regression of <em>USD_hand on </em><em>MIKEY </em> in the statistical package Stata produces the following regression table:</p>
<p><img class="aligncenter size-full wp-image-105" title="regression_table" src="http://pokeitmethod.com/wp-content/uploads/2009/12/regression_table.jpg" alt="regression_table" width="569" height="170" /></p>
<p>Joe’s intuition seems to have been spot on as the b coefficient for the variable <em>MIKEY </em>is large, negative, and significant at the 95% confidence level. Having “Mike” in your player name is associated with win rates that are $1.25 a hand below average (as shown in the Coef. column). Also, the P&gt;|t| value highlighted in yellow is less than 0.05 &#8211; hence why we say it is significant at the 95% confidence level. What this means is that if the null hypothesis that there is no relationship between having &#8216;Mike&#8217; in your name and your win rates is true, we would observe this $1.25 below average result less than 5% of the time. Now here’s where the whole thing gets Freakonomics on us. Does this result mean that if I changed my username to ‘MikeOrangelloNutz’, my average win rate per hand would drop by $1.25? Likely not.</p>
<p>Unless your player name makes people think you are a pro (like if it was Phil Ivey or something) it should have <strong>no direct effect</strong> on win rates. However the data shows a positive correlation between the two. This isn’t a case of reverse causality, either (win rates don’t cause players to change their player name). What’s happening here is that both win rate and player name are cause by the same unobserved factor -  player skill. And for whatever reason, Mikey tends to suck at poker.</p>
<p><strong>Selection and self-selection</strong></p>
<p>When we estimate b we want to interpret it as “the average effect of X on Y.” However, people are different, and each person has a different internal relationship between X and Y. For example,  the relationship (b) between minutes spent watching Gossip Girl (X) and  jollies derived from watching Gossip Girl (Y), is likely higher for me than it is for you. Because of this, I watch Gossip Girl every Monday night, read <a href="http://nymag.com/tv/gossip-girl/" target="_blank">Daily Intel</a> on Tuesday at work, and afterwards, I email my friends &#8220;+50 for Chuck Bass saying &#8216;Because I&#8217;m Chuck Bass&#8217;&#8221;. My 15 year old sister calls me about it on Wednesday after watching it online (because she&#8217;s not allowed to watch tv on weeknights!) and we giggle about it together like two 15 year old girls. Now, if I were to look at the data on Gossip Girl viewership to try to estimate the joy the average dude would get from watching it, we&#8217;d find the average <em>only</em> for the people who actively choose to watch Gossip Girl. Since these are precisely the same people who get more jollies out of it than most, our estimate of the effect of X and Y would be biased upward therefore causing us to over-estimate the effect of Gossip Girl on happiness. This problem is called “self-selection” or selection in its general form.</p>
<p>Switching gears to <a href="http://cdn.videogum.com/img/thumbnails/photos/gossip_girl_3_6/poker.jpg" target="_blank">poker</a>, suppose there was a training program that could increase player’s winnings. Here are five people considering taking the class</p>
<p><img class="aligncenter size-full wp-image-106" title="self_selection_table_1" src="http://pokeitmethod.com/wp-content/uploads/2009/12/self_selection_table_1.jpg" alt="self_selection_table_1" width="591" height="190" /></p>
<p>Across the population, the average return is <strong>$2,800</strong>. But let’s assume that the class costs <strong>$4,000</strong>. Who would then take the class? If our subjects are rational economic agents (you know, like most gamblers), they will only take the class if the net benefit in increased winnings outweighs the cost. Performing the cost benefit analysis we find that Andrew and Stu would be the only ones to take the class:</p>
<p><img class="aligncenter size-full wp-image-107" title="self_selection_table_2" src="http://pokeitmethod.com/wp-content/uploads/2009/12/self_selection_table_2.jpg" alt="self_selection_table_2" width="279" height="191" /></p>
<p>If we were to then estimate the effect of the class on winnings using just those who attended, we would estimate the average effect to be<strong> $5,500</strong> for a net effect minus costs of <strong>+1,500</strong> per player. While the estimate is accurate for this particular subset, it overstates the effect the class would have on the population as a whole. In fact, after fees, the average effect of the class is <strong>-$1,200</strong>.</p>
<p>This is actually a perfect example of what multilevel marketers like Amway do in an effort to recruit new members. After-hours seminars are set up in a non-descript office park, the few success stories are paraded around, and afterwards, an “Amway Business Owner” offers to sell you several Robert Kiyosaki books such as the notorious “Rich Dad, Poor Dad” and the lesser known “Cashflow Quadrant” (I wouldn’t be surprised if a study came out in a few years showing that the housing bubble was actually caused by hordes of Amway automatons pumped up by Kiyosaki’s real estate happy talk). Conned into giving up their money and their dignity, most of the people that get involved in these schemes never make back their initial investment.</p>
<p>Observed hands are also subject to selection. Players fold the majority of hands before they actually reach a showdown, and even then, the loser can muck his hand if he&#8217;s beaten. In a sample of 174,305 hands of low limit online poker, only 6,551 hands (3.8%) were actually observed at showdown. Since only the best hands (and some bluffs) are not folded, we can expect that observed hands are on average, far stronger than unobserved hands. This certainly complicates our efforts to model situations in online poker, but by how much?</p>
<p>To refine what we mean by better or worse hands, we will use the patented <span style="color: #ff0000;"><a href="http://www.tightpoker.com/images/books/david-sklansky.jpg" target="_blank">Sklansky Hand groups</a></span> to order all of the starting hands in Texas Hold’em into 9 ranked categories:</p>
<p align="center"><img class="aligncenter size-full wp-image-108" title="sklansky_group_table" src="http://pokeitmethod.com/wp-content/uploads/2009/12/sklansky_group_table.jpg" alt="sklansky_group_table" width="532" height="189" /></p>
<p>Using a personal database which reveals all of the hands dealt to a player, we can identify the size of the selection bias by comparing the frequency of observing each hand group in the full datasets of 174,305 hands to the frequency of observing each hand group in the showed datasets of just 6,551 hands. Plotted out, the frequencies for each category give us the hand range distributions for showed hands vs. all hands:</p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-148" title="sklansky_chart_1" src="http://pokeitmethod.com/wp-content/uploads/2009/12/sklansky_chart_1.jpg" alt="sklansky_chart_1" width="573" height="364" /></p>
<p><em><strong>Figure 1:</strong> A plot comparing the frequency of observing a hand in each of the 9 Sklansky Hand Groupings for all hands dealt vs. only those shown at a showdown. The probability distribution for all hands is indicated by the black line, while the probability distribution for shown hands is indicated by the blue line. The yellow bars show the net difference between the frequencies of each group.</em></p>
<p>The observed hand selection bias is in fact, quite large. Crappy group 9 hands represent 61% of all hands dealt, but only 20% of hands that are observed at showdown. Likewise you are more than five times as likely to observe a tier 1 hand like Aces or Kings in a sample of observed hands than you are in a population with all hands revealed. The showed hands distribution is roughly flat before dipping around group 6 – 8 and rising up for group 9. Meanwhile, the revealed hands distribution is <a href="http://en.wikipedia.org/wiki/Skewness" target="_blank">heavily left skewed</a> with the majority of hands coming from the garbage group 9 category.</p>
<p>Have these results just shot a massive hole through our plan of modeling opponent hand range distributions? Do other options exist besides using public datasets of showed hands? While we could rely entirely on personal datasets with all hands revealed (like the one used to generate Figure 1), our sample of player databases would also be subject to selection bias since only certain types of player actively track their hand histories. And before you get any ideas, datasets revealing every player&#8217;s hand do not exist outside the poker site’s offshore facilities and Russ Hamilton’s hard drive. <img class="alignnone size-full wp-image-111" title="sadface" src="http://pokeitmethod.com/wp-content/uploads/2009/12/sadface1.jpg" alt="sadface" width="16" height="16" /></p>
<p>Defeat appears emanate but is all truly lost for our hero statistician? Tune in next time to find out if this series ends at #4!</p>
<p>-chaz</p>
<hr size="1" /><a href="#_ftnref1">[1]</a> Adapted from: Lich-Tyler, Stephen. &#8220;Supplemental Notes from Econ 570: Econometrics.&#8221; (2008)</p>
]]></content:encoded>
			<wfw:commentRss>http://pokeitmethod.com/2009/12/pokeit-letter-3-%e2%80%93-cum-hoc-ergo-propter-hoc/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Pokeit Letter #2 – Nate Silver’s Crystal Ball</title>
		<link>http://pokeitmethod.com/2009/11/pokeit-letter-2-nate-silvers-crystal-ball/</link>
		<comments>http://pokeitmethod.com/2009/11/pokeit-letter-2-nate-silvers-crystal-ball/#comments</comments>
		<pubDate>Fri, 20 Nov 2009 20:18:08 +0000</pubDate>
		<dc:creator>chaz</dc:creator>
				<category><![CDATA[letters]]></category>

		<guid isPermaLink="false">http://pokeitmethod.com/?p=47</guid>
		<description><![CDATA[At 9:46 p.m., blogging on his site FiveThirtyEight.com, Nate Silver called the presidential election for Barack Obama. The television networks followed suit about an hour and 15 minutes later after most polls in Western states closed.
 
Of course, Mr. Silver had a head start: he had forecast that Senator Obama would beat Senator John McCain [...]]]></description>
			<content:encoded><![CDATA[<p><em>At 9:46 p.m., blogging on his site FiveThirtyEight.com, Nate Silver called the presidential election for Barack Obama. The television networks followed suit about an hour and 15 minutes later after most polls in Western states closed.</em></p>
<p><em> </em></p>
<p><em>Of course, Mr. Silver had a head start: he had forecast that Senator Obama would beat Senator John McCain back in March.</em></p>
<p><em> </em></p>
<p align="right"><em>From the New York Times &#8211; November 10, 2008</em></p>
<p>Nate Silver, the prodigy behind the PECOTA system for predicting the performance of baseball players, and former economic consultant for KPMG, had developed a new statistical framework for analyzing elections. Silver had already proven its scary accuracy during the Democratic primaries in May. While every other commentator was celebrating Hillary Clinton’s resurgent momentum, Silver was skeptical of the new polls showing she would win by five in Indiana and had closed the gap to 8 in North Carolina. The fresh polls didn’t make sense when compared against the relatively stable demographic data. Blogging under than handle Poblano, he broke down the numbers in a different way – Clinton by just two in Indiana, and a seventeen point whuppin in the Tar Heel State. On May 6<sup>th</sup>, the night of the Democratic primaries, Clinton won Indiana by one and lost North Carolina by fifteen.</p>
<p><em> </em></p>
<p><span id="more-47"></span>By the end of election night on November 4<sup>th</sup>,  Nate’s model had predicted the popular vote within one percentage point, correctly predicted the results of 49 of the 50 States, and accurately forecasted all of the resolved Senate races.<em> </em></p>
<p>My good friend John from Uganda said it best:</p>
<p><em>“That was so exciting moment for the young man to beat <span style="text-decoration: underline;">a big man</span> pants down”</em></p>
<p>Indeed it was. But how exactly did he do it? The most common understanding goes like this: math genius creates a complicated statistical black box able to conjure up crystal ball-like predictions from the polls. World finds out, world goes apeshit, and its not long before Silver’s being chased around by a <a href="http://bit.ly/cma58">Hasidic Jewish sect</a>, and agents of a <a href="http://pokeitmethod.com/wp-content/uploads/2009/11/marcy_dawson.jpg" target="_blank">nefarious Wall Street firm</a>[1]:</p>
<p><em>11:15, restate my assumptions</em></p>
<p style="text-align: center;"><a href="http://pokeitmethod.com/wp-content/uploads/2009/11/calling-indiana.jpg"><img class="alignnone size-full wp-image-49" title="calling indiana" src="http://pokeitmethod.com/wp-content/uploads/2009/11/calling-indiana.jpg" alt="calling indiana" width="576" height="543" /></a></p>
<p><em><strong>Figure 1:</strong> To arrive at a win percentage on election day, Silver combined the reliability-weighted average of all the polls, adjusted for more recent polls from other states, averaged in predictions from demographic and economic indicators, created a projection to account for the historical trends of undecided voters and ran a Monte Carlo simulation to translate polling predictions and polling error into a probabilistic statement of the likely outcome (i.e. Obama has a 49% chance of winning Indiana).<a href="#_ftn2">[2]</a></em></p>
<p>Sensationalism aside, the actual story of how Nate Silver derived his electoral predictions is as fascinating and impressive as they come.</p>
<p>The starting point for Nate was a growing dissatisfaction with the way commentators were botching the analysis of the race’s primary data source – the polls. Poll aggregators like RealClearPolitics.com gave every poll the same weight. Good polls were mixed with polls having small sample sizes, polls from unreliable pollsters, old polls and polls from pollsters with a known political bias. Silver wanted to create an aggregate of the polls, but with a weight towards the best ones.</p>
<p>In order to determine the best polls, he examined all of the old polls, took the average ‘miss’ for each pollster across each contest they polled, and compare it to the average miss of other pollsters in the same contest. The methodology for calculating the rankings and subsequent weights based on effective sample size are exhaustively documented <a href="http://www.fivethirtyeight.com/search/label/pollster%20ratings">@fivethirtyeight.com</a>.</p>
<p>Individual outlier polls, and states with little or no recent polling, posed another problem. For example, in February of 2008, the single Kentucky poll showed Obama trailing McCain by 29 points, whereas the only Tennessee poll had Obama trailing by 9 points. Since Tennessee and Kentucky are fairly similar, it was unlikely that there was in fact a 20-point gap between the two states.<a href="#_ftn3">[3]</a> This realization lead to another insight. The fairly stable relationship between the demographics of Tennessee and Kentucky and their expected electoral outcomes is generalizable to elections in every other state. If you were to look back through the data of election outcomes, you would find a statistically significant relationship between a states demographics and a candidate’s, or party’s, expected two-way share of the vote (two-way as in excluding 3<sup>rd</sup> party candidates).</p>
<p>One of the more notable cases where the polls began to diverge from the historical relationship was in North Carolina during the weeks leading up to the Democratic Primary. The pollster Insider Advantage, released a poll 6 days before the election showing that Hillary Clinton had pulled ahead by 2 points. Other polls also pointed to a narrowing of the gap. Silver smelled a lark. He suspected that pollsters were significantly underestimating Obama’s margin of victory in Southern states with substantial black populations. In addition, early voting data in North   Carolina suggested that pollsters may also be significantly underestimating the proportion of African-Americans in the voting population.<a href="#_ftn4">[4]</a> While the polls showed tightening, Silver’s model correctly forecasted a double-digit victory for Barack Obama.</p>
<p>The assumption was that voters in North Carolina would behave like demographically-aligned voters in other states. More explicitly, the model looked at a set of independent demographic variables and then tried to estimate their effect on one dependent variable: Obama’s two-way vote share. The statistical technique used to parse out this relationship is known as multiple regression analysis. Let’s jump right in.</p>
<p><strong>Regression</strong></p>
<p>Many empirical questions can be posed as “what is the effect of X on Y?”. In our example, we’re looking at the effect of a set of state demographic statistics, X on Obama’s share of the vote total, Y. The mathematical relationship of this statement is <a href="#_ftn5"><strong><strong>[5]</strong></strong></a></p>
<p style="text-align: center;"><img class="aligncenter size-full wp-image-52" title="regression 1" src="http://pokeitmethod.com/wp-content/uploads/2009/11/regression-1.jpg" alt="regression 1" width="185" height="26" /></p>
<p>Where a is the intercept or constant, and b is a measures of how much Y changes (on average) when X changes. In a simple world (think Pre-Algebra), with no other cofounding factors to consider, b would simply be the slope b=y/x. In the real world, things are far more complicated. Obama’s two-way vote share is determined by a lot things we aren’t able to include in the model. For example, the Reverend Wright scandal was going on at the time and the results of which were rather unpredictable – certainly not by an econometric model using only recent polling history. Econometricians call these other factors “error” or “unobservables” and they use the letter e to represent it.</p>
<p>Since we can’t just find the slope between Y/X, how do we guess the value of b? Well, we want our estimate of b to be the best estimate of b possible, i.e. we want to maximize the accuracy of our estimate of b. In this case, maximizing the accuracy of b is also the same thing as minimizing the size of the error term, or what we can’t explain. And to be more precise, we don’t want to minimize e in the sense of making it as negative as possible, we want to normalize e in a way to make sure we’re measuring its absolute size. We <em>could</em> do this by minimizing the absolute value of e – but that’s quite difficult. A far simpler exercise would be to take the square of e and then try to minimize the value of that vector. I’m going to walk through the derivation of b using this method of minimizing the size of square of our e vector. This technique is call “(ordinary) least-squares regression”. Replacing ‘other factors’ with e, the true relationship between Y and X is</p>
<p align="center"><a href="http://pokeitmethod.com/wp-content/uploads/2009/11/regression-2.jpg"><img class="alignnone size-full wp-image-54" title="regression 2" src="http://pokeitmethod.com/wp-content/uploads/2009/11/regression-2.jpg" alt="regression 2" width="112" height="23" /></a></p>
<p>We will also introduce an equation for the average value of Y</p>
<p align="center"><img class="aligncenter size-full wp-image-55" title="regression 3" src="http://pokeitmethod.com/wp-content/uploads/2009/11/regression-3.jpg" alt="regression 3" width="92" height="30" /></p>
<p>Let’s perform a mathematical parlor trick and subtract the average value of Y from both sides of the true relationship equation</p>
<p align="center"><img class="aligncenter size-full wp-image-56" title="regression 4" src="http://pokeitmethod.com/wp-content/uploads/2009/11/regression-4.jpg" alt="regression 4" width="216" height="27" /></p>
<p>The a constants cancel out and this can be rewritten as</p>
<p align="center"><img class="aligncenter size-full wp-image-57" title="regression 5" src="http://pokeitmethod.com/wp-content/uploads/2009/11/regression-5.jpg" alt="regression 5" width="150" height="27" /></p>
<p>Rearranging the terms so we isolate e</p>
<p align="center"><img class="aligncenter size-full wp-image-58" title="regression 6" src="http://pokeitmethod.com/wp-content/uploads/2009/11/regression-6.jpg" alt="regression 6" width="158" height="31" /></p>
<p>Recall that our best guess of b is the value that minimizes the sum of the squared errors. Therefore, we want to pick b to minimize</p>
<p align="center"><img class="aligncenter size-full wp-image-59" title="regression 7" src="http://pokeitmethod.com/wp-content/uploads/2009/11/regression-7.jpg" alt="regression 7" width="383" height="31" /></p>
<p>Think back to calculus now and recall how to minimize this function: take a derivative with respect to b, and set it equal to zero.</p>
<p align="center"><img class="aligncenter size-full wp-image-60" title="regression 8" src="http://pokeitmethod.com/wp-content/uploads/2009/11/regression-8.jpg" alt="regression 8" width="289" height="36" /></p>
<p>A little rearranging</p>
<p align="center"><img class="aligncenter size-full wp-image-61" title="regression 9" src="http://pokeitmethod.com/wp-content/uploads/2009/11/regression-9.jpg" alt="regression 9" width="252" height="36" /></p>
<p>Then if we divide through to isolate b, and we have</p>
<p align="center"><img class="aligncenter size-full wp-image-62" title="regression 10" src="http://pokeitmethod.com/wp-content/uploads/2009/11/regression-10.jpg" alt="regression 10" width="169" height="62" /></p>
<p>This is our formula for the best guess of b. For those familiar with statistics, this can be viewed as the covariance between X &amp; Y divided by the variance in X</p>
<p align="center"><img class="aligncenter size-full wp-image-63" title="regression 11" src="http://pokeitmethod.com/wp-content/uploads/2009/11/regression-11.jpg" alt="regression 11" width="149" height="34" /></p>
<p align="center"><a href="http://www.youtube.com/watch?v=WVXGC896Jdw">Most Outstanding</a></p>
<p>To be sure, the purpose of this exercise was less about learning a statistical derivation, and more about examining the logic of these techniques.  The ordinary least squares estimator is just one of many estimation techniques used to identify the relationship between variables, and with God’s help, causality. While each technique is different in its execution, they all attempt to either minimize the error, or maximize the likelihood that the estimated value of b* is equal to the true value of b.</p>
<p><strong> </strong></p>
<p><strong>The Democratic Primary Model</strong></p>
<p>As Silver is surely well aware, <a href="http://www.stat.columbia.edu/%7Ecook/movabletype/archives/2009/08/dumpin_the_data.html">models don’t come fully formed from the .raw file</a>. After testing and retesting dozens of hypothetical correlations, Silver found 9 demographic variables that had a statistically significant effect on the Obama-Clinton vote share. Factors included in the model were: <a href="#_ftn6">[6]</a></p>
<p style="padding-left: 30px;">1. Caucus versus Primary<br />
2. African-American population<br />
3. Percentage of 18-29 voters<br />
4. Percentage of adults with college degrees<br />
5. Fundraising<br />
6. Percentage of Southern Baptists<br />
7. John Kerry vote share, 2004<br />
8. Percentage of Democratic voters who self-identify as Liberal<br />
9. Percentage of naturalized citizens, e.g. immigrants</p>
<p>An econometrician might write the model like this:</p>
<p><img class="aligncenter size-full wp-image-64" title="regression 12" src="http://pokeitmethod.com/wp-content/uploads/2009/11/regression-12.jpg" alt="regression 12" width="262" height="31" /></p>
<p>Where <img title="yi" src="../wp-content/uploads/2009/11/yi1.jpg" alt="yi" width="15" height="13" /> is Obama’s vote share, <img class="alignnone size-full wp-image-70" title="xik" src="http://pokeitmethod.com/wp-content/uploads/2009/11/xik1.jpg" alt="xik" width="18" height="13" /> is the value of explanatory variables 1-9, <img class="alignnone size-full wp-image-71" title="bk" src="http://pokeitmethod.com/wp-content/uploads/2009/11/bk1.jpg" alt="bk" width="18" height="22" /> is the estimated marginal average effect of <img class="alignnone size-full wp-image-72" title="xk" src="http://pokeitmethod.com/wp-content/uploads/2009/11/xk1.jpg" alt="xk" width="17" height="13" /> on <img class="alignnone size-full wp-image-73" title="y" src="http://pokeitmethod.com/wp-content/uploads/2009/11/y.jpg" alt="y" width="13" height="12" />, and <img class="alignnone size-full wp-image-74" title="b0" src="http://pokeitmethod.com/wp-content/uploads/2009/11/b01.jpg" alt="b0" width="17" height="22" /> is the estimated constant. Taken together, these variables explained +95% of the voting breakdown in states that had already voted.</p>
<p>The first thing you may notice is that without even knowing the size and direction of the <img class="alignnone size-full wp-image-71" title="bk" src="http://pokeitmethod.com/wp-content/uploads/2009/11/bk1.jpg" alt="bk" width="18" height="22" /> coefficients for each variable, you probably have an idea about which factors favored Obama and which factors favored Hillary. As you might expect, Obama performed well with African-Americans, the youth vote, the better educated and the more liberal.</p>
<p>Several of the more creative variables require some explanation. Caucuses, which benefited from lots of on the ground organizing, gave Obama’s an advantage while Hillary did better in wide open primaries. Fundraising power measured the dollars raised by the candidate divided by each vote Kerry netted in 2004. This serves to quantify which candidate has the state party apparatus in their back pocket. Percentage of Southern Baptists was used as a proxy for a State’s ‘Southerness’ so not to resort to messy geographic definitions of the South. And surprisingly, Nate did not find that Obama performed worse in states with large Latino populations once all the other factors were controlled for. He did find though, that Obama did slightly worse among recent immigrants relative to Hispanics born in the United States. This is the ‘Percentage of naturalized citizens, e.g. immigrants’ variable. Silver hypothesizes that this might be on account of Bill Clinton being the President when they came to this country or became citizens. The final variable, John Kerry vote share, favored Hillary.</p>
<p><strong>Conclusion</strong></p>
<p>Perhaps you’re wondering why we took this long detour through political forecasting as opposed to going headlong into the poker. Without mincing words, the story of FiveThirtyEight.com is one of the most righteous examples of applied statistical modeling that I’ve ever seen. Nate’s intricate process for synthesizing win percentages from raw polling data and demographic regressions is a piece of engineering at least as complex and twice as clever as what’s done at any hedge fund out there. Each challenge had a carefully chosen solution and each solution built on the next to finally arrive at something unique and undeniably useful – a measure of a candidate’s probability of winning an election. We’re going to try to adopt a similar thought process as we begin building our expected value model for no-limit hold’em poker.</p>
<p>Looking back at what went into the FiveThirtyEight.com model really reaffirms my belief that econometrics is more of an art than a science.<a href="#_ftn7">[7]</a> The math and the stats are necessary and yes they can be complex, tedious, and dull at times. However, while you need to know the math to communicate with the data, its how well you understand the questions you’re asking that matters most. The real heavy lifting of model building is done when you are forced to order and codify concepts that you had only previously considered in rules of thumb and rough approximations.</p>
<p>In the next installment, we’ll be turning our attention back to poker as we examine the challenge of determining causality.</p>
<p>-chaz</p>
<hr size="1" /><a href="#_ftnref1">[1]</a> <em>“[Silver] had been flown to New York at the invitation of a hedge fund to give a talk. They just said, ‘Why don’t you come in, talk about your models’” </em>(New York Magazine Oct 12, 2008)</p>
<p><a href="#_ftnref2">[2]</a> Sternbergh, Adam. &#8220;The Spreadsheet Psychic: How Nate Silver Went from Forecasting Baseball Games to Forecasting Elections.&#8221; New York Magazine 12 Oct 2008: 2.</p>
<p><a href="#_ftnref3">[3]</a> Silver, Nate. &#8220;General Election Projections, Beta Version.&#8221; <em>Daily Kos</em> 26 Feb 2008 Web.25 Aug 2009. &lt; http://www.dailykos.com/storyonly/2008/2/26/183555/011/136/464643&gt;.</p>
<p><a href="#_ftnref4">[4]</a> Silver, Nate. “North   Carolina Prediction: Obama by Double Digits.” <em>FiveThirtyEight.com 5</em> May 2008 Web.25 Aug 2009. &lt; http://www.fivethirtyeight.com/2008/05/north-carolina-prediction-obama-by.html&gt;.</p>
<p><a href="#_ftnref5">[5]</a> Adapted from: Lich-Tyler, Stephen. &#8220;Supplemental Notes from Econ 570: Econometrics.&#8221; (2008)</p>
<p><a href="#_ftnref6">[6]</a> Silver, Nate. &#8220;What’s an Obama State? With February predications.&#8221; <em>Daily Kos</em> 9 Feb 2008 Web.22 Aug 2009. &lt; http://www.dailykos.com/storyonly/2008/2/9/13227/22519/239/453361 &gt;.</p>
<p><a href="#_ftnref7">[7]</a> Of course, I did not originate this belief.</p>
]]></content:encoded>
			<wfw:commentRss>http://pokeitmethod.com/2009/11/pokeit-letter-2-nate-silvers-crystal-ball/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pokeit Pre-Flop Prototype is Available for Download</title>
		<link>http://pokeitmethod.com/2009/11/pokeit-pre-flop-prototype-is-available-for-download/</link>
		<comments>http://pokeitmethod.com/2009/11/pokeit-pre-flop-prototype-is-available-for-download/#comments</comments>
		<pubDate>Sun, 15 Nov 2009 17:53:20 +0000</pubDate>
		<dc:creator>chaz</dc:creator>
				<category><![CDATA[pokeit_models]]></category>

		<guid isPermaLink="false">http://pokeitmethod.com/?p=27</guid>
		<description><![CDATA[The Pokeit pre-flop prototype is available for download. You input the game state, your hole cards, and the action during the hand, and the Excel based tool gives you a 2-way equity estimate of your hand against each opponent’s hand range.
Hand range estimates are derived using a multinomial logit regression model. The general idea is [...]]]></description>
			<content:encoded><![CDATA[<p>The Pokeit pre-flop prototype is available for download. You input the game state, your hole cards, and the action during the hand, and the Excel based tool gives you a 2-way equity estimate of your hand against each opponent’s hand range.</p>
<p>Hand range estimates are derived using a <span style="color: #3366ff;"><a href="http://en.wikipedia.org/wiki/Multinomial_logit">multinomial logit</a></span> regression model. The general idea is that the model takes the input variables of game state and opponent action and then tells you the conditional probability that your opponent holds each of the 169 possible hands pre-flop. Each component of your opponent’s hand range distribution has an associated showdown equity value against your hand. To derive the estimated equity of your hand against the distribution, we multiply the probability of your opponent holding each hand, by the showdown equity, and sum the whole thing together.</p>
<p><strong>The input variables of the model include:</strong></p>
<p>♦ Player action on each pre-flop ‘round’ of betting (call/check or raise)<br />
♦ Action behind the player (call, raise, 3-bet, etc.)<br />
♦ A variable that combines position with # of players at the table<br />
♦ Amount bet on each pre-flop ‘round’<br />
♦ An interaction between player action and action behind<br />
♦ An interaction between player action and position/# players<br />
♦ An interaction between player action and amount bet</p>
<p>The data used to produce these estimates comes from a 181,007 hands worth of 6-max NL-Hold’em on PokerStars with limits ranging from $0.50-$1.00 NL to $3.00-$6.00 NL. The analysis is completely derived from the revealed hands of my friend Joe, who was kind enough to ship me his PokerTracker database. As you’ll notice when you test it out, Joe is pretty TAG. Some of his stats are vpip = 0.18, pf_raise = 0.12, wwsf = 0.43, and total_af = 2.77. Because the statistical model is based off of just one player, the prototype is somewhat impractical for in-game use. It should however give you an idea of what we have in store.</p>
<p><strong>Just a few notes about the functionality of the Excel workbook:</strong></p>
<p><img class="aligncenter size-full wp-image-43" title="main5" src="http://pokeitmethod.com/wp-content/uploads/2009/11/main5.jpg" alt="main5" width="639" height="360" /><br />
<span id="more-27"></span><br />
♦ Top left of the ‘main’ sheet shows the position of each player at the table. It is here where you edit the player names, and create empty seats if &lt;6 players. It is required that someone is named ‘hero’</p>
<p>♦ The gray cells throughout the workbook are your input cells. Variables with discrete values, such a hero’s position, hero’s hole cards, and player action, have pull down menus. Variables with continuous values, such as amount bet, require typing</p>
<p>♦ To input the action for the hand, first select one of 3 player action’s in the ‘action’ column: F – Fold, C – Call, R – Raise</p>
<p>♦ Next in the ‘total bet’ column, type in the total amount the player has bet up to that point in the hand. For example, if someone raises to $10 and you call in the BB, type in $10. If an opponent 3-bets and you 4-bet, type in the total value of your 4-bet</p>
<p>♦ Estimated 2-way equity is displayed in the ‘equity’ column. If the opponent has already acted in the hand, the conditional equity is shown. If action has yet to reach a player, equity against a random hand is displayed</p>
<p>♦ The ‘New Hand’ button resets the action</p>
<p><img class="aligncenter size-full wp-image-45" title="equity5" src="http://pokeitmethod.com/wp-content/uploads/2009/11/equity5.jpg" alt="equity5" width="642" height="429" /></p>
<p>♦ The ‘equity’ tab, provides a graphical view of your hand’s equity against each 169 pre-flop hand categories. The generalized categories 1-13 of each hand are shown to the right, and you can highlight a particular category using the ‘category’ pull down menu</p>
<p>♦ Using the ‘view’ pull down menu, you can switch between viewing actual equity derived from ~50,000 Monte Carlo trials, the average equity of each generalized category, and a third view that shows the difference between the two</p>
<p>♦ Both worksheets have a table and chart combination which displays the hand range and category equities of each opponent. If ‘automatic’ is selected (cell J2 on the main sheet) its shows data for the player whose action it’s on</p>
<p>♦ The remaining worksheets house the innards of the model. Feel free to snoop around if you’d like</p>
<p style="text-align: center;">
<p style="text-align: center;"><a title="Download Pokeit Pre-Flop Model" href="http://pokeitmethod.com/wp-content/uploads/2010/04/Pokeit-Pre-Flop-Equity-Model-v-0.5.xls.zip" target="_blank"><span style="color: #0000ff;">Download Pokeit Pre-Flop Model (3.70 MB) &#8212; Updated as of 4/07/10</span><br />
</a></p>
<p style="text-align: left;">- chaz</p>
<p style="text-align: center;">
]]></content:encoded>
			<wfw:commentRss>http://pokeitmethod.com/2009/11/pokeit-pre-flop-prototype-is-available-for-download/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Pokeit Letter #1 – Poker Through the Lens of Expected Value</title>
		<link>http://pokeitmethod.com/2009/11/poker-through-the-lens-of-expected-value/</link>
		<comments>http://pokeitmethod.com/2009/11/poker-through-the-lens-of-expected-value/#comments</comments>
		<pubDate>Fri, 13 Nov 2009 05:07:23 +0000</pubDate>
		<dc:creator>chaz</dc:creator>
				<category><![CDATA[letters]]></category>

		<guid isPermaLink="false">http://pokeitmethod.com/?p=9</guid>
		<description><![CDATA[The basic problem in poker is that unlike craps, roulette, blackjack, or any of the various table games you might find yourself blowing cash on at Casino Royal, the expected value of any bet in poker is uncertain. Calculating expected value in roulette is pretty straightforward in comparison. Take for example a $100 bet on [...]]]></description>
			<content:encoded><![CDATA[<p>The basic problem in poker is that unlike craps, roulette, blackjack, or any of the various table games you might find yourself blowing cash on at Casino Royal, the expected value of any bet in poker is uncertain. Calculating expected value in roulette is pretty straightforward in comparison. Take for example a $100 bet on black. The equation for the expected value, or average profit per bet would be:</p>
<p>&lt;X, black&gt; = p(black)*(payout if black) &#8211; $100</p>
<p>That is, the probability that black comes up, multiplied by payout if black comes up, minus your original bet. On a standard American table, the payout for black is 1:1 and the probability of hitting black is a little less than even:</p>
<p>&lt;X, black&gt; = (16/38)*$200 &#8211; $100<br />
&lt;X, black&gt; = $94.74 &#8211; $100 = <strong>-$5.26</strong></p>
<p>Almost every bet in roulette will give you the same expected value: <strong>- 5.26%</strong> with the only difference being the variance between individual outcomes. The same concept is true for other table games. The expected value of betting the Pass line in Craps is -<strong>1.41%</strong> per bet and likewise, the expected value of betting &#8216;Hard Eight&#8217; is <strong>-9.09% </strong>per bet. This is of course how the casinos make their money. Poker is different from the table games primarily because the expected value of bets change and estimates of expected value are always uncertain.</p>
<p><span id="more-9"></span>Calculating the expected value of even the most simple bet in poker can be an elaborate process of brute mathematical force and artful inductive reasoning. As Dan Harrington outlines in the introduction to <em>Harrington on Hold&#8217;em: Volume 1</em>, there are four factors a player must consider when making or calling bets in poker:</p>
<p style="padding-left: 30px;">1. The likelihood that their hand will improve as more cards are dealt, which is pretty much a straight mathematical exercise.</p>
<p style="padding-left: 30px;">2. An estimate of the hand their opponent may hold, which is an exercise in inductive reasoning, based on hands he has held in the past, his general style of play, and the bets he has made thus far.</p>
<p style="padding-left: 30px;">3.  The likelihood their opponent&#8217;s hand will improve, another mathematical exercise but complicated by the fact that their opponent&#8217;s hand is not known for sure.</p>
<p style="padding-left: 30px;">4. The money odds being offered by the      pot.</p>
<p>Let&#8217;s try to map these factors onto our expected value formula with a simple example from No-limit Hold’em:</p>
<p>You, the Hero, are facing a $100 bet into a $400 pot on the flop. Ignoring for now &#8216;implied odds&#8217; of winning extra bets on future streets, the expected value of calling this bet is given by the formula:</p>
<p>&lt;Hero, call&gt; = p(win)*(new pot size) &#8211; (cost of bet)<br />
&lt;Hero, call&gt; = p(win)*($500) &#8211; ($100) = ?</p>
<p>As Action Dan alluded to, breaking down the p(win) is the tricky part. All you know for sure are 2 things:</p>
<p style="padding-left: 30px;">1. Your own hand</p>
<p style="padding-left: 30px;">2. The board (in this case the flop)</p>
<p>Factors 1 &amp; 3, the odds of your hand improving and the odds of your opponent improving are jointly determined – that is, in order to estimate your ESPN showdown win percentage you need to know your hand, the board, <strong>and </strong>your opponent’s hand. But unless there is some sort of malfunction, you won’t know your opponent’s hole cards until the showdown. No hole cards, no p(win), no p(win) no estimate of expected value.</p>
<p>In lieu of facts about our opponent’s hole cards we make estimates of what he is holding and then calculate the odds of winning against those hands. Taken to its logical conclusion, just knowing your own hand and the board gives you the ability to calculate the probability of your hand winning at showdown against every possible hand he could be holding. On the flop, there are 1081 possible hole card combinations our opponent could be holding (n!/r!(n-r)!). We can use a 1081-dimensional vector <strong>p</strong> = (p1, p2, &#8230;, p1081) to represent this collection of win percentage probabilities (<a href="http://www.unc.edu/%7Eswlt/metricsmatrix.pdf" target="_blank">click here for a nice primer on matrix algebra</a>). Ignore for a second the impracticality of running 1081 calculations in a live game and assume you’ve rigged up one of the many available <a href="http://www.codingthewheel.com/archives/poker-hand-evaluator-roundup" target="_blank">poker hand evaluators</a> to do the hard work for you.</p>
<p>Provided we can get our evaluator to spit out thousand dimensional vectors we&#8217;re left with only one remaining item on Mr. Harrington’s list &#8212; our opponent&#8217;s hand. Of course, the problem still remains that we don&#8217;t know what our opponent&#8217;s hand is unless it is revealed at the showdown. However, if we think of it in a different way, a probabilistic way, until our opponent&#8217;s hand is revealed, it really isn&#8217;t one particular hand. Rather, it’s a probability distribution of every possible hand weighted towards the most likely hands. We can define <strong>y </strong>= (y1, y2, &#8230;, y1081) as the vector of probabilities of our opponent holding every possible hand on the flop. Here, each individual piece of vector <strong>y </strong>matches up with a piece of vector <strong>p</strong>. For example, there is a probability <strong>y</strong><strong>j</strong><strong> </strong>that our opponent has Aces and there is a probability <strong>p</strong><strong>j</strong><strong> </strong>that our hand will win against the aces at showdown. Multiplying each <strong>p</strong><strong>j</strong><strong> </strong>by its respective <strong>y</strong><strong>j</strong><strong> </strong>and summing the product gives the dot product of the vectors <strong>p</strong> and <strong>y</strong>:</p>
<p><strong>y*p = SUM(y</strong><strong>j</strong><strong>p</strong><strong>j</strong><strong>) = </strong><strong>y</strong><strong>1</strong><strong>p</strong><strong>1</strong><strong>+</strong><strong>y</strong><strong>2</strong><strong>p</strong><strong>2 </strong><strong>+&#8230;+ </strong><strong>y</strong><strong>1081</strong><strong>p</strong><strong>1081</strong></p>
<p>This linear function of <strong>y</strong> and <strong>p </strong>can be thought of as the probability of the hero’s hand winning against our opponent’s hand distribution. This is in fact the <strong>p(win)</strong> that we were looking for. Provided we know the product of the vectors <strong>y </strong>and <strong>p</strong> and the pot odds, we would be able to identify the expected value of any bet in poker.</p>
<p>Everything in the expected value function can be calculated in a rather straightforward mathematical way save for the hand distribution vector <strong>y</strong>. In practice, we arrive at an estimate of <strong>y </strong>by incorporating observable &#8216;tells&#8217; such as betting patterns, general style of play, and the frequency of being dealt certain hands, to place an opponent on a range of likely hands. Let&#8217;s define these observable tells as the set of variables in vector <strong>x = </strong>(x1, x2, &#8230;, xn). In calculating the expected probability of an opponent having a particular hand we are implicitly constructing a linear function <strong>y</strong><strong>j</strong><strong> = Bx</strong><strong>j</strong><strong> </strong>whereby <strong>x</strong><strong>j</strong><strong> </strong>are the things we observe, <strong>y</strong><strong>j</strong><strong> </strong>is the probability of having a particular hand in our distribution, and <strong>B </strong>is our estimate of the relationship between <strong>y</strong><strong>j</strong> and<strong> </strong><strong>x</strong><strong>j</strong>.</p>
<p>Done at the poker table it is called hand reading. Done with reams of data and a statistical package and it is called econometrics&#8230;</p>
<p>-chaz</p>
<p><strong> </strong></p>
]]></content:encoded>
			<wfw:commentRss>http://pokeitmethod.com/2009/11/poker-through-the-lens-of-expected-value/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
