<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:gd="http://schemas.google.com/g/2005" gd:etag="W/&quot;D0EGSXk_cSp7ImA9WxBVGEU.&quot;"><id>tag:blogger.com,1999:blog-21831384</id><updated>2010-02-22T16:40:28.749-08:00</updated><title>Bzst...</title><subtitle type="html">Musings on Business and Statistics (oops, data analysis).</subtitle><link rel="http://schemas.google.com/g/2005#feed" type="application/atom+xml" href="http://blog.bzst.com/feeds/posts/default" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/" /><link rel="next" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default?start-index=26&amp;max-results=25&amp;redirect=false&amp;v=2" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email></author><generator version="7.00" uri="http://www.blogger.com">Blogger</generator><openSearch:totalResults>103</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/bzstblog" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="bzstblog" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry gd:etag="W/&quot;D0EGSXk9fSp7ImA9WxBVGEU.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-4830702490776260207</id><published>2010-02-22T16:30:00.000-08:00</published><updated>2010-02-22T16:40:28.765-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-22T16:40:28.765-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="data collection" /><category scheme="http://www.blogger.com/atom/ns#" term="data mining" /><title>Online data collection</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/4830702490776260207/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=4830702490776260207" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/4830702490776260207?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/4830702490776260207?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2010/02/online-data-collection.html" title="Online data collection" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">0</thr:total><content type="html">Online data are a huge resources for research as well as in practice.  Although it is often tempting to "scrape everything" using technologies like web-crawling, it is extremely important to keep the...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/OfX4AjSVppUmIylZdZfG08fzGro/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/OfX4AjSVppUmIylZdZfG08fzGro/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/OfX4AjSVppUmIylZdZfG08fzGro/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/OfX4AjSVppUmIylZdZfG08fzGro/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;D0YAR30_eip7ImA9WxBWGUk.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-3154502324267187549</id><published>2010-02-11T19:06:00.000-08:00</published><updated>2010-02-11T19:25:46.342-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-02-11T19:25:46.342-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="overfitting" /><category scheme="http://www.blogger.com/atom/ns#" term="Explaining vs. Predicting" /><title>Over-fitting analogies</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/3154502324267187549/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=3154502324267187549" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3154502324267187549?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3154502324267187549?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2010/02/over-fitting-analogies.html" title="Over-fitting analogies" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">0</thr:total><content type="html">To explain the danger of model over-fitting in prediction to data mining newcomers, I often use the following analogy:Say you are at the tailor's, who will be sewing an expensive suit (or dress) for...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/G_b3RlZ-VzG0IyRM2YOy4vHKqRk/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/G_b3RlZ-VzG0IyRM2YOy4vHKqRk/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/G_b3RlZ-VzG0IyRM2YOy4vHKqRk/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/G_b3RlZ-VzG0IyRM2YOy4vHKqRk/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;A0ADRH07fCp7ImA9WxBXFUg.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-1334870741576516465</id><published>2010-01-26T17:14:00.000-08:00</published><updated>2010-01-26T18:36:15.304-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-01-26T18:36:15.304-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="software" /><category scheme="http://www.blogger.com/atom/ns#" term="data mining" /><title>Drag-and-drop data mining software for the classroom</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/1334870741576516465/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=1334870741576516465" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/1334870741576516465?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/1334870741576516465?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2010/01/drag-and-drop-data-mining-software-for.html" title="Drag-and-drop data mining software for the classroom" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">0</thr:total><content type="html">The drag-and-drop (D&amp;amp;D) concept in data mining tools is very neat. You "drag" icons (aka "nodes")  that do different operations, and "connect" them to create a data mining process.  This is also...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/uEsNI7Cz2dM2ERrlV1INNpn8Uxw/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/uEsNI7Cz2dM2ERrlV1INNpn8Uxw/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/uEsNI7Cz2dM2ERrlV1INNpn8Uxw/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/uEsNI7Cz2dM2ERrlV1INNpn8Uxw/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;Ck8ERn44eCp7ImA9WxBRF0Q.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-4778384385520814474</id><published>2010-01-06T06:00:00.001-08:00</published><updated>2010-01-06T07:13:27.030-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2010-01-06T07:13:27.030-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="spotfire" /><category scheme="http://www.blogger.com/atom/ns#" term="data visualization" /><title>Creating map charts</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/4778384385520814474/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=4778384385520814474" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/4778384385520814474?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/4778384385520814474?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2010/01/creating-map-charts.html" title="Creating map charts" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://4.bp.blogspot.com/_x7QjiZypFj0/S0SXlz1iLHI/AAAAAAAAAHg/5tQijF7_G64/s72-c/MapHappiness.png" height="72" width="72" /><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">0</thr:total><content type="html">With the growing amount of available geographical data, it is useful to be able to visualize one's data on top of a map. Visualizing numeric and/or categorical information on top of a map is called a...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/VMb0tD45tz6aUGJjfk4apbb6Ihg/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/VMb0tD45tz6aUGJjfk4apbb6Ihg/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/VMb0tD45tz6aUGJjfk4apbb6Ihg/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/VMb0tD45tz6aUGJjfk4apbb6Ihg/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;DE8ASXo6cSp7ImA9WxBSFU4.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-3785008126161862855</id><published>2009-12-22T18:48:00.001-08:00</published><updated>2009-12-22T18:54:08.419-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-12-22T18:54:08.419-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="data mining" /><category scheme="http://www.blogger.com/atom/ns#" term="Course" /><title>My newest batch of graduating data mining MBAs</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/3785008126161862855/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=3785008126161862855" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3785008126161862855?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3785008126161862855?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/12/newest-batch-of-data-mining-mbas.html" title="My newest batch of graduating data mining MBAs" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://4.bp.blogspot.com/_x7QjiZypFj0/SzGFo7Q9A4I/AAAAAAAAAHY/NFUTM-SRmOs/s72-c/BUDT733_Fall2009.JPG" height="72" width="72" /><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">0</thr:total><content type="html">Congratulations to our Smith School's Fall 2009 "Data Mining for Business" students. I look forward to hearing about your future endeavors -- use data mining to do good!
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/TVf1YTrgPcjA79N-DQl6UPimDFc/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/TVf1YTrgPcjA79N-DQl6UPimDFc/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/TVf1YTrgPcjA79N-DQl6UPimDFc/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/TVf1YTrgPcjA79N-DQl6UPimDFc/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;C08MSH07fCp7ImA9WxBTFk4.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-72163441335795630</id><published>2009-12-12T07:28:00.001-08:00</published><updated>2009-12-12T07:31:29.304-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-12-12T07:31:29.304-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="predictive accuracy" /><category scheme="http://www.blogger.com/atom/ns#" term="surveys" /><category scheme="http://www.blogger.com/atom/ns#" term="data mining" /><category scheme="http://www.blogger.com/atom/ns#" term="sampling" /><title>Stratified sampling: why and how?</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/72163441335795630/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=72163441335795630" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/72163441335795630?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/72163441335795630?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/12/stratified-sampling-why-and-how.html" title="Stratified sampling: why and how?" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">0</thr:total><content type="html">In surveys and polls it is common to use stratified sampling. Stratified sampling is also used in data mining, when drawing a sample from a database (for the purpose of model building). This post...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/Z3rqTAHZB8dCVKjnymc149paCog/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Z3rqTAHZB8dCVKjnymc149paCog/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/Z3rqTAHZB8dCVKjnymc149paCog/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Z3rqTAHZB8dCVKjnymc149paCog/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;CUEFQHYyfCp7ImA9WxNUFUs.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-4340229192979962594</id><published>2009-11-06T19:09:00.001-08:00</published><updated>2009-11-06T19:13:31.894-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-11-06T19:13:31.894-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="p-value" /><title>The value of p-values: Science magazine asks</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/4340229192979962594/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=4340229192979962594" title="12 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/4340229192979962594?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/4340229192979962594?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/11/value-of-p-values-science-magazine-asks.html" title="The value of p-values: Science magazine asks" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">12</thr:total><content type="html">My students know how I cringe when I am forced to teach them p-values. I have always felt that their meaning is hard to grasp, and hence they are mostly abused when used by non-statisticians. This is...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/x1rUgK_7m9QyXXyoNU7oxj1_qVk/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/x1rUgK_7m9QyXXyoNU7oxj1_qVk/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/x1rUgK_7m9QyXXyoNU7oxj1_qVk/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/x1rUgK_7m9QyXXyoNU7oxj1_qVk/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;CkAHQn06cSp7ImA9WxNVFkw.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-7962718699899857814</id><published>2009-10-26T19:32:00.001-07:00</published><updated>2009-10-26T19:32:13.319-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-10-26T19:32:13.319-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="software" /><category scheme="http://www.blogger.com/atom/ns#" term="p-value" /><category scheme="http://www.blogger.com/atom/ns#" term="regression" /><title>Testing directional hypotheses: p-values can bite</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/7962718699899857814/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=7962718699899857814" title="3 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/7962718699899857814?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/7962718699899857814?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/10/testing-directional-hypotheses-p-values.html" title="Testing directional hypotheses: p-values can bite" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">3</thr:total><content type="html">I've recently had interesting discussions with colleagues in Information Systems regarding testing directional hypotheses. Following their request, I'm posting about this apparently illusive issue....
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/wYVfBimZMYwbrwWURkne5fDY39A/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/wYVfBimZMYwbrwWURkne5fDY39A/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/wYVfBimZMYwbrwWURkne5fDY39A/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/wYVfBimZMYwbrwWURkne5fDY39A/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;Dk4MRnk5cCp7ImA9WxNWEE4.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-2760992945888915997</id><published>2009-10-08T13:49:00.001-07:00</published><updated>2009-10-08T13:49:47.728-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-10-08T13:49:47.728-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="data" /><category scheme="http://www.blogger.com/atom/ns#" term="education" /><category scheme="http://www.blogger.com/atom/ns#" term="teaching business data mining" /><category scheme="http://www.blogger.com/atom/ns#" term="software" /><category scheme="http://www.blogger.com/atom/ns#" term="data mining" /><title>SAS On Demand: Enterprise Miner -- Update</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/2760992945888915997/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=2760992945888915997" title="3 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/2760992945888915997?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/2760992945888915997?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/10/sas-on-demand-enterprise-miner-update.html" title="SAS On Demand: Enterprise Miner -- Update" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">3</thr:total><content type="html">Following up on my previous posting about using SAS Enterprise Minder via the On Demand platform: From continued communication with experts at SAS, it turns out that with the EM version 5.3, which is...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/ucBQ2e_Y-vZBWQeHIFGK_H67WXE/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/ucBQ2e_Y-vZBWQeHIFGK_H67WXE/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/ucBQ2e_Y-vZBWQeHIFGK_H67WXE/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/ucBQ2e_Y-vZBWQeHIFGK_H67WXE/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;CUAHR3YzeSp7ImA9WxNXFU8.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-4549710266311080653</id><published>2009-10-02T15:48:00.001-07:00</published><updated>2009-10-02T15:48:56.881-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-10-02T15:48:56.881-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="data" /><category scheme="http://www.blogger.com/atom/ns#" term="teaching business data mining" /><category scheme="http://www.blogger.com/atom/ns#" term="software" /><category scheme="http://www.blogger.com/atom/ns#" term="data mining" /><title>SAS On Demand: Enterprise Miner</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/4549710266311080653/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=4549710266311080653" title="3 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/4549710266311080653?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/4549710266311080653?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/10/sas-on-demand-enterprise-miner.html" title="SAS On Demand: Enterprise Miner" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">3</thr:total><content type="html">I am in the process of trying out SAS Enterprise Miner via the (relatively new) SAS On Demand for Academics. In our MBA data mining course at Smith, we introduce SAS EM. In the early days, we'd get...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/aWgxDk8yqPizfqvRFgStWOIHJ0w/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/aWgxDk8yqPizfqvRFgStWOIHJ0w/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/aWgxDk8yqPizfqvRFgStWOIHJ0w/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/aWgxDk8yqPizfqvRFgStWOIHJ0w/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;AkcNQ3c4eip7ImA9WxNRGUo.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-7516459931197767019</id><published>2009-09-14T18:28:00.001-07:00</published><updated>2009-09-14T18:28:12.932-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-09-14T18:28:12.932-07:00</app:edited><title>Interpreting log-transformed variables in linear regression</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/7516459931197767019/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=7516459931197767019" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/7516459931197767019?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/7516459931197767019?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/09/interpreting-log-transformed-variables.html" title="Interpreting log-transformed variables in linear regression" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">0</thr:total><content type="html">Statisticians love variable transformations. log-em, square-em, square-root-em, or even use the all-encompassing Box-Cox transformation, and voilla: you get variables that are "better behaved". Good...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/iA8-tOdQVmpYruRRVw-5C1jFMT8/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/iA8-tOdQVmpYruRRVw-5C1jFMT8/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/iA8-tOdQVmpYruRRVw-5C1jFMT8/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/iA8-tOdQVmpYruRRVw-5C1jFMT8/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;AkIGSXY6eip7ImA9WxNSFkU.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-3305516861181936701</id><published>2009-08-30T20:04:00.001-07:00</published><updated>2009-08-30T20:15:28.812-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-08-30T20:15:28.812-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="scatterplot" /><category scheme="http://www.blogger.com/atom/ns#" term="graphs" /><category scheme="http://www.blogger.com/atom/ns#" term="Excel" /><category scheme="http://www.blogger.com/atom/ns#" term="software" /><title>Creating color-coded scatterplots in Excel: a nightmare</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/3305516861181936701/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=3305516861181936701" title="15 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3305516861181936701?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3305516861181936701?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/08/creating-color-coded-scatterplots-in.html" title="Creating color-coded scatterplots in Excel: a nightmare" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://1.bp.blogspot.com/_x7QjiZypFj0/Sps-j_Yw-BI/AAAAAAAAAHM/3TYp9rbgyuY/s72-c/ScatterPlot.png" height="72" width="72" /><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">15</thr:total><content type="html">Scatterplots are extremely popular and useful graphical displays for examining the relationship between two numeric variables. They get even better when we add the use of color/hue and shape to...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/zGwNup5_P05R0An3ZlqOK480GCE/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/zGwNup5_P05R0An3ZlqOK480GCE/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/zGwNup5_P05R0An3ZlqOK480GCE/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/zGwNup5_P05R0An3ZlqOK480GCE/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;CkQCQXk5cCp7ImA9WxNTF0Q.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-9185200001507488699</id><published>2009-08-20T10:23:00.001-07:00</published><updated>2009-08-20T10:46:00.728-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-08-20T10:46:00.728-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="data" /><category scheme="http://www.blogger.com/atom/ns#" term="competition" /><category scheme="http://www.blogger.com/atom/ns#" term="graphs" /><category scheme="http://www.blogger.com/atom/ns#" term="software" /><title>Data Exploration Celebration: The ENBIS 2009 Challenge</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/9185200001507488699/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=9185200001507488699" title="5 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/9185200001507488699?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/9185200001507488699?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/08/data-exploration-celebration-enbis-2009.html" title="Data Exploration Celebration: The ENBIS 2009 Challenge" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">5</thr:total><content type="html">The European  Network for Business and Industrial Statistics (ENBIS) has released the 2009 ENBIS Challenge. The challenge this time is to use an exploratory data analysis (EDA) tool to answer a bunch...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/otSIerdanYDXe9AoUiaJcfknB1I/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/otSIerdanYDXe9AoUiaJcfknB1I/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/otSIerdanYDXe9AoUiaJcfknB1I/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/otSIerdanYDXe9AoUiaJcfknB1I/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;CUUGQXg7fCp7ImA9WxJXGEQ.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-7575798772698042924</id><published>2009-06-13T04:40:00.001-07:00</published><updated>2009-06-13T04:40:20.604-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-06-13T04:40:20.604-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="pivot table" /><category scheme="http://www.blogger.com/atom/ns#" term="graphs" /><category scheme="http://www.blogger.com/atom/ns#" term="Excel" /><category scheme="http://www.blogger.com/atom/ns#" term="software" /><title>Histograms in Excel</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/7575798772698042924/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=7575798772698042924" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/7575798772698042924?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/7575798772698042924?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/06/histograms-in-excel.html" title="Histograms in Excel" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">2</thr:total><content type="html">Histograms are very useful charts for displaying the distribution of a numerical measurement. The idea is to bucket the numerical measurement into intervals, and then to display the frequency (or...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/uUM21yNZI9BS1YaQVmn_n_1zS5o/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/uUM21yNZI9BS1YaQVmn_n_1zS5o/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/uUM21yNZI9BS1YaQVmn_n_1zS5o/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/uUM21yNZI9BS1YaQVmn_n_1zS5o/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;DUYERH04cSp7ImA9WxJTFEo.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-3434465982321509601</id><published>2009-04-23T01:58:00.001-07:00</published><updated>2009-04-23T01:58:25.339-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-04-23T01:58:25.339-07:00</app:edited><title>Fragmented</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/3434465982321509601/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=3434465982321509601" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3434465982321509601?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3434465982321509601?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/04/fragmented.html" title="Fragmented" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">1</thr:total><content type="html">In the process of planning the syllabus for my next PhD course on "Scientific Data-Collection", to be offered for the third time in Spring 2009, I have realized how fragmented the education of...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/xjifN5DfoFJBtsKEmyv9Sjavhv4/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/xjifN5DfoFJBtsKEmyv9Sjavhv4/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/xjifN5DfoFJBtsKEmyv9Sjavhv4/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/xjifN5DfoFJBtsKEmyv9Sjavhv4/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;A0IFQXk_eyp7ImA9WxJTEEw.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-2980102374163836075</id><published>2009-04-17T18:51:00.001-07:00</published><updated>2009-04-17T18:51:50.743-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-04-17T18:51:50.743-07:00</app:edited><title>Collecting online data (for research)</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/2980102374163836075/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=2980102374163836075" title="2 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/2980102374163836075?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/2980102374163836075?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/04/collecting-online-data-for-research.html" title="Collecting online data (for research)" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">2</thr:total><content type="html">In the new era of large amounts of publicly available data, an issue that is sometimes overlooked is ethical data collection. Whereas for experimental studies involving humans we have clear...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/mzkgycTV5fuUC3aVSYaVFEbZjiY/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/mzkgycTV5fuUC3aVSYaVFEbZjiY/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/mzkgycTV5fuUC3aVSYaVFEbZjiY/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/mzkgycTV5fuUC3aVSYaVFEbZjiY/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;DUENQnw-cSp7ImA9WxVbEEo.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-202753945100546472</id><published>2009-03-25T23:27:00.000-07:00</published><updated>2009-03-26T07:28:13.259-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-03-26T07:28:13.259-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="PCA" /><category scheme="http://www.blogger.com/atom/ns#" term="Explaining vs. Predicting" /><category scheme="http://www.blogger.com/atom/ns#" term="Factor Analysis" /><category scheme="http://www.blogger.com/atom/ns#" term="data compression" /><title>Principal Components Analysis vs. Factor Analysis</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/202753945100546472/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=202753945100546472" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/202753945100546472?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/202753945100546472?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/03/principal-components-analysis-vs-factor.html" title="Principal Components Analysis vs. Factor Analysis" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">0</thr:total><content type="html">Here is an interesting example of how similar mechanics lead to two very different statistical tools. Principal Components Analysis (PCA) is a powerful method for data compression, in the sense of...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/lui0UNTSQacRxMfDgu6aUxoXUjU/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/lui0UNTSQacRxMfDgu6aUxoXUjU/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/lui0UNTSQacRxMfDgu6aUxoXUjU/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/lui0UNTSQacRxMfDgu6aUxoXUjU/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;CkcGRHwyfCp7ImA9WxVUGUo.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-6831796612325699204</id><published>2009-03-24T23:23:00.000-07:00</published><updated>2009-03-25T01:33:45.294-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-03-25T01:33:45.294-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="predictive accuracy" /><category scheme="http://www.blogger.com/atom/ns#" term="data collection" /><category scheme="http://www.blogger.com/atom/ns#" term="observational data" /><category scheme="http://www.blogger.com/atom/ns#" term="Netflix" /><category scheme="http://www.blogger.com/atom/ns#" term="Explaining vs. Predicting" /><category scheme="http://www.blogger.com/atom/ns#" term="experiment" /><title>Are experiments always better?</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/6831796612325699204/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=6831796612325699204" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/6831796612325699204?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/6831796612325699204?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/03/are-experiments-always-better.html" title="Are experiments always better?" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">1</thr:total><content type="html">This continues my "To Explain or To Predict?" argument (in brief: statistical models aimed at causal explanation will not necessarily be good predictors). And now, I move to a very early stage in the...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/Ayyfkus899C2ZcuJXrpHm5G6MgA/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Ayyfkus899C2ZcuJXrpHm5G6MgA/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/Ayyfkus899C2ZcuJXrpHm5G6MgA/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/Ayyfkus899C2ZcuJXrpHm5G6MgA/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;AkUMQno4eCp7ImA9WxVVF04.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-6314338218383723527</id><published>2009-03-10T08:01:00.000-07:00</published><updated>2009-03-10T19:24:43.430-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-03-10T19:24:43.430-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="performance metrics" /><category scheme="http://www.blogger.com/atom/ns#" term="R-squared" /><category scheme="http://www.blogger.com/atom/ns#" term="goodness-of-fit" /><title>What R-squared is (and is not)</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/6314338218383723527/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=6314338218383723527" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/6314338218383723527?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/6314338218383723527?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/03/what-r-squared-is-and-is-not.html" title="What R-squared is (and is not)" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">0</thr:total><content type="html">R-squared (aka "coefficient of determination", or for short, R2) is a popular measure used in linear regression to assess the strength of the linear relationship between the inputs and the output. In...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/NYGVs2Y5EtKQM8RLy-1dYv81TYM/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/NYGVs2Y5EtKQM8RLy-1dYv81TYM/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/NYGVs2Y5EtKQM8RLy-1dYv81TYM/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/NYGVs2Y5EtKQM8RLy-1dYv81TYM/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;CkUAQnY8fSp7ImA9WxVVFkw.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-3132252569687049492</id><published>2009-03-09T07:00:00.000-07:00</published><updated>2009-03-09T07:50:43.875-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-03-09T07:50:43.875-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="education" /><category scheme="http://www.blogger.com/atom/ns#" term="textbook" /><category scheme="http://www.blogger.com/atom/ns#" term="regression" /><title>Start the Revolution</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/3132252569687049492/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=3132252569687049492" title="5 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3132252569687049492?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3132252569687049492?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/03/start-revolution.html" title="Start the Revolution" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">5</thr:total><content type="html">Variability is a key concept in statistics. The Greek letter Sigma has such importance, that it is probably associated more closely with statistics than with Greek. Yet, if you have a chance to...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/FfOzLHnLBIOQdUC8CpSgsFpB6IE/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/FfOzLHnLBIOQdUC8CpSgsFpB6IE/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/FfOzLHnLBIOQdUC8CpSgsFpB6IE/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/FfOzLHnLBIOQdUC8CpSgsFpB6IE/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;CE8NQnY4cSp7ImA9WxVSFUk.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-3076296928172582503</id><published>2009-01-09T13:53:00.000-08:00</published><updated>2009-01-09T15:14:53.839-08:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2009-01-09T15:14:53.839-08:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="causality" /><category scheme="http://www.blogger.com/atom/ns#" term="newspaper" /><category scheme="http://www.blogger.com/atom/ns#" term="Explaining vs. Predicting" /><title>Beer and ... crime</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/3076296928172582503/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=3076296928172582503" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3076296928172582503?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3076296928172582503?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2009/01/beer-and-crime.html" title="Beer and ... crime" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">1</thr:total><content type="html">I often glimpse the local newspapers while visiting a foreign country (as long as it is in a language I can read). Yesterday, the Australian Herald Sun had the article "Drop in light beer sales...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/4N1r4-ozW6wuYjClQB5v8ZEy_2Q/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/4N1r4-ozW6wuYjClQB5v8ZEy_2Q/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/4N1r4-ozW6wuYjClQB5v8ZEy_2Q/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/4N1r4-ozW6wuYjClQB5v8ZEy_2Q/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;CkEDSXkzcCp7ImA9WxRXE00.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-8533375832311372468</id><published>2008-10-17T20:08:00.000-07:00</published><updated>2008-10-17T20:31:18.788-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2008-10-17T20:31:18.788-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="Excel" /><category scheme="http://www.blogger.com/atom/ns#" term="hidden fields" /><category scheme="http://www.blogger.com/atom/ns#" term="security" /><title>Microsoft and the financial downfall</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/8533375832311372468/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=8533375832311372468" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/8533375832311372468?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/8533375832311372468?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2008/10/microsoft-and-financial-downfall.html" title="Microsoft and the financial downfall" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">1</thr:total><content type="html">One of the misleading features of Microsoft Office software is that it gives the user the illusion that they are in control of what's visible and what's hidden to readers of the files. One example is...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/cYPFI3K6GB2GC18uwxPo-RSLYEc/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/cYPFI3K6GB2GC18uwxPo-RSLYEc/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/cYPFI3K6GB2GC18uwxPo-RSLYEc/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/cYPFI3K6GB2GC18uwxPo-RSLYEc/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;D04BQng7fyp7ImA9WxRQE0Q.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-2018010884984678180</id><published>2008-10-07T08:07:00.000-07:00</published><updated>2008-10-07T09:12:33.607-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2008-10-07T09:12:33.607-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="performance metrics" /><category scheme="http://www.blogger.com/atom/ns#" term="classification" /><title>Sensitivity, specificity, false positive and false negative rates</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/2018010884984678180/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=2018010884984678180" title="0 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/2018010884984678180?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/2018010884984678180?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2008/10/sensitivity-specificity-false-positive.html" title="Sensitivity, specificity, false positive and false negative rates" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://3.bp.blogspot.com/_x7QjiZypFj0/SOuDREAxrZI/AAAAAAAAAFw/bCPXuq-BXKY/s72-c/ClassMat.gif" height="72" width="72" /><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">0</thr:total><content type="html">I recently had an interesting discussion with a few colleagues in Korea regarding the definition of false positive and false negative rates and their relation to sensitivity and specificity....
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/4hTzpabTtvlaiaXvrUemA0UflYY/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/4hTzpabTtvlaiaXvrUemA0UflYY/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/4hTzpabTtvlaiaXvrUemA0UflYY/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/4hTzpabTtvlaiaXvrUemA0UflYY/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;CkQHRHY8fSp7ImA9WxRREU0.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-1736450479056064942</id><published>2008-09-21T21:08:00.000-07:00</published><updated>2008-09-22T09:18:55.875-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2008-09-22T09:18:55.875-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="data" /><category scheme="http://www.blogger.com/atom/ns#" term="econometric model" /><category scheme="http://www.blogger.com/atom/ns#" term="Explaining vs. Predicting" /><title>Dr. Doom and data mining</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/1736450479056064942/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=1736450479056064942" title="1 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/1736450479056064942?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/1736450479056064942?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2008/09/dr-doom-and-data-mining.html" title="Dr. Doom and data mining" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">1</thr:total><content type="html">Last month The New York Times featured an article about Dr. Doom: Economics professor "Roubini, a respected but formerly obscure academic, has become a major figure in the public debate about the...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/UHQf1VKX0613UjO6MTzXpkkz-w4/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/UHQf1VKX0613UjO6MTzXpkkz-w4/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/UHQf1VKX0613UjO6MTzXpkkz-w4/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/UHQf1VKX0613UjO6MTzXpkkz-w4/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry><entry gd:etag="W/&quot;D0QGRnY4fCp7ImA9WxRTFEg.&quot;"><id>tag:blogger.com,1999:blog-21831384.post-3486360619085549169</id><published>2008-09-03T08:17:00.000-07:00</published><updated>2008-09-03T08:22:07.834-07:00</updated><app:edited xmlns:app="http://www.w3.org/2007/app">2008-09-03T08:22:07.834-07:00</app:edited><category scheme="http://www.blogger.com/atom/ns#" term="bhutan" /><category scheme="http://www.blogger.com/atom/ns#" term="software" /><title>Data conversion and open-source software</title><link rel="replies" type="application/atom+xml" href="http://blog.bzst.com/feeds/3486360619085549169/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=21831384&amp;postID=3486360619085549169" title="7 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3486360619085549169?v=2" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/21831384/posts/default/3486360619085549169?v=2" /><link rel="alternate" type="text/html" href="http://blog.bzst.com/2008/09/data-conversion-and-open-source.html" title="Data conversion and open-source software" /><author><name>Galit Shmueli</name><uri>http://www.blogger.com/profile/06119270323184007583</uri><email>noreply@blogger.com</email><gd:extendedProperty name="OpenSocialUserId" value="07473654713881826051" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">7</thr:total><content type="html">Recently I was trying to open a data file that was created in the statistical software SPSS. SPSS is widely used in the social sciences (a competitor to SAS), and appears to have some ground here in...
&lt;p&gt;&lt;a href="http://feedads.g.doubleclick.net/~a/arJhaJgCxDqbkQAHsicmXN0-Mic/0/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/arJhaJgCxDqbkQAHsicmXN0-Mic/0/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;br/&gt;
&lt;a href="http://feedads.g.doubleclick.net/~a/arJhaJgCxDqbkQAHsicmXN0-Mic/1/da"&gt;&lt;img src="http://feedads.g.doubleclick.net/~a/arJhaJgCxDqbkQAHsicmXN0-Mic/1/di" border="0" ismap="true"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;</content></entry></feed>
