<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">
            <title>Featured Blog Posts - AnalyticBridge</title>
            
            <updated>2013-05-22T05:54:38Z</updated>
                        <id>http://www.analyticbridge.com/profiles/blog/feed?promoted=1&amp;xn_auth=no</id>
                            <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/FeaturedBlogPosts-Analyticbridge" /><feedburner:info uri="featuredblogposts-analyticbridge" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
                    <title>When to use a view and when a table, the fuzz about it for analysis</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/2xt2Wb6o95k/2004291:BlogPost:246960" />
                                        <id>tag:www.analyticbridge.com,2013-05-20:2004291:BlogPost:246960</id>
                                        <updated>2013-05-20T10:50:53.000Z</updated>
                    
                                            <author>
                            <name>Jeffrey Ng</name>
                            <uri>http://www.analyticbridge.com/profile/JeffreyNg</uri>
                        </author>
                    
                    <summary type="html">
                        When to use a view and table?&lt;br /&gt;
Is it an art? I doubt it. Only when you cannot explain it and depends on feeling, it is an art.&lt;br /&gt;
Recently we are revising our codes and this question of giving guidance.on when to use table and view come to me. I consider the following questions:&lt;br /&gt;
1. when does it justify a table/view?&lt;br /&gt;
Having worked in a team setting on modeling and reporting works, the key to me is readability. leveraging my programming learning on object oriented concept, every object exists for a…                    </summary>

                    <content type="html">
When to use a view and table?&lt;br /&gt;
Is it an art? I doubt it. Only when you cannot explain it and depends on feeling, it is an art.&lt;br /&gt;
Recently we are revising our codes and this question of giving guidance.on when to use table and view come to me. I consider the following questions:&lt;br /&gt;
1. when does it justify a table/view?&lt;br /&gt;
Having worked in a team setting on modeling and reporting works, the key to me is readability. leveraging my programming learning on object oriented concept, every object exists for a purpose, with set of inputs, preconditions, processes and output. When you feel that the same processes can be reused for other reports or analysis, I created a view or table for it. For those of you from the database admin, put down your judgement adhereing to normalization, listen: the cost of screwing the model or analysis leading to wrong business decision outweights the cost of harddisk.&lt;br /&gt;
2. View or table?&lt;br /&gt;
To choose between, view is always preferred as it takes less disk space and one less program that creates the table. (now we are talking admin :)). Only when the same data is queried very often or very big in size, when performance in speed is needed,I use a table.&lt;br /&gt;
I believe that our task is to define as clearly as possible our programming style for data analysis, even now it is empty in the programming literature.&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/2xt2Wb6o95k" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:246960</feedburner:origLink></entry>
                            <entry>
                    <title>R (Web Server) Solutions - Amplifying Artichokes</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/uWKjURWsuMs/2004291:BlogPost:246906" />
                                        <id>tag:www.analyticbridge.com,2013-05-20:2004291:BlogPost:246906</id>
                                        <updated>2013-05-20T04:52:38.000Z</updated>
                    
                                            <author>
                            <name>Dr. Pradeep Mavuluri</name>
                            <uri>http://www.analyticbridge.com/profile/DrPradeepMavuluri</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;Every month I see one or more new R based web server solutions coming into the market, sight seeing some of them thought of sharing one of my old architecture map manifested to the client long back in early 2009 (good to see quick spreading of scalable and customizable open source statistical computing tool in the market).…&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;Every month I see one or more new R based web server solutions coming into the market, sight seeing some of them thought of sharing one of my old architecture map manifested to the client long back in early 2009 (good to see quick spreading of scalable and customizable open source statistical computing tool in the market).&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;a target="_self" href="http://api.ning.com:80/files/imbe9KBD0dOBLdckLSqiSoamClJIImLQWEOivt7qsfvtYBoA-zmFqNoz1Wu9K7lwptuiPirQTB-ilMv7jkIJ50O-zhJZss59/R_Web_Server_Solutions.PNG"&gt;&lt;img class="align-full" src="http://api.ning.com:80/files/imbe9KBD0dOBLdckLSqiSoamClJIImLQWEOivt7qsfvtYBoA-zmFqNoz1Wu9K7lwptuiPirQTB-ilMv7jkIJ50O-zhJZss59/R_Web_Server_Solutions.PNG?width=750" width="750"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/uWKjURWsuMs" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:246906</feedburner:origLink></entry>
                            <entry>
                    <title>Weekly Digest - May 20</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/04c9TRpxrNM/2004291:BlogPost:246569" />
                                        <id>tag:www.analyticbridge.com,2013-05-17:2004291:BlogPost:246569</id>
                                        <updated>2013-05-17T17:30:00.000Z</updated>
                    
                                            <author>
                            <name>Vincent Granville</name>
                            <uri>http://www.analyticbridge.com/profile/VincentGranville</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;Selected articles, blog posts and forum questions from DSC, AnalyticBridge, BigDataNews&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/extreme-data-science" target="_self"&gt;Extreme Data Science…&lt;/a&gt;&lt;br&gt;&lt;/br&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Selected articles, blog posts and forum questions from DSC, AnalyticBridge, BigDataNews&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/extreme-data-science" target="_self"&gt;Extreme Data Science&lt;/a&gt;&lt;br/&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/forum/topics/how-to-choose-an-analytic-tool" target="_blank"&gt;27 criteria to choose analytic tools&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/66-job-interview-questions-for-data-scientists" target="_blank"&gt;80 job interview questions for data scientists&lt;/a&gt; (questions 78-80 added this week) &lt;br/&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/analyticjobs/forum/topics/career-alert-may-17" target="_self"&gt;Career Alert, May 17&lt;/a&gt;&lt;br/&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/how-to-detect-a-pattern-problem-and-solution" target="_blank"&gt;Test your analytical intuition&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.bigdatanews.com/profiles/blogs/big-data-analytics-in-retail" target="_blank"&gt;Big data analytics in retail&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/the-application-of-propensity-score-matching"&gt;The application of Propensity Score Matching&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/strategy-for-building-a-good-predictive-model"&gt;Strategy for building a “good” predictive model&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/the-end-of-theory-the-data-deluge-makes-the-scientific-method-obs"&gt;The End of Theory: The Data Deluge Makes the Scientific Method Obsolete | Wired&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/codesnippets/forum/topics/source-code-to-compute-all-permutations-of-n-elements"&gt;Source code to compute all permutations of n elements&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/books/forum/topics/book-delivering-business-analytics-practical-guidelines-for-best-"&gt;New Book: Delivering Business Analytics - Practical Guidelines for Best Practice&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/books/forum/topics/two-new-data-science-books-from-crc-press" target="_blank"&gt;Two new data science books from CRC Press&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/forum/topics/apriori-prediction-in-r" target="_blank"&gt;Apriori prediction in R&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;&lt;span class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/weekly-digest-may-13" target="_self"&gt;Previous digest&lt;/a&gt; | &lt;a href="http://www.analyticbridge.com/group/analyticjobs/forum/topics/career-alert-may-17" target="_self"&gt;Recent jobs&lt;/a&gt; | &lt;a href="http://www.analyticbridge.com/page/links"&gt;Top Links&lt;/a&gt; | &lt;a href="http://www.analyticbridge.com/group/data-science/forum/topics/data-science-e-book-first-draft-available-for-download"&gt;Data Science eBook&lt;/a&gt; &lt;/span&gt;&lt;/center&gt;
&lt;center&gt;&lt;span class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/group/data-science-apprenticeship/forum/topics/update-about-our-data-science-apprenticeship"&gt;Apprenticeship&lt;/a&gt; | &lt;a href="http://www.datasciencecentral.com/profiles/blogs/dsc-monthly-contest" target="_blank"&gt;Contest&lt;/a&gt; | &lt;a href="http://www.datasciencecentral.com/events/event/listUpcoming" target="_blank"&gt;Events&lt;/a&gt; | &lt;a href="http://www.bigdatanews.com/group/bdn-daily-press-releases" target="_blank"&gt;Press Releases&lt;/a&gt;&lt;/span&gt;&lt;/center&gt;
&lt;p&gt;&lt;/p&gt;
&lt;center&gt;&lt;div class="xg_module_body xg_user_generated"&gt;&lt;p&gt;&lt;span class="font-size-2"&gt;&lt;a href="http://www.linkedin.com/groups/Advanced-Business-Analytics-Data-Mining-35222"&gt;&lt;img src="http://datashaping.com/favico_linkedin.png" border="0&amp;quot;"/&gt;&lt;/a&gt; &lt;a href="https://www.facebook.com/pages/AnalyticBridge/80509530789"&gt;&lt;img src="http://datashaping.com/favico_facebook.png"/&gt;&lt;/a&gt; &lt;a href="https://twitter.com/analyticbridge"&gt;&lt;img src="http://datashaping.com/favico_twitter.png" border="0&amp;quot;"/&gt;&lt;/a&gt; &lt;a href="https://plus.google.com/u/0/communities/107156514183161811383/"&gt;&lt;img src="http://datashaping.com/favico_google.png" border="0&amp;quot;"/&gt;&lt;/a&gt; &lt;a href="http://www.quora.com/Vincent-Granville"&gt;&lt;img src="http://datashaping.com/favico_quora.png" border="0&amp;quot;"/&gt;&lt;/a&gt; &lt;a href="http://www.datasciencecentral.com/page/news-feeds"&gt;&lt;img src="http://datashaping.com/rss-favicon.png" border="0&amp;quot;"/&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/center&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/04c9TRpxrNM" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:246569</feedburner:origLink></entry>
                            <entry>
                    <title>The application of Propensity Score Matching</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/dab4lM6nXtg/2004291:BlogPost:246178" />
                                        <id>tag:www.analyticbridge.com,2013-05-16:2004291:BlogPost:246178</id>
                                        <updated>2013-05-16T09:00:39.000Z</updated>
                    
                                            <author>
                            <name>Ian Morton</name>
                            <uri>http://www.analyticbridge.com/profile/IanMorton</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;a href="http://en.wikipedia.org/wiki/Propensity_score_matching"&gt;Propensity Score Matching&lt;/a&gt; is a statistical matching technique that attempts to estimate the effect of a treatment, policy or other intervention by accounting for the covariates that predict receiving the treatment. It helps to reduce bias due to confounding and can be used to estimate the counterfactual outcome.&lt;/p&gt;
&lt;p&gt;For example, many of you will have been to a particular university or school and achieved a certain…&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;a href="http://en.wikipedia.org/wiki/Propensity_score_matching"&gt;Propensity Score Matching&lt;/a&gt; is a statistical matching technique that attempts to estimate the effect of a treatment, policy or other intervention by accounting for the covariates that predict receiving the treatment. It helps to reduce bias due to confounding and can be used to estimate the counterfactual outcome.&lt;/p&gt;
&lt;p&gt;For example, many of you will have been to a particular university or school and achieved a certain result. But have you ever wondered what could have been the result if you had attended somewhere else (the counterfactual outcome) ? To determine this you would need to account for the covariates using information on people like yourself who studied the same course. Then, you could estimate this counterfactual outcome using Propensity Score Matching.&lt;/p&gt;
&lt;p&gt;I have put various resources (including SAS code) on my blog. These have allowed me to do Propensity Score Matching - See blog post here: &lt;a href="http://www.analysisandstatistics.blogspot.co.uk/2013/05/what-could-propensity-score-matching-do.html"&gt;What could propensity score matching do for you ? (with examples from justice, medicine, education and finance)&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt; &lt;/p&gt;
&lt;p&gt;Ian Morton has built propensity scoring models for the financial services sector, for a utility company, and for the public sector. He has given a number of presentations on the technique of propensity score matching, and has also co-authored a forthcoming peer-reviewed journal article.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/dab4lM6nXtg" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:246178</feedburner:origLink></entry>
                            <entry>
                    <title>Strategy for building a “good” predictive model</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/E9XZW-0Nh7E/2004291:BlogPost:246166" />
                                        <id>tag:www.analyticbridge.com,2013-05-16:2004291:BlogPost:246166</id>
                                        <updated>2013-05-16T04:00:00.000Z</updated>
                    
                                            <author>
                            <name>Mirko Krivanek</name>
                            <uri>http://www.analyticbridge.com/profile/MirkoKrivanek</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;By &lt;span style="font-size: 10pt;"&gt;Ian Morton. Ian worked in credit risk for big banks for a number of years. He learnt about how to (and how not to) build “good” statistical models in the form of scorecards using the SAS Language.&lt;br&gt;&lt;/br&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span class="font-size-2" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;Read original post and similar articles…&lt;/span&gt;&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;By &lt;span style="font-size: 10pt;"&gt;Ian Morton. Ian worked in credit risk for big banks for a number of years. He learnt about how to (and how not to) build “good” statistical models in the form of scorecards using the SAS Language.&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Read original post and similar articles &lt;a href="http://www.analysisandstatistics.blogspot.co.uk/2013/05/my-suggested-strategy-for-building-good.html" target="_blank"&gt;here&lt;/a&gt;. I thing Ian's list below is a good starting point. I would add a few steps such as deployment, maintenance at the end, and gathering requirements, understanding goal and success metrics at the top.&lt;/span&gt;&lt;/p&gt;
&lt;div&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Initial investigations&lt;/strong&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;1. Look at the data dictionary to see which data is available&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;&lt;span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;2. What is the outcome ? is it yes / no ? is it continuous ?&lt;br/&gt; 3. Decide upon the model required (logistic ! for yes / no outcome)&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;div&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Getting the data ready&lt;/strong&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;4 cross tabulations on categorical variables to understand the coding and volumes&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;&lt;span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;5. summary statistics to understand the distribution of the continuous variables&lt;br/&gt; 6. Ask questions about data quality:&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;remove these variables from any potential models ? or,&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;think about imputation ? or,&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;obtain accurate data ?&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;7. Convert continuous variables into categorical variables&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;div&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Modelling&lt;/strong&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;&lt;span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;8. Check for multi-colinearity / correlation between variables (variance inflation factors), or correlation tests&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;9. Check for interactions&lt;br/&gt; 10. Choose type of logistic approach (e.g. forward, backward, stepwise)&lt;br/&gt; 11. Choose the baseline attribute for each categorical variable&lt;br/&gt; 12. Create a random variable – mustn’t step into the model - something is wrong if it does step into the model&lt;br/&gt; 13. Split the dataset into two parts (ratio 80%/20%)&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;using random selection without replacement&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;the larger sample is the &lt;i&gt;build&lt;/i&gt; dataset&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt; the smaller sample is the &lt;i&gt;test&lt;/i&gt; dataset&lt;/span&gt;&lt;span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt; &lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;14. Put all variables from the &lt;i&gt;build&lt;/i&gt; dataset (including interactions and the random variable) into the model and run it&lt;/span&gt;&lt;/div&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Check odds ratios – do they make sense ?, and&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Check the coefficients – do they make sense ?&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;div&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Check the model&lt;/strong&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;&lt;span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;15. Do diagnostic checks and plots of the fit (e.g. Somers D, residuals etc., etc.)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;16. Put all variables from the &lt;i&gt;test&lt;/i&gt; dataset (including interactions and the random variable) into a new model and run it&lt;br/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Are the coefficients the same as the model it was built on ? and&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Are the odds ratios the same as the model it was built on ?&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;strong style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;Start again&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt; &lt;/span&gt;&lt;span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;17. Back to the start, fine tune the grouping of the data, put variables in or take variables out.&lt;/span&gt;&lt;/p&gt;
&lt;div&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Related articles&lt;/strong&gt;&lt;/span&gt;&lt;/div&gt;
&lt;div&gt;&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/great-statistical-analysis-forecasting-meteorite-hits" target="_blank"&gt;Great statistical analysis: forecasting meteorite hits&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/2004291:BlogPost:223153"&gt;Data Science Dictionary&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/four-ways-to-solve-a-data-science-problem-case-study" target="_blank"&gt;Four different ways to solve a data science problem - case study&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/building-a-good-predictive-model-for-credit-risk"&gt;Building a good predictive model for credit risk&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/data-science/forum/topics/data-science-e-book-first-draft-available-for-download" target="_blank"&gt;Data Science eBook&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/group/data-science-apprenticeship" target="_blank"&gt;Data Science Apprenticeship&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/66-job-interview-questions-for-data-scientists" target="_blank"&gt;66 job interview questions for data scientists&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/E9XZW-0Nh7E" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:246166</feedburner:origLink></entry>
                            <entry>
                    <title>Analytical Skills Development Spending.</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/6P1VEm3uycw/2004291:BlogPost:245742" />
                                        <id>tag:www.analyticbridge.com,2013-05-13:2004291:BlogPost:245742</id>
                                        <updated>2013-05-13T08:04:00.000Z</updated>
                    
                                            <author>
                            <name>Dr. Pradeep Mavuluri</name>
                            <uri>http://www.analyticbridge.com/profile/DrPradeepMavuluri</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;div style="text-align: justify;"&gt;&lt;span style="color: black; font-family: 'Calibri';"&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;Challenged with the acquisitiveness for adaptability and agility,&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="color: black; font-family: 'Calibri';"&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;analytical service organizations are turning to real-world work &amp;amp; emergent…&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;                    </summary>

                    <content type="html">
&lt;div style="text-align: justify;"&gt;&lt;span style="color: black; font-family: 'Calibri';"&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;Challenged with the acquisitiveness for adaptability and agility,&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="color: black; font-family: 'Calibri';"&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;analytical service organizations are turning to real-world work &amp;amp; emergent technologies/methods to develop next-generation analytical skills (talent). However, corporate bosses seems to playing it cautious when coming to their spending on analytical skills development.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="font-size: large;"&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;
&lt;p&gt;&lt;span style="color: black; font-family: 'Calibri';"&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;&lt;br/&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt; &lt;span style="color: black; font-family: 'Calibri';"&gt;&lt;span style="font-size: large;"&gt;&lt;span style="font-family: Times, 'Times New Roman', serif;"&gt;According to our survey conducted by us across a good number of analytical services organizations in India, most expect to increase their current spending on analytical skills development (43.8%) in next six months. Around 27.2% organizations expect to spend aggressively in next six months. Only 6.7% expect spending to decrease in next six months. While 22.3% of organizations spending on analytical skills development continue to be frozen in next six months.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;a target="_self" href="http://api.ning.com:80/files/ToGhWan8nztCoyYOgHq0EMhuuFET2IekJK0qFzc5sShnWwdCGqb-8iGi-S85TE13YcpPZ1DH0UyDy9JDkJO8b5W2xbCwaTM3/AnalyticalSkillDevelopmentSpending.PNG"&gt;&lt;img class="align-full" src="http://api.ning.com:80/files/ToGhWan8nztCoyYOgHq0EMhuuFET2IekJK0qFzc5sShnWwdCGqb-8iGi-S85TE13YcpPZ1DH0UyDy9JDkJO8b5W2xbCwaTM3/AnalyticalSkillDevelopmentSpending.PNG" width="689"/&gt;&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/6P1VEm3uycw" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:245742</feedburner:origLink></entry>
                            <entry>
                    <title>The End of Theory: The Data Deluge Makes the Scientific Method Obsolete | Wired</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/wDy0LEXogNU/2004291:BlogPost:245632" />
                                        <id>tag:www.analyticbridge.com,2013-05-13:2004291:BlogPost:245632</id>
                                        <updated>2013-05-13T01:30:00.000Z</updated>
                    
                                            <author>
                            <name>Vincent Granville</name>
                            <uri>http://www.analyticbridge.com/profile/VincentGranville</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;Here's my rebuttal to this article &lt;a href="http://www.wired.com/science/discoveries/magazine/16-07/pb_theory" target="_blank"&gt;published in Wired&lt;/a&gt; in 2008.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;&lt;strong&gt;Vincent's rebuttal&lt;/strong&gt;:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;A lot can be done with black-box pattern…&lt;/span&gt;&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Here's my rebuttal to this article &lt;a href="http://www.wired.com/science/discoveries/magazine/16-07/pb_theory" target="_blank"&gt;published in Wired&lt;/a&gt; in 2008.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Vincent's rebuttal&lt;/strong&gt;:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;A lot can be done with black-box pattern detection, where patterns are found but not understood. For many applications (e.g. high frequency trading) it's fine as long as your algorithm works. But in other contexts (e.g. root cause analysis for cancer eradication), deeper investigation is needed for higher success. And in all contexts, identifying and weighting true factors that explain the cause, usually allows for better forecasts, especially if good model selection, model fitting and cross-validation is performed. But if advanced modeling requires paying a high salary to a statistician for 12 months, maybe the ROI becomes negative and black-box brute force performs better, ROI-wise. In both cases, whether caring about cause or not, it is still science. Indeed it is actually data science - and it includes an analysis to figure out when/whether deeper statistical science is really required. And ill all cases, it always involves cross-validation and design of experiment. Only the statistical theoretical modeling aspect can be ignored. Other aspects, such as scalability and speed, must be considered, and this is science too: data and computer science.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;img src="http://www.wired.com/images/article/magazine/1607/pb_theory_f.jpg"/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Related article&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/forum/topics/correlation-vs-causation" target="_blank"&gt;Correlation vs. causation&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/from-chaos-to-clusters-statistical-modeling-without-models" target="_blank"&gt;From chaos to clusters - statistical modeling without models&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/correlation-or-causation-bloomberg-businessweek" target="_blank"&gt;Causation vs. Correlation&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/forum/topics/the-practitioners-dilemma-observational-versus-causal-inference" target="_blank"&gt;The Practitioners' Dilemma: Observational versus Causal Inference in Research&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/wDy0LEXogNU" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:245632</feedburner:origLink></entry>
                            <entry>
                    <title>Use PRESS, not R squared to judge predictive power of regression</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/ZiZMvWYCMto/2004291:BlogPost:245306" />
                                        <id>tag:www.analyticbridge.com,2013-05-12:2004291:BlogPost:245306</id>
                                        <updated>2013-05-12T15:00:00.000Z</updated>
                    
                                            <author>
                            <name>Theophano Mitsa</name>
                            <uri>http://www.analyticbridge.com/profile/TheophanoMitsa</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;R squared, also known as coefficient of determination, is a popular measure of quality of fit in regression. However, it does not offer any significant insights into how well our regression model can predict future values. Instead, the PRESS statistic (the predicted residual sum of squares) can be used as a measure of predictive power. The PRESS statistic can be computed in the leave-one-out cross validation…&lt;/span&gt;&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;R squared, also known as coefficient of determination, is a popular measure of quality of fit in regression. However, it does not offer any significant insights into how well our regression model can predict future values. Instead, the PRESS statistic (the predicted residual sum of squares) can be used as a measure of predictive power. The PRESS statistic can be computed in the leave-one-out cross validation process, by adding the square of the residuals for the case that is left out. As a reminder, in the leave-one-out cross validation, one case of the data set is used as the testing set and the remaining are used as the testing set. We iterate this process, until all cases have served as the testing set. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Here is an example implemented in R, on the gala dataset in the faraway package:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;&amp;gt; gala[1:3,]&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;          Species Endemics  Area Elevation Nearest Scruz Adjacent&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;Baltra         58       23 25.09       346     0.6   0.6     1.84&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;Bartolome      31       21  1.24       109     0.6  26.3   572.33&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;Caldwell        3        3  0.21       114     2.8  58.7     0.78&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Model1:&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;&amp;gt;model1&amp;lt;-lm(Species~Endemics+Area+Elevation)&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;&amp;gt;summary(model1)&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;....&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;Residual standard error: 27.29 on 26 degrees of freedom&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;Multiple R-squared&lt;strong&gt;: 0.9492, &lt;/strong&gt;   Adjusted R-squared: 0.9433&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;F-statistic: 161.8 on 3 and 26 DF,  p-value: &amp;lt; 2.2e-16 &lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Model2:&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;&amp;gt; model2&amp;lt;-lm(Species~I(Endemics^2))&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;&amp;gt; summary(model2)&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;...&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;Residual standard error: 27.1 on 28 degrees of freedom&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;Multiple R-squared: &lt;strong&gt;0.946&lt;/strong&gt;,     Adjusted R-squared: 0.9441&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;F-statistic:   491 on 1 and 28 DF,  p-value: &amp;lt; 2.2e-16 &lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Model3:&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&amp;gt; &lt;em&gt;model3&amp;lt;-lm(Species~Endemics+I(Endemics^2))&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;&amp;gt; summary(model3)&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;.....&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;Residual standard error: 22.94 on 27 degrees of freedom&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;Multiple R-squared: &lt;strong&gt;0.9627&lt;/strong&gt;,    Adjusted R-squared: 0.9599&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;F-statistic: 348.5 on 2 and 27 DF,  p-value: &amp;lt; 2.2e-16 &lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Here are now the AIC (Akaike test criterion), BIC (Bayesian information criterion), and PRESS statistic of the three models:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Model 1:&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&amp;gt;AIC(model1)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;289.243&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&amp;gt; BIC(model1)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;296.249&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;PRESS(model1)=259520.5&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Model 2:&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&amp;gt; AIC(model2)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;287.0325&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&amp;gt; BIC(model2)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;291.2361&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;PRESS(model2)=&lt;span style="font-size: 13px;"&gt;26382.22&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Model 3:&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&amp;gt; AIC(model3)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt; 277.9558&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&amp;gt; BIC(model3)&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt; 283.5606&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;PRESS(model3)=&lt;span style="font-size: 13px;"&gt;22567.03&lt;/span&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-size: 13px; font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;As we can see, the PRESS statistic is significantly smaller (better) for models 2 and 3, while R squared has a trivial improvement for model 3.  So, according to PRESS, model 3 has the highest predictive power. It is interesting to note that the AIC and BIC also get their best values for model 3.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;If you are interested in how I computed the PRESS statistic doing cross-validation in R, please check my next blog post.&lt;/span&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/ZiZMvWYCMto" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:245306</feedburner:origLink></entry>
                            <entry>
                    <title>Using Big Data to Improve Customer Experience</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/-zCZtCwzjnk/2004291:BlogPost:244824" />
                                        <id>tag:www.analyticbridge.com,2013-05-09:2004291:BlogPost:244824</id>
                                        <updated>2013-05-09T10:24:41.000Z</updated>
                    
                                            <author>
                            <name>Jigsaw Academy</name>
                            <uri>http://www.analyticbridge.com/profile/JigsawAcademy</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;i&gt;This week we are happy to introduce and welcome our guest blogger Alastair Kane. Alastair is a freelance writer and has provided this article on behalf of Acxiom a company providing&lt;/i&gt; &lt;a href="http://bigdata.acxiom.co.uk/"&gt;big data analytics&lt;/a&gt; &lt;i&gt;services.&lt;/i&gt;&lt;i&gt;In this post he talks about how Big Data can enhance Customer Experience.&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt; &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;u&gt; &lt;/u&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;What is Big Data, and why is it different from other data&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;As the world becomes…&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;i&gt;This week we are happy to introduce and welcome our guest blogger Alastair Kane. Alastair is a freelance writer and has provided this article on behalf of Acxiom a company providing&lt;/i&gt; &lt;a href="http://bigdata.acxiom.co.uk/"&gt;big data analytics&lt;/a&gt; &lt;i&gt;services.&lt;/i&gt;&lt;i&gt;In this post he talks about how Big Data can enhance Customer Experience.&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt; &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;u&gt; &lt;/u&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;What is Big Data, and why is it different from other data&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;As the world becomes increasingly digitised, access to reams of information about customers, including data detailing buying behaviours, preferences and dislikes is advancing. These useful, but separate bits of information arrive in a variety of formats, from an equally varied number of sources, as in addition to traditional CRM information, data is now being exponentially captured from social posts, blogs, via smartphones and other diverse digital sources. Formed of small pockets of brand-useful information, plus an explosion of other data 'noise', this voluminous and varied information is what's known as 'big data'.&lt;/p&gt;
&lt;p&gt;&lt;br/&gt; &lt;em&gt;What Makes ‘Big Data’ Different?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;‘Big Data’ is different from traditional data as it is unstructured and scattered, making it impossible for traditional SQL databases to make sense of. And of course, there’s just so much more of it! That’s why new ways of analysing big data (big data analytics) are required - to assess and monetise huge pools of information collected from customers - from the web, call centers  in fact from just about anywhere you can think of. &lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://analyticstraining.com/2013/using-big-data-to-improve-customer-experience/" target="_blank"&gt;Read more&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/-zCZtCwzjnk" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:244824</feedburner:origLink></entry>
                            <entry>
                    <title>Building a good predictive model for credit risk</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/hbpbPUK2LtY/2004291:BlogPost:244930" />
                                        <id>tag:www.analyticbridge.com,2013-05-09:2004291:BlogPost:244930</id>
                                        <updated>2013-05-09T16:00:00.000Z</updated>
                    
                                            <author>
                            <name>Ian Morton</name>
                            <uri>http://www.analyticbridge.com/profile/IanMorton</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;A colleague of mine wanted to understand how to build predictive models, and asked if I had a strategy for building them. I thought it would be useful to share this. For more details about each stage see my personal blog (&lt;a href="http://bit.ly/10uyAVu" target="_self"&gt;My suggested strategy for building a “good” predictive model&lt;/a&gt;).…&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;A colleague of mine wanted to understand how to build predictive models, and asked if I had a strategy for building them. I thought it would be useful to share this. For more details about each stage see my personal blog (&lt;a href="http://bit.ly/10uyAVu" target="_self"&gt;My suggested strategy for building a “good” predictive model&lt;/a&gt;).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Stage 1 – Perform initial investigations&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Stage 2 - Getting the data ready&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Stage 3 - Modelling&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Stage 4 - Check the model&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Stage 5 - Start again&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt; &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;The bottom line: It’s an iterative process and it might take some time to get a model that’s acceptable in terms of fit, and acceptable to business users. Always, always, always and at each stage consult with the business to check on ethical issues, applicability of the model, and that the model can be implemented.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;b&gt;Ian Morton worked in credit risk for big banks for a number of years. He learnt about how to (and how not to) build “good” statistical models in the form of scorecards using the SAS Language.&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt; &lt;/p&gt;
&lt;p&gt;&lt;b&gt;George E. P. Box “Essentially, all models are wrong, but some are useful”&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/hbpbPUK2LtY" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:244930</feedburner:origLink></entry>
                            <entry>
                    <title>Wiley's list of leading and interesting blogs to follow</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/kTlRGEE8bPE/2004291:BlogPost:244797" />
                                        <id>tag:www.analyticbridge.com,2013-05-08:2004291:BlogPost:244797</id>
                                        <updated>2013-05-08T16:00:00.000Z</updated>
                    
                                            <author>
                            <name>Vincent Granville</name>
                            <uri>http://www.analyticbridge.com/profile/VincentGranville</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;Here are the top 10, in alphabetical order. Wiley's &lt;a href="http://stats.cwslive.wiley.com/details/tools/13ae5fb1aa6/A-list-of-leading-and-interesting-blogs-to-follow.html" target="_blank"&gt;full list&lt;/a&gt; mentions many interesting statistical blogs.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;&lt;a href="http://www.theanalysisfactor.com/"&gt;The Analysis…&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Here are the top 10, in alphabetical order. Wiley's &lt;a href="http://stats.cwslive.wiley.com/details/tools/13ae5fb1aa6/A-list-of-leading-and-interesting-blogs-to-follow.html" target="_blank"&gt;full list&lt;/a&gt; mentions many interesting statistical blogs.&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.theanalysisfactor.com/"&gt;The Analysis Factor&lt;/a&gt; -offers statistical consulting, resources, and training that researchers need to conduct quality work.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/"&gt;Analytic Bridge&lt;/a&gt; - A group blog written by data scientists about statistics and data analysis.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.thejuliagroup.com/blog/"&gt;AnnMaria’s Blog&lt;/a&gt; - The blog of Dr. AnnMaria De Mars, President of the online statistics education company The Julia Group.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.badscience.net/"&gt;Bad Science&lt;/a&gt; - The blog of Dr. Ben Goldacre, an epidemiologist who uses statistics to debunk bad science.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.beyondtheboxscore.com/"&gt;Beyond the Box Score&lt;/a&gt; - A blog using statistics to analyse the game of baseball.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://blogstats.wordpress.com/"&gt;Blog About Stats&lt;/a&gt; - Blogstats creates a network for dissemination professionals mainly of statistical institutions. It is a meeting point where colleagues from Statistical Organizations share their experiences, successes and failures, focus attention on (new) developments and stay informed in the vast domain of disseminating (statistical) information.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://blogs.sas.com/content/corneroffice/author/johnsall/"&gt;bLog-Normal Distribution&lt;/a&gt; - 'The Corner Office blog', where SAS executives post their thoughts on global business, analytics and technology&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://blow.blogs.nytimes.com/"&gt;By the Numbers&lt;/a&gt; - The blog of New York Times visual Op-Ed columnist Charles Blow conducts discussions about all things statistical — from the environment to entertainment — and their visual expressions.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.bytemining.com/"&gt;Byte Mining&lt;/a&gt; - A blog written by Ryan Rosario, an ex-Statistician, now turned Computer Scientist, even though he is a Ph.D. candidate in Statistics at UCLA.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://cooldata.wordpress.com/"&gt;CoolData blog&lt;/a&gt; - A blog written by Kevin McDonell, Business Analyst for the Office of External Relations at Dalhousie University in Halifax, Nova Scotia where he concentrates on data mining and predictive modelling for many Advancement functions.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Read full list at &lt;a href="http://stats.cwslive.wiley.com/details/tools/13ae5fb1aa6/A-list-of-leading-and-interesting-blogs-to-follow.html"&gt;http://stats.cwslive.wiley.com/details/tools/13ae5fb1aa6/A-list-of-leading-and-interesting-blogs-to-follow.html&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Related articles&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/what-are-the-best-blogs-about-data" target="_blank" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;What are the best blogs about data?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/100-savvy-sites-on-statistics-and-quantitative-analysis" target="_blank"&gt;100 Savvy Sites on Statistics and Quantitative Analysis&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/the-8-worst-predictive-modeling-techniques"&gt;The 8 worst predictive modeling techniques&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/the-top-ten-worst-graphs" target="_blank"&gt;The top 10 worst graphs&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/4-open-source-data-mining"&gt;4 open source data mining tools (with GUI)&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/the-top-20-data-visualisation-tools"&gt;The top 20 data visualisation tools&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/14-questions-about-data-visualization-tools" target="_blank"&gt;14 questions about data visualization tools&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/10-great-metrics-and-strategies-for-email-campaign-optimization" target="_blank"&gt;10+ Great Metrics and Strategies for Email Campaign Optimization&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/top-analytic-blogs-and-websites-with-trending-information" target="_blank"&gt;Top analytics websites with trending information&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/forum/topics/who-are-the-wealthiest-data-scientists"&gt;Who are the wealthiest data scientists?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/kTlRGEE8bPE" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:244797</feedburner:origLink></entry>
                            <entry>
                    <title>What Agile Desktop BI Tool Really Means?</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/0JucSOf5vQk/2004291:BlogPost:245036" />
                                        <id>tag:www.analyticbridge.com,2013-05-08:2004291:BlogPost:245036</id>
                                        <updated>2013-05-08T08:15:02.000Z</updated>
                    
                                            <author>
                            <name>Jim King</name>
                            <uri>http://www.analyticbridge.com/profile/JimKing</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;span class="font-size-2" style="font-family: arial,helvetica,sans-serif;"&gt;BI (Business Intelligence) refers to the intelligence and ability to enhance the enterprises competitiveness, involving report presentation, reporting result calculation, OLAP analysis, business data calculation, data mining and predication. Among these, there are both the technician-oriented high-level systems, and the business user-oriented agile desktop BI tools. In this post, we only talk about agile desktop BI…&lt;/span&gt;&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;BI (Business Intelligence) refers to the intelligence and ability to enhance the enterprises competitiveness, involving report presentation, reporting result calculation, OLAP analysis, business data calculation, data mining and predication. Among these, there are both the technician-oriented high-level systems, and the business user-oriented agile desktop BI tools. In this post, we only talk about agile desktop BI tools. But what is agile desktop BI tool and what are their standards?&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;Agile desktop BI software should have the following features:&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;strong&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-3"&gt;Common BI features support:&lt;/span&gt;&lt;/strong&gt;&lt;br/&gt;&lt;br/&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;&lt;b&gt;The agile &lt;a href="http://www.raqsoft.com/" target="_blank"&gt;desktop BI software&lt;/a&gt; should fit for those business personnel to prepare the static report by themselves&lt;/b&gt;&lt;b&gt;, even&lt;/b&gt; &lt;b&gt;if they are inexperienced in IT.&lt;/b&gt; The friendly reporting design interface is necessary. It also supports the rapid reporting, the business personnel-oriented data processing, and high fidelity report preparation. In other words, it is capable to keep reports consistent in the stages of design, preview, pagination &amp;amp; printing, and export.&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;For example, make a product sales situation report for specific products to present the monthly sales of 3 products, and their link relative ratio, and monthly year-on-year growth&lt;/span&gt;.&lt;a href="http://www.raqsoft.com/help/blog/agile_desktop_BI-1.png" target="_blank"&gt;&lt;img class="align-full" src="http://www.raqsoft.com/help/blog/agile_desktop_BI-1.png"/&gt;&lt;/a&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;&lt;b&gt;The agile desktop BI software supports the calculation on the result of common reporting tools&lt;/b&gt;. Excel and the plain txt are the export formats supported by most reporting tools. It can import and calculate them directly; supports the data pasting from the report result directly on the clipboard; provides the calculator-style operation for business personnel; capable to conduct any process on data and the calculation between steps can be transited smoothly. &lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-3"&gt;&lt;span class="font-size-2"&gt;For example, the reporting tools generate the below report result&lt;/span&gt;:&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.raqsoft.com/help/blog/agile_desktop_BI-2.jpg" target="_blank"&gt;&lt;img class="align-full" src="http://www.raqsoft.com/help/blog/agile_desktop_BI-2.jpg"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;br/&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;Find the big clients who account for 60% of the total sales of the company based on the above data. The result is as follows:&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;a href="http://www.raqsoft.com/help/blog/agile_desktop_BI-3.png" target="_blank"&gt;&lt;img class="align-full" src="http://www.raqsoft.com/help/blog/agile_desktop_BI-3.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;&lt;b&gt;The &lt;a href="http://www.raqsoft.com/real-olap-tool-for-agile-business-intelligence" target="_blank"&gt;agile desktop BI tools&lt;/a&gt; support the true OLAP analysis&lt;/b&gt;. It is capable to perform the interactive analysis arbitrarily and intuitively, decompose and simplify the obscure analysis goal. It provides the basic analysis methods that are both simple and easy to use. Then, lots of advanced analyses can be implemented through the free combination. esCalc is such kind of desktop BI software. With an Excel-style, esCalc becomes relatively easier to understand and learn. Moreover, esCalc also provides a range of powerful advanced functions to solve the complex problems regarding OLAP.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;strong&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-3"&gt;Agile installation deployment:&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;The size of the agile BI tool should be very small and easy to install and uninstall. For example, esCalc installer is only dozens of MB and only requires a few clicks to install and run, which is the typical example of agile desktop BI software. Agile desktop BI tools are capable to run on most desktop computers independently, not having to deploy the additional server:&lt;br/&gt;&lt;br/&gt;&lt;img class="align-full" src="http://www.raqsoft.com/help/blog/agile_desktop_BI-4.png"/&gt;&lt;/span&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;Agile desktop BI software supports various mainstream databases, like esCalc which can manipulate data from different databases, including MSSQL, Oracle, Access, MySQL, DB2, and Sybase. This desktop BI software also supports the local data files, for example, Txt, Log, tab, and other text files; Excel 97, Excel 2010, and the Excel of other versions. It also supports the interactive calculation between various data sources, such as the calculation between Oracle and Excel.&lt;br/&gt;&lt;br/&gt;&lt;/span&gt;&lt;strong&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-3"&gt;Agile formula functions:&lt;/span&gt;&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;Agile BI software also provides the agile formulas and functions, so that the business personnel can easily represent the relatively complex calculations, such as comparison on year-on-year basis, link relative ratio, set operations, ranking and row number calculations.&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;For instance, calculate the players whose rankings are among the top 5 in every game. We take esCalc, the agile desktop BI software as an example. The data available is as follows&lt;/span&gt;:&lt;br/&gt;&lt;br/&gt;&lt;img class="align-full" src="http://www.raqsoft.com/help/blog/agile_desktop_BI-5.png"/&gt;&lt;br/&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;Simply input "&lt;b&gt;={A3}&lt;/b&gt;" in E2, and the top 5 players of each game will be calculated out in E2, E8, and E14. Input "&lt;b&gt;={E2}.isect ()&lt;/b&gt;" in E1, then the players whose rankings are among the top 5 in every game will be calculated automatically. In which, the function &lt;b&gt;"isect"&lt;/b&gt; is to calculate the intersection of sets, and "&lt;b&gt;{E2}"&lt;/b&gt; is a set to indicate "cells shares the same meaning with E2 regarding business” (homocell by name), that is, &lt;b&gt;E2, E8, and E14&lt;/b&gt;.&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;It also has the similar function like:&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;&lt;b&gt;diff()&lt;/b&gt;: Calculate the difference set of a group of data; for example, calculate the employee who made a full attendance in this quarter. You can calculate through the formulas like &lt;b&gt;[set of employees, employee who ever absent in the 1st month, employee who ever absent in the 2nd month, employee who ever absent in the 3rd month].diff()&lt;/b&gt; to calculate.&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;There are also other advanced functions available, such as the &lt;b&gt;sum(~*~)&lt;/b&gt; to calculate the sum of squares of a certain group of data; &lt;b&gt;cumulate()&lt;/b&gt; to calculate the cumulative value of a group of data, &lt;b&gt;ord()&lt;/b&gt; to calculate the relative row number in the calculation hierarchy, and &lt;b&gt;ranki()&lt;/b&gt; to calculate the ranking of a certain number in a group of numbers.&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;span style="font-family: arial,helvetica,sans-serif;" class="font-size-2"&gt;All in all, agile desktop BI software supports the common BI functions, and is business personnel-oriented in the respect of installation deployment, and formula function.&lt;/span&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/0JucSOf5vQk" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:245036</feedburner:origLink><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="enclosure" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~5/ykTMNt1C3R0/agile_desktop_BI-1.png" length="0" type="image/png" /><feedburner:origEnclosureLink>http://www.raqsoft.com/help/blog/agile_desktop_BI-1.png</feedburner:origEnclosureLink></entry>
                            <entry>
                    <title>Predictive Analytics in Campaign Management</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/ThiA8yyLdPc/2004291:BlogPost:244727" />
                                        <id>tag:www.analyticbridge.com,2013-05-08:2004291:BlogPost:244727</id>
                                        <updated>2013-05-08T01:00:00.000Z</updated>
                    
                                            <author>
                            <name>Jozo Kovac</name>
                            <uri>http://www.analyticbridge.com/profile/JozoKovac</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;One of the most popular application of predictive analytics is optimization of marketing campaign management. I've implemented it several times for clients in four different industries and here is my solution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Introduction to campaign management&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Marketing departments are usually responsible for management the marketing campaigns. Campaigns always have some triggers - may be planned, triggered by customer events and various alerts. Campaigns are run for a…&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;One of the most popular application of predictive analytics is optimization of marketing campaign management. I've implemented it several times for clients in four different industries and here is my solution.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Introduction to campaign management&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Marketing departments are usually responsible for management the marketing campaigns. Campaigns always have some triggers - may be planned, triggered by customer events and various alerts. Campaigns are run for a purpose and usually carry strong call-to-action message. Most common call-to-actions are acquisition, x-sell, retention or win-back.  Campaigns are delivered through various channels. There are channels for Above The Line (ATL) and Bellow The Line (BTL) communication.&lt;/p&gt;
&lt;div align="center"&gt;&lt;br/&gt; &lt;a href="http://www.7segments.com/wp-content/uploads/2013/04/Slide2.jpg"&gt;&lt;img class=" wp-image-302" title="Customer Management Solution" alt="Predictive Analytics has important role in Campaign Management" src="http://www.7segments.com/wp-content/uploads/2013/04/Slide2.jpg" width="600" height="450"/&gt;&lt;/a&gt;&lt;/div&gt;
&lt;div align="center"&gt;Customer Management Solution with Predictive Components&lt;/div&gt;
&lt;p&gt;&lt;br/&gt; Feedback loop is very important. In my solution consists of four components: responses collection &amp;amp; reporting, evaluation across customer segments, predictions what happens next and ROI analysis / projection.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Role of predictive analytics&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Predictive analytics is used for campaign optimization. Can identify the best customers for campaign with specific goal. This approach saves the most of budget or lead to better results with the same budget. For example with 20% of budget you achieve 80% of all possible sales in a campaign. Or you can identify customers at risk of churn and run very precise campaigns with "stay with us" call-to-action.&lt;/p&gt;
&lt;p&gt;Customer segmentation provides better insight. One campaign can work well in one segment and fail completely in another. The segmentation unveils such fails and you never repeat the same failure twice in given segment. And finally we got to the profitability. It's always great to measure it backwards and see what happend. And it's even better to simulate what will happen next! In this circle you have all required data for projections.  To certain point the customer behavior is well predictable. So you can make your go / no go decision well informed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Conclusion&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Predictive analytics is optional and for companies whose mastered it delivers great value. The majority of companies is not using it at all. Some companies use predictive analytics and don't run any campaigns. They let results in the drawer. For various reasons. The most common reason is lack of good infrastructure what makes the process manageable for business people without hard skills. That is the topic for my next post. Stay tuned.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Joseph Kovac&lt;/p&gt;
&lt;p&gt;CEO of 7SEGMENTS&lt;/p&gt;
&lt;p&gt;&lt;a href="http://www.7segments.com"&gt;http://www.7segments.com&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/ThiA8yyLdPc" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:244727</feedburner:origLink><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="enclosure" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~5/24X3PYTz6qs/Slide2.jpg" length="0" type="image/jpeg" /><feedburner:origEnclosureLink>http://www.7segments.com/wp-content/uploads/2013/04/Slide2.jpg</feedburner:origEnclosureLink></entry>
                            <entry>
                    <title>Choosing a BI Vendor - Making the Short List</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/QVGc-lMhI-o/2004291:BlogPost:244368" />
                                        <id>tag:www.analyticbridge.com,2013-05-05:2004291:BlogPost:244368</id>
                                        <updated>2013-05-05T13:17:19.000Z</updated>
                    
                                            <author>
                            <name>Galia Nedvedovich</name>
                            <uri>http://www.analyticbridge.com/profile/GaliaNedvedovich</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;div&gt;There is no shortage of business intelligence vendors out there. They all claim to be powerful, easy-to-use, flexible and affordable. So how do you pick the one that is right for you?&lt;/div&gt;
&lt;p&gt; &lt;/p&gt;
&lt;div&gt;In order to be able to choose the right BI vendor from the abundance out there, the best way is to follow high-level, yet restrictive, criteria and only then compare them on a feature-by-feature basis. Here are a few tips that will help you do that, as well as avoid common mistakes…&lt;/div&gt;                    </summary>

                    <content type="html">
&lt;div&gt;There is no shortage of business intelligence vendors out there. They all claim to be powerful, easy-to-use, flexible and affordable. So how do you pick the one that is right for you?&lt;/div&gt;
&lt;p&gt; &lt;/p&gt;
&lt;div&gt;In order to be able to choose the right BI vendor from the abundance out there, the best way is to follow high-level, yet restrictive, criteria and only then compare them on a feature-by-feature basis. Here are a few tips that will help you do that, as well as avoid common mistakes typically made when choosing a BI solution. This is the 21st century, and BI solutions are completely different than what you may be used to. If you follow these tips, you’ll end up with a very short list of vendors, and then it’ll just be a matter of choosing the one you feel most comfortable with in terms of specific features, pricing, support, etc:&lt;/div&gt;
&lt;p&gt; &lt;/p&gt;
&lt;div class="separator"&gt;&lt;/div&gt;
&lt;div&gt;&lt;b&gt;Find a Complete Solution, Not Just Pretty Visualization.&lt;/b&gt;&lt;/div&gt;
&lt;div&gt;The visualization of data is important, of course, but the biggest mistake you can make is judge the BI vendor based on the pretty dashboard samples they show you on their website or during a demo. Every BI vendor can do that because visualization software components are a dime a dozen. The real challenge is customizing these dashboards to your own needs and having them show your own data. This part usually takes most vendors months, and costs you bundles. If the BI vendor cannot get your own data to show the way you like it within just a few days, you could probably find a better one.&lt;/div&gt;
&lt;div&gt; &lt;/div&gt;
&lt;div&gt; &lt;/div&gt;
&lt;div&gt;&lt;b&gt;Beware of the Data Warehouse.&lt;/b&gt;&lt;/div&gt;
&lt;div&gt;A data warehouse is a centralized database filled with all the business’s data, and for years it’s been making a ton of money for BI vendors and bringing nothing but grief to customers. Today’s BI technology does not require a data warehouse, even when there are multiple data sources involved, large amounts of data or multiple users querying the data. There are very specific scenarios where a data warehouse is a good idea, but they are most likely not relevant to you. If the vendor requires a data warehouse to proceed with implementation, it is most likely you should keep looking.&lt;/div&gt;
&lt;div&gt; &lt;/div&gt;
&lt;div&gt; &lt;/div&gt;
&lt;div&gt;&lt;b&gt;Beware of the OLAP Cube.&lt;/b&gt;&lt;/div&gt;
&lt;div&gt;OLAP, which stands for Online Analytical Processing, is 20 year old technology designed to improve query performance over medium to large datasets. OLAP is also very lengthy and costly to implement, and there is really no need for it anymore. Today’s BI technology can handle even huge amounts of data without OLAP, at fractions of the time or cost. If the BI vendor requires OLAP to assure you acceptable query performance, you should probably move on.&lt;/div&gt;
&lt;div&gt; &lt;/div&gt;
&lt;div&gt; &lt;/div&gt;
&lt;div&gt;&lt;b&gt;Refuse to Make Significant Upfront Investments.&lt;/b&gt;&lt;/div&gt;
&lt;div&gt;Many BI vendors will promise you the world, but will demand significant upfront investment in preparation projects, hardware and software before you even get to run a single report on your actual data. Do not agree to this, and demand to have at least one solid report or dashboard running over your own data before you commit to anything significant in advance. If the vendor is not willing to do so, it’s probably because they would have to spend weeks on development before they can reach that point. That typically means this vendor is either using very old technology or is simply trying to pull one over you.&lt;/div&gt;
&lt;div&gt; &lt;/div&gt;
&lt;div&gt;&lt;b&gt;Be wary of Vendors whose Business is Prof. Services.&lt;/b&gt;&lt;/div&gt;
&lt;div&gt;Vendors who sell real home-grown BI software products (in contrast to OEMing someone else's software) do not like engaging in long professional services projects because it hurts their margins. That is why they prefer to create software that is easy enough to be used directly by the customer or through a third party (which usually lives off these professional services contracts). If you choose a BI vendor who makes most of his business off professional services (as opposed to software sales), you can pretty much be sure that they will take their time building your solution. These types of BI vendors also live off on-going maintenance services, so what you initially pay for the solution is actually only the beginning. Whenever possible, try to choose a BI vendor that focuses on selling BI software to the end customer, not to the professional services community.&lt;/div&gt;
&lt;div&gt; &lt;/div&gt;
&lt;div&gt; &lt;/div&gt;
&lt;div&gt;&lt;b&gt;Make the Vendor Prove it To You.&lt;/b&gt;&lt;/div&gt;
&lt;div&gt;The most important thing is to make the vendor prove what they claim prior to investing too much money upfront. This proof must be in the form of reports, dashboards or analytics in real life scenarios, running on real data, used by the actual end users and within a reasonable amount of time. If a vendor is not willing to accommodate this simple request, you really should find one that does. Many vendors provide free trial versions, as well as utilize technology that speeds up implementation tremendously. If the one you're in contact with now doesn't, they shouldn't make your short list.&lt;/div&gt;
&lt;div&gt; &lt;/div&gt;
&lt;div&gt;See more at &lt;a href="http://www.sisense.com"&gt;www.sisense.com&lt;/a&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/QVGc-lMhI-o" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:244368</feedburner:origLink></entry>
                            <entry>
                    <title>Weekly Digest - May 6</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/WTy6ENA-Xtc/2004291:BlogPost:244187" />
                                        <id>tag:www.analyticbridge.com,2013-05-03:2004291:BlogPost:244187</id>
                                        <updated>2013-05-03T17:00:00.000Z</updated>
                    
                                            <author>
                            <name>Vincent Granville</name>
                            <uri>http://www.analyticbridge.com/profile/VincentGranville</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;Announcements and articles from the last couple of days:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/dsc-monthly-contest"&gt;Viral program for data science authors and bloggers&lt;/a&gt; - Read updated version.…&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Announcements and articles from the last couple of days:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/dsc-monthly-contest"&gt;Viral program for data science authors and bloggers&lt;/a&gt; - Read updated version.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/codesnippets/forum/topics/r-code-for-model-free-data-driven-confidence-intervals"&gt;R code for model-free, data-driven confidence intervals&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/choosing-a-bi-vendor-making-the-short-list"&gt;Choosing a BI Vendor - Making the Short List&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/how-do-i-become-a-data-scientist-1"&gt;How do I become a data scientist?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/what-are-the-best-blogs-about-data"&gt;What are the best blogs about data?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/books/forum/topics/my-o-reilly-books"&gt;My O'Reilly book collection&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/books/forum/topics/i-could-not-resist-to-adding-this-book"&gt;I could not resist to adding this book&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/group/data-science-apprenticeship/forum/topics/update-about-our-data-science-apprenticeship" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;Data Science Apprenticeship update&lt;/a&gt;&lt;span style="font-size: 10pt;"&gt; - New study material added: check out the starred items.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/shooting-stars" target="_self" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;Shooting stars: new R code to produce new, great videos&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.bigdatanews.com/profiles/blogs/when-data-flows-faster-than-it-can-be-processed" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;When data flows faster than it can be processed&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/big-data-big-relevance" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;Big Data, Big Relevance&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.linkedin.com/groups?gid=4989164" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;New LinkedIn group about Data Science Training&lt;/a&gt;&lt;span style="font-size: 10pt;"&gt; - 3,000 members in the first 24 hours.&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/predicting-the-reality-show-winner-using-big-data" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;Predicting the reality show winner using big data&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/forum/topics/how-to-find-first-order-effects-second-order-effects-and-their" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;How to find First and Second Order effects and their significance/causality in Data Analysis&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/forum/topics/can-we-automate-data-mining" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;Can we automate data mining?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/how-to-install-mapr-m3-on-ubuntu-through-ubuntu-partner-archive" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;How to install MapR M3 on Ubuntu through Ubuntu Partner Archive.&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/i-did-it-jigsaw-academy-student-rajesh-ramamurthy-tells-us-how-he" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;Rajesh Ramamurthy tells us how he made a career shift into analytics&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/6-advantages-of-analytics-driven-hiring-technology" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;6 Advantages of Analytics-driven Hiring Technology&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.bigdatanews.com/group/bdn-daily-press-releases" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;New press releases&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/new-articles-posted-since-last-friday" target="_self"&gt;Previous digest&lt;/a&gt; | &lt;a href="http://www.analyticbridge.com/group/analyticjobs/forum/topics/career-alert-april-26"&gt;Recent jobs&lt;/a&gt; | &lt;a href="http://www.analyticbridge.com/page/links"&gt;Top Links&lt;/a&gt; | &lt;a href="http://www.analyticbridge.com/group/data-science/forum/topics/data-science-e-book-first-draft-available-for-download"&gt;Data Science eBook&lt;/a&gt; | &lt;a href="http://www.datasciencecentral.com/group/data-science-apprenticeship/forum/topics/update-about-our-data-science-apprenticeship"&gt;Apprenticeship&lt;/a&gt;&lt;/span&gt;&lt;/center&gt;
&lt;center&gt;&lt;div class="xg_module_body xg_user_generated"&gt;&lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;&lt;a href="http://www.linkedin.com/groups/Advanced-Business-Analytics-Data-Mining-35222"&gt;&lt;img src="http://datashaping.com/favico_linkedin.png" border="0&amp;quot;"/&gt;&lt;/a&gt; &lt;a href="https://www.facebook.com/pages/AnalyticBridge/80509530789"&gt;&lt;img src="http://datashaping.com/favico_facebook.png"/&gt;&lt;/a&gt; &lt;a href="https://twitter.com/analyticbridge"&gt;&lt;img src="http://datashaping.com/favico_twitter.png" border="0&amp;quot;"/&gt;&lt;/a&gt; &lt;a href="https://plus.google.com/u/0/communities/107156514183161811383/"&gt;&lt;img src="http://datashaping.com/favico_google.png" border="0&amp;quot;"/&gt;&lt;/a&gt; &lt;a href="http://www.quora.com/Vincent-Granville"&gt;&lt;img src="http://datashaping.com/favico_quora.png" border="0&amp;quot;"/&gt;&lt;/a&gt; &lt;a href="http://www.datasciencecentral.com/page/news-feeds"&gt;&lt;img src="http://datashaping.com/rss-favicon.png" border="0&amp;quot;"/&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/center&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/WTy6ENA-Xtc" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:244187</feedburner:origLink></entry>
                            <entry>
                    <title>6 Advantages of Analytics-driven Hiring Technology</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/4wAGd8sCj7U/2004291:BlogPost:244068" />
                                        <id>tag:www.analyticbridge.com,2013-05-02:2004291:BlogPost:244068</id>
                                        <updated>2013-05-02T16:30:00.000Z</updated>
                    
                                            <author>
                            <name>Mike Kennedy</name>
                            <uri>http://www.analyticbridge.com/profile/MikeKennedy</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p dir="ltr"&gt;By now you are probably familiar with the business value analytics can provide in a myriad of business contexts. If not, you’ll want to read &lt;a href="http://www.amazon.com/Value-Business-Analytics-Identifying-Profitability/dp/1118012399" target="_blank" title="The Value of Business Analytics by Evan Stubbs"&gt;&lt;em&gt;The Value of Business Analytics&lt;/em&gt;&lt;/a&gt; by Evan Stubbs as it’s one of my favorite analytics books for the business community.&lt;/p&gt;
&lt;p dir="ltr"&gt;One opportunity that cuts…&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p dir="ltr"&gt;By now you are probably familiar with the business value analytics can provide in a myriad of business contexts. If not, you’ll want to read &lt;a title="The Value of Business Analytics by Evan Stubbs" href="http://www.amazon.com/Value-Business-Analytics-Identifying-Profitability/dp/1118012399" target="_blank"&gt;&lt;em&gt;The Value of Business Analytics&lt;/em&gt;&lt;/a&gt; by Evan Stubbs as it’s one of my favorite analytics books for the business community.&lt;/p&gt;
&lt;p dir="ltr"&gt;One opportunity that cuts across all industries, functions and levels where analytics-driven technology can provide a tremendous advantage is by improving how we hire. Hiring is a critical decision for any analytics manager because it represents arguably the most direct impact on business performance – the people doing the work. At a high level, improved hiring decisions is what analytics-driven hiring technology offers.&lt;/p&gt;
&lt;p dir="ltr"&gt;Yet, according to the well-publicized &lt;a href="http://www.mckinsey.com/insights/business_technology/big_data_the_next_frontier_for_innovation" target="_blank"&gt;expected analytic talent shortage&lt;/a&gt;, many firms are finding it increasingly difficult to attract and retain top talent.&lt;/p&gt;
&lt;p dir="ltr"&gt;If we drill in a bit further, there may be some ways to address analytic talent hiring challenges by using analytics-driven hiring technology. What follows are some key advantages that make analytics-driven hiring technology a sound strategic investment for innovative hiring managers responsible for analytics teams, with a focus on optimizing both short and long term business performance.&lt;/p&gt;
&lt;p dir="ltr"&gt;&lt;strong&gt;Analytics-driven Hiring Technology Advantages&lt;/strong&gt;&lt;/p&gt;
&lt;p dir="ltr"&gt;&lt;strong&gt;1. Improve hiring decision effectiveness by reducing risk.&lt;/strong&gt; Bad hires are an expensive, yet pervasive and well documented business problem. While no one can claim hiring perfection, analytics-driven technology that can reduce the risk of a bad hire is an invaluable ally for any hiring manager.&lt;/p&gt;
&lt;p dir="ltr"&gt;&lt;strong&gt;2. Reduce ramp-up time and business disruption.&lt;/strong&gt; No one enjoys a disruptive and time-consuming hiring process, whether it’s you as the hiring manager, your team, your candidates or the business. Analytics-driven hiring technology shortens disruption and ramp-up time by speeding up collaboration through &lt;a href="http://www.talentanalytics.com/blog/100-of-people-prefer-pre-boarding-to-on-boarding/" target="_blank"&gt;pre-boarding with Team Playbooks&lt;/a&gt;. The new hire hits the ground running and knows how to best work together with their team.&lt;/p&gt;
&lt;p dir="ltr"&gt;&lt;strong&gt;3. Google for your candidate pool.&lt;/strong&gt; The pool of candidates for the role you are hiring for may have dozens or even hundreds of candidates, making it time consuming to find candidates that closest match what the role requires. Just like a search engine uses algorithms behind the scenes to instantly shortlist search results for you, analytics-driven technology enables you to benchmark the specific role and let the analytics engine shortlist job candidates that most closely match its pre-defined requirements in context of the culture of your team and organization.&lt;/p&gt;
&lt;p dir="ltr"&gt;&lt;strong&gt;4. Optimize interview time.&lt;/strong&gt; Analytics-driven technology enables innovative hiring managers to conduct candidate gap analysis prior to the interview. This allows interview time to be spent better understanding specifically how the candidate will perform in the role, giving managers valuable data to support their decision they would otherwise learn after the decision has been made.&lt;/p&gt;
&lt;p dir="ltr"&gt;&lt;strong&gt;5. Reduce interview bias.&lt;/strong&gt; Since two different interviewers may have two completely opposite opinions of how a candidate did in the interview, analytics-driven technology provides an consensus on key areas to focus on. This is an easy way to take add more objectivitY into the candidate rating process by taking the interviewer out of it.&lt;/p&gt;
&lt;p&gt;&lt;b id="docs-internal-guid-4a08fb1f-5c7e-2581-c174-9344bb02d948"&gt;6. Improve employee/manager relationship. &lt;/b&gt;If the common saying “employees join companies and leave managers” is true, it underscores the importance of getting off on the right foot. Analytics-driven technology allows you as hiring manager to quickly aggregate the communication preferences of your direct reports from day one (and vice versa) by quickly understanding opportunities for synergies and conflict. This lets both you as manager and your employees customize communication and develop stronger rapport, which in turn yields greater engagement and productivity.&lt;/p&gt;
&lt;p&gt;This is just a quick look at a few opportunities analytics-driven hiring technology provides.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;&lt;strong&gt;What other advantages can analytics-driven hiring technology provide? Feel free to comment below!&lt;/strong&gt;&lt;/em&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/4wAGd8sCj7U" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:244068</feedburner:origLink></entry>
                            <entry>
                    <title>I did it! Jigsaw Academy student Rajesh Ramamurthy tells us how he made a career shift into analytics.</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/6TUASZ8k8r0/2004291:BlogPost:243882" />
                                        <id>tag:www.analyticbridge.com,2013-05-02:2004291:BlogPost:243882</id>
                                        <updated>2013-05-02T08:33:55.000Z</updated>
                    
                                            <author>
                            <name>Jigsaw Academy</name>
                            <uri>http://www.analyticbridge.com/profile/JigsawAcademy</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;Many of Jigsaw’s students join our various courses because they can hear the rising crescendo within the analytics industry. They yearn to be a part of this challenging and exciting career, one that is fast paced, has rich monetary returns and can fast track your career path. For those of you out there, who also have such dreams, are you skeptical and nervous because you have no previous experience in analytics? Well, be reassured, many have walked a similar path. Having no previous…&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;Many of Jigsaw’s students join our various courses because they can hear the rising crescendo within the analytics industry. They yearn to be a part of this challenging and exciting career, one that is fast paced, has rich monetary returns and can fast track your career path. For those of you out there, who also have such dreams, are you skeptical and nervous because you have no previous experience in analytics? Well, be reassured, many have walked a similar path. Having no previous analytics experience, they took that first step and contacted Jigsaw, registered for one of our analytics courses, worked hard at their assignments and upon completion of the course, were able to find a job in the analytics sector.&lt;/p&gt;
&lt;p&gt;This week we would like to showcase one of our bright and talented students who after successfully completing one of our courses, has been able to break into the analytics arena.&lt;/p&gt;
&lt;p&gt;Read more at &lt;a href="http://analyticstraining.com/2013/i-did-it-jigsaw-academy-student-rajesh-ramamurthy-tells-us-how-he-made-a-career-shift-into-analytics/" target="_blank"&gt;analyticstraining.com&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/6TUASZ8k8r0" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:243882</feedburner:origLink></entry>
                            <entry>
                    <title>Shooting stars</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/DRymNQcqbBc/2004291:BlogPost:243602" />
                                        <id>tag:www.analyticbridge.com,2013-04-30:2004291:BlogPost:243602</id>
                                        <updated>2013-04-30T01:00:00.000Z</updated>
                    
                                            <author>
                            <name>Vincent Granville</name>
                            <uri>http://www.analyticbridge.com/profile/VincentGranville</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;This is a follow up to our video series &lt;em&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/from-chaos-to-clusters-statistical-modeling-without-models" target="_blank"&gt;From chaos to clusters&lt;/a&gt;,&lt;/em&gt; made with data points moving over time to form clusters, and produced with open source and home-made data science algorithms.…&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;This is a follow up to our video series &lt;em&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/from-chaos-to-clusters-statistical-modeling-without-models" target="_blank"&gt;From chaos to clusters&lt;/a&gt;,&lt;/em&gt; made with data points moving over time to form clusters, and produced with open source and home-made data science algorithms.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;See below two frames from the new video, now featuring line segments connecting a current point to its location in the previous frame. These line segments are overwritten and change constantly from iteration to iteration, creating a "shooting stars" visual effect when you watch the video.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://api.ning.com:80/files/qTBoDPGnvtyLcMzG7ncyrcRMxC-K6kVBf5sv4jFUgRFzrG6MnOuZkyDGbaCUh3d9psxUvgqm*okrt0-*M3rN5c35vMeTl0pS/shot1.png" target="_self"&gt;&lt;img src="http://api.ning.com:80/files/qTBoDPGnvtyLcMzG7ncyrcRMxC-K6kVBf5sv4jFUgRFzrG6MnOuZkyDGbaCUh3d9psxUvgqm*okrt0-*M3rN5c35vMeTl0pS/shot1.png" width="435" class="align-center"/&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://api.ning.com:80/files/qTBoDPGnvtyR4qBBJ*8-3HvIb2aPa1oo*sVlG1YeoYw83EddSTG-Xg-BDzXDxXlIGPiJdKy0Ggl*NWXuOj1nTuNj50xXaWx7/shot2.png" target="_self"&gt;&lt;img src="http://api.ning.com:80/files/qTBoDPGnvtyR4qBBJ*8-3HvIb2aPa1oo*sVlG1YeoYw83EddSTG-Xg-BDzXDxXlIGPiJdKy0Ggl*NWXuOj1nTuNj50xXaWx7/shot2.png" width="435" class="align-center"/&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-size: 10pt; font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Towards the end of the video, the clusters are well formed (though they are also moving, especially the one at the bottom right corner) and points coming from outside are progressively attracted to the nearest cluster: you can see them quickly getting close and then get absorbed. &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Here are the two new videos&lt;/strong&gt;:&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/video/from-chaos-to-cluster-part-4" target="_blank"&gt;Video #4&lt;/a&gt; (orange, more visually pleasant | &lt;a href="http://www.youtube.com/watch?v=u1oZdJ0ph_g" target="_blank"&gt;Youtube version&lt;/a&gt;)&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/video/from-chaos-to-cluster-part-5" target="_blank"&gt;Video #5&lt;/a&gt; (blue, better &lt;em&gt;shooting star&lt;/em&gt; effect | &lt;a href="http://www.youtube.com/watch?v=fb0yevGSAxY" target="_blank"&gt;Youtube version&lt;/a&gt;)&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datashaping.com/rfile3.txt" target="_blank"&gt;Download the data file rfile3.txt&lt;/a&gt; used to produce these videos, (also available in &lt;a href="http://www.datashaping.com/rfile3.txt.gz" target="_blank"&gt;compressed format&lt;/a&gt;). These videos are based on the following R script (a more complex version of &lt;a href="http://www.analyticbridge.com/group/codesnippets/forum/topics/simple-solutions-to-make-videos-with-r" target="_blank"&gt;our initial R script&lt;/a&gt;):&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;R Source code&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;vv&amp;lt;-read.table("c:/vincentg/rfile3.txt",header=TRUE);&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;iter&amp;lt;-vv$iter;&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;for (n in 1:199) {&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  x&amp;lt;-vv$x[iter == n];&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  y&amp;lt;-vv$y[iter == n];&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  z&amp;lt;-vv$new[iter == n];&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  u&amp;lt;-vv$d2init[iter == n];&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  v&amp;lt;-vv$d2last[iter == n];&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  p&amp;lt;-vv$x[iter == n-1];&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  q&amp;lt;-vv$y[iter == n-1];&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  u[u&amp;gt;1]&amp;lt;-1;&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  v[v&amp;gt;0.10]&amp;lt;-0.10;&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  s=1/sqrt(1+n);&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  if (n==1) {&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;    plot(p,q,xlim=c(-0.08,1.08),ylim=c(-0.08,1.09),pch=20,cex=0,col=rgb(1,1,0),xlab="",ylab="",axes=TRUE  );&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  }&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  points(p,q,col=rgb(1-s,1-s,1-s),pch=20,cex=1);&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  segments(p,q,x,y,col=rgb(0,0,1));&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  points(x,y,col=rgb(z,0,0),pch=20,cex=1);&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  Sys.sleep(5*s);&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;  segments(p,q,x,y,col=rgb(1,1,1));&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;}&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;segments(p,q,x,y,col=rgb(0,0,1)); # arrows segments&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: 'courier new', courier;" class="font-size-1"&gt;points(x,y,col=rgb(z,0,0),pch=20,cex=1);&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Related articles&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/from-chaos-to-clusters-statistical-modeling-without-models" target="_blank"&gt;From chaos to clusters - statistical modeling without models&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/codesnippets/forum/topics/simple-solutions-to-make-videos-with-r" target="_blank"&gt;Simple solutions to make videos with R&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/data-science-the-end-of-statistics" target="_blank"&gt;Data Science: The End of Statistics?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.bigdatanews.com/profiles/blogs/fast-clustering-algorithms-for-massive-datasets" target="_blank"&gt;Fast clustering algorithms for massive datasets&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/codesnippets" target="_blank"&gt;Other useful pieces of code (Perl, Python, R etc.)&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.bigdatanews.com/profiles/blogs/internet-topology-massive-and-amazing-graphs" target="_blank"&gt;Internet Topology - Massive and Amazing Graphs&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/forum/topics/3-d-visualizations-for-small-and-big-data" target="_blank"&gt;3-D Visualizations with rotating charts, for small and big data&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/great-graphic-diagrams" target="_blank"&gt;Great graphic diagrams&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/two-more-beautiful-graphs" target="_blank"&gt;Two more interesting graphs&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/a-new-way-to-define-centrality" target="_blank"&gt;A new way to define centrality&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/14-questions-about-data-visualization-tools" target="_blank"&gt;14 questions about data visualization tools&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/the-top-20-data-visualisation-tools"&gt;The top 20 data visualisation tools&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/another-cute-graph" target="_blank"&gt;Another cute graph&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/books/forum/topics/5-books-on-data-visua-ization" target="_blank"&gt;5 books on data visualization&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/registered-meteorites-that-has-impacted-on-earth-visualized" target="_blank"&gt;Registered meteorites that has impacted on Earth visualized&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/the-curse-of-big-data"&gt;The curse of big data&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/how-to-detect-a-pattern-problem-and-solution" target="_blank"&gt;How to detect a pattern? Problem and solution&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/DRymNQcqbBc" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:243602</feedburner:origLink><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="enclosure" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~5/zdi2kVgCavU/shot1.png" length="0" type="image/png" /><feedburner:origEnclosureLink>http://api.ning.com:80/files/qTBoDPGnvtyLcMzG7ncyrcRMxC-K6kVBf5sv4jFUgRFzrG6MnOuZkyDGbaCUh3d9psxUvgqm*okrt0-*M3rN5c35vMeTl0pS/shot1.png</feedburner:origEnclosureLink></entry>
                            <entry>
                    <title>Predicting the reality show winner using big data</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/avkvmMDxjw0/2004291:BlogPost:243585" />
                                        <id>tag:www.analyticbridge.com,2013-04-29:2004291:BlogPost:243585</id>
                                        <updated>2013-04-29T15:25:27.000Z</updated>
                    
                                            <author>
                            <name>Eka Aulia</name>
                            <uri>http://www.analyticbridge.com/profile/EkaAulia</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;I wonder, using big data and predictive analytic, can we predict the winner of x-factor or American Idols from the start of their audition performance? I think we might have a good chance to predict the winner right away.&lt;/p&gt;
&lt;p&gt;What if we could only have the information from their first performance, what should be the variables to be used in the predictive model? Here’s from what I could think of:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span&gt;The voice: quantified timbre, energy and rhythm&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;Song…&lt;/li&gt;
&lt;/ul&gt;                    </summary>

                    <content type="html">
&lt;p&gt;I wonder, using big data and predictive analytic, can we predict the winner of x-factor or American Idols from the start of their audition performance? I think we might have a good chance to predict the winner right away.&lt;/p&gt;
&lt;p&gt;What if we could only have the information from their first performance, what should be the variables to be used in the predictive model? Here’s from what I could think of:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span&gt;The voice: quantified timbre, energy and rhythm&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;Song selection: how popular the song was, type of music (pop/jazz/country)&lt;/li&gt;
&lt;li&gt;The singer appearance: body mass, hair color, skin color, clothing, type of shoes, color contrast, etc (some of these variables might not be legal to use)&lt;/li&gt;
&lt;li&gt;The early response from panel of judges: number of yes/no&lt;/li&gt;
&lt;li&gt;Wisdom of the crowds: mentions at twitter, number goods vs bad sentiments, number of videos uploaded to YouTube, number of download from iTunes, etc.&lt;/li&gt;
&lt;li&gt;Audience claps: number of decibels from audience claps&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Once we have all of these variables, we might be able to predict the winner this coming season X-factor/American Idols/The voice.&lt;/p&gt;
&lt;p&gt;However, if the model is able to predict the winner, who should be benefited from this algorithm? The producer of the show could be one. Once he/she knows who should be the winner, he/she can play with the TV viewers’ emotions by altering some of the significant variables. The audience could be more attached if their favorite singer is about to lose and need more support. With more viewer’s getting more attached, the TV can have higher rating and higher advertising income. Hear the KA-CHING?&lt;/p&gt;
&lt;p&gt;Can we do this project just for fun? :)&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href="http://haveasanookday.wordpress.com/2013/04/29/predicting-the-reality-show-winner-using-big-data-x-factorbritain-got-talentamerican-idols/" target="_blank"&gt;Predicting the reality show winner using big data&lt;/a&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/avkvmMDxjw0" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:243585</feedburner:origLink></entry>
                            <entry>
                    <title>Hadoop Herd : When to use What...</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/KMr0JYvL0io/2004291:BlogPost:243347" />
                                        <id>tag:www.analyticbridge.com,2013-04-26:2004291:BlogPost:243347</id>
                                        <updated>2013-04-26T00:55:43.000Z</updated>
                    
                                            <author>
                            <name>Mohammad Tariq Iqbal</name>
                            <uri>http://www.analyticbridge.com/profile/MohammadTariqIqbal</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;a href="http://api.ning.com:80/files/PD*OEYumS6UTNxTgjoZK-YB7FgHvOcrBM9f5XOz69rLxIzyGOkrUxqNZJq3yr*tkGFE*W2EUftokNFFnPFPmuSZ7XTFo2UD2/96bbea98986e4646bfb17a148e4a13ddwallpaper.jpg" target="_self"&gt;&lt;img class="align-full" src="http://api.ning.com:80/files/PD*OEYumS6UTNxTgjoZK-YB7FgHvOcrBM9f5XOz69rLxIzyGOkrUxqNZJq3yr*tkGFE*W2EUftokNFFnPFPmuSZ7XTFo2UD2/96bbea98986e4646bfb17a148e4a13ddwallpaper.jpg?width=750" width="750"&gt;&lt;/img&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;8 years ago not even Doug Cutting would have thought that the tool which he's naming after the name of his kid's soft toy would so soon become a rage and change the way people and organizations look at their data. Today Hadoop and BigData have almost become synonyms to…&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;a href="http://api.ning.com:80/files/PD*OEYumS6UTNxTgjoZK-YB7FgHvOcrBM9f5XOz69rLxIzyGOkrUxqNZJq3yr*tkGFE*W2EUftokNFFnPFPmuSZ7XTFo2UD2/96bbea98986e4646bfb17a148e4a13ddwallpaper.jpg" target="_self"&gt;&lt;img src="http://api.ning.com:80/files/PD*OEYumS6UTNxTgjoZK-YB7FgHvOcrBM9f5XOz69rLxIzyGOkrUxqNZJq3yr*tkGFE*W2EUftokNFFnPFPmuSZ7XTFo2UD2/96bbea98986e4646bfb17a148e4a13ddwallpaper.jpg?width=750" width="750" class="align-full"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;8 years ago not even Doug Cutting would have thought that the tool which he's naming after the name of his kid's soft toy would so soon become a rage and change the way people and organizations look at their data. Today Hadoop and BigData have almost become synonyms to each other. But Hadoop is not just Hadoop now. Over the time it has evolved into one big herd of various tools, each meant to serve a different purpose. But glued together they give you a powerpacked combo.&lt;br/&gt;&lt;br/&gt;Having said that, one must be careful while choosing these tools for their specific use case as one size doesn't fit all. What is working for someone might not be that productive for you. So, here I am trying to show you which tool should be picked in which scenario. It's not a big comparative study but a short intro to some very useful tools. And, I am really not an expert or an authority so there is always some scope of suggestions. Please feel free to comment or suggest if you have any. I would love to hear from you. Let's get started :&lt;br/&gt;&lt;br/&gt;1-&lt;i&gt; &lt;/i&gt;&lt;i&gt;&lt;a href="http://hadoop.apache.org/" target="_blank"&gt;Hadoop&lt;/a&gt; &lt;/i&gt;: Hadoop is basically 2 things, a distributed file system(HDFS) which constitutes Hadoop's storage layer and a distributed computation framework(MapReduce) which constitutes the processing layer. You should go for Hadoop if your data is very huge and you have offline, batch processing kinda needs. Hadoop is not suitable for real time stuff. You setup a Hadoop cluster on a group of commodity machines connected together over a network(called as a cluster). You then store huge amounts of data into the HDFS and process this data by writing MapReduce programs(or jobs). Being distributed, HDFS is spread across all the machines in a cluster and MapReduce processes this scattered data locally by going to each machine, so that you don't have to relocate this gigantic amount of data.&lt;br/&gt;&lt;br/&gt;2- &lt;a href="http://hbase.apache.org/" target="_blank"&gt;Hbase&lt;/a&gt;&lt;i&gt; &lt;/i&gt;: Hbase is a distributed, scalable, big data store, modelled after Google's BigTable. It stores data as key/value pairs. It's basically a database, a NoSQL database and like any other database it's biggest advantage is that it provides you random read/write capabilities. As I have mentioned earlier, Hadoop is not very good for your real time needs, so you can use Hbase to serve that purpose. If you have some data which you want to access real time, you could store it in Hbase. Hbase has got it's own set of very good API which could be used to push/pull the data. Not only this, Hbase can be seamlessly integrated with MapReduce so that you can do bulk operation, like indexing, analytics etc etc.&lt;br/&gt;&lt;br/&gt;Tip : You could use Hadoop as the repository for your static data and Hbase as the datastore which will hold data that is probably gonna change over time after some processing.&lt;br/&gt;&lt;br/&gt;3- &lt;a href="http://hive.apache.org/" target="_blank"&gt;Hive&lt;/a&gt; : Originally developed by Facebook, Hive is basically adata warehouse. It sits on top of your Hadoop cluster and provides you an SQL like interface to the data stored in your Hadoop cluster. You can then write SQLish queries using Hive's query language, called as HiveQL and perform operations like store, select, join, and much more. It makes processing a lot easier as you don't have to do lengthy, tedious coding. Write simple Hive queries and get the results. Isn't that cool??RDBMS folks will definitely love it. Simply map HDFS files to Hive tables and start querying the data. Not only this, you could map Hbase tables as well, and operate on that data.&lt;br/&gt;&lt;br/&gt;Tip : Use Hive when you have warehousing needs and you are good at SQL and don't want to write MapReduce jobs. One important point though, Hive queries get converted into a corresponding MapReduce job under the hood which runs on your cluster and gives you the result. Hive does the trick for you. But each and every problem cannot be solved using HiveQL. Sometimes, if you need really fine grained and complex processing you might have to take MapReduce's shelter.&lt;br/&gt;&lt;br/&gt;4- &lt;i&gt;&lt;a href="http://pig.apache.org/" target="_blank"&gt;Pig&lt;/a&gt;&lt;/i&gt; : Pig is a dataflow language that allows you to process enormous amounts of data very easily and quickly by repeatedly transforming it in steps. It basically has 2 parts, the PigInterpreter and the language, PigLatin. Pig was originally developed at Yahoo and they use it extensively. Like Hive, PigLatin queries also get converted into a MapReduce job and give you the result. You can use Pig for data stored both in HDFS and Hbase very conveniently. Just like Hive, Pig is also really efficient at what it is meant to do. It saves a lot of your effort and time by allowing you to not write MapReduce programs and do the operation through straightforward Pig queries.&lt;br/&gt;&lt;br/&gt;Tip : Use Pig when you want to do a lot of transformations on your data and don't want to take the pain of writing MapReduce jobs.&lt;br/&gt;&lt;br/&gt;5- &lt;i&gt;&lt;a href="http://sqoop.apache.org/" target="_blank"&gt;Sqoop&lt;/a&gt;&lt;/i&gt; : Sqoop is a tool that allows you to transfer data between relational databases and Hadoop. It supports incremental loads of a single table or a free form SQL query as well as saved jobs which can be run multiple times to import updates made to a database since the last import. Not only this, imports can also be used to populate tables in Hive or HBase. Along with this Sqoop also allows you to export the data back into the relational database from the cluster.&lt;br/&gt;&lt;br/&gt;Tip : Use Sqoop when you have lots of legacy data and you want it to be stored and processed over your Hadoop cluster or when you want to incrementally add the data to your existing storage.&lt;br/&gt;&lt;br/&gt;6- &lt;i&gt;&lt;a href="http://oozie.apache.org/" target="_blank"&gt;Oozie&lt;/a&gt;&lt;/i&gt; : Now you have everything in place and want to do the processing but find it crazy to start the jobs and manage the workflow manually all the time. Specially in the cases when it is required to chain multiple MapReduce jobs together to achieve a goal. You would like to have some way to automate all this. No worries, Oozie comes to the rescue. It is a scalable, reliable and extensible workflow scheduler system. You just define your workflows(which are Directed Acyclical Graphs) once and rest is taken care by Oozie. You can schedule MapReduce jobs, Pig jobs, Hive jobs, Sqoop imports and even your Java programs using Oozie.&lt;br/&gt;&lt;br/&gt;Tip : Use Oozie when you have a lot of jobs to run and want some efficient way to automate everything based on some time (frequency) and data availabilty.&lt;br/&gt;&lt;br/&gt;7- &lt;i&gt;&lt;a href="http://flume.apache.org/" target="_blank"&gt;Flume&lt;/a&gt;&lt;/i&gt;/&lt;i&gt;&lt;a href="http://incubator.apache.org/chukwa/" target="_blank"&gt;Chukwa&lt;/a&gt;&lt;/i&gt; : Both Flume and Chukwa are data aggregation tools and allow you to aggregate data in an efficient, reliable and distributed manner. You can pick data from some place and dump it into your cluster. Since you are handling BigData, it makes more sense to do it in a distributed and parallel fashion which both these tools are very good at. You just have to define your flows and feed them to these tools and rest of things will be done automatically by them.&lt;br/&gt;&lt;br/&gt;Tip : Go for Flume/Chukwa when you have to aggregate huge amounts of data into your Hadoop environment in a distributed and parallel manner.&lt;br/&gt;&lt;br/&gt;8- &lt;a href="http://avro.apache.org/" target="_blank"&gt;Avro&lt;/a&gt;&lt;i&gt; &lt;/i&gt;: Avro is a data serialization system. It provides functionalities similar to systems like Protocol Buffers, Thrift etc. In addition to that it provides some other significant features like rich data structures, a compact, fast, binary data format, a container file to store persistent data, RPC mechanism and pretty simple dynamic languages integration. And the best part is that Avro can easily be used with MapReduce, Hive and Pig. Avro uses JSON for defining data types.&lt;br/&gt;&lt;br/&gt;Tip : Use Avro when you want to serialize your BigData with good flexibility.&lt;br/&gt;&lt;br/&gt;&lt;br/&gt;The list is actually pretty big, but I have covered only the most significant tools. Over time if I feel like something else should be mentioned here I would definitely do that. Comments and suggestions are welcome.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/KMr0JYvL0io" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:243347</feedburner:origLink><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="enclosure" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~5/1w_tuDjEYE8/96bbea98986e4646bfb17a148e4a13ddwallpaper.jpg" length="0" type="image/jpeg" /><feedburner:origEnclosureLink>http://api.ning.com:80/files/PD*OEYumS6UTNxTgjoZK-YB7FgHvOcrBM9f5XOz69rLxIzyGOkrUxqNZJq3yr*tkGFE*W2EUftokNFFnPFPmuSZ7XTFo2UD2/96bbea98986e4646bfb17a148e4a13ddwallpaper.jpg</feedburner:origEnclosureLink></entry>
                            <entry>
                    <title>From chaos to clusters - statistical modeling without models</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/kvuaiYEyRJA/2004291:BlogPost:243240" />
                                        <id>tag:www.analyticbridge.com,2013-04-25:2004291:BlogPost:243240</id>
                                        <updated>2013-04-25T05:00:00.000Z</updated>
                    
                                            <author>
                            <name>Vincent Granville</name>
                            <uri>http://www.analyticbridge.com/profile/VincentGranville</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p style="text-align: left;"&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;Here I provide the mathematics, explanations and source code to produce the data and moving clusters in the &lt;em&gt;&lt;a href="http://www.analyticbridge.com/group/codesnippets/forum/topics/simple-solutions-to-make-videos-with-r" target="_blank"&gt;From chaos to clusters&lt;/a&gt;&lt;/em&gt; video series.…&lt;/span&gt;&lt;br&gt;&lt;/br&gt;&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p style="text-align: left;"&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Here I provide the mathematics, explanations and source code to produce the data and moving clusters in the &lt;em&gt;&lt;a href="http://www.analyticbridge.com/group/codesnippets/forum/topics/simple-solutions-to-make-videos-with-r" target="_blank"&gt;From chaos to clusters&lt;/a&gt;&lt;/em&gt; video series.&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;small&gt;&lt;a href="http://api.ning.com:80/files/VzgFwBdBjDVE*0PU5gZc08*i3BTeqn49pw4PPrebyLU2GJoG98lnPkV**VQJowva4eN-Q9Ut6BiKcwUNMGhY4mGo6K-u8Fdi/pki.png" target="_self"&gt;&lt;img src="http://api.ning.com:80/files/VzgFwBdBjDVE*0PU5gZc08*i3BTeqn49pw4PPrebyLU2GJoG98lnPkV**VQJowva4eN-Q9Ut6BiKcwUNMGhY4mGo6K-u8Fdi/pki.png" width="369" class="align-center"/&gt;&lt;/a&gt;&lt;a href="http://www.analyticbridge.com/video/video"&gt;&lt;br/&gt;&lt;/a&gt;&lt;/small&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;A little bit of history on how the project started&lt;/strong&gt;:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Interest in astronomy, visualization and &lt;a href="http://www.analyticbridge.com/profiles/blogs/stat-models-to-solve-astronomical-mysteries" target="_blank"&gt;how physics models apply to business problems&lt;/a&gt;&lt;br/&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Research on &lt;a href="http://www.datasciencecentral.com/forum/topics/predicting-urban-growth-using-physics-models" target="_blank"&gt;how urban growth could be modeled by the gravitional law&lt;/a&gt;&lt;br/&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Interest in systems that produce clusters (as well as birth and death processes) and in &lt;a href="http://www.analyticbridge.com/group/codesnippets/forum/topics/simple-solutions-to-make-videos-with-r" target="_blank"&gt;visualizing cluster formation with videos rather than charts&lt;/a&gt;&lt;br/&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Creating art: videos with sound and images synchronized and both generated using data (coming soon). Maybe I'll be able to turn your business data into a movie (either artistic or insightful or both)! I'm already at the point where I can produce the video frames faster than they are delivered on the streaming device. I called it FRT for &lt;em&gt;faster than real time&lt;/em&gt;.&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;What is a statistical model without model?&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;There's actually a generic mathematical model behind the algorithm. But nobody cares about the model, the algorithm was created first without having a mathematical model in mind. Initially, I had a gravitational model in mind, but I eventually abandoned it as it was not producing what I expected.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;This illustrates a new trend in data science: we care less and less about modeling, but more and more about results. My algorithm has a bunch of parameters and features that can be fine-tuned to produce anything you want - be it a simulation of a Neyman-Scott cluster process, or a simulation of some no-name stochastic process.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;It's a bit similar to how modern rock climbing has evolved: focusing on big names such as Everest in the past, to exploring deeper wilderness and climbing no-name peaks today (with their own challenges), to rock climbing on Mars in the future.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;You can fine tune the parameters to&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Achieve best fit between simulated data and real business (or other data), using traditional goodness-of-fit testing and sensitivity analysis. Note that the simulated data represents a realization (an &lt;em&gt;instance&lt;/em&gt; for object-oriented people) of a spatio-temporal stochastic process.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Once the parameters are calibrated, perform predictions (if you speak statistician language) or extrapolations (if you speak mathematician language).&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;So how does the algorithm work?&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;It starts with a random distribution of m &lt;em&gt;mobile&lt;/em&gt; points in the [0,1] x [0,1] square window. The points get attracted to each other (attraction is stronger to closest neighbors) and thus over time, they group into clusters.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;The algorithm has the following components:&lt;/span&gt;&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Creation of n random &lt;em&gt;fixed&lt;/em&gt; points (n=100) on [-0.5, 1.5] x [-0.5, 1.5]. This window is 4 times bigger than the one containing the &lt;em&gt;mobile&lt;/em&gt; points, to eliminate edge effects impacting the mobile points. These fixed points (they never move) also act as some sort of dark matter: they are invisible, they are not represented in the video, but they are the glue that prevents the whole system from collapsing onto itself and converging to a single point.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Creation of m random &lt;em&gt;mobile&lt;/em&gt; points (m=500) on [0,1] x [0,1].&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Main loop (200 iterations). At each iteration, we compute the distance d between each mobile point (x,y) and each of his m-1 mobile neighbors and n fixed neighbors. A weight w is computed as a function of d, with a special weight for the point (x,y) itself. Then the updated (x,y) is the weighted sum aggregated over all points, and we do that for each point (x,y) at each iteration. The weight is such that the sum of weights over all points is always 1. In other words, we replace each point with a &lt;a href="http://en.wikipedia.org/wiki/Convex_combination" target="_blank"&gt;convex linear combination&lt;/a&gt; of all points.&lt;/span&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Special features&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;If the weight for (x,y) [the point being updated] is very high at a given iteration, then (x,y) will barely move.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;We have tested negative weights (especially for the point being updated) and we liked the results better. A delicate amount of negative weights also further prevents the system from collapsing and introduce a bit of chaos.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Occasionally, one point is replaced by a brand new, random point, rather than updated using the weighted sum of neighbors. We call this event a "birth". It happens for less than 1% of all point updates, and it happens more frequently at the beginning. Of course, you can play with these parameters.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;In the source code, the &lt;strong&gt;birth process&lt;/strong&gt;  (for point $k) is simply encoded as:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;if (rand()&amp;lt;0.1/(1+$iteration)) { # birth and death&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;  $tmp_x[$k]=rand();&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;  $tmp_y[$k]=rand();&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;  $rebirth[$k]=1;&lt;/span&gt;&lt;br/&gt; &lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;}&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;In the source code, in the inner loop over $k, the point ($x,$y) to be updated is referenced as point $k, that is,  ($y, $y) = ($moving_x[$k], $moving_y[$k]). Also, in a loop over $l, one level deeper, ($p, $q) referenced as point $l, represents a neighboring point when computing the weighted average formula used to update ($x, $y). The distance d is computed using the function &lt;em&gt;distance&lt;/em&gt; which accepts four arguments ($x, $y, $p, $q) and returns $weight, the weight w.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/codesnippets/forum/topics/from-chaos-to-clusters-simulation-of-stochastic-processes" target="_blank"&gt;Click here to view source code&lt;/a&gt;.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Related articles&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/shooting-stars" target="_blank"&gt;New videos added&lt;/a&gt; (&lt;em&gt;shooting stars&lt;/em&gt; series with new R script)&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/codesnippets/forum/topics/simple-solutions-to-make-videos-with-r" target="_blank" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;Simple solutions to make videos with R&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/data-science-the-end-of-statistics" target="_blank" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;Data Science: The End of Statistics?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.bigdatanews.com/profiles/blogs/fast-clustering-algorithms-for-massive-datasets" target="_blank" style="font-size: 10pt; font-family: arial, helvetica, sans-serif;"&gt;Fast clustering algorithms for massive datasets&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/codesnippets" target="_blank" style="font-size: 10pt;"&gt;Other useful pieces of code (Perl, Python, R etc.)&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.bigdatanews.com/profiles/blogs/internet-topology-massive-and-amazing-graphs" target="_blank" style="font-size: 10pt;"&gt;Internet Topology - Massive and Amazing Graphs&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/forum/topics/3-d-visualizations-for-small-and-big-data" target="_blank" style="font-size: 10pt;"&gt;3-D Visualizations with rotating charts, for small and big data&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/great-graphic-diagrams" target="_blank" style="font-size: 10pt;"&gt;Great graphic diagrams&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/two-more-beautiful-graphs" target="_blank" style="font-size: 10pt;"&gt;Two more interesting graphs&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/a-new-way-to-define-centrality" target="_blank" style="font-size: 10pt;"&gt;A new way to define centrality&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/14-questions-about-data-visualization-tools" target="_blank" style="font-size: 10pt;"&gt;14 questions about data visualization tools&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/the-top-20-data-visualisation-tools" style="font-size: 10pt;"&gt;The top 20 data visualisation tools&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/another-cute-graph" target="_blank" style="font-size: 10pt;"&gt;Another cute graph&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/books/forum/topics/5-books-on-data-visua-ization" target="_blank" style="font-size: 10pt;"&gt;5 books on data visualization&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/registered-meteorites-that-has-impacted-on-earth-visualized" target="_blank" style="font-size: 10pt;"&gt;Registered meteorites that has impacted on Earth visualized&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/the-curse-of-big-data" style="font-size: 10pt;"&gt;The curse of big data&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/how-to-detect-a-pattern-problem-and-solution" target="_blank" style="font-size: 10pt;"&gt;How to detect a pattern? Problem and solution&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/kvuaiYEyRJA" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:243240</feedburner:origLink><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="enclosure" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~5/4vcuvtMp8VY/pki.png" length="0" type="image/png" /><feedburner:origEnclosureLink>http://api.ning.com:80/files/VzgFwBdBjDVE*0PU5gZc08*i3BTeqn49pw4PPrebyLU2GJoG98lnPkV**VQJowva4eN-Q9Ut6BiKcwUNMGhY4mGo6K-u8Fdi/pki.png</feedburner:origEnclosureLink></entry>
                            <entry>
                    <title>Finance Analysis by Empowering Spreadsheet with SQL Ability</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/E_WxaBKZBcM/2004291:BlogPost:242963" />
                                        <id>tag:www.analyticbridge.com,2013-04-24:2004291:BlogPost:242963</id>
                                        <updated>2013-04-24T01:29:04.000Z</updated>
                    
                                            <author>
                            <name>Jim King</name>
                            <uri>http://www.analyticbridge.com/profile/JimKing</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;The &lt;a href="http://www.raqsoft.com/product-escalc"&gt;spreadsheet&lt;/a&gt; can implement the visualized calculation to some extent, and the nontechnical people can perform some rather complex calculations without having to learn the SQL. However, as the core of SQL, the relational query is unable to be implemented through spreadsheet, which adds complexity to the apparently simple problems of multi-table join.&lt;br&gt;&lt;/br&gt;&lt;br&gt;&lt;/br&gt;For example, the Finance department needs to calculate the salary, and the…&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;The &lt;a href="http://www.raqsoft.com/product-escalc"&gt;spreadsheet&lt;/a&gt; can implement the visualized calculation to some extent, and the nontechnical people can perform some rather complex calculations without having to learn the SQL. However, as the core of SQL, the relational query is unable to be implemented through spreadsheet, which adds complexity to the apparently simple problems of multi-table join.&lt;br/&gt;&lt;br/&gt;For example, the Finance department needs to calculate the salary, and the relevant data is stored in ”standard sheet”, ” Absence sheet”, and ” performance sheet”, as shown in the below figure:&lt;br/&gt;&lt;br/&gt;&lt;a href="http://3.bp.blogspot.com/-PLK2djaPWO0/UXYhQC2Oe2I/AAAAAAAAAx8/t23QuXd2P5w/s1600/esCalc+visulization+1.png" target="_blank"&gt;&lt;img class="align-full" src="http://3.bp.blogspot.com/-PLK2djaPWO0/UXYhQC2Oe2I/AAAAAAAAAx8/t23QuXd2P5w/s1600/esCalc+visulization+1.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;If these three sheets can be joined, then you can compute it easily via the standardWages*(1+Evaluation-Absence/40)+Bouns, as shown below:&lt;br/&gt;&lt;br/&gt;&lt;a href="http://2.bp.blogspot.com/-DZZ8X3wlbwQ/UXYhQGkhriI/AAAAAAAAAxw/Xie4JUuG9Zc/s1600/esCalc+visulization+2.png" target="_blank"&gt;&lt;img class="align-full" src="http://2.bp.blogspot.com/-DZZ8X3wlbwQ/UXYhQGkhriI/AAAAAAAAAxw/Xie4JUuG9Zc/s1600/esCalc+visulization+2.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;However, the common &lt;a href="http://www.raqsoft.com/download.html"&gt;business spreadsheet software&lt;/a&gt; like Excel is usually quite inconvenient for such Union and Join actions. The manual data copying is error-prone, and it will be even more exhausting if the data volume is huge. Considering these factors, composing formula is a great method, for example, in D2, E2, and F2, respectively compose the 3 formulas:&lt;br/&gt;&lt;br/&gt;=IFERROR(INDIRECT("'Absence'!"&amp;amp;ADDRESS(MATCH(A2,'Absence'!$A:$A,0),2)),0)&lt;/p&gt;
&lt;p&gt;=IFERROR(INDIRECT("'Performance'!"&amp;amp;ADDRESS(MATCH(A2,'Performance'!$A:$A,0),2)),0)&lt;/p&gt;
&lt;p&gt;=IFERROR(INDIRECT("'Performance'!"&amp;amp;ADDRESS(MATCH(A2,'Performance'!$A:$A,0),3)),0)&lt;br/&gt;&lt;br/&gt;The above-mentioned formula requires the strong technical competence and rich user experience with spreadsheet. In fact, the qualified capable people would rather import the data to the database and use a simple statement of “relation query” to solve the problem, because this formula is hard to understand and error-prone.&lt;br/&gt;&lt;br/&gt;Isn’t there any better method for non-technical users? Actually, we’ve got one – esCalc, an innovative desktop BI tool which is capable for the relation query. To join the Absence sheet with the standard sheet, simply use the Join function, as shown in the below figure:&lt;br/&gt;&lt;br/&gt;&lt;a href="http://4.bp.blogspot.com/-Xae7fYFum6A/UXYhQRtvRqI/AAAAAAAAAx0/t_Xhe4KI2pw/s1600/esCalc+visulization+3.png" target="_blank"&gt;&lt;img class="align-full" src="http://4.bp.blogspot.com/-Xae7fYFum6A/UXYhQRtvRqI/AAAAAAAAAx0/t_Xhe4KI2pw/s1600/esCalc+visulization+3.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Similarly, you are only allowed to perform the Join action for once for the Performance sheet. The ultimate result is the just what we have expected:&lt;br/&gt;&lt;br/&gt;&lt;a href="http://1.bp.blogspot.com/-kWp-gRaEq7M/UXYhQ1DTI7I/AAAAAAAAAyA/ON8a0lgnTOM/s1600/esCalc+visulization+4.png" target="_blank"&gt;&lt;img class="align-full" src="http://1.bp.blogspot.com/-kWp-gRaEq7M/UXYhQ1DTI7I/AAAAAAAAAyA/ON8a0lgnTOM/s1600/esCalc+visulization+4.png"/&gt;&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;The “perform for once” even includes the formula to calculate the salary. In G2, just enter the formula for once and the formula will be automatically copied to G3, G4, and other cells sharing the common business sense. We call such cells Homocell.&lt;br/&gt;&lt;br/&gt;The Join action is dependent on the homocell to some degree. The advantage of group table at multi-levels is to join the data correctly, even those data at various levels. Similarly, in the grouping table at multi-levels, the formula will be copied and pasted to the homocells. For example, the formula in the summary section will be copied and pasted to other summary section, and the data in the details section will not be affected. Therefore, for the huge amount of workload needing adjust before in the business spreadsheet software has been automated in esCalc, the &lt;a href="http://www.raqsoft.com/real-olap-tool-for-agile-business-intelligence"&gt;smarter desktop BI tool&lt;/a&gt;.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/E_WxaBKZBcM" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:242963</feedburner:origLink><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="enclosure" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~5/RtFlefdYq-A/esCalc+visulization+1.png" length="0" type="image/png" /><feedburner:origEnclosureLink>http://3.bp.blogspot.com/-PLK2djaPWO0/UXYhQC2Oe2I/AAAAAAAAAx8/t23QuXd2P5w/s1600/esCalc+visulization+1.png</feedburner:origEnclosureLink></entry>
                            <entry>
                    <title>The Cost of BI In Perspective</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/E3GUVmLMCf8/2004291:BlogPost:242410" />
                                        <id>tag:www.analyticbridge.com,2013-04-21:2004291:BlogPost:242410</id>
                                        <updated>2013-04-21T13:55:10.000Z</updated>
                    
                                            <author>
                            <name>Galia Nedvedovich</name>
                            <uri>http://www.analyticbridge.com/profile/GaliaNedvedovich</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;span class="font-size-4"&gt;&lt;sub&gt;The Cost of Starting Too Big… or Too Small&lt;/sub&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;After talking to business managers, executives and other stakeholders, you’ve determined that this BI solution you’re considering has the potential of serving 100 users. How would you then go about calculating your project costs? This is where things get tricky, and where most BI buyers fail to protect their wallets. Making the wrong decision here is far more significant than any decision…&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;span class="font-size-4"&gt;&lt;sub&gt;The Cost of Starting Too Big… or Too Small&lt;/sub&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;After talking to business managers, executives and other stakeholders, you’ve determined that this BI solution you’re considering has the potential of serving 100 users. How would you then go about calculating your project costs? This is where things get tricky, and where most BI buyers fail to protect their wallets. Making the wrong decision here is far more significant than any decision you make on software licenses or even hardware.&lt;/p&gt;
&lt;p&gt;&lt;b&gt; &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Even if the development stage of your BI project goes without a hitch, getting a hundred users to use any kind of software, in any company, is a challenge that is not at all easier than any technical challenge you will encounter during the various stages of the project. You could easily find yourself spending tons of money on the development and deployment of a complicated 100 user solution, only to find that only 15 of them are actually using it.&lt;/p&gt;
&lt;p&gt;&lt;b&gt; &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;So instead of your total cost per user being reduced due to the ‘volume-pricing’ model, you actually paid much more – because each one of these 15 users absorbs the cost of the 85 others who find it utterly useless, too difficult to use or completely misaligned with their business objectives. You'd be surprised how often this happens.&lt;/p&gt;
&lt;p&gt;&lt;b&gt; &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;The obvious way of dealing with this common problem is to start off small (10-20 users), and expand as usage of the system grows (assuming it will). But when it comes to traditional business intelligence solutions, there’s a catch - deploying a solution for 10-20 users and deploying a solution for 100 users are utterly different tasks and require significant changes in solution architecture.&lt;/p&gt;
&lt;p&gt;&lt;b&gt; &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Following this path will save you some cost on the software licenses you did not purchase straight off. However, if demand for the solution grows inside the business, you will have to re-design your solution – which would probably end up costing more than it would have initially.&lt;/p&gt;
&lt;p&gt;&lt;b&gt;&lt;sub&gt; &lt;/sub&gt;&lt;/b&gt;&lt;/p&gt;
&lt;p&gt;&lt;i&gt;Tip: The correct way of dealing with this challenge is to seek a solution that scales without having to re-architect the solution as usage grows. Buying more software and upgrading hardware when the time comes is relatively easy and inexpensive, while rebuilding the entire solution from scratch every year or two costs way more.&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;i&gt;Find more at &lt;a href="http://www.sisense.com/prism"&gt;http://www.sisense.com/prism&lt;/a&gt;&lt;/i&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/E3GUVmLMCf8" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:242410</feedburner:origLink></entry>
                            <entry>
                    <title>Hadoop+Ubuntu : The Big Fat Wedding.</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/XYBcM5dtf18/2004291:BlogPost:242311" />
                                        <id>tag:www.analyticbridge.com,2013-04-21:2004291:BlogPost:242311</id>
                                        <updated>2013-04-21T02:10:40.000Z</updated>
                    
                                            <author>
                            <name>Mohammad Tariq Iqbal</name>
                            <uri>http://www.analyticbridge.com/profile/MohammadTariqIqbal</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;span&gt;Now, here is a treat for all you Hadoop and Ubuntu lovers. Last month, &lt;/span&gt;&lt;a href="http://www.canonical.com/" target="_blank"&gt;&lt;span&gt;Canonical&lt;/span&gt;&lt;/a&gt;&lt;span&gt;, the organization behind the Ubuntu operating system, partnered with &lt;/span&gt;&lt;b&gt;&lt;i&gt;&lt;a href="http://www.mapr.com/" target="_blank"&gt;&lt;span&gt;MapR&lt;/span&gt;&lt;/a&gt;&lt;/i&gt;&lt;/b&gt;&lt;span&gt;, one of the Hadoop heavyweights, in an effort to make Hadoop available as an integrated part of Ubuntu through its repositories. The partnership announced that…&lt;/span&gt;&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;span&gt;Now, here is a treat for all you Hadoop and Ubuntu lovers. Last month, &lt;/span&gt;&lt;a href="http://www.canonical.com/" target="_blank"&gt;&lt;span&gt;Canonical&lt;/span&gt;&lt;/a&gt;&lt;span&gt;, the organization behind the Ubuntu operating system, partnered with &lt;/span&gt;&lt;b&gt;&lt;i&gt;&lt;a href="http://www.mapr.com/" target="_blank"&gt;&lt;span&gt;MapR&lt;/span&gt;&lt;/a&gt;&lt;/i&gt;&lt;/b&gt;&lt;span&gt;, one of the Hadoop heavyweights, in an effort to make Hadoop available as an integrated part of Ubuntu through its repositories. The partnership announced that MapR's M3 Edition for Apache Hadoop will be packaged and made available for download as an integrated part of the Ubuntu operating system. Canonical and MapR are also working to develop a &lt;/span&gt;&lt;b&gt;&lt;i&gt;&lt;a href="https://juju.ubuntu.com/docs/charm-store.html" target="_blank"&gt;&lt;span&gt;Juju Charm&lt;/span&gt;&lt;/a&gt;&lt;/i&gt;&lt;/b&gt;&lt;span&gt; that can be used by OpenStack and other customers to easily deploy MapR into their environments.&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;span&gt;The free MapR M3 Edition includes HBase, Pig, Hive, Mahout, Cascading, Sqoop, Flume and other Hadoop-related components for unlimited production use. MapR M3 will be bundled with Ubuntu 12.04 LTS and 12.10 via the Ubuntu Partner Archive. MapR also announced that the source code for the component packages of the MapR Distribution for Apache Hadoop is now publicly available on GitHub.&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;span&gt;MapR is the only distribution that enables Linux applications and commands to access data directly in the cluster via the NFS interface that is available with all MapR Editions. The MapR M5 and M7 Editions for Apache Hadoop, which provide enterprise-grade features for HBase and Hadoop such as mirroring, snapshots, NFS HA and data placement control, will also be certified for Ubuntu.&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;span&gt;Now, as you get Hadoop integrated natively with Ubuntu, it's a lot easier to install it and go. No more unnecessary downloads and wacky configuration steps. And the best part is the NFS interface available with MapR's distribution that enables other Linux commands and application to access the cluster data directly. The Ubuntu/MapR package will be available through the Ubuntu Partner Archive for 12.04 LTS and 12.10 releases of Ubuntu on the official &lt;/span&gt;&lt;b&gt;&lt;i&gt;&lt;a href="http://www.ubuntu.com/" target="_blank"&gt;&lt;span&gt;website&lt;/span&gt;&lt;/a&gt;&lt;/i&gt;&lt;/b&gt;&lt;span&gt; starting from April 25, 2013.&lt;/span&gt;&lt;br/&gt;&lt;br/&gt;&lt;span&gt;For more info you can get the &lt;/span&gt;&lt;b&gt;Ubuntu and Hadoop: the perfect match&lt;/b&gt;&lt;span&gt; white paper from &lt;/span&gt;&lt;b&gt;&lt;i&gt;&lt;a href="http://www.canonical.com/sites/www.canonical.com/files/active/WP_Hadoop_WEB.pdf" target="_blank"&gt;&lt;span&gt;here&lt;/span&gt;&lt;/a&gt;&lt;/i&gt;&lt;/b&gt;&lt;span&gt;.&lt;/span&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/XYBcM5dtf18" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:242311</feedburner:origLink></entry>
                            <entry>
                    <title>Take-aways from IE's  first Predictive Analytics Summit in the Asia-Pacific</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/LktXjO3seYw/2004291:BlogPost:242502" />
                                        <id>tag:www.analyticbridge.com,2013-04-21:2004291:BlogPost:242502</id>
                                        <updated>2013-04-21T01:00:00.000Z</updated>
                    
                                            <author>
                            <name>Jeffrey Ng</name>
                            <uri>http://www.analyticbridge.com/profile/JeffreyNg</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;Having asked for a budget for it with special approval, I, of course, would take whatever the Event offered in the short-but fruitful 2 days in Hong Kong during 18th to 19th April, 2013. My overall feedback is positive and will recommend companies to spend their training budget for Analytics people to come to here instead of staying in the classroom to learn Stat 101.  This event meant to be for particioners&lt;/p&gt;
&lt;p&gt;&lt;span style="text-decoration: underline;"&gt;&lt;strong&gt;The Analytics market in…&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;Having asked for a budget for it with special approval, I, of course, would take whatever the Event offered in the short-but fruitful 2 days in Hong Kong during 18th to 19th April, 2013. My overall feedback is positive and will recommend companies to spend their training budget for Analytics people to come to here instead of staying in the classroom to learn Stat 101.  This event meant to be for particioners&lt;/p&gt;
&lt;p&gt;&lt;span style="text-decoration: underline;"&gt;&lt;strong&gt;The Analytics market in Asia-Pacific&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;The market is divided into:&lt;/p&gt;
&lt;p&gt;i. Data collector: those with different fancy technologies or incentive to capture your footprint online, on social network, on the toucscreen at the backseat of the cab. These people are moving fast in the retail sector. Nothing done with the B2B market.&lt;/p&gt;
&lt;p&gt;ii. Data integrator: those link information from various databases in house, over the social network. There are media companies doing all these, but have problem in identifying the customers. However, financial industry are struggling more about managing their internal data as there are a sea of high quality data inside the organization.&lt;/p&gt;
&lt;p&gt;iii. Analyst of the big data patterns: Algorithms are shown but what is key is that algorithm focuses on capturing two trends: the mass trend and the outliers. In Asia, the outliers or emerging trends are one of the key topics for Risk Analytics,  not so for the Marketing or Operation side.&lt;/p&gt;
&lt;p&gt;iv. Implenmentator of the findings: applications areas involves risk(Fraud, Credit Risk), B2C marketing(Identifying the profitable customers with the right resources), finance (forecasting materials for an online shop or pricing correctly for the auto industry), operational efficiency via workforce planning (in quantifying /finding out the key performers' characteristics).&lt;/p&gt;
&lt;p&gt;v. Speeding up the overall cycles or spreading out the processes: There are IT guys working hard to give real-time output and spreading the iii. and iv. to as many people as possible.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="text-decoration: underline;"&gt;&lt;strong&gt;What the participants are thinking?&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;From what I observed, there are three types: 1. the Start-up Analytics looking for or developing ideas/knowledge; 2. The successful camp which think they are the best and want to continue doing so; 3. The sales and consultants who came to pitch their products/services. Having a mix of these people would ensure the industry is alive and the conversation is healthy. Unfortunately, I do not feel that 3. the products and services of Predictive Analytics are doing good in the Region. You can say this is a gap or an opportunity.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="text-decoration: underline;"&gt;&lt;strong&gt;What I am looking forward to?&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;There is a lack of development of Predictive Analytics in the B2B markets - If I have to see more potential for development in Asia-Pacific and if what GDP said was right, even in the US, over 50% of the business/revenues are generated in the B2B market, why so few development out there?&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;I have not been to other Region but I feel that it would be interesting to see more exchange like this type for the Professionals.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;Disclaimer: As my sample size is bias towards the participants coming to the Hong Kong event, those centered around Singapore or Shanghai may not be within my reach.&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/LktXjO3seYw" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:242502</feedburner:origLink></entry>
                            <entry>
                    <title>Weekly Digest - April 22</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/iRzQ71LOhVs/2004291:BlogPost:242396" />
                                        <id>tag:www.analyticbridge.com,2013-04-21:2004291:BlogPost:242396</id>
                                        <updated>2013-04-21T00:00:00.000Z</updated>
                    
                                            <author>
                            <name>Vincent Granville</name>
                            <uri>http://www.analyticbridge.com/profile/VincentGranville</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;&lt;b&gt;Sponsored Announcements&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;&lt;a href="http://gopivotal.com/?utm_source=DSC&amp;amp;utm_medium=email&amp;amp;utm_content=42413&amp;amp;utm_campaign=gopivotal"&gt;Invitation to the Pivotal launch&lt;/a&gt; - this Wednesday. A new platform for a new era.…&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;b&gt;Sponsored Announcements&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://gopivotal.com/?utm_source=DSC&amp;amp;utm_medium=email&amp;amp;utm_content=42413&amp;amp;utm_campaign=gopivotal"&gt;Invitation to the Pivotal launch&lt;/a&gt; - this Wednesday. A new platform for a new era.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://info.hortonworks.com/FY13Q2WebinarSeries.html?source=DSC" target="_self"&gt;Enterprise Apache Hadoop: Best Practices and Tools&lt;/a&gt; - Hortonworks Webinar&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Calling all travel, hospitality and tourism technology professionals! &lt;a href="http://events.eyefortravel.com/smart-technology-show/conference-agenda.php?utm_source=datasciencecentral&amp;amp;utm_medium=bannernewslettertextad&amp;amp;utm_campaign=2383dscnewsletterbannertextad"&gt;Join us at The Smart Travel Technology Show&lt;/a&gt; this July 18-19 in Boston to meet 150-200 of the best IT and Tech minds in the space. We’re discussing data, cloud, wifi, payment efficiency, mobile, strategic IT concerns for travel and more. Plus! Meet leading travel brands such as Travelocity, Expedia, Boston Logan Airport, Los Angeles World Airports, Target Vacations, Visit California, Atlas Travel, Roomkey, Louvre Hotels and more.&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.predictiveanalyticsworld.com/chicago/2013/?src=DSC&amp;amp;media=WeeklyDigest&amp;amp;date=041913"&gt;Predictive Analytics World Chicago&lt;/a&gt; - June 2013. Hear how top practitioners deploy predictive modeling, and what kind of business impact it delivers. Will feature over 30 sessions with case studies across 2 tracks: 1) All Audiences and 2) Expert/Practitioner.&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;b&gt;Staff Postings&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/selected-articles-from-top-news-outlets"&gt;33 selected articles from top news outlets - April 20&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/new-fast-excel-to-process-billions-of-rows-via-the-cloud"&gt;New, fast Excel to process billions of rows via the cloud&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/the-amateur-data-scientist-and-her-projects"&gt;The amateur data scientist and her projects&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/books/forum/topics/two-new-books-on-data-analysis-with-open-source"&gt;Two new books on data analysis with open source&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/forum/topics/predicting-urban-growth-using-physics-models"&gt;Predicting urban growth using physics models&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.bigdatanews.com/profiles/blogs/bit-ly-for-competitive-intelligence"&gt;Bit.ly for competitive intelligence&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;b&gt;Selected Postings from Members&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/should-data-science-become-a-profession"&gt;Should Data Science Become a Profession?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/education-business-and-tech"&gt;Should business and engineering schools develop joint programs?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/is-open-source-really-free"&gt;Is Open Source Really Free?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/how-do-you-solve-your-pre-etl-source-to-target-mapping-problems"&gt;How do you solve your "Pre-Etl" Source to target mapping problems?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/unleashing-intelligence-through-natural-language-part-2"&gt;Unleashing Intelligence through natural language&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/text-analytics-software-events-and-training"&gt;Text Analytics Software, Events and Training&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/the-hidden-biases-in-big-data?xg_source=activity"&gt;The Hidden Biases in Big Data&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;b&gt;Forum Questions&lt;/b&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/forum/topics/handling-imbalanced-data-when-building-regression-models"&gt;Handling Imbalanced data when building regression models&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/forum/topics/cltv-model-using-spss-modeler"&gt;customer lifetime value model using SPSS Modeler?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/forum/topics/has-link-analysis-rendered-association-rules-redundant"&gt;Has link analysis rendered association rules redundant?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;center&gt;&lt;br/&gt; &lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/weekly-digest-april-15"&gt;Previous digest&lt;/a&gt; | &lt;a href="http://www.analyticbridge.com/group/analyticjobs/forum/topics/career-alert-april-10"&gt;Recent jobs&lt;/a&gt; | &lt;a href="http://www.analyticbridge.com/page/links"&gt;Top Links&lt;/a&gt;&lt;/span&gt;&lt;div class="xg_module_body xg_user_generated"&gt;&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.linkedin.com/groups/Advanced-Business-Analytics-Data-Mining-35222"&gt;&lt;img src="http://datashaping.com/favico_linkedin.png" border="0&amp;quot;"/&gt;&lt;/a&gt; &lt;a href="https://www.facebook.com/pages/AnalyticBridge/80509530789"&gt;&lt;img src="http://datashaping.com/favico_facebook.png"/&gt;&lt;/a&gt; &lt;a href="https://twitter.com/analyticbridge"&gt;&lt;img src="http://datashaping.com/favico_twitter.png" border="0&amp;quot;"/&gt;&lt;/a&gt; &lt;a href="https://plus.google.com/u/0/communities/107156514183161811383/"&gt;&lt;img src="http://datashaping.com/favico_google.png" border="0&amp;quot;"/&gt;&lt;/a&gt; &lt;a href="http://www.quora.com/Vincent-Granville"&gt;&lt;img src="http://datashaping.com/favico_quora.png" border="0&amp;quot;"/&gt;&lt;/a&gt; &lt;a href="http://www.datasciencecentral.com/page/news-feeds"&gt;&lt;img src="http://datashaping.com/rss-favicon.png" border="0&amp;quot;"/&gt;&lt;/a&gt;&lt;/span&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;/center&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/iRzQ71LOhVs" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:242396</feedburner:origLink></entry>
                            <entry>
                    <title>Should business and engineering schools develop joint programs?</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/FcBbdfCFPwQ/2004291:BlogPost:242244" />
                                        <id>tag:www.analyticbridge.com,2013-04-19:2004291:BlogPost:242244</id>
                                        <updated>2013-04-19T16:30:00.000Z</updated>
                    
                                            <author>
                            <name>Srinivasan Krishnamurthy</name>
                            <uri>http://www.analyticbridge.com/profile/SrinivasanKrishnamurthy</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;Businesses are increasingly using data-driven methods to make business decisions. Hence, there is a need for people with &lt;span style="text-decoration: underline;"&gt;both&lt;/span&gt; good business skills and programming/quant skills. Finance/Accounting PhDs  and other business PhDs do have such skills, but they are few in number, are costly to hire, and the majority anyway prefer academia. This limits businesses to mainly hire bachelors or masters level candidates.&lt;/p&gt;
&lt;p&gt;However, a majority of the…&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;Businesses are increasingly using data-driven methods to make business decisions. Hence, there is a need for people with &lt;span style="text-decoration: underline;"&gt;both&lt;/span&gt; good business skills and programming/quant skills. Finance/Accounting PhDs  and other business PhDs do have such skills, but they are few in number, are costly to hire, and the majority anyway prefer academia. This limits businesses to mainly hire bachelors or masters level candidates.&lt;/p&gt;
&lt;p&gt;However, a majority of the business students (MBAs) do not have sufficient quantitative and programming skills to be able to adequately do the quant part of the job. A single course teaching introductory statistics using basic Excel for analysis is nowhere near adequate, as far as quant skills go. It is difficult for a quant neophyte to do the quant analysis, understand the results from the analysis and use it effectively to make business decisions.&lt;/p&gt;
&lt;p&gt;On the other hand, most folks with programming experience do not have the expertise to adequately do the business part of the job. While they may be able to do a great job of programming, they often find it difficult to understand the business side of things, what is the business decision being analyzed and what analysis would be appropriate to address these issues, and how to leverage the results of the  quant analysis to make good business decisions.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;To address this gap, should business and engineering schools come together and develop a joint program, where the graduates have both sets of skills? Some schools may already be doing this.&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/FcBbdfCFPwQ" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:242244</feedburner:origLink></entry>
                            <entry>
                    <title>The amateur data scientist and her projects</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/-_KO-5Vt8Mg/2004291:BlogPost:242023" />
                                        <id>tag:www.analyticbridge.com,2013-04-17:2004291:BlogPost:242023</id>
                                        <updated>2013-04-17T23:00:00.000Z</updated>
                    
                                            <author>
                            <name>Vincent Granville</name>
                            <uri>http://www.analyticbridge.com/profile/VincentGranville</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;With so much data available for free everywhere, and so many open tools, I would expect to see the emergence of a new kind of analytic practitioners: the amateur data scientist.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span class="font-size-2" style="font-family: arial, helvetica, sans-serif;"&gt;Just like the amateur astronomer, the amateur data scientist will significantly contribute to the art and science, and will eventually solve…&lt;/span&gt;&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;With so much data available for free everywhere, and so many open tools, I would expect to see the emergence of a new kind of analytic practitioners: the amateur data scientist.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Just like the amateur astronomer, the amateur data scientist will significantly contribute to the art and science, and will eventually solve mysteries. Could the Boston bomber be found thanks to thousands of amateurs analyzing publicly available data (images, videos, tweets etc.) with open source tools? After all, amateur astronomers have been able to detect exoplanets and much more.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Also, just like the amateur astronomer only needs one expensive tool (a good telescope with data recording capabilities), the amateur data scientist only needs one expensive tool (a good laptop and possibly subscription to some cloud storage/computing services).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Amateur data scientists might earn money from winning Kaggle contests, working on problems such as identifying a Bonet, explaining the stock market flash crash, defeating Google page ranking algorithms (contact me regarding this upcoming, &lt;a href="http://www.analyticbridge.com/profiles/blogs/google-search-three-bugs-to-fix-with-better-data-science" target="_blank"&gt;paid project&lt;/a&gt;, helping find new complex molecules to fight cancer (analytical chemistry), predicting solar flares and their intensity. Interested in becoming an amateur data scientist? Here's a first project for you, to get started:&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;First project: do large meteors cause multiple small craters or a big one?&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;If meteors usually break up into multiple fragments, or approach the solar system already broken down into several pieces, they might be less dangerous than if they hit with a single, huge punch. That's the idea, although I'm not sure if this assumption is correct. Even if the opposite is true, it still worth asking the question about frequency of binary impact craters.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;img src="http://cdn.zmescience.com/wp-content/uploads/2013/02/binary-asteroid.jpg" class="align-center"/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: center;"&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;About to hit Earth&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Eventually, knowing that meteorites arrive in pieces rather than intact, could change government policies and priorities, and maybe stop spending money on projects to detect and blow up meteors (or the other way around).&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;So how would you go about estimating the chance that a large meteor (hitting Earth) creates multiple small impacts. And how many impacts on average: 2, or 3? An idea consists in looking at Moon craters and check how many of them are aligned. Yet what causes meteors to explode before hitting (an thus create multiple craters) is Earth's thick atmosphere. Thus Moon would not provide good data. Yet Earth's crust is so geologically active that all crater traces disappear after a few million years. Maybe Venus would be a good source of data? Nope, even worst than Earth. Maybe Mars? Nope, just like moon. Maybe some moons from Jupiter or Saturn would be great candidates.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;img src="http://ramblingsage.com/wp-content/uploads/Clear-Water-Lakes-Binary-Impact-Craters-297x300.jpg" class="align-center"/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p style="text-align: center;"&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;em&gt;Double impact, seen million years later&lt;/em&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Once a data source is identified and the questions answered, deeper questions can be asked, such as, when we see a binary crater (two craters, same meteor), what is the average distance between the two craters. This will also help better assess population risks, and how many billion dollars should NASA spend on meteor tracking programs.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;In any case, as a starter, I did a bit of research and found the &lt;a href="http://www.stecf.org/~ralbrech/amico/intabs/koeberlc.html" target="_blank"&gt;following data&lt;/a&gt;, with a map showing impact craters on Earth.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;img src="http://www.stecf.org/~ralbrech/amico/intabs/koeberlc.jpeg" class="align-center"/&gt;&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;Visually, with the naked eye, it looks like multiple impacts (e.g. binary craters), and crater alignment, is the norm, not the exception. But t&lt;a href="http://www.analyticbridge.com/profiles/blogs/how-to-detect-a-pattern-problem-and-solution" target="_blank"&gt;he brain can be very lousy at detecting probabilities&lt;/a&gt;. So a statistical analysis is needed. Note that the first step consists in processing the image to detect craters and extract coordinates, using some software or writing your own code. But this is still something a good amateur data scientist could do, I'm sure you can find the right tools on the Internet.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;A better idea might be to use &lt;a href="http://www.analyticbridge.com/profiles/blogs/great-statistical-analysis-forecasting-meteorite-hits" target="_blank"&gt;some public data I published a while back&lt;/a&gt;: it is more comprehensive, already in Excel format, and also has dates attached to craters, so that binary craters created on the same year are likely to be true twins (that is, coming from a single piece of space rock that broke into two fragments sometimes in the past, before hitting Earth). &lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.google.com/search?source=ig&amp;amp;rlz=&amp;amp;q=binary+craters+on+earth" target="_blank"&gt;Google 'binary craters on Earth'&lt;/a&gt; for additional info.&lt;/span&gt;&lt;/p&gt;
&lt;p&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;strong&gt;Related articles&lt;/strong&gt;&lt;/span&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/sample-proposal-for-a-fraud-detection-project" target="_blank" style="font-size: 10pt;"&gt;Example of proposal for a data science project&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/webanalytics/forum/topics/web-analytics-example-of" target="_blank" style="font-size: 10pt;"&gt;Another example of proposal&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/registered-meteorites-that-has-impacted-on-earth-visualized" target="_blank" style="font-size: 10pt;"&gt;Map of registered meteorites&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/debunking-the-story-about-the-russian-meteor-event" target="_blank" style="font-size: 10pt;"&gt;Debunking the story about the Russian meteor event&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/forum/topics/can-we-use-data-science-to-measure-distances-to-stars" target="_blank" style="font-size: 10pt;"&gt;Can we use data science to measure distances to stars?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/questions-about-astronomy-and-clustering" target="_blank" style="font-size: 10pt;"&gt;Questions about astronomy and clustering&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.bigdatanews.com/profiles/blogs/fast-clustering-algorithms-for-massive-datasets" target="_blank" style="font-size: 10pt;"&gt;Fast clustering algorithms for massive datasets&lt;/a&gt;&lt;span style="font-size: 10pt;"&gt; (with links to large datasets)&lt;/span&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/page/links" target="_blank" style="font-size: 10pt;"&gt;Additional resources&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/how-to-better-compete-with-other-data-scientists" target="_blank" style="font-size: 10pt;"&gt;How to better compete with other data scientists&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/vertical-vs-horizontal-data-scientists" target="_blank" style="font-size: 10pt;"&gt;Horizontal vs. Vertical Data Scientists&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/66-job-interview-questions-for-data-scientists" target="_blank" style="font-size: 10pt;"&gt;66 job interview questions for data scientists&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/are-data-scientists-overpaid" target="_blank" style="font-size: 10pt;"&gt;Are data scientists overpaid?&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.datasciencecentral.com/profiles/blogs/the-face-of-the-new-university" target="_blank" style="font-size: 10pt;"&gt;The Face of the New University&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/profiles/blogs/fake-data-science" target="_blank" style="font-size: 10pt;"&gt;Fake data science&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;li&gt;&lt;span style="font-family: arial, helvetica, sans-serif;" class="font-size-2"&gt;&lt;a href="http://www.analyticbridge.com/group/analyticscourses/forum/topics/free-courses-from-top-universities-coursera-com" style="font-size: 10pt;"&gt;Free courses from top universities&lt;/a&gt;&lt;/span&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/-_KO-5Vt8Mg" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:242023</feedburner:origLink></entry>
                            <entry>
                    <title>Is Open Source Really Free?</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/kzI4F-V_z94/2004291:BlogPost:242100" />
                                        <id>tag:www.analyticbridge.com,2013-04-17:2004291:BlogPost:242100</id>
                                        <updated>2013-04-17T08:00:00.000Z</updated>
                    
                                            <author>
                            <name>Galia Nedvedovich</name>
                            <uri>http://www.analyticbridge.com/profile/GaliaNedvedovich</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;Open Source and the Cost of Software Licenses&lt;/p&gt;
&lt;p&gt; &lt;/p&gt;
&lt;p&gt;People often think that the answer to this question lies in software costs, but in fact software costs are usually the red herring in the process of business intelligence costing&lt;i&gt;.&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt; &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;It is obvious that the more users your solution has the more software licenses are going to cost. Therefore, you might be tempted to choose a vendor that sells software for 30% less than another vendor – but basing a…&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;Open Source and the Cost of Software Licenses&lt;/p&gt;
&lt;p&gt; &lt;/p&gt;
&lt;p&gt;People often think that the answer to this question lies in software costs, but in fact software costs are usually the red herring in the process of business intelligence costing&lt;i&gt;.&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt; &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;It is obvious that the more users your solution has the more software licenses are going to cost. Therefore, you might be tempted to choose a vendor that sells software for 30% less than another vendor – but basing a decision solely on this is a big mistake as license costs have little bearing on the total cost of a BI solution, and hardly any impact on ROI&lt;i&gt;.&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt; &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Some proof to this can be found in open source. Open source BI provides (by definition) free software, and there is no shortage of open source BI tools/platforms. However, none of them are doing as well as the established non-open source vendors, even though they have been around since the beginning of the century. They’re having trouble acquiring customers, at least compared to commercial vendors. It is very easy to assume that if software costs were significant inhibitors in the BI space, open source solutions would be much more prominent than they actually are&lt;i&gt;.&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt; &lt;/b&gt;&lt;/p&gt;
&lt;p&gt;Another hint at this can be found in the ‘commercial’ (non-open source) world, where BI vendors do charge for licenses but will usually provide significant discounts on purchasing of large volumes of licenses. BI vendors do it for reasons that go beyond the obvious attempt to motivate potential buyers to expand their purchase orders. They do it because they realize the total cost of the solution – to the customer – grows significantly as the number of users grows, regardless of license costs (preparation projects, IT personnel assignment, etc). They need to take this into account when they price their software&lt;i&gt;.&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;/p&gt;
&lt;p&gt;&lt;i&gt;For more info  visit &lt;a href="http://www.sisense.com/product"&gt;http://www.sisense.com/product&lt;/a&gt;&lt;/i&gt;&lt;/p&gt;
&lt;p&gt;&lt;b&gt; &lt;/b&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/kzI4F-V_z94" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:242100</feedburner:origLink></entry>
                            <entry>
                    <title>How do you solve your "Pre-Etl" Source to target mapping problems?</title>
                    <link rel="alternate" href="http://feedproxy.google.com/~r/FeaturedBlogPosts-Analyticbridge/~3/FvDW5c4FOM4/2004291:BlogPost:242060" />
                                        <id>tag:www.analyticbridge.com,2013-04-15:2004291:BlogPost:242060</id>
                                        <updated>2013-04-15T20:30:00.000Z</updated>
                    
                                            <author>
                            <name>Mohammad Azad</name>
                            <uri>http://www.analyticbridge.com/profile/MohammadAzad</uri>
                        </author>
                    
                    <summary type="html">
                        &lt;p&gt;&lt;span&gt;It's one of the integration problems that most of the big palyers in the industry have pretty much left untouched, Anyone working in the data integration / data warehousing industy understands that when you build a data warehouse, you have to create these complex pre-ETL source mappings before the ETL developers start work. The way most organizations do this is with spreadsheets. Every organization has an exorbitant amount of spreadsheets that they use to document this stuff. Once…&lt;/span&gt;&lt;/p&gt;                    </summary>

                    <content type="html">
&lt;p&gt;&lt;span&gt;It's one of the integration problems that most of the big palyers in the industry have pretty much left untouched, Anyone working in the data integration / data warehousing industy understands that when you build a data warehouse, you have to create these complex pre-ETL source mappings before the ETL developers start work. The way most organizations do this is with spreadsheets. Every organization has an exorbitant amount of spreadsheets that they use to document this stuff. Once they've handed themoff to the ETL developers, they're never maintained. &lt;/span&gt;&lt;br/&gt; &lt;br/&gt; &lt;span&gt;Using spreadsheets is definitively non-authoritative: source mappings change as you design and test your ETL jobs. A spreadsheet that once functioned as the single or authoritative catalog of all source mappings might not get updated -- or (just as likely) might get updated with incorrect or incomplete information -- as the ETL design process evolves.&lt;/span&gt;&lt;/p&gt;&lt;img src="http://feeds.feedburner.com/~r/FeaturedBlogPosts-Analyticbridge/~4/FvDW5c4FOM4" height="1" width="1"/&gt;</content>
<category term="United States" />

                                    <feedburner:origLink>http://www.analyticbridge.com/xn/detail/2004291:BlogPost:242060</feedburner:origLink></entry>
                    </feed>
