<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Quantitative thoughts</title>
	
	<link>http://www.investuotojas.eu</link>
	<description>Quantitative investment strategies</description>
	<lastBuildDate>Thu, 16 Feb 2012 13:54:53 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/investuotojas" /><feedburner:info uri="investuotojas" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>VilniusR – R users group in Lithuania</title>
		<link>http://feedproxy.google.com/~r/investuotojas/~3/w_79gBsMOjI/</link>
		<comments>http://www.investuotojas.eu/2012/02/16/vilniusr-r-users-group-in-lithuania/#comments</comments>
		<pubDate>Thu, 16 Feb 2012 13:54:53 +0000</pubDate>
		<dc:creator>Dzidorius Martinaitis</dc:creator>
				<category><![CDATA[EN]]></category>
		<category><![CDATA[R-language]]></category>
		<category><![CDATA[Lithuania]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://www.investuotojas.eu/?p=747</guid>
		<description><![CDATA[Today is Lithuania&#8217;s independence day and I have created R user group in Lithuania &#8211; VilniusR. If you are near by please follow the link, sign up and I hope that we will have a meeting soon.]]></description>
			<content:encoded><![CDATA[<p>Today is <a href="http://en.wikipedia.org/wiki/Lithuania" target="_blank">Lithuania&#8217;s</a> independence day and I have created R user group in Lithuania &#8211; <a href="http://www.vilniusr.org" target="_blank">VilniusR</a>. If you are near by please follow <a href="http://www.vilniusr.org" target="_blank">the link</a>, sign up and I hope that we will have <a href="http://www.vilniusr.org/2012/02/16/vilniusr/" target="_blank">a meeting soon</a>.</p>

<p><a href="http://feedads.g.doubleclick.net/~a/dnBEo1grvv43ysvNJ76IC6vxRKI/0/da"><img src="http://feedads.g.doubleclick.net/~a/dnBEo1grvv43ysvNJ76IC6vxRKI/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/dnBEo1grvv43ysvNJ76IC6vxRKI/1/da"><img src="http://feedads.g.doubleclick.net/~a/dnBEo1grvv43ysvNJ76IC6vxRKI/1/di" border="0" ismap="true"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.investuotojas.eu/2012/02/16/vilniusr-r-users-group-in-lithuania/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.investuotojas.eu/2012/02/16/vilniusr-r-users-group-in-lithuania/</feedburner:origLink></item>
		<item>
		<title>Vectorized R vs Rcpp</title>
		<link>http://feedproxy.google.com/~r/investuotojas/~3/h9hMxfqWSIc/</link>
		<comments>http://www.investuotojas.eu/2012/02/01/vectorized-r-vs-rcpp/#comments</comments>
		<pubDate>Wed, 01 Feb 2012 20:03:09 +0000</pubDate>
		<dc:creator>Dzidorius Martinaitis</dc:creator>
				<category><![CDATA[EN]]></category>
		<category><![CDATA[quant]]></category>
		<category><![CDATA[R-language]]></category>
		<category><![CDATA[c++]]></category>
		<category><![CDATA[quantitative]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[rcpp]]></category>

		<guid isPermaLink="false">http://www.investuotojas.eu/?p=730</guid>
		<description><![CDATA[In my previous post, I tried to show, that Rcpp is 1000 faster than pure R and that generated the fuss in the comments. Being lazy, I didn&#8217;t vectorize R code and at the end I was comparing apples vs oranges. To fix that problem, I built a new script, where I&#8217;m trying to compare [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.investuotojas.eu/2012/01/30/the-power-of-rcpp/"  target="_blank">In my previous post</a>, I tried to show, that Rcpp is 1000 faster than pure R and that generated the fuss in the comments. Being lazy, I didn&#8217;t vectorize R code and at the end I was comparing apples vs oranges.</p>
<p>To fix that problem, I built a new script, where I&#8217;m trying to compare apples against apples. First piece of code named &#8220;ifelse R&#8221; uses R &#8220;ifelse&#8221; function to vectorize code. Second piece of code is fully vectorized code written in R, third &#8211; pure C++ code and the last one is C++, where  Rcpp &#8221;ifelse&#8221; function is used.</p>
<p><a href="http://s176.photobucket.com/albums/w180/investuotojas/?action=view&amp;current=performance.png" target="_blank"><img src="http://i176.photobucket.com/albums/w180/investuotojas/performance.png" alt="Photobucket" border="0" /></a></p>
<p>&nbsp;</p>
<table border="0">
<tbody>
<tr>
<th>name</th>
<th>seconds</th>
</tr>
<tr>
<td align="right">ifelse R</td>
<td align="right">27.50</td>
</tr>
<tr>
<td align="right">vectorized R</td>
<td align="right">10.40</td>
</tr>
<tr>
<td align="right">pure C++</td>
<td align="right">0.44</td>
</tr>
<tr>
<td align="right">vectorized C++</td>
<td align="right">2.24</td>
</tr>
</tbody>
</table>
<p>Here we go &#8211; vectorization truly helps, but pure C++ code still 23 times faster. Of course you pay the price when writing it in C++.<br />
I found a bit strange, that vectorized C++ code doesn&#8217;t perform that well&#8230;</p>
<p>You can get the code from <a href="https://github.com/kafka399/Rproject/blob/master/performance/performance.R"  target="_blank">github</a> or review it below:</p>

<div class="wp_codebox_msgheader wp_codebox_hide"><span class="right"><sup><a href="http://www.ericbess.com/ericblog/2008/03/03/wp-codebox/#examples" target="_blank" title="WP-CodeBox HowTo?"><span style="color: #99cc00">?</span></a></sup></span><span class="left"><a href="javascript:;" onclick="javascript:showCodeTxt('p730code2'); return false;">View Code</a> RSPLUS</span><div class="codebox_clear"></div></div><div class="wp_codebox"><table><tr id="p7302"><td class="code" id="p730code2"><pre class="rsplus" style="font-family:monospace;"><span style="color: #228B22;">#Author Dzidorius Martinaitis</span>
<span style="color: #228B22;">#Date 2012-02-01</span>
<span style="color: #228B22;">#Description http://www.investuotojas.eu/2012/02/01/vectorized-r-vs-rcpp</span>
&nbsp;
bid <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">runif</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">50000000</span>,<span style="color: #ff0000;">5</span>,<span style="color: #ff0000;">9</span><span style="color: #080;">&#41;</span>
ask <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">runif</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">50000000</span>,<span style="color: #ff0000;">5</span>,<span style="color: #ff0000;">9</span><span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/close.html"><span style="color: #0000FF; font-weight: bold;">close</span></a> <span style="color: #080;">=</span> <span style="color: #0000FF; font-weight: bold;">runif</span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">50000000</span>,<span style="color: #ff0000;">5</span>,<span style="color: #ff0000;">9</span><span style="color: #080;">&#41;</span>
&nbsp;
x<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/data.frame.html"><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span></a><span style="color: #080;">&#40;</span>bid<span style="color: #080;">=</span>bid,ask<span style="color: #080;">=</span>ask,last_price<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/close.html"><span style="color: #0000FF; font-weight: bold;">close</span></a><span style="color: #080;">&#41;</span>
rez<span style="color: #080;">=</span><span style="color: #ff0000;">0</span>
&nbsp;
<span style="color: #228B22;">###########    ifelse R  #################</span>
answ<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.vector.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">vector</span></span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/system.time.html"><span style="color: #0000FF; font-weight: bold;">system.<span style="">time</span></span></a><span style="color: #080;">&#40;</span>
<span style="color: #080;">&#123;</span>
rez <span style="color: #080;">=</span> <a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/ifelse.html"><span style="color: #0000FF; font-weight: bold;">ifelse</span></a><span style="color: #080;">&#40;</span>x$last_price<span style="color: #080;">&gt;</span><span style="color: #ff0000;">0</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/ifelse.html"><span style="color: #0000FF; font-weight: bold;">ifelse</span></a><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#91;</span>, <span style="color: #ff0000;">&quot;bid&quot;</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&gt;</span> x<span style="color: #080;">&#91;</span>, <span style="color: #ff0000;">&quot;last_price&quot;</span><span style="color: #080;">&#93;</span>, x<span style="color: #080;">&#91;</span>, <span style="color: #ff0000;">&quot;bid&quot;</span><span style="color: #080;">&#93;</span>, <a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/ifelse.html"><span style="color: #0000FF; font-weight: bold;">ifelse</span></a><span style="color: #080;">&#40;</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#91;</span>, <span style="color: #ff0000;">&quot;ask&quot;</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&gt;</span> <span style="color: #ff0000;">0</span><span style="color: #080;">&#41;</span> <span style="color: #080;">&amp;</span> <span style="color: #080;">&#40;</span>x<span style="color: #080;">&#91;</span>, <span style="color: #ff0000;">&quot;ask&quot;</span><span style="color: #080;">&#93;</span> <span style="color: #080;">&lt;</span> x<span style="color: #080;">&#91;</span>, <span style="color: #ff0000;">&quot;last_price&quot;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>, x<span style="color: #080;">&#91;</span>, <span style="color: #ff0000;">&quot;ask&quot;</span><span style="color: #080;">&#93;</span>, x<span style="color: #080;">&#91;</span>, <span style="color: #ff0000;">&quot;last_price&quot;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>, <span style="color: #ff0000;">0.5</span><span style="color: #080;">*</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#91;</span>, <span style="color: #ff0000;">&quot;ask&quot;</span><span style="color: #080;">&#93;</span> <span style="color: #080;">+</span> x<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">&quot;bid&quot;</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
<span style="color: #228B22;">###########   end ifelse R  #################</span>
&nbsp;
<span style="color: #228B22;">###########    vectorized R  #################</span>
&nbsp;
answ<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/append.html"><span style="color: #0000FF; font-weight: bold;">append</span></a><span style="color: #080;">&#40;</span>answ,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/system.time.html"><span style="color: #0000FF; font-weight: bold;">system.<span style="">time</span></span></a><span style="color: #080;">&#40;</span>
<span style="color: #080;">&#123;</span>
lgt0 <span style="color: #080;">=</span> x$last_price <span style="color: #080;">&gt;</span> <span style="color: #ff0000;">0</span>
bgtl <span style="color: #080;">=</span> x$bid <span style="color: #080;">&gt;</span> x$last_price
agt0 <span style="color: #080;">=</span> x$ask <span style="color: #080;">&gt;</span> <span style="color: #ff0000;">0</span>
altl <span style="color: #080;">=</span> x$ask <span style="color: #080;">&gt;</span> x$last_price
rez <span style="color: #080;">=</span> x$last_price
rez<span style="color: #080;">&#91;</span>lgt0 <span style="color: #080;">&amp;</span> agt0 <span style="color: #080;">&amp;</span> altl<span style="color: #080;">&#93;</span> <span style="color: #080;">=</span> x$ask<span style="color: #080;">&#91;</span>lgt0 <span style="color: #080;">&amp;</span> agt0 <span style="color: #080;">&amp;</span> altl<span style="color: #080;">&#93;</span>
rez<span style="color: #080;">&#91;</span>lgt0 <span style="color: #080;">&amp;</span> bgtl<span style="color: #080;">&#93;</span> <span style="color: #080;">=</span> x$bid<span style="color: #080;">&#91;</span>lgt0 <span style="color: #080;">&amp;</span> bgtl<span style="color: #080;">&#93;</span>
rez<span style="color: #080;">&#91;</span><span style="color: #080;">!</span>lgt0<span style="color: #080;">&#93;</span> <span style="color: #080;">=</span> <span style="color: #080;">&#40;</span>x$ask<span style="color: #080;">&#91;</span><span style="color: #080;">!</span>lgt0<span style="color: #080;">&#93;</span><span style="color: #080;">+</span>x$bid<span style="color: #080;">&#91;</span><span style="color: #080;">!</span>lgt0<span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #ff0000;">2</span>
<span style="color: #080;">&#125;</span>
<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
<span style="color: #228B22;">###########   end vectorized R  #################</span>
&nbsp;
<span style="color: #228B22;">#C++ code starts here</span>
&nbsp;
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/library.html"><span style="color: #0000FF; font-weight: bold;">library</span></a><span style="color: #080;">&#40;</span>inline<span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/library.html"><span style="color: #0000FF; font-weight: bold;">library</span></a><span style="color: #080;">&#40;</span>Rcpp<span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;">###########    pure C++  #################</span>
&nbsp;
code<span style="color: #080;">=</span><span style="color: #ff0000;">'
NumericVector bid(bid_);NumericVector ask(ask_);NumericVector close(close_);
int bid_size = bid.size();
NumericVector ret(bid_size);
for(int i =0;i&lt;bid_size;i++)
{
  if(close[i]&gt;0)
  {
    if(bid[i]&gt;close[i])
    {
      ret[i] = bid[i]; 
    }
    else if(ask[i]&gt;0 &amp;&amp; ask[i]&lt;close[i])
    {
      ret[i] = ask[i];//
    }
    else
    {
      ret[i] = close[i];//
    }
  }
  else
  {
    ret[i]=(bid[i]+ask[i])/2;
  }
&nbsp;
}
return ret;
'</span>
getLastPrice <span style="color: #080;">&lt;-</span> cxxfunction<span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/base/html/Log.html"><span style="color: #0000FF; font-weight: bold;">signature</span></a><span style="color: #080;">&#40;</span> bid_ <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;numeric&quot;</span>,ask_ <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;numeric&quot;</span>,close_<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;numeric&quot;</span><span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/body.html"><span style="color: #0000FF; font-weight: bold;">body</span></a><span style="color: #080;">=</span>code,plugin<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;Rcpp&quot;</span><span style="color: #080;">&#41;</span>
rez<span style="color: #080;">=</span><span style="color: #ff0000;">0</span>
answ<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/append.html"><span style="color: #0000FF; font-weight: bold;">append</span></a><span style="color: #080;">&#40;</span>answ,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/system.time.html"><span style="color: #0000FF; font-weight: bold;">system.<span style="">time</span></span></a><span style="color: #080;">&#40;</span>
  <span style="color: #080;">&#123;</span>
    rez<span style="color: #080;">=</span>getLastPrice<span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>x$bid<span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>x$ask<span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>x$last_price<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
  <span style="color: #080;">&#125;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;">###########   end pure C++  #################</span>
&nbsp;
<span style="color: #228B22;">#summary(rez)</span>
&nbsp;
&nbsp;
<span style="color: #228B22;">###########    vectorized C++  #################</span>
code<span style="color: #080;">=</span><span style="color: #ff0000;">'
NumericVector bid(bid_);NumericVector ask(ask_);NumericVector close(close_);
int bid_size = bid.size();
NumericVector ret=ifelse(close&gt;0,ifelse(bid &gt;close, bid, ifelse(ask &gt; 0,ifelse(ask &lt; close,ask, close),close)), 0.5*(ask + bid));
return ret;
'</span>
getLastPrice <span style="color: #080;">&lt;-</span> cxxfunction<span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/base/html/Log.html"><span style="color: #0000FF; font-weight: bold;">signature</span></a><span style="color: #080;">&#40;</span> bid_ <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;numeric&quot;</span>,ask_ <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;numeric&quot;</span>,close_<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;numeric&quot;</span><span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/body.html"><span style="color: #0000FF; font-weight: bold;">body</span></a><span style="color: #080;">=</span>code,plugin<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;Rcpp&quot;</span><span style="color: #080;">&#41;</span>
rez<span style="color: #080;">=</span><span style="color: #ff0000;">0</span>
answ<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/append.html"><span style="color: #0000FF; font-weight: bold;">append</span></a><span style="color: #080;">&#40;</span>answ,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/system.time.html"><span style="color: #0000FF; font-weight: bold;">system.<span style="">time</span></span></a><span style="color: #080;">&#40;</span>
<span style="color: #080;">&#123;</span>
  rez<span style="color: #080;">=</span>getLastPrice<span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>x$bid<span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>x$ask<span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>x$last_price<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;">###########   end vectorized C++  #################</span>
&nbsp;
<span style="color: #228B22;">#summary(rez)</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/names.html"><span style="color: #0000FF; font-weight: bold;">names</span></a><span style="color: #080;">&#40;</span>answ<span style="color: #080;">&#41;</span><span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/c.html"><span style="color: #0000FF; font-weight: bold;">c</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'ifelse R'</span>,<span style="color: #ff0000;">'vectorized R'</span>,<span style="color: #ff0000;">'pure C++'</span>,<span style="color: #ff0000;">'vectorized C++'</span><span style="color: #080;">&#41;</span>
&nbsp;
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/library.html"><span style="color: #0000FF; font-weight: bold;">library</span></a><span style="color: #080;">&#40;</span>ggplot2<span style="color: #080;">&#41;</span>
a<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/data.frame.html"><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span></a><span style="color: #080;">&#40;</span>ind<span style="color: #080;">=</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">4</span>,val<span style="color: #080;">=</span>answ<span style="color: #080;">&#41;</span>
ggplot<span style="color: #080;">&#40;</span>a,aes<span style="color: #080;">&#40;</span>ind,val<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">+</span>geom_point<span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/legend.html"><span style="color: #0000FF; font-weight: bold;">legend</span></a><span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/F.html"><span style="color: #0000FF; font-weight: bold;">F</span></a><span style="color: #080;">&#41;</span><span style="color: #080;">+</span>geom_text<span style="color: #080;">&#40;</span>aes<span style="color: #080;">&#40;</span>label<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/names.html"><span style="color: #0000FF; font-weight: bold;">names</span></a><span style="color: #080;">&#40;</span>answ<span style="color: #080;">&#41;</span>,hjust<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/c.html"><span style="color: #0000FF; font-weight: bold;">c</span></a><span style="color: #080;">&#40;</span><span style="color: #080;">-</span><span style="color: #ff0000;">0.2</span>,<span style="color: #080;">-</span><span style="color: #ff0000;">0.2</span>,<span style="color: #080;">-</span><span style="color: #ff0000;">0.2</span>,<span style="color: #ff0000;">0.8</span><span style="color: #080;">&#41;</span>,vjust<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/c.html"><span style="color: #0000FF; font-weight: bold;">c</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">0</span>,<span style="color: #ff0000;">0</span>,<span style="color: #ff0000;">0</span>,<span style="color: #080;">-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,size<span style="color: #080;">=</span><span style="color: #ff0000;">4</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>


<p><a href="http://feedads.g.doubleclick.net/~a/Ugie2tMZwfireHg3CKkbsLymQaQ/0/da"><img src="http://feedads.g.doubleclick.net/~a/Ugie2tMZwfireHg3CKkbsLymQaQ/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/Ugie2tMZwfireHg3CKkbsLymQaQ/1/da"><img src="http://feedads.g.doubleclick.net/~a/Ugie2tMZwfireHg3CKkbsLymQaQ/1/di" border="0" ismap="true"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.investuotojas.eu/2012/02/01/vectorized-r-vs-rcpp/feed/</wfw:commentRss>
		<slash:comments>9</slash:comments>
		<feedburner:origLink>http://www.investuotojas.eu/2012/02/01/vectorized-r-vs-rcpp/</feedburner:origLink></item>
		<item>
		<title>The power of Rcpp</title>
		<link>http://feedproxy.google.com/~r/investuotojas/~3/fzEt94ihsRs/</link>
		<comments>http://www.investuotojas.eu/2012/01/30/the-power-of-rcpp/#comments</comments>
		<pubDate>Mon, 30 Jan 2012 21:51:38 +0000</pubDate>
		<dc:creator>Dzidorius Martinaitis</dc:creator>
				<category><![CDATA[EN]]></category>
		<category><![CDATA[quant]]></category>
		<category><![CDATA[R-language]]></category>
		<category><![CDATA[quantitative]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[rcpp]]></category>

		<guid isPermaLink="false">http://www.investuotojas.eu/?p=703</guid>
		<description><![CDATA[While ago I built two R scripts to track OMX Baltic Benchmark Fund against the index. One script returns the deviation of  fund from the index and it works fast enough. The second calculates the value of the fund every minute and it used to take for while. For example, it spent 2 minutes or [...]]]></description>
			<content:encoded><![CDATA[<p>While ago I built two R scripts to track <a href="http://markets.ft.com/research/Markets/Tearsheets/Financials?s=OAMOBBF1L:VLX" target="_blank">OMX Baltic Benchmark Fund</a> against the index. One script returns the deviation of  fund from the index and it works fast enough. The second calculates the value of the fund every minute and it used to take for while. For example, it spent 2 minutes or more to get the values for one day. Here is an example of the result:</p>
<p><a href="http://s176.photobucket.com/albums/w180/investuotojas/?action=view&amp;current=ind.png" target="_blank"><img src="http://i176.photobucket.com/albums/w180/investuotojas/ind.png" alt="Photobucket" border="0" /></a></p>
<p>Following piece of code was in question:</p>

<div class="wp_codebox_msgheader wp_codebox_hide"><span class="right"><sup><a href="http://www.ericbess.com/ericblog/2008/03/03/wp-codebox/#examples" target="_blank" title="WP-CodeBox HowTo?"><span style="color: #99cc00">?</span></a></sup></span><span class="left"><a href="javascript:;" onclick="javascript:showCodeTxt('p703code5'); return false;">View Code</a> RSPLUS</span><div class="codebox_clear"></div></div><div class="wp_codebox"><table><tr id="p7035"><td class="code" id="p703code5"><pre class="rsplus" style="font-family:monospace;"><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/for.html"><span style="color: #0000FF; font-weight: bold;">for</span></a><span style="color: #080;">&#40;</span>y <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #ff0000;">1</span><span style="color: #080;">:</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/NROW.html"><span style="color: #0000FF; font-weight: bold;">NROW</span></a><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
 <span style="color: #080;">&#123;</span>
    z<span style="color: #080;">=</span>x<span style="color: #080;">&#91;</span>y,<span style="color: #080;">&#93;</span>
    <a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/if.html"><span style="color: #0000FF; font-weight: bold;">if</span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>z$last_price<span style="color: #080;">&gt;</span><span style="color: #ff0000;">0</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
    <span style="color: #080;">&#123;</span>
      <a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/if.html"><span style="color: #0000FF; font-weight: bold;">if</span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>z$bid<span style="color: #080;">&gt;</span>z$last_price<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>rez<span style="color: #080;">&#91;</span>y<span style="color: #080;">&#93;</span><span style="color: #080;">=</span>z$bid
      <span style="color: #0000FF; font-weight: bold;">else</span> <a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/if.html"><span style="color: #0000FF; font-weight: bold;">if</span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>z$ask<span style="color: #080;">&#41;</span><span style="color: #080;">&gt;</span><span style="color: #ff0000;">0</span> <span style="color: #080;">&amp;</span>amp<span style="color: #080;">;</span> <a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>z$ask<span style="color: #080;">&#41;</span><span style="color: #080;">&lt;</span>z$last_price<span style="color: #080;">&#41;</span>rez<span style="color: #080;">&#91;</span>y<span style="color: #080;">&#93;</span><span style="color: #080;">=</span>z$ask
      <span style="color: #0000FF; font-weight: bold;">else</span> rez<span style="color: #080;">&#91;</span>y<span style="color: #080;">&#93;</span><span style="color: #080;">=</span>z$last_price
    <span style="color: #080;">&#125;</span>
    <span style="color: #0000FF; font-weight: bold;">else</span>
    <span style="color: #080;">&#123;</span>
      rez<span style="color: #080;">&#91;</span>y<span style="color: #080;">&#93;</span><span style="color: #080;">=</span><span style="color: #080;">&#40;</span>z$ask<span style="color: #080;">+</span>z$bid<span style="color: #080;">&#41;</span><span style="color: #080;">/</span><span style="color: #ff0000;">2</span>
    <span style="color: #080;">&#125;</span>
 <span style="color: #080;">&#125;</span></pre></td></tr></table></div>

<p>The code above loops over time series and based on set of rules tries to decide which price (bid, ask or previous one) to use for calculations. Pure R script used to take 100 seconds to derive the price.</p>
<p>During the weekend I found time to watch very interesting <a href="http://goo.gl/zzq0B" target="_blank">Rcpp presentation</a>. To my surprise, there are numerous ways to seamlessly integrate C++ into R code. So, I decided to rewrite the code above in C++ (Rcpp and inline packages were used).</p>

<div class="wp_codebox_msgheader wp_codebox_hide"><span class="right"><sup><a href="http://www.ericbess.com/ericblog/2008/03/03/wp-codebox/#examples" target="_blank" title="WP-CodeBox HowTo?"><span style="color: #99cc00">?</span></a></sup></span><span class="left"><a href="javascript:;" onclick="javascript:showCodeTxt('p703code6'); return false;">View Code</a> RSPLUS</span><div class="codebox_clear"></div></div><div class="wp_codebox"><table><tr id="p7036"><td class="code" id="p703code6"><pre class="rsplus" style="font-family:monospace;"><span style="color: #228B22;">#c++ code embed in code value</span>
code<span style="color: #080;">=</span><span style="color: #ff0000;">'
NumericVector bid(bid_);NumericVector ask(ask_);NumericVector close(close_);NumericVector ret(ask_);
int bid_size = bid.size();
for(int i =0;i&lt;bid_size;i++)
{
  if(close[i]&gt;0)
  {
    if(bid[i]&gt;close[i])
    {
      ret[i] = bid[i];
    }
    else if(ask[i]&gt;0 &amp;amp;&amp;amp; ask[i]&lt;close[i])
    {
      ret[i] = ask[i];//
    }
    else
    {
      ret[i] = close[i];//
    }
  }
  else
  {
    ret[i]=(bid[i]+ask[i])/2;
  }
&nbsp;
}
return ret;
'</span>
<span style="color: #228B22;">#a glue function between C++ and R</span>
getLastPrice <span style="color: #080;">=</span> cxxfunction<span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/base/html/Log.html"><span style="color: #0000FF; font-weight: bold;">signature</span></a><span style="color: #080;">&#40;</span> bid_ <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;numeric&quot;</span>,ask_ <span style="color: #080;">=</span> <span style="color: #ff0000;">&quot;numeric&quot;</span>,close_<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;numeric&quot;</span><span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/body.html"><span style="color: #0000FF; font-weight: bold;">body</span></a><span style="color: #080;">=</span>code,plugin<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;Rcpp&quot;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;">#and the call of the function</span>
getLastPrice<span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>x$bid<span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>x$ask<span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>x$last_price<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>What did I get in return? Well, 0.1 of a second instead of 100 seconds!</p>

<p><a href="http://feedads.g.doubleclick.net/~a/GWt3xiz6vSRO5iE-eYQ8DR9AFQs/0/da"><img src="http://feedads.g.doubleclick.net/~a/GWt3xiz6vSRO5iE-eYQ8DR9AFQs/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/GWt3xiz6vSRO5iE-eYQ8DR9AFQs/1/da"><img src="http://feedads.g.doubleclick.net/~a/GWt3xiz6vSRO5iE-eYQ8DR9AFQs/1/di" border="0" ismap="true"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.investuotojas.eu/2012/01/30/the-power-of-rcpp/feed/</wfw:commentRss>
		<slash:comments>10</slash:comments>
		<feedburner:origLink>http://www.investuotojas.eu/2012/01/30/the-power-of-rcpp/</feedburner:origLink></item>
		<item>
		<title>ai-class.com vs ml-class.com</title>
		<link>http://feedproxy.google.com/~r/investuotojas/~3/lxXj_H2Njzw/</link>
		<comments>http://www.investuotojas.eu/2011/12/16/ai-class-com-vs-ml-class-com/#comments</comments>
		<pubDate>Fri, 16 Dec 2011 22:52:04 +0000</pubDate>
		<dc:creator>Dzidorius Martinaitis</dc:creator>
				<category><![CDATA[books]]></category>
		<category><![CDATA[EN]]></category>
		<category><![CDATA[quant]]></category>
		<category><![CDATA[R-language]]></category>

		<guid isPermaLink="false">http://www.investuotojas.eu/?p=690</guid>
		<description><![CDATA[For those who did not know, Stanford university offered free off charge 3 courses at beginning of the autumn. It is kind of shocking &#8211; US based institution offers education for free! Take any socialism oriented country and one of the promises is education for free. But it seems, that the argument loosing the power &#8211; Stanford, [...]]]></description>
			<content:encoded><![CDATA[<p>For those who did not know, Stanford university offered free off charge 3 courses at beginning of the autumn. It is kind of shocking &#8211; US based institution offers education for free! Take any socialism oriented country and one of the promises is education for free. But it seems, that the argument loosing the power &#8211; Stanford, <a href="http://www.khanacademy.org/" target="_blank">khanacademy</a> and <a href="http://ocw.mit.edu/index.htm" target="_blank">bunch of others</a> offer high quality learning for everyone.</p>
<p><a href="http://jan2012.ml-class.org" target="_blank">In January</a> (scroll down to get full list), Stanford will provide more than 15 courses for free and I thought that I could provide my based opinion about the courses.</p>
<p><a href="http://www.ml-class.com" target="_blank">ml-class.com</a> This course was perfect fit for my personality and I loved it. Every week there was video lessons about the topics like machine learning, datamining, and statistical pattern recognition, overview questions and programming exercises, which had to be completed in Octave/Matlab. The quality of the video was superb, the length of the lessons was 8-14 minutes and format of the lessons was great as well (Prof. Andrew Ng was seamlessly switching between the white board and talks).<br />
This course inspired me to build anomaly detection system at my work, where we already spotted few anomalies. Now I&#8217;m working on  kind of &#8220;spam filter implementation&#8221; for text analysis.<br />
For me, the practical part of the course is like the water for the fish &#8211; without it theoretical part is empty and to be forgotten within the hours.</p>
<p><a href="https://www.ai-class.com/">ai-class.com</a> This course gave to me a broad view about artificial intelligence: machine learning, robotics, natural language processing, computer vision, search algorithms and etc. I suppose, that because the topics are so different the course was align towards theoretical part &#8211; otherwise the practical parts would take forever. However, in the last part there was an optional exercise &#8211; to encrypt two texts, which I loved!<br />
The instructors, namely <a href="http://www.linkedin.com/pub/sebastian-thrun/17/713/88">Sebastian Thrun</a> and <a href="http://www.linkedin.com/in/pnorvig">Peter Norvig</a>, recommend this book: <a href="http://www.amazon.com/gp/product/0136042597/ref=as_li_qf_sp_asin_tl?ie=UTF8&amp;tag=quantitativ0e-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0136042597">Artificial Intelligence: A Modern Approach</a><img style="border: none !important; margin: 0px !important;" src="http://www.assoc-amazon.com/e/ir?t=quantitativ0e-20&amp;l=as2&amp;o=1&amp;a=0136042597" border="0" alt="" width="1" height="1" />. I should say, that the book was very helpful during the course and but I won&#8217;t use it outside the course.<br />
The courses have different evaluation systems. AI class will score your homework and exams, where the top 1% will be awarded with special paper and maybe a <a href="http://pastebin.com/JiczaBxb" target="_blank">job offer</a>, while ML class inclined towards delivering knowledge &#8211; almost everyone working hard could get 100% score without a penalty. I think, that based on such environments, different communities sprang up - <a href="http://www.aiqus.com/" target="_blank">aiqus.com</a> forum is very harsh to any question, where the answers start by stating, like &#8220;I know the answer, but hey, I can&#8217;t tell you anything, because honor code doesn&#8217;t allow and I&#8217;m the smartest guy on the Earth&#8221;, while <a href="http://www.ml-class.org/course/qna/index" target="_blank">ml-class forum</a> is more open minded &#8211; if you can&#8217;t crack the problem then other students will help you.<br />
I was in light shock, when I saw the format of AI lectures first time &#8211; the instructors used real white board, namely paper and pencil and took me a while to get use it.</p>
<p>But overall, I really really enjoy both courses and special thanks to Stanford professors, concretely Andrew Ng, Sebastian Thrun and Peter Norvig!</p>

<p><a href="http://feedads.g.doubleclick.net/~a/lp4L1NBM5tavsaPOWV7qEgx9aO0/0/da"><img src="http://feedads.g.doubleclick.net/~a/lp4L1NBM5tavsaPOWV7qEgx9aO0/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/lp4L1NBM5tavsaPOWV7qEgx9aO0/1/da"><img src="http://feedads.g.doubleclick.net/~a/lp4L1NBM5tavsaPOWV7qEgx9aO0/1/di" border="0" ismap="true"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.investuotojas.eu/2011/12/16/ai-class-com-vs-ml-class-com/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://www.investuotojas.eu/2011/12/16/ai-class-com-vs-ml-class-com/</feedburner:origLink></item>
		<item>
		<title>C++ is dead. Long live C++</title>
		<link>http://feedproxy.google.com/~r/investuotojas/~3/wqx1N55hiZo/</link>
		<comments>http://www.investuotojas.eu/2011/12/01/c-is-dead-long-live-c/#comments</comments>
		<pubDate>Thu, 01 Dec 2011 20:39:22 +0000</pubDate>
		<dc:creator>Dzidorius Martinaitis</dc:creator>
				<category><![CDATA[books]]></category>
		<category><![CDATA[EN]]></category>
		<category><![CDATA[quant]]></category>
		<category><![CDATA[R-language]]></category>
		<category><![CDATA[coding]]></category>
		<category><![CDATA[java]]></category>
		<category><![CDATA[quantitative]]></category>
		<category><![CDATA[R]]></category>

		<guid isPermaLink="false">http://www.investuotojas.eu/?p=667</guid>
		<description><![CDATA[During the summer I was contacted by a hedge fund from Bahamas. The fund was looking for someone with R language skills on-site and insisted for phone interview. Besides obvious questions about finance, statistics, coding and how many tennis balls can fit in Boeing 747 (ok, this question was omitted), they wanted to know if [...]]]></description>
			<content:encoded><![CDATA[<p>During the summer I was contacted by a hedge fund from Bahamas. The fund was looking for someone with R language skills on-site and insisted for phone interview. Besides obvious questions about finance, statistics, coding and how many tennis balls can fit in Boeing 747 (ok, this question was omitted), they wanted to know if I code in C++. So, I told them true &#8211; the last time I wrote a line in C++ was more or less 10 years ago. Long story short &#8211; it made me thinking about existence of C++.</p>
<p>10 yeas ago I was told, that C++ is going to disappear soon and Java is the king. At that time neither HN nor stackoverflow existed (meaning, that I had to rely on limited source), so I took it for granted, so here I am.</p>
<p>What do we have 10 years later? Neither C++ is dead, nor Java is sexy anymore. Actually is opposite &#8211; if you use Java, then you are clumsy programmer with lack of imagination. Does it sounds offensive? Then read for example the comments of <a href="http://news.ycombinator.com/item?id=3293392">Scala vs Java</a> article and you will get the same feeling. Replacement of  Sun with Oracle does not help either.</p>
<p>But lets go back to C++. <a href="http://www.google.com/trends?q=c%2B%2B" target="_blank">Google trends</a> says, that C++ enjoys either maturity or decline. However, if you concentrate on specific industry, you will have a different picture. Kernel development (C not C++), game industry, number crunching, data mining, finance &#8211; where C++ matters. I know, I know, that you can write a magic code with Ruby or Python and it will perform almost as C++. And I saw a video, where guys were claiming, that they tuned &#8220;a bit&#8221;  Java and now it is able to deal with more that 1 million requests a second. <em>Only</em> the thing they did was elimination of garbage collection. The question &#8211; is it really worth of doing that way?</p>
<p>Next thing is to check what is demand for C++. Time to time I scroll through HN <a href="http://news.ycombinator.com/item?id=2949787">to be millionaires list</a> and strangely enough C++ is demanded for back end systems, where performance or data amount is an issue. However, if you are targeting finance industry exclusively, then you may find this<a href="http://quant.stackexchange.com/questions/306/what-programming-languages-are-most-commonly-used-in-quantitative-finance" target="_blank"> discussion</a> interesting. Basically, it says, that there is stable demand for C++. Worth to say nonetheless, that C++ is mostly used by front and middle offices (where quants live) and the demand diminish in back office.</p>
<p>With such subjective study in mind I purchased <a href="http://www.amazon.com/gp/product/0201721481/ref=as_li_tf_tl?ie=UTF8&amp;tag=quantitativ0e-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=0201721481">C++ Primer</a><img style="border: none !important; margin: 0px !important;" src="http://www.assoc-amazon.com/e/ir?t=quantitativ0e-20&amp;l=as2&amp;o=1&amp;a=0201721481" border="0" alt="" width="1" height="1" /> by Stanley B. Lippman, which I recommend to beginners or disillusioned users like me. So far I built 2 small projects and in which one of them I parse 1.5 million terms to get a list of most used terms. R language does that in minutes, C++ in seconds.</p>
<p>Welcome back C++. It seems, that I miss you.</p>

<p><a href="http://feedads.g.doubleclick.net/~a/JFsbHlshuL5RK2k2zEnqrMEDLiM/0/da"><img src="http://feedads.g.doubleclick.net/~a/JFsbHlshuL5RK2k2zEnqrMEDLiM/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/JFsbHlshuL5RK2k2zEnqrMEDLiM/1/da"><img src="http://feedads.g.doubleclick.net/~a/JFsbHlshuL5RK2k2zEnqrMEDLiM/1/di" border="0" ismap="true"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.investuotojas.eu/2011/12/01/c-is-dead-long-live-c/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		<feedburner:origLink>http://www.investuotojas.eu/2011/12/01/c-is-dead-long-live-c/</feedburner:origLink></item>
		<item>
		<title>Trading volume forecast for an illiquid stock</title>
		<link>http://feedproxy.google.com/~r/investuotojas/~3/wxnRIUIW978/</link>
		<comments>http://www.investuotojas.eu/2011/08/08/trading-volume-forecast-for-an-illiquid-stock/#comments</comments>
		<pubDate>Mon, 08 Aug 2011 10:14:54 +0000</pubDate>
		<dc:creator>Dzidorius Martinaitis</dc:creator>
				<category><![CDATA[EN]]></category>
		<category><![CDATA[quant]]></category>
		<category><![CDATA[R-language]]></category>
		<category><![CDATA[TCA]]></category>
		<category><![CDATA[correlation]]></category>
		<category><![CDATA[ggplot2]]></category>
		<category><![CDATA[quantitative]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[trading]]></category>

		<guid isPermaLink="false">http://www.investuotojas.eu/?p=658</guid>
		<description><![CDATA[When dealing with transaction cost analysis, a stock&#8217;s volume is assumed to be stable or foreseeable.  However, there is different picture, then we are dealing with an illiquid stock. It is relatively easy to forecast the volume of a liquid stock, because trading volume has high autocorrelation &#8211; the volumes at t and t+1 are correlated. For [...]]]></description>
			<content:encoded><![CDATA[<p>When dealing with transaction cost analysis, a stock&#8217;s volume is assumed to be stable or foreseeable.  However, there is different picture, then we are dealing with an illiquid stock.</p>
<p>It is relatively easy to forecast the volume of a liquid stock, because trading volume has high autocorrelation &#8211; the volumes at<em> t</em> and <em>t+1</em> are correlated. For example, let&#8217;s take a look at BPN Paribas stock  autocorrelation figure:</p>
<p><a href="http://s176.photobucket.com/albums/w180/investuotojas/?action=view&amp;current=bnpautocorrelation.png" target="_blank"><img src="http://i176.photobucket.com/albums/w180/investuotojas/bnpautocorrelation.png" border="0" alt="Photobucket" /></a></p>
<p>X axis shows the lag between the days and Y axis shows percentage of the correlation. For BNP Paribas stock we have 60 % correlation between t and t+1 and 30 % between t and t+2.</p>
<p>Now, let&#8217;s look what is autocorrelation  of an illiquid stock:</p>
<p><a href="http://s176.photobucket.com/albums/w180/investuotojas/?action=view&amp;current=vlpautocorrelation.png" target="_blank"><img src="http://i176.photobucket.com/albums/w180/investuotojas/vlpautocorrelation.png" border="0" alt="Photobucket" /></a></p>
<p>The figure above shows, that autocorrelation for this emerging market stock is zero. That means, that we can&#8217;t forecast tomorrow&#8217;s volume, based on today&#8217;s volume.</p>
<p>Imagine, that a portfolio manager has to liquidate 100 000 of illiquid stock where the median of daily volume is 4000 and participation rate has to be maximum 20 %. How many days it will take to liquidate the position?</p>
<p>Because the daily volume of the stock is very volatile, we need more randomness in our forecast. To do that, we can use bootstrap &#8211; let&#8217;s take the last 250 data-points of the volume and generate 10 000 or 100 000 new time series. Once the have a bunch of new time series, let&#8217;s check how many days it would take to liquidate 100 000 stock position in each. Finally, we collect the numbers of the days needed for liquidation  and form a new vector. The result is following:</p>
<p><a href="http://s176.photobucket.com/albums/w180/investuotojas/?action=view&amp;current=volumeHist.png" target="_blank"><img src="http://i176.photobucket.com/albums/w180/investuotojas/volumeHist.png" border="0" alt="Photobucket" /></a></p>
<p>The histogram above shows, that it will take maximum 80 days to liquidate 100 000 of illiquid stock with 95% confidence.</p>

<div class="wp_codebox_msgheader wp_codebox_hide"><span class="right"><sup><a href="http://www.ericbess.com/ericblog/2008/03/03/wp-codebox/#examples" target="_blank" title="WP-CodeBox HowTo?"><span style="color: #99cc00">?</span></a></sup></span><span class="left"><a href="javascript:;" onclick="javascript:showCodeTxt('p658code8'); return false;">View Code</a> RSPLUS</span><div class="codebox_clear"></div></div><div class="wp_codebox"><table><tr id="p6588"><td class="code" id="p658code8"><pre class="rsplus" style="font-family:monospace;"><span style="color: #228B22;">#volume - vector of the stock's volume</span>
<span style="color: #228B22;">#shareNumber - number of the shares to liquidate</span>
<span style="color: #228B22;">#loop - number of the time-series to be created</span>
<span style="color: #228B22;">#participationRate - what is going to be participation rate </span>
simNumberOfDays<span style="color: #080;">&lt;-</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/function.html"><span style="color: #0000FF; font-weight: bold;">function</span></a><span style="color: #080;">&#40;</span>volume,shareNumber,loop<span style="color: #080;">=</span><span style="color: #ff0000;">10000</span>,participationRate<span style="color: #080;">=</span><span style="color: #ff0000;">0.05</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
  rez<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/matrix.html"><span style="color: #0000FF; font-weight: bold;">matrix</span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/nrow.html"><span style="color: #0000FF; font-weight: bold;">nrow</span></a><span style="color: #080;">=</span>loop<span style="color: #080;">&#41;</span>
  <a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/for.html"><span style="color: #0000FF; font-weight: bold;">for</span></a><span style="color: #080;">&#40;</span>i <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #ff0000;">1</span><span style="color: #080;">:</span>loop<span style="color: #080;">&#41;</span>
  <span style="color: #080;">&#123;</span>
    x<span style="color: #080;">=</span>nasa<span style="color: #080;">&#91;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/sample.html"><span style="color: #0000FF; font-weight: bold;">sample</span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/NROW.html"><span style="color: #0000FF; font-weight: bold;">NROW</span></a><span style="color: #080;">&#40;</span>volume<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#93;</span>
    y<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/cumsum.html"><span style="color: #0000FF; font-weight: bold;">cumsum</span></a><span style="color: #080;">&#40;</span>x<span style="color: #080;">*</span>participationRate<span style="color: #080;">&#41;</span>
    rez<span style="color: #080;">&#91;</span>i,<span style="color: #080;">&#93;</span><span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/which.html"><span style="color: #0000FF; font-weight: bold;">which</span></a><span style="color: #080;">&#40;</span>y<span style="color: #080;">&gt;</span>shareNumber<span style="color: #080;">&#41;</span><span style="color: #080;">&#91;</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>
  <span style="color: #080;">&#125;</span>
  <a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/return.html"><span style="color: #0000FF; font-weight: bold;">return</span></a><span style="color: #080;">&#40;</span>rez<span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span>
rez<span style="color: #080;">=</span>simNumberOfDays<span style="color: #080;">&#40;</span>illiquidStock,<span style="color: #ff0000;">100000</span>,<span style="color: #ff0000;">10000</span>,<span style="color: #ff0000;">0.2</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>


<p><a href="http://feedads.g.doubleclick.net/~a/jOnb3U_6BEnfb9jOtyC4Uj23CMQ/0/da"><img src="http://feedads.g.doubleclick.net/~a/jOnb3U_6BEnfb9jOtyC4Uj23CMQ/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/jOnb3U_6BEnfb9jOtyC4Uj23CMQ/1/da"><img src="http://feedads.g.doubleclick.net/~a/jOnb3U_6BEnfb9jOtyC4Uj23CMQ/1/di" border="0" ismap="true"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.investuotojas.eu/2011/08/08/trading-volume-forecast-for-an-illiquid-stock/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.investuotojas.eu/2011/08/08/trading-volume-forecast-for-an-illiquid-stock/</feedburner:origLink></item>
		<item>
		<title>How big block trades affect stock market prices?</title>
		<link>http://feedproxy.google.com/~r/investuotojas/~3/xxNtLrNG4kk/</link>
		<comments>http://www.investuotojas.eu/2011/07/27/how_big_block_trades_affect_stock_market_prices/#comments</comments>
		<pubDate>Wed, 27 Jul 2011 20:52:53 +0000</pubDate>
		<dc:creator>Dzidorius Martinaitis</dc:creator>
				<category><![CDATA[EN]]></category>
		<category><![CDATA[quant]]></category>
		<category><![CDATA[R-language]]></category>
		<category><![CDATA[TCA]]></category>
		<category><![CDATA[ggplot2]]></category>
		<category><![CDATA[Lithuania]]></category>
		<category><![CDATA[quantitative]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[trading]]></category>

		<guid isPermaLink="false">http://www.investuotojas.eu/?p=644</guid>
		<description><![CDATA[I will be giving a presentation on &#8220;Optimal transaction cost&#8221; in Vilnius on  16  August. While preparing the presentation and looking for an optimal execution solution, a natural question arises: does the size of the trade affect stock market price? I&#8217;m sure, you would say 100 % yes. Well, you would be right, but what is [...]]]></description>
			<content:encoded><![CDATA[<p>I will be giving a presentation on &#8220;Optimal transaction cost&#8221; in <a href="http://maps.google.com/maps?q=vilnius,Maironio+g.+11&amp;hl=en&amp;sll=54.689386,25.280024&amp;sspn=0.599292,1.226349&amp;z=16" target="_blank">Vilnius</a> on  16  August. While preparing the presentation and looking for an optimal execution solution, a natural question arises: does the size of the trade affect stock market price? I&#8217;m sure, you would say 100 % yes. Well, you would be right, but what is the scale of such effect? Is it possible to profit from execution of the big block trades?</p>
<p>Such test is not trivial and to conduct it, you need high frequency data, which is messy in most of the cases. For testing purpose I chose <a href="http://finance.yahoo.com/q?s=BNP.PA&amp;ql=0" target="_blank">BNP Paribas</a> stock from February 2011 to May 2011. Initially, I had more than 460 k. trades and more than 320k. quotes. However, the data was filtered by buyers initiated trades. To find buyers initiated trades, I used <em>Lee-Ready Rule</em> &#8211; short description can be found <a href="http://goo.gl/RWSqa" target="_blank">here</a> on page 2. I found about Lee &#8211; Ready rule while reading <a href="http://www.maxdama.com/?p=477" target="_blank">Maxdama</a> last post and a damn good <a href="http://dl.dropbox.com/u/39904/maxdama.pdf" target="_blank">summary</a> (check page 42).</p>
<p>The first chart below shows the average return  one trade later (within seconds in most of the cases), when big or small trade was done. X axis represents difference between the trade and following trade, Y axis represents the trade size and the dot size represents number of trades within that cluster of volume. As you can see, small trades add 0.0004% to the price, while big ones (more than 980 of shares) increase the price on average 0.0007%</p>
<p><a href="http://s176.photobucket.com/albums/w180/investuotojas/?action=view&amp;current=bnpNextTrade.png" target="_blank"><img src="http://i176.photobucket.com/albums/w180/investuotojas/bnpNextTrade.png" border="0" alt="Photobucket" /></a></p>
<p>The next figure shows average return one minute later. This time the different between small trades and big one are almost3 times!</p>
<p><a href="http://s176.photobucket.com/albums/w180/investuotojas/?action=view&amp;current=bnpMinuteLater.png" target="_blank"><img src="http://i176.photobucket.com/albums/w180/investuotojas/bnpMinuteLater.png" border="0" alt="Photobucket" /></a></p>
<p>While we can see, that stock market prices are affected by big blocks, there&#8217;s no easy way to profit from it. You have to take into account bid/ask spread, plus you are becoming liquidity demander when liquidity is dry. On other end, this test shows the cost for each volume cluster and this cost can be used when choosing an optimal strategy for portfolio/stock liquidation.</p>

<p><a href="http://feedads.g.doubleclick.net/~a/L4a6Zmw0u4ibrFU-tTOJ6izLO7s/0/da"><img src="http://feedads.g.doubleclick.net/~a/L4a6Zmw0u4ibrFU-tTOJ6izLO7s/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/L4a6Zmw0u4ibrFU-tTOJ6izLO7s/1/da"><img src="http://feedads.g.doubleclick.net/~a/L4a6Zmw0u4ibrFU-tTOJ6izLO7s/1/di" border="0" ismap="true"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.investuotojas.eu/2011/07/27/how_big_block_trades_affect_stock_market_prices/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.investuotojas.eu/2011/07/27/how_big_block_trades_affect_stock_market_prices/</feedburner:origLink></item>
		<item>
		<title>Plotting git statistics</title>
		<link>http://feedproxy.google.com/~r/investuotojas/~3/rvP7bBzxreg/</link>
		<comments>http://www.investuotojas.eu/2011/07/13/plotting-git-statistics/#comments</comments>
		<pubDate>Wed, 13 Jul 2011 16:28:08 +0000</pubDate>
		<dc:creator>Dzidorius Martinaitis</dc:creator>
				<category><![CDATA[EN]]></category>
		<category><![CDATA[R-language]]></category>

		<guid isPermaLink="false">http://www.investuotojas.eu/?p=524</guid>
		<description><![CDATA[Here&#8217;s a funny story &#8211; friend of my, avid gamer at that time, was going downhill on a bicycle when wonderful idea flashed his mind: I need to save the current status&#8230; Just in case if I crash, I will start again from the top of the hill. If you are a developer (quantitative or [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a funny story &#8211; friend of my, avid gamer at that time, was going downhill on a bicycle when wonderful idea flashed his mind: I need to save the current status&#8230; Just in case if I crash, I will start again from the top of the hill.</p>
<p>If you are a developer (quantitative or software), then you can use such marvelous feature. I use <a href="https://github.com/" target="_blank">GitHub</a> for my software and data mining or quantitative projects. Yesterday I came up with an idea to check my statistics of git commits. You can easily find ready to use software, but I was eager to extend my knowledge about git features and keep my machine clean.</p>
<p>I built two scripts &#8211; one is Linux shell script to get the data and another one is to plot the data in R.<br />
<strong>getstats.sh:</strong></p>

<div class="wp_codebox_msgheader wp_codebox_hide"><span class="right"><sup><a href="http://www.ericbess.com/ericblog/2008/03/03/wp-codebox/#examples" target="_blank" title="WP-CodeBox HowTo?"><span style="color: #99cc00">?</span></a></sup></span><span class="left"><a href="javascript:;" onclick="javascript:showCodeTxt('p524code11'); return false;">View Code</a> BASH</span><div class="codebox_clear"></div></div><div class="wp_codebox"><table><tr id="p52411"><td class="code" id="p524code11"><pre class="bash" style="font-family:monospace;"><span style="color: #c20cb9; font-weight: bold;">git</span> log master <span style="color: #660033;">--shortstat</span> <span style="color: #660033;">--pretty</span>=<span style="color: #ff0000;">&quot;format: %ai&quot;</span><span style="color: #000000; font-weight: bold;">|</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #660033;">-e</span> <span style="color: #ff0000;">'s/\+[0-9]*/,/g'</span><span style="color: #000000; font-weight: bold;">|</span><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">':a;N;$!ba;s/ ,\n/,/g'</span><span style="color: #000000; font-weight: bold;">|</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/ files changed//g'</span><span style="color: #000000; font-weight: bold;">|</span><span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/ insertions(,)//g'</span><span style="color: #000000; font-weight: bold;">|</span>
<span style="color: #c20cb9; font-weight: bold;">sed</span> <span style="color: #ff0000;">'s/ deletions(-)//g'</span> <span style="color: #000000; font-weight: bold;">&gt;</span>gitstats.csv</pre></td></tr></table></div>

<p>This part of the code: <em>git log master &#8211;shortstat &#8211;pretty=&#8221;format: %ai&#8221;</em> dumps all necessary data and the rest of the code makes it ready for R consumption. I found this <a href="ftp://ftp.kernel.org/pub/software/scm/git-core/docs/pretty-formats.txt" target="_blank">page</a> helpful, when I tried to format the dump.</p>
<p><strong>gitStats.R:</strong></p>

<div class="wp_codebox_msgheader wp_codebox_hide"><span class="right"><sup><a href="http://www.ericbess.com/ericblog/2008/03/03/wp-codebox/#examples" target="_blank" title="WP-CodeBox HowTo?"><span style="color: #99cc00">?</span></a></sup></span><span class="left"><a href="javascript:;" onclick="javascript:showCodeTxt('p524code12'); return false;">View Code</a> RSPLUS</span><div class="codebox_clear"></div></div><div class="wp_codebox"><table><tr id="p52412"><td class="code" id="p524code12"><pre class="rsplus" style="font-family:monospace;"><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/require.html"><span style="color: #0000FF; font-weight: bold;">require</span></a><span style="color: #080;">&#40;</span>ggplot2<span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/require.html"><span style="color: #0000FF; font-weight: bold;">require</span></a><span style="color: #080;">&#40;</span>xts<span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/setwd.html"><span style="color: #0000FF; font-weight: bold;">setwd</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'/home/git/Rproject/gitStats/'</span><span style="color: #080;">&#41;</span> 
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/Sys.setenv.html"><span style="color: #0000FF; font-weight: bold;">Sys.<span style="">setenv</span></span></a><span style="color: #080;">&#40;</span>TZ<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;GMT&quot;</span><span style="color: #080;">&#41;</span>
tmp<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.matrix.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">matrix</span></span></a><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">read.<span style="">table</span></span><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'gitstats.csv'</span>,sep<span style="color: #080;">=</span><span style="color: #ff0000;">','</span>,header<span style="color: #080;">=</span>FALSE<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
commits<span style="color: #080;">=</span>xts<span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/cbind.html"><span style="color: #0000FF; font-weight: bold;">cbind</span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.double.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">double</span></span></a><span style="color: #080;">&#40;</span>tmp<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.double.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">double</span></span></a><span style="color: #080;">&#40;</span>tmp<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">3</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.double.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">double</span></span></a><span style="color: #080;">&#40;</span>tmp<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">4</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,order.<span style="">by</span><span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.POSIXct.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">POSIXct</span></span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/strptime.html"><span style="color: #0000FF; font-weight: bold;">strptime</span></a><span style="color: #080;">&#40;</span>tmp<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span>,<span style="color: #ff0000;">'%Y-%m-%d %H:%M:%S'</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
&nbsp;
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/colnames.html"><span style="color: #0000FF; font-weight: bold;">colnames</span></a><span style="color: #080;">&#40;</span>commits<span style="color: #080;">&#41;</span><span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/c.html"><span style="color: #0000FF; font-weight: bold;">c</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'Changes'</span>,<span style="color: #ff0000;">'Insertion'</span>,<span style="color: #ff0000;">'Deletion'</span><span style="color: #080;">&#41;</span>
tmp<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/data.frame.html"><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span></a><span style="color: #080;">&#40;</span>Date<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.Date.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">Date</span></span></a><span style="color: #080;">&#40;</span>index<span style="color: #080;">&#40;</span>commits<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,Changes<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>commits$Changes<span style="color: #080;">&#41;</span>,Insertion<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>commits$Insertion<span style="color: #080;">&#41;</span>,Deletion<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>commits$Deletion<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
tmp<span style="color: #080;">=</span>melt<span style="color: #080;">&#40;</span>tmp,id.<span style="">vars</span><span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/c.html"><span style="color: #0000FF; font-weight: bold;">c</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'Date'</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/summary.lm.html"><span style="color: #0000FF; font-weight: bold;">png</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'gitStats.png'</span>,width<span style="color: #080;">=</span><span style="color: #ff0000;">500</span><span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/print.html"><span style="color: #0000FF; font-weight: bold;">print</span></a><span style="color: #080;">&#40;</span>ggplot<span style="color: #080;">&#40;</span>tmp,aes<span style="color: #080;">&#40;</span>Date,value,color<span style="color: #080;">=</span>variable<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">+</span>geom_jitter<span style="color: #080;">&#40;</span>alpha<span style="color: #080;">=</span>.65,size<span style="color: #080;">=</span><span style="color: #ff0000;">3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/summary.lm.html"><span style="color: #0000FF; font-weight: bold;">dev.<span style="">off</span></span></a><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;">#############daily aggregated data##############</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/factor.html"><span style="color: #0000FF; font-weight: bold;">factor</span></a><span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.factor.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">factor</span></span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/format.html"><span style="color: #0000FF; font-weight: bold;">format</span></a><span style="color: #080;">&#40;</span>index<span style="color: #080;">&#40;</span>commits<span style="color: #080;">&#41;</span>,<span style="color: #ff0000;">'%Y%m%d'</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
tmp<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/cbind.html"><span style="color: #0000FF; font-weight: bold;">cbind</span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">aggregate</span><span style="color: #080;">&#40;</span>commits$Changes,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/factor.html"><span style="color: #0000FF; font-weight: bold;">factor</span></a>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/sum.html"><span style="color: #0000FF; font-weight: bold;">sum</span></a><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">aggregate</span><span style="color: #080;">&#40;</span>commits$Insertion,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/factor.html"><span style="color: #0000FF; font-weight: bold;">factor</span></a>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/sum.html"><span style="color: #0000FF; font-weight: bold;">sum</span></a><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">aggregate</span><span style="color: #080;">&#40;</span>commits$Deletion,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/factor.html"><span style="color: #0000FF; font-weight: bold;">factor</span></a>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/sum.html"><span style="color: #0000FF; font-weight: bold;">sum</span></a><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
tmp<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/data.frame.html"><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/unique.html"><span style="color: #0000FF; font-weight: bold;">unique</span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.Date.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">Date</span></span></a><span style="color: #080;">&#40;</span>index<span style="color: #080;">&#40;</span>commits<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,tmp<span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/colnames.html"><span style="color: #0000FF; font-weight: bold;">colnames</span></a><span style="color: #080;">&#40;</span>tmp<span style="color: #080;">&#41;</span><span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/c.html"><span style="color: #0000FF; font-weight: bold;">c</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'Date'</span>,<span style="color: #ff0000;">'Changes'</span>,<span style="color: #ff0000;">'Insertion'</span>,<span style="color: #ff0000;">'Deletion'</span><span style="color: #080;">&#41;</span>
tmp<span style="color: #080;">=</span>melt<span style="color: #080;">&#40;</span>tmp,id.<span style="">vars</span><span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/c.html"><span style="color: #0000FF; font-weight: bold;">c</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'Date'</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/summary.lm.html"><span style="color: #0000FF; font-weight: bold;">png</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'gitStats2.png'</span>,width<span style="color: #080;">=</span><span style="color: #ff0000;">500</span><span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/print.html"><span style="color: #0000FF; font-weight: bold;">print</span></a><span style="color: #080;">&#40;</span>ggplot<span style="color: #080;">&#40;</span>tmp,aes<span style="color: #080;">&#40;</span>Date,value,color<span style="color: #080;">=</span>variable<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">+</span>geom_jitter<span style="color: #080;">&#40;</span>alpha<span style="color: #080;">=</span>.65,size<span style="color: #080;">=</span><span style="color: #ff0000;">3</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/summary.lm.html"><span style="color: #0000FF; font-weight: bold;">dev.<span style="">off</span></span></a><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>R script generates this nice plot below:</p>
<p><a href="http://s176.photobucket.com/albums/w180/investuotojas/?action=view&amp;current=gitStats.png" target="_blank"><img src="http://i176.photobucket.com/albums/w180/investuotojas/gitStats.png" border="0" alt="Photobucket" /></a></p>
<p>What does it shows? It shows my activity in master repository. There is two projects &#8211; one was suspended in March and another one is under heavy development. As you can see, there was a lot of insertion when the last project was committed and since then numbers of insertion declined. I will come back, when I generate more data.<br />
Do you track your git activity?</p>
<p><a href="https://github.com/kafka399/Rproject/tree/master/gitStats" target="_blank">Source code</a></p>

<p><a href="http://feedads.g.doubleclick.net/~a/Q8EjiFF8D-MKO53a9l6f1G8h8AI/0/da"><img src="http://feedads.g.doubleclick.net/~a/Q8EjiFF8D-MKO53a9l6f1G8h8AI/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/Q8EjiFF8D-MKO53a9l6f1G8h8AI/1/da"><img src="http://feedads.g.doubleclick.net/~a/Q8EjiFF8D-MKO53a9l6f1G8h8AI/1/di" border="0" ismap="true"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.investuotojas.eu/2011/07/13/plotting-git-statistics/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.investuotojas.eu/2011/07/13/plotting-git-statistics/</feedburner:origLink></item>
		<item>
		<title>Artificial intelligence in trading: k-means clustering</title>
		<link>http://feedproxy.google.com/~r/investuotojas/~3/z5NKw-tpm48/</link>
		<comments>http://www.investuotojas.eu/2011/07/06/artificial-intelligence-in-trading-k-means-clustering/#comments</comments>
		<pubDate>Wed, 06 Jul 2011 09:53:29 +0000</pubDate>
		<dc:creator>Dzidorius Martinaitis</dc:creator>
				<category><![CDATA[EN]]></category>
		<category><![CDATA[quant]]></category>
		<category><![CDATA[R-language]]></category>
		<category><![CDATA[stocks]]></category>

		<guid isPermaLink="false">http://www.investuotojas.eu/?p=500</guid>
		<description><![CDATA[There is many flavors of artificial intelligence (AI), however I want to show practical example of the cluster analysis. It is very applicable in finance. For example, one of stylized facts of volatility is, that it moves in clusters, meaning that today&#8217;s volatility will be more likely as yesterday&#8217;s volatility. To gauge these moves you [...]]]></description>
			<content:encoded><![CDATA[<p>There is many flavors of artificial intelligence (AI), however I want to show practical example of the cluster analysis. It is very applicable in finance. For example, one of stylized facts of volatility is, that it moves in clusters, meaning that today&#8217;s volatility will be more likely as yesterday&#8217;s volatility. To gauge these moves you can use hidden Markov chain (complicated method) or k-means (probably to simplified). However, GARCH model successfully exploits this stylized fact to make prediction of tomorrow&#8217;s volatility (it takes into account another fact as well &#8211; volatility is mean reverting process).</p>
<p>K-means is based on unsupervised learning &#8211; you give the data and k-means decides how to classify it. The idea is to split data into clusters based on cluster center and assign each point to nearest center.  There is drawback with such approach &#8211; the algorithm tries to establish the centers of  clusters with initial data set. If the data is very noisy and the centers are not stable, then every try will give you different results.</p>
<p>As you probably know, the distribution of financial data is very unstable. How to tackle this problem? We should be looking at daily returns instead of prices. The figure below shows daily returns of SPY stock.</p>

<div class="wp_codebox_msgheader wp_codebox_hide"><span class="right"><sup><a href="http://www.ericbess.com/ericblog/2008/03/03/wp-codebox/#examples" target="_blank" title="WP-CodeBox HowTo?"><span style="color: #99cc00">?</span></a></sup></span><span class="left"><a href="javascript:;" onclick="javascript:showCodeTxt('p500code16'); return false;">View Code</a> RSPLUS</span><div class="codebox_clear"></div></div><div class="wp_codebox"><table><tr id="p50016"><td class="code" id="p500code16"><pre class="rsplus" style="font-family:monospace;"><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/setwd.html"><span style="color: #0000FF; font-weight: bold;">setwd</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'/home/git/Rproject/kmeans/'</span><span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/require.html"><span style="color: #0000FF; font-weight: bold;">require</span></a><span style="color: #080;">&#40;</span>quantmod<span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/require.html"><span style="color: #0000FF; font-weight: bold;">require</span></a><span style="color: #080;">&#40;</span>ggplot2<span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/Sys.setenv.html"><span style="color: #0000FF; font-weight: bold;">Sys.<span style="">setenv</span></span></a><span style="color: #080;">&#40;</span>TZ<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;GMT&quot;</span><span style="color: #080;">&#41;</span>
getSymbols<span style="color: #080;">&#40;</span><span style="color: #ff0000;">'SPY'</span>,from<span style="color: #080;">=</span><span style="color: #ff0000;">'2000-01-01'</span><span style="color: #080;">&#41;</span>
&nbsp;
x<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/data.frame.html"><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span></a><span style="color: #080;">&#40;</span>d<span style="color: #080;">=</span>index<span style="color: #080;">&#40;</span>Cl<span style="color: #080;">&#40;</span>SPY<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/return.html"><span style="color: #0000FF; font-weight: bold;">return</span></a><span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>Delt<span style="color: #080;">&#40;</span>Cl<span style="color: #080;">&#40;</span>SPY<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/summary.lm.html"><span style="color: #0000FF; font-weight: bold;">png</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'daily_density.png'</span>,width<span style="color: #080;">=</span><span style="color: #ff0000;">500</span><span style="color: #080;">&#41;</span>
ggplot<span style="color: #080;">&#40;</span>x,aes<span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/return.html"><span style="color: #0000FF; font-weight: bold;">return</span></a><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">+</span>stat_density<span style="color: #080;">&#40;</span>colour<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;steelblue&quot;</span>, size<span style="color: #080;">=</span><span style="color: #ff0000;">2</span>, fill<span style="color: #080;">=</span>NA<span style="color: #080;">&#41;</span><span style="color: #080;">+</span>xlab<span style="color: #080;">&#40;</span>label<span style="color: #080;">=</span><span style="color: #ff0000;">'Daily returns'</span><span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/summary.lm.html"><span style="color: #0000FF; font-weight: bold;">dev.<span style="">off</span></span></a><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p><a href="http://s176.photobucket.com/albums/w180/investuotojas/?action=view&amp;current=daily_density.png" target="_blank"><img src="http://i176.photobucket.com/albums/w180/investuotojas/daily_density.png" border="0" alt="Photobucket" /></a></p>
<p>I was ready to show another trick &#8211; how to neutralize long tails by replacing existing distribution with uniform distribution, but quick test revealed, that this leads to uninterpretable results.</p>
<p>OK, lets move further &#8211; how many clusters should we have? Can AI give us a clue? Of course, but keep in mind that then your future decision will be <a href="http://www.sciencedaily.com/articles/a/anchoring.htm" target="_blank">anchored</a>.</p>

<div class="wp_codebox_msgheader wp_codebox_hide"><span class="right"><sup><a href="http://www.ericbess.com/ericblog/2008/03/03/wp-codebox/#examples" target="_blank" title="WP-CodeBox HowTo?"><span style="color: #99cc00">?</span></a></sup></span><span class="left"><a href="javascript:;" onclick="javascript:showCodeTxt('p500code17'); return false;">View Code</a> RSPLUS</span><div class="codebox_clear"></div></div><div class="wp_codebox"><table><tr id="p50017"><td class="code" id="p500code17"><pre class="rsplus" style="font-family:monospace;">nasa<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">tail</span><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/cbind.html"><span style="color: #0000FF; font-weight: bold;">cbind</span></a><span style="color: #080;">&#40;</span>Delt<span style="color: #080;">&#40;</span>Op<span style="color: #080;">&#40;</span>SPY<span style="color: #080;">&#41;</span>,Hi<span style="color: #080;">&#40;</span>SPY<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,Delt<span style="color: #080;">&#40;</span>Op<span style="color: #080;">&#40;</span>SPY<span style="color: #080;">&#41;</span>,Lo<span style="color: #080;">&#40;</span>SPY<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,Delt<span style="color: #080;">&#40;</span>Op<span style="color: #080;">&#40;</span>SPY<span style="color: #080;">&#41;</span>,Cl<span style="color: #080;">&#40;</span>SPY<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,<span style="color: #080;">-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span>
&nbsp;
<span style="color: #228B22;">#optimal number of clusters</span>
wss <span style="color: #080;">=</span> <span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/nrow.html"><span style="color: #0000FF; font-weight: bold;">nrow</span></a><span style="color: #080;">&#40;</span>nasa<span style="color: #080;">&#41;</span><span style="color: #080;">-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span><span style="color: #080;">*</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/sum.html"><span style="color: #0000FF; font-weight: bold;">sum</span></a><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/apply.html"><span style="color: #0000FF; font-weight: bold;">apply</span></a><span style="color: #080;">&#40;</span>nasa,<span style="color: #ff0000;">2</span>,<span style="color: #0000FF; font-weight: bold;">var</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/for.html"><span style="color: #0000FF; font-weight: bold;">for</span></a> <span style="color: #080;">&#40;</span>i <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #ff0000;">2</span><span style="color: #080;">:</span><span style="color: #ff0000;">15</span><span style="color: #080;">&#41;</span> wss<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span> <span style="color: #080;">=</span> <a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/sum.html"><span style="color: #0000FF; font-weight: bold;">sum</span></a><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">kmeans</span><span style="color: #080;">&#40;</span>nasa, centers<span style="color: #080;">=</span>i<span style="color: #080;">&#41;</span>$withinss<span style="color: #080;">&#41;</span>
wss<span style="color: #080;">=</span><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/data.frame.html"><span style="color: #0000FF; font-weight: bold;">data.<span style="">frame</span></span></a><span style="color: #080;">&#40;</span>number<span style="color: #080;">=</span><span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">15</span>,value<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/as.numeric.html"><span style="color: #0000FF; font-weight: bold;">as.<span style="">numeric</span></span></a><span style="color: #080;">&#40;</span>wss<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
&nbsp;
<a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/summary.lm.html"><span style="color: #0000FF; font-weight: bold;">png</span></a><span style="color: #080;">&#40;</span><span style="color: #ff0000;">'numberOfClusters.png'</span>,width<span style="color: #080;">=</span><span style="color: #ff0000;">500</span><span style="color: #080;">&#41;</span>
ggplot<span style="color: #080;">&#40;</span>wss,aes<span style="color: #080;">&#40;</span>number,value<span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span><span style="color: #080;">+</span>geom_point<span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span><span style="color: #080;">+</span>
  xlab<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Number of Clusters&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">+</span>ylab<span style="color: #080;">&#40;</span><span style="color: #ff0000;">&quot;Within groups sum of squares&quot;</span><span style="color: #080;">&#41;</span><span style="color: #080;">+</span>geom_smooth<span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/stats/html/summary.lm.html"><span style="color: #0000FF; font-weight: bold;">dev.<span style="">off</span></span></a><span style="color: #080;">&#40;</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p><a href="http://s176.photobucket.com/albums/w180/investuotojas/?action=view&amp;current=numberOfClusters.png" target="_blank"><img src="http://i176.photobucket.com/albums/w180/investuotojas/numberOfClusters.png" border="0" alt="Photobucket" /></a></p>
<p>The figure above implies, that we should have more than 15 clusters for financial data. Well, for sake of simplicity and education purpose lets use only 5.</p>

<div class="wp_codebox_msgheader wp_codebox_hide"><span class="right"><sup><a href="http://www.ericbess.com/ericblog/2008/03/03/wp-codebox/#examples" target="_blank" title="WP-CodeBox HowTo?"><span style="color: #99cc00">?</span></a></sup></span><span class="left"><a href="javascript:;" onclick="javascript:showCodeTxt('p500code18'); return false;">View Code</a> RSPLUS</span><div class="codebox_clear"></div></div><div class="wp_codebox"><table><tr id="p50018"><td class="code" id="p500code18"><pre class="rsplus" style="font-family:monospace;">kmeanObject<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">kmeans</span><span style="color: #080;">&#40;</span>nasa,<span style="color: #ff0000;">5</span>,iter.<span style="">max</span><span style="color: #080;">=</span><span style="color: #ff0000;">10</span><span style="color: #080;">&#41;</span>
kmeanObject$centers
autocorrelation<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">head</span><span style="color: #080;">&#40;</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/cbind.html"><span style="color: #0000FF; font-weight: bold;">cbind</span></a><span style="color: #080;">&#40;</span>kmeanObject$cluster,<span style="color: #0000FF; font-weight: bold;">lag</span><span style="color: #080;">&#40;</span>as.<span style="">xts</span><span style="color: #080;">&#40;</span>kmeanObject$cluster<span style="color: #080;">&#41;</span>,<span style="color: #080;">-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,<span style="color: #080;">-</span><span style="color: #ff0000;">1</span><span style="color: #080;">&#41;</span>
<span style="color: #0000FF; font-weight: bold;">xtabs</span><span style="color: #080;">&#40;</span>~autocorrelation<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">+</span><span style="color: #080;">&#40;</span>autocorrelation<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
&nbsp;
y<span style="color: #080;">=</span><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/apply.html"><span style="color: #0000FF; font-weight: bold;">apply</span></a><span style="color: #080;">&#40;</span><span style="color: #0000FF; font-weight: bold;">xtabs</span><span style="color: #080;">&#40;</span>~autocorrelation<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">+</span><span style="color: #080;">&#40;</span>autocorrelation<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>,<span style="color: #ff0000;">1</span>,<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/sum.html"><span style="color: #0000FF; font-weight: bold;">sum</span></a><span style="color: #080;">&#41;</span>
x<span style="color: #080;">=</span><span style="color: #0000FF; font-weight: bold;">xtabs</span><span style="color: #080;">&#40;</span>~autocorrelation<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">1</span><span style="color: #080;">&#93;</span><span style="color: #080;">+</span><span style="color: #080;">&#40;</span>autocorrelation<span style="color: #080;">&#91;</span>,<span style="color: #ff0000;">2</span><span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span><span style="color: #080;">&#41;</span>
&nbsp;
z<span style="color: #080;">=</span>x
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/for.html"><span style="color: #0000FF; font-weight: bold;">for</span></a><span style="color: #080;">&#40;</span>i <span style="color: #0000FF; font-weight: bold;">in</span> <span style="color: #ff0000;">1</span><span style="color: #080;">:</span><span style="color: #ff0000;">5</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#123;</span>
  z<span style="color: #080;">&#91;</span>i,<span style="color: #080;">&#93;</span><span style="color: #080;">=</span><span style="color: #080;">&#40;</span>x<span style="color: #080;">&#91;</span>i,<span style="color: #080;">&#93;</span><span style="color: #080;">/</span>y<span style="color: #080;">&#91;</span>i<span style="color: #080;">&#93;</span><span style="color: #080;">&#41;</span>
<span style="color: #080;">&#125;</span></pre></td></tr></table></div>

<p>The code above actually shows, how to run k-means clustering in R. The first line runs the sorting and the second shows clusters&#8217; centroids:</p>
<table border="1" cellspacing="0">
<caption class="captiondataframe"> </caption>
<tbody>
<tr>
<td>
<table class="dataframe" border="0">
<tbody>
<tr class="firstline">
<th></th>
<th>High</th>
<th>Low</th>
<th>Close</th>
</tr>
<tr>
<td class="firstcolumn">1</td>
<td class="cellinside">0.0388</td>
<td class="cellinside">-0.0094</td>
<td class="cellinside">0.0313</td>
</tr>
<tr>
<td class="firstcolumn">2</td>
<td class="cellinside">0.0049</td>
<td class="cellinside">-0.0050</td>
<td class="cellinside">0.0006</td>
</tr>
<tr>
<td class="firstcolumn">3</td>
<td class="cellinside">0.0143</td>
<td class="cellinside">-0.0038</td>
<td class="cellinside">0.0106</td>
</tr>
<tr>
<td class="firstcolumn">4</td>
<td class="cellinside">0.0038</td>
<td class="cellinside">-0.0148</td>
<td class="cellinside">-0.0103</td>
</tr>
<tr>
<td class="firstcolumn">5</td>
<td class="cellinside">0.0053</td>
<td class="cellinside">-0.0348</td>
<td class="cellinside">-0.0280</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<p>So, we have 5 clusters: 1. extremely positive day, 2. flat day, 3. positive day and 4,5 are clusters with negative outcome.<br />
The third and fourth lines in the code above checks and prints autocorrelation between today(N0) and tomorrow(N1):</p>
<table border="1" cellspacing="0">
<caption class="captiondataframe"> </caption>
<tbody>
<tr>
<td>
<table class="dataframe" border="0">
<tbody>
<tr class="firstline">
<th></th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
</tr>
<tr>
<td class="firstcolumn"><strong>1</strong></td>
<td class="cellinside">11</td>
<td class="cellinside">24</td>
<td class="cellinside">29</td>
<td class="cellinside">21</td>
<td class="cellinside">12</td>
</tr>
<tr>
<td class="firstcolumn"><strong>2</strong></td>
<td class="cellinside">16</td>
<td class="cellinside">991</td>
<td class="cellinside">288</td>
<td class="cellinside">351</td>
<td class="cellinside">42</td>
</tr>
<tr>
<td class="firstcolumn"><strong>3</strong></td>
<td class="cellinside">17</td>
<td class="cellinside">338</td>
<td class="cellinside">144</td>
<td class="cellinside">168</td>
<td class="cellinside">28</td>
</tr>
<tr>
<td class="firstcolumn"><strong>4</strong></td>
<td class="cellinside">27</td>
<td class="cellinside">310</td>
<td class="cellinside">202</td>
<td class="cellinside">207</td>
<td class="cellinside">32</td>
</tr>
<tr>
<td class="firstcolumn"><strong>5</strong></td>
<td class="cellinside">26</td>
<td class="cellinside">24</td>
<td class="cellinside">33</td>
<td class="cellinside">31</td>
<td class="cellinside">23</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<p>If you prefer percentages instead of plain numbers, the following table gives you that:</p>
<table border="1" cellspacing="0">
<caption class="captiondataframe"> </caption>
<tbody>
<tr>
<td>
<table class="dataframe" border="0">
<tbody>
<tr class="firstline">
<th></th>
<th>1</th>
<th>2</th>
<th>3</th>
<th>4</th>
<th>5</th>
</tr>
<tr>
<td class="firstcolumn"><strong>1</strong></td>
<td class="cellinside">0.11</td>
<td class="cellinside">0.25</td>
<td class="cellinside">0.30</td>
<td class="cellinside">0.22</td>
<td class="cellinside">0.12</td>
</tr>
<tr>
<td class="firstcolumn"><strong>2</strong></td>
<td class="cellinside">0.01</td>
<td class="cellinside"><strong>0.59</strong></td>
<td class="cellinside">0.17</td>
<td class="cellinside">0.21</td>
<td class="cellinside">0.02</td>
</tr>
<tr>
<td class="firstcolumn"><strong>3</strong></td>
<td class="cellinside">0.02</td>
<td class="cellinside"><strong>0.49</strong></td>
<td class="cellinside">0.21</td>
<td class="cellinside">0.24</td>
<td class="cellinside">0.04</td>
</tr>
<tr>
<td class="firstcolumn"><strong>4</strong></td>
<td class="cellinside">0.03</td>
<td class="cellinside">0.40</td>
<td class="cellinside">0.26</td>
<td class="cellinside">0.27</td>
<td class="cellinside">0.04</td>
</tr>
<tr>
<td class="firstcolumn"><strong>5</strong></td>
<td class="cellinside">0.19</td>
<td class="cellinside">0.18</td>
<td class="cellinside">0.24</td>
<td class="cellinside">0.23</td>
<td class="cellinside">0.17</td>
</tr>
</tbody>
</table>
</td>
</tr>
</tbody>
</table>
<p>How to read such tables? Lets take for example line 2. The first table says, that the centers of the cluster are following: 0.0049;-0.0050;0.0006, meaning that during such day, the price of the asset is moving in very narrow range. Now, the table 2 or 3 shows, what are the chances for the next day (N1). Here is only 1 % chance, that following day will be extremely negative or positive (1 and 5 columns), 59 % chance, that it will be as today (N0) or it will be mild volatility with positive or negative outcome (3 and 4 columns). Put it shortly &#8211; if volatility today is very low, then most likely it will be tomorrow.</p>
<p>For further research I would advise to increase the number of clusters and check what are the results. On the same vein IntelligentTradingTech made a <a href="http://intelligenttradingtech.blogspot.com/2010/06/quantitative-candlestick-pattern.html" target="_blank">post</a> while back.</p>
<p>The source code can be found <a href="https://github.com/kafka399/Rproject/blob/master/kmeans/kmeans.R" target="_blank">here</a>.</p>

<p><a href="http://feedads.g.doubleclick.net/~a/zZaoKKaSckDcALI1dhgvK03IvCU/0/da"><img src="http://feedads.g.doubleclick.net/~a/zZaoKKaSckDcALI1dhgvK03IvCU/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/zZaoKKaSckDcALI1dhgvK03IvCU/1/da"><img src="http://feedads.g.doubleclick.net/~a/zZaoKKaSckDcALI1dhgvK03IvCU/1/di" border="0" ismap="true"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.investuotojas.eu/2011/07/06/artificial-intelligence-in-trading-k-means-clustering/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		<feedburner:origLink>http://www.investuotojas.eu/2011/07/06/artificial-intelligence-in-trading-k-means-clustering/</feedburner:origLink></item>
		<item>
		<title>timezone issue in R</title>
		<link>http://feedproxy.google.com/~r/investuotojas/~3/bsvrxq1cxGc/</link>
		<comments>http://www.investuotojas.eu/2011/05/14/timezone-issue-in-r/#comments</comments>
		<pubDate>Sat, 14 May 2011 09:51:36 +0000</pubDate>
		<dc:creator>Dzidorius Martinaitis</dc:creator>
				<category><![CDATA[EN]]></category>
		<category><![CDATA[forex]]></category>
		<category><![CDATA[quant]]></category>
		<category><![CDATA[R-language]]></category>
		<category><![CDATA[Strategy]]></category>
		<category><![CDATA[quantitative]]></category>
		<category><![CDATA[R]]></category>
		<category><![CDATA[trading]]></category>

		<guid isPermaLink="false">http://www.investuotojas.eu/?p=493</guid>
		<description><![CDATA[While investigating Intraday patterns in FX returns and order flow paper I have faced the problem with timezone. I had 3 data sources with different timezones (GMT, CET, CEST). Most confusing thing was, that I didn&#8217;t know, how to deal with summer time. But why did I have the data with summer time in the first place? [...]]]></description>
			<content:encoded><![CDATA[<p>While investigating <a href="http://www.snb.ch/n/mmr/reference/working_paper_2011_04/source" target="_blank">Intraday patterns in FX returns and order flow</a> paper I have faced the problem with timezone. I had 3 data sources with different timezones (GMT, CET, CEST). Most confusing thing was, that I didn&#8217;t know, how to deal with summer time.<br />
But why did I have the data with summer time in the first place?  Well, I use IBrokers package to get latest Forex data and turns out, that it <a href="http://r.789695.n4.nabble.com/IBrokers-and-timezone-td3458674.html" target="_blank">is impossible to specify timezone parameter </a>within reqHistoricalData function. Once the data is received, it assigns default timezone of R (in my case GMT), but the requested data comes with OS timezone (CET/CEST in my case). It took for while to realize that, but the real challenge was how to convert CET/CEST to GMT.</p>
<p>The answer &#8211; don&#8217;t use CET/CEST, but instead of that use something like &#8216;Europe/Paris&#8217; or &#8216;Europe/Berlin&#8217;. This approach takes into account summer time issue and you don&#8217;t need to worry about it. Once you have the data in &#8216;Europe/Paris&#8217; or &#8216;Europe/Berlin&#8217; format you can easily convert it:</p>

<div class="wp_codebox_msgheader wp_codebox_hide"><span class="right"><sup><a href="http://www.ericbess.com/ericblog/2008/03/03/wp-codebox/#examples" target="_blank" title="WP-CodeBox HowTo?"><span style="color: #99cc00">?</span></a></sup></span><span class="left"><a href="javascript:;" onclick="javascript:showCodeTxt('p493code21'); return false;">View Code</a> RSPLUS</span><div class="codebox_clear"></div></div><div class="wp_codebox"><table><tr id="p49321"><td class="code" id="p493code21"><pre class="rsplus" style="font-family:monospace;"><a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/Sys.setenv.html"><span style="color: #0000FF; font-weight: bold;">Sys.<span style="">setenv</span></span></a><span style="color: #080;">&#40;</span>TZ<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;Europe/Paris&quot;</span><span style="color: #080;">&#41;</span>
eur.<span style="">usd</span><span style="color: #080;">=</span>reqHistoricalData<span style="color: #080;">&#40;</span>tws,currency,whatToShow<span style="color: #080;">=</span><span style="color: #ff0000;">'MIDPOINT'</span>,barSize<span style="color: #080;">=</span><span style="color: #ff0000;">'1 hour'</span><span style="color: #080;">&#41;</span>
<a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/Sys.setenv.html"><span style="color: #0000FF; font-weight: bold;">Sys.<span style="">setenv</span></span></a><span style="color: #080;">&#40;</span>TZ<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;GMT&quot;</span><span style="color: #080;">&#41;</span>
eur.<span style="">usd</span><span style="color: #080;">=</span>xts<span style="color: #080;">&#40;</span>Op<span style="color: #080;">&#40;</span>eur.<span style="">usd</span><span style="color: #080;">&#41;</span>,tz<span style="color: #080;">=</span><span style="color: #ff0000;">'GMT'</span><span style="color: #080;">&#41;</span></pre></td></tr></table></div>

<p>or you can change the index:</p>

<div class="wp_codebox_msgheader wp_codebox_hide"><span class="right"><sup><a href="http://www.ericbess.com/ericblog/2008/03/03/wp-codebox/#examples" target="_blank" title="WP-CodeBox HowTo?"><span style="color: #99cc00">?</span></a></sup></span><span class="left"><a href="javascript:;" onclick="javascript:showCodeTxt('p493code22'); return false;">View Code</a> RSPLUS</span><div class="codebox_clear"></div></div><div class="wp_codebox"><table><tr id="p49322"><td class="code" id="p493code22"><pre class="rsplus" style="font-family:monospace;"> <a href="http://astrostatistics.psu.edu/su07/R/html/graphics/html/format.html"><span style="color: #0000FF; font-weight: bold;">format</span></a><span style="color: #080;">&#40;</span>eur.<span style="">usd</span>, tz<span style="color: #080;">=</span><span style="color: #ff0000;">&quot;GMT&quot;</span>,usetz<span style="color: #080;">=</span>TRUE<span style="color: #080;">&#41;</span></pre></td></tr></table></div>


<p><a href="http://feedads.g.doubleclick.net/~a/kogef6O_Y3zf55nvtcU3J8_1B_E/0/da"><img src="http://feedads.g.doubleclick.net/~a/kogef6O_Y3zf55nvtcU3J8_1B_E/0/di" border="0" ismap="true"></img></a><br/>
<a href="http://feedads.g.doubleclick.net/~a/kogef6O_Y3zf55nvtcU3J8_1B_E/1/da"><img src="http://feedads.g.doubleclick.net/~a/kogef6O_Y3zf55nvtcU3J8_1B_E/1/di" border="0" ismap="true"></img></a></p>]]></content:encoded>
			<wfw:commentRss>http://www.investuotojas.eu/2011/05/14/timezone-issue-in-r/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://www.investuotojas.eu/2011/05/14/timezone-issue-in-r/</feedburner:origLink></item>
	</channel>
</rss>

