<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet href="http://feeds.feedburner.com/~d/styles/rss2full.xsl" type="text/xsl" media="screen"?><?xml-stylesheet href="http://feeds.feedburner.com/~d/styles/itemcontent.css" type="text/css" media="screen"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>Software Ramblings</title>
	
	<link>http://softwareramblings.com</link>
	<description>Stephen Doyles Ramblings on the Art of Software Engineering</description>
	<pubDate>Thu, 04 Sep 2008 15:55:25 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.6.1</generator>
	<language>en</language>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/SoftwareRamblings" type="application/rss+xml" /><item>
		<title>C++ Regular Expression Performance</title>
		<link>http://feeds.feedburner.com/~r/SoftwareRamblings/~3/379922916/c-regular-expression-performance.html</link>
		<comments>http://softwareramblings.com/2008/08/c-regular-expression-performance.html#comments</comments>
		<pubDate>Sun, 31 Aug 2008 21:10:03 +0000</pubDate>
		<dc:creator>Stephen Doyle</dc:creator>
		
		<category><![CDATA[C++]]></category>

		<category><![CDATA[Regular Expressions]]></category>

		<guid isPermaLink="false">http://softwareramblings.com/?p=34</guid>
		<description><![CDATA[
A previous post gave some examples to illustrate how to use regular expressions as defined by the upcoming C++0x standard. The new regex features are certainly easy to use but what about the performance of some of the current implementations?
To find out, I compared the performance of the Micirosoft Visual Studio 2008 implementation against the regex implementation from [...]]]></description>
			<content:encoded><![CDATA[<div class="mceTemp">
<p>A <a href="http://softwareramblings.com/2008/07/regular-expressions-in-c.html">previous post</a> gave some examples to illustrate how to use regular expressions as defined by the upcoming C++0x standard. The new regex features are certainly easy to use but what about the performance of some of the current implementations?</p>
<p>To find out, I compared the performance of the Micirosoft Visual Studio 2008 implementation against the regex implementation from the <a href="http://www.boost.org/">Boost library</a> v1.36, and also against the <a href="http://www.pcre.org/">PCRE v7.7</a> library. Microsoft introduced C++0x regex support in Visual Studio 2008 feature pack #1 and subsequently <a href="http://blogs.msdn.com/vcblog/archive/2008/08/11/tr1-fixes-in-vc9-sp1.aspx">improved regex performance</a> in Visual Studio 2008 service pack #1. Both the fp1 and sp1 implementations were included in the comparison analysis. The Boost library was chosen for comparison since this was the initial C++ regex implementation that was the basis for the proposal for inclusion into C++0x. PCRE was chosen for comparison since this is a very popular and in many respects the defacto regular expression library for use in C/C++ applications.</p>
<p><strong>Methodology<br />
</strong>The regular expression patterns and data strings to use for comparison were taken from <a href="http://www.tusker.org/regex/20050421.html">here</a>. Three patterns of various complexity were selected and each of these was run against a set of five content strings. Each pattern/string search combination was run for 10000 iterations and the elapsed time was measured for each combination. The code used to perform the test is available <a href="http://softwareramblings.com/wp-content/uploads/2008/08/regex_bench.cpp">here</a>. Multiple runs were conducted to ensure that the results were consistent across each run. Once all the data was gathered, the results were normalized against libPcre - hence all the graphs show libPcre at a value of 1. By taking this normalization approach it is easier to compare the relative performance of each implementation.</p>
<table style="width: 450px; border-collapse: collapse; height: 141px;" border="0" cellspacing="0" cellpadding="0" width="450">
<colgroup span="1"><col style="width: 15pt; mso-width-source: userset; mso-width-alt: 731;" span="1" width="20"></col><col style="width: 177pt; mso-width-source: userset; mso-width-alt: 8630;" span="1" width="236"></col><col style="width: 210pt; mso-width-source: userset; mso-width-alt: 10240;" span="1" width="280"></col></colgroup>
<tbody>
<tr style="height: 15pt;" height="20">
<td class="xl66" style="width: 15pt; height: 15pt; background-color: #4f81bd; border: windowtext 0.5pt solid;" width="20" height="20"><span style="font-size: small; color: #ffffff; font-family: Calibri;">ID</span></td>
<td class="xl66" style="border-right: windowtext 0.5pt solid; border-top: windowtext 0.5pt solid; border-left: windowtext; width: 177pt; border-bottom: windowtext 0.5pt solid; background-color: #4f81bd;" width="236"><span style="font-size: small; color: #ffffff; font-family: Calibri;">Patterns</span></td>
<td class="xl66" style="border-right: windowtext 0.5pt solid; border-top: windowtext 0.5pt solid; border-left: windowtext; width: 210pt; border-bottom: windowtext 0.5pt solid; background-color: #4f81bd;" width="280"><span style="font-size: small; color: #ffffff; font-family: Calibri;">Content Strings</span></td>
</tr>
<tr style="height: 15pt;" height="20">
<td class="xl67" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext 0.5pt solid; border-bottom: windowtext 0.5pt solid; height: 15pt; background-color: transparent;" height="20"><span style="font-size: small; font-family: Calibri;">1</span></td>
<td class="xl65" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext; border-bottom: windowtext 0.5pt solid; background-color: transparent;"><span style="font-size: small; font-family: Calibri;">^(([^:]+)://)?([^:/]+)(:([0-9]+))?(/.*)</span></td>
<td class="xl65" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext; border-bottom: windowtext 0.5pt solid; background-color: transparent;"><span style="font-size: small; font-family: Calibri;">http://www.linux.com/</span></td>
</tr>
<tr style="height: 15pt;" height="20">
<td class="xl67" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext 0.5pt solid; border-bottom: windowtext 0.5pt solid; height: 15pt; background-color: transparent;" height="20"><span style="font-size: small; font-family: Calibri;">2</span></td>
<td class="xl65" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext; border-bottom: windowtext 0.5pt solid; background-color: transparent;"><span style="font-size: small; font-family: Calibri;">usd [+-]?[0-9]+.[0-9][0-9]</span></td>
<td class="xl65" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext; border-bottom: windowtext 0.5pt solid; background-color: transparent;"><span style="font-size: small; font-family: Calibri;">http://www.thelinuxshow.com/main.php3</span></td>
</tr>
<tr style="height: 15pt;" height="20">
<td class="xl67" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext 0.5pt solid; border-bottom: windowtext 0.5pt solid; height: 15pt; background-color: transparent;" height="20"><span style="font-size: small; font-family: Calibri;">3</span></td>
<td class="xl65" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext; border-bottom: windowtext 0.5pt solid; background-color: transparent;"><span style="font-size: small; font-family: Calibri;">\b(\w+)(\s+\1)+\b</span></td>
<td class="xl65" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext; border-bottom: windowtext 0.5pt solid; background-color: transparent;"><span style="font-size: small; font-family: Calibri;">usd 1234.00</span></td>
</tr>
<tr style="height: 15pt;" height="20">
<td class="xl67" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext 0.5pt solid; border-bottom: windowtext 0.5pt solid; height: 15pt; background-color: transparent;" height="20"><span style="font-size: small; font-family: Calibri;">4</span></td>
<td class="xl65" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext; border-bottom: windowtext 0.5pt solid; background-color: transparent;"><span style="font-size: small; font-family: Calibri;"> </span></td>
<td class="xl65" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext; border-bottom: windowtext 0.5pt solid; background-color: transparent;"><span style="font-size: small; font-family: Calibri;">he said she said he said no</span></td>
</tr>
<tr style="height: 15pt;" height="20">
<td class="xl67" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext 0.5pt solid; border-bottom: windowtext 0.5pt solid; height: 15pt; background-color: transparent;" height="20"><span style="font-size: small; font-family: Calibri;">5</span></td>
<td class="xl65" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext; border-bottom: windowtext 0.5pt solid; background-color: transparent;"><span style="font-size: small; font-family: Calibri;"> </span></td>
<td class="xl65" style="border-right: windowtext 0.5pt solid; border-top: windowtext; border-left: windowtext; border-bottom: windowtext 0.5pt solid; background-color: transparent;"><span style="font-size: small; font-family: Calibri;">same same same</span></td>
</tr>
</tbody>
</table>
<p><strong>Results<br />
</strong>The results are shown for each of the three patterns in the graphs below - one per pattern. The numbers along x-axis in each graph corresponds to the strings numbered 1-5 from the table above.</div>
<div id="attachment_51" class="wp-caption alignnone" style="width: 310px"><a href="http://softwareramblings.com/wp-content/uploads/2008/08/regex_bench_pattern1.png"><img class="size-medium wp-image-51" title="regex_bench_pattern1" src="http://softwareramblings.com/wp-content/uploads/2008/08/regex_bench_pattern1-300x182.png" alt="RegEx Benchmark - Pattern #1" width="300" height="182" /></a><p class="wp-caption-text">RegEx Benchmark - Pattern #1</p></div>
<div id="attachment_52" class="wp-caption alignnone" style="width: 310px"><a href="http://softwareramblings.com/wp-content/uploads/2008/08/regex_bench_pattern2.png"><img class="size-medium wp-image-52" title="regex_bench_pattern2" src="http://softwareramblings.com/wp-content/uploads/2008/08/regex_bench_pattern2-300x181.png" alt="RegEx Benchmark - Pattern #2" width="300" height="181" /></a><p class="wp-caption-text">RegEx Benchmark - Pattern #2</p></div>
<div id="attachment_53" class="wp-caption alignnone" style="width: 310px"><a href="http://softwareramblings.com/wp-content/uploads/2008/08/regex_bench_pattern3.png"><img class="size-medium wp-image-53" title="regex_bench_pattern3" src="http://softwareramblings.com/wp-content/uploads/2008/08/regex_bench_pattern3-300x181.png" alt="RegEx Benchmark - Pattern #3" width="300" height="181" /></a><p class="wp-caption-text">RegEx Benchmark - Pattern #3</p></div>
<p>Based on the data shown in the graphs, a number of observations can be made:</p>
<ol>
<li>libPCRE v7.7 has the best performance for all pattern / string combinations.</li>
<li>Performance varies significantly for different patterns. For example, for pattern #1, the boost implementation was faster than either of the Visual Studio 2008 (vc9) versions, but was slower for patterns #2 and #3. This seems to suggest that the Visual Studio 2008 implementations show relatively poor performance for patterns that contain alterations and a number of groups and nested groups.</li>
<li>Based on patterns #1 and #2 the Visual C++ regex implementation performance has <a href="http://blogs.msdn.com/vcblog/archive/2008/08/11/tr1-fixes-in-vc9-sp1.aspx">improved as claimed</a> between feature pack 1 (fp1) and service pack 1 (sp1) but this is not completely true as can be seen from the pattern #3 results.</li>
</ol>
<p><strong>Conclusion</strong><br />
While the C++0x regex library makes it easier to use regular expressions in C++ code, there are still some performance gaps between the Visual Studio 2008 implementation and the performance of existing regular expression libraries such as libPCRE.</p>
<p><em>Note: The data shown in this article is based on a small sample of regular expression patterns and content strings and so the results should be taken in that context.</em></p>
<img src="http://feeds.feedburner.com/~r/SoftwareRamblings/~4/379922916" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://softwareramblings.com/2008/08/c-regular-expression-performance.html/feed</wfw:commentRss>
		<feedburner:origLink>http://softwareramblings.com/2008/08/c-regular-expression-performance.html</feedburner:origLink></item>
		<item>
		<title>Multi-thread scaling issues with Python and Ruby</title>
		<link>http://feeds.feedburner.com/~r/SoftwareRamblings/~3/335458534/multi-thread-scaling-issues-with-python-and-ruby.html</link>
		<comments>http://softwareramblings.com/2008/07/multi-thread-scaling-issues-with-python-and-ruby.html#comments</comments>
		<pubDate>Mon, 14 Jul 2008 21:44:49 +0000</pubDate>
		<dc:creator>Stephen Doyle</dc:creator>
		
		<category><![CDATA[Concurrency]]></category>

		<category><![CDATA[Python]]></category>

		<category><![CDATA[Ruby]]></category>

		<guid isPermaLink="false">http://softwareramblings.com/?p=33</guid>
		<description><![CDATA[With the advent of multi-core processors, CPU bound applications need to use multi-threading in order to be able to scale their performance beyond that offered by a single core. This provides many challenges, but an interesting aspect of this problem is to consider how the threading modules in modern programming languages such as Python and [...]]]></description>
			<content:encoded><![CDATA[<p>With the advent of multi-core processors, CPU bound applications need to use multi-threading in order to be able to scale their performance beyond that offered by a single core. This provides many challenges, but an interesting aspect of this problem is to consider how the threading modules in modern programming languages such as <a href="http://www.python.org/">Python</a> and <a href="http://www.ruby-lang.org/en/">Ruby</a>can either help or hinder this scalability. Yes, there are plenty of other programming languages in use today, but Python and especially Ruby are rapidly <a href="http://www.tiobe.com/index.php/content/paperinfo/tpci/index.html">rising in popularity</a> and there are some surprising limitations to be aware of when using their threading packages.</p>
<p><strong>RUBY<br />
</strong>The standard C implementation of Ruby 1.x (current version: 1.9) implements threading as <a href="http://en.wikipedia.org/wiki/Green_threads">green threading</a>, where all threads are serviced by a single OS level thread and the Ruby runtime has full control over the thread life cycle. As described in the <a href="http://spec.ruby-doc.org/wiki/Ruby_Threading">Ruby wiki</a>, Ruby&#8217;s thread scheduler is a simple cooperative timeslicing scheduler with control switching to another thread if certain well defined keywords or events are encountered. There is also a 10ms timeout period to prevent too many context switches occurring (i.e. in general a max of 1 context switch every 10ms).</p>
<p>This use of green threads imposes severe scaling restrictions for Ruby applications that are CPU bound since the use of a single native OS thread limits the Ruby application to run on a single CPU core. IO bound Ruby applications can employ threading to a certain extent to parallelize waiting on IO operations but even this is limited by the 10ms minimum context switch time which has the effect of limiting the number of threads that can run within a Ruby application. Due to this limitation, scalability of Ruby applications appears to be solved today by splitting the application and <a href="http://spec.ruby-doc.org/wiki/Ruby_Threading">running it in multiple processes</a> which can then be run on different cores.</p>
<p>There is some hope in store though in that using native OS threads instead of green threads is being considered for Ruby 2.0 and there are some implementations of Ruby such as <a href="http://spec.ruby-doc.org/wiki/JRuby_Threading">JRuby</a>which currently implement Ruby threads using native OS threads (via Java though for JRuby).</p>
<p><strong>PYTHON<br />
</strong>In contrast to Ruby, Python threads are implemented using native OS threads and so it is possible for different Python threads within a single application to run on different cores on a multi-core processor under the control of the OS scheduler. However, Python threading has a serious limitation in the form of the <a href="http://docs.python.org/api/threads.html">Global Interpreter Lock</a>(GIL). This is a global lock that must be held by the current thread before it can safely access Python objects and only the thread that has acquired the global interpreter lock may operate on Python objects or call Python/C API functions. In order to support multi-threaded python programs, the interpreter regularly releases and reacquires the lock - by default, every 100 bytecode instructions. C extensions can release and reacquire the lock using the Python API and so this offers some relief, but the lock must be acquired before the state of any Python object is accessed.</p>
<p>Similar to Ruby, this GIL effectively limits the performance of CPU bound Python applications to that of a single CPU core (since only one Python thread can run at a time). Scalability is available for IO bound applications as these can easily scale across cores and the &#8220;one at a time&#8221; model of the GIL does not significantly restrict the performance of threads that are highly IO bound. Some relief is available by being able to implement performance and lock optimized C extensions but this is very restrictive and cumbersome - certainly a lot harder than writing some Python code.</p>
<p>Given this serious restriction of the Python threading model, you would expect it to be possible to replace the GIL with more fine grained locking, but apparently it has been tried and there are some reasons <a href="http://www.python.org/doc/faq/library/#can-t-we-get-rid-of-the-global-interpreter-lock">why we can&#8217;t get rid of the global interpreter lock</a>. When fine grained locking was tried as a patch to Python 1.5, a 2x slowdown was observed. The slowdown was attributed to the overhead of the acquiring/releasing the OS locks. This patch hasn&#8217;t been maintained for subsequent versions of Python. Another patch that is gaining popularity and actively being maintained is <a href="https://launchpad.net/python-safethread">python-safethread</a>. This is a set of Python extensions that is &#8220;intended to provide safe, easy, and scalable concurrency mechanisms. It focuses on local concurrency, not distributed or parallel programs.&#8221; While it is not yet part of the Python mainline but it is certainly a promising solution to the GIL issue.</p>
<p>Update: Thanks to Adam Olsen for pointing me towards <a href="https://launchpad.net/python-safethread">python-safethread</a> as a possible solution to the GIL.</p>
<img src="http://feeds.feedburner.com/~r/SoftwareRamblings/~4/335458534" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://softwareramblings.com/2008/07/multi-thread-scaling-issues-with-python-and-ruby.html/feed</wfw:commentRss>
		<feedburner:origLink>http://softwareramblings.com/2008/07/multi-thread-scaling-issues-with-python-and-ruby.html</feedburner:origLink></item>
		<item>
		<title>Regular Expressions in C++</title>
		<link>http://feeds.feedburner.com/~r/SoftwareRamblings/~3/333141068/regular-expressions-in-c.html</link>
		<comments>http://softwareramblings.com/2008/07/regular-expressions-in-c.html#comments</comments>
		<pubDate>Sat, 12 Jul 2008 00:34:10 +0000</pubDate>
		<dc:creator>Stephen Doyle</dc:creator>
		
		<category><![CDATA[C++]]></category>

		<category><![CDATA[Regular Expressions]]></category>

		<guid isPermaLink="false">http://softwareramblings.com/?p=31</guid>
		<description><![CDATA[Up to now, regular expression support in C/C++ programs was achieved using third party or open source regular expression libraries such as the PCRE library. With the addition of regex support to the C++ standard library as part of the C++0x standard update, using regular expressions in C++ programs has become much simpler. This feature is [...]]]></description>
			<content:encoded><![CDATA[<p>Up to now, regular expression support in C/C++ programs was achieved using third party or open source regular expression libraries such as the <a href="http://www.pcre.org/">PCRE library</a>. With the addition of regex support to the C++ standard library as part of the C++0x standard update, using regular expressions in C++ programs has become much simpler. This feature is included in the <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2005/n1836.pdf">TR1 draft report</a>which has already been implemented in some popular compilers such as gcc and Visual Studio 2008 (as part of service pack 1).  </p>
<p>Six regular expression grammars will be supported in C++0x. The default is based upon the ECMAScript grammar specified in <span style="font-size: x-small; font-family: NimbusRomNo9L-Regu;"><span style="font-size: x-small; font-family: NimbusRomNo9L-Regu;">ECMA-262. This syntax is based upon the PCRE syntax and is used by languages such as Perl, Python and Ruby which also provide built in regular expression support. Other supported grammars include the POSIX regex syntax, and the syntaxes used in tools such as awk, grep and egrep.</span></span></p>
<p>Here are some examples that illustrate how to perform some basic tasks with the new C++ regex component of the standard library.</p>
<p><strong>Header files and namespaces</strong>:</p>
<pre class="syntax-highlight:c++">#include &lt;regex&gt;

using namespace std::tr1;</pre>
<p><strong>Finding a match:<br />
</strong>
<pre class="syntax-highlight:c++">regex rgx(&quot;ello&quot;);
assert(regex_search(&quot;Hello World&quot;, rgx));</pre>
<p>The above example illustrates the construction of a <code>regex </code>object, with the regex pattern being passed as a parameter to the regex constructor. The <code>regex </code>object is a specialization of the basic_regex template for working with regular expressions which are provided using sequences of <code>char</code>s. The <code>regex_search()</code>function template is then used to see if the &#8220;Hello world&#8221; string contains the &#8220;ello&#8221; pattern. This function returns true as soon as the first matching substring is found. The <code>regex_search()</code>function is also overloaded to provide versions that take sequence iterators as params (instead of a full string) and also versions that provide additional info on the match results.</p>
<p><em>Note: The use of assert() in the examples is used to highlight the &#8220;contract&#8221; provided by the api - e.g. to highlight if a function can be used in a conditional expression and if the function should return true or false for the particular example.</em></p>
<p><strong>Finding an exact match:</strong><br />
The <code>regex_match()</code> function template is an alternative to <code>regex_search()</code> and is used when the target sequence must exactly match the regular expression.</p>
<pre class="syntax-highlight:c++">regex rgx(&quot;ello&quot;);
assert(regex_match(&quot;Hello World&quot;, rgx) == false);
assert(regex_match(&quot;ello&quot;, rgx));</pre>
<p><strong>Finding the position of a match</strong>:<br />
The <code>sub_match</code> or <code>match_results</code> template is used to receive search results from <code>regex_search(). </code>When searching <code>char</code> data, the library provides a ready made specialization of <code>match_results</code> called <code>cmatch</code>.</p>
<pre class="syntax-highlight:c++">regex rgx(&quot;llo&quot;);
cmatch result;
regex_search(&quot;Hello World&quot;, result, rgx);
cout &lt;&lt; &quot;Matched \&quot;&quot; &lt;&lt; result.str()
    &lt;&lt; &quot;\&quot; after \&quot;&quot; &lt;&lt; result.prefix()
    &lt;&lt; &quot;\&quot; at offset: &quot; &lt;&lt; result.position()
    &lt;&lt; &quot; with length: &quot; &lt;&lt; result.length()
    &lt;&lt; endl;</pre>
<p><strong>Working with capture groups</strong>:<br />
Capture groups provide a means for capturing matched regions within a regular expression. Each captured region is represented by a <code>sub_match</code> template object. The <code>smatch</code> specialization of <code>match_results</code> is provided by the library for working with sequences of string sub-matches.</p>
<pre class="syntax-highlight:c++">string seq = &quot;foo@helloworld.com&quot;;
regex rgx(&quot;(.*)@(.*)&quot;);
smatch result;
regex_search(seq, result, rgx);
for(size_t i=0; i&lt;result.size(); ++i)
{
cout &lt;&lt; result[i] &lt;&lt; endl;
}
</pre>
<p><strong>Case insensitive searches</strong>:</p>
<pre class="syntax-highlight:c++">regex rgx(&quot;ello&quot;, regex_constants::icase);
assert(regex_search(&quot;HELLO WORLD&quot;, rgx));</pre>
<img src="http://feeds.feedburner.com/~r/SoftwareRamblings/~4/333141068" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://softwareramblings.com/2008/07/regular-expressions-in-c.html/feed</wfw:commentRss>
		<feedburner:origLink>http://softwareramblings.com/2008/07/regular-expressions-in-c.html</feedburner:origLink></item>
		<item>
		<title>Running Functions as Threads in Python</title>
		<link>http://feeds.feedburner.com/~r/SoftwareRamblings/~3/321683656/running-functions-as-threads-in-python.html</link>
		<comments>http://softwareramblings.com/2008/06/running-functions-as-threads-in-python.html#comments</comments>
		<pubDate>Sat, 28 Jun 2008 00:05:56 +0000</pubDate>
		<dc:creator>Stephen Doyle</dc:creator>
		
		<category><![CDATA[Concurrency]]></category>

		<category><![CDATA[Python]]></category>

		<category><![CDATA[multi-threading]]></category>

		<guid isPermaLink="false">http://softwareramblings.com/?p=28</guid>
		<description><![CDATA[Suppose you have a function in some Python code that you want to run as a thread. How do you do it? The simplest way is via the thread module and its start_new_thread() method. This is illustrated in the following example.
import thread

def someFunc():
    print &#34;someFunc was called&#34;

thread.start_new_thread(someFunc, ())

This approach has a limitation in that once [...]]]></description>
			<content:encoded><![CDATA[<p>Suppose you have a function in some Python code that you want to run as a thread. How do you do it? The simplest way is via the <a href="http://docs.python.org/lib/module-thread.html">thread</a> module and its start_new_thread() method. This is illustrated in the following example.</p>
<pre class="syntax-highlight:python">import thread

def someFunc():
    print &quot;someFunc was called&quot;

thread.start_new_thread(someFunc, ())
</pre>
<p>This approach has a limitation in that once the start_new_thread() function is called, it is not possible to find out when the thread has finished or to wait for completion of the thread. This may be acceptable for many applications but may be too restrictive for others. Note that function parameters can also be passed in a tuple (use an empty tuple if there are no parameters) as the second argument to start_new_thread().</p>
<p>Python also provides the <a href="http://docs.python.org/lib/module-threading.html">threading</a> module which implements a layer on top of the thread module. The <a href="http://docs.python.org/lib/module-threading.html">threading</a> module provides, among other things, a <a href="http://docs.python.org/lib/thread-objects.html">Thread</a> class which contains a run() method. Typical usage is to subclass the <a href="http://docs.python.org/lib/thread-objects.html">Thread</a> class and override the run() method in the subclass to implement the desired functionality. The <a href="http://docs.python.org/lib/thread-objects.html">Thread</a> class also provides start() and join() methods to control the starting of a thread and to provide a mechanism for waiting until the thread has finished execution (i.e. the end of run() method is reached).</p>
<p>Without sub-classing, it is possible to pass a function or other callable object to the <a href="http://docs.python.org/lib/thread-objects.html">Thread</a> class constructor to specify the target that the run() method will call. This is illustrated below.</p>
<pre class="syntax-highlight:python">import threading

t1 = threading.Thread(target=someFunc)
t1.start()
t1.join()
</pre>
<p>This approach works well for providing a mechanism for waiting for the thread to complete. A drawback to this approach though is that it is not possible to pass any arguments to the function supplied as the thread&#8217;s target.</p>
<p>If waiting for thread completion and argument passing is required it is necessary to provide a subclass of <a href="http://docs.python.org/lib/thread-objects.html">Thread</a>. The function to execute and its arguments can be passed to the subclass constructor. This is illustrated below:</p>
<pre class="syntax-highlight:python">import threading

class FuncThread(threading.Thread):
    def __init__(self, target, *args):
        self._target = target
        self._args = args
        threading.Thread.__init__(self)

    def run(self):
        self._target(*self._args)

# Example usage
def someOtherFunc(data, key):
    print &quot;someOtherFunc was called : data=%s; key=%s&quot; % (str(data), str(key))

t1 = FuncThread(someOtherFunc, [1,2], 6)
t1.start()
t1.join()
</pre>
<img src="http://feeds.feedburner.com/~r/SoftwareRamblings/~4/321683656" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://softwareramblings.com/2008/06/running-functions-as-threads-in-python.html/feed</wfw:commentRss>
		<feedburner:origLink>http://softwareramblings.com/2008/06/running-functions-as-threads-in-python.html</feedburner:origLink></item>
		<item>
		<title>Demystifying the volatile keyword</title>
		<link>http://feeds.feedburner.com/~r/SoftwareRamblings/~3/289741385/demystifying-the-volatile-keyword.html</link>
		<comments>http://softwareramblings.com/2008/05/demystifying-the-volatile-keyword.html#comments</comments>
		<pubDate>Tue, 13 May 2008 22:10:30 +0000</pubDate>
		<dc:creator>Stephen Doyle</dc:creator>
		
		<category><![CDATA[C++]]></category>

		<category><![CDATA[code snippet]]></category>

		<guid isPermaLink="false">http://softwareramblings.com/?p=25</guid>
		<description><![CDATA[Following on from my earlier post about the restrict keyword, I&#8217;d like to try and dispell the myth around the volatile keyword. The meaning of volatile is a popular interview question, particularly for embedded development jobs and I&#8217;ve heard some programmers describe the properties of this keyword as if it had super powers.
The volatile keyword [...]]]></description>
			<content:encoded><![CDATA[<p>Following on from my <a href="http://softwareramblings.com/2008/05/c99-restrict-keyword.html">earlier post</a> about the restrict keyword, I&#8217;d like to try and dispell the myth around the volatile keyword. The meaning of volatile is a popular interview question, particularly for embedded development jobs and I&#8217;ve heard some programmers describe the properties of this keyword as if it had super powers.</p>
<p>The volatile keyword does a very simple job. When a variable is marked as volatile, the programmer is instructing the compiler not to cache this variable in a register but instead to read the value of the variable from memory each and every time the variable is used. That&#8217;s it - simple isn&#8217;t it?</p>
<p>To illustrate the use of the keyword, consider the following example:
<pre class="syntax-highlight:c">volatile int* vp = SOME_REGISTER_ADDRESS;
for(int i=0; i&lt;100; i++)
    foo(*vp);</pre>
<p>In this simple example, the pointer vp points to a volatile int. The value of this int is read from memory for each loop iteration. If volatile was not specified then it is likely that the compiler would generate optimized code which would read the value of the int once, temporarily store this in a register and then use the register copy during each iteration.</p>
<p>Examples of where volatile is often used:</p>
<ul>
<li>When accessing hardware registers via pointers. It is necessary for the generated code to always access the hardware registers value and never a potentially out of date copy.</li>
<li>When accessing a variable that is shared between two or more threads or between a thread and an ISR.</li>
</ul>
<p>Common myths about the volatile keyword:</p>
<ul>
<li>A volatile variable will never reside in cache memory - e.g. within the L2 cache of a processor. This is not true. In the case where the volatile variable is shared between two software threads, it is highly likely that the variable will exist in cached memory and the cache coherency policy will ensure that the threads will see the correct and up-to-date value of the variable in the event that the threads are running on seperate cores that don&#8217;t share the same cache. When the volatile variable refers to a hardware register, it is likely that the memory map of the system will be setup such that this register is not cacheable in the processor cache, but this is setup and managed by the appropriate device driver and is not something that is provided by the volatile keyword.</li>
</ul>
<p>Thats&#8217; it. So the next time you hear somebody waxing lyrical about the super powers of the volatile keyword, please feel empowered to enlighten them!</p>
<img src="http://feeds.feedburner.com/~r/SoftwareRamblings/~4/289741385" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://softwareramblings.com/2008/05/demystifying-the-volatile-keyword.html/feed</wfw:commentRss>
		<feedburner:origLink>http://softwareramblings.com/2008/05/demystifying-the-volatile-keyword.html</feedburner:origLink></item>
		<item>
		<title>C99 restrict Keyword</title>
		<link>http://feeds.feedburner.com/~r/SoftwareRamblings/~3/289735451/c99-restrict-keyword.html</link>
		<comments>http://softwareramblings.com/2008/05/c99-restrict-keyword.html#comments</comments>
		<pubDate>Tue, 13 May 2008 21:44:56 +0000</pubDate>
		<dc:creator>Stephen Doyle</dc:creator>
		
		<category><![CDATA[C++]]></category>

		<category><![CDATA[code snippet]]></category>

		<guid isPermaLink="false">http://softwareramblings.com/?p=24</guid>
		<description><![CDATA[I was pleasently surprised to encounter a C keyword that I had never heard of before. The keyword in question is the restrict keyword which is a type qualifier for pointers and is a formal part of the C99 standard. This keyword allows programmers to declare that pointers which share the same type do not [...]]]></description>
			<content:encoded><![CDATA[<p>I was pleasently surprised to encounter a C keyword that I had never heard of before. The keyword in question is the <code>restrict </code>keyword which is a type qualifier for pointers and is a formal part of the C99 standard. This keyword allows programmers to declare that pointers which share the same type do not alias each other.  This information can then be used by the compiler to make optimizations when using the pointers. If the data is in fact aliased, the results are undefined.</p>
<p>Consider the following example:</p>
<pre class="syntax-highlight:c">memcpy((void* restrict) dst, (void* restrict) src, size);</pre>
<p> </p>
<p>This tells the compiler that neither the <code>dst</code> or <code>src</code> pointer paramters overlap and so the compiler is free to apply any optimizations - including optimizations that may result in out of order reads/writes.</p>
<p>Mainstream compilers have varying support for this feature.</p>
<ul>
<li>GCC supports it in C99 mode  - specified via the &#8220;-std=c99&#8243; option or for non-C99 code by specifying <code>__restrict</code> to enable the keyword as a GCC extension.</li>
<li>Microsoft&#8217;s Visual Studio .NET 2005/2008 compiler doesn&#8217;t support this feature as specified in the C99 standard but does provide similar support using the <code>__restrict</code> specifier. Micorosft also allows this keyword to be specified for both C and C++ code. See the <a href="http://msdn.microsoft.com/en-us/library/5ft82fed(VS.80).aspx">MSDN documentation</a> for more details on Microsofts implementation and differences between it&#8217;s support and the C99 specification of <code>restrict</code>.</li>
</ul>
<p>Finally, it should be noted that this keyword is specific to C and is not specified in the 1998 C++ specification nor is it currently planned for inclusion in the fothcoming C++ specification update.</p>
<p>References:</p>
<ul>
<li style="text-align: left;"><a href="http://msdn.microsoft.com/en-us/library/5ft82fed(VS.80).aspx">MSDN documentation on restrict keyword</a></li>
<li style="text-align: left;"><a href="http://www.cellperformance.com/mike_acton/2006/05/demystifying_the_restrict_keyw.html">Demystifying The Restrict Keyword</a></li>
</ul>
<img src="http://feeds.feedburner.com/~r/SoftwareRamblings/~4/289735451" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://softwareramblings.com/2008/05/c99-restrict-keyword.html/feed</wfw:commentRss>
		<feedburner:origLink>http://softwareramblings.com/2008/05/c99-restrict-keyword.html</feedburner:origLink></item>
		<item>
		<title>This blog has a new home</title>
		<link>http://feeds.feedburner.com/~r/SoftwareRamblings/~3/282985107/blog-has-a-new-home.html</link>
		<comments>http://softwareramblings.com/2008/05/blog-has-a-new-home.html#comments</comments>
		<pubDate>Sat, 03 May 2008 23:08:33 +0000</pubDate>
		<dc:creator>Stephen Doyle</dc:creator>
		
		<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://softwareramblings.com/?p=23</guid>
		<description><![CDATA[This blog has moved from it&#8217;s old home at http://stephendoyle.blogspot.com to its own domain at http://softwareramblings.com. The move should be transparent to subscribers of the feedburner feed - at most there will be a few duplicate entries in the feed as the old posts are imported into the new setup.
So why the move? The main reason [...]]]></description>
			<content:encoded><![CDATA[<p>This blog has moved from it&#8217;s old home at <a href="http://stephendoyle.blogspot.com">http://stephendoyle.blogspot.com</a> to its own domain at <a href="http://softwareramblings.com">http://softwareramblings.com</a>. The move should be transparent to subscribers of the <a href="http://feeds.feedburner.com/SoftwareRamblings">feedburner feed</a> - at most there will be a few duplicate entries in the feed as the old posts are imported into the new setup.</p>
<p>So why the move? The main reason was because I found the blogger interface a bit cumbersome for posts that involved source code. I was underwhelmed by the presentation of source code that resulted from &lt;code&gt; and &lt;pre&gt; tags. It was also a pain to have to remember to properly handle the angle brackets and ampersands in the code. I decided to move from <a href="http://blogspot.com">blogger</a> to <a href="http://wordpress.com">WordPress</a> so that I could take advantage of it&#8217;s <a href="http://faq.wordpress.com/2007/09/03/how-do-i-post-source-code/">syntax highlighting plugin</a> which is based on the <a href="http://code.google.com/p/syntaxhighlighter/">syntaxhighlighter Google Code project by Alex Gorbatchev</a>. Check out the following snippet - cool or what?</p>
<pre class="syntax-highlight:cpp">#include &lt;iostream&gt;

using namespace std;

int main()
{
    cout &lt;&lt; &quot;Hello World&quot; &lt;&lt; endl;
    return 0;
}</pre>
<p>Rather than moving the blog from stephendoyle.blogspot.com to something like stephendoyle.wordpress.com or softwareramblings.wordpress.com I decided to move the blog to it&#8217;s own domain. Since the blog was entitled &#8220;Software Ramblings&#8221; and the domain of the same name was miraculously free, my mind was made up.</p>
<img src="http://feeds.feedburner.com/~r/SoftwareRamblings/~4/282985107" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://softwareramblings.com/2008/05/blog-has-a-new-home.html/feed</wfw:commentRss>
		<feedburner:origLink>http://softwareramblings.com/2008/05/blog-has-a-new-home.html</feedburner:origLink></item>
		<item>
		<title>Thread Affinity on OS X</title>
		<link>http://feeds.feedburner.com/~r/SoftwareRamblings/~3/282648647/thread-affinity-on-os-x.html</link>
		<comments>http://softwareramblings.com/2008/04/thread-affinity-on-os-x.html#comments</comments>
		<pubDate>Tue, 29 Apr 2008 21:59:00 +0000</pubDate>
		<dc:creator>Stephen Doyle</dc:creator>
		
		<category><![CDATA[C++]]></category>

		<category><![CDATA[OS X]]></category>

		<guid isPermaLink="false">http://softwareramblings.com/2008/04/thread-affinity-on-os-x.html</guid>
		<description><![CDATA[My first experience in developing software to run on OS X has been a disappointing one. Previously, I have written software to run on Windows and Linux and in general I have found library or system API calls that provide me with the ability to access the services that I expect to be provided by [...]]]></description>
			<content:encoded><![CDATA[<p>My first experience in developing software to run on OS X has been a disappointing one. Previously, I have written software to run on Windows and Linux and in general I have found library or system API calls that provide me with the ability to access the services that I expect to be provided by the operating system. One such service is thread affinity - i.e. the ability to tie a particular thread to a given core. This is achieved using <code>SetThreadAffinityMask()</code> on Windows and <code>sched_setaffinity()</code> on Linux.</p>
<p>However, I was very surprised to find that it does not appear to be possible to do this on OS X! </p>
<p>OS X Leopard introduced a <a href="http://developer.apple.com/releasenotes/Performance/RN-AffinityAPI/index.html">thread affinity API</a> which provides the ability to provide affinity hints to the scheduler to improve data locality in caches. Unfortunately this API does not provide the ability to tie a thread to a core. This is a major gap, especially now that all new PCs and laptops are multi-core.</p>
<p>So why do I want to be able to tie a thread to a core? I want to benchmark a piece of code and one of the data points of interest is to benchmark this code on a single core. With the current Mac OS X thread affinity API, this is not possible!</p>
<p>I&#8217;m finding it hard to believe that this service does not exist on OS X - I mean surely somebody must have wanted to run a single core benchmark on OS X running on a multi-core system. But after a couple of hours of googling and reading through the OS X APIs I have been unable to find out how to do this.</p>
<img src="http://feeds.feedburner.com/~r/SoftwareRamblings/~4/282648647" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://softwareramblings.com/2008/04/thread-affinity-on-os-x.html/feed</wfw:commentRss>
		<feedburner:origLink>http://softwareramblings.com/2008/04/thread-affinity-on-os-x.html</feedburner:origLink></item>
		<item>
		<title>Coding The Architecture</title>
		<link>http://feeds.feedburner.com/~r/SoftwareRamblings/~3/282648648/coding-the-architecture.html</link>
		<comments>http://softwareramblings.com/2008/04/coding-the-architecture.html#comments</comments>
		<pubDate>Tue, 22 Apr 2008 22:47:00 +0000</pubDate>
		<dc:creator>Stephen Doyle</dc:creator>
		
		<category><![CDATA[architecture]]></category>

		<guid isPermaLink="false">http://softwareramblings.com/2008/04/coding-the-architecture.html</guid>
		<description><![CDATA[I stumbled upon Coding the Architecture and found it to be a very refreshing and pragmatic view to software architecture and to the role of a software architect. I generally find that sites, blogs and articles on software architecture tend to focus on presenting a specific architectural process, modeling tool or on the latest trend [...]]]></description>
			<content:encoded><![CDATA[<p>I stumbled upon <a href="http://www.codingthearchitecture.com/">Coding the Architecture</a> and found it to be a very refreshing and pragmatic view to software architecture and to the role of a software architect. I generally find that sites, blogs and articles on software architecture tend to focus on presenting a specific architectural process, modeling tool or on the latest trend in software architecture. Trying to find down to earth, practical advice on how to set oneself up for success as a software architect has proven difficult - until now. </p>
<p>After browsing through some of the <a href="http://www.codingthearchitecture.com/">blog</a> entries on the site and from one of its <a href="http://www.codingthearchitecture.com/pages/londonusergroup.html">user groups</a>, I am reminded of the resonant style of the <a href="http://www.pragprog.com/the-pragmatic-programmer">Pragmatic Programmer</a> book. It has the same practical and pragmatic feel. </p>
<p>Some entries / articles that I found interesting:</p>
<ul>
<li><a href="http://www.codingthearchitecture.com/2008/04/10/what_is_a_software_architect.html">What is a Software Architect?</a></li>
<li><a href="http://www.codingthearchitecture.com/2007/07/31/role_profile_for_software_architects.html">Role profile for software architects</a></li>
<li><a href="http://www.codingthearchitecture.com/files/presentations/20080408-sharing-architectures.pdf">Sharing Architectures</a></li>
<li><a href="http://www.codingthearchitecture.com/2008/03/18/software_architecture_document_guidelines.html">Software Architecture Document Guidelines</a></li>
</ul>
<img src="http://feeds.feedburner.com/~r/SoftwareRamblings/~4/282648648" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://softwareramblings.com/2008/04/coding-the-architecture.html/feed</wfw:commentRss>
		<feedburner:origLink>http://softwareramblings.com/2008/04/coding-the-architecture.html</feedburner:origLink></item>
		<item>
		<title>ToDo Lists</title>
		<link>http://feeds.feedburner.com/~r/SoftwareRamblings/~3/282648649/todo-lists.html</link>
		<comments>http://softwareramblings.com/2008/04/todo-lists.html#comments</comments>
		<pubDate>Fri, 18 Apr 2008 23:36:00 +0000</pubDate>
		<dc:creator>Stephen Doyle</dc:creator>
		
		<category><![CDATA[tools]]></category>

		<guid isPermaLink="false">http://softwareramblings.com/2008/04/todo-lists.html</guid>
		<description><![CDATA[Where would we be without ToDo lists? They have found their use in everything from keeping track of simple errands to keeping track of outstanding tasks in large scale projects. I find them an integral part of my day to day work and so over the years, I&#8217;ve tried out various ToDo list applications to [...]]]></description>
			<content:encoded><![CDATA[<p>Where would we be without ToDo lists? They have found their use in everything from keeping track of simple errands to keeping track of outstanding tasks in large scale projects. I find them an integral part of my day to day work and so over the years, I&#8217;ve tried out various ToDo list applications to try and find the one that the right fit for me.</p>
<p>I am constantly amazed by the vast number of different applications for managing ToDo lists that have emerged. They range from very simple single list tools, to complex tools that are more suited towards project management than they are for keeping track of what to get during the next visit to the shops.</p>
<p>ToDo list applications need to feel right when you are using them. I have all too often had the experience of trying out a particular ToDo list application only to find that it constrains my usage model in some way. I then spend a couple of days/weeks trying to adapt my usage model to the constraints and model offered by the application. I eventually get too frustrated and give up. I am convinced at this stage that different people are tuned to different ToDo list application features and that there is no one &#8220;ToDo list model&#8221; that fits all. In spite of this, I believe that I have found the <a href="http://www.codeproject.com/KB/applications/todolist2.aspx">magic ToDo list application</a> that seems to fit most usage models - or at least it doesn&#8217;t provide any constraints about how to manage your ToDo lists so you have the freedom to manage them whatever way you like.</p>
<p>After trying 20+ different applications over the past 10 or so years, I constantly return to <a href="http://www.codeproject.com/KB/applications/todolist2.aspx">Abstract Spoon&#8217;s ToDoList application</a>. This is without doubt one of the most flexible ToDo list managers out there and it acquires new features at a fast pace. This application really defines the meaning of feature rich. I have used it successfully for managing simple lists, project scheduling, time tracking, and <a href="http://www.scrumalliance.org/">SCRUM</a> backlogs, burndown charts and even for online SCRUM boards. It is difficult to describe its feature list in a simple blog post so I encourage you to <a href="http://www.codeproject.com/KB/applications/todolist2.aspx">download</a> it and give it a twirl.</p>
<p>The only downside that I have encountered with this particular application is that it is Windows specific. I&#8217;m sure that ports of this application to different OSes will occur with time but for the moment the cost of its excellent flexibility is only being able to use it on your Windows box(es).</p>
<img src="http://feeds.feedburner.com/~r/SoftwareRamblings/~4/282648649" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://softwareramblings.com/2008/04/todo-lists.html/feed</wfw:commentRss>
		<feedburner:origLink>http://softwareramblings.com/2008/04/todo-lists.html</feedburner:origLink></item>
	</channel>
</rss>
