<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss1full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:admin="http://webns.net/mvcb/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns="http://purl.org/rss/1.0/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">

<channel rdf:about="http://www.biais.org/blog/index.php/">
  <title>biais.org</title>
  <description><![CDATA[Python, art and chicken pie.]]></description>
  <link>http://www.biais.org/blog/index.php/</link>
  <dc:language>en</dc:language>
  <dc:creator />
  <dc:rights />
  <dc:date>2009-04-02T05:36:08+02:00</dc:date>
  <admin:generatorAgent rdf:resource="http://www.dotclear.net/" />
  
  <sy:updatePeriod>daily</sy:updatePeriod>
  <sy:updateFrequency>1</sy:updateFrequency>
  <sy:updateBase>2009-04-02T05:36:08+02:00</sy:updateBase>
  
  <items>
  <rdf:Seq>
    <rdf:li rdf:resource="http://www.biais.org/blog/index.php/2009/04/02/73-where-the-world-sees-junk-africa-recycles-maker-faire-africa" />
  <rdf:li rdf:resource="http://www.biais.org/blog/index.php/2009/01/05/72-genetic-algorithm-in-python-to-generate-file-converters" />
  <rdf:li rdf:resource="http://www.biais.org/blog/index.php/2008/12/06/71-opencalais-semantic-analysis-web-service" />
  <rdf:li rdf:resource="http://www.biais.org/blog/index.php/2008/12/05/70-dbpedia-32-including-dbpedia-ontology" />
  <rdf:li rdf:resource="http://www.biais.org/blog/index.php/2008/09/17/69-ruby-for-a-python-programmer" />
  <rdf:li rdf:resource="http://www.biais.org/blog/index.php/2008/08/29/68-evol-ution-darwin-s-graffiti" />
  <rdf:li rdf:resource="http://www.biais.org/blog/index.php/2008/06/13/67-russian-word-stress-dictionary" />
  <rdf:li rdf:resource="http://www.biais.org/blog/index.php/2008/05/26/66-screened-emacs-launcher" />
  <rdf:li rdf:resource="http://www.biais.org/blog/index.php/2008/04/05/64-machine-translation-techniques-and-open-source" />
  <rdf:li rdf:resource="http://www.biais.org/blog/index.php/2008/03/24/63-ack-a-better-grep-for-programmers" />
  </rdf:Seq>
  </items>
<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/biais" type="application/rss+xml" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com" /></channel>

<item rdf:about="http://www.biais.org/blog/index.php/2009/04/02/73-where-the-world-sees-junk-africa-recycles-maker-faire-africa">
  <title>Where the World Sees Junk, Africa Recycles: Maker Faire Africa</title>
  <link>http://feedproxy.google.com/~r/biais/~3/cbN9m1saMj0/73-where-the-world-sees-junk-africa-recycles-maker-faire-africa</link>
  <dc:date>2009-04-02T05:36:08+02:00</dc:date>
  <dc:language>en</dc:language>
  <dc:creator>Maxime Biais</dc:creator>
  <dc:subject>Misc</dc:subject>
  <description>Maker Faire Africa (MFA), a celebration of African ingenuity, innovation and invention, will take place August 13-15 at the Ghana-India Kofi Annan Centre of Excellence in ICT in Ghana's capital, Accra.


You also should check out the Afrigadget website:

dedicated to showcasing African ingenuity....</description>
  <content:encoded><![CDATA[ <p><a href="http://makerfaireafrica.com/" hreflang="en"><img src="http://makerfaireafrica.com/wp-content/uploads/2009/04/mfa-banner-3j.jpg" alt="" /></a></p>


<p><a href="http://makerfaireafrica.com/" hreflang="en">Maker Faire Africa (MFA)</a>, a celebration of African ingenuity, innovation and invention, will take place August 13-15 at the Ghana-India Kofi Annan Centre of Excellence in ICT in Ghana's capital, Accra.</p>


<p>You also should check out the <a href="http://www.afrigadget.com/" hreflang="en">Afrigadget</a> website:</p>

<blockquote><p>dedicated to showcasing African ingenuity. A team of bloggers and readers contribute their pictures, videos and stories from around the continent. The stories of innovation are inspiring. It is a testament to Africans bending the little they have to their will, using creativity to overcome life’s challenges.</p></blockquote><div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/biais?a=cbN9m1saMj0:IZx6NAiv6_w:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/biais?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/biais?a=cbN9m1saMj0:IZx6NAiv6_w:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/biais?i=cbN9m1saMj0:IZx6NAiv6_w:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/biais?a=cbN9m1saMj0:IZx6NAiv6_w:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/biais?i=cbN9m1saMj0:IZx6NAiv6_w:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/biais?a=cbN9m1saMj0:IZx6NAiv6_w:cGdyc7Q-1BI"><img src="http://feeds.feedburner.com/~ff/biais?d=cGdyc7Q-1BI" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/biais/~4/cbN9m1saMj0" height="1" width="1"/>]]></content:encoded>
<feedburner:origLink>http://www.biais.org/blog/index.php/2009/04/02/73-where-the-world-sees-junk-africa-recycles-maker-faire-africa</feedburner:origLink></item>
<item rdf:about="http://www.biais.org/blog/index.php/2009/01/05/72-genetic-algorithm-in-python-to-generate-file-converters">
  <title>Genetic Algorithm in Python to Generate File Converters</title>
  <link>http://feedproxy.google.com/~r/biais/~3/IUsiAbU6gaM/72-genetic-algorithm-in-python-to-generate-file-converters</link>
  <dc:date>2009-01-05T16:51:25+01:00</dc:date>
  <dc:language>en</dc:language>
  <dc:creator>Maxime Biais</dc:creator>
  <dc:subject>Python</dc:subject>
  <description>In an old post I wrote about a metaheuristic: particle swarm optimization (PSO). It was the simplest code to demonstrate what PSO looks like. Today I'm writing about genetic algorithm, another metaheuristic inspired by evolution of species. You can read the description of the algorithm on the...</description>
  <content:encoded><![CDATA[ <p><img src="/blog/images/chromosome2.jpg" alt="" /></p>


<p>In an old post I wrote about a metaheuristic: <a href="http://www.biais.org/blog/index.php/2007/01/14/13-metaheuristic-particle-swarm-optimization-pso-in-python">particle swarm optimization (PSO)</a>. It was the simplest code to demonstrate what PSO looks like. Today I'm writing about genetic algorithm, another metaheuristic inspired by evolution of species. You can read the description of the algorithm on the <a href="http://en.wikipedia.org/wiki/Genetic_algorithm">genetic algorithm article</a> on Wikipedia.</p>


<p>I applied the algorithm to a problem that is not really one: trying to help lazy programmers to write file converters. I had to write file converters to unify all (more or less) formated input files into one kind of <a href="http://en.wikipedia.org/wiki/Comma-separated_values">CSV file</a>. Each of these converters is made using the right combination of filtering / regexp matching / line splitting.</p>


<p>So I wrote a program based on genetic algorithm where an individual is composed by 3 "genes":</p>
<ul>
<li>a list of filters (remove consecutive spaces, replace tabs by spaces, ...)</li>
<li>a list of basics regexp (int, float, date, line separator, ...)</li>
<li>a list of cleaning functions (remove empty cell, merge cells, ...)</li>
</ul>

<p>The fitness of an individual is calculated using a sample file to parse that goes through the genes (filters / regexp / cleaning functions). The result of this process is a CSV file that can be compared to the expected sample result (that I wrote manually). The comparison uses the SequenceMatcher of the difflib Python stantard module, it returns the fitness: a float. A fitness close to 1.0 means the results is close to the expected CSV format. When the fitness is exaclty 1.0, that means the converter works perfectly for the sample.</p>



<p>Here is a sample file I wanted to convert:</p>
<pre>
08/12   	 Hello world 06/12
Paris 121231231 	  	- 22,29
08/12 	Something something ... 04/12
1111 1111 1111 1111 	  	- 14,35
08/12 	something else
	  	- 12,96
26/11 	Vir AAAAA
AAAAAA 2008	  	264,51
</pre>


<p>into this csv file:</p>
<pre>
&quot;08/12&quot;,&quot;Hello world 06/12&quot;,&quot;Paris 121231231&quot;,&quot;-22,29&quot;
&quot;08/12&quot;,&quot;Something something ... 04/12&quot;,&quot;1111 1111 1111 1111&quot;,&quot;-14,35&quot;
&quot;08/12&quot;,&quot;something else&quot;,&quot;&quot;,&quot;-12,96&quot;
&quot;26/11&quot;,&quot;Vir AAAAA&quot;,&quot;AAAAAA 2008&quot;,&quot;264,51&quot;
</pre>



<p>The program prints for each generation the 5 bests individual (I'm using elitism, so the best individual is always kept from a generation to the next one). A sample run:</p>
<pre>
$ python ga.py tests/sample1.txt tests/sample1.expected result1.pickled
Generation: 0 (mutation rate=10)
0.53772
0.42507
0.30406
0.28598
0.26389
Generation: 1 (mutation rate=10)
0.53772
0.42507
0.32684
0.30406
0.26799

...

Generation: 271 (mutation rate=15)
1.0
0.96025
0.95107
0.91779
0.87886
&quot;08/12&quot;,&quot;Hello world 06/12&quot;,&quot;Paris 121231231&quot;,&quot;-22,29&quot;
&quot;08/12&quot;,&quot;Something something ... 04/12&quot;,&quot;1111 1111 1111 1111&quot;,&quot;-14,35&quot;
&quot;08/12&quot;,&quot;something else&quot;,&quot;&quot;,&quot;-12,96&quot;
&quot;26/11&quot;,&quot;Vir AAAAA&quot;,&quot;AAAAAA 2008&quot;,&quot;264,51&quot;

filters: str_remove_consecutive_spaces, str_remove_somespaces, str_strip, str_remove_consecutive_spaces
regex: ([0-9]{2}/[0-9]{2})(.+?)(
)(.+?)(-?[0-9]+[.,]+[0-9]+)
cleaners: clean_strip, _in, _in, _in
</pre>



<h5>Conclusion</h5>


<p>The good:</p>
<ul>
<li>It's fun to see your program evolve ;)</li>
<li>You are lazy and don't want to write _many_ of this kind of file converters</li>
<li>With a good interface and if the basic functions and genes size are well defined, a non-programmer can create his own parser (I think this is the only argument to use it).</li>
</ul>

<p>The bad:</p>
<ul>
<li>The resulted parsers are not optimal</li>
</ul>

<p>The ugly:</p>
<ul>
<li>Of course it's quite easy to write such genes by hand</li>
<li>It may never converge to 1.0 and generated parsers of fitness != 1.0 are useless (this may not be the case for other kind of GA applications)</li>
<li>You need many well chosen primitives to cover a wide set of solutions</li>
</ul>

<h5>Download</h5>

<p><a href="http://www.biais.org/blog/data/gaparser-0.1.tar.bz2">gaparser-0.1.tar.bz2</a>: source code and examples.</p><div class="feedflare">
<a href="http://feeds.feedburner.com/~f/biais?a=mbz3ef0h"><img src="http://feeds.feedburner.com/~f/biais?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=3touXCyh"><img src="http://feeds.feedburner.com/~f/biais?i=3touXCyh" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=x4HZZafH"><img src="http://feeds.feedburner.com/~f/biais?i=x4HZZafH" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=PGUomcBU"><img src="http://feeds.feedburner.com/~f/biais?d=131" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/biais/~4/IUsiAbU6gaM" height="1" width="1"/>]]></content:encoded>
<feedburner:origLink>http://www.biais.org/blog/index.php/2009/01/05/72-genetic-algorithm-in-python-to-generate-file-converters</feedburner:origLink></item>
<item rdf:about="http://www.biais.org/blog/index.php/2008/12/06/71-opencalais-semantic-analysis-web-service">
  <title>OpenCalais: Semantic Analysis Web Service</title>
  <link>http://feedproxy.google.com/~r/biais/~3/7ClmSH6iOoA/71-opencalais-semantic-analysis-web-service</link>
  <dc:date>2008-12-06T02:40:27+01:00</dc:date>
  <dc:language>en</dc:language>
  <dc:creator>Maxime Biais</dc:creator>
  <dc:subject>NLP</dc:subject>
  <description>OpenCalais is a free web service that can perform semantic analysis on any English text. It processes the text sent in your request and respond with extracted concepts and relationships. It's a great tool if you want to play with semantics and if you want to add some nice features to your website...</description>
  <content:encoded><![CDATA[ <p><img src="http://www.opencalais.com/files/calais_logo.png" alt="" /></p>


<p><a href="http://www.opencalais.com/" hreflang="en">OpenCalais</a> is a free web service that can perform semantic analysis on any English text. It processes the text sent in your request and respond with extracted concepts and relationships. It's a great tool if you want to play with semantics and if you want to add some nice features to your website / blog.</p>


<p>As an example, I tried to send the text from a <a href="http://www.biais.org/blog/index.php/2008/09/17/69-ruby-for-a-python-programmer" hreflang="en">this small article about Ruby and Python</a>. Note : For readability I kept only interesting data from the response :</p>
<pre>[xml]
&lt;!-- 
Relations: 
ProgrammingLanguage: Python, Ruby
--&gt; 
&lt;rdf:RDF xmlns:rdf=&quot;http://www.w3.org/1999/02/22-rdf-syntax-ns#&quot; xmlns:c=&quot;http://s.opencalais.com/1/pred/&quot;&gt;
  &lt;rdf:Description rdf:about=&quot;...&quot;&gt;
    &lt;!-- ProgrammingLanguage: Python; --&gt; 
    &lt;c:detection&gt;[similarities and differences between Ruby and ]Python[ but I didn't find any idioms list in Ruby, so if]&lt;/c:detection&gt; 
    &lt;c:prefix&gt;similarities and differences between Ruby and&lt;/c:prefix&gt; 
    &lt;c:exact&gt;Python&lt;/c:exact&gt; 
    &lt;c:suffix&gt;but I didn't find any idioms list in Ruby, so if&lt;/c:suffix&gt; 
    &lt;c:relevance&gt;0.543&lt;/c:relevance&gt; 
  &lt;/rdf:Description&gt;

  &lt;rdf:Description rdf:about=&quot;...&quot;&gt;
    &lt;!-- ProgrammingLanguage: Ruby; --&gt; 
    &lt;c:detection&gt;[ list in Ruby, so if you know one or if you are a ]Ruby[ programmer, please post a]&lt;/c:detection&gt; 
    &lt;c:prefix&gt;list in Ruby, so if you know one or if you are a&lt;/c:prefix&gt; 
    &lt;c:exact&gt;Ruby&lt;/c:exact&gt; 
    &lt;c:suffix&gt;programmer, please post a&lt;/c:suffix&gt; 
    &lt;c:relevance&gt;0.386&lt;/c:relevance&gt; 
  &lt;/rdf:Description&gt;
&lt;/rdf:RDF&gt;
</pre>


<p>The analyzed text is quite small but the results seems OK : 2 programming languages detected here, no animal, no gemtone...</p><div class="feedflare">
<a href="http://feeds.feedburner.com/~f/biais?a=VG7yf55m"><img src="http://feeds.feedburner.com/~f/biais?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=RyGUoPIa"><img src="http://feeds.feedburner.com/~f/biais?i=RyGUoPIa" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=cZYLPDB4"><img src="http://feeds.feedburner.com/~f/biais?i=cZYLPDB4" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=0YBITbq3"><img src="http://feeds.feedburner.com/~f/biais?d=131" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/biais/~4/7ClmSH6iOoA" height="1" width="1"/>]]></content:encoded>
<feedburner:origLink>http://www.biais.org/blog/index.php/2008/12/06/71-opencalais-semantic-analysis-web-service</feedburner:origLink></item>
<item rdf:about="http://www.biais.org/blog/index.php/2008/12/05/70-dbpedia-32-including-dbpedia-ontology">
  <title>DBPedia 3.2 Including DBpedia Ontology</title>
  <link>http://feedproxy.google.com/~r/biais/~3/l-U5i7Ja1X0/70-dbpedia-32-including-dbpedia-ontology</link>
  <dc:date>2008-12-05T00:07:46+01:00</dc:date>
  <dc:language>en</dc:language>
  <dc:creator>Maxime Biais</dc:creator>
  <dc:subject>NLP</dc:subject>
  <description>If you like semantics or if you work on NLP projects, you should already know DBPedia. Database plus a set of tools that allow you to ask sophisticated queries against Wikipedia. Some days ago, DBPedia 3.2 was released and now, it includes DBpedia Ontology, a manually created cross-domain...</description>
  <content:encoded><![CDATA[ <p><img src="http://wiki.dbpedia.org/images/dbpedia_logo.png" alt="" /></p>


<p>If you like semantics or if you work on NLP projects, you should already know <a href="http://dbpedia.org" hreflang="en">DBPedia</a>. Database plus a set of tools that allow you to ask sophisticated queries against Wikipedia. Some days ago, DBPedia 3.2 was released and now, it includes <a href="http://wiki.dbpedia.org/Ontology" hreflang="en">DBpedia Ontology</a>, a manually created cross-domain ontology based on the most commonly used infoboxes within Wikipedia.</p>


<p>Read more on the <a href="http://blog.dbpedia.org/2008/11/17/dbpedia-version-32-released-including-the-new-dbpedia-ontology/" hreflang="en">official announcement</a></p>


<p>Note: I'm a bit late on this news, I will try to update the blog more often.</p><div class="feedflare">
<a href="http://feeds.feedburner.com/~f/biais?a=Os1PH3c2"><img src="http://feeds.feedburner.com/~f/biais?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=RbwzkPek"><img src="http://feeds.feedburner.com/~f/biais?i=RbwzkPek" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=Oc77pEP5"><img src="http://feeds.feedburner.com/~f/biais?i=Oc77pEP5" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=hE72VyFN"><img src="http://feeds.feedburner.com/~f/biais?d=131" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/biais/~4/l-U5i7Ja1X0" height="1" width="1"/>]]></content:encoded>
<feedburner:origLink>http://www.biais.org/blog/index.php/2008/12/05/70-dbpedia-32-including-dbpedia-ontology</feedburner:origLink></item>
<item rdf:about="http://www.biais.org/blog/index.php/2008/09/17/69-ruby-for-a-python-programmer">
  <title>Ruby For a Python Programmer</title>
  <link>http://feedproxy.google.com/~r/biais/~3/kQSejDQ2va0/69-ruby-for-a-python-programmer</link>
  <dc:date>2008-09-17T14:54:40+02:00</dc:date>
  <dc:language>en</dc:language>
  <dc:creator>Maxime Biais</dc:creator>
  <dc:subject>Python</dc:subject>
  <description>I'm looking for a website comparing Ruby and Python idioms. I'm a Python programmer and I always use the same idioms to write programs (list and dict comprehension, loops on slices, ...). I found a good resource that describes some similarities and differences between Ruby and Python but I didn't...</description>
  <content:encoded><![CDATA[ <p>I'm looking for a website comparing Ruby and Python idioms. I'm a Python programmer and I always use the same idioms to write programs (list and dict comprehension, loops on slices, ...). I found a good resource that describes some <a href="http://www.ruby-lang.org/en/documentation/ruby-from-other-languages/to-ruby-from-python/">similarities and differences between Ruby and Python</a> but I didn't find any idioms list in Ruby, so if you know one or if you are a Ruby programmer, please post a comment.</p><div class="feedflare">
<a href="http://feeds.feedburner.com/~f/biais?a=EvspHvwX"><img src="http://feeds.feedburner.com/~f/biais?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=nFkaILaF"><img src="http://feeds.feedburner.com/~f/biais?i=nFkaILaF" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=lkh3Htkk"><img src="http://feeds.feedburner.com/~f/biais?i=lkh3Htkk" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=Q6PUR2ca"><img src="http://feeds.feedburner.com/~f/biais?d=131" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/biais/~4/kQSejDQ2va0" height="1" width="1"/>]]></content:encoded>
<feedburner:origLink>http://www.biais.org/blog/index.php/2008/09/17/69-ruby-for-a-python-programmer</feedburner:origLink></item>
<item rdf:about="http://www.biais.org/blog/index.php/2008/08/29/68-evol-ution-darwin-s-graffiti">
  <title>EVOL-ution : Darwin's graffiti</title>
  <link>http://feedproxy.google.com/~r/biais/~3/sbJi4YfU-Uo/68-evol-ution-darwin-s-graffiti</link>
  <dc:date>2008-08-29T11:45:03+02:00</dc:date>
  <dc:language>en</dc:language>
  <dc:creator>Maxime Biais</dc:creator>
  <dc:subject>Art</dc:subject>
  <description>Brilliant idea, thanks to Kriebel...</description>
  <content:encoded><![CDATA[ <p><img src="/blog/images/darwin_love.jpg" alt="" /></p>


<p>Brilliant idea, thanks to <a href="http://www.flickr.com/groups/734787@N20/" hreflang="en">Kriebel</a></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~f/biais?a=Q9Kh3DGL"><img src="http://feeds.feedburner.com/~f/biais?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=DaViP91J"><img src="http://feeds.feedburner.com/~f/biais?i=DaViP91J" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=NqyEI6pR"><img src="http://feeds.feedburner.com/~f/biais?i=NqyEI6pR" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=BRSkJoqU"><img src="http://feeds.feedburner.com/~f/biais?d=131" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/biais/~4/sbJi4YfU-Uo" height="1" width="1"/>]]></content:encoded>
<feedburner:origLink>http://www.biais.org/blog/index.php/2008/08/29/68-evol-ution-darwin-s-graffiti</feedburner:origLink></item>
<item rdf:about="http://www.biais.org/blog/index.php/2008/06/13/67-russian-word-stress-dictionary">
  <title>Russian Word Stress Dictionary</title>
  <link>http://feedproxy.google.com/~r/biais/~3/2YQXpgDjFIQ/67-russian-word-stress-dictionary</link>
  <dc:date>2008-06-13T00:59:00+02:00</dc:date>
  <dc:language>en</dc:language>
  <dc:creator>Maxime Biais</dc:creator>
  <dc:subject>Misc</dc:subject>
  <description>I'm trying to learn russian since a few weeks. I was looking on the Internet for a Russian-English dictionary with stress on Russian words because this is the only tool I needed to learn spoken russian by myself. I found the eSpeak Project, they worked on a russian dictionary with stress...</description>
  <content:encoded><![CDATA[ <p><a href="http://www.biais.org/russian-stress/" hreflang="en"><img src="/blog/images/russian_stress_screen.png" alt="" /></a></p>


<p>I'm trying to learn russian since a few weeks. I was looking on the Internet for a Russian-English dictionary with <a href="http://en.wikipedia.org/wiki/Stress_(linguistics)" hreflang="en">stress</a> on Russian words because this is the only tool I needed to learn spoken russian by myself. I found the <a href="http://espeak.sourceforge.net/" hreflang="en">eSpeak Project</a>, they worked on a <a href="http://espeak.sourceforge.net/data/russian_data.zip" hreflang="en">russian dictionary</a> with stress associated to each word. It's great but it's annoying to look for a word in a big text file... That's why I wrote a small django+jquery frontend to query the dictionary easily. Also I don't have russian keyboard so I add a small transliteration tool to the interface.</p>


<p>You can access it here: <a href="http://www.biais.org/russian-stress/" hreflang="en">http://www.biais.org/russian-stress/</a></p>


<p><strong>EDIT: It doesn't work in IE, <a href="http://www.getfirefox.com" hreflang="en">Get Firefox</a></strong></p>


<p><strong>EDIT2: It's now working in IE, anyhow <a href="http://www.getfirefox.com" hreflang="en">Get Firefox</a></strong></p>


<p>Note: The dictionary is not perfect but it contains about 220000 entries.</p><div class="feedflare">
<a href="http://feeds.feedburner.com/~f/biais?a=cGNP4jPW"><img src="http://feeds.feedburner.com/~f/biais?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=7K0VyUh6"><img src="http://feeds.feedburner.com/~f/biais?i=7K0VyUh6" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=4uthERv4"><img src="http://feeds.feedburner.com/~f/biais?i=4uthERv4" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=tA4bsCoU"><img src="http://feeds.feedburner.com/~f/biais?d=131" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/biais/~4/2YQXpgDjFIQ" height="1" width="1"/>]]></content:encoded>
<feedburner:origLink>http://www.biais.org/blog/index.php/2008/06/13/67-russian-word-stress-dictionary</feedburner:origLink></item>
<item rdf:about="http://www.biais.org/blog/index.php/2008/05/26/66-screened-emacs-launcher">
  <title>Screened Emacs Launcher</title>
  <link>http://feedproxy.google.com/~r/biais/~3/l43p7fikVKc/66-screened-emacs-launcher</link>
  <dc:date>2008-05-26T21:26:59+02:00</dc:date>
  <dc:language>en</dc:language>
  <dc:creator>Maxime Biais</dc:creator>
  <dc:subject>Misc</dc:subject>
  <description>I'm used to run emacs from my shell and my mind is not able to switch from the command emacs to emacs-client when I have an opened  windows. This is why I wrote this simple shell script that:

run emacs (and force server-start) in detached screen with a particular id (emax) if this screen doesn't...</description>
  <content:encoded><![CDATA[ <p>I'm used to run <code>emacs</code> from my shell and my mind is not able to switch from the command <code>emacs</code> to <code>emacs-client</code> when I have an opened  windows. This is why I wrote this simple shell script that:</p>
<ul>
<li>run <code>emacs</code> (and force server-start) in detached <code>screen</code> with a particular id (<code>emax</code>) if this <code>screen</code> doesn't already exist</li>
<li>run <code>emacs-client</code> (with the -n option : don't wait for the server to return) else</li>
</ul>
<pre>[shell]
#!/bin/bash

screen -list |grep emax &gt; /dev/null
if [ $? -eq 1 ]; then
	echo &quot;screening -- emacs $@&quot;
	screen -S emax -d -m emacs -f 'server-start' $@
else
	echo &quot;connect to emacs server and detach -- emacs $@&quot;
	emacsclient -n $@
fi
</pre>


<p>I prefer to get a separate emacs instance when I'm writing mail because I can focus on it. You may want to have special cases for this, use this script instead :</p>

<pre>[shell]
#!/bin/bash

# special case for mutt mail edition
if [[ &quot;$1&quot; =~ &quot;/tmp/mutt&quot;  ]]; then
    echo &quot;attached&quot;
    detach=0
else
     echo &quot;detached&quot;
    detach=1
fi

screen -list |grep emax &gt; /dev/null
if [ $? -eq 1 ]; then
    if [ $detach -eq 1 ]; then
	echo &quot;screening -- emacs $@&quot;
	screen -S emax -d -m emacs -f 'server-start' $@
    else
	echo &quot;normal mode -- emacs $@&quot;
	emacs -f 'mail-mode' $@
    fi
else
    if [ $detach -eq 1 ]; then
	echo &quot;connect to emacs server and detach -- emacs $@&quot;
	emacsclient -n $@
    else
	echo &quot;connect to emacs server -- emacs $@&quot;
	emacsclient $@
    fi
fi
</pre>


<p>Note: I also set a zsh alias to <code>emacs</code> on this script</p><div class="feedflare">
<a href="http://feeds.feedburner.com/~f/biais?a=NMylqSDc"><img src="http://feeds.feedburner.com/~f/biais?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=3bzCUYsW"><img src="http://feeds.feedburner.com/~f/biais?i=3bzCUYsW" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=a6UuNpOm"><img src="http://feeds.feedburner.com/~f/biais?i=a6UuNpOm" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=JiQcDvl5"><img src="http://feeds.feedburner.com/~f/biais?d=131" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/biais/~4/l43p7fikVKc" height="1" width="1"/>]]></content:encoded>
<feedburner:origLink>http://www.biais.org/blog/index.php/2008/05/26/66-screened-emacs-launcher</feedburner:origLink></item>
<item rdf:about="http://www.biais.org/blog/index.php/2008/04/05/64-machine-translation-techniques-and-open-source">
  <title>Machine Translation Techniques and Open Source</title>
  <link>http://feedproxy.google.com/~r/biais/~3/06PLT51rKQ8/64-machine-translation-techniques-and-open-source</link>
  <dc:date>2008-04-05T20:24:08+02:00</dc:date>
  <dc:language>en</dc:language>
  <dc:creator>Maxime Biais</dc:creator>
  <dc:subject>NLP</dc:subject>
  <description>Today there two main approaches to Machine Translation (MT)


Rules based MT (used by numbers of companies working in the domain: Systran, Reverso, etc.). The only open source software I know that works with this approach is Apertium.


Statistical based MT (used by Google and Language Weaver)....</description>
  <content:encoded><![CDATA[ <p>Today there two main approaches to Machine Translation (MT)</p>

<ul>
<li>Rules based MT (used by numbers of companies working in the domain: Systran, <a href="http://www.reverso.net">Reverso</a>, etc.). The only open source software I know that works with this approach is <a href="http://xixona.dlsi.ua.es/apertium-www/">Apertium</a>.</li>
</ul>
<ul>
<li>Statistical based MT (used by Google and <a href="http://www.languageweaver.com/">Language Weaver</a>). <a href="http://www.statmt.org/moses/">Moses</a> is an open source implementation of this approach. Also, the learning process is supported by other open source layers. (for example <a href="http://code.google.com/p/giza-pp/">giza++</a> is an open source word aligner needed by moses to prepare the corpus).</li>
</ul>

<h5>Pros and cons of rules based machine translation</h5>

<ul>
<li>It needs rules, dictionaries (general and contextual) and people with the know how (linguists) to write this rules and fill dictionaries.</li>
<li>Translation costs (CPU and memory) are fairly low</li>
</ul>

<h5>Pros and cons of statistical based machine translation</h5>

<ul>
<li>It needs big bilingual corpus and computer ressources to run the learning process</li>
<li>The bilingual corpus have to be clean (automatic pre process and human checking)</li>
<li>Translation costs are heavy</li>
<li>You can translate in all pair languages you want if you got the corpus</li>
</ul>

<h5>Resources:</h5>
<ul>
<li><a href="http://xixona.dlsi.ua.es/wiki/index.php/Main_Page">Apertium wiki</a>, great wiki about Apertium but also about other open source tools (word aligner, ...)</li>
<li><a href="http://urd.let.rug.nl/tiedeman/OPUS/">OPUS</a>, an open source parallel corpus (mixing different sources)</li>
<li><a href="http://del.icio.us/maxme/mt">My del.icio.us bookmarks on machine translation</a></li>
</ul>

<p>Notes: there is other less used techniques; word to word substitution (<a href="http://linguaphile.sourceforge.net/">Linguaphile</a>, example based translation (I didn't find open source implementation of this one), of course, you can imagine mixed techniques.</p><div class="feedflare">
<a href="http://feeds.feedburner.com/~f/biais?a=UBhIqLf6"><img src="http://feeds.feedburner.com/~f/biais?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=7SVxilJT"><img src="http://feeds.feedburner.com/~f/biais?i=7SVxilJT" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=5QmzXI0N"><img src="http://feeds.feedburner.com/~f/biais?i=5QmzXI0N" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=V2ThenuH"><img src="http://feeds.feedburner.com/~f/biais?d=131" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/biais/~4/06PLT51rKQ8" height="1" width="1"/>]]></content:encoded>
<feedburner:origLink>http://www.biais.org/blog/index.php/2008/04/05/64-machine-translation-techniques-and-open-source</feedburner:origLink></item>
<item rdf:about="http://www.biais.org/blog/index.php/2008/03/24/63-ack-a-better-grep-for-programmers">
  <title>ack: a better grep for programmers</title>
  <link>http://feedproxy.google.com/~r/biais/~3/vCCs33Q3hkc/63-ack-a-better-grep-for-programmers</link>
  <dc:date>2008-03-24T17:56:06+01:00</dc:date>
  <dc:language>en</dc:language>
  <dc:creator>Maxime Biais</dc:creator>
  <dc:subject>Misc</dc:subject>
  <description>ack is a grep like for programmers. I'm used to run grep -R and find ... -exec grep to search for something in my code or in others code. But since I found ack, I definitely switched to ack when I code. ack website.


My favourites features:

Color highlighting of search results
Searches recursively...</description>
  <content:encoded><![CDATA[ <p><a href="http://petdance.com/ack/">ack</a> is a grep like for programmers. I'm used to run <code>grep -R</code> and <code>find ... -exec grep</code> to search for something in my code or in others code. But since I found ack, I definitely switched to ack when I code. <a href="http://petdance.com/ack/">ack website</a>.</p>


<p>My favourites features:</p>
<ul>
<li>Color highlighting of search results</li>
<li>Searches recursively through directories by default, while ignoring .svn, CVS and other VCS directories</li>
<li>Many command-line switches are the same as in GNU grep, so the transition is nothing</li>
</ul>

<p><a href="http://perlbuzz.com/mechanix/2008/03/ack-178-is-out.html" hreflang="en">ack 1.78 is out</a></p><div class="feedflare">
<a href="http://feeds.feedburner.com/~f/biais?a=g918gLpu"><img src="http://feeds.feedburner.com/~f/biais?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=vdU5hUgD"><img src="http://feeds.feedburner.com/~f/biais?i=vdU5hUgD" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=ZlBtxFLc"><img src="http://feeds.feedburner.com/~f/biais?i=ZlBtxFLc" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/biais?a=2wpf8CeP"><img src="http://feeds.feedburner.com/~f/biais?d=131" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/biais/~4/vCCs33Q3hkc" height="1" width="1"/>]]></content:encoded>
<feedburner:origLink>http://www.biais.org/blog/index.php/2008/03/24/63-ack-a-better-grep-for-programmers</feedburner:origLink></item>

</rdf:RDF>
