<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>The Artificial Intelligence Cookbook</title>
	
	<link>http://blog.aicookbook.com</link>
	<description>Having some fun with Artificial Intelligence</description>
	<lastBuildDate>Mon, 20 Feb 2012 14:42:18 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/TheArtificialIntelligenceCookbook" /><feedburner:info uri="theartificialintelligencecookbook" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
		<title>All efforts to our new StrongSteam AI/Data mining service</title>
		<link>http://feedproxy.google.com/~r/TheArtificialIntelligenceCookbook/~3/qfI2KcO_S90/</link>
		<comments>http://blog.aicookbook.com/2012/02/all-efforts-to-our-new-strongsteam-aidata-mining-service/#comments</comments>
		<pubDate>Mon, 20 Feb 2012 14:42:18 +0000</pubDate>
		<dc:creator>Ian Ozsvald</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blog.aicookbook.com/?p=147</guid>
		<description><![CDATA[Clearly there have been no new posts here for quite a while &#8211; I&#8217;ve switched my focus on to another AI project which includes financial backing. As of late 2011 I started work on StrongSteam with my co-founder of ShowMeDo (Kyran Dale). We&#8217;re building a web based API that makes it easy to find things [...]]]></description>
			<content:encoded><![CDATA[<p>Clearly there have been no new posts here for quite a while &#8211; I&#8217;ve switched my focus on to another AI project which includes financial backing.</p>
<p>As of late 2011 I started work on <a href="http://strongsteam.com/" onclick="pageTracker._trackPageview('/outgoing/strongsteam.com/?referer=');">StrongSteam</a> with my co-founder of <a href="http://showmedo.com/" onclick="pageTracker._trackPageview('/outgoing/showmedo.com/?referer=');">ShowMeDo</a> (Kyran Dale). We&#8217;re building a web based API that makes it easy to find things in images &#8211; things like text, objects and people. Our first APIs will focus on text and object matching.</p>
<p>All going well I&#8217;ll be demonstrating some of these ideas at PyCon 2012 in Santa Clara and our first iPhone app is in production, slated for release in April. The iPhone app will read Latin plant labels at botanical gardens and give you information from WikiPedia, GeoSpecies and BBC:Wildlife along with pictures and maybe video. We want to expand this app to work at museums too.</p>
<p>I&#8217;ll do another post here once we&#8217;re live at StrongSteam and then this Cookbook will enter maintenance mode.</p>
<img src="http://feeds.feedburner.com/~r/TheArtificialIntelligenceCookbook/~4/qfI2KcO_S90" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.aicookbook.com/2012/02/all-efforts-to-our-new-strongsteam-aidata-mining-service/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.aicookbook.com/2012/02/all-efforts-to-our-new-strongsteam-aidata-mining-service/</feedburner:origLink></item>
		<item>
		<title>Review for Python Text Processing with NLTK 2.0 Cookbook (Packt, 2010)</title>
		<link>http://feedproxy.google.com/~r/TheArtificialIntelligenceCookbook/~3/HSmzTa34pMo/</link>
		<comments>http://blog.aicookbook.com/2011/01/review-for-python-text-processing-with-nltk-2-0-cookbook-packt-2010/#comments</comments>
		<pubDate>Sun, 30 Jan 2011 19:38:55 +0000</pubDate>
		<dc:creator>Ian Ozsvald</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blog.aicookbook.com/?p=143</guid>
		<description><![CDATA[On my personal site I have a full review for Packt&#8217;s new Python Text Processing with NLTK 2.0 Cookbook (Packt, 2010). NLTK is the excellent Natural Language Toolkit. The book is a cookbook with a huge set of recipes for NLTK. It makes for a great companion for the original O&#8217;Reilly NLTK book. Rather than [...]]]></description>
			<content:encoded><![CDATA[<p>On my personal site I have a full review for Packt&#8217;s new <a href="http://ianozsvald.com/2011/01/30/review-for-python-text-processing-with-nltk-2-0-cookbook-packt-2010/" onclick="pageTracker._trackPageview('/outgoing/ianozsvald.com/2011/01/30/review-for-python-text-processing-with-nltk-2-0-cookbook-packt-2010/?referer=');">Python Text Processing with NLTK 2.0 Cookbook</a> (Packt, 2010). <a href="http://www.nltk.org/" onclick="pageTracker._trackPageview('/outgoing/www.nltk.org/?referer=');">NLTK</a> is the excellent Natural Language Toolkit.</p>
<p>The book is a cookbook with a huge set of recipes for NLTK. It makes for a great companion for the original O&#8217;Reilly NLTK book. Rather than repost it I&#8217;ll simply direct you back to my blog &#8211; my post has links for all the major topics and tools.</p>
<img src="http://feeds.feedburner.com/~r/TheArtificialIntelligenceCookbook/~4/HSmzTa34pMo" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.aicookbook.com/2011/01/review-for-python-text-processing-with-nltk-2-0-cookbook-packt-2010/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://blog.aicookbook.com/2011/01/review-for-python-text-processing-with-nltk-2-0-cookbook-packt-2010/</feedburner:origLink></item>
		<item>
		<title>Installing openCV at WebFaction with Python2.5</title>
		<link>http://feedproxy.google.com/~r/TheArtificialIntelligenceCookbook/~3/qA50y8UiUEw/</link>
		<comments>http://blog.aicookbook.com/2010/12/installing-opencv-at-webfaction-with-python2-5/#comments</comments>
		<pubDate>Mon, 13 Dec 2010 22:50:14 +0000</pubDate>
		<dc:creator>Ian Ozsvald</dc:creator>
				<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blog.aicookbook.com/?p=138</guid>
		<description><![CDATA[Just to follow up on my earlier post about building openCV on a Mac (Leopard), here are some notes for installing openCV 2.1 at WebFaction to default to using Python 2.5 (rather than 2.7). The wiki guide to use cmake didn&#8217;t quite work, I went ahead with ccmake (the command-line GUI equivalent): ccmake -D CMAKE_BUILD_TYPE=RELEASE [...]]]></description>
			<content:encoded><![CDATA[<p>Just to follow up on my earlier post about building openCV on a <a href="http://blog.aicookbook.com/2010/06/installing-opencv-python-leopard-mac/">Mac (Leopard)</a>, here are some notes for installing openCV 2.1 at <a href="http://www.webfaction.com/signup?affiliate=ianozsvald" onclick="pageTracker._trackPageview('/outgoing/www.webfaction.com/signup?affiliate=ianozsvald&amp;referer=');">WebFaction</a> to default to using Python 2.5 (rather than 2.7).</p>
<p>The <a href="http://opencv.willowgarage.com/wiki/InstallGuide" onclick="pageTracker._trackPageview('/outgoing/opencv.willowgarage.com/wiki/InstallGuide?referer=');">wiki guide</a> to use cmake didn&#8217;t quite work, I went ahead with ccmake (the command-line GUI equivalent):</p>
<blockquote><p>ccmake -D CMAKE_BUILD_TYPE=RELEASE -D CMAKE_INSTALL_PREFIX=/home/ianozsvald -D BUILD_PYTHON_SUPPORT=ON ..</p></blockquote>
<p>Next I moved through the menu to set:</p>
<ul>
<li>BUILD_EXAMPLES ON</li>
<li>INSTALL_C_EXAMPLES ON</li>
<li>INSTALL_PYTHON_EXAMPLES ON</li>
</ul>
<p>Next I used &#8216;t&#8217; to turn on advanced options and configued:</p>
<ul>
<li> PYTHON_EXECUTABLE=/usr/local/bin/python2.5</li>
<li> PYTHON_INCLUDE_DIR=/usr/local/include/python2.5</li>
<li> PYTHON_LIBRARY=/usr/local/lib/libpython2.5.so</li>
</ul>
<p>To finish I used &#8216;c&#8217; to create settings and &#8216;g&#8217; to generate setup files. Now &#8216;make&#8217; and &#8216;make install&#8217; did their job (it took 10 minutes to compile). In &#8216;~/bin&#8217; I saw &#8216;opencv_createsamples&#8217; which shows that openCV built ok.</p>
<p>Finally I copied the build directory&#8217;s lib/cv.so to &#8216;~/lib/&#8217; and then in python2.5 I tried &#8216;import cv&#8217; and it worked. In the build directory&#8217;s &#8216;~/bin&#8217; I ran &#8216;./cxcoretest&#8217; and all tests passed.</p>
<img src="http://feeds.feedburner.com/~r/TheArtificialIntelligenceCookbook/~4/qA50y8UiUEw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.aicookbook.com/2010/12/installing-opencv-at-webfaction-with-python2-5/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.aicookbook.com/2010/12/installing-opencv-at-webfaction-with-python2-5/</feedburner:origLink></item>
		<item>
		<title>Automatic plaque transcription (py+tesseract) average error down to 33.4</title>
		<link>http://feedproxy.google.com/~r/TheArtificialIntelligenceCookbook/~3/2udYepgT3Uw/</link>
		<comments>http://blog.aicookbook.com/2010/08/automatic-plaque-transcription-pytesseract-average-error-down-to-33-4/#comments</comments>
		<pubDate>Wed, 25 Aug 2010 12:26:35 +0000</pubDate>
		<dc:creator>Ian Ozsvald</dc:creator>
				<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blog.aicookbook.com/?p=134</guid>
		<description><![CDATA[A month back I posted about my progress with the automatic English Heritage plaque transcription project using Optical Character Recognition (using tesseract) and Python for OpenPlaques. The post mentions a monthly cash prize for progress towards a solution&#8230; A few days back Jonathan Street announced his entry in the challenge&#8217;s thread &#8211; he&#8217;d beaten my [...]]]></description>
			<content:encoded><![CDATA[<p>A month back I posted about <a href="http://blog.aicookbook.com/2010/07/automatic-plaque-transcription-using-python-work-in-progress/">my progress</a> with the automatic English Heritage plaque transcription project using Optical Character Recognition (using tesseract) and Python for <a href="http://openplaques.org/" onclick="pageTracker._trackPageview('/outgoing/openplaques.org/?referer=');">OpenPlaques</a>. The post mentions a monthly cash prize for progress towards a solution&#8230;</p>
<p>A few days back <a href="http://jonathanstreet.com" onclick="pageTracker._trackPageview('/outgoing/jonathanstreet.com?referer=');">Jonathan Street</a> announced his entry in the <a href="http://groups.google.com/group/aicookbook/browse_thread/thread/4853cefff5d70231" onclick="pageTracker._trackPageview('/outgoing/groups.google.com/group/aicookbook/browse_thread/thread/4853cefff5d70231?referer=');">challenge&#8217;s thread</a> &#8211; he&#8217;d beaten my initial average error of 709 (it was in part designed to be easy to beat!) and quickly brought it down to 33.4. Jonathan becomes the winner of the A.I.Cookbook&#8217;s first challenge, the challenge now rolls on to this month and the same prize is offered.</p>
<p>I&#8217;ll be presenting the results at the <a href="http://blog.openplaques.org/2010/08/registration-open-for-open-day" onclick="pageTracker._trackPageview('/outgoing/blog.openplaques.org/2010/08/registration-open-for-open-day?referer=');">open day</a> for the Open Plaques project (sponsored by the Royal Society of the Arts) on 25th September and I hope to be able to demonstrate that, for a few plaques at least, we can automatically get a good transcription.</p>
<p>In Jonathan&#8217;s <a href="http://jonathanstreet.com/blog/ai-cookbook-competition" onclick="pageTracker._trackPageview('/outgoing/jonathanstreet.com/blog/ai-cookbook-competition?referer=');">write-up</a> he describes the main steps and includes full Python source:</p>
<ul>
<li>image pre-processing to find the blue regions</li>
<li>restricting tesseract&#8217;s character set</li>
<li>spell checking</li>
<li>word clean-up (to fix things like dates)</li>
</ul>
<p>He&#8217;s taken some of the ideas I listed in the <a href="http://aicookbook.com/wiki/Automatic_plaque_transcription" onclick="pageTracker._trackPageview('/outgoing/aicookbook.com/wiki/Automatic_plaque_transcription?referer=');">wiki</a> and taken them further &#8211; I&#8217;m particularly happy with the blue region detection as that felt like an obvious first step that I hadn&#8217;t attempted.</p>
<p>In the <a href="http://groups.google.com/group/aicookbook/browse_thread/thread/4853cefff5d70231" onclick="pageTracker._trackPageview('/outgoing/groups.google.com/group/aicookbook/browse_thread/thread/4853cefff5d70231?referer=');">thread</a> there&#8217;s also a note by Andrew Elwell &#8211; he&#8217;s <a href="http://blog.elwell.org.uk/2009/11/twittering-ocr.html" onclick="pageTracker._trackPageview('/outgoing/blog.elwell.org.uk/2009/11/twittering-ocr.html?referer=');">using OCR</a> to update <a href="http://twitter.com/lhcstatus" onclick="pageTracker._trackPageview('/outgoing/twitter.com/lhcstatus?referer=');">@lhcstatus</a> (for the Large Hadron Collider &#8211; with &gt;1,000 followers!) by screen scraping their graphical update screen.</p>
<p><a href="http://www.thecyberiad.net/" onclick="pageTracker._trackPageview('/outgoing/www.thecyberiad.net/?referer=');">David Rawlinson</a> also posted in the thread about some ideas taken from his experience with automatic number plate recognition (ANPR) so we can correct mis-recognised characters (e.g. 0O0 and 1lLiI are easily mis-recognised by OCR!).</p>
<p>The competition runs on, the new deadline is Thursday September 23rd so I can present our progress on the 25th at the OpenPlaques event. If nobody beats Jonathan&#8217;s result by then then he becomes the winner by default. I&#8217;ll be adding more ideas for improving the result into the main <a href="http://aicookbook.com/wiki/Automatic_plaque_transcription" onclick="pageTracker._trackPageview('/outgoing/aicookbook.com/wiki/Automatic_plaque_transcription?referer=');">wiki page</a>. Join the <a href="http://groups.google.com/group/aicookbook" onclick="pageTracker._trackPageview('/outgoing/groups.google.com/group/aicookbook?referer=');">Google Group</a> if you&#8217;d like to offer ideas and get involved.</p>
<img src="http://feeds.feedburner.com/~r/TheArtificialIntelligenceCookbook/~4/2udYepgT3Uw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.aicookbook.com/2010/08/automatic-plaque-transcription-pytesseract-average-error-down-to-33-4/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.aicookbook.com/2010/08/automatic-plaque-transcription-pytesseract-average-error-down-to-33-4/</feedburner:origLink></item>
		<item>
		<title>Automatic Plaque Transcription using Python (work in progress)</title>
		<link>http://feedproxy.google.com/~r/TheArtificialIntelligenceCookbook/~3/V_kev9K-0hQ/</link>
		<comments>http://blog.aicookbook.com/2010/07/automatic-plaque-transcription-using-python-work-in-progress/#comments</comments>
		<pubDate>Sun, 04 Jul 2010 20:25:24 +0000</pubDate>
		<dc:creator>Ian Ozsvald</dc:creator>
				<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blog.aicookbook.com/?p=119</guid>
		<description><![CDATA[I&#8217;m working with the OpenPlaques folk to create a system that automatically &#8216;reads&#8217; images of English Heritage plaques and extracts a transcript of the plaque&#8217;s text. This is a classic optical character recognition project. Here&#8217;s a simple example (thanks Fiery Fred): The text is very easy for a human to read but very hard for [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;m working with the <a href="http://openplaques.org/" onclick="pageTracker._trackPageview('/outgoing/openplaques.org/?referer=');">OpenPlaques</a> folk to create a system that automatically &#8216;reads&#8217; images of English Heritage plaques and extracts a transcript of the plaque&#8217;s text. This is a classic optical character recognition project. Here&#8217;s a simple example (thanks <a href="http://www.flickr.com/photos/35118587@N06/" onclick="pageTracker._trackPageview('/outgoing/www.flickr.com/photos/35118587_N06/?referer=');">Fiery Fred</a>):</p>
<p><a href="http://www.flickr.com/photos/35118587@N06/4757423214/" onclick="pageTracker._trackPageview('/outgoing/www.flickr.com/photos/35118587_N06/4757423214/?referer=');"><img class="aligncenter" title="Mary Royce plaque" src="http://farm5.static.flickr.com/4115/4757423214_7ddff9550b_m.jpg" alt="" width="240" height="240" /></a>The text is very easy for a human to read but very hard for a computer to extract. Thankfully there&#8217;s a great open-source OCR system (t<a href="http://code.google.com/p/tesseract-ocr" onclick="pageTracker._trackPageview('/outgoing/code.google.com/p/tesseract-ocr?referer=');">esseract</a>) that was released by HP several years back.</p>
<p>The goal of this project is to automatically transcribe the text from several thousand plaques (taken using different cameras and phones in varying lighting conditions at various angles and distances) so that the human sysops don&#8217;t have to do the  transcription work by hand!</p>
<p>Previously I&#8217;ve posted about my <a href="http://blog.aicookbook.com/2010/06/optical-character-recognition-webservice-work-in-progress/">work in progress</a> with a manual process, now I&#8217;m building towards an automated solution, progress is outlined in the wiki as <a href="http://aicookbook.com/wiki/Automatic_plaque_transcription" onclick="pageTracker._trackPageview('/outgoing/aicookbook.com/wiki/Automatic_plaque_transcription?referer=');">Automatic Plaque Recognition</a>.</p>
<p>Currently there&#8217;s a Python demo file which retrieves three example plaques  from flickr, passes them through tesseract and then uses the <a href="http://en.wikipedia.org/wiki/Levenshtein_distance" onclick="pageTracker._trackPageview('/outgoing/en.wikipedia.org/wiki/Levenshtein_distance?referer=');">Levenshtein distance </a>as an error metric against a manually transcribed string.</p>
<p>Once the OpenPlaques team puts together a larger test and validation set I&#8217;ll setup a monthly challenge. The challenge will have a cash prize, the goal will be to encourage entrants to write better recognition systems each month up until we can run an automatic algorithm against the entire OpenPlaques corpus.</p>
<p>If you&#8217;re interested in getting involved please join the <a href="http://groups.google.com/group/aicookbook" onclick="pageTracker._trackPageview('/outgoing/groups.google.com/group/aicookbook?referer=');">A.I. Cookbook Google Group</a>.</p>
<p><strong>Next step:</strong></p>
<p>The following text will give you more detail on OCR techniques.</p>
<table>
<tr>
<td>
<iframe src="http://rcm-uk.amazon.co.uk/e/cm?lt1=_blank&#038;bc1=000000&#038;IS2=1&#038;bg1=FFFFFF&#038;fc1=000000&#038;lc1=0000FF&#038;t=entrepreneuri-21&#038;o=2&#038;p=8&#038;l=as1&#038;m=amazon&#038;f=ifr&#038;md=0M5A6TN3AXP2JHJBWT02&#038;asins=0615155111" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe></p>
</td>
</tr>
</table>
<img src="http://feeds.feedburner.com/~r/TheArtificialIntelligenceCookbook/~4/V_kev9K-0hQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.aicookbook.com/2010/07/automatic-plaque-transcription-using-python-work-in-progress/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		<feedburner:origLink>http://blog.aicookbook.com/2010/07/automatic-plaque-transcription-using-python-work-in-progress/</feedburner:origLink></item>
		<item>
		<title>Building a face-tracking robot (Headroid1) with Python in an afternoon</title>
		<link>http://feedproxy.google.com/~r/TheArtificialIntelligenceCookbook/~3/SBHWLQuZ8t4/</link>
		<comments>http://blog.aicookbook.com/2010/06/building-a-face-tracking-robot-headroid1-with-python-in-an-afternoon/#comments</comments>
		<pubDate>Sun, 27 Jun 2010 16:48:56 +0000</pubDate>
		<dc:creator>Ian Ozsvald</dc:creator>
				<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blog.aicookbook.com/?p=79</guid>
		<description><![CDATA[Here we&#8217;ll look at building Headroid1 in a few hours &#8211; a face tracking 2-axis robot head controlled by Python and open source modules. UPDATE see Headroid featured in The Gadget Show in the 3 minute video made at MakerFaire UK 2011. This is what the finished system will look like: An earlier demo was [...]]]></description>
			<content:encoded><![CDATA[<p>Here we&#8217;ll look at building Headroid1 in a few hours &#8211; a face tracking 2-axis robot head controlled by Python and open source modules. <strong>UPDATE</strong> see Headroid featured in <a href="http://fwd.channel5.com/gadget-show/videos/feature/maker-faire" onclick="pageTracker._trackPageview('/outgoing/fwd.channel5.com/gadget-show/videos/feature/maker-faire?referer=');">The Gadget Show</a> in the 3 minute video made at MakerFaire UK 2011.</p>
<p>This is what the finished system will look like:</p>
<p><a href="http://www.flickr.com/photos/54145418@N00/4738357670/" onclick="pageTracker._trackPageview('/outgoing/www.flickr.com/photos/54145418_N00/4738357670/?referer=');"><img class="aligncenter" title="Headroid1 finished system" src="http://farm5.static.flickr.com/4097/4738357670_d34f04a460_m.jpg" alt="" width="240" height="180" /></a></p>
<p>An earlier demo was presented on my blog as <a href="http://ianozsvald.com/2010/05/21/headroid1-a-face-tracking-robot-head/" onclick="pageTracker._trackPageview('/outgoing/ianozsvald.com/2010/05/21/headroid1-a-face-tracking-robot-head/?referer=');">Headroid1 &#8211; A Face Tracking Robot</a>, here&#8217;s a video demo:</p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="576" height="462" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/_9DXecQdJEY&amp;hl=en_US&amp;fs=1&amp;" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="576" height="462" src="http://www.youtube.com/v/_9DXecQdJEY&amp;hl=en_US&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p><strong>Requirements:</strong></p>
<ul>
<li>An afternoon with some tools and Python</li>
<li><a href="http://pyserial.sourceforge.net/" onclick="pageTracker._trackPageview('/outgoing/pyserial.sourceforge.net/?referer=');">pySerial</a>, <a href="http://opencv.willowgarage.com/" onclick="pageTracker._trackPageview('/outgoing/opencv.willowgarage.com/?referer=');">openCV</a> with Python wrappers</li>
<li>Webcam</li>
<li>2 <a href="http://www.botbuilder.co.uk/store/index.php?main_page=product_info&amp;cPath=3_24&amp;products_id=41" onclick="pageTracker._trackPageview('/outgoing/www.botbuilder.co.uk/store/index.php?main_page=product_info_amp_cPath=3_24_amp_products_id=41&amp;referer=');">servos</a> (if you want the head to move) and some brackets</li>
<li><a href="http://www.botbuilder.co.uk/ssb.html" onclick="pageTracker._trackPageview('/outgoing/www.botbuilder.co.uk/ssb.html?referer=');">Serial Servo Controller</a> or Arduino (if you want to control your servos)</li>
</ul>
<p><strong>First &#8211; let Python see faces using OpenCV:</strong></p>
<p>My earlier <a href="http://blog.aicookbook.com/2010/06/pyopencv-facedetect-py-demo-face-detection-with-python/">facedetect.py post</a> shows you how well facial detection works, it includes links to get <a href="http://opencv.willowgarage.com/" onclick="pageTracker._trackPageview('/outgoing/opencv.willowgarage.com/?referer=');">OpenCV</a> (which includes the Python bindings).  It&#8217;ll take 30 minutes to download and compile OpenCV. To get facial detection working just plug in a webcam and run:</p>
<pre class="brush: plain; title: ; notranslate">
cd OpenCV-2.1.0/samples/python
python facedetect.py 0 # pass in id for webcam - 0 is first webcam
</pre>
<p>Now you&#8217;ll have a red rectangle around your face as long as you&#8217;re looking roughly towards the webcam. First step complete!</p>
<p><strong>Second &#8211; figure out how far the face is from the centre of the screen</strong></p>
<p>Having found a face we now need to determine how far it is from the centre of the image. We edit detect_and_draw(&#8230;) in facedetect.py to add the following lines:</p>
<pre class="brush: python; title: ; notranslate">
centre = None
...
if faces:
    for ((x, y, w, h), n) in faces:
        # the input to cv.HaarDetectObjects was resized, so scale the
        # bounding box of each face and convert it to two CvPoints
        pt1 = (int(x * image_scale), int(y * image_scale))
        pt2 = (int((x + w) * image_scale), int((y + h) * image_scale))
        cv.Rectangle(img, pt1, pt2, cv.RGB(255, 0, 0), 3, 8, 0)
        # ADD THESE LINES BELOW
        # get the xy corner co-ords, calc the centre location
        x1 = pt1[0]
        x2 = pt2[0]
        y1 = pt1[1]
        y2 = pt2[1]
        centrex = x1+((x2-x1)/2)
        centrey = y1+((y2-y1)/2)
        centre = (centrex, centrey)
...
return centre
</pre>
<p>So as long as we find a face, we know where the centre of the facial rectangle is inside the webcam image. We&#8217;ll also know how big the webcam&#8217;s image is, so we know which quadrant of the webcam the face is in &#8211; and from here we&#8217;ll know which direction to move.</p>
<p><strong>Third &#8211; how far should we move the servos?</strong></p>
<p>Next we want to know how far to move the x and y servo locations to bring the face closer to the centre of the webcam image. We can&#8217;t just change the position in one big jump &#8211; the whole assembly will rock due to the sudden movement and then the webcam&#8217;s image rocks creating odd oscillations. Instead we move in short, stable steps.</p>
<p>We&#8217;ll call the following routine twice, once for the x axis and once for the y axis. We&#8217;ll allow the x axis to move up to 4 degrees each iteration, the y axis can only move a maximum of 1 degree per iteration (mechanically one degree on the y axis is lots, 4 degrees on the x axis with my webcam isn&#8217;t much). These figures will need tuning depending on your setup.</p>
<pre class="brush: python; title: ; notranslate">
def get_delta(loc, span, max_delta, centre_tolerance):
    &quot;&quot;&quot;How far do we move on this axis to get the webcam
       centred on the face?
       loc is the face's centre for this axis
       span is the width or height for this axis
       max_delta is the max nbr of degrees to move on this axis
       centre_tolerance is the centre region where we don't allow movement
       &quot;&quot;&quot;
    framecentre = span/2
    delta = framecentre - loc
    if abs(delta) &lt; centre_tolerance: # within X pixels of the centre
        delta = 0 # so don't move - else we get weird oscillations
    else:
        # the x-axis is reversed so we must remember the sign
        is_neg = delta &lt;= 0
        to_get_near_centre = abs(delta) - centre_tolerance
        if to_get_near_centre &gt; 35:
            delta = 4 # big movement allowed if we're far away
        else:
            delta = 1 # small movement if we're close to the centre
        if is_neg:
            delta = delta * -1
    return delta
</pre>
<p><strong>Fourth &#8211; move the servos to re-centre the webcam using pySerial</strong></p>
<p>Finally we need to control our servos so they respond to the deltas we&#8217;ve calculated. I&#8217;m using the <a href="http://pyserial.sourceforge.net/" onclick="pageTracker._trackPageview('/outgoing/pyserial.sourceforge.net/?referer=');">pySerial</a> module and <a href="http://www.botbuilder.co.uk" onclick="pageTracker._trackPageview('/outgoing/www.botbuilder.co.uk?referer=');">BotBuilder</a>&#8216;s <a href="http://www.botbuilder.co.uk/ssb.html" onclick="pageTracker._trackPageview('/outgoing/www.botbuilder.co.uk/ssb.html?referer=');">Serial Servo Board</a>. The servo board is based on an Arduino &#8211; if you have an Arduino then these <a href="http://www.arduino.cc/playground/ComponentLib/Servo" onclick="pageTracker._trackPageview('/outgoing/www.arduino.cc/playground/ComponentLib/Servo?referer=');">servo</a> <a href="http://principialabs.com/arduino-serial-servo-control/" onclick="pageTracker._trackPageview('/outgoing/principialabs.com/arduino-serial-servo-control/?referer=');">links</a> will give you an easy equivalent (I&#8217;d love to see new code if you have a working Arduino solution!).</p>
<p>You&#8217;ll also need some brackets to mount your servos, see the end for purchase details to get an assembly like this (a USB-&gt;Serial cable is also shown to drive the serial board):</p>
<p><a href="http://www.flickr.com/photos/54145418@N00/4738186409/" onclick="pageTracker._trackPageview('/outgoing/www.flickr.com/photos/54145418_N00/4738186409/?referer=');"><img class="aligncenter" title="Two servos, brackets, Serial Servo Board" src="http://farm5.static.flickr.com/4139/4738186409_1a3a831571_m.jpg" alt="" width="240" height="180" /></a></p>
<p>This assembly doesn&#8217;t show the webcam &#8211; we&#8217;ll add it back shortly. To control the servo board we open a connection using:</p>
<pre class="brush: python; title: ; notranslate">
import serial
# /dev/cu.usbserial is the serial port on a Mac, it'll be COMx on Windows
# The Serial Servo Board uses 19200 baud
ser=serial.Serial(port='/dev/cu.usbserial',baudrate=19200,timeout=0)
ser.write('r') # send reset command
ser.read(100) # receive 'ready' string back
</pre>
<p>and to move the servos we simply specify the angle for the servo, e.g.:</p>
<pre class="brush: python; title: ; notranslate">
ser.write('20a') # send servo on connection A to 20 degrees
ser.write('40a40b') # move servos on connections A and B to 40 degrees
</pre>
<p>To get an idea of how quickly the servos move watch this 30 second video:<br />
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="576" height="462" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/kcs41o48mg8&amp;hl=en_US&amp;fs=1&amp;" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="576" height="462" src="http://www.youtube.com/v/kcs41o48mg8&amp;hl=en_US&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>Now we add the webcam using a bracket to complete the hardware:</p>
<p style="text-align: center;"><a href="http://www.flickr.com/photos/54145418@N00/4738397685/" onclick="pageTracker._trackPageview('/outgoing/www.flickr.com/photos/54145418_N00/4738397685/?referer=');"><img class=" aligncenter" title="Headroid's webcam and bracket" src="http://farm5.static.flickr.com/4093/4738397685_10b43ac827_m.jpg" alt="" width="240" height="180" /></a></p>
<p>Here&#8217;s the final head assembly:</p>
<p><a href="http://www.flickr.com/photos/54145418@N00/4737722683/" onclick="pageTracker._trackPageview('/outgoing/www.flickr.com/photos/54145418_N00/4737722683/?referer=');"><img class="aligncenter" title="Headroid1" src="http://farm5.static.flickr.com/4074/4737722683_9e7b32fe57_m.jpg" alt="" width="180" height="240" /></a></p>
<p>Down below you&#8217;ll find the complete source code.</p>
<p><strong>Questions?</strong> Join the <a href="http://groups.google.com/group/aicookbook" onclick="pageTracker._trackPageview('/outgoing/groups.google.com/group/aicookbook?referer=');">A.I. Cookbook&#8217;s Google Group</a> and see more details in the <a href="http://aicookbook.com/wiki/Headroid1" onclick="pageTracker._trackPageview('/outgoing/aicookbook.com/wiki/Headroid1?referer=');">Cookbook wiki</a>.</p>
<p><strong>Purchase?</strong> If you&#8217;re interested in buying a hardward kit then email: kits AT aicookbook.com. We don&#8217;t have kits yet but if there&#8217;s interest, we&#8217;ll put them together via <a href="http://botbuilder.co.uk/store/" onclick="pageTracker._trackPageview('/outgoing/botbuilder.co.uk/store/?referer=');">botbuilder.co.uk</a>.</p>
<p><strong>Moving forwards</strong></p>
<p>The openCV book is rather excellent, everything is in C++ but the Python API is easy to figure out and the reference text makes it all clear.</p>
<table>
<tbody>
<tr>
<td></td>
<td></td>
<td></td>
<td></td>
</tr>
</tbody>
</table>
<p><strong>Next?</strong> I&#8217;m glad you asked &#8211; here&#8217;s Headroid2 with a smiley-inspired emoticon interface from a recent <a href="http://flashbrighton.org/?p=698" onclick="pageTracker._trackPageview('/outgoing/flashbrighton.org/?p=698&amp;referer=');">A.I. talk I gave</a>. Instructions for adding an Arduino+<a href="http://jimmieprodgers.com/kits/lolshield/" onclick="pageTracker._trackPageview('/outgoing/jimmieprodgers.com/kits/lolshield/?referer=');">LOLShield</a> for emotional feedback will follow.</p>
<p><a href="http://www.flickr.com/photos/lilspikey/4725075065/" onclick="pageTracker._trackPageview('/outgoing/www.flickr.com/photos/lilspikey/4725075065/?referer=');"><img class="aligncenter" title="Headroid2 with LOLShield emoticon interface" src="http://farm2.static.flickr.com/1384/4725075065_4279547ba9.jpg" alt="" width="375" height="500" /></a></p>
<p><strong>Full source:</strong></p>
<pre class="brush: python; title: ; notranslate">
#!/usr/bin/python
&quot;&quot;&quot;
This program is demonstration for face and object detection using haar-like features.
The program finds faces in a camera image or video stream and displays a red box around them,
then centres the webcam via two servos so the face is at the centre of the screen
Based on facedetect.py in the OpenCV samples directory
&quot;&quot;&quot;
import sys
from optparse import OptionParser
import time
import math
import datetime
import serial
import cv

# Parameters for haar detection
# From the API:
# The default parameters (scale_factor=2, min_neighbors=3, flags=0) are tuned
# for accurate yet slow object detection. For a faster operation on real video
# images the settings are:
# scale_factor=1.2, min_neighbors=2, flags=CV_HAAR_DO_CANNY_PRUNING,
# min_size=&lt;minimum possible face size

min_size = (20, 20)
image_scale = 2
haar_scale = 1.2
min_neighbors = 2
haar_flags = 0

def detect_and_draw(img, cascade):
    gray = cv.CreateImage((img.width,img.height), 8, 1)
    small_img = cv.CreateImage((cv.Round(img.width / image_scale),
			       cv.Round (img.height / image_scale)), 8, 1)

    # convert color input image to grayscale
    cv.CvtColor(img, gray, cv.CV_BGR2GRAY)

    # scale input image for faster processing
    cv.Resize(gray, small_img, cv.CV_INTER_LINEAR)

    cv.EqualizeHist(small_img, small_img)

    centre = None

    if(cascade):
        t = cv.GetTickCount()
        # HaarDetectObjects takes 0.02s
        faces = cv.HaarDetectObjects(small_img, cascade, cv.CreateMemStorage(0),
                                     haar_scale, min_neighbors, haar_flags, min_size)
        t = cv.GetTickCount() - t
        if faces:
            for ((x, y, w, h), n) in faces:
                # the input to cv.HaarDetectObjects was resized, so scale the
                # bounding box of each face and convert it to two CvPoints
                pt1 = (int(x * image_scale), int(y * image_scale))
                pt2 = (int((x + w) * image_scale), int((y + h) * image_scale))
                cv.Rectangle(img, pt1, pt2, cv.RGB(255, 0, 0), 3, 8, 0)
                # get the xy corner co-ords, calc the centre location
                x1 = pt1[0]
                x2 = pt2[0]
                y1 = pt1[1]
                y2 = pt2[1]
                centrex = x1+((x2-x1)/2)
                centrey = y1+((y2-y1)/2)
                centre = (centrex, centrey)

    cv.ShowImage(&quot;result&quot;, img)
    return centre

def move_servos(xygo):
    position = '%da%db' % (xygo[0], xygo[1])
    ser.write(position)

def get_delta(loc, span, max_delta, centre_tolerance):
    &quot;&quot;&quot;How far do we move on this axis to get the webcam
       centred on the face?
       loc is the face's centre for this axis
       span is the width or height for this axis
       max_delta is the max nbr of degrees to move on this axis
       centre_tolerance is the centre region where we don't allow movement
       &quot;&quot;&quot;
    framecentre = span/2
    delta = framecentre - loc
    if abs(delta) &lt; centre_tolerance: # within X pixels of the centre
        delta = 0 # so don't move - else we get weird oscillations
    else:
        is_neg = delta &lt;= 0
        to_get_near_centre = abs(delta) - centre_tolerance
        if to_get_near_centre &gt; 35:
            delta = 4
        else:
            # move slower if we're closer to centre
            if to_get_near_centre &gt; 25:
                delta = 3
            else:
                # move real slow if we're very near centre
                delta = 1
        if is_neg:
            delta = delta * -1
    return delta

if __name__ == '__main__':
    # open a serial port
    ser=serial.Serial(port='/dev/cu.usbserial',baudrate=19200,timeout=0)
    ser.write('r')
    xygo = (90,90)
    move_servos(xygo)

    # parse cmd line options, setup Haar classifier
    parser = OptionParser(usage = &quot;usage: %prog [options] [camera_index]&quot;)
    parser.add_option(&quot;-c&quot;, &quot;--cascade&quot;, action=&quot;store&quot;, dest=&quot;cascade&quot;, type=&quot;str&quot;, help=&quot;Haar cascade file, default %default&quot;, default = &quot;/Users/ian/Documents/OpenCV-2.1.0//data/haarcascades/haarcascade_frontalface_alt.xml&quot;)
    (options, args) = parser.parse_args()

    cascade = cv.Load(options.cascade)

    if len(args) != 1:
        parser.print_help()
        sys.exit(1)

    input_name = args[0]
    if input_name.isdigit():
        capture = cv.CreateCameraCapture(int(input_name))
    else:
        print &quot;We need a camera input! Specify camera index e.g. 0&quot;
        sys.exit(0)

    cv.NamedWindow(&quot;result&quot;, 1)

    if capture:
        frame_copy = None

        while True:
            frame = cv.QueryFrame(capture)
            if not frame:
                cv.WaitKey(0)
                break
            if not frame_copy:
                frame_copy = cv.CreateImage((frame.width,frame.height),
                                            cv.IPL_DEPTH_8U, frame.nChannels)
            if frame.origin == cv.IPL_ORIGIN_TL:
                cv.Copy(frame, frame_copy)
            else:
                cv.Flip(frame, frame_copy, 0)

            centre = detect_and_draw(frame_copy, cascade)

            if centre is not None:
                cx = centre[0]
                cy = centre[1]

                # modify the *-1 if your x or y directions are reversed!
                xdelta = get_delta(cx, frame_copy.width, 6, 15) * -1
                ydelta = get_delta(cy, frame_copy.height, 1, 25) * -1

                # on my camera I introduce a delay after movements
                # else my assembly wobbles and the webcam transmits
                # a non-centred image, so weird oscillations can occur
                total_delta = abs(xdelta)+abs(ydelta)
                if total_delta &gt; 0:
                    xygo = (xygo[0]+xdelta,xygo[1]+ydelta)

                    sleep_for = 1/10.0*min(total_delta, 10)
                    sleep_for = min(sleep_for, 0.4)

                    move_servos(xygo)
                else:
                    sleep_for = 0

            if cv.WaitKey(10) &gt;= 0: # 10ms delay
                break

    cv.DestroyWindow(&quot;result&quot;)
</pre>
<img src="http://feeds.feedburner.com/~r/TheArtificialIntelligenceCookbook/~4/SBHWLQuZ8t4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.aicookbook.com/2010/06/building-a-face-tracking-robot-headroid1-with-python-in-an-afternoon/feed/</wfw:commentRss>
		<slash:comments>33</slash:comments>
		<feedburner:origLink>http://blog.aicookbook.com/2010/06/building-a-face-tracking-robot-headroid1-with-python-in-an-afternoon/</feedburner:origLink></item>
		<item>
		<title>pyOpenCV facedetect.py demo (face detection with Python)</title>
		<link>http://feedproxy.google.com/~r/TheArtificialIntelligenceCookbook/~3/fYxH0Bx0cO4/</link>
		<comments>http://blog.aicookbook.com/2010/06/pyopencv-facedetect-py-demo-face-detection-with-python/#comments</comments>
		<pubDate>Fri, 04 Jun 2010 17:58:40 +0000</pubDate>
		<dc:creator>Ian Ozsvald</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blog.aicookbook.com/?p=62</guid>
		<description><![CDATA[When I first played with openCV I had no idea how good the facial detection would be, or how fast it might run on my MacBook. I&#8217;m recording this demo so you&#8217;ll know what to expect&#8230; pyOpenCV is the Python binding to the open source openCV (originally created by Intel for vision research). It comes [...]]]></description>
			<content:encoded><![CDATA[<p>When I first played with openCV I had no idea how good the facial detection would be, or how fast it might run on my MacBook. I&#8217;m recording this demo so you&#8217;ll know what to expect&#8230;</p>
<p><a href="http://opencv.willowgarage.com/wiki/PythonInterface" onclick="pageTracker._trackPageview('/outgoing/opencv.willowgarage.com/wiki/PythonInterface?referer=');">pyOpenCV</a> is the Python binding to the open source <a href="http://opencv.willowgarage.com/" onclick="pageTracker._trackPageview('/outgoing/opencv.willowgarage.com/?referer=');">openCV</a> (originally created by Intel for vision research). It comes as a Windows exe, it builds on Linux just fine, I posted some <a href="http://blog.aicookbook.com/2010/06/installing-opencv-python-leopard-mac/">Mac openCV build notes</a> a few days back.</p>
<p>facedetect.py is the demo program for <a href="http://opencv.willowgarage.com/wiki/FaceDetection" onclick="pageTracker._trackPageview('/outgoing/opencv.willowgarage.com/wiki/FaceDetection?referer=');">facial detection</a>. The goal of the video below is to show you that:</p>
<ul>
<li>facedetect.py is pretty good at facial detection on a standard laptop</li>
<li>light sources really degrade the performance</li>
</ul>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="480" height="385" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/bTrInlHAcLQ&amp;hl=en_US&amp;fs=1&amp;" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="480" height="385" src="http://www.youtube.com/v/bTrInlHAcLQ&amp;hl=en_US&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p><strong>Books:</strong></p>
<p>The openCV book is rather excellent, everything is in C++ but the Python API is easy to figure out and the reference text makes it all clear.</p>
<table>
<tr>
<td>
<iframe src="http://rcm-uk.amazon.co.uk/e/cm?lt1=_blank&#038;bc1=000000&#038;IS2=1&#038;bg1=FFFFFF&#038;fc1=000000&#038;lc1=0000FF&#038;t=entrepreneuri-21&#038;o=2&#038;p=8&#038;l=as1&#038;m=amazon&#038;f=ifr&#038;md=0M5A6TN3AXP2JHJBWT02&#038;asins=0596516134" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe></p>
</td>
<td>
<iframe src="http://rcm-uk.amazon.co.uk/e/cm?lt1=_blank&#038;bc1=000000&#038;IS2=1&#038;bg1=FFFFFF&#038;fc1=000000&#038;lc1=0000FF&#038;t=entrepreneuri-21&#038;o=2&#038;p=8&#038;l=as1&#038;m=amazon&#038;f=ifr&#038;md=0M5A6TN3AXP2JHJBWT02&#038;asins=0471140562" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe></p>
</td>
</tr>
</table>
<img src="http://feeds.feedburner.com/~r/TheArtificialIntelligenceCookbook/~4/fYxH0Bx0cO4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.aicookbook.com/2010/06/pyopencv-facedetect-py-demo-face-detection-with-python/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		<feedburner:origLink>http://blog.aicookbook.com/2010/06/pyopencv-facedetect-py-demo-face-detection-with-python/</feedburner:origLink></item>
		<item>
		<title>Optical Character Recognition webservice work-in-progress</title>
		<link>http://feedproxy.google.com/~r/TheArtificialIntelligenceCookbook/~3/GUTM7m9zdnY/</link>
		<comments>http://blog.aicookbook.com/2010/06/optical-character-recognition-webservice-work-in-progress/#comments</comments>
		<pubDate>Fri, 04 Jun 2010 13:29:08 +0000</pubDate>
		<dc:creator>Ian Ozsvald</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blog.aicookbook.com/?p=54</guid>
		<description><![CDATA[This is a quick progress report on my webservice for optical character recognition using the open source Tesseract engine. This builds on my post a month back &#8216;Tesseract OCR to read plaques&#8216;. The immediate goal is to let the OpenPlaques folk have an automatic service which machine-reads English Heritage Plaques (blue plaques &#8211; very common [...]]]></description>
			<content:encoded><![CDATA[<p>This is a quick progress report on my webservice for <a href="http://en.wikipedia.org/wiki/Optical_character_recognition" onclick="pageTracker._trackPageview('/outgoing/en.wikipedia.org/wiki/Optical_character_recognition?referer=');">optical character recognition</a> using the open source <a href="http://code.google.com/p/tesseract-ocr/" onclick="pageTracker._trackPageview('/outgoing/code.google.com/p/tesseract-ocr/?referer=');">Tesseract</a> engine. This builds on my post a month back &#8216;<a href="http://ianozsvald.com/2010/04/04/tesseract-optical-character-recognition-to-read-plaques/" onclick="pageTracker._trackPageview('/outgoing/ianozsvald.com/2010/04/04/tesseract-optical-character-recognition-to-read-plaques/?referer=');">Tesseract OCR to read plaques</a>&#8216;.</p>
<p>The immediate goal is to let the <a href="http://openplaques.org/" onclick="pageTracker._trackPageview('/outgoing/openplaques.org/?referer=');">OpenPlaques</a> folk have an automatic service which machine-reads English Heritage Plaques (blue plaques &#8211; very common at historic sites in the UK) from their flickr photos and then squirt out the English text. Currently volunteers are transcribing the text by hand.</p>
<p>Below you&#8217;ll see a quick demo, I&#8217;ve used the <a href="http://bottle.paws.de/" onclick="pageTracker._trackPageview('/outgoing/bottle.paws.de/?referer=');">bottle.py</a> microframework to run my webservice, it takes a URL to an image, converts it to a <a href="http://en.wikipedia.org/wiki/Tagged_Image_File_Format" onclick="pageTracker._trackPageview('/outgoing/en.wikipedia.org/wiki/Tagged_Image_File_Format?referer=');">TIFF</a> image, passes it into Tesseract and presents the recognised text as a text output.</p>
<p>This isn&#8217;t live on the web yet (it needs a bit more work) but shortly it&#8217;ll be up for public use.</p>
<p><strong>Update</strong> &#8211; following this <a href="http://www.howtoforge.com/ocr_with_tesseract_on_ubuntu704" onclick="pageTracker._trackPageview('/outgoing/www.howtoforge.com/ocr_with_tesseract_on_ubuntu704?referer=');">tesseract image clean-up advice</a> (isolate text region, threshold, convert to b&amp;w) I can extract very clean text &#8211; contrast these results with what you see in the video.</p>
<blockquote><p>IN<br />
THIS HOUSE<br />
LIVED<br />
RALPH ELLIS<br />
1885 &#8211; 1963<br />
ARTIST<br />
PAINTER &amp; DESIGNER<br />
OF<br />
INN SIGNS <em>(Note &#8211; I extracted the inner circle so Sussex isn&#8217;t shown)</em></p></blockquote>
<blockquote><p>THIS WALKWAY<br />
WAS DONATED BY<br />
JEAN &amp; BRIAN CROSSLEY<br />
OP BROCKHAM<em> (Note &#8211; 1 typo here with OP)</em><br />
TO CELEBRATE<br />
JEAN’S 80th BIRTHDAY<br />
DECEMBER 27<br />
2007</p></blockquote>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="480" height="385" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/xtC3Qrh7ULw&amp;hl=en_US&amp;fs=1&amp;" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="480" height="385" src="http://www.youtube.com/v/xtC3Qrh7ULw&amp;hl=en_US&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<img src="http://feeds.feedburner.com/~r/TheArtificialIntelligenceCookbook/~4/GUTM7m9zdnY" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.aicookbook.com/2010/06/optical-character-recognition-webservice-work-in-progress/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		<feedburner:origLink>http://blog.aicookbook.com/2010/06/optical-character-recognition-webservice-work-in-progress/</feedburner:origLink></item>
		<item>
		<title>Combined face tracking and speech recognition (Intel research)</title>
		<link>http://feedproxy.google.com/~r/TheArtificialIntelligenceCookbook/~3/o5QjEJb-S4g/</link>
		<comments>http://blog.aicookbook.com/2010/06/combined-face-tracking-and-speech-recognition-intel-research/#comments</comments>
		<pubDate>Fri, 04 Jun 2010 10:59:48 +0000</pubDate>
		<dc:creator>Ian Ozsvald</dc:creator>
				<category><![CDATA[General]]></category>

		<guid isPermaLink="false">http://blog.aicookbook.com/?p=50</guid>
		<description><![CDATA[Here is a rather neat demo of the advantage of tracking a face whilst performing speech recognition &#8211; if the user is looking at the computer then the computer knows to listen. This is common sense to a human but for a computer with just a microphone input it has to listen to everything, not [...]]]></description>
			<content:encoded><![CDATA[<p>Here is a rather neat demo of the advantage of tracking a face whilst performing speech recognition &#8211; if the user is looking at the computer then the computer knows to listen. This is common sense to a human but for a computer with just a microphone input it has to listen to <em>everything</em>, not just the things that are directed towards it.</p>
<p>The face tracking technology (<a href="http://www.seeingmachines.com/" onclick="pageTracker._trackPageview('/outgoing/www.seeingmachines.com/?referer=');">SeeingMachines</a> <a href="http://www.seeingmachines.com/product/faceapi/" onclick="pageTracker._trackPageview('/outgoing/www.seeingmachines.com/product/faceapi/?referer=');">faceAPI</a>) is also very nice &#8211; it tracks faces when they turn a long way away from the camera (at 0:30). The speakers states that the speech recognition tool is Microsoft&#8217;s default for Win Vista/7.</p>
<p>The query template appears to be &#8220;Search X for Y please&#8221;, the Search, For and Please all need to be present for the search to trigger.</p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="640" height="385" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/JrzfgAUxTI0&amp;hl=en_US&amp;fs=1&amp;" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="640" height="385" src="http://www.youtube.com/v/JrzfgAUxTI0&amp;hl=en_US&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>A goal of the research seems to be to sell more Intel many-core CPUs &#8211; the point is made that face tracking and speech recognition are parallelisable tasks which work well across several cores.</p>
<img src="http://feeds.feedburner.com/~r/TheArtificialIntelligenceCookbook/~4/o5QjEJb-S4g" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.aicookbook.com/2010/06/combined-face-tracking-and-speech-recognition-intel-research/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.aicookbook.com/2010/06/combined-face-tracking-and-speech-recognition-intel-research/</feedburner:origLink></item>
		<item>
		<title>Open Allure DS conversational interface (using Python)</title>
		<link>http://feedproxy.google.com/~r/TheArtificialIntelligenceCookbook/~3/vbYK6TiuJ3M/</link>
		<comments>http://blog.aicookbook.com/2010/06/open-allure-conversational-interface-python/#comments</comments>
		<pubDate>Fri, 04 Jun 2010 10:16:29 +0000</pubDate>
		<dc:creator>Ian Ozsvald</dc:creator>
				<category><![CDATA[General]]></category>
		<category><![CDATA[Python]]></category>

		<guid isPermaLink="false">http://blog.aicookbook.com/?p=26</guid>
		<description><![CDATA[Back at Christmas I was speaking to John Graves about his Open Allure DS PhD project &#8211; a conversational interface written in Python. The project has moved wonderfully forward over the past few months, I&#8217;ll summarise some of the features of Open Allure here. Sidenote &#8211; if you prefer podcasts then John was recently interviewed [...]]]></description>
			<content:encoded><![CDATA[<p>Back at Christmas I was speaking to <a href="http://nz.linkedin.com/pub/john-graves/7/b05/160" onclick="pageTracker._trackPageview('/outgoing/nz.linkedin.com/pub/john-graves/7/b05/160?referer=');">John Graves</a> about his <a href="http://code.google.com/p/open-allure-ds/" onclick="pageTracker._trackPageview('/outgoing/code.google.com/p/open-allure-ds/?referer=');">Open Allure DS</a> PhD project &#8211; a conversational interface written in Python. The project has moved wonderfully forward over the past few months, I&#8217;ll summarise some of the features of Open Allure here.</p>
<p>Sidenote &#8211; if you prefer podcasts then John was <a href="http://wiki.github.com/jg1141/Open-Allure-DS/python411-interview-with-john-graves" onclick="pageTracker._trackPageview('/outgoing/wiki.github.com/jg1141/Open-Allure-DS/python411-interview-with-john-graves?referer=');">recently interviewed</a> by Ron Stephens on Python411 (with a lovely plug for <a href="http://showmedo.com" onclick="pageTracker._trackPageview('/outgoing/showmedo.com?referer=');">ShowMeDo</a> &#8211; cheers John!). The interview runs for an hour and is completely about Open Allure.</p>
<p>The project aims to build a speech and gesture recognising interface,  mostly using Python, to provide us with new ways of interacting with the  computer. As John states:</p>
<blockquote><p>Open Allure is a project aimed at  developing new ways to share what we know with one another by permitting  the collaborative creation and experience of interactive dialogs. </p>
<p>These  <em>verbal</em> exchanges give your interaction with the computer a very  different quality and permit immediate feedback to help reinforce or  reorganize your thinking. </p>
<p>Because voice recognition is still imperfect,  the interface also supports making choices by <em>gesture</em>: your  webcam watches you as you raise your hand. </p></blockquote>
<p>Here&#8217;s a good overview video:</p>
<p><object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="480" height="385" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/uwY0KB5hNPw&amp;hl=en_US&amp;fs=1&amp;" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="480" height="385" src="http://www.youtube.com/v/uwY0KB5hNPw&amp;hl=en_US&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>And here&#8217;s a demo showing speech and gesture recognition:<br />
<object classid="clsid:d27cdb6e-ae6d-11cf-96b8-444553540000" width="480" height="385" codebase="http://download.macromedia.com/pub/shockwave/cabs/flash/swflash.cab#version=6,0,40,0"><param name="allowFullScreen" value="true" /><param name="allowscriptaccess" value="always" /><param name="src" value="http://www.youtube.com/v/tfKTk6rgWsA&amp;hl=en_US&amp;fs=1&amp;" /><param name="allowfullscreen" value="true" /><embed type="application/x-shockwave-flash" width="480" height="385" src="http://www.youtube.com/v/tfKTk6rgWsA&amp;hl=en_US&amp;fs=1&amp;" allowscriptaccess="always" allowfullscreen="true"></embed></object></p>
<p>Here John shows the start of a chatbot interface &#8211; you can verbally ask the computer to solve math problems and to look-up word definitions:</p>
<p><object width="480" height="385"><param name="movie" value="http://www.youtube.com/v/_M8kllSJZpg&#038;hl=en_US&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/_M8kllSJZpg&#038;hl=en_US&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"></embed></object></p>
<p>I particularly like the direction of that latest (0.1d7) release &#8211; John has integrated a persona (shown as photos of John in the top-right corner) to visually show the state of the machine. The persona indicates if the machine is listening or responding, this feedback is very common in human-to-human interaction:</p>
<p><object width="480" height="385"><param name="movie" value="http://www.youtube.com/v/Dqjli3dSYqE&#038;hl=en_US&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/Dqjli3dSYqE&#038;hl=en_US&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"></embed></object></p>
<p>Gesture recognition uses a &#8216;green screen&#8217; approach. The system records a set of images to generate an average background, it then subtracts (using a grey scale image) each new image from the average and applies a threshold. Movement shows up as regions of activity, if these regions occupy a region of interest then an event is generated:</p>
<p><object width="480" height="385"><param name="movie" value="http://www.youtube.com/v/q2pF7Z4eXuU&#038;hl=en_US&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/q2pF7Z4eXuU&#038;hl=en_US&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"></embed></object></p>
<p>Currently John has the best support for Open Allure on Windows using the Microsoft Speech API (support for Dragon Naturally Speaking is just starting). On the Mac and Linux there is no speech recognition yet. pyGame is used for the vision components. This <a href="http://spreadsheets.google.com/ccc?key=0AqJtCBHzLJcXdEdaM0tBSDl0T0NhNUxONXRqblpMdnc&#038;hl=en#gid=0" onclick="pageTracker._trackPageview('/outgoing/spreadsheets.google.com/ccc?key=0AqJtCBHzLJcXdEdaM0tBSDl0T0NhNUxONXRqblpMdnc_038_hl=en_gid=0&amp;referer=');">Google Sheet</a> lists the status of the components for each platform.</p>
<p>John has tested the open source <a href="http://julius.sourceforge.jp/en_index.php" onclick="pageTracker._trackPageview('/outgoing/julius.sourceforge.jp/en_index.php?referer=');">Julius</a> continuous speech recogniser on Linux (shown below) but it isn&#8217;t yet integrated into Open Allure:</p>
<p><object width="480" height="385"><param name="movie" value="http://www.youtube.com/v/s1srNOk2ISI&#038;hl=en_US&#038;fs=1&#038;"></param><param name="allowFullScreen" value="true"></param><param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/s1srNOk2ISI&#038;hl=en_US&#038;fs=1&#038;" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="480" height="385"></embed></object></p>
<p>Do you want to try it yourself? If you have Windows then you should be all set, Linux and Mac have partial support. To see what you need visit the <a href="http://pypi.python.org/pypi/openallure/" onclick="pageTracker._trackPageview('/outgoing/pypi.python.org/pypi/openallure/?referer=');">PyPI page</a>.</p>
<p>If you want to contribute then see this <a href="http://code.google.com/p/open-allure-ds/wiki/LittleEasyImprovement?ts=1272862130&#038;updated=LittleEasyImprovement" onclick="pageTracker._trackPageview('/outgoing/code.google.com/p/open-allure-ds/wiki/LittleEasyImprovement?ts=1272862130_038_updated=LittleEasyImprovement&amp;referer=');">list of easy improvements</a> that need to be made and join the project&#8217;s <a href="http://groups.google.com/group/open-allure-ds" onclick="pageTracker._trackPageview('/outgoing/groups.google.com/group/open-allure-ds?referer=');">Google Group</a>.</p>
<img src="http://feeds.feedburner.com/~r/TheArtificialIntelligenceCookbook/~4/vbYK6TiuJ3M" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://blog.aicookbook.com/2010/06/open-allure-conversational-interface-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		<feedburner:origLink>http://blog.aicookbook.com/2010/06/open-allure-conversational-interface-python/</feedburner:origLink></item>
	</channel>
</rss><!-- Dynamic page generated in 1.643 seconds. --><!-- Cached page generated by WP-Super-Cache on 2013-05-23 00:19:18 -->
