<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:atom="http://www.w3.org/2005/Atom" xmlns:openSearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:georss="http://www.georss.org/georss" xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr="http://purl.org/syndication/thread/1.0" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0"><channel><atom:id>tag:blogger.com,1999:blog-3813942136986409295</atom:id><lastBuildDate>Sat, 04 Feb 2012 08:01:45 +0000</lastBuildDate><category>Summarization</category><category>Python</category><category>Automatic Translation</category><category>Statistics</category><category>Semantic Analysis</category><category>Semantic Web</category><category>Social Networks</category><category>Natural Language Generation</category><category>Parsing</category><category>Chatbots</category><category>Artificial Intelligence</category><category>Speech Recognition</category><category>Web Search</category><category>SRL</category><category>Syntax</category><category>Stanford</category><category>Conferences</category><category>Language</category><category>NLP Tools</category><category>Natural Language</category><category>Journals</category><category>Lectures</category><category>Sentiment Analysis</category><category>Disambiguation</category><category>News</category><category>Books</category><title>Natural Language Processing World</title><description>This blog presents posts about Natural Language Processing, Linguistics, Artificial Intelligence, Cognitive Science and Sentiment Analysis. I want to present you my discoveries, news and researches about these topics.</description><link>http://nlpb.blogspot.com/</link><managingEditor>noreply@blogger.com (Pedro Paulo Balage)</managingEditor><generator>Blogger</generator><openSearch:totalResults>35</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/NaturalLanguageProcessingWorld" /><feedburner:info uri="naturallanguageprocessingworld" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-7147511320400344702</guid><pubDate>Thu, 12 May 2011 07:32:00 +0000</pubDate><atom:updated>2011-05-13T17:37:19.968-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Natural Language</category><category domain="http://www.blogger.com/atom/ns#">NLP Tools</category><title>Wolfram Mathematica 8 comes with a Free-Form Linguistic input</title><description>Wolfram Mathematica, one of the best tools for scientific computing and mathematics learning, announced an impressive feature in its new software version. According with Mathematica's website (&lt;a href="http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/"&gt;http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/&lt;/a&gt;), the version 8 comes with a free-from linguistic input where you can use natural language to define the queries.&lt;br /&gt;
&lt;br /&gt;
According the website with the new tool is possible&lt;br /&gt;
&lt;br /&gt;
&lt;ul&gt;&lt;li&gt;learn Mathematica syntax by entering a query in English and then viewing the free-form queries as precise Mathematica commands.&lt;/li&gt;
&lt;li&gt;Easily specify plot frames, styles, gridlines, and more in natural language.&lt;/li&gt;
&lt;li&gt;Perform many specialist operations in fields as diverse as image processing and finance using free-form instructions.&lt;/li&gt;
&lt;li&gt;Access Wolfram|Alpha's vast collection of data and integrate it with your Mathematica session.&lt;/li&gt;
&lt;li&gt;Transparently refer to the results of previous computations or session variables in free-form commands.&lt;/li&gt;
&lt;/ul&gt;&lt;br /&gt;
The computational knowledge engine Wolfram Alpha (&lt;a href="http://www.wolframalpha.com/"&gt;http://www.wolframalpha.com/&lt;/a&gt;) makes use of this new version of Mathematica software and also allows the user to input natural language queries. The&amp;nbsp;strategic&amp;nbsp;of Wolfram labs and many other companies seems to fuse the free-form natural language input with computational tools like web search, mathematics software and&amp;nbsp;knowledge&amp;nbsp;retrieval.&lt;br /&gt;
&lt;br /&gt;
Some example of commands supported by the tool are:&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;table class="inputTable" style="color: #222222; font-family: Arial, Verdana, Geneva, sans-serif; font-size: 13px; line-height: 19px; margin-top: 20px;"&gt;&lt;tbody&gt;
&lt;tr style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif;"&gt;&lt;td class="number" style="color: #222222; font-size: 12px; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top; width: 50px;"&gt;&lt;span&gt;In[5]:=&lt;/span&gt;&lt;/td&gt;&lt;td style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top;"&gt;&lt;div class="showHideInputCont"&gt;&lt;div class="showHideInput" style="cursor: pointer; position: relative;"&gt;&lt;img alt="Click for copyable input" border="0" height="88" src="http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/HTMLImages.en/basic-examples/In_5.png" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px;" width="454" /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;table class="outputTable" style="color: #222222; font-family: Arial, Verdana, Geneva, sans-serif; font-size: 13px; line-height: 19px; margin-top: 20px;"&gt;&lt;tbody&gt;
&lt;tr style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif;"&gt;&lt;td class="number" style="color: #222222; font-size: 12px; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top; width: 50px;"&gt;&lt;span&gt;Out[5]=&lt;/span&gt;&lt;/td&gt;&lt;td style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top;"&gt;&lt;img border="0" height="30" src="http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/HTMLImages.en/basic-examples/O_5.png" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px;" width="175" /&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="color: #222222; font-family: Arial, Verdana, Geneva, sans-serif; font-size: x-small;"&gt;&lt;span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 2px; -webkit-border-vertical-spacing: 2px; line-height: 19px;"&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color: #222222; font-family: Arial, Verdana, Geneva, sans-serif; font-size: x-small;"&gt;&lt;span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 2px; -webkit-border-vertical-spacing: 2px; line-height: 19px;"&gt;&lt;br /&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; font-size: 13px;"&gt;&lt;table class="inputTable" style="margin-top: 20px;"&gt;&lt;tbody&gt;
&lt;tr style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif;"&gt;&lt;td class="number" style="color: #222222; font-size: 12px; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top; width: 50px;"&gt;&lt;span&gt;In[10]:=&lt;/span&gt;&lt;/td&gt;&lt;td style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top;"&gt;&lt;div class="showHideInputCont"&gt;&lt;div class="showHideInput" style="cursor: pointer; position: relative;"&gt;&lt;img alt="Click for copyable input" border="0" height="48" src="http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/HTMLImages.en/basic-examples/In_10.png" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px;" width="454" /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;table class="outputTable" style="margin-top: 20px;"&gt;&lt;tbody&gt;
&lt;tr style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif;"&gt;&lt;td class="number" style="color: #222222; font-size: 12px; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top; width: 50px;"&gt;&lt;span&gt;Out[10]=&lt;/span&gt;&lt;/td&gt;&lt;td style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top;"&gt;&lt;img border="0" height="324" src="http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/HTMLImages.en/basic-examples/O_10.png" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px;" width="322" /&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="color: #222222; font-family: Arial, Verdana, Geneva, sans-serif; font-size: x-small;"&gt;&lt;span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 2px; -webkit-border-vertical-spacing: 2px; line-height: 19px;"&gt;&lt;span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 0px; -webkit-border-vertical-spacing: 0px; font-size: 13px;"&gt;&lt;table class="inputTable" style="margin-top: 20px;"&gt;&lt;tbody&gt;
&lt;tr style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif;"&gt;&lt;td class="number" style="color: #222222; font-size: 12px; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top; width: 50px;"&gt;&lt;span&gt;In[9]:=&lt;/span&gt;&lt;/td&gt;&lt;td style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top;"&gt;&lt;div class="showHideInputCont"&gt;&lt;div class="showHideInput" style="cursor: pointer; position: relative;"&gt;&lt;img alt="Click for copyable input" border="0" height="51" src="http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/HTMLImages.en/probability-and-statistics/In_67.png" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px;" width="454" /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;table class="outputTable" style="margin-top: 20px;"&gt;&lt;tbody&gt;
&lt;tr style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif;"&gt;&lt;td class="number" style="color: #222222; font-size: 12px; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top; width: 50px;"&gt;&lt;span&gt;Out[9]=&lt;/span&gt;&lt;/td&gt;&lt;td style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top;"&gt;&lt;img border="0" height="37" src="http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/HTMLImages.en/probability-and-statistics/O_62.png" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px;" width="124" /&gt;&lt;br /&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="color: #222222; font-family: Arial, Verdana, Geneva, sans-serif; font-size: 13px; line-height: 19px;"&gt;&lt;/span&gt;&lt;br /&gt;
&lt;table class="inputTable" style="margin-top: 20px;"&gt;&lt;tbody&gt;
&lt;tr style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif;"&gt;&lt;td class="number" style="color: #222222; font-size: 12px; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top; width: 50px;"&gt;&lt;span&gt;n[3]:=&lt;/span&gt;&lt;/td&gt;&lt;td style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top;"&gt;&lt;div class="showHideInputCont"&gt;&lt;div class="showHideInput" style="cursor: pointer; position: relative;"&gt;&lt;img alt="Click for copyable input" border="0" height="60" src="http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/HTMLImages.en/advanced-mathematics/In_84.png" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px;" width="508" /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;table class="outputTable" style="margin-top: 20px;"&gt;&lt;tbody&gt;
&lt;tr style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif;"&gt;&lt;td class="number" style="color: #222222; font-size: 12px; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top; width: 50px;"&gt;&lt;span&gt;Out[3]=&lt;/span&gt;&lt;/td&gt;&lt;td style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top;"&gt;&lt;img border="0" height="110" src="http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/HTMLImages.en/advanced-mathematics/O_79.png" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px;" width="180" /&gt;&lt;br /&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="color: #222222; font-family: Arial, Verdana, Geneva, sans-serif; font-size: 13px; line-height: 19px;"&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;table class="inputTable" style="margin-top: 20px;"&gt;&lt;tbody&gt;
&lt;tr style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif;"&gt;&lt;td class="number" style="color: #222222; font-size: 12px; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top; width: 50px;"&gt;&lt;span&gt;In[4]:=&lt;/span&gt;&lt;/td&gt;&lt;td style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top;"&gt;&lt;div class="showHideInputCont"&gt;&lt;div class="showHideInput" style="cursor: pointer; position: relative;"&gt;&lt;img alt="Click for copyable input" border="0" height="51" src="http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/HTMLImages.en/linguistics/In_110.png" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px;" width="454" /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;table class="outputTable" style="margin-top: 20px;"&gt;&lt;tbody&gt;
&lt;tr style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif;"&gt;&lt;td class="number" style="color: #222222; font-size: 12px; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top; width: 50px;"&gt;&lt;span&gt;Out[4]=&lt;/span&gt;&lt;/td&gt;&lt;td style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top;"&gt;&lt;img border="0" height="69" src="http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/HTMLImages.en/linguistics/O_104.png" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px;" width="473" /&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;span class="Apple-style-span" style="color: #222222; font-family: Arial, Verdana, Geneva, sans-serif; font-size: 13px; line-height: 19px;"&gt;&lt;/span&gt;&lt;br /&gt;
&lt;table class="inputTable" style="margin-top: 20px;"&gt;&lt;tbody&gt;
&lt;tr style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif;"&gt;&lt;td class="number" style="color: #222222; font-size: 12px; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top; width: 50px;"&gt;&lt;span&gt;In[5]:=&lt;/span&gt;&lt;/td&gt;&lt;td style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top;"&gt;&lt;div class="showHideInputCont"&gt;&lt;div class="showHideInput" style="cursor: pointer; position: relative;"&gt;&lt;img alt="Click for copyable input" border="0" height="108" src="http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/HTMLImages.en/linguistics/In_111.png" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px;" width="454" /&gt;&lt;/div&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;table class="outputTable" style="margin-top: 20px;"&gt;&lt;tbody&gt;
&lt;tr style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif;"&gt;&lt;td class="number" style="color: #222222; font-size: 12px; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top; width: 50px;"&gt;&lt;span&gt;Out[5]=&lt;/span&gt;&lt;/td&gt;&lt;td style="color: #222222; font: normal normal normal 13px/19px Arial, Verdana, Geneva, sans-serif; vertical-align: top;"&gt;&lt;img border="0" height="14" src="http://www.wolfram.com/mathematica/new-in-8/free-form-linguistic-input/HTMLImages.en/linguistics/O_105.png" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px;" width="7" /&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-7147511320400344702?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/fSDZBX0vePA" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/fSDZBX0vePA/wolfram-mathematica-8-comes-with-free.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2011/05/wolfram-mathematica-8-comes-with-free.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-4583640975575113225</guid><pubDate>Sat, 26 Feb 2011 19:12:00 +0000</pubDate><atom:updated>2011-02-26T16:12:31.281-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Python</category><category domain="http://www.blogger.com/atom/ns#">Parsing</category><category domain="http://www.blogger.com/atom/ns#">NLP Tools</category><title>Memory-Based Shallow Parser for Python</title><description>From the author's website:&lt;br /&gt;
&lt;br /&gt;
&lt;blockquote&gt;MBSP is a text analysis system based on the TiMBL and MBT memory based learning applications developed at CLiPS and ILK. It provides tools for Tokenization and Sentence Splitting, Part of Speech Tagging, Chunking, Lemmatization, Relation Finding and Prepositional Phrase Attachment.&lt;/blockquote&gt;&lt;br /&gt;
&lt;div style="text-align: center;"&gt;&lt;a href="http://www.clips.ua.ac.be/pages/MBSP"&gt;http://www.clips.ua.ac.be/pages/MBSP&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
MBSP stands for "Memory-Based Shallow Parser" and provides a nice python interface use the software functionalities. In fact, it is a service that once started it is very fast to provide new parsings. An example for the system is outputted bellow.&lt;br /&gt;
&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://2.bp.blogspot.com/-D6qaYemzg1E/TWlPVaRphGI/AAAAAAAARgM/r2pu8kQafE4/s1600/MBSP_schema.gif" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="131" src="http://2.bp.blogspot.com/-D6qaYemzg1E/TWlPVaRphGI/AAAAAAAARgM/r2pu8kQafE4/s400/MBSP_schema.gif" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
&lt;pre&gt;&amp;gt;&amp;gt;&amp;gt; print MBSP.parse('I ate pizza with a fork.')

I/PRP/I-NP/O/NP-SBJ-1/O/i 
ate/VBD/I-VP/O/VP-1/A1/eat 
pizza/NN/I-NP/O/NP-OBJ-1/O/pizza 
with/IN/I-PP/B-PNP/O/P1/with 
a/DT/I-NP/I-PNP/O/P1/a 
fork/NN/I-NP/I-PNP/O/P1/fork 
././O/O/O/O/.
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-4583640975575113225?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/LJ35Va7WIMY" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/LJ35Va7WIMY/memory-based-shallow-parser-for-python.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://2.bp.blogspot.com/-D6qaYemzg1E/TWlPVaRphGI/AAAAAAAARgM/r2pu8kQafE4/s72-c/MBSP_schema.gif" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2011/02/memory-based-shallow-parser-for-python.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-1031128321975252966</guid><pubDate>Sat, 26 Feb 2011 18:49:00 +0000</pubDate><atom:updated>2011-03-03T06:21:44.252-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">SRL</category><category domain="http://www.blogger.com/atom/ns#">NLP Tools</category><title>SENNA: A Fast Semantic Role Labeling (SRL) Tool</title><description>Last days I played with a very fast Semantic Role Labeling (SRL) tool. The software is quite new and fully developed in pure C to provide efficiency. &lt;br /&gt;
&lt;br /&gt;
The software outputs a host of Natural Language Processing (NLP) predictions: part-of-speech (POS) tags, chunking (CHK), name entity recognition (NER) and semantic role labeling (SRL).part-of-speech (POS) tags, chunking (CHK), name entity recognition (NER) and semantic role labeling (SRL).&lt;br /&gt;
&lt;br /&gt;
&lt;div style="text-align: center;"&gt;&lt;a href="http://ml.nec-labs.com/senna/"&gt;http://ml.nec-labs.com/senna/&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
A friend was using ASSERT (http://cemantix.org/assert.html) to provide the SRL tags, but the software broke every time. Also, because ASSERT is a pipeline of other solutions, the runtime was enormous huge. Solution: change to SENNA.&lt;br /&gt;
&lt;br /&gt;
I did some tests with the software and I read a &lt;a href="http://arxiv.org/abs/1103.0398"&gt;report&lt;/a&gt; about it. The software is very robust and offer an easy tool for those who need to use SRL in their project. The software is distributed with the binaries (Linux32, Linux64, Mac, Windows) or it can be compiled at your machine. My friend's experiment runtime changed from 4 days to 4 hours using SENNA. &lt;br /&gt;
&lt;br /&gt;
The performance in each task is much closer to the best systems. Bellow, the system output.&lt;br /&gt;
&lt;br /&gt;
&lt;pre&gt;Google announced a new product yesterday.
   Google       NNP     S-NP     S-ORG           -      S-A0
 announced      VBD     S-VP         O   announced       S-V
         a      DT      B-NP         O           -      B-A1
       new      JJ      I-NP         O           -      I-A1
   product      NN      E-NP         O           -      E-A1
 yesterday      NN      S-NP         O           -  S-AM-TMP
         .       .         O         O           -         O
&lt;/pre&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-1031128321975252966?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/5ljNfFoeyRM" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/5ljNfFoeyRM/senna-fast-semantic-role-labeling-srl.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2011/02/senna-fast-semantic-role-labeling-srl.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-1944886696012601293</guid><pubDate>Tue, 18 Jan 2011 13:47:00 +0000</pubDate><atom:updated>2011-01-18T11:47:39.583-02:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Natural Language Generation</category><category domain="http://www.blogger.com/atom/ns#">Statistics</category><title>Experiments with Statistical Language Generation</title><description>Some time ago I tried to imagine how amazingly our brain is able to generate language. This is a very hard problem and it isn't  totally understood by the science. As I wish to do some practical experiments I used what I had in hands: statistics.&lt;br /&gt;
&lt;br /&gt;
Simple, there is some methods to generate text in the &lt;a href="http://www.nltk.org/"&gt;natural language toolkit&lt;/a&gt; for python (NLTK). I studied them and I created some curious examples. Here, I post my examples.&lt;br /&gt;
&lt;br /&gt;
First, let's create a code for learning a &lt;a href="http://en.wikipedia.org/wiki/N-gram"&gt;N-gram&lt;/a&gt; &lt;a href="http://en.wikipedia.org/wiki/Language_model"&gt;language model&lt;/a&gt; where each gram is a word present in the text. I use two corpora, the Bible Genesis in English and Reuters Trade. Both are available in the NLTK. Bellow the code.&lt;br /&gt;
&lt;br /&gt;
&lt;pre class="python" name="code"&gt;# Import the corpus and functions used from nltk library
from nltk.corpus import reuters
from nltk.corpus import genesis
from nltk.probability import LidstoneProbDist
from nltk.model import NgramModel

# Tokens contains the words for Genesis and Reuters Trade
tokens = list(genesis.words('english-kjv.txt'))
tokens.extend(list(reuters.words(categories = 'trade')))

# estimator for smoothing the N-gram model
estimator = lambda fdist, bins: LidstoneProbDist(fdist, 0.2)

# N-gram language model with 3-grams
model = NgramModel(3, tokens,estimator)

# Apply the language model to generate 50 words in sequence
text_words = model.generate(50)

# Concatenate all words generated in a string separating them by a space.
text = ' '.join([word for word in text_words])

# print the text
print text
&lt;/pre&gt;&lt;br /&gt;
In the line 12 I use a &lt;a href="http://en.wikipedia.org/wiki/Additive_smoothing"&gt;Lidstone's smoothing&lt;/a&gt; technique for the &lt;a href="http://en.wikipedia.org/wiki/N-gram"&gt;n-gram&lt;/a&gt; &lt;a href="http://en.wikipedia.org/wiki/Language_model"&gt;language model&lt;/a&gt;. For more information about the use of a smoothing techniques in n-gram language modelling consult the Wikipedia or a good statistical natural language processing book. I am generating 50 words in the example using 3-gram model. This means that the model will pick the previous 2 words and check in the text for  words that came after and pick one of them respecting the estimation. For each 2 words, the model will select the third, and so do it for all 50 words.&lt;br /&gt;
&lt;br /&gt;
Some examples of two texts generated for the 3-gram language model were (see that each time the algorithm runs it generate a random another example):&lt;br /&gt;
&lt;br /&gt;
&lt;pre&gt;In the fiscal year , with tabret , and menservants , and sat in the presence of the heaven ; and darkness was upon all that he may waive any retaliation if it accelerated too fast and the one affected by the hand of Es for I also here looked

In the speech , the brother of Japheth the elder , even with Isaac ' s steel exports , the statistics department said first quarter . There were indications of improvement in Britain ' s deputy minister of Economy and Communication . Speaking in a coffin in Egypt , Jacob
&lt;/pre&gt;&lt;br /&gt;
You can see that the algorithm mixed the texts, some times speaks about heaven, Jacob and Egypt and other times about improvements in Britain, retaliation, minister of Economy and Communication. The most important is that you can, in a high level, say that the text respects the English grammar and that the sequence of words are not totally random. This is the result of the n-gram model. For much more n-grams do you have in your model, more accurate will be the text. For example, let's generate for 5 grams:&lt;br /&gt;
&lt;br /&gt;
&lt;pre class="python" name="code"&gt;# N-gram language model with 5-grams
model = NgramModel(5, tokens,estimator)
&lt;/pre&gt;&lt;br /&gt;
&lt;pre&gt;In the beginning God created the heaven and the earth . And when Joseph saw Benjamin with them , he said . Such an approach would finance the trade deficit but allow for its gradual resolution over time . YEUTTER SAYS U . S ., Company spokesmen told Reuters .
&lt;/pre&gt;&lt;br /&gt;
In this example, the text generated starts exactly equal the bible, but in the middle it changes to some random text from Reuters. In my examples I am not moving the start point, so the random text will always start with "In the beginning God ..." (for 5 grams). What we can verify is that the text is much more coherent (in a high level again) than the previous 3-gram modelling. It keeps the basic English syntax even if the meaning (semantic) is not coherent.&lt;br /&gt;
&lt;br /&gt;
Ok, if the text with more n-grams has better coherence, why we do not use a higher n-gram for modelling? Well, first because time/space constrains. The more n-gram we pick, the more time processing and memory we require (it grows experientially). In this example I am using a text with near 200.000 words, but imagine you using it with a more than 1 billion words corpus, which is usual for language modelling. Second because the probability itself. The number of different words that we can find after the same previous 3 words is much bigger than the ones we can find after the previous 6 or 7 words. So, in the case of a high n-gram, the probability function do not performs well because the insufficient amount of data. Again, more data more time, it is a complex trade-off. What is usual in computational linguistics is a 3-5 n-gram models, this is what you will find in the papers.&lt;br /&gt;
&lt;br /&gt;
Just for fun, let's see how crazy the random words are chosen by a 2-gram language model.&lt;br /&gt;
&lt;br /&gt;
&lt;pre class="python" name="code"&gt;# N-gram language model with 2-grams
model = NgramModel(2, tokens,estimator)
&lt;/pre&gt;&lt;br /&gt;
&lt;pre&gt;In the rods of the land of her pitcher from 27 of an agreement aims to impose permanent quotas on Japan hopes of the latest monthly figure excluding Dutch - import quotas on its currency and 24 . Another possible to rebuild our goal set it said " We cannot
&lt;/pre&gt;&lt;br /&gt;
For complementing my example, I will now demonstrate that you can choose even letters for your language model. The code is posted bellow.&lt;br /&gt;
&lt;br /&gt;
&lt;pre class="python" name="code"&gt;from nltk.corpus import reuters
from nltk.corpus import genesis
from nltk.probability import LidstoneProbDist
from nltk.model import NgramModel

# Tokens contains the words for English Genesis and Reuters Trade
tokens = list(genesis.words('english-kjv.txt'))
tokens.extend(list(reuters.words(categories = 'trade')))

# Generate a list of letters for each word in tokens separated by a space
letters = [letter for letter in ' '.join([word for word in tokens])]

# estimator for smoothing the N-gram model
estimator = lambda fdist, bins: LidstoneProbDist(fdist, 0.2)

# N-gram language model with 4-grams
model = NgramModel(4, letters,estimator)

# Apply the language model to generate 200 letters in sequence
text = model.generate(200)
text_output = ''.join([letter for letter in text])
print text_output
&lt;/pre&gt;&lt;br /&gt;
I chose 4-grams, i.e., pick 3 previous letters and select the fourth. The text generated for 200 letters is bellow. &lt;br /&gt;
&lt;br /&gt;
&lt;pre&gt;In to On to serity seriod on the 62 bid to Wash , and he U . " the rawing up , sinese ther it he sharah fedeji Fulforce againted import becassed the Asensurplus in July alway to Tariffections in 1 . w
&lt;/pre&gt;&lt;br /&gt;
Of, course the readability is worse than picking words, since some words do not even exist. The punctuation signals also do not work well and the syntax, well, it is just some letters. You can increase or decrease the n-gram factor to see more results. &lt;br /&gt;
&lt;br /&gt;
The most amazing think in these examples is that: yes we could generate text, we could in some examples generate texts that people will say made by other people (some crazy people maybe). And you see, what are we using? Just a corpus and statistics! Natural Language Processing is amazing!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-1944886696012601293?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/2f8PZYVftGg" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/2f8PZYVftGg/experiments-with-statistical-language.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2011/01/experiments-with-statistical-language.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-7124197755358076214</guid><pubDate>Sun, 16 Jan 2011 19:23:00 +0000</pubDate><atom:updated>2011-01-16T17:23:00.449-02:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Natural Language</category><category domain="http://www.blogger.com/atom/ns#">Python</category><title>Spell Checker</title><description>One of the most important task in Natural Language Processing is Spell Checking. This post will explain how to deal with this. I am going to use python language as example.&lt;br /&gt;
&lt;br /&gt;
First of all, let's not re-create the things. There is a good tutorial by Peter Norvig about spelling Corrector in the link: &lt;a href="http://norvig.com/spell-correct.html"&gt;http://norvig.com/spell-correct.html&lt;/a&gt;. Let's use the same code in our example.&lt;br /&gt;
&lt;br /&gt;
&lt;pre class="python" name="code"&gt;import re, collections

def words(text): return re.findall('[a-z]+', text.lower()) 

def train(features):
    model = collections.defaultdict(lambda: 1)
    for f in features:
        model[f] += 1
    return model

NWORDS = train(words(file('big.txt').read()))

alphabet = 'abcdefghijklmnopqrstuvwxyz'

def edits1(word):
   splits     = [(word[:i], word[i:]) for i in range(len(word) + 1)]
   deletes    = [a + b[1:] for a, b in splits if b]
   transposes = [a + b[1] + b[0] + b[2:] for a, b in splits if len(b)&amp;gt;1]
   replaces   = [a + c + b[1:] for a, b in splits for c in alphabet if b]
   inserts    = [a + c + b     for a, b in splits for c in alphabet]
   return set(deletes + transposes + replaces + inserts)

def known_edits2(word):
    return set(e2 for e1 in edits1(word) for e2 in edits1(e1) if e2 in NWORDS)

def known(words): return set(w for w in words if w in NWORDS)

def correct(word):
    candidates = known([word]) or known(edits1(word)) or known_edits2(word) or [word]
    return max(candidates, key=NWORDS.get)
&lt;/pre&gt;&lt;div style="text-align: right;"&gt;Spelling Corrector code from: &lt;a href="http://norvig.com/spell-correct.html"&gt;http://norvig.com/spell-correct.html&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;
To use this code you have to download the file &lt;a href="http://norvig.com/big.txt"&gt;big.txt&lt;/a&gt; in the same directory you are working. As Peter Norvig explain: "the file is a concatenation of several public domain books from Project Gutenberg and lists of most frequent words from Wiktionary and the British National Corpus". So, we expect to have the coverage for the most common words.&lt;br /&gt;
&lt;br /&gt;
To use this code you have to call the function correct(word). In the example above I am using a python interpreter to show the results.&lt;br /&gt;
&lt;br /&gt;
&lt;pre class="python" name="code"&gt;&amp;gt;&amp;gt;&amp;gt; correct('see')
'see'
&amp;gt;&amp;gt;&amp;gt; correct('tee')
'the'
&amp;gt;&amp;gt;&amp;gt; correct('twelfe')
'twelve'
&amp;gt;&amp;gt;&amp;gt; correct('speling')
'spelling'
&amp;gt;&amp;gt;&amp;gt; correct('korrecter')
'corrector'
&lt;/pre&gt;&lt;br /&gt;
You see that the function always return a suggestion. If the word is present in the file, so the suggestion is the word itself. If not, the algorithm looks for the edit distance between your word and the words in the file. It is more probable that you mistype by one or two letters and so, the words closer to it are the most probable to be correct. The algorithm looks for the words with edit distance 1 and second with edit distance 2. If it could not found any valid word in the both results, so it returns the input word unchanged.&lt;br /&gt;
&lt;br /&gt;
One modification I suggest is to use a real spell checker dictionary instead the corpus. You can, for example, download the linux aspell dictionary in the follow command line:&lt;br /&gt;
&lt;br /&gt;
&lt;pre name="code"&gt;aspell --lang en dump repl &amp;gt; big.txt
&lt;/pre&gt;&lt;br /&gt;
The aspell program can be found in the major linux distributions, including the Ubuntu linux. &lt;br /&gt;
The file generated ('big.txt') has now 138622 words. This file can represent all English language formalized by a dictionary and the variations found for each morpheme (e.g. volunteer, &lt;br /&gt;
volunteering, volunteerism, volunteered, volunteer's, volunteers). The new set of test has now the following result:&lt;br /&gt;
&lt;br /&gt;
&lt;pre class="python" name="code"&gt;&amp;gt;&amp;gt;&amp;gt; correct('see')
'see'
&amp;gt;&amp;gt;&amp;gt; correct('tee')
'tee'
&amp;gt;&amp;gt;&amp;gt; correct('twelfe')
'twelve'
&amp;gt;&amp;gt;&amp;gt; correct('speling')
'spelling'
&amp;gt;&amp;gt;&amp;gt; correct('korrecter')
'correcter'
&lt;/pre&gt;&lt;br /&gt;
The use of this new file brings you some particular aspects. For example the words tee and correcter sound strange, but they are totally valid in English (&lt;a href="http://www.google.com/dictionary?langpair=en|en&amp;amp;q=tee&amp;amp;hl=en&amp;amp;aq=f"&gt;tee&lt;/a&gt;, &lt;a href="http://www.google.com/dictionary?langpair=en|en&amp;amp;q=correcter&amp;amp;hl=en&amp;amp;aq=f"&gt;correcter&lt;/a&gt;). &lt;br /&gt;
&lt;br /&gt;
So, it is a matter of know the purpose of your spell corrector and use the appropriate file to generate the valid words.&lt;br /&gt;
&lt;br /&gt;
For a full explanation for the Norvig's code you can read his page (&lt;a href="http://norvig.com/spell-correct.html"&gt;http://norvig.com/spell-correct.html&lt;/a&gt;.). There you can also find the same spell checker code in other languages than Python.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-7124197755358076214?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/nAOGesnXN5M" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/nAOGesnXN5M/spell-checker.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2011/01/spell-checker.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-5539792135072292792</guid><pubDate>Sat, 08 Jan 2011 19:05:00 +0000</pubDate><atom:updated>2011-01-08T17:05:00.110-02:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Sentiment Analysis</category><title>Twitter Sentiment Analysis</title><description>&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://1.bp.blogspot.com/_PqNL_88ei8E/TSi0qmYx8YI/AAAAAAAARbA/Xve11cBsYnY/s1600/Twitter2.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"&gt;&lt;img border="0" height="150" src="http://1.bp.blogspot.com/_PqNL_88ei8E/TSi0qmYx8YI/AAAAAAAARbA/Xve11cBsYnY/s200/Twitter2.jpg" width="200" /&gt;&lt;/a&gt;&lt;/div&gt;I have so much work in the last months, that is why I am far from posts since them. During this time I did some interesting projects that I will start posting here.&lt;br /&gt;
&lt;br /&gt;
As I am conducting researches in Sentiment Analysis and I found a very interesting database for twitter sentiment analysis. I am not very particular interesting in twitter, but since there is many researches for sentiment analysis in this tool I found useful to promote it here.&lt;br /&gt;
&lt;br /&gt;
The survey is conducted mainly by the people on &lt;a href="http://twittersentiment.appspot.com/"&gt;http://twittersentiment.appspot.com/&lt;/a&gt;, a great app for twitter sentiment analysis. Many others can be found in the survey doc:&lt;br /&gt;
&lt;br /&gt;
&lt;a href="https://spreadsheets0.google.com/ccc?key=tVfLhBxao70Fk9bAtebp_xQ"&gt;https://spreadsheets0.google.com/ccc?key=tVfLhBxao70Fk9bAtebp_xQ&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-5539792135072292792?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/eoM9RjGP7BM" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/eoM9RjGP7BM/twitter-sentiment-analysis.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://1.bp.blogspot.com/_PqNL_88ei8E/TSi0qmYx8YI/AAAAAAAARbA/Xve11cBsYnY/s72-c/Twitter2.jpg" height="72" width="72" /><thr:total>3</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2011/01/twitter-sentiment-analysis.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-6598535314939349868</guid><pubDate>Thu, 21 Oct 2010 10:15:00 +0000</pubDate><atom:updated>2010-10-21T08:15:37.515-02:00</atom:updated><title>Comic Book Markup Language</title><description>Yes, it exist. I read about this in the corpora list.&lt;br /&gt;
&lt;br /&gt;
From:&amp;nbsp;&lt;a href="http://www.cbml.org/"&gt;http://www.cbml.org/&lt;/a&gt;&amp;nbsp;:&lt;br /&gt;
&lt;table align="center" border="0" cellpadding="0" cellspacing="0"&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td style="font-family: verdana, tahoma, helvetica, sans-serif; font-size: 10pt;"&gt;&lt;br /&gt;
&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class="pageTitle" style="color: black; font-family: 'comic sans ms'; font-size: small; font-variant: small-caps; font-weight: bold;"&gt;What is CBML?&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td style="font-family: verdana, tahoma, helvetica, sans-serif; font-size: 10pt;"&gt;&lt;div style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px;"&gt;CBML, or Comic Book Markup Language, is a&amp;nbsp;&lt;a href="http://www.tei-c.org/" style="color: #990000; text-decoration: none;"&gt;TEI&lt;/a&gt;-based XML vocabulary (with DTD and schema representations) designed to accommodate the XML encoding of comic books and graphic novels.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_PqNL_88ei8E/TMASDzTP8cI/AAAAAAAAQbY/ploc_uEgOrU/s1600/cbmlCollage.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="400" src="http://4.bp.blogspot.com/_PqNL_88ei8E/TMASDzTP8cI/AAAAAAAAQbY/ploc_uEgOrU/s400/cbmlCollage.jpg" width="255" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-6598535314939349868?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/dG58xVhRGQQ" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/dG58xVhRGQQ/comic-book-markup-language.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://4.bp.blogspot.com/_PqNL_88ei8E/TMASDzTP8cI/AAAAAAAAQbY/ploc_uEgOrU/s72-c/cbmlCollage.jpg" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/10/comic-book-markup-language.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-6298420072853337463</guid><pubDate>Sun, 29 Aug 2010 15:25:00 +0000</pubDate><atom:updated>2010-08-29T12:25:44.508-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Summarization</category><category domain="http://www.blogger.com/atom/ns#">NLP Tools</category><title>Understanding a Celebrity with His Salient Events</title><description>How the NLP is amazing! I was reading the today &lt;a href="http://nlp.hivefire.com/articles/16228/understanding-a-celebrity-with-his-salient-events/"&gt;NLP News&lt;/a&gt; and I got the follow paper published in the Springer Lecture Notes in Computer Science, 2010. I would never imagine such application for NLP, but it is, and works!&lt;br /&gt;
&lt;blockquote&gt;Internet has become a resourceful platform for people to collect information. Specially, it becomes one of the main ways to understand a celebrity. However, the huge volume of information makes troubles for people to get what they really want. How to filter out needless information through numerous data and form a brief review of a celebrity become necessary for people to understand the person. In this paper, we propose a novel solution for understanding a celebrity by summarizing his most salient historical events, and a framework is outlined. The framework contains three main components: attention tracking, event mining from News, and event summarization. First, with the comparison of users’ attention and media attention on a celebrity, News corpus is proved to be able to represent the users’ attention. Second, keywords are extracted from the News according to different time periods for choosing summary sentences. Third, a final event description of the celebrity will be given. Finally, we will show the user interface of our system. Our experimental results show that the proposed solution can effectively process the news corpus and provide us with accurate description of the celebrity.&lt;/blockquote&gt;Link:&amp;nbsp;&lt;a href="http://www.springerlink.com/content/4x333830nq8w80p7/"&gt;http://www.springerlink.com/content/4x333830nq8w80p7/&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
The NLP field each day has been showed more and more important. A lot of NLP application should yet be discovered! I love study NLP!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-6298420072853337463?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/8e6T6iiI2j8" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/8e6T6iiI2j8/understanding-celebrity-with-his.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/08/understanding-celebrity-with-his.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-2947623908151230555</guid><pubDate>Thu, 26 Aug 2010 15:45:00 +0000</pubDate><atom:updated>2010-08-26T12:45:37.545-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">News</category><title>The Top Three hottest new majors for a career in technology</title><description>From: &lt;a href="http://microsoftjobsblog.com/blog/top-three-new-tech-majors/"&gt;Microsft Jobs blog&lt;/a&gt;:&lt;br /&gt;
&lt;br /&gt;
&lt;span class="Apple-style-span" style="font-family: 'Segoe UI', Tahoma, Arial, Helvetica, sans-serif; font-size: 13px;"&gt;&lt;/span&gt;&lt;br /&gt;
&lt;div class="posthead" style="border-bottom-color: rgb(223, 223, 223); border-left-color: rgb(223, 223, 223); border-right-color: rgb(223, 223, 223); border-top-color: rgb(223, 223, 223); color: #767676; font-size: 1em; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Posted Monday, August 23 2010 by The JobsBloggers&lt;/div&gt;&lt;div class="postcontent" style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;div style="margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Have you ever wondered what fields of study are hot right now in the world of technology?&amp;nbsp; Or maybe you’re starting to think about declaring your major and you’re looking for some real world guidance?&lt;/div&gt;&lt;div style="margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;It is worth thinking beyond a traditional Computer Science degree or even an Electrical Engineering &amp;amp; Computer Science (EECS) program. Microsoft is hiring people with unique backgrounds, some that are new with the inception of the Cloud, web services and the amazing scale at which the industry is operating (&lt;a href="http://en.wikipedia.org/wiki/Exabyte" style="color: #6aa646; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;" target="_blank"&gt;Exabyte&lt;/a&gt;&amp;nbsp;anyone?).&lt;/div&gt;&lt;div style="margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;The following is my list of the Top Three hottest academic areas for a&amp;nbsp;future&amp;nbsp;career in tech:&lt;/div&gt;&lt;div style="margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;strong style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Data Mining/Machine Learning/AI/Natural Language Processing&lt;/strong&gt;&amp;nbsp;&lt;br style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;" /&gt;All of these fields help us sift through and organize huge amounts of information or data. When you apply your knowledge in these areas to a challenging problem in the online space, you know that you are working at a scale that is just immense.&amp;nbsp; It’s much easier said than done.&amp;nbsp; If you have a passion for this area and have a technical background there are a multitude of open positions that might hold a long-term career for you.&amp;nbsp; With the move to the cloud and the sheer amount of information on the web, this area of expertise will continue to be in great demand. Microsoft has a great need for both people interested in the research space and the applied space which is very refreshing.&lt;/div&gt;&lt;div style="margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;strong style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Business Intelligence/Competitive Intelligence&lt;/strong&gt;&lt;br style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;" /&gt;The ability to see trends, make sense of data to a business audience and help to&amp;nbsp;understand your customers requires a special person.&amp;nbsp;Someone with a mix of engineering, BI/CI experience and a business mindset can take this field to the next level.&amp;nbsp;You will help increase any employer’s bottom line and be able to provide organized data that is extremely valuable to any business.&amp;nbsp;You can help drive business decisions and help your internal audience understand what the data is telling or showing you.&amp;nbsp;&lt;/div&gt;&lt;div style="margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;&lt;strong style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;Analytics/Statistics – specifically Web Analytics, A/B Testing and statistical analysis&lt;/strong&gt;&lt;br style="margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;" /&gt;All of these subjects are offshoots of traditional degrees in CS and mathematics.&amp;nbsp;They all apply to the online world we live in and will also be in great demand as we continue to monetize the web. Retailers, web services, and advertisers will need people in these fields as they try to get the most for their advertising money. As we continue to see the dollar amounts spent for online advertising worldwide, these fields will be hot and we will see online advertising change over time as a result of these positions.&amp;nbsp;&lt;/div&gt;&lt;div style="margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;"&gt;If these fields interest you and you want to find out what some of these jobs really entail, visit our&amp;nbsp;&lt;a href="http://careers.microsoft.com/" style="color: #6aa646; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px;" target="_blank"&gt;website&lt;/a&gt;&amp;nbsp;and search on the terms above to get a more detailed look at the positions. These fields are very HOT and looking long term, the demand will be just that much greater in these areas.&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-2947623908151230555?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/BKZ5Pch6JhM" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/BKZ5Pch6JhM/top-three-hottest-new-majors-for-career.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/08/top-three-hottest-new-majors-for-career.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-8300349061828067901</guid><pubDate>Mon, 16 Aug 2010 14:03:00 +0000</pubDate><atom:updated>2010-08-16T11:03:19.956-03:00</atom:updated><title>The AGI Summer School 2009</title><description>&lt;div&gt;An excellent post from Joel Pitt in the &lt;a href="http://blog.opencog.org/2010/08/15/the-agi-summer-school-2009/"&gt;OpenCog blog&lt;/a&gt;. Excellent videos!&lt;/div&gt;&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;a href="http://blog.opencog.org/2010/08/15/the-agi-summer-school-2009/"&gt;The AGI Summer School 2009&lt;/a&gt;: "&lt;p&gt;In the middle of last year, Xiamen University hosted the first international summer school on Artificial General Intelligence. While several of the core OpenCog developers, and Ben Goertzel were there to teach, it passed by somewhat quietly on our blog as we waited for the videos and presentations to be put together by independent film-maker Raj Dye.&lt;/p&gt;&lt;p&gt;Raj actually completed these early 2010, but due to various myself and others being very busy at the time, we didn’t post them here. However, we continue to get many requests for an easy introduction and a tutorial to OpenCog. Fortunately one of the videos is an introduction to the software framework:&lt;/p&gt;&lt;p&gt;&lt;a href="http://agi-school.org/2009/dr-joel-pitt-with-dr-ben-goertzel-opencog-software-framework"&gt;The OpenCog Software Framework&lt;/a&gt; – presented by Joel Pitt (me) and Ben Goertzel.&lt;/p&gt;&lt;p&gt;Of course, this is a the high level overview, and the other videos available on the &lt;a href="http://agi-school.org/"&gt;AGI Summer School site&lt;/a&gt; focus on more specific aspects of OpenCog, such as:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;&lt;a href="http://agi-school.org/2009/dr-nil-geissweiller-automated-program-learning-the-moses-algorithm"&gt;Automated Program Learning: the MOSES Algorithm&lt;/a&gt; presented by Nil Geisweiller.&lt;/li&gt;&lt;li&gt;&lt;a href="http://agi-school.org/2009/dr-joel-pitt-probabilistic-logical-networks"&gt;An introduction to PLN&lt;/a&gt; by myself.&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;…among many others!&lt;/p&gt;"&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-8300349061828067901?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/Byg-_Y176cE" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/Byg-_Y176cE/agi-summer-school-2009.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/08/agi-summer-school-2009.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-9176002623744070737</guid><pubDate>Wed, 11 Aug 2010 13:22:00 +0000</pubDate><atom:updated>2010-08-11T10:24:34.942-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">NLP Tools</category><title>Announcing Python NLTK Demos</title><description>Bellow a post from the &lt;a href="http://streamhacker.com/2010/08/02/announcing-python-nltk-demos/"&gt;StreamHacker.com&lt;/a&gt; blog presenting a Demo for some features in the NLTK tool.&lt;br /&gt;
&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;&lt;a href="http://feedproxy.google.com/~r/StreamHacker/~3/RS_GWtsAto8/"&gt;Announcing Python NLTK Demos&lt;/a&gt;: &lt;br /&gt;
&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;"&lt;br /&gt;
If you want to see what NLTK can do, but don't want to go thru the effort of installation and learning how to use it, then check out my &lt;a href="http://text-processing.com/demo/"&gt;Python NLTK demos&lt;/a&gt;.&lt;br /&gt;
It currently demonstrates the following functionality:&lt;br /&gt;
&lt;ul&gt;&lt;li&gt;&lt;a href="http://text-processing.com/demo/tag/" title="Part of Speech Tagging with Python NLTK"&gt;part-of-speech tagging&lt;/a&gt; with the default NLTK pos tagger&lt;/li&gt;
&lt;li&gt;&lt;a href="http://text-processing.com/demo/tag/" title="Chunk Extraction and Named Entity Recognition with Python NLTK"&gt;chunking and named entity recognition&lt;/a&gt; with the default NLTK chunker&lt;/li&gt;
&lt;li&gt;&lt;a href="http://text-processing.com/demo/sentiment/" title="Sentiment Analysis with Python NLTK"&gt;sentiment analysis&lt;/a&gt; with a combination of a &lt;a href="http://streamhacker.com/2010/05/10/text-classification-sentiment-analysis-naive-bayes-classifier/#utm_source=feed&amp;amp;utm_medium=feed&amp;amp;utm_campaign=feed" title="Sentiment Analysis with NaiveBayesClassifier from Python NLTK"&gt;naive bayes classifier&lt;/a&gt; and a &lt;em&gt;maximum entropy classifier&lt;/em&gt;, both trained on the movie reviews corpus&lt;/li&gt;
&lt;/ul&gt;If you like it, &lt;strong&gt;&lt;a href="http://bit.ly/http://text-processing.com/demo/" title="Share on Bitly"&gt;please share it&lt;/a&gt;&lt;/strong&gt;. If you want to see more, leave a comment below. And if you are interested in a service that could apply these processes to your own data, please fill out this &lt;a href="http://streamhacker.com/nltk-services/#utm_source=feed&amp;amp;utm_medium=feed&amp;amp;utm_campaign=feed"&gt;NLTK services survey&lt;/a&gt;.&lt;br /&gt;
&lt;h2&gt;Other Natural Language Processing Demos&lt;/h2&gt;Here's a list of similar resources on the web:&lt;br /&gt;
&lt;ul&gt;&lt;li&gt;A demo of the &lt;a href="http://nlp.stanford.edu/software/lex-parser.shtml"&gt;Stanford Parser&lt;/a&gt; with a javascript API: &lt;a href="http://nlp.naturalparsing.com/browserparser/parse"&gt;Natural-language Parsing For The Web&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;A demo of the &lt;a href="http://www.lsi.upc.edu/~nlp/freeling/"&gt;FreeLing&lt;/a&gt; language analysis suite: &lt;a href="http://garraf.epsevg.upc.es/freeling/demo.php"&gt;FreeLing Demo&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Emotional identification from text: &lt;a href="http://dtminredis.housing.salle.url.edu:8080/EmoLib/"&gt;EmoLib&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;&lt;img height="1" src="http://feeds.feedburner.com/~r/StreamHacker/~4/RS_GWtsAto8" width="1" /&gt;"&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-9176002623744070737?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/64jYuxXJfGM" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/64jYuxXJfGM/announcing-python-nltk-demos_11.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/08/announcing-python-nltk-demos_11.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-8867870543287002691</guid><pubDate>Sun, 25 Jul 2010 00:06:00 +0000</pubDate><atom:updated>2010-07-24T21:06:25.696-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Language</category><title>Do the languages we speak shape the way we think?</title><description>&lt;div style="text-align: center;"&gt;&lt;b&gt;Lost in Translation&lt;/b&gt;&lt;/div&gt;&lt;i&gt;New cognitive research suggests that language profoundly influences the way people see the world; a different sense of blame in Japanese and Spanish&lt;/i&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;div class="separator" style="clear: both; text-align: center;"&gt;&lt;a href="http://4.bp.blogspot.com/_PqNL_88ei8E/TEt-yT5pyiI/AAAAAAAAPVU/gvGwyrn5cus/s1600/babel.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"&gt;&lt;img border="0" height="265" src="http://4.bp.blogspot.com/_PqNL_88ei8E/TEt-yT5pyiI/AAAAAAAAPVU/gvGwyrn5cus/s400/babel.jpg" width="400" /&gt;&lt;/a&gt;&lt;/div&gt;&lt;div style="text-align: right;"&gt;&lt;i&gt;The Tower of Babel' by Pieter Brueghel the Elder, 1563.&lt;/i&gt;&lt;/div&gt;&lt;br /&gt;
Do the languages we speak shape the way we think? Do they merely express thoughts, or do the structures in languages (without our knowledge or consent) shape the very thoughts we wish to express?&lt;br /&gt;
&lt;br /&gt;
Take "Humpty Dumpty sat on a..." Even this snippet of a nursery rhyme reveals how much languages can differ from one another. In English, we have to mark the verb for tense; in this case, we say "sat" rather than "sit." In Indonesian you need not (in fact, you can't) change the verb to mark tense.&lt;br /&gt;
&lt;br /&gt;
In Russian, you would have to mark tense and also gender, changing the verb if Mrs. Dumpty did the sitting. You would also have to decide if the sitting event was completed or not. If our ovoid hero sat on the wall for the entire time he was meant to, it would be a different form of the verb than if, say, he had a great fall.&lt;br /&gt;
&lt;br /&gt;
In Turkish, you would have to include in the verb how you acquired this information. For example, if you saw the chubby fellow on the wall with your own eyes, you'd use one form of the verb, but if you had simply read or heard about it, you'd use a different form.&lt;br /&gt;
&lt;br /&gt;
Do English, Indonesian, Russian and Turkish speakers end up attending to, understanding, and remembering their experiences differently simply because they speak different languages?&lt;br /&gt;
&lt;br /&gt;
These questions touch on all the major controversies in the study of mind, with important implications for politics, law and religion. Yet very little empirical work had been done on these questions until recently. The idea that language might shape thought was for a long time considered untestable at best and more often simply crazy and wrong. Now, a flurry of new cognitive science research is showing that in fact, language does profoundly influence how we see the world.&lt;br /&gt;
&lt;br /&gt;
The question of whether languages shape the way we think goes back centuries; Charlemagne proclaimed that "to have a second language is to have a second soul." But the idea went out of favor with scientists when Noam Chomsky's theories of language gained popularity in the 1960s and '70s. Dr. Chomsky proposed that there is a universal grammar for all human languages—essentially, that languages don't really differ from one another in significant ways. And because languages didn't differ from one another, the theory went, it made no sense to ask whether linguistic differences led to differences in thinking.&lt;br /&gt;
&lt;br /&gt;
The search for linguistic universals yielded interesting data on languages, but after decades of work, not a single proposed universal has withstood scrutiny. Instead, as linguists probed deeper into the world's languages (7,000 or so, only a fraction of them analyzed), innumerable unpredictable differences emerged.&lt;br /&gt;
&lt;br /&gt;
Of course, just because people talk differently doesn't necessarily mean they think differently. In the past decade, cognitive scientists have begun to measure not just how people talk, but also how they think, asking whether our understanding of even such fundamental domains of experience as space, time and causality could be constructed by language.&lt;br /&gt;
&lt;br /&gt;
For example, in Pormpuraaw, a remote Aboriginal community in Australia, the indigenous languages don't use terms like "left" and "right." Instead, everything is talked about in terms of absolute cardinal directions (north, south, east, west), which means you say things like, "There's an ant on your southwest leg." To say hello in Pormpuraaw, one asks, "Where are you going?", and an appropriate response might be, "A long way to the south-southwest. How about you?" If you don't know which way is which, you literally can't get past hello.&lt;br /&gt;
&lt;br /&gt;
About a third of the world's languages (spoken in all kinds of physical environments) rely on absolute directions for space. As a result of this constant linguistic training, speakers of such languages are remarkably good at staying oriented and keeping track of where they are, even in unfamiliar landscapes. They perform navigational feats scientists once thought were beyond human capabilities. This is a big difference, a fundamentally different way of conceptualizing space, trained by language.&lt;br /&gt;
&lt;br /&gt;
Differences in how people think about space don't end there. People rely on their spatial knowledge to build many other more complex or abstract representations including time, number, musical pitch, kinship relations, morality and emotions. So if Pormpuraawans think differently about space, do they also think differently about other things, like time?&lt;br /&gt;
&lt;br /&gt;
To find out, my colleague Alice Gaby and I traveled to Australia and gave Pormpuraawans sets of pictures that showed temporal progressions (for example, pictures of a man at different ages, or a crocodile growing, or a banana being eaten). Their job was to arrange the shuffled photos on the ground to show the correct temporal order. We tested each person in two separate sittings, each time facing in a different cardinal direction. When asked to do this, English speakers arrange time from left to right. Hebrew speakers do it from right to left (because Hebrew is written from right to left).&lt;br /&gt;
&lt;br /&gt;
Pormpuraawans, we found, arranged time from east to west. That is, seated facing south, time went left to right. When facing north, right to left. When facing east, toward the body, and so on. Of course, we never told any of our participants which direction they faced. The Pormpuraawans not only knew that already, but they also spontaneously used this spatial orientation to construct their representations of time. And many other ways to organize time exist in the world's languages. In Mandarin, the future can be below and the past above. In Aymara, spoken in South America, the future is behind and the past in front.&lt;br /&gt;
&lt;br /&gt;
In addition to space and time, languages also shape how we understand causality. For example, English likes to describe events in terms of agents doing things. English speakers tend to say things like "John broke the vase" even for accidents. Speakers of Spanish or Japanese would be more likely to say "the vase broke itself." Such differences between languages have profound consequences for how their speakers understand events, construct notions of causality and agency, what they remember as eyewitnesses and how much they blame and punish others.&lt;br /&gt;
&lt;br /&gt;
In studies conducted by Caitlin Fausey at Stanford, speakers of English, Spanish and Japanese watched videos of two people popping balloons, breaking eggs and spilling drinks either intentionally or accidentally. Later everyone got a surprise memory test: For each event, can you remember who did it? She discovered a striking cross-linguistic difference in eyewitness memory. Spanish and Japanese speakers did not remember the agents of accidental events as well as did English speakers. Mind you, they remembered the agents of intentional events (for which their language would mention the agent) just fine. But for accidental events, when one wouldn't normally mention the agent in Spanish or Japanese, they didn't encode or remember the agent as well.&lt;br /&gt;
&lt;br /&gt;
In another study, English speakers watched the video of Janet Jackson's infamous "wardrobe malfunction" (a wonderful nonagentive coinage introduced into the English language by Justin Timberlake), accompanied by one of two written reports. The reports were identical except in the last sentence where one used the agentive phrase "ripped the costume" while the other said "the costume ripped." Even though everyone watched the same video and witnessed the ripping with their own eyes, language mattered. Not only did people who read "ripped the costume" blame Justin Timberlake more, they also levied a whopping 53% more in fines.&lt;br /&gt;
&lt;br /&gt;
Beyond space, time and causality, patterns in language have been shown to shape many other domains of thought. Russian speakers, who make an extra distinction between light and dark blues in their language, are better able to visually discriminate shades of blue. The Piraha, a tribe in the Amazon in Brazil, whose language eschews number words in favor of terms like few and many, are not able to keep track of exact quantities. And Shakespeare, it turns out, was wrong about roses: Roses by many other names (as told to blindfolded subjects) do not smell as sweet.&lt;br /&gt;
&lt;br /&gt;
Patterns in language offer a window on a culture's dispositions and priorities. For example, English sentence structures focus on agents, and in our criminal-justice system, justice has been done when we've found the transgressor and punished him or her accordingly (rather than finding the victims and restituting appropriately, an alternative approach to justice). So does the language shape cultural values, or does the influence go the other way, or both?&lt;br /&gt;
&lt;br /&gt;
Languages, of course, are human creations, tools we invent and hone to suit our needs. Simply showing that speakers of different languages think differently doesn't tell us whether it's language that shapes thought or the other way around. To demonstrate the causal role of language, what's needed are studies that directly manipulate language and look for effects in cognition.&lt;br /&gt;
&lt;br /&gt;
One of the key advances in recent years has been the demonstration of precisely this causal link. It turns out that if you change how people talk, that changes how they think. If people learn another language, they inadvertently also learn a new way of looking at the world. When bilingual people switch from one language to another, they start thinking differently, too. And if you take away people's ability to use language in what should be a simple nonlinguistic task, their performance can change dramatically, sometimes making them look no smarter than rats or infants. (For example, in recent studies, MIT students were shown dots on a screen and asked to say how many there were. If they were allowed to count normally, they did great. If they simultaneously did a nonlinguistic task—like banging out rhythms—they still did great. But if they did a verbal task when shown the dots—like repeating the words spoken in a news report—their counting fell apart. In other words, they needed their language skills to count.)&lt;br /&gt;
&lt;br /&gt;
All this new research shows us that the languages we speak not only reflect or express our thoughts, but also shape the very thoughts we wish to express. The structures that exist in our languages profoundly shape how we construct reality, and help make us as smart and sophisticated as we are.&lt;br /&gt;
&lt;br /&gt;
Language is a uniquely human gift. When we study language, we are uncovering in part what makes us human, getting a peek at the very nature of human nature. As we uncover how languages and their speakers differ from one another, we discover that human natures too can differ dramatically, depending on the languages we speak. The next steps are to understand the mechanisms through which languages help us construct the incredibly complex knowledge systems we have. Understanding how knowledge is built will allow us to create ideas that go beyond the currently thinkable. This research cuts right to the fundamental questions we all ask about ourselves. How do we come to be the way we are? Why do we think the way we do? An important part of the answer, it turns out, is in the languages we speak.&lt;br /&gt;
&lt;br /&gt;
&lt;i&gt;—Lera Boroditsky is a professor of psychology at Stanford University and editor in chief of Frontiers in Cultural Psychology.&lt;/i&gt;&lt;br /&gt;
&lt;br /&gt;
Source: &lt;a href="http://online.wsj.com/article/SB10001424052748703467304575383131592767868.html?mod=WSJ_LifeStyle_Lifestyle_5"&gt;The Wall Street Journal&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-8867870543287002691?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/zGNA34_A--A" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/zGNA34_A--A/do-languages-we-speak-shape-way-we.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://4.bp.blogspot.com/_PqNL_88ei8E/TEt-yT5pyiI/AAAAAAAAPVU/gvGwyrn5cus/s72-c/babel.jpg" height="72" width="72" /><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/07/do-languages-we-speak-shape-way-we.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-2231186159036856075</guid><pubDate>Thu, 22 Jul 2010 18:19:00 +0000</pubDate><atom:updated>2010-07-22T15:19:57.854-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Natural Language</category><title>Tip: Use Natural Language to Search your Windows 7 System</title><description>Windows Search supports some pretty complex search capabilities. The set of rules that Windows Search follows when interpreting what you type in a search box are referred to as Advanced Query Syntax (AQS). You can filter by file type, use Boolean operators and Boolean properties, specify ranges, and more. Detailed documentation about AQS is available in the Windows Developer Center. &lt;br /&gt;
&lt;br /&gt;
But did you know Windows Search supports natural language? If you don’t fancy Boolean formulations, you may want to try the natural-language approach to searching. &lt;br /&gt;
&lt;br /&gt;
So, instead of typing kind:email from:(Carl OR Ed) received:this week, you can enter email from Carl or Ed received this week. The system looks for key words (like “email”), filters out prepositions (such as “from”), handles conjunctions without making you capitalize them, and assumes the rest of what you type consists of property values that it should try to match. &lt;br /&gt;
&lt;br /&gt;
But first you need to turn on the natural language searching capabilities. To do this, open Windows Explore, choose Organize, and select Folder And Search Options. In the Folder Options dialog, click the Search tab. On the Search tab, select Use Natural Language Search. &lt;br /&gt;
&lt;br /&gt;
Source: &lt;a href="http://technet.microsoft.com/en-us/magazine/ee851676.aspx"&gt;Technet&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-2231186159036856075?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/GS7izzn26JU" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/GS7izzn26JU/tip-use-natural-language-to-search-your.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/07/tip-use-natural-language-to-search-your.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-8404056301523110913</guid><pubDate>Sun, 27 Jun 2010 03:25:00 +0000</pubDate><atom:updated>2010-06-27T00:25:12.776-03:00</atom:updated><title>Google Translates “Call Us For Free” To “Skype” In Italian</title><description>&lt;a href="http://nlp.hivefire.com/articles/15204/google-translates-call-us-for-free-to-skype-in-ita/"&gt;Google Translates “Call Us For Free” To “Skype” In Italian&lt;/a&gt;: "Google Translates “Call Us For Free” To “Skype” In ItalianTechCrunch (blog)Update: Google has confirmed that this was a machine translation error and not the result of a crowdsourced effort to change a translation. ..."&lt;div&gt;&lt;br /&gt;&lt;/div&gt;&lt;div&gt;Sometimes statistical machine translation do not work properly :)&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-8404056301523110913?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/srS8nW26nXo" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/srS8nW26nXo/google-translates-call-us-for-free-to.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/06/google-translates-call-us-for-free-to.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-3488494867807004806</guid><pubDate>Fri, 04 Jun 2010 01:58:00 +0000</pubDate><atom:updated>2010-06-03T22:58:07.134-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">News</category><category domain="http://www.blogger.com/atom/ns#">Syntax</category><category domain="http://www.blogger.com/atom/ns#">Language</category><title>Syntax found in the brain -- not in Broca's area</title><description>&lt;span class="Apple-style-span" style="color: #333333; font-family: Georgia, serif; font-size: 13px; line-height: 20px;"&gt;Syntactic processing and Broca's area have been cozy bedfellows ever since work in the 1970s showed that patients with Broca's aphasia had difficulty comprehending syntactically complex sentences. Despite the fact that further lesion based evidence severely weakened the relationship (e.g., Broca's aphasics are pretty good at grammaticality judgments) subsequent PET and fMRI studies prolonged the marriage by showing that the comprehension of complex sentences activates Broca's area more than simple sentences. These days a look at the literature on Broca's area and sentence processing is full of controversy with proposals ranging from the region supporting syntactic movement processing (Grodzinsky) hierarchical structure and phrase structure building (Friederici), "linearization" (Bornkessel-Schlesewsky), and several more domain-general functions such as cognitive control (Novick), "unification" (Hagoort) and working memory (Caplan; Rogalsky/Hickok).&lt;br /&gt;
&lt;br /&gt;
Meanwhile, evidence has been accumulating that the anterior temporal lobe may house a network that behaves much more like a syntactic computation system in that it seems to be highly correlated with the presence or absence of syntactic information in a sentence (Dronkers, et al. 2004; Friederici, et al., 2000; Humphries, et al. 2005,2006; Mazoyer et al., 1993; Rogalsky &amp;amp; Hickok, 2008; Stowe et al., 1998; Vandenberghe, Nobre, &amp;amp; Price, 2002).&lt;br /&gt;
&lt;br /&gt;
A new study in&amp;nbsp;&lt;span style="font-style: italic;"&gt;Brain and Language&lt;/span&gt;&amp;nbsp;weighs in on the issue using the psycholinguists' favorite work of literature,&amp;nbsp;&lt;span style="font-style: italic;"&gt;Alice in Wonderland&lt;/span&gt;. Unlike most psycholinguistic nods to Lewis Carroll (a.k.a.&amp;nbsp;&lt;a href="http://en.wikipedia.org/wiki/Lewis_Carroll" style="color: #5588aa; text-decoration: none;"&gt;Charles Lutwidge Dodgson&lt;/a&gt;),&amp;nbsp;&lt;a href="http://www.sciencedirect.com/science?_ob=ArticleURL&amp;amp;_udi=B6WC0-50338MV-1&amp;amp;_user=4422&amp;amp;_coverDate=05%2F15%2F2010&amp;amp;_rdoc=1&amp;amp;_fmt=high&amp;amp;_orig=search&amp;amp;_sort=d&amp;amp;_docanchor=&amp;amp;view=c&amp;amp;_acct=C000059600&amp;amp;_version=1&amp;amp;_urlVersion=0&amp;amp;_userid=4422&amp;amp;md5=f0cfa6f2802f62c680b63f15bcba4d71" style="color: #5588aa; text-decoration: none;"&gt;the study by Jonathan Brennan and colleagues&lt;/a&gt;&amp;nbsp;did not use Jabberwocky sentences, but instead had subjects listen to text from&amp;nbsp;&lt;span style="font-style: italic;"&gt;Alice&lt;/span&gt;&amp;nbsp;while chillin' in a giant magnetic donut. The authors calculated word-by-word the amount of syntactic structure that was involved in integrating each word (basically a syntactic tree node counting analysis). These values were then correlated with the fMRI signal.&lt;br /&gt;
&lt;br /&gt;
So what brain region correlated with syntactic structure? You guessed it: the anterior temporal lobe.&amp;nbsp;&lt;/span&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="color: #333333; font-family: Georgia, serif; font-size: 13px; line-height: 20px;"&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="color: #333333; font-family: Georgia, serif; font-size: 13px; line-height: 20px;"&gt;Source: &lt;a href="http://www.talkingbrains.org/2010/05/syntax-found-in-brain-not-in-brocas.html"&gt;Talking Brains Blog&lt;/a&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-3488494867807004806?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/gKmYSF0bYE8" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/gKmYSF0bYE8/syntax-found-in-brain-not-in-brocas.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/06/syntax-found-in-brain-not-in-brocas.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-2813829551965986326</guid><pubDate>Tue, 01 Jun 2010 17:38:00 +0000</pubDate><atom:updated>2010-06-01T14:38:54.959-03:00</atom:updated><title>Microsoft Research Conversation Corpus</title><description>&lt;a href="http://research.microsoft.com/en-us/downloads/8f8d5323-0732-4ba0-8c6d-a5304967cc3f/default.aspx"&gt;Microsoft Research Conversation Corpus&lt;/a&gt;: "The Microsoft Research Conversation Corpus is a collection of approximately 1.3 million conversations gathered from Twitter.com. These conversations were collected using the Twitter Public API in 2009, between July 1 and August 27. Many of the tweets on Twitter are one-way, “broadcast” tweets, but this corpus is focused on two-way interactions between Twitter users, public conversations that could be used for research, on dialog modeling. Conversations in the corpus range from 2 to 243 posts in length, though the majority of interactions are short; those of length 2 account for 69 percent of the data. A more complete description of this effort can be found in the paper Ritter, Alan, Colin Cherry, and Bill Dolan “Unsupervised Modeling of Twitter Conversations,” NAACL 2010. We would appreciate your citing this work if you publish research exploiting this corpus. Given the conversational nature of Twitter, the data contain content that may be offensive, harmful, inaccurate, inappropriate, or unsuitable for minors. Microsoft did not create, does not endorse, and is not responsible for the content of the data. By using the data, you agree to supervise usage by minors whom you permit to use the data. Under no circumstances will Microsoft be liable in any way for the content of the data."&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-2813829551965986326?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/5ZQMagAwfxs" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/5ZQMagAwfxs/microsoft-research-conversation-corpus.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>2</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/06/microsoft-research-conversation-corpus.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-5141269674776814656</guid><pubDate>Sat, 15 May 2010 15:47:00 +0000</pubDate><atom:updated>2010-05-15T12:47:17.986-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Natural Language Generation</category><category domain="http://www.blogger.com/atom/ns#">Semantic Web</category><title>The Semantic Web as a Linguistic Resource: Opportunities for Natural Language Generation</title><description>&amp;nbsp;&lt;span class="Apple-style-span" style="color: #000025; font-family: Verdana; font-size: 15px;"&gt;The Semantic Web as a&amp;nbsp;&lt;/span&gt;&lt;span class="Apple-style-span" style="color: #000025; font-family: Verdana; font-size: 15px;"&gt;&lt;i&gt;Linguistic&lt;/i&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="color: #000025; font-family: Verdana; font-size: 15px;"&gt;&amp;nbsp;Resource: Opportunities for Natural Language Generation&lt;/span&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="color: #000025; font-family: Verdana; font-size: 15px;"&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="-webkit-border-horizontal-spacing: 4px; -webkit-border-vertical-spacing: 4px; color: #000025; font-family: Verdana; font-size: 14px;"&gt;&lt;/span&gt;&lt;br /&gt;
&lt;table border="1" cellpadding="2px" style="border-bottom-width: 0px; border-color: initial; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-style: initial; border-top-width: 0px; font-size: 1em; width: auto;"&gt;&lt;tbody&gt;
&lt;tr style="font-size: 1em;"&gt;&lt;td colspan="1" rowspan="1" style="font-size: 1em; vertical-align: top;"&gt;Research and Development in Intelligent Systems XXII&lt;br /&gt;
Proceedings of AI-2005, the Twenty-fifth SGAI International Conference on Innovative Techniques and Applications of Artificial Intelligence, Cambridge, UK, December 2005&lt;/td&gt;&lt;/tr&gt;
&lt;tr style="font-size: 1em;"&gt;&lt;td colspan="1" rowspan="1" style="font-size: 1em; vertical-align: top;"&gt;10.1007/978-1-84628-226-3_7&lt;/td&gt;&lt;/tr&gt;
&lt;tr style="font-size: 1em;"&gt;&lt;td colspan="1" rowspan="1" style="font-size: 1em; vertical-align: top;"&gt;Max&amp;nbsp;Bramer, Frans&amp;nbsp;Coenen and Tony&amp;nbsp;Allen&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;
&lt;div class="AuthorGroup" style="font-size: 1em; font-weight: bold;"&gt;Chris&amp;nbsp;Mellish&lt;sup style="vertical-align: super;"&gt;1&lt;/sup&gt;&amp;nbsp;and Xiantang&amp;nbsp;Sun&lt;sup style="vertical-align: super;"&gt;1&lt;/sup&gt;&lt;/div&gt;&lt;table style="border-bottom-width: 0px; border-color: initial; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-style: initial; border-top-width: 0px; font-size: 1em; width: auto;"&gt;&lt;tbody&gt;
&lt;tr style="font-size: 1em;" valign="top"&gt;&lt;td style="font-size: 1em; vertical-align: top;"&gt;&lt;span class="Affiliation" style="font-family: Arial, Helvetica, sans-serif; font-size: 10pt;"&gt;&lt;a href="" name="Aff4" style="font-size: 1em; text-decoration: none;"&gt;&lt;/a&gt;(1)&amp;nbsp;&lt;/span&gt;&lt;/td&gt;&lt;td style="font-size: 1em; vertical-align: top;"&gt;&lt;span class="Affiliation" style="font-family: Arial, Helvetica, sans-serif; font-size: 10pt;"&gt;Department of Computing Science, University of Aberdeen, Aberdeen, AB24 3UE, UK&lt;/span&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;div style="font-size: 1em;"&gt;&lt;a href="" name="Abs1" style="font-size: 1em; text-decoration: none;"&gt;&lt;/a&gt;&lt;/div&gt;&lt;div class="Heading3" style="font-size: 1em;"&gt;Abstract&lt;/div&gt;&lt;div class="Abstract" lang="en" style="font-size: 1em; margin-top: 1em;"&gt;&lt;div class="normal" style="font-size: 1em;"&gt;This paper argues that, because the documents of the semantic web are created by human beings, they are actually much more like natural language documents than theory would have us believe. We present evidence that natural language words are used extensively and in complex ways in current ontologies. This leads to a number of dangers for the semantic web, but also opens up interesting new challenges for natural language processing. This is illustrated by our own work using natural language generation to present parts of ontologies.&lt;/div&gt;&lt;div class="normal" style="font-size: 1em;"&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div class="normal" style="font-size: 1em;"&gt;&lt;span class="Apple-style-span" style="font-size: 12px;"&gt;&lt;a href="http://www.springerlink.com/content/v5648601l0473896/fulltext.pdf"&gt;PDF (1.2 MB)&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-5141269674776814656?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/P61-hSizzGM" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/P61-hSizzGM/semantic-web-as-linguistic-resource.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/05/semantic-web-as-linguistic-resource.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-106584012140842559</guid><pubDate>Fri, 07 May 2010 15:17:00 +0000</pubDate><atom:updated>2010-05-07T12:17:01.971-03:00</atom:updated><title>Semantic network methods to disambiguate natural language meaning</title><description>&lt;a href="http://nlp.hivefire.com/articles/14460/semantic-network-methods-to-disambiguate-natural-l/"&gt;Semantic network methods to disambiguate natural language meaning&lt;/a&gt;: "A computer implemented data processor system automatically disambiguates a contextual meaning of natural language symbols to enable precise meanings to be stored for later retrieval from a natural language database, so that natural language database design is automatic, to enable flexible and efficient natural language interfaces to computers, household appliances and hand-held devices."&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-106584012140842559?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/s52jgy8ek9E" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/s52jgy8ek9E/semantic-network-methods-to.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/05/semantic-network-methods-to.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-4906859143102220481</guid><pubDate>Wed, 28 Apr 2010 18:06:00 +0000</pubDate><atom:updated>2010-04-28T15:06:03.639-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Social Networks</category><title>What's in a Tweet?</title><description>&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;span class="Apple-style-span" style="font-family: Arial; font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-size: 13px;"&gt;&lt;span class="Apple-style-span" style="color: #333333; font-family: Verdana, Arial, Helvetica, sans-serif; line-height: 18px;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;span class="Apple-style-span" style="font-family: Arial; font-size: small;"&gt;&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;span class="Apple-style-span" style="font-family: Arial; font-size: small;"&gt;&lt;div style="line-height: 18px;"&gt;Researchers at the&amp;nbsp;&lt;a href="http://www.parc.com/" style="color: #252571; text-decoration: underline;" target="_blank"&gt;Palo Alto Research Center&lt;/a&gt;&amp;nbsp;(PARC) are developing new ways to deal with the torrent of information flowing from social media sites like Twitter. They have developed a Twitter "topic browser" that extracts meaning from the posts in a user's timeline. This could help users scan through thousands of tweets quickly, and the underlying technology could also offer novel ways of mining Twitter for information or for creating targeted advertising.&lt;/div&gt;&lt;br /&gt;
The researchers' idea was to provide a way for users to deal with a large number of Twitter messages quickly. They found that many users wanted to be able to quickly catch up on what's been going on, without having to go through every single tweet in their timeline.&lt;br /&gt;
&lt;br /&gt;
&lt;div style="line-height: 18px;"&gt;&lt;a href="http://www.parc.com/about/people/35/ed-h-chi.html" style="color: #252571; text-decoration: underline;" target="_blank"&gt;Ed Chi&lt;/a&gt;, area manager and principal scientist for the&amp;nbsp;&lt;a href="http://asc-parc.blogspot.com/" style="color: #252571; text-decoration: underline;" target="_blank"&gt;Augmented Social Cognition Research Group&lt;/a&gt;&amp;nbsp;at PARC, says that the information coming through Twitter resembles a stream--users will dip into it from time to time, but they don't want to consume it all at once. His group's work is called the "Eddi Project" in reference to the idea of eddies in a stream.&lt;br /&gt;
&lt;br /&gt;
&lt;/div&gt;&lt;div style="line-height: 18px;"&gt;The researchers developed two main ways of filtering Twitter content. The first, presented recently at the&amp;nbsp;&lt;a href="http://www.chi2010.org/" style="color: #252571; text-decoration: underline;" target="_blank"&gt;ACM Conference on Human Factors in Computing Systems&lt;/a&gt;&amp;nbsp;in Atlanta, is a&amp;nbsp;&lt;a href="http://asc-parc.blogspot.com/2010/04/short-and-tweet-experiments-on.html" style="color: #252571; text-decoration: underline;" target="_blank"&gt;recommendation&lt;/a&gt;&amp;nbsp;system that ranks which posts in a Twitter stream a user is likely to find most interesting, based on factors such as the contents of posts as well as his interactions with other Twitter users. The second tool, the Twitter topic browser, summarizes the contents of a user's timeline so that the user can quickly survey what information has come through Twitter without having to read through every post.&lt;/div&gt;&lt;/span&gt;&lt;/span&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;&lt;br /&gt;
Read the full story in &lt;/span&gt;&lt;a href="http://www.technologyreview.com/web/25189/page1/"&gt;&lt;span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"&gt;http://www.technologyreview.com/web/25189/page1/&lt;/span&gt;&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-4906859143102220481?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/0_CztfGy0IA" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/0_CztfGy0IA/whats-in-tweet.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/04/whats-in-tweet.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-8656850245250377344</guid><pubDate>Wed, 28 Apr 2010 17:50:00 +0000</pubDate><atom:updated>2010-04-28T14:50:29.800-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Journals</category><title>The Biological Nature of Human Language</title><description>&lt;span class="Apple-style-span" style="font-family: Arial; font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-size: 13px;"&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse; font-family: arial, sans-serif;"&gt;&lt;a href="http://www.biolinguistics.eu/index.php/biolinguistics/article/view/110/144" style="color: #2244bb;" target="_blank"&gt;&lt;em&gt;&lt;span style="color: #000099;"&gt;&lt;strong&gt;The Biological Nature of Human Language&lt;/strong&gt;&lt;/span&gt;&lt;/em&gt;&lt;/a&gt;&lt;br /&gt;
&lt;strong&gt;Anna Maria Di Sciullo, Massimo Piattelli-Palmarini, Kenneth Wexler, Robert C. Berwick&lt;/strong&gt;&amp;nbsp;&lt;em&gt;et al&lt;/em&gt;.&lt;br /&gt;
Biolinguistics 4.1: 004–034, 2010&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Arial; font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-size: 13px;"&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse; font-family: arial, sans-serif;"&gt;&lt;br /&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Arial; font-size: small;"&gt;&lt;span class="Apple-style-span" style="font-size: 13px;"&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse; font-family: arial, sans-serif;"&gt;&lt;div&gt;Biolinguistics aims to shed light on the specifically biological nature of&amp;nbsp;human language, focusing on five foundational questions: (1) What are the&amp;nbsp;properties of the language phenotype? (2) How does language ability grow&amp;nbsp;and mature in individuals? (3) How is language put to use? (4) How is&amp;nbsp;language implemented in the brain? (5) What evolutionary processes led to&amp;nbsp;the emergence of language? These foundational questions are used here to&amp;nbsp;frame a discussion of important issues in the study of language, exploring&amp;nbsp;whether our linguistic capacity is the result of direct selective pressure or&amp;nbsp;due to developmental or biophysical constraints, and assessing whether the&lt;/div&gt;&lt;div&gt;neural/computational components entering into language are unique to&amp;nbsp;human language or shared with other cognitive systems, leading to a discussion of advances in theoretical linguistics, psycholinguistics, comparative&lt;/div&gt;&lt;div&gt;animal behavior and psychology, genetics/genomics, disciplines that can&amp;nbsp;now place these longstanding questions in a new light, while raising challenges for future research.&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;More readings on this subject try the BIOLINGUISTICS Magazine:&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div style="text-align: center;"&gt;&lt;a href="http://www.biolinguistics.eu/index.php/biolinguistics/"&gt;http://www.biolinguistics.eu/index.php/biolinguistics/&lt;/a&gt;&lt;/div&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-8656850245250377344?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/_u7kxpMJuR8" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/_u7kxpMJuR8/biological-nature-of-human-language.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/04/biological-nature-of-human-language.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-2604955757458714278</guid><pubDate>Sat, 17 Apr 2010 23:55:00 +0000</pubDate><atom:updated>2010-04-17T20:55:18.979-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Lectures</category><category domain="http://www.blogger.com/atom/ns#">Conferences</category><title>CICLing 2010 Conference Recordings Available</title><description>&lt;span class="Apple-style-span" style="border-collapse: collapse; color: #333333; font-family: arial, sans-serif; line-height: 19px;"&gt;There some recordings available for the 11th International Conference on&amp;nbsp;Intelligent Text Processing and Computational Linguistics (&amp;nbsp;CICLing 2010).&amp;nbsp;&lt;/span&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="border-collapse: collapse; color: #333333; font-family: arial, sans-serif; line-height: 19px;"&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;div style="text-align: center;"&gt;&lt;span class="Apple-style-span" style="border-collapse: collapse; color: #333333; font-family: arial, sans-serif; line-height: 19px;"&gt;&lt;a href="http://profs.info.uaic.ro/~cicling2010/"&gt;http://profs.info.uaic.ro/~cicling2010/&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="border-collapse: collapse; color: #333333; font-family: arial, sans-serif; line-height: 19px;"&gt;The recordings are not so good :(&lt;/span&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="border-collapse: collapse; color: #333333; font-family: arial, sans-serif; line-height: 19px;"&gt;&lt;br /&gt;
&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-2604955757458714278?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/Xe0mfVLw3Yc" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/Xe0mfVLw3Yc/cicling-2010-conference-recordings.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/04/cicling-2010-conference-recordings.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-9220350512045892512</guid><pubDate>Thu, 15 Apr 2010 14:32:00 +0000</pubDate><atom:updated>2010-04-15T11:41:28.838-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Lectures</category><title>JHU Summer School 2009 available on video</title><description>&lt;span class="Apple-style-span" style="border-collapse: collapse; font-family: arial, sans-serif; font-size: 13px;"&gt;I received an e-mail from NAACL with the following message.&amp;nbsp;I took a look and there is interesting topics in NLP with good speakers.&lt;/span&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="border-collapse: collapse; font-family: arial, sans-serif; font-size: 13px;"&gt;&lt;br /&gt;
&lt;/span&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="border-collapse: collapse; font-family: arial, sans-serif; font-size: 13px;"&gt;JHU has just recently released videos for their 2009 two week summer&amp;nbsp;school on NLP and Human Language Technology.&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://videolectures.net/clspss09_baltimore/" style="color: #074d8f;" target="_blank"&gt;http://videolectures.net/&lt;wbr&gt;&lt;/wbr&gt;clspss09_baltimore/&lt;/a&gt;&lt;br /&gt;
&lt;br /&gt;
You can find quite a bit of information about the 2009 summer school&amp;nbsp;here, which may help orient you to the videos :&lt;br /&gt;
&lt;br /&gt;
&lt;a href="http://www.clsp.jhu.edu/workshops/ws09/" style="color: #074d8f;" target="_blank"&gt;http://www.clsp.jhu.edu/&lt;wbr&gt;&lt;/wbr&gt;workshops/ws09/&lt;/a&gt;&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-9220350512045892512?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/GVx9AkXtayY" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/GVx9AkXtayY/jhu-summer-school-2009-available-on.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/04/jhu-summer-school-2009-available-on.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-8380047848633935797</guid><pubDate>Sat, 20 Feb 2010 21:33:00 +0000</pubDate><atom:updated>2010-02-21T16:49:19.548-03:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Stanford</category><category domain="http://www.blogger.com/atom/ns#">News</category><category domain="http://www.blogger.com/atom/ns#">Conferences</category><title>Stanford software is gaining the sophistication to comprehend what humans write</title><description>Bellow a very good &lt;a href="http://news.stanford.edu/news/2010/february15/manning-aaas-computers-021910.html"&gt;news&lt;/a&gt;&amp;nbsp;from Manning about how Stanford is working with NLP issues and what we can expected for future NLP capabilities.&lt;br /&gt;
&lt;br /&gt;
----&lt;br /&gt;
&lt;br /&gt;
&lt;i&gt;Computer science and linguistics professor says software is steadily gaining the sophistication needed to comprehend what humans write and to help us sort through the chatter of information overload.&lt;/i&gt;&lt;br /&gt;
&lt;br /&gt;
&lt;span class="Apple-style-span" style="font-size: 12px; line-height: 15px;"&gt;BY DAVID ORENSTEIN&lt;/span&gt;&lt;br /&gt;
&lt;div class="photobug" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; float: left; font-size: 12px; font-style: inherit; font-weight: inherit; margin-bottom: 0px; margin-left: 0px; margin-right: 1em; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline; width: 120px;"&gt;&lt;img alt="AAAS logo" border="0" src="http://news.stanford.edu/news/2010/february15/gifs/aaas_120_col.gif" style="border-bottom-style: none; border-color: initial; border-color: initial; border-color: initial; border-left-style: none; border-right-style: none; border-style: initial; border-top-style: none; border-width: initial; border-width: initial; font-size: 12px; font-style: inherit; font-weight: inherit; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;" /&gt;&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;For people who despair that there is too much information online, Chris Manning has a response: Technology is not the problem. In fact, technology may understand what you're trying to say.&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;At the annual conference of the American Association for the Advancement of Science (AAAS) in San Diego, the Stanford associate professor of computer science and linguistics will talk about enabling computers to process human language well enough to use the information it conveys.&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;&lt;object height="190" style="clear: right; float: right;" width="300"&gt;  &lt;param name="movie" value="http://www.youtube.com/v/3KdKiE-_bRQ&amp;hl=en&amp;fs=1&amp;autoplay=1"&gt;&lt;/param&gt;&lt;param name="allowFullScreen" value="true"&gt;&lt;/param&gt;&lt;param name="allowscriptaccess" value="always"&gt;&lt;/param&gt;&lt;embed src="http://www.youtube.com/v/3KdKiE-_bRQ&amp;hl=en&amp;fs=1&amp;autoplay=0&amp;showinfo=0" type="application/x-shockwave-flash" allowfullscreen="true" allowscriptaccess="always" width="300" height="190"&gt;  &lt;/embed&gt;  &lt;/object&gt;"The problem of the age is information overload," said Manning, who'll speak Friday, Feb. 19, at 4:10 p.m. in Room 2 of the San Diego Convention Center. "The fundamental challenge I'm going to talk about is how we can get computers to actually understand at least a reasonable amount of what they read."&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;As computers make more sense of what's online, they will deliver more relevant search results and will help summarize, structure and act on information that individuals care about, much like a personal assistant.&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;A smartphone email program that understands the difference between "We need the Q4 figures" and "We found the Q4 figures" could prove invaluable to a busy executive.&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;Computers also could help researchers extract key facts from a sea of articles to create and update databases. In fact, Manning already has developed software that mines biology research papers for basic data.&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;&lt;strong&gt;State of the art&lt;/strong&gt;&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;Manning readily acknowledges that the field of natural language understanding has a long way to go to catch up with popular imagination.&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;"The state of the art is still highly incomplete," he says. "We're just not at the level of what we see in science fiction movies. But human language technology has been making enormous advances."&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;In his AAAS talk, Manning will describe work on three emerging technologies at Stanford's Natural Language Processing (NLP) Group.&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;Working with linguistics Associate Professor Dan Jurafsky, Manning has been developing a fundamental set of tools to help computers do what a pupil does in grammar school: Parse sentences. As with humans, computers begin to understand sentences by recognizing parts of speech and how the sentence is structured.&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;The underlying technology is a branch of artificial intelligence called probabilistic machine learning. Essentially, computers are programmed to read a large number of sentences and then analyze their structure and elements, compiling statistics about verbs and nouns, and keeping track of what the subject of the sentence is doing.&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;Based on those statistics, for instance, a computer might conclude that "horse" is likely to be the subject of a sentence and that "hay" is something that horses might eat. Technical demonstrations of Manning and Jurafsky's language parsing software are&amp;nbsp;&lt;a href="http://nlp.stanford.edu/" style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #663333; font-size: 12px; font-style: inherit; font-weight: inherit; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; text-decoration: underline; vertical-align: baseline;"&gt;available&lt;/a&gt;&amp;nbsp;on the NLP Group website.&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;Building on that level of understanding, Manning's group has created software to sort out ambiguities in language by taking whole sentences into account when deciding what each word means. For instance, "make up" can have at least three meanings: to reconcile after a spat, to concoct a story, or to apply cosmetics. The technical solution, called "joint inference," is to look for other words in the sentence that are statistically shown to be relevant. If the word "argument" is there, the computer will lean toward "to reconcile."&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;Finally, Manning will talk about a technology called robust textual inference, which can read a passage of text and determine whether a conclusion about it is supported. That reading comprehension task is important because it's similar to what people sometimes expect search engines to do; they'll type in a conclusion ("hotels with free Wi-Fi") and hope the engine leads them to text that supports it ("Free Wireless High-Speed Internet access in all rooms").&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;With tremendous volumes of information appearing online every day in social networks, Manning says the need to train computers to understand human language, rather than meticulously structured data, is only increasing. The next research frontier may therefore be getting a computer to understand "C U soon, QT."&lt;/div&gt;&lt;div style="border-bottom-width: 0px; border-color: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-size: 12px; font-style: inherit; font-weight: inherit; line-height: 1.25em; margin-bottom: 1em; margin-left: 0px; margin-right: 0px; margin-top: 0px; outline-color: initial; outline-style: initial; outline-width: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"&gt;&lt;em&gt;David Orenstein is&lt;em&gt;&amp;nbsp;associate director of communications at the Stanford School of Engineering&lt;/em&gt;.&lt;/em&gt;&lt;br /&gt;
&lt;em&gt;&lt;br /&gt;
&lt;/em&gt;&lt;br /&gt;
&lt;span class="Apple-style-span" style="font-size: medium; line-height: normal;"&gt;Source: &lt;a href="http://news.stanford.edu/news/2010/february15/manning-aaas-computers-021910.html"&gt;Stanford News&lt;/a&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-8380047848633935797?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/_tBCvzvE_t4" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/_tBCvzvE_t4/stanford-software-is-gaining.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/02/stanford-software-is-gaining.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-8757231572884974066</guid><pubDate>Fri, 19 Feb 2010 13:13:00 +0000</pubDate><atom:updated>2010-02-19T11:13:14.034-02:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Semantic Analysis</category><category domain="http://www.blogger.com/atom/ns#">Web Search</category><title>Cognition’s Semantic Technology Contributes to Microsoft’s Bing</title><description>Bellow I present a &lt;a href="http://www.dbusinessnews.com/shownews.php?newsid=201628&amp;amp;type_news=past"&gt;news&lt;/a&gt; showing that Microsoft Bing will license the Cognition platform &amp;nbsp;(from &lt;a href="http://www.cognition.com/"&gt;Cognition.com&lt;/a&gt;) to&amp;nbsp;enhance&amp;nbsp;his web search. &amp;nbsp;A good video about the Cognition platform can be found in&amp;nbsp;&lt;a href="http://www.cognition.com/info/videodemo.html"&gt;http://www.cognition.com/info/videodemo.html&lt;/a&gt;&lt;div&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div&gt;-------&lt;/div&gt;&lt;div&gt;&lt;span class="Apple-style-span" style="font-family: Verdana, Arial, Helvetica, sans-serif; font-size: 11px;"&gt;&lt;/span&gt;&lt;br /&gt;
&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;span style="font-family: 'Times New Roman';"&gt;&lt;b&gt;&lt;span style="font-size: 11pt;"&gt;LOS ANGELES, CA&amp;nbsp;– February 17, 2010&lt;/span&gt;&lt;/b&gt;&lt;span style="font-size: 11pt;"&gt;&amp;nbsp;—&amp;nbsp;&lt;a href="" name="x7_j40"&gt;&lt;/a&gt;&lt;a href="" name="x7_j41"&gt;&lt;/a&gt;&lt;a href="" name="x7_j45"&gt;&lt;/a&gt;&lt;a href="" name="x7_j47"&gt;&lt;/a&gt;Cognition Technologies, the creator of the most advanced and complete semantic Natural Language Processing (NLP) technology on the market, today announced that Microsoft Corp. has licensed some of its proprietary semantic technologies and will be using them to enhance Bing and other applications within Microsoft. Specifically, Microsoft will incorporate Cognition’s comprehensive and robust Semantic Map of the English language.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;span style="font-family: 'Times New Roman';"&gt;The non-exclusive licensing arrangement enables Microsoft to embed elements of Cognition’s semantic technologies into any Microsoft application which would benefit from an “understanding” of the English language. Initially, it will be used to enhance the user experience in Bing, Microsoft’s online decision engine.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;span style="font-family: 'Times New Roman';"&gt;“&lt;/span&gt;&lt;a href="" name="x7_j34"&gt;&lt;/a&gt;&lt;span style="font-family: 'Times New Roman';"&gt;Cognition’s comprehensive Semantic Map will help us continue to improve the search experiences we can offer to consumers,” said Ron Kaplan, chief scientist in the Powerset division of Bing at&amp;nbsp;&lt;span style="font-weight: normal;"&gt;Microsoft&lt;/span&gt;. “After several months of evaluation working closely with the technical team at Cognition, we believe that Cognition’s semantic technologies can help us provide better results for Bing customers.”&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;span style="font-family: 'Times New Roman';"&gt;Dr. Kathleen Dahlgren, Cognition’s founder and CTO, added, “Cognition’s Semantic Map will contribute conceptual reasoning and precise question-answering that will mesh well with Microsoft’s existing search capabilities.”&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;span style="font-family: 'Times New Roman';"&gt;The scope of Cognition’s Semantic Map is more than double the size of any other computational linguistic dictionary for English, and includes more than ten million semantic connections that are comprised of semantic contexts, meaning representations, taxonomy and word meaning distinctions. The Map encompasses over 540,000 word senses (word and phrase meanings); 75,000 concept classes (or synonym classes of word meanings); 8,000 nodes in the technology’s ontology or classification scheme; and 510,000 word stems (roots of words) for the English language. Cognition’s lexical resources encode a wealth of semantic, morphological and syntactic information about the words contained within documents and their relationships to each other. These resources were created, codified and reviewed by lexicographers and linguists over a span of more than 25 years.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;a href="" name="x7_j26"&gt;&lt;/a&gt;&lt;a href="" name="x7_j33"&gt;&lt;/a&gt;&lt;span style="font-size: 11pt;"&gt;&lt;span style="font-family: 'Times New Roman';"&gt;“We are obviously very pleased that Microsoft has recognized the significant value of Cognition’s semantic technologies,” said Scott Jarus, Cognition’s CEO. “Microsoft joins a list of companies in the legal litigation support, publishing and life sciences industries who have also recognized Cognition’s ability to bring meaning and understanding to vast amounts of information.”&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;span style="font-family: 'Times New Roman';"&gt;More information about Cognition’s Semantic NLP™ technology is available on its Website at&amp;nbsp;&lt;/span&gt;&lt;a href="http://www.cognition.com/"&gt;&lt;span style="color: purple; font-family: 'Times New Roman';"&gt;www.cognition.com&lt;/span&gt;&lt;/a&gt;&lt;span style="font-family: 'Times New Roman';"&gt;.&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;br /&gt;
&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;o:p&gt;&lt;span style="font-family: 'Times New Roman';"&gt;&lt;/span&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;a href="" name="x7_j61"&gt;&lt;/a&gt;&lt;span style="font-family: 'Times New Roman';"&gt;&lt;b&gt;&lt;span style="font-size: 11pt;"&gt;About Cognition:&lt;/span&gt;&lt;/b&gt;&lt;span style="font-size: 11pt;"&gt;&lt;o:p&gt;&lt;/o:p&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;a href="" name="x7_j65"&gt;&lt;/a&gt;&lt;span style="font-size: 11pt;"&gt;&lt;span style="font-family: 'Times New Roman';"&gt;Cognition Technologies, based in&amp;nbsp;&lt;st1:city w:st="on"&gt;&lt;st1:place w:st="on"&gt;Los Angeles&lt;/st1:place&gt;&lt;/st1:city&gt;, has developed a revolutionary Semantic Natural Language Processing (NLP) technology which adds word and phrase meaning and “understanding” to computer applications, enabling them to be more human-like in their processing of information.&lt;span&gt;&amp;nbsp;&amp;nbsp;&lt;/span&gt;Cognition's Semantic Map, the underlying technology developed over the past 24 years, is the largest and most extensive in existence. Applications and technologies which utilize Cognition's Semantic NLP™ technology are positioned to take full advantage of Web 3.0 (the Semantic Web).&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;span style="font-family: 'Times New Roman';"&gt;&lt;br /&gt;
&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;div class="MsoNormal" style="margin-bottom: 0pt; margin-left: 0in; margin-right: 0in; margin-top: 0in; text-align: justify;"&gt;&lt;span style="font-size: 11pt;"&gt;&lt;span style="font-family: 'Times New Roman';"&gt;Source:&amp;nbsp;&lt;a href="http://www.dbusinessnews.com/shownews.php?newsid=201628&amp;amp;type_news=past"&gt;http://www.dbusinessnews.com/shownews.php?newsid=201628&amp;amp;type_news=past&lt;/a&gt;&lt;/span&gt;&lt;/span&gt;&lt;/div&gt;&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-8757231572884974066?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/uOlihKwBGf4" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/uOlihKwBGf4/cognitions-semantic-technology.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/02/cognitions-semantic-technology.html</feedburner:origLink></item><item><guid isPermaLink="false">tag:blogger.com,1999:blog-3813942136986409295.post-974703454947928638</guid><pubDate>Wed, 10 Feb 2010 13:04:00 +0000</pubDate><atom:updated>2010-02-10T11:04:52.320-02:00</atom:updated><category domain="http://www.blogger.com/atom/ns#">Language</category><title>Numbers from 1 to 10 in Over 5000 Language</title><description>The website&amp;nbsp;&lt;a href="http://www.zompist.com/numbers.htm"&gt;http://www.zompist.com/numbers.htm&lt;/a&gt;&amp;nbsp;shows the numbers from 1 to 10 in over 5000 languages including some historical differences. Bellow the difference between the old English and the today English.&lt;br /&gt;
&lt;br /&gt;
&lt;span class="Apple-style-span" style="color: blue;"&gt;&lt;/span&gt;&lt;br /&gt;
&lt;table&gt;&lt;tbody&gt;
&lt;tr&gt;&lt;td&gt;Old English+&lt;/td&gt;&lt;td&gt;án&lt;/td&gt;&lt;td&gt;twá&lt;/td&gt;&lt;td&gt;þrí&lt;/td&gt;&lt;td&gt;féower&lt;/td&gt;&lt;td&gt;fíf&lt;/td&gt;&lt;td&gt;sex&lt;/td&gt;&lt;td&gt;seofon&lt;/td&gt;&lt;td&gt;eahta&lt;/td&gt;&lt;td&gt;nighon&lt;/td&gt;&lt;td&gt;tíen&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Middle English+&lt;/td&gt;&lt;td&gt;an&lt;/td&gt;&lt;td&gt;two&lt;/td&gt;&lt;td&gt;three&lt;/td&gt;&lt;td&gt;four&lt;/td&gt;&lt;td&gt;fif&lt;/td&gt;&lt;td&gt;six&lt;/td&gt;&lt;td&gt;seven&lt;/td&gt;&lt;td&gt;eihte&lt;/td&gt;&lt;td&gt;nien&lt;/td&gt;&lt;td&gt;ten&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;&lt;b&gt;English&lt;/b&gt;&lt;/td&gt;&lt;td&gt;one&lt;/td&gt;&lt;td&gt;two&lt;/td&gt;&lt;td&gt;three&lt;/td&gt;&lt;td&gt;four&lt;/td&gt;&lt;td&gt;five&lt;/td&gt;&lt;td&gt;six&lt;/td&gt;&lt;td&gt;seven&lt;/td&gt;&lt;td&gt;eight&lt;/td&gt;&lt;td&gt;nine&lt;/td&gt;&lt;td&gt;ten&lt;/td&gt;&lt;/tr&gt;
&lt;/tbody&gt;&lt;/table&gt;&lt;br /&gt;
More in:&amp;nbsp;&lt;a href="http://www.zompist.com/numbers.shtml"&gt;http://www.zompist.com/numbers.shtml&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/3813942136986409295-974703454947928638?l=nlpb.blogspot.com' alt='' /&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/NaturalLanguageProcessingWorld/~4/NS8_XWvmb8g" height="1" width="1"/&gt;</description><link>http://feedproxy.google.com/~r/NaturalLanguageProcessingWorld/~3/NS8_XWvmb8g/numbers-from-1-to-10-in-over-5000.html</link><author>noreply@blogger.com (Pedro Paulo Balage)</author><thr:total>0</thr:total><feedburner:origLink>http://nlpb.blogspot.com/2010/02/numbers-from-1-to-10-in-over-5000.html</feedburner:origLink></item></channel></rss>

