<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:blogger='http://schemas.google.com/blogger/2008' xmlns:georss='http://www.georss.org/georss' xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-4127035516184942767</id><updated>2024-10-09T13:00:54.766+11:00</updated><category term="announcement"/><category term="carabao"/><category term="release"/><category term="machine translation"/><category term="programming"/><category term="statistical machine translation"/><category term="SMT"/><category term="data"/><category term="distribution"/><category term="vista"/><category term="BLEU"/><category term="COM"/><category term="NLP"/><category term="SxS"/><category term="alta"/><category term="blackberry"/><category term="business"/><category term="development"/><category term="dictionary"/><category term="event"/><category term="freeze"/><category term="iis"/><category term="it"/><category term="linguasys"/><category term="localization"/><category term="mobile"/><category term="multilingual"/><category term="opinion"/><category term="predictions; language"/><category term="press"/><category term="publication"/><category term="publicity"/><category term="rant"/><category term="regfree"/><category term="sentiment"/><category term="support"/><category term="text analytics"/><category term="tip"/><category term="website"/><category term="weird"/><title type='text'>Digital Sonata&#39;s Blog</title><subtitle type='html'>A blog of &lt;a href=&quot;http://www.digitalsonata.com&quot;&gt;Digital Sonata&lt;/a&gt;, the home of Carabao Language Kit</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://digitalsonata.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default?alt=atom'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default?alt=atom&amp;start-index=26&amp;max-results=25'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>39</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-6396852638912407174</id><published>2012-04-27T18:58:00.000+10:00</published><updated>2012-04-27T19:01:51.452+10:00</updated><title type='text'>Workaround for conversion of Unicode Vietnamese to ANSI</title><content type='html'>A few months ago, we had an interesting project which involved Vietnamese. As a part of the project, we have hit a minor snag. &lt;br /&gt;
&lt;br /&gt;
It looks like the first support for Vietnamese appeared in 1990s, which means the ANSI codepage was built in a hurry already when Unicode was either in the works or already out. The tonal nature of the language demands complex combinations of diacritics to be used. Not just regular grave accents, umlauts and such. Simply put, there were not enough slots for the new characters, so the creators of the Vietnamese codepage stuffed these new characters wherever, in whatever way possible.&lt;br /&gt;
&lt;br /&gt;
Today very few people use ANSI, however, it is still needed for several reasons: legacy being one (people still work with mainframes, you know), and compliance another. Of course, there is a tried and true function &lt;i&gt;WideStringToMultiByte&lt;/i&gt; which works like a Swiss chronometre. That is, for most languages - except Vietnamese. There are posts by Microsoft folks stating that &quot;&lt;a href=&quot;http://blogs.msdn.com/b/michkap/archive/2005/08/27/457224.aspx&quot; target=&quot;_blank&quot;&gt;Vietnamese is a complex language on Windows&lt;/a&gt;&quot; (duh!), but not really telling how to fix it. I asked around, no one replied, as expected (I love &lt;a href=&quot;http://stackoverflow.com/questions/7703114/using-widechartomultibyte-with-codepage-1258-vietnamese&quot; target=&quot;_blank&quot;&gt;how Stackoverflow people react&lt;/a&gt; when they can&#39;t answer the question :-) ).&lt;br /&gt;
&lt;br /&gt;
After scrutinising the result I saw what&#39;s the problem. It seems that the decomposing routine during the conversion is unable to handle some combinations of Unicode characters. Manually decomposing some characters to their equivalents worked for me.&lt;br /&gt;
&lt;br /&gt;
I use an esoteric language called &lt;a href=&quot;http://www.softvelocity.com/&quot; target=&quot;_blank&quot;&gt;Clarion&lt;/a&gt; to design our tools and some components, so my original code is in Clarion. A few days ago, &lt;a href=&quot;http://www.jacobsm.com/&quot; target=&quot;_blank&quot;&gt;Mark Jacobs&lt;/a&gt; from Critical Research contacted me requesting help with the same issue, and kindly converted my Clarion source code to C++ more familiar to the rest of the world. Thanks, Mark!&lt;br /&gt;
&lt;br /&gt;
Get Clarion source code &lt;a href=&quot;http://www.digitalsonata.com/FixVietnamese.clw&quot; target=&quot;_blank&quot;&gt;here&lt;/a&gt; and Mark&#39;s C++ &lt;a href=&quot;http://www.digitalsonata.com/fixvietnamese.cpp&quot; target=&quot;_blank&quot;&gt;here&lt;/a&gt;.</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/6396852638912407174' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/6396852638912407174'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/6396852638912407174'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2012/04/workaround-for-conversion-of-unicode.html' title='Workaround for conversion of Unicode Vietnamese to ANSI'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-101136933041934212</id><published>2010-12-10T16:54:00.005+11:00</published><updated>2010-12-10T17:28:41.638+11:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="carabao"/><category scheme="http://www.blogger.com/atom/ns#" term="linguasys"/><category scheme="http://www.blogger.com/atom/ns#" term="press"/><title type='text'>LinguaSys - USA Today</title><content type='html'>An article about an Android app built by our partners in &lt;a href=&quot;http://www.linguasys.com&quot; class=&quot;dynalink&quot; target=&quot;_blank&quot;&gt;LinguaSys&lt;/a&gt; using Carabao as a backend (among others), is published in &lt;a href=&quot;http://content.usatoday.com/communities/technologylive/post/2010/12/tgphoto-app-translates-signs-in-foreign-languages/1&quot; class=&quot;dynalink&quot; target=&quot;_blank&quot;&gt;USA Today&lt;/a&gt; and &lt;a href=&quot;http://ivr.tmcnet.com/topics/two-way-sms/articles/125196-linguasys-unveils-instant-translation-application-android-devices.htm&quot; class=&quot;dynalink&quot; target=&quot;_blank&quot;&gt;other news sources&lt;/a&gt;.</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/101136933041934212' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/101136933041934212'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/101136933041934212'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2010/12/linguasys-usa-today.html' title='LinguaSys - USA Today'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-8861737545331813226</id><published>2010-12-06T14:03:00.005+11:00</published><updated>2010-12-10T17:29:04.464+11:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="alta"/><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="carabao"/><category scheme="http://www.blogger.com/atom/ns#" term="event"/><title type='text'>ALTA 2010</title><content type='html'>Digital Sonata was invited to present at the &lt;a href=&quot;http://www.alta.asn.au/events/alta2010/alta-2010-program.html&quot; class=&quot;dynalink&quot; target=&quot;_blank&quot;&gt;Australian Language Technology Association 2010 workshop&lt;/a&gt; on Thursday, December 9, 2010. Overview and directions are &lt;a href=&quot;http://alta.asn.au/events/alta2010/&quot; class=&quot;dynalink&quot; target=&quot;_blank&quot;&gt;here&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Vadim Berman will be speaking some time between 3:30pm and 5:30pm.</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/8861737545331813226' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/8861737545331813226'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/8861737545331813226'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2010/12/alta-2010.html' title='ALTA 2010'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-5615375532353222615</id><published>2010-06-07T10:24:00.003+10:00</published><updated>2010-06-08T10:33:29.143+10:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="carabao"/><title type='text'>Digital Sonata Signs Long Term Exclusive Agreement with LinguaSys™ For Use Of Carabao For Machine Translation</title><content type='html'>&lt;p&gt;Digital Sonata signed a long term deal with &lt;a href=&quot;http://www.linguasys.net&quot; target=&quot;_blank&quot; class=&quot;dynalink&quot;&gt;LinguaSys™&lt;/a&gt; for the exclusive use of Carabao in machine translation (MT) solutions on March 24, 2010.&lt;/p&gt;&lt;p&gt;Carabao is a hybrid language translation system using both statistical and rules-based methodologies.  Vadim Berman, CEO of Digital Sonata in Australia, and Chief Technology Officer and a co-founder of LinguaSys, is the author of Carabao.  Berman has a wealth of dedicated experience in the field of MT and text analysis. &lt;/p&gt;&lt;p&gt;Brian Garr, CEO of LinguaSys said, “We are very excited that this incredible technology from Digital Sonata will help us create the next generation of language translation solutions.”&lt;/p&gt;&lt;p&gt;LinguaSys is a new next generation machine translation company.  LinguaSys’ Carabao language middleware uses language processing methodologies offering excellent comprehension in the least amount of time at low cost.  LinguaSys enables enterprises to translate volumes of information, including text chat, e-mail, web pages and documents, quickly, accurately and automatically.  LinguaSys provides the creation of new MT languages, customized lexical services, ease of use, compatibility with existing natural language software, security behind the firewall, availability, integration and lower memory requirements.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/5615375532353222615' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/5615375532353222615'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/5615375532353222615'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2010/06/digital-sonata-signs-long-term.html' title='Digital Sonata Signs Long Term Exclusive Agreement with LinguaSys™ For Use Of Carabao For Machine Translation'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-7489859308698665950</id><published>2010-05-03T12:58:00.003+10:00</published><updated>2010-05-03T13:04:11.243+10:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="carabao"/><category scheme="http://www.blogger.com/atom/ns#" term="release"/><title type='text'>Carabao Language Kit 1.7.0.0 released</title><content type='html'>&lt;p&gt;The version 1.7.0.0 is now available for download. &lt;/p&gt;&lt;p&gt;Fixed:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Handling of control priority greater than 2, when some of the members have no feasible agreement graph. The result was, that some parts of the sequence worked, and some didn&#39;t.&lt;/li&gt;&lt;li&gt;Truncation of very long sentences&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;p&gt;Added:&lt;/p&gt;&lt;ul class=&quot;featureList&quot;&gt;&lt;li&gt;A utility to validate and correct rule unit values&lt;br /&gt;&lt;/li&gt;&lt;li&gt;A generic support for formatted processing, e.g. HTML, XML, SGML including embedded formatting elements in the text flow&lt;/li&gt;&lt;li&gt;GUI to test formatted processing in Carabao Test Console&lt;/li&gt;&lt;li&gt;Automatic conversion of double-byte space characters into standard single-byte&lt;/li&gt;&lt;/ul&gt;&lt;p&gt;Improved:&lt;/p&gt;&lt;ul class=&quot;featureList&quot;&gt;&lt;li&gt;Regular expressions for segmentation into character classes for double-byte languages&lt;/li&gt;&lt;li&gt;Perl-compatible regular expressions have been introduced for unknown heuristics&lt;/li&gt;&lt;li&gt;Frequency-based backtracking added to the tokenization algorithm&lt;/li&gt;&lt;li&gt;Unicode clipboard support in Carabao desktop suites is now bidirectional: when leaving the application and when coming back to the application&lt;/li&gt;&lt;/ul&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/7489859308698665950' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/7489859308698665950'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/7489859308698665950'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2010/05/carabao-language-kit-1700-released.html' title='Carabao Language Kit 1.7.0.0 released'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-4074788663899154835</id><published>2010-04-19T20:59:00.003+10:00</published><updated>2010-04-19T21:01:26.255+10:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="multilingual"/><category scheme="http://www.blogger.com/atom/ns#" term="publication"/><title type='text'>Publication in Multilingual</title><content type='html'>&lt;p&gt;My article on evaluation of emerging language technologies was published in &lt;a href=&quot;http://www.multilingual.com&quot; target=&quot;_blank&quot;&gt;Multingual&lt;/a&gt;:&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://multilingual.texterity.com/multilingual/20100405?pg=59#pg64&quot; target=&quot;_blank&quot;&gt;http://multilingual.texterity.com/multilingual/20100405?pg=59#pg64&lt;/a&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/4074788663899154835' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/4074788663899154835'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/4074788663899154835'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2010/04/publication-in-multilingual.html' title='Publication in Multilingual'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-1123561727416349314</id><published>2010-01-27T09:40:00.005+11:00</published><updated>2010-02-02T18:33:20.056+11:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="dictionary"/><category scheme="http://www.blogger.com/atom/ns#" term="release"/><title type='text'>English - Swedish OLIF dictionary released</title><content type='html'>&lt;p&gt;Engish - Swedish OLIF dictionary added to the list of OLIF lexicons distributed by Digital Sonata. The dictionary is available for download from &lt;a href=&quot;http://www.digitalsonata.com/download.aspx?type=linguisticData&quot; class=&quot;dynalink&quot; target=&quot;_blank&quot;&gt;http://www.digitalsonata.com/download.aspx?type=linguisticData&lt;/a&gt;.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/1123561727416349314' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/1123561727416349314'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/1123561727416349314'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2010/01/english-swedish-olif-dictionary-added.html' title='English - Swedish OLIF dictionary released'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-866885016079548546</id><published>2010-01-10T13:16:00.008+11:00</published><updated>2010-01-11T08:40:59.021+11:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="data"/><category scheme="http://www.blogger.com/atom/ns#" term="release"/><title type='text'>Bilingual OLIF dictionaries released</title><content type='html'>&lt;p&gt;Digital Sonata released a set of low-cost royalty-free bilingual dictionaries in &lt;a href=&quot;http://www.olif.net&quot; target=&quot;_blank&quot; class=&quot;dynalink&quot;&gt;OLIF format&lt;/a&gt;, optimized for use in NLP and content management applications. Translation, part of the speech, and a thesaurus article is included. The dictionaries are available at &lt;a href=&quot;http://www.digitalsonata.com/download.aspx?type=linguisticData&quot; target=&quot;_blank&quot; class=&quot;dynalink&quot;&gt;http://www.digitalsonata.com/download.aspx?type=linguisticData&lt;/a&gt;. Currently the following dictionaries are available:&lt;/p&gt;&lt;ul&gt;&lt;br /&gt; &lt;li&gt;English -&gt; Finnish&lt;/li&gt;&lt;br /&gt; &lt;li&gt;English -&gt; French&lt;/li&gt;&lt;br /&gt; &lt;li&gt;English -&gt; German&lt;/li&gt;&lt;br /&gt; &lt;li&gt;English -&gt; Japanese&lt;/li&gt;&lt;br /&gt; &lt;li&gt;English -&gt; Korean&lt;/li&gt;&lt;br /&gt; &lt;li&gt;English -&gt; Russian&lt;/li&gt;&lt;br /&gt; &lt;li&gt;English -&gt; Spanish&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/866885016079548546' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/866885016079548546'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/866885016079548546'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2010/01/bilingual-olif-dictionaries-released.html' title='Bilingual OLIF dictionaries released'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-4663936982725370381</id><published>2010-01-05T19:38:00.004+11:00</published><updated>2010-01-19T15:37:26.373+11:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="carabao"/><category scheme="http://www.blogger.com/atom/ns#" term="release"/><title type='text'>Carabao Language Kit 1.6.2.1 released</title><content type='html'>&lt;p&gt;The version 1.6.2.1 is now available for download. &lt;/p&gt;&lt;p&gt;Fixed:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Transliteration to empty string&lt;/li&gt;&lt;li&gt;Partial transliteration&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;p&gt;Added:&lt;/p&gt;&lt;ul class=&quot;featureList&quot;&gt;&lt;li&gt;Change log which allows distributed collaboration&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;p&gt;Improved:&lt;/p&gt;&lt;ul class=&quot;featureList&quot;&gt;&lt;li&gt;Processing speed&lt;/li&gt;&lt;li&gt;Entry matching accuracy&lt;/li&gt;&lt;/ul&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/4663936982725370381' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/4663936982725370381'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/4663936982725370381'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2010/01/carabao-language-kit-1621-released.html' title='Carabao Language Kit 1.6.2.1 released'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-2719484270395416129</id><published>2009-07-22T13:21:00.003+10:00</published><updated>2009-07-22T13:56:29.620+10:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="iis"/><category scheme="http://www.blogger.com/atom/ns#" term="programming"/><category scheme="http://www.blogger.com/atom/ns#" term="vista"/><title type='text'>Running IIS7 on Windows Vista Home</title><content type='html'>Many people are stumbled when trying to use their Windows Vista Home machines for web development involving IIS deployment. Usually the tricky part is the error message that ASPX pages are not handled. As the management console is not installed on some Vista editions, the workaround is to edit the configuration file directly.&lt;br /&gt;&lt;br /&gt;The configuration file is &lt;b&gt;C:\Windows\System32\inetsrv\config\applicationHost.config&lt;/b&gt;. Simply overwrite the &lt;b&gt;&amp;lt;handlers&amp;gt;&lt;/b&gt; section with the following:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;          &amp;lt;handlers accessPolicy=&quot;Script, Read&quot;&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;TraceHandler-Integrated&quot; path=&quot;trace.axd&quot; verb=&quot;GET,HEAD,POST,DEBUG&quot; type=&quot;System.Web.Handlers.TraceHandler&quot; preCondition=&quot;integratedMode&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;WebAdminHandler-Integrated&quot; path=&quot;WebAdmin.axd&quot; verb=&quot;GET,DEBUG&quot; type=&quot;System.Web.Handlers.WebAdminHandler&quot; preCondition=&quot;integratedMode&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;AssemblyResourceLoader-Integrated&quot; path=&quot;WebResource.axd&quot; verb=&quot;GET,DEBUG&quot; type=&quot;System.Web.Handlers.AssemblyResourceLoader&quot; preCondition=&quot;integratedMode&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;PageHandlerFactory-Integrated&quot; path=&quot;*.aspx&quot; verb=&quot;GET,HEAD,POST,DEBUG&quot; type=&quot;System.Web.UI.PageHandlerFactory&quot; preCondition=&quot;integratedMode&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;SimpleHandlerFactory-Integrated&quot; path=&quot;*.ashx&quot; verb=&quot;GET,HEAD,POST,DEBUG&quot; type=&quot;System.Web.UI.SimpleHandlerFactory&quot; preCondition=&quot;integratedMode&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;WebServiceHandlerFactory-Integrated&quot; path=&quot;*.asmx&quot; verb=&quot;GET,HEAD,POST,DEBUG&quot; type=&quot;System.Web.Services.Protocols.WebServiceHandlerFactory, System.Web.Services, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b03f5f7f11d50a3a&quot; preCondition=&quot;integratedMode&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;HttpRemotingHandlerFactory-rem-Integrated&quot; path=&quot;*.rem&quot; verb=&quot;GET,HEAD,POST,DEBUG&quot; type=&quot;System.Runtime.Remoting.Channels.Http.HttpRemotingHandlerFactory, System.Runtime.Remoting, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089&quot; preCondition=&quot;integratedMode&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;HttpRemotingHandlerFactory-soap-Integrated&quot; path=&quot;*.soap&quot; verb=&quot;GET,HEAD,POST,DEBUG&quot; type=&quot;System.Runtime.Remoting.Channels.Http.HttpRemotingHandlerFactory, System.Runtime.Remoting, Version=2.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089&quot; preCondition=&quot;integratedMode&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;AXD-ISAPI-2.0&quot; path=&quot;*.axd&quot; verb=&quot;GET,HEAD,POST,DEBUG&quot; modules=&quot;IsapiModule&quot; scriptProcessor=&quot;%windir%\Microsoft.NET\Framework\v2.0.50727\aspnet_isapi.dll&quot; preCondition=&quot;classicMode,runtimeVersionv2.0,bitness32&quot; responseBufferLimit=&quot;0&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;PageHandlerFactory-ISAPI-2.0&quot; path=&quot;*.aspx&quot; verb=&quot;GET,HEAD,POST,DEBUG&quot; modules=&quot;IsapiModule&quot; scriptProcessor=&quot;%windir%\Microsoft.NET\Framework\v2.0.50727\aspnet_isapi.dll&quot; preCondition=&quot;classicMode,runtimeVersionv2.0,bitness32&quot; responseBufferLimit=&quot;0&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;SimpleHandlerFactory-ISAPI-2.0&quot; path=&quot;*.ashx&quot; verb=&quot;GET,HEAD,POST,DEBUG&quot; modules=&quot;IsapiModule&quot; scriptProcessor=&quot;%windir%\Microsoft.NET\Framework\v2.0.50727\aspnet_isapi.dll&quot; preCondition=&quot;classicMode,runtimeVersionv2.0,bitness32&quot; responseBufferLimit=&quot;0&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;WebServiceHandlerFactory-ISAPI-2.0&quot; path=&quot;*.asmx&quot; verb=&quot;GET,HEAD,POST,DEBUG&quot; modules=&quot;IsapiModule&quot; scriptProcessor=&quot;%windir%\Microsoft.NET\Framework\v2.0.50727\aspnet_isapi.dll&quot; preCondition=&quot;classicMode,runtimeVersionv2.0,bitness32&quot; responseBufferLimit=&quot;0&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;HttpRemotingHandlerFactory-rem-ISAPI-2.0&quot; path=&quot;*.rem&quot; verb=&quot;GET,HEAD,POST,DEBUG&quot; modules=&quot;IsapiModule&quot; scriptProcessor=&quot;%windir%\Microsoft.NET\Framework\v2.0.50727\aspnet_isapi.dll&quot; preCondition=&quot;classicMode,runtimeVersionv2.0,bitness32&quot; responseBufferLimit=&quot;0&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;HttpRemotingHandlerFactory-soap-ISAPI-2.0&quot; path=&quot;*.soap&quot; verb=&quot;GET,HEAD,POST,DEBUG&quot; modules=&quot;IsapiModule&quot; scriptProcessor=&quot;%windir%\Microsoft.NET\Framework\v2.0.50727\aspnet_isapi.dll&quot; preCondition=&quot;classicMode,runtimeVersionv2.0,bitness32&quot; responseBufferLimit=&quot;0&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;ASPClassic&quot; path=&quot;*.asp&quot; verb=&quot;GET,HEAD,POST&quot; modules=&quot;IsapiModule&quot; scriptProcessor=&quot;%windir%\system32\inetsrv\asp.dll&quot; resourceType=&quot;File&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;SecurityCertificate&quot; path=&quot;*.cer&quot; verb=&quot;GET,HEAD,POST&quot; modules=&quot;IsapiModule&quot; scriptProcessor=&quot;%windir%\system32\inetsrv\asp.dll&quot; resourceType=&quot;File&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;SSINC-stm&quot; path=&quot;*.stm&quot; verb=&quot;GET,POST&quot; modules=&quot;ServerSideIncludeModule&quot; resourceType=&quot;File&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;SSINC-shtm&quot; path=&quot;*.shtm&quot; verb=&quot;GET,POST&quot; modules=&quot;ServerSideIncludeModule&quot; resourceType=&quot;File&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;SSINC-shtml&quot; path=&quot;*.shtml&quot; verb=&quot;GET,POST&quot; modules=&quot;ServerSideIncludeModule&quot; resourceType=&quot;File&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;ISAPI-dll&quot; path=&quot;*.dll&quot; verb=&quot;*&quot; modules=&quot;IsapiModule&quot; resourceType=&quot;File&quot; requireAccess=&quot;Execute&quot; allowPathInfo=&quot;true&quot; /&amp;gt;&lt;br /&gt;            &amp;lt;add name=&quot;CGI-exe&quot; path=&quot;*.exe&quot; verb=&quot;*&quot; modules=&quot;CgiModule&quot; resourceType=&quot;File&quot; requireAccess=&quot;Execute&quot; allowPathInfo=&quot;true&quot; /&amp;gt;&lt;br /&gt;            &lt;br /&gt;          &amp;lt;/handlers&amp;gt;&lt;br /&gt;&lt;br /&gt;Done.</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/2719484270395416129' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/2719484270395416129'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/2719484270395416129'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2009/07/running-iis7-on-windows-vista-home.html' title='Running IIS7 on Windows Vista Home'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-7259029644505417163</id><published>2009-04-28T15:25:00.010+10:00</published><updated>2009-06-08T15:38:12.321+10:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="blackberry"/><category scheme="http://www.blogger.com/atom/ns#" term="mobile"/><category scheme="http://www.blogger.com/atom/ns#" term="programming"/><category scheme="http://www.blogger.com/atom/ns#" term="tip"/><title type='text'>BlackBerry connection tip</title><content type='html'>&lt;p style=&quot;text-align:justify;&quot;&gt;We developed a small client application for BlackBerry phones that connects to &lt;a href=&quot;http://www.mymiamia.com/&quot; target=&quot;_blank&quot;&gt;MiaMia&lt;/a&gt; servers. A lot of things can be said about BlackBerry adherence to standards, but I don&#39;t want to use cuss words here. The main problem is that these phones were initially devised to serve as &quot;corporate walkie-talkies&quot;, and only recently their design was stretched to accommodate the needs of the general public. In a nutshell, this is a hardware equivalent of Lotus Notes - a combination of lack of respect for standards and a kungfu grip on the corporate customers.&lt;/p&gt;&lt;p style=&quot;text-align:justify;&quot;&gt;As a result, once in a while developers encounter what can be described as a &quot;chastity belt&quot; - when it comes to communications with external world. Look for &lt;a href=&quot;http://www.google.com/search?hl=en&amp;amp;safe=off&amp;amp;rls=en&amp;amp;hs=Gy0&amp;amp;num=100&amp;amp;q=blackberry+connection+problem&amp;amp;btnG=Search&quot; target=&quot;_blank&quot;&gt;blackberry connection problems&lt;/a&gt; in Google or read &lt;a href=&quot;http://www.paxmodept.com/telesto/blogitem.htm?id=617&quot; target=&quot;_blank&quot;&gt;this frustrated rant that sums it all&lt;/a&gt;. And if, as a BlackBerry user, you encountered problems with MSN Messenger, IRC, or anything else - same reason.&lt;/p&gt;&lt;p style=&quot;text-align:justify;&quot;&gt;By default, when possible and a proxy-based connection (MDS, BES, BIS) exists, BlackBerry seems to re-route all requests there. It is not at all certain that your tiny app will be allowed to go further. Chances are, it will be stopped, and possibly hang. However, there are two possibilities that should work in most cases:&lt;/p&gt;&lt;br /&gt;&lt;ul&gt;&lt;li&gt;direct TCP/IP via GPRS network&lt;/li&gt;&lt;li&gt;wireless connection&lt;/li&gt;&lt;/ul&gt;&lt;br /&gt;&lt;p&gt;To cut the long story short, the easiest way to provide maximum coverage is to create a tiny function that will be used like this:&lt;/p&gt;&lt;br /&gt;&lt;br /&gt;&lt;pre name=&quot;code&quot; class=&quot;java&quot;&gt; protected String getDeviceSpecificUrlParameters()&lt;br /&gt;{&lt;br /&gt;      try {&lt;br /&gt; if (net.rim.device.api.system.WLANInfo.getWLANState() == net.rim.device.api.system.WLANInfo.WLAN_STATE_CONNECTED)&lt;br /&gt;              {&lt;br /&gt;  return &quot;;deviceside=true;interface=wifi;ConnectionSetup=delayed;retrynocontext=true&quot;;&lt;br /&gt;              }&lt;br /&gt;          } catch (Exception e) {} // doesn&#39;t matter, just go elsewhere&lt;br /&gt; return &quot;;deviceside=true&quot;;&lt;br /&gt;}&lt;/pre&gt;&lt;br /&gt;then call it whenever you make a connection, e.g.&lt;br /&gt;&lt;pre name=&quot;code&quot; class=&quot;java&quot;&gt; ...&lt;br /&gt;Connector.open(url + getDeviceSpecificUrlParameters());&lt;br /&gt;...&lt;/pre&gt;&lt;br /&gt;&lt;p style=&quot;text-align:justify;&quot;&gt;If your application is designed to work across different platforms (like ours), you can re-implement that &lt;i&gt;getDeviceSpecificUrlParameters()&lt;/i&gt; for whatever platform you need. &lt;/p&gt;&lt;br /&gt;&lt;p&gt;A word of warning: if it&#39;s a brand new phone, and the access to GPRS is not configured, you may need to explain your users how to set up their APN parameters. The carrier-specific parameters can be found &lt;a href=&quot;http://www.unlocks.co.uk/gprs_settings.php&quot; target=&quot;_blank&quot;&gt;here&lt;/a&gt;, &lt;a href=&quot;http://www.pinstack.com/carrier_settings_apn_gateway.html&quot; target=&quot;_blank&quot;&gt;here&lt;/a&gt;, and &lt;a href=&quot;http://www.blackberryfaq.com/index.php/Carrier_specific_APN/TCP_settings&quot; target=&quot;_blank&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/7259029644505417163' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/7259029644505417163'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/7259029644505417163'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2009/04/blackberry-connection-tip.html' title='BlackBerry connection tip'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-5836995962087352844</id><published>2009-04-14T17:33:00.009+10:00</published><updated>2009-04-28T16:12:49.914+10:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="freeze"/><category scheme="http://www.blogger.com/atom/ns#" term="it"/><category scheme="http://www.blogger.com/atom/ns#" term="support"/><category scheme="http://www.blogger.com/atom/ns#" term="vista"/><title type='text'>Making peace with Vista</title><content type='html'>&lt;p style=&quot;text-align:justify;&quot;&gt;This is not directly related to the subject of the blog, but is worth sharing, since so many people are having problems with this.&lt;/p&gt;&lt;p style=&quot;text-align:justify;&quot;&gt;My development machine runs Windows Vista. I&#39;m not a fan of it (probably, like the majority), but I have to eat the same food my users do. Many people experience the same phenomenon, where Vista suddenly freezes without any provocation. It gets progressively worse with time. In my case, a few days ago it came to a point where the freezing occurred every half an hour. &lt;/p&gt;&lt;p style=&quot;text-align:justify;&quot;&gt;I tried different things: switching off visual styles, removing the useless Windows Defender, etc. The performance improved, of course, but the freeze attacks still remained.&lt;/p&gt;&lt;p style=&quot;text-align:justify;&quot;&gt;Finally, I found &lt;a href=&quot;http://www.winvistaclub.com/t24.html&quot; target=&quot;_blank&quot;&gt;this tip&lt;/a&gt;. (In Vista Home Basic, it is a bit different: you need to navigate to &lt;i&gt;Control Panel&lt;/i&gt; -&gt; &lt;i&gt;System&lt;/i&gt; -&gt; &lt;i&gt;Performance&lt;/i&gt; -&gt;  &lt;i&gt;Adjust indexing options&lt;/i&gt;.) It all makes sense: usually in databases functional indices are paramount, and when the system builds them, everything else is secondary. The most important analogy: if an index is corrupt, all kinds of bad things are likely to happen. Except that why would we need them in a desktop OS? It turns out that the reason is the ultra-annoying search, which is a part of the Start menu. I suppose the reason for this nuisance was to show that it&#39;s no worse OS X. Well, this is dumb. The old search was fine, and those who need better capabilities, generally use something else. In Vista, as a result, a background process is indexing nearly every file in the system. Why not make the indexing a scheduled task, and fall back to un-indexed search when necessary?&lt;/p&gt;&lt;p style=&quot;text-align:justify;&quot;&gt;After I reduced the population of files to index to the contents of Start menu only, suddenly Vista became the most stable, high-performance system I have ever seen - I kid you not! On the other hand, if you do need and like that new bloated search, or need the thumbnails, simply go and rebuild the index. Also at this opportunity review what extensions you want to index. &lt;/p&gt;&lt;p style=&quot;text-align:justify;&quot;&gt;Oh, and of course, I assume no responsibility if you do something silly to your machine, so don&#39;t just push everything without thinking. &lt;/p&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/5836995962087352844' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/5836995962087352844'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/5836995962087352844'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2009/04/making-peace-with-vista.html' title='Making peace with Vista'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-976016933337299204</id><published>2009-03-04T16:59:00.003+11:00</published><updated>2009-03-04T17:01:20.075+11:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="carabao"/><title type='text'>Guide to sequences uploaded</title><content type='html'>We uploaded a short guide to building and debugging the sequences. It is available at our &lt;a href=&quot;http://www.digitalsonata.com/download.aspx?type=whitepaper&quot; target=&quot;_blank&quot;&gt;whitepaper download page&lt;/a&gt;.</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/976016933337299204' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/976016933337299204'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/976016933337299204'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2009/03/guide-to-sequences-uploaded.html' title='Guide to sequences uploaded'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-1888586328208533688</id><published>2009-02-20T18:11:00.003+11:00</published><updated>2009-02-20T18:20:00.539+11:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="carabao"/><category scheme="http://www.blogger.com/atom/ns#" term="release"/><title type='text'>Carabao Language Kit 1.5.0.1 released</title><content type='html'>&lt;p&gt;The version 1.5.0.1 is now available for download. Lots of changes and enhancements thanks to ongoing development of Chinese (not in the default database though).&lt;/p&gt;&lt;p&gt;Fixed:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Regression: &quot;phantom capitalization&quot; of re-used words&lt;/li&gt;&lt;li&gt;Regression: sequence style forcing / avoiding&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Repositioning errors in sentences with attached tokens&lt;/li&gt;&lt;li&gt;Sequence processing in languages not using white spaces&lt;/li&gt;&lt;li&gt;Regression: single member sequence processing&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;p&gt;Added:&lt;/p&gt;&lt;ul class=&quot;featureList&quot;&gt;&lt;li&gt;Lattice-based processing for speech recognition and OCR application usage&lt;/li&gt;&lt;li&gt;Optional and unmapped members in sequences&lt;/li&gt;&lt;li&gt;Members in sequences which are validated but not mapped&lt;/li&gt;&lt;li&gt;Possibility to get a crosslingual representation (components only: DeepAnalyzer and Translation Server)&lt;/li&gt;&lt;li&gt;Possibility to load content from a disambiguated crosslingual representation&lt;/li&gt;&lt;li&gt;GUI in Translation Console to enable lattice-based processing&lt;/li&gt;&lt;li&gt;GUI in Translation Console to enable loading crosslingual representation&lt;/li&gt;&lt;li&gt;GUI in Translation Console to hint the system about the expected domains in the text&lt;/li&gt;&lt;li&gt;Analysis mode in Translation Console, when the source and target languages are the same and no styles are enforced / avoided&lt;/li&gt;&lt;li&gt;Capability of using the white space as a delimiter in languages that don&#39;t have white spaces&lt;/li&gt;&lt;li&gt;Smart quotes and other delimiters&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;p&gt;Improved:&lt;/p&gt;&lt;ul class=&quot;featureList&quot;&gt;&lt;li&gt;Dictionary GUI - presents thesaurus from another language, if missing in the current language&lt;/li&gt;&lt;li&gt;Sequence builder GUI - color coding of members which are not mapped, or contain conditions producing empty sets&lt;/li&gt;&lt;/ul&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/1888586328208533688' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/1888586328208533688'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/1888586328208533688'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2009/02/carabao-language-kit-1501-released.html' title='Carabao Language Kit 1.5.0.1 released'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-8620962280855167811</id><published>2008-12-08T09:00:00.003+11:00</published><updated>2008-12-08T09:49:33.664+11:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="carabao"/><category scheme="http://www.blogger.com/atom/ns#" term="release"/><title type='text'>Carabao Language Kit 1.2.3.0 released</title><content type='html'>&lt;p&gt;The version 1.2.3.0 is now available for download.&lt;/p&gt;&lt;p&gt;Fixed:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Handling of single quotes as syntax delimiters in English&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;p&gt;Added:&lt;/p&gt;&lt;ul class=&quot;featureList&quot;&gt;&lt;li&gt;A segmentation mode more effectively handling languages that don&#39;t use white spaces (e.g. Chinese, Japanese, Korean, Thai). In this mode, different character classes are broken into tokens (e.g. Chinese, and then immediately English). The remaining unidentified part is run through unknown heuristic identifier.&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Automatic conversion for Unicode clipboard data into the currently active encoding in tokens table&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Better warning when attempting to overwrite the current token&lt;/li&gt;&lt;br /&gt;&lt;li&gt;A utility to rebuild semantic links cache&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;p&gt;Improved:&lt;/p&gt;&lt;ul class=&quot;featureList&quot;&gt;&lt;li&gt;In some systems, the table of tokens with every update was adding a new set of system icons (minimize, restore, maximize) to the MDI frame window. The maximize option now causes the window to be set roughly to the full client area, but not in maximize mode&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/8620962280855167811' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/8620962280855167811'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/8620962280855167811'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2008/12/version-1.html' title='Carabao Language Kit 1.2.3.0 released'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-5837876348684031034</id><published>2008-09-08T16:28:00.002+10:00</published><updated>2008-09-08T16:30:57.001+10:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><title type='text'>Free source code section</title><content type='html'>We added a small source code section on our &lt;i&gt;&lt;a href=&quot;http://www.digitalsonata.com/download.aspx?type=code&quot; target=&quot;_blank&quot;&gt;Download&lt;/a&gt;&lt;/i&gt; page, where we will post freebies for developers.</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/5837876348684031034' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/5837876348684031034'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/5837876348684031034'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2008/09/free-source-code-section.html' title='Free source code section'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-7044924034625925898</id><published>2008-09-08T08:16:00.003+10:00</published><updated>2008-09-08T08:16:00.658+10:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="carabao"/><category scheme="http://www.blogger.com/atom/ns#" term="release"/><title type='text'>Carabao Language Kit 1.2.0.0 released</title><content type='html'>&lt;p&gt;The version 1.2.0.0 is now available for download.&lt;/p&gt;&lt;p&gt;Fixed:&lt;/p&gt;&lt;ul&gt;&lt;li&gt;Unknown patterns were translated as hypernyms&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Regression: certain category-based sequences were omitted on second execution because of a malfunctioning guess scan caching mechanism&lt;/li&gt;&lt;br /&gt;&lt;li&gt;In analytical mode (Carabao DeepAnalyzer), there was a mismatch between word index number and an idiom member index, in sentences with attached tokens such as &#39;em, &#39;m&lt;/li&gt;&lt;br /&gt;&lt;li&gt;When copying a token with 1 rule units or less, the text is always reset to the original&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;p&gt;Added:&lt;/p&gt;&lt;ul class=&quot;featureList&quot;&gt;&lt;li&gt;Capability to match numbers as patterns&lt;/li&gt;&lt;br /&gt;&lt;li&gt;When a translation is not found, the engine tries to fall back to a matching hypernym instead&lt;/li&gt;&lt;br /&gt;&lt;li&gt;New methods to Carabao DeepAnalyzer that enable accessing the members of the detected idioms&lt;/li&gt;&lt;br /&gt;&lt;li&gt;New methods to Carabao CDA that enable accessing the unknown heuristics table&lt;/li&gt;&lt;br /&gt;&lt;li&gt;New sequences&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Russian morphological exceptions&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;p&gt;Improved:&lt;/p&gt;&lt;ul class=&quot;featureList&quot;&gt;&lt;li&gt;If an &quot;unknown pattern&quot; is forced to match a known word, it will not create a new guess if a guess with a same hypernym already exists. For example, if you force to check, whether a known word can be a city, a new record will not be created, if there is already a guess with a known city&lt;br /&gt;Automatic input language switching in locator fields&lt;br /&gt;&lt;li&gt;Locator fields are pre-filled with the list of all existing languages in the database, eliminating the need to jump to the next language&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/7044924034625925898' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/7044924034625925898'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/7044924034625925898'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2008/09/carabao-language-kit-1200-released.html' title='Carabao Language Kit 1.2.0.0 released'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-4252211756604070368</id><published>2008-06-05T09:48:00.002+10:00</published><updated>2008-06-05T10:06:38.959+10:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="development"/><category scheme="http://www.blogger.com/atom/ns#" term="rant"/><title type='text'>The surreal world of VoIP</title><content type='html'>We&#39;re currently working on a massive project involving telephony and mobile technologies, and I had to look for VoIP vendors to cater for relatively simple needs of my client. I have to say that while the needs are simple, the traffic is extremely high, so in monetary terms, it could be a nice deal for the VoIP vendors.&lt;br /&gt;&lt;br /&gt;But... is VoIP a strange  industry or what. I don&#39;t know who runs all these companies - but:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;if you don&#39;t know what IVR, DID, PSTN, or all their other &quot;secret handshakes&quot; mean - they won&#39;t even talk to you. Forget about the forums, they are even less helpful than Usenet.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;the responses come usually after weeks, and they are of the type &quot;I &lt;i&gt;just&lt;/i&gt; transferred your inquiry to our sales representative&quot;. Obviously, unless you kick and scream, the sales rep won&#39;t get back&lt;/li&gt;&lt;li&gt;you have to re-tell the story over and over again&lt;/ul&gt;&lt;br /&gt;Ah yes, but the internet is full of dotcomish optimism about mashups and other kewl stuff. Awesome, dude.&lt;br /&gt;&lt;br /&gt;Of course, there are also people who do need paying customers, and those who are able to concentrate, and - surprise surprise - they were the ones who got the job eventually.</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/4252211756604070368' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/4252211756604070368'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/4252211756604070368'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2008/06/surreal-world-of-voip.html' title='The surreal world of VoIP'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-8442021847725424635</id><published>2008-05-01T10:02:00.005+10:00</published><updated>2008-05-02T14:50:27.548+10:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="machine translation"/><category scheme="http://www.blogger.com/atom/ns#" term="publicity"/><title type='text'>Published in MultiLingual</title><content type='html'>My &lt;a href=&quot;http://www.multilingual.com/articleDetail.php?id=1415&quot; target=&quot;_blank&quot;&gt;article about real-world applications of machine translation&lt;/a&gt; has been published in MultiLingual Computing, the leading industry magazine for  globalization, international software development and language technology.&lt;br /&gt;&lt;br /&gt;This part is for subscribers only though.</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/8442021847725424635' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/8442021847725424635'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/8442021847725424635'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2008/05/published-in-multilingual.html' title='Published in MultiLingual'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-8318292647010062780</id><published>2008-04-30T15:28:00.004+10:00</published><updated>2008-04-30T15:32:05.312+10:00</updated><title type='text'>Localization can be a matter of life and death</title><content type='html'>This story (&lt;a href=&quot;http://gizmodo.com/382026/a-cellphones-missing-dot-kills-two-people-puts-three-more-in-jail&quot; target=&quot;_blank&quot;&gt;A Cellphone&#39;s Missing Dot Kills Two People, Puts Three More in Jail&lt;/a&gt;) reads like it was taken from a Tarantino or Coen brothers&#39; movie.</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/8318292647010062780' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/8318292647010062780'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/8318292647010062780'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2008/04/localization-can-be-matter-of-life-and.html' title='Localization can be a matter of life and death'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-7705674533665807068</id><published>2008-04-23T09:06:00.005+10:00</published><updated>2008-05-01T10:02:08.104+10:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="data"/><category scheme="http://www.blogger.com/atom/ns#" term="distribution"/><title type='text'>We are published at ELRA</title><content type='html'>After a few months of evaluations, agreements, and inspections, our linguistic data is published at European Linguistic Resources Association&#39;s website. The &lt;a href=&quot;http://catalog.elra.info/product_info.php?products_id=1058&quot; target=&quot;_blank&quot;&gt;Russian - English OLIF dictionary&lt;/a&gt; is sold at  quite a price, while the freebie Swahili, Czech and Cebuano dictionaries are distributed for free (although ELRA takes postage and media charges).&lt;br /&gt;&lt;br /&gt;It is important to mention that all this data can be created from (usually free) ASCII dictionaries on the net using Carabao Linguist Edition.&lt;br /&gt;&lt;br /&gt;&lt;b&gt;Clarification&lt;/b&gt;: OLIF is Open Lexicon Interchange Format backed by SAP, especially created for NLP oriented lexica. The official website is &lt;a href=&quot;http://www.olif.net&quot; target=&quot;_blank&quot;&gt;www.olif.net&lt;/a&gt;.</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/7705674533665807068' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/7705674533665807068'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/7705674533665807068'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2008/04/we-are-published-in-elra.html' title='We are published at ELRA'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-6391229787718070747</id><published>2008-04-01T16:45:00.003+11:00</published><updated>2008-04-01T16:50:06.570+11:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="website"/><title type='text'>Server transition</title><content type='html'>We just moved to a new server. Much better performance, but there might be some minor technical glitches in the next few days. Thank you for your patience.</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/6391229787718070747' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/6391229787718070747'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/6391229787718070747'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2008/04/server-transition.html' title='Server transition'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-422935940638079366</id><published>2008-03-14T01:01:00.003+11:00</published><updated>2008-03-30T12:32:09.694+11:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="machine translation"/><category scheme="http://www.blogger.com/atom/ns#" term="SMT"/><category scheme="http://www.blogger.com/atom/ns#" term="statistical machine translation"/><title type='text'>Hype, hype, hype</title><content type='html'>Got &lt;a href=&quot;http://www.translationautomation.com/joomla/index.php?option=com_content&amp;view=article&amp;catid=45:news_archive&amp;id=155:-and-a-new-competitor-revs-up-in-thailand&amp;Itemid=46&quot; mce_href=&quot;http://www.translationautomation.com/joomla/index.php?option=com_content&amp;view=article&amp;catid=45:news_archive&amp;id=155:-and-a-new-competitor-revs-up-in-thailand&amp;Itemid=46&quot; target=&quot;_blank&quot;&gt;this article&lt;/a&gt; from TAUS today.&lt;br /&gt;&lt;br /&gt;Man, is this annoying or what. With claims like these, the MT industry is going to move from Web 2.0 to Dot Com Bubble Burst 2.0.&lt;br /&gt;&lt;br /&gt;I do hold high respect for Asia Online&#39;s team. Phillip Koehn co-authored some papers with Franz Och, the guy behind Language Weaver and Google Translate. Dion Wiggins has some super-duper credentials as an IT businessman, although with no experience at all in MT or NLP (and this industry is &lt;i&gt;very&lt;/i&gt; different).&lt;br /&gt;&lt;br /&gt;From the technological point of view, Koehn did the right thing (IMHO) going in the hybrid direction instead of looking for the philosopher&#39;s stone of pure ideal engine that builds itself and with some kind of divination deduces stuff that hand-crafted rules can&#39;t. It is also refreshing to see that they decided to use a high quality corpus.&lt;br /&gt;&lt;br /&gt;But - c&#39;mon, people!&lt;br /&gt;&lt;br /&gt;The article contains odd bits such as:&lt;br /&gt;&lt;br /&gt;&lt;i&gt;...adding a syntax component to &lt;strong&gt;a SMT system&lt;/strong&gt; &quot;seriously degrades throughput performance from 5,000 words a minute to only around 300&quot; on a machine with 4 high speed CPUs&lt;/i&gt;&lt;br /&gt;&lt;br /&gt;Did they mean &lt;strong&gt;the &lt;/strong&gt;SMT system? Because how can you measure speed of all SMT systems?&lt;br /&gt;&lt;br /&gt;If this is &lt;strong&gt;the&lt;/strong&gt; SMT system (i.e. they already built the kernel), this probably means they are going to use Moses. Which calls for more questions. Koehn created Pharaoh and Moses; both are open-source; both have been around for a while; yet I never heard of a commercial or even semi-commercial application that uses them. And I know at least one huge translation agency that launched a project to create their own MTs based on Moses.&lt;br /&gt;&lt;br /&gt;There is also EuroMatrix, an all-you-can-spend research project, where Koehn also took part, and which, just like all the other euro-science-charity projects, produced nothing (yeah, OK, there is a Czech English lexicon, which is a huge deal, right?).&lt;br /&gt;&lt;br /&gt;The website of &lt;a href=&quot;http://www.asiaonline.net/&quot; mce_href=&quot;http://www.asiaonline.net&quot; target=&quot;_blank&quot;&gt;Asia Online&lt;/a&gt; looks nice. Obviously, lots of Wikipedia articles, &lt;a href=&quot;http://www.asiaonline.net/corporate/technology.aspx&quot; mce_href=&quot;http://www.asiaonline.net/corporate/technology.aspx&quot; target=&quot;_blank&quot;&gt;SMT for dummies complete with scientifically-looking formulas&lt;/a&gt;, &lt;a href=&quot;http://www.asiaonline.net/corporate/news.aspx&quot; mce_href=&quot;http://www.asiaonline.net/corporate/news.aspx&quot; target=&quot;_blank&quot;&gt;media buzz&lt;/a&gt;. Not a word about the actual engine, no screenshots, no demos.&lt;br /&gt;&lt;br /&gt;One thing that particularly captured my attention was the &lt;a href=&quot;http://www.asiaonline.net/corporate/careers.aspx&quot; mce_href=&quot;http://www.asiaonline.net/corporate/careers.aspx&quot; target=&quot;_blank&quot;&gt;Careers &lt;/a&gt;  page. As of now, they are looking for:&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;country office managers in every Asian country, including Thailand&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;&quot;content procurement&quot; managers that read content and make sure it is well-translated (good luck with that) before feeding it to the lexicon builders&lt;/li&gt;&lt;br /&gt;&lt;li&gt;the coolest thing: programmers in Thailand: C++ - probably to fix bugs in Moses, and C# to hack the front-end. Now this last part is kind of OK, except that one of the requirements is &quot;&lt;i&gt;Have database skills in &lt;b&gt;MS SQL Server, Oracle Database and etc&lt;/b&gt;&lt;/i&gt;&quot;. Data and stuff. In other words, on this stage they did not decide yet what backend they are going to use. Which means, there are no design specs.  &lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;So they have no people responsible for the content. They have no design specs. They have no solid plan. They have no people who worked and produced anything of this class.&lt;br /&gt;&lt;br /&gt;They did not even research their markets properly. If they even looked at Wikipedia articles about the Philippines, or talked to one or two Filipinos, they&#39;d learn that English, not Tagalog, is the lingua franca of the Philippines when it comes to written language. 90% of the major newspapers, all the official correspondence, all the commercial documents in the Philippines are in English.&lt;br /&gt;&lt;br /&gt;All they have is the prototype of a system that was never shown to work in a real-world environment, and a bunch of British guys who rented an office in &quot;beautiful downtown Bangkok&quot; and announced that they will conquer the world in a year.&lt;br /&gt;&lt;br /&gt;And, of course, the usual reservations about statistical MT apply.</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/422935940638079366' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/422935940638079366'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/422935940638079366'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2008/03/hype-hype-hype.html' title='Hype, hype, hype'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-2152677968697904305</id><published>2008-03-11T01:10:00.001+11:00</published><updated>2008-03-30T12:31:58.271+11:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="announcement"/><category scheme="http://www.blogger.com/atom/ns#" term="carabao"/><category scheme="http://www.blogger.com/atom/ns#" term="release"/><title type='text'>Carabao Language Kit 1.1.0.1 released</title><content type='html'>The version 1.1.0.1 is now available for download - mostly to fix the regressions reported in 1.1.0.0.&lt;br /&gt;&lt;p&gt;Fixed:&lt;/p&gt;&lt;br /&gt;&lt;ul&gt;&lt;br /&gt;&lt;li&gt;Crash when using sequence extraction option (regression from 1.1.0.0)&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;p&gt;Added:&lt;/p&gt;&lt;br /&gt;&lt;ul class=&quot;featureList&quot;&gt;&lt;br /&gt;&lt;li&gt;Capability to import sequences by data entry directly from the Sequence Sheet&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Capability to manually set sequence descriptions&lt;br /&gt;&lt;li&gt;Some sequences for multi-word entity extraction&lt;br /&gt;&lt;li&gt;More morphological exceptions for Russian&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;/ul&gt;&lt;br /&gt;&lt;p&gt;Improved:&lt;/p&gt;&lt;br /&gt;&lt;ul class=&quot;featureList&quot;&gt;&lt;br /&gt;&lt;li&gt;Processing speed and memory consumption - further boost&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Token Sheet (words &amp;amp; sequences) GUI&lt;br /&gt;&lt;br /&gt;&lt;/ul&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/2152677968697904305' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/2152677968697904305'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/2152677968697904305'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2008/03/carabao-language-kit-1101-released.html' title='Carabao Language Kit 1.1.0.1 released'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-4127035516184942767.post-1188762031922409395</id><published>2008-03-03T00:37:00.001+11:00</published><updated>2008-03-30T12:31:45.997+11:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="predictions; language"/><title type='text'>What may happen in the next 100 years - predictions from 1900</title><content type='html'>A scan of an interesting article from December 1900&#39;s &lt;I&gt;Ladies&#39; Home Journal&lt;/I&gt;: &lt;A href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEguWppG3WMExSn5hcvkAHeQzJcOi5dAFcJHnJsncrpA9sE1VHtVCnLobvHvsEV_FImSA381PGd1K-6ttG7hn8BpXicCZ7SG6jA9s1pOsJ5D4yOssRsnsEL_lZvKE4WjY7CH_-4wQi4g94Rn/s1600-h/Ladies+Home+Journal+Dec+1900+paleofuture+paleo-future.jpg&quot; mce_href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEguWppG3WMExSn5hcvkAHeQzJcOi5dAFcJHnJsncrpA9sE1VHtVCnLobvHvsEV_FImSA381PGd1K-6ttG7hn8BpXicCZ7SG6jA9s1pOsJ5D4yOssRsnsEL_lZvKE4WjY7CH_-4wQi4g94Rn/s1600-h/Ladies+Home+Journal+Dec+1900+paleofuture+paleo-future.jpg&quot; target=&quot;_blank&quot;&gt;What May Happen in the Next 100 Years&lt;/A&gt;  &lt;BR&gt;&lt;BR&gt;Some of the predictions are astonishingly accurate. Some are quire funny. &lt;BR&gt;There&#39;s a prediction about language: &lt;br /&gt;&lt;P&gt;&lt;I&gt;&lt;B&gt;There will be No C, X or Q&lt;/B&gt; in our every-day alphabet. They will be abandoned because unnecessary. Spelling by sound will have been adopted, first by the newspapers. English will be a language of condensed words expressing condensed ideas, and will be more extensively spoken than any other. Russian will rank second.&lt;/I&gt;&lt;/P&gt;&lt;br /&gt;  Of course, it was before two world wars, which introduced corrections into the statistics. &lt;BR&gt;&lt;BR&gt;Well done with the &quot;condensed words&quot; and &quot;condensed ideas&quot;, and the propagation of English (definitely not a given in 1900).&lt;BR&gt;&lt;BR&gt;</content><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/4127035516184942767/1188762031922409395' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/1188762031922409395'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/4127035516184942767/posts/default/1188762031922409395'/><link rel='alternate' type='text/html' href='http://digitalsonata.blogspot.com/2008/03/what-may-happen-in-next-100-years.html' title='What may happen in the next 100 years - predictions from 1900'/><author><name>Vadim Berman</name><uri>http://www.blogger.com/profile/05259798410409116215</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='32' height='22' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgvdMyv8Xih0FS_rWUbnw6V2yLVhzY-joHhmat9qnNWMJvetXNTNAusebWU_0ivWGA6USfCvsbHRubAlL5DcwNYUgx3aJVW891zrmHkQv2OCkWAxjLa06WAJBIaHl1GaQ/s220/vad.jpg'/></author><thr:total>0</thr:total></entry></feed>