<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:openSearch="http://a9.com/-/spec/opensearchrss/1.0/" xmlns:georss="http://www.georss.org/georss" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0"><id>tag:blogger.com,1999:blog-6252700852068053040</id><updated>2009-11-10T06:46:19.629-07:00</updated><title type="text">Mountain Goat Programmer</title><subtitle type="html">Adventures in code with Jacob R Rideout</subtitle><link rel="http://schemas.google.com/g/2005#feed" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/posts/default" /><link rel="alternate" type="text/html" href="http://blog.jacobrideout.net/" /><link rel="hub" href="http://pubsubhubbub.appspot.com/" /><link rel="next" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default?start-index=26&amp;max-results=25" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email></author><generator version="7.00" uri="http://www.blogger.com">Blogger</generator><openSearch:totalResults>26</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><geo:lat>40.020885</geo:lat><geo:long>-105.296733</geo:long><link rel="license" type="text/html" href="http://creativecommons.org/licenses/by-nc-sa/2.0/" /><link rel="self" href="http://feeds.feedburner.com/MountainGoatProgrammer" type="application/atom+xml" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com" /><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-9196840615642482635</id><published>2007-02-10T02:09:00.000-07:00</published><updated>2007-02-10T04:09:37.032-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="grammar" /><category scheme="http://www.blogger.com/atom/ns#" term="data structures" /><title type="text">Graph Library Suggestions</title><content type="html">I have an idea for an algorithm that requires a &lt;a href="http://en.wikipedia.org/wiki/Graph_%28data_structure%29"&gt;graph&lt;/a&gt;. Any suggestions on good library in c++ for the data structure would be nice, then I wouldn't have to reinvent the wheel. There is one special requirement; there can be (but needn't be) two edges per connected nodes. Each edge has a direction and weight. If the edge is implemented as a template class, then it can store a tuple or pair with the differing weights for each direction. I've taken a look and &lt;a href="http://www.boost.org/libs/graph/doc/index.html"&gt;boost&lt;/a&gt;, but it seems like overkill, Below is an example of what I need.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp3.blogger.com/_xf-7z0XFitg/Rc2MQ2P1KEI/AAAAAAAAAAo/TbmVox-sG2o/s1600-h/graph.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://bp3.blogger.com/_xf-7z0XFitg/Rc2MQ2P1KEI/AAAAAAAAAAo/TbmVox-sG2o/s400/graph.png" alt="" id="BLOGGER_PHOTO_ID_5029830579910420546" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;I created the image above in Kivio. This is the first time I've had an opportunity to use it. Great work KOffice team!&lt;br /&gt;&lt;br /&gt;EDIT: I'm going to go with boost for now. Additional suggestions are still welcome.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-9196840615642482635?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/9196840615642482635/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=9196840615642482635" title="19 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/9196840615642482635" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/9196840615642482635" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/wqW-1PwmrtM/graph-library-suggestions.html" title="Graph Library Suggestions" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://bp3.blogger.com/_xf-7z0XFitg/Rc2MQ2P1KEI/AAAAAAAAAAo/TbmVox-sG2o/s72-c/graph.png" height="72" width="72" /><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">19</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/02/graph-library-suggestions.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-6274727278739346550</id><published>2007-02-09T21:32:00.000-07:00</published><updated>2007-02-09T22:12:37.552-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><title type="text">Sonnet Updates</title><content type="html">I've been fairly busy the past two weeks and haven't put as much work into Sonnet as I would like. However there are several recent developments to mention.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Language Detection&lt;/span&gt;&lt;br /&gt;We now have preliminary support for distinguishing between pt_PT and pt_BR as well as en_US and en_GB. Portuguese seems to be a special case that most NLP programs explicitly acknowledge, which I now understand. I'm not sure what should be done to additionally distinguish between en_ZA or en_AU. I've a few ideas and will let everyone know my thoughts after more testing is done. I really didn't want to start messing around with dialects, but the response to that position has been massively against me; so into the fray I go.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Elixir&lt;/span&gt;&lt;br /&gt;The engine and documentation for Elixir is now ready for public scrutiny and comment. It has been interesting for me to write, since I've decided to only use C++ and the standard libraries so that there wouldn't be dependencies. Qt has really spoiled me :) The work has so far been done in my personal subversion repository. It will be made public as soon as &lt;a href="https://bugs.freedesktop.org/show_bug.cgi?id=9775"&gt;Bug #9775&lt;/a&gt; has been fixed and I have a working freedesktop.org CVS account.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Documentation&lt;/span&gt;&lt;br /&gt;Aseigo mentioned the need for &lt;a href="http://aseigo.blogspot.com/2007/02/speaking-wikiing.html"&gt;documentation&lt;/a&gt; today,&lt;br /&gt;&lt;blockquote&gt;"sonnet might be cool, for instance, but unless there's a tutorial that lets people start using it in their application code quickly it'll almost certainly end up under-utilized and/or take many more revision releases of kde4 to find its potential realized."&lt;/blockquote&gt;So true. The public interfaces for Sonnet have only just settled down, and some are still on my computer and yet to be committed. So, this weekend I'll update most the changes littering my working directory and start outlining some tutorials to be put in the wiki. I've been fairly good in providing apidox so far, but those need some improvements as well. Of course, KDE4PORTING.html needs to be updated as well.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Merging&lt;/span&gt;&lt;br /&gt;I've been hesitant to merge into trunk while the interfaces rapidly changing, but now that isn't much of a concern. A list of programs and libraries in kdepimlibs, kdebase and several other specific cases has been compiled and I'll be able to modify them when merging to ensure they build. For those projects that I personally won't migrate the tutorial should enable their developers to migrate seamlessly.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-6274727278739346550?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/6274727278739346550/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=6274727278739346550" title="11 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/6274727278739346550" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/6274727278739346550" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/aXjUefTiOis/sonnet-updates.html" title="Sonnet Updates" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">11</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/02/sonnet-updates.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-6707069287899107771</id><published>2007-02-07T14:59:00.000-07:00</published><updated>2007-02-07T18:20:19.441-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><title type="text">Sonnet In The Press</title><content type="html">Nathan Sanders of OSTG interviewed me for an article last week. The result has just been published on Linux.com in an article entitled, "&lt;a href="http://applications.linux.com/article.pl?sid=07/02/01/1935238"&gt;KDE 4's Sonnet will turbocharge language processing&lt;/a&gt;."&lt;br /&gt;&lt;br /&gt;Overall, I'm pleased with the coverage, but I do have a few misgivings; although, any minor errors likely are my fault for providing limited explanations in the interview. The scope my concern of is largely limited to grandiose statements I did not intend.  For example, "[I]mproved multilingual support is the "most requested change" from KDE 3..." I didn't really mean this; it was a context sensitive statement. I meant something such as, "Excluding technical issues like, 'KSpell doesn't work for me' The most requested features for KSpell that I know of (from end users) involve improving its multilingual support." But, requesting extra qualifiers for my statements is more likely an exercise in vanity than in promoting greater truth.&lt;br /&gt;&lt;br /&gt;I will mention, the article doesn't note the work of David Sweet. I'm not familiar with what exactly he did, but my understanding is that he wrote much of the original KSpell and thus much deserves some credit.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-6707069287899107771?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/6707069287899107771/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=6707069287899107771" title="10 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/6707069287899107771" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/6707069287899107771" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/o92Y6yJaDRY/sonnet-in-press.html" title="Sonnet In The Press" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">10</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/02/sonnet-in-press.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-4516791894649854594</id><published>2007-01-30T00:16:00.000-07:00</published><updated>2007-01-30T00:30:05.766-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="grammar" /><category scheme="http://www.blogger.com/atom/ns#" term="linguistics" /><category scheme="http://www.blogger.com/atom/ns#" term="writing" /><title type="text">Instant Messaging</title><content type="html">On the kde-devel mailing list regarding the term "instant messaging." I am no expert at English grammar, but do have some knowledge on what might be going on. The term has taken hold in the popular lexicon. American Heritage Dictionary defines instant messaging as:&lt;br /&gt;&lt;blockquote&gt;n. The transmission of an electronic message over a computer network using software that immediately displays the message in a window on the screen of the recipient. &lt;/blockquote&gt;In English, if a given lexeme has verb form, it usually also  has a gerund, which is often equivalent to the progressive (often called participle) form of the verb. There are many cases where the infinitive also has noun that describes the action of preforming the verb, unlike the the gerund, which describes the act itself. Gerunds have special rules in English and can be clefted unlike the verbal forms they often act as substitutes for.&lt;br /&gt;&lt;br /&gt;For example:&lt;br /&gt;verb: &lt;span style="font-weight: bold;"&gt;to run&lt;/span&gt; (infinitive)&lt;br /&gt;noun: &lt;span style="font-weight: bold;"&gt;run&lt;/span&gt; = the act of running&lt;br /&gt;verb: &lt;span style="font-weight: bold;"&gt;running&lt;/span&gt; (past progressive) (i.e. she was running)  = an inflected form of the verb&lt;br /&gt;gerund: &lt;span style="font-weight: bold;"&gt;running&lt;/span&gt; (noun) =   the action of the verb to run&lt;br /&gt;&lt;br /&gt;Of course, there the are many lexemes in English where the past progressive can also be used as an adjective further confusing the matter.&lt;br /&gt;&lt;br /&gt;So, "instant messaging" can be parsed several ways. For the several possible forms messaging could refer to there also exists a complementary form for instant, but I can take a stab at the intended meaning via the following examples:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Ian enjoys instant messaging his friends. ("messaging" is a gerund)&lt;/li&gt;&lt;li&gt;Ian is instant messaging his friends. ("messaging" is a participle verb)&lt;/li&gt;&lt;/ul&gt;Note that in the first sentence, the clause, "instant messaging his friends" is actually acting as a single noun.&lt;br /&gt;&lt;br /&gt;There seems to be no problem using "instant messaging" as a noun in English.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-4516791894649854594?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/4516791894649854594/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=4516791894649854594" title="8 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/4516791894649854594" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/4516791894649854594" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/cWgB6dPWQZk/instant-messaging.html" title="Instant Messaging" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">8</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/instant-messaging.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-1136236233688130522</id><published>2007-01-25T04:32:00.000-07:00</published><updated>2007-01-25T04:57:44.061-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="screenshots" /><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><title type="text">Language Detection Works!</title><content type="html">I've finally been able to put some of Sonnet's many pieces together. Initial integration of  language detection into the spell-check highlighter class has just been committed.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://bp3.blogger.com/_xf-7z0XFitg/RbiVbctc_pI/AAAAAAAAAAU/fYFcuHSj02g/s1600-h/spellcheckscreenshot.png"&gt;&lt;img style="margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;" src="http://bp3.blogger.com/_xf-7z0XFitg/RbiVbctc_pI/AAAAAAAAAAU/fYFcuHSj02g/s320/spellcheckscreenshot.png" alt="" id="BLOGGER_PHOTO_ID_5023929683127631506" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;The screen shot shows text that I copied from the Hebrew, German and English homepages of  &lt;a href="http://www.wikipedia.org/"&gt;Wikipedia&lt;/a&gt; in konqueror. Upon pasting them in the simple test application Sonnet detected the languages in a background thread and then proceeded to spell-check the paragraphs, also in a background thread.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-1136236233688130522?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/1136236233688130522/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=1136236233688130522" title="138 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/1136236233688130522" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/1136236233688130522" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/EkwJ3A-6lkA/language-detection-works.html" title="Language Detection Works!" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="http://bp3.blogger.com/_xf-7z0XFitg/RbiVbctc_pI/AAAAAAAAAAU/fYFcuHSj02g/s72-c/spellcheckscreenshot.png" height="72" width="72" /><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">138</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/language-detection-works.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-1333907487464667396</id><published>2007-01-21T23:43:00.000-07:00</published><updated>2007-01-22T11:00:24.732-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="linguistics" /><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="writing" /><title type="text">to boldly justify my perscriptivism</title><content type="html">&lt;span style="font-style: italic;"&gt;The following is a retort to an email &lt;/span&gt;&lt;span style="font-style: italic;"&gt;attacking&lt;/span&gt;&lt;span style="font-style: italic;"&gt; Sonnet and spell checking in general. I had initially written a vindictive reply, which I decided not to send and instead rewrote for a more general audience without, I hope, the angry overtones.&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Languages are living, changing, amorphous things. Whilst it is possible to categorize them for certain uses, such classification seems to fail in other contexts. If we were to honestly demarcate a category for every language on Earth, there would be one for every human being on the planet. We all have our lexicons, grammars and even orthography. Of course, a personal language would be near useless if there was not a group to comprehend it. Therefore, we often classify languages into groupings of mutual intelligibility and hierarchies by degree thereof.&lt;br /&gt;&lt;br /&gt;Spelling and grammar rules exist to convey our information in a manner that can be understood by others who also understand these rules. Style guides (should) provide hints on presenting information in a way less ambiguous. There is great tension among scholarly and armchair linguistics alike in characterizing certain aspects of language as &lt;span style="font-style: italic;"&gt;correct&lt;/span&gt;. I've certainly felt this tension internally, while working on Sonnet. Yet, the need to communicate dictates &lt;span style="font-style: italic;"&gt;a priori&lt;/span&gt; the necessity of common protocols and widely used convention.&lt;br /&gt;&lt;br /&gt;So, to the angry woman (who thankfully was not a contributor to KDE, although was bitter user) who deemed it worth her time to write and send a tirade outline the alleged hypocrisy of descriptive linguists creating prescriptive software I suggest that she consider my true purpose.&lt;br /&gt;&lt;br /&gt;The need to communicate clearly outweighs the disadvantage of using a language in a  “non-standard” manner. But how are “non-standard” uses defined? Should we follow decree, common convention or the prescription of articulate, but deadline driven and uniformed writers? I certainly won't (and don't) posit myself as the arbiter of linguistic rule; although, I have my opinions of English usage. I simply wish to improve upon the existing technology that enables better communication. Providing &lt;span style="font-style: italic;"&gt;optional&lt;/span&gt; languages aids is but one part of this desire. Furthermore, seeking to empower users of minority tongues does not force them to abandon their unique and valuable linguistic traditions. Merely providing users with tools, allows those that wish to ensure their writing is understood by others to verify that it is.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-1333907487464667396?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/1333907487464667396/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=1333907487464667396" title="34 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/1333907487464667396" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/1333907487464667396" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/f1fsoh3Uv20/to-boldly-justify-my-perscriptivism.html" title="to boldly justify my perscriptivism" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">34</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/to-boldly-justify-my-perscriptivism.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-2727472320122341593</id><published>2007-01-21T22:10:00.000-07:00</published><updated>2007-01-22T01:26:58.140-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="linguistics" /><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><title type="text">Another Reason ODF Rocks</title><content type="html">Bill Poser, writes at the &lt;a href="http://itre.cis.upenn.edu/%7Emyl/languagelog/archives/004065.html"&gt;Language Log&lt;/a&gt;:&lt;br /&gt;&lt;blockquote&gt; Now, you might be wondering what this all has to do with linguistics. Well, one of the things that document metadata specify is the language of the document. The Open Document standard does this correctly. It uses (p. 61) the three-letter language codes of &lt;a href="http://www.sil.org/iso639-3"&gt;ISO-639&lt;/a&gt;, followed by a two-letter country code following &lt;a href="http://www.iso.org/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/list-en1.html"&gt;ISO 3166&lt;/a&gt;. This allows for the specification of any of the world's languages. A three letter code allows for as many as 17,576 languages. ISO-639-3 in fact already encodes most of the world's approximately 6,700 languages. Open XML, on the other hand, does not follow ISO-639-3. Instead (section 2.18.52), it requires that languages be specified by means of two hexadecimal digits, e.g. 0x09 for English. That means that no more than 256 languages can be accomodated. The list of languages available is in the document referenced above on pp. 2531-2537 but for the two-letter hex codes you'll have to look elsewhere because Microsoft doesn't list them together with the languages. For some reason it gives a completely different set of non-hexadecimal codes ranging from 1025 to 58,380. The hex codes can be found in the fourth column of &lt;a href="http://unicode.org/onlinedat/languages.html"&gt;this table&lt;/a&gt;, the one labelled "Win Code".&lt;/blockquote&gt;Three cheers for standards and simplicity. I'd love to use the same language tagging standard as ODF in KDE, but there are few current limitations, depending on context. Most spell checkers require 639-1 codes for the language part. Sonnet uses these as well, for languages that have them and 639-3 codes for those that don't.&lt;br /&gt;&lt;br /&gt;Treating &lt;a href="http://en.wikipedia.org/wiki/ISO_639_macrolanguage"&gt;macrolanguages&lt;/a&gt; and separate &lt;a href="http://blog.jacobrideout.net/2007/01/queen-and-country.html"&gt;dialects&lt;/a&gt; with distinction (when relevant) has caused end users quite some concern. Most the problems I've been alerted to are being addressed. I've found the world wide community of KDE to be quite helpful and informative. Time that could have been spent programming has been instead consumed researching the differences between the Norwegian [nor] languages, Bokmål      [nob] and Nynorsk [nno]. Or tracking done some obscure variation of Cornish, only existing in one town, which I won't yet mention since I might publish something about it in the upcoming year.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-2727472320122341593?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/2727472320122341593/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=2727472320122341593" title="60 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/2727472320122341593" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/2727472320122341593" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/_EXa3cmNtxQ/another-reason-odf-rocks.html" title="Another Reason ODF Rocks" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">60</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/another-reason-odf-rocks.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-6223876670591301207</id><published>2007-01-17T10:46:00.000-07:00</published><updated>2007-01-17T12:36:52.625-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><title type="text">Queen and Country</title><content type="html">Can Sonnet detect the difference between en_US and en_GB? No, and yes. I've been asked that quite a bit, so I'll clarify. There are several key requirements, as I see them, for adding language detection to KDE.&lt;br /&gt;&lt;ol&gt;&lt;li&gt;It must distinguish between different languages. While all supported languages could be detected, it is more likely that a user will only use a few languages in most of his sessions or that the program will be used by in setting by people speaking a specific range of languages. For example, a  speaker of French and German, or school computer lab with speakers of Xhosa, Zulu and Afrikaans.&lt;/li&gt;&lt;li&gt;It must be Fast. The detection must occur in real-time; otherwise, the user might as well select their language manually.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt; I've optimized the language detection for the above criteria. The models used are limited so that they are fast. This makes detection less accurate, but this is overcome, by taking into account other factors such default settings or the language detected for surrounding text segments.&lt;br /&gt;&lt;br /&gt;Are users of multiple dialects less important? No, but if I value priorities (1) and (2) above, then the detection of the dialect will be highly inaccurate due to the nature of the algorithms used. However, detecting a sample of text written in a dialect as the common language currently is robust. It must be, the detection would be useless for our purposes if it were so brittle as to fail on minor spelling differences, which are likely candidates for errors. Additionally the statistical differences between dialects are much less then between different languages. In most cases, text in a different dialect just isn't contrastive enough in the greater scheme of things.&lt;br /&gt;&lt;br /&gt;Now, there are ways to distinguish between American and British spellings, but doing it at the statistical comparison stage isn't reliable with the current heuristics used, but could be done if there was enough user demand. However, any application that utilizes the languages guessing class should be a bit smarter. The convenience classes Sonnet will provide will distinguish between countries by checking user settings. If the users locale is de_BE and French is detected then the spell checker would default to checking with fr_BE; it would fallback to fr or fr_FR if a dictionary for fr_BE was not found.&lt;br /&gt;&lt;br /&gt;In summary, the language detection class will not distinguish between cases where the same language has a different orthography in a different country. Yet, tools built using this class will distinguish between them using alternative means. But, I could be wrong in my assumptions. If  enough complaints surface, it's early enough to change the behavior.&lt;br /&gt;&lt;br /&gt;For all those who emailed me questions similar to ris:&lt;br /&gt;&lt;blockquote&gt;[08:28] &amp;lt;ris&amp;gt; rideout: re: sonnet test sentences - would it be useful to put a british english test sentence in? would sonnet be able to distinguish between the two if you stuck a few 'realise', 'centre' and 'programme's in?&lt;/blockquote&gt;It is very reasonable to test that British English is detected as English. Empirically, Sonnet is not encumbered by the mild differences that exist between British and American orthography.&lt;br /&gt;&lt;br /&gt;EDIT: I meant to ask this originally, but became carried away and forgot. Are there languages for which I must distinguish between dialects? I can think of examples like Chinese where the dialects are essentially different languages. But, in this case they all share a common orthography. Are there those with different orthographies?&lt;br /&gt;&lt;br /&gt;NOTE: In this post, I did conflate the notion of dialects and countries with different standards of orthography. To the pedantic, please don't shoot. To the curious, I try to limit the use of technical linguistic terms in this blog and to use the imprecise and ofter confused vernacular, this has led to some confusion in the past.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-6223876670591301207?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/6223876670591301207/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=6223876670591301207" title="12 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/6223876670591301207" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/6223876670591301207" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/7w_PoarULLk/queen-and-country.html" title="Queen and Country" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">12</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/queen-and-country.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-116056303134341494</id><published>2007-01-17T05:34:00.000-07:00</published><updated>2007-01-17T05:46:31.826-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><title type="text">New Sonnet Mailing List</title><content type="html">Sonnet now has a &lt;a href="https://mail.kde.org/mailman/listinfo/kde-sonnet"&gt;mailing list&lt;/a&gt;. When possible, please post both questions and suggestions for Sonnet to this list rather than on this blog or in a personal email. Everyone interested in how Sonnet will end up should subscribe.&lt;br /&gt;&lt;br /&gt;Thanks for all the great help so far.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-116056303134341494?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/116056303134341494/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=116056303134341494" title="3 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/116056303134341494" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/116056303134341494" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/RKqw6et_eI0/new-sonnet-mailing-list.html" title="New Sonnet Mailing List" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">3</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/new-sonnet-mailing-list.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-4769748147277826447</id><published>2007-01-16T15:23:00.000-07:00</published><updated>2007-01-26T04:26:06.811-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="corpora" /><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><title type="text">Can your language be detected?</title><content type="html">Today, I've added a few more languages to Sonnet::GuessLanguage. I've also improved the speed considerably, removing a linear search. Below are the &lt;a href="http://en.wikipedia.org/wiki/List_of_ISO_639-1_codes"&gt;ISO 639 codes&lt;/a&gt; that currently should be detected.&lt;br /&gt;&lt;br /&gt;af, ar, az, bg, bn, bo, ca, ceb, cs, cy, da, de, el, en, es, et, eu, fa, fi, fr, gu, ha, haw, he, hi, hr, hu, hy, id, is, it, ja, ka, kk, km, kn, ko, ky, la, lo, lt, lv, mk, ml, mn, my, nb, ne, nl, nr, nso, or, pa, pl, ps, pt, ro, ru, si, sk, sl, so, sq, sr, ss, st, sv, sw, ta, te, th, tl, tlh, tn, tr, ts, vi, uk, ur, uz, ve, xh, zh, zu&lt;br /&gt;&lt;br /&gt;Please let me know if your language isn't supported and you would like to help.&lt;br /&gt;&lt;br /&gt;Each these languages should have a unit test to ensure they are actually detected. Take a look at the list of &lt;a href="http://websvn.kde.org/branches/work/sonnet-refactoring/tests/guesslanguagetest.cpp?rev=627380&amp;amp;view=markup"&gt;tests&lt;/a&gt; and see if your language is listed. If not, please send me an example sentence.&lt;br /&gt;&lt;br /&gt;EDIT: Thanks to everyone who has helped to make corrections so far. If the test is bad (i.e. contains more proper names than suitable for a test) then please post or send a sentence that is more representative of your language.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-4769748147277826447?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/4769748147277826447/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=4769748147277826447" title="94 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/4769748147277826447" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/4769748147277826447" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/tZrvzH8cBRg/can-your-language-be-detected.html" title="Can your language be detected?" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">94</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/can-your-language-be-detected.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-4114646851083145711</id><published>2007-01-11T19:09:00.000-07:00</published><updated>2007-01-11T19:17:52.107-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="wikipedia" /><category scheme="http://www.blogger.com/atom/ns#" term="corpora" /><title type="text">Wikipedia, A Source Of Error Corpora</title><content type="html">They guys from &lt;a href="http://morfologik.blogspot.com/2006/05/about-project.html"&gt;Morfologik&lt;/a&gt; have created a neat method for gathering error corpora.&lt;br /&gt;&lt;br /&gt;They've succinctly described why such data sources are needed:&lt;br /&gt;&lt;blockquote&gt;Background. The developers of grammar checkers, and autocorrect lists, have hard times with finding relevant corpora. Revision history is an excellent source about native speakers perception of linguistic norms. Frequently revised typos are perceived as errors that need to be corrected, so using these typos on autocorrect lists is justified. The same goes for style, grammar and usage errors.&lt;/blockquote&gt;Well, where is the biggest source of revision history on the planet? Wikipedia. You can read the whole post on the &lt;a href="http://morfologik.blogspot.com/2007/01/wikipedia-history-diff-as-revision.html"&gt;Morfologik blog.&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-4114646851083145711?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/4114646851083145711/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=4114646851083145711" title="4 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/4114646851083145711" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/4114646851083145711" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/T-yZJj2ECdo/wikipedia-source-of-error-corpora.html" title="Wikipedia, A Source Of Error Corpora" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">4</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/wikipedia-source-of-error-corpora.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-2666392606408966179</id><published>2007-01-11T12:28:00.000-07:00</published><updated>2007-01-11T19:20:34.652-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="metrics" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><title type="text">Open Source Development Metrics</title><content type="html">I find statistics and charts on the development of open source quite fascinating. In the past I've used tools such as &lt;a href="http://cia.navi.cx/"&gt;CIA&lt;/a&gt;, which tracks commits to project like KDE. It is then possible to track all manner of details.&lt;br /&gt;&lt;br /&gt;Today I came across &lt;a href="http://ohloh.net/"&gt;Ohloh&lt;/a&gt;. It describes itself as, "Mapping the open source world by collecting objective information on open source projects." Check out my &lt;a href="http://ohloh.net/projects/272/contributors/21350"&gt;statistics&lt;/a&gt;. Ohloh also tracks statistics for KDE overall.&lt;br /&gt;&lt;br /&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://ohloh.net/projects/272"&gt;&lt;img style="margin: 0px auto; display: block; text-align: center; cursor: pointer;" src="http://ohloh.net/projects/272.gif;badge" alt="KDE Statistics" border="0" /&gt;&lt;/a&gt;&lt;br /&gt;This data is collected solely from kdelibs and kdebase in trunk. It makes an assumption of the number of "person years" it would take to write the the lines of code we have, and then multiplies that by an estimated $55,000 (USD) per year, per developer to produce some rough estimate what it would cost to develop KDE in the absence of volunteer work. Of course this neglects all the other non-programming work, as well as the work on KDE 1, 2 &amp;amp; 3.&lt;br /&gt;&lt;br /&gt;Naturally the &lt;a href="http://www.englishbreakfastnetwork.org/"&gt;English Breakfast Network&lt;/a&gt; (EBN) provides our own statistics on both &lt;a href="http://www.englishbreakfastnetwork.org/thisweek.php"&gt;project activity&lt;/a&gt; and potential errors in &lt;a href="http://www.englishbreakfastnetwork.org/krazy/"&gt;code&lt;/a&gt; or &lt;a href="http://www.englishbreakfastnetwork.org/apidocs/"&gt;documentation&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Danny Allen's &lt;a href="http://commit-digest.org/"&gt;Commit Digest&lt;/a&gt; has been getting better as well. Each week we are treated to some beautiful charts dissecting all the (programming) work that has gone into KDE.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-2666392606408966179?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/2666392606408966179/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=2666392606408966179" title="13 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/2666392606408966179" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/2666392606408966179" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/l286QB-LBkE/open-source-development-metrics.html" title="Open Source Development Metrics" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">13</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/open-source-development-metrics.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-8294688801391701792</id><published>2007-01-10T13:50:00.001-07:00</published><updated>2007-01-10T14:00:31.111-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><title type="text">Spellcheck and Usability</title><content type="html">Lately, I've been thinking about the user interface KDE uses for spellcheck. There are two primary modes used:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Dialog checking: A dialog appears, and iterates over found misspellings. Suggestion and ignore features are available in the dialog.&lt;/li&gt;&lt;li&gt;Inline checking: Misspellings are highlighted or underlined red. Context menus provide correction options.&lt;/li&gt;&lt;/ol&gt;So I'll open the floodgates for suggestions in KDE4. What could be done to improve these interfaces in general? Are there alternate interfaces? What could be improved in KDE's GUI as it exists.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-8294688801391701792?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/8294688801391701792/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=8294688801391701792" title="61 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/8294688801391701792" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/8294688801391701792" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/Sj8rb_GZDBA/spellcheck-and-usability.html" title="Spellcheck and Usability" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">61</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/spellcheck-and-usability.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-6371858941148943812</id><published>2007-01-10T10:38:00.000-07:00</published><updated>2007-01-11T18:46:13.354-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="enchant" /><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><title type="text">A Web Service For PWLs?</title><content type="html">Dwayne wrote the following in the comments of a previous post:&lt;br /&gt;&lt;blockquote&gt;Glad to see your rolling this into a spec. I'm so tired of OOo and Mozilla and everything else using different spell checkers and differrent PWLs. I am also very interested in spell checking and related themes. At &lt;a href="http://translate.org.za/"&gt;Translate.org.za&lt;/a&gt; we've developed spell checkers of varying quality :) for the 11 official languages of South Africa. I'd like to know what we need to supply to get languages guessing working in Sonnet for those languages. Can you write a blog entry on what you need? One thing I would love is that users can submit their personaly dictionaries for possible inclusion into the formal dictionary. Every time someone adds a word to a personal dictionary there is a chance that it should go into the official one. We could create a poweful network of dictionary improvers.&lt;/blockquote&gt;Let me explain exactly what is happening. The &lt;a href="http://freedesktop.org/wiki/Standards_2fdesktop_2dlanguage_2dchecking_2dspec"&gt;spec&lt;/a&gt; on freedesktop.org (which has yet to be written) defines common interfaces for spell checking engines. The various spelling engines need to provide an interface conforming to the spec, or since they don't, we provide a wrapper.&lt;br /&gt;&lt;br /&gt;The spec doesn't create a new spelling engine. However, it seems the dictionaries Translate.org.za provides are in myspell format. If installed, the Myspell plugin for Enchant would use them. It does this transparently, calling Enchant (or some spell checker that uses it) with "af_ZA" just works. Of course, Sonnet can detect Afrikaans. So in KDE4 you won't even need to set the language; the document will just start using the relevant dictionary.&lt;br /&gt;&lt;br /&gt;Now Dwayne's second question is much more interesting. There should be someway to harness the collective power of all KDE (or even a more broad category of) users personal word lists and aggregate them. This is a great project for someone to pick up. I might even get the itch if I could finish all the projects I've started first.&lt;br /&gt;&lt;br /&gt;I envision a website with some standard interface that would allow you upload wordlists for a particular language.  A client for KDE could be made that queries Sonnet and retrieves your pwls. You could then download aggregations of words based of some criteria. For example, "Get every word with more than 5 instances in fr_FR." This could then be merged with your own list or be kept separate. The client could do this all seamlessly.&lt;br /&gt;&lt;br /&gt;This would provide valuable data to those who study the addition of new words into a language. This could also be used to create dictionaries for languages which currently don't have them. Provided of course, that the language is sufficiently &lt;a href="http://en.wikipedia.org/wiki/Analytic_language"&gt;analytic&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;EDIT: I just came across &lt;a href="http://joukahainen.lokalisointi.org/"&gt;Joukahainen&lt;/a&gt;. A Finnish web application that is quite similar to what I describe. It's GPL so someone might be able to use it to create my vision.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-6371858941148943812?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/6371858941148943812/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=6371858941148943812" title="27 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/6371858941148943812" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/6371858941148943812" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/ACfn54C5eVo/dwayne-wrote-following-in-comments-of.html" title="A Web Service For PWLs?" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">27</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/dwayne-wrote-following-in-comments-of.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-9141561990742440876</id><published>2007-01-09T15:27:00.000-07:00</published><updated>2007-01-09T15:33:00.258-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="freedesktop.org" /><category scheme="http://www.blogger.com/atom/ns#" term="enchant" /><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="elixir" /><title type="text">Freedesktop.org Spec for Language Checking</title><content type="html">I've added a new spec to the Freedesktop.org wiki for &lt;a href="http://freedesktop.org/wiki/Standards_2fdesktop_2dlanguage_2dchecking_2dspec#preview"&gt;Desktop Language Checking&lt;/a&gt;. It will be used to coordinate efforts between Gnome, KDE and others on spelling/grammar/style/diction checking.&lt;br /&gt;&lt;br /&gt;Also check out this Gnome bug report: &lt;a href="http://bugzilla.gnome.org/show_bug.cgi?id=383706"&gt;Bug 383706 – Adding support for spellcheckers into the Gtk+ stack&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-9141561990742440876?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/9141561990742440876/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=9141561990742440876" title="10 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/9141561990742440876" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/9141561990742440876" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/od0U_deiwx0/freedesktoporg-spec-for-language.html" title="Freedesktop.org Spec for Language Checking" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">10</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/freedesktoporg-spec-for-language.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-7319527643669686110</id><published>2007-01-09T15:23:00.000-07:00</published><updated>2007-01-09T15:27:42.549-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="meta" /><title type="text">New Domain Name</title><content type="html">My blog can now be read at &lt;a href="http://blog.jacobrideout.net/"&gt;blog.jacobrideout.net&lt;/a&gt;. The old site on blogger can still be used. In fact, it is the same site. I've just added a CNAME DNS entry via Blogger's new &lt;a href="http://help.blogger.com/bin/answer.py?answer=55373"&gt;custom domain&lt;/a&gt; feature.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-7319527643669686110?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/7319527643669686110/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=7319527643669686110" title="38 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/7319527643669686110" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/7319527643669686110" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/2BDJnzyYsUE/new-domain-name.html" title="New Domain Name" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">38</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/new-domain-name.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-5435810697723094864</id><published>2007-01-03T00:50:00.000-07:00</published><updated>2007-01-08T06:52:50.724-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="enchant" /><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><title type="text">Pourquoi enchanter le dragon ?</title><content type="html">I keep getting asked why we Enchant will be used in Sonnet. There seems to be a fear that the language of the questioner will no longer be supported. This is not the case. Enchant supports all the current spell checking engines in KDE - without &lt;span style="font-style: italic;"&gt;extra&lt;/span&gt; layers of indirection. Enchant does the exact &lt;span style="font-style: italic;"&gt;same&lt;/span&gt; thing as the old KSpell plugins. But there are some the additional features that Enhant supports, but KSpell does not:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;More languages are supported. Enchant has plugins for many more languages than KDE currently.&lt;/li&gt;&lt;li&gt;Per language, engine preferences. If you write documents in multiple languages, you can choose the best checker for each language.&lt;/li&gt;&lt;li&gt;Enchant supports emulating session and personal dictionaries for a checker that don't support them.&lt;/li&gt;&lt;li&gt;Persistent settings across both Gnome and KDE&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;We could add these features to what is currently available in KSpell2, but why? With Enchant we share the burden of maintenance and support.&lt;br /&gt;&lt;br /&gt;Take a look at &lt;a href="http://www.abisource.com/enchant/"&gt;Enchant's website&lt;/a&gt; for more information&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-5435810697723094864?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/5435810697723094864/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=5435810697723094864" title="80 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/5435810697723094864" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/5435810697723094864" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/yu7_qAGWZzk/pourquoi-enchantez-le-dragon.html" title="Pourquoi enchanter le dragon ?" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">80</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2007/01/pourquoi-enchantez-le-dragon.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-1172285002635971885</id><published>2006-12-31T03:13:00.000-07:00</published><updated>2007-01-02T12:58:10.758-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><category scheme="http://www.blogger.com/atom/ns#" term="kdelibs" /><title type="text">How Is Sonnet Stacking Up?</title><content type="html">The Sonnet stack is quickly taking shape. We are going to have some cool capabilities in KDE4 that make writing much easier. The stack, as it is planned, is shown bellow. The brackets show an estimate of the work that has been completed thus far.&lt;br /&gt;&lt;br /&gt;EDIT: Based off a discussion on kde-core-devel, all Sonnet classes will be in the Sonnet namespace. So you can prepend Sonnet:: to the class names below.&lt;br /&gt;&lt;h2&gt;Foundations&lt;/h2&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;QString&lt;/span&gt; &amp; other Qt classes - Provides 16 bit strings that store Unicode characters.&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;UnicodeData&lt;/span&gt; [90%] Provides means to query information from UCD files provided by the Unicode Consortium. A tool named parseucd is provided to convert ucd files into a data format optimized for fast lookups and low memory usage. This also allows users to regenerate any relevant data files in order to modify behavior in the rest of the stack.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;Parsing (NLP)&lt;br /&gt;&lt;/h2&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;GuessLanguage&lt;/span&gt; [50%] This class provides a statistical guess as to which language a given sample might be written in. It is based off a simple N-gram model and currently uses a trigram as well as other heuristics to determine a language. The class will be tuned to provide fast prediction for paragraph length text. [Currently based off &lt;a href="http://languid.cantbedone.org/"&gt;Languid&lt;/a&gt;]&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;TextBreaks&lt;/span&gt; [90%] Provides a list of relevant breaks in  a given string.  The default implementation will use the suggestions provided by the Unicode Consortium. This should provide adequate partitioning for word and sentence boundaries in most the world's languages (where such concepts have meaning in orthography)&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;AbstractFilter/DefaultFilter&lt;/span&gt; [85%] Provides a customizable filter for determining words and sentences. This classes determines textbreak locations and then determines if each segmented part of speech is relevent to the target of the query. This can be customized in interpretable rules set by the user.&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;Correctness Testing (Spell/Grammar/Style Checking)&lt;/h2&gt;&lt;ul&gt;&lt;li&gt;Currently KSpell2 uses a plugin framework for accessing spellchecker engines. AbiWord uses a framework they developed called Enchant which is preforms almost exactly the same task as KSpell and has a very similar interface for plugins. This is no coincidence since most spellcheckers implement an API designed for compatibly with ispell. In fact, KSpell has a Enchant plugin.&lt;/li&gt;&lt;li&gt;Sonnet will utilize Enchant as the interface to spellchecking and no longer support old plugins. This allows us to use the same spelling engines and rules along with the growing number of applications supporting Enchant. This also makes Sonnet more maintainable, bugfree and have more plugins available for more languages.&lt;/li&gt;&lt;li&gt;Grammar checking and style checking are highly requested features and will be available via Elixir. Rather than write a KDE specific framework for interfacing with grammar checkers, we are working with the developers of Enchant to provide a general library similar to Enchant but tailored to the needs of these types of tools.&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Enchant&lt;/span&gt; [98%]&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Elixir&lt;/span&gt; [5%]&lt;/li&gt;&lt;/ul&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Spell&lt;/span&gt; [99%] An interface to &lt;span&gt;Enchant.&lt;/span&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Grammar&lt;/span&gt; [50%] An interface to Elixir&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;Background Checking&lt;/h2&gt;&lt;ul&gt;&lt;li&gt;The parsing and analysis of language is time intensive. Sonnet will replace the old KSpell2 background checking (based on QThread subclassing) with a ThreadWeaver based implementation that will support both KSpell and KGrammar. [10%]&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;h2&gt;GUI&lt;/h2&gt;&lt;blockquote&gt;No work has started on the GUI layer. Usability review requests have been made and I'm awaiting feedback. Until then, as can be seen, there is a bunch of lower level work to be keep busy with.&lt;/blockquote&gt;&lt;ul&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Configuration&lt;/span&gt; -  Implement features to embed configuration of Enchant and Elixir in applications.&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Standard Checking Dialogs &amp; Widgets&lt;/span&gt; - This includes the dialog that appears when checking text and allows you to iterate through errors.&lt;/li&gt;&lt;li&gt;&lt;span style="font-weight: bold;"&gt;Highlightling&lt;/span&gt; - Automatic highlighting of misspelled words, etc...&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;&lt;blockquote&gt;Beyond the usability of the gui, some consideration is now being taken to determine proper behaviors for the actions associated with checking a document.  For example, should "ignore" permantly ignore that word in the application? Systemwide for all applications? Or, just for the session in which the application is used?&lt;br /&gt;&lt;h2&gt;&lt;/h2&gt;&lt;/blockquote&gt;&lt;h2&gt;Auxiliary Code&lt;br /&gt;&lt;/h2&gt;&lt;blockquote&gt;Sonnet have will a number of helpful classes and code snippets that can be incorporated into applications, including, but not limited to:&lt;/blockquote&gt;&lt;ul&gt;&lt;li&gt;Automatic detection of language and using setting the spellchecker to use the correct dictionary.&lt;/li&gt;&lt;li&gt;Advanced statistics - word/sentence/other counts, &lt;span style="font-weight: bold;"&gt;readability&lt;/span&gt; scores(Kincaid, ARI, Fog, etc...)&lt;/li&gt;&lt;li&gt;Advanced layout hints - Example: should text containing 70% Hebrew be right aligned?&lt;/li&gt;&lt;li&gt;Tools to define and configure autocorrection.&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;Some of these classes might not be appropriate for inclusion in kdelibs and may be placed elsewhere.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-1172285002635971885?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/1172285002635971885/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=1172285002635971885" title="101 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/1172285002635971885" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/1172285002635971885" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/f8dDgT6cR_s/how-is-sonnet-stacking-up.html" title="How Is Sonnet Stacking Up?" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">101</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2006/12/how-is-sonnet-stacking-up.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-6700281637429899988</id><published>2006-12-29T08:17:00.000-07:00</published><updated>2006-12-29T09:13:47.275-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="phrasis" /><category scheme="http://www.blogger.com/atom/ns#" term="writing" /><title type="text">New Conclusions</title><content type="html">I've been collecting feedback on Phrasis over the past two weeks now. However, the holidays slowed my progress quite a bit. Today, the suggestions started to swirl around and then coalesce into something much more coherent.&lt;br /&gt;&lt;br /&gt;My recent conclusions:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Writers don't want grammar checking. They want style checking and this may or may not include grammar checking.&lt;/li&gt;&lt;li&gt;Writers say they want work flow management. But they don't. When they do get it, most ignore or misuse it. I consider this analogous to a programmer and writing documentation. Yet, like code documentation, managing workflow is something good writers do. (But they call refer to it by different names and do it in different manners) So, how do you provide useful workflow to a writer?&lt;/li&gt;&lt;li&gt;There is a demand for limited dictionaries. Rather than having every valid spelling in the English language, some writers would like a subset suitable for a less literate or less technical mass audience. Undefined words would then be highlighted and the choice of their use would be deliberate.&lt;br /&gt;In the same token there are several well known algorithms available to analyze the  'readability' of text. They output scores that roughly correspond to grade level.&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Some writers would like some kind of tagging to their text. This would be similar to a more general system of annotations. You could tag a paragraph "find sources" or "needs work" and then have some system to query the tag database in the document.&lt;/li&gt;&lt;/ul&gt;I'll more ideas posted later. I should have full requirements / feature plan document for version 1.0 up around 3 January.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-6700281637429899988?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/6700281637429899988/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=6700281637429899988" title="7 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/6700281637429899988" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/6700281637429899988" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/CSGe2DqyFUI/new-conclusions.html" title="New Conclusions" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">7</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2006/12/new-conclusions.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-1774863274485098429</id><published>2006-12-29T03:56:00.000-07:00</published><updated>2006-12-29T03:57:05.544-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="phrasis" /><title type="text">Inkthinker on Phrasis</title><content type="html">Kristen King, a top blogger in the writing blogsphere made a post just before Christmas soliciting suggestions for Phrasis.&lt;br /&gt;&lt;br /&gt;Read: &lt;a href="http://inkthinker.blogspot.com/2006/12/open-source-text-editor-for-writers.html"&gt;Open-source text editor for writers&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-1774863274485098429?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/1774863274485098429/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=1774863274485098429" title="4 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/1774863274485098429" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/1774863274485098429" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/fV5u4CrFXqM/inkthinker-on-phrasis.html" title="Inkthinker on Phrasis" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">4</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2006/12/inkthinker-on-phrasis.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-6961382170371662509</id><published>2006-12-28T21:07:00.000-07:00</published><updated>2006-12-28T21:15:02.656-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="phrasis" /><title type="text">Phasis is gaining momentum</title><content type="html">I'm now in communication with a writer from &lt;a href="http://www.linux.com/"&gt;Linux.com&lt;/a&gt; who may be writing a story on &lt;a href="http://code.google.com/p/phrasis/"&gt;Phrasis&lt;/a&gt;. This is great news since I need *much* more feeback on how Phrasis should end up. I'll post an update when I learn more.&lt;br /&gt;&lt;br /&gt;Phrasis has two more ways to connect to users. On IRC you can now go to #phrasis, a new channel on freenode. There is a new public &lt;a href="http://scratchpad.wikia.com/wiki/Phrasis"&gt;wiki&lt;/a&gt; as well.&lt;br /&gt;&lt;br /&gt;Plus, all the work I'm doing in Sonnet for KDE will be used in Phrasis once I get the chance. This is great news for internationalization support and for platform integration.&lt;br /&gt;&lt;br /&gt;Cheers!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-6961382170371662509?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/6961382170371662509/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=6961382170371662509" title="7 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/6961382170371662509" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/6961382170371662509" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/WRMA3UaZ6Nw/phasis-is-gaining-momentum.html" title="Phasis is gaining momentum" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">7</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2006/12/phasis-is-gaining-momentum.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-4853617858341857240</id><published>2006-12-28T05:00:00.000-07:00</published><updated>2006-12-28T05:10:42.683-07:00</updated><title type="text">The latest SVN commit feed</title><content type="html">Several people have asked me about the svn commit log that is shown at the bottom of my blog. The code to generate it is very simple.&lt;br /&gt;&lt;br /&gt;I use two services &lt;a href="http://feed2js.org/"&gt;feed2js&lt;/a&gt; and &lt;a href="http://cia.navi.cx"&gt;CIA&lt;/a&gt;. CIA is an interesting service that tracks repositories for open source projects. I then take the feed they provide and use feed2js to generate a javascript which will convert the latest rss entry to html.&lt;br /&gt;&lt;br /&gt;&lt;blockquote&gt;&amp;lt;script language="JavaScript" src="http://feed2js.org//feed2js.php?src=http%3A%2F%2Fcia.navi.cx%2Fstats%2Fauthor%2Fjrideout%2F.rss&amp;amp;amp;num=1&amp;tz=-7&amp;amp;amp;html=a" type="text/javascript"&amp;gt;&amp;lt;/script&amp;gt;&lt;/blockquote&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-4853617858341857240?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/4853617858341857240/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=4853617858341857240" title="10 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/4853617858341857240" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/4853617858341857240" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/L5LfOqMTq7w/latest-svn-commit-feed.html" title="The latest SVN commit feed" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">10</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2006/12/latest-svn-commit-feed.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-5413962833822441355</id><published>2006-12-27T23:55:00.000-07:00</published><updated>2006-12-28T00:13:34.609-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><category scheme="http://www.blogger.com/atom/ns#" term="kdelibs" /><title type="text">KAutoSaveFile</title><content type="html">Thiago Macieira made a post last month, detailing the &lt;a href="http://www.kdedevelopers.org/node/2540"&gt;opportunity&lt;/a&gt; for a new KDE developer to hack on kdelibs. At that time I wanted to jump on it, but was bogged down with schoolwork. Then, this past week, after I received my subversion access to KDE, I looked to see if anyone had taken Thiago up on the offer. No one had, so spent a few hours and wrote the implementation to KAutoSaveFile.&lt;br /&gt;&lt;br /&gt;With KAutoSaveFile you can easily create a temporary file to write unsaved data in. If the application fails, you can recover any lost documents. KOffice will be changing its own implementation of this feature to KAutoSaveFile shortly.&lt;br /&gt;&lt;br /&gt;On a more general note. Aaron Seigo and I have been conversing on the overlapping file classes and methods in KDE. He has just &lt;a href="http://aseigo.blogspot.com/2006/12/how-not-to-save-temporary-data.html"&gt;outlined&lt;/a&gt; what should now be used in his blog. All this simplifying is great.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-5413962833822441355?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/5413962833822441355/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=5413962833822441355" title="3 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/5413962833822441355" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/5413962833822441355" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/W_W_Dzp6iB0/kautosavefile.html" title="KAutoSaveFile" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">3</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2006/12/kautosavefile.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-7199291660906211708</id><published>2006-12-24T21:23:00.000-07:00</published><updated>2006-12-26T06:37:32.267-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><title type="text">Zach, on Sonnet</title><content type="html">Here is Zach &lt;a href="http://zrusin.blogspot.com/2006/05/improving-reality.html"&gt;introducing&lt;/a&gt; Sonnet. It is a bit old (May 2006) yet still relevant. I plan on slowly taking over some of the maintainer responsibilities from Zach.&lt;br /&gt;&lt;blockquote&gt;Talking about spell checking I played a bit with Sonnet over the weekend. I've been handling KSpell, then KSpell2 for a while and then I just grew tired of it last summer (for various reasons not really related to the code itself). I've been toying with the idea of full linguistic framework for a while. Besides spell checking we're talking about grammar checker, dictionary, thesaurus and translator. Sonnet is just that - full linguistic framework. I'd like to have all those functions available to all KDE applications. Being able to take a step back, look at all the problems I've seen and complains I got over the years from both users and developers and just sit down and rework the whole framework to fix them is great. Linguistics is fascinating and for some reasons there's not a whole lot of people who'd want to deal with it, at least not as far as its desktop usage goes.&lt;/blockquote&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-7199291660906211708?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/7199291660906211708/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=7199291660906211708" title="3 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/7199291660906211708" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/7199291660906211708" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/p6nLYRAFvpM/zach-on-sonnet.html" title="Zach, on Sonnet" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">3</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2006/12/zach-on-sonnet.html</feedburner:origLink></entry><entry><id>tag:blogger.com,1999:blog-6252700852068053040.post-8940488218026348512</id><published>2006-12-24T20:56:00.000-07:00</published><updated>2006-12-24T21:12:07.805-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="sonnet" /><category scheme="http://www.blogger.com/atom/ns#" term="kde" /><title type="text">KDE Digest</title><content type="html">I'm in this weeks KDE &lt;a href="http://commit-digest.org/issues/2006-12-24/"&gt;Digest&lt;/a&gt;! I've been working on Sonnet. Sonnet (also known as KSpell2) will be the spelling &amp; grammar checker for KDE4.&lt;br /&gt;&lt;br /&gt;I'm working on a &lt;a href="http://unicode.org/reports/tr29/"&gt;Unicode&lt;/a&gt; compliant parser for word and sentence boundaries rather than the regex hack we have now. While the regex worked fine for English and most European languages, it didn't work at all for other scripts, like Hebrew or Devanagari. It now does (well mostly, we have a few bugs).&lt;br /&gt;&lt;br /&gt;Sonnet works by having plugins for various spelling and grammar engines. Once, &lt;a href="http://zrusin.blogspot.com/"&gt;Zach&lt;/a&gt; (the official maintainer of Sonnet) commits the grammar interface he promised, (It's somewhere on his computer he tells me) I can convert my &lt;a href="http://www.link.cs.cmu.edu/link/"&gt;link-grammar&lt;/a&gt; engine interface to a Sonnet plugin.&lt;br /&gt;&lt;br /&gt;Sonnet also has a few UI elements that need some usability love. This should follow in a few weeks.&lt;br /&gt;&lt;br /&gt;KDE4 is going to rock!&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/6252700852068053040-8940488218026348512?l=blog.jacobrideout.net'/&gt;&lt;/div&gt;</content><link rel="replies" type="application/atom+xml" href="http://blog.jacobrideout.net/feeds/8940488218026348512/comments/default" title="Post Comments" /><link rel="replies" type="text/html" href="https://www.blogger.com/comment.g?blogID=6252700852068053040&amp;postID=8940488218026348512" title="4 Comments" /><link rel="edit" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/8940488218026348512" /><link rel="self" type="application/atom+xml" href="http://www.blogger.com/feeds/6252700852068053040/posts/default/8940488218026348512" /><link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/MountainGoatProgrammer/~3/2WmMROvnlsE/kde-digest.html" title="KDE Digest" /><author><name>Jacob Rideout</name><uri>http://www.blogger.com/profile/10121984287633320596</uri><email>noreply@blogger.com</email><gd:extendedProperty xmlns:gd="http://schemas.google.com/g/2005" name="OpenSocialUserId" value="04006731725119955092" /></author><thr:total xmlns:thr="http://purl.org/syndication/thread/1.0">4</thr:total><feedburner:origLink>http://blog.jacobrideout.net/2006/12/kde-digest.html</feedburner:origLink></entry></feed>
