<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:blogger='http://schemas.google.com/blogger/2008' xmlns:georss='http://www.georss.org/georss' xmlns:gd="http://schemas.google.com/g/2005" xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-7071496506298372327</id><updated>2026-04-11T11:33:41.816-07:00</updated><category term="list loading"/><category term="features"/><category term="markmail"/><category term="analytics"/><category term="search results"/><category term="architecture"/><category term="google"/><category term="interview"/><category term="marklogic"/><category term="markmail search architecture"/><category term="perl"/><category term="shirts"/><category term="xml"/><category term="apache"/><category term="apachecon"/><category term="appfuse"/><category term="cargo"/><category term="castor"/><category term="codehaus"/><category term="css"/><category term="drools"/><category term="eclipse"/><category term="failover"/><category term="freebsd"/><category term="gadget"/><category term="gnome"/><category term="googlegroups"/><category term="grails"/><category term="groovy"/><category term="java.net"/><category term="jdom"/><category term="jruby"/><category term="kde"/><category term="launch"/><category term="mailman"/><category term="markmail anniversary"/><category term="markmail conference"/><category term="markmail mlug"/><category term="markmail rubyonrails ruby"/><category term="mojo"/><category term="mozilla"/><category term="mysql"/><category term="nanog"/><category term="netbeans"/><category term="netcoolusers"/><category term="openejb"/><category term="openmoko"/><category term="openoffice"/><category term="operations"/><category term="pear"/><category term="perforce"/><category term="perl screencast"/><category term="php"/><category term="picocontainer"/><category term="plexus"/><category term="postgresql"/><category term="procmail"/><category term="python"/><category term="redhat"/><category term="saxon"/><category term="squid"/><category term="stemming"/><category term="talks"/><category term="w3c"/><category term="wso2"/><category term="xen"/><category term="xquery"/><category term="xslt"/><category term="xstream"/><category term="xwiki"/><title type='text'>The Making of MarkMail</title><subtitle type='html'>Welcome to the MarkMail team blog.  Here we discuss MarkMail enhancements, new mailing list archives, and (perhaps most important) discuss the challenges of building an Internet service for searching large, million-message mailing list archives using XML, XQuery, and MarkLogic Server.</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default?redirect=false'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><link rel='next' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default?start-index=26&amp;max-results=25&amp;redirect=false'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>58</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>25</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-299452598506386478</id><published>2010-01-15T19:36:00.001-08:00</published><updated>2010-10-18T12:14:39.787-07:00</updated><title type='text'>Welcome community plumbers (aka Drupal)</title><content type='html'>&lt;div style=&quot;float: right;&quot;&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglhllTRLwHLYFAJ1UXxU9A3Luu3jAfLMijF6nYXYjLlVa88E7tuxl3ACTqNkHPLz1MzKJLaR62osemBXjW0qXqglJjctj8TxF_KIbrV0HsuZzLhv2HRp1C1FaTb6zFlu3TJ5uTwyC5Xw4/s1600-h/Screen+shot+2010-01-27+at+3.12.13+PM.png&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 176px;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglhllTRLwHLYFAJ1UXxU9A3Luu3jAfLMijF6nYXYjLlVa88E7tuxl3ACTqNkHPLz1MzKJLaR62osemBXjW0qXqglJjctj8TxF_KIbrV0HsuZzLhv2HRp1C1FaTb6zFlu3TJ5uTwyC5Xw4/s400/Screen+shot+2010-01-27+at+3.12.13+PM.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5431561614112517970&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;/div&gt;Towards the end of last year we loaded up the &lt;a href=&quot;http://drupal.org/&quot;&gt;Drupal&lt;/a&gt; archives and subscribed to their active lists.  Drupal is a pretty successful project with lots of real-world use and we figured they might be reasonably chatty online.&lt;br /&gt;&lt;br /&gt;As preparation for this post, I checked out the Drupal front page.  I love a good analogy and their HTML &lt;tt&gt;&amp;lt;title&amp;gt;&lt;/tt&gt; but a good smile on my face:&lt;pre&gt;    &amp;lt;title&amp;gt;drupal.org | Community plumbing&amp;lt;/title&gt;&lt;/pre&gt;It made me think we might re-consider the MarkMail title, given what we do.  MarkMail helps &lt;span style=&quot;font-style: italic;&quot;&gt;organize&lt;/span&gt; development communities and so my first thought was:&lt;br /&gt;&lt;pre&gt;    &amp;lt;title&amp;gt;markmail.org | Community organization&amp;lt;/title&amp;gt;&lt;/pre&gt;More specifically, we help with project &lt;span style=&quot;font-style: italic;&quot;&gt;histories&lt;/span&gt; so I changed it to:  &lt;pre&gt;    &amp;lt;title&amp;gt;markmail.org | Community histories&amp;lt;/title&amp;gt;&lt;/pre&gt;And then, perhaps best of all, I landed on:&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglhllTRLwHLYFAJ1UXxU9A3Luu3jAfLMijF6nYXYjLlVa88E7tuxl3ACTqNkHPLz1MzKJLaR62osemBXjW0qXqglJjctj8TxF_KIbrV0HsuZzLhv2HRp1C1FaTb6zFlu3TJ5uTwyC5Xw4/s1600-h/Screen+shot+2010-01-27+at+3.12.13+PM.png&quot;&gt;&lt;/a&gt;&lt;pre&gt;    &amp;lt;title&amp;gt;markmail.org | Community libraries&amp;lt;/title&amp;gt; &lt;/pre&gt;And as the chart above shows, these folks are pretty chatty and have been so since about 2004.&lt;br /&gt;&lt;br /&gt;&lt;div style=&quot;float: right;&quot;&gt; &lt;/div&gt; So here&#39;s a hearty welcome from us at MarkMail&lt;br /&gt;&lt;br /&gt;&lt;div style=&quot;text-align: center;&quot;&gt; &lt;img alt=&quot;Librarian&quot; src=&quot;http://upload.wikimedia.org/wikipedia/commons/c/cb/Librarian_at_the_card_files_at_a_senior_high_school_in_New_Ulm%2C_Minnesota.jpg&quot; height=&quot;150&quot; width=&quot;220&quot; /&gt;&lt;/div&gt;&lt;br /&gt;to the busy folks over at Drupal&lt;br /&gt;&lt;br /&gt;&lt;div style=&quot;text-align: center;&quot;&gt;&lt;img alt=&quot;Plumber&quot; src=&quot;http://www.plumberspreston.co.uk/plumber001-%28no-bkgd%29.gif&quot; height=&quot;200&quot; width=&quot;150&quot; /&gt;&lt;br /&gt;&lt;div style=&quot;text-align: left;&quot;&gt;&lt;br /&gt;&lt;/div&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/299452598506386478/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/299452598506386478' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/299452598506386478'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/299452598506386478'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2010/01/welcome-community-plumbers-aka-drupal.html' title='Welcome community plumbers (aka Drupal)'/><author><name>Eric Bloch</name><uri>http://www.blogger.com/profile/02699687256217967826</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEglhllTRLwHLYFAJ1UXxU9A3Luu3jAfLMijF6nYXYjLlVa88E7tuxl3ACTqNkHPLz1MzKJLaR62osemBXjW0qXqglJjctj8TxF_KIbrV0HsuZzLhv2HRp1C1FaTb6zFlu3TJ5uTwyC5Xw4/s72-c/Screen+shot+2010-01-27+at+3.12.13+PM.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-4567502038041435075</id><published>2010-01-13T10:48:00.000-08:00</published><updated>2010-01-14T13:20:18.905-08:00</updated><title type='text'>The wide world of Ubuntu: 1.9 million messages</title><content type='html'>&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://www.ubuntu.com/sites/all/themes/ubuntu09/logo.png&quot;&gt;&lt;img style=&quot;margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 202px; height: 55px;&quot; src=&quot;http://www.ubuntu.com/sites/all/themes/ubuntu09/logo.png&quot; alt=&quot;&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;br /&gt;A few months ago, we loaded up the publicly available &lt;a href=&quot;http://www.ubuntu.com/&quot;&gt;Ubuntu&lt;/a&gt; archives and subscribed to their active lists.  MarkMail now searches &lt;strong&gt;313 Ubuntu lists&lt;/strong&gt; and &lt;strong&gt;1,917,153 messages&lt;/strong&gt;. The first Ubuntu list started in &lt;strong&gt;July 2004&lt;/strong&gt; and there are currently &lt;strong&gt;200 active  lists&lt;/strong&gt;, recently accumulating &lt;strong&gt;4,061  messages per day&lt;/strong&gt;.&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxSGcdqnJ3C4hjoC5aV7RD9oMGwy0-l42nYSqnDVSj1ToEeg8tIT99_aloiHQU6B31wo_dQgxtM23xtEBxZ3tEQyLmPUHz5kDbQ2FaRVW1pwhOeg_2sLci_tOQaci381n8_7z2XjQAquk/s1600-h/Screen+shot+2010-01-13+at+10.50.01+AM.png&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 147px;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxSGcdqnJ3C4hjoC5aV7RD9oMGwy0-l42nYSqnDVSj1ToEeg8tIT99_aloiHQU6B31wo_dQgxtM23xtEBxZ3tEQyLmPUHz5kDbQ2FaRVW1pwhOeg_2sLci_tOQaci381n8_7z2XjQAquk/s400/Screen+shot+2010-01-13+at+10.50.01+AM.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5426299473969519586&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;  Ubuntu is very much a world-wide project.  Below is a recent snapshot of the message counts for Ubuntu lists associated with specific counties:&lt;br /&gt;&lt;table&gt;&lt;tbody&gt;&lt;tr&gt;&lt;td&gt;&lt;a href=&quot;http://markmail.org/browse/com.ubuntu.lists.ubuntu-ar&quot;&gt;Agentina&lt;/a&gt;&lt;/td&gt;&lt;td align=&quot;right&quot;&gt;26,923&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;a href=&quot;http://markmail.org/browse/com.ubuntu.lists.ubuntu-au&quot;&gt;Austria&lt;/a&gt;&lt;/td&gt;&lt;td align=&quot;right&quot;&gt;5,597&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;a href=&quot;http://markmail.org/browse/com.ubuntu.lists.ubuntu-br&quot;&gt;Brazil&lt;/a&gt;&lt;/td&gt;&lt;td align=&quot;right&quot;&gt;67,940&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;a href=&quot;http://markmail.org/browse/com.ubuntu.lists.ubuntu-co&quot;&gt;Columbia&lt;/a&gt;&lt;/td&gt;&lt;td align=&quot;right&quot;&gt;20,758&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;a href=&quot;http://markmail.org/browse/com.ubuntu.lists.ubuntu-de&quot;&gt;Germany&lt;/a&gt;&lt;/td&gt;&lt;td align=&quot;right&quot;&gt;19,438&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;a href=&quot;http://markmail.org/browse/com.ubuntu.lists.ubuntu-es&quot;&gt;Spain&lt;/a&gt;&lt;/td&gt;&lt;td align=&quot;right&quot;&gt;41,489&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;a href=&quot;http://markmail.org/browse/com.ubuntu.lists.ubuntu-it&quot;&gt;Italy&lt;/a&gt;&lt;/td&gt;&lt;td align=&quot;right&quot;&gt;43,618&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;a href=&quot;http://markmail.org/browse/com.ubuntu.lists.ubuntu-ni&quot;&gt;Nicaragua&lt;/a&gt;&lt;/td&gt;&lt;td align=&quot;right&quot;&gt;11,761&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;a href=&quot;http://markmail.org/browse/com.ubuntu.lists.ubuntu-ru&quot;&gt;Russia&lt;/a&gt;&lt;/td&gt;&lt;td align=&quot;right&quot;&gt;19,321&lt;/td&gt;&lt;/tr&gt;&lt;tr&gt;&lt;td&gt;&lt;a href=&quot;http://markmail.org/browse/com.ubuntu.lists.ubuntu-uk&quot;&gt;United Kingdom&lt;/a&gt;&lt;/td&gt;&lt;td align=&quot;right&quot;&gt;21,995&lt;/td&gt;&lt;/tr&gt;&lt;/tbody&gt;&lt;/table&gt;</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/4567502038041435075/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/4567502038041435075' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/4567502038041435075'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/4567502038041435075'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2010/01/wide-world-of-ubuntu-19-million.html' title='The wide world of Ubuntu: 1.9 million messages'/><author><name>Eric Bloch</name><uri>http://www.blogger.com/profile/02699687256217967826</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjxSGcdqnJ3C4hjoC5aV7RD9oMGwy0-l42nYSqnDVSj1ToEeg8tIT99_aloiHQU6B31wo_dQgxtM23xtEBxZ3tEQyLmPUHz5kDbQ2FaRVW1pwhOeg_2sLci_tOQaci381n8_7z2XjQAquk/s72-c/Screen+shot+2010-01-13+at+10.50.01+AM.png" height="72" width="72"/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-1186233995286800932</id><published>2009-11-08T19:35:00.000-08:00</published><updated>2009-11-18T17:24:42.433-08:00</updated><title type='text'>Magic Eraser Policy</title><content type='html'>If you&#39;ve been around email long enough, you know there&#39;s a reason that gmail has that magic &lt;a href=&quot;http://gmailblog.blogspot.com/2009/03/new-in-labs-undo-send.html&quot;&gt;undo send&lt;/a&gt; feature. (Well, ok, once you see what it is, it&#39;s not that magic).&lt;br /&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://www.naa.gov.au/Images/B4717_BRADFORD.FREDERICK%20ANDREW_A97%20700%20wide_tcm2-10026.jpg&quot;&gt;&lt;img style=&quot;margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 120px; height: 150px;&quot; src=&quot;http://www.naa.gov.au/Images/B4717_BRADFORD.FREDERICK%20ANDREW_A97%20700%20wide_tcm2-10026.jpg&quot; alt=&quot;Your permanent record&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;Because of the nature of digital information, it&#39;s &lt;span style=&quot;font-style: italic;&quot;&gt;too&lt;/span&gt; &lt;span style=&quot;font-style: italic;&quot;&gt;easy&lt;/span&gt; to have that email end up &lt;a href=&quot;http://images.salon.com/comics/tomo/2002/12/02/tomo/story.jpg&quot;&gt;somewhere&lt;/a&gt; you&#39;d rather not see it. Once you hit send, you don&#39;t have much control over the copying process. That email immediately becomes part of your &lt;span style=&quot;font-style: italic; font-weight: bold;&quot;&gt;permanent record&lt;/span&gt;.  And, well, you&#39;ve probably been there yourself at one time or another, wishing you hadn&#39;t sent it.&lt;br /&gt;&lt;br /&gt;Even though there are a ton of &lt;a href=&quot;http://www.amazon.com/s/ref=nb_ss?url=search-alias%3Dstripbooks&amp;amp;field-keywords=email+etiquette&amp;amp;x=5&amp;amp;y=18&quot;&gt;sources&lt;/a&gt; on email etiquette.  it&#39;s not too surprising that folks send email they wish they hadn&#39;t.  We get weekly evidence of that.    Like &lt;a href=&quot;http://www.google.com/support/webmasters/bin/answer.py?hl=en&amp;amp;answer=156412&quot;&gt;search engines&lt;/a&gt; and other popular archives, we get our share of requests to remove content.  Sometimes folks ask for individual emails to be removed.  Other times, we get more open-ended requests like, &quot;please remove my name from your site&quot;.&lt;br /&gt;&lt;p&gt;&lt;/p&gt;The funny thing is that mailing list tools aren&#39;t really set up for redaction.  They don&#39;t make it easy to remove individual emails, let alone parts of emails, from their archives.   So, what does MarkMail do with this issue?&lt;br /&gt;&lt;div style=&quot;float: left;&quot;&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://z.about.com/d/chemistry/1/0/L/P/eraser1.jpg&quot;&gt;&lt;img style=&quot;margin: 0pt 10px 10px 0pt; cursor: pointer; width: 147px; height: 125px;&quot; src=&quot;http://z.about.com/d/chemistry/1/0/L/P/eraser1.jpg&quot; alt=&quot;Magic Eraser&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;MarkLogic Server to the rescue!&lt;br /&gt;&lt;br /&gt;One of the cool things for us, is that MarkMail actually has a magic eraser.  With little pain, thanks to the real-time index provided by our MarkLogic server, we can remove emails from our index at a moment&#39;s notice.  With a single line of XQuery code, we can move the document representing that email into a hidden collection and in nanoseconds, it&#39;s gone from all further queries to MarkMail.  Yes, that is neato.&lt;br /&gt;&lt;br /&gt;So we have the technology.&lt;br /&gt;&lt;br /&gt;Still, we don&#39;t take message removal lightly.  As an authoritative record of public history, removals are treated as the exceptional case.  MarkMail provides content that it receives from publicly available sources. Everything we serve, we received another source.  List administrators control their own lists and set policies on archives and we respect that.  By posting on their list, you follow their rules, and we do too.   So, if you want something removed from MarkMail, you&#39;ll usually have to get it removed from the original source first.&lt;br /&gt;&lt;br /&gt;We&#39;ve recently added our &lt;a href=&quot;http://markmail.org/docs/removal-policy.xqy&quot;&gt;removal policy&lt;/a&gt; to the site, which has full details on how we deal with removal requests.  The  mechanism starts at our &lt;a href=&quot;http://markmail.org/docs/feedback.xqy&quot;&gt;feedback page&lt;/a&gt;.  In a nutshell, we will remove emails under two specific cases:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;Content in Violation of the &lt;a href=&quot;http://markmail.org/docs/content-policy.xqy&quot;&gt;MarkMail Content Policy&lt;/a&gt; (e.g. spam, porn, virus, fraud, illegal activities, copyright violations)&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Content Removed from Official Archive&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;When content clearly violates our content policy, we may simply remove it.  No surprise there.  That&#39;s bad stuff.   Any other material needs to be removed from the original source before we&#39;ll act.  In all cases, we prefer requests from the archive owner and/or requests that come with evidence that the content has been removed from the originating source.&lt;br /&gt;&lt;br /&gt;This policy attempts to balance the rights of many parties, including those who have posted content to public lists and forums, those who own and administer the public lists and forums, and those who link to and reference the MarkMail archives as an information source and accurate record of history.  We hope you find it reasonable.</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/1186233995286800932/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/1186233995286800932' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/1186233995286800932'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/1186233995286800932'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2009/11/magic-eraser-policy.html' title='Magic Eraser Policy'/><author><name>Eric Bloch</name><uri>http://www.blogger.com/profile/02699687256217967826</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-6570297384751966421</id><published>2009-11-04T09:32:00.000-08:00</published><updated>2009-11-04T16:19:45.672-08:00</updated><title type='text'>Easy Change</title><content type='html'>As we look to make MarkMail pay for itself, it&#39;s pretty darn obvious that the &lt;a href=&quot;http://www.alexa.com/siteinfo/markmail.org&quot;&gt;traffic&lt;/a&gt; the site gets is sufficient to generate some revenue from advertising. A good number of well-known developer-centric sites display ads (e.g., &lt;a href=&quot;http://sourceforge.net/softwaremap/&quot;&gt;sourceforge.net&lt;/a&gt;, &lt;a href=&quot;http://www.xml.com/&quot;&gt;www.xml.com&lt;/a&gt;, &lt;a href=&quot;http://www.vim.org/&quot;&gt;vim.org&lt;/a&gt;, &lt;a href=&quot;http://linux.org/&quot;&gt;linux.org&lt;/a&gt;, many others) with varying degrees of success. It even looks like &lt;a href=&quot;http://msdn.com/&quot;&gt;msdn.com&lt;/a&gt; displays ads (albeit for other MS properties). And there are also sites like Expert Exchange that charge users a premium to search their archives, while also displaying ads.&lt;br /&gt;&lt;br /&gt;The plan is to make minor adjustments to the site layout and provide relevant ads that are meaningful to a software developer.   &lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://www.myinvestmentanalysis.com/wp-content/uploads/2009/07/penny-stocks.jpg&quot;&gt;&lt;img style=&quot;margin: 10pt 10px 10px 10pt; float: right; cursor: pointer; width: 136px; height: 172px;&quot; src=&quot;http://www.myinvestmentanalysis.com/wp-content/uploads/2009/07/penny-stocks.jpg&quot; alt=&quot;&quot; border=&quot;0&quot; /&gt;&lt;/a&gt; Over time, we&#39;ll be looking to bring in ad content that is&lt;br /&gt;&lt;ul&gt;&lt;li&gt;targeted to the list archive the developer is reading&lt;/li&gt;&lt;li&gt;suitable for a library (quiet, text-only, unobtrusive)&lt;/li&gt;&lt;li&gt;designed for and targeted to a developer audience&lt;/li&gt;&lt;/ul&gt; We&#39;ll be enabling &lt;a href=&quot;http://google.com/adsense&quot;&gt;Google AdSense&lt;/a&gt; text-based ads soon, as they appear to be best-of-breed, simple to implement.  But we&#39;ll also be looking to sell advertising space ourselves.&lt;br /&gt;&lt;br /&gt;In the meantime, before we&#39;re fully set up, if you&#39;re reading this and you have something worth communicating to some of the millions of folks that end up at MarkMail, give us a holler.&lt;br /&gt;&lt;br /&gt;PS.  If you sign up for a MarkMail account, you&#39;ll have access to a switch that will enable you to opt out of the AdSense ads.</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/6570297384751966421/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/6570297384751966421' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/6570297384751966421'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/6570297384751966421'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2009/11/easy-change.html' title='Easy Change'/><author><name>Eric Bloch</name><uri>http://www.blogger.com/profile/02699687256217967826</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-8187833970376813898</id><published>2009-10-08T16:17:00.000-07:00</published><updated>2009-10-14T10:27:05.894-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="marklogic"/><category scheme="http://www.blogger.com/atom/ns#" term="markmail"/><title type='text'>The New Guy</title><content type='html'>Hi Folks,&lt;br /&gt;&lt;br /&gt;I&#39;m the &lt;span style=&quot;font-style: italic;&quot;&gt;new guy&lt;/span&gt; and this is my inaugural post.&lt;br /&gt;&lt;br /&gt;I first ran into MarkMail a few years ago, during my tenure at &lt;a href=&quot;http://www.clearwellsystems.com/&quot;&gt;Clearwell Systems&lt;/a&gt;.  Back then, both Clearwell and MarkMail were building &lt;span style=&quot;font-style: italic;&quot;&gt;&quot;search engines for email&quot;&lt;/span&gt;.  If you didn&#39;t look deeply, you&#39;d have thought we were competitors.  But we weren&#39;t.  And we aren&#39;t, still.&lt;br /&gt;&lt;br /&gt;At Clearwell, we knew we were breaking ground.  Like many start-ups, we really didn&#39;t preconceive which applications our efforts would enable and which markets would find our solutions valuable.  We just believed, with all our hearts and minds, that there was something there and we could get it done.  I&#39;ve since learned the history isn&#39;t too different at MarkMail.   To make a long story short, Clearwell went one way, focusing on enterprise email environments, like Microsoft Exchange, PST Files, and Lotus Notes, eventually cracking the nut on common &lt;a href=&quot;http://www.clearwellsystems.com/e-discovery-blog/&quot;&gt;Electronic Discovery&lt;/a&gt; use cases.   We built a really sweet e-discovery product that continues to get rave reviews and save customers $$$. &lt;br /&gt;&lt;br /&gt;And, MarkMail went another, supporting public, open-source communities and mailing lists (&lt;a href=&quot;http://www.gnu.org/software/mailman/index.html&quot;&gt;MailMan&lt;/a&gt;, &lt;a href=&quot;http://www.ezmlm.org/&quot;&gt;Ezmlm&lt;/a&gt;, &lt;a href=&quot;http://groups.google.com/&quot;&gt;Google Groups&lt;/a&gt;, and others).  Through these efforts, MarkMail has become a large scale, highly respected, and high-traffic service for software developers.  Ironically enough, as a software developer, it was not uncommon for me at Clearwell to ultimately use MarkMail.  So I&#39;ve been a fan for quite sometime.&lt;br /&gt;&lt;br /&gt;And now, thanks to the hard work of good folks at Mark Logic (and &lt;a href=&quot;http://www.servlets.com/jason/&quot;&gt;Jason Hunter&lt;/a&gt; in particular), I&#39;m here to help further the MarkMail mission.   I bring to MarkMail, deep experience in software development practice, high-performance computing, and user interface engineering (&lt;a href=&quot;http://openlaszlo.org/&quot;&gt;OpenLaszlo&lt;/a&gt;, &lt;a href=&quot;http://laszlomail.com/moved/&quot;&gt;LaszloMail&lt;/a&gt;).  For a good percentage of this time, I&#39;ve been in and around all sorts of email, communication, and collaboration tools, especially those used by developers.    And in my most recent gigs, I&#39;ve been focused on making sure my engineering efforts are part of a broader business success.  And I plan to do the same for MarkMail.&lt;br /&gt;&lt;br /&gt;So... to take a line from Bette Middler, &quot;Enough about me.&quot;  What am I going to do for MarkMail?  Well, a lot I hope.   In the immediate term, I&#39;ve got a few obvious directives, like keeping the site going and growing.  I’ll also be focusing on the changes needed to make the site responsible (aka pay) for its own operations, while keeping a focus on the general communities that it serves.   To that end, you can expect another post (or two) about upcoming changes.&lt;br /&gt;&lt;br /&gt;Sounds like fun, eh?  Well, you&#39;ll get to hear it all,  assuming I continue to be able to make time for blog posts like this one.  Thanks for listening and please holler at me with advice and comments.   And, of course, if you have feedback or ideas related to how MarkMail might help you, please holler.&lt;br /&gt;&lt;br /&gt;Oh... and one more thing.  If you&#39;re in the Bay Area and you want to see me in person, you can also find me occasionally playing out with the &lt;a href=&quot;http://tribalbluesband.com/&quot;&gt;Tribal Blues Band&lt;/a&gt; (I&#39;m the one in the blue shirt).</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/8187833970376813898/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/8187833970376813898' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/8187833970376813898'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/8187833970376813898'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2009/10/new-guy.html' title='The New Guy'/><author><name>Eric Bloch</name><uri>http://www.blogger.com/profile/02699687256217967826</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-5488725411140444386</id><published>2009-06-23T14:09:00.001-07:00</published><updated>2009-06-23T14:25:45.529-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="markmail mlug"/><title type='text'>MarkMail at the first MarkLogic User Group</title><content type='html'>Last week I spoke at the inaugural &lt;a href=&quot;http://novajug.wordpress.com/2009/06/16/june-16-markmail-and-mark-logic-server-by-jason-hunter/&quot;&gt;Mark Logic User Group meeting in Reston, VA&lt;/a&gt; (near where a lot of our government customers are based).  The topic was MarkMail: where the idea came from, how we built it on the cheap, how Mark Logic began using it internally, and some lessons we learned as we scaled out the public high-traffic site.  It&#39;s a similar talk to the one I gave at the &lt;a href=&quot;http://www.twazzup.com/search?q=%23mluc09&amp;amp;l=all&quot;&gt;Mark Logic User Conference in San Francisco&lt;/a&gt; last month.&lt;br /&gt;&lt;br /&gt;For those interested, the &lt;a href=&quot;http://markmail.org/collateral/mlug2009.mov&quot;&gt;slides are available&lt;/a&gt; as a downloadable MOV file.  Click to advance.&lt;br /&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://markmail.org/collateral/mlug2009.mov&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer; width: 400px; height: 303px;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgXCxzIBWIf2TMDfTh2tjlbBDNBPKs3sPMU4j4I_hpNRRf434S4mg2JagxH1KhhXwLSxLchpluwNt46mHiGv5cgze2N6XKUGrBuAkwAHWGfE4A7mJv_FKz4JEL7-Z3fOrni62Hk9UA4KxyN/s400/Picture+2.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5350637084196234178&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;br /&gt;The slides are fairly simple.  Most of the fun of the talk (well, at least for me) is in the stories I tell, usually relating to the quotes in italics at the bottom of slides.  I suppose you&#39;ll just have to use your imagination.</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/5488725411140444386/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/5488725411140444386' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/5488725411140444386'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/5488725411140444386'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2009/06/markmail-at-first-marklogic-user-group.html' title='MarkMail at the first MarkLogic User Group'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgXCxzIBWIf2TMDfTh2tjlbBDNBPKs3sPMU4j4I_hpNRRf434S4mg2JagxH1KhhXwLSxLchpluwNt46mHiGv5cgze2N6XKUGrBuAkwAHWGfE4A7mJv_FKz4JEL7-Z3fOrni62Hk9UA4KxyN/s72-c/Picture+2.png" height="72" width="72"/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-6255236527565714166</id><published>2009-05-05T17:53:00.000-07:00</published><updated>2009-05-05T18:17:07.984-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="markmail conference"/><title type='text'>MarkMail at the Mark Logic User Conference</title><content type='html'>The &lt;a href=&quot;http://www.marklogic.com/UserConference2009/&quot;&gt;Mark Logic User Conference&lt;/a&gt; is coming up next week.  If you&#39;re coming to the show, I encourage you to attend the talk on  MarkMail I&#39;ll be giving on Wednesday. I&#39;ll tell the story of MarkMail as it progressed from my first idea to a night project built with Ryan Grimm to the robust web site you see now at markmail.org (and even to the other web sites you don&#39;t see, because they&#39;re running behind people&#39;s firewalls).  It&#39;s in the conference&#39;s technical track so there&#39;ll be a lot of focus on the core tech.&lt;br /&gt;&lt;br /&gt;If you&#39;re not coming to the show, why the heck not?  :)  It&#39;s not too late to register.&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9WF5VVWR4e9uF-qPftWaiYXVcBfAuKw7Q8G0okDn7aM9Ed1lgwwNzbcmxz3HuMvQ00sQNoOuiVX_Zc-jvZ5Vf5BrEIQzcUjg2AvWfJtQa_qTDoITURkE_t1IwyxcEPwRGO3kgFN5OIm-7/s1600-h/Picture+2.png&quot;&gt;&lt;img style=&quot;margin: 0pt 0pt 10px 10px; float: right; cursor: pointer; width: 355px; height: 269px;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9WF5VVWR4e9uF-qPftWaiYXVcBfAuKw7Q8G0okDn7aM9Ed1lgwwNzbcmxz3HuMvQ00sQNoOuiVX_Zc-jvZ5Vf5BrEIQzcUjg2AvWfJtQa_qTDoITURkE_t1IwyxcEPwRGO3kgFN5OIm-7/s400/Picture+2.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5332513098814687474&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/6255236527565714166/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/6255236527565714166' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/6255236527565714166'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/6255236527565714166'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2009/05/markmail-at-mark-logic-user-conference.html' title='MarkMail at the Mark Logic User Conference'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEi9WF5VVWR4e9uF-qPftWaiYXVcBfAuKw7Q8G0okDn7aM9Ed1lgwwNzbcmxz3HuMvQ00sQNoOuiVX_Zc-jvZ5Vf5BrEIQzcUjg2AvWfJtQa_qTDoITURkE_t1IwyxcEPwRGO3kgFN5OIm-7/s72-c/Picture+2.png" height="72" width="72"/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-3041492970893725680</id><published>2008-12-09T16:27:00.001-08:00</published><updated>2008-12-09T17:09:44.457-08:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="markmail anniversary"/><title type='text'>MarkMail at One Year: Looking Back</title><content type='html'>It&#39;s now been a little over a year since we launched MarkMail.  We&#39;ve sure come a long way!&lt;br /&gt;&lt;br /&gt;We&#39;re now seeing well over a &lt;span style=&quot;font-weight: bold;&quot;&gt;million unique visitors every month&lt;/span&gt; and more than &lt;span style=&quot;font-weight: bold;&quot;&gt;5 million page views&lt;/span&gt;.&lt;br /&gt;&lt;br /&gt;The Googlebot crawler (whose activity isn&#39;t included in the above statistics) has also been active.  It now &lt;span style=&quot;font-weight: bold;&quot;&gt;crawls between 1.0 and 1.3 million pages every day&lt;/span&gt; to keep its index fresh.  That&#39;s about 15 page hits every second -- or 15 Hertz, enough to make a nice low background rumble noise.  It&#39;s really enjoyable to get so much Google attention because it wasn&#39;t that long ago when we were just trying to &lt;a href=&quot;http://markmail.blogspot.com/2008/01/stuffing-six-million-pages-down-googles.html&quot;&gt;get Google to index more than a million&lt;/a&gt; of our pages, nevermind crawl that many in a day.&lt;br /&gt;&lt;br /&gt;Our content size has grown also, from 4 million messages at launch, covering just the Apache Software Foundation archives, to 34 million messages today, spanning &lt;a href=&quot;http://markmail.org/docs/projects.xqy&quot;&gt;all sorts of communities&lt;/a&gt;.  For us to grow so big so fast has been possible only because of the community support we&#39;ve received.  There&#39;s a long list of various community members who have worked with us to accumulate and load their list archives.  We&#39;d like to thank all those folks, as well as the people who placed a MarkMail search box or other MarkMail link on their site or helped spread the word in blogs and emails and tweets.&lt;br /&gt;&lt;br /&gt;Looking forward, where do we go from here?  We have some big plans.  I&#39;ll get into details with a later post.</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/3041492970893725680/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/3041492970893725680' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/3041492970893725680'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/3041492970893725680'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/12/markmail-at-one-year-looking-back.html' title='MarkMail at One Year: Looking Back'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-7796233115420417472</id><published>2008-10-09T00:14:00.001-07:00</published><updated>2008-10-09T13:22:47.983-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="gadget"/><category scheme="http://www.blogger.com/atom/ns#" term="google"/><category scheme="http://www.blogger.com/atom/ns#" term="googlegroups"/><title type='text'>Google Code Adds Gadgets: MarkMail Helps</title><content type='html'>Google today &lt;a href=&quot;http://google-code-updates.blogspot.com/2008/10/gadgets-and-google-code.html&quot;&gt;announced&lt;/a&gt; new support for embeddable &quot;gadgets&quot; on Google Code project pages.  Particularly exciting to us, they introduced MarkMail as the recommended gadget for viewing and searching Google Code project list archives.&lt;br /&gt;&lt;br /&gt;For those who haven&#39;t encountered one in the wild, a &lt;a href=&quot;http://www.google.com/webmasters/gadgets/&quot;&gt;Google Gadget&lt;/a&gt; is an embeddable web object that puts a bit of third-party dynamic content into the middle of a web page.  Gadgets are the things you place on your iGoogle home page or your Google Desktop, but you can also add them to your own web page with one line of JavaScript, or anyone else&#39;s page if it supports the OpenSocial APIs.&lt;br /&gt;&lt;br /&gt;We&#39;ve coordinated with the Google Code team over the last several months to load about &lt;a href=&quot;http://markmail.org/browse/?q=list%3Agooglegroups&quot;&gt;500 GoogleGroups lists&lt;/a&gt; (3.8 million emails) and build a new &lt;a href=&quot;http://markmail.org/gadgets/builder/&quot;&gt;MarkMail Gadget&lt;/a&gt; (launching today!) to let Google Code developers search and analyze their lists using MarkMail.&lt;br /&gt;&lt;br /&gt;The new MarkMail gadget lets you view messages, threads, attachments, and senders, and a traffic chart (wouldn&#39;t be MarkMail without it!) for any set of messages you want to follow. The messages you choose to track with the gadget can be those from a single list, set of lists, a person, containing a term or phrase, or any combination. In fact, anything you can use in a search on &lt;a href=&quot;http://markmail.org/&quot;&gt;MarkMail&lt;/a&gt; can be used as input to the gadget view.  The new gadget offers two features not yet available on MarkMail.org: a daily traffic chart (MarkMail.org only does monthly traffic charts) and a view that coalesces threads.&lt;br /&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfCzkMphY8wCaST44kbggnGTUm69F3Z05qspAZX_rOPxMZHoLhO4CSAI4yxYKH4UG1W9H1NCHY1wgmhghvbqVgza0u38hlYiwNrjf94tma1rSNqC6-fzadApmkLe1FKGMmimyFNSI2NTep/s1600-h/78f73770519091c42c2f759f7116f674.png&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfCzkMphY8wCaST44kbggnGTUm69F3Z05qspAZX_rOPxMZHoLhO4CSAI4yxYKH4UG1W9H1NCHY1wgmhghvbqVgza0u38hlYiwNrjf94tma1rSNqC6-fzadApmkLe1FKGMmimyFNSI2NTep/s400/78f73770519091c42c2f759f7116f674.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5255052303972527186&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;br /&gt;So what does this mean for you?  If you&#39;re a project leader (either on Google Code or somewhere else) it&#39;s now easier than ever to embed a MarkMail traffic chart and recent message list inside any of your project pages.  If you&#39;re just a lurker, you can personalize your view on MarkMail traffic and embed that view into iGoogle or Google Desktop, or any other page.&lt;br /&gt;&lt;br /&gt;To help you set up the right links, we created a &lt;a href=&quot;http://markmail.org/gadgets/builder/&quot;&gt;Gadget Embedding Wizard&lt;/a&gt; that guides you through the process of embedding.  You can also find &lt;a href=&quot;http://www.google.com/ig/directory?synd=open&amp;amp;url=http://markmail.org/gadgets/markmailmini.xqy&quot;&gt;our gadget in the Google Directory&lt;/a&gt; where they have additional embedding instructions.&lt;br /&gt;&lt;br /&gt;Tim O&#39;Reilly in describing Web 2.0 &lt;a href=&quot;http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html?page=2&quot;&gt;says&lt;/a&gt;, &lt;span style=&quot;font-style: italic;&quot;&gt;A platform beats an application every time.&lt;/span&gt;  We agree.  We think you should be able to access mailing list archives whenever and wherever you want, be it at MarkMail.org or on another page that&#39;s been MarkMail-enabled via a gadget.  So have fun, and &lt;a href=&quot;http://markmail.org/docs/feedback.xqy&quot;&gt;let us know&lt;/a&gt; how this works for you!</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/7796233115420417472/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/7796233115420417472' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/7796233115420417472'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/7796233115420417472'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/10/google-code-adds-gadgets-markmail-helps.html' title='Google Code Adds Gadgets: MarkMail Helps'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhfCzkMphY8wCaST44kbggnGTUm69F3Z05qspAZX_rOPxMZHoLhO4CSAI4yxYKH4UG1W9H1NCHY1wgmhghvbqVgza0u38hlYiwNrjf94tma1rSNqC6-fzadApmkLe1FKGMmimyFNSI2NTep/s72-c/78f73770519091c42c2f759f7116f674.png" height="72" width="72"/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-1676842439172175662</id><published>2008-10-02T18:07:00.000-07:00</published><updated>2008-10-02T23:33:13.140-07:00</updated><title type='text'>1.4% of Emails Mention Google</title><content type='html'>As Google celebrates its 10 year anniversary we thought it&#39;d be fun to use our archive of 30 million mailing list messages to see how Google&#39;s popularity has grown over time across the &lt;a href=&quot;http://markmail.org/&quot;&gt;list-o-sphere&lt;/a&gt;.  Boy has it grown!&lt;br /&gt;&lt;br /&gt;In 2008 (so far) the word &quot;Google&quot; appears in 1.4% of emails in our archive, up from 1.15% last year and 0.75% five years ago.&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvl2v7u-UTBV5ZYmAFut2R2oVu1mbzc6d8pCwP0tSjdvLNhdP1lGQOmMWmCpw8GomJgla-GPCtFbNGdRxwub4wT3xtH9MyPO-U-er8nDOTbSFt8E3wgtYQdnvs7tQuN-iccqHqCj92szgc/s1600-h/Picture+1.png&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvl2v7u-UTBV5ZYmAFut2R2oVu1mbzc6d8pCwP0tSjdvLNhdP1lGQOmMWmCpw8GomJgla-GPCtFbNGdRxwub4wT3xtH9MyPO-U-er8nDOTbSFt8E3wgtYQdnvs7tQuN-iccqHqCj92szgc/s400/Picture+1.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5252807250823733090&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;br /&gt;While shockingly high, that 1.4% number is actually calculated with some conservative restrictions.  We&#39;re excluding all mentions that occur inside quote blocks (where someone replies to another who said the word).   It&#39;d be 2% if we didn&#39;t have that rule.  We&#39;re also excluding from our calculations all the Google Groups lists we follow, where Google is often the topic of discussion.  With those lists added in?  It&#39;s 13%.&lt;br /&gt;&lt;br /&gt;You can explore this yourself with our public interface.  You&#39;ll want to query for &quot;google&quot;, use the &quot;opt:noquote&quot; flag, and set &quot;-list:googlegroups&quot; to exclude those lists.  Then you can add date constraints either by typing &quot;date:2008&quot; in the search or dragging on the chart.  Track the numbers as a result of your searches, do a little division, and you get your percentages.&lt;br /&gt;&lt;br /&gt;You&#39;ll see that so far in 2008 there were &lt;a href=&quot;http://markmail.org/search/?q=google+opt%3Anoquote+-list%3Agooglegroups+date:2008&quot;&gt;50,826&lt;/a&gt; emails saying &quot;google&quot; across &lt;a href=&quot;http://markmail.org/search/?q=-list%3Agooglegroups+date:2008&quot;&gt;3,607,973&lt;/a&gt; emails total.  That&#39;s 1.4%.  For 2003 it&#39;s &lt;a href=&quot;http://markmail.org/search/?q=google+opt%3Anoquote+-list%3Agooglegroups+date%3A2003&quot;&gt;21,165&lt;/a&gt; emails out of &lt;a href=&quot;http://markmail.org/search/?q=-list%3Agooglegroups+date%3A2003&quot;&gt;2,770,480&lt;/a&gt; total, or 0.75%.</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/1676842439172175662/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/1676842439172175662' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/1676842439172175662'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/1676842439172175662'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/10/14-of-emails-mention-google.html' title='1.4% of Emails Mention Google'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhvl2v7u-UTBV5ZYmAFut2R2oVu1mbzc6d8pCwP0tSjdvLNhdP1lGQOmMWmCpw8GomJgla-GPCtFbNGdRxwub4wT3xtH9MyPO-U-er8nDOTbSFt8E3wgtYQdnvs7tQuN-iccqHqCj92szgc/s72-c/Picture+1.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-1877999967233507852</id><published>2008-09-22T15:46:00.000-07:00</published><updated>2008-09-22T16:04:54.296-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="markmail rubyonrails ruby"/><title type='text'>Ruby on Rails on MarkMail: 200,000 Emails</title><content type='html'>Interested in Ruby on Rails?  If so, you&#39;ll be happy to learn we&#39;ve loaded the &lt;a href=&quot;http://rubyonrails.markmail.org/&quot;&gt;full RoR mailing list archive&lt;/a&gt;.  It holds about 200,000 emails and includes both the original Mailman lists from 2004-2006 and the GoogleGroups lists from 2006 onward.&lt;br /&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://rubyonrails.markmail.org/&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEguLG5ZvzI1fiDK3yQon4rDKGXbAb1K4r-g3ob4GlTPOoN_6bThncv-c37gW-RlPZH2KKVymYnRSK243aOCBtTuDK6PZn4b5ZomOel3e9Rsp9BQakgbNrTuGIb9YlA5QRrt_LnpKeYaQVrg/s400/Picture+6.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5248977896427910434&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;br /&gt;Fun facts:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;&lt;a href=&quot;http://workingwithrails.com/person/5478-frederick-cheung&quot;&gt;Frederick Cheung&lt;/a&gt; is the #1 most frequent poster&lt;/li&gt;&lt;li&gt;&lt;a href=&quot;http://workingwithrails.com/person/5246-david-heinemeier-hansson&quot;&gt;DHH&lt;/a&gt; is #22&lt;/li&gt;&lt;li&gt;The traffic never fully recovered after it transitioned from rubyonrails.org to GoogleGroups.  You can compare the &lt;a href=&quot;http://rubyonrails.markmail.org/search/?q=list%3Aorg.rubyonrails&quot;&gt;two&lt;/a&gt; &lt;a href=&quot;http://rubyonrails.markmail.org/search/?q=list%3Acom.googlegroups&quot;&gt;charts&lt;/a&gt; (keep an eye on the y-axis).&lt;/li&gt;&lt;li&gt;Maybe it&#39;s because &lt;a href=&quot;http://rubyonrails.markmail.org/search/?q=from%3Ahansson&quot;&gt;DHH didn&#39;t make the move to GG&lt;/a&gt;?&lt;br /&gt;&lt;/li&gt;&lt;/ul&gt;Don&#39;t forget, we have the regular &lt;a href=&quot;http://ruby.markmail.org/&quot;&gt;Ruby lists&lt;/a&gt; too.</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/1877999967233507852/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/1877999967233507852' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/1877999967233507852'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/1877999967233507852'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/09/ruby-on-rails-on-markmail-200000-emails.html' title='Ruby on Rails on MarkMail: 200,000 Emails'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEguLG5ZvzI1fiDK3yQon4rDKGXbAb1K4r-g3ob4GlTPOoN_6bThncv-c37gW-RlPZH2KKVymYnRSK243aOCBtTuDK6PZn4b5ZomOel3e9Rsp9BQakgbNrTuGIb9YlA5QRrt_LnpKeYaQVrg/s72-c/Picture+6.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-5390975301304334337</id><published>2008-09-11T16:17:00.000-07:00</published><updated>2008-09-11T17:26:54.838-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="freebsd"/><category scheme="http://www.blogger.com/atom/ns#" term="list loading"/><title type='text'>FreeBSD, the Unknown Giant</title><content type='html'>My &lt;a href=&quot;http://markmail.blogspot.com/2008/09/announcing-netbeans-and-openofficeorg.html&quot;&gt;last entry about NetBeans and OpenOffice.org&lt;/a&gt; and their million messages reminded me that I&#39;ve never announced here our load of the &lt;a href=&quot;http://freebsd.markmail.org/&quot;&gt;FreeBSD archives&lt;/a&gt;, an even larger and older community.  They have more than 2.5 million messages, stretching back to 1994.&lt;br /&gt;&lt;br /&gt;FreeBSD doesn&#39;t get as much attention at Linux but is a great operating system.  Here&#39;s a description from an &lt;a href=&quot;http://www.ibm.com/developerworks/opensource/library/os-freebsd/&quot;&gt;IBM developerWorks article&lt;/a&gt;:&lt;br /&gt;&lt;blockquote&gt;&quot;The FreeBSD operating system is the unknown giant among free operating systems. Starting out from the 386BSD project, it is an extremely fast UNIX®-like operating system mostly for the Intel® chip and its clones. In many ways, FreeBSD has always been the operating system that GNU/Linux®-based operating systems should have been. It runs on out-of-date Intel machines and 64-bit AMD chips, and it serves terabytes of files a day on some of the largest file servers on earth.&quot;&lt;/blockquote&gt;Here&#39;s the historic traffic chart (excluding automated bug and check-in messages):&lt;a href=&quot;http://en.wikipedia.org/wiki/FreeBSD&quot;&gt;&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://markmail.org/search/?q=list%3Afreebsd+-type%3Abugs+-type%3Acheckins&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhcjHUeKXe2nkpHIfJTU4aVj1qRxFui6qTg0WP8llzJbgSJT-oDSswQyTSkcIb0_nCQRjBHL_xCAUcuDFWCSUhKUAxCu8R-IgKDTUeOF34MM1hmB5x3gmwvXWU7uAFIN3_5K6tPkw7Nqhjq/s400/Picture+4.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5244907298964697986&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;br /&gt;Looks like it&#39;s a giant in traffic as well.  The &lt;a href=&quot;http://markmail.org/list/org.freebsd.freebsd-questions&quot;&gt;freebsd-questions&lt;/a&gt; list alone gets a couple thousand emails a month, half a million in its history.  Got a FreeBSD problem?  I bet the answer&#39;s in there.</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/5390975301304334337/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/5390975301304334337' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/5390975301304334337'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/5390975301304334337'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/09/my-last-entry-about-netbeans-and.html' title='FreeBSD, the Unknown Giant'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhcjHUeKXe2nkpHIfJTU4aVj1qRxFui6qTg0WP8llzJbgSJT-oDSswQyTSkcIb0_nCQRjBHL_xCAUcuDFWCSUhKUAxCu8R-IgKDTUeOF34MM1hmB5x3gmwvXWU7uAFIN3_5K6tPkw7Nqhjq/s72-c/Picture+4.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-415030212363820313</id><published>2008-09-11T15:50:00.000-07:00</published><updated>2008-09-11T16:10:35.962-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="list loading"/><category scheme="http://www.blogger.com/atom/ns#" term="netbeans"/><category scheme="http://www.blogger.com/atom/ns#" term="openoffice"/><title type='text'>Announcing NetBeans and OpenOffice.org</title><content type='html'>Last week we finished adding the &lt;a href=&quot;http://netbeans.markmail.org/&quot;&gt;NetBeans&lt;/a&gt; and &lt;a href=&quot;http://openoffice.markmail.org/&quot;&gt;OpenOffice.org&lt;/a&gt; mailing lists to the MarkMail archive.  Both communities host more than a million messages each.  Here&#39;s the NetBeans activity graph (with automated bugs and check-in messages removed):&lt;br /&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://markmail.org/search/?q=list%3Anetbeans+-type%3Acheckins+-type%3Abugs&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgYdvH9q7WlGK_Q1j6PRe-anTqeYOB3ZdTfJzxhbFP5W9UMeO2OeGHCudoBCsXe0ILHQR3UJZo1Nm_xl9rpfY10nc7RdtQjk6ZIPYcSl1ef6dMzldNu_bjVz0sXAbDSi-P6ACN4sdvsuR-x/s400/Picture+1.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5244901597159608802&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;br /&gt;Looks like they&#39;ve seen a resurgence in activity going up for the last 4 years.  They have more list activity than &lt;a href=&quot;http://markmail.org/search/?q=list%3Aeclipse+-type%3Acheckins+-type%3Abugs&quot;&gt;Eclipse&lt;/a&gt;, too.  (Eclipse directs user questions to web forums that aren&#39;t included in our stats.)&lt;br /&gt;&lt;br /&gt;Here&#39;s the OpenOffice.org traffic (same automated message removals):&lt;br /&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://markmail.org/search/?q=list%3Aopenoffice+-type%3Acheckins+-type%3Abugs&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhWfBLmTuU5ZHbkib8GOgzqt6a3cLSacB4rBkSFQoqoxZLgjGvK2TgHC0qJ1vS874SkFv7JjUJC0TfgmV7kt3T7SC1wfoqCXsHhr7r7gyagV7N7ZPt1rybGap24fH6sWNcym97XjjgrdGoi/s400/Picture+2.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5244901705992585810&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;br /&gt;The folks at &lt;a href=&quot;http://collab.net/&quot;&gt;CollabNet&lt;/a&gt; worked with us to transfer the massive archives, and yesterday we issued a &lt;a href=&quot;http://marklogic.com/news-and-events/press-releases/markmail-partners-with-collabnet-to-bolster-mail-archives-of-popular-sun-microsystems-open-source-projects.html&quot;&gt;joint press release&lt;/a&gt; announcing the new list availability.  We also boasted passing 27.5 million emails in total.  That was yesterday.  Today we&#39;re passing &lt;a href=&quot;http://markmail.org/browse/&quot;&gt;28 million&lt;/a&gt;.  Chugga, chugga!</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/415030212363820313/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/415030212363820313' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/415030212363820313'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/415030212363820313'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/09/announcing-netbeans-and-openofficeorg.html' title='Announcing NetBeans and OpenOffice.org'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgYdvH9q7WlGK_Q1j6PRe-anTqeYOB3ZdTfJzxhbFP5W9UMeO2OeGHCudoBCsXe0ILHQR3UJZo1Nm_xl9rpfY10nc7RdtQjk6ZIPYcSl1ef6dMzldNu_bjVz0sXAbDSi-P6ACN4sdvsuR-x/s72-c/Picture+1.png" height="72" width="72"/><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-8520081047321263027</id><published>2008-09-02T13:40:00.001-07:00</published><updated>2008-09-02T14:28:37.620-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="architecture"/><category scheme="http://www.blogger.com/atom/ns#" term="markmail search architecture"/><category scheme="http://www.blogger.com/atom/ns#" term="talks"/><title type='text'>A Tale of Two Search Engines, Revisited</title><content type='html'>As Jason &lt;a href=&quot;http://markmail.blogspot.com/2008/07/tale-of-two-search-engines.html&quot;&gt;announced previously&lt;/a&gt;, last Wednesday night I delivered&lt;br /&gt;&lt;a href=&quot;http://www.sdforum.org/index.cfm?fuseaction=Calendar.eventDetail&amp;amp;eventId=13137&amp;amp;nodeID=1&quot;&gt;A Tale of Two Search Engines&lt;/a&gt; — a presentation for the &lt;a href=&quot;http://www.sdforum.org/index.cfm?fuseaction=Page.ViewPage&amp;amp;PageID=642&quot;&gt;Software Architecture and Modeling SIG&lt;/a&gt; of &lt;a href=&quot;http://sdforum.org/&quot;&gt;SDForum&lt;/a&gt; — about building and running the &lt;a href=&quot;http://krugle.org/&quot;&gt;Krugle&lt;/a&gt; and &lt;a href=&quot;http://markmail.org/&quot;&gt;MarkMail&lt;/a&gt; vertical search engines for code and email, respectively.&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://markmail.org/collateral/jdm/SDForum20080827-ATaleOfTwoSearchEngines.pdf&quot; title=&quot;A Tale of Two Search Engines&quot;&gt;Here&lt;/a&gt; are my tidied-up &lt;a href=&quot;http://markmail.org/collateral/jdm/SDForum20080827-ATaleOfTwoSearchEngines.pdf&quot; title=&quot;A Tale of Two Search Engines&quot;&gt;slides&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;Note&lt;/strong&gt; carefully that my presentation style is a very &lt;a href=&quot;http://www.presentationzen.com/&quot;&gt;visual, story-telling approach&lt;/a&gt; for live, interactive audiences -- i.e., the slide deck is &lt;strong&gt;quite large&lt;/strong&gt; and &lt;strong&gt;not&lt;/strong&gt; geared towards a reading-at-home audience.  Heck, I only broke down and used bullet points on 4 slides right at the end. :-)&lt;br /&gt;&lt;br /&gt;That said, I&#39;ll start blogging some of the stories, go deeper on various technical details, and/or get into any of the &quot;fun topics&quot; that people are interested in.  Feel free to leave comments here about any that you particularly want to hear about.&lt;br /&gt;&lt;br /&gt;Special thanks to Ron Lichty for dragging me into giving this presentation and the wonderful SAMSIG audience for making it so much fun.&lt;br /&gt;&lt;br /&gt;Enjoy,&lt;br /&gt;&lt;br /&gt;John</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/8520081047321263027/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/8520081047321263027' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/8520081047321263027'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/8520081047321263027'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/09/tale-of-two-search-engines-revisited.html' title='A Tale of Two Search Engines, Revisited'/><author><name>Unknown</name><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='https://img1.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-6023683100594002355</id><published>2008-08-25T15:14:00.000-07:00</published><updated>2008-08-25T16:23:22.511-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="list loading"/><category scheme="http://www.blogger.com/atom/ns#" term="redhat"/><title type='text'>Loaded Red Hat: A Thousand Emails a Day</title><content type='html'>How did we celebrate the success of our &lt;a href=&quot;http://markmail.blogspot.com/2008/08/how-to-shutdown-all-your-machines.html&quot;&gt;memory swap ballet&lt;/a&gt; last week?  We loaded 1.7 million &lt;a href=&quot;http://redhat.markmail.org/&quot;&gt;Red Hat emails&lt;/a&gt;.  It&#39;s geeky, but so are the lists!  We&#39;ve now got the complete set archived.&lt;br /&gt;&lt;br /&gt;The first Red Hat messages start back in May 1996.  At that time then there were just a few hundred emails each month.  The chatter has grown a lot since, with recent numbers topping 30,000 messages a month.  That&#39;s 1,000 per day.&lt;br /&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://redhat.markmail.org/&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZBzQ7VW5mlMfFtOuwBtSX8kv7IuhywlAgmHtujfZuRMHWfJAPJncfKvM1TVDxwfS_uwi1ZiwrmlVzHR2iRwAIxqhfx0R_vq7wCKq5yJ7YPEhCkFs2yW71ksrjPvtxb6aQvrOpwS6T9xXE/s400/Picture+2.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5238587547886367602&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;br /&gt;It&#39;s interesting to see the #1 most common file attachment is of type &lt;span style=&quot;font-weight: bold;&quot;&gt;patch&lt;/span&gt;.  That makes sense as these are mostly developer lists.&lt;br /&gt;&lt;br /&gt;But can anyone explain why on a Linux list the #2 most common attachment is the Outlook-generated &lt;span style=&quot;font-weight: bold;&quot;&gt;winmail.dat&lt;/span&gt;!?  Is that a good sign or bad sign?</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/6023683100594002355/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/6023683100594002355' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/6023683100594002355'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/6023683100594002355'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/08/loaded-red-hat-thousand-emails-day.html' title='Loaded Red Hat: A Thousand Emails a Day'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjZBzQ7VW5mlMfFtOuwBtSX8kv7IuhywlAgmHtujfZuRMHWfJAPJncfKvM1TVDxwfS_uwi1ZiwrmlVzHR2iRwAIxqhfx0R_vq7wCKq5yJ7YPEhCkFs2yW71ksrjPvtxb6aQvrOpwS6T9xXE/s72-c/Picture+2.png" height="72" width="72"/><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-6245576650760482060</id><published>2008-08-19T17:34:00.001-07:00</published><updated>2008-08-20T00:58:10.420-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="architecture"/><category scheme="http://www.blogger.com/atom/ns#" term="failover"/><category scheme="http://www.blogger.com/atom/ns#" term="markmail"/><category scheme="http://www.blogger.com/atom/ns#" term="operations"/><title type='text'>How to Shutdown All Your Machines Without Anyone Noticing</title><content type='html'>Last week we discovered we had to replace some bad memory chips in 2 of the 3 machines we use to run the MarkMail service.  This blog post tells the story of how we managed to replace these memory chips without (almost) any of our visitors noticing.&lt;br /&gt;&lt;br /&gt;&lt;span style=&quot;font-weight: bold;font-size:130%;&quot; &gt;Architecture&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;First, a word about our architecture.  The three machines I&#39;m talking about here all run MarkLogic Server.  We have some other machines in the overall MarkMail system that do things like queue and handle incoming mail, but they&#39;re not directly involved in the web site experience.  I&#39;m talking here about the three MarkLogic machines that work together in a cluster and that you interact with when you hit &lt;a href=&quot;http://markmail.org/&quot;&gt;http://markmail.org&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The MarkLogic machines have specialized roles.  One machine (I like to picture it up front) listens for client connections.  It&#39;s responsible for running XQuery script code, gathering answers from the other two machines, and formatting responses.  The other two machines manage the stored content, about half on each.  They support the front machine by actually executing queries and returning content.&lt;br /&gt;&lt;br /&gt;I&#39;ll refer to the front machine as E1, which stands for evaluator #1.  We don&#39;t have an E2 yet but we&#39;re planning for that when user load requires.  The back-end machines are D1 and D2, which stands for data manager #1 and #2.&lt;br /&gt;&lt;br /&gt;The bad memory was on E1 and D1.&lt;br /&gt;&lt;br /&gt;&lt;span style=&quot;font-weight: bold;font-size:130%;&quot; &gt;We&#39;ll Fix E1 First&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;We decided to fix E1 first because it&#39;s easiest.  We gathered the MarkMail team and started at 5pm.  That&#39;s the time period with our lowest traffic.  It&#39;s a little counter-intuitive but since we&#39;re a global site we&#39;re as busy at 2am (Pacific) as we are at 2pm.  The time around 5pm Pacific still sees a lot of traffic, but relatively less.  Why?  We theorize it&#39;s because we get the most traffic during the visitor&#39;s local business hours, and the 5pm to 8pm Pacific time slot puts the local business hours out in the middle of the Pacific.&lt;br /&gt;&lt;br /&gt;The E1 server is important because it catches all requests.  Our plan was to place a new host, essentially E2, into the cluster and route all traffic through it instead of E1. There&#39;s no state held by the front-end machines, so this is an easy change.  We borrowed a machine, added it to the MarkLogic cluster, told it to join the &quot;group&quot; that would make it act like E1, and has our reverse proxy start routing traffic to it instead.  We did all this with the MarkLogic web-based administration.  It was far too easy, frankly.&lt;br /&gt;&lt;br /&gt;We immediately saw the E1 access logs go silent and we knew our patient was, in effect, on a heart-lung bypass machine.  We told our sysadmin in the colo to proceed.&lt;br /&gt;&lt;br /&gt;That&#39;s when he told us that on more careful inspection the memory problems were on D1 and D2. The E1 server was just fine. Hmm...&lt;br /&gt;&lt;br /&gt;We decided to call the maneuver good practice for later and put things back like we found them.&lt;br /&gt;&lt;br /&gt;&lt;span style=&quot;font-weight: bold;font-size:130%;&quot; &gt;OK, We&#39;ll Fix D1 First&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Performing maintenance on a machine like D1 requires more consideration because it&#39;s managing content.  If we were to just unplug it, the content on the site would appear to be reduced by half. It&#39;d be like winding the clock back to April, with our home page saying we just passed the 10 million message mark.&lt;br /&gt;&lt;br /&gt;All email messages go into MarkLogic data structures called &quot;forests&quot;.  (Get it? Forests are collections of trees, each document being an XML tree). Our D1 server manages forests MarkMail 1 and MarkMail 2, the oldest two.  They&#39;re now effectively read-only because we&#39;re loading into higher numbered forests now on D2.&lt;br /&gt;&lt;br /&gt;Turns out that&#39;s a highly convenient fact.  It means we could back up the content from D1 and put it on our spare box, now acting like a D3.  Then with a single transactional call to MarkLogic we could enable the two backup forests on D3 and disable the two original forests on D1. No one on the outside would see a difference. Zero downtime.&lt;br /&gt;&lt;br /&gt;It worked great!  It took a few hours to copy things because it&#39;s hundreds of gigs of messages, but like a chef on TV we knew what we were going to need for showtime and prepared things in advance.&lt;br /&gt;&lt;br /&gt;With the new memory chip placed in D1 we did a transactional switch-back, put the two original forests back into service and had the spare box unused again, ready to help with D2.&lt;br /&gt;&lt;br /&gt;&lt;span style=&quot;font-size:130%;&quot;&gt;&lt;span style=&quot;font-weight: bold;&quot;&gt;We Need an Alternate Approach for D2&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Had we planned in advance to work on D2 we probably would have followed the same &quot;use a backup forest&quot; approach we used to work on D1 because it allowed for zero downtime. It would have required pushing ingestion activities to another machine like D1 so the forests could settle down and be read-only, but that&#39;s done easily enough.  We didn&#39;t do this, however, because we were too impatient to wait for the data to copy between machines.  Instead we decided to leave the data in place and do a SAN mount switch.&lt;br /&gt;&lt;br /&gt;We host all our forest content on a big SAN (a storage area network, basically a bunch of drives working together to act like a big fast disk).  All the data managing machines (D1, D2, and the spare acting as D3) have access to it.  Usually we partition things into individual mount points so they can&#39;t step on each other&#39;s toes and corrupt things.  You never want two databases to operate against the same data!  Here we decided to remove the isolation.  We&#39;d have D2 &quot;detach&quot; the MarkMail 3 and MarkMail 4 forests and have our spare machine (acting like D3) quickly &quot;attach&quot; them.  We would essentially transfer a few hundred gigs in seconds.&lt;br /&gt;&lt;br /&gt;This system change couldn&#39;t be made transactionally, so we had a decision to make: Is it better to turn off the MarkMail site for a short time or let the world see a MarkMail with only half its content? We decided to just turn off the site.  Our total downtime for the switch was 43 seconds going over, just over a minute coming back after the memory change.&lt;br /&gt;&lt;br /&gt;We think we could do it faster next time with some optimizations in the MarkLogic configuration  -- turning off things like index compatibility checks, which we know we don&#39;t need.  Maybe 20 seconds, or even 15.&lt;br /&gt;&lt;br /&gt;&lt;span style=&quot;font-weight: bold;font-size:130%;&quot; &gt;The Moral&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;Looking back, we&#39;re happy that we could cycle through disabling every machine in our MarkLogic cluster yet not have any substantial downtime.  Looking forward, we expect operations like this will get easier.  If and when we add a permanent E2 machine to the cluster it means we won&#39;t have to do anything special to take one of them out of commission.  Our load balancer will just automatically route around any unresponsive front-end servers.  We were also happy to see that our configuration for SAN-based manual failover works. We proved that as long as another machine can access the SAN, we&#39;ll be able to bring the content back online should a back-end machine fail.&lt;br /&gt;&lt;br /&gt;Everyone on the MarkMail team works at Mark Logic, the company that makes the core technology that powers our site.  In fact, in years past some of us have been directly involved in building the technology.  But despite our familiarity, we were still delighted to take the production MarkLogic cluster out for a walk and get it to do tricks. It did the right thing time after time with every disconnect and reconnect and reconfiguration, and we couldn&#39;t help but feel a point of pride.  This is some fun software!  If you&#39;re a Mark Logic customer, we trust you know what we mean.&lt;br /&gt;&lt;br /&gt;A non-techie friend once asked why managing a high-uptime web site was hard.  I said, &quot;It&#39;s like we&#39;re driving from California to New York and we&#39;re not allowed to stop the car.  We have to fill the gas tank, change the tires, wash the windows, and tune the engine but never reduce our speed.  And really, because we&#39;re trying to add new features and load new content as we go, we need to leave California driving a Mini Cooper S and arrive in New York with a Mercedes ML320.&quot;&lt;br /&gt;&lt;br /&gt;So far so good!  Here&#39;s to the long roads ahead...</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/6245576650760482060/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/6245576650760482060' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/6245576650760482060'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/6245576650760482060'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/08/how-to-shutdown-all-your-machines.html' title='How to Shutdown All Your Machines Without Anyone Noticing'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-7506921382548439286</id><published>2008-08-17T17:06:00.000-07:00</published><updated>2008-08-17T21:10:38.242-07:00</updated><title type='text'>Pillow Talk</title><content type='html'>We have a bit of a tradition at MarkMail where we give away T-shirts at the conferences we attend.  Printed on the front of the T-shirts we put the mailing list traffic chart generated by the community whose conference we&#39;re attending.  Last year we did it at &lt;a href=&quot;http://markmail.blogspot.com/2007/11/apachecon-report.html&quot;&gt;ApacheCon&lt;/a&gt; in November, then again at &lt;a href=&quot;http://markmail.blogspot.com/2007/12/keynoting-xml-2007.html&quot;&gt;XML 2007&lt;/a&gt; in December.  We did it at JavaOne too.  They&#39;re fun because they&#39;re personalized, and to the recipients the long bars often bring back memories of fast growth, new product releases, and raging flamewars.&lt;br /&gt;&lt;br /&gt;One of the recipients of a T-shirt at XML 2007 was &lt;a href=&quot;http://www.mulberrytech.com/people/usdin/&quot;&gt;B Tommie Usdin&lt;/a&gt;.  Tommie doesn&#39;t like to wear T-shirts.  No, she likes to make T-pillows out of them instead.  Recently she emailed us a picture of her handiwork:&lt;br /&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://markmail.org/list/org.xml.lists.xml-dev&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4SuSm-b98vX3v3WkvFVyck6_7NJR9lJp5nppojrcOw6WHJh-s74Sp7sTHf2byM86iPMB1-EUa7py7Ujh0vgvU1dRaSAo6ps2KP1qLxlWx018yYV0aD52u_Yp9hajs9cLO391D4edraoWT/s400/xml-dev-pillow-med.jpg&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5235642558405496226&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;br /&gt;We just had to share.</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/7506921382548439286/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/7506921382548439286' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/7506921382548439286'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/7506921382548439286'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/08/pillow-talk.html' title='Pillow Talk'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEg4SuSm-b98vX3v3WkvFVyck6_7NJR9lJp5nppojrcOw6WHJh-s74Sp7sTHf2byM86iPMB1-EUa7py7Ujh0vgvU1dRaSAo6ps2KP1qLxlWx018yYV0aD52u_Yp9hajs9cLO391D4edraoWT/s72-c/xml-dev-pillow-med.jpg" height="72" width="72"/><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-6496646101439426208</id><published>2008-08-16T08:20:00.000-07:00</published><updated>2008-08-16T11:52:55.162-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="list loading"/><category scheme="http://www.blogger.com/atom/ns#" term="markmail"/><category scheme="http://www.blogger.com/atom/ns#" term="procmail"/><title type='text'>MarkMail has Procmail</title><content type='html'>This last week we loaded the &lt;a href=&quot;http://procmail.markmail.org/&quot;&gt;Procmail list archives&lt;/a&gt; into MarkMail and I wanted to pause and mention it here because it&#39;s the kind of thing that readers of our blog would probably appreciate.&lt;br /&gt;&lt;br /&gt;&lt;a href=&quot;http://www.procmail.org&quot;&gt;Procmail&lt;/a&gt;, for those who don&#39;t know, is a tool for filtering email.  It lets you define complex rules for email processing.  You can file messages into folders, quarantine spam, block viruses, and more.  First released back in 1990, it&#39;s an oldie but goodie for people who want to do advanced things with email and aren&#39;t afraid to do some rule file hacking.&lt;br /&gt;&lt;br /&gt;Of course not all is great with Procmail.  It&#39;s arcane and fickle, with a rule syntax that confuses new users.  It hasn&#39;t had a new release in a long while, nor have its official docs been kept up to date.  Answers to common questions aren&#39;t on the web site.  As a result, every time I&#39;ve wanted to do something non-trivial with Procmail, I&#39;ve had to spend a fair amount of time Googling for answers and hunting for samples.&lt;br /&gt;&lt;br /&gt;I think that can change.  With this list load MarkMail lets you search 25,000 emails spanning the last 8 years where people have been doing Q&amp;amp;A for each other.  I expect those emails will give me some good A&#39;s for my Q&#39;s.  Hopefully they&#39;ll do the same for you.&lt;br /&gt;&lt;br /&gt;P.S.  Of course there&#39;s almost as many &lt;a href=&quot;http://markmail.org/search/?q=procmail+-list%3Aprocmail&quot;&gt;emails about Procmail outside the Procmail list&lt;/a&gt; as there are emails inside it.</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/6496646101439426208/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/6496646101439426208' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/6496646101439426208'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/6496646101439426208'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/08/markmail-has-procmail.html' title='MarkMail has Procmail'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-3553588578497489664</id><published>2008-08-07T06:33:00.000-07:00</published><updated>2008-08-07T06:33:14.490-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="interview"/><category scheme="http://www.blogger.com/atom/ns#" term="kde"/><title type='text'>Interview with KDE</title><content type='html'>Earlier this week KDE News published an &lt;a href=&quot;http://dot.kde.org/1217966578/&quot;&gt;interview with us about our recent loading of the KDE community list archives&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;Our interviewer, Jos Poortvliet, asked some interesting questions on topics we haven&#39;t spoken much about before: how we select which lists to load, and what technical challenges you hit in gathering and loading 2.7 million emails.  If you&#39;re curious about how things work at MarkMail on the loading side, &lt;a href=&quot;http://dot.kde.org/1217966578/&quot;&gt;check it out&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://dot.kde.org/1217966578/&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhq1FmJcXk2yi8tmNrl6Mgq7twuvSJlJq1OF7T294csEHvUiI162Ta68IQj_hMheIJPffJa5be50CgFwbvlk-qC-BhoezoyUf3KT13dYvlagk5H7FjIqOTHLCsnwvzuIy45cEvF1blxtQ08/s400/Picture+10.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5231557069249530018&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/3553588578497489664/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/3553588578497489664' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/3553588578497489664'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/3553588578497489664'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/08/interview-with-kde.html' title='Interview with KDE'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhq1FmJcXk2yi8tmNrl6Mgq7twuvSJlJq1OF7T294csEHvUiI162Ta68IQj_hMheIJPffJa5be50CgFwbvlk-qC-BhoezoyUf3KT13dYvlagk5H7FjIqOTHLCsnwvzuIy45cEvF1blxtQ08/s72-c/Picture+10.png" height="72" width="72"/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-5305346886789343444</id><published>2008-08-06T15:11:00.000-07:00</published><updated>2008-08-06T17:01:53.856-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="markmail"/><title type='text'>Blogger Names us a &quot;Blog of Note&quot;</title><content type='html'>Earlier today we received a &lt;a href=&quot;http://markmail.blogspot.com/2008/07/here-comes-sun.html?showComment=1218037500000#c1541569377709945498&quot;&gt;comment&lt;/a&gt; on one of our blog posts that said, &quot;Congratulations on being named a Blog of Note this week!&quot;.  It seemed like a perfect comment-spam ploy: Say something nice so the blog owner won&#39;t delete your comment.  Yet something about the post smelled &lt;span style=&quot;font-style: italic;&quot;&gt;non-fishy&lt;/span&gt;.  The comment didn&#39;t have any sketchy links like most spams do.  I thought maybe it was real.  To my surprise and happiness, it was!&lt;br /&gt;&lt;br /&gt;We were listed by the Blogger Team as a &quot;&lt;a href=&quot;http://blogsofnote.blogspot.com/&quot;&gt;Blog of Note&lt;/a&gt;&quot; for &lt;a href=&quot;http://blogsofnote.blogspot.com/2008/08/making-of-markmail.html&quot;&gt;August 4th&lt;/a&gt;:&lt;br /&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://blogsofnote.blogspot.com/2008/08/making-of-markmail.html&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj8KQljGkho-s0Mi_p3gnx37iCOLFqS6_FWD09Bn-uLCp6Lq2KzN5xHRQ5Z3ApiJe_A4v5Zr9zSAHTfCiVLQANHDaz02Z3kHOaNMTRSbiQNL8G6gkBDLX5IZcI2mL8iO1c5lplZlW2hBwVV/s400/Picture+9.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5231536752763279426&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;br /&gt;Thanks, Blogger!</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/5305346886789343444/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/5305346886789343444' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/5305346886789343444'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/5305346886789343444'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/08/earlier-today-we-received-comment-on.html' title='Blogger Names us a &quot;Blog of Note&quot;'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj8KQljGkho-s0Mi_p3gnx37iCOLFqS6_FWD09Bn-uLCp6Lq2KzN5xHRQ5Z3ApiJe_A4v5Zr9zSAHTfCiVLQANHDaz02Z3kHOaNMTRSbiQNL8G6gkBDLX5IZcI2mL8iO1c5lplZlW2hBwVV/s72-c/Picture+9.png" height="72" width="72"/><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-5872379011698298307</id><published>2008-07-22T11:38:00.000-07:00</published><updated>2008-07-23T22:25:29.391-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="java.net"/><category scheme="http://www.blogger.com/atom/ns#" term="list loading"/><title type='text'>Here Comes the Sun</title><content type='html'>Last week, in collaboration with &lt;a href=&quot;http://sun.com/&quot;&gt;Sun&lt;/a&gt; and &lt;a href=&quot;http://collab.net/&quot;&gt;CollabNet&lt;/a&gt;, we loaded the &lt;a href=&quot;http://markmail.org/search/?q=net.java&quot;&gt;mail archive histories for java.net&lt;/a&gt;, Sun&#39;s open source developer playground for Java projects and home to projects like &lt;a href=&quot;http://glassfish.markmail.org/&quot;&gt;GlassFish&lt;/a&gt;, &lt;a href=&quot;http://ajax.markmail.org/&quot;&gt;jMaki&lt;/a&gt;, &lt;a href=&quot;http://appfuse.markmail.org/&quot;&gt;AppFuse&lt;/a&gt;, &lt;a href=&quot;http://grizzly.markmail.org/&quot;&gt;Grizzly&lt;/a&gt;, &lt;a href=&quot;http://hudson.markmail.org/&quot;&gt;Hudson&lt;/a&gt; and &lt;a href=&quot;http://webwork.markmail.org/&quot;&gt;WebWork&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The load includes more than 1,000 mailing lists and roughly 1,000,000 messages.  Their growth curve is fantastic (the last month is partial):&lt;br /&gt;&lt;br /&gt;&lt;div style=&quot;text-align: center;&quot;&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://markmail.org/list/net.java.*&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgg0KjOs40jP8sHMljyEmtTFYYiaZMB66wzF4QUMM4PNR4gWtO-Vs_hssbKx-vvTYD8g_CNbWkA7h8HaJDWLssw_9IYRFbalKZbL4H7FTXn6mwXAxwGHlB_BxdIFlIG1iSwXCVSSv1twXcn/s400/java-net.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5225919651617584882&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;/div&gt;&lt;br /&gt;Just about half the java.net mails are auto-generated as a result of checkins or bugs.  If we remove those, &lt;a href=&quot;http://markmail.org/search/?q=list%3Anet.java+-type%3Acheckins+-type%3Abugs&quot;&gt;the curve is still beautiful&lt;/a&gt;.  Looks like people are writing more than 15,000 human-to-human emails every month on java.net.&lt;br /&gt;&lt;br /&gt;With such a large community, it&#39;s fun to look at community-wide analytics.  It&#39;s a little-known feature that you can go to our browse page and add an arbitrary query to the URL and it&#39;ll show you list-by-list numbers for all messages matching that query.  For example, you can view &lt;a href=&quot;http://markmail.org/browse/?q=list:net.java&quot;&gt;the total number of messages per list throughout time&lt;/a&gt;, or the &lt;a href=&quot;http://markmail.org/browse/?q=list:net.java+date:lastweek&quot;&gt;counts for just last week&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;You can &lt;a href=&quot;http://markmail.org/browse/?q=list:net.java+from:sun.com&quot;&gt;browse the lists where people from &quot;sun.com&quot; have written the most&lt;/a&gt;. If you want to see the top lists, do it as a &lt;a href=&quot;http://markmail.org/search/?q=list:net.java+from:sun.com&quot;&gt;regular search&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;There&#39;s been a lot of coverage about this, from &lt;a href=&quot;http://blogs.sun.com/marla/entry/markmail_for_java_net&quot;&gt;Marla Parker&lt;/a&gt;, &lt;a href=&quot;http://blogs.sun.com/theaquarium/entry/using_markmail_for_glassfish_related&quot;&gt;Eduardo Pelegri-Llopart&lt;/a&gt;, &lt;a href=&quot;http://clarkrichey.blogspot.com/2008/07/markmail-meets-java.html&quot;&gt; Clark Richey&lt;/a&gt;, and &lt;a href=&quot;http://javahispano.org/contenidos/es/markmail_para_java_net_buscador_de_listas_de_correo/&quot;&gt;javaHispano&lt;/a&gt;.  Plus we issued our own &lt;a href=&quot;http://marklogic.com/news-and-events/press-releases/mark-logic-enables-devlopers-to-tap-into-java-brain-trust.html&quot;&gt;press release&lt;/a&gt;!&lt;br /&gt;&lt;br /&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;http://marklogic.com/news-and-events/press-releases/mark-logic-enables-devlopers-to-tap-into-java-brain-trust.html&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEieazXI_OjcNFjNDy3hPzgjbW2ijEqRqDkaweyfhzM6MtVozOCmjFrq99Y4bc_YdSXN5MOw_Yq9VoyZJaXUEm-Ap1x0y0mPqyMKuQlbyugNx6mnhL84lnpCqTuUc0EZzv7QLa6lqFqnAibK/s400/Picture+9.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5225946293447135938&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/5872379011698298307/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/5872379011698298307' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/5872379011698298307'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/5872379011698298307'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/07/here-comes-sun.html' title='Here Comes the Sun'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgg0KjOs40jP8sHMljyEmtTFYYiaZMB66wzF4QUMM4PNR4gWtO-Vs_hssbKx-vvTYD8g_CNbWkA7h8HaJDWLssw_9IYRFbalKZbL4H7FTXn6mwXAxwGHlB_BxdIFlIG1iSwXCVSSv1twXcn/s72-c/java-net.png" height="72" width="72"/><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-8164103657027399911</id><published>2008-07-18T00:44:00.000-07:00</published><updated>2008-07-18T01:03:36.703-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="markmail search architecture"/><title type='text'>A Tale of Two Search Engines</title><content type='html'>If you&#39;re local to the Bay Area, you may be interested in attending an upcoming talk from the SDForum &lt;span style=&quot;font-style: italic;&quot;&gt;Software Architecture &amp;amp; Modeling&lt;/span&gt; SIG on August 27th.  It&#39;s titled &lt;a href=&quot;http://sdforum.com/index.cfm?fuseaction=Calendar.eventDetail&amp;amp;eventID=13137&quot;&gt;&lt;span style=&quot;font-style: italic;&quot;&gt;A Tale of Two Search Engines&lt;/span&gt;&lt;/a&gt; and will be given by our own John Mitchell, one of the developers on &lt;a href=&quot;http://markmail.org/&quot;&gt;MarkMail&lt;/a&gt;.  Here&#39;s his abstract:&lt;br /&gt;&lt;blockquote&gt;Betwixt the rigid structure of relational databases and the unbridled chaos of random content lies the world of search engines. Search engines shine in the middle ground where the messy complexity of reality makes everything harder than we imagine.&lt;br /&gt;&lt;br /&gt;While the soap operas of general-purpose search engines dominate the news, specialized search engines are coming to dominate their vertical niches. Special-purpose search engines can aggressively leverage domain-specific intelligence to return highly relevant results.&lt;br /&gt;&lt;br /&gt;This talk will present the architecture, implementation, and stories behind the creation of two specialized search engines for code and email: Krugle and MarkMail.&lt;br /&gt;&lt;/blockquote&gt;If you&#39;re interested in MarkMail I think you&#39;ll enjoy it.  (And no, John doesn&#39;t really use words like &quot;betwixt&quot; in daily conversations.)&lt;br /&gt;&lt;br /&gt;&lt;span class=&quot;appOutput&quot;&gt;&lt;span style=&quot;font-family:Verdana;font-size:85%;color:#003366;&quot;&gt;&lt;/span&gt;&lt;/span&gt;</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/8164103657027399911/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/8164103657027399911' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/8164103657027399911'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/8164103657027399911'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/07/tale-of-two-search-engines.html' title='A Tale of Two Search Engines'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><thr:total>3</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-8408636030187458122</id><published>2008-07-02T18:15:00.000-07:00</published><updated>2008-07-02T18:24:11.909-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="perl screencast"/><title type='text'>The Perl Review: Now with Video</title><content type='html'>The folks at &lt;a href=&quot;http://theperlreview.com/&quot;&gt;The Perl Review&lt;/a&gt; recently enhanced the &lt;a href=&quot;http://www.theperlreview.com/Interviews/jason-hunter-markmail-200805.html&quot;&gt;interview&lt;/a&gt; I mentioned here Monday with a &lt;a href=&quot;http://vimeo.com/1226043&quot;&gt;new screencast video showing MarkMail in action&lt;/a&gt;.  The intro is terrific.  There&#39;s a guy hitting his Mac with a hammer!&lt;br /&gt;&lt;br /&gt;It&#39;s a strange (happy) feeling to have others produce advertising videos for you.  Thanks, &lt;a href=&quot;http://use.perl.org/articles/08/07/02/193221.shtml&quot;&gt;brian d foy&lt;/a&gt;!</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/8408636030187458122/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/8408636030187458122' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/8408636030187458122'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/8408636030187458122'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/07/perl-review-now-with-video.html' title='The Perl Review: Now with Video'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-5144199799441006407</id><published>2008-06-30T12:20:00.000-07:00</published><updated>2008-08-06T17:02:44.142-07:00</updated><category scheme="http://www.blogger.com/atom/ns#" term="interview"/><category scheme="http://www.blogger.com/atom/ns#" term="perl"/><title type='text'>Interview with The Perl Review</title><content type='html'>&lt;a href=&quot;http://theperlreview.com/&quot;&gt;The Perl Review&lt;/a&gt;, a quarterly newsletter about all things Perl, recently published an &lt;a href=&quot;http://www.theperlreview.com/Interviews/jason-hunter-markmail-200805.html&quot;&gt;interview with us&lt;/a&gt; where we discuss several topics relating to &lt;a href=&quot;http://markmail.org/&quot;&gt;MarkMail&lt;/a&gt;:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;How we load mail&lt;/li&gt;&lt;li&gt;Our choice between Java and Perl&lt;br /&gt;&lt;/li&gt;&lt;li&gt;Our model of permalinking&lt;/li&gt;&lt;li&gt;Comparative community sizes&lt;br /&gt;&lt;/li&gt;&lt;li&gt;What&#39;s in store for the future&lt;/li&gt;&lt;li&gt;How this is different than Google&lt;/li&gt;&lt;/ul&gt;It&#39;s a more technical interview than some of the ones we&#39;ve done previously with &lt;a href=&quot;http://feathercast.org/?p=60&quot;&gt;Apache&lt;/a&gt;, &lt;a href=&quot;http://www.thecontentwrangler.com/people/forget_listserv_digests_youve_got_markmail_intervew_with_jason_hunter_mark/&quot;&gt;The Content Wrangler&lt;/a&gt;, and &lt;a href=&quot;http://www.infoq.com/news/2008/01/markmail&quot;&gt;InfoQ&lt;/a&gt;.</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/5144199799441006407/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/5144199799441006407' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/5144199799441006407'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/5144199799441006407'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/06/interview-with-perl-review.html' title='Interview with The Perl Review'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-7071496506298372327.post-8966705990846081142</id><published>2008-06-10T13:59:00.000-07:00</published><updated>2008-06-10T14:59:52.058-07:00</updated><title type='text'>Diacritics, or should I say dịẫçritícs</title><content type='html'>We changed our indexing this week regarding how we handle diacritics -- those accent marks you see on vowels and some consonants in many languages.&lt;br /&gt;&lt;br /&gt;Previously we resolved all queries in a &lt;span style=&quot;font-style: italic;&quot;&gt;diacritic insensitive&lt;/span&gt; manner.  That meant that a search for &quot;francois&quot; would match both &quot;francois&quot; and &quot;&lt;strong style=&quot;font-weight: normal;&quot;&gt;françois&lt;/strong&gt;&quot;, and a search for &quot;&lt;strong style=&quot;font-weight: normal;&quot;&gt;françois&lt;/strong&gt;&quot; would do the same.  Basically we specified in our MarkLogic Server configuration that the c versus &lt;strong style=&quot;font-weight: normal;&quot;&gt;ç difference be ignored.&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;&lt;strong style=&quot;font-weight: normal;&quot;&gt;Now we&#39;ve changed the configuration so the diacritic sensitivity choice depends on the search term.  A term containing diacritics will trigger a diacritic &lt;span style=&quot;font-style: italic;&quot;&gt;sensitive&lt;/span&gt; match, while a term without diacritics will remain diacritic &lt;span style=&quot;font-style: italic;&quot;&gt;insensitive&lt;/span&gt;.  That means a search for &quot;francois&quot; will match with and without diacritics (the same as before), but a search for &lt;/strong&gt;&quot;&lt;strong style=&quot;font-weight: normal;&quot;&gt;françois&lt;/strong&gt;&quot; will respect the &lt;strong style=&quot;font-weight: normal;&quot;&gt;ç character constraint and won&#39;t match &quot;francois&quot; anymore.&lt;br /&gt;&lt;blockquote&gt;To summarize: If you care enough to type a diacritic, we&#39;ll care enough to match it for you!&lt;/blockquote&gt;&lt;/strong&gt;This is a particularly helpful change as we&#39;ve expanded from English-only content into lists written in &lt;a href=&quot;http://markmail.org/search/?q=list%3Aja&quot;&gt;Japanese&lt;/a&gt;, &lt;a href=&quot;http://markmail.org/search/?q=list%3Avi&quot;&gt;Vietnamese&lt;/a&gt;,   &lt;a href=&quot;http://markmail.org/search/?q=list%3Aes&quot;&gt;Spanish&lt;/a&gt;, &lt;a href=&quot;http://markmail.org/search/?q=list%3Ade&quot;&gt;German&lt;/a&gt;, &lt;a href=&quot;http://markmail.org/search/?q=list%3Ait&quot;&gt;Italian&lt;/a&gt;, &lt;a href=&quot;http://markmail.org/search/?q=list%3Anl&quot;&gt;Dutch&lt;/a&gt;, &lt;a href=&quot;http://markmail.org/search/?q=list%3Apt&quot;&gt;Portuguese&lt;/a&gt;, &lt;a href=&quot;http://markmail.org/search/?q=list%3Ask&quot;&gt;Slovak&lt;/a&gt;, &lt;a href=&quot;http://markmail.org/search/?q=list%3Apl&quot;&gt;Polish&lt;/a&gt;, and &lt;a href=&quot;http://markmail.org/search/?q=list%3Afa&quot;&gt;Farsi&lt;/a&gt;.  We even have one mail in &lt;a href=&quot;http://markmail.org/search/?q=list%3Aorg.kde.kde-i18n-fry&quot;&gt;Frisian&lt;/a&gt;.  Who knew!&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;&lt;strong style=&quot;font-weight: normal;&quot;&gt;&lt;a onblur=&quot;try {parent.deselectBloggerImageGracefully();} catch(e) {}&quot; href=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjil1xMfS2rXy0gOfVGmVF3hT5trL5_yGj3HEFZGqAyv4RLAyfYU38ws09wPJ57k1nbIPeh7NN_7jtY-U7dD2854WWQ8fiAcA8wtw4UG0HjeBPQ5XSdq0KrB4_P_sVDrOWqLOhnaXhyphenhyphensryK/s1600-h/Picture+1.png&quot;&gt;&lt;img style=&quot;margin: 0px auto 10px; display: block; text-align: center; cursor: pointer;&quot; src=&quot;https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjil1xMfS2rXy0gOfVGmVF3hT5trL5_yGj3HEFZGqAyv4RLAyfYU38ws09wPJ57k1nbIPeh7NN_7jtY-U7dD2854WWQ8fiAcA8wtw4UG0HjeBPQ5XSdq0KrB4_P_sVDrOWqLOhnaXhyphenhyphensryK/s400/Picture+1.png&quot; alt=&quot;&quot; id=&quot;BLOGGER_PHOTO_ID_5210373028584357266&quot; border=&quot;0&quot; /&gt;&lt;/a&gt;&lt;/strong&gt;&lt;/strong&gt;&lt;br /&gt;On a per-message basis, we get more traffic from these lists than our English lists. Perhaps they&#39;re underserved by other email archive systems? Maybe the other systems have issues hosting messages with the non-ASCII characters. We&#39;ve definitely had trouble finding &quot;clean&quot; historical archive records for non-English lists, ones where the diacritics were reliably preserved. Luckily for us, being built on XML, we have native support for all Unicode characters.&lt;br /&gt;&lt;br /&gt;We hope you find the new indexing logic helpful.</content><link rel='replies' type='application/atom+xml' href='http://markmail.blogspot.com/feeds/8966705990846081142/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment/fullpage/post/7071496506298372327/8966705990846081142' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/8966705990846081142'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/7071496506298372327/posts/default/8966705990846081142'/><link rel='alternate' type='text/html' href='http://markmail.blogspot.com/2008/06/diacritics-or-should-i-say-dritcs.html' title='Diacritics, or should I say dịẫçritícs'/><author><name>Jason Hunter</name><uri>http://www.blogger.com/profile/00854855078730758915</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='21' height='32' src='//blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhrteipepdg4SaBB39vk-Fwxd7U1F0GD1nbKXurmB12fhGKm3YOrZKnH8SSqqKoLylR3t8ejxnXIc3C3e0t4uRVkwzqmkJVw20hNNts4h7gbSAQA4Zh3Ti0zw9QQUz9epk/s220/CRW_1397_small.jpg'/></author><media:thumbnail xmlns:media="http://search.yahoo.com/mrss/" url="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjil1xMfS2rXy0gOfVGmVF3hT5trL5_yGj3HEFZGqAyv4RLAyfYU38ws09wPJ57k1nbIPeh7NN_7jtY-U7dD2854WWQ8fiAcA8wtw4UG0HjeBPQ5XSdq0KrB4_P_sVDrOWqLOhnaXhyphenhyphensryK/s72-c/Picture+1.png" height="72" width="72"/><thr:total>0</thr:total></entry></feed>