<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:media="http://search.yahoo.com/mrss/" xmlns:yt="http://gdata.youtube.com/schemas/2007" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0">
   <channel>
      <title>pyo-friends</title>
      <description>Pythonic blogs I read</description>
      <link>http://pipes.yahoo.com/pipes/pipe.info?_id=d8bd08d1cdd623259a1d4c879facccb1</link>
      <atom:link rel="next" href="http://pipes.yahoo.com/pipes/pipe.run?_id=d8bd08d1cdd623259a1d4c879facccb1&amp;_render=rss&amp;page=2" />
      <pubDate>Sun, 27 May 2012 10:29:39 +0000</pubDate>
      <generator>http://pipes.yahoo.com/pipes/</generator>
      <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/pyobject/friends" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="pyobject/friends" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><item>
         <title>Python.org Redesign Request For Proposals</title>
         <link>http://feedproxy.google.com/~r/Jessenollercom/~3/CJ1qZlQS29E/</link>
         <description>Well, it’s official — a labor of love from myself and many others — with special thanks to Andrew Kuchling for getting it over the finish line. The Python Software Foundation has officially announced a call for proposals for the redesign of the Python.org site and properties. You can see the RFP here: http://pythonorg-redesign.readthedocs.org/en/latest/ It’s taken [...]</description>
         <guid isPermaLink="false">http://jessenoller.com/?p=1141</guid>
         <pubDate>Wed, 23 May 2012 12:42:35 +0000</pubDate>
         <content:encoded><![CDATA[<p>Well, it’s official — a labor of love from myself and many others — with special thanks to Andrew Kuchling for getting it over the finish line. The Python Software Foundation has officially announced a call for proposals for the redesign of the Python.org site and properties.</p>
<p>You can see the RFP here: <a rel="nofollow" target="_blank" href="http://pythonorg-redesign.readthedocs.org/en/latest/">http://pythonorg-redesign.readthedocs.org/en/latest/</a></p>
<p>It’s taken me several years of false starts, other attempts (including skunkworks attempts), political and social discussions, and the hard work of many to make this come to fruition. Now, we can only sit back and hope that we see some amazing proposals from the community and others.</p>
<p>I sincerely hope this will be successful, and that we will see a modern, well designed Python.org that showcases not only the language, but the vibrant, open, welcoming and active community we are all part of. </p>
 <p><a rel="nofollow" target="_blank" href="http://jessenoller.com/?flattrss_redirect&amp;id=1141&amp;md5=21a56dfa5feea14a76d208e81b851ac1" title="Flattr"><img src="http://jessenoller.com/wp-content/plugins/flattr/img/flattr-badge-large.png" alt="flattr this!"/></a></p>]]></content:encoded>
      </item>
      <item>
         <title>Детские сказки</title>
         <link>http://feedproxy.google.com/~r/app-engine/~3/qK-NxCJtAsc/alltales</link>
         <description>&lt;p&gt;Каждый очередной пост можно было бы начинать со стандартного абзаца текста о том, что то, что сейчас происходит стало следствием рождения дочки. Просто масштаб этого события так сильно повлиял на наше мировосприятие и, как уже говорил, ощущение времени. Кроме того, определенным образом выстроилась мотивация и из рожденного плана постепенно рождаются результаты, о которых, собственно, и идет речь в последних постах.&lt;/p&gt;

&lt;h2&gt;Начальные условия&lt;/h2&gt;

&lt;p&gt;Итак, рождение Алисы заставило задуматься о том, что же собственно мы хотим дать ей. Какой набор навыков сделал бы ее успешной и счастливой в этом мире, дал основу для мечт и их осуществления. Без расшифровки слова "успешность":&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Понятно, что это не деньги. Деньги в целом являются большой фикцией. Есть еще массы людей, которые верят, что они имеют какую-то ценность, но максимум, на что они годятся — это управление как морковкой всеми теми, кто в них верит. Деньги - это особый ресурс, который находится рядом с доверием. Правильная система мышления позволит им распоряжаться и, если надо, преумножать этот ресурс.&lt;/li&gt;
&lt;li&gt;Понятно, что это не какие-то конкретные знания. Это не чемодан псевдонаучной тухлятины, которую заливают в головы детей в школе. Особенно это касается той лженаучной ветви, которую называют школьным курсом истории.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Наверное, нам бы хотелось дать Алисе некий набор умений, которые будут увеличивать ее шансы:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Многоязычность. Для примера, даже просто знание английского и китайского для предприимчивого человека может дать ему возможность обеспечивать себя и свои интересы.&lt;/li&gt;
&lt;li&gt;Умение учиться. Кажется понял это только недавно, так сильно подпортили мне психику школа и отчасти родители. Правда, в опровдание им, они учили меня быть выигрышным в той системе, которой жили сами. Я обожал программирование, но меня заставляли учить какое-то говно.&lt;/li&gt;
&lt;li&gt;Умение находить контакт с людьми. То, что вообще не доступно в школах и детских садах, вузах.&lt;/li&gt;
&lt;li&gt;Умение работать. Школьная система обучения разрушает это на корню.&lt;/li&gt;
&lt;li&gt;Сексуальная раскрепощенность. Некоторые тут прочитают блядство, но это их проблемы. Люди умеющие управлять своей сексуальностью становятся лидерами. Зажатость и комплексы делают людей несчастыми.&lt;/li&gt;
&lt;li&gt;Общее развитие. Повышение основных навыков хотя бы на несколько условных единиц позволят перейти в другой класс людей.&lt;/li&gt;
&lt;li&gt;Храбрость. Или даже 'отсутствие страхов', которые неизбежно закачивают в голову детям, чтобы проще ими манипулировать.&lt;/li&gt;
&lt;li&gt;Умение быть счастливой.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Без последнего пункта все предыдущие полностью утратили бы смысл. Наша дочка — это не наша собственность, у нее есть свои планы и желания, мы лишь увеличиваем возможности. Что даст ей больше вариантов добиться главного, заниматься тем, что ей нравится, не зависить от внешних сил и стать тем, кем ей хочется.&lt;/p&gt;

&lt;p&gt;Это далеко не полный список. Причем это не результат - это просто начальный продукт размышлений.&lt;/p&gt;

&lt;p&gt;Мы понимаем, что все люди разные, что дети и взрослые не равны. Что мы так же блуждаем в тумане обмана и не знаем какие из наших установок верные, а какие ложные. То, чему нас учили в нашем детстве и то, что навязывается нам обществом, может являться опасным злом для конкретных людей. Точно так же борьба с общественными системами и ценностями может отнять слишком много ресурсов. Мы не хотим положить свою жизнь на то, чтобы "сделать мир лучше", проведя ее в борьбе с теми, ради кого это делается. Благодарности нет и никогда не будет. Если кому-то по пути - добро пожаловать! Нет? Значит вы "не мой клиент".&lt;/p&gt;

&lt;h2&gt;Изучение поля&lt;/h2&gt;

&lt;p&gt;В результате осознания того, что надо что-то делать, пришел к выводу, что надо изучить вопрос. Не буду сильно вдаваться в подробности, да, уже пообщался со многими людьми, сделал какие-то выводы. Прочитал книги и продолжаю их читать. Мне бы хотелось общаться с другими людьми, но понимаю, что пока это не то общение, когда можно обсуждать что-то на конкретном деловом уровне. Вообще, образование детей — это нечто долгосрочное.&lt;/p&gt;

&lt;p&gt;Возможно, есть люди, которые смотрят так же на воспитание своих детей, которые видят, что мир сильно меняется и скоро будет еще больше изменений. Возможно, они хотели бы объединиться или хотя бы доверить образование своих детей. Если бы я встретился с ними, то тогда из этого получилось бы что-то, возможно даже что-то коммерческое.&lt;/p&gt;

&lt;p&gt;Но для того, чтобы поднимать финансы на какие-то действия надо, во первых, понять какие именно действия могли бы быть. Но одного виденья не достаточно.&lt;/p&gt;

&lt;p&gt;У нас есть некие планы относительно Алисы. Возможно я о них напишу отдельно, если такой запрос будет.&lt;/p&gt;

&lt;h2&gt;Детские сказки&lt;/h2&gt;

&lt;p&gt;Собственно, результатом изучения стало понятно, что надо начать что-то делать. Можно считать все предыдущее введением, причем таких введений можно было бы написать несколько разных. Может быть я выбрал самое пафосное в этот раз. Не принципиально.&lt;/p&gt;

&lt;p&gt;Пока Алиса взрослеет, решил не терять время и начать некоторую подготовительную работу. По моим воспоминаниям и выводам из размышлений почему-то решил, что для папы, который все время занят где-то на работе, рассказывание сказок является одним из лучших способов начинать диалог с ребенком.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;Наш папа часто нам что-то рассказывал. Мы с сестрой готовы были идти длинной дорогой лишь бы послушать об очередных приключениях пилота Пиркса, ковбоях и других героях из фильмов и книг. Кроме того, папа придумывал много своих сказок, про числа, про свою работу, учебу, преподавание и всего, чего мы уже и не вспомним. Кстати, на пенсии он начал записывать некоторые из этих историй, не очень знаю подробности, кажется это даже где-то согласились публиковать.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Собственно понял, что надо начинать со сказок. И уткнулся в какую-то нелепую ситуацию. Оказалось, что те сказки, которые должны учить, воспитывать, давать какие-то знания, на самом деле, для меня, как взрослого человека, являются совершенно не тем, на что рассчитывал. Есть некоторые классические сюжеты, которые обязательно надо знать, потому что на них огромное количество культурных ссылок (например "Красная шапочка"). Но при этом большинство сюжетов готовит к тому, о чем дети никогда не будут даже знать. Где скажите дети увидят дремучий лес? Некоторые сюжеты вообще являются порождением каких-то клинических психических отклонений.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;Сказка про Мальчика-с-Пальчика имеет несколько вариантов. Русский начинается с того, что бабка &lt;strong&gt;отрубает себе палец&lt;/strong&gt; и он оживает. Через некоторое время дед продает ребенка помещику. В варианте Перро родители &lt;strong&gt;отводят детей в лес&lt;/strong&gt;, чтобы от них избавиться, потом попадают к людоеду, который хочет сожрать братьев, но из-за подлога съедает своих дочерей.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Вместо магического мира (что тоже не однозначно позитивно) дети запугиваются, чтобы подготовиться к какому-то миру, с которым им не придется столкнуться, но главное - попасть под контроль взрослых.&lt;/p&gt;

&lt;p&gt;Да, сказки — это не глобальный ответ на воспитание детей. Но лично мне бы хотелось строить диалог с ребенком, по крайней мере, так, чтобы и я мог понимать его со своей стороны.&lt;/p&gt;

&lt;p&gt;Онлайн сборники сказок которые вылазят в топе поисковых систем похоже делаются людьми, которые никогда не садились их читать ребенку. Мелкие шрифты, отсутствие типографики, иллюстраций и тд. В общем, от большинства начинают кровоточить глаза.&lt;/p&gt;

&lt;h2&gt;Сайт &lt;a rel="nofollow" target="_blank" href="http://www.alltal.es/"&gt;alltal.es&lt;/a&gt;&lt;/h2&gt;

&lt;p&gt;В качестве теста мы сделали сайт с &lt;a rel="nofollow" target="_blank" href="http://www.alltal.es/"&gt;детскими сказками Alltal.es&lt;/a&gt;. Это глубкая альфа, вот только вчера выкатил главную страницу. Нанял редактора, которая пока только наполняет базу без вычитки, проверки ошибок, отбору стоящих сказок. Впереди еще большая работа:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Улучшение типографики&lt;/li&gt;
&lt;li&gt;Добавление иллюстраций&lt;/li&gt;
&lt;li&gt;Проверка на ошибки&lt;/li&gt;
&lt;li&gt;Наполнение словаря трактовками непонятных и устаревших слов&lt;/li&gt;
&lt;li&gt;Наполнение статьями об авторах, об особенностях сказок разных народов&lt;/li&gt;
&lt;li&gt;Улучшение интерфейса&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Пока мы наполняем сайт сказками для того, чтобы показать, что готовы заниматься этой работой.&lt;/p&gt;

&lt;p&gt;Есть 1000 и 1 "но" для продолжения этой работы. И, все таки, может быть в результате усилий получится площадка и для современных авторов, с некоторыми из которых я уже встречался и которые создают сказки не по заказу "партии". Это расширит контент сайта и сделает его тем местом, откуда родители будут брать сказки для своих детей.&lt;/p&gt;

&lt;h2&gt;Партнерство&lt;/h2&gt;

&lt;p&gt;Я открыт для партнерства с теми, кто готов дать нам трафик или контент.&lt;/p&gt;

&lt;p&gt;Если вы знаете художников, которые уже рисовали иллюстрации к каким-то сказкам и лицензии на их работы позволяют использовать ее &lt;strong&gt;с указанием авторства и ссылки&lt;/strong&gt;, то мы бы с радостью добавили к сказкам их иллюстрации.&lt;/p&gt;

&lt;p&gt;Мы пока не думаем о монетизации, но может оказаться так, что и этот вопрос когда-то встанет, но мы против превращения сайта в помойку. Поэтому не всем готовы ответить взаимностью.&lt;/p&gt;

&lt;p&gt;Трафика на сайте пока нет. Сайт появился буквально вчера, сказок на нем еще мало. Половина страниц пустых и не вся функциональность работает. В качестве шаблона было взято оформление &lt;a rel="nofollow" target="_blank" href="http://sib.fm"&gt;Sib.fm&lt;/a&gt;, оформление которого заставляет людей читать. Наше оформление сильно запаздывает. Если есть дизайнеры, которые тоже могли бы помочь за небольшой бюджет, то обращайтесь.&lt;/p&gt;

&lt;h2&gt;Предупреждение&lt;/h2&gt;

&lt;p&gt;Сайт еще не готов, мы не доделали оформление сказок. Если бы вы хотели прямо сегодя зайти на сайт &lt;a rel="nofollow" target="_blank" href="http://www.alltal.es/"&gt;Alltal.es&lt;/a&gt; и найти там сказку для прочтения, то не делайте этого. Вам будет не комфортно. Дайте нам еще время привести все в порядок. Мы занимаемся тем, чтобы там стало лучше.&lt;/p&gt;

&lt;p&gt;Воспринимайте данный пост, как анонс альфа версии продукта. Оставляйте комментарии с идеями и ошибками.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/app-engine/~4/qK-NxCJtAsc" height="1" width="1"/&gt;</description>
         <guid isPermaLink="false">hhttp://www.vurt.ru/2012/05/alltales</guid>
         <pubDate>Sat, 19 May 2012 07:00:00 +0000</pubDate>
      </item>
      <item>
         <title>My (very shallow) thoughts on Dart</title>
         <link>http://feedproxy.google.com/~r/CoderWhoSaysPy/~3/p0VhUiNzBpI/my-very-shallow-thoughts-on-dart.html</link>
         <description>Being the language nerd that I am, I actually find it fun to learn new programming languages. Now typically this is nothing more than me reading all of the official documentation and writing some toy examples that give me a very shallow, quick-and-dirty feel for a language. Since I have been involved in language design for nearly a decade (started participating on python-dev in June 2002) and have done toy examples now in &lt;a rel="nofollow" target="_blank" href="http://code.google.com/p/bcannon/source/browse/#hg%2Flanguages"&gt;18 languages&lt;/a&gt; (17 actually still run; I have never bothered to get Forth to work again after a gforth change broke my code),&amp;nbsp;this is actually usually enough for me to grasp the inspirations for a language and thus understand its essence.&lt;br /&gt;
&lt;br /&gt;
At work I have been doing some JavaScript work for an internal Chrome extension and dashboard and so that led me to want to look into what &lt;a rel="nofollow" target="_blank" href="http://www.dartlang.org/"&gt;Dart&lt;/a&gt; had to offer over JavaScript. I know the language is only at version 0.09 (and still changing weekly), but the fundamentals are there so I wanted to see what the general feel of the language is (and will continue to be).&lt;br /&gt;
&lt;br /&gt;
I also know Dart is &lt;a rel="nofollow" target="_blank" href="https://news.ycombinator.com/item?id=2982949"&gt;somewhat controversial&lt;/a&gt; for some people. Personally, I fall on the &lt;a rel="nofollow" target="_blank" href="http://www.dartlang.org/support/faq.html#does-dart-divert-effort"&gt;"competition is good" side&lt;/a&gt; of the argument, not the "OMG fragmentation" side. I want ECMAScript Harmony to still happen and give me a cleaner, tighter, more functional JavaScript, but that doesn't mean Dart doesn't have a place in the world as a cleaner OO language for the web. Besides, me thinking otherwise would make me a massive hypocrite as I began working on Python before it was cool (I feel like I need a hipster meme for that statement, but I digress) and I have worked hard to convert people to Python from other languages. Hell, I have tried to foster competition between the Python VMs to get them to push each other to perform better and be ever more interoperable. IOW I don't totally buy this fragmentation argument.&lt;br /&gt;
&lt;br /&gt;
Going into learning Dart I knew who was involved with the language which is what will inherently define how a language feels. I knew &lt;a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Lars_Bak_(computer_programmer)"&gt;Lars Bak&lt;/a&gt; of &lt;a rel="nofollow" target="_blank" href="http://code.google.com/p/v8/"&gt;V8&lt;/a&gt; helped design the language, which meant it would have some design restrictions put on it to make it have a damn fast VM. &lt;a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Joshua_Bloch"&gt;Josh Bloch&lt;/a&gt; has been helping to design Dart's library which meant some JDK feel to it. I also know Jim Hugunin is involved which should also help with the VM speed. So fast with an API designed like the JDK.&lt;br /&gt;
&lt;br /&gt;
What did I find? A language with a damn fast VM and a standard library that felt like the JDK. =) Take OO as a Python programmer would expect (e.g. pure OO where everything is an object, not dogmatic OO like Java where everything has to be in an class definition), make types entirely optional for testing and tooling purposes but enough support to use interfaces and generics, and then toss in abilities based on what JavaScript allows and then you have a good idea of what Dart offers.&lt;br /&gt;
&lt;br /&gt;
So, Dart has optional typing. In case you have not heard, Dart does not use type information at runtime for performance and only throws any form of fit if a type doesn't match what is specified unless you run in &lt;i&gt;checked&lt;/i&gt;&amp;nbsp;mode. If you do that then you get warnings about possible type issues. But &lt;a rel="nofollow" target="_blank" href="http://www.dartlang.org/support/faq.html#why-types-unsound"&gt;Dart's type system is unsound&lt;/a&gt; so don't expect typing to catch every error that a more strict type system might even when you run in checked mode. Dart views types as helpful documentation and a way to help tools assist with things, period. I actually find it rather refreshing to have a language that treats types as just documentation since that is really what they are for the programmer (VMs can use it for performance, but it isn't required for good performance and type safety only saves you from a minor set of bugs which every Python programmer probably realizes eventually =).&lt;br /&gt;
&lt;br /&gt;
But that's even if you bother with types! You can write all of your code without types and everything will run without issue. Even generics are optional, so you can declare a function accepts a &lt;span style="font-family:Courier, monospace;"&gt;List&lt;/span&gt; or &lt;span style="font-family:Courier, monospace;"&gt;List&lt;/span&gt; ; Dart doesn't care either way and it alleviates covariance/contravariance headaches by not caring if you don't care either. It's actually rather nice to have non-library code be written quickly using dynamic typing and only add in the type information for library code where you care about what interface is expected. IOW I think Dart strike a nice balance with how it does typing and I actually feel fine using types when I know what I expect to accept in my own code that I don't expect anyone else to rely upon.&lt;br /&gt;
&lt;br /&gt;
Dart is OO, not prototypical like JavaScript. It's single-inheritance, which I'm fine with. It does have interfaces as one would expect in a statically typed language, but it softens their expense by allowing one to define a default implementation of an interface. What this means is that the Map interface will also give you a HashMap instance if you call &lt;span style="background-color:#eeeeee;font-family:Courier, monospace;"&gt;new Map()&lt;/span&gt;&lt;span style="background-color:white;font-family:inherit;"&gt;. I suspect they snagged the idea from Scala &amp;nbsp;where you have the Map class which hides HashMap from the user if you simply don't care about what Map implementation you use.&lt;/span&gt;&lt;br /&gt;
&lt;span style="background-color:white;font-family:inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;
&lt;span style="background-color:white;font-family:inherit;"&gt;It does have a modicum of privacy by using a leading underscore for signaling something is private, much like Python. But the privacy is enforced at the library-level or is public, period. Every field automatically has a getter and setter defined for them, so there is no way to force a private field (which I think is a good thing since I find private privacy bloody annoying). I also like that getters and setters are directly supported by the language with automatic generation show you don't ever have to see a setSomething()/getSomething() function call just to read/write a field, but you can do something like Python's properties very easily.&lt;/span&gt;&lt;br /&gt;
&lt;span style="background-color:white;font-family:inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;
&lt;span style="background-color:white;font-family:inherit;"&gt;The standard libraries are fine and just feel like the JDK. Things are very much LBYL rather than EAFP. I am willing to bet (although I have not tested this) that exceptions are a little expensive in Dart (since exceptions are hard to optimize) and so they would rather go the LBYL way. But they still went a little overboard in my opinion on some things (e.g. the &lt;a rel="nofollow" target="_blank" href="http://api.dartlang.org/dart_core/List.html"&gt;list interface&lt;/a&gt; has a last() method instead of supporting negative indexes). But there is nothing there that is making me run away screaming.&lt;/span&gt;&lt;br /&gt;
&lt;span style="background-color:white;font-family:inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;
&lt;span style="background-color:white;font-family:inherit;"&gt;One place I do think Dart could use some improvement is simplifying their constructor rules. Upfront Dart has some nice syntactic sugar for a construction where you directly specify how a constructor's arguments map to instance fields, avoiding having to declare the constructor parameters and then also write an assignment. OK, I like that.&lt;/span&gt;&lt;br /&gt;
&lt;span style="background-color:white;font-family:inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;
&lt;span style="background-color:white;font-family:inherit;"&gt;Dart also has initializer lists which let you initialize final fields. OK, that's cool and a nice idea taken from C++.&lt;/span&gt;&lt;br /&gt;
&lt;span style="background-color:white;font-family:inherit;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;
&lt;span style="background-color:white;font-family:inherit;"&gt;Constructors are not inherited. OK, that's fine since you probably want to be explicit about how you tweak stuff. But there is an exception about the default, no-argument constructor calling the superclass' no-argument constructor. So while not technically inherited, it might as well be in that single instance. And all defined constructors will automatically call the default constructor, which if it isn't defined you must explicitly call a constructor somehow (probably in the initializer list of your constructor). Um, OK...&lt;/span&gt;&lt;br /&gt;
&lt;br /&gt;
And you have named constructors. This gets you around from the lack of type-based method overloading for constructors. OK, I can go with that.&lt;br /&gt;
&lt;br /&gt;
You also have constant constructors since fields can only be initialized to compile-constant values. Fine, that's for performance and determinism in instance creation, so I can grasp the desire for that.&lt;br /&gt;
&lt;br /&gt;
And then you have factory constructors. OK, this is where I go "WTF people". This is so that you can have a constructor that actually doesn't create a new instance but instead can return something else other than a new instance (think of Python's __new__() or any of Java's static factory methods). But this lets you use the &lt;span style="font-family:Courier, monospace;"&gt;new&lt;/span&gt; keyword on a factory constructor instead of using a static method. And that to me seems unneeded.&lt;br /&gt;
&lt;br /&gt;
So lets recap what constructor options we have. We have regular constructors, default and defined, which supports initialize lists. You have named constructors. There are constant constructors. And you also have factory constructors. If you don't count the default constructor as special that means Dart has four types of constructors. WTF!?! I realize that Java's FactoryFactoryOfFactories crap has probably spooked the crap out of the Dart designers, all the while having Java influences making them think they need the &lt;span style="font-family:Courier, monospace;"&gt;new&lt;/span&gt; keyword for anything that would return an instance of a class, but this seems a bit much. Dart's function definitions are rich enough to allow for optional arguments, etc. which would suggest that the typical constructor can do the job of named constructors with static methods picking up the slack where absolutely necessary where factory constructors are used. Maybe I'm missing something here, but I think they tried to design for everything that is bad about Java's constructor mess without stopping to think what their function definitions already buy them, all while making sure the &lt;span style="font-family:Courier, monospace;"&gt;new&lt;/span&gt; keyword was used.&lt;br /&gt;
&lt;br /&gt;
Luckily that is the only bit of Dart that I found poorly designed. Everything else is reasonable and something any JavaScript programmer will be somewhat familiar with or quickly grasp.&lt;br /&gt;
&lt;br /&gt;
Now as I said, I only did &lt;a rel="nofollow" target="_blank" href="http://code.google.com/p/bcannon/source/browse/#hg%2Flanguages%2FDart"&gt;toy examples in Dart&lt;/a&gt; beyond reading the docs from beginning to end. If I had more time this weekend I may have done one more coding example that was more involved, but I ran out of time. But based on what I have read and what I learned, I am happy with Dart and would be content in using it for programming for the Internet. I would also be totally happy being asked to use it in a situation where others wanted to use types (e.g. I would be fine ditching Java for Dart if people really felt the need to hold on to their types).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20144447-2969136173953826690?l=sayspy.blogspot.com' alt=''/&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/CoderWhoSaysPy/~4/p0VhUiNzBpI" height="1" width="1"/&gt;</description>
         <author>Brett Cannon</author>
         <guid isPermaLink="false">tag:blogger.com,1999:blog-20144447.post-2969136173953826690</guid>
         <pubDate>Sun, 13 May 2012 21:11:00 +0000</pubDate>
      </item>
      <item>
         <title>Thoughts on using function signatures as a DSL for CLI parsers</title>
         <link>http://feedproxy.google.com/~r/CoderWhoSaysPy/~3/GQTu3iJmOAE/thoughts-on-using-function-signatures.html</link>
         <description>I have no idea why, but this morning I thought about a decorator for delineating what function should be treated as the main function (e.g. using a decorator instead of the traditional &lt;span style="background-color:#eeeeee;font-family:Courier, monospace;"&gt;if __name__ == '__main__'&lt;/span&gt; idiom). Now I solved it in my head on the spot, and then immediately realized someone had to have solved this already. Turns out various people have done things as nuts as examine stack levels to detect the &lt;span style="font-family:Courier, monospace;"&gt;__main__&lt;/span&gt; name, but the most straight-forward &lt;a rel="nofollow" target="_blank" href="http://code.activestate.com/recipes/577791/"&gt;solution&lt;/a&gt; I found doesn't do anything nearly as nuts or CPython-specific and is basically what I came up with. There was a red herring, though, in everyone's solution where they claim the decorator has to be on the last function in your module. While technically true when using the decorator as a decorator only, you can also just as easily not decorate the function and instead, at the end of your module, do something like &lt;span style="background-color:#eeeeee;font-family:Courier, monospace;"&gt;main(func)&lt;/span&gt; since that is the same as decorating &lt;span style="font-family:Courier, monospace;"&gt;func&lt;/span&gt; with &lt;span style="font-family:Courier, monospace;"&gt;main&lt;/span&gt;.&lt;br /&gt;
&lt;br /&gt;
A really simple expansion of this idea of helping out with defining what function is the main function, is to pass in &lt;span style="font-family:Courier, monospace;"&gt;sys.argv&lt;/span&gt; and to return a value to signify exit status: &lt;span style="background-color:#eeeeee;font-family:Courier, monospace;"&gt;sys.exit(func(sys.argv[1:]))&lt;/span&gt;. So now you have made the decorator more useful than replacing the old __name__ idiom.&lt;br /&gt;
&lt;br /&gt;
But while that is nice and helps deal with the very common case, I wanted more. Why can't you introspect on the arguments the function takes and use that to automatically generate a command-line parser? I did a search and the best I could find is &lt;a rel="nofollow" target="_blank" href="http://pypi.python.org/pypi/entrypoint"&gt;entrypoint&lt;/a&gt;, but it doesn't go far enough for me. What I want is to use the full expressiveness of function parameters in Python to express as much about what should/could be given on the command-line along with passing in as little as possible to the decorator in order to replicate the common case of command-line parsing; think just as easy as &lt;a rel="nofollow" target="_blank" href="http://docs.python.org/py3k/library/getopt.html#module-getopt"&gt;getopt&lt;/a&gt; but more powerful by using as much of &lt;a rel="nofollow" target="_blank" href="http://docs.python.org/py3k/library/argparse.html#module-argparse"&gt;argparse&lt;/a&gt; as you can without coming up with complicated rules about how things should work (since once you pass a certain complexity threshold you should just build the argument parser using argparse's API directly and stop trying to optimize for it like I'm suggesting).&lt;br /&gt;
&lt;br /&gt;
So what do we have at our disposal to build such a decorator? We have positional arguments so we know how many arguments are required without some specific qualifier. We have variable positional arguments (e.g. &lt;span style="font-family:Courier, monospace;"&gt;*args&lt;/span&gt;) to take an optional number of extra arguments at the end of the command-line. We have keyword arguments which are optional flags that one can specify. You could even have variable keyword arguments for major flexibility, but that just seems like a total lack of structure the CLIs just don't typically provide. With all of that you can reproduce getopt without any issue for long-form names. For short names, I would say you need to pass in a mapping of short names to long names into the decorator. Same goes for long names to help string (you can use the function's docstring for the main help for the app itself).&lt;br /&gt;
&lt;br /&gt;
But where things get really interesting is when you take into consideration function annotations. That opens up the possibility of going beyond getopt and potentially supporting argparse's &lt;a rel="nofollow" target="_blank" href="http://docs.python.org/py3k/library/argparse.html#action"&gt;action&lt;/a&gt;, &lt;a rel="nofollow" target="_blank" href="http://docs.python.org/py3k/library/argparse.html#nargs"&gt;nargs&lt;/a&gt;, and &lt;a rel="nofollow" target="_blank" href="http://docs.python.org/py3k/library/argparse.html#type"&gt;type&lt;/a&gt; options. Take the type option as an example. You could say &lt;span style="background-color:#eeeeee;"&gt;&lt;span style="font-family:Courier, monospace;"&gt;limit:int=10&lt;/span&gt;&lt;/span&gt; to have a command-line option called &lt;span style="font-family:Courier, monospace;"&gt;--limit&lt;/span&gt; which only accepted an integer and defaulted to 10. &amp;nbsp;This obviously could also work with float or any other type where you can just pass in a string to the constructor to get back an instance of the type. So you have a general case which can be useful, but you can you potentially special-case some things to get enhanced functionality where it doesn't make sense to simply take in a string?&lt;br /&gt;
&lt;br /&gt;
Lists pose an interesting option as argparse provides both nargs for specifying the number of arguments to a single option, or the append action for accepting multiple instances of the same option and accumulating them. In my mind both can be expressed in a way that I think makes sense but some might view as too magical. If you specify &lt;span style="background-color:#eeeeee;font-family:Courier, monospace;"&gt;names:list=[]&lt;/span&gt;, then that supports the append action, e.g. &lt;span style="background-color:#eeeeee;font-family:Courier, monospace;"&gt;--names Brett --names Andrea&lt;/span&gt; leads to names being set to &lt;span style="background-color:#eeeeee;font-family:Courier, monospace;"&gt;['Brett', 'Andrea']&lt;/span&gt;. But if you were to do &lt;span style="background-color:#eeeeee;font-family:Courier, monospace;"&gt;names:['+']=[]&lt;/span&gt;, then that would get the same result from &lt;span style="background-color:#eeeeee;font-family:Courier, monospace;"&gt;--names Brett Andrea&lt;/span&gt;. In other words, the list type specifies the append action while a list instance specifies using the nargs option with the single item in the list acting as the value to set to nargs.&lt;br /&gt;
&lt;br /&gt;
For booleans, I would want the use of the bool type to mean use either the store_true or store_false action based on what the default argument was. So &lt;span style="background-color:#eeeeee;font-family:Courier, monospace;"&gt;turn_on:bool=True&lt;/span&gt; would use the store_false action since the argument is meant to be a boolean and it's default value is True, meaning that if the option was specified it represents the reverse.&lt;br /&gt;
&lt;br /&gt;
Finally, the tricky bit is for files since that is a common command-line argument and you might as well open the file and close it for the function. The solution argparse uses is a specific &lt;a rel="nofollow" target="_blank" href="http://docs.python.org/py3k/library/argparse.html#filetype-objects"&gt;FileType class&lt;/a&gt; where you can pass specific arguments to use when opening the file. The problem is that it doesn't support everything open() does, e.g. encoding. So what I would want to do instead is provide a partial function that took everything &lt;b&gt;but&lt;/b&gt;&amp;nbsp;the file path and then when it came time to call the main function, passed in the file path to the partial function, passed the returned file to &lt;a rel="nofollow" target="_blank" href="http://docs.python.org/py3k/library/contextlib.html#contextlib.closing"&gt;contextlib.closing()&lt;/a&gt;, and then passed it on to the main function. You could even generalize a lot of this and simply say that whatever is specified as the function annotation, if it isn't a special-case like lists, then you call the annotation with what came from the command-line and if it provides a context manager it is used before calling the main function.&lt;br /&gt;
&lt;br /&gt;
So those are my thoughts on using function parameters as a DSL for getopt++/argparse-- functionality on a Saturday morning. Honestly the most complicated bit would be constructing the arguments to pass to the main function in the right order, otherwise it's just introspecting on a function's parameters and making the proper call to argparse. But then again the real question is whether anyone thinks this at all sounds reasonable enough to code it up.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20144447-1179193750413927939?l=sayspy.blogspot.com' alt=''/&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/CoderWhoSaysPy/~4/GQTu3iJmOAE" height="1" width="1"/&gt;</description>
         <author>Brett Cannon</author>
         <guid isPermaLink="false">tag:blogger.com,1999:blog-20144447.post-1179193750413927939</guid>
         <pubDate>Sat, 12 May 2012 13:02:00 +0000</pubDate>
      </item>
      <item>
         <title>Месяц в Таиланде</title>
         <link>http://feedproxy.google.com/~r/app-engine/~3/4Mc3WZD9b5w/month-in-thailand</link>
         <description>&lt;p&gt;Уже прошел месяц нашей жизни в Таиланде, многие знакомые просили сделать некий отчет о жизни тут. Плюс, он имеет смысл с исторической точки зрения.&lt;/p&gt;

&lt;p&gt;В сети полно историй о том как уехать, но, в большинстве, они от самостоятельных путешественников, а мы, все же, родители с  младенцем и планировали поездку на длительный срок в стране без предварительного опыта ее посещения. Возможно этот нюанс добавит самостоятельной ценности посту.&lt;/p&gt;

&lt;h2&gt;Подготовка&lt;/h2&gt;

&lt;p&gt;Прежде всего все приготовления имели цель обезопасить нас в другой стране во время первых дней и во время всего пребывания. Безопасность складывается из разных частей - это может быть долгосрочный вопрос денег, а может быть собранность в течение перелетов.&lt;/p&gt;

&lt;p&gt;Лучший способ унять страхи — это составить план, изучить возможные риски и подготовиться к их решению. Наша подготовка началась почти год назад.&lt;/p&gt;

&lt;h2&gt;Ребенок&lt;/h2&gt;

&lt;p&gt;Мы с &lt;a rel="nofollow" target="_blank" href="http://www.juliphoto.com/"&gt;Юлей&lt;/a&gt; давно хотели уехать на длительное время пожить в Азии. Но все время казалось, что еще есть время впереди. Беременность, роды и хлопоты после, создали паузу в этих планах, но маленький человек со своим ритмом взросления и жизни стал постоянным напоминанием о том, что время бежит.&lt;/p&gt;

&lt;p&gt;Есть люди, которые едут рожать из России и Украины в другие страны, но у нас не было опыта длительной жизни за границей. Кроме того в Одессе сравнительно хороший климат и по отзывам одна из лучших доступных пренатальных медицин. По крайней мере, у нас все прошло гладко и мы остались довольны врачами.&lt;/p&gt;

&lt;p&gt;Спустя пол года мы начали изучать вопрос о возможности путешествия с младенцем. Комаровский ответил, что ребенок готов к перелетам, если с ним все в порядке через две недели после рождения. Это мнение очень упорядочило наши страхи и мы поняли, что успех и безопасность дальних поездок зависят от нашей подготовки.&lt;/p&gt;

&lt;p&gt;Не буду пересказывать другие руководства для путешествующих родителей. Наши главные приоритеты были в подготовке к двум фазам путешествия: перелеты туда и обратно и, собственно, пребывание там. Пока младенец находится на грудном вскармливании, перелет оказывается достаточно простой процедурой. Во время самых опасных периодов — 20 минут взлета и 15 посадки — ребенок блокируется грудью мамы и болезненных проблем с ушами можно избежать.&lt;/p&gt;

&lt;p&gt;Пребывание там вызывало множество вопросов, поэтому мы решили застраховать себя тем, что поедем в специализированное место с хорошей инфраструктурой и медициной. Таким местом была выбрана Турция, так как она находится в непосредственной близости от Одессы и, в общем-то, считается европейской страной.&lt;/p&gt;

&lt;p&gt;Не буду в деталях рассказывать о подготовке к этой поездке, вот несколько отрывочных фактов по результату:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Алисе было 10 месяцев во время поездки&lt;/li&gt;
&lt;li&gt;Мы попали в больницу с подозрением на кишечную инфекцию, но все обошлось пищевым отравлением кормящей мамы. Условия были очень хорошие. Противоположные ощущения и сравнение с ситуацией в Украине в посте "&lt;a rel="nofollow" target="_blank" href="http://www.vurt.ru/2012/02/relocate"&gt;Почему я хочу уехать&lt;/a&gt;"&lt;/li&gt;
&lt;li&gt;Оказалось, что в Турции очень сложно достать кашки для младенцев без молока, но, к счастью, у нас прошла аллергия и мы смогли перейти на эти смеси&lt;/li&gt;
&lt;li&gt;Общение с приветливыми турками, которые каждый раз заигрывали с Алисой, очень пошло ей на пользу. Ребенок стал более открытым и общительным, что дало ей прогресс в развитии&lt;/li&gt;
&lt;li&gt;Поездка оказалась не такой сложной. Мы поняли, что можем путешествовать вместе&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Возможно, для таких же трусливых перестраховщиков как я, можно рекомендовать подобный вариант путешествия для первого раза. Наградой будет каждодневная радость от купания ребенка в море. Плюс, конечно же, отдых от множества бытовых проблем, которые вынуждена решать мама.&lt;/p&gt;

&lt;h2&gt;Подготовка к Таиланду&lt;/h2&gt;

&lt;p&gt;Понятно, что две недели — это меньше чем "неопределенный срок, пока не захочется обратно". Отдых в отеле по программе "все включено" - это скорее учебное занятие. Вы не расходуете ресурсы, а накапливаете их (за исключением начальных трат при выезде, включая бюджет на экскурсии и сувениры). Тут же приходится переходить на самообеспечение, но без помощи приросших связей. Впрочем, за время перемещений после школы, уже привык к тому, что связи — это нарабатываемый актив.&lt;/p&gt;

&lt;p&gt;Основные группы задач, о которых надо озаботиться в случайном порядке, потому что везде есть какой-то критически важный пункт:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Документы&lt;/li&gt;
&lt;li&gt;Деньги&lt;/li&gt;
&lt;li&gt;Страховка&lt;/li&gt;
&lt;li&gt;Транспорт&lt;/li&gt;
&lt;li&gt;Знания&lt;/li&gt;
&lt;li&gt;Голова&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;Документы&lt;/h3&gt;

&lt;p&gt;Тут все достаточно просто и многократно описано. Надо получить или обновить загранпаспорта, в моем случае понадобилось получить новый. Это превратилось в очень болезненный процесс, потому что в феодальном псевдогосударстве Украшка у людей нет права на собственное имя. Больше всего наверное расстроился папа, когда узнал, что тупая шовинистическая мразь из ОВИРа имеет больше прав называть меня, чем он. Поскольку времени и сил было ограничено, то сейчас у меня загран паспорт, выданный на непринадлежащее мне имя Михайло.&lt;/p&gt;

&lt;p&gt;Второй этап — это получение визы. Гражданам Украины несколько сложнее, чем гражданам России. Но, в целом, процесс хорошо задокументирован и приятное отношение к чужой стране начинает формироваться уже от общения с консулом. Краткий итог для тех, кто задумается о возможности выезда в Таиланд:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Визу можно получить на 60 дней, стоит 40 долларов (но на младенца тоже надо получать).&lt;/li&gt;
&lt;li&gt;Посредники для получения не нужны, список всех документов есть на сайте консульства. Откройте его, прочтите и если что-то не понятно, задайте вопрсы по телефону. Вежливые и приятные сотрудники ответят на все. Справки о месте работы не обязательны, обратный билет выкупать тоже, достаточно брони.&lt;/li&gt;
&lt;li&gt;Документы можно распечатать и отправить курьерской службой вместе с деньгами (мы пользовались FedEx).&lt;/li&gt;
&lt;li&gt;В Киеве получение готовых документов занимает 3 минуты, без очереди в специальном окне.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Дальше все происходит просто. В Таиланде, в специальном месте, виза продляется еще на 30 дней. Потом так называемый visa run - вы выезжаете на территорию соседней Малазии и получаете еще раз такую же визу и ее так же можно продлить. Суммарно это дает пол года пребывания. Если все понравилось, то можно сделать специальную студенческую визу еще на год. Граждане России просто делают виза раны раз в месяц.&lt;/p&gt;

&lt;p&gt;Поскольку людей, которым нужны такие услуги - десятки тысяч, то во многих местах есть офисы компаний, которые предлагают такие услуги.&lt;/p&gt;

&lt;p&gt;Надеюсь, это снимет тревоги относительно возможности поездки.&lt;/p&gt;

&lt;p&gt;Из дополнительных документов имеет смысл брать права международного образца (техталоны, или как там называются дополнительные бумажки, не обязательно). Если летите с ребенком до определенного возраста, то свидетельство о рождении и присутствие обоих родителей обязательно. В общем - уточняйте.&lt;/p&gt;

&lt;h3&gt;Деньги&lt;/h3&gt;

&lt;p&gt;Тут все сложнее и более индивидуально. Самый разумный способ — это перейти на удаленную работу. Я зарегистрировался на &lt;a rel="nofollow" target="_blank" href="http://odesk.com"&gt;Odesk.com&lt;/a&gt;, прошел тесты, оставил по максимуму бумаг, чтобы обелить свой профайл. Заказал себе карточку Mastercard, на которую в случае чего мне могут перевести оплату. Пока не воспользовался этой возможностью, это дополнительная защита на всякий случай.&lt;/p&gt;

&lt;p&gt;Где вам взять запас денег советовать не буду. Но лучше всего их вести не налом, а положить в банк на счет. Сделал себе Visa Gold, потому что Gold статус дает некоторые дополнительные услуги. Например персонального менеджера, который помогает ответить на некоторые вопросы и дает советы. И небольшую страховку, что очень удобно для краткосрочных поездок.&lt;/p&gt;

&lt;p&gt;Не буду рекламировать УкрСибБанк, у меня от него приятное впечатление, но есть один неприятный недостаток, приобретающий особую важность в данном контексте. SMS нотификации они отправляют только на номера +380. Поэтому, либо оставлять номер на включенном телефоне кому-то в Украине, либо делать переадресацию с помощью специального софта на какой-то железке, которую так же надо оставлять в подключенном виде.&lt;/p&gt;

&lt;p&gt;Сделал дополнительный комплект карточек на случай каких-то проблем с банкоматами.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;Важно: &lt;strong&gt;уведомите сотрудников банка, что уезжаете за пределы Украины&lt;/strong&gt;, чтобы не срабатывали блокировки из-за внезапных расходов непонятно откуда.&lt;/p&gt;&lt;/blockquote&gt;

&lt;h4&gt;Расходы&lt;/h4&gt;

&lt;p&gt;Одним из самых частых вопросов является "сколько обходится жизнь в месяц". Приведу приблизительные расходы, с учетом комфортного уровня, верхние планки часто не имеют пределов.&lt;/p&gt;

&lt;p&gt;Наши цифры не оптимальные, мы во многих случаях не искали максимально лучшего варианта и выборки строятся на опросе окружающих.&lt;/p&gt;

&lt;p&gt;Мы не ставили цели сильно экономить, но и не транжирим деньгами. Скажем так, для нас материальные вещи не имеют сильного значения до тех пор, пока это не создает неудобств ребенку. Но у нее пока потребностей тоже не так много.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Дом&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Сутки отеля в Бангкоке могут стоить от 40 до 100-150 USD. Мы бронировали заранее, но приехали в какой-то ужасный отель. Пошли по улицам искать в округе и нашли более приличный за чуть большие деньги.&lt;/p&gt;

&lt;p&gt;Бунгало на Ко Самуй по слухам от 300 бат, мы жили в домике за 600 бат в день ($19) в 10 метрах от моря. Знакомые спустя некоторое время нашли недалеко за 800 бат, но чуть лучше условия. При этом могут быть и за $100-150 в сутки (почти по цене вилл, но они не возле моря).&lt;/p&gt;

&lt;p&gt;Жить у моря экзотично и интересно первые дни, но потом начинаешь вспоминать о дополнительных удобствах. Сейчас живем в домике за 11 000 бат ($353) в месяц с бассейном и в удобной близости от моря, рынка, магазинов. Уборка и смена белья раз в неделю. Но еще доплачиваем 500 бат ($15) за интернет и за электричество по счетчику. Но, судя по отзывам, нам достался достаточно дешевый домик.&lt;/p&gt;

&lt;p&gt;Цены очень сильно меняются, в зависимости от договоренности о сроке пребывания. Например, если решите остаться на год, то цена может упасть вдвое. Плюс, можно договариваться о дополнительных услугах, например, об отдельном выделенном интернете.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Транспорт&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Дорого сюда долететь, но, для ориентировки, скажу цены, которые мы нашли. Искали онлайн, в авиакассу зашли только, чтобы распечатать бронь и на те же самые рейсы нам дали цены на 250-300 USD  дороже. Можно поймать билеты за 400 долларов в одну сторону, из Москвы за 300 и меньше.&lt;/p&gt;

&lt;p&gt;Расходы зависят от ваших потребностей. В Бангкоке проехать на Sky Train стоит около 30 бат ($1) в зависимости от расстояния. Такси бывает разным, есть те, которые используют счетчики (taxi meter) и там цены получаются очень небольшие, примерно 100 бат на расстоянии пары станций метро. Есть и такие, которые за то же самое расстояние попросят 600.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;Есть один нюанс, в гроде есть платные дороги, по которым действительно быстро и комфортно ехать, они оплачиваются пассажирами отдельно.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Цены на Ко Самуй на такси совсем другие. Тут нет понятия счетчиков и, поэтому, на острове, длинна окружной дороги вокруг которого всего 60 км, таксисты могут просить 400-600 бат (почти $20). Есть понятие маршрутного такси, оно дешевле, но тоже надо договариваться о стоимости каждой поездки.&lt;/p&gt;

&lt;p&gt;Многие берут мопеды в аренду, цены около $100 в месяц или $30 на неделю, можно брать и по дням. Отдельно оплачивается горючее (gasoline). По отзывам обходится еще до 1000 бат в месяц ($30).&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;Внимание: изучите вопрос с мошенничеством при сдаче мопедов фарангам.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Аренда машины достаточно дорогая — $600 в месяц. Посуточно мало кто дает, но такая услуга есть если поискать.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Еда&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Тайская еда вкусная и острая, но в какой-то момент захочется есть чего-то более привычного. Впрочем, для начала, имеет смысл зайти в фудкорты в торговых центрах. Одно из самых ярких первых впечатлений было от посещения, даже не дворика, а городка, в торговом центре Siam World. Почти вся еда, которую вы хотели попробовать после просмотра аниме тут есть!&lt;/p&gt;

&lt;p&gt;Особенность посещения заключается в том, что принято брать специальную карточку на которую кладется номинал. Ею потом расплачиватюся непосредственно за блюда и напитки на раздаче. После посещения, карточка возвращается и, выдается остаток обратно. Возможно это сделано для учета, а возможно, чтобы повора лишний раз не трогали деньги. Во многих лавочках фудкорта не обслужат без карточки, но если не хватает пары десятки бат, то их можно додать наличными.&lt;/p&gt;

&lt;p&gt;Стоимость одного блюда в среднем до 100 бат ($3), напитки 30-60 ($1-2). Тарелки супа или лапши хватает чтобы накушаться.&lt;/p&gt;

&lt;p&gt;Так же, по всей стране, можно полноценно питаться на улице. Везде стоят макашницы со своими небольшими прилавками, на которых готовят все, что можно: салаты (40); кукуруза (15-20); шпажки с разными видами мяса, морепродуктов, потрошков (5-10-15); блинчики, супы, рыба, фрукты, которые тут же режут. Ну и конечно же рис (10 бат).&lt;/p&gt;

&lt;p&gt;Первую неделю на острове мы питались почти только такой едой. Один прием пищи получался рис + мясо, рыба или морепродукты. Получалось порядка 50 бат, при том, что прием пищи был полноценным.&lt;/p&gt;

&lt;p&gt;Есть большие торговые центры, цены пречислять не буду, по ощущениями забить багажник машины едой в Украине нам обходилось на 100-150 USD дороже, чем тут. Плюс, тут есть товары, которые идут как деликатесы и стоят дороже, например сыр. А вот креветки тут стоят 180 бат за кг (~$6). В Украине такие же стоили 300 гривен (~$37). Недостачи чего-то привычного пока не ощущаем. Тут нет некоторых продуктов типа гречки, но я без них прожить могу с большим удовольствием.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Прочие расходы&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Стирка тут в прачечных, 30-40 бат за кг одежды.&lt;/p&gt;

&lt;p&gt;Экскурсии могут стоить по разному. Пока видел цены $30-50 с человека.&lt;/p&gt;

&lt;p&gt;Цены на технику Apple дешевле, чем в Украине (и, тем более, в ублюдочных Комфи).&lt;/p&gt;

&lt;p&gt;Телефон около $30 в месяц с использованием 3G для общения и отправки фотографий (трафик на социальные сети).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Итог&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Чтобы за деталями не упустили главной цифры. Если вам хочется экономить, то есть люди, которые умудряются выживать на $400 в месяц. Но комфортная величина составляет приблизительно 800-1000, причем эта сумма не очень увеличится, если вы не один, а с семьей.&lt;/p&gt;

&lt;h3&gt;Страховка&lt;/h3&gt;

&lt;p&gt;Одним словом — это очень важно. Я потратил много времени на изучение вопроса. Печальный факт -  в Украине нет нормальных страховых компаний для Азии. Более радостный в том, что застраховаться можно в зарубежных.&lt;/p&gt;

&lt;p&gt;Если очень кратко, то схема выглядит так. На вашей стороне находится страховая компания, которая получает ваши деньги и выдает полис, но она не является медицинским учереждением, и даже не имеет контактов с такими. Есть компании, которые занимаются менеджментом и разруливанием ситуаций с медучереждениями, они называются assist. Они ведут финансовую и бумажную работу с медкомпаниями, но вы с ними не контактируете. Но именно от этих компаний зависит к какому врачу вы попадете. В случае страхового проишествия, вы связываетесь со своей страховой компанией, точнее ее представителем на месте или, в крайнем случае, с колцентром в Москве/Твери/Киеве. Они направляют ваше дело к ассисту, а те уже говорят госпиталю "да помогайте". Так вот, ключевым является качество и компетенция асиста.&lt;/p&gt;

&lt;p&gt;Лучшим ассистом в Таиланде и на Ко Самуй по многочисленным отзывам является International SOS, но в Украине нет ни одной страховой компании, с которым он работает. Максимум, на что можно расчитывать — это ассист Mondial, а к нему огромное количество претензий.&lt;/p&gt;

&lt;p&gt;Лучшие условия у мегакорпораций и нефтяных компаний, но мы не являемся сотрудниками и детьми руководства оных. У них даже есть свои собственные ассисты. И по слухам International SOS потихоньку переходит на такой формат работы. При этом сокращая количество договоров. В России похоже осталась только одна страховая компания, с которой сохранился договор — СК "Согласие".&lt;/p&gt;

&lt;p&gt;По телефону договорились о заключении договора и отправили сканы необходимых документов. После чего наш доверенный человек заехал в офис в Москве и оплатил услугу. Сканы страховок прислали по почте, этого достаточно.&lt;/p&gt;

&lt;p&gt;Расчет стоимости страховки имеет свои особенности. Нам нужно было на длительный срок, но чем дольше, тем выше коэффициент. Поэтому взяли страховку на 180 дней и через 180 будем брать следующую. За маленького ребенка берется двойной тариф. На семью из двух взрослых и одного (получается двойного) ребенка получилось $560. Можете посчитать сколько будет на одного разделив на 4.&lt;/p&gt;

&lt;h3&gt;Транспорт&lt;/h3&gt;

&lt;p&gt;Планирование транспорта — это основа подготовки к любому самостоятельному путешествию. Все предыдущие виды подготовки приходят с опытом, но транспорт надо планировать каждый раз перед выходом из дома.&lt;/p&gt;

&lt;p&gt;Если есть Google Maps в том месте куда едете, то считайте что 50% проблем решено. В Бангкоке прекрасно работает поиск от точки до точки с учетом общественного транспорта, а так же автомобилей. Google Maps даже показывает время прибытия и отбытия поездов BTS.&lt;/p&gt;

&lt;p&gt;В Одессе не представляю как можно жить без 2gis для iPhone, для Бангкока англоязычного аналога не нашел. Но есть программы типа Yandex.Metro с картой метро и BTS.&lt;/p&gt;

&lt;p&gt;Вообще о транспорте в Бангкоке лучше написать отдельный пост, если это представлят интерес. Поскольку этот пост о подготовке, то достаточно обратить внимание, что подготовка занимала время и в результате вооружившись iPad и iPhone я мог ориентироваться лучше чем доброжелательные тайские помощники. Впрочем, это не уберегло меня от поворота не в ту сторону в первый день :)&lt;/p&gt;

&lt;h3&gt;Знания&lt;/h3&gt;

&lt;p&gt;Сейчас есть огромное количество блогов, постов приличных и быдло туристов, рекламных сплогов и мусорных помоек-форумов в которых можно найти информацию почти по любому шагу. Основная цель изучения информации — это уменьшить тревожность и уберечься от лишних проблем. Мы тратили вечера на изучение всего чего можно. Конечно же это не убережет, хуже того от усиленного чтения может возникнуть ощущение что уже испытал все желаемые эмоции. Но, к счастью, на месте все может оказаться еще интереснее.&lt;/p&gt;

&lt;p&gt;Мне кажется, дополнительным важным пунктом в приобретении знаний, является поиск тех, к кому можно было бы обратиться. Тут есть много людей, которые сами были в подобной ситуации и готовы помочь советом и делом. Мы очень благодарны &lt;a rel="nofollow" target="_blank" href="https://twitter.com/nafigator"&gt;Арсению Камышеву&lt;/a&gt;, который помог с поиском бунгало и другими делами.&lt;/p&gt;

&lt;h2&gt;Голова&lt;/h2&gt;

&lt;blockquote&gt;&lt;p&gt;Перед отъездом встретился с хорошим другом и рассказал ему, что мы собираемся уехать в Таиланд. Мы посидели и пообщались в кафе и он предложил подбросить домой, а по пути подобрал свою маму после каких-то курсов. Друг в шутку спросил у нее как она смотрит на то, что он уедет жить в Таиланд. Через пол минуты переваривания она взорвалась и начала рассказывать, что скоро случится наводнение, которое смоет всю планету, а только Украина какого-то хрена останется спасенной. WTF?! Что у этих людей в голове?&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Это наверное самый сложный пункт в подготовке для многих. Сложно сказать, что будет мешать уехать. И лишний страх и лишняя самоуверенность могут навредить. Зависимость от мнения окружающих или отторжение контактов также.&lt;/p&gt;

&lt;p&gt;Думаю, главное пробовать и, тем самым, потихоньку избавляться от всякой гнили, которая откладывается в голове, от чрезмерного сидения в своей стране. Например, очень быстро уменьшается размер чемоданов у девушек, после пары поездок в жаркий климат они понимают, что накрашенными и в вечернем платье выглядят феерическими дурами на вечерних прогулках. Вещи в дороге портятся и ломаются, их могут украсть. Каждый лишний чемодан является проблемой безопасности если вы просчитались с транспортом и оказались на неизвестной улице, а до отеля идти лишний километр.&lt;/p&gt;

&lt;p&gt;Учить местных как жить и пытаться себя вести как рабовладельцы перед низшими существами тоже не стоит. Украина/Россия являются третьим миром по сравнению с остальной планетой. Мы потребляем товары из Азии, культуру и прогресс из США, а Большой Театр в масштабах страны видели единицы процентов. По умению жить и наслаждаться тем, что происходит вокруг "страны для грустных" - находятся в далеком средневековье и, все активнее в него погружаются.&lt;/p&gt;

&lt;p&gt;Лучше засунуть свое эго поглубже. И точно так же не показывать русскоговорящим приехавшим сюда, они может быть уехали как раз от этого.&lt;/p&gt;

&lt;h2&gt;Дальше&lt;/h2&gt;

&lt;p&gt;Этот рассказ и так получился очень внушительным. Поэтому о том, как наладить тут свою работу не буду уже расписывать. Даже без учета фотографий уже получается достаточно много. Постараюсь их добавить следующим заходом.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/app-engine/~4/4Mc3WZD9b5w" height="1" width="1"/&gt;</description>
         <guid isPermaLink="false">hhttp://www.vurt.ru/2012/05/month-in-thailand</guid>
         <pubDate>Fri, 11 May 2012 07:00:00 +0000</pubDate>
      </item>
      <item>
         <title>Automatic filtering in SQLAlchemy: motivation</title>
         <link>http://otkds.blogspot.com/2012/05/automatic-filtering-in-sqlalchemy.html</link>
         <description>&lt;div&gt;Server side code of web project usually has 3 layers:&lt;br /&gt;
&lt;br /&gt;
&lt;ul&gt;&lt;li&gt;data classes mapped to relational database,&lt;/li&gt;
&lt;li&gt;request handlers for each URL pattern,&lt;/li&gt;
&lt;li&gt;templates used to render pages.&lt;/li&gt;
&lt;/ul&gt;&lt;br /&gt;
Simple request handlers contain code like the following:&lt;br /&gt;
&lt;br /&gt;
&lt;pre&gt;&lt;code class="python"&gt;item = session.query(Entry).get(item_id)&lt;/code&gt;&lt;/pre&gt;or&lt;br /&gt;
&lt;pre&gt;&lt;code class="python"&gt;items = session.query(Entry)[:limit]&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;
When &lt;code&gt;Entry&lt;/code&gt; class has &lt;code&gt;public&lt;/code&gt; attribute and objects should be shown when &lt;code&gt;Entry.public &lt;/code&gt; is &lt;code&gt;True&lt;/code&gt; only (the simplest example of publicity condition; in real life it might be composite and even involve related tables) we have to include this condition in queries:&lt;br /&gt;
&lt;br /&gt;
&lt;pre&gt;&lt;code class="python"&gt;item = session.query(Entry).filter_by(public=True, id=item_id).scalar()&lt;/code&gt;&lt;/pre&gt;or&lt;br /&gt;
&lt;pre&gt;&lt;code class="python"&gt;items = session.query(Entry).filter_by(public=True)[:limit]&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;
Note, that we already violate DRY principle (the same condition should be used every time we query Entry), but it’s still not problem. Now let’s add relation to some Child class that has similar condition for publicity. If we pass only item or items to template, we have to be careful using their data:&lt;br /&gt;
&lt;br /&gt;
&lt;pre&gt;&lt;code class="jinja"&gt;{% for child in item.children %}…{% endfor %}&lt;/code&gt;&lt;/pre&gt;must be replaced with&lt;br /&gt;
&lt;pre&gt;&lt;code class="jinja"&gt;{% for child in item.children %}
{% if child.public %}…{% endif %}
{% endfor %}&lt;/code&gt;&lt;/pre&gt;&lt;br /&gt;
In real life it becomes even more complex: a simple test for empty list is already not so simple. Do we have other options? Yes, we can pass each relation as separate variable and move filtering to the code. This will prevent mess in templates, but this won’t prevent us from using relations directly by mistake. Do you think this shouldn’t happen? We are lazy, and I doubt anybody will define separate variable for relation that doesn’t have publicity condition (yet). But life changes and eventually we might need this condition. Now one developer adds new field to the database, changes all related request handlers and (if he is a responsible person) even templates. Simultaneously (or even later, since people remember code patterns they often used) other person adds usage of this relation unfiltered in some other place and we have unpublished data leaked to public. International scandal, world war III begins (joke).&lt;br /&gt;
&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;In fact, templates developer shouldn’t care about publicity of data. Unpublished data must not reach templates.&lt;br /&gt;
Constructing some data structures specially for templates leads to verbose request handler code instead for concise single line:&lt;br /&gt;
&lt;pre&gt;&lt;code class="python"&gt;item = session.query(Entry).filter_by(public=True, id=item_id).scalar()
data = {‘id’: item.id,
        ‘title’: item.title,
        ‘date’: item.date,
        ‘body’: item.body}
data[‘children’] = children = []
for child in item.children:
    if not child.public:
        continue
    child_data = {‘id’: child.id,
                  ‘title’: child.title,
                  ‘data’: child.data,
                  ‘body’: child.body}
    if child.author and child.author.public:
        child_data[‘author’] = author = {‘id’: child.author.id,
                                         ‘name’: child.author.name}
        if child.author.company and child.author.company.public:
            author[‘company’] = {‘id’: child.author.company.id,
                                 ‘title’: child.author.company.title}&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;Here is statistics from one big project where I’m involved in development. The numbers below cover public segment only (internal services like editor interface are not included).&lt;br /&gt;
&lt;br /&gt;
&lt;ul&gt;&lt;li&gt;458 templates&lt;/li&gt;
&lt;li&gt;6 databases with 210 tables&lt;/li&gt;
&lt;li&gt;135 mapped classes, 5 of them are bases for inheritance trees&lt;/li&gt;
&lt;li&gt;Data for 63 mapped classes must not go to public unless some condition is met (15 of them indirectly through inheritance). Those are only conditions that can’t be applied when replicating data from internal segment to public without significant impact on performance (changing state field of parent object would trigger publication or deletion of a huge list of children; using publication time in future requires some scheduler to trigger publication), the rest is filtered out before reaching database for public sites.&lt;/li&gt;
&lt;/ul&gt;&lt;br /&gt;
&lt;/div&gt;&lt;br /&gt;
&lt;div&gt;Having we can’t change relations behavior in request handler (this breaks ORM’s single object for each identity rule) I see the following 2 ways to solve the problem:&lt;br /&gt;
&lt;br /&gt;
&lt;ul&gt;&lt;li&gt;define separate mapped classes for public site,&lt;/li&gt;
&lt;li&gt;instruct session to filter all ORM queries.&lt;/li&gt;
&lt;/ul&gt;&lt;br /&gt;
Both ways have problems and require separate analysis.&lt;br /&gt;
&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5235863953075762128-1592014513354506085?l=otkds.blogspot.com' alt=''/&gt;&lt;/div&gt;</description>
         <author>Denis Otkidach</author>
         <guid isPermaLink="false">tag:blogger.com,1999:blog-5235863953075762128.post-1592014513354506085</guid>
         <pubDate>Thu, 03 May 2012 19:33:00 +0000</pubDate>
      </item>
      <item>
         <title>The Two-Way Conference (MozCamp and more)</title>
         <link>http://www.blueskyonmars.com/2012/04/30/the-two-way-conference-mozcamp-and-more/</link>
         <description>A week ago, I had the good fortune of attending and speaking at MozCamp Latin America in Buenos Aires, Argentina. I really enjoyed meeting a whole bunch of new people and appreciated the chance to talk about the Firefox developer tools shipping today and in the near future. The organizers clearly put a lot of [...]</description>
         <guid isPermaLink="false">http://www.blueskyonmars.com/?p=2906</guid>
         <pubDate>Mon, 30 Apr 2012 19:34:50 +0000</pubDate>
         <content:encoded><![CDATA[<p><a rel="nofollow" target="_blank" href="http://www.blueskyonmars.com/images/2012/04/2012-04-21-08.55.36.jpg"><img class="aligncenter size-large wp-image-2910" title="2012-04-21 08.55.36" src="http://www.blueskyonmars.com/images/2012/04/2012-04-21-08.55.36-1024x768.jpg" alt="" width="500" height="375"/></a><br />
A week ago, I had the good fortune of attending and speaking at <a rel="nofollow" target="_blank" href="https://wiki.mozilla.org/MozCampLATAM2012">MozCamp Latin America</a> in Buenos Aires, Argentina. I really enjoyed meeting a whole bunch of new people and appreciated the chance to talk about the Firefox developer tools shipping today and in the near future. The organizers clearly put a lot of effort into getting this conference together (thanks!)</p>
<p>This MozCamp was filled with excitement about <a rel="nofollow" target="_blank" href="https://wiki.mozilla.org/B2G">B2G </a>and the <a rel="nofollow" target="_blank" href="https://wiki.mozilla.org/Kilimanjaro">many other initiatives</a> Mozilla has going on. Beyond the product building we&#8217;re doing, there was a lot of energy and enthusiasm for growing the Mozilla community and building on the ideals that <a rel="nofollow" target="_blank" href="http://www.mozilla.org/about/manifesto.html">Mozilla stands for.</a></p>
<p>During MozCamp, I spoke with a few people about conferences in general. I think there&#8217;s a lot of room to make MozCamp and other conferences better than what we&#8217;re used to. These ideas are not new and didn&#8217;t originate with me, but they&#8217;re worth repeating.</p>
<h2>The Format Today</h2>
<p>MozCamp was structured like most other conferences that I&#8217;ve been to: a packed schedule with multiple tracks of presentations. <strong>You get interesting people presenting on useful topics.</strong> That&#8217;s not a bad thing, but I think it can be better.</p>
<p>If I&#8217;m going to deal with the hassle of air travel and spend days away from my family, I&#8217;d like to get the most I can out of that time.</p>
<p>A typical presentation slot is 30, 45 or 60 minutes. Of that, there&#8217;s maybe 10 minutes of questions and the rest is an &#8220;eyes-forward&#8221; presentation. I don&#8217;t think this is the best use of time. The unique thing about MozCamp (or any conference, for that matter), is that <strong>I&#8217;m physically there with the other people</strong>. The communications bandwidth is <em>much</em> higher. To use that bandwidth for one-way communication seems suboptimal.</p>
<p>There were other issues that I noticed as well:</p>
<ol>
<li><strong>Attendees at MozCamp had varying levels of English proficiency</strong>. This can make it hard for some to keep up with eyes-forward presentations from native English speakers. Plus, a <em>whole day</em> of constantly translating in your mind can get tiring&#8230; by the time the second day rolls around (after a possibly late night <a rel="nofollow" target="_blank" href="https://wiki.mozilla.org/MozCampLATAM2012/Play_Futbol_with_the_CEO">followed by soccer</a> in the morning!), I would imagine that keeping focused would be difficult.</li>
<li><strong>No slack time for checking email and having hallway conversations</strong>. The schedule was packed, leaving mealtime and snack time as the only times to talk (short of skipping sessions). Some of the evening activities suffered from <a rel="nofollow" target="_blank" href="http://www.kevindangoor.com/2012/04/its-not-the-booze-its-the-noise/">high noise levels</a> as well, eliminating that chance to talk easily.</li>
<li><strong>No slack time also causes issues with schedule slip</strong>.. On Sunday, the Q&amp;A session threw the rest of the day off by 30 minutes. That&#8217;s not surprising, since it was an interesting two-way communication sort of session&#8230; but it meant that the rest of the schedule needed to be pushed by half an hour and never got back on track (causing some sessions at the end of the day to be dropped).</li>
</ol>
<h2>The Two-Way Conference</h2>
<p>I think conferences, including MozCamp, should try to become more &#8220;two-way&#8221;. One formula for a session could go something like this:</p>
<ul>
<li>&#8220;speakers&#8221; are more like &#8220;hosts&#8221; or &#8220;invited experts&#8221;.</li>
<li>The expert prepares a <strong>page with links to background material and probably a presentation</strong>. This page should be available a couple weeks <strong><em>in advance</em></strong> of the event</li>
<li>That page can also include some <strong>suggestions for areas that could benefit from discussion</strong>.</li>
<li>That page could also have an etherpad or wiki associated with it to collect more ideas in advance (as attendees view the material).</li>
<li>At the beginning of the session, the <strong>expert provides a lightning talk-sized intro</strong> and, possibly with the help of a facilitator, <strong>gets people organized</strong> to usefully talk about things or work on something</li>
</ul>
<p>A parallel is the recent talk of <a rel="nofollow" target="_blank" href="http://www.economist.com/node/21529062">&#8220;flipping the classroom&#8221;</a>. The students <a rel="nofollow" target="_blank" href="http://www.khanacademy.org/">watch a video</a> outside of class and then use class time to work together or get help from the teacher.</p>
<p>Wouldn&#8217;t it be awesome if, instead of 50 minutes of eyes-forward presentation followed by questions, we had 5-10 minutes of organizing, level-setting and topic choosing, followed by 50 minutes of two-way communication?</p>
<p>Some <a rel="nofollow" target="_blank" href="http://en.wikipedia.org/wiki/Unconference">unconferences</a> go so far as to not even have predefined topics and time slots. I&#8217;m not going that far. I think that a little bit of structure with some constraints on time can help make the most of the time. Additionally, I have heard from some conference organizers that you can&#8217;t even get some companies to sponsor sending people to a conference without presentations from industry experts.</p>
<p><a rel="nofollow" target="_blank" href="http://jboriss.wordpress.com/">Boriss</a> had a user experience workshop that followed a good format: she did a few minutes of intro followed by demonstrations of applying her UX research suggestions to projects that people were working on. <a rel="nofollow" target="_blank" href="http://dailycavalier.com/">William Reynolds</a> told me that he also ran a workshop session on the topic of getting people involved with Mozilla. Individuals can take matters into their own hands this way, and I wish I had done so myself (there&#8217;s always a next time!). I&#8217;d like to see conferences that encourage and support this even more.</p>
<p>How can this format help with the problems I talked about?</p>
<ol>
<li>the sessions spend <strong>most of their time in a two-way exchange</strong> between the &#8220;expert&#8221; and the participants, thus making better use of the bandwidth</li>
<li>when communication is two-way, there&#8217;s <strong>more opportunity to overcome language issues</strong> than when you have a presenter following a predefined outline. In fact, sessions that involve group discussion could possibly take place in the group&#8217;s native language. (Of course, if <em>everyone</em> involved speaks the same language, then there&#8217;s no problem. Some of the MozCamp sessions were held in Spanish&#8230; of course, that left me, and some of the Brazilians, out.)</li>
<li>More of the stuff that might get discussed on the &#8220;hallway track&#8221; can now get discussed by more people during sessions</li>
<li>Ideally, there would be a bit more buffer time to handle schedule slippage and sessions that are so good that people just don&#8217;t want to stop</li>
</ol>
<p>I still find conferences valuable and will continue to attend them, but I think we can do better.</p>]]></content:encoded>
         <category>Mozilla</category>
      </item>
      <item>
         <title>Playing with the Ninja build system</title>
         <link>http://feedproxy.google.com/~r/CoderWhoSaysPy/~3/bkE9DKXsA6Q/playing-with-ninja-build-system.html</link>
         <description>Whenever I learn a new programming language I end up writing some&lt;a rel="nofollow" target="_blank" href="http://code.google.com/p/bcannon/source/browse/#hg%2Flanguages"&gt; toy examples&lt;/a&gt; to try to get a feel for what the language is about. This leads to the need to build code using many different compilers with their own flags, quirks, etc. Up until today I had used SCons for my build setup. But honestly, it always seemed like overkill to me. Because I only had about 5 programs to build per language with at most two files used to produce the program, a full-blown build system was never really needed. Add to the fact that I am building for languages that no build system would have built-in support for, it led me to always have a wandering eye for another build system I could use.&lt;br /&gt;
&lt;br /&gt;
This past week someone on Google+ &amp;nbsp;shared a &lt;a rel="nofollow" target="_blank" href="https://plus.google.com/108996039294665965197/posts/SfhrFAhRyyd"&gt;post comparing configure+make, cmake+make, and cmake+ninja&lt;/a&gt;. I had never heard of &lt;a rel="nofollow" target="_blank" href="http://martine.github.com/ninja/"&gt;Ninja&lt;/a&gt;, so I decided to have a look. It turns out someone had written a build tool whose only explicit job was to take a DAG, figure out what needed to be built, and then execute the commands for the build. No crazy metadata checks like Make, or fanciful features, just bare-bones building. Ninja was actually designed to be a target for other higher-level build systems like cmake which can do the pre-computation of what the DAG should be, leaving it to Ninja to drive the needed compilation.&lt;br /&gt;
&lt;br /&gt;
What attracted me to it was that it was fast and the syntax was simple. I have code examples for 16 languages, of which 10 have build rules (one happens to be Python 2.7 as I pre-compile the .pyo files). Turned out to be a pretty straight-forward process to take my custom SCons commands and just translate them to the subsequent shell commands that Ninja would execute for me. They are a tad verbose in order to make sure that the &lt;span style="font-family:Courier, monospace;"&gt;ninja -t clean &lt;/span&gt;&lt;span style="font-family:Times, serif;"&gt;command would clean up all intermediary files (I'm looking at you OCaml, Haskell, Java, and Scala). But as I said, I typically never have more than 5 programs to build per language, so it wasn't that much of a burden. And if I really cared I could have written a Python script to auto-generate the Ninja files for me, but I decided the effort of writing the code would be just as much as writing the build files by hand.&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family:Times, serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family:Times, serif;"&gt;I realize I could have used Make, but I honestly am not&amp;nbsp;enamoured&amp;nbsp;with that tool; requiring tabs just rubs me the wrong way. Plus it's rather slow in the common case of only changing a file or two compared to a complete build from scratch.&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family:Times, serif;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;br /&gt;
&lt;span style="font-family:Times, serif;"&gt;Overall, for my weird case Ninja worked out. For something more complex, though, I will consider looking at cmake+ninja as a build solution.&lt;/span&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/20144447-8878728228228327079?l=sayspy.blogspot.com' alt=''/&gt;&lt;/div&gt;&lt;img src="http://feeds.feedburner.com/~r/CoderWhoSaysPy/~4/bkE9DKXsA6Q" height="1" width="1"/&gt;</description>
         <author>Brett Cannon</author>
         <guid isPermaLink="false">tag:blogger.com,1999:blog-20144447.post-8878728228228327079</guid>
         <pubDate>Sat, 28 Apr 2012 20:21:00 +0000</pubDate>
      </item>
      <item>
         <title>Why I don't *really* practice open science</title>
         <link>http://ivory.idyll.org/blog/apr-12/blog-practicing-open-science</link>
         <description>&lt;div class="document"&gt;
&lt;p&gt;I'm a pretty big advocate of anything open -- open source, open
access, and open science, in particular.  I always have been.  And now
that I'm a professor, I've been trying to figure out how to actually
&lt;em&gt;practice&lt;/em&gt; open science effectively&lt;/p&gt;
&lt;p&gt;What is open science?  Well, I think of it as talking regularly about
my unpublished research on the Internet, generally in my blog or on
some other persistent, explicitly public forum.  It should be done
regularly, and it should be done with a certain amount of depth or
self-reflection.  (See, for example, the wunnerful &lt;a rel="nofollow" class="reference" target="_blank" href="http://www.nature.com/news/2011/110809/full/news.2011.469.html"&gt;Rosie Redfield&lt;/a&gt;
and &lt;a rel="nofollow" class="reference" target="_blank" href="http://www.nature.com/news/2011/110809/full/news.2011.469.html"&gt;Nature's commentary&lt;/a&gt;
on her blogging of the arsenic debacle &amp;amp; tests thereof.)&lt;/p&gt;
&lt;p&gt;Most of my cool, sexy bloggable work is in bioinformatics; I do have a
wet lab, and we're starting to get some neat stuff out of that
(incl. both some ascidian evo-devo and some chick transcriptomics) but
that's not as mature as the computational stuff I'm doing.  And, as
you know if you've seen any of my recent posts on this, I'm pretty
bullish about the computational work we've been doing: the de novo
assembly sequence foo is, frankly, super awesome and seems to solve
most of the scaling problems we face in short-read assembly.  And it
provides a path to solving the problems that it doesn't outright
&lt;em&gt;solve&lt;/em&gt;.  (I'm talking about &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/dec-11/kmer-percolation-posted.html"&gt;partitioning&lt;/a&gt;
and &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/mar-12/diginorm-paper-posted.html"&gt;digital normalization&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;While I think we're doing awesome work, I've been uncharacteristically
(for me) shy about proselytizing it prior to having papers ready.  I
occasionally reference it on mailing lists, or in blog posts, or on
twitter, but the most I've talked about the details has been in talks
-- and I've rarely posted those talks online.  When I have, I don't
point out the nifty awesomeness in the talks, either, which of course
means it goes mostly unnoticed.  This seems to be at odds with my
oft-loudly stated position that open-everything is the way to go.
What's going on??  That's what this blog post is about.  I think it
sheds some interesting light on how science is actually practiced, and
why completely open science might waste a lot of people's time.&lt;/p&gt;
&lt;p&gt;I'd like to dedicate this blog post to &lt;a rel="nofollow" class="reference" target="_blank" href="http://third-bit.com/"&gt;Greg Wilson&lt;/a&gt;.  He and I chat irregularly about research,
and he's always seemed interested in what I'm doing but is stymied
because I don't talk about it much in public venues.  And he's been a
bit curious about why.  Which made me curious about why.  Which led to
this blog post, explaining why I think why.  (I've had it written for
a few months, but was waiting until I posted diginorm.)&lt;/p&gt;
&lt;hr class="docutils"/&gt;
&lt;p&gt;For the past two years or so, I've been unusually focused on the
problem of putting together vast amounts of data -- the problem of de
novo assembly of short-read sequences.  This is because I work on
unusual critters -- soil microbes &amp;amp; non-model animals -- that nobody
has sequenced before, and so we can't make use of prior work.  We're
working in two fields primarily, metagenomics (sampling populations of
wild microbes) and mRNAseq (quantitative sequencing of transcriptomes,
mostly from non-model organisms).&lt;/p&gt;
&lt;p&gt;The problems in this area are manifold, but basically boil down to two
linked issues: vast underlying diversity, and dealing with the even
vaster amounts of sequence necessary to thoroughly sample this
diversity.  There's lots of biology motivating this, but the
computational issues are, to first order, dominant: we can generate
more sequence than we can assemble.  This is the problem that
we've basically solved.&lt;/p&gt;
&lt;p&gt;A rough timeline of our work is:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;mid/late 2009: Likit, a graduate student in my lab, points out that
we're getting way better gene models from assembly of chick mRNAseq
than from reference-based approaches.  Motivates interest in assembly.&lt;/li&gt;
&lt;li&gt;mid/late 2009: our lamprey collaborators deliver vast amounts of lamprey
mRNAseq to us.  Reference genome sucks.  Motivates interest in assembly.&lt;/li&gt;
&lt;li&gt;mid/late 2009: the JGI starts delivering ridiculous amount of soil
sequencing data to us (specifically, Adina).  We do everything
possible to avoid assembly.&lt;/li&gt;
&lt;li&gt;early 2010: we realize that the least insane approach to analyzing
soil sequencing data relies on assembly.&lt;/li&gt;
&lt;li&gt;early 2010: Qingpeng, a graduate student, convinces me that
existing software for counting k-mers (tallymer, specifically)
doesn't scale to samples with 20 or 30 billion unique k-mers.  (He
does this by repeatedly crashing our lab servers.)&lt;/li&gt;
&lt;li&gt;mid-2010: a computational cabal within the lab (Jason, Adina, Rose)
figures out how to count k-mers really efficiently, using a
CountMin Sketch data structure (which we reinvent, BTW, but
eventually figure out isn't novel.  o well).  We implement this in
khmer.  (see &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/jul-10/kmer-filtering"&gt;k-mer filtering&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;mid-2010: We use khmer to figure out just how much Illumina
sequence sucks.  (see &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/jul-10/illumina-read-phenomenology"&gt;Illumina read phenomenology&lt;/a&gt;)&lt;/li&gt;
&lt;li&gt;mid-2010: Arend joins our computational cabal, bringing detailed
and random knowledge of graph theory with him.  We invent an
&lt;em&gt;actually&lt;/em&gt; novel use of Bloom filters for storing de Bruijn graphs.
(&lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/dec-11/kmer-percolation-posted.html"&gt;blog post&lt;/a&gt;)
The idea of partitioning large metagenomic data sets into
(disconnected) components is born.  (Not novel, as it turns out --
see &lt;a rel="nofollow" class="reference" target="_blank" href="http://metavelvet.dna.bio.keio.ac.jp/"&gt;MetaVelvet&lt;/a&gt; and &lt;a rel="nofollow" class="reference" target="_blank" href="http://bioinformatics.oxfordjournals.org/content/27/13/i94.abstract"&gt;MetaIDBA&lt;/a&gt;.)&lt;/li&gt;
&lt;li&gt;late 2010: Adina and Rose figure out that Illumina suckage prevents
us from actually getting this to work.&lt;/li&gt;
&lt;li&gt;first half of 2011: Spent figuring out capacity of de Bruijn graph
representation (Jason/Arend) and the right parameters to actually
de-suckify large Illumina data sets (Adina).  We slowly progress
towards actually being able to partition large metagenomic data
sets reliably.  A friend browbeats me into applying the same
technique to his ugly genomic data set, which magically seems to
solve his assembly problems.&lt;/li&gt;
&lt;li&gt;fall 2011: the idea of digital normalization is born: throwing away
redundant data FTW. Early results are very promising (we throw away
95% of data, get identical assembly) but it doesn't scale assembly
as well as I'd hoped.&lt;/li&gt;
&lt;li&gt;October 2011: JGI talk at the &lt;a rel="nofollow" class="reference" target="_blank" href="http://www.youtube.com/watch?v=0Oon5viKMmA&amp;amp;list=PL29441D81BD645568&amp;amp;index=8&amp;amp;feature=plpp_video"&gt;metagenome informatics workshop - SLYT&lt;/a&gt;, where
we present our ideas of partitioning and digital normalization,
together, for the first time.  We point out that this potentially
solves all the scaling problems.&lt;/li&gt;
&lt;li&gt;November 2011: We figure out the right parameters for digital
normalization, turning up the awesomeness level dramatically.&lt;/li&gt;
&lt;li&gt;through present: focus on actually writing this stuff up.  See:
&lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/dec-11/kmer-percolation-posted.html"&gt;de Bruijn graph preprint&lt;/a&gt;; &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/mar-12/diginorm-paper-posted.html"&gt;digital normalization preprint&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;hr class="docutils"/&gt;
&lt;p&gt;If you read this timeline (yeah, I know it's long, just skim) and look
at the dates of &amp;quot;public disclosure&amp;quot;, there's a 12-14 month gap between
talking about k-mer counting (July 2010) and partitioning/etc (Oct
2011, metagenome informatics talk).  And then there's another
several-month gap before I really talk about digital normalization as
a good solution (basically, mid/late January 2012).&lt;/p&gt;
&lt;p&gt;Why??&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;I was really freakin' busy actually getting the stuff to work, not
to mention teaching, traveling, and every now and then actually
being at home.&lt;/li&gt;
&lt;li&gt;I was definitely worried about &amp;quot;theft&amp;quot; of ideas.  Looking back,
this seems a mite ridiculous, but: I'm junior faculty in a
fast-moving field.  Eeek!  I also have a duty to my grads and
postdocs to get &lt;em&gt;them&lt;/em&gt; published, which wouldn't be helped by being
&amp;quot;scooped&amp;quot;.&lt;/li&gt;
&lt;li&gt;We kept on coming up with new solutions and approaches!  Digital
normalization didn't exist until August 2011, for example;
appropriate de-suckifying of Illumina data took until April or May
of 2011; and proving that it all worked was, frankly, quite tough
and took until October.  (More on this below.)&lt;/li&gt;
&lt;li&gt;The code wasn't ready to use, and we hadn't worked out all the
right parameters, and I wasn't ready to do the support necessary to
address lots of people using the software.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;All of these things meant I didn't talk about things openly on my blog.
Is this me falling short of &amp;quot;open science&amp;quot; ideals??&lt;/p&gt;
&lt;p&gt;In my defense, on the &amp;quot;open science&amp;quot; side:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;I gave plenty of invited talks in this period, including a few (one
at JGI and one at UMD CBCB) attended by experts who certainly
understood everything I was saying, probably better than me.&lt;/li&gt;
&lt;li&gt;I posted some of these talks on &lt;a rel="nofollow" class="reference" target="_blank" href="http://www.slideshare.net/c.titus.brown/"&gt;slideshare&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;all of our software development has been done on github, under
github.com/ctb/khmer/.  It's all open source, available, etc.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;...but these are sad excuses for open science.  None of these
activities really disseminated my research openly.  Why?&lt;/p&gt;
&lt;p&gt;Well, invited talks by junior faculty like me are largely attended out
of curiosity and habit, rather than out of a burning desire to
understand what they're doing; odds are, the faculty in question
hasn't done anything particularly neat, because if they had, they'd be
well known/senior, right?  And who the heck goes
through other people's random presentations on slideshare?  So that's
not really dissemination, especially when the talks are given to an in
group already.&lt;/p&gt;
&lt;p&gt;What about the source code?  The &amp;quot;but all my source code is available&amp;quot;
dodge is particularly pernicious.  Nobody, but nobody, looks at other
people's source code in science, unless it's (a) been released, (b)
been documented, and (c) claims to solve YOUR EXACT ACTUAL PROBLEM
RIGHT NOW RIGHT NOW.  The idea that someone is going to come along and
swoop your awesome solution out of your repository seems to me to be
ridiculous; &lt;strong&gt;you'd be lucky to be that relevant, frankly.&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;So I don't think any of that is a good way to disseminate what you've
done.  It's necessary for science, but it's not at all sufficient.&lt;/p&gt;
&lt;p&gt;--&lt;/p&gt;
&lt;p&gt;What do I think &lt;em&gt;is&lt;/em&gt; sufficient for dissemination?  In my case, how do
you build solutions and write software that &lt;em&gt;actually has an impact&lt;/em&gt;,
either on the way people think or (even better) on actual practice?
And is it compatible with open science?&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;Write effective solutions to common problems.  The code doesn't
have to be pretty or even work all that well, but it needs to work
well enough to run and solve a common problem.&lt;/li&gt;
&lt;li&gt;Make your software available.  Duh.  It doesn't have to be open
source, as far as I can tell; I think it should be, but plenty
of people have restrictive licenses on their code and software,
and it gets used.&lt;/li&gt;
&lt;li&gt;Write about it in an open setting.  Blogs and mailing lists are ok;
SeqAnswers is probably a good place for my field; but honestly,
you've got to write it all down in a nice, coherent, well-thought
out body of text.  And if you're doing that?  You might as well
publish it.  Here is where Open Access really helps, because The
Google will make it possible for people to find it, read it, and
then go out and find your software.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;The interesting thing about this list is that in addition to all the
less-than-salutary reasons (given above) for not blogging more
regularly about our stuff, I had one &lt;em&gt;very&lt;/em&gt; good reason for not doing
so.&lt;/p&gt;
&lt;p&gt;It's a combination of #1 and #3.&lt;/p&gt;
&lt;p&gt;You see, &lt;strong&gt;until near to the metagenome informatics meeting, I didn't
know if partitioning or digital normalization really worked&lt;/strong&gt;.  We had
really good indications that partitioning worked, but it was never
solid enough for me to push it strongly as an &lt;em&gt;actual&lt;/em&gt; solution to big
data problems.  And digital normalization made so much sense that it
almost &lt;em&gt;had&lt;/em&gt; to work, but, um, proving it was a different problem.
Only in October did we do a bunch of cross-validation that basically
convinced me that partitioning worked &lt;em&gt;really&lt;/em&gt; well, and only in
November did we figure out how awesome digital normalization was.&lt;/p&gt;
&lt;p&gt;So we thought we had solutions, but we weren't sure they were
effective, and we sure didn't have it neatly wrapped in a bow for
other people to use.  So #1 wasn't satisfied.&lt;/p&gt;
&lt;p&gt;And, once we did have it working, we started to put a lot of energy
into demonstrating that it worked and writing it up for publication --
#3 -- which took a few months.&lt;/p&gt;
&lt;p&gt;In fact, I would actually argue that before October 2011, we could
have wasted people's time by pushing our solutions out for general use
when we basically didn't know if they worked well.  Again, we
&lt;em&gt;thought&lt;/em&gt; they did, but we didn't really know.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;This is a conundrum for open science: how do you know that someone
else's work is worth your attention?&lt;/strong&gt; Research is really hard, and it
may take months or years to nail down all the details; do you really
want to invest significant time or effort in someone else's research
before that's done?  And when they are done -- well, that's when they
submit it for publication, so you might as well just read that first!&lt;/p&gt;
&lt;p&gt;--&lt;/p&gt;
&lt;p&gt;This is basically the format for open science I'm evolving.  I'll blog
as I see fit, I'll post code and interact with people that I know who
need solutions, but I will wait until we have written a paper to
really open up about what we're doing.  A big part of that is trying
to only push solid science, such that I don't waste other people's
time, energy, and attention.&lt;/p&gt;
&lt;p&gt;So: I'm planning to continue to post all my senior-author papers to
arXiv just before their first submission.  The papers will come with
open source and the full set of data necessary to recapitulate our
results.  And I'll blog about the papers, and the code, and the work,
and try to convince people that it's nifty and awesome and solves some
useful problems, or addresses cool science.  But I don't see any much
point in broadly discussing my stuff before a preprint is available.&lt;/p&gt;
&lt;p&gt;Is this open science?  I don't really think so.  I'd really like to
talk more openly about our actual research, but for all the reasons
above, it doesn't seem like a good idea.  So I'll stick to trying to
give presentations on our stuff at conferences, and maybe posting the
presentations to slideshare when I think of it, and interacting with
people privately where I can understand what problems they're running
into.&lt;/p&gt;
&lt;p&gt;What I'm doing is more about &lt;em&gt;open access&lt;/em&gt; than open science: people
won't find out details of our work until I think it's ready for
publication, but they also won't have to wait for the review process
to finish.  While I'm not a huge fan of the way peer review is done, I
accept it as a necessary evil for getting my papers into a journal.
By the time I submit a paper, I'll be prepared to argue, confidently
and with actual evidence, that the approach is sound.  If the
reviewers disagree with me and find an actual mistake, I'll fix the
paper and apologize profusely &amp;amp; publicly; if reviewers just want more
experiments done to round out the story, I'll do 'em, but it's easy to
argue that additional experiments generally don't &lt;em&gt;detract&lt;/em&gt; from the
paper unless they discover flaws (see above, re &amp;quot;apologize&amp;quot;).  The
main thing reviewers seem to care about is softening grandiose claims,
anyway; this can be dealt with by (a) not making them and (b) sending
to impact-oblivious journals like PLoS One.  I see no problem with
posting the paper, in any of these circumstances.&lt;/p&gt;
&lt;p&gt;Maybe I'm wrong; experience will tell if this is a good idea.  It'll
be interesting to see where I am once we get these papers out... which
may take a year or two, given all the stuff we are writing up.&lt;/p&gt;
&lt;p&gt;I've also come to realize that most people don't have the time or
(mental) energy to spare to really come to grips with other people's
research.  We were doing some pretty weird stuff (sketch graph
representations? streaming sketch algorithms for throwing away data?),
and I don't have a prior body of work in this area; most people
probably wouldn't be able to guess at whether I was a quack without
really reading through my code and presentations, and understanding it
in depth.  That takes a &lt;em&gt;lot&lt;/em&gt; of effort.  And most people
don't really understand the underlying issues anyway; those who do
probably care about them sufficiently to have their own research ideas
and are pursuing them instead, and don't have time to understand mine.
The rest just want a solution that runs and isn't obviously wrong.&lt;/p&gt;
&lt;p&gt;In the medium term, the best I can hope for is that preprints and blog
posts will spur people to either use our software and approaches, or
that -- even better -- they will come up with nifty &lt;em&gt;new&lt;/em&gt; approaches
that solve the problems in some new way that I'd never have thought
of.  And then I can read &lt;em&gt;their&lt;/em&gt; work and build on &lt;em&gt;their&lt;/em&gt; ideas.
&lt;strong&gt;This is what we should strive for in science: the shortest
round trip between solid scientific inspiration in different labs.&lt;/strong&gt;
This does not necessarily mean open notebooks.&lt;/p&gt;
&lt;p&gt;Overall, it's been an interesting personal journey from &amp;quot;blind
optimism&amp;quot; about openness to a more, ahem, &amp;quot;nuanced&amp;quot; set of thoughts
(i.e., I was wrong before :).  I'd be interested to hear what other
people have to say... drop me a note or make a comment.&lt;/p&gt;
&lt;p&gt;--titus&lt;/p&gt;
&lt;p&gt;p.s. I recognize that it's too early to really defend the claim that
our stuff provides a broad set of solutions.  That's not up to me to
say, for one thing.  For another, it'll take years to prove out.  So
I'm really talking about the hypothetical solution where it &lt;em&gt;is&lt;/em&gt;
widely useful in practice, and how that intersects with open science
goals &amp;amp; practice.&lt;/p&gt;
&lt;/div&gt;</description>
         <guid isPermaLink="false">http://ivory.idyll.org/blog/2012/04/07/blog-practicing-open-science</guid>
         <pubDate>Sun, 08 Apr 2012 00:07:59 +0000</pubDate>
      </item>
      <item>
         <title>Big Data Biology - why efficiency matters</title>
         <link>http://ivory.idyll.org/blog/apr-12/big-data-biology-2</link>
         <description>&lt;div class="document"&gt;
&lt;p&gt;I'm going to pick on Mick Watson today.  (It's OK.  He's just a foil for
this discussion, and I hope he doesn't take it too personally.)&lt;/p&gt;
&lt;p&gt;Mick made the following comment on my earlier &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/mar-12/big-data-biology"&gt;Big Data Biology blog post&lt;/a&gt;:&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;I do wonder whether there is just a bit too much hand wringing
about &amp;quot;big data&amp;quot;.&lt;/p&gt;
&lt;p&gt;For e.g., the rumen metagenomic data you mentioned above, I can
assemble using MetaVelvet on our server in less than a day
(admittedly it has 512Gb of RAM, but doesn't everyone?).  I can
count the 17mers in it using Jellyfish in a few hours.&lt;/p&gt;
&lt;p&gt;So I just set the processes running, two days later, I have my
analysis.  What's the problem?  Does it matter that you can do it
quicker?&lt;/p&gt;
&lt;p&gt;Big data doesn't really worry me.&lt;/p&gt;
&lt;p&gt;...&lt;/p&gt;
&lt;p&gt;I know I am being flippant, but really to me the challenge isn't
the data, it's the biology.  I don't care if it takes 2 hours, 2
days or 2 weeks to process the data.&lt;/p&gt;
&lt;p&gt;Improve your computing efficiency by 100x, I don't care; improve
your ability to extract biological information by 100x, then I'm
interested :)&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;He makes one very, very, very good point -- who cares if you can run
an analysis (whatever it is) and it doesn't provide any value?  The
end goal of &lt;em&gt;my&lt;/em&gt; sequencing analysis is to provide insight into
biological processes; I might as well just delete the data (an O(1)
&amp;quot;analysis&amp;quot; operation, if one with a big constant in front of it..)  if
the analysis isn't going to yield useful information.&lt;/p&gt;
&lt;p&gt;But he also seems to think that speed and efficiency of analyses
doesn't matter for science.  And I don't just &lt;em&gt;think&lt;/em&gt; he's dead wrong,
I &lt;em&gt;know&lt;/em&gt; he's dead wrong.&lt;/p&gt;
&lt;p&gt;This is both an academic point and a practical point.  And, in fact,
an algorithmic point.&lt;/p&gt;
&lt;div class="section"&gt;
&lt;h1&gt;&lt;a rel="nofollow" id="the-academic-reason-why-efficient-computation-is-good-for-science" name="the-academic-reason-why-efficient-computation-is-good-for-science"&gt;The academic reason why efficient computation is good for science&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;The academic point is simple: our ability to do thorough exploratory
analysis of a large sequencing data set is limited by at least four
things.  These four things are:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ol class="arabic"&gt;
&lt;li&gt;&lt;p class="first"&gt;Our ability to do initial processing on the data - error trimming
and correction, and data summary (mapping and assembly, basically).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;The information available for cross-reference.  Most (99.9%) of
our bioinformatic analyses rely on homology (for inference of
function) and annotation.&lt;/p&gt;
&lt;p&gt;(This is why Open Access of data is so freakin' important to us
bioinformaticians.  If you hide your database from us, it might
as well not exist for all we care.)&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;Statistics.  We do a lot of sensitive signal analysis and
multiple testing, and we are really quite bad at computing FDRs
and other statistical properties.  Each statistical advance is
greeted with joy.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p class="first"&gt;The ability to complete computations on (1), (2), and (3).&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/blockquote&gt;
&lt;p&gt;Every 100gb data set takes a day to process.  Mapping and assembly can
take hours to days to weeks.  Each database search costs time and
effort (in part because the annotations are all in different formats).
Each MCMC simulation or background calculation takes significant time,
even if it's automated.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Inefficient computation thus translates to an economic penalty on
science (in time, effort, and attention span).&lt;/strong&gt; This, in turn, leads
directly to science that is not as good as it could be (as do &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/jun-11/ngs-2011"&gt;poor
computational science skills&lt;/a&gt;, &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/jan-12/top-ten-things-i-hate-about-bioinfo-software"&gt;badly written
software&lt;/a&gt;,
&lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/dec-11/data-intensive-science-and-workflows"&gt;inflexible workflows&lt;/a&gt;,
&lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/dec-11/four-reasons-i-wont-use-your-data-analysis-pipeline"&gt;opaque pipelines&lt;/a&gt;,
and &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/dec-11/is-discovery-science-really-bogus"&gt;too quick a rush to hypotheses&lt;/a&gt;
-- hey, look, a central theme to my blog posts!)&lt;/p&gt;
&lt;p&gt;Anecdote: someone recently e-mailed us to tell us about how they could
assemble a comparable soil data set to ours in a mere week and 3 TB of
memory.  Our internal estimates suggest that for full sensitivity, we
need to do 5-10 assemblies of that data set (each with different
parameters) followed by a similarly expensive post-assembly merging --
so, minimally, 6 weeks of compute requiring 3 TB of memory, full-time,
on as many cores as possible.  You've gotta imagine that there's going
to be a lot of internal pressure to get results in less time (surely
we can get away with only 1 assembly?) with less parameter searching
(what, you think we can tell you which parameters are going to work?)
and this pressure is going to translate to doing less in the way of
data set exploration.  (Never mind the &lt;em&gt;actual&lt;/em&gt; economics -- since
this data set would take about 1 week of sequencer time, and $10,000
or so, to generate today, I think they don't make sense either.)&lt;/p&gt;
&lt;p&gt;I can point you to at least three big metagenome Illumina assembly
papers where I &lt;em&gt;know&lt;/em&gt; these computational limitations truncated their
exploration of the data set.  (Wait, you say there are only three?
Well, I'm not going to tell you which three they are.)&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section"&gt;
&lt;h1&gt;&lt;a rel="nofollow" id="the-practical-reason-why-efficient-computation-is-good-for-science" name="the-practical-reason-why-efficient-computation-is-good-for-science"&gt;The practical reason why efficient computation is good for science&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;This one's a bit more obvious, but, interestingly, Mick &lt;em&gt;also&lt;/em&gt; treads
all over it.  He says &amp;quot;...I can assemble using MetaVelvet on our server
in less than a day (admittedly it has 512 Gb of RAM, but doesn't everyone?&amp;quot;&lt;/p&gt;
&lt;p&gt;Well, no, they don't.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;We&lt;/em&gt; didn't have access to such a big server until recently.  We had plenty
of offers for occasional access, but when we explained that we needed them
for a few weeks of dedicated compute (for parameter exploration -- see above)
and also that no, we weren't willing to sign copyright or license for our
software over to a national lab for that access, somewhat oddly a lot of
the offers came to naught.&lt;/p&gt;
&lt;p&gt;It turns out &lt;em&gt;most&lt;/em&gt; people don't have access to such bigmem computers, or
even big compute clusters; and when they do, those computers and clusters
aren't configured for biologists to use.&lt;/p&gt;
&lt;p&gt;Democratization of sequencing should mean democratization of analysis,
too.  Every year &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/mar-12/ngs-course-where-next.html"&gt;our next-gen sequence analysis course&lt;/a&gt;
gets tons of applicants from small colleges and universities where the
compute infrastructure is small and what does exist is overwhelmed by
Monte Carlo calculations.  Our course explicitly teaches them to use
Amazon to do their compute -- with that, they can take that knowledge
home, and spend small amounts of money to buy IaaS, or apply for an
AWS education grant to do their analysis.  We feel for them because
we were &lt;em&gt;in&lt;/em&gt; their situation until recently.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Expensive compute translates to a penalty on the very ability of
many scientists and teachers to access computational science.&lt;/strong&gt;
&lt;em&gt;(Insert snide comment on similar limitations in practical access to
US education, health care, and justice).&lt;/em&gt;&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section"&gt;
&lt;h1&gt;&lt;a rel="nofollow" id="the-algorithmic-reason-why-efficient-computation-is-good-for-science" name="the-algorithmic-reason-why-efficient-computation-is-good-for-science"&gt;The algorithmic reason why efficient computation is good for science&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Assemblers kinda suck.  Everyone knows it, and recent contests &amp;amp;
papers have done a pretty good job of highlighting the limitations
(see &lt;a rel="nofollow" class="reference" target="_blank" href="http://gage.cbcb.umd.edu/index.html"&gt;GAGE&lt;/a&gt; and &lt;a rel="nofollow" class="reference" target="_blank" href="http://assemblathon.org/"&gt;Assemblathon&lt;/a&gt;).  This is not because the field is full
of stupid people, but rather because assembly is a really, really hard
problem (see &lt;a rel="nofollow" class="reference" target="_blank" href="http://trinity.engr.uconn.edu/~vamsik/Fragment%20Assembly/NagarajanPopJCB09.pdf"&gt;Nagarajan &amp;amp; Pop&lt;/a&gt;)
-- so hard that really smart people have worked for decades on it.
(In many ways, the fact that it works at all is a tribute to their
brilliance.)&lt;/p&gt;
&lt;p&gt;Advances in assembly algorithms have led to our current crop of
assemblers, but assemblers are still relatively slow and relatively
memory consumptive.  Our &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/apr-12/what-is-diginorm.html"&gt;diginorm paper&lt;/a&gt; benchmarks Trinity as
requiring 38 hours in 42gb of RAM for 100m mouse mRNAseq reads; genome
and metagenome assemblers require similar size resources, although the
variance depends on the sample, of course.  &lt;a rel="nofollow" class="reference" target="_blank" href="http://www.ncbi.nlm.nih.gov/pubmed?term=22156294"&gt;SGA&lt;/a&gt; and &lt;a rel="nofollow" class="reference" target="_blank" href="http://www.ncbi.nlm.nih.gov/pubmed?term=22231483"&gt;Cortex&lt;/a&gt; seem
unreasonably memory efficient to me :), but I understand that they perform
less well on things other than single genomes (like, say, metagenomic
data) -- in part because the underlying data structures are
targeted at specific features of their data.&lt;/p&gt;
&lt;p&gt;What's the plan for the future, in which we will be applying next-gen
sequencing to non-model organisms, evolutionary experiments, and
entire populations of novel critters?  These sequencing data sets will
have different features from the ones we are used to tackling with
current tech -- including higher heterozygosity and strong GC-rich
biases.&lt;/p&gt;
&lt;p&gt;I personally think the next big advances in assembly will come through
the systematic application of sample- or sub-sample specific,
compute-expensive algorithms like &lt;a rel="nofollow" class="reference" target="_blank" href="http://www.ncbi.nlm.nih.gov/pubmed?term=21595876"&gt;EMIRGE&lt;/a&gt; to our data
sets.  While perfect assembly may be a pipe dream, significant and
useful incremental advances seem very achievable, especially if the
practical cost of current assembly algorithms drops.&lt;/p&gt;
&lt;p&gt;Not so parenthetically, this is one of the reasons I'm so excited
about &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/apr-12/what-is-diginorm.html"&gt;digital normalization&lt;/a&gt; (the general concept, not only our
implementation) --&lt;/p&gt;
&lt;p&gt;I bet more algorithmically expensive solutions would be investigated,
implemented, and applied if memory and time requirements dropped,
don't you?&lt;/p&gt;
&lt;p&gt;Or if the data could be made less error-prone and simpler?&lt;/p&gt;
&lt;p&gt;Or if the volume of data could be reduced without losing much
information?&lt;/p&gt;
&lt;p&gt;I will take one side of that bet...&lt;/p&gt;
&lt;p&gt;---&lt;/p&gt;
&lt;p&gt;Of course, I'm more than a wee bit biased on this whole topic.  A big
focus of my group has been in spending the last three years fighting
the trend of &amp;quot;just use a bigger computer and it will all be OK&amp;quot;.
&lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/mar-12/diginorm-paper-posted.html"&gt;Diginorm&lt;/a&gt; and &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/dec-11/kmer-percolation-posted.html"&gt;partitioning&lt;/a&gt; are two of the results, and a
few more will be emerging soon.  I happen to think it's incredibly
important; I would have done something else with my time, energy,
and money if not.  Hopefully you can agree that it's important, even
if you're interested in other things.&lt;/p&gt;
&lt;p&gt;So: yes, computational efficiency is not the only thing.  And it's a
surprisingly convenient moving target; frequently, you yourself can
just wait a few months or buy a bigger computer, and achieve similar
results.  But sometimes that attitude masks the fact that efficient
computation can bring better, cheaper, and broader science.  We need
to pay attention to that, too.&lt;/p&gt;
&lt;p&gt;And, Mick?  I don't think I can improve your ability to extract
biological information by 100x.  On metagenomes, would 2-10x be a good
enough start?&lt;/p&gt;
&lt;p&gt;--titus&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;</description>
         <guid isPermaLink="false">http://ivory.idyll.org/blog/2012/04/06/big-data-biology-2</guid>
         <pubDate>Fri, 06 Apr 2012 13:36:38 +0000</pubDate>
      </item>
      <item>
         <title>What is digital normalization, anyway?</title>
         <link>http://ivory.idyll.org/blog/apr-12/what-is-diginorm</link>
         <description>&lt;div class="document"&gt;
&lt;p&gt;I'm out at a &lt;a rel="nofollow" class="reference" target="_blank" href="http://chem.colorado.edu/knightgroup/index.php?option=com_flexicontent&amp;amp;view=items&amp;amp;id=254:cloud-computing-for-the-microbiome-workshop-"&gt;Cloud Computing for the Human Microbiome Workshop&lt;/a&gt; and I've been trying to convince people of the importance of digital normalization.  When &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/mar-12/diginorm-paper-posted"&gt;I posted the paper&lt;/a&gt; the reaction was reasonably positive, but I haven't had much luck explaining why it's so awesome.&lt;/p&gt;
&lt;p&gt;At the workshop, people were still confused.  So I
tried something new.&lt;/p&gt;
&lt;p&gt;I first made a simulated metagenome by taking three genomes worth of
data from the Chitsaz et al. (2011) paper (see
&lt;a rel="nofollow" class="reference" target="_blank" href="http://bix.ucsd.edu/projects/singlecell/"&gt;http://bix.ucsd.edu/projects/singlecell/&lt;/a&gt;) and shuffling them together.
I combined the sequences in a ratio of 10:25:50 for the E. coli
sequences, the Staph sequences, and the SAR sequences, respectively;
the latter two were single-cell MDA genomic DNA.  I took the first 10m
reads of this mix and then estimated the coverage.&lt;/p&gt;
&lt;p&gt;You can see the coverage of these genomic data sets estimated by using
the known reference sequences in the first figure.  E. coli looks nice
and Gaussian; Staph is smeared from here to heck; and much of the SAR
sequence is low coverage.  This reflects the realities of single cell
sequencing: you get really weird copy number biases out of multiple
displacement amplification.&lt;/p&gt;
&lt;p&gt;Then I applied three-pass digital normalization (see &lt;a rel="nofollow" class="reference" target="_blank" href="http://ivory.idyll.org/blog/mar-12/diginorm-paper-posted.html"&gt;the paper&lt;/a&gt;) and
plotted the new abundances.  As a reminder, &lt;strong&gt;this operates without
knowing the reference in advance&lt;/strong&gt;; we're just using the known reference
here to check the effects.&lt;/p&gt;
&lt;div class="figure"&gt;
&lt;img alt="http://ivory.idyll.org/permanent/raw-coverage.png" src="http://ivory.idyll.org/permanent/raw-coverage.png" style="width:400px;"/&gt;
&lt;p class="caption"&gt;Coverage of genome read mix, calculated by mapping the mixed reads
onto the known reference genomes.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="figure"&gt;
&lt;img alt="http://ivory.idyll.org/permanent/norm-coverage.png" src="http://ivory.idyll.org/permanent/norm-coverage.png" style="width:400px;"/&gt;
&lt;p class="caption"&gt;Coverage post-digital-normalization, again calculated by mapping
the mixed reads onto the known reference genomes.&lt;/p&gt;
&lt;/div&gt;
&lt;p&gt;As you can see, digital normalization literally &amp;quot;normalizes&amp;quot; the data
to the best of its ability.  That is, it cannot create higher coverage
where high coverage doesn't exist (for the SAR), but it can convert
the existing high coverage into nice, Gaussian distributions centered
around a much lower number.  You also discard quite a bit of data (look
at the X axes -- about 85% of the reads were discarded in downsampling
the coverage like this).&lt;/p&gt;
&lt;p&gt;When you assemble this, you get as good or better results than
assembling the unnormalized data, despite having discarded so much
data.  This is because no low-coverage data is discarded, so you still
retain as much overall covered bases -- just in fewer reads.  To boot,
it works pretty generically for single genomes, MDA genomes,
transcriptomes, and metagenomes.&lt;/p&gt;
&lt;p&gt;And, as a reminder? Digital normalization does this in fixed, low
memory; in a single pass; and without any reference sequence needed.&lt;/p&gt;
&lt;p&gt;Pretty neat.&lt;/p&gt;
&lt;p&gt;--titus&lt;/p&gt;
&lt;/div&gt;</description>
         <guid isPermaLink="false">http://ivory.idyll.org/blog/2012/04/06/what-is-diginorm</guid>
         <pubDate>Fri, 06 Apr 2012 11:17:51 +0000</pubDate>
      </item>
      <item>
         <title>When to Build Performance Measurement Tools for Firefox</title>
         <link>http://www.blueskyonmars.com/2012/04/02/when-to-build-performance-measurement-tools-for-firefox/</link>
         <description>We&amp;#8217;re well on our way to having a full-featured set of tools for web developers that ship with every release of Firefox, in addition to the already great Firebug add-on. In our roadmap, I talk about building &amp;#8220;bundled tools for the most common tasks&amp;#8221;. Lately, people have been asking me about tools to help web [...]</description>
         <guid isPermaLink="false">http://www.blueskyonmars.com/?p=2893</guid>
         <pubDate>Mon, 02 Apr 2012 15:22:52 +0000</pubDate>
         <content:encoded><![CDATA[<p>We&#8217;re well on our way to having a full-featured set of tools for web developers that ship with every release of Firefox, in addition to the already great Firebug add-on. In our <a rel="nofollow" target="_blank" href="https://wiki.mozilla.org/DevTools/RoadmapDec2011">roadmap</a>, I talk about building &#8220;bundled tools for the most common tasks&#8221;. Lately, people have been asking me about tools to help web developers improve the performance of their applications.</p>
<p><strong>Firefox is very fast.</strong> In fact, Firefox and its competitors are so fast that most web developers only care about one aspect of web application performance: network access. <strong>Latency and the amount of data transferred</strong> are the biggest issues for most web developers. We&#8217;ll be working on <a rel="nofollow" target="_blank" href="https://wiki.mozilla.org/DevTools/Features/NetworkView">providing insight into network access</a> soon in Firefox.</p>
<p>Developers working on three sorts of web applications in particular are asking for deeper insight into what the platform is doing:</p>
<ul>
<li>games</li>
<li>complex layouts involving large amounts of data</li>
<li>applications that have features you&#8217;d traditionally associate with &#8220;desktop applications&#8221;</li>
</ul>
<p>Each browser has different performance characteristics, and these developers need tools that give them hints on how to make their apps responsive on each browser. They care about things like garbage collection pauses, repaints and reflows and hot spots in their JavaScript code where the just-in-time compilers aren&#8217;t able to make the JS zoom.</p>
<p><strong>Most web developers aren&#8217;t working on these kinds of apps</strong>, and we&#8217;re focused on building tools that are useful for the &#8220;most common tasks&#8221;. However, we want these kinds of applications to run well on Firefox. Firefox developer tools really serve two groups: the web developers who use the tools directly, and the hundreds of millions of Firefox users who are looking to experience the web in the best way possible.</p>
<p>I think that our focus needs to remain on building the best tools for the most common tasks. But <strong>we also need to accommodate these sophisticated developers</strong>. Fortunately, we have more options than just &#8220;build it&#8221; or &#8220;don&#8217;t&#8221;.</p>
<p>For a feature to ship in Firefox, it goes through <em>a lot</em> of work to ensure that the feature is of a quality that is ready to ship to many millions of people and in many languages. The developers building these performance intensive apps do not number in the millions, and they are capable of installing add-ons. Some are even willing to produce their own custom builds of Firefox, if that&#8217;s what it takes to get the performance data they want.</p>
<p>In my opinion, that&#8217;s the planning lever we need to pull here. <strong>We can try to get these developers the data they need, albeit in a rough form, in add-ons as soon as possible.</strong> Along those lines, Brian Hackett has made his <a rel="nofollow" target="_blank" href="https://addons.mozilla.org/en-US/firefox/addon/jit-inspector/">JIT Inspector</a> tool available as an add-on.</p>
<p>If you need help figuring out performance issues with your application in Firefox, <a rel="nofollow" target="_blank" href="https://lists.mozilla.org/listinfo/dev-apps-firefox">get in touch</a>.</p>]]></content:encoded>
         <category>Mozilla</category>
      </item>
      <item>
         <title>Our approach to replication in computational science</title>
         <link>http://ivory.idyll.org/blog/apr-12/replication-i</link>
         <description>&lt;div class="document"&gt;
&lt;p&gt;I'm pretty proud of our most recently posted paper, which is on a
sequence analysis concept we call &lt;a rel="nofollow" class="reference" target="_blank" href="http://ged.msu.edu/papers/2012-diginorm/"&gt;digital normalization&lt;/a&gt;.  I think the paper is
pretty kick-ass, but so is the way in which we're approaching
replication.  This blog post is about the latter.&lt;/p&gt;
&lt;p&gt;(Quick note re &amp;quot;replication&amp;quot; vs &amp;quot;reproduction&amp;quot;: The distinction
between replication and reproducibility is, from what I understand,
that &amp;quot;replicable&amp;quot; means &amp;quot;other people get exactly the same results
when doing exactly the same thing&amp;quot;, while &amp;quot;reproducible&amp;quot; means
&amp;quot;something similar happens in other people's hands&amp;quot;.  The latter is
far stronger, in general, because it indicates that your results are
not merely some quirk of your setup and may actually be right.)&lt;/p&gt;
&lt;p&gt;So what did we do to make this paper extra super replicable?&lt;/p&gt;
&lt;p&gt;If you go to the &lt;a rel="nofollow" class="reference" target="_blank" href="http://ged.msu.edu/papers/2012-diginorm/"&gt;paper Web site&lt;/a&gt;, you'll find:&lt;/p&gt;
&lt;blockquote&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;a link to the paper itself, in preprint form, stored at the arXiv
site;&lt;/li&gt;
&lt;li&gt;a tutorial for running the software on a Linux machine hosted in
the Amazon cloud;&lt;/li&gt;
&lt;li&gt;a git repository for the software itself (hosted on github);&lt;/li&gt;
&lt;li&gt;a git repository for the LaTeX paper and analysis scripts (also
hosted on github), including an ipython notebook for generating the
figures (more about &lt;em&gt;that&lt;/em&gt; in my next blog post);&lt;/li&gt;
&lt;li&gt;instructions on how to start up an EC2 cloud instance, install the
software and paper pipeline, and build most of the analyses and all
of the figures from scratch;&lt;/li&gt;
&lt;li&gt;the data necessary to run the pipeline;&lt;/li&gt;
&lt;li&gt;some of the output data discussed in the paper.&lt;/li&gt;
&lt;/ul&gt;
&lt;/blockquote&gt;
&lt;p&gt;(Whew, it makes me a little tired just to type all that...)&lt;/p&gt;
&lt;p&gt;What this means is that you can regenerate substantial amounts (but
not all) of the data and analyses underlying the paper from scratch,
all on your own, on a machine that you can rent for something like 50
cents an hour.  (It'll cost you about $4 -- 8 hours of CPU -- to
re-run everything, plus some incidental costs for things like downloads.)&lt;/p&gt;
&lt;p&gt;Not only &lt;em&gt;can&lt;/em&gt; you do this, but if you try it, it will actually &lt;em&gt;work&lt;/em&gt;.
I've done my best to make sure the darn thing works, and this is the
actual pipeline we ourselves ran to produce the figures in the paper.
All the data is there, and all of the code used to process the data,
analyze the results, and produce the figures is &lt;em&gt;also&lt;/em&gt; there. In
version control.&lt;/p&gt;
&lt;p&gt;When you combine that with the ability to run this on a specific EC2
instance -- a combination of a frozen virtual machine installation and
a specific set of hardware -- I feel pretty confident that at least
&lt;em&gt;this&lt;/em&gt; component of our paper is something that can be replicated.&lt;/p&gt;
&lt;div class="section"&gt;
&lt;h1&gt;&lt;a rel="nofollow" id="a-few-thoughts-on-replicability-and-effort" name="a-few-thoughts-on-replicability-and-effort"&gt;A few thoughts on replicability, and effort&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Why did I go to all this trouble??&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Wasn't it a lot of work?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Well, interestingly enough, it wasn't &lt;em&gt;that&lt;/em&gt; much work.  I already
use version control for everything, including paper text; posting it
all to github was a matter of about three commands.&lt;/p&gt;
&lt;p&gt;Writing the code, analysis scripts, and paper was an immense amount of
work.  But I had to do that anyway.&lt;/p&gt;
&lt;p&gt;The most extra effort I put in was making sure that the big data files
were available.  I didn't want to add the the 2gb E. coli resequencing
data set to git, for example.  So I ended up tarballing those files
sticking them on S3.&lt;/p&gt;
&lt;p&gt;The Makefile and analysis scripts are ugly, but suffice to remake
everything from scratch; they were already needed to make the paper,
so in order to post them all I had to do was put in a teensy bit of
effort to remove some unintentional dependencies.&lt;/p&gt;
&lt;p&gt;The ipython notebook used to generate the figures (again -- next blog
post) was probably the most effort, because I had to learn how to use
it, which took about 20 minutes.  But it was one of the smoothest
transitions into using a new tool I've ever experienced in my ~25 years
of coding.&lt;/p&gt;
&lt;p&gt;Overall, it wasn't that much extra effort on my part.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why bother in the first place??&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The first and shortest answer is, because I could, and because I
believe in replication and reproducibility, and wanted to see how
tough it was to actually do something like this.  (It's a good deal
above and beyond what most bioinformaticians do.)&lt;/p&gt;
&lt;p&gt;Perhaps the strongest reason is that our group has been bitten a lot
in recent months by irreplicable results.  I won't name names, but
several Science and PNAS and PLoS One papers of interest to us turned
out to be basically impossible for us to replicate.  And, since we are
engaged in developing new computational methods that must be compared
to previous work, an inability to
regenerate &lt;em&gt;exactly&lt;/em&gt; the results in those other papers meant we had to
work harder than we should have, simply to reproduce what they'd done.&lt;/p&gt;
&lt;p&gt;A number of these problems came from people discarding large data sets
after publishing, under the mistaken belief that their submission to
the Short Read Archive could be used to regenerate their results.
(Often SRA submissions are unfiltered, and no one keeps the filtering
parameters around...right?)  In some cases, I got the right data sets
from the authors and could replicate (kudos to Brian Haas of Trinity
for this!), but in most cases, ixnay on the eplicationre.&lt;/p&gt;
&lt;p&gt;Then there were the cases where authors clearly were simply being bad
computational scientists.  My favorite example is a very high profile
paper (coauthored by someone I admire greatly), in which the script
they sent to us -- a script necessary for the initial analyses -- had
a &lt;em&gt;syntax error&lt;/em&gt; in it.  In that case, we were fairly sure that the
authors weren't sending us the script they'd actually used...  (It was
Perl, so admittedly it's hard to tell a syntax error from legitimate
code, but even the Perl interpreter was choking on this.)&lt;/p&gt;
&lt;p&gt;(A few replication problems came from people using closed or
unpublished software, or being hand-wavy about the parameters they
used, or using version X of some Web-hosted pipeline for which only
version Y was now available.  Clearly these are long-term issues that
need to be discussed with respect to replication in comp. bio., but
that's another topic.)&lt;/p&gt;
&lt;p&gt;Thus, my group has wasted a lot of time replicating other people's
work.  I wanted to avoid making other people go through that.&lt;/p&gt;
&lt;p&gt;A third reason is that I really, really, really want to make it easy
for people to pick up this tool and &lt;em&gt;use&lt;/em&gt; it.  Digital normalization
is super ultra awesome and I want as little as possible to stand in
the way of others using it.  So there's a strong element of
self-interest in doing things this way, and I hope it makes diginorm
more useful.  (I know about a dozen people that have already tried it
out in the week or so since I made the paper available, which is
pretty cool.  But citations will tell.)&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section"&gt;
&lt;h1&gt;&lt;a rel="nofollow" id="what-use-is-replication" name="what-use-is-replication"&gt;What use is replication?&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;Way back when, &lt;a rel="nofollow" class="reference" target="_blank" href="http://www.scimatic.com/node/361"&gt;Jim Graham politely schooled me&lt;/a&gt; in the true meaning of
reproducibility, as opposed to replication.  He was about 2/3 right,
but then he went a bit too far and said&lt;/p&gt;
&lt;blockquote&gt;
But let's drop the idea that I'm going to take your data and your
code and &amp;quot;reproduce&amp;quot; your result. I'm not. First, I've got my own
work to do. More importantly, the odds are that nobody will be any
wiser when I'm done.&amp;quot;&lt;/blockquote&gt;
&lt;p&gt;Well, let's take a look at that concern, shall we?&lt;/p&gt;
&lt;p&gt;With the benefit of about two years of further practice, I can tell
you this is a dangerously wrong way to think, at least in the field of
bioinformatics.  My objections hinge on a few points:&lt;/p&gt;
&lt;p&gt;First, based on our experiences so far, I'd be surprised if the
authors themselves could replicate their own computational results --
too many files and parameters are missing.  We call that &amp;quot;bad
science&amp;quot;.&lt;/p&gt;
&lt;p&gt;Second, odds are, the senior professor has little or no detailed
understanding of what bioinformatic steps were taken in processing the
data, and moreover is uninterested in the details; that's why they're
not in the Methods.  Why is that a problem?  Because the odds are
quite good that many biological analyses &lt;em&gt;hinge critically&lt;/em&gt; on such
points.  So the peer reviewers and the community at large need to be
able to evaluate them (see &lt;a rel="nofollow" class="reference" target="_blank" href="http://seqanswers.com/forums/showthread.php?t=18501"&gt;this RNA editing kerfuffle&lt;/a&gt; for an
excellent example of reviewer fail).  Yet most bioinformatic pipelines
are so terribly described that even with some WAG I can't figure out
what, roughly speaking, is going on.  I certainly couldn't replicate
it, and generating specific critiques is quite difficult in that kind
of circumstance.&lt;/p&gt;
&lt;p&gt;Parenthetically, Graham does refer to the climate sciences &lt;a rel="nofollow" class="reference" target="_blank" href="http://www.realclimate.org/index.php/archives/2009/02/on-replication/langswitch_lang/in/"&gt;struggles
with reproducibility and replication&lt;/a&gt;.  If only they put the same effort into replication and
data archiving they did into arguing with climate change deniers...&lt;/p&gt;
&lt;p&gt;Third, Graham may be guilty of physics chauvinism (just like I'm
almost certainly guilty of bioinformatics chauvinism...) Physics and
biology are quite different: in physics, you often have a theoretical
framework to go by, and results should at least roughly adhere to that
or else they are considered guilty until proven innocent.  In biology,
we usually have no good idea of what we're expecting to see, and often
we're looking at a system for the very first time.  In that
environment, I think it's important to make the underlying computation
WAY more solid than you would demand in physics (see RNA editing above).&lt;/p&gt;
&lt;p&gt;As Narayan Desai pointed out to me (following which I then put it in
my &lt;a rel="nofollow" class="reference" target="_blank" href="http://www.slideshare.net/c.titus.brown/pycon-2011-talk-ngram-assembly-with-bloom-filters"&gt;PyCon talk (slide 5)&lt;/a&gt;),
physics and biology are quite different in the way data is generated
and analyzed.  There's fewer sources of data generation in physics,
there's more of a computational culture, and there's more theory.
Having worked with physicists for much of my scientific life (and
having published a number of papers with physicists) I can tell you
that replication is certainly a big problem over there, but the
&lt;em&gt;consequences&lt;/em&gt; don't seem as big -- eventually the differences between
theory and computation will be worked out, because they're far more
noticeable when you &lt;em&gt;have&lt;/em&gt; theory, like in physics.  Not so in biology.&lt;/p&gt;
&lt;p&gt;Fourth, a renewed emphasis on computational methods (and therefore on
replicability of computational results) is a natural part of the
transition to &lt;a rel="nofollow" class="reference"&gt;Big Data biology&lt;/a&gt;.  The quality of
analysis methods matters A LOT when you are dealing with massive
data sets with weak signals and many systematic biases.  (I'll write
about this more later.)&lt;/p&gt;
&lt;p&gt;Fifth, and probably most significant from a practical perspective,
Graham misses the point of &lt;em&gt;reuse&lt;/em&gt;.  In bioinformatics, it behooves us
to reuse proven (aka published) tools -- at least we know they worked
for &lt;em&gt;someone&lt;/em&gt;, at least once, which is not usually the case for newly
written software.  I don't pretend that it's the responsibility of
people to write awesome reusable tools for every paper, but sure as
heck I should expect to be able to &lt;em&gt;run&lt;/em&gt; them on &lt;em&gt;some&lt;/em&gt; combination of
hardware and software.  Often that's not the case, which means I get
to reinvent the wheel (yay...) even when I'm doing the same stupid
thing the last five pubs did.&lt;/p&gt;
&lt;p&gt;For our paper, khmer and screed should be quite reusable.  The
analysis pipeline for the paper?  It's not that great.  But at least
you can run it, and potentially steal code from it, too.&lt;/p&gt;
&lt;p&gt;When I was talking to a colleague about the diginorm paper, he said
something jokingly: &amp;quot;wow, you're making it way too easy for people!&amp;quot;
-- presumably he meant it would be way to easy for people to criticize
or otherwise complain about the specific way we're doing things.
Then, a day or two later he said, &amp;quot;hmm, but now that I think of it, no
one ever uses the software we publish, and you seem to have had better
luck with that...&amp;quot;  -- recognizing that if you are barely able to run
your own software, perhaps others might find it even more difficult.&lt;/p&gt;
&lt;p&gt;Heck, the diginorm paper itself would have been far harder to write
without the data sets from the &lt;a rel="nofollow" class="reference" target="_blank" href="http://www.ncbi.nlm.nih.gov/pubmed?term=21572440"&gt;Trinity paper&lt;/a&gt; and the
&lt;a rel="nofollow" class="reference" target="_blank" href="http://www.ncbi.nlm.nih.gov/pubmed?term=21926975"&gt;Velvet-SC paper&lt;/a&gt;.  Having those
nice, fresh, well-analyzed data sets already at hand was &lt;em&gt;fantastic&lt;/em&gt;.
Being able to &lt;em&gt;run Trinity&lt;/em&gt; and reproduce their results was &lt;em&gt;wonderful&lt;/em&gt;.&lt;/p&gt;
&lt;p&gt;There's a saying in software engineering: &amp;quot;one of the main people you
should be programming for is yourself, in 6 months.&amp;quot;  That's also true
in science -- I'm sure I won't remember the finer details of the
diginorm paper analysis in 2 years -- but I can always go look into
version control.  More importantly, new graduate students can go look
and really see what's going on.  (And I can use it for teaching, too.)
And so can other people working with me.  So there's a lot of utility
in simply nailing everything down and making it runnable.&lt;/p&gt;
&lt;p&gt;Replication is by no means sufficient for good science.  But I'll be
more impressed by the argument that &amp;quot;replication isn't all that
important&amp;quot; when I see lack of replication as the exception rather than
the rule.  Replication is essential, and good, and useful.  I long for
the day when it's not &lt;em&gt;interesting&lt;/em&gt;, because it's so standard.  In
the meantime I would argue that it certainly doesn't do any harm to
emphasize it.&lt;/p&gt;
&lt;p&gt;(Note that I really appreciate Jim Graham's commentary, as I think he
is at worst &lt;em&gt;usefully&lt;/em&gt; wrong on these points, and substantially
correct in many ways.  I'm just picking on him because he wrote it all
down in one place for me to link to, and chose to use the word 'sic'
when reproducing my spelling mistake.  Low blow ;)&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section"&gt;
&lt;h1&gt;&lt;a rel="nofollow" id="the-future" name="the-future"&gt;The future&lt;/a&gt;&lt;/h1&gt;
&lt;p&gt;I don't pretend to have all, or even many, of the answers; I just like
to think about what form they might take.&lt;/p&gt;
&lt;p&gt;I don't want to argue that this approach is a panacea or a
high-quality template for others to use, inside or out of
bioinformatics.  For one thing, I haven't automated some of the
analyses in the paper; it's just too much work for too little benefit
at this point.  (Trust me, they're easy to reproduce... :).  For
another, our paper used a fairly small amount of data overall; only a
few dozen gigabytes all told.  This makes it easy to post the data for
others to use later on.  Several of our next few papers will involve
over a half terabyte of raw data, plus several hundred gb of ancillary
and intermediate results; no idea what we'll do for them.&lt;/p&gt;
&lt;p&gt;Diginorm is also a somewhat strange bioinformatics paper.  We just
analyzed other people's data sets (an approach which for some reason
isn't in favor in high impact bioinformatics, probably because high
impact journal subs are primarily reviewed by biologists who want to
see cool new data that we don't understand, not boring old data that
we don't understand).  There's no way we can or should argue that
biological replicates done in a different lab should &lt;em&gt;replicate&lt;/em&gt; the
results; that's where reproducibility becomes important.&lt;/p&gt;
&lt;p&gt;But I would like it if people &lt;em&gt;considered&lt;/em&gt; this approach (or some
other approach) to making their analyses replicable.  I don't mind
people rejecting good approaches because they don't fit; to each their
own.  But this kind of limited enabling of replication isn't that
difficult, frankly, and even if it were, it has plenty of upsides.
It's definitely not irrelevant to the practice of science -- I would
challenge anyone to try to make &lt;em&gt;that&lt;/em&gt; claim in good faith.&lt;/p&gt;
&lt;p&gt;--titus&lt;/p&gt;
&lt;p&gt;p.s. I think I have to refer to this &lt;a rel="nofollow" class="reference" target="_blank" href="http://news.yahoo.com/cancer-science-many-discoveries-dont-hold-174216262.html"&gt;cancer results not reproducible&lt;/a&gt; paper somewhere.  Done.&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;</description>
         <guid isPermaLink="false">http://ivory.idyll.org/blog/2012/04/02/replication-i</guid>
         <pubDate>Mon, 02 Apr 2012 14:29:39 +0000</pubDate>
      </item>
      <item>
         <title>Thinking About the Developer Experience for the Web</title>
         <link>http://www.blueskyonmars.com/2012/03/29/thinking-about-the-developer-experience/</link>
         <description>I&amp;#8217;ve been working on developer tools for a while now, and I&amp;#8217;m really proud of what we are shipping in Firefox today and the new features that are right around the corner. Browser tools are one of the most important parts of a web developer&amp;#8217;s toolbox. But, there&amp;#8217;s a lot more that goes into web [...]</description>
         <guid isPermaLink="false">http://www.blueskyonmars.com/?p=2889</guid>
         <pubDate>Thu, 29 Mar 2012 15:35:05 +0000</pubDate>
         <content:encoded><![CDATA[<p></p>
<p>I&#8217;ve been working on developer tools for a while now, and I&#8217;m really proud of what we are shipping in Firefox today and the new features that are right around the corner. Browser tools are one of the most important parts of a web developer&#8217;s toolbox.</p>
<p>But, there&#8217;s a lot more that goes into web development than the browser tools. The video above and the text that follows are some thoughts on the whole of the web developer&#8217;s experience.</p>
<p>Web development is great because the platform is so high level and dynamic. That makes it easy to get started. There&#8217;s a massive collection of libraries, tools, books, tutorials and more to help web developers get things done once they&#8217;ve moved beyond the first steps. In fact, there&#8217;s so much out there that it can be hard for someone getting going to decide how to go from idea to done. The riches of the web ecosystem are both a blessing and a curse. It&#8217;s more blessing than curse, but that doesn&#8217;t make it any easier for newcomers and, in some instances, for experienced developers that are moving into a new area or applying a new technology.</p>
<p>Mozilla&#8217;s non-profit mission is to protect openness and innovation on the web. We want to make the web better for everyone, and I think we&#8217;re in a good position to help guide developers from idea to published app. Doing so is especially critical for our <a rel="nofollow" target="_blank" href="https://www.mozilla.org/en-US/apps/">Apps initiative</a>.</p>
<p>To that end, Daniel Buchner and I will be looking beyond developer tools in our product plans to include the whole of the <a rel="nofollow" target="_blank" href="https://wiki.mozilla.org/DeveloperExperience">developer experience</a>. This will first show up in an Apps context, but we&#8217;re going to look for ways to apply what we do more broadly.</p>]]></content:encoded>
      </item>
      <item>
         <title>Перестать писать классы?</title>
         <link>http://otkds.blogspot.com/2012/03/jack-diederich-pycon-us-2012-stop.html</link>
         <description>&lt;div dir="ltr" style="text-align:left;"&gt;
Jack Diederich на конференции PyCon US 2012 сделал замечательный доклад &lt;a rel="nofollow" target="_blank" href="http://pyvideo.org/video/880/stop-writing-classes"&gt;Stop Writing Classes&lt;/a&gt;&amp;nbsp;и добрые люди даже &lt;a rel="nofollow" target="_blank" href="http://habrahabr.ru/post/140581/"&gt;перевели его на русский язык&lt;/a&gt;. Тема очень правильная, но к этому докладу (как, впрочем, и любым другим провокационным заявлениям) очень недостаёт эпиграфом известной японской поговорки:&lt;br /&gt;
&lt;blockquote class="tr_bq"&gt;
&lt;span style="font-size:small;"&gt;&lt;span style="color:#454545;"&gt;&lt;span style="font-family:Times, serif;"&gt;If you believe everything you read, better not read.&lt;/span&gt;&lt;/span&gt;&lt;/span&gt;&lt;/blockquote&gt;
Больше всего режет глаза отказ от собственных исключений. А ведь в этом случае нам при использовании придётся перехватывать стандартные generic исключения, для возникновения которых может быть куча причин. Если во всех случаях исключение означает нештатную ситуацию (не перехватывается), то всё нормально. А если нет? Тогда вполне вероятна ситуация, когда мы думаем, что обрабатываем ошибку времени выполнения, в то время как на самом деле где-то в коде затесалась ошибка в логике, проявляющаяся на определённых данных, и потратить время на разгадывание загадок при отладке.&lt;br /&gt;
&lt;br /&gt;
Теперь посмотрим на пример с классом для API. Избавились от громоздоко класса — это хорошо. Но теперь конфигурационный параметр API_KEY стал глобальной переменной, неявно используемой в функции. Implicit is better than explicit? Если это всё находится в моём небольшом скрипте, то всё замечательно. А если код запроса в сторонней библиотеке, а API_KEY нужно читать из конфигурационного файла?&lt;br /&gt;
&lt;br /&gt;
И так можно продолжать со всеми остальными примерами. Урощение, в том числе и избавление от ненужных классов — это хорошо, но только нужно смотреть, насколько оно применимо в каждом конкретном случае.&lt;br /&gt;
&lt;div style="background-color:transparent;background-position:!important;display:none;margin-bottom:0px !important;margin-left:0px !important;margin-right:0px !important;margin-top:0px !important;padding-bottom:0px !important;padding-left:0px !important;padding-right:0px !important;padding-top:0px !important;text-align:left;"&gt;
&lt;div style="background-color:#363636;border-color:#000000;border-width:0px !important;color:#fafafa;font-size:16px !important;max-width:300px !important;overflow:visible;padding:8px !important;text-align:left;"&gt;
&lt;div class="translate"&gt;
&lt;/div&gt;
&lt;div class="additional"&gt;
&lt;/div&gt;
&lt;/div&gt;
&lt;img src="http://www.google.com/uds/css/small-logo.png" style="cursor:pointer;margin:0 !important;padding:3px 5px 0 !important;"/&gt;&lt;/div&gt;
&lt;/div&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/5235863953075762128-5210067075777557637?l=otkds.blogspot.com' alt=''/&gt;&lt;/div&gt;</description>
         <author>Denis Otkidach</author>
         <guid isPermaLink="false">tag:blogger.com,1999:blog-5235863953075762128.post-5210067075777557637</guid>
         <pubDate>Mon, 26 Mar 2012 19:08:00 +0000</pubDate>
      </item>
   </channel>
</rss><!-- fe1.yql.bf1.yahoo.com compressed/chunked Sun May 27 10:29:39 UTC 2012 -->

