<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">

  <title />
  
  <link href="http://pilif.github.com/" />
  <updated>2012-01-13T15:35:05+01:00</updated>
  <id>http://pilif.github.com/</id>
  <author>
    <name>Philip Hofstetter</name>
    
      <email>blog@pilif.me</email>
    
  </author>

  
  <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/gnegg" /><feedburner:info uri="gnegg" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><geo:lat>47.3525</geo:lat><geo:long>8.581942</geo:long><link rel="license" type="text/html" href="http://creativecommons.org/licenses/by/2.0/" /><feedburner:browserFriendly>This is an XML content feed. It is intended to be viewed in a newsreader or syndicated to another site.</feedburner:browserFriendly><entry>
    <title>My worst mistakes in programming</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/0xVc2xYj2QY/my-worst-mistakes" />
    <updated>2012-01-13T00:00:00+01:00</updated>
    <id>http://pilif.github.com/2012/01/my-worst-mistakes</id>
    <content type="html">&lt;p&gt;I'm in the middle of refactoring a big infrastructure piece in our
product &lt;a href="http://www.popscan.com"&gt;PopScan&lt;/a&gt;. It's very early code, rarely
touched since its inception in 2004, so I'm dealing mainly with my sins
of the past.&lt;/p&gt;

&lt;p&gt;This time like no time before, I'm feeling the two biggest mistake I
have ever made in designing a program, so I though I'd make this post
here in order to help others not fall into the same trap.&lt;/p&gt;

&lt;p&gt;Remember this: Once you are no longer alone working on your project,
the code you have written sets an example. Mistakes you have made are
copied - either verbatim or in spirit. The design you have chosen
lives on in the code that others write (rightfully so - you should
strive to keep code consistent).&lt;/p&gt;

&lt;p&gt;This makes it even more important not to screw up.&lt;/p&gt;

&lt;p&gt;Back in 2004 I have failed badly at two places.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;I chose a completely wrong abstraction in class design, mixing two
things that should be separate.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I chose - in a foolhearted whish to save on CPU time to create a ton
of internal state instead of fetching the data when it's needed (I
could still cache then, but I missed that).&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;So here's the story.&lt;/p&gt;

&lt;h2&gt;One is the architectural issue.&lt;/h2&gt;

&lt;p&gt;Let me tell you, dear reader, should you &lt;em&gt;ever&lt;/em&gt; be in the position of
having to do anything even remotely related to an ecommerce solution
dealing with products and orders, so repeat with me:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;Product lists are not the same thing as orders. Orders are not the
same thing as baskets.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;and even more importantly:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;A product and a line item are two completely different things.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;A line item describes how a specific product is placed in a list, so
at best, a product is contained in a line item. A product doesn't have
a quantity. A product doesn't have a total price.&lt;/p&gt;

&lt;p&gt;A line item does.&lt;/p&gt;

&lt;p&gt;And when we are at it: «quantity» is not a number. It is the entitiy
that describes the amount of times the product is contained within the
line item. As such a quantity usually consists of an amount and a
unit. If you change the unit, you change the quantity. If you change
the amount, you change the quantity.&lt;/p&gt;

&lt;p&gt;Anyways - sitting down and thinking of the entities in the feature
that you are implementing is an essential part of the work that you
do. Even it it seems "kinda right" at the time, even if it works
"right" for years - once you make a mistake at a bad place, you are
stuck with it.&lt;/p&gt;

&lt;p&gt;PopScan is about products and ordering them. Me missing the
distinction between a product and a line item back in 2004 worked fine
until now, but as this is a core component of PopScan, it has grown
the most over the years, more and more intertwining product and line
item functionality to the point of where it's too late to fix this now
or at least it would require countless hours of work.&lt;/p&gt;

&lt;p&gt;Work that will have to be done sooner rather than later. Work that
deeply affects a core component of the product. Work that will change
the API greatly and as such can only be tested for correctness in
integration tests. Unit tests become useless as the units that are
being tested won't exist any more in the future.&lt;/p&gt;

&lt;p&gt;Painful work.&lt;/p&gt;

&lt;p&gt;If only I had more time and experience those 8 years ago.&lt;/p&gt;

&lt;h2&gt;The other issue is about state&lt;/h2&gt;

&lt;p&gt;Let's say you have a class &lt;code&gt;FooBar&lt;/code&gt; with a property &lt;code&gt;Foo&lt;/code&gt; that is
exposed as part of the public API via a &lt;code&gt;getFoo&lt;/code&gt; method.&lt;/p&gt;

&lt;p&gt;That &lt;code&gt;Foo&lt;/code&gt; relies of some external data - let's call it &lt;code&gt;foodata&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Now you have two options of dealing with that &lt;code&gt;foodata&lt;/code&gt;:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;You could read &lt;code&gt;foodata&lt;/code&gt; into an internal &lt;code&gt;foo&lt;/code&gt; field at
construction time. Then, whenever your &lt;code&gt;getFoo()&lt;/code&gt; is called, you
return the value you stored in &lt;code&gt;foo&lt;/code&gt;.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Or you could read nothing until &lt;code&gt;getFoo()&lt;/code&gt; is called and then read
&lt;code&gt;foodata&lt;/code&gt; and return that (optionally caching it for the next call to
&lt;code&gt;getFoo()&lt;/code&gt;)&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;Chosing the first design for most of the models back in 2004 was the
second biggest coding mistake I have ever made in my life.&lt;/p&gt;

&lt;p&gt;Aside of the fact that constructing one of these &lt;code&gt;FooBar&lt;/code&gt; objects
becomes more and more expensive the more stuff you preload (likely
never to be used for the lifetime of the object), you have also
contributed to a huge amount of internal state of the object.&lt;/p&gt;

&lt;p&gt;The temptation to write a &lt;code&gt;getBar()&lt;/code&gt; method that has a side effect of
also altering the internal foo field is just too big. And now you end
up with a &lt;code&gt;getBar()&lt;/code&gt; that suddenly also depends on the internal state
of &lt;code&gt;foo&lt;/code&gt; which suddenly is disconnected from the initial &lt;code&gt;foodata&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Worse, suddenly calling code will see different results depending on
whether it calls &lt;code&gt;getBar()&lt;/code&gt; before it's calling &lt;code&gt;getFoo()&lt;/code&gt;. Which will
of course lead to code depending on that fact, so fixing it becomes
very hard (but at least caught by unit tests).&lt;/p&gt;

&lt;p&gt;Having the internal fields also leads to &lt;code&gt;FooBar&lt;/code&gt;'s implementation
preferring these fields over the public methods, which is totally
fine, as long as &lt;code&gt;FooBar&lt;/code&gt; stands alone.&lt;/p&gt;

&lt;p&gt;But the moment there's a &lt;code&gt;FooFooBar&lt;/code&gt; which inherits from &lt;code&gt;FooBar&lt;/code&gt;, you
lose all the advantages of polymorphism. &lt;code&gt;FooBar&lt;/code&gt;'s implementation will
always only use its own private fields. It's impossible for &lt;code&gt;FooFooBar&lt;/code&gt;
to affect &lt;code&gt;FooBar&lt;/code&gt;'s implementation, causing the need to override many
more methods than what would have been needed if &lt;code&gt;FooBar&lt;/code&gt; used its own
public API.&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;These two mistakes cost us hours and hours of working around our
inability to do what we want. It cost us hours of debugging and it
causes new features to come out much more clunky than they need to be.&lt;/p&gt;

&lt;p&gt;I have done so many bad things in my professional life. A &lt;code&gt;shutdown -h&lt;/code&gt;
instead of -r on a remote server. A &lt;code&gt;mem=512&lt;/code&gt; boot parameter (yes.
That number is/was interpreted as bytes. And yes. Linux needs more
than 512 bytes of RAM to boot), an &lt;code&gt;update&lt;/code&gt; without &lt;code&gt;where&lt;/code&gt; clause -
I've screwed up so badly in my life.&lt;/p&gt;

&lt;p&gt;But all of this is &lt;em&gt;nothing&lt;/em&gt; compared to these two mistakes.&lt;/p&gt;

&lt;p&gt;These are not just inconveniencing myself. These are inconveniencing
my coworkers and our customers (because we need more time to implement
features).&lt;/p&gt;

&lt;p&gt;Shutting down a server by accident means 30 minutes of downtime at
worst (none since we heavily use VMWare). Screwing up a class design
twice is the gift that keeps on giving.&lt;/p&gt;

&lt;p&gt;I'm so sorry for you guys having to put up with &lt;code&gt;OrderSet&lt;/code&gt; of doom.&lt;/p&gt;

&lt;p&gt;Sorry guys.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2012/01/my-worst-mistakes</feedburner:origLink></entry>
  
  <entry>
    <title>Abusing LiveConnect for fun and profit</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/19flltjay4U/grave-digging" />
    <updated>2011-12-22T00:00:00+01:00</updated>
    <id>http://pilif.github.com/2011/12/grave-digging</id>
    <content type="html">&lt;p&gt;On december 20th I gave a talk at the JSZurich user group meeting in Zürich.
The talk is about a decade old technology which can be abused to get full,
unrestricted access to a client machine from JavaScript and HTML.&lt;/p&gt;

&lt;p&gt;I was showing how you would script a Java Applet (which is completely hidden
from the user) to do the dirty work for you while you are creating a very nice
user interface using JavaScript and HTML.&lt;/p&gt;

&lt;iframe class="youtube-player" type="text/html" width="640" height="385" src="http://www.youtube.com/embed/zOhyjaTkjI4" frameborder="0"&gt;
&lt;/iframe&gt;


&lt;p&gt;The slides are &lt;a href="http://bit.ly/vUmkZH"&gt;available in PDF format&lt;/a&gt; too.&lt;/p&gt;

&lt;p&gt;While it's a very cool tech demo, it's IMHO also a very bad security issue
which browser vendors and Oracle need to have a look at. The user sees nothing
but a dialog like this:&lt;/p&gt;

&lt;p&gt;&lt;img src="http://www.pilif.ch/java-prompt.png" width="616" height="301"&gt;
and once they click OK, they are completely owned.&lt;/p&gt;

&lt;p&gt;Even worse, while this dialog is showing the case of a valid certificate, the
dialog in case of an invalid (self-signed or expired) certificate isn't much
different, so users can easily tricked into clicking allow.&lt;/p&gt;

&lt;p&gt;The source code of the demo application is on &lt;a href="https://github.com/pilif/gravedigging"&gt;github&lt;/a&gt;
and I've already written about this on this blog &lt;a href="/2009/04/javascript-and-applet-interaction/"&gt;here&lt;/a&gt;,
but back then I was mainly interested in getting it work.&lt;/p&gt;

&lt;p&gt;By now though, I'm really concerned about putting an end to this, or at least
increasing the hurdle the end-user has to jump through before this goes off -
maybe force them to click a visible Applet. Or just remove the &lt;a
href="http://en.wikipedia.org/wiki/LiveConnect"&gt;LiveConnect&lt;/a&gt; feature all
together from browsers, thus forcing applets to be visible.&lt;/p&gt;

&lt;p&gt;But aside of the security issues, I still think that this is a very
interesting case of long forgotten technology. If you are interested, do have
a look at the talk and travel back in time to when stuff like this was only
half as scary as it is now.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/12/grave-digging</feedburner:origLink></entry>
  
  <entry>
    <title>updated sacy - now with external tools</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/eF3AYOoDIv8/updated-sacy-again" />
    <updated>2011-11-09T00:00:00+01:00</updated>
    <id>http://pilif.github.com/2011/11/updated-sacy-again</id>
    <content type="html">&lt;p&gt;I've just updated the &lt;a href="https://github.com/pilif/sacy"&gt;sacy repository&lt;/a&gt; again and tagged a v0.3-beta1 release.&lt;/p&gt;

&lt;p&gt;The main feature since yesterday is support for the official compilers and
tools if you can provide them on the target machine.&lt;/p&gt;

&lt;p&gt;The drawback is that these things come with hefty dependencies at times (I
don't think you'd find a shared hoster willing to install node.js or Ruby for
you), but if you can provide the tools, you can get some really nice
advantages over the PHP ports of the various compilers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;the PHP port of sass has &lt;a href="http://code.google.com/p/phamlp/issues/detail?id=116"&gt;an issue&lt;/a&gt; that prevents
@import from working. sacy's build script does patch that, but the way they
were parsing the file names doesn't inspire confidence in the library. You
might get a more robust solution by using the official tool.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;uglifier-js is a bit faster than JSMin, produces significantly smaller
output and comes with a better license (JSMin isn't strictly free software
as it has this "do no evil" clause)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;coffee script is under very heavy development, so I'd much rather use the
upstream source than some experimental fun project. So far I haven't seen
issues with coffeescript-php, but then I haven't been using it much yet.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Absent from the list you'll find less and css minification:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;the PHP native &lt;a href="http://code.google.com/p/cssmin/"&gt;CSSMin&lt;/a&gt; is really good and
there's no single official external tool out that demonstrably better (maybe
the YUI compressor, but I'm not going to support something that requires me
to deal with Java)&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;a href="http://leafo.net/lessphp/"&gt;lessphp&lt;/a&gt; is very lightweight and yet very full
featured and very actively developed. It also has a nice advantage over the
native solution in that the currently released native compiler does not
support reading its input from STDIN, so if you want to use the official
less, you have to go with the git HEAD.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Feel free to try this out (and/or send me a patch)!&lt;/p&gt;

&lt;p&gt;Oh and by the way: If you want to use uglifier or the original coffee script
and you need node but can't install it, have a look at the
&lt;a href="http://pilif.github.com/2011/11/node-to-go/"&gt;static binary&lt;/a&gt; I created&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/11/updated-sacy-again</feedburner:origLink></entry>
  
  <entry>
    <title>updated sacy - now with more coffee</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/vzYAAXjJIGQ/updated-sacy" />
    <updated>2011-11-08T00:00:00+01:00</updated>
    <id>http://pilif.github.com/2011/11/updated-sacy</id>
    <content type="html">&lt;p&gt;I've just updated the &lt;a href="https://github.com/pilif/sacy"&gt;sacy repository&lt;/a&gt;
to now also provide support for compiling Coffee Script.&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="xml"&gt;{asset_compile}
&lt;span class="nt"&gt;&amp;lt;script&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/coffeescript&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/file1.coffee&amp;quot;&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;script&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/javascript&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/file2.js&amp;quot;&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
{/asset_compile}
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;will now not compile file1.coffee into JS before creating and linking one big chunk of minified JavaScript.&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="xml"&gt;&lt;span class="nt"&gt;&amp;lt;script&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/javascript&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/assetcache/file2-deadbeef1234.js&amp;quot;&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;As always, the support is seamless - this is all you have to do.&lt;/p&gt;

&lt;p&gt;Again, in order to keep deployment simple, I decided to go with a pure PHP solution (&lt;a href="https://github.com/alxlit/coffeescript-php"&gt;coffeescript-php&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;I do see some advantages in the native solutions though (performance, better output), so I'm actively looking into a solution to detect the availability of native converters that I could shell out to without having to hit the file system on every request.&lt;/p&gt;

&lt;p&gt;Also, when adding the coffee support, I noticed that the architecture of sacy isn't perfect for doing this transformation stuff. Too much code had to be duplicated between CSS and JavaScript, so I will do a bit of refactoring there.&lt;/p&gt;

&lt;p&gt;Once both the support for external tools and the refactoring of the transformation is completed, I'm going to release v0.3, but if you want/need coffee support right now, go ahead and clone
&lt;a href="https://github.com/pilif/sacy"&gt;the repository&lt;/a&gt;.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/11/updated-sacy</feedburner:origLink></entry>
  
  <entry>
    <title>node to go</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/f_fbayTlpzQ/node-to-go" />
    <updated>2011-11-07T00:00:00+01:00</updated>
    <id>http://pilif.github.com/2011/11/node-to-go</id>
    <content type="html">&lt;p&gt;Having node.js around on your machine can be very useful - not just if you are
&lt;a href="/tags/tempalias/"&gt;building your new fun project&lt;/a&gt;, but also for
quite real world applications.&lt;/p&gt;

&lt;p&gt;For me it was &lt;a href="http://jashkenas.github.com/coffee-script/"&gt;coffee script&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;After reading some incredibly beautiful coffee code by &lt;a href="https://twitter.com/brainlock"&gt;@brainlock&lt;/a&gt;
(work related, so I can't link the code), I decided that I wanted to use
coffee in PopScan and as such I need coffee support in sacy which handles
asset compilation for us.&lt;/p&gt;

&lt;p&gt;This means that I need node.js on the server (sacy is allowing us a very cool
checkout-and-forget deployment without any build-scripts, so I'd like to keep
this going on).&lt;/p&gt;

&lt;p&gt;On servers we manage, this isn't an issue, but some customers insist on
hosting PopScan within their DMZ and provide a pre-configured Linux machine
running OS versions that weren't quite current a decade ago.&lt;/p&gt;

&lt;p&gt;Have fun compiling node.js for these: There are so many dependencies to meet
(a recent python for example) to build it - if you even manage to get it to
compile on these ancient C compilers available for these ancient systems.&lt;/p&gt;

&lt;p&gt;But I really wanted coffee.&lt;/p&gt;

&lt;p&gt;So here you go: Here's a statically linked (this required a bit of trickery)
binary of node.js v0.4.7 compiled for 32bit Linux. This runs even on an
ancient RedHat Enterprise 3 installation, so I'm quite confident that it runs
everywhere running at least Linux 2.2:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.pilif.ch/node-x86-v0.4.7.bz2" checksum="sha256:142085682187a57f312d095499e7d8b2b7677815c783b3a6751a846f102ac7b9"&gt;node-x86-v0.4.7.bz2&lt;/a&gt;
(SHA256:&amp;nbsp;142085682187a57f312d095499e7d8b2b7677815c783b3a6751a846f102ac7b9)&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;pilif@miscweb ~ % file node-x86-v0.4.7 
node-x86: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), for GNU/Linux 2.2.5, statically linked, for GNU/Linux 2.2.5, not stripped
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The binary can be placed wherever you want and executed from there - node
doesn't require any external files (which is very cool).&lt;/p&gt;

&lt;p&gt;I'll update the file from time to time and provide an updated post. 0.4.7 is good enough to run coffee script though.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/11/node-to-go</feedburner:origLink></entry>
  
  <entry>
    <title>protecting siri</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/onQUgaIrfic/protecting-siri" />
    <updated>2011-10-31T00:00:00+01:00</updated>
    <id>http://pilif.github.com/2011/10/protecting-siri</id>
    <content type="html">&lt;p&gt;Over the last weekend, 9to5mac.com &lt;a href="http://9to5mac.com/2011/10/29/siri-hacked-to-fully-run-on-the-iphone-4-and-ipod-touch-iphone-4s-vs-iphone-4-siri-showdown-video-interview/"&gt;posted about a hack&lt;/a&gt; which shows that it's possible to run Siri on a iPhone 4 and
an iPod Touch 4g and possibly even oder devices - considering how much of Siri
is running on Apple's servers.&lt;/p&gt;

&lt;p&gt;We've always suspected that the decision to restrict Siri to the 4S is
basically a marketing decision and I don't really care about this either.
Nobody is forcing you to use Siri and thus nobody is forcing you to update to
anything.&lt;/p&gt;

&lt;p&gt;Siri is Apple's product and so are the various iPhones. It's their decision
whom they want to sell what to.&lt;/p&gt;

&lt;p&gt;What I find more interesting is that it was even possible to have a hacked
Siri on a non 4S-phone talk to Apple's servers. If I were in Apple's shoes, I
would have made that (practically) impossible.&lt;/p&gt;

&lt;p&gt;And here's how:&lt;/p&gt;

&lt;p&gt;Having a device that you put into users hands and trusting it is always a very
hard, if impossible thing to do as the device can (more or less) easily be
tampered with.&lt;/p&gt;

&lt;p&gt;So to solve this problem, we need some component that we know reasonably well
to be safe from the user's tampering and we need to find a way for that
component to prove to the server that indeed the component is available and
healthy.&lt;/p&gt;

&lt;p&gt;I would do that using public key crypto and specialized hardware that works
like a TPM. So that would be a chip that contains a private key embedded in
hardware, likely not updatable. Also, that private key will never leave that
device. There is no API to read it.&lt;/p&gt;

&lt;p&gt;The only API the chip provides is either a relatively high-level API to sign
an arbitrary binary blob or, more likely, a lower level one to encrypt some
small input (a SHA1 hash for example) with the private key.&lt;/p&gt;

&lt;p&gt;OK. Now we have that device (also, it's likely that the iPhone already has
something like that for its secured boot process). What's next?&lt;/p&gt;

&lt;p&gt;Next you make sure that the initial handshake with your servers requires that
device. Have the server post a challenge to the phone. Have the phone solve it
and have the response signed by that crypto device.&lt;/p&gt;

&lt;p&gt;On your server, you will have the matching public key. If the signature checks
out, you talk to the device. If not, you don't.&lt;/p&gt;

&lt;p&gt;Now, it is possible using very expensive hardware to extract that key from the
hardware (by opening the chip's casing and using a microscope and a lot of
skills). If you are really concerned about this, give each device a unique
private key. If a key gets compromised, blacklist it.&lt;/p&gt;

&lt;p&gt;This greatly complicates the manufacturing process of course, so you might go
ahead with just one private key per hardware type and hope that cracking the
key will take longer than the lifetime of the hardware (which is very likely).&lt;/p&gt;

&lt;p&gt;This isn't at all specific to Siri of course. Whenever you have to trust a
device that you put into consumers hands, this is the way to go and I'm sure
we'll be seeing more of this in the future (imagine the uses for copy
protection - let's hope we don't end up there).&lt;/p&gt;

&lt;p&gt;I'm not particularly happy that this is possible, but I'd rather talk about it
than to hope that it's never going to happen - it will and &lt;a href="/2011/09/asking-for-permission/"&gt;I'll be pissed&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;For now I'm just wondering why Apple wasn't doing it to protect Siri.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/10/protecting-siri</feedburner:origLink></entry>
  
  <entry>
    <title>A new fun project</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/yhbKdV6jPk0/new-fun-project" />
    <updated>2011-10-12T00:00:00+02:00</updated>
    <id>http://pilif.github.com/2011/10/new-fun-project</id>
    <content type="html">&lt;p&gt;Like &lt;a href="http://blip.tv/jsconfeu/by-philip-hofstetter-node-js-in-production-use-tempalias-com-4258344"&gt;back in 2010&lt;/a&gt; I went to JSConf.eu this year around.&lt;/p&gt;

&lt;p&gt;One of the many impressive facts about JSConf is the quality of their Wifi
connection. It's not just free and stable, it's also fast. Not only that, this
time around, they had a very cool feature: You authenticated via twitter.&lt;/p&gt;

&lt;p&gt;As most of the JS community seems to be having twitter accounts anyways, this
was probably the most convenient solution for everyone: You didn't have to
deal with creating an account or asking someone for a password and on the
other hand, the organizers could make sure that, if abuse should happen,
they'd know whom to notify.&lt;/p&gt;

&lt;p&gt;On a related note: This was in stark contrast to the WiFi I had in the hotel
which was unstable, slow and cost a ton of money to use and it didn't use
Twitter either :-)&lt;/p&gt;

&lt;p&gt;In fact, the twitter thing was so cool to see in practice, that I want to use
it for myself too.&lt;/p&gt;

&lt;p&gt;Since the days of WEP-only Nintendo DS, I'm running two WiFi networks at home:
One is WPA protected and for my own use, the other is open, but it runs over
a different interface on &lt;a href="/2006/07/computers-under-my-command-issue-1-shion/"&gt;shion&lt;/a&gt;
which has no access to any other machine in my network. This is even more
important as &lt;a href="/2005/05/lots-of-fun-with-openvpn/"&gt;I have a permanent OpenVPN connection&lt;/a&gt;
to my office and I definitely don't want to give the world access to that.&lt;/p&gt;

&lt;p&gt;So now the plan would be to change that open network so that it redirects to a
captive portal until the user has authenticated with twitter (I might add
other providers later on - LinkedIn would be &lt;em&gt;awesome&lt;/em&gt; for the office for
example).&lt;/p&gt;

&lt;p&gt;In order for me to actually get the thing going, I'm doing a tempalias on this
one too and keep a diary of my work.&lt;/p&gt;

&lt;p&gt;So here we go. I really think that every year I should do some fun-project
that's programming related, can be done on my own and is at least of some use.
&lt;a href="/tags/tempalias/"&gt;Last time it was tempalias&lt;/a&gt;, this time, it'll be
&lt;em&gt;Jocotoco&lt;/em&gt; (more about the name in the next installment).&lt;/p&gt;

&lt;p&gt;But before we take off, let me give, again, huge thanks to the JSConf crew for
the amazing conference they manage to organize year after year. If I could,
I'd already preorder the tickets for next year :p&lt;/p&gt;

&lt;p&gt;Attending a JSConf feels like a two-day drug-trip that lasts for at least two
weeks.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/10/new-fun-project</feedburner:origLink></entry>
  
  <entry>
    <title>E_NOTICE stays off.</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/e04X1mUVtjE/e-notice-stays-off" />
    <updated>2011-10-06T00:00:00+02:00</updated>
    <id>http://pilif.github.com/2011/10/e-notice-stays-off</id>
    <content type="html">&lt;p&gt;I'm sure you've used this idiom a lot when writing JavaScript code&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;foobar&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;It's short, it's concise and it's clear what it does. In ruby, you can even be
more concise:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:a&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;||=&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;foobar&amp;#39;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;So you can imagine that I was happy with PHP 5.3's new ?: operator:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="php"&gt;&lt;span class="cp"&gt;&amp;lt;?&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;foobar&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;In all three cases, the syntax is concise and readable, though arguably, the
PHP one could read a bit better, but, ?: still is better than writing the full
ternary expression, spelling out &lt;code&gt;$options['a']&lt;/code&gt; three times.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.popscan.com"&gt;PopScan&lt;/a&gt;, since forever (forever being
2004) runs with E_NOTICE turned off. Back in the times, I felt it provided
just baggage and I just wanted (had to) get things done quickly.&lt;/p&gt;

&lt;p&gt;This, of course, lead to people not taking enough care for the code and
recently, I had one too many case of a bug caused by accessing a variable that
was undefined in a specific code path.&lt;/p&gt;

&lt;p&gt;I decided that I'm willing to spend the effort in cleaning all of this up and
making sure that there are no undeclared fields and variables in all of
PopScans codebase.&lt;/p&gt;

&lt;p&gt;Which turned out to be quite a bit of work as a lot of code is apparently
happily relying on the default &lt;code&gt;null&lt;/code&gt; that you can read out of undefined
variables. Those instances might be ugly, but they are by no means bugs.&lt;/p&gt;

&lt;p&gt;Cases where the &lt;code&gt;null&lt;/code&gt; wouldn't be expected are the ones I care about, but I
don't even what to go and discern the two - I'll just fix all of the instances
(embarrassingly many, most of them, thankfully, not mine).&lt;/p&gt;

&lt;p&gt;Of course, if I put hours into a cleanup project like this, I want to be sure
that nobody destroys my work again over time.&lt;/p&gt;

&lt;p&gt;Which is why I was looking into running PHP with &lt;code&gt;E_NOTICE&lt;/code&gt; in development
mode at least.&lt;/p&gt;

&lt;p&gt;Which brings us back to the introduction.&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="php"&gt;&lt;span class="cp"&gt;&amp;lt;?&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;foobar&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;is wrong code. Any accessing of an undefined index of an array always raises a
notice. It's not like Python where you can chose (accessing a dictionary using
[] will throw a KeyError, but there's get() which just returns None). No. You
don't get to chose. You only get to add boilerplate:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="php"&gt;&lt;span class="cp"&gt;&amp;lt;?&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;isset&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="nv"&gt;$options&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;a&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;foobar&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; 
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;See how I'm now spelling &lt;code&gt;$options['a']&lt;/code&gt; three times again? &lt;code&gt;?:&lt;/code&gt; just got a
whole lot less useful.&lt;/p&gt;

&lt;p&gt;But not only that. Let's say you have code like this:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="php"&gt;&lt;span class="cp"&gt;&amp;lt;?&lt;/span&gt;
&lt;span class="k"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;explode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$def&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="nv"&gt;$port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$port&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="mi"&gt;11211&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;IMHO very readable and clear what it does: It extracts a host and a port and
sets the port to 11211 if there's none in the initial string.&lt;/p&gt;

&lt;p&gt;This of course won't work with E_NOTICE enabled. You either lose the very
concise list() syntax, or you do - &lt;em&gt;ugh&lt;/em&gt; - this:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="php"&gt;&lt;span class="cp"&gt;&amp;lt;?&lt;/span&gt;
&lt;span class="k"&gt;list&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$host&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$port&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;explode&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;:&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;trim&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$def&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="k"&gt;array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;null&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="nv"&gt;$port&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$port&lt;/span&gt; &lt;span class="o"&gt;?:&lt;/span&gt; &lt;span class="mi"&gt;11211&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Which looks ugly as hell. And no, you can't write a wrapper to explode() which
always returns an array big enough, because you don't know what's big enough.
You would have to pass the amount of nulls you want into the call too. That
would look nicer then above hack, but it still doesn't even come close in
conciseness to the solution which throws a notice.&lt;/p&gt;

&lt;p&gt;So. In the end, I'm just complaining about syntax you might think? I though so
too and I wanted to add the syntax I liked, so I did a bit of experimenting.
Here's a little something I've come up with:&lt;/p&gt;

&lt;script src="https://gist.github.com/1267568.js?file=e_notice_stays_off.php"&gt;&lt;/script&gt;


&lt;p&gt;The wrapped array solution looks really compelling syntax-wise and I could
totally see myself using this and even forcing everybody else to go there. But
of course, I didn't trust PHP's interpreter and thus benchmarked the thing.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;pilif@tali ~ % php e_notice_stays_off.php 
Notices off. Array 100000 iterations took 0.118751s
Notices off. Inline. Array 100000 iterations took 0.044247s
Notices off. Var. Array 100000 iterations took 0.118603s
Wrapped array. 100000 iterations took 0.962119s
Parameter call. 100000 iterations took 0.406003s
Undefined var. 100000 iterations took 0.194525s
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;So. Using nice syntactic sugar costs 7 times the performance. The second best
solution? Still 4 times. Out of the question. Yes. It could be seen as a
micro-optimization, but 100'000 iterations, while a lot is not &lt;em&gt;that many&lt;/em&gt;.
Waiting nearly a second instead of 0.1 second is crazy, especially for a
common operation like this.&lt;/p&gt;

&lt;p&gt;Interestingly, the most bloated code (that checks with isset()) is twice as
fast as the most readable (just assign). Likely, the notice gets fired
regardless of error_reporting() and then just ignored later on.&lt;/p&gt;

&lt;p&gt;What really pisses me off about this is the fact that everywhere else PHP
doesn't give a damn. '0' is equal to 0. Heck, even 'abc' is equal to 0. It
even fails silently many times.&lt;/p&gt;

&lt;p&gt;But in a case like this, where there is even newly added nice and concise
syntax, it has to be anal and bitchy. And there's no way to get to the needed
solution but to either write too expensive wrappers or ugly boilerplate.&lt;/p&gt;

&lt;p&gt;Dynamic languages give us a very useful tool to be dynamic in the APIs we
write. We can create functions that take a dictionary (an array in PHP) of
options. We can extend our objects at runtime by just adding a property. And
with PHP's (way too) lenient data conversion rules, we can even do math with
user supplied string data.&lt;/p&gt;

&lt;p&gt;But can we read data from $_GET without boilerplate? No. Not in PHP. Can we
use a dictionary of optional parameters? Not in PHP. PHP would require
boilerplate.&lt;/p&gt;

&lt;p&gt;If a language basically mandates retyping the same expression three times,
then, IMHO, something is broken. And if all the workarounds are either crappy
to read or have very bad runtime properties, then something is terribly
broken.&lt;/p&gt;

&lt;p&gt;So, I decided to just fix the problem (undefined variable access) but leave
E_NOTICE where it is (off). There's always &lt;code&gt;git blame&lt;/code&gt; and I'll make sure I
will get a beer every time somebody lets another undefined variable slip in
:-)&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/10/e-notice-stays-off</feedburner:origLink></entry>
  
  <entry>
    <title>Asking for permission</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/_6zQx4b0EGg/asking-for-permission" />
    <updated>2011-09-22T00:00:00+02:00</updated>
    <id>http://pilif.github.com/2011/09/asking-for-permission</id>
    <content type="html">&lt;p&gt;Only just last year, I told &lt;a href="https://twitter.com/brainlock"&gt;@brainlock&lt;/a&gt;
(in real life, so I can't link) that the coolest thing about our industry was that
you don't have to ask for permission to do anything.&lt;/p&gt;

&lt;p&gt;Want to start the next big web project? Just start it. Want to write about
your opinions? Just write about them. Want to get famous? It's still a lot of
work and marketing, but nothing (aside of lack of talent) is stopping you.&lt;/p&gt;

&lt;p&gt;Whenever you have a good idea for a project, you start working on it, you see
how it turns out and you decide whether to continue working on it or whether
to scrap it. Aside of a bit of cash for hosting, you don't need anything else.&lt;/p&gt;

&lt;p&gt;This is very cool because is empowers "normal people". Heck, I probably
wouldn't be where I currently am if it wasn't for this. Back in 1996 I had no
money, I wasn't known, I had no past experience. What I had though was
enthusiasm.&lt;/p&gt;

&lt;p&gt;Which is all that's needed.&lt;/p&gt;

&lt;p&gt;Only a year later though, I'm sad to see that we are at the verge of losing
all of this. Piece by piece.&lt;/p&gt;

&lt;p&gt;First was apple with their iPhone. Even with all the enthusiasm of the world,
you are not going to write an app that other people can run on the phone. No.
First you will have to ask Apple for permission.&lt;/p&gt;

&lt;p&gt;Want to access some third-party hardware from that iPhone app? Sure. But now
you have to not only ask Apple, but also the third party vendor for
permission.&lt;/p&gt;

&lt;p&gt;The explanation we were given is that a malicious app could easily bring down
the mobile network. Thus they needed to be careful what we could run on our
phones.&lt;/p&gt;

&lt;p&gt;But then, we got the iPad with the exact same restrictions even though not all
of them even have mobile network access.&lt;/p&gt;

&lt;p&gt;The explanation this time? Security.&lt;/p&gt;

&lt;p&gt;As nobody wants their machine to be insecure, everybody just accepts it.&lt;/p&gt;

&lt;p&gt;Next came Microsoft: In the Windows Mobile days before the release of 7, you
didn't have to ask anybody for permission. You bought (or pirated if you
didn't have money) Visual Studio, you wrote your app, you published it.&lt;/p&gt;

&lt;p&gt;All of this is lost now. Now you ask for permission. Now you hope for the
powers that be to allow you to write your software.&lt;/p&gt;

&lt;p&gt;Finally, &lt;a href="http://mjg59.dreamwidth.org/5552.html"&gt;you can't even do what you want with your PC&lt;/a&gt; - all because of security.&lt;/p&gt;

&lt;p&gt;So there's still the web you think? I wish I could be positive about that, but
as we are running out of IP-addresses and the adoption of IPv6 is slow as
ever, I believe that public IP addresses are becoming a scarce good at which
point, again, you will be asking for permission.&lt;/p&gt;

&lt;p&gt;In some countries, even today, it's not possible to just write a blog post
because the government is afraid of "unrest" (read: losing even more
credibility). That's not just countries we always perceived as "not free" -
heck, &lt;s&gt;even in Italy you must register with the government if you want to have
a blog&lt;/s&gt; (it turns out that law didn't come to pass - let's hope no other country
has the same bright idea). In Germany, if you read the law by the letter, you
can't blog at all without getting every post approved - you could write
something that a minor might see.&lt;/p&gt;

&lt;p&gt;«But permission will be granted anyways», you might say. Are you sure though?
What if you are a minor wanting to create an application for your first
client? Back in my days, I could just do it. Are you sure that whatever entity
is going to have to give permission wan't to do business with minors? You &lt;em&gt;do&lt;/em&gt;
know that you can't have a Gmail account if you are younger than 13 years, do
you? So age barriers exist.&lt;/p&gt;

&lt;p&gt;What if your project competes with whatever entity has to give permission?
Remember the &lt;a href="http://www.google.com/search?ie=UTF-8&amp;amp;q=google+voice+iphone+rejection"&gt;story about the Google Voice app&lt;/a&gt;?
Once we are out of IP addresses, the big provider and media companies who still
have addresses might see you little startup web project as competition in some
way. Are you sure you will still get permission?&lt;/p&gt;

&lt;p&gt;Back in 1996 when I started my company in High-School, all you needed to earn
your living was enthusiasm and a PC (yes - I started doing web programming
without having access to the internet)&lt;/p&gt;

&lt;p&gt;Now you need signed contracts, signed NDAs, lobbying, developer program
memberships, cash - the barriers to entry are infinitely higher at this point.&lt;/p&gt;

&lt;p&gt;I'm afraid though, that this is just the beginning. If we don't stand up now,
if we continue to let big companies and governments take away our freedom of
expression piece by piece, if we give up more and more of our freedom because
of the false promise of security, then, at one point, all of what we had will
be lost.&lt;/p&gt;

&lt;p&gt;We won't be able to just start our projects. We won't be able to create - only
to work on other peoples projects. We will lose all that makes our profession
interesting.&lt;/p&gt;

&lt;p&gt;Let's not go there.&lt;/p&gt;

&lt;p&gt;Please.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://news.ycombinator.com/item?id=3025245"&gt;Discussion on HackerNews&lt;/a&gt;&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/09/asking-for-permission</feedburner:origLink></entry>
  
  <entry>
    <title>Lion Server authentication issues</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/XtSmTI-8_6I/lion-password-server" />
    <updated>2011-09-19T00:00:00+02:00</updated>
    <id>http://pilif.github.com/2011/09/lion-password-server</id>
    <content type="html">&lt;p&gt;Lately I was having an issue with a Lion Server that refused logins of users stored in OpenDirectory. A quick check of &lt;code&gt;/var/log/opendirectoryd.log&lt;/code&gt; revealed an issue with the «Password Server»:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Module: AppleODClient - unable to send command to Password Server - sendmsg() on socket fd 16 failed: Broken pipe (5205)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As this message apparently doesn't appear on Google yet, there's my contribution to solving this.&lt;/p&gt;

&lt;p&gt;The fix was to kill -9 the kerberos authentication daemon:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;sudo killall kpasswdd
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;which in fact didn't help (sometimes &lt;a href="http://xkcd.com/149/"&gt;even sudo isn't enough&lt;/a&gt;), so I had to be more persuasive to get rid of the apparently badly hanging process:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;sudo killall -9 kpasswdd
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This time the process was really killed and subsequently instantly restarted by launchd.&lt;/p&gt;

&lt;p&gt;After that, the problem went away.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/09/lion-password-server</feedburner:origLink></entry>
  
  <entry>
    <title>serialize() output is binary data!</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/3h7gHBpMT2w/serialize-mistake" />
    <updated>2011-09-15T00:00:00+02:00</updated>
    <id>http://pilif.github.com/2011/09/serialize-mistake</id>
    <content type="html">&lt;p&gt;When you call &lt;a href="http://www.php.net/serialize"&gt;serialize()&lt;/a&gt; in PHP, to serialize a value into something that you store for later use with &lt;a href="http://www.php.net/unserialize"&gt;unserialize()&lt;/a&gt;, then be very careful what you are doing with that data.&lt;/p&gt;

&lt;p&gt;When you look at the output, you'd be tempted to assume that it's text data:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;php &amp;gt; $a = array('foo' =&amp;gt; 'bar');
php &amp;gt; echo serialize($a);
a:1:{s:3:"foo";s:3:"bar";}
php &amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and as such, you'd be tempted to treat this as text data (i.e. store it in a TEXT column in your database).&lt;/p&gt;

&lt;p&gt;But what looks like text on first glance isn't text data at all. Assume that my terminal is in ISO-8859-1 encoding:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;php &amp;gt; echo serialize(array('foo' =&amp;gt; 'bär'));
a:1:{s:3:"foo";s:3:"bär";}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;and now assume it's in UTF-8 encoding:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;php &amp;gt; echo serialize(array('foo' =&amp;gt; 'bär'));
a:1:{s:3:"foo";s:4:"bär";}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You will notice that the format encodes the strings length together with the string. And because PHP is inherently not unicode capable, it's not encoding the strings character length, but its &lt;em&gt;byte-length&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;unserialize() checks whether the encoded length matches the actual delimited strings length. This means that if you treat the serialized output as text and your databases's encoding changes along the way, that the retrieved string can't be unserialized any more.&lt;/p&gt;

&lt;p&gt;I just learned that the hard way (even though it's obvious in hindsight) while migrating &lt;a href="http://www.popscan.ch"&gt;PopScan&lt;/a&gt; from ISO-8859-1 to UTF-8:&lt;/p&gt;

&lt;p&gt;The databases of existing systems now contain a lot of output from serialize() which was run over ISO strings but now that the client-encoding in the database client is set to utf-8, the data will be retrieved as UTF-8 and because the serialize() output was stored in a TEXT column, it happily gets UTF-8 encoded.&lt;/p&gt;

&lt;p&gt;If we remove the database from the picture and express the problem in code, this is what's going on:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;unserialize(utf8encode(serialize('data with 8bit chàracters')));
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;i.e the data gets altered after serializing and the way it gets altered is a way that unserialize can't deal with the data any more.&lt;/p&gt;

&lt;p&gt;So, for everybody else not yet in this dead end:&lt;/p&gt;

&lt;p&gt;The output of serialize() is &lt;em&gt;binary data&lt;/em&gt;. It looks like textual data, bit it isn't. Treat it as binary. If you store it somewhere, make sure that the medium you store it to treats the data as binary. No transformation what so ever must ever be made on it.&lt;/p&gt;

&lt;p&gt;Of course, that leaves you with a problem later on if you switch character sets and you have to unserialize, but at least you get to unserialize then. I have to go great lengths now to salvage the old data.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/09/serialize-mistake</feedburner:origLink></entry>
  
  <entry>
    <title>Another platform change</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/uFV4GBw7g9E/another-platform-change" />
    <updated>2011-08-03T00:00:00+02:00</updated>
    <id>http://pilif.github.com/2011/08/another-platform-change</id>
    <content type="html">&lt;p&gt;If you can read this, then it has happened - this blog moved again.&lt;/p&gt;

&lt;p&gt;Eons ago in internet time, &lt;a href="/2002/11/welcome/"&gt;this project has started&lt;/a&gt; wich lasted for 4 years at which point I got spammed so badly that I had to move away.&lt;/p&gt;

&lt;p&gt;So, still ages ago, &lt;a href="/2006/06/new-face-new-engine-new-everything/"&gt;I moved to Serendipity&lt;/a&gt; which fixed the spam issue for me.&lt;/p&gt;

&lt;p&gt;This lastes only two years before &lt;a href="/2008/03/another-new-look/"&gt;I moved again&lt;/a&gt;to WordPress this time - the nicer admin tool and the richer theme selection pushed me over the edge.&lt;/p&gt;

&lt;p&gt;While I'm still happy with WordPress in general, over time I learned a few things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;p&gt;while running your own server is fun, having it compromised is not. Using
any well-known blogging engine that relies on server-side generation of&lt;br/&gt;
content is ultimately a way to get compromised unless you constantly patch
security issues, taking up a lot of time in the process.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;While the old name of this blog (gnegg) was a cool pun for people who knew
me, it didn't at all convey my identity on the internet. Me? I'm
&lt;a href="http://pilif.me"&gt;pilif&lt;/a&gt;, so this should be conveyed at least by the URL&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Most of my posts very relying heavily on custom markup, making the WP
WYSIWYG editor more annoying than useful.&lt;/p&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;So when &lt;a href="https://twitter.com/rmurphey"&gt;@rmurphey&lt;/a&gt; &lt;a href="http://rmurphey.com/blog/2011/07/25/switching-to-octopress/"&gt;blogged about octopress&lt;/a&gt; I immediately recognized the huge opportunity &lt;a href="http://octopress.org"&gt;Octopress&lt;/a&gt; provides:&lt;/p&gt;

&lt;p&gt;I can host static files on a server I don't own and thus don't have to care about compromising, I can blog using my favorite tools (any text editor and git) and I still get an acceptable layout.&lt;/p&gt;

&lt;p&gt;So here we are - at the end of yet another conversion.&lt;/p&gt;

&lt;p&gt;While the old URLs already 301 redirect, pictures are still missing and I'll work on getting them back. The comments I'll try to port over too the moment I see how disqus handles my WordPress export.&lt;/p&gt;

&lt;p&gt;The gnegg branding is gone and has been replaced by something that doesn't look like a name but isn't while still being a fun pun for people who know me. The tagline, of course, stays the same.&lt;/p&gt;

&lt;p&gt;So.&lt;/p&gt;

&lt;p&gt;Welcome to my new home and let's hope this lasts as long as the previous instances!&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/08/another-platform-change</feedburner:origLink></entry>
  
  <entry>
    <title>AJAX, Architecture, Frameworks and Hacks</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/2dPEjkp5mV4/ajax-architecture-frameworks-and-hacks" />
    <updated>2011-04-13T00:00:00+02:00</updated>
    <id>http://pilif.github.com/2011/04/ajax-architecture-frameworks-and-hacks</id>
    <content type="html">&lt;p&gt;Today I was talking with &lt;a href="http://twitter.com/brainlock"&gt;@brainlock&lt;/a&gt; about JavaScript, AJAX and Frameworks and about two paradigms that are in use today:&lt;/p&gt;

&lt;p&gt;The first is the "traditional" paradigm where your JS code is just glorified view code. This is how AJAX worked in the early days and how people are still using it. Your JS-code intercepts a click somewhere, sends an AJAX request to the server and gets back either more JS code which just gets evaulated (thus giving the server kind of indirect access to the client DOM) or a HTML fragment which gets inserted at the appropriate spot.&lt;/p&gt;

&lt;p&gt;This means that&lt;em&gt; your JS code will be ugly&lt;/em&gt; (especially the code coming from the server), but it has the advantage that all your view code is right there where all your controllers and your models are: on the server. You see this pattern in use on the 37signals pages or in the &lt;a href="http://github.com"&gt;github&lt;/a&gt; file browser for example.&lt;/p&gt;

&lt;p&gt;Keep the file browser in mind as I'm going to use that for an example later on.&lt;/p&gt;

&lt;p&gt;The other paradigm is to go the other way around an promote JS to a first-class language. Now you build a framework on the client end and transmit only data (XML or JSON, but mostly JSON these days) from the server to the client. The server just provides a REST API for the data plus serves static HTML files. All the view logic lives only on the client side.&lt;/p&gt;

&lt;p&gt;The advantages are that you can organize your client side code much better, for example using &lt;a href="http://documentcloud.github.com/backbone/"&gt;backbone&lt;/a&gt;, that there's no expensive view rendering on the server side and that you basically get your third party API for free because the API is the only thing the server provides.&lt;/p&gt;

&lt;p&gt;This paradigm is used for the new twitter webpage or in my very own &lt;a href="http://tempalias.com"&gt;tempalias.com&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now &lt;a href="http://twitter.com/brainlock"&gt;@brainlock&lt;/a&gt; is a heavy proponent of the second paradigm. After being enlightened by the great Crockford, we both love JS and we both worked on huge messes of client-side JS code which has grown over the years and lacks structure and feels like copy pasta sometimes. In our defense: Tons of that code was written in the pre-enlightened age (2004).&lt;/p&gt;

&lt;p&gt;I on the other hand see some justification for the first pattern aswell and I wouldn't throw it away so quickly.&lt;/p&gt;

&lt;p&gt;The main reason: It's more pragmatic, it's more DRY once you need graceful degradation and arguably, you can reach your goal a bit faster.&lt;/p&gt;

&lt;p&gt;Let me explain by looking at the github file browser:&lt;/p&gt;

&lt;p&gt;If you have a browser that supoports the HTML5 history API, then a click on a directory will reload the file list via AJAX and at the same time the URL will be updated using push state (so that the current view keeps its absolute URL which is valid even after you open it in a new browser).&lt;/p&gt;

&lt;p&gt;If a browser doesn't support pushState, it will gracefully degrade by just using the traditional link (and reloading the full page).&lt;/p&gt;

&lt;p&gt;Let's map this functionality to the two paradigms.&lt;/p&gt;

&lt;p&gt;First the hacky one:&lt;/p&gt;

&lt;ol&gt;
    &lt;li&gt;You render the full page with the file list using a server-side template&lt;/li&gt;
    &lt;li&gt;You intercept clicks to the file list. If it's a folder:&lt;/li&gt;
    &lt;li&gt;you request the new file list&lt;/li&gt;
    &lt;li&gt;the server now renders the file list partial (in rails terms - basically just the file list part) without the rest of the site&lt;/li&gt;
    &lt;li&gt;the client gets that HTML code and inserts it in place of the current file list&lt;/li&gt;
    &lt;li&gt;You patch up the url using push state&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;done. The view code is only on the server. Whether the file list is requested using the AJAX call or the traditional full page load doesn't matter. The code path is exactly the same. The only difference is that the rest of the page isn't rendered in case of an AJAX call. You get graceful degradation and no additional work.&lt;/p&gt;

&lt;p&gt;Now assuming you want to keep graceful degradation possible and you want to go the JS framework route:&lt;/p&gt;

&lt;ol&gt;
    &lt;li&gt;You render the full page with the file list using a server-side template&lt;/li&gt;
    &lt;li&gt;You intercept the click to the folder in the file list&lt;/li&gt;
    &lt;li&gt;You request the JSON representation of the target folder&lt;/li&gt;
    &lt;li&gt;You use that JSON representation to fill a client-side template which is a copy of the server side partial&lt;/li&gt;
    &lt;li&gt;You insert that HTML at the place where the file list is&lt;/li&gt;
    &lt;li&gt;You patch up the URL using push state&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;The amount of steps is the same, but the amount of work isn't: If you want graceful degradation, then you write the file list template twice: Once as a server-side template, once as a client-side template. Both are quite similar but usually you'll be forced to use slightly different syntax. If you update one, you have to update the other or the experience will be different whether you click on a link or you open the URL directly.&lt;/p&gt;

&lt;p&gt;Also you are duplicating the code which fills that template: On the server side, you use ActiveRecord or whatever other ORM. On the client side, you'd probably use Backbone to do the same thing but now your backend isn't the database but the JSON response. Now, Backbone is really cool and a huge timesaver, but it's still more work than not doing it at all.&lt;/p&gt;

&lt;p&gt;OK. Then let's skip graceful degradation and make this a JS only client app (&lt;a href="http://www.google.com/search?ie=UTF-8&amp;amp;q=gawker+redesign"&gt;good luck trying to get away with that&lt;/a&gt;). Now the view code on the server goes away and you are just left with the model on the server to retrieve the data, with the model on the client (Backbone helps a lot here, but there's still a substatial amount of code that needs to be written that otherwise wouldn't) and with the view code on the client.&lt;/p&gt;

&lt;p&gt;Now don't ge me wrong.&lt;/p&gt;

&lt;p&gt;I &lt;strong&gt;love&lt;/strong&gt; the idea of promoting JS to a first class language. I &lt;strong&gt;love&lt;/strong&gt; JS frameworks for big JS only applications. I &lt;strong&gt;love&lt;/strong&gt; having a "free", dogfooded-by-design REST API. I &lt;strong&gt;love&lt;/strong&gt; building cool architectures.&lt;/p&gt;

&lt;p&gt;I'm just thinking that at this point it's so much work doing it right, that the old ways do have their advantages and that we should not condemn them for being hacky. True. They are. But they are also &lt;em&gt;pragmatic&lt;/em&gt;.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/04/ajax-architecture-frameworks-and-hacks</feedburner:origLink></entry>
  
  <entry>
    <title>DNSSEC to fix the SSL mess?</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/5Bz1QxlzQRo/dnssec-to-clean-the-ssl-mess" />
    <updated>2011-04-07T00:00:00+02:00</updated>
    <id>http://pilif.github.com/2011/04/dnssec-to-clean-the-ssl-mess</id>
    <content type="html">&lt;p&gt;After &lt;a href="http://codebutler.com/firesheep"&gt;Firesheep&lt;/a&gt; it has become clear that there's no way around SSL.&lt;/p&gt;

&lt;p&gt;But still many people (and I'm including myself) are unhappy with the fact that to roll out SSL, you basically have to pay a sometimes significant premium for the certificate. And that's not all: You have to pay the same fee every n years (and while you could say that the CA does some work the first time, every following year, it's plain sucking money from you) and you have to remember to actually do it unless you want &lt;a href="http://forum.skype.com/index.php?showtopic=784971"&gt;embarrassing warnings&lt;/a&gt; pop up to your users.&lt;/p&gt;

&lt;p&gt;The usual suggestion is to make browsers accept self-signed certificates without complaining, but that doesn't really work to prevent a Firesheep style attack and is arguably even worse as it would allow not only your session id, but also your password to leak from sites that use the traditional SSL-for-login-HTTP-afterwards mechanism.&lt;/p&gt;

&lt;p&gt;See &lt;a href="http://news.ycombinator.com/item?id=2348836"&gt;my comment on HackerNews&lt;/a&gt; for more details.&lt;/p&gt;

&lt;p&gt;To make matters worse, last week news about a CA being compromised and issuing fraudulent (but still trusted) certificates made the rounds, so now even with the current CA based security mechanism, we still can't completely trust the infrastructure.&lt;/p&gt;

&lt;p&gt;Thinking about this, I had an idea.&lt;/p&gt;

&lt;p&gt;Let's assume that one day, one glorious day, DNSSEC will actually be deployed.&lt;/p&gt;

&lt;p&gt;If that's the case, then if I was the owner of gnegg.ch, I could just publish the certificate (or its fingerprint or a link to the certificate over SSL) in the DNS as a TXT record. DNSSEC would ensure that it was the owner of the domain who created the TXT entry and that the domain is the real one and not a faked one.&lt;/p&gt;

&lt;p&gt;So if that entry says that gnegg.ch is supposed to serve a certificate with the fingerprint 0xdeadbeef, then a connecting browser would be sure that if the site is serving that certificate (and has the matching private key), then the connection would be secure and not man-in-the-middle'd.&lt;/p&gt;

&lt;p&gt;Even better: If I lose the private key of gnegg.ch, I would just update the TXT record, making the old key useless. No non-working CRL or OCSP. Just one additional DNS query.&lt;/p&gt;

&lt;p&gt;And you know what? It would put CAs out of business for signing of site certificates as a self-signed certificate would be as good as an official one (they would still be needed to sign your DNSSEC zone file of course, but that could be done by the TLD owners).&lt;/p&gt;

&lt;p&gt;Oh and by the way: I could create my certificate with an incredibly long (if ever) expiration time: If I want the certificate to be invalid, I remove or change the TXT record and I'm done. As simple as that. No more embarrassing warnings. No more fear of missing the deadline.&lt;/p&gt;

&lt;p&gt;Now, this feels so incredibly simple that there &lt;strong&gt;must&lt;/strong&gt; be something I'm missing. What is it? Is it just that politics is preventing DNSSEC from ever being real? Is there an error in my thinking?&lt;/p&gt;

&lt;p&gt;&amp;nbsp;&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/04/dnssec-to-clean-the-ssl-mess</feedburner:origLink></entry>
  
  <entry>
    <title>rails, PostgreSQL and the native uuid type</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/mHB3gWT6Y00/rails-postgresql-and-the-native-uuid-type" />
    <updated>2011-03-07T00:00:00+01:00</updated>
    <id>http://pilif.github.com/2011/03/rails-postgresql-and-the-native-uuid-type</id>
    <content type="html">&lt;p&gt;UUID have the very handy property that they are uniqe and there are quite many of them for you to use. Also they are difficult to guess and knowing the UUID of one object, it's very hard to guess a valid UUID of another object.&lt;/p&gt;

&lt;p&gt;This makes UUIDs perfect for identifying things in web applications:&lt;/p&gt;

&lt;ul&gt;
    &lt;li&gt;Even if you shard across multiple machines, each machine can independently generate primary keys without (realistic) fear of overlapping.&lt;/li&gt;
    &lt;li&gt;You can generate them without using any kind of locks.&lt;/li&gt;
    &lt;li&gt;Sometimes, you have to expose such keys to the user. If possible, you will of course do authorization checks, but it still makes sense not allowing users know about neighboring keysThis gets even more important when you are not able to do authorization keys because the resource you are referring to is public (like a &lt;a href="http://tempalias.com"&gt;mail alias&lt;/a&gt;) but it should still not possible to know other items if you know one.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Knowing that &lt;a href="http://www.codinghorror.com/blog/2007/03/primary-keys-ids-versus-guids.html"&gt;UUIDs are a good thing&lt;/a&gt;, you might want to use them in your application (or you just have to in the last case above).&lt;/p&gt;

&lt;p&gt;There are multiple recipes out there that show how to do it in a rails application (&lt;a href="http://stackoverflow.com/questions/2487837/uuids-in-rails3"&gt;this one for example&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;All of these recipes store UUIDs as varchar's in your database. In general, that's fine and also the only thing you can do as most databases don't have a native data type for UUIDs.&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.postgresql.org"&gt;PostgreSQL&lt;/a&gt; on the other hand indeed has a native 128 bit integer type to store UUID.&lt;/p&gt;

&lt;p&gt;This is more space efficient than storing the UUID in string form (288 bit) and it might be a tad bit faster when doing comparison operations on the database as integer operations (even if they are this big) require a constant amount of operations whereas comparing two string UUIDs is a string comparison which is dependent on the string size and size of the matching parts.&lt;/p&gt;

&lt;p&gt;So maybe for the (minuscule) speed increase or for the purpose of correct semantics or just for interoperability with other applications, you might want to use native PostgreSQL UUIDs from your Rails (or other, but without the abstraction of a "Migration", just using UUID is trivial) applications.&lt;/p&gt;

&lt;p&gt;This already works quite nicely if you generate the columns as strings in your migrations and then manually send an &lt;code&gt;alter table&lt;/code&gt; (whenever you restore the schema from scratch).&lt;/p&gt;

&lt;p&gt;But if you want to create the column with the correct type directly from the migration and you want the column to be created correctly when using &lt;code&gt;rake db:schema:load&lt;/code&gt;, then you need a bit of additional magic, especially if you want to still support other databases.&lt;/p&gt;

&lt;p&gt;In my case, I was using PostgreSQL in production (&lt;a href="http://www.gnegg.ch/2004/06/all-time-favourite-tools/"&gt;what&lt;/a&gt; &lt;a href="http://www.gnegg.ch/2009/02/all-time-favourite-tools-update/"&gt;else&lt;/a&gt;?), but on my local machine, for the purpose of getting started quickly, I wanted to still be able to use SQLite for development.&lt;/p&gt;

&lt;p&gt;In the end, everything boils down to monkey patching ActiveRecord::ConnectionAdapters::&lt;em&gt;Adapters and PostgreSQLColumn of the same module. So here's what I've addded to &lt;code&gt;config/initializers/uuuid_support.rb&lt;/code&gt; (Rails 3.0.&lt;/em&gt;):&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;ActiveRecord&lt;/span&gt;
  &lt;span class="k"&gt;module&lt;/span&gt; &lt;span class="nn"&gt;ConnectionAdapters&lt;/span&gt;
    &lt;span class="no"&gt;SQLiteAdapter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;class_eval&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
      &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;native_database_types_with_uuid_support&lt;/span&gt;
        &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;native_database_types_without_uuid_support&lt;/span&gt;
        &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:uuid&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="ss"&gt;:name&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;varchar&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:limit&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;36&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
      &lt;span class="n"&gt;alias_method_chain&lt;/span&gt; &lt;span class="ss"&gt;:native_database_types&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:uuid_support&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;adapter_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;SQLite&amp;#39;&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Base&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;adapter_name&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;PostgreSQL&amp;#39;&lt;/span&gt;
      &lt;span class="no"&gt;PostgreSQLAdapter&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;class_eval&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;native_database_types_with_uuid_support&lt;/span&gt;
          &lt;span class="n"&gt;a&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;native_database_types_without_uuid_support&lt;/span&gt;
          &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="o"&gt;[&lt;/span&gt;&lt;span class="ss"&gt;:uuid&lt;/span&gt;&lt;span class="o"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="ss"&gt;:name&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;uuid&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
          &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;
        &lt;span class="k"&gt;end&lt;/span&gt;
        &lt;span class="n"&gt;alias_method_chain&lt;/span&gt; &lt;span class="ss"&gt;:native_database_types&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:uuid_support&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;

      &lt;span class="no"&gt;PostgreSQLColumn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;class_eval&lt;/span&gt; &lt;span class="k"&gt;do&lt;/span&gt;
        &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;simplified_type_with_uuid_support&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;field_type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;field_type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;uuid&amp;#39;&lt;/span&gt;
            &lt;span class="ss"&gt;:uuid&lt;/span&gt;
          &lt;span class="k"&gt;else&lt;/span&gt;
            &lt;span class="n"&gt;simplified_type_without_uuid_support&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;field_type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
          &lt;span class="k"&gt;end&lt;/span&gt;
        &lt;span class="k"&gt;end&lt;/span&gt;
        &lt;span class="n"&gt;alias_method_chain&lt;/span&gt; &lt;span class="ss"&gt;:simplified_type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:uuid_support&lt;/span&gt;
      &lt;span class="k"&gt;end&lt;/span&gt;
    &lt;span class="k"&gt;end&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;In your migrations you can then use the :uuid type. In my sample case, this was it:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="ruby"&gt;&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;AddGuuidToSites&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;ActiveRecord&lt;/span&gt;&lt;span class="o"&gt;::&lt;/span&gt;&lt;span class="no"&gt;Migration&lt;/span&gt;
  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nc"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;up&lt;/span&gt;
    &lt;span class="n"&gt;add_column&lt;/span&gt; &lt;span class="ss"&gt;:sites&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:guuid&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:uuid&lt;/span&gt;
    &lt;span class="n"&gt;add_index&lt;/span&gt; &lt;span class="ss"&gt;:sites&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:guuid&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;

  &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nc"&gt;self&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;down&lt;/span&gt;
    &lt;span class="n"&gt;remove_column&lt;/span&gt; &lt;span class="ss"&gt;:sites&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="ss"&gt;:guuid&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt;
&lt;span class="k"&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Maybe with a bit better Ruby knowledge than I have, it should be possible to just monkey-patch the parent &lt;code&gt;AbstractAdaper&lt;/code&gt; while still calling the method of the current subclass. This would not require a separate patch for all adapters in use.&lt;/p&gt;

&lt;p&gt;For my case which was just support for SQLite and PostgreSQL, the above initializer was fine though.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/03/rails-postgresql-and-the-native-uuid-type</feedburner:origLink></entry>
  
  <entry>
    <title>How I back up gmail</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/F-KQAPYPsrA/how-i-back-up-gmail" />
    <updated>2011-02-28T00:00:00+01:00</updated>
    <id>http://pilif.github.com/2011/02/how-i-back-up-gmail</id>
    <content type="html">&lt;p&gt;There was a &lt;a href="http://news.ycombinator.com/item?id=2269346"&gt;discussion on HackerNews&lt;/a&gt; about Gmail having lost the email in some accounts. One sentiment in the comments was clear:&lt;/p&gt;

&lt;p&gt;It's totally the users problem if they don't back up their cloud based email.&lt;/p&gt;

&lt;p&gt;Personally, I think I would have to agree:&lt;/p&gt;

&lt;p&gt;Google is a provider like every other ISP or basically any other service too. There's no reason to believe that your data is more save on Google than it is any where else. Now granted, they are not exactly known for losing data, but there's other things that can happen.&lt;/p&gt;

&lt;p&gt;Like your account being closed because whatever automated system believed your usage patterns were consistent with those of a spammer.&lt;/p&gt;

&lt;p&gt;So the question is: What would happen if your Google account wasn't reachable at some point in the future?&lt;/p&gt;

&lt;p&gt;For my company (using commercial Google Apps accounts), I would start up that IMAP server which serves all mail ever sent to and from Gmail. People would use the already existing webmail client or their traditional IMAP clients. They would lose some productivity, but no single byte of data.&lt;/p&gt;

&lt;p&gt;This was my condition for migrating email over to Google. I needed to have a back up copy of that data. Otherwise, I would not have agreed to switch to a cloud based provider.&lt;/p&gt;

&lt;p&gt;The process is completely automated too. There's not even a backup script running somewhere. Heck, &lt;strong&gt;not even the Google Account passwords have to be stored anywhere for this to work&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;So. How does it work then?&lt;/p&gt;

&lt;p&gt;Before you read on, here are the drawbacks of the solution:&lt;/p&gt;

&lt;ul&gt;
    &lt;li&gt;I'm a die-hard &lt;a href="http://exim.org/"&gt;Exim&lt;/a&gt; fan (long story. It served me very well once - up to saving-my-ass level of well), so the configuration I'm outlining here is for Exim as the mail relay.&lt;/li&gt;
    &lt;li&gt;Also, this &lt;strong&gt;only works with paid Google accounts&lt;/strong&gt;. You can get somewhere using the free ones, but you don't get the full solution (i.e. having a backup of all sent email)&lt;/li&gt;
    &lt;li&gt;This requires you to have full control over the MX machine(s) of your domain.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;If you can live with this, here's how you do it:&lt;/p&gt;

&lt;p&gt;First, you set up your Google domain as normal. Add all the users you want and do everything else just as you would do it in a traditional set up.&lt;/p&gt;

&lt;p&gt;Next, we'll have to configure Google Maps for &lt;a href="http://www.gnegg.ch/2010/06/google-apps-provisioning-two-legged-oauth/"&gt;two-legged OAuth access&lt;/a&gt; to our accounts. I've written about this &lt;a href="http://www.gnegg.ch/2010/06/google-apps-provisioning-two-legged-oauth/"&gt;before&lt;/a&gt;. We are doing this so we don't need to know our users passwords. Also, we need to enable the provisioning API to get access to the list of users and groups.&lt;/p&gt;

&lt;p&gt;Next, our mail relay will have to know about what users (and groups) are listed in our Google account. Here's what I quickly hacked together in Python (my first Python script ever - be polite while flaming) using the GData library:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="python"&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;gdata.apps.service&lt;/span&gt;

&lt;span class="n"&gt;consumer_key&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;yourdomain.com&amp;#39;&lt;/span&gt;
&lt;span class="n"&gt;consumer_secret&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;2-legged-consumer-secret&amp;#39;&lt;/span&gt; &lt;span class="c"&gt;#see above&lt;/span&gt;
&lt;span class="n"&gt;sig_method&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gdata&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;auth&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;OAuthSignatureMethod&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HMAC_SHA1&lt;/span&gt;

&lt;span class="n"&gt;service&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gdata&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;apps&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;AppsService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;domain&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;consumer_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetOAuthInputParameters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sig_method&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;consumer_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;\
  &lt;span class="n"&gt;consumer_secret&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;consumer_secret&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;two_legged_oauth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RetrieveAllUsers&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;login&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;user_name&lt;/span&gt;

&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;gdata.apps.groups.service&lt;/span&gt;

&lt;span class="n"&gt;service&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gdata&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;apps&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;groups&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;GroupsService&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;domain&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;consumer_key&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SetOAuthInputParameters&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sig_method&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;consumer_key&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;\
  &lt;span class="n"&gt;consumer_secret&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;consumer_secret&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;two_legged_oauth&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;service&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RetrieveAllGroups&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;res&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;entry&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;groupName&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Place this script somewhere on your mail relay and run it in a cron job. In my case, I'm having its output redirected to &lt;code&gt;/etc/exim4/gmail_accounts&lt;/code&gt;. The script will emit one user (and group) name per line.&lt;/p&gt;

&lt;p&gt;Next, we'll deal with incoming email:&lt;/p&gt;

&lt;p&gt;In the Exim configuration of your mail relay, add the following routers:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="text"&gt;yourdomain_gmail_users:
  driver = accept
  domains = yourdomain.com
  local_parts = lsearch;/etc/exim4/gmail_accounts
  transport_home_directory = /var/mail/yourdomain/${lc:$local_part}
  router_home_directory = /var/mail/yourdomain/${lc:$local_part}
  transport = gmail_local_delivery
  unseen

yourdomain_gmail_remote:
  driver = accept
  domains = yourdomain.com
  local_parts = lsearch;/etc/exim4/gmail_accounts
  transport = gmail_t
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;yourdomain_gmail_users is what creates the local copy. It accepts all mail sent to yourdomain.com, if the local part (the stuff in front of the @) is listed in that gmail_accounts file. Then it sets up some paths for the local transport (see below) and marks the mail as unseen so the next router gets a chance too.&lt;/p&gt;

&lt;p&gt;Which is yourdomain_gmail_remote. This one is again checking domain and the local part and if they match, it's just delegating to the gmail_t remote transport (which will then send the email to Google).&lt;/p&gt;

&lt;p&gt;The transports look like this:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="text"&gt;gmail_t:
  driver = smtp
  hosts = aspmx.l.google.com:alt1.aspmx.l.google.com:\
    alt2.aspmx.l.google.com:aspmx5.googlemail.com:\
    aspmx2.googlemail.com:aspmx3.googlemail.com:\
    aspmx4.googlemail.com
  gethostbyname

gmail_local_delivery:
  driver = appendfile
  check_string =
  delivery_date_add
  envelope_to_add
  group=mail
  maildir_format
  directory = MAILDIR/yourdomain/${lc:$local_part}
  maildir_tag = ,S=$message_size
  message_prefix =
  message_suffix =
  return_path_add
  user = Debian-exim
  create_file = anywhere
  create_directory
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;the gmail_t transport is simple. The local one you might have to patch up users and groups plus the location where you what to write the mail to.&lt;/p&gt;

&lt;p&gt;Now we are ready to reconfigure Google as this is all that's needed to get a copy of every inbound mail into a local maildir on the mail relay.&lt;/p&gt;

&lt;p&gt;Here's what you do:&lt;/p&gt;

&lt;ul&gt;
    &lt;li&gt;You change the MX of your domain to point to this relay of yours&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The next two steps are the reason you need a paid account: These controls are not available for the free accounts:&lt;/p&gt;

&lt;ul&gt;
    &lt;li&gt;In your Google Administration panel, you visit the Email settings and configure the outbound gateway. Set it to your relay.&lt;/li&gt;
    &lt;li&gt;Then you configure your inbound gateway and set it to your relay too (and to your backup MX if you have one).&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;This screenshot will help you:&lt;/p&gt;

&lt;p&gt;&lt;a href="http://www.gnegg.ch/wp-content/uploads/2011/02/gmail-config.png"&gt;&lt;img class="aligncenter size-medium wp-image-800" title="gmail-config" src="http://www.gnegg.ch/wp-content/uploads/2011/02/gmail-config-300x102.png" alt="" width="300" height="102" /&gt;&lt;/a&gt;All email sent to your MX (over the gmail_t transport we have configured above) will now be accepted by gmail.&lt;/p&gt;

&lt;p&gt;Also, Gmail will now send all outgoing Email to your relay which needs to be configured to accept (and relay) email from Google. This pretty much depends on your otherwise existing Exim configuration, but here's what I added (which will work with the default ACL):&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="text"&gt;hostlist   google_relays = 216.239.32.0/19:64.233.160.0/19:66.249.80.0/20:\
    72.14.192.0/18:209.85.128.0/17:66.102.0.0/20:\
    74.125.0.0/16:64.18.0.0/20:207.126.144.0/20
hostlist   relay_from_hosts = 127.0.0.1:+google_relays
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;And lastly, the tricky part: Storing a copy of all mail that is being sent through Gmail (we are already correctly sending the mail. What we want is a copy):&lt;/p&gt;

&lt;p&gt;Here is the exim router we need:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="text"&gt;gmail_outgoing:
  driver = accept
  condition = &amp;quot;${if and{\
    { eq{$sender_address_domain}{yourdomain.com} }\
    {=={${lookup{$sender_address_local_part}lsearch{/etc/exim4/gmail_accounts}{1}}}{1}}} {1}{0}}&amp;quot;
  transport = store_outgoing_copy
  unseen
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;(did I mention that I severely dislike RPN?)&lt;/p&gt;

&lt;p&gt;and here's the transport:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="text"&gt;store_outgoing_copy:
  driver = appendfile
  check_string =
  delivery_date_add
  envelope_to_add
  group=mail
  maildir_format
  directory = MAILDIR/yourdomain/${lc:$sender_address_local_part}/.Sent/
  maildir_tag = ,S=$message_size
  message_prefix =
  message_suffix =
  return_path_add
  user = Debian-exim
  create_file = anywhere
  create_directory
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;The maildir I've chosen is the correct one if the IMAP-server you want to use is Courier IMAPd. Other servers use different methods.&lt;/p&gt;

&lt;p&gt;One little thing: When you CC or BCC other people in your domain, Google will send out multiple copies of the same message. This will yield some message duplication in the sent directory (one per recipient), but as they say: Better backup too much than too little.&lt;/p&gt;

&lt;p&gt;Now if something happens to your google account, just start up an IMAP server and have it serve mail from these maildir directories.&lt;/p&gt;

&lt;p&gt;And remember to back them up too, but you can just use rsync or rsnapshot or whatever other technology you might have in use. They are just directories containing one file per email.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/02/how-i-back-up-gmail</feedburner:origLink></entry>
  
  <entry>
    <title>sacy 0.2 - now with less, sass and scss</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/okry8b8IDy4/sacy-0-2-now-with-less-sass-and-scss" />
    <updated>2011-02-16T00:00:00+01:00</updated>
    <id>http://pilif.github.com/2011/02/sacy-0-2-now-with-less-sass-and-scss</id>
    <content type="html">&lt;p&gt;To fresh up your memory (&lt;a href="/2009/09/introducing-sacy-the-smarty-asset-compiler/"&gt;it has been a while&lt;/a&gt;): &lt;a href="http://github.com/pilif/sacy"&gt;sacy&lt;/a&gt; is a &lt;a href="http://www.smarty.net"&gt;Smarty&lt;/a&gt; (both 2 and 3) plugin that turns&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="xml"&gt;{asset_compile}
&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/css&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;stylesheet&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/styles/file1.css&amp;quot;&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/css&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;stylesheet&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/styles/file2.css&amp;quot;&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/css&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;stylesheet&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/styles/file3.css&amp;quot;&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/css&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;stylesheet&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/styles/file4.css&amp;quot;&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;script&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/javascript&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/jslib/file1.js&amp;quot;&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;script&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/javascript&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/jslib/file2.js&amp;quot;&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;script&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/javascript&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/jslib/file3.js&amp;quot;&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
{/asset_compile}
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;into&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="xml"&gt;&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/css&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;stylesheet&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/assets/files-1234abc.css&amp;quot;&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;script&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/javascript&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/assets/files-abc123.js&amp;quot;&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;It does this without you ever having to manually run a compiler, without serving all your assets through some script (thus saving RAM) and without worries about stale copies being served. In fact, you can serve all static files generated with sacy with cache headers telling browsers to never revisit them!&lt;/p&gt;

&lt;p&gt;All of this, using two lines of code (wrap as much content as you want in {asset_compile}...{/asset_compile})&lt;/p&gt;

&lt;p&gt;Sacy has been around for a bit more than a year now and has since been in production use in &lt;a href="http://www.popscan.com"&gt;PopScan&lt;/a&gt;. During this time, no single bug in Sacy has been found, so I would say that it's pretty usable.&lt;/p&gt;

&lt;p&gt;Coworkers have bugged me enough about how much better &lt;a href="http://lesscss.org/"&gt;less&lt;/a&gt; or &lt;a href="http://sass-lang.com/"&gt;sass&lt;/a&gt; would be compared to pure CSS so that I finally decided to update &lt;a href="http://github.com/pilif/sacy"&gt;sacy&lt;/a&gt; to allow us to use less in PopScan:&lt;/p&gt;

&lt;p&gt;Aside of consolidating and minimizing CSS and JavaScript, sacy can now also transform less and sass (or scss) files using the exact same method as before but just changing the mime-type:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="xml"&gt;&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/x-less&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;stylesheet&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/styles/file1.less&amp;quot;&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/x-sass&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;stylesheet&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/styles/file2.sass&amp;quot;&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;link&lt;/span&gt; &lt;span class="na"&gt;type=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;text/x-scss&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;rel=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;stylesheet&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;/styles/file3.scss&amp;quot;&lt;/span&gt; &lt;span class="nt"&gt;/&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Like before, you don't concern yourself with manual compilation or anything. Just use the links as is and sacy will do the magic for you.&lt;/p&gt;

&lt;p&gt;Interested? Read the (by now huge) &lt;a href="https://github.com/pilif/sacy/blob/v0.2/README.markdown"&gt;documentation&lt;/a&gt; on &lt;a href="http://github.com/pilif"&gt;my github page&lt;/a&gt;!&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/02/sacy-0-2-now-with-less-sass-and-scss</feedburner:origLink></entry>
  
  <entry>
    <title>Find relation sizes in PostgreSQL</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/L_teP_8nQkI/find-relation-sizes-in-postgresql" />
    <updated>2011-02-07T00:00:00+01:00</updated>
    <id>http://pilif.github.com/2011/02/find-relation-sizes-in-postgresql</id>
    <content type="html">&lt;p&gt;Like so many times before, today I was yet again in the situation where I wanted to know which tables/indexes take the most disk space in a particular PostgreSQL database.&lt;/p&gt;

&lt;p&gt;My usual procedure in this case was to &lt;code&gt;\dt+&lt;/code&gt; in psql and scan the sizes by eye (this being on my development machine, trying to find out the biggest tables I could clean out to make room).&lt;/p&gt;

&lt;p&gt;But once you've done that a few times and considering that &lt;code&gt;\dt+&lt;/code&gt; does nothing but query some PostgreSQL internal tables, I thought that I want this solved in an easier way that also is less error prone. In the end I just wanted the output of \dt+ sorted by size.&lt;/p&gt;

&lt;p&gt;The lead to some digging in the source code of psql itself (&lt;code&gt;src/bin/psql&lt;/code&gt;) where I quickly found the function that builds the query (&lt;code&gt;listTables&lt;/code&gt; in &lt;code&gt;describe.c&lt;/code&gt;), so from now on, this is what I'm using when I need to get an overview over all relation sizes ordered by size in descending order:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="sql"&gt;&lt;span class="k"&gt;select&lt;/span&gt;
  &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;nspname&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="ss"&gt;&amp;quot;Schema&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relname&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="ss"&gt;&amp;quot;Name&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relkind&lt;/span&gt;
     &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;r&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;table&amp;#39;&lt;/span&gt;
     &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;v&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;view&amp;#39;&lt;/span&gt;
     &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;i&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;index&amp;#39;&lt;/span&gt;
     &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;S&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;sequence&amp;#39;&lt;/span&gt;
     &lt;span class="k"&gt;when&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;s&amp;#39;&lt;/span&gt; &lt;span class="k"&gt;then&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;special&amp;#39;&lt;/span&gt;
  &lt;span class="k"&gt;end&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="ss"&gt;&amp;quot;Type&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;pg_catalog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pg_get_userbyid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relowner&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="ss"&gt;&amp;quot;Owner&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="n"&gt;pg_catalog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pg_size_pretty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;pg_catalog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pg_relation_size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;oid&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="ss"&gt;&amp;quot;Size&amp;quot;&lt;/span&gt;
&lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;pg_catalog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pg_class&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;
 &lt;span class="k"&gt;left&lt;/span&gt; &lt;span class="k"&gt;join&lt;/span&gt; &lt;span class="n"&gt;pg_catalog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pg_namespace&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt; &lt;span class="k"&gt;on&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;oid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relnamespace&lt;/span&gt;
&lt;span class="k"&gt;where&lt;/span&gt; &lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;relkind&lt;/span&gt; &lt;span class="k"&gt;IN&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;r&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;v&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;&amp;#39;i&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;order&lt;/span&gt; &lt;span class="k"&gt;by&lt;/span&gt; &lt;span class="n"&gt;pg_catalog&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;pg_relation_size&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;c&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;oid&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;desc&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Of course I could have come up with this without source code digging, but honestly, I didn't know about relkind s, about pg_size_pretty and pg_relation_size (I would have thought that one to be stored in some system view), so figuring all of this out would have taken much more time than just reading the source code.&lt;/p&gt;

&lt;p&gt;Now it's here so I remember it next time I need it.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2011/02/find-relation-sizes-in-postgresql</feedburner:origLink></entry>
  
  <entry>
    <title>overpriced data roaming</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/aXgwfFTU7oo/overpriced-data-roaming" />
    <updated>2010-11-04T00:00:00+01:00</updated>
    <id>http://pilif.github.com/2010/11/overpriced-data-roaming</id>
    <content type="html">&lt;p&gt;You shouldn't complain if something gets cheaper. But if something just gets 7 times cheaper from one day to the next, then that leaves you thinking whether the price offered so far might have been a tad bit too high.&lt;/p&gt;

&lt;p&gt;I'm talking about Swisscom's data roaming charges.&lt;/p&gt;

&lt;p&gt;Up to now, you paid CHF 50 per 5 MB (CHF 10 per MB) when roaming in the EU. Yes. That's around $10 and EUR 6.60 per &lt;strong&gt;Megabyte&lt;/strong&gt;. Yes. Megabyte. Not Gigabyte. And you people complain about getting limited to 5 GB for your $30.&lt;/p&gt;

&lt;p&gt;Just now I got a &lt;a href="http://www.swisscom.ch/NR/exeres/FC2C644E-DFB7-49E4-8326-93C03D317BAF,frameless.htm?lang=de"&gt;press release&lt;/a&gt; form Swisscom that they are changing their roaming charges to CHF 7 per 5 MB. That's CHF 1.40 per MB which is &lt;strong&gt;7 times cheaper.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If you can make a product of yours 7 times cheaper from one day to the other, the rates you charged before that were clearly way too high.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2010/11/overpriced-data-roaming</feedburner:origLink></entry>
  
  <entry>
    <title>How to kill IE performance</title>
    <link href="http://feedproxy.google.com/~r/gnegg/~3/L0vcXiBGmwE/how-to-kill-ie-performance" />
    <updated>2010-10-07T00:00:00+02:00</updated>
    <id>http://pilif.github.com/2010/10/how-to-kill-ie-performance</id>
    <content type="html">&lt;p&gt;While working on my day job, we are often dealing with huge data tables in HTML augmented with some JavaScript to do calculations with that data.&lt;/p&gt;

&lt;p&gt;Think huge shopping cart: You change the quantity of a line item and the line total as well as the order total will change.&lt;/p&gt;

&lt;p&gt;This leads to the same data (line items) having three representations:&lt;/p&gt;

&lt;ol&gt;
    &lt;li&gt;The model on the server&lt;/li&gt;
    &lt;li&gt;The HTML UI that is shown to the user&lt;/li&gt;
    &lt;li&gt;The model that's seen by JavaScript to do the calculations on the client side (and then updating the UI)&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;You might think that the JavaScript running in the browser would somehow be able to work with the data from 2) so that the third model wouldn't be needed, but due to various localization issues (think number formatting) and data that's not displayed but affects the calculations, that's not possible.&lt;/p&gt;

&lt;p&gt;So the question is: Considering we have some HTML templating language to build 2), how do we get to 3).&lt;/p&gt;

&lt;p&gt;Back in 2004 when I initially designed that system (using AJAX before it was widely called AJAX even), I hadn't seen &lt;a href="http://video.yahoo.com/watch/111593/1710507"&gt;Crockford's lecture&lt;/a&gt;s yet, so I still lived in the "JS sucks" world, where I've done something like this&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="xml"&gt;&lt;span class="c"&gt;&amp;lt;!-- lots of TRs --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;tr&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 1 &lt;span class="nt"&gt;&amp;lt;script&amp;gt;&lt;/span&gt;addSet(1234 /*prodid*/, 1 /*quantity*/, 10 /*price*/, /* and, later, more, stuff, so, really, ugly */)&lt;span class="nt"&gt;&amp;lt;/script&amp;gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 2&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 3&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/tr&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;&amp;lt;!-- lots of TRs --&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;(Yeah - as I said: 2004. No object literals, global functions. We had a lot to learn back then, but so did you, so don't be too angry at me - we improved)&lt;/p&gt;

&lt;p&gt;Obviously, this doesn't scale: As the line items got more complicated, that parameter list grew and grew. The HTML code got uglier and uglier and of course, cluttering the window object is a big no-no too. So we went ahead and built a beautiful design:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="xml"&gt;&lt;span class="c"&gt;&amp;lt;!-- lots of TRs --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;tr&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;lineitem&amp;quot;&lt;/span&gt; &lt;span class="na"&gt;data-ps-lineitem=&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;{&amp;quot;prodid&amp;quot;: 1234, &amp;quot;quantity&amp;quot;: 1, &amp;quot;price&amp;quot;: 10, &amp;quot;foo&amp;quot;: &amp;quot;bar&amp;quot;, &amp;quot;blah&amp;quot;: &amp;quot;blah&amp;quot;}&amp;#39;&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 1&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 2&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 3&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/tr&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;&amp;lt;!-- lots of TRs --&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;The first iteration was then parsing that JSON every time we needed to access any of the associated data (and serializing again whenever it changed). Of course this didn't go that well performance-wise, so we began caching and did something like this (using jQuery):&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="nx"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(){&lt;/span&gt;
    &lt;span class="nx"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.lineitem&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;each&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;function&lt;/span&gt;&lt;span class="p"&gt;(){&lt;/span&gt;
        &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;ps_data&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;$&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;parseJSON&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nx"&gt;attr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;data-ps-lineitem&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Now each DOM element representing one of these &lt;tr&gt;'s had a ps_data member which allowed for quick access. The JSON had to be parsed only once and then the data was available. If it changed, writing it back didn't require a re-serialization either - you just changed that property directly.&lt;/p&gt;

&lt;p&gt;This design is reasonably clean (still not as DRY as the initial attempt which had the data only in that JSON string) while still providing enough performance.&lt;/p&gt;

&lt;p&gt;Until you begin to amass datasets. That is.&lt;/p&gt;

&lt;p&gt;Well. Until you do so and expect this to work in IE.&lt;/p&gt;

&lt;p&gt;800 rows like this made IE lock up its UI thread for 40 seconds.&lt;/p&gt;

&lt;p&gt;So more optimization was in order.&lt;/p&gt;

&lt;p&gt;First,&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="nx"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;.lineitem&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;will kill IE. Remember: IE (still) doesn't have getElementsByClassName, so in IE, jQuery has to iterate the whole DOM and check whether each elements class attribute contains "lineitem". Considering that IE's DOM isn't really fast to start with, this is a HUGE no-no.&lt;/p&gt;

&lt;p&gt;So.&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="nx"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;tr.lineitem&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Nope. Nearly as bad considering there are still at least 800 tr's to iterate over.&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="javascript"&gt;&lt;span class="nx"&gt;$&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;&amp;#39;#whatever tr.lineitem&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Would help if it weren't 800 tr's that match. Using &lt;a href="http://ajax.dynatrace.com/pages/"&gt;dynaTrace AJAX&lt;/a&gt; (highly recommended tool, by the way) we found out that just selecting the elements alone (without the iteration) took more than 10 seconds.&lt;/p&gt;

&lt;p&gt;So the general take-away is: Selecting lots of elements in IE is painfully slow. Don't do that.&lt;/p&gt;

&lt;p&gt;But back to our little problem here. Unserializing that JSON at DOM ready time is not feasible in IE, because no matter what we do to that selector, once there are enough elements to handle, it's just going to be slow.&lt;/p&gt;

&lt;p&gt;Now by chunking up the amount of work to do and using setTimeout() to launch various deserialization jobs we could fix the locking up, but the total run time before all data is deserialized will still be the same (or slightly worse).&lt;/p&gt;

&lt;p&gt;So what we have done in 2004, even though it was ugly, was way more feasible in IE.&lt;/p&gt;

&lt;p&gt;Which is why we went back to the initial design with some improvements:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="xml"&gt;&lt;span class="c"&gt;&amp;lt;!-- lots of TRs --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;tr&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;lineitem&amp;quot;&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 1 &lt;span class="nt"&gt;&amp;lt;script&amp;gt;&lt;/span&gt;PopScan.LineItems.add({&amp;quot;prodid&amp;quot;: 1234, &amp;quot;quantity&amp;quot;: 1, &amp;quot;price&amp;quot;: 10, &amp;quot;foo&amp;quot;: &amp;quot;bar&amp;quot;, &amp;quot;blah&amp;quot;: &amp;quot;blah&amp;quot;});&lt;span class="nt"&gt;&amp;lt;/script&amp;gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 2&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 3&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/tr&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;&amp;lt;!-- lots of TRs --&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;&lt;em&gt;phew&lt;/em&gt; crisis averted.&lt;/p&gt;

&lt;p&gt;Loading time went back to where it was in the 2004 design. It was still bad though. With those 800 rows, IE was still taking more than 10 seconds for the rendering task. dynaTrace revealed that this time, the time was apparently spent rendering.&lt;/p&gt;

&lt;p&gt;The initial feeling was that there's not much to do at that point.&lt;/p&gt;

&lt;p&gt;Until we began suspecting the script tags.&lt;/p&gt;

&lt;p&gt;Doing this:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="xml"&gt;&lt;span class="c"&gt;&amp;lt;!-- lots of TRs --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;tr&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;lineitem&amp;quot;&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 1&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 2&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 3&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/tr&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;&amp;lt;!-- lots of TRs --&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;The page loaded instantly.&lt;/p&gt;

&lt;p&gt;Doing this&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="xml"&gt;&lt;span class="c"&gt;&amp;lt;!-- lots of TRs --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;tr&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;lineitem&amp;quot;&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 1 &lt;span class="nt"&gt;&amp;lt;script&amp;gt;&lt;/span&gt;1===1;&lt;span class="nt"&gt;&amp;lt;/script&amp;gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 2&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 3&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/tr&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;&amp;lt;!-- lots of TRs --&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;it took 10 seconds again.&lt;/p&gt;

&lt;p&gt;Considering that IE's JavaScript engine runs as a COM component, this isn't actually that surprising: Whenever IE hits a script tag, it stops whatever it's doing, sends that script over to the COM component (first doing all the marshaling of the data), waits for that to execute, marshals the result back (depending on where the DOM lives and whether the script accesses it, possibly crossing that COM boundary many, many times in between) and then finally resumes page loading.&lt;/p&gt;

&lt;p&gt;It has to wait for each script because, potentially, that JavaScript could call document.open() / document.write() at which point the document could completely change.&lt;/p&gt;

&lt;p&gt;So the final solution was to loop through the server-side model twice and do something like this:&lt;/p&gt;

&lt;div class="highlight"&gt;&lt;pre&gt;&lt;code class="xml"&gt;&lt;span class="c"&gt;&amp;lt;!-- lots of TRs --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;tr&lt;/span&gt; &lt;span class="na"&gt;class=&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;lineitem&amp;quot;&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 1 &lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 2&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
    &lt;span class="nt"&gt;&amp;lt;td&amp;gt;&lt;/span&gt;Column 3&lt;span class="nt"&gt;&amp;lt;/td&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/tr&amp;gt;&lt;/span&gt;
&lt;span class="c"&gt;&amp;lt;!-- lots of TRs --&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/table&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;script&amp;gt;&lt;/span&gt;
PopScan.LineItems.add({prodid: 1234, quantity: 1, price: 10, foo: &amp;quot;bar&amp;quot;, blah: &amp;quot;blah&amp;quot;});
// 800 more of these
&lt;span class="nt"&gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Problem solved. Not too ugly design. Certainly no 2004 design any more.&lt;/p&gt;

&lt;p&gt;And in closing, let me give you a couple of things you can do if you want to bring the performance of IE down to its knees:&lt;/p&gt;

&lt;ul&gt;
    &lt;li&gt;Use broad jQuery selectors. &lt;code&gt;$('.someclass')&lt;/code&gt; will cause jQuery to loop through &lt;em&gt;all&lt;/em&gt; elements on the page.&lt;/li&gt;
    &lt;li&gt;Even if you try not to be broad, you can still kill performance: &lt;code&gt;$('div.someclass')&lt;/code&gt;. The most help jQuery can expect from IE is getElementsByTagName, so while it's better than iterating &lt;em&gt;all&lt;/em&gt; elements, it's still going over all div's on your page. Once it's more than 200, the performance extremely quickly falls down (probably doing some O(n^2) thing somehwere).&lt;/li&gt;
    &lt;li&gt;Use a lot of &amp;lt;script&amp;gt;-tags. Every one of these will force IE to marshal data to the scripting engine COM component and to wait for the result.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Next time, we'll have a look at how to use jQuery's delegate() to handle common cases with huge selectors.&lt;/p&gt;
</content>
  <feedburner:origLink>http://pilif.github.com/2010/10/how-to-kill-ie-performance</feedburner:origLink></entry>
  
</feed>

