<?xml version="1.0" encoding="iso-8859-1"?>
<rss version="2.0"><channel><title>Articles by Dustin Boswell</title><link>http://dustwell.com/</link><description>Everything that I get excited about and want to share with the world.</description><lastBuildDate>Tue, 14 May 2019 07:55:36 GMT</lastBuildDate><generator>PyRSS2Gen-1.0.0</generator><docs>http://blogs.law.harvard.edu/tech/rss</docs><item><title>How X Over SSH really works</title><link>http://dustwell.com/how-x-over-ssh-really-works.html</link><description>


Imagine you are sitting in front of a machine named
"&lt;span style="font-weight: bold; color: rgb(0, 153, 0);font-family:courier new;" &gt;home&lt;/span&gt;"
 that has a keyboard, mouse, display, and is running an
&lt;span style="font-style: italic;"&gt;X server&lt;/span&gt;.   Now you open a terminal and 
&lt;span style="font-style: italic;"&gt;ssh&lt;/span&gt; to a machine called 
"&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt;"
 (which doesn't need to have an X server running), and run a program like "firefox" which pops a window in your screen.    How does this all work?

&lt;br /&gt; &lt;br /&gt;
&lt;img src="images/x-over-ssh.png" /&gt;
&lt;br /&gt;

First, let's clear up the client/server terminology confusion.  When talking about X:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;
&lt;span style="font-weight: bold; font-style: italic;"&gt;X client&lt;/span&gt;:
 a process (like firefox or xemacs) which uses the X client API to display things and receive mouse/keyboard events.&lt;/li&gt;
&lt;li&gt;&lt;span style="font-weight: bold; font-style: italic;"&gt;X server&lt;/span&gt;:
 a process (usually just "X") which X clients connect to.  At times (as we'll see below) other processes can act as X servers.&lt;br /&gt;&lt;/li&gt;
&lt;/ul&gt;

Whenever an X client starts up, it reads the local 
&lt;span style="font-weight: bold;font-family:courier new;" &gt;$DISPLAY&lt;/span&gt; environment variable,
whose value looks like: &lt;span style="font-weight: bold;font-family:courier new;" &gt;[hostname]:display_number[.screen_number]&lt;/span&gt;.
The X client immediately opens a connection to that X server.  If it can't, it fails:
&lt;br /&gt;&lt;br /&gt;
&lt;div style="border: 1px solid black; padding: 10px;"&gt;
&lt;span style="font-weight: bold;font-family:courier new;" &gt;user&lt;span style="color: rgb(0, 153, 0);"&gt;@home&lt;/span&gt;:
 echo $DISPLAY
&lt;br /&gt;    :0  # hostname is "localhost" by default&lt;br /&gt;user&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;@home&lt;/span&gt;&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;: xcalc # pops up a calculator on my screen&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;user&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;@home&lt;/span&gt;&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;: DISPLAY="nosuchhost:99"&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;user&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;@home&lt;/span&gt;&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;: xcalc&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;      Error: Can't open display: nosuchhost:99&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;Now let's see what happens when you ssh to another machine:&lt;br /&gt;&lt;br /&gt;&lt;div style="border: 1px solid black; padding: 10px;"&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;user&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;@&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;home&lt;/span&gt;: ps aux | grep X     # yep, X server running @&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;home&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;    root ...    /usr/bin/X :0  ...&lt;br /&gt;&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;user&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;@&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;home&lt;/span&gt;: ssh -X user@remote  # -X enables X-forwarding&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;user@&lt;/span&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;: ps aux | grep X   # nope, no X server here&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;user@&lt;/span&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;: echo $DISPLAY&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;    :11.0&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;user@&lt;/span&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;: xcalc             # pops up screen @&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;home&lt;/span&gt;&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;Wait a minute, the &lt;span style="font-weight: bold;font-family:courier new;" &gt;$DISPLAY&lt;/span&gt; variable is pointing to the localhost ("&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt;").  Natural questions to ask at this point:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;There is no X server @&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt;, so why didn't xcalc just fail on startup?&lt;/li&gt;&lt;li&gt;Why was the &lt;span style="font-style: italic;"&gt;display number&lt;/span&gt; "11".&lt;/li&gt;&lt;li&gt;How did xcalc show something @&lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;home&lt;/span&gt;&lt;/span&gt;?&lt;/li&gt;&lt;/ul&gt;The answer has to do with the &lt;span style="font-style: italic;"&gt;ssh-daemon&lt;/span&gt; running @&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt;:&lt;br /&gt;&lt;br /&gt;&lt;div style="border: 1px solid black; padding: 10px;"&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;user@&lt;/span&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;: ps aux | grep user&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;      root  ...  sshd:user@pts/11&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;What's happening is that there's an &lt;span style="font-style: italic;"&gt;"X emulator"&lt;/span&gt; running @&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt; that was setup just for your ssh session, that is listening on display 11.&lt;br /&gt;&lt;br /&gt;To review, here's a play-by-play:&lt;br /&gt;&lt;ol&gt;&lt;li&gt;You type "&lt;span style="font-weight: bold;font-family:courier new;" &gt;ssh -X user@remote&lt;/span&gt;" in your terminal&lt;/li&gt;&lt;li&gt;The &lt;span style="font-style: italic;"&gt;ssh&lt;/span&gt; process connects to the &lt;span style="font-style: italic;"&gt;sshd&lt;/span&gt; server @&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt;.&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;sshd&lt;/span&gt; spawns a new process that is an X-server-emulator listening on some display number, e.g. "11"&lt;br /&gt;&lt;/li&gt;&lt;li&gt;&lt;span style="font-style: italic;"&gt;sshd&lt;/span&gt; sets the &lt;span style="font-weight: bold;font-family:courier new;" &gt;$DISPLAY&lt;/span&gt; to point to that local "X-server" (e.g. ":11")&lt;/li&gt;&lt;li&gt;xcalc reads this &lt;span style="font-weight: bold;font-family:courier new;" &gt;$DISPLAY&lt;/span&gt; and conncects to this X-server.  xcalc thinks it's displaying to the local machine.&lt;/li&gt;&lt;li&gt;the X-server-emulator simply forwards the X commands from xcalc through the ssh connection, to the original ssh process.&lt;/li&gt;&lt;li&gt;The ssh process @&lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;home&lt;/span&gt;&lt;/span&gt; now acts as a normal X-client and sends those commands to the X-server @&lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;home&lt;/span&gt;&lt;/span&gt;.&lt;br /&gt;&lt;/li&gt;&lt;/ol&gt;&lt;a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://1.bp.blogspot.com/_E3Va1433EXQ/SN6jXcdzzII/AAAAAAAABd8/B-IlTHDjIWc/s1600-h/xoverssh.png"&gt;

&lt;img src="images/x-over-ssh.png" /&gt;
&lt;br&gt;

&lt;/a&gt;Some of you might be wondering: &lt;span style="font-style: italic;"&gt;wasn't the X protocol designed to go over the network?  Can't you do all this without ssh?&lt;/span&gt;  You might be tempted to try something like:&lt;br /&gt;&lt;br /&gt;&lt;div style="border: 1px solid black; padding: 10px;"&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;user@&lt;/span&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;: DISPLAY="home:0"&lt;/span&gt;&lt;br /&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;user@&lt;/span&gt;&lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;: xcalc&lt;/span&gt;&lt;br /&gt;&lt;/div&gt;&lt;br /&gt;This doesn't work because the X server @&lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;home&lt;/span&gt;&lt;/span&gt; won't let other hosts connect to it.  To change this (not that you should -- see below) you can do:&lt;br /&gt;&lt;br /&gt;&lt;div style="border: 1px solid black; padding: 10px;"&gt;&lt;span style="font-weight: bold;font-family:courier new;" &gt;user&lt;span style="color: rgb(0, 153, 0);"&gt;@home&lt;/span&gt;: xhost +remote&lt;/span&gt;&lt;span style="font-style: italic;"&gt;&lt;br /&gt;&lt;/span&gt;&lt;/div&gt;&lt;br /&gt;&lt;span style="font-style: italic;"&gt;xhost&lt;/span&gt; is a command which says "that host can connect to our X-server".  However, everybody uses ssh X forwarding instead.  Here are some &lt;span style="color: rgb(255, 0, 0); font-weight: bold;"&gt;security reasons&lt;/span&gt; why:&lt;br /&gt;&lt;ul&gt;&lt;li&gt;Normally, X-traffic (like your keystrokes) is sent unencrypted from X-client to X-server.&lt;/li&gt;&lt;li&gt;Ssh nicely sends that data through an encrypted channel, so it doesn't go over the internet in the clear.&lt;/li&gt;&lt;li&gt;"&lt;span style="font-weight: bold;font-family:courier new;" &gt;xhost +remote&lt;/span&gt;" is putting a lot of trust in 'remote' being a nice guy.  If remote ever gets hacked, it could connect to the X-server @&lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;home&lt;/span&gt;&lt;/span&gt; and listen to all its keystrokes.&lt;/li&gt;&lt;/ul&gt;There's also an issue with firewalls: by doing "&lt;span style="font-weight: bold;font-family:courier new;" &gt;DISPLAY=home:0&lt;/span&gt;", you're assuming that a connection can be established from &lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt; -&gt; &lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;home&lt;/span&gt;&lt;/span&gt;.   But this isn't always possible --  &lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;home&lt;/span&gt;&lt;/span&gt; might be sitting behind a firewall (like your Netgear router).  Since the ssh-connection was setup from &lt;span style="font-weight: bold;font-family:courier new;" &gt;&lt;span style="color: rgb(0, 153, 0);"&gt;home&lt;/span&gt;&lt;/span&gt;-&gt; &lt;span style="font-weight: bold; color: rgb(0, 0, 153);font-family:courier new;" &gt;remote&lt;/span&gt;, it takes advantage of this already-established connection.&lt;br /&gt;&lt;br /&gt;Other notes for the curious:&lt;br /&gt;- if &lt;span style="font-weight: bold;font-family:courier new;" &gt;$DISPLAY&lt;/span&gt; is set to "&lt;span style="font-weight: bold;font-family:courier new;" &gt;localhost:0&lt;/span&gt;" (or any explicitly named host) it uses tcp-ip to send the X-traffic locally.&lt;br /&gt;- if &lt;span style="font-weight: bold;font-family:courier new;" &gt;$DISPLAY&lt;/span&gt; is just "&lt;span style="font-weight: bold;font-family:courier new;" &gt;:0&lt;/span&gt;" it uses a special (more efficient, non-tcp-ip) connection.

&lt;script&gt;
var disqus_url = "http://thoughts.dustwell.com/2008/09/how-x-over-ssh-really-works.html"
&lt;/script&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/how-x-over-ssh-really-works.html</guid><pubDate>Sat, 27 Sep 2008 00:00:00 GMT</pubDate></item><item><title>How to hash passwords securely.</title><link>http://dustwell.com/how-to-handle-passwords-old.html</link><description>

Let's say you're making a website where users can login with a password.
How do you handle passwords in a way that is secure and doesn't risk exposing the user's password to the world?

Here's a simple recipe I use:
&lt;ol&gt;
&lt;li&gt;
&lt;b&gt;Always one-way-hash (with a salt) the user's password on the client&lt;/b&gt; (using Javascript most likely).
 So when your user types
&lt;tt&gt;"my_password"&lt;/tt&gt; into the password field and hits "Log In", the browser will send something like
&lt;tt&gt;"0x22cd3f2e3f2e56f7ecf5..."&lt;/tt&gt; instead.  There's never any need to "decrypt" this.  The rest of the 
system behaves exactly the same as if &lt;tt&gt;"0x22cd3f2e3f2e56f7ecf5..."&lt;/tt&gt; was their actual password.
&lt;/li&gt;

&lt;li&gt;
&lt;b&gt;Store a random string (a salt) for each user in your database.&lt;/b&gt;
&lt;/li&gt;

&lt;li&gt;
Instead of storing the user's password, store &lt;b&gt;&lt;tt&gt;hash(salt+password)&lt;/tt&gt;&lt;/b&gt;.
Here's an example of what this looks like:
&lt;pre&gt;
&lt;table border=2px cellpadding=5px&gt;
&lt;tr&gt;&lt;th&gt;username&lt;/th&gt; &lt;th&gt;salt&lt;/th&gt; &lt;th&gt;hash(salt+password)&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;BillyBob&lt;/td&gt; &lt;td&gt;0xd029d0f092c09a09b&lt;/td&gt; &lt;td&gt;0xa0947cf7abd520&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;BettySue&lt;/td&gt; &lt;td&gt;0x9017d09082ceaa0fc&lt;/td&gt; &lt;td&gt;0x0014bcfd8be781&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;...&lt;/td&gt; &lt;td&gt;...&lt;/td&gt; &lt;td&gt;...&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;/pre&gt;
&lt;/li&gt;

&lt;li&gt;
To &lt;b&gt;verify if a given password is correct&lt;/b&gt;, lookup the salt for that user, and compute
&lt;tt&gt;hash(salt+password)&lt;/tt&gt; to see if that matches.
&lt;/li&gt;

&lt;li&gt;
If a &lt;b&gt;user forgets their password&lt;/b&gt;, send them an email with a link to a one-time url
where they can enter a &lt;b&gt;new&lt;/b&gt; password.  Compute &lt;tt&gt;hash(salt+new_password)&lt;/tt&gt;
and store that in the database.
&lt;/li&gt;
&lt;/ol&gt;


&lt;h3&gt;Why storing plain text passwords is bad&lt;/h3&gt;
Let's say that you just store all the passwords in plain text.  The problem is that
some nefarious hacker might gain access to your database:
&lt;pre&gt;
&lt;table border=2px cellpadding=5px&gt;
&lt;tr&gt;&lt;th&gt;username&lt;/th&gt; &lt;th&gt;email&lt;/th&gt; &lt;th&gt;password (plain text!)&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;BillyBob&lt;/td&gt; &lt;td&gt;billybob23@gmail.com&lt;/td&gt; &lt;td&gt;Password123&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;BettySue&lt;/td&gt; &lt;td&gt;bettysue@yahoo.com&lt;/td&gt; &lt;td&gt;iLoveChocolate&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;...&lt;/td&gt; &lt;td&gt;...&lt;/td&gt; &lt;td&gt;...&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;/pre&gt;

&lt;p&gt;
The bigger problem is that &lt;b&gt;most people re-use passwords across multiple sites&lt;/b&gt;.
So a smart hacker could try to login to &lt;tt&gt;billybob23@gmail.com&lt;/tt&gt;'s email account
using the password &lt;tt&gt;"Password123"&lt;/tt&gt;. Think about all the information someone
can get out of an email account (bank records for instance).
&lt;/p&gt;

&lt;p&gt;
If that password doesn't work for BillyBob's email, the hacker could then try using
that username/email/password for a number of popular banks, like bofa.com and chase.com
&lt;/p&gt;

&lt;p&gt;
And if that doesn't work, the hacker will move on to BettySue's information.
The hacker will probably be successful for a good fraction of your users.
So do your users's a favor, and don't store passwords in plain text.
&lt;/p&gt;

&lt;h3&gt;Why you need salts in your database&lt;/h3&gt;
&lt;p&gt;
Why isn't it good enough to store &lt;tt&gt;hash(password)&lt;/tt&gt; in the database?
Hashes are one-way, so it's impossible to "undo" a hash, right? 
&lt;/p&gt;

&lt;p&gt;
Well, yes and no.  The problem is that hackers have computed
&lt;a href="http://en.wikipedia.org/wiki/Rainbow_table"&gt;Rainbow Tables&lt;/a&gt; for many well-known hashes
(like &lt;tt&gt;md5&lt;/tt&gt;, &lt;tt&gt;sha1&lt;/tt&gt;, &lt;tt&gt;sha256&lt;/tt&gt;, etc... -- all the hashes that you might be using).
In a nutshell, a Rainbow Table is
a giant list (like hundreds of millions) of common passwords along with their hash: 

&lt;pre&gt;
&lt;table border=2px cellpadding=5px&gt;
&lt;tr&gt;&lt;th&gt;password&lt;/th&gt; &lt;th&gt;hash(password)&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;a&lt;/td&gt; &lt;td&gt;0x2d08d232d9823d&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;b&lt;/td&gt; &lt;td&gt;0xfe8f8c8a8e8d82&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;...&lt;/td&gt; &lt;td&gt;...&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Password123&lt;/td&gt; &lt;td&gt;0xc0c27d8f9dee475c&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;...&lt;/td&gt; &lt;td&gt;...&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td style="color:blue"&gt;iLoveChocolate&lt;/td&gt; &lt;td style="color:red"&gt;0x32c243e37333489e&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;...&lt;/td&gt; &lt;td&gt;...&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;/pre&gt;

(A Rainbow Table is actually a little smarter/more space efficient than this, but I won't go into the details.)
This table took a long time to generate, but once it's built, the hacker community can share it forever.

&lt;p&gt;
Now it's easy to "reverse" the simple &lt;tt&gt;hash(password)&lt;/tt&gt; in your database -- they just take the hash like
&lt;tt style="color: red"&gt;0x32c243e37333489e&lt;/tt&gt;, look it up in the Rainbow Table, and viola! the password is &lt;tt style="color: blue"&gt;iLoveChocolate&lt;/tt&gt;.
&lt;/p&gt;

&lt;p&gt;
This is why you need to add a &lt;b&gt;salt&lt;/b&gt; to the input of your hash.  So instead of storing &lt;tt&gt;hash("iLoveChocolate")&lt;/tt&gt;,
you would be storing &lt;tt&gt;hash("this is your random long salt." + "iLoveChocolate")&lt;/tt&gt;. This makes the Rainbow Table
ineffective because &lt;tt&gt;"this is your random long salt.iLovChocolate"&lt;/tt&gt; is unlikely to be in the table.
The actual salt you use should be a good long (and random) string, perhaps 100 characters or greater.
&lt;/p&gt;

&lt;h3&gt;Why you need per-user salts&lt;/h3&gt;
You might be tempted to just use a single salt value for all users, and not have to deal with storing each user's
salt in the database.  But this is less secure.

&lt;p&gt;
For one, the hacker could start to build his own Rainbow Table, where &lt;tt&gt;"this is your single fixed salt value"&lt;/tt&gt;
is prepended to each password first.  It would take a while, since the hacker would have to compute hundreds of millions
of hashes.  But eventually he might be able to match one of the hashes in your database to one of the
entries in his newly-built Rainbow Table.
&lt;/p&gt;

&lt;p&gt;
Also, by having a single salt for all users, the hacker would be able to know if 2 users happen to use the same password.
If he sees 2 users with the same hash(fixed_salt+password), it must be because they have the same password.
Thankfully, he doesn't know &lt;b&gt;what&lt;/b&gt; that password is (yet).  But a password that is used by two users is likely to be
very weak, and so he could focus his attack on those 2 users.
&lt;/p&gt;

&lt;p&gt;
So it's a good idea to have a per-user salt.  This makes the above two attacks much less likely.
&lt;/p&gt;

&lt;h3&gt;Why you should hash the password on the client also.&lt;/h3&gt;
When the user types their password into your &lt;tt&gt;&amp;lt;input type=password&amp;gt;&lt;/tt&gt; and submits that form,
that password value is sent across the internet in plain text, and given to your web server, which might
even log all the parameters for all the requests it gets.
&lt;p&gt;
That's a lot of opportunity for the password &lt;tt&gt;"iLoveChocolate"&lt;/tt&gt; to get stolen.  That's why it's better
to hash that password in the browser first, &lt;b&gt;so their password never leaves their computer.&lt;/b&gt;
Note that a hacker could still sniff the &lt;i&gt;hashed password&lt;/i&gt; going over the network, and use that hash later
to send to the server and impersonate you.  But at least the hacker can't use your real password for other
purposes.
&lt;/p&gt;

&lt;p&gt;
For the same reasons as above, it's better to salt this hash too.  (In general, it's always more secure
to add a salt before hashing.  The saltier, the better.)
&lt;/p&gt;

&lt;h3&gt;Shouldn't you use https/ssl instead?&lt;/h3&gt;
&lt;p&gt;
Yes, you should use it, but I believe &lt;b&gt;it's not enough&lt;/b&gt;.  You should still use a client-side hash because:
&lt;ul&gt;
  &lt;li&gt;Servers often log incoming requests (and their POST parameters) to a file.  This would mean the user's plain-text passwords are just sitting there in a file somewhere.
  &lt;/li&gt;
  &lt;li&gt;One day someone might accidentally turn &lt;tt&gt;https&lt;/tt&gt; off, or accidentally use &lt;tt&gt;http://&lt;/tt&gt; in one of the places where a password is sent.
  &lt;/li&gt;
  &lt;li&gt;SSL might be unavailable for a particular client or server (for example, some embedded device).&lt;/li&gt;
&lt;/ul&gt;
&lt;/p&gt;

&lt;h3&gt;Example client-side code&lt;/h3&gt;
Here's some example html:
&lt;pre class="code"&gt;&lt;code&gt;{% filter force_escape %}&lt;div id="fake_form"&gt;
  &lt;input type="text" id="username" /&gt;
  &lt;input type="password" id="raw_password" /&gt;
  &lt;input type="button" onclick="login()" /&gt;
&lt;/div&gt;{% endfilter %}&lt;/code&gt;&lt;/pre&gt;

Note: I use a &lt;tt&gt;div&lt;/tt&gt; instead of a &lt;tt&gt;form&lt;/tt&gt; to prevent it from accidentally getting submitted (thereby sending the &lt;tt&gt;raw_password&lt;/tt&gt;).  Here's the corresponding JavaScript:
(I haven't tested this - I'm just giving you an idea of how it works.)

&lt;pre class="code"&gt;&lt;code&gt;{% filter force_escape %}var login = function() {
  var username = $('#username').value();
  var raw_password = $('#raw_password').value();

  // Choose which version depending on your needs
  // hash() is a secure hash function like sha-1

  // Version 1: good
  var hashed_password = hash(raw_password);

  // Version 2: better (salt the hash with your domain)
  var hashed_password = hash("example.com" + raw_password);

  // Version 3: best
  var hashed_password = hash(username + "example.com" + raw_password);

  // Now login to server by sending (username, hashed_password) ...
}{% endfilter %}&lt;/code&gt;&lt;/pre&gt;

Version 3 is technically the most secure, but this only works if &lt;tt&gt;username&lt;/tt&gt;
will never change (probably a dangerous assumption). And if your user can login with either a &lt;i&gt;username&lt;/i&gt; or &lt;i&gt;email&lt;/i&gt;, then it won't work either.  So unfortunately, version 3 isn't doable for most sites.

&lt;script&gt;
var disqus_url = "http://dustwell.com/how-to-handle-passwords.html";
&lt;/script&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/how-to-handle-passwords-old.html</guid><pubDate>Tue, 09 Feb 2010 00:00:00 GMT</pubDate></item><item><title>How to use screen to pair program.</title><link>http://dustwell.com/screen_command_to_pair_program.html</link><description>

&lt;p&gt;
&lt;a href="http://en.wikipedia.org/wiki/Pair_programming"&gt;Pair programming&lt;/a&gt; is a great way for 2 people to work together on the same code.
Typically, it's done with the programmers sitting right next to each other. But what if they are in separate places,
or they just don't want to accidentally touch elbows (ewww...)?
&lt;/p&gt;

&lt;p&gt;
&lt;div style="text-align: center"&gt;
&lt;img src="images/programmer2.jpg" style="width: 200px; padding: 20px;" /&gt;
&lt;img src="images/programmer.jpg" style="width: 200px; padding: 20px;" /&gt;
&lt;/div&gt;
&lt;/p&gt;

&lt;p&gt;The UNIX &lt;a href="http://en.wikipedia.org/wiki/GNU_Screen"&gt;screen&lt;/a&gt; command is typically used to run multiple terminal programs
inside a single &lt;tt&gt;ssh&lt;/tt&gt; session, and be able to disconnect/re-connect to the session without
the programs noticing. It's an awesome utility, but you can also use it to let multiple people
interact with the same terminal screen, and hence, allow multiple people to use the same editor
at the same time.
&lt;/p&gt;

&lt;p&gt;First, I'm assuming both programmers have a user account on the same machine, and are already
logged in (or ssh'd in) to the machine.  (If the second programmer doesn't have an account, he
can use the first programmer's account, and the steps below are the same.)
&lt;/p&gt;

&lt;p&gt;
&lt;h3&gt;Enabling multi-user with &lt;tt&gt;screen&lt;/tt&gt;&lt;/h3&gt;
There are two ways to do this.  One way is to do
&lt;pre class=code&gt;
chmod u+s /usr/bin/screen
&lt;/pre&gt;

first, and then make sure everyone's &lt;tt&gt;~/.screenrc&lt;/tt&gt; file contains:
&lt;pre class=code&gt;
multiuser on
acladd &lt;i&gt;second_programmer_username&lt;/i&gt;
&lt;/pre&gt;

The other way is to just have
&lt;pre class=code&gt;
multiuser on
acladd root
&lt;/pre&gt;

But then the second programmer will need to do &lt;tt&gt;sudo screen&lt;/tt&gt; instead of just &lt;tt&gt;screen&lt;/tt&gt; in the steps below.
(There are also &lt;a href="http://aperiodic.net/screen/multiuser"&gt;more advanced security options&lt;/a&gt;.)
&lt;/p&gt;


&lt;p&gt;
&lt;h3&gt;First Programmer: run &lt;tt&gt;screen&lt;/tt&gt;&lt;/h3&gt;
The first programmer starts his day by doing:
&lt;pre class=code&gt;
screen
&lt;/pre&gt;
Hit &lt;tt&gt;ENTER&lt;/tt&gt; to dismiss the screen startup message.
Then, go about your normal activities, such as running &lt;tt&gt;vim&lt;/tt&gt;, or &lt;tt&gt;grep&lt;/tt&gt;, or whatever.
&lt;/p&gt;


&lt;p&gt;
&lt;h3&gt;Second Programmer: attach to his screen&lt;/h3&gt;
The second user does the following:
&lt;pre class=code&gt;
[sudo] screen -rx &lt;i&gt;first_programmer_username&lt;/i&gt;/
&lt;/pre&gt;
This attaches to the other user's active screen.
&lt;/p&gt;

&lt;p&gt;
That's it! Now you should both be seing the same window and can both use their keyboard.

&lt;p&gt;
&lt;h3&gt;Problems with the Delete Button?&lt;/h3&gt;
When ssh'ing from a terminal in Mac OSX I noticed that my &lt;tt&gt;delete&lt;/tt&gt; button no longer worked.
(ssh'ing from PuTTY in Windows didn't have this problem.)  To fix this, do:
&lt;pre class=code&gt;
TERM=screen; screen
&lt;/pre&gt;
whereever you would normally do &lt;tt&gt;screen&lt;/tt&gt;.  Or, it's probably easier to just put
&lt;pre class=code&gt;
alias screen='TERM=screen; screen'
&lt;/pre&gt;
in your &lt;tt&gt;~/.bashrc&lt;/tt&gt; file.
&lt;/p&gt;

&lt;p&gt;
&lt;h3&gt;Essential commands while inside &lt;tt&gt;screen&lt;/tt&gt;&lt;/h3&gt;
&lt;pre class=code&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;Command&lt;/th&gt;&lt;th&gt;Purpose&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Ctrl-a ?&lt;/td&gt;&lt;td&gt;Show the &lt;tt&gt;screen&lt;/tt&gt; help menu&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Ctrl-a d&lt;/td&gt;&lt;td&gt;&lt;b&gt;D&lt;/b&gt;ettach from the screen (without killing it)&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Ctrl-a c&lt;/td&gt;&lt;td&gt;&lt;b&gt;C&lt;/b&gt;reate another window inside the screen.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Ctrl-a &amp;lt;space&amp;gt;&amp;nbsp;&amp;nbsp;&lt;/td&gt;&lt;td&gt;Cycle to the next window.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Ctrl-a a&lt;/td&gt;&lt;td&gt;&lt;b&gt;A&lt;/b&gt;lternate to the previous window.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;Ctrl-a [&lt;/td&gt;&lt;td&gt;Enter "scroll mode" (use Up/Down arrows, then ESC to exit).&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;/pre&gt;
&lt;/p&gt;

&lt;p&gt;
&lt;h3&gt;Other &lt;tt&gt;screen&lt;/tt&gt; flags&lt;/h3&gt;
&lt;pre class=code&gt;
&lt;table&gt;
&lt;tr&gt;&lt;th&gt;Command-line&lt;/th&gt;&lt;th&gt;Purpose&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;screen -ls&lt;/td&gt;&lt;td&gt;List all the active screens I have on this machine.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;screen&lt;/td&gt;&lt;td&gt;Start a new screen session.&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;screen -x&lt;/td&gt;&lt;td&gt;Reattach to my pre-existing screen session.&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;/pre&gt;
&lt;/p&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/screen_command_to_pair_program.html</guid><pubDate>Fri, 02 Apr 2010 00:00:00 GMT</pubDate></item><item><title>How iTunes works under the (file) covers</title><link>http://dustwell.com/understanding-itunes-files.html</link><description>

If you're like me, you have a bunch of mp3s (that you legitmately obtained from your CD's - ahem) in a neatly organized directory tree on your hard drive.  And it's a breeze to navigate, play, backup, and share.&lt;br /&gt;

&lt;div style="text-align: center;"&gt; &lt;img src="images/itunes-icon.jpg"&gt; &lt;/div&gt;

Now you've got an iPod and/or a Mac, and you want to take the plunge and use iTunes - but your puny little file-based UNIX-loving brain can't handle the transition.  Well, neither could I at first, but now I've finally unraveled the how the "file system" of iTunes works.&lt;br /&gt;&lt;br /&gt;Here are the juicy nuggets that will help you understand:&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;About authorization:&lt;/span&gt;&lt;br /&gt;1) Instead of an .&lt;span style="font-weight: bold;"&gt;mp3&lt;/span&gt;, purchased songs from iTunes are .&lt;span style="font-weight: bold;"&gt;m4p&lt;/span&gt; files.  They are similar to mp3s but they are essentially encrypted with your &lt;span style="font-style: italic;"&gt;AppleID&lt;/span&gt; as a key.  (In fact, if you open up the .m4p file in a text editor, you can see your email address!)&lt;br /&gt;&lt;br /&gt;2) Apple allows you to &lt;span style="font-weight: bold;"&gt;copy&lt;/span&gt; those .m4p files to as many computers/ipods as you like.&lt;br /&gt;&lt;br /&gt;3) But you can't &lt;span style="font-weight: bold;"&gt;play&lt;/span&gt; an .m4p file unless that computer is "&lt;span style="font-weight: bold;"&gt;authorized&lt;/span&gt;" by your AppleID.  You can have up to 5 computers authorized at one time.  The "authorize" and "deauthorize" options in iTunes do the trick.  (Note: it doesn't touch your library or files.)&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Where is all the data?&lt;/span&gt;&lt;br /&gt;1) When iTunes starts up, it looks at the &lt;span style="font-family:courier new;"&gt;iTunes/&lt;/span&gt; directory (inside &lt;span style="font-family:courier new;"&gt;~/Music&lt;/span&gt; in OSX, somewhere in your &lt;span style="font-family:courier new;"&gt;Documents &amp;amp; Settings&lt;/span&gt; in Windows).  This contains a few importants files:&lt;br /&gt;- &lt;span style="font-family:courier new;"&gt;iTunes Music Library.xml&lt;/span&gt; - this single file has all your &lt;span style="font-weight: bold;"&gt;playlists&lt;/span&gt;, and &lt;span style="font-weight: bold;"&gt;playcounts&lt;/span&gt;, and has the file paths of your actual music files.&lt;br /&gt;- There is a similar file (without the .xml) which is a binary version of the .xml file.  From my understanding, the xml file is for reading convenience, but this file is what iTunes really uses.  The two files are kept in sync by iTunes.&lt;br /&gt;- The &lt;span style="font-family:courier new;"&gt;Album Artwork/&lt;/span&gt; directory is where the album image files are stored.&lt;br /&gt;&lt;br /&gt;2) In &lt;span style="font-weight: bold;"&gt;iTunes-&gt;Preferences-&gt;Advanced&lt;/span&gt; there is an option for the &lt;span style="font-weight: bold;"&gt;"iTunes Music Folder"&lt;/span&gt;.  This directory contains all the .m4p and other raw music files.  The song title and other text is metadata is inside the .m4p file.&lt;br /&gt;&lt;br /&gt;Note that this directory has a &lt;span style="font-family:courier new;"&gt;Artist/Album/Song&lt;/span&gt; directory structure inside it.  (This is what the &lt;span style="font-weight: bold;"&gt;iTunes-&gt;Preferences-&gt;Advanced-&gt;"Keep iTunes Music Folder Organized"&lt;/span&gt; option enforces.)  You're not meant to touch this directory yourself - the iTunes program manages these subdirectories for you.&lt;br /&gt;&lt;br /&gt;3) When you import music via &lt;span style="font-weight: bold;"&gt;iTunes-&gt;"Add to Library"&lt;/span&gt; (or when you purchase music, for that matter) then it:&lt;br /&gt;- Adds those raw music files to your iTunes Music Folder.  (If you have &lt;span style="font-weight: bold;"&gt;iTunes-&gt;Preferences-&gt;Advanced-&gt;"Copy files to iTunes Music folder when adding to library" &lt;/span&gt;unchecked, it will skip this step.)&lt;br /&gt;- Adds the filepath of those files to the xml database mentioned above.&lt;br /&gt;&lt;br /&gt;&lt;span style="font-weight: bold;"&gt;Implications&lt;/span&gt;&lt;br /&gt;1) If you copy an raw music file into your iTunes Music Folder, that doesn't do anything - your iTunes library (the xml database) doesn't know about it.&lt;br /&gt;2) If you move your iTunes Music Folder, your library will be broken, since that xml-database is pointing to file locations that don't exist.  You'll have to move them back, or re-import those files.&lt;br /&gt;3) The &lt;span style="font-weight: bold;"&gt;"Consolidate Library..."&lt;/span&gt; action copies all the music files pointed to by your library (where-ever they are) to your current iTunes Music Folder.&lt;br /&gt;&lt;br /&gt;Well, I hope that helps!  Let me if there are any important parts I've left out.&lt;br /&gt;&lt;br /&gt;Oh, and if you're curious here's a writeup on &lt;a href="http://davidjduran.com/2008/04/16/how-to-share-a-single-itunes-library-between-you-and-your-wife/"&gt;how to share an iTunes library across multiple users on the same computer&lt;/a&gt;
&lt;div style='clear: both;'&gt;&lt;/div&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/understanding-itunes-files.html</guid><pubDate>Thu, 02 Apr 2009 00:00:00 GMT</pubDate></item><item><title>Good Politician != Good Decider</title><link>http://dustwell.com/proposal-for-new-type-of-government.html</link><description>


I've just realized what's wrong with our government: it's run by politicians.  Seriously though, hear me out.  The qualities that make a good politician aren't the qualities that make an ideal government official.

&lt;h3&gt;Here are the traits that make a successful politian:&lt;/h3&gt;

&lt;h4&gt;&amp;rarr; Electability&lt;/h4&gt;
Because of our media-based election system, you can't get into office unless you have a good on-camera personality, have the gift of gab, and are generally "likable."  These are nice qualities for someone to have, and somewhat correlated with leadership, but less so with decision-making ability.

&lt;h4&gt;&amp;rarr; Leadership/Charisma&lt;/h4&gt;
The ability to get others to join in your cause, to spend their efforts on your purpose.  I believe people are hard-wired to want to follow leaders - it's part of our tribal ancestry.

&lt;h4&gt;&amp;rarr; Networking/Schmoozing/Politicizing&lt;/h4&gt;
If you're able to rub elbows with the elite, "network", and get into "scratch my back, I'll scratch yours" exchanges with other politicians, you're going to be more effective.


&lt;h3&gt;But really, the most important trait we need in a government leader is:&lt;/h3&gt;

&lt;h4&gt;&amp;rarr; Intelligence &amp;amp; decision-making&lt;/h4&gt;
We need officials who can decide questions like:
&lt;ul&gt;
&lt;li&gt;"Should we go to war?"&lt;/li&gt;
&lt;li&gt;"Should we impose an embargo against Cuba?"&lt;/li&gt;
&lt;li&gt;"Should we spend more on military spending?"&lt;/li&gt;
&lt;/ul&gt;
by analyzing the facts and making good judgement calls.

&lt;br&gt;
&lt;br&gt;
Unfortunately, I think the first 3 traits are 
&lt;span style="font-weight: bold;"&gt;anti-correlated with good decision-making and analysis&lt;/span&gt;.
The people who are good at dealing with other people typically aren't the type to study the facts and research a topic.&lt;br /&gt;&lt;br /&gt;

&lt;img src="images/branches-government.jpg" /&gt;

&lt;br /&gt;&lt;br /&gt;&lt;span style="font-style: italic;font-size:130%;" &gt;A new form of government.&lt;/span&gt;&lt;br /&gt;If it were up to me, we would separate the "&lt;span style="font-weight: bold;"&gt;politician&lt;/span&gt;" and "&lt;span style="font-weight: bold;"&gt;decision-maker&lt;/span&gt;" roles.&lt;br /&gt;&lt;br /&gt;We would have a large body of decision-makers - folks whose only job is to research issues and make informed decisions.   These decision-makers would essentially act as a "&lt;span style="font-weight: bold;"&gt;brain trust&lt;/span&gt;" and be isolated from the day-to-day activities a normal politician does.  That is, they would be spared from photo-ops, dealing with the media, filibusters, and other wastes of time.  You know, so they'd have time to &lt;a href="http://www.downsizedc.org/page/read_the_laws"&gt;actually read the bills they're voting on&lt;/a&gt;.&lt;br /&gt;&lt;br /&gt;The politicians would be required to bring decisions to to the decision-makers and accept their decision.  But note that the politicians still have the power of picking which decisions to present.  So the politicians can still choose the issues to focus on.&lt;br /&gt;&lt;br /&gt;For example, if a politician thinks that we need to reduce carbon emissions, she might propose a cap-and-trade system for emissions.  The decision-makers might review this proposal and reject it because it wouldn't be effective.  The politician then might propose a simple carbon tax bill instead.  The decision-makers might approve this proposal, and then the approval could be turned into a bill.&lt;br /&gt;&lt;br /&gt;The key is to have a decision-maker group that is highly intelligent and unbiased.   You might think "you'll have all the same problems of electing decision-makers as you do for electing politicians."  I don't think it would, for a number of reasons:&lt;br /&gt;&lt;br /&gt;A decision-maker has a much more "boring" job.  They are handed proposals, and they get to evaluate them.  They can't choose which proposals they are given, so it would be much more difficult for special-interests to penetrate this group.  Also, there should be some sort of impartiality-judgement (like they do for jury selection) when decision-makers are selected.&lt;br /&gt;&lt;br /&gt;The group would be large (over 1000 people), so it would be more robust to rogue decision-makers.  Also, any one decision-maker would have far less power, so it wouldn't attract the power-hungry politician type.&lt;br /&gt;&lt;br /&gt;

&lt;img src="images/constitution-artwork.jpg" /&gt;

&lt;br /&gt;&lt;br /&gt;What do you think?
&lt;div style='clear: both;'&gt;&lt;/div&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/proposal-for-new-type-of-government.html</guid><pubDate>Fri, 03 Apr 2009 00:00:00 GMT</pubDate></item><item><title>Maximum Likelihood vs. Expected Value</title><link>http://dustwell.com/maximum-likelihood-vs-expected-value.html</link><description>

&lt;script type="text/x-mathjax-config"&gt;
  MathJax.Hub.Config({tex2jax: {inlineMath: [['$','$'], ['\\(','\\)']]}});
&lt;/script&gt;
&lt;script type="text/javascript"
  ssssrc="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML"
  src="http://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS_HTML"&gt;
&lt;/script&gt;

&lt;p&gt;
&lt;img style="float: right; width: 200px" src="images/slot_machine_cartoon.gif" /&gt;
Suppose you pulled a slot machine 5 times, and won 1 time. 
What's your best estimate for the true underlying payout probability $p$?
&lt;/p&gt;

&lt;p&gt;
This seemingly innocent question is actually quite involved.
You might think the answer is $\frac{1}{5}$ but it's actually $\frac{2}{7}$.
&lt;/p&gt;

&lt;p&gt;
(Note: we're making the usual assumption of a &lt;i&gt;uniform prior&lt;/i&gt; for $p$, which just means
"before we saw any data, we thought any value of $p$ was equally likely.")
&lt;/p&gt;

&lt;h2&gt;Maximum Likelihood Estimate&lt;/h2&gt;
&lt;p&gt;
&lt;img style="float: right; width: 400px" src="/images/beta_2_5.png" /&gt;
The &lt;a href="http://mathworld.wolfram.com/MaximumLikelihood.html"&gt;maximum likelihood estimate&lt;/a&gt; (MLE) of a parameter is
the value whose likelihood is highest.  If you're looking at the
&lt;a href="http://en.wikipedia.org/wiki/Probability_density_function"&gt;probability density function&lt;/a&gt; (PDF) of that parameter,
the MLE is simply the &lt;b&gt;highest point on the curve&lt;/b&gt; (i.e. the &lt;i&gt;mode&lt;/i&gt;).
&lt;/p&gt;

&lt;p&gt;
For a binomial, the MLE of $p$ is indeed
$\frac{k}{n}$ (where you saw $k$ "successes" out of $n$ attempts), which is $\frac{1}{5}$ in this case.
&lt;/p&gt;

&lt;h2&gt;Expected Value&lt;/h2&gt;
&lt;p&gt;
However, the MLE is different than the &lt;a href="http://mathworld.wolfram.com/ExpectationValue.html"&gt;expected value&lt;/a&gt; of $p$.
If you're looking at the PDF of a parameter, the expected value is the &lt;i&gt;mean&lt;/i&gt; of that curve.
For a binomial, the expected value turns out to be
$\frac{k+1}{n+2}$.  (This is also known as
&lt;a href="http://en.wikipedia.org/wiki/Rule_of_succession"&gt;Laplace's Rule of Succession&lt;/a&gt;.)
&lt;/p&gt;

&lt;p&gt;
For the data above ($k=1$, $n=5$) you get $\frac{2}{7}$.  Yes, this seems strange,
but this will be a better estimate of $p$ than $\frac{1}{5}$.
&lt;/p&gt;

&lt;h2&gt;Nitty Gritty Math Details&lt;/h2&gt;
&lt;p&gt;
The posterior distribution for a binomial parameter $p$ takes the shape of a
&lt;a href="http://en.wikipedia.org/wiki/Beta_distribution"&gt;beta distribution&lt;/a&gt;, which is a function $Beta(x; \alpha, \beta)$.
When the data is $k$ successes out of $n$ attempts, the parameters to the function are $\alpha=k+1$ and $\beta=(n-k)+1$.
&lt;/p&gt;

&lt;p&gt;
The function $Beta(x; \alpha, \beta)$ has the following properties:
$$mean = \frac{\alpha}{\alpha + \beta} = \frac{k+1}{n+2}$$
and
$$mode = \frac{\alpha - 1}{\alpha + \beta + 2} = \frac{k}{n}$$
&lt;/p&gt;

&lt;p&gt;
Lo and behold, when you plug in $k=1$ and $n=5$ you get $\frac{2}{7}$ for the mean, and $\frac{1}{5}$ for the mode.
&lt;/p&gt;

&lt;h2&gt;Head-to-head Matchup&lt;/h2&gt;
But don't take my word for it.  You can simulate it yourself with the Python code below:

&lt;pre class=code&gt;&lt;code&gt;import random

def flip_coin(p_heads):
        if random.random() &lt; p_heads: return 'H'
        else: return 'T'

def compute_errors():
        # pick a true underlying 'p' uniformly, and generate data
        p_heads = random.random()
        NUM_COINS = 5
        coins = [flip_coin(p_heads) for x in xrange(NUM_COINS)]

        # come up with estimates of 'p'
        mle_p = coins.count('H') / float(NUM_COINS)
        laplace_p = (coins.count('H') + 1) / float(NUM_COINS + 2)

        # return their errors
        return (abs(mle_p - p_heads), abs(laplace_p - p_heads))

win_count = {'mle': 0, 'laplace': 0}
mle_errors = []
laplace_errors = []
for x in xrange(1000000):
        (mle_error, laplace_error) = compute_errors()
        if mle_error &lt;= laplace_error:
                win_count["mle"] += 1
        else:
                win_count["laplace"] += 1

        mle_errors.append(mle_error)
        laplace_errors.append(laplace_error)

print "win_counts:",  win_count
print "average mle_error:", sum(mle_errors) / len(mle_errors)
print "average laplace_error:", sum(laplace_errors) / len(laplace_errors)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;
When I run this, I get
&lt;/p&gt;

&lt;pre class=code&gt;&lt;code&gt;win_counts: {'laplace': 569967, 'mle': 430033}
average mle_error: 0.142
average laplace_error: 0.123
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;
Which means the formula $\frac{k+1}{n+2}$ was closer 57% of the time,
and had a smaller average error than $\frac{k }{n}$ does.
&lt;/p&gt;

&lt;h2&gt;WTF?!&lt;/h2&gt;
&lt;p&gt;
If you're having a hard time coming to grips with this reality, let
me try to explain some intuition behind why $\frac{k}{n}$ is a bad
estimator.
&lt;/p&gt;

&lt;p&gt;
The basic problem is that $\frac{k}{n}$ tends to "believe" extreme data too easily.
If we see 0 wins out of 5 attempts, $\frac{k}{n}$ will conclude that the exact value
of $p$ must be 0.  Of course, this is extremely unlikely.  It's more likely that
the true value of $p$ is $&gt; 0$, and that 0-out-of-5 was just
an unlucky streak. (Similarly for when the data is 5-out-of-5 wins.)
&lt;/p&gt;


</description><author>Dustin Boswell</author><guid>http://dustwell.com/maximum-likelihood-vs-expected-value.html</guid><pubDate>Tue, 05 Apr 2011 00:00:00 GMT</pubDate></item><item><title>Using Rsync to do local snapshotting/backups</title><link>http://dustwell.com/rsync-for-local-snapshot.html</link><description>

&lt;a href="http://www.codinghorror.com/blog/archives/001315.html"&gt;Jeff Atwood's recent data-loss story&lt;/a&gt;
is a good reminder of why you should do off-site backups.
But what if you just accidentally &lt;tt&gt;rm&lt;/tt&gt;'ed a file, or saved over it's contents with bad data? It's a lot of work to
do a full backup recovery of your entire system &lt;i&gt;just to get back one file&lt;/i&gt;.  Isn't there an easier way?

&lt;h3&gt;Local Snapshotting: not perfect, but super usefull&lt;/h3&gt;
Here's what I do: run &lt;a href="http://en.wikipedia.org/wiki/Rsync"&gt;rsync&lt;/a&gt;
with &lt;a href="http://en.wikipedia.org/wiki/Cron"&gt;cron&lt;/a&gt;
to copy all your important files to a local &lt;tt&gt;/backup&lt;/tt&gt; directory.
&lt;br&gt;
First, create a new file called &lt;tt&gt;crontab.txt&lt;/tt&gt; with contents like:
&lt;pre class=code&gt;
@hourly  rsync -a --inplace --max-size=1MB /www /etc /backup/hourly/
@daily   rsync -a --inplace --max-size=1MB /www /etc /backup/daily/
@weekly  rsync -a --inplace --max-size=1MB /www /etc /backup/weekly/
@monthly rsync -a --inplace /www /etc /backup/monthly/
&lt;/pre&gt;

&lt;h3&gt;Let me explain:&lt;/h3&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;tt&gt;@hourly&lt;/tt&gt; (and the others) is a special interval understood by cron&lt;/li&gt;
  &lt;li&gt;&lt;tt&gt;/www&lt;/tt&gt; and &lt;tt&gt;/etc&lt;/tt&gt; are specific directories I want to keep snapshotted. You might want to include &lt;tt&gt;/home&lt;/tt&gt; as well&lt;/li&gt;
  &lt;li&gt;The &lt;tt&gt;-a&lt;/tt&gt; option to rsync tells it to do "archive" mode (preserving permissions, etc...)&lt;/li&gt;
  &lt;li&gt;The &lt;tt&gt;--inplace&lt;/tt&gt; option is just an optimization so that files in &lt;tt&gt;/backup&lt;/tt&gt; are overwritten inplace,
      as opposed to rsync creating an intermediate temp file.&lt;/li&gt;
  &lt;li&gt;The &lt;tt&gt;--max-size=1MB&lt;/tt&gt; tells rsync to ignore files greater than 1MB in size. I do this so that I don't bother making lots of copies of big log files and videos and other stuff that isn't that important and doesn't change that often.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;Now you install this crontab by doing:&lt;/h3&gt;
&lt;pre class=code&gt;
crontab crontab.txt
&lt;/pre&gt;

I do this as the &lt;tt&gt;root&lt;/tt&gt; user, but you can do this as any user
that can read all the directories you need to backup, and can write to
&lt;tt&gt;/backup&lt;/tt&gt;.
(Warning: the above command will overwrite any other crontabs you already have installed.  Do a &lt;tt&gt;crontab -l&lt;/tt&gt; first, to see what's installed.)

Now you can just sit back and relax -- copies of your local files are being copied every hour, day, week, and month to the &lt;tt&gt;/backup&lt;/tt&gt; directory. If you want to make sure you have a full (monthly) backup right away, then you should execute:
&lt;pre class=code&gt;
rsync -a --inplace /www /etc /backup/monthly/
&lt;/pre&gt;
right now.

&lt;h3&gt;How does this help me?&lt;/h3&gt;
Let's say you just accidentally removed a local file
&lt;pre class=code&gt;
rm /etc/lighttpd/lighttpd.conf
# Oh shit, I didn't mean to do that!
&lt;/pre&gt;
Not to fear, you can recover it by doing
&lt;pre class=code&gt;
cp /backup/hourly/etc/lighttpd/lighttpd.conf /etc/lighttpd/lighttpd.conf
&lt;/pre&gt;

&lt;h3&gt;Why do hourly, daily, etc...?&lt;/h3&gt;
For mistakes that you catch right away, the &lt;tt&gt;/backup/hourly&lt;/tt&gt;
directory is what you'll goto most often to get a recent version.
But sometimes you don't realize something is wrong until days (or weeks) later.  In that case, the hourly backup is of no use, since it has already mirrored the mistake.
If you were really paranoid, you could add a &lt;tt&gt;@yearly&lt;/tt&gt; line to the crontab above.
</description><author>Dustin Boswell</author><guid>http://dustwell.com/rsync-for-local-snapshot.html</guid><pubDate>Sat, 09 Jan 2010 00:00:00 GMT</pubDate></item><item><title>Source code for putting "a" or "an" in front of a word.</title><link>http://dustwell.com/a-or-an-in-front-of-a-word.html</link><description>

&lt;p&gt;
English is a funny language - you say "&lt;b&gt;a&lt;/b&gt; usual person", but "&lt;b&gt;an&lt;/b&gt; unusual person".
There is no simple set of rules for how to decide this. Instead, you have to rely on whether the
next word starts with a vowel sound.
&lt;/p&gt;

&lt;p&gt;
Instead, I think the "right" solution is just to have a list of the words that should be prefixed with "an".
&lt;pre class=code&gt;
aardvark
able
&lt;i&gt;... words that should be prefixed with "an" ...&lt;/i&gt;
&lt;/pre&gt;
I searched far and wide, but couldn't find one, so I &lt;b&gt;created a list of words that should be prefixed with "an"&lt;/b&gt;
myself, based on which words in the &lt;a href="http://www.speech.cs.cmu.edu/cgi-bin/cmudict"&gt;CMU pronunciation dictionary&lt;/a&gt;
 started with one of the vowel-like phonemes:
&lt;pre class=code&gt;
["AA", "AE", "AH", "AO", "AW", "AY", "EH", "ER", "EY", "IH", "IY", "OW", "OY", "UH", "UW"]&lt;/pre&gt;

&lt;h3&gt;Files for you to Download&lt;/h3&gt;
I am now sharing this list to the rest of the world (it's public domain, use it how you like.)
&lt;ul&gt;
  &lt;li&gt;&lt;a href="preceed_with_an.txt"&gt;preceed_with_an.txt&lt;/a&gt; [146KB in size, 16,831 words]&lt;/li&gt;
  &lt;li&gt;&lt;a href="preceed_with_an.py"&gt;preceed_with_an.py&lt;/a&gt; [147KB in size, it embeds the list above]&lt;/li&gt;
&lt;/ul&gt;

That Python file contains a function &lt;tt&gt;should_preceed_with_an("phrase...")&lt;/tt&gt; that returns &lt;tt&gt;True&lt;/tt&gt; or &lt;tt&gt;False&lt;/tt&gt;.

&lt;h3&gt;A Django Template Filter&lt;/h3&gt;
I now use this code in my
&lt;a href="http://docs.djangoproject.com/en/dev/howto/custom-template-tags/"&gt;Django templates&lt;/a&gt; by doing:
&lt;pre class=code&gt;I am looking for {% templatetag openvariable %} phrase|a_or_an {% templatetag closevariable %} {% templatetag openvariable %} phrase {% templatetag closevariable %} &lt;/pre&gt;
which will produce:
&lt;pre class=code&gt;
I am looking for a cat.
I am looking for an hour-glass.
I am looking for an unusual person.
I am looking for a usual person.
&lt;/pre&gt;

To make use of &lt;tt&gt;a_or_an&lt;/tt&gt; you have to define it in one of your
&lt;tt&gt;app/templatetags/&lt;/tt&gt; files:

&lt;pre class=code&gt;
from preceed_with_an import should_preceed_with_an

@register.filter
def a_or_an(phrase):
  if should_preceed_with_an(phrase): return "an"
  else: return "a"
&lt;/pre&gt;

&lt;pre&gt;
&lt;/pre&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/a-or-an-in-front-of-a-word.html</guid><pubDate>Thu, 11 Feb 2010 00:00:00 GMT</pubDate></item><item><title>Gas Mileage for my 2007 Audi A4</title><link>http://dustwell.com/audi-a4-2007-gas-mileage.html</link><description>

&lt;p&gt;
I've collected meticulous notes on how much I fill up each time at the gas pump, and below is the data.
(Don't ask why - I'm an information packrat.)
&lt;/p&gt;
&lt;img src="/images/audi-a4-2007.jpg" width=400px &gt;

&lt;img src="/images/audi-gas-mileage.png" &gt;

&lt;p&gt;About the Data&lt;/p&gt;
This is over a 2 year period, driving a good mix of city traffic and longer freeway trips.
I'm a moderate driver in terms of gassing it.  (I never put it in "Sport" mode though.)

&lt;p&gt;I figured I had this data, so I might as well share it.  Let me know if your mileage is much different.&lt;/p&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/audi-a4-2007-gas-mileage.html</guid><pubDate>Sun, 06 Jun 2010 00:00:00 GMT</pubDate></item><item><title>How to Diagnose your Flaky Internet Connection</title><link>http://dustwell.com/diagnose-your-flaky-internet.html</link><description>

I have Verizon Avenue DSL, the worst ISP I've ever had, so I've gotten to learn a few tricks on how to troubleshoot my internet problems.

Here are the steps:

&lt;h3&gt;Can you ping the outside world?&lt;/h3&gt;

Try pinging a well-known IP address like &lt;tt&gt;4.2.2.2&lt;/tt&gt;
&lt;ul&gt;
  &lt;li&gt;In Windows: Click Start -&gt; Run... -&gt; cmd -&gt; type &lt;tt&gt;"ping -n 50 4.2.2.2"&lt;/tt&gt;&lt;/li&gt;
  &lt;li&gt;In Linux/Mac: Open a Terminal and type &lt;tt&gt;"ping -c 50 4.2.2.2"&lt;/tt&gt;&lt;/li&gt;
&lt;/ul&gt;

You should see successful results (0% packet loss) like that below:

&lt;img src="images/windows-ping.jpg" /&gt;
&lt;br&gt;
If you get messages like "no route to host", or get 100% packet loss, you've got much bigger problems.  (If so, try doing &lt;tt&gt;"ping 192.168.0.1"&lt;/tt&gt; - if that doesn't even work, then you probably aren't even connected to your router.)

&lt;h3&gt;Does resetting just the router help things?&lt;/h3&gt;
Try unplugging (waiting 20 seconds) and re-plugging the power to your router.  Does that help things?  If so, you might have a crappy/old/broken router.  I've had 3 different Netgear/DLink routers where resetting helped things. (In fairness, 1 of those was my fault: I plugged a 12v power supply into a router that wanted 7.5v -- the plug fit, the router got really hot, and periodically reset it self.)

&lt;h3&gt;Does resetting the modem and router help?&lt;/h3&gt;
Try unplugging (waiting 30 seconds) and re-plugging the power to your dsl/cable modem, and also to your wireless router.  Occasionally, your modem can get stuck with a bad IP address, and this will force it to get a new one.  This really shouldn't happen if you have a good ISP, but it can.  But this is only something that might happen every few months or so, not every day.  If doing this helps all the time, you probably have a different problem.

&lt;h3&gt;Is it your DNS?&lt;/h3&gt;
If you are getting a lot of "host/server not found" errors in your browser, and/or the "looking up domain.com ..." message in the status-bar at the bottom takes a long time, the problem might be a bad DNS server.

&lt;br&gt; &lt;br&gt;
&lt;h4&gt;Background on DNS:&lt;/h4&gt;
When you plug your wireless router into your cable/dsl modem, the router is given an IP address, as well as the IP address of where to do DNS lookups.  (These DNS servers are hosted by your ISP, and are often flaky/overloaded.)  When you plug your computer into the router (or connect over wireless), the router tells your computer to use 192.168.0.1 (the IP address of the router) as the DNS server.   Your computer thinks that your router is the DNS server, but really, your router just turns around and does the DNS lookup for you.

&lt;br&gt; &lt;br&gt;
&lt;h4&gt;How to fix your DNS:&lt;/h4&gt;
One thing you can easily try is to tell your computer to use a different DNS server.  Go to &lt;a href="http://opendns.org"&gt;opendns.org&lt;/a&gt; -- they have instructions on how to do this for your particular computer.  Their DNS Server IP addresses are &lt;tt&gt;208.67.222.222&lt;/tt&gt; and &lt;tt&gt;208.67.220.220&lt;/tt&gt;&lt;br /&gt;(Or you can use Verizon's public DNS servers of &lt;tt&gt;4.2.2.2&lt;/tt&gt; and &lt;tt&gt;4.2.2.3&lt;/tt&gt;, or (&lt;font color=red&gt;new&lt;/font&gt;) Google's DNS servers of &lt;tt&gt;8.8.8.8&lt;/tt&gt;)

&lt;br&gt;
Alternately, you can change the settings on your router to use these IP addresses.  (It's hard to explain how to do this - you have to visit
&lt;a href="http://192.168.0.1"&gt;http://192.168.0.1&lt;/a&gt;
from a computer that is plugged directly into your router's special port.)  This way, all the computers in your home will benefit from having these new DNS servers.

&lt;br /&gt;&lt;br /&gt;
If using these DNS servers fixes your internet woes, then you've found your problem (and your solution).

&lt;h3&gt;Is your wireless connection flaky?&lt;/h3&gt;
Try pinging &lt;tt&gt;192.168.0.1&lt;/tt&gt; from your laptop that's connected wirelessly, and see what the packet loss is.  (You really need to do 50 or 100 pings to get a fair estimate.)  Ideally, the packet loss should be 0% -- if you do a ping from a computer that is plugged directly into the router that is what you'll get.

&lt;br /&gt;In my house, I would get packet losses of at least 3%, sometimes as high as 8% or even higher.  The symptom is that the internet seemed very flaky.  Sometimes web pages don't load, or take extremely long to load.  Sometimes my ssh connections would lock up.  If your wireless is the true cause, then all of these should be symptoms that you &lt;b&gt;don't&lt;/b&gt; have when plugged directly into your router (all ethernet, no wireless).&lt;br /&gt;&lt;br /&gt;

&lt;h4&gt;How to fix your wireless connection:&lt;/h4&gt;
I don't have a great solution here, since the problem might be that your house is just a "dead zone" as far as wireless goes.  Or there might be too many other routers/microwaves/cordless phones/other interference right around you.&lt;br /&gt;&lt;br /&gt;

But here are some ideas to try:
&lt;ul&gt;
&lt;li&gt;try changing the "channel" of your wireless router (it's a number from 1-11) to something very different from what it was before.&lt;/li&gt;
&lt;li&gt;try moving your router to a different place in the room (away from bookcases for example)&lt;/li&gt;
&lt;li&gt;try upgrading the firmware of your router (a pain, I know)&lt;/li&gt;
&lt;li&gt;buy a fancy new router&lt;/li&gt;
&lt;/ul&gt;

&lt;b&gt;Update:&lt;/b&gt; I bought the
&lt;a href="http://www.amazon.com/gp/product/B001UE8LRY?ie=UTF8&amp;tag=dussbradum-20&amp;linkCode=as2&amp;camp=1789&amp;creative=9325&amp;creativeASIN=B001UE8LRY"
&gt;Apple MB763LL/A AirPort Extreme Dual-band Base Station&lt;/a&gt;
&lt;img src="http://www.assoc-amazon.com/e/ir?t=dussbradum-20&amp;l=as2&amp;o=1&amp;a=B001UE8LRY" width="1" height="1" border="0" alt="" style="border:none !important; margin:0px !important;" /&gt;
and so far everything is great - 0% packet loss, great range.  It's a bit pricey, but I've decided I'm not going to skimp on productivity tools that I use every day.

&lt;script&gt;
var disqus_url = "http://thoughts.dustwell.com/2009/07/how-to-diagnose-your-flaky-internet.html";
&lt;/script&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/diagnose-your-flaky-internet.html</guid><pubDate>Sun, 12 Jul 2009 00:00:00 GMT</pubDate></item><item><title>Is Drinking Distilled Water Dangerous?</title><link>http://dustwell.com/distilled-water.html</link><description>


&lt;img src="images/water-filter.jpg"&gt;

&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;

&lt;br /&gt;
&lt;span style="font-style: italic;"&gt;Note: I'm not a doctor, zealot, or trying to sell anything - I'm just a regular guy summarizing all the research I did on this topic.
&lt;/span&gt;
&lt;br /&gt;There seems to be a lot of controversy about what kind of water people should be drinking and cooking their food with:
&lt;br /&gt;
&lt;ul&gt;
  &lt;li&gt;tap water &lt;/li&gt;
  &lt;li&gt;natural "spring" or "rain" water &lt;/li&gt;
  &lt;li&gt;filtered, reverse-osmosis, or other cleansed/purified/low-mineral water &lt;/li&gt;
  &lt;li&gt;distilled water &lt;/li&gt;
&lt;/ul&gt;The controversy is about the good &amp;amp; bad components in water.  There are a number of good minerals found in "mineralized" water sources including:
&lt;br /&gt;
&lt;ul&gt;
  &lt;li&gt;calcium &lt;/li&gt;
  &lt;li&gt;magnesium &lt;/li&gt;
  &lt;li&gt;potassium &lt;/li&gt;
  &lt;li&gt;sodium &lt;/li&gt;
&lt;/ul&gt;There are also a number of bad components that might be found in water (
&lt;a href="http://www.medicalnewstoday.com/articles/125610.php"&gt;whether its bottled or not
&lt;/a&gt;)
&lt;br /&gt;
&lt;ul&gt;
  &lt;li&gt;
  &lt;a href="http://www.nytimes.com/2008/01/29/health/29real.html?em&amp;amp;ex=1201928400&amp;amp;en=453d5a482ccc6236&amp;amp;ei=5087%0A"&gt;lead&lt;/a&gt;
  , mercury, and other unhealthy heavy metals
  &lt;/li&gt;
  &lt;li&gt;bacteria &amp;amp; other live matter
  &lt;/li&gt;
  &lt;li&gt;man-made chemicals like plastics, 
  &lt;a href="http://www.cnn.com/2008/HEALTH/03/10/pharma.water1/index.html"&gt;medicines
  &lt;/a&gt;
  &lt;/li&gt;
&lt;/ul&gt;There is a wide range of waters, in terms of purity.  On the one extreme is distilled water, which should contain absolutely nothing but H2O.  On the other extreme is "hard water" which contains a large amount of minerals (and possibly chemicals).  There are many types of water in between, some that have only a small amounts, or just a subset of the minerals in question.
&lt;br /&gt;
&lt;br /&gt;The "total dissolved solids" (TDS) is a measure of how much stuff (typically good minerals) is in your water.  Highly-purified water (distilled, reverse-osmosis, other highly-filtered) has a TDS well below 50mg/liter.  Distilled water ought to have TDS=0.  "Hard water" or mineralized water often has TDS &gt; 200.  The debate is about which is healthier: low-TDS water, or high-TDS water.  Below are the common arguments that come up, and my take on them.
&lt;br /&gt;
&lt;span style="font-size:130%;"&gt;
  &lt;br /&gt;
  &lt;span style="font-weight: bold;"&gt;"Distilled/highly-purified water is missing essential minerals that your body needs."
    &lt;br /&gt;
  &lt;/span&gt;
&lt;/span&gt;  The amount of minerals in normal water is very small compared to the amount found in food (less than 10%).  A humorous thought experiment mentioned 
&lt;a href="http://www.cyber-nook.com/water/distilledwater.htm"&gt;here
&lt;/a&gt; was to imagine a blender with a day's worth of food and consider the tiny difference in minerals between adding distilled water to this, or regular water.  However, some argue that the calcium/magnesium/etc... in water is more easily absorbed by your body than from food or supplements.
&lt;br /&gt;I think this may be more important of an issue if you are already low on these minerals, and don't get enough of them from other sources (eg. eating nutritious food, cooking with tap water, drinking other liquids like orange juice, etc...)
&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-size:130%;"&gt;
  &lt;span style="font-weight: bold;"&gt;"Purified water can "leach" metals and other bad chemicals from the pipes and containers."
  &lt;/span&gt;
&lt;/span&gt;
&lt;br /&gt;Apparently, pure/low-mineral water is chemically "unstable" and wants to dissolve away the materials around it.  If you store your (pure) water for a long amount of time, or get it through pipes that haven't been setup to do so, this is something you should think about.
&lt;br /&gt;Overall, this is really a contamination issue though, not about how distilled water affects your body.   I just wanted to mention it for completeness.
&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;font-size:130%;" &gt;"Purified water leaches vital minerals &amp;amp; ions from your body."
&lt;/span&gt;
&lt;br /&gt;It stands to reason that if you drink distilled water, and urinate any of these vital minerals, that there is a net loss.  I get the impression that the dissolving/extracting power of distilled water is very high, and that drinking it will draw out many good chemicals (in addition to the bad impurities) from your body.  Perhaps distilled water is less dangerous when drank with a meal?  I've read claims that distilled water is particularly bad during exercise (presumably because that is when your body needs those electrolytes the most).
&lt;br /&gt;Frustratingly, there doesn't seem to be much research on this.  Would it be that hard to measure the amount of these chemicals in the urine of distilled water drinkers compared to normal?
&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-size:130%;"&gt;
  &lt;span style="font-weight: bold;"&gt;Then why do people drink distilled water?
  &lt;/span&gt;
&lt;/span&gt;
&lt;br /&gt;
&lt;ul&gt;
  &lt;li&gt;to avoid all the bad chemicals that might be found in tap water
  &lt;/li&gt;
  &lt;li&gt;they like the taste (as I do)
  &lt;/li&gt;
  &lt;li&gt;to extract and remove toxic substances from your body
  &lt;/li&gt;
&lt;/ul&gt;I understand the inclination to avoid tap water (that's a whole controversy of its own) - but good bottled water can achieve this goal.   As far as taste, this is a personal matter, but presumably everyone could find a non-distilled alternative that they like (as I am going to do now) for drinking on a daily basis.   If you have a particular need to remove impurities from your body, I think distilled water is safe &amp;amp; effective in doing so, understanding that it may remove good materials from your body as well.  I've also read that activated charcoal has been used for the same purpose.
&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-weight: bold;font-size:130%;" &gt;So, what water should I be drinking?
&lt;/span&gt;
&lt;br /&gt;I have no idea :)  How ironic is it that modern man cannot answer such a simple question, when drinking water is something that every life form on Earth was born from?
&lt;br /&gt;If you believe all the evidence in the references below, you should be drinking water with a high TDS (lots of calcium, magnesium, and other good stuff), that doesn't have toxic metals or chemicals.
&lt;br /&gt;There doesn't seem to be any specific health benefit from distilled water except for avoiding bad chemicals.  From what I've read, the only "danger" with drinking reasonable amounts of distilled water is the long-term mineral-deficiency it might cause in your body.  But maybe our Western diets &amp;amp; lifestyles are already so deficient in these minerals that distilled water exacerbates it?
&lt;br /&gt;There is also a lot of controversy about whether alkaline water (pH &gt; 7.0) is generally better for you because it helps your body be less acidic (which is the source of most disease according to alkaline-diet proponents).  I hope to research this more and post later about it.  But one point worth noting is that supposedly distilled waters are actually slightly acidic because they readily absorb CO2 from the air (carbonic acid).
&lt;br /&gt;
&lt;br /&gt;As I find out more about various bottled water, I'll post them here.  For starters, you might consider:
&lt;br /&gt;
&lt;ul&gt;
  &lt;li&gt;
  &lt;a href="http://www.fijiwater.com/"&gt;Fiji Water
  &lt;/a&gt; - TDS &gt; 200, pH 7.5.  Fiji took a lot of flack when people 
  &lt;a href="http://www.treehugger.com/files/2007/02/pablo_calculate.php"&gt;calculated how much waste
  &lt;/a&gt; goes into a single bottle flown from around the world, but the 
  &lt;a href="http://www.fijigreen.com/CarbonNegative.html"&gt;company now aims to be carbon negative
  &lt;/a&gt;.
  &lt;/li&gt;
&lt;/ul&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;img src="images/fiji-bottle.jpg" /&gt;
&lt;br /&gt;
&lt;br /&gt;If you have a brand of water that you swear by, please comment on this post.
&lt;br /&gt;
&lt;br /&gt;
&lt;span style="font-size:130%;"&gt;
  &lt;span style="font-weight: bold;"&gt;Further reading:
  &lt;/span&gt;
&lt;/span&gt;
&lt;br /&gt;
&lt;a href="http://www.who.int/water_sanitation_health/dwq/nutdemineralized.pdf"&gt;http://www.who.int/water_sanitation_health/dwq/nutdemineralized.pdf
&lt;/a&gt; - a great report by the World Health Organization citing a lot of research on why water-without-minerals is unhealthy
&lt;br /&gt;
&lt;a href="http://www.cyber-nook.com/water/distilledwater.htm"&gt;http://www.cyber-nook.com/water/distilledwater.htm
&lt;/a&gt; - a page with lots of information on both sides of the issue
&lt;br /&gt;
&lt;a href="http://www.mgwater.com/calcium.shtml"&gt;http://www.mgwater.com/calcium.shtml
&lt;/a&gt; - a research paper showing that "hard water" was correlated with lower rates of cardiovascular death
&lt;br /&gt;
&lt;br /&gt;
&lt;img src="images/water-fall.jpg" /&gt;

&lt;div style='clear: both;'&gt;
&lt;/div&gt;

&lt;script type="text/javascript"&gt;&lt;!--
google_ad_client = "ca-pub-5621221933612064";
/* Ads#1 */
google_ad_slot = "8483206685";
google_ad_width = 728;
google_ad_height = 15;
//--&gt;
&lt;/script&gt;
&lt;script type="text/javascript"
src="http://pagead2.googlesyndication.com/pagead/show_ads.js"&gt;
&lt;/script&gt;


&lt;script&gt;
var disqus_url = "http://thoughts.dustwell.com/2008/11/is-drinking-distilled-water-dangerous.html";

&lt;/script&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/distilled-water.html</guid><pubDate>Sun, 09 Nov 2008 00:00:00 GMT</pubDate></item><item><title>SSH keys in 2 easy steps</title><link>http://dustwell.com/ssh-without-typing-passwords.html</link><description>

These are simple instructions that will let you ssh from one Linux machine to another without needing to type your password.

&lt;h2&gt;Step 1) Generate your public signature&lt;/h2&gt;
On your local machine (where you are ssh-ing &lt;i&gt;from&lt;/i&gt;) type:
&lt;pre class="code"&gt;
ssh-keygen
&lt;/pre&gt;
(Then hit ENTER to accept the default output file of &lt;tt&gt;~/.ssh/id_rsa.pub&lt;/tt&gt; and ENTER again twice if you're lazy and want to use a blank passphrase.)
Note that you only have to generate a key &lt;b&gt;once&lt;/b&gt; per client machine - the same public key will be used to access all servers.

&lt;h2&gt;Step 2) Copy your public signature to the server&lt;/h2&gt;
Again, from your local machine, type:
&lt;pre class="code"&gt;
cat ~/.ssh/id_rsa.pub | ssh remote_user@remote.example.com "cat &amp;gt;&amp;gt; ~/.ssh/authorized_keys"
&lt;/pre&gt;
(but replace &lt;tt&gt;remote_user@remote.example.com&lt;/tt&gt; with your actual user and server.)

&lt;p&gt;This fancy shell command &lt;b&gt;appends&lt;/b&gt; the contents of your public signature to the end of the &lt;tt&gt;~/.ssh/authorized_keys&lt;/tt&gt; file on the server.
(If you did a simple &lt;tt&gt;scp&lt;/tt&gt; it would overwrite any previous authorized keys you've stored.)

&lt;h2&gt;You're done!&lt;/h2&gt;
Next time you ssh into the server
&lt;pre class=code&gt;
ssh remote_user@remote.example.com
&lt;/pre&gt;
It should do this without prompting for any passwords.
</description><author>Dustin Boswell</author><guid>http://dustwell.com/ssh-without-typing-passwords.html</guid><pubDate>Sat, 13 Feb 2010 00:00:00 GMT</pubDate></item><item><title>How to fix certain macbook wireless problems.</title><link>http://dustwell.com/fix-macbook-wireless-problems.html</link><description>

&lt;p&gt;
On a number of occasions, I've opened my MacBook Pro (OS X 10.5.8) quickly and tried to use the internet before AirPort could connect to my usual network.
Sometimes this causes the AirPort to get "stuck" in a weird state, where you can no longer use that network.
Even rebooting the MacBook doesn't help.
&lt;/p&gt;

&lt;p&gt;
Looking at System Preferences &amp;rarr; Network &amp;rarr; Advanced &amp;rarr; TCP/IP you could see that IPv4 was something
like &lt;tt&gt;168.254.x.x&lt;/tt&gt; and the Subnet was &lt;tt&gt;255.255.0.0&lt;/tt&gt;. This is the IP address that Mac assigns itself in the case
when it failed to get an IP from the DHCP server. (You might also try turning off your firewall.
&lt;a href="http://www.motherboardpoint.com/leopard-keeps-picking-up-spoofed-ip-t241385.html"&gt;Here's more info.&lt;/a&gt;)
&lt;/p&gt;

&lt;p&gt;
Here's what I've done to fix it:
&lt;ol&gt;
  &lt;li&gt;Turn the AirPort off (under System Preferences &amp;rarr; Network)&lt;/li&gt;
  &lt;li&gt;Open the Terminal App.&lt;/li&gt;
  &lt;li&gt;type &lt;tt style="font-size: 12px"&gt;sudo rm /Library/Preferences/SystemConfiguration/com.apple.network.identification.plist&lt;/tt&gt; and hit ENTER. (You will need to type your password.)&lt;/li&gt;
  &lt;li&gt;type &lt;tt style="font-size: 12px"&gt;sudo rm /Library/Preferences/SystemConfiguration/com.apple.airport.preferences.plist&lt;/tt&gt; and hit ENTER&lt;/li&gt;
  &lt;li&gt;restart your MacBook&lt;/li&gt;
  &lt;li&gt;Turn the AirPort on.&lt;/li&gt;
&lt;/ol&gt;
&lt;/p&gt;

&lt;p&gt;
Those plist files will get regenerated, and your wireless passwords are even remembered!
(
&lt;/p&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/fix-macbook-wireless-problems.html</guid><pubDate>Tue, 09 Aug 2011 00:00:00 GMT</pubDate></item><item><title>My Vim Cheat Sheet</title><link>http://dustwell.com/vim-cheat-sheet.html</link><description>

&lt;p&gt;
A friend of mine decided to make the switch from Emacs to Vim, and to help him out,
I gave him a copy of the cheat sheet I made to help me learn. Here it is - enjoy!
&lt;/p&gt;

&lt;a href="/images/ViVisualCueCard.pdf"&gt;
&lt;img style="border: 2px solid black" src="/images/ViVisualCueCard.png"&gt;
&lt;/a&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/vim-cheat-sheet.html</guid><pubDate>Tue, 07 Aug 2012 00:00:00 GMT</pubDate></item><item><title>How to Play Yes/No Proposition Bets (and "Lodden Thinks")</title><link>http://dustwell.com/yes_no_proposition_bets_and_lodden_thinks.html</link><description>

&lt;p&gt;
If you're a gambling man (like I am), and fancy yourself a good estimator,
here are some fun games you can play with your friends...
&lt;/p&gt;

&lt;h2&gt;Game 1: Over/Under betting, auction-style&lt;/h2&gt;

&lt;p&gt;
This is an even-money bet (say for $10), where the first person starts by making a claim like:

&lt;pre class=code&gt;I bet $10 that the Statue of Liberty is at least 50 feet tall.&lt;/pre&gt;

The second person has 2 choices:
&lt;ol&gt;
  &lt;li&gt;&lt;b&gt;Accept the bet&lt;/b&gt; (second person wins if Statue of Liberty is actually under 50 feet tall.)&lt;/li&gt;
  &lt;li&gt;&lt;b&gt;Make a bolder claim&lt;/b&gt; by increasing the value-in-question.&lt;/li&gt;
&lt;/ol&gt;
The bolder claim might be:
&lt;pre class=code&gt;I bet $10 that the Statue of Liberty is at least &lt;b&gt;200 feet tall&lt;/b&gt;.&lt;/pre&gt;
This keeps going back-and-forth until a bet is accepted. At that point, you have to go lookup the fact
in question, and resolve the bet.
&lt;/p&gt;

&lt;p&gt;The value that the bet settles at is the &lt;a href="http://en.wikipedia.org/wiki/Over-under"&gt;Over-under&lt;/a&gt;
that combines the two people's estimates.  This is assuming both players play rationally, and neither
accidentally increases the value too much in one step.
&lt;/p&gt;

&lt;p&gt;Usually, there is a gentlemen's agreement that each person has to increase the value-in-question
by a certain fraction.  But in practice, this doesn't come up much, because if one person
were to be a sissy and increase the value from &lt;b&gt;200 feet&lt;/b&gt; to &lt;b&gt;200.01 feet&lt;/b&gt;, the second person
usually just leapfrogs this to a new reasonable value.
&lt;/p&gt;


&lt;h2&gt;Game 2: Lodden Thinks&lt;/h2&gt;
&lt;p&gt;
The show &lt;a href="http://www.pokerafterdark.com/"&gt;Poker After Dark&lt;/a&gt; has made the "Lodden Thinks"
version of this game popular. The "Lodden" is &lt;a href="http://en.wikipedia.org/wiki/Johnny_Lodden"&gt;Johnny Lodden&lt;/a&gt;, a famous poker player.
&lt;/p&gt;

&lt;p&gt;
This game is exactly the same as Game 1, except that the &lt;i&gt;value-in-question&lt;/i&gt; is something that
a &lt;b&gt;third&lt;/b&gt; person knows and will keep secret until the bet needs to be resolved.
&lt;/p&gt;

&lt;p&gt;
For example, in one episode of Poker After Dark, two players bet on the age
&lt;a href="http://en.wikipedia.org/wiki/Daniel_Negreanu"&gt;Daniel Negreanu&lt;/a&gt; lost his virginity.
&lt;/p&gt;

&lt;p&gt;
What's interesting about this game is that the &lt;i&gt;value-in-question&lt;/i&gt; doesn't have to be
something the third-person &lt;i&gt;knows&lt;/i&gt; - it can be that third person's
&lt;b&gt;estimate&lt;/b&gt; of some unknown value.
For instance, in one episode, two players were betting on what a third-player's estimate of
&lt;a href="http://en.wikipedia.org/wiki/Hugh_Hefner"&gt;Hugh Hefner's&lt;/a&gt; age was.

Before the betting started, the third player was asked to come up with an estimate,
and remember it, without telling anyone.  Now the other two players bet on this estimate.
&lt;/p&gt;

&lt;p&gt;It doesn't matter if the estimate is good or not, what you're really betting on is what
Lodden (or whoever the third person is) is thinking.
It's an interesting game because you're basically betting over who can do a better job getting into the mind of another person (hence, why poker players like this game).
&lt;/p&gt;

&lt;p&gt;
The other benefit of this game is it doesn't require access to a computer to lookup
the facts - you can play this game in the car (or at a poker table), for instance.
&lt;/p&gt;


&lt;h2&gt;Game 3: Odds-making on Yes-No Propositions&lt;/h2&gt;
&lt;p&gt;
Both of the previous games require betting on a &lt;i&gt;value-in-question&lt;/i&gt; that is a number.
But you can play a version where you bet on a yes-no propostion, such as
"Will the Lakers win the next championship?"
&lt;/p&gt;

&lt;p&gt;
To play this game, you have to decide on a fixed prize-pool (say $10) and each claim
is a fraction of that pool that is being bet against the remaining portion.
&lt;/p&gt;

&lt;p&gt;
Here's an example: the first person starts by saying
&lt;pre class=code&gt;I bet &lt;b&gt;$0.10&lt;/b&gt; (vs. your $9.90) that the Lakers will win.&lt;/pre&gt;
The first claim should always be chosen to be an extremely good bet.
In this case, the Lakers will probably win with better than a 1-in-99 chance,
so the bet above is a good bet (for the first person).
&lt;/p&gt;

&lt;p&gt;
The second person obviously never takes this first bet, but instead makes a bolder claim, like:
&lt;pre class=code&gt;I bet &lt;b&gt;$0.50&lt;/b&gt; (vs. your &lt;b&gt;$9.50&lt;/b&gt;) that the Lakers will win.&lt;/pre&gt;
This goes back and forth, raising the value each time, until someone accepts.
&lt;/p&gt;

&lt;p&gt;My friends and I are software engineers, so we tend to bet on weird things like
"Is the domain name &lt;a href="http://greenmonkeybutt.com"&gt;GreenMonkeyButt.com&lt;/a&gt; available?
&lt;/p&gt;

&lt;p&gt;
Note that you always have to &lt;b&gt;phrase the question in the way that is MOST LIKELY&lt;/b&gt;,
so that the small initial bet is a good bet. For example, if you wanted to bet on
whether the Clippers will win the NBA championship, you'd want to start the betting as:
&lt;pre class=code&gt;I bet &lt;b&gt;$0.10&lt;/b&gt; (vs. your $9.90) that the Clippers will &lt;b&gt;lose&lt;/b&gt;.&lt;/pre&gt;
so that the bet amount can increase from there.  Otherwise, if you start the betting as:
&lt;pre class=code&gt;I bet &lt;b&gt;$0.10&lt;/b&gt; (vs. your $9.90) that the Clippers will &lt;b&gt;win&lt;/b&gt;.&lt;/pre&gt;
Then the second player might just stop right there and accept your bet (oops).
&lt;/p&gt;


</description><author>Dustin Boswell</author><guid>http://dustwell.com/yes_no_proposition_bets_and_lodden_thinks.html</guid><pubDate>Fri, 18 Feb 2011 00:00:00 GMT</pubDate></item><item><title>Storing User Passwords Securely: hashing, salting, and Bcrypt</title><link>http://dustwell.com/how-to-handle-passwords-bcrypt.html</link><description>

&lt;p&gt;
In this article, I'll explain the theory for how to store user passwords securely,
as well as some example code in Python using a Bcrypt library.
&lt;/p&gt;

&lt;h2&gt;Bad Solution #1: plain text password&lt;/h2&gt;
It would be very insecure to store each user's "plain text" password in your database:

&lt;pre&gt;
&lt;table border=2px cellpadding=5px&gt;
&lt;tr&gt;&lt;th&gt;user account&lt;/th&gt; &lt;th&gt;plain text password&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;john@hotmail.com&lt;/td&gt; &lt;td&gt;password&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;betty@gmail.com&lt;/td&gt; &lt;td&gt;password123&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;...&lt;/td&gt;&lt;td&gt;...&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;/pre&gt;

&lt;p&gt;
This is insecure because if a hacker gains access to your database,
they'll be able to use that password to login as that user on your system.
Or even worse, if that user uses the same password for other sites on the
internet, the hacker can now login there as well. Your users will be very unhappy.
&lt;/p&gt;

&lt;p&gt;
(Oh, and if you think no one would ever store passwords this way,
&lt;a href="http://www.informationweek.com/news/security/attacks/229900111"&gt;Sony did just this in 2011&lt;/a&gt;.)
&lt;/p&gt;


&lt;h2&gt;Bad Solution #2: sha1(password)&lt;/h2&gt;
A better solution is to store a "one-way hash" of the password,
typically using a function like &lt;a href="http://en.wikipedia.org/wiki/MD5"&gt;md5()&lt;/a&gt;
or &lt;a href="http://en.wikipedia.org/wiki/SHA-1"&gt;sha1()&lt;/a&gt;:

&lt;pre&gt;
&lt;table border=2px cellpadding=5px&gt;
&lt;tr&gt;&lt;th&gt;user account&lt;/th&gt; &lt;th&gt;sha1(password)&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;john@hotmail.com&lt;/td&gt; &lt;td&gt;5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;betty@gmail.com&lt;/td&gt; &lt;td&gt;cbfdac6008f9cab4083784cbd1874f76618d2a97&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;...&lt;/td&gt;&lt;td&gt;...&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;/pre&gt;

Even though the server doesn't store the plain text password anywhere,
it can still authenticate the user:

&lt;pre class="code"&gt;&lt;code&gt;{% filter force_escape %}def is_password_correct(user, password_attempt):
    return sha1(password_attempt) == user["sha1_password"]
{% endfilter %}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;
This solution is more secure than storing the plain text password, because in theory
it should be impossible to "undo" a one-way hash function and find an input string that
outputs the same hash value.  Unfortunately, hackers have found ways around this.
&lt;/p&gt;

&lt;p&gt;
One problem is that many hash functions (including md5() and sha1()) aren't so "one-way" afterall,
and security experts suggest that these functions not be used anymore for security applications.
(Instead, you should use better hash functions like &lt;a href="http://en.wikipedia.org/wiki/SHA-2"&gt;sha256()&lt;/a&gt;
which don't have any known vulnerabilities so far.)
&lt;/p&gt;

&lt;p&gt;
But there's a bigger problem: hackers don't need to "undo" the hash function at all;
they can just keep guessing input passwords until they find a match.  This is similar
to trying all the combinations of a combination lock.  Here's what the code would look
like:
&lt;/p&gt;

&lt;pre class="code"&gt;&lt;code&gt;{% filter force_escape %}database_table = {
  "5baa61e4c9b93f3f0682250b6cf8331b7ee68fd8": "john@hotmail.com",
  "cbfdac6008f9cab4083784cbd1874f76618d2a97": "betty@gmail.com",
  ...}

for password in LIST_OF_COMMON_PASSWORDS:
    if sha1(password) in database_table:
        print "Hacker wins! I guessed a password!"
{% endfilter %}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;
You might think that there are too many possible passwords for this technique to be
feasible.  But there are far fewer common passwords than you'd think.  Most people
use passwords that are based on dictionary words (possibly with a few extra numbers or
letters thrown in).

And most hash functions like sha1() can be executed very quickly -- one computer can
literally try &lt;a href="http://www.codinghorror.com/blog/2012/04/speed-hashing.html"
&gt;billions of combinations each second&lt;/a&gt;.  That means &lt;b&gt;most passwords can be figured
out in under 1 cpu-hour.&lt;/b&gt;  Programs like 
&lt;a href="http://en.wikipedia.org/wiki/John_the_Ripper"&gt;John The Ripper&lt;/a&gt; are able
to do just this.
&lt;/p&gt;

&lt;p&gt;
Aside: years ago, computers weren't this fast, so the hacker community created
&lt;a href="http://en.wikipedia.org/wiki/Rainbow_table"&gt;rainbow tables&lt;/a&gt;
that have pre-computed a large set of these hashes ahead of time. Today,
nobody uses rainbow tables anymore because computers are fast enough without them.
&lt;/p&gt;

&lt;p&gt;
So the bad news is that any user with a simple password like
&lt;tt&gt;"password"&lt;/tt&gt; or &lt;tt&gt;"password123"&lt;/tt&gt; or any of the billion most-likely
passwords will have their password guessed.
If you have an extremely complicated password
(over 16 random numbers and letters) you were probably safe.
&lt;/p&gt;

&lt;p&gt;
Also notice that the code above is effectively &lt;b&gt;attacking all of the passwords at the same
time.&lt;/b&gt; It doesn't matter if there are 10 users in your database, or 10 million, it
doesn't take the hacker any longer to guess a matching password. All that matters is how fast
the hacker can iterate through potential passwords.  (And in fact, having lots of users
actually &lt;b&gt;helps&lt;/b&gt; the hacker, because it's more likely that &lt;b&gt;someone&lt;/b&gt; in the
system was using the password &lt;tt&gt;"password123"&lt;/tt&gt;.)
&lt;/p&gt;

&lt;p&gt;sha1(password) is what &lt;a href="http://linkedin.com"&gt;LinkedIn&lt;/a&gt; used to store
its passwords. And in 2012,
a &lt;a href="http://securitywatch.pcmag.com/security/298795-linkedin-resets-affected-accounts-takes-member-security-seriously"
&gt;large set of those password hashes were leaked&lt;/a&gt;.  Over time, hackers were able to
figure out the plain text password to &lt;b&gt;most&lt;/b&gt; of these hashes.
&lt;/p&gt;

&lt;p&gt;
Summary: storing a simple hash (with no salt) is not secure -- if a hacker gains
access to your database, they'll be able to figure out the majority of the passwords
of the users.
&lt;/p&gt;


&lt;h2&gt;Bad Solution #3: sha1(FIXED_SALT + password)&lt;/h2&gt;
One attempt to make things more secure is to "salt" the password before hashing it:

&lt;pre&gt;
&lt;table border=2px cellpadding=5px&gt;
&lt;tr&gt;&lt;th&gt;user account&lt;/th&gt; &lt;th&gt;sha1("salt123456789" + password)&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;john@hotmail.com&lt;/td&gt; &lt;td&gt;b467b644150eb350bbc1c8b44b21b08af99268aa&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;betty@gmail.com&lt;/td&gt; &lt;td&gt;31aa70fd38fee6f1f8b3142942ba9613920dfea0&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;...&lt;/td&gt;&lt;td&gt;...&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;/pre&gt;

&lt;p&gt;
The salt is supposed to be a long random string of bytes.
If the hacker gains access to these new password hashes (but not the salt), it will
make it much more difficult for the hacker to guess the passwords because they would
also need to know the salt.  However, if the hacker has broken into your server, they
probably also have access to your source code as well, so they'll learn the salt too.  That's
why security designers just assume the worst, and don't rely on the salt being secret.
&lt;/p&gt;

&lt;p&gt;But even if the salt is not a secret, it still makes it harder to use
those old-school &lt;b&gt;rainbow tables&lt;/b&gt; I mentioned before.
(Those rainbow tables are built assuming there is no salt, so salted hashes stop them.)
However, since no-one uses rainbow tables anymore, adding a fixed salt doesn't help much.
The hacker can still execute the same basic for-loop from above:

&lt;pre class="code"&gt;&lt;code&gt;{% filter force_escape %}for password in LIST_OF_COMMON_PASSWORDS:
    if sha1(SALT + password) in database_table:
        print "Hacker wins! I guessed a password!", password
{% endfilter %}&lt;/code&gt;&lt;/pre&gt;

Summary: adding a fixed salt still isn't secure enough.


&lt;h2&gt;Bad Solution #4: sha1(PER_USER_SALT + password)&lt;/h2&gt;
The next step up in security is to create a new column in the database and store
a different salt for each user.  The salt is randomly created when the user account
is first created (or when the user changes their password).

&lt;pre&gt;
&lt;table border=2px cellpadding=5px&gt;
&lt;tr&gt;&lt;th&gt;user account&lt;/th&gt; &lt;th&gt;salt&lt;/th&gt; &lt;th&gt;sha1(salt + password)&lt;/th&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;john@hotmail.com&lt;/td&gt; &lt;td&gt;2dc7fcc...&lt;/td&gt; &lt;td&gt;1a74404cb136dd60041dbf694e5c2ec0e7d15b42&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;betty@gmail.com&lt;/td&gt; &lt;td&gt;afadb2f...&lt;/td&gt; &lt;td&gt;e33ab75f29a9cf3f70d3fd14a7f47cd752e9c550&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td&gt;...&lt;/td&gt;&lt;td&gt;...&lt;/td&gt;&lt;td&gt;...&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;
&lt;/pre&gt;

Authenticating the user isn't much harder than before:

&lt;pre class="code"&gt;&lt;code&gt;{% filter force_escape %}def is_password_correct(user, password_attempt):
    return sha1(user["salt"] + password_attempt) == user["password_hash"]
{% endfilter %}&lt;/code&gt;&lt;/pre&gt;

By having a per-user-salt, we get one huge benefit: &lt;b&gt;the hacker can't attack all of your user's
passwords at the same time.&lt;/b&gt;  Instead, his attack code has to try each user one by one:

&lt;pre class="code"&gt;&lt;code&gt;{% filter force_escape %}for user in users:
    PER_USER_SALT = user["salt"]

    for password in LIST_OF_COMMON_PASSWORDS:
        if sha1(PER_USER_SALT + password) in database_table:
            print "Hacker wins! I guessed a password!", password
{% endfilter %}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;
So basically, if you have 1 million users, having a per-user-salt makes it 1 million times
harder to figure out the passwords of &lt;i&gt;all&lt;/i&gt; your users.  But this still isn't impossible for a hacker to do.
Instead of 1 cpu-hour, now they need 1 million cpu-hours, which can easily be
&lt;a href="http://aws.amazon.com/ec2/pricing/"&gt;rented from Amazon for about $40,000.&lt;/a&gt;
&lt;/p&gt;

&lt;p&gt;
The real problem with all the systems we've discussed so far is that hash functions
like sha1() (or even sha256()) can be executed on passwords at a rate of 100M+/second
(or even faster, by &lt;a href="http://www.codinghorror.com/blog/2012/04/speed-hashing.html"&gt;using the GPU&lt;/a&gt;).
Even though these hash functions were designed with security in mind, they were also
designed so they would be fast when executed on longer inputs like entire files.
&lt;b&gt;Bottom line: these hash functions were not designed to be used for password storage.&lt;/b&gt;
&lt;/p&gt;


&lt;h2&gt;Good Solution: bcrypt(password)&lt;/h2&gt;
&lt;p&gt;
Instead, there are a set of hash functions that &lt;i&gt;were&lt;/i&gt; specifically designed for passwords.
In addition to being secure "one-way" hash functions, they were also &lt;b&gt;designed to be slow&lt;/b&gt;.
&lt;/p&gt;

&lt;p&gt;One example is &lt;a href="http://en.wikipedia.org/wiki/Bcrypt"&gt;Bcrypt&lt;/a&gt;.
bcrypt() takes about 100ms to compute, which is about 10,000x slower than sha1().
100ms is fast enough that the user won't notice when they log in, but slow enough that
it becomes less feasible to execute against a long list of likely passwords.
For instance, if a hacker wants to compute bcrypt() against a list of a billion likely passwords,
it will take about 30,000 cpu-hours (about $1200) -- and that's for a single password.
Certainly not impossible, but way more work than most hackers are willing to do.
&lt;/p&gt;

&lt;p&gt;
If you're wondering how Bcrypt works, here's &lt;a href="http://www.openbsd.org/papers/bcrypt-paper.ps"&gt;the paper&lt;/a&gt;.
Basically the "trick" is that it executes an internal encryption/hash function many times in a loop.
(There are other alternatives to Bcrypt, such as &lt;a href="http://en.wikipedia.org/wiki/PBKDF2"&gt;PBKDF2&lt;/a&gt; that use the same trick.)
&lt;/p&gt;

&lt;p&gt;
Also, Bcrypt is configurable, with a &lt;tt&gt;log_rounds&lt;/tt&gt; parameter that tells it how many times to
execute that internal hash function.  If all of a sudden, Intel comes out with a new computer
that is 1000 times faster than the state of the art today, you can reconfigure your system
to use a log_rounds that is 10 more than before (log_rounds is logarithmic), which will cancel
out the 1000x faster computer.
&lt;/p&gt;

&lt;p&gt;
Because bcrypt() is so slow, it makes the idea of rainbow tables attractive again, so a per-user-salt
is built into the Bcrypt system.  In fact, libraries like &lt;a href="https://pypi.python.org/pypi/bcrypt/2.0.0"&gt;bcrypt on pypi&lt;/a&gt;
store the salt in the same string as the password hash, so you won't even have to create a separate database column for the salt.
&lt;/p&gt;

&lt;p&gt;Let's see the code in action.  First, let's install it:
&lt;/p&gt;

&lt;pre class="code"&gt;&lt;code class="language-bash"&gt;{% filter force_escape %}sudo apt-get install libffi-dev libssl-dev
sudo pip install bcrypt
python -c "import bcrypt"   # did it work?
{% endfilter %}&lt;/code&gt;&lt;/pre&gt;

Now that it's installed, here's the Python code you'd run when creating a new user account (or resetting their password):
&lt;pre class="code"&gt;&lt;code class="language-python"&gt;{% filter force_escape %}from bcrypt import hashpw, gensalt
hashed = hashpw(plaintext_password, gensalt())
print hashed    # save this value to the database for this user
'$2a$12$8vxYfAWCXe0Hm4gNX8nzwuqWNukOkcMJ1a9G2tD71ipotEZ9f80Vu'
{% endfilter %}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Let's dissect that output string a little:&lt;/p&gt;

&lt;img style="border: 2px solid black" src="/images/bcrypt_output_string2.png" /&gt;

&lt;p&gt;As you can see, it stores both the salt, and the hashed output in the string.
It also stores the &lt;tt&gt;log_rounds&lt;/tt&gt; parameter that was used to generate the password, which controls how
much work (i.e. how slow) it is to compute.  If you want the hash to be slower, you pass a larger value to &lt;tt&gt;gensalt()&lt;/tt&gt;:
&lt;/p&gt;

&lt;pre class="code"&gt;&lt;code class="language-python"&gt;{% filter force_escape %}hashed = hashpw(plaintext_password, gensalt(log_rounds=13))
print hashed
'$2a$13$ZyprE5MRw2Q3WpNOGZWGbeG7ADUre1Q8QO.uUUtcbqloU0yvzavOm'
{% endfilter %}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;
Notice that there is now a &lt;tt&gt;13&lt;/tt&gt; where there was a &lt;tt&gt;12&lt;/tt&gt; before.  In any case, you store this string in the database,
and when that same user attempts to log in, you retrieve that same &lt;tt&gt;hashed&lt;/tt&gt; value and do this:
&lt;/p&gt;

&lt;pre class="code"&gt;&lt;code&gt;{% filter force_escape %}if hashpw(password_attempt, hashed) == hashed:
    print "It matches"
else:
    print "It does not match"
{% endfilter %}&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;You might be wondering why you pass in &lt;tt&gt;hashed&lt;/tt&gt; as the salt
argument to &lt;tt&gt;hashpw()&lt;/tt&gt;.  The reason this works is that the
hashpw() function is smart, and can extract the salt from that
&lt;tt&gt;$2a$12$...&lt;/tt&gt; string.
This is great, because it means you never have to store, parse, or handle
any salt values yourself -- the only value you need to deal with is that
single &lt;tt&gt;hashed&lt;/tt&gt; string which contains everything you need.
&lt;/p&gt;


&lt;h2&gt;Final Thoughts: choosing a good password&lt;/h2&gt;
&lt;p&gt;
If your user has the password &lt;tt&gt;"password"&lt;/tt&gt;, then no amount of hashing/salting/bcrypt/etc. is going
to protect that user.  The hacker will always try simpler passwords first, so if your password is toward the top
of the list of likely passwords, the hacker will probably guess it.
&lt;/p&gt;

&lt;p&gt;
The best way to prevent your password from being guessed is to create a
password that is as far down the list of likely passwords as possible.
Any password based on a dictionary word (even if it has simple mutations
like a letter/number at the end) is going to be on the list of the first
few million password guesses.
&lt;/p&gt;

&lt;p&gt;
Unfortunately, difficult-to-guess passwords are also difficult-to-remember.
If that wasn't an issue, I would suggest picking a password that is a 16-character random
sequence of numbers and letters.  Other people have suggested
&lt;a href="http://www.codinghorror.com/blog/2005/07/passwords-vs-pass-phrases.html"&gt;using passphrases&lt;/a&gt;
instead, like &lt;tt&gt;"billy was a turtle for halloween"&lt;/tt&gt;.  If your system allows long
passwords with spaces, then this is definitely better than a password like &lt;tt&gt;"billy123"&lt;/tt&gt;.
(But I actually suspect the entropy of most user's pass phrases will end up being about the same as a
password of 8 random alphanumeric characters.)
&lt;/p&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/how-to-handle-passwords-bcrypt.html</guid><pubDate>Mon, 18 Jun 2012 00:00:00 GMT</pubDate></item><item><title>div and span: display = 'block', 'inline', or 'inline-block' ?</title><link>http://dustwell.com/div-span-inline-block.html</link><description>

&lt;h2&gt;Background: the difference between &lt;tt&gt;div&lt;/tt&gt; and &lt;tt&gt;span&lt;/tt&gt;&lt;/h2&gt;

&lt;div style="width: 750px;"&gt;
&lt;div style="width: 380px; display: inline-block"&gt;
&lt;h2&gt;&lt;tt&gt;&amp;lt;div&amp;gt;&lt;/tt&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;A "block-level element"&lt;/li&gt;
&lt;li&gt;can contain all other elements!&lt;/li&gt;
&lt;li&gt;can only be inside other block-level elements&lt;/li&gt;
&lt;li&gt;defines a rectangular region on the page&lt;/li&gt;
&lt;li&gt;tries to be as wide as possible&lt;/li&gt;
&lt;li&gt;begins on a "new line", and has an "carriage return" at the end, like a &amp;lt;p&amp;gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;

&lt;div style="width: 350px; display: inline-block; vertical-align: top"&gt;
&lt;h2&gt;&lt;tt&gt;&amp;lt;span&amp;gt;&lt;/tt&gt;&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;An "inline element"&lt;/li&gt;
&lt;li&gt;cannot contain block-level elements!!&lt;/li&gt;
&lt;li&gt;can be inside any other element&lt;/li&gt;
&lt;li&gt;defines a "snake" on the page&lt;/li&gt;
&lt;li&gt;tries to be as small as possible&lt;/li&gt;
&lt;li&gt;doesn't create any new lines.&lt;/li&gt;
&lt;/ul&gt;
&lt;/div&gt;
&lt;/div&gt;

&lt;fieldset style="width: 300px"&gt;
  &lt;legend&gt;Simple &lt;tt&gt;&amp;lt;div&amp;gt;&lt;/tt&gt;s and &lt;tt&gt;&amp;lt;span&amp;gt;&lt;/tt&gt;s&lt;/legend&gt;
  &lt;span style="background-color: red;"&gt;span&lt;/span&gt;
  &lt;span style="background-color: red;"&gt;span . . . . . . . . . . . . . . . . . . . . . . . . . . .&lt;/span&gt;
  &lt;span style="background-color: red;"&gt;span&lt;/span&gt;
  &lt;div style="background-color: green;"&gt;div&lt;/div&gt;
  &lt;span style="background-color: red;"&gt;span&lt;/span&gt;
  &lt;div style="background-color: green;"&gt;div&lt;/div&gt;
&lt;/fieldset&gt;

&lt;p&gt;
From a rendering point of view,&lt;br&gt;
&lt;blockquote&gt;
 &lt;b&gt;&lt;tt&gt;&amp;lt;span&amp;gt;&lt;/tt&gt; == &lt;tt&gt;&amp;lt;div style="display: inline"&amp;gt;&lt;/tt&gt;&lt;/b&gt;
&lt;/blockquote&gt;
and
&lt;blockquote&gt;
&lt;b&gt;&lt;tt&gt;&amp;lt;div&amp;gt;&lt;/tt&gt; == &lt;tt&gt;&amp;lt;span style="display: block"&amp;gt;&lt;/tt&gt;&lt;/b&gt;.
&lt;/blockquote&gt;
As for HTML syntax, however, a &lt;tt&gt;div&lt;/tt&gt; cannot be nested inside an inline element, whereas a &lt;tt&gt;span&lt;/tt&gt; cannot contain block-level elements.
&lt;/p&gt;

&lt;p&gt;
But there is also the mysterious &lt;tt&gt;"display: inline-block"&lt;/tt&gt;.  What is it...?

&lt;h2&gt;&lt;tt&gt;block&lt;/tt&gt; vs &lt;tt&gt;inline&lt;/tt&gt; vs &lt;tt&gt;inline-block&lt;/tt&gt;&lt;/h2&gt;
Below are a bunch of &lt;tt&gt;&amp;lt;div style="width: 50px"...&amp;gt;&lt;/tt&gt; with different &lt;tt&gt;display:&lt;/tt&gt; settings.
&lt;br&gt;

&lt;div style="width: 700px; margin: 10px;"&gt;
&lt;fieldset style="width: 150px; display:inline-block"&gt;
  &lt;legend&gt;&lt;b&gt;&lt;tt&gt;display: block&lt;/tt&gt;&lt;/b&gt;&lt;/legend&gt;
  &lt;div style="width: 60px; display: block; background-color: red; border: 3px solid black;"&gt;display: block&lt;/div&gt;
  &lt;div style="width: 60px; display: block; background-color: red; border: 3px solid black;"&gt;display: block&lt;/div&gt;
  &lt;div style="width: 60px; display: block; background-color: red; border: 3px solid black;"&gt;display: block&lt;/div&gt;
&lt;/fieldset&gt;

&lt;fieldset style="width: 190px; display:inline-block; vertical-align: top"&gt;
  &lt;legend&gt;&lt;b&gt;&lt;tt&gt;display: inline&lt;/tt&gt;&lt;/b&gt;&lt;/legend&gt;
  &lt;div style="width: 60px; display: inline; background-color: red; border: 3px solid black;"&gt;display: inline&lt;/div&gt;
  &lt;div style="width: 60px; display: inline; background-color: red; border: 3px solid black;"&gt;display: inline&lt;/div&gt;
  &lt;div style="width: 60px; display: inline; background-color: red; border: 3px solid black;"&gt;display: inline&lt;/div&gt;
&lt;/fieldset&gt;

&lt;fieldset style="width: 230px; display:inline-block; vertical-align: top"&gt;
  &lt;legend&gt;&lt;b&gt;&lt;tt&gt;display: inline-block&lt;/tt&gt;&lt;/b&gt;&lt;/legend&gt;
  &lt;div style="width: 60px; display: inline-block; background-color: red; border: 3px solid black;"&gt;display: inline-block&lt;/div&gt;
  &lt;div style="width: 60px; display: inline-block; background-color: red; border: 3px solid black;"&gt;display: inline-block&lt;/div&gt;
  &lt;div style="width: 60px; display: inline-block; background-color: red; border: 3px solid black;"&gt;display: inline-block&lt;/div&gt;
&lt;/fieldset&gt;
&lt;/div&gt;
 
As you can see, &lt;tt&gt;inline-block&lt;/tt&gt; is a hybrid that:
&lt;ul&gt;
&lt;li&gt;Creates a rectangular region (a block)&lt;/li&gt;
&lt;li&gt;Doesn't create any new lines (hence "in line")&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;For more information&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href="http://www.quirksmode.org/css/display.html"&gt;quirksmode.org - what CSS "display:" does&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href="http://stackoverflow.com/questions/1142104/is-div-different-from-span-styledisplayblock"&gt;
stackoverflow.com - is "div" different than "span display: block"&lt;/a&gt; &lt;/li&gt;
&lt;/ul&gt;

</description><author>Dustin Boswell</author><guid>http://dustwell.com/div-span-inline-block.html</guid><pubDate>Tue, 26 Oct 2010 00:00:00 GMT</pubDate></item><item><title>ETF's leak money</title><link>http://dustwell.com/etfs-leak-money.html</link><description>

&lt;p&gt;
&lt;img style="float: right" src="/images/oil.jpg" /&gt;
An ETF (Exchange Traded Fund) is a stock (like a mutual fund) that lets
ordinary investors invest in commodities and assets like oil and gold.
&lt;/p&gt;

&lt;p&gt;
Without an ETF, it would be difficult to invest in the price of oil.
You could invest in an oil company, like
Exxon Mobile (&lt;a href="http://www.google.com/finance?q=NYSE%3AXOM"&gt;XOM&lt;/a&gt;)
or BP (&lt;a href="http://www.google.com/finance?q=NYSE:BP"&gt;BP&lt;/a&gt;),
but the prices of these stocks only loosely correlate with the price of oil.
You could theoretically rent out a tanker to physically store the oil,
but only big firms like JP Morgan can afford &lt;a href="http://www.reuters.com/article/2009/06/03/energy-products-storage-idUSL365078320090603"&gt;to do this&lt;/a&gt;.
&lt;/p&gt;

&lt;p&gt;So instead you buy an ETF like &lt;a href="http://www.google.com/finance?q=olo"&gt;OLO&lt;/a&gt;,
whose daily price changes are designed to match the daily changes in oil as close as possible.
If oil goes up by 5% one day, this ETF should go up by about 5%, too.
They even have &lt;i&gt;short&lt;/i&gt; ETFs (like &lt;a href="http://www.google.com/finance?q=NYSE%3ASZO"&gt;SZO&lt;/a&gt;), where its price
should go &lt;i&gt;down&lt;/i&gt; by 5% if oil went up 5%.
&lt;/p&gt;

&lt;p&gt;
The way these ETFs work is by buying and selling short-term oil futures, and
"rolling" these contracts from month-to-month, buying new ones to replace expiring ones.
They don't actually have to own any barrels of oil.
&lt;/p&gt;

&lt;h2&gt;ETF's leak money&lt;/h2&gt;
&lt;img style="width: 200px; float: right" src="/images/oil_leak.jpg"&gt;
&lt;p&gt;
What most people don't realize is that this process &lt;b&gt;leaks money&lt;/b&gt;.
It takes money to run the ETF - buying and selling futures has its transaction costs,
plus you have to pay guys in fancy suits to watch over the whole operation.
&lt;/p&gt;

&lt;p&gt;
By "leaks money", I mean that the price of the ETF slowly goes down over time.
It's like a boat with a small hole in it - the waves (oil prices) may bounce 
the boat up and down, but ultimately the boat will sink.
&lt;/p&gt;

&lt;p&gt;So how big is the hole? How fast does it leak money?
It's hard to measure exactly, but here is some interesting data:
the table below shows prices for OLO, SZO, and &lt;a href="http://quotes.post1.org/historical-crude-oil-price-chart/"&gt;WTI crude futures&lt;/a&gt;,
on 2 dates about a year apart:
&lt;/p&gt;

&lt;div style="clear: both"&gt;&lt;/div&gt;

&lt;pre&gt;
&lt;table style="border: 2px solid black; text-align: center;" border="2px"&gt;
&lt;tr&gt;
  &lt;th&gt;Date&lt;/th&gt;&lt;th&gt;WTI crude futures&lt;/th&gt;&lt;th&gt;OLO (long oil)&lt;/th&gt;&lt;th&gt;SZO (short oil)&lt;/th&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td&gt;Jan 8, 2010&lt;/td&gt;
  &lt;td&gt;$83.25&lt;/td&gt;
  &lt;td&gt;$14.06&lt;/td&gt;
  &lt;td&gt;$46.15&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
  &lt;td&gt;Dec 31, 2010&lt;/td&gt;
  &lt;td&gt;$89.68*&lt;/td&gt;
  &lt;td&gt;$14.00&lt;/td&gt;
  &lt;td&gt;$44.31&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style="border-top: 4px solid black"&gt;
  &lt;td&gt;Change:&lt;/td&gt;
  &lt;td style="color: green"&gt;+8%&lt;/td&gt;
  &lt;td&gt;+0%&lt;/td&gt;
  &lt;td style="color: red"&gt;-4%&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;
&lt;/pre&gt;
  
&lt;p&gt;
[* Dec 30, 2010 price, no data available for Dec 31. Was $91.58 Jan 3, 2011]
As you can see, the price of oil went up 8% over that time, yet OLO remained at the same price.
(Now, there might be some intricacies I'm missing, in the way futures prices are calculated at the end of the day,
or there might be other weird near-expiration effects happening.)
&lt;/p&gt;

&lt;p&gt;But there's another way to measure the leak: if there were no leak, the price of
&lt;tt&gt;$100-of-OLO + $100-of-SZO&lt;/tt&gt; should remain constant.  That is, buying equally-weighted shares of each
puts you in a combined &lt;i&gt;neutral&lt;/i&gt; position where the price increase of one stock should cancel out the price
decrease of the other.
However, the combined &lt;tt&gt;$100-of-OLO + $100-of-SZO&lt;/tt&gt; went down about 2% during that time.
(I should really calculate this leakage on a month-to-month basis, and average over a number of years.)
&lt;/p&gt;

&lt;h2&gt;Don't buy-and-hold ETFs&lt;/h2&gt;
&lt;p&gt;
Some people believe that oil will run out in the next 20 or 30 years.
And they might be right, but buying-and-holding an ETF like OLO isn't a good way to make money
in the long term. During those 20 or 30 years, OLO is sinking by a few percent each year.
&lt;/p&gt;

&lt;h2&gt;ETFs: the ultimate bookie&lt;/h2&gt;
&lt;p&gt;
On a side note, I'd like to point out how awesome it must be for the ETF companies, like
&lt;a href="http://www.invescopowershares.com/"&gt;Power Shares&lt;/a&gt; (the guys who run OLO and SZO).
You're basically just a middle-man between someone betting &lt;i&gt;for&lt;/i&gt; the price of oil, and someone else betting &lt;i&gt;against&lt;/i&gt; the price of oil.
Effectively, the ETF company is just a &lt;a href="http://en.wikipedia.org/wiki/Bookmaker"&gt;bookie&lt;/a&gt;,
and doesn't care which way the price of oil goes.
There's very little risk, and they are happy to silently take 3% of your money each year.
&lt;/p&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/etfs-leak-money.html</guid><pubDate>Sat, 02 Apr 2011 00:00:00 GMT</pubDate></item><item><title>Most useful tools for diagnosing UNIX system performance.</title><link>http://dustwell.com/unix-system-performance-tools.html</link><description>

&lt;h2&gt;Summary:&lt;/h2&gt;
&lt;table border="1px" style="border-collapse: collapse;"&gt;
  &lt;tr&gt;&lt;th&gt;Command&lt;/th&gt;&lt;th&gt;Example&lt;/th&gt;&lt;th&gt;Install&lt;/th&gt;&lt;th&gt;What it does&lt;/th&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;top&lt;/td&gt;&lt;td&gt;top&lt;/td&gt;&lt;td&gt;(built-in)&lt;/td&gt;&lt;td&gt;Interactive overview of machine&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;vmstat&lt;/td&gt;&lt;td&gt;vmstat 2&lt;/td&gt;&lt;td&gt;(built-in)&lt;/td&gt;&lt;td&gt;Overview of memory, swap, cpu, disk&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;iostat&lt;/td&gt;&lt;td&gt;iostat -dmx 2&lt;/td&gt;&lt;td&gt;sudo apt-get install iostat&lt;/td&gt;&lt;td&gt;Disk utilization, throughput&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;iftop&lt;/td&gt;&lt;td&gt;iftop&lt;/td&gt;&lt;td&gt;sudo apt-get install iftop&lt;/td&gt;&lt;td&gt;Interactive overview of network traffic&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;tcpflow&lt;/td&gt;&lt;td&gt;sudo tcpflow -i any -C -e port 80&lt;/td&gt;&lt;td&gt;sudo apt-get install tcpflow&lt;/td&gt;&lt;td&gt;Sniff live network traffic&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;lsof&lt;/td&gt;&lt;td&gt;sudo lsof -i TCP&lt;/td&gt;&lt;td&gt;(built-in)&lt;/td&gt;&lt;td&gt;Show processes with open files/sockets&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;

&lt;h2&gt;&lt;tt&gt;top&lt;/tt&gt; explained&lt;/h2&gt;
&lt;p&gt;
Ignore the "load". It just confuses people, and a 'bad' value depends on how many cores you have. Instead, just look at the CPU usage numbers ("user", "system", "nice", "idle", "iowait"), which are percentages averaged over all cpus.  As long as the "idle" value is above 0%, your system probably isn't overloaded (at least, not in a way that having more CPU would help).  The "iowait" value is confusing, so I would ignore it (if it's high, that means that a faster disk would improve throughput of your system, but if it's near-0, then either you don't have much io waiting, or that io wait time was used by some other cpu-busy process).
&lt;/p&gt;

&lt;p&gt;I usually hit 'M' to sort by memory usage -- the biggest processes are usually the most interesting ones.  The "virtual" memory size is misleading -- this is how much memory the process &lt;i&gt;would&lt;/i&gt; use if it touched all the memory it was given. The "resident (RES)" memory usage is the one that matters.  The "shared" column is mostly useless, so ignore it (for instance, it doesn't take into account the amount "shared" between forked processes that haven't copy-on-write yet.)&lt;/p&gt;

&lt;h2&gt;&lt;tt&gt;vmstat&lt;/tt&gt; explained&lt;/h2&gt;
The runnable processes ("--procs-- &gt; r") gives you a sense of how many processes are using (or want) CPU at the moemnt.
The idle percent ("--cpu-- &gt; id") lets you know how often the CPU has nothing to do. If this is 0, then your system is cpu limited at the moment.
The swap amount ("--swap-- &gt; si so") shows the "swapped in" and "swapped out". If these are constantly above 0, then your system is swapping to disk a lot, which is probably bad.
The memory amount ("--memory-- &gt; free buf cache") shows how much memory is free, being used for buffers, or for file cache. If these numbers are low (less than 1000(KB) for each), then your system probably doesn't have enough memory for what it wants.

&lt;h2&gt;&lt;tt&gt;iostat&lt;/tt&gt; explained&lt;/h2&gt;
Ignore the first line of output (those are summary stats since bootup, which is rarely useful). Focus on the last column (utilization).  If it's near 100%, then your system has a disk bottleneck at the moment.

</description><author>Dustin Boswell</author><guid>http://dustwell.com/unix-system-performance-tools.html</guid><pubDate>Sun, 24 Mar 2013 00:00:00 GMT</pubDate></item><item><title>Installing djb-dns on a Linux machine.</title><link>http://dustwell.com/how-to-install-djbdns.html</link><description>

&lt;p&gt;
Down below is a script you can use to install
&lt;a href="http://cr.yp.to/djbdns/tools.html"&gt;djb-dns&lt;/a&gt; on a Linux system (like Ubuntu).
&lt;/p&gt;

&lt;p&gt;
Specifically, it will install &lt;tt&gt;dnscache&lt;/tt&gt; (a &lt;i&gt;local caching nameserver&lt;/i&gt;) which
resolves any domain name into an IP address.
This is much like &lt;a href="http://code.google.com/speed/public-dns/"&gt;Google's public
8.8.8.8 DNS server&lt;/a&gt;.
&lt;/p&gt;

&lt;h2&gt;Background on DNS lookups&lt;/h2&gt;
&lt;p&gt;To be clear: &lt;tt&gt;dnscache&lt;/tt&gt; is &lt;b&gt;not&lt;/b&gt; an &lt;i&gt;"authoritative" dns server&lt;/i&gt;
A &lt;i&gt;dns cache&lt;/i&gt; is a simply a middle-man that executes global dns lookups on behalf of an incoming query,
and caches the result for subsequent queries. See this &lt;a href="http://cr.yp.to/djbdns/separation.html"&gt;clarification&lt;/a&gt;.
&lt;/p&gt;

&lt;p&gt;
When a program does a dns lookup (turning a domain name into an IP, or vice versa) it uses
a dns client library (e.g. calling the UNIX function
&lt;a href="http://beej.us/guide/bgnet/output/html/multipage/gethostbynameman.html"&gt;gethostbyname()&lt;/a&gt;) to
connect to a &lt;b&gt;("recursive") domain name server&lt;/b&gt;.  That server (typically hosted by your ISP)
does all the dirty work of first talking to the
&lt;a href="http://en.wikipedia.org/wiki/Root_nameserver"&gt;root-name-servers&lt;/a&gt; and going
down the tree of DNS lookups until the full domain name is completely resolved.
&lt;/p&gt;

&lt;p&gt;
The file &lt;tt&gt;/etc/resolv.conf&lt;/tt&gt; contains the IP address(es) of the domain name server(s) your system
is using.  It is a small file that typically looks something like:
&lt;pre class=code&gt;
nameserver a.b.c.d
nameserver e.f.g.h
&lt;/pre&gt;
&lt;/p&gt;


&lt;h2&gt;Why do I need to run my own dns cache?&lt;/h2&gt;
&lt;p&gt;
&lt;b&gt;The dns cache servers that your ISP is hosting typically aren't very good.&lt;/b&gt;
Those servers are overloaded,
not well maintained, etc...  If you are doing a high volume of dns-lookups they won't keep up.
For instsance, you are running a web crawler, or doing reverse-lookups on all the IP addresses that visit your site.
Your ISP's servers will introduce latency and flakiness. I've personally dealt with 3 ISPs whose servers
started returning errors because my volume was too high.
&lt;/p&gt;

&lt;p&gt;I've even run my own dns cache on my home Linux desktop because my home ISP's was so bad.
(Nowadays I just use &lt;a href="http://code.google.com/speed/public-dns/"&gt;8.8.8.8&lt;/a&gt; for my home networks.)
&lt;/p&gt;


&lt;h2&gt;What's so special about djb-dns?&lt;/h2&gt;
&lt;p&gt;
It's rock-solid.  It's written by this crazy-smart guy who knows his shit, and even has an
&lt;a href="http://cr.yp.to/djbdns/guarantee.html"&gt;unclaimed $1000 prize&lt;/a&gt; to find a security bug.
&lt;/p&gt;

&lt;p&gt;I've used it multiple times and haven't had any problems. The only downside is it's a pain-in-the-ass
to install. Thankfully, I've gone through the headache for you.


&lt;h2&gt;The Install Script&lt;/h2&gt;

&lt;pre class="code"&gt;
# Must be run as root
# Also see http://hydra.geht.net/tino/howto/linux/djbdns/

#Create a /package directory:
mkdir -p /package
chmod 1755 /package

cd /package
wget http://cr.yp.to/daemontools/daemontools-0.76.tar.gz
gunzip daemontools-0.76.tar.gz
tar -xpf daemontools-0.76.tar
rm daemontools-0.76.tar
cd admin/daemontools-0.76
# Apply dumb patch to make things compile
cd src; echo gcc -O2 -include /usr/include/errno.h &gt; conf-cc; cd ..
./package/install

cd /package
wget http://cr.yp.to/ucspi-tcp/ucspi-tcp-0.88.tar.gz
rm -rf ucspi-tcp-0.88
tar xfz ucspi-tcp-0.88.tar.gz
cd ucspi-tcp-0.88
# Apply dumb patch to make things compile
echo gcc -O2 -include /usr/include/errno.h &gt; conf-cc
make
make setup check

cd /package
wget http://cr.yp.to/djbdns/djbdns-1.05.tar.gz
gunzip djbdns-1.05.tar.gz
tar -xf djbdns-1.05.tar
cd djbdns-1.05
# Apply dumb patch to make things compile
echo gcc -O2 -include /usr/include/errno.h &gt; conf-cc
# Allow more simultaneous dns requests
sed -i -e "s/MAXUDP 200/MAXUDP 600/g" dnscache.c
make
make setup check

########## Install Users and Service directories ###########
groupadd dnscache
useradd -g dnscache dnscache
useradd -g dnscache dnslog
/usr/local/bin/dnscache-conf dnscache dnslog /var/dnscache
ln -s /var/dnscache /service

# Fix the nameservers to point to current ICANN structure 
# This assumes you have dig installed 
# Patch in the current list of root servers  
for a in a b c d e f g h i j k l m
do
  dig +short $a.root-servers.net.
done &gt; /var/dnscache/root/servers/\@

# Increase the cache to 100MB
echo 100000000 &gt; /service/dnscache/env/CACHESIZE
echo 104857600 &gt; /service/dnscache/env/DATALIMIT

# Change multilog to keep more logs
echo "#!/bin/sh" &gt; /service/dnscache/log/run
echo "exec setuidgid dnslog multilog t s10000000 ./main" &gt;&gt; /service/dnscache/log/run
&lt;/pre&gt;

Now all the tools and binaries are installed.  To verify that the tools were installed you can do:

&lt;pre class="code"&gt;
dnsip www.google.com
&lt;/pre&gt;

Now you just have to kick-off the dnscache server and update &lt;tt&gt;/etc/resolv.conf&lt;/tt&gt;.
You will want to run the following script at system startup (if you don't, the file
&lt;tt&gt;/etc/resolv.conf&lt;/tt&gt; might get over-written by your system):

&lt;pre class="code"&gt;
# Must be run as root
rm -rf /etc/resolv.conf.prev
mv /etc/resolv.conf /etc/resolv.conf.prev
echo "nameserver 127.0.0.1" &gt; /etc/resolv.conf

## init q  # (is this needed?)
/command/svscanboot &amp;
sleep 5
svc -u /service/dnscache   # FYI: -t does a reboot
svstat /service/dnscache
svc -t /service/dnscache/log
&lt;/pre&gt;

Enjoy!
</description><author>Dustin Boswell</author><guid>http://dustwell.com/how-to-install-djbdns.html</guid><pubDate>Wed, 23 Feb 2011 00:00:00 GMT</pubDate></item><item><title>How to Fix the Sharp Edge on your MacBook Pro</title><link>http://dustwell.com/macbook-pro-sharp-edge.html</link><description>

&lt;h2&gt;Problem:&lt;/h2&gt;

&lt;span style="float: left; width: 300px"&gt;
The MacBook Pro has &lt;strong&gt;ridiculously sharp edges&lt;/strong&gt;.
After using it for a hour my wrists had annoying (and painful) indentation lines on them.  It doesn't bother everyone, but
&lt;a title="Apple Discussion Page on Shard Edge"
   href="http://discussions.apple.com/thread.jspa?threadID=1861071&amp;amp;start=0&amp;amp;tstart=0" target="_blank"&gt;I'm not the only one&lt;/a&gt;.
&lt;/span&gt;

&lt;img title="MacBook Pro Wrists Slit" src="images/macbook-slit-wrists.jpg" alt="MacBook Pro Wrists Slit" style="float: left"/&gt;
&lt;img title="MacBook Pro" src="images/macbook-pro.jpg" alt="MacBook Pro" style="float: left"/&gt;
&lt;div style="clear: both;"&gt;&amp;nbsp;&lt;/div&gt;

&lt;h2&gt;Solution:&lt;/h2&gt;

&lt;span style="float: left; width: 300px"&gt;
It's pretty easy to shave that edge down with some tools.  It only took 10 minutes, and it looks as good as if it was manufactured that way. I wish I had done it sooner.
&lt;br&gt; &lt;br&gt;
I was a little hesitant to take a power tool to my new $1700 computer, but in the end it was really easy.  The worst case is probably that you'd just scratch the case a little.  But in my case it came out smooth and flawless.
&lt;br&gt; &lt;br&gt;
And I assume you know how to be safe with power tools - wearing goggles and all that ...  If you had a lot of time, and a few emery boards, I suppose you could do it without a Dremel, but I got impatient :)

&lt;/span&gt;

&lt;img title="Dremel" src="images/dremel.jpg" alt="Dremel" style="float: left"/&gt;
&lt;img title="Emery Board" src="images/emery-board.jpg" alt="Emery Board" style="float: left"/&gt;&lt;/p&gt;
&lt;div style="clear: both;"&gt;&amp;nbsp;&lt;/div&gt;

&lt;h2&gt;Step 1: Position your laptop&lt;/h2&gt;

&lt;span style="float: left; width: 300px"&gt;
Cover your open MacBook with a T-shirt to stop the rest of your laptop from getting any aluminum dust in it.  Only the edge facing you needs to be exposed.
&lt;br&gt;&lt;br&gt;
Then put your laptop on a table so that edge can hang off.  You probably want your computer turned off during this time.
&lt;/span&gt;

&lt;img title="MacBook in a T-shirt" src="images/macbook-in-tshirt.jpg" alt="MacBook in a T-shirt" style="float: left"/&gt;
&lt;div style="clear: both;"&gt;&amp;nbsp;&lt;/div&gt;

&lt;h2&gt;Step 2: Dremel&lt;/h2&gt;
&lt;p&gt;I used various Dremel tips (mostly the stone tips), and never found the perfect one.  I'm not sure it really matters though.  I turned on the drill on a medium speed and went across the whole length of the edge (with the bit angled so that it would "pull" you across the length of the edge).  I only shaved across the front edge - the side edges have the external connection slots, so it seemed more dangerous, plus my wrists aren't bothered by those edges.&lt;/p&gt;

&lt;p&gt;At first I was worried that I would shave away too much and cut into the motherboard, but it's not worth worrying about.  There's plenty of aluminum between you and the inside (judging by that notch where you open it, at least 1/8th of an inch, maybe more).  And you're only looking to shave away 1/32 of an inch or so (depending on your preferences).  So as long as you keep moving the drill back and forth across the edge, you'll get a uniform shave that only takes a little off.&lt;/p&gt;

&lt;p&gt;It's okay if there are grooves or it looks rough right now.  But be careful to only apply the drill to the aluminum right on the very edge - you don't want to leave scratches elsewhere.  Also, there will be a lot of aluminum dust, so you'll probably want to wipe that off with a wet napkin every once in a while.&lt;/p&gt;

&lt;h2&gt;Step 3: Polishing &lt;/h2&gt;
&lt;p&gt;Now it's time for the &lt;b&gt;emery board&lt;/b&gt;.  The finer-grit the better.  I don't think an ordinary nail file (or sandpaper for that matter) will do as nice a job.&lt;/p&gt;
&lt;p&gt;I moved the emery board across the whole length, back and forth, again and again, quickly scrubbing it down.  Tilt the board all the way up, and all the way down &amp;#8211; you want to get a nice rounded finished look.&lt;/p&gt;
&lt;p&gt;(Again, this step will generate lots of aluminum dust, so you'll need to stop every once in a while to clean it off with a wet napkin.)  You can't overdue this step, so be sure to spend at least 5 minutes with the emery board.&lt;/p&gt;

&lt;h2&gt;You're Done&lt;/h2&gt;
&lt;p&gt;If you did it right, you'll have a rounded edge that looks natural and is very smooth to the touch &amp;#8211; like it was made that way.&lt;/p&gt;

&lt;img title="MacBook Pro Edge After Emery Board" src="images/macbook-closeup-edge1.jpg" alt="MacBook Pro Edge After Emery Board" /&gt;
&lt;img title="MacBook Pro After Dremel" src="images/macbook-closeup-edge2.jpg" alt="MacBook Pro After Dremel" /&gt;
</description><author>Dustin Boswell</author><guid>http://dustwell.com/macbook-pro-sharp-edge.html</guid><pubDate>Tue, 01 Dec 2009 00:00:00 GMT</pubDate></item></channel></rss>