<?xml version="1.0" encoding="UTF-8"?>
<feed xmlns="http://www.w3.org/2005/Atom" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">
    <title>Project Notes</title>
    <id>tag:teichman.org,2007-10-08:/blog//1</id>
    <link rel="alternate" type="text/html" href="http://teichman.org/blog/" />
    
    <updated>2012-05-05T21:49:33-04:00</updated>

    
    <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/teichman/blog" /><feedburner:info uri="teichman/blog" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
        <title>Cobe 2.0: Or, it is what it is what it is what it is what it is what it is what it is.</title>
        <id>tag:teichman.org,2011-09-19:/2011/09/cobe-2.0.html</id>

        <published>2011-09-19T08:10:02-04:00</published>

        
          
        

        <updated>2011-09-19T08:10:02-04:00</updated>

        <author><name>Peter</name></author>

        <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/teichman/blog/~3/G5m9CJfPJH8/cobe-2.0.html" />

        <content type="html" xml:lang="en" xml:base="http://teichman.org/blog/">&lt;div class="post-image pull-right"&gt;
 &lt;a href="http://www.flickr.com/photos/pteichman/4202629105/"&gt;&lt;img class="pull-right" width="240" height="240" src="http://farm5.staticflickr.com/4006/4202629105_26c635c101_m.jpg" alt="Step 3: Braze a luglet to a tiny head tube." /&gt;
  &lt;p style="max-width:240px"&gt;Step 3: Braze a luglet to a tiny head tube.&lt;/p&gt;
 &lt;/a&gt;
&lt;/div&gt;


&lt;p&gt;I&amp;rsquo;m releasing Cobe 2.0, with a major conceptual change in its data
model and a more flexible scoring system. To explain the changes,
let&amp;rsquo;s get a little deeper into Cobe 1.x.&lt;/p&gt;

&lt;h3&gt;Cobe 1.x&lt;/h3&gt;

&lt;p&gt;When Cobe 1.x learns an input phrase, say &amp;ldquo;this is a test&amp;rdquo;, it does
the following:&lt;/p&gt;

&lt;p&gt;First, it splits the phrase into tokens. Each series of alphanumeric
characters becomes its own token, and each series of characters
between those becomes another.&lt;/p&gt;

&lt;p&gt;So &amp;ldquo;&lt;em&gt;this is a test&lt;/em&gt;&amp;rdquo; becomes seven tokens:&lt;/p&gt;

&lt;p&gt;&lt;img class="inline-image" width="272" height="69" src="http://teichman.org/blog/static/2011/tokens.png" alt="seven tokens" /&gt;&lt;/p&gt;

&lt;p&gt;Cobe 1.x groups these tokens by the Markov Order &lt;em&gt;m&lt;/em&gt; (default five)
and adds special END tokens that signal the beginning and end of a
phrase:&lt;/p&gt;

&lt;p&gt;&lt;img class="inline-image" width="232" height="245" src="http://teichman.org/blog/static/2011/groups.png" alt="grouped tokens" /&gt;&lt;/p&gt;

&lt;p&gt;And then each group is annotated with the next token that follows.&lt;/p&gt;

&lt;p&gt;&lt;img class="inline-image" width="337" height="211" src="http://teichman.org/blog/static/2011/groups-next.png" alt="grouped tokens, with next" /&gt;&lt;/p&gt;

&lt;p&gt;And the previous token:&lt;/p&gt;

&lt;p&gt;&lt;img class="inline-image" width="337" height="211" src="http://teichman.org/blog/static/2011/groups-prev.png" alt="grouped tokens, with previous" /&gt;&lt;/p&gt;

&lt;p&gt;As more phrases are learned, Cobe keeps track of all the words that
have ever followed these contexts. After a while, one might look like:&lt;/p&gt;

&lt;p&gt;&lt;img class="inline-image" width="397" height="155" src="http://teichman.org/blog/static/2011/groups-more.png" alt="groups with edge probabilities" /&gt;&lt;/p&gt;

&lt;p&gt;Here you can also see each link annotated with its probability in the
learned text. For phrases beginning &amp;ldquo;this is &amp;rdquo;, &amp;ldquo;this is a&amp;rdquo; occurred
75% of the time, &amp;ldquo;this is another&amp;rdquo; occurred 17% of the time, and &amp;ldquo;this
is third&amp;rdquo; occurred 8%.&lt;/p&gt;

&lt;p&gt;When generating a reply, cobe 1.x picks a random next word, shifts it
into the context, and looks for the words that follow that new
context.&lt;/p&gt;

&lt;p&gt;In functional terms, this works exceedingly well. But there&amp;rsquo;s no
denying that it&amp;rsquo;s complex. Each context is clearly related to all of
the others, but they&amp;rsquo;re not explicitly connected to each other&amp;mdash;the
connections are to the next word.&lt;/p&gt;

&lt;p&gt;Maintaining the forward and backward chains independently means it&amp;rsquo;s
difficult to visualize the brain as a whole, and since spaces are
separate tokens, half of the useful context is taken up tracking
whitespace.&lt;/p&gt;

&lt;h3&gt;Cobe 2.0&lt;/h3&gt;

&lt;p&gt;Cobe 2.0 employs two major changes to brain storage that reduce
complexity, ease visualization, enhance processing speed, and leave us
able to use a world of existing general-purpose algorithms.&lt;/p&gt;

&lt;p&gt;My first change was to connect contexts directly to each other. Since
each is already stored in the database, this is just a matter of
creating a table that links the previous context to its next context.&lt;/p&gt;

&lt;p&gt;Astute readers will note that this creates a &lt;em&gt;graph&lt;/em&gt;. Each context is
a graph node, and the graph&amp;rsquo;s edges are the links between them. Using
a directed graph, Cobe ends up learning &amp;ldquo;this is a test&amp;rdquo; like this:&lt;/p&gt;

&lt;p&gt;&lt;img class="inline-image" width="232" height="325" src="http://teichman.org/blog/static/2011/groups-digraph.png" alt="simple graph" /&gt;&lt;/p&gt;

&lt;p&gt;Take note: since the edges of the graph are directed, this same data
structure can be used to walk either forward or backward. No more
training two chains!&lt;/p&gt;

&lt;p&gt;At this point, we&amp;rsquo;re still looking at the equivalent of a 5th order
Markov chain. I wanted to reduce the Markov order by eliminating the
whitespace that&amp;rsquo;s being stored in the contexts.&lt;/p&gt;

&lt;p&gt;The other Markov-based chatbots have taken different strategies
here. They may group spaces and punctuation together as the tokens
that separate words, like Cobe 1.x. They may tokenize punctuation and
spaces separately, but still consider spaces as normal tokens. They
may attach the spaces as leading or trailing properties of words and
reconstruct the spaces when joining the words back together.&lt;/p&gt;

&lt;p&gt;Cobe 2.0 takes a different strategy. It eliminates all spaces from the
nodes and makes them a property of the graph &lt;em&gt;edges&lt;/em&gt;. Combined with
the graph layout, this is a major breakthrough. It keeps the database
size small and the nodes much more dense with information.&lt;/p&gt;

&lt;p&gt;Below is a dump of a 2nd order Cobe 2.0 brain that has been trained
with four phrases:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;this is a test&lt;/li&gt;
&lt;li&gt;this is another test&lt;/li&gt;
&lt;li&gt;another test would help&lt;/li&gt;
&lt;li&gt;hal hal hal&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;The empty node at the top is a special node which signals the end of
the graph. It serves the same purpose as the special END tokens in
Cobe 1.x. Open arrowheads show nodes that were learned with spaces
between them.&lt;/p&gt;

&lt;p&gt;&lt;img class="inline-image" width="455" height="397" src="http://teichman.org/blog/static/2011/brain.png" alt="a full brain graph" /&gt;&lt;/p&gt;

&lt;p&gt;Looking at this, you can easily see how a reply might be
generated. Cobe follows a random arrow from every node, building a
list of reply words as it goes.&lt;/p&gt;

&lt;p&gt;When generating a reply with 2.0, all of the database queries are on
the edges table alone. It doesn&amp;rsquo;t even matter what Markov order the
contexts are anymore. Indeed, different contexts could have &lt;em&gt;different
numbers of tokens&lt;/em&gt;. &amp;ldquo;Raining cats and dogs&amp;rdquo; could be identified as a
single concept (or
&lt;a href="http://en.wikipedia.org/wiki/Collocation"&gt;collocation&lt;/a&gt;), and it could
be treated as a single word for replies.&lt;/p&gt;

&lt;p&gt;If you follow the graph above, it&amp;rsquo;s clear that you could get a reply
like &amp;ldquo;this is another test would help&amp;rdquo; or &amp;ldquo;hal hal hal hal hal hal hal
hal hal hal hal&amp;rdquo;.&lt;/p&gt;

&lt;p&gt;Being able to say quantitatively that the first is a &lt;em&gt;bad&lt;/em&gt; reply and
the second is a &lt;em&gt;great&lt;/em&gt; reply is a question for reply scoring, and I&amp;rsquo;ve
made the groundwork for some changes there too.&lt;/p&gt;

&lt;h3&gt;Scoring changes&lt;/h3&gt;

&lt;p&gt;Cobe (both old and new) will generate as many replies as possible
during half a second. These are evaluated by a scoring routine, which
determines which of the candidates is best. At the end of the half
second, the best candidate so far becomes the result.&lt;/p&gt;

&lt;p&gt;Cobe 1.x used a scorer based on MegaHAL&amp;rsquo;s evaluate_reply(), which adds
together the
&lt;a href="http://en.wikipedia.org/wiki/Self-information"&gt;self-information&lt;/a&gt; of
all reply tokens that also appeared in the input. This means that more
surprising replies score higher.&lt;/p&gt;

&lt;p&gt;MegaHAL&amp;rsquo;s scorer then applies a couple of dividers to discourage
overly long replies.&lt;/p&gt;

&lt;p&gt;Recently I&amp;rsquo;ve been looking more closely at the candidate replies. Many
of them are subjectively better than the result, and I&amp;rsquo;d like to
explore scoring for the next few releases.&lt;/p&gt;

&lt;p&gt;For Cobe 2.x, I have introduced reply scoring based on several
signals. The information content of a reply is one good metric, but
there are a lot of other things that might contribute to a good
reply. Some examples:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;discouraging overly lengthy replies (as exhibited by MegaHAL&amp;rsquo;s scorer)&lt;/li&gt;
&lt;li&gt;discouraging replies that are too similar to the input&lt;/li&gt;
&lt;li&gt;discouraging replies that are too similar to the last N responses&lt;/li&gt;
&lt;li&gt;encouraging excited or shouting replies&lt;/li&gt;
&lt;li&gt;encouraging &lt;a href="http://www.cs.washington.edu/homes/brun/pubs/pubs/Kiddon11.pdf"&gt;double entendre&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;encourage (discourage?) replies that loop back on themselves&lt;/li&gt;
&lt;li&gt;preserve word &lt;a href="http://en.wikipedia.org/wiki/Word-sense_disambiguation"&gt;sense&lt;/a&gt; as found in the input&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;There&amp;rsquo;s a lot of room for playing around with scoring.&lt;/p&gt;

&lt;h3&gt;Performance&lt;/h3&gt;

&lt;p&gt;Cobe 2 is quite a bit faster than 1.x, and a Cobe 2 brain is roughly
half the size of the same 1.x brain.&lt;/p&gt;

&lt;p&gt;Learning the &lt;a href="http://en.wikipedia.org/wiki/Brown_Corpus"&gt;Brown
Corpus&lt;/a&gt;, 500 samples of
English language text, Cobe 2.0 is about five times faster than
1.2. I adjusted the Cobe 1.2 code to use the same database
parameters as 2.0, to get a more accurate comparison of the
core learning algorithm:&lt;/p&gt;

&lt;pre class="highlight"&gt;
Cobe 1.2 (adjusted) - 224M brain file
real 18m25.908s
user 5m6.115s
sys  0m36.146s
&lt;/pre&gt;




&lt;pre class="highlight"&gt;
Cobe 2.0 - 102M brain file
real 3m24.447s
user 2m35.319s
sys  0m19.854s
&lt;/pre&gt;


&lt;p&gt;Reply speed is difficult to measure, as it&amp;rsquo;s dependent on random
factors, but I&amp;rsquo;m seeing at least double the number of candidate
replies in the new code.&lt;/p&gt;

&lt;p&gt;My largest brain in production has learned about 230,000 phrases, for
1.8 million edges between 1.4 million nodes. This is better than Cobe
1.x could manage, but there&amp;rsquo;s still room for improvement. Learning one
million phrases is my next scaling goal.&lt;/p&gt;

&lt;h3&gt;Next steps&lt;/h3&gt;

&lt;p&gt;Where to go from here?&lt;/p&gt;

&lt;p&gt;First, explore new scorers. There are a few listed above, plus general
purpose ideas like a Bayesian classifier scorer. That would allow you
to train Cobe based on a list of good and bad replies.&lt;/p&gt;

&lt;p&gt;The scorers are sometimes at odds with each other&amp;mdash;you might want to
encourage reply length in one case, brevity in another. It might make
sense to build reply profiles out of a set of weighted scorers and
score each set of candidate replies based on a randomly chosen
profile. (e.g. &amp;ldquo;SHORT AND SHOUTY&amp;rdquo; or &amp;ldquo;long and repeating and repeating
and repeating and repeating and repeating&amp;rdquo;)&lt;/p&gt;

&lt;p&gt;Second, continue improving the speed of candidate replies. Assuming
the scorers are going to select the best response, Cobe needs to make
candidates as quickly as possible.&lt;/p&gt;

&lt;p&gt;I have two strategies planned for faster replies. Currently, Cobe
performs a random walk from scratch for every candidate. This could be
made a lot faster by turning the walk into a generator, searching the
tree for the END token and yielding a reply every time it&amp;rsquo;s found.&lt;/p&gt;

&lt;p&gt;The second strategy is to start generating replies in parallel. Cobe
is easily distributed across a large number of computers.&lt;/p&gt;

&lt;p&gt;Have fun out there.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/teichman/blog/~4/G5m9CJfPJH8" height="1" width="1"/&gt;</content>
    <feedburner:origLink>http://teichman.org/blog/2011/09/cobe-2.0.html</feedburner:origLink></entry>
    
    <entry>
        <title>Cobe learns English: Or, a different sort of singularity.</title>
        <id>tag:teichman.org,2011-05-26:/2011/05/singularity.html</id>

        <published>2011-05-26T22:36:41-04:00</published>

        
          
        

        <updated>2011-05-26T22:36:41-04:00</updated>

        <author><name>Peter</name></author>

        <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/teichman/blog/~3/rbhXDb0CM04/singularity.html" />

        <content type="html" xml:lang="en" xml:base="http://teichman.org/blog/">&lt;div class="post-image pull-right"&gt;
 &lt;a href="http://www.flickr.com/photos/pteichman/5093326068/"&gt;&lt;img class="pull-right" width="240" height="240" src="http://farm5.staticflickr.com/4131/5093326068_b406e541b8_m.jpg" alt="8:47p" /&gt;
  &lt;p style="max-width:240px"&gt;8:47p&lt;/p&gt;
 &lt;/a&gt;
&lt;/div&gt;


&lt;p&gt;A fresh &lt;a href="/blog/2011/02/cobe.html"&gt;cobe&lt;/a&gt; brain starts
completely empty. Eventually it will seem to know about vocabulary and
sentence structure, but that&amp;rsquo;s all part of the essential excellence of
the halgorithm.&lt;/p&gt;

&lt;p&gt;This means you can interact with a fresh brain in any language. As
long as it can split the input into words, it will generate sentences
that mirror the structure of its input. I&amp;rsquo;m sure this isn&amp;rsquo;t a
generalization that works for every written language, but it does a
decent job in a lot of them.&lt;/p&gt;

&lt;p&gt;In fact, cobe does somewhat better than MegaHAL, which only supported
8-bit ASCII. Cobe uses Python&amp;rsquo;s Unicode support and character
tables to detect words, and in practice this works well.&lt;/p&gt;

&lt;p&gt;This is one of its key features &amp;mdash; access to surreal robot chatter is
a basic human right &amp;mdash; and I have been reluctant to introduce any
features that require knowledge of any specific language.&lt;/p&gt;

&lt;p&gt;However, I&amp;rsquo;m not above making a few tweaks here and there.&lt;/p&gt;

&lt;p&gt;A little review: for a single generated reply, cobe:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Identifies all of the words in its input.&lt;/li&gt;
&lt;li&gt;Produces a set of possible reply words by removing any words it
hasn&amp;rsquo;t seen before.&lt;/li&gt;
&lt;li&gt;Picks one of the known words at random.&lt;/li&gt;
&lt;li&gt;Selects a Markov context containing that word.&lt;/li&gt;
&lt;li&gt;Performs a random walk on the two Markov chains from that context,
keeping track of each new word along the way.&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;There&amp;rsquo;s an edge case where the input contains no known words, but in
general this guarantees that one word in the input will be found in
the generated reply.&lt;/p&gt;

&lt;p&gt;This works, but it would be nice if cobe responded more naturally. I
want it to reply to the concepts in the input, but not necessarily
using the exact same words.&lt;/p&gt;

&lt;p&gt;In cobe 1.2.0, I&amp;rsquo;m introducing an optional feature that gives more
natural responses at the expense of some of its language
independence.&lt;/p&gt;

&lt;h3&gt;Conflation&lt;/h3&gt;

&lt;p&gt;Take a look at &lt;em&gt;2.&lt;/em&gt; above. Until now, this has been a stage where cobe
restricts the input words to the ones it has already seen. But if it
expands that set to include similar but related words, we&amp;rsquo;ll get
replies that can be about the same things without using the exact
words.&lt;/p&gt;

&lt;p&gt;The clever bit is that this conflation can choose any words, &lt;em&gt;as long
as they have already been learned&lt;/em&gt;. Since cobe follows the Markov
chains using the usual random walk, the replies will only ever include
phrases that came from the learned text.&lt;/p&gt;

&lt;p&gt;There are a few ways to generate related words. I&amp;rsquo;ve chosen a strategy
called &lt;a href="http://en.wikipedia.org/wiki/Stemming"&gt;stemming&lt;/a&gt;, which has
gotten a lot of use in text-based search.&lt;/p&gt;

&lt;p&gt;Stemming reduces words to their base form. See this excerpt from the
Wikipedia page:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;A stemmer for English, for example, should identify the string
&amp;ldquo;cats&amp;rdquo; (and possibly &amp;ldquo;catlike&amp;rdquo;, &amp;ldquo;catty&amp;rdquo; etc.) as based on the root
&amp;ldquo;cat&amp;rdquo;, and &amp;ldquo;stemmer&amp;rdquo;, &amp;ldquo;stemming&amp;rdquo;, &amp;ldquo;stemmed&amp;rdquo; as based on &amp;ldquo;stem&amp;rdquo;. A
stemming algorithm reduces the words &amp;ldquo;fishing&amp;rdquo;, &amp;ldquo;fished&amp;rdquo;, &amp;ldquo;fish&amp;rdquo;,
and &amp;ldquo;fisher&amp;rdquo; to the root word, &amp;ldquo;fish&amp;rdquo;.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Cobe will take the set of words in &lt;em&gt;2.&lt;/em&gt; and add any words that &lt;em&gt;have
the same stems&lt;/em&gt;. Any words that have the same stems are considered
synonyms.&lt;/p&gt;

&lt;p&gt;An example: I&amp;rsquo;ll use the input text &amp;ldquo;This is a test of stemming.&amp;rdquo; on a
cobe brain with some experience.&lt;/p&gt;

&lt;p&gt;Without a stemmer, this yields these possible words:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;This a is of stemming test&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;And with an English stemmer:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;A IS Is OF Of Stem TEST THIS Test Tested This a is of ofs stem
stemmed stemming stems test tested testes testing tests this&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;After conflation, there are many more options. Every word is one cobe
has already learned, so they&amp;rsquo;re all in Markov contexts that can be
followed to generate replies.&lt;/p&gt;

&lt;p&gt;You&amp;rsquo;ll note that some of these change between upper and lower
case. I&amp;rsquo;ve chosen to store all the stems in lower case, so conflation
can hop between cases. In theory, this can improve replies. In
practice, it can be &lt;em&gt;awesome&lt;/em&gt;.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;08:27 -!- cobe has quit [Ping timeout]
09:03 &amp;lt;steve&amp;gt; nothing like a power outage
09:04 -!- cobe has joined #zzz
09:04 &amp;lt;steve&amp;gt; cobe: welcome back
09:04 &amp;lt;cobe&amp;gt; steve: KILLING ME WON'T BRING BACK YOUR GODDAMNED HONEY
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Here cobe has taken the input &amp;ldquo;welcome back&amp;rdquo;, conflated that to &amp;ldquo;BACK
Back Backs WELCOME Welcome back backed backing backs welcome welcomed
welcoming&amp;rdquo;, chosen BACK (randomly), and generated a reply confusing a
normal power outage with malicious intent.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;ve chosen to use the
&lt;a href="http://pypi.python.org/pypi/PyStemmer/"&gt;PyStemmer&lt;/a&gt; package, a port of
the Snowball stemmer to Python, because it implements stemmers for a
several languages: Danish, Dutch, English, Finnish, French, German,
Hungarian, Italian, Norwegian, Portuguese, Romanian, Russian, Spanish,
Swedish, and Turkish.&lt;/p&gt;

&lt;p&gt;To configure a stemmer on an existing brain, close all clients that
might be accessing its database and run:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ cobe set-stemmer &amp;lt;language&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Where &amp;lt;language&amp;gt; is any of the languages listed above. Use lower case,
since that&amp;rsquo;s what PyStemmer expects.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;$ cobe set-stemmer english
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Adding a stemmer has no effect on the text structure cobe has learned,
so it doesn&amp;rsquo;t require any relearning. The set-stemmer command is
idempotent, so you can run it more than once without affecting the
list of stems.&lt;/p&gt;

&lt;p&gt;Next steps might include using latent semantic analysis to generate
lists of related words independent of language.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/teichman/blog/~4/rbhXDb0CM04" height="1" width="1"/&gt;</content>
    <feedburner:origLink>http://teichman.org/blog/2011/05/singularity.html</feedburner:origLink></entry>
    
    <entry>
        <title>Hammond organ simulation: Or, roto.</title>
        <id>tag:teichman.org,2011-05-22:/2011/05/roto.html</id>

        <published>2011-05-22T21:06:04-04:00</published>

        
          
        

        <updated>2011-05-22T21:06:04-04:00</updated>

        <author><name>Peter</name></author>

        <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/teichman/blog/~3/KEW1Zpc4Coo/roto.html" />

        <content type="html" xml:lang="en" xml:base="http://teichman.org/blog/">&lt;div class="post-image pull-right"&gt;
 &lt;a href="http://www.flickr.com/photos/pteichman/3853795073/"&gt;&lt;img class="pull-right" width="240" height="240" src="http://farm3.staticflickr.com/2661/3853795073_2a8b0cba8e_m.jpg" alt="Teatime 2" /&gt;
  &lt;p style="max-width:240px"&gt;Teatime 2&lt;/p&gt;
 &lt;/a&gt;
&lt;/div&gt;


&lt;p&gt;If you&amp;rsquo;re like most people, you want to know how to simulate a Hammond
B-3 organ using fixed-point arithmetic on an 8-bit microcontroller.&lt;/p&gt;

&lt;p&gt;First, you&amp;rsquo;re going to need tone generators. In a Hammond, this is a
set of 91 tonewheels. Each is positioned adjacent to an
electromagnetic pickup, inducing a sinusoidal current at a specific
frequency. On a microcontroller, we can simulate the tonewheel
assembly with a set of 91 sine oscillators. Take a look at
&lt;a href="/blog/2011/02/new-noises-from-midivox.html"&gt;bleep bloop&lt;/a&gt;
for direct digital synthesis, one method of sine wave
generation. We&amp;rsquo;ll be using it below.&lt;/p&gt;

&lt;p&gt;A B-3 has two keyboards, called &lt;em&gt;manuals&lt;/em&gt; on an organ. When you press
a key, nine of these oscillators connect to the audio output. Each
corresponds to a note that&amp;rsquo;s harmonically related to the key you
pressed (including that key&amp;rsquo;s note, the &lt;em&gt;fundamental&lt;/em&gt;). They&amp;rsquo;re played
at volumes set by nine drawbars.&lt;/p&gt;

&lt;p&gt;&lt;table&gt;
  &lt;tr&gt;&lt;th&gt;Drawbar&lt;/th&gt;&lt;th&gt;Harmonic&lt;/th&gt;&lt;th&gt;Interval&lt;/th&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;1&lt;/td&gt;&lt;td&gt;Sub-Fundamental&lt;/td&gt;&lt;td&gt;Sub-Octave&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;2&lt;/td&gt;&lt;td&gt;Sub-Third&lt;/td&gt;&lt;td&gt;Sub-Third&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;3&lt;/td&gt;&lt;td&gt;Fundamental&lt;/td&gt;&lt;td&gt;Unison&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;4&lt;/td&gt;&lt;td&gt;2nd&lt;/td&gt;&lt;td&gt;Octave&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;5&lt;/td&gt;&lt;td&gt;3rd&lt;/td&gt;&lt;td&gt;12th&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;6&lt;/td&gt;&lt;td&gt;4th&lt;/td&gt;&lt;td&gt;15th&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;7&lt;/td&gt;&lt;td&gt;5th&lt;/td&gt;&lt;td&gt;17th&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;8&lt;/td&gt;&lt;td&gt;6th&lt;/td&gt;&lt;td&gt;19th&lt;/td&gt;&lt;/tr&gt;
  &lt;tr&gt;&lt;td&gt;9&lt;/td&gt;&lt;td&gt;8th&lt;/td&gt;&lt;td&gt;22nd&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;&lt;/p&gt;


&lt;p&gt;So to simulate a Hammond organ, all you need is 91 sine wave
oscillators, harmonic ratios to calculate which oscillators apply to
each key, and a mixer that can add oscillator outputs together based
on the drawbar settings.&lt;/p&gt;

&lt;h3&gt;It&amp;rsquo;s that simple!&lt;/h3&gt;

&lt;p&gt;It&amp;rsquo;s not that simple.&lt;/p&gt;

&lt;p&gt;First, you need to account for &lt;em&gt;harmonic foldback&lt;/em&gt;. There aren&amp;rsquo;t
enough tonewheels to play the nine harmonics of every key, and on
either end of the manual some tonewheels sound on multiple
drawbars. This is no big deal, you just need to handle it in your
key-&gt;tonewheels mapping.&lt;/p&gt;

&lt;p&gt;Second, there&amp;rsquo;s a phenomenon called &lt;em&gt;key click&lt;/em&gt;. On a Hammond, this is
a discontinuity in organ output as tonewheels switch in and out of the
mix. We get something very similar if we aren&amp;rsquo;t careful about
discontinuities in our DAC output. Great! The lazy programmer wins.&lt;/p&gt;

&lt;p&gt;Third, a few of the tonewheels on a Hammond don&amp;rsquo;t make sine waves &amp;mdash;
the lowest 12 make more of a square wave tone. Luckily for us, it is
even easier to generate square waves. It&amp;rsquo;s the same frequency
calculation, but we can use the highest bit of our accumulator instead
of the sine lookup table. These tonewheels are only active on the
Hammond&amp;rsquo;s bass pedal board, so it&amp;rsquo;s not a big deal if you&amp;rsquo;re not
planning to emulate that.&lt;/p&gt;

&lt;p&gt;Fourth, there&amp;rsquo;s a feature on the Hammond called &lt;em&gt;percussion&lt;/em&gt;. This
disables one of the drawbars (the 2nd or 3rd harmonic) and plays the
associated tonewheel briefly at the beginning of a note. In modern
synth terms, this is a single triggered effect &amp;mdash; it is only activated
when you depress the first key from an all-keys-up state. For
percussion, play that tonewheel at slightly increased volume with a
quick decay to nothing.&lt;/p&gt;

&lt;p&gt;Fifth, many Hammond organs have two built-in effects: chorus and
vibrato. Each is available at three intensities, though always at a
fixed rate of 7 Hz. We may get into this in the future.&lt;/p&gt;

&lt;p&gt;Sixth, you have &lt;em&gt;harmonic leakage&lt;/em&gt;. The tonewheels on a Hammond are
organized in pairs, and each wheel can induce a small current in its
twin. They&amp;rsquo;re organized so each pair has two tonewheels exactly four
octaves apart, so the harmonic impact of leakage is low, but it&amp;rsquo;s an
important part of the Hammond sound.&lt;/p&gt;

&lt;p&gt;Seventh, the tonewheels don&amp;rsquo;t output at the same level. High frequency
tones sound louder to us than low frequencies, and the Hammond
attempts to keep perceived volumes the same by attenuating the high
frequencies more.&lt;/p&gt;

&lt;p&gt;Eighth and most critical is the Leslie rotating speaker cabinet. A
rotating speaker creates a very complex effect, with a sinusoidal
volume shift (tremolo) trailing a sinusoidal pitch shift
(vibrato). Meanwhile, the rotating speaker is throwing the sound all
over the speaker cabinet and the room, creating a reverb effect that
changes character at the rotation rate.&lt;/p&gt;

&lt;p&gt;That&amp;rsquo;s one rotating speaker. A Leslie cabinet has two, a rotating
speaker for treble and a rotating baffle around a bass speaker. They
run at different speeds. Also, you can run the cabinet in either
&lt;em&gt;fast&lt;/em&gt; or &lt;em&gt;slow&lt;/em&gt; mode. With the two speakers adjusting to mode changes
at different rates of acceleration. Did I mention the effect of the
room it&amp;rsquo;s in?&lt;/p&gt;

&lt;h3&gt;I digress&lt;/h3&gt;

&lt;p&gt;I think that simulating a Leslie on a microcontroller is possible. One
speaker at one speed doesn&amp;rsquo;t seem so tough &amp;mdash; I think you can adjust
the main timer interrupt on a sine wave to effect the vibrato. The
tremolo is a volume adjustment 90 degrees out of phase. Use a 90
degree offset into your sine table or use a recursive sine/cosine
generator to calculate them at the same time.&lt;/p&gt;

&lt;p&gt;Adding in the Leslie&amp;rsquo;s second speaker makes things more
difficult. Since they run independently from each other, you can&amp;rsquo;t
just tweak the timer interrupt to get vibrato. At least we already
know the exact frequency of each tonewheel, so separating the bass and
treble frequencies is just a matter of handling them in two groups.&lt;/p&gt;

&lt;p&gt;You&amp;rsquo;ll need phase-adjustable all-pass filters for the two vibrato
effects, and a model of speaker acceleration when you&amp;rsquo;re switching
speeds. Full Leslie simulation is probably not a task for the same
microcontroller that&amp;rsquo;s simulating the organ, but my gut feeling is
that one might manage a two-speaker simulation alone.&lt;/p&gt;

&lt;h3&gt;Roto&lt;/h3&gt;

&lt;p&gt;Time to get down to
business. &lt;a href="https://github.com/pteichman/roto"&gt;roto&lt;/a&gt; is a project to
simulate one manual of a Hammond B-3. It runs on an Arduino board with
the MidiVox shield.&lt;/p&gt;

&lt;p&gt;The Arduino uses an ATmega328 as its microcontroller, running at
16MHz. This is only enough power to run and mix about a third of the
available tonewheels. Depending on your drawbar settings, it can play
anywhere from three to 30 keys at the same time.&lt;/p&gt;

&lt;p&gt;Here are a few highlights of the development process:&lt;/p&gt;

&lt;h4&gt;Choosing a sample rate&lt;/h4&gt;

&lt;p&gt;I knew from &lt;a href="http://www.goodeveca.net/RotorOrgan/ToneWheelSpec.html"&gt;tonewheel tables&lt;/a&gt;
that I would need to produce tones up to about 6000 Hz. Using this as
my &lt;a href="http://en.wikipedia.org/wiki/Nyquist_frequency"&gt;Nyquist Frequency&lt;/a&gt;
left me needing to use a sample clock of at least 12000 Hz. I already
had the code from
&lt;a href="/blog/2011/02/new-noises-from-midivox.html"&gt;bleep bloop&lt;/a&gt;
that used a 15625 Hz sample clock, so I reused that. Tweaking that
clock for a slightly longer loop is one place I could gain a few
cycles.&lt;/p&gt;

&lt;p&gt;Including the time it takes to update the DAC, the sample interrupt
handler is running about 9% of the time:&lt;/p&gt;

&lt;p&gt;&lt;img class="inline-image" width="941" height="253" src="/blog/static/2011/roto-tick.png" /&gt;&lt;/p&gt;

&lt;p&gt;This leaves a 58 &amp;micro;s window to advance oscillators, mix them
together according to the drawbars, and produce a sample.&lt;/p&gt;

&lt;p&gt;Let&amp;rsquo;s see how long it takes to update the 91 oscillators, to know how
much time remains for handling MIDI changes and mixing outputs
together.&lt;/p&gt;

&lt;p&gt;&lt;img class="inline-image" width="946" height="249" src="/blog/static/2011/roto-accumulate.png" /&gt;&lt;/p&gt;

&lt;p&gt;Oh.&lt;/p&gt;

&lt;p&gt;This is advancing the oscillators, not figuring out which are active
or scaling them by the drawbar volumes. Advancing 91 oscillators takes
130 &amp;micro;s, just over twice as long as our time slice. I&amp;rsquo;m going
to have to get a little more clever.&lt;/p&gt;

&lt;p&gt;Roto does a couple of things to work around the problem. Clearly the
most important task is to have the next sample ready by the time its
interrupt fires. If that doesn&amp;rsquo;t happen, you can&amp;rsquo;t update the DAC in
time and you&amp;rsquo;ll have unintended frequency shifts.&lt;/p&gt;

&lt;p&gt;In fact, having the next sample available is even more important than
responding to the user. I don&amp;rsquo;t have to handle MIDI changes on every
interrupt tick. I could handle them every 512 ticks and still have
only 30 ms latency. That&amp;rsquo;s still instantaneous, as far as our brains
are concerned. Roto keeps two 256-sample buffers and fills each after
it is emptied, keeping the latency for MIDI changes around 15 ms.&lt;/p&gt;

&lt;p&gt;Being able to keep the same list of &amp;ldquo;on&amp;rdquo; oscillators for multiple
samples is a huge benefit. I can calculate each oscillator&amp;rsquo;s
contribution to all the samples in the buffer in one shot, keeping the
oscillator&amp;rsquo;s 16-bit position in two registers. That only needs to be
saved to memory after calculating the full buffer.&lt;/p&gt;

&lt;p&gt;I haven&amp;rsquo;t mentioned the impact of memory access on these calculations
&amp;mdash; a load or save of an oscillator&amp;rsquo;s 16-bit position is four cycles. A
single oscillator update is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;load 16-bit oscillator position (4 cycles)&lt;/li&gt;
&lt;li&gt;load 16-bit oscillator rate (4 cycles)&lt;/li&gt;
&lt;li&gt;add them together (2 cycles)&lt;/li&gt;
&lt;li&gt;shift the oscillator position to get an 8-bit index (1 cycle)&lt;/li&gt;
&lt;li&gt;look up that value in the sine table (2 cycles)&lt;/li&gt;
&lt;li&gt;add that value to the current 16-bit sample (2 cycles)&lt;/li&gt;
&lt;li&gt;save 16-bit oscillator position (4 cycles)&lt;/li&gt;
&lt;li&gt;save 16-bit current sample (4 cycles)&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;23 cycles, or 1.4 &amp;micro;s at 16 MHz.&lt;/p&gt;

&lt;p&gt;Calculating the oscillator&amp;rsquo;s impact on the buffer at once, I avoid the
four cycles in &lt;em&gt;1.&lt;/em&gt;, &lt;em&gt;2.&lt;/em&gt; and &lt;em&gt;7.&lt;/em&gt; for all the mid-buffer samples. If
I calculate 256 samples at a time, this is a savings of 3060 cycles
every time the buffer fills. 52% of the total, worth saving for sure.&lt;/p&gt;

&lt;h4&gt;Volume&lt;/h4&gt;

&lt;p&gt;So far I&amp;rsquo;ve neglected the impact of the drawbar settings. On a
Hammond, each click of the drawbar is a 3 dB gain in volume, up to
eight clicks. 6 dB of gain is a doubling of volume, so we need an
extra binary bit of scale factor for every two clicks available. So
the drawbars map to a 4-bit multiplier. In the current code, I&amp;rsquo;m using
a &lt;a href="http://en.wikipedia.org/wiki/Q_(number_format"&gt;Q4.3&lt;/a&gt;) format number
to maximize bit usage and get close to that 16x multiplier (Q4.3 gives
us 15.875x).&lt;/p&gt;

&lt;p&gt;I also need to think about the overall scale. So far I&amp;rsquo;m doing the
same thing as the previous post &amp;mdash; accumulating 16-bit samples and
shifting them down to 12 bits before output. With an 8-bit sine table,
we have enough headroom to play 256 (2&lt;sup&gt;16&lt;/sup&gt; / 2&lt;sup&gt;8)&lt;/sup&gt; oscillators at
once. Excellent.&lt;/p&gt;

&lt;p&gt;If each drawbar is pulled out all the way, the oscillators are scaled
by a factor of 16 (the Q4.3 multiplier above). I have enough volume to
play (2&lt;sup&gt;16&lt;/sup&gt; / 2&lt;sup&gt;8&lt;/sup&gt; / 16) oscillators at once. That&amp;rsquo;s only 16, not enough
to play two keys. Uh oh.&lt;/p&gt;

&lt;p&gt;Once I run out of headroom on volume, some samples overflow to
meaningless values. This screws with the output waveform and creates
an unpleasant sort of distortion.&lt;/p&gt;

&lt;p&gt;This is a problem I haven&amp;rsquo;t solved yet.&lt;/p&gt;

&lt;p&gt;It would be easy to give the samples more bits, either 24 or 32, but
time is tight already. I could add a curve for dynamic range
compression, but so far I haven&amp;rsquo;t thought of a good way to do that
quickly enough. The Hammond has a behavior called &amp;ldquo;loudness robbing&amp;rdquo;
that results in compression as a single tonewheel is played via
multiple keys. Simulating that could help.&lt;/p&gt;

&lt;p&gt;The oscillators can run in the background as long as they&amp;rsquo;re
calculated in time for the sample clock. What if I offload them to
additional microcontrollers?&lt;/p&gt;

&lt;h3&gt;Tonewheel coprocessors&lt;/h3&gt;

&lt;p&gt;A possible next move is to move oscillator generation onto separate
chips. If I split the 91 oscillators into three groups and run the
coprocessors at 20MHz, I should have time to spare.&lt;/p&gt;

&lt;p&gt;The main microcontroller controls the system, handling MIDI messages
and the final mix. Partitioning the oscillators carefully would leave
the bass and treble sounds separated for two-speaker Leslie
simulation.&lt;/p&gt;

&lt;p&gt;Plus, I get to say tonewheel coprocessors. Tonewheel coprocessors.&lt;/p&gt;

&lt;p&gt;I&amp;rsquo;m sure it would be productive to try a more powerful
microcontroller. A 16-bit PIC or 32-bit ARM would probably suffice,
running at a higher clock rate, and the ARM in particular would
provide plenty of volume headroom.&lt;/p&gt;

&lt;h3&gt;Status&lt;/h3&gt;

&lt;p&gt;Where is roto now? It&amp;rsquo;s a decent sounding simulation of a Hammond:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;It runs all 91 tonewheels, at rates calculated from the Hammond
B-3&amp;rsquo;s original gear ratios.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It supports foldback properly at either end of the manual.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;All nine drawbars are modeled accurately.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It has clicks on key presses that sound a lot like Hammond key
click.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;It supports the &amp;ldquo;complex tonewheels&amp;rdquo; of the bass pedals (though
not the pedals' separate drawbars).&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;I have support for percussion on a branch, but I&amp;rsquo;m not going to
merge that until I hash out the bigger issues.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;


&lt;h4&gt;The bigger issues&lt;/h4&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Each drawbar you pull out at all drastically reduces the number of
notes you can play at once. At the moment that&amp;rsquo;s due to distortion
when overflowing the 16-bit samples, but computation speed becomes
limiting as soon as that&amp;rsquo;s solved.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;No attenuation of higher frequency tonewheels.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;No built-in effects, i.e. chorus and vibrato.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;No Leslie!&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;img src="http://feeds.feedburner.com/~r/teichman/blog/~4/KEW1Zpc4Coo" height="1" width="1"/&gt;</content>
    <feedburner:origLink>http://teichman.org/blog/2011/05/roto.html</feedburner:origLink></entry>
    
    <entry>
        <title>New noises from MidiVox: Or, bleep bloop.</title>
        <id>tag:teichman.org,2011-02-20:/2011/02/new-noises-from-midivox.html</id>

        <published>2011-02-20T12:16:46-05:00</published>

        
          
        

        <updated>2011-02-20T12:16:46-05:00</updated>

        <author><name>Peter</name></author>

        <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/teichman/blog/~3/4gVvzcjIZxs/new-noises-from-midivox.html" />

        <content type="html" xml:lang="en" xml:base="http://teichman.org/blog/">&lt;div class="post-image pull-right"&gt;
 &lt;a href="http://www.flickr.com/photos/pteichman/5314265115/"&gt;&lt;img class="pull-right" width="240" height="240" src="http://farm6.staticflickr.com/5130/5314265115_ce981b6d64_m.jpg" alt="Skyline" /&gt;
  &lt;p style="max-width:240px"&gt;Skyline&lt;/p&gt;
 &lt;/a&gt;
&lt;/div&gt;


&lt;p&gt;I don&amp;rsquo;t like lugging my laptop up to my MIDI keyboard every time I
want to play music, so I have been writing a MIDI synth that uses an
&lt;a href="http://arduino.cc/"&gt;Arduino&lt;/a&gt; and Collin Cunningham&amp;rsquo;s &lt;a href="http://www.narbotic.com/projects/midivox/"&gt;MidiVox
Shield&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The MidiVox shield comes with software called Healer, a synthesizer
with several waveforms, a resonant filter, and traditional envelope
controls. Healer is nice, but it&amp;rsquo;s monophonic: it only plays one note
at a time. I want a polyphonic synth, so I can play crazy chords.&lt;/p&gt;

&lt;p&gt;Here are some notes on making a polyphonic MIDI synth from scratch.&lt;/p&gt;

&lt;h3&gt;Step 1: beeeeeeeeeeeeeeeeeeee&lt;/h3&gt;

&lt;p&gt;First step? Make a noise. There are a number of ways to get sound
output from an Arduino, and this shield is an excellent option. It has
a high quality, 12-bit digital to analog converter on board: the
&lt;a href="http://www.microchip.com/wwwproducts/Devices.aspx?dDocName=en020398"&gt;MCP4921&lt;/a&gt;.
The DAC is controlled by the Arduino&amp;rsquo;s microcontroller via the SPI
bus, which can update it often enough to produce quality audio. It
drives a headphone jack, so I can make secret noises that don&amp;rsquo;t annoy
my wife.&lt;/p&gt;

&lt;p&gt;This is the same DAC used in Adafruit&amp;rsquo;s &lt;a href="http://www.ladyada.net/make/waveshield/"&gt;Wave
Shield&lt;/a&gt;. That board uses a
different pin (PD2, you&amp;rsquo;ll see PB1 used below) as a chip select, so
this code won&amp;rsquo;t work on one without modification. I intend to control
my synth via MIDI, so the Wave Shield isn&amp;rsquo;t a single solution anyway.&lt;/p&gt;

&lt;p&gt;I drive the DAC from a timer interrupt on the microcontroller. Human
hearing is very sensitive to small changes in frequency, so we need
the DAC to be updated on as stable a clock as possible. This code sets
up a 15.6kHz timer and alternates the output of the DAC on each
tick. This makes a square wave tone that is high pitched (half the
clock rate, 7.8kHz).&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
uint16_t sample = 0;

void setup() {
    cli();

    /* Enable interrupt on timer2 == 127, with clk/8 prescaler. At 16MHz,
       this gives a timer interrupt at 15625Hz. */
    TIMSK2 = (1 &amp;lt;&amp;lt; OCIE2A);
    OCR2A = 127;

    /* clear/reset timer on match */
    TCCR2A = 1&amp;lt;&amp;lt;WGM21 | 0&amp;lt;&amp;lt;WGM20; /* CTC mode, reset on match */
    TCCR2B = 0&amp;lt;&amp;lt;CS22 | 1&amp;lt;&amp;lt;CS21 | 0&amp;lt;&amp;lt;CS20; /* clk, /8 prescaler */

    SPCR = 0x50;
    SPSR = 0x01;
    DDRB |= 0x2E;
    PORTB |= (1&amp;lt;&amp;lt;1);

    sei();
}

ISR(TIMER2_COMPA_vect) {
    /* OCR2A has been cleared, per TCCR2A above */
    OCR2A = 127;

    /* set the lower 12 bytes of sample */
    sample = ~sample &amp;amp; 0xFFF;

    PORTB &amp;amp;= ~(1&amp;lt;&amp;lt;1);

    /* buffered, 1x gain, active mode */
    SPDR = highByte(sample) | 0x70;
    while (!(SPSR &amp;amp; (1&amp;lt;&amp;lt;SPIF)));

    SPDR = lowByte(sample);
    while (!(SPSR &amp;amp; (1&amp;lt;&amp;lt;SPIF)));

    PORTB |= (1&amp;lt;&amp;lt;1);
}

void loop() {
}
&lt;/pre&gt;


&lt;p&gt;&lt;audio src="http://teichman.org/blog/static/2011/beeeeeeeeeeeeeeeeeeee.mp3" controls="controls"&gt;&lt;a href="http://teichman.org/blog/static/2011/beeeeeeeeeeeeeeeeeeee.mp3"&gt;beeeeeeeeeeeeeeeeeeee&lt;/a&gt;&lt;/audio&gt;&lt;/p&gt;

&lt;h3&gt;Step 2: boooooooooooooooooooo&lt;/h3&gt;

&lt;p&gt;Ok, I managed to make my microcontroller whine like a TV. Next I tried
to generate a more complex waveform. I dropped a table into the code
with 8 bits worth of sine wave data, a perfect tone.&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
uint16_t sample = 0;
uint8_t pos = 0;

const uint8_t sine[] = {
    0x80, 0x83, 0x86, 0x89, 0x8C, 0x8F, 0x92, 0x95, 0x98, 0x9B, 0x9E, 0xA2,
    0xA5, 0xA7, 0xAA, 0xAD, 0xB0, 0xB3, 0xB6, 0xB9, 0xBC, 0xBE, 0xC1, 0xC4,
    0xC6, 0xC9, 0xCB, 0xCE, 0xD0, 0xD3, 0xD5, 0xD7, 0xDA, 0xDC, 0xDE, 0xE0,
    0xE2, 0xE4, 0xE6, 0xE8, 0xEA, 0xEB, 0xED, 0xEE, 0xF0, 0xF1, 0xF3, 0xF4,
    0xF5, 0xF6, 0xF8, 0xF9, 0xFA, 0xFA, 0xFB, 0xFC, 0xFD, 0xFD, 0xFE, 0xFE,
    0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFE, 0xFE, 0xFE, 0xFD,
    0xFD, 0xFC, 0xFB, 0xFA, 0xFA, 0xF9, 0xF8, 0xF6, 0xF5, 0xF4, 0xF3, 0xF1,
    0xF0, 0xEE, 0xED, 0xEB, 0xEA, 0xE8, 0xE6, 0xE4, 0xE2, 0xE0, 0xDE, 0xDC,
    0xDA, 0xD7, 0xD5, 0xD3, 0xD0, 0xCE, 0xCB, 0xC9, 0xC6, 0xC4, 0xC1, 0xBE,
    0xBC, 0xB9, 0xB6, 0xB3, 0xB0, 0xAD, 0xAA, 0xA7, 0xA5, 0xA2, 0x9E, 0x9B,
    0x98, 0x95, 0x92, 0x8F, 0x8C, 0x89, 0x86, 0x83, 0x80, 0x7D, 0x7A, 0x77,
    0x74, 0x71, 0x6E, 0x6B, 0x68, 0x65, 0x62, 0x5E, 0x5B, 0x59, 0x56, 0x53,
    0x50, 0x4D, 0x4A, 0x47, 0x44, 0x42, 0x3F, 0x3C, 0x3A, 0x37, 0x35, 0x32,
    0x30, 0x2D, 0x2B, 0x29, 0x26, 0x24, 0x22, 0x20, 0x1E, 0x1C, 0x1A, 0x18,
    0x16, 0x15, 0x13, 0x12, 0x10, 0x0F, 0x0D, 0x0C, 0x0B, 0x0A, 0x08, 0x07,
    0x06, 0x06, 0x05, 0x04, 0x03, 0x03, 0x02, 0x02, 0x02, 0x01, 0x01, 0x01,
    0x01, 0x01, 0x01, 0x01, 0x02, 0x02, 0x02, 0x03, 0x03, 0x04, 0x05, 0x06,
    0x06, 0x07, 0x08, 0x0A, 0x0B, 0x0C, 0x0D, 0x0F, 0x10, 0x12, 0x13, 0x15,
    0x16, 0x18, 0x1A, 0x1C, 0x1E, 0x20, 0x22, 0x24, 0x26, 0x29, 0x2B, 0x2D,
    0x30, 0x32, 0x35, 0x37, 0x3A, 0x3C, 0x3F, 0x42, 0x44, 0x47, 0x4A, 0x4D,
    0x50, 0x53, 0x56, 0x59, 0x5B, 0x5E, 0x62, 0x65, 0x68, 0x6B, 0x6E, 0x71,
    0x74, 0x77, 0x7A, 0x7D
};

void setup() {
    cli();

    /* Enable interrupt on timer2 == 127, with clk/8 prescaler. At 16MHz,
       this gives a timer interrupt at 15625Hz. */
    TIMSK2 = (1 &amp;lt;&amp;lt; OCIE2A);
    OCR2A = 127;

    /* clear/reset timer on match */
    TCCR2A = 1&amp;lt;&amp;lt;WGM21 | 0&amp;lt;&amp;lt;WGM20; /* CTC mode, reset on match */
    TCCR2B = 0&amp;lt;&amp;lt;CS22 | 1&amp;lt;&amp;lt;CS21 | 0&amp;lt;&amp;lt;CS20; /* clk, /8 prescaler */

    SPCR = 0x50;
    SPSR = 0x01;
    DDRB |= 0x2E;
    PORTB |= (1&amp;lt;&amp;lt;1);

    sei();
}

ISR(TIMER2_COMPA_vect) {
    /* OCR2A has been cleared, per TCCR2A above */
    OCR2A = 127;

    /* shift left for volume */
    sample = sine[pos++] &amp;lt;&amp;lt; 3;

    PORTB &amp;amp;= ~(1&amp;lt;&amp;lt;1);

    /* buffered, 1x gain, active mode */
    SPDR = highByte(sample) | 0x70;
    while (!(SPSR &amp;amp; (1&amp;lt;&amp;lt;SPIF)));

    SPDR = lowByte(sample);
    while (!(SPSR &amp;amp; (1&amp;lt;&amp;lt;SPIF)));

    PORTB |= (1&amp;lt;&amp;lt;1);
}

void loop() {
}
&lt;/pre&gt;


&lt;p&gt;When &lt;em&gt;pos&lt;/em&gt; rolls over, the sine wave starts its cycle again. This
gives us a mellow tone at 15625Hz / 256 = 61Hz (quite low pitched).&lt;/p&gt;

&lt;p&gt;&lt;audio src="http://teichman.org/blog/static/2011/boooooooooooooooooooo.mp3" controls="controls"&gt;&lt;a href="http://teichman.org/blog/static/2011/boooooooooooooooooooo.mp3"&gt;boooooooooooooooooooo&lt;/a&gt;&lt;/audio&gt;&lt;/p&gt;

&lt;h3&gt;Step 3: buhhhhhhhhhhhhhhhhhhh&lt;/h3&gt;

&lt;p&gt;How about making a tone at any frequency? I have tied the DAC updates
to a 15625Hz clock. I&amp;rsquo;ve seen that changing the index into the sine
table on every tick will produce a 61Hz tone. What I need is an index
that moves faster or slower than that, according to the frequency of
tone I want to generate.&lt;/p&gt;

&lt;p&gt;There is an efficient way to generate an index that increments by
frequency, and that is &lt;a href="http://en.wikipedia.org/wiki/Direct_digital_synthesizer"&gt;Direct Digital
Synthesis&lt;/a&gt;. I
can update the DAC with the right value at each sample tick, and its
output with mimic the waveform of a running sine wave oscillator.&lt;/p&gt;

&lt;p&gt;First, I calculate the amount the output wave&amp;rsquo;s phase should shift in
each clock cycle. Then, I track the current phase by adding that value
to an accumulator in each tick.&lt;/p&gt;

&lt;p&gt;Calculating the value that needs to be sent to the DAC becomes easy: I
use the accumulator as the index into the sine table. I can even use a
16-bit accumulator, choose the phase increment accordingly, and the
sine index becomes the top 8 bits of the accumulator.&lt;/p&gt;

&lt;p&gt;Let me rephrase, because direct synthesis phase digital accumulator
increment blah blah blah.&lt;/p&gt;

&lt;p&gt;To create a sine wave at a frequency &lt;em&gt;freq&lt;/em&gt; Hz, sampling at 15625Hz:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Track the position of the oscillator in a 16-bit unsigned int.&lt;/li&gt;
&lt;li&gt;Increment it on each sample clock tick by &lt;em&gt;freq&lt;/em&gt; * (2&lt;sup&gt;16&lt;/sup&gt; / 15625)&lt;/li&gt;
&lt;li&gt;Use the top 8 bits of the oscillator position as the index into the sine table.&lt;/li&gt;
&lt;/ol&gt;


&lt;pre class="prettyprint"&gt;
uint16_t sample = 0;

/* incr = freq * (2^16 / 15625) 
 * So for 440Hz, incr = 1845 */
uint16_t incr = 1845;

/* oscillator position */
uint16_t pos = 0;

const uint8_t sine[] = {
    0x80, 0x83, 0x86, 0x89, 0x8C, 0x8F, 0x92, 0x95, 0x98, 0x9B, 0x9E, 0xA2,
    0xA5, 0xA7, 0xAA, 0xAD, 0xB0, 0xB3, 0xB6, 0xB9, 0xBC, 0xBE, 0xC1, 0xC4,
    0xC6, 0xC9, 0xCB, 0xCE, 0xD0, 0xD3, 0xD5, 0xD7, 0xDA, 0xDC, 0xDE, 0xE0,
    0xE2, 0xE4, 0xE6, 0xE8, 0xEA, 0xEB, 0xED, 0xEE, 0xF0, 0xF1, 0xF3, 0xF4,
    0xF5, 0xF6, 0xF8, 0xF9, 0xFA, 0xFA, 0xFB, 0xFC, 0xFD, 0xFD, 0xFE, 0xFE,
    0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFE, 0xFE, 0xFE, 0xFD,
    0xFD, 0xFC, 0xFB, 0xFA, 0xFA, 0xF9, 0xF8, 0xF6, 0xF5, 0xF4, 0xF3, 0xF1,
    0xF0, 0xEE, 0xED, 0xEB, 0xEA, 0xE8, 0xE6, 0xE4, 0xE2, 0xE0, 0xDE, 0xDC,
    0xDA, 0xD7, 0xD5, 0xD3, 0xD0, 0xCE, 0xCB, 0xC9, 0xC6, 0xC4, 0xC1, 0xBE,
    0xBC, 0xB9, 0xB6, 0xB3, 0xB0, 0xAD, 0xAA, 0xA7, 0xA5, 0xA2, 0x9E, 0x9B,
    0x98, 0x95, 0x92, 0x8F, 0x8C, 0x89, 0x86, 0x83, 0x80, 0x7D, 0x7A, 0x77,
    0x74, 0x71, 0x6E, 0x6B, 0x68, 0x65, 0x62, 0x5E, 0x5B, 0x59, 0x56, 0x53,
    0x50, 0x4D, 0x4A, 0x47, 0x44, 0x42, 0x3F, 0x3C, 0x3A, 0x37, 0x35, 0x32,
    0x30, 0x2D, 0x2B, 0x29, 0x26, 0x24, 0x22, 0x20, 0x1E, 0x1C, 0x1A, 0x18,
    0x16, 0x15, 0x13, 0x12, 0x10, 0x0F, 0x0D, 0x0C, 0x0B, 0x0A, 0x08, 0x07,
    0x06, 0x06, 0x05, 0x04, 0x03, 0x03, 0x02, 0x02, 0x02, 0x01, 0x01, 0x01,
    0x01, 0x01, 0x01, 0x01, 0x02, 0x02, 0x02, 0x03, 0x03, 0x04, 0x05, 0x06,
    0x06, 0x07, 0x08, 0x0A, 0x0B, 0x0C, 0x0D, 0x0F, 0x10, 0x12, 0x13, 0x15,
    0x16, 0x18, 0x1A, 0x1C, 0x1E, 0x20, 0x22, 0x24, 0x26, 0x29, 0x2B, 0x2D,
    0x30, 0x32, 0x35, 0x37, 0x3A, 0x3C, 0x3F, 0x42, 0x44, 0x47, 0x4A, 0x4D,
    0x50, 0x53, 0x56, 0x59, 0x5B, 0x5E, 0x62, 0x65, 0x68, 0x6B, 0x6E, 0x71,
    0x74, 0x77, 0x7A, 0x7D
};

void setup() {
    cli();

    /* Enable interrupt on timer2 == 127, with clk/8 prescaler. At 16MHz,
       this gives a timer interrupt at 15625Hz. */
    TIMSK2 = (1 &amp;lt;&amp;lt; OCIE2A);
    OCR2A = 127;

    /* clear/reset timer on match */
    TCCR2A = 1&amp;lt;&amp;lt;WGM21 | 0&amp;lt;&amp;lt;WGM20; /* CTC mode, reset on match */
    TCCR2B = 0&amp;lt;&amp;lt;CS22 | 1&amp;lt;&amp;lt;CS21 | 0&amp;lt;&amp;lt;CS20; /* clk, /8 prescaler */

    SPCR = 0x50;
    SPSR = 0x01;
    DDRB |= 0x2E;
    PORTB |= (1&amp;lt;&amp;lt;1);

    sei();
}

ISR(TIMER2_COMPA_vect) {
    /* OCR2A has been cleared, per TCCR2A above */
    OCR2A = 127;

    pos += incr;

    /* shift left a couple of bits for more volume */
    sample = sine[highByte(pos)] &amp;lt;&amp;lt; 2;

    PORTB &amp;= ~(1&amp;lt;&amp;lt;1);

    /* buffered, 1x gain, active mode */
    SPDR = highByte(sample) | 0x70;
    while (!(SPSR &amp; (1&amp;lt;&amp;lt;SPIF)));

    SPDR = lowByte(sample);
    while (!(SPSR &amp; (1&amp;lt;&amp;lt;SPIF)));

    PORTB |= (1&amp;lt;&amp;lt;1);
}

void loop() {
}
&lt;/pre&gt;


&lt;p&gt;&lt;audio src="http://teichman.org/blog/static/2011/buhhhhhhhhhhhhhhhhhhh.mp3" controls="controls"&gt;&lt;a href="http://teichman.org/blog/static/2011/buhhhhhhhhhhhhhhhhhhh.mp3"&gt;buhhhhhhhhhhhhhhhhhhh&lt;/a&gt;&lt;/audio&gt;&lt;/p&gt;

&lt;p&gt;There are two great things about Direct Digital Synthesis.&lt;/p&gt;

&lt;p&gt;First, it&amp;rsquo;s pretty fast even on the Arduino&amp;rsquo;s ATmega328
microcontroller. It&amp;rsquo;s all integer math, one 16-bit addition and an
array index. GCC even optimizes away the mess with the high 8 bits&amp;mdash;it
generates code that uses the register containing &lt;em&gt;pos&lt;/em&gt;&amp;rsquo;s high byte, so
the shift and mask aren&amp;rsquo;t executed.&lt;/p&gt;

&lt;p&gt;Most of the microcontroller cycles are spent accessing memory: copying
&lt;em&gt;pos&lt;/em&gt; to and from RAM while adding &lt;em&gt;incr&lt;/em&gt;, and making the index into
the sine table.&lt;/p&gt;

&lt;p&gt;Speed isn&amp;rsquo;t an issue for one oscillator, but with a 15625Hz clock each
sample has only 64 microseconds to be calculated. (SPOILER ALERT: I
ran out of time trying to advance and mix about 30 oscillators).&lt;/p&gt;

&lt;p&gt;Second, DDS allows any waveform to be pushed through the DAC. I&amp;rsquo;m
using a sine wave, but that table could be created from any data. For
a little more processor time, I could even calculate waveform values
on the fly.&lt;/p&gt;

&lt;h3&gt;Step 4: hummmmmmmmmmmmmmmmmmm&lt;/h3&gt;

&lt;p&gt;Let&amp;rsquo;s wrap this thing up with a chord. Three sine oscillators, C
major.&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
incr1 = 261.63 * 65536 / 15625 = 1097; /* C4 */
incr2 = 329.63 * 65536 / 15625 = 1383; /* E4 */
incr3 = 392.00 * 65536 / 15625 = 1644; /* G4 */
&lt;/pre&gt;


&lt;p&gt;So the final code is:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
uint16_t sample = 0;

/* incr = freq * 2^16 / 15625 */

uint16_t incr1 = 1097; /* C4 */
uint16_t incr2 = 1383; /* E4 */
uint16_t incr3 = 1644; /* G4 */

uint16_t pos1 = 0;
uint16_t pos2 = 0;
uint16_t pos3 = 0;

const uint8_t sine[] = {
    0x80, 0x83, 0x86, 0x89, 0x8C, 0x8F, 0x92, 0x95, 0x98, 0x9B, 0x9E, 0xA2,
    0xA5, 0xA7, 0xAA, 0xAD, 0xB0, 0xB3, 0xB6, 0xB9, 0xBC, 0xBE, 0xC1, 0xC4,
    0xC6, 0xC9, 0xCB, 0xCE, 0xD0, 0xD3, 0xD5, 0xD7, 0xDA, 0xDC, 0xDE, 0xE0,
    0xE2, 0xE4, 0xE6, 0xE8, 0xEA, 0xEB, 0xED, 0xEE, 0xF0, 0xF1, 0xF3, 0xF4,
    0xF5, 0xF6, 0xF8, 0xF9, 0xFA, 0xFA, 0xFB, 0xFC, 0xFD, 0xFD, 0xFE, 0xFE,
    0xFE, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFF, 0xFE, 0xFE, 0xFE, 0xFD,
    0xFD, 0xFC, 0xFB, 0xFA, 0xFA, 0xF9, 0xF8, 0xF6, 0xF5, 0xF4, 0xF3, 0xF1,
    0xF0, 0xEE, 0xED, 0xEB, 0xEA, 0xE8, 0xE6, 0xE4, 0xE2, 0xE0, 0xDE, 0xDC,
    0xDA, 0xD7, 0xD5, 0xD3, 0xD0, 0xCE, 0xCB, 0xC9, 0xC6, 0xC4, 0xC1, 0xBE,
    0xBC, 0xB9, 0xB6, 0xB3, 0xB0, 0xAD, 0xAA, 0xA7, 0xA5, 0xA2, 0x9E, 0x9B,
    0x98, 0x95, 0x92, 0x8F, 0x8C, 0x89, 0x86, 0x83, 0x80, 0x7D, 0x7A, 0x77,
    0x74, 0x71, 0x6E, 0x6B, 0x68, 0x65, 0x62, 0x5E, 0x5B, 0x59, 0x56, 0x53,
    0x50, 0x4D, 0x4A, 0x47, 0x44, 0x42, 0x3F, 0x3C, 0x3A, 0x37, 0x35, 0x32,
    0x30, 0x2D, 0x2B, 0x29, 0x26, 0x24, 0x22, 0x20, 0x1E, 0x1C, 0x1A, 0x18,
    0x16, 0x15, 0x13, 0x12, 0x10, 0x0F, 0x0D, 0x0C, 0x0B, 0x0A, 0x08, 0x07,
    0x06, 0x06, 0x05, 0x04, 0x03, 0x03, 0x02, 0x02, 0x02, 0x01, 0x01, 0x01,
    0x01, 0x01, 0x01, 0x01, 0x02, 0x02, 0x02, 0x03, 0x03, 0x04, 0x05, 0x06,
    0x06, 0x07, 0x08, 0x0A, 0x0B, 0x0C, 0x0D, 0x0F, 0x10, 0x12, 0x13, 0x15,
    0x16, 0x18, 0x1A, 0x1C, 0x1E, 0x20, 0x22, 0x24, 0x26, 0x29, 0x2B, 0x2D,
    0x30, 0x32, 0x35, 0x37, 0x3A, 0x3C, 0x3F, 0x42, 0x44, 0x47, 0x4A, 0x4D,
    0x50, 0x53, 0x56, 0x59, 0x5B, 0x5E, 0x62, 0x65, 0x68, 0x6B, 0x6E, 0x71,
    0x74, 0x77, 0x7A, 0x7D
};

void setup() {
    cli();

    /* Enable interrupt on timer2 == 127, with clk/8 prescaler. At 16MHz,
       this gives a timer interrupt at 15625Hz. */
    TIMSK2 = (1 &amp;lt;&amp;lt; OCIE2A);
    OCR2A = 127;

    /* clear/reset timer on match */
    TCCR2A = 1&amp;lt;&amp;lt;WGM21 | 0&amp;lt;&amp;lt;WGM20; /* CTC mode, reset on match */
    TCCR2B = 0&amp;lt;&amp;lt;CS22 | 1&amp;lt;&amp;lt;CS21 | 0&amp;lt;&amp;lt;CS20; /* clk, /8 prescaler */

    SPCR = 0x50;
    SPSR = 0x01;
    DDRB |= 0x2E;
    PORTB |= (1&amp;lt;&amp;lt;1);

    sei();
}

ISR(TIMER2_COMPA_vect) {
    /* OCR2A has been cleared, per TCCR2A above */
    OCR2A = 127;

    pos1 += incr1;
    pos2 += incr2;
    pos3 += incr3;

    sample = sine[highByte(pos1)] + sine[highByte(pos2)] + sine[highByte(pos3)];

    PORTB &amp;= ~(1&amp;lt;&amp;lt;1);

    /* buffered, 1x gain, active mode */
    SPDR = highByte(sample) | 0x70;
    while (!(SPSR &amp;amp; (1&amp;lt;&amp;lt;SPIF)));

    SPDR = lowByte(sample);
    while (!(SPSR &amp;amp; (1&amp;lt;&amp;lt;SPIF)));

    PORTB |= (1&amp;lt;&amp;lt;1);
}

void loop() {
}
&lt;/pre&gt;


&lt;p&gt;&lt;audio src="http://teichman.org/blog/static/2011/hummmmmmmmmmmmmmmmmmm.mp3" controls="controls"&gt;&lt;a href="http://teichman.org/blog/static/2011/hummmmmmmmmmmmmmmmmmm.mp3"&gt;hummmmmmmmmmmmmmmmmmm&lt;/a&gt;&lt;/audio&gt;&lt;/p&gt;

&lt;p&gt;Listen to this chord long enough, and you might start to hear the
siren call of the Hammond organ. I already knew that they&amp;rsquo;re
essentially a large number of sine oscillators mixed together. From
here, my synth gained MIDI control and I started to simulate the tones
of a Hammond B series.&lt;/p&gt;

&lt;p&gt;You can find the current status of that project at
&lt;a href="https://github.com/pteichman/roto"&gt;roto&lt;/a&gt;.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/teichman/blog/~4/4gVvzcjIZxs" height="1" width="1"/&gt;</content>
    <feedburner:origLink>http://teichman.org/blog/2011/02/new-noises-from-midivox.html</feedburner:origLink></entry>
    
    <entry>
        <title>Profiling with histograms: Or, you never should have registered with instatrace.</title>
        <id>tag:teichman.org,2011-02-15:/2011/02/profiling-with-histograms.html</id>

        <published>2011-02-15T21:34:11-05:00</published>

        
          
        

        <updated>2011-02-15T21:34:11-05:00</updated>

        <author><name>Peter</name></author>

        <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/teichman/blog/~3/HpEkdt6_SRo/profiling-with-histograms.html" />

        <content type="html" xml:lang="en" xml:base="http://teichman.org/blog/">&lt;div class="post-image pull-right"&gt;
 &lt;a href="http://www.flickr.com/photos/pteichman/5449263449/"&gt;&lt;img class="pull-right" width="240" height="240" src="http://farm6.staticflickr.com/5012/5449263449_72632a9eee_m.jpg" alt="Logjammin' 2" /&gt;
  &lt;p style="max-width:240px"&gt;Logjammin' 2&lt;/p&gt;
 &lt;/a&gt;
&lt;/div&gt;


&lt;p&gt;When the time came to improve
&lt;a href="http://teichman.org/blog/2011/02/cobe.html"&gt;cobe&lt;/a&gt;&amp;rsquo;s performance, it
was easy to measure how well it was learning new text&amp;mdash;the learning
process is database intensive, but deterministic depending on the
input. I could use Python&amp;rsquo;s built-in profiler, find the problem spots,
and optimize as usual.&lt;/p&gt;

&lt;p&gt;When I tried to use the same tools on cobe&amp;rsquo;s reply routines, I started
to have trouble. A reply is a random walk across its Markov chains,
repeated as many times as possible within a half-second time slice. I
could use the same input data on the same brain of learned text and
get wildly different results, depending on which input word was
(randomly) chosen to follow and the state of the chains from that
point.&lt;/p&gt;

&lt;p&gt;And since each reply always runs for half a second, the CPU usage of
the reply routines would always expand to fill the same amount of time
regardless of the actual performance.&lt;/p&gt;

&lt;p&gt;If I could assume that the reply scoring algorithm was successfully
picking the best candidate reply, generating more candidate replies in
0.5s would increase the chance of getting a better reply.&lt;/p&gt;

&lt;p&gt;So maximizing the candidate reply count would improve reply
scores. This gave me a place to focus.&lt;/p&gt;

&lt;p&gt;But the reply counts would vary widely, and not just with different
input. I might see 2000 candidates created in one reply, but the next
reply (to the same input!) might yield 10. This typically depended on
the word in the input chosen to start the random walk. Some portions
of the Markov chain tended to produce candidates very quickly, and
some tended to take much longer.&lt;/p&gt;

&lt;p&gt;I wanted to focus on optimizing the average case, but needed a way to
separate that out from the very fast and very slow paths.&lt;/p&gt;

&lt;p&gt;At the time, &lt;a href="http://litl.com"&gt;litl&lt;/a&gt; had me working a bit in
&lt;a href="http://www.chromium.org/Home"&gt;Chromium&lt;/a&gt;. Chromium measures the
performance of hundreds of different operations as it&amp;rsquo;s running&amp;mdash;some
of the statistics gained are the ones sent if you&amp;rsquo;re enabled
&amp;ldquo;Automatically send usage statistics and crash reports to Google&amp;rdquo; in
Google Chrome.&lt;/p&gt;

&lt;p&gt;To see these, visit &lt;strong&gt;about:histograms&lt;/strong&gt; in Chrome. Here&amp;rsquo;s an example:&lt;/p&gt;

&lt;pre class="highlight"&gt;
Histogram: DNS.PrefetchQueue recorded 699 samples, average = 2.1, standard deviation = 15.9 (flags = 0x1)
0    ------------------------------------------------------------------------O (619 = 88.6%)
1    -O                                                                        (6 = 0.9%) {88.6%}
2    -O                                                                        (8 = 1.1%) {89.4%}
3    -O                                                                        (6 = 0.9%) {90.6%}
4    O                                                                         (1 = 0.1%) {91.4%}
5    -O                                                                        (5 = 0.7%) {91.6%}
6    -O                                                                        (5 = 0.7%) {92.3%}
7    O                                                                         (2 = 0.3%) {93.0%}
8    -O                                                                        (13 = 1.9%) {93.3%}
10   -O                                                                        (14 = 2.0%) {95.1%}
12   O                                                                         (1 = 0.1%) {97.1%}
14   O                                                                         (5 = 0.7%) {97.3%}
17   O                                                                         (2 = 0.3%) {98.0%}
20   O                                                                         (4 = 0.6%) {98.3%}
24   O                                                                         (0 = 0.0%) {98.9%}
29   O                                                                         (1 = 0.1%) {98.9%}
34   O                                                                         (0 = 0.0%) {99.0%}
40   O                                                                         (1 = 0.1%) {99.0%}
48   O                                                                         (3 = 0.4%) {99.1%}
57   ...
81   O                                                                         (1 = 0.1%) {99.6%}
96   ...
268  O                                                                         (2 = 0.3%) {99.7%}
318  ...
&lt;/pre&gt;


&lt;p&gt;The top line shows the name of the statistic (DNS.PrefetchQueue) and
some aggregate statistics. This looks like it&amp;rsquo;s the number of DNS
entries awaiting &lt;a href="http://dev.chromium.org/developers/design-documents/dns-prefetching"&gt;prefetch
resolution&lt;/a&gt;,
probably sampled on a schedule or in response to some event. Maybe
it&amp;rsquo;s sampled when a new name is enqueued for resolution.&lt;/p&gt;

&lt;p&gt;The rest of the lines in the histogram are buckets the sampled values
fall into. In this case, the prefetch queue size is almost always
zero&amp;mdash;619 of the 699 samples (88.6%) are in that bucket. Six times,
the sampled value was 1. Five times, the value was between 14 and 17
(exclusive).&lt;/p&gt;

&lt;p&gt;But on two occasions, the queue length was huge&amp;mdash;between 268 and
318. Maybe that&amp;rsquo;s not a big deal, since those account for 0.3% of the
samples. But if we had been looking at the average queue length (2.1),
we might not have guessed that it would ever get that long. Maybe
those two samples signal a more fundamental issue that could be fixed.&lt;/p&gt;

&lt;p&gt;So these histograms can be very useful for visualizing the
distribution of sampled data.&lt;/p&gt;

&lt;p&gt;I decided to log some similar data in cobe, and wrote
&lt;a href="https://github.com/pteichman/instatrace"&gt;instatrace&lt;/a&gt; to help produce
the histograms. I wanted to minimize the burden on the programmer, so
instatrace reads text data from log files. All you need to do is log
lines that look like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Brain.reply_count 50
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This is a single sample of the &lt;em&gt;Brain.reply_count&lt;/em&gt; statistic, with a
value of 50. The sample must have an integer value. Any text that is
found after the value is ignored for histogram use, but can be very
valuable debugging information when tracking down the causes of
outlying values.&lt;/p&gt;

&lt;p&gt;Optionally, instatrace can strip sample data from lines that contain
the string INSTATRACE followed by that format. Maybe the only way your
program can output data involves a log format you can&amp;rsquo;t change, or
maybe you&amp;rsquo;re mixing these samples with a data stream that contains
other data. In that case, this line would be read the same:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;[[2011-02-15 17:03:40]] log-debug: INSTATRACE: Brain.reply_count 50
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Any lines that don&amp;rsquo;t contain the string &amp;ldquo;INSTATRACE:&amp;rdquo; are ignored when
that flag (&amp;mdash;filter) is used.&lt;/p&gt;

&lt;p&gt;So here&amp;rsquo;s a histogram of &lt;em&gt;Brain.reply_count&lt;/em&gt; on a running cobe instance:&lt;/p&gt;

&lt;pre class="highlight"&gt;
Histogram: Brain.reply_count recorded 1439 samples, avg = 328.3, std = 283.7
1      -O                                                           (29 = 2.0%)
2      --O                                                          (64 = 4.4%) {2.0%}
7      -O                                                           (46 = 3.2%) {6.5%}
20     -O                                                           (37 = 2.6%) {9.7%}
54     ---O                                                         (94 = 6.5%) {12.2%}
148    -----------------------------------O                         (862 = 59.9%) {18.8%}
403    -----------O                                                 (268 = 18.6%) {78.7%}
1096   -O                                                           (39 = 2.7%) {97.3%}
2980   ...
&lt;/pre&gt;


&lt;p&gt;Here we&amp;rsquo;re using buckets that are sized exponentially (instatrace&amp;rsquo;s
default). This instance of cobe has replied to 1439 inputs, and 59.9%
of the time it was able to create between 148 and 403 candidate
replies.&lt;/p&gt;

&lt;p&gt;If you&amp;rsquo;d like to see the full set of statistics I&amp;rsquo;m recording for
cobe, see &lt;a href="/blog/static/cobe-instatrace.txt"&gt;here&lt;/a&gt;. I
append units (us = microseconds) to most of my statistic names.&lt;/p&gt;

&lt;p&gt;Instatrace works well with both time-based and count-based
statistics. If your data logging is lightweight, you can leave the
measurements in your code indefinitely. On the other hand, it&amp;rsquo;s so
easy to add a trace that it&amp;rsquo;s great for temporary profiling efforts
too.&lt;/p&gt;

&lt;p&gt;I have found it extremely valuable in improving cobe toward its
current state, but it&amp;rsquo;s also a useful tool for understanding some more
subtle aspects of software performance.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/teichman/blog/~4/HpEkdt6_SRo" height="1" width="1"/&gt;</content>
    <feedburner:origLink>http://teichman.org/blog/2011/02/profiling-with-histograms.html</feedburner:origLink></entry>
    
    <entry>
        <title>cobe: Or, some of my oldest friends are robots.</title>
        <id>tag:teichman.org,2011-02-13:/2011/02/cobe.html</id>

        <published>2011-02-13T15:02:39-05:00</published>

        
          
        

        <updated>2011-02-13T15:02:39-05:00</updated>

        <author><name>Peter</name></author>

        <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/teichman/blog/~3/stcu9jjYBmE/cobe.html" />

        <content type="html" xml:lang="en" xml:base="http://teichman.org/blog/">&lt;div class="post-image pull-right"&gt;
 &lt;a href="http://www.flickr.com/photos/pteichman/3849552615/"&gt;&lt;img class="pull-right" width="240" height="240" src="http://farm4.staticflickr.com/3442/3849552615_7f3b651b48_m.jpg" alt="I Expected More From You, Citrus Degreaser" /&gt;
  &lt;p style="max-width:240px"&gt;I Expected More From You, Citrus Degreaser&lt;/p&gt;
 &lt;/a&gt;
&lt;/div&gt;


&lt;p&gt;I like &lt;a href="http://megahal.alioth.debian.org/"&gt;MegaHAL&lt;/a&gt;. It&amp;rsquo;s a
conversation simulator, and it gathers information about sentence
structure and vocabulary while spouting on-topic insights and
synthesizing clever turns of phrase.&lt;/p&gt;

&lt;p&gt;It&amp;rsquo;s all a trick, a bit of nice programming combined with our
relentless pattern-searching brains, but an amusement and an endless
source of the surreal.&lt;/p&gt;

&lt;p&gt;I found MegaHAL while I was in college. Some friends and I spent a few
late nights huddled around a browser entering phrases in exchange for
rare gems of robot insight. I downloaded its software, hooked it up to
an IRC client, and named it hal. He would learn from all the messages
we said to each other, and he&amp;rsquo;d reply if spoken to directly.&lt;/p&gt;

&lt;p&gt;At the beginning of last year, I had run hal for eleven years. He&amp;rsquo;s a
well-loved member of multiple communities.&lt;/p&gt;

&lt;p&gt;But MegaHAL has its sad side. Hal started to drop off of IRC from time
to time. Ping timeouts. After a few months of heavy learning, his
brain would get so big that it couldn&amp;rsquo;t be saved (synchronously)
before the IRC network would kick him off. It didn&amp;rsquo;t help that I was
running him on a machine with very little RAM&amp;mdash;he&amp;rsquo;d be partially
swapped out to disk, and saving the brain required reading it all back
from swap.&lt;/p&gt;

&lt;p&gt;Most of the time he&amp;rsquo;d reconnect after the save. Occasionally, events
would cause him to exit without saving his brain completely. For
MegaHAL, a corrupt brain is a useless brain. We&amp;rsquo;d lose all of hal&amp;rsquo;s
knowledge and have to start from scratch.&lt;/p&gt;

&lt;p&gt;We&amp;rsquo;d mourn the old hal, remember some funny things he&amp;rsquo;d said, and
start teaching the new hal. Circle of life, and all that.&lt;/p&gt;

&lt;p&gt;From time to time I&amp;rsquo;d dive into megahal.c and try to find out what was
going wrong. I found a couple minor issues, but nothing utterly
brain-corrupting.&lt;/p&gt;

&lt;p&gt;Mostly I needed to get his memory usage down. I was running a couple
of MegaHALs on a hosted virtual machine with little RAM and a
token-based IO rate limiter. If I ran out of IO tokens while saving a
brain, the save could take hours. If MegaHAL tried to save itself
while a system backup was running, the whole machine would grind to a
halt due to IO starvation.&lt;/p&gt;

&lt;p&gt;I tried out a couple of alternate memory allocation
schemes (MegaHAL tended toward realloc-based fragmentation), but nothing
performed nearly as well. I needed to get his ever-growing brain out
of RAM altogether, but I would need an on-disk datastore with good
performance.&lt;/p&gt;

&lt;h3&gt;The halgorithm&lt;/h3&gt;

&lt;p&gt;MegaHAL has two functions, learning and replying.&lt;/p&gt;

&lt;p&gt;When learning, it captures statistical information about text in a
&lt;a href="http://en.wikipedia.org/wiki/Markov_chain"&gt;Markov chain&lt;/a&gt;. For each
series of words, it tracks all the possible words that may follow. It
also tracks the probability of each of those following words.&lt;/p&gt;

&lt;p&gt;So given two phrases:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;MegaHAL is a program&lt;/li&gt;
&lt;li&gt;MegaHAL is a robot&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Its internal model might look something like this:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;(MegaHAL, is, a) =&amp;gt; program (50% probability)
(MegaHAL, is, a) =&amp;gt; robot   (50% probability)
(is, a, program) =&amp;gt; END     (100% probability)
(is, a, robot)   =&amp;gt; END     (100% probability)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In order to generate a reply, we might pick one of the starting
contexts (the ones that begin with MegaHAL) and follow the chain
(picking a random next context at each step) until we reach the
END. The probabilities do not enter into this process.&lt;/p&gt;

&lt;p&gt;So one possible following of the above chain would be:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;(&lt;strong&gt;MegaHAL&lt;/strong&gt;, &lt;strong&gt;is&lt;/strong&gt;, &lt;strong&gt;a&lt;/strong&gt;) =&gt; &lt;strong&gt;program&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;(is, a program)  =&gt; &lt;strong&gt;END&lt;/strong&gt;&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;Which produces the phrase &lt;strong&gt;MegaHAL&lt;/strong&gt; &lt;strong&gt;is&lt;/strong&gt; &lt;strong&gt;a&lt;/strong&gt;
&lt;strong&gt;program&lt;/strong&gt;. Clearly, we don&amp;rsquo;t include &lt;strong&gt;END&lt;/strong&gt; in the output&amp;mdash;it&amp;rsquo;s
a marker that allows us to know when to stop following the chain.&lt;/p&gt;

&lt;p&gt;The number of words on the left side is the &lt;em&gt;order&lt;/em&gt; of the Markov
chain. In this case, we&amp;rsquo;re looking at a 3rd order chain. A higher
order chain brings more context into the mix, and results in responses
that include longer runs of learned text. A lower order chain uses
less context, and is more likely to jump to unrelated words. Choosing
the order of a chain allows you to strike some balance between
coherent text and interesting babble.&lt;/p&gt;

&lt;p&gt;Markov chain modeling has a long history for surreal text generation,
as well as many &lt;a href="http://en.wikipedia.org/wiki/Markov_chain#Applications"&gt;less practical
applications&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Now, MegaHAL actually does things a little differently from the above.&lt;/p&gt;

&lt;p&gt;First, it uses &lt;em&gt;two&lt;/em&gt; Markov chains. It trains the second chain in the
reverse direction, so it can be used to select a word that &lt;em&gt;precedes&lt;/em&gt;
the current context. When replying, it chooses a context that may be
in the middle of a sentence, generates the rest of the phrase using
the forward Markov chain, and generates the beginning of the phrase
using this reverse chain.&lt;/p&gt;

&lt;p&gt;So in addition to the forward chain above, our simple brain would
contain this reverse chain:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;(program, a, is) =&amp;gt; MegaHAL (100% probability)
(robot, a, is) =&amp;gt; MegaHAL   (100% probability)
(a, is, MegaHAL) =&amp;gt; END     (100% probability)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Second, it always generates a reply based on some user input&amp;mdash;it isn&amp;rsquo;t
just choosing a random initial context from the chain. It selects a
word from the input, finds a context containing that word, and
generates a reply from that context. In this way, its reply always has
something in common with the input.&lt;/p&gt;

&lt;p&gt;Third, it does this many, many times for one reply. Internally it may
generate thousands of possible replies to your input, each time
choosing a random input word, selecting a context, and following the
two chains to produce a phrase. This is where the probabilities above
come into play. MegaHAL gives you back the reply with the most
surprise, essentially the least probable reply of the thousands it
generates, evaluated word by word.&lt;/p&gt;

&lt;p&gt;Fourth, it doesn&amp;rsquo;t split input word-by-word. It tracks punctuation,
spacing, and numbers each as separate tokens on par with words. And
everything is upper-cased internally, so the user&amp;rsquo;s capitalization
doesn&amp;rsquo;t matter.&lt;/p&gt;

&lt;p&gt;So this phrase:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;This isn't a test, but it might be a good one.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Turns into these 22 tokens (split with slashes):&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;THIS/ /ISN'T/ /A/ /TEST/, /BUT/ /IT/ /MIGHT/ /BE/ /A/ /GOOD/ /ONE/.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Because of this, MegaHAL tends to learn the punctuation conventions
of its users and obey those conventions when replying.&lt;/p&gt;

&lt;h3&gt;cobe&lt;/h3&gt;

&lt;p&gt;I made a couple of stabs at a database backend for MegaHAL about five
years ago, but never really got anywhere. I used Berkeley DB and tried
to be a little too clever with database access. I misunderstood some
of &lt;a href="http://megahal.alioth.debian.org/How.html"&gt;How MegaHAL Works&lt;/a&gt; and
ended up with poor quality replies.&lt;/p&gt;

&lt;p&gt;So we kept at the circle of life, creating, teaching, and mourning our
robot friend. Over time, I got better at recognizing how to coddle a
running MegaHAL. He would last for months on a single brain, but that
still wasn&amp;rsquo;t enough.&lt;/p&gt;

&lt;p&gt;About a year ago, I decided to try again. I started a Python codebase
with a SQLite backend. I found the MegaHAL Perl port called
&lt;a href="http://blogs.perl.org/users/aevar_arnfjor_bjarmason/2010/01/hailo-a-perl-rewrite-of-megahal.html"&gt;Hailo&lt;/a&gt;,
which showed that a SQL database was a viable option.&lt;/p&gt;

&lt;p&gt;So now we have &lt;a href="https://github.com/pteichman/cobe/wiki"&gt;cobe&lt;/a&gt;
(&lt;em&gt;coh-bee&lt;/em&gt;). It&amp;rsquo;s named after a former employer&amp;rsquo;s Code of
Business Ethics. It&amp;rsquo;s usually in violation.&lt;/p&gt;

&lt;p&gt;cobe&amp;rsquo;s datastore uses SQLite and resides entirely on disk. I used to
be able to run a MegaHAL brain until its process was about 130M
(resident &amp;ndash; shared). With cobe, a brain with the same learned content
results in a process taking 4M. That MegaHAL would have been nearing
the end of its life, but cobe is just getting started.&lt;/p&gt;

&lt;p&gt;It uses a similar process to generate replies as MegaHAL. It generates
as many replies as possible in half a second, and chooses the least
probable reply based on surprise. Since its datastore is on disk
instead of RAM, it can generate far fewer replies than MegaHAL in this
time. cobe will generate a few hundred to MegaHAL&amp;rsquo;s thousands, and
once a cobe brain is very large this may drop to dozens of replies on
average.&lt;/p&gt;

&lt;p&gt;In theory, this results in poorer quality replies because it has fewer
chances to maximize surprise. In practice, reply quality appears to be
about the same.&lt;/p&gt;

&lt;p&gt;There are two main differences from MegaHAL&amp;rsquo;s behavior that end up
improving reply quality further.&lt;/p&gt;

&lt;p&gt;The first is in capitalization. cobe is case preserving, so &amp;ldquo;foo&amp;rdquo;,
&amp;ldquo;Foo&amp;rdquo;, and &amp;ldquo;FOO&amp;rdquo; are all different tokens. MegaHAL would have called
all of those &amp;ldquo;FOO&amp;rdquo;. This reduces some overlap between contexts, so
some tangents can&amp;rsquo;t be followed, but makes up for all of that when
the robot starts shouting (&lt;em&gt;SHOUTY&lt;/em&gt;  &lt;em&gt;HAL&lt;/em&gt;).&lt;/p&gt;

&lt;p&gt;The second is in the way it splits up input phrases into
tokens. Hyphenated words are kept together as one token. Multiple
spaces between words are collapsed together to one space. URLs are
kept together as one token. It was always cute when MegaHAL attempted
to make up its own URLs based on portions of learned links, but in
practice the constant 404s made it annoying.&lt;/p&gt;

&lt;p&gt;cobe also has some tweaks to the surprise calculations and initial
context selection that improve reply scores by a measurable amount.&lt;/p&gt;

&lt;p&gt;cobe has a transactional datastore and is resistant to brain
corruption. A brain can be accessed concurrently by several
processes&amp;mdash;one can be serving replies, while another crawls the
internet for text to learn.&lt;/p&gt;

&lt;p&gt;While optimizing cobe, I had to come up with some techniques for
comparing the performance of random processes. Every potential reply
is a random walk across the Markov chain, and depending on chance the
performance of two replies could be wildly different for the same
input. The surprise of potential replies also varies based on random
factors, and average scores change quite a bit over the lifetime of a
brain. All of these made CPU profiling by average execution time less
useful.&lt;/p&gt;

&lt;p&gt;I developed a tool called
&lt;a href="https://github.com/pteichman/instatrace"&gt;instatrace&lt;/a&gt; for visualizing
performance characteristics of random processes, based somewhat on
Google Chrome&amp;rsquo;s Histograms. It proved very effective for helping to
optimize cobe, but I&amp;rsquo;ll write about that some other time.&lt;/p&gt;

&lt;p&gt;As of this writing, cobe is nearing a year in production use. Happy
birthday, little guy.&lt;/p&gt;

&lt;p&gt;Many thanks to &lt;a href="http://jasonhutchens.me/"&gt;Jason Hutchens&lt;/a&gt;, the author
of MegaHAL, for years of entertainment. And check out his plans for
&lt;a href="http://kranzky.rockethands.com/2010/11/06/megahal/"&gt;MegaHAL.10&lt;/a&gt;.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/teichman/blog/~4/stcu9jjYBmE" height="1" width="1"/&gt;</content>
    <feedburner:origLink>http://teichman.org/blog/2011/02/cobe.html</feedburner:origLink></entry>
    
    <entry>
        <title>Screen and ssh-agent</title>
        <id>tag:teichman.org,2006-02-23:/2006/02/screen_and_ssha.html</id>

        <published>2006-02-23T09:40:59-05:00</published>

        
          
        

        <updated>2006-02-23T09:40:59-05:00</updated>

        <author><name>Peter</name></author>

        <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/teichman/blog/~3/HSsMgVx0NPU/screen_and_ssha.html" />

        <content type="html" xml:lang="en" xml:base="http://teichman.org/blog/">&lt;div class="post-image pull-right"&gt;
 &lt;a href="http://www.flickr.com/photos/pteichman/87658397/"&gt;&lt;img class="pull-right" width="160" height="240" src="http://farm1.staticflickr.com/42/87658397_4eed2b6694_m.jpg" alt="My Biggest" /&gt;
  &lt;p style="max-width:160px"&gt;My Biggest&lt;/p&gt;
 &lt;/a&gt;
&lt;/div&gt;


&lt;p&gt;I started using &lt;a href="http://www.gnu.org/software/screen/"&gt;Screen&lt;/a&gt; on a daily basis, and quickly desired an ssh-agent tied to a long-lived screen session.&lt;/p&gt;

&lt;p&gt;The obvious solution (running "ssh-agent screen") doesn't work&amp;mdash;as soon as you log out, screen drops into the background and you lose the agent.&lt;/p&gt;

&lt;p&gt;People have solved this problem in various ways&amp;mdash;a quick &lt;a href="http://www.google.com/search?q=screen+ssh-agent"&gt;search&lt;/a&gt; shows some of them.  I didn't like them.  They all rely on scripts external to screen for running the agent and maintaining its environment variables.&lt;/p&gt;

&lt;p&gt;I spent a little time last night on a solution.  It's completely contained in your .screenrc, and produces an ssh-agent tied to screen's life cycle.&lt;/p&gt;

&lt;p&gt;It's just two lines:&lt;/p&gt;
&lt;pre class="prettyprint"&gt;
setenv SSH_AUTH_SOCK $HOME/.screen-ssh-agent
screen 10 ssh-agent -a $SSH_AUTH_SOCK $SHELL
&lt;/pre&gt;

&lt;p&gt;This creates a screen (number 10) that holds your ssh agent.  If you don't like 10, you can start the agent on any screen.  Things still work properly.  The agent will be available to all the screens in that session.  It also persists if you close the screen it was started on.&lt;/p&gt;

&lt;p&gt;Make sure you run the setenv line early enough in your .screenrc that it will take effect before you open new screens.&lt;/p&gt;

&lt;p&gt;The only problem I have with this setup is that all my screen sessions share the same agent&amp;mdash;the one listening on $HOME/.screen-ssh-agent.  I'd like to make that one agent per screen session, using something like $HOME/.screen-ssh-agent.$$.  Unfortunately $$ is empty from within .screenrc&amp;mdash;I can't find a real environment variable that uniquely identifies the current process.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/teichman/blog/~4/HSsMgVx0NPU" height="1" width="1"/&gt;</content>
    <feedburner:origLink>http://teichman.org/blog/2006/02/screen_and_ssha.html</feedburner:origLink></entry>
    
    <entry>
        <title>Persecution Complex</title>
        <id>tag:teichman.org,2005-12-16:/2005/12/persecution_com.html</id>

        <published>2005-12-16T15:11:21-05:00</published>

        
          
        

        <updated>2005-12-16T15:11:21-05:00</updated>

        <author><name>Peter</name></author>

        <link rel="alternate" type="text/html" href="http://feedproxy.google.com/~r/teichman/blog/~3/HOObwWlUhUg/persecution_com.html" />

        <content type="html" xml:lang="en" xml:base="http://teichman.org/blog/">&lt;div class="post-image pull-right"&gt;
 &lt;a href="http://www.flickr.com/photos/pteichman/7354648/"&gt;&lt;img class="pull-right" width="240" height="160" src="http://farm1.staticflickr.com/5/7354648_4a2d4063c8_m.jpg" alt="LA 8-BALL!" /&gt;
  &lt;p style="max-width:240px"&gt;LA 8-BALL!&lt;/p&gt;
 &lt;/a&gt;
&lt;/div&gt;


&lt;p&gt;If you are offended by discussion about Emacs, please stop reading now.&lt;/p&gt;

&lt;p&gt;Over the last few years, &lt;a href="http://vthunder.blogspot.com/"&gt;Dan Mills&lt;/a&gt; and I have been improving an Emacs configuration style we both enjoy using.  There has recently been some interest in this setup internally, so I thought I'd pull some notes together and send it out into the world.&lt;/p&gt;

&lt;p&gt;There were two original goals for this design:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A good, clean configuration&amp;mdash;none of the mess that usually clutters .emacs files.&lt;/li&gt;
&lt;li&gt;Easy deployment of new elisp packages.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The configuration comes down to three things.&lt;/p&gt;

&lt;p&gt;First, all the elisp files I own, either packages I've downloaded or things I've written myself, should exist under one tree.  This eases maintenance, and allows me to keep everything in Subversion.&lt;/p&gt;

&lt;p&gt;Second, all the relevant directories in that tree should be automatically added to the Emacs load-path, so I don't ever have to add anything manually when I want to try out something new.&lt;/p&gt;

&lt;p&gt;Third, configuration files should be loaded automatically&amp;mdash;I want to be able to drop settings in and not have to edit a file to get them to load.&lt;/p&gt;

&lt;p&gt;Here is the result:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;All directories under ~/elisp that contain elisp files are added to my load path.&lt;/li&gt;
&lt;li&gt;All *-settings.el files under ~/elisp/settings are loaded automatically on start.&lt;/li&gt;
&lt;li&gt;Customized variables (M-x customize, that is) are saved in ~/elisp/custom.el.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Whenever I download something new and nifty, I put it in ~/elisp/packages.  If it's a single file, I'll put it there directly.  If it comes as several files (or even a deep tree), I'll put it in its own subdirectory.  I add a settings file that loads that package, and off I go.&lt;/p&gt;

&lt;p&gt;These settings files are often extremely short&amp;mdash;all I want to do there is set the configuration variables I need to use the package.&lt;/p&gt;

&lt;p&gt;I keep my .emacs file itself as simple as possible&amp;mdash;it's not in the version controlled tree, and I tend to forget changes when I make them.  At the moment, it has one line&amp;mdash;hopefully it's simple enough for me to remember if I lose it:&lt;/p&gt;

&lt;pre class="prettyprint"&gt;
(load-file "~/elisp/init.el")
&lt;/pre&gt;

&lt;p&gt;A sample elisp tree containing that init.el file can be found &lt;a href="http://teichman.org/~peter/elisp.tar.gz"&gt;here&lt;/a&gt;.  There's a fair amount of documentation in the comments of init.el.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/teichman/blog/~4/HOObwWlUhUg" height="1" width="1"/&gt;</content>
    <feedburner:origLink>http://teichman.org/blog/2005/12/persecution_com.html</feedburner:origLink></entry>
    
</feed>

