<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

 <title>Christopher Peplin</title>
 <link href="http://christopherpeplin.com/atom.xml" rel="self"/>
 <link href="http://christopherpeplin.com/"/>
 <updated>2012-02-06T11:41:27-08:00</updated>
 <id>http://christopherpeplin.com/</id>
 <author>
   <name>Christopher Peplin</name>
   <email>chris.peplin@rhubarbtech.com</email>
 </author>

 
 <entry>
   <title>Maximizing USB Bulk Transfer Throughput</title>
   <link href="http://christopherpeplin.com/2012/02/bulk-usb-throughput"/>
   <updated>2012-02-06T00:00:00-08:00</updated>
   <id>http://christopherpeplin.com/2012/02/usb_throughput</id>
   <content type="html">&lt;p&gt;Ever get so stuck on a problem that web searches only lead back to your own
forum post with the original question? This post is hopefully an end to that for
some people.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;A few months ago, I ran into a performance problem with USB bulk transfers on
the chipKIT, an Arduino-compatible PIC32 microcontroller. I posed this question
to the chipKIT and Microchip communities:&lt;/p&gt;

&lt;p&gt;I'm just getting started with USB programming using the recently released
Microchip library on the chipKIT with the network shield. I've tried to learn as
much as I can about proper USB programming and it's been good for the most part
- however, I'm stuck at a ~75KB/s bulk transfer speed.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;At Ford, we created a device for the &lt;a href=&quot;http://openxcplatform.com&quot;&gt;OpenXC&lt;/a&gt; project that spits out discrete
messages to a host PC over USB as fast as possible. The current strategy is to
use JSON delimited by newlines sent via a bulk transfer endpoint. I trimmed down
the GenericUSB example to test the maximum transfer rate of such a device, and
created a Python receiver for the host (it uses libusb as the underlying USB
library). The benchmarking code is available at &lt;a href=&quot;https://github.com/openxc/arduino-transfer-benchmarking&quot;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;On the chipKIT side, the main loop is pretty simple: it continuously writes a 45
byte JSON message to the USB endpoint:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;while(true) {
    while(usb.HandleBusy(handleInput));
    handleInput = usb.GenWrite(DATA_ENDPOINT, messageBuffer, messageSize);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In Python it's a similar loop that requests a read of some size until it hits
10MB transferred. This is the actual PyUSB read function that gets called:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;device.read(self.endpoint, self.message_size)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The original question continues:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;If you download the benchmarkUSB.pde sketch to the chipKIT, then run &quot;python
receiver.py&quot; it will read 10MB from the device with various &quot;read request&quot; sizes
ranging from 64 bytes (one packet) to 1KB.&lt;/p&gt;

&lt;p&gt;It's my understanding that requesting more data from the host at a time will
increase throughput as messaging overhead drops - for some reason, this isn't
the case for my code. You'll see as the request size increases the throughput
actually drops from 75KB/s to even lower numbers.&lt;/p&gt;&lt;/blockquote&gt;

&lt;h2&gt;Potential Solutions&lt;/h2&gt;

&lt;p&gt;The most common answer I found to throughput issues was to increase the amount
of data requested by the host. This seems like sound advice, but it just wasn't
fixing the problem for me.&lt;/p&gt;

&lt;p&gt;Another bit of advice (from Professor Ed Olson at the University of Michigan)
was to use asynchronous requests so that multiple USB request blocks (URB) were
always in flight. I found this enhancement mentioned elsewhere, and it is in
line with the other potential fix - making sure the USB device is always busy
sending data.&lt;/p&gt;

&lt;p&gt;Unfortunately, switching to asynchronous requests had no effect either - the
benchmark still showed abysmal throughput on the order of 50 - 70KB/s.&lt;/p&gt;

&lt;h2&gt;The Fix&lt;/h2&gt;

&lt;p&gt;The problem ended up being much simpler and had more to do with one of the core
design principles of USB. After shelving this issue for a few months, I
revisited the problem and something caught my eye.&lt;/p&gt;

&lt;p&gt;No matter how many bytes were requested on the host, from 64 to 4096, the read
operation only every returned 45 bytes - one message. USB uses 64 byte packets,
and it uses a less than 64 byte packet (traditionally but not limited to a zero
length packet) to indicate the end of a transfer. Were we causing a lot of extra
overhead by ending every transfer after 45 byets?&lt;/p&gt;

&lt;p&gt;I padded out the 45 byte test message I was using to 64 bytes, and now it is
&lt;strong&gt;much&lt;/strong&gt; faster (70KB/s from Python to 650+ KB/s). Previously, we requested 1024
bytes but got 45, which is less than a full 64 byte packet so of course, the
transfer was closed.&lt;/p&gt;

&lt;p&gt;In hindsight this seems like a pretty important thing to know about USB, but
being new to driver development, it wasn't obvious and I couldn't find any
references to how the less-than-max length packet can effect performance
elsewhere.&lt;/p&gt;

&lt;p&gt;The complete fix for OpenXC's use case will be to make sure every message is
padded out to 64 bytes and to request bulk transfer sizes that are big enough to get
good throughput but small enough to not be too delayed. I have yet to confirm,
but my understanding is that if we request a 4KB read and don't mark the end of
the transfer with a less-than-max length packet, the host device will block
waiting for the rest of the requested data.&lt;/p&gt;

&lt;h2&gt;References&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/openxc/arduino-transfer-benchmarking&quot;&gt;Benchmarking code&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.chipkit.cc/forum/viewtopic.php?f=7&amp;amp;t=503&quot;&gt;chipKIT forum post&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.microchip.com/forums/m610161.aspx&quot;&gt;Microchip forum post&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>chipKIT Compatible Arduino-based Makefile</title>
   <link href="http://christopherpeplin.com/2011/12/chipkit-arduino-makefile"/>
   <updated>2011-12-09T00:00:00-08:00</updated>
   <id>http://christopherpeplin.com/2011/12/chipkit-makefile</id>
   <content type="html">&lt;p&gt;The &lt;a href=&quot;https://github.com/peplin/arduino.mk&quot;&gt;Arduino&lt;/a&gt; is a great platform for rapid prototyping hardware devices.
If you need to squeeze a bit more performance out of your project, I've recently
found the &lt;a href=&quot;http://www.digilentinc.com/Products/Catalog.cfm?NavPath=2,892&amp;amp;Cat=18&quot;&gt;Digilent chipKIT&lt;/a&gt; a great drop-in replacement board.&lt;/p&gt;

&lt;p&gt;The chipKIT is based on the PIC32 microcontroller (as opposed to the Ardunio's
Atmel ATmega chips), and thus uses a different toolchain for compiling. The
folks at Digilent have kindly released a new version of the Arduino IDE renamed
&lt;a href=&quot;https://github.com/chipKIT32/chipKIT32-MAX/downloads&quot;&gt;MPIDE&lt;/a&gt; which includes the pic32 compilers and other tools.&lt;/p&gt;

&lt;p&gt;If you're not wild about GUI IDEs, there are a few Arduino-compatible Makefiles
floating around that allow you to build and deploy code from the command line.
None of these that I've found have supported the chipKIT until now - I've
published on GitHub an &lt;a href=&quot;https://github.com/peplin/arduino.mk&quot;&gt;extended version&lt;/a&gt; of Martin Oldfield's
&lt;a href=&quot;http://mjo.tc/atelier/2009/02/arduino-cli.html&quot;&gt;Arduino.mk&lt;/a&gt; that works with the tools provided by MPIDE.&lt;/p&gt;

&lt;p&gt;The biggest change is allowing the tool names to be overriden - e.g. you need to
use &lt;code&gt;pic32-gcc&lt;/code&gt; instead of &lt;code&gt;avr-gcc&lt;/code&gt;. To use it, follow Martin's instructions
but instead of including &lt;code&gt;Arduino.mk&lt;/code&gt; at the bottom of your Makefile, just
include &lt;code&gt;chipKIT.mk&lt;/code&gt; instead.&lt;/p&gt;

&lt;p&gt;Thanks for Martin for the well-documented Makefile - this would have taken much
longer had it not been so clearly explained.&lt;/p&gt;

&lt;h2&gt;Source&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Download the chipKIT.mk Makefile from &lt;a href=&quot;https://github.com/peplin/arduino.mk&quot;&gt;GitHub&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;Check out Martin Oldfield's &lt;a href=&quot;http://mjo.tc/atelier/2009/02/arduino-cli.html&quot;&gt;original project&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>Filesystem Music Collection Sync to Rdio</title>
   <link href="http://christopherpeplin.com/2011/08/rdio-filesystem-collection-sync"/>
   <updated>2011-08-23T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/08/rdiosync</id>
   <content type="html">&lt;p&gt;&lt;a href=&quot;http://rdio.com&quot;&gt;Rdio&lt;/a&gt; is pretty neat, but I've been missing out on some of the fun because I
can't sync my music collection to my Rdio account. I don't use iTunes or Windows
Media Player, so the &lt;a href=&quot;http://www.rdio.com/#/apps/desktop/&quot;&gt;Rdio Desktop&lt;/a&gt; Music
Collector doesn't do me much good.&lt;/p&gt;

&lt;p&gt;Thankfully, there's a great &lt;a href=&quot;http://developer.rdio.com/page&quot;&gt;Rdio API&lt;/a&gt; that we
can use to manually sync from the filesystem. I put together a command-line
Python tool that takes care of the job, or at least it did for me. You can run
it multiple times safely, as it keeps track of which albums have already been
uploaded.&lt;/p&gt;

&lt;p&gt;The tool is available on GitHub at
&lt;a href=&quot;https://github.com/peplin/rdiosync&quot;&gt;rdiosync&lt;/a&gt; - I hope it's helpful for someone
else, too.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Command-line Build and Deploy to the BUG</title>
   <link href="http://christopherpeplin.com/2011/05/bug-labs-maven-build-and-deploy"/>
   <updated>2011-08-04T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/bug-maven</id>
   <content type="html">&lt;p&gt;I recently started a new job, and we're working with the &lt;a href=&quot;http://www.buglabs.net/products&quot;&gt;BUG YT&lt;/a&gt; from
&lt;a href=&quot;http://www.buglabs.net&quot;&gt;Bug Labs&lt;/a&gt;. The primary language for BUG apps is Java, and they helpfully
provide libraries for interacting with all of the available hardware modules as
OSGi services.&lt;/p&gt;

&lt;p&gt;The recommended development environment is their &lt;a href=&quot;http://www.buglabs.net/sdk&quot;&gt;Dragonfly SDK&lt;/a&gt;, an Eclipse
plugin. It provides shortcuts for creating new BUG applications with the proper
build configuration and ways to deploy to either a physical BUG or a virtual
BUG simulator.&lt;/p&gt;

&lt;p&gt;Try as I might, I've never been able to have a positive relationship with
Eclipse - it boils down to the fact that the least important thing in any
workspace view seems to be the code itself. I've paried Eclipse with
&lt;a href=&quot;http://eclim.org/&quot;&gt;Eclim&lt;/a&gt; so I can code in Vim but take advantage of plugins
like Dragonfly, but the complicated setup is dragging my little work laptop
down.&lt;/p&gt;

&lt;h2&gt;Building with Maven&lt;/h2&gt;

&lt;p&gt;The last few times I had to use Java, I found Maven to be a good command-line
alternative to the build features of Eclipse. I've created a Maven archetype and
a small deploy script that can replicate some of the important features of the
Dragonfly SDK on the command line - no mouse required.&lt;/p&gt;

&lt;p&gt;The code for the &lt;code&gt;bug-archetype&lt;/code&gt; is available on
&lt;a href=&quot;https://github.com/peplin/buglabs-maven-archetype&quot;&gt;GitHub&lt;/a&gt;, where I've also
written up more detailed installation instructions. There's not much to the
archetype itself - the trickiest part was finding an existing OSGi bundle
archetype that fit my needs and adding in the Bug Labs libraries.&lt;/p&gt;

&lt;p&gt;This is the first archetype I've made, and I'm a little rusty on the internals
of Maven, so if you have trouble I'd be happy to work through it with you - just
&lt;a href=&quot;https://github.com/peplin/buglabs-maven-archetype/issues&quot;&gt;file an issue&lt;/a&gt; in
the GitHub project.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>A Google Voice Landline with Asterisk</title>
   <link href="http://christopherpeplin.com/2011/05/google-voice-asterisk/"/>
   <updated>2011-06-04T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/google-voice-asterisk</id>
   <content type="html">&lt;p&gt;Inspired by a recent
&lt;a href=&quot;http://www.maximumpc.com/article/how-tos/how_build_your_own_home_phone_server&quot;&gt;Maximum PC article&lt;/a&gt;,
I recently set up a land line telephone that supports both incoming and outgoing
calls from my Google Voice number. I opted to use an installation of Asterisk in
on a standard Ubuntu server instead of using one of the pre-built PBX
Linux distributions. The process involved enough fiddling and Google searching
that I thought a summary of how my setup works could be helpful for others.&lt;/p&gt;

&lt;p&gt;You can probably guess by the recent addition of calling to Gmail that Google is
using VoIP protocols to shuffle Voice traffic around. Thankfully, they're
somewhat standard, and Asterisk recently added official support for it. Just
like the calling feature in Gmail, an Asterisk server can connect to Google and
send/receive calls.&lt;/p&gt;

&lt;h2&gt;Chef Cookbook&lt;/h2&gt;

&lt;p&gt;I've written a &lt;a href=&quot;https://github.com/peplin/asterisk-cookbook&quot;&gt;Chef cookbook&lt;/a&gt; that does most of this configuration
automatically. I'll go over the basics of the cookbook if you don't use Chef.&lt;/p&gt;

&lt;h2&gt;Install the Package&lt;/h2&gt;

&lt;p&gt;The best place to get the Asterisk package is the official asterisk.org
repository. Add it to your &lt;code&gt;/etc/apt/sources.list&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;deb http://packages.asterisk.org/deb natty main
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Then install the PGP key and update your package lists:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;sudo apt-key adv --keyserver subkeys.pgp.net --recv-keys 175E41DF
sudo apt-get update
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Finally, install the two packages we need:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;sudo apt-get install asterisk-1.8 asterisk-dahdi
&lt;/code&gt;&lt;/pre&gt;

&lt;h2&gt;Configuration&lt;/h2&gt;

&lt;p&gt;There are five configuration files to change:&lt;/p&gt;

&lt;h3&gt;&lt;code&gt;/etc/asterisk/sip.conf&lt;/code&gt;&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Set &lt;code&gt;localnet&lt;/code&gt; to the IP range of your LAN, e.g.
  &lt;code&gt;192.168.1.0/255.255.255.0&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;tcpenable = 'yes'&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;disallow = 'all'&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;allow = ...&lt;/code&gt; for &lt;code&gt;ulaw&lt;/code&gt;, &lt;code&gt;gsm&lt;/code&gt;, &lt;code&gt;ilbc&lt;/code&gt;, and &lt;code&gt;speex&lt;/code&gt;&lt;/li&gt;
&lt;li&gt;&lt;code&gt;externip&lt;/code&gt; must be set to your external (i.e. WAN) IP address&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Finally, in the &lt;code&gt;[authentication]&lt;/code&gt; section, add the details for your user
account:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;[DESIRED_ASTERISK_USERNAME]
secret=YOUR_DESIRED_PASSWORD
type=friend
callerid=&quot;Your Name &amp;lt;username&amp;gt;&quot;
host=dynamic
context=outbound
&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;&lt;code&gt;/etc/asterisk/extensions.conf&lt;/code&gt;&lt;/h3&gt;

&lt;p&gt;This file sets up all routing paths for getting calls from Google to your user.
This configuration is probably a bit more than is required (and is a mashup of a
few different blog posts), but it's working for me at the moment:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;[general]
static=yes
writeprotect=no
clearglobalvars=no

[globals]
CONSOLE=Console/dsp             ; Console interface for demo
IAXINFO=guest                   ; IAXtel username/password
TRUNK=Zap/G2                    ; Trunk interface
TRUNKMSD=1                  ; MSD digits to strip (usually 1 or 0)

[default]
exten =&amp;gt; s,1,Set(CALLERID(name)=${DB(cidname/${CALLERID(num)})})
exten =&amp;gt; s,n,Dial(SIP/YOUR_ASTERISK_USERNAME, 10)
exten =&amp;gt; s,n, Hangup
exten =&amp;gt; YOUR_ASTERISK_USERNAME, 1, Dial(SIP/YOUR_ASTERISK_USERNAME, 10)

[google-in]
exten =&amp;gt; YOUR_ASTERISK_USERNAME, 1, GotoIf(${DB_EXISTS(gv_dialout/channel)}?bridged)
exten =&amp;gt; YOUR_ASTERISK_USERNAME, n, NoOp(Callerid  ${CALLERID(name)})
exten =&amp;gt; YOUR_ASTERISK_USERNAME, n, Set(CALLERID(num)=${SHIFT(CALLERID(name),@)})
exten =&amp;gt; YOUR_ASTERISK_USERNAME, n, Set(CALLERID(name)=${DB(cidname/${CALLERID(num)})})
exten =&amp;gt; YOUR_ASTERISK_USERNAME, n, Dial(SIP/YOUR_ASTERISK_USERNAME, 20, aD(:1))
exten =&amp;gt; YOUR_ASTERISK_USERNAME, n(bridged),Bridge(${DB_DELETE(gv_dialout/channel)}, p)

[outbound]
include =&amp;gt; seven-digit
include =&amp;gt; local-devices
include =&amp;gt; tollfree
include =&amp;gt; talk-gmail-outbound
include =&amp;gt; talk-numeric-outbound
include =&amp;gt; dial-uri

[local-devices]
exten =&amp;gt; YOUR_ASTERISK_EXTENSION_NUM, 1, Dial(SIP/YOUR_ASTERISK_USERNAME, 10)

[tollfree]
exten =&amp;gt; _411, 1, Dial(SIP/18004664411@proxy.ideasip.com,60)
exten =&amp;gt; _1800NXXXXXX,1,Dial(SIP/${EXTEN}@proxy.ideasip.com,60)
exten =&amp;gt; _1888NXXXXXX,1,Dial(SIP/${EXTEN}@proxy.ideasip.com,60)
exten =&amp;gt; _1877NXXXXXX,1,Dial(SIP/${EXTEN}@proxy.ideasip.com,60)
exten =&amp;gt; _1866NXXXXXX,1,Dial(SIP/${EXTEN}@proxy.ideasip.com,60)

[seven-digit]
exten =&amp;gt; _NXXXXXX,1,Set(CALLERID(dnid)=1512${CALLERID(dnid)})
exten =&amp;gt; _NXXXXXX,n,Goto(1512${EXTEN},1)
exten =&amp;gt; _NXXNXXXXXX,1,Set(CALLERID(dnid)=1${CALLERID(dnid)})
exten =&amp;gt; _NXXNXXXXXX,n,Goto(1${EXTEN},1)

[talk-gmail-outbound]
exten =&amp;gt; _[a-z].@gmail.com,1,Dial(gtalk/google/${EXTEN}@gmail.com)
exten =&amp;gt; _[A-Z].@gmail.com,1,Dial(gtalk/google/${EXTEN}@gmail.com)

[talk-numeric-outbound]
exten =&amp;gt; _1XXXXXXXXXX,1,Dial(gtalk/google/+${EXTEN}@voice.google.com)
exten =&amp;gt; _+1XXXXXXXXXX,1,Dial(gtalk/google/+${EXTEN}@voice.google.com)

[dial-uri]
exten =&amp;gt; _[a-z].,1,Dial(SIP/${EXTEN}@${SIPDOMAIN},120,tr)
exten =&amp;gt; _[A-Z].,1,Dial(SIP/${EXTEN}@${SIPDOMAIN},120,tr)
exten =&amp;gt; _X.,1,Dial(SIP/${EXTEN}@${SIPDOMAIN},120,tr)
&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;&lt;code&gt;/etc/asterisk/jabber.conf&lt;/code&gt;&lt;/h3&gt;

&lt;p&gt;This file configures the authentication details for your Google account (Jabber
is the protocol used by GTalk, and it's technically how you log in). This file
contains your Google password, so make sure it's only readable by your user or
root.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;[general]
autoregister=yes

[google]
type=client
serverhost=talk.google.com
username=YOUR_GOOGLE_ACCOUNT@gmail.com/Talk
secret=YOUR_GOOGLE_PASSWORD
port=5222
statusmessage=Asterisk Server - Not a Human
status=xaway
&lt;/code&gt;&lt;/pre&gt;

&lt;h3&gt;&lt;code&gt;/etc/asterisk/gtalk.conf&lt;/code&gt;&lt;/h3&gt;

&lt;p&gt;Finally, this file links together your Google account (defined previous in the
&lt;code&gt;jabber.conf&lt;/code&gt; file) with Google Voice.&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;[general]
context=google-in       ; Context to dump call into
allowguest=yes

[guest]         ; special account for options on guest account
disallow=all
allow=ulaw

[gtalk]
username=YOUR_GOOGLE_USER@gmail.com
disallow=all
allow=ulaw
context=google-in
connection=google
&lt;/code&gt;&lt;/pre&gt;

&lt;h2&gt;Google Voice Configuration&lt;/h2&gt;

&lt;p&gt;The last step is to enable Google Talk as an extension in your Google Voice
account. This should be pretty straightforward.&lt;/p&gt;

&lt;h2&gt;Caveats&lt;/h2&gt;

&lt;p&gt;There are a few known issues with this setup.&lt;/p&gt;

&lt;p&gt;The Google account you use will appear logged into Google Talk anytime the
server is running. Messages sent to the server will just get lost in the ether.
I set up a separate Google account with my Voice number so this wouldn't be an
issue. If you already have an established Voice number that you don't want to
change, another option is to create a second Google account for chat only -
although then you lose the ability to use the inlined chat in Gmail. If anyone
has a better solution for this, I'd love to know about it.&lt;/p&gt;

&lt;p&gt;Incoming calls are somewhat unreliable. Sometimes the phone will ring, but after
picking it up you just get dead air. My cell phone keeps ringing when this
happens so I'm still able to pick up the call. It's annoying, however, and I'd
love to find the root cause.&lt;/p&gt;

&lt;p&gt;I suspect that Google's requirement that you hit &quot;1&quot; when accepting a call via
SIP is the problem. This Asterisk config attempts to do that automatically for
you, but it took me quite a while to nail down that process and it's most likely
still not working. Sometimes jabbing &quot;1&quot; when I get dead air seems to wake it
up, but not every time.&lt;/p&gt;

&lt;p&gt;Otherwise, outgoing calls work quite well. The only issue I've ever had is if
I'm using all of my upstream bandwidth doing something else, but that's not
unique to Asterisk. Some QoS controls in your router should help out with that.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>OAuth for Privacy</title>
   <link href="http://christopherpeplin.com/2011/05/oauth-privacy"/>
   <updated>2011-05-29T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/oauth-privacy</id>
   <content type="html">&lt;p&gt;&lt;em&gt;I wrote this paper for Professor Lorrie Cranor's Privacy Policy, Law &amp;amp;
Technology course at Carnegie Mellon Unversity. A &lt;a href=&quot;http://things.rhubarbtech.com/pdf/oauth-privacy-report.pdf&quot;&gt;PDF&lt;/a&gt; version is available.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Abstract&lt;/h2&gt;

&lt;p&gt;The rise of Facebook, Twitter and their associated ecosystems of third-party
applications was accompanied by a new authentication protocol, OAuth. The
protocol enables single sign-on (SSO) as well as limited information sharing
between web services at the behest of users without revealing credentials. It
also provides an opportune hook for confirming privacy policies with users
during authentication, but because of the difficulties of enforcement, an
extended OAuth is no more or less effective than other efforts to increase user
awareness. The most effective approach will be the education of web developers
and information platform operators.&lt;/p&gt;

&lt;h2&gt;Introduction&lt;/h2&gt;

&lt;p&gt;The era of Web 2.0 may have reached buzz-word status, but the changes in
Internet-enabled applications it ushered in have had an enormous impact on
the quantity and flow of personal information online. With web applications,
more user activities are done entirely online, and the services are becoming
increasingly interconnected. The rise of Facebook, Twitter and their associated
ecosystems of third-party applications was accompanied by a new authentication
protocol, &lt;a href=&quot;http://oauth.net&quot;&gt;OAuth&lt;/a&gt;. OAuth enables single sign-on (SSO) as well as limited
information sharing between web services at the behest of users without
revealing credentials. Unlike the last major SSO contender, &lt;a href=&quot;http://openid.net/&quot;&gt;OpenID&lt;/a&gt;, OAuth
has gained widespread adoption. For example, applications taking advantage of
Facebook's OAuth access include the New York Times and Farmville.&lt;/p&gt;

&lt;p&gt;The procedure to grant a third-party client access to data stored in a
centralized personal information store, e.g. Facebook, is now greatly simplified
and users are comfortable with the process. The ease and familiarity means that
more people are sharing more data with more sites, and understanding the privacy
policies of each party involved is even more of a daunting task.&lt;/p&gt;

&lt;p&gt;Authorizing access to a resource through OAuth is inherently different than
visiting a single website with a browser. The authorization is done under the
banner of an already trusted website. The OAuth process provides an opportunity
to better inform the user of the privacy they can expect for their data and to
warn them when data leaves the safety of the host application. One of the
hurdles for the standard privacy enhancing protocol P3P was that it required
adding an additional step to a user's workflow: setting up privacy policy
preferences, paying attention to a warning icon at each new website, using a
different search engine, etc. Instead, P3P-style privacy negotiation can be
bundled with authentication and shown to the user on the same OAuth permission
page they expect. This paper describes one such possible extension to the OAuth
specification.&lt;/p&gt;

&lt;p&gt;OAuth provides an opportune hook for confirming privacy policies with users
during authentication, but because of the difficulties of enforcement, an
extended OAuth is no more or less effective than other efforts to increase user
awareness. The most effective approach will be the education of web developers
and information platform operators.&lt;/p&gt;

&lt;h2&gt;Definitions&lt;/h2&gt;

&lt;p&gt;This project intentionally uses some of the same language as the
&lt;a href=&quot;http://tools.ietf.org/html/draft-ietf-oauth-v2-10&quot;&gt;OAuth2 specification&lt;/a&gt;. The published definitions are provided verbatim, with some
additional comments for clarity.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;protected resource&lt;/strong&gt; &quot;An access-restricted resource which can be obtained
  using an OAuth-authenticated request.&quot; E.g. an e-mail address or phone
  number.&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;resource server&lt;/strong&gt; &quot;A server capable of accepting and responding to protected
  resource requests.&quot; E.g. Facebook or Twitter.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;client&lt;/strong&gt;  &quot;An application obtaining authorization and making protected
  resource requests.&quot; E.g. a third-party such as the New York Times or a
  Zynga.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;resource owner&lt;/strong&gt; &quot;An entity capable of granting access to a protected
  resource.&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;end-user&lt;/strong&gt; &quot;A human resource owner.&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;token&lt;/strong&gt; &quot;A string representing an access authorization issued to the client.
  The string is usually opaque to the client. Tokens represent specific scopes
  and durations of access, granted by the resource owner, and enforced by the
  resource server and authorization servers.&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;access token&lt;/strong&gt; &quot;A token used by the client to make authenticated requests on
  behalf of the resource owner.&quot;&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;authorization server&lt;/strong&gt; &quot;A server capable of issuing tokens after
  successfully authenticating the resource owner and obtaining authorization.
  The authorization server may be the same server as the resource server, or a
  separate entity.&quot; E.g. Facebook or Twitter, which act as both resource
  servers and authentication servers.&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Web Services&lt;/h2&gt;

&lt;p&gt;A web service, loosely defined, is an application hosted on the Internet that
exposes its functionality and data not only through a graphical user interface
in a browser, but through a computer-readable interface (commonly referred to as
an application programming interface or API). A web application with an API can
be combined with other data sources by developers to create unique combinations
of features or entirely new applications based on data previously collected or
processed. For example, third-parties can offer additional insight into a user's
social network by tapping into the connections they've already made in an
application like Facebook. Such data sharing significantly lowers the barrier of
entry for new applications, which can now piggyback off of the success of other
platforms instead of building an entirely new user base. Users appreciate not
needing to sign up for as many accounts, and developers appreciate the deferring
of authentication work to other servers.&lt;/p&gt;

&lt;p&gt;As users begin connecting the applications they use together, they have started
thinking (often subconsciously) about their online identity. Users are concerned
with who has access to their identity and to which elements of their personal
information and activities online. Application developers continue to need a way
to uniquely identify users to offer services, and of course want to share the
work of authentication if possible. The interests of the two parties can often
seem at odds.&lt;/p&gt;

&lt;h3&gt;Identity Management&lt;/h3&gt;

&lt;p&gt;Identity management is the process or system by which users control the contents
of and access to the pieces of their digital identity. There are two approaches
to identity management - user-centric and organization-centric.&lt;/p&gt;

&lt;p&gt;Organization-centric management is the approach taken by many companies today,
where user data is increasingly linked with a unique identifier to try and
create relationships between an individual's accounts in different business
entities.&lt;/p&gt;

&lt;p&gt;User-centric management gives the individual more control. This system avoids
using globally unique identifiers for users, instead creating service-specific
identifiers that uniquely identify a user within an area but are insufficient
for cross-referencing between databases. In some proposed systems, access
control resides in a &quot;&lt;a href=&quot;http://www.ico.gov.uk/upload/documents/library/data_protection/detailed_specialist_guides/edentity_hp_idm_paper_for_web.pdf&quot;&gt;permission hub&lt;/a&gt;,&quot; where a user identifies and
authenticates, and third-party clients connect to request access to various
resources. While not conceived as such, OAuth's capabilities and usage
patterns are in some ways an embodiment of this idea. Users are consolidating
their personal information into a few large data silos (e.g. Facebook), using
OAuth to control access.&lt;/p&gt;

&lt;p&gt;A further evolution of user-centric identity management is the
&lt;a href=&quot;http://wiki.eclipse.org/Personal_Data_Store_Overview&quot;&gt;Personal Data Store&lt;/a&gt; (PDS). A PDS is a server - either a user's own computer
or a hosted solution – that stores personal information and releases it as
needed to third-parties with the user's consent. It combines personal
information storage, authentication, validation and permissions into one system.
This idea shares many features with a &quot;permission hub,&quot; and thus the existing
social networks.&lt;/p&gt;

&lt;p&gt;The PDS highlights some security and privacy concerns with data consolidation,
especially considering that previously distributed, uncorrelated information is
now centralized in a single server and thus a single, vulnerable target. An
analysis of the benefits and risks of this approach to identity management is
beyond the scope of this project. However, considering the success of Facebook
(arguably a proprietary PDS), it is reasonable to assume that for many, these
concerns aren't a top priority or that the benefits of such a system outweigh
the risks.&lt;/p&gt;

&lt;p&gt;The goal of this project is to find a way to provide a basic level of privacy
protection and notification to users of these systems while minimizing any
additional burden on users and software developers. The success of these
so-called &quot;software as a service&quot; applications depends on the availability of a
secure but extremely simple method of authenticating and sharing user
information between services. OAuth fostered innovation by lowering the barrier
of entry for developers to link data and for users to try out new tools, and any
additional privacy controls cannot stand in the way of this feature without
risking alienating the market.&lt;/p&gt;

&lt;h3&gt;Personal Value in Linking Data&lt;/h3&gt;

&lt;p&gt;Beyond single sign-on, OAuth was the first user-friendly protocol for linking
accounts together for the purposes of sharing data. The process is initiated by
the user, so there is likely some apparent benefit for them to do so. This is
not a situation where the resource server is offering user data to third-parties
for profit or marketing. The rate of user adoption indicates there is great
value to consumers in being able to link their data between services.&lt;/p&gt;

&lt;p&gt;Conversely, OpenID has had limited success in gaining popular traction. OAuth
and OpenID undoubtedly have different goals. OpenID providers serve primarily
for identity management - the user's OpenID URL is their identity online. OAuth,
oppositely, was created to solve security concerns when delegating access to a
user's account with a specific service. Instead of providing a username and
password, third-parties can use an access token with limited capabilities that
can be revoked at will by the user without having to change their password. For
better or worse, OAuth has emerged as the leading protocol in both realms. Web
developers found OAuth to be easier to implement than OpenID, and so began using
it as their primary SSO tool.&lt;/p&gt;

&lt;p&gt;Debating the merits of relying on a limited number of proprietary OAuth servers
(e.g. Facebook and Twitter) instead of an unrestricted set of OpenID providers
is beyond the scope of this project, but again, we must recognize the higher
acceptance of OAuth among users.&lt;/p&gt;

&lt;h3&gt;API Keys&lt;/h3&gt;

&lt;p&gt;Before OAuth, API keys were the most popular way of authenticating requests to
or between web services. These are simple persistent access tokens generated
once and appended to each HTTP request, which uniquely authenticates and
authorizes a user for that request. Their implementation is generally insecure,
as requests made over standard HTTP (and not SSL-encrypted HTTPS) are sent in
the clear and the token is susceptible to snooping. More importantly, they are
too cumbersome for end users, who won't be bothered with copy and pasting long
strings from application to application.&lt;/p&gt;

&lt;p&gt;OAuth adds an extra layer of security be default (by using signed authentication
requests in the original specification, and requiring SSL encryption in the next
iteration), and pushes the exchange of access tokens down from the user layer to
the server layer. The end result is the same - the third party has some sort of
authentication token that allows them to make requests on behalf of the user -
but the details of the process are transparent to the user.&lt;/p&gt;

&lt;h3&gt;Single Sign-On v.s. Access Delegation&lt;/h3&gt;

&lt;p&gt;The use of OAuth for single sign-on can unfortunately lead to data creep, as
third parties take advantage of the protocol to gather data from users. Users
are so comfortable with the OAuth process that they may not fully review each
request for access. An application that only uses the protocol for
authentication may be requesting access to much more data than is required to
perform their business function.&lt;/p&gt;

&lt;p&gt;The client application does not always have malicious intent. Hoping to avoid
having to reacquire access in the future (and thus bother their users), and
also to give themselves as much flexibility as possible, developers have a
tendency to ask for complete access to a user's information even if a subset
would be sufficient.&lt;/p&gt;

&lt;p&gt;Furthermore, there is no standard granularity (either in the specification or in
common practice) of control that a user has over the access granted to clients.
Some data stores grant access on a per-item basis, while others offer only blunt
&quot;read&quot; and &quot;read and write&quot; options. Requests for authentication have a wide
range of requirements. Linking to personal information, or even identifying an
individual is often unnecessary. The primary risks of broader application of
authentication include covert identification, excessive use and excessive
aggregation. The authentication protocol should be careful not to encourage
excessive data exposure.&lt;/p&gt;

&lt;h2&gt;State of Identification, Authorization &amp;amp; Privacy&lt;/h2&gt;

&lt;h3&gt;Privacy Policy Troubles&lt;/h3&gt;

&lt;p&gt;The protections offered to users by first-party information stores are not often
made clear during the OAuth authorization workflow. When a user submits personal
information to a trusted website, they expect the site to follow its stated
privacy policy. It is not clear that they can expect the same level of
protection from third-parties to which they grant access. OAuth offers no
opinion on the responsibility of clients to abide by the resource server's
privacy policy. Furthermore, there is no requirement in the specification for
the resource server to verify compliance by the client.&lt;/p&gt;

&lt;p&gt;In practice, the protections offered to users differ on a site to site basis.
This is no different than without OAuth (where users still must expect different
privacy policies on different websites), but the addition of almost trivial data
sharing between companies stacks the cards against users. One may say that
granting access via OAuth is sufficient consent, but the current user interfaces
do not sufficiently explain the risks and conditions. The result is a clean,
simple interface for sharing information which encourages users to share more
and give up privacy in exchange for promised improvements in service. This
protocol could be more mindful of users without significantly complicating the
process.&lt;/p&gt;

&lt;h3&gt;Pragmatic Privacy Enhancement&lt;/h3&gt;

&lt;p&gt;Ideas for an overhaul of identity management have been brewing over a period of
years, while users and developers are moving ahead with whatever technology is
most accessible. It is in the interest of all parties to make smaller,
incremental improvements to existing technologies to improve user privacy
instead of focusing soley on ideal, complete systems.&lt;/p&gt;

&lt;p&gt;In a &lt;a href=&quot;http://www.youtube.com/watch?gl=US&amp;amp;hl=uk&amp;amp;v=RrpajcAgR1E&quot;&gt;keynote address&lt;/a&gt; at the Identity 2.0 conference, Dick Hardt questioned
which sector will spearhead such an identity overhaul. The success of OAuth
proves the power of small companies and individual developers in shaping the
technologies used online. Some of the most technically up-to-date,
standards-compliant and accessible websites are from these small players, not
the government, banks or large corporations (who are in fact notoriously behind
the curve online). Many of the disruptive software technologies of the past five
years have started as grassroots efforts led by developers and not the result of
any corporate-backed strategy.&lt;/p&gt;

&lt;h3&gt;In Practice&lt;/h3&gt;

&lt;p&gt;There are many projects in the identity management space that attempt (or have
the capability) to integrate at least one aspect of privacy control with single
sign-on - fine-grained information release, permissions delegation, personal
information storage, etc. OAuth is a slim protocol that has only one parameter
specifically aimed at access control - scope.&lt;/p&gt;

&lt;p&gt;When a client makes a request for a resource on behalf of the resource owner, it
can optionally set the value of the scope parameter. This parameter is a space
separated, unordered list of values that describes the types of data or
permissions this application is requesting. The OAuth specification leaves the
details of the possible values up to the authorizing and resource servers,
meaning that each OAuth provider has a different set of permitted scope values
and valid combinations. For example, Facebook's implementation of the OAuth2
draft describes a set of &quot;&lt;a href=&quot;http://developers.facebook.com/docs/authentication/permissions&quot;&gt;extended permissions&lt;/a&gt;&quot; that can be requested via
the scope parameter. &lt;a href=&quot;http://wiki.developer.myspace.com/index.php?title=Extended_Permissions&quot;&gt;MySpace&lt;/a&gt; made their own proprietary modifications to the
original OAuth protocol to implement a similar feature.&lt;/p&gt;

&lt;p&gt;Typically, the user will see a human-readable description of the scope requested
on the authorization screen. If they accept, the scope is stored alongside the
access token that is generated - together, a contract between the two
applications. The client's subsequent requests for protected resources with this
access token can only access the data or perform the actions described by the
stored scope. There is no way for the third-party to access a resource to which
the user did not specifically grant access (unless there are security holes in
the resource server).&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/oauth/fb.png&quot; alt=&quot;Facebook OAuth&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This is a screenshot of the OAuth access grant for the New York Times Facebook
application. Facebook informs users of the resources requested by the
application, but does not allow fine grained control. The page includes a
subdued notice that use of the data is subject to the application's privacy
policy, not Facebook's.&lt;/p&gt;

&lt;p&gt;In common practice, the scope describes only what can be accessed, not what can
be done with the information after it has been shared. There is no notice for a
user that by granting access to their e-mail address (stored with the content
provider), they may be unintentionally providing it to advertisers as well.&lt;/p&gt;

&lt;p&gt;Two examples of OAuth authorization pages, Facebook and MySpace, illustrate the
lack of standardization for informing users of data exposure. MySpace includes a
descriptive notice in small, light gray text at the bottom of the screen: &quot;The
service you are linking to is not provided by MySpace. If you choose to link to
this service, it may share your data in accordance with the privacy policy of
and your privacy settings on the linked service.&quot;&lt;/p&gt;

&lt;p&gt;It also includes a note about and a direct link to the account page for revoking
access for previously authorized applications:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;To revoke access to this linked service and for more information visit the
Sync section of your MySpace account.&quot;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/oauth/myspace.png&quot; alt=&quot;MySpace OAuth&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This is a screenshot of the extended permissions on MySpace OAuth access grant
page. MySpace added proprietary permissions parameters to the original
&lt;a href=&quot;http://oauth.net/core/1.0/&quot;&gt;OAuth specification&lt;/a&gt;, as well as OpenID. Users can choose whether or not to grant
permission for each resource individually with check boxes. Similar to Facebook,
the page includes a subdued (but more descriptive) notice that the shared data
is subject to a new privacy policy.&lt;/p&gt;

&lt;p&gt;The authorization page for Facebook applications provides much less information.
If the client application provided one, Facebook will display a link to a
privacy policy with a note about possible extended use. In the case of the New
York Times Facebook client:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;Use of this data is subject to the The New York Times Privacy Policy&quot;.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Again, the text is small, gray and likely to be missed by users. Compared to
MySpace, this version at least provides a link to the client application's
website. Disturbingly, however, if the application omits a privacy policy link
from their application settings, not only is no link displayed but there is no
longer any mention of possible extended data use. In this case, it would be
reasonable (but dangerously incorrect) for a user to assume that their data is
still covered by the Facebook privacy policy, and at no greater risk of exposure
after authorizing the application for access.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/oauth/fb-noprivacy.png&quot; alt=&quot;Facebook OAuth 2&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This is especially important since once a user's data is transferred, it could
be stored indefinitely on the third-party servers. By granting OAuth access to
one's Facebook profile, for example, their entire social network history could
be replicated on an insecure server with no policy for data protection or breach
handling.&lt;/p&gt;

&lt;p&gt;Most application developers implicitly agree to platform policies written by the
resource servers when an application is first created. The policies range in
their limitations, but for example, Facebook's developer agreement requires that
data not be sold to advertisers. The privacy policy and develop policy are
typically not in the same location on the resource server's website, and
end-users are not expected to navigate to the developer area.&lt;/p&gt;

&lt;h2&gt;Proposed OAuth Extension&lt;/h2&gt;

&lt;p&gt;OAuth is an interesting candidate for integrating privacy controls because it is
an open standard and in the set of popular authentication systems, a relatively
simple protocol. OAuth is also a standard in flux - the next version of the
standard, &lt;a href=&quot;http://tools.ietf.org/html/draft-ietf-oauth-v2-10&quot;&gt;OAuth2&lt;/a&gt;, is current being drafted. The draft is
already implemented by Facebook, while the &lt;a href=&quot;http://oauth.net/core/1.0/&quot;&gt;first version&lt;/a&gt;
is in use elsewhere. The standard is at a late enough stage that such privacy
extensions will not likely be incorporated, but the activity does suggest that
the market for an authentication standard is active and willing to adapt
quickly.&lt;/p&gt;

&lt;p&gt;No existing authentication system with wide deployment has ever met all of the
criteria for a complete identity management system. Rather than trying to build
a complete system from scratch, or even modify an existing one to cover all
areas, OAuth should be extended only in ways that are most natural. One problem
with privacy enhancing technologies is that they typically add to or change the
workflow of a user. This approach instead augments something they are already
used to doing (OAuth) with privacy controls.&lt;/p&gt;

&lt;p&gt;The proposed extension adds additional user control of and consent to
information release and an element of minimal disclosure (pseudonyms). The goal
of the extension is not to implement all aspects of identity management, as
others have tried in the past, but to embolden OAuth to become a partner in a
mixed &quot;&lt;a href=&quot;http://www.identityblog.com/stories/2004/12/09/thelaws.html&quot;&gt;identity metasystem&lt;/a&gt;&quot;. Compared to other
existing or proposed systems, the extended OAuth specification does not attempt
to be as feature complete or secure. It represents a pragmatic approach to
identity management, and attempt to create an incremental improvement on the way
to a better system.&lt;/p&gt;

&lt;p&gt;Viewing an OAuth access request as a pseudo-&lt;a href=&quot;http://www.w3.org/P3P/&quot;&gt;P3P&lt;/a&gt; policy, the protocol is
current missing the &quot;usage&quot; section which would define the intended use of the
data. OAuth can be combined with P3P-style privacy summaries to allow users to
simultaneously authenticate and approve privacy policies. The extension
registers a usage parameter (similar to scope) that uses P3P compact policies to
describe how the information will be used. Additionally, the specification
recommends that only a pseudonym is exposed to the client by default, unless
more detailed identification is specifically requested.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Parameter Name&lt;/strong&gt; &lt;code&gt;usage&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Parameter Usage Location&lt;/strong&gt; The end-user authorization endpoint request, the
end-user authorization endpoint response, the token endpoint request, the
token endpoint response, and the &quot;WWW-Authenticate&quot; header field.&lt;/p&gt;

&lt;h3&gt;Access Grant&lt;/h3&gt;

&lt;p&gt;When a client is requesting an access grant (the process of prompting a user to
grant permissions) they can optionally provide the usage parameter.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;usage&lt;/code&gt; - The intended use of the data in the scope of the access request
expressed as a P3P compact policy. This parameter is optional.&lt;/p&gt;

&lt;h4&gt;Errors&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;invalid_usage&lt;/code&gt; The requested usage is invalid, unknown, or malformed.&lt;/p&gt;

&lt;h3&gt;Access Token&lt;/h3&gt;

&lt;p&gt;After the access grant, the client can request an access token.&lt;/p&gt;

&lt;h4&gt;Request&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;usage&lt;/code&gt; - The intended use of the data in the scope of the access request expressed
as a P3P compact policy. If the access grant being used already represents an
approved usage, the requested usage MUST be equal or lesser than the usage
previously granted. This parameter is optional.&lt;/p&gt;

&lt;h4&gt;Response&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;usage&lt;/code&gt; - The allowed use of the data in the scope of the access request expressed
as a P3P compact policy. The authorization server SHOULD include the parameter
if the requested usage is different from the one requested by the client. This
parameter is optional.&lt;/p&gt;

&lt;h4&gt;Errors&lt;/h4&gt;

&lt;p&gt;&lt;code&gt;invalid_usage&lt;/code&gt; - The requested usage is invalid, unknown, malformed, or exceeds the
previously granted usage.&lt;/p&gt;

&lt;h3&gt;Sample P3P Compact Policy&lt;/h3&gt;

&lt;p&gt;A social network user may have the following privacy preferences, expressed and
stored as a P3P compact policy:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;ALL DSP CURa ADMa DEVa IVAa IVDa OUR NOR ONL DEM CNT PRE
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This (optimistic) policy will match applications that use the data requested
only for completing the current action (e.g. searching a friends list),
individual analysis or individual decision making (e.g. personalized service
based on a social network profile). The only recipient of the data can be the
client application itself, and the data cannot be retained beyond an active user
session (meaning that the application must re-request the information it
requires from the resource server each time the user logs in). The user must be
able to access all information stored with the application about themselves.
Finally, the application must have a dispute resolution plan.&lt;/p&gt;

&lt;p&gt;If the user begins the authorization process for an application with the
following non-compliant policy, they will be warned of the specific differences:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;ALL DSP CURa ADMa DEVa IVAa IVDa TEL OUR NOR ONL DEM CNT PRE
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;In this case, the application plans to use the data for telemarketing -
something the user's privacy preferences do not allow. Technically, P3P compact
policies are only intended for use with HTTP cookies. Using them here is a
slight stretch of the P3P specification, but it reflects their spirit. One
limitation of compact policies (addressed in the
&lt;a href=&quot;http://www.w3.org/TR/P3P11&quot;&gt;P3P 1.1 draft specification&lt;/a&gt;) is the lack of granularity - the usage
described in the policy applies to all data collected from the user. The 1.1
specification introduced compact statements, which group together a set of
compact policy elements to describe one or more types of data. This allows the
application or user to specify different intended usage for each type of data
mentioned in the scope parameter, a reasonable request for both parties when
considering the wide range of information shared via OAuth (from first name to
geotagged photographs).&lt;/p&gt;

&lt;h2&gt;Use Cases&lt;/h2&gt;

&lt;h3&gt;No Privacy Settings&lt;/h3&gt;

&lt;p&gt;If an authorization server does not allow users to predefine their minimum
privacy requirements or the user does not have any set, the site must assume the
highest level of privacy. The user should be prompted on each access grant to
confirm the privacy settings manually.&lt;/p&gt;

&lt;h3&gt;Insufficiently Limited Use&lt;/h3&gt;

&lt;p&gt;If a user has their minimum privacy preferences set at the authorization server
and the client is requesting usage beyond what is allowed by those preferences,
the user should be shown a prominent warning and a description of the specific
discrepancies. The user should be able to manually grant access even in this
case.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/oauth/fb-privacy-BAD.png&quot; alt=&quot;Facebook OAuth Mockup&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This is a mockup of an extended OAuth access page with a privacy policy
mismatch. If the user's privacy settings do not match the usage requested by the
application, they should be warned but allowed to proceed.&lt;/p&gt;

&lt;h3&gt;Sufficiently Limited Use&lt;/h3&gt;

&lt;p&gt;If a user has their minimum privacy preferences set at the authorization server
and the client is requesting usage less than or equal to what is allowed by
those preferences, the user not be shown anything specific regarding the privacy
settings (beyond perhaps a small icon indicating that there are no issues).
Since the majority of cases will likely fall into this use case, users should
not be constantly bothered with notifications that the privacy settings are
acceptable. They should only be notified when there is an issue. In this case,
the user just needs to decide if they wish to grant access to the specific
application, confident that the third party's privacy policy matches their own
personal privacy requirements.&lt;/p&gt;

&lt;h3&gt;Usage Upgrade&lt;/h3&gt;

&lt;p&gt;If a client has an access token for a user, but wishes to expand the usage of
their personal information, they must reacquire the token from the
authorization server either by prompting the user out-of-band (e.g. e-mail) of
their new desired usage or by requiring the user's permission at the next
application launch. They must reacquire the token with the new usage settings,
and they must be properly stored with the authorization server.&lt;/p&gt;

&lt;h3&gt;Policy Violation&lt;/h3&gt;

&lt;p&gt;There is no technical way to enforce a usage policy. The authorization server
must be vigilant in confirming the practices of their permitted client
applications, and follow up with complaints from their customers. In the case of
a breach of policy, all users of that application (easy to find from the content
provider's access token registry) must be notified and given the option to
continue or terminate their relationship with that client.&lt;/p&gt;

&lt;h2&gt;Alternatives&lt;/h2&gt;

&lt;p&gt;There are numerous alternative approaches to integrating privacy with
authentication and identity management. This section describes some of the
efforts of other projects, as well as ideas considered and dropped in favor of
the OAuth extension described in this paper.&lt;/p&gt;

&lt;h3&gt;&lt;a href=&quot;http://doi.acm.org/10.1145/1655028.1655036&quot;&gt;Web2ID&lt;/a&gt;&lt;/h3&gt;

&lt;p&gt;Current single sign-on solutions rely on a trusted, centralized identity
provider to authenticate users at various websites. This violates the user's
privacy because the websites they authenticate with are exposed to the identity
provider. One example of a decentralized single-sign on system is &lt;a href=&quot;http://doi.acm.org/10.1145/1655028.1655036&quot;&gt;Web2ID&lt;/a&gt;, an
SSO framework that uses public &amp;amp; private key encryption in place of an identity
provider.&lt;/p&gt;

&lt;p&gt;Tailored specifically for in-browser web service mashups, Web2ID avoids some of
the problems associated with the redirects and page refreshes required by OAuth.
The added complexity is of debatable value for end-users, however. Web2ID also
introduces a broader identity management framework (closer to OpenID) and is not
backwards compatible with any existing protocols.&lt;/p&gt;

&lt;h3&gt;User-centric Federated SSO&lt;/h3&gt;

&lt;p&gt;Centralized SSO also violates user privacy by allowing identity managers and
service providers to exchange and link personal information and web traffic. The
User-centric Federated Single Sign-On System (UFed) adopts the principles of
user-centric identity management to protect privacy. UFed relies on existing
protocols, but not the most widely used ones. It also sacrifices simplicity for
security, something which many service providers do not require and which
developers may resent.&lt;/p&gt;

&lt;h3&gt;P3P&lt;/h3&gt;

&lt;p&gt;The existing machine-readable privacy protocol, P3P, could be used without
modification during the authentication process for clients. The primary use case
for P3P requires the user's browser or other end-user application acting as a
user-agent to access a website's P3P policy. Instead, the authorization server
could act as the user agent and communicate directly with the client
application. Users no longer need to install any additional software for policy
checking and the rollout can be a resource server-led effort.&lt;/p&gt;

&lt;p&gt;Short of implementing dynamic per-user full P3P policies, the authorization
server and client could use standard P3P compact policies in their HTTP headers
when requesting token access. The workflow would be identical to that described
in section 4, but with the policies sent via HTTP headers instead of URL
parameters.&lt;/p&gt;

&lt;p&gt;This approach would require the modification of neither the P3P or OAuth
specifications, so can be implemented by any resource server immediately. That
said, without being explicitly described in a standard protocol, it must rely on
becoming common practice through other means. Another problem is that the
definition of the scope and usage fields would be less similar in the code. For
example, typical open-source OAuth libraries accept URLs, user IDs and access
tokens as parameters. This approach would require them to also accept the HTTP
request itself, to access the P3P headers. There is an increased chance that the
scope and usage would fall out of sync as their data structures diverge.&lt;/p&gt;

&lt;h3&gt;Identity Metasystem&lt;/h3&gt;

&lt;p&gt;Personal identity on the Internet is based on a fractured landscape of
incompatible systems and has seemingly been on the verge of its next phase (some
say &quot;Identity 2.0&quot;) for multiple years. No single system has been created that
completely satisfies developers and users alike, and those that have come close
never saw widespread deployment. Identity advocates propose an
&quot;&lt;a href=&quot;http://www.identityblog.com/stories/2004/12/09/thelaws.html&quot;&gt;identity metasystem&lt;/a&gt;&quot; that combines the many existing implementations of identity
management into a unified platform.&lt;/p&gt;

&lt;p&gt;The &quot;identity metasystem&quot; attempts to unify the interface for online
authentication for both developers and consumers. The task is not impossible, as
such abstraction layers have been created for other areas of computing (e.g.
video, networking). The inspiration for the system is the
&lt;a href=&quot;http://www.identityblog.com/stories/2004/12/09/thelaws.html&quot;&gt;Laws of Identity&lt;/a&gt;, a set of principles for identity
management set collaboratively by identity advocates. They cover user consent,
disclosure, information user justification and usability.&lt;/p&gt;

&lt;p&gt;The most promising of the alternatives, it is also one of the biggest. This
system may eventually mitigate the privacy concerns described in section 3.1,
but widespread deployment is remote compared to the availability of OAuth today.&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Despite their similarity in implementation, there is a problematic difference
between the proposed usage parameter and its counterpart scope. If a protected
resource is not in the scope of an access token (e.g. e-mail address), there is
no technical way for the third party to access that data. If the application
doesn't abide by the policy they were granted by the usage parameter, there is
no technical recourse. There is no way for the user to detect that the
information is being misused, and no way to revoke access once it has been
transferred off of the resource server. This fact makes the extended OAuth
specification no better and no worse than existing privacy policy tools when it
comes to protecting users. What it does provide is better notification and a
clearer contractual agreement between parties.&lt;/p&gt;

&lt;p&gt;In the end, any new identity system has to be sold first to developers.
Developers will accept and implement a relatively complicated identification
protocol if and only if the information behind the authentication wall is of
sufficient value. Despite complaining loudly, developers using the Twitter API
all migrated from Basic Authentication to OAuth because the API was compelling
enough to do so. Other service providers with a lower uptake in OAuth usage
suffer from a lack of &lt;a href=&quot;http://hueniverse.com/2010/09/twitter-a-hot-princess-google-an-empty-castle/&quot;&gt;quality offerings&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Without much trouble, OAuth can be extended to incorporate existing approaches
to communicating privacy preferences. Unfortunately, it is susceptible to the
same issues - namely a lack of enforcement and thus a lack of incentive for
client compliance. Unless the resource servers required it (unlikely),
developers would be unlikely to adapt the privacy extension.&lt;/p&gt;

&lt;h3&gt;Recommendations&lt;/h3&gt;

&lt;p&gt;OAuth is not a traditional single sign-on solution in that the identity has
&lt;a href=&quot;http://www.eweek.com/c/a/Security/OAuth-Is-the-New-Hotness-In-Identity-Management-572745/&quot;&gt;stronger ties&lt;/a&gt; to the identity provider. The resource servers &amp;amp; identity
providers should take advantage of this close relationship to make a safer user
experience. Within the bounds of the existing OAuth2 specification, the follow
recommendations would improve user notification and expand awareness of the
risks of data exposure:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;In the interest of their customers, resource servers are encouraged to hold
  their client applications to the terms of the host application's privacy
  policy. To maintain the faith of their users, they must aggressively pursue
  those who violate it.&lt;/li&gt;
&lt;li&gt;Resource servers should hold third-party applications to higher standards when
  granting access to the ecosystem. The requirements at registration for
  Facebook and Twitter clients are minimal, with little verification of the
  legitimacy of any proposed application. This has certainly helped with the
  rapid expansion of applications, but at the cost of user security and
  privacy.&lt;/li&gt;
&lt;li&gt;Authorization servers are encouraged to support only SSO by default. A user's
  unique identifier should be considered a protected resource, and a
  pseudonym should be provided in its absence. For example, with a blank scope
  field, Facebook applications can currently access &quot;all public data in a
  user's profile, including her name, profile picture, gender, and friends&quot; as
  well as the user's unique identifier (i.e. Facebook ID). For simple single
  sign-on, absolutely none of this is required - only a valid, permission-less
  access token that sufficiently identifies the user. Clients
  should be required to explicitly request each protected resource in the
  scope parameter beyond this pseudonym, to make it clear to users and
  developers what information is accessible to each party.&lt;/li&gt;
&lt;li&gt;Users should be allowed to modify the valid &lt;a href=&quot;http://sole.dimi.uniud.it/~antonina.dattolo/papers/2009/book/Dattolo-apweb2009pdf#page=120.&quot;&gt;lifetime of access tokens&lt;/a&gt;.
  They should be able to to grant single use or other time-limited access
  tokens instead of the default permanent ones.&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Offline References&lt;/h2&gt;

&lt;p&gt;The online references for this article are linked inline.&lt;/p&gt;

&lt;p&gt;Wang Bin et al. &quot;Open Identity Management Framework for SaaS Ecosystem.&quot; In:
ICEBE '09: Proceedings of the 2009 IEEE International Conference on e-Business
Engineering. Wash- ington, DC, USA: IEEE Computer Society, 2009, pp. 512–517.
isbn: 978-0-7695-3842-6.&lt;/p&gt;

&lt;p&gt;Stephen Farrell. &quot;API Keys to the Kingdom.&quot; In: IEEE Internet Computing 13.5
(2009), pp. 91–93. issn: 1089-7801.&lt;/p&gt;

&lt;p&gt;Suriadi Suriadi, Ernest Foo, and Audun Josang. &quot;A User-centric Federated Single
Sign-on System.&quot; In: NPC '07: Proceedings of the 2007 IFIP International
Conference on Network and Parallel Computing Workshops. Washington, DC, USA:
IEEE Computer Society, 2007, pp. 99–106. isbn: 0-7695-2943-7.&lt;/p&gt;

&lt;p&gt;Stephen T. Kent and Lynette I. Millett. Who Goes There?: Authentication Through the Lens
of Privacy. Washington, D.C.: National Academies Press, 2003. Chap. 1,2.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Monitoring Civil Infrastructure</title>
   <link href="http://christopherpeplin.com/2011/05/civil-infrastructure-monitoring/"/>
   <updated>2011-05-29T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/infrasructure-monitoring</id>
   <content type="html">&lt;p&gt;&lt;em&gt;A &lt;a href=&quot;http://things.rhubarbtech.com/infrastructure-monitoring/infra-monitoring-report.pdf&quot;&gt;PDF&lt;/a&gt;
of this post is available&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Introduction&lt;/h2&gt;

&lt;p&gt;Civil engineers were some of the first to take advantage of the information
revolution around the time of plugging semi-conductor prices. As early as the
1970's, major infrastructure projects were wired with real-time monitoring and
control systems to lower costs and improve safety and operating efficiency. This
trend continues today.&lt;/p&gt;

&lt;p&gt;Meanwhile, computer system architecture shifted three major times in these past
40 years. Individual, personal computers gave way to mainframes and thin
clients. The tide shifted back to powerful end-user clients in the 1990's. Now,
the cloud and web services embody somewhat of a throwback to the mainframe era.
Computer networks remained quite small in comparison to some infrastructure
sensor networks, at least until until the rise of the Internet as we know it.
Now, some computer systems rival the largest infrastructure projects in size (if
not cost). These systems have some of the same issues with monitoring and
statistics analysis.&lt;/p&gt;

&lt;p&gt;An example of a relatively new, distributed computer application are massively
multiplayer online games. The game publisher Blizzard has a vested interested in
tracking the users of their online game &lt;em&gt;World of Warcraft&lt;/em&gt; for billing,
game balancing and to plan for future expansions. In order to scale the game
world in a reasonable fashion, the developers split up the environment into
thousands of shards. The gameplay statistics must make it back to a centralized
location at Blizzard eventually, but the fractured architecture doesn't lend
itself to a simple solution.&lt;/p&gt;

&lt;p&gt;The data center ecosystem itself is also one massively distributed system,
encompassing thousands of nodes across a diverse geography. In short,
distributed systems are more common than ever and the importance of monitoring,
tracking and accounting hasn't waned.&lt;/p&gt;

&lt;p&gt;These applications are what bring the computer world more in line with the
monitoring situation in civil infrastructure. Civil engineers and government
organizations in charge of projects such as roads, bridges, oil &amp;amp; gas pipelines
and waterways have been struggling with monitoring some of the earliest
distributed systems. These are not distributed in the same computing sense of
the word; they are often entirely offline and unpowered. For decades, their data
has been gathered (often inconsistently) by hand. The engineers tasked with
accounting for trillions of dollars of public assets and physical systems are
dealing with what could be viewed as widespread network unreliability and wholly
unreliable nodes.&lt;/p&gt;

&lt;h2&gt;Infrastructure Monitoring&lt;/h2&gt;

&lt;p&gt;The operators and designers of civil and computing infrastructure have a keen
interest in collecting knowledge of the behavior of the systems. Infrastructure
management is increasingly data-driven, requiring ever more monitoring. It is
also useful for providing a global-level view of the condition of a system. This
can often be a good indicator of when and where a failure occurred, giving
operators prime candidates for further investigation. The three primary
motivations for infrastructure monitoring are asset management, safety and
operations.&lt;/p&gt;

&lt;h4&gt;Asset Management&lt;/h4&gt;

&lt;p&gt;Asset management is an increasingly popular (and in some cases required)
strategy for managing an organization's physical assets. It includes performing
continuous inventory, risk modeling, and life-cycle and condition assessment.
All of these rely heavily on computing for data collection and analysis. A
monitoring system more tightly integrated with asset management tools can give
more accurate predictions - e.g. automatically collected gas measurements in
power transformers can be taken nearly continuously, compared to quarterly or
yearly manual inspections. On the computing side, data centers are becoming
increasingly heterogeneous and asset management is important, albeit to a lesser
degree.&lt;/p&gt;

&lt;h4&gt;Safety&lt;/h4&gt;

&lt;p&gt;Many infrastructure components require constant inspection to ensure safe
operation. Components have different safe operating ranges for various
statistics (temperature, pressure, cycles, etc.) and the more accurate and
up-to-date this information is when it reaches the control center, the better.
There are fewer safety considerations for computer applications, but a parallel
metric is the reliability and robustness of users' data storage.&lt;/p&gt;

&lt;h4&gt;Operations&lt;/h4&gt;

&lt;p&gt;Normal day-to-day operations of some types of infrastructure previously required
many employees on-site to monitor and operate different components. With a
two-way communication system between infrastructure and control room (i.e. one
that sends commands and receives metrics), these jobs can be done more
efficiently with fewer people and with a better sense of the status of the
system as a whole. For example, operators on two ends of a pipeline many have a
difficult time determining the status of an issue occurring in the middle. With
remote monitoring, these parties can deal with the situation all from the same
room. The same goals hold true for data center operations, and especially for
geographically distributed software systems.&lt;/p&gt;

&lt;h3&gt;Sensing&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4563333&quot;&gt;Modern research&lt;/a&gt; prefers wireless sensors over wired. These
systems are harder to disrupt, which is of greater concern when the system is
out in the open and many network hops away from the central office. A widely
distributed wired network involves a significant amount of extra infrastructure,
and damage to the infrastructure being monitoring often implies damage to the
monitoring system itself. Wireless systems have the advantage of being easier to
deploy, as they can self-organize into ad-hoc networks (see the figure below) as
long as they are within range of another sensor node. Connectivity problems also
tend to be isolated to individual units, and are easier to troubleshoot.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/infrastructure-monitoring/adhoc.png&quot; alt=&quot;Ad-Hoc&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Wireless sensors along a pipeline form an ad-hoc wireless network and
communicate only with nearby neighbors (&lt;a href=&quot;http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4563333&quot;&gt;source&lt;/a&gt;)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Recent projects have taken a page (knowingly or not) from tools like the
computer network monitoring application &lt;a href=&quot;http://monitor.millennium.berkeley.edu/&quot;&gt;Ganglia&lt;/a&gt; and now use a unified data
format, regardless of sensor or data type. For example, a
single update could consist of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Type of data (1 byte)&lt;/li&gt;
&lt;li&gt;Geographic coordinates, determined via GPS or inferred via signal
  strength of other nodes&lt;/li&gt;
&lt;li&gt;Network address&lt;/li&gt;
&lt;li&gt;Actual data (4 bytes)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;These monitoring networks self-organize into a dynamic hierarchy of nodes based
on their placement and capabilities. There are generally three types of nodes
deployed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Basic sensor node&lt;/li&gt;
&lt;li&gt;Communication relay node - collects data in its 1- or 2-hop neighborhood&lt;/li&gt;
&lt;li&gt;Data Discharge Node - forward results to the Network Control Center,
  i.e. the one with a connection to the Internet&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Whereas in computer system monitoring, each node generally has similar
capabilities for communication and sensing, the role of these nodes are bound by
their physical capabilities. Hierarchical organization becomes a simpler problem
of guaranteeing a wide enough dispersal of communication relay and data
discharge nodes to reach all of the levels of the hierarchy, compared to the
somewhat arbitrary trees made in computer networks.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/infrastructure-monitoring/hierarchy.png&quot; alt=&quot;Hierarchy&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;An example hierarchy for a network of sensor nodes (&lt;a href=&quot;http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4563333&quot;&gt;source&lt;/a&gt;).&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A critical feature of these networks is the scalability of their performance:
&quot;[s]uch a dense array must be designed to be scalable, which means that the
system performance does not degrade substantially, or at all, as the number of
components increases&quot; (&lt;a href=&quot;http://onlinelibrary.wiley.com/doi/10.1002/stc.48/abstract&quot;&gt;source&lt;/a&gt;). The fact that each node only
communicates with its closest neighbors is a strong indication that such a
design is scalable to tens of thousands of nodes.&lt;/p&gt;

&lt;h3&gt;Collection &amp;amp; Processing&lt;/h3&gt;

&lt;p&gt;A significant impediment to a monitoring system's success is the amount of
processing required to put the data into a useful form. The amount of data
recorded for infrastructure components can be extensive, reaching multiple
terabytes of data per day, but &quot;[m]uch of such data [is] being collected but not
used because processing is too costly&quot; (&lt;a href=&quot;http://shm.sagepub.com/content/2/3/257.short&quot;&gt;source&lt;/a&gt;). Even with the
recent advances in &quot;massively distributed smart sensors&quot;, infrastructure
managers still lack a general computation framework to explore, experiment with
and analyze the data. Controls software vendors are driven by the requirements
set forth by operators, but these operators' demands are often only responding
to what vendors have given them in the past.&lt;/p&gt;

&lt;p&gt;Data collection, storage, processing and querying is a general enough task that
it could and should be standardized across all areas of infrastructure. Computer
systems have converged around a few simple data formats and protocols to ensure
interoperability and save time by avoiding re-implementing existing features. A
few of these include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The Hypertext Transfer Protocol (HTTP) for exchanging data between

&lt;pre&gt;&lt;code&gt;  remote systems, which contributed to the success of the World Wide Web.
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;XML and JSON for encoding data into a predictable and quickly

&lt;pre&gt;&lt;code&gt;  parseable format, which contributed to the enhanced interactivity of Web
  2.0.
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;SQL for querying relational databases, which made user customization

&lt;pre&gt;&lt;code&gt;  possible on the web.
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;HTML, Cascading Style Sheets (CSS) and JavaScript for user interfaces,

&lt;pre&gt;&lt;code&gt;  which led the way for browser-based applications like Google Docs.
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;An example monitoring implementation using these technologies is Nagios, a tool
to monitor computer grid infrastructure.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;The Nagios distribution provides only the basic set of sensors, but custom
sensors can be developed by using any existing programming language. This means
that Nagios can be used for monitoring virtually anything as long as appropriate
sensor can be developed.&quot; (&lt;a href=&quot;http://portal.acm.org/citation.cfm?id=1272685&quot;&gt;source&lt;/a&gt;)&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Nagios provides a uniform interface to the monitored statistics, on
top of which any number of processing and analysis applications can be built.
Nagios is not intended for the resiliency that something like an electricity
grid would require, but many smaller infrastructure projects and those with less
critical tasks could likely use this system without modification.&lt;/p&gt;

&lt;h3&gt;State of the Art&lt;/h3&gt;

&lt;p&gt;One &lt;a href=&quot;http://books.google.com/books?hl=en&amp;amp;lr=&amp;amp;id=RPXU1vBhIFMC&amp;amp;oi=fnd&amp;amp;pg=PA123&amp;amp;dq=Health+Monitoring+Framework+for+Bridges+and+Civil+Infrastructure&amp;amp;ots=_LI3GPLQF0&amp;amp;sig=Slwwo9mnAB0rGg68Yh88RqLgoYc#v=onepage&amp;amp;q=Health%20Monitoring%20Framework%20for%20Bridges%20and%20Civil%20Infrastructure&amp;amp;f=false&quot;&gt;recent project&lt;/a&gt; at the University of California, San
Deigo set out to implement an extensible structural health monitoring platform
using some of these open technologies. They proposed a system with many
components, including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Networked sensor arrays&lt;/li&gt;
&lt;li&gt;A high-performance centralized database&lt;/li&gt;
&lt;li&gt;Computer vision analysis&lt;/li&gt;
&lt;li&gt;Physics analysis&lt;/li&gt;
&lt;li&gt;Visualizations that allow comparison between experimental and
  numerical simulation data&lt;/li&gt;
&lt;li&gt;Modeling and risk analysis&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;This was an ambitious project, attempted to provide a standard way &quot;to measure,
acquire, process, and analyze the massive amount of data that is currently
coming on-line (not to mention the terabytes of streaming data that will
inundate potential users in the near future) in order to extract useful
information concerning the condition assessment of the monitored structures.&quot;
Unfortunately, the project's web portal is no longer online and no further
information about the project could be found. A likely cause of failure is that
the system tried to combine too many components into a single piece. Monitoring
and data querying is one of the few tasks that can be completely generalized
across industries. Considering that computer vision analysis and risk modeling
will likely change substantially between a pipeline operator and
telecommunications operator, these systems should be left to the various
industries to implement as they see fit. As long as the data collection system
provides a uniform interface to query the monitoring statistics, it can still be
of great value.&lt;/p&gt;

&lt;h3&gt;SCADA&lt;/h3&gt;

&lt;p&gt;Supervisory Control and Data Acquisition (SCADA) is an all-encompassing
descriptor for real-time communication systems that connect infrastructure to
operators. SCADA systems can be uni- or bi-directional - that is they can both
send monitoring data back to the operator and propagate control commands
to infrastructure. These types of systems are popular in industrial environments
as well, as a way to monitor and control heavy equipment.&lt;/p&gt;

&lt;p&gt;There are many existing SCADA-style systems (an estimated 150-200
&lt;a href=&quot;http://www.sciencedirect.com/science/article/B6V8G-4JXRWXY-1/2/b9d08f2cdd9717ffab6c60e9b7d658f1&quot;&gt;different protocols&lt;/a&gt;), dating back to the late 1960's.
SCADA is very popular with power utilities as a mechanism for coordinating
generation capacity among power plant operators; the generators must react very
quickly to changes in load, and an automated communication system is the only
way to respond fast enough. In the few first decades of their existence, SCADA
systems were primarily developed in-house and extremely customized for specific
use cases. As computer software firms spawned in the 1980's and 1990's, more
infrastructure operators switched to purchasing their controls software from
existing vendors. This has both positives and negatives.&lt;/p&gt;

&lt;h4&gt;Standard Networking Protocols&lt;/h4&gt;

&lt;p&gt;In the last 20 years, SCADA software vendors have migrated towards using the
standard Internet Protocol (IP) for communication. This is a widely accepted
protocol in computing systems, and the success and diversity of the Internet
proves its flexibility. The other most popular protocols are DeviceNet and
ControlNet.&lt;/p&gt;

&lt;p&gt;Previously, operators would find themselves purchasing very large, completely
proprietary systems designed to monitor just a single attribute of their system
- for example, monitoring electrical &lt;a href=&quot;http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1259313&quot;&gt;power phase&lt;/a&gt;. An
operator's terminal could be littered with different applications, each for a
very specific part of the system.  The integration between these components was
poor because of the difficulties in sharing data among them.&lt;/p&gt;

&lt;p&gt;Once concern with using IP is that now, running on the same network as consumers
and potentially malicious attackers, there are additional security risks for a
SCADA system, especially when ill-timed or unauthorized control messages could
have disastrous consequences.&lt;/p&gt;

&lt;h4&gt;Standard Interfaces&lt;/h4&gt;

&lt;p&gt;Familiarizing operators with a SCADA user interface is a critical step in
deploying a successful system, and one that was given short shrift in the first
few decades of deployment. Along with most of the population of the 1960s, very
few operators were familiar with computer interfaces and thus it &quot;was
imperative that the [SCADA] operator interface be graphical in nature to provide
the dispatchers with a similar look and feel to what they were used to&quot;
(&lt;a href=&quot;http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=560831&quot;&gt;source&lt;/a&gt;).&lt;/p&gt;

&lt;h4&gt;Security &amp;amp; Confidentiality&lt;/h4&gt;

&lt;p&gt;As mentioned, the privacy and security of a SCADA system is especially of
concern when using networks shared by others. In general, &quot;[m]onitoring and
controlling these systems is an enormous undertaking, requiring constant
supervision&quot; (&lt;a href=&quot;http://portal.acm.org/citation.cfm?id=1047846.1047872&quot;&gt;source&lt;/a&gt;). They must be architected to
avoid cascading failures so connectivity issues in one area of the Internet do
not effect the stability of safety of a piece of critical infrastructure. SCADA
systems are some of the most diverse and complex integrated systems, commonly
containing &quot;as many as 50,000 input/output modules for data collection.&quot; Since
infrastructure systems are so interdependent, failure or damage in one
operator's infrastructure may cause a larger part of the infrastructure to
become unstable. Many types of infrastructure are dependent on
telecommunications to keep their SCADA systems running, and if standardization
and collaboration continues across industries, the integration points will only
become more prevalent.&lt;/p&gt;

&lt;p&gt;Knowing this, systems using a standard shared network with open protocols are
not without fault. They allow for more efficient operation and
potentially more collaboration, &quot;but it also exposes the safety-critical
industrial network to the myriad security problems of the Internet&quot;
(&lt;a href=&quot;http://www.sciencedirect.com/science/article/B6V8G-4JXRWXY-1/2/b9d08f2cdd9717ffab6c60e9b7d658f1&quot;&gt;source&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;It's important to distinguish between the control and monitoring halves of
SCADA, as they have very different security requirements. Some types of
infrastructure have a much greater need for control than others, who are
satisfied with mostly monitoring. It would be a mistake to require both systems
to have the same level of protection (although they both require robustness).&lt;/p&gt;

&lt;p&gt;Some common strategies for improving the security of SCADA systems on an open
network are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Make certainly all network traffic is protected using common Internet
  security measures such as Transport Layer security (TLS)&lt;/li&gt;
&lt;li&gt;Establish industry-wide data security practices.&lt;/li&gt;
&lt;li&gt;Educate operators on these best practices. Many reported
  security breaches were the result of operator error.&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;Openness&lt;/h3&gt;

&lt;p&gt;Before fretting too much about security and spending additional money, the
operators should also consider the true sensitivity of the data they are
collecting. The surface appeal for protecting data is great, but often the
data would not expose anything a watchful eye could not detect by manual
inspection. The true motivation for protecting data may be different.
For example, the Federal Energy Regulatory Commission removed previously
accessible statistics and documents from their website in the name of security,
&quot;but a 2003 investigation strongly suggests that advancing the economic
interests of favored industries or keeping executive actions from being
scrutinized are the actual motivations&quot; (&lt;a href=&quot;http://ssrn.com/paper=777045&quot;&gt;source&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Operators need a standard and objective process for determining the sensitivity
of information, and should err more on the side of release than protection
(depending on the risk involved). The reason is that &quot;[r]evealing data on the
vulnerabilities of certain kinds of infrastructure can be a net benefit when the
target would be inadequately defended absent that revelation. A series of GAO
reports about weaknesses in defensive measures at commercial nuclear power
plants, for example, played a key role in overcoming industry resistance to
stricter security standards&quot; (&lt;a href=&quot;http://www.informaworld.com/smpp/content~db=all~content=a919474402&quot;&gt;source&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Computer engineers have long appreciated the massive amount of cognitive power
accessible through the Internet, people ready and willing to tackle difficult
problems. The same is true for infrastructure issues, and &quot;government officials
should release organizational information whenever society is more effective
than terrorists at utilizing it&quot;. This level of
transparency could both force utilities to &quot;internalize more fully the costs of
attacks&quot; and encourage more self-interested regulatory
action to protect from attack in the first place.&lt;/p&gt;

&lt;p&gt;Attacks are so rare that as a private company operating a piece of
infrastructure, it doesn't often make business sense to spend money on
protection. The government lacks the regulatory power to make these industries
account for the full social costs of these events - and up to 90% of the
nation's critical infrastructure is privately owned and operated
(&lt;a href=&quot;http://heinonline.org/HOL/LandingPage?collection=journals&amp;amp;handle=hein.journals/aulr53&amp;amp;div=15&amp;amp;id=&amp;amp;page=&quot;&gt;source&lt;/a&gt;). Private industry, even when operating publicly owned
infrastructure, lacks strong incentives to properly protect it.&lt;/p&gt;

&lt;p&gt;In computing, open source cryptographic algorithms are more trusted and widely
used than closed-source proprietary complements because their vulnerabilities
have been exposed and patched. They are not more susceptible to attack solely
because of increased knowledge of their inner workings. Security by obscurity is
insufficient; non-disclosure is almost certainly an admission of the existence
of vulnerabilities, but it doesn't guarantee any of them are being resolved.&lt;/p&gt;

&lt;p&gt;Non-disclosure risks more than just the reputation of the company in question -
it risks public safety and the structural integrity of critical infrastructure.
Companies certainly recognize this, and is a part of the reason for protecting
data - anything exposed to the government could potentially show up in civil
enforcement action against the company.&lt;/p&gt;

&lt;p&gt;Of course, there must be a balance between openness and security. Data on
nuclear fuel distribution and quantity is protected because of the potentially
extreme consequences from unauthorized access. However, the
positives of openness and possibilities of early vulnerability discovery likely
outweigh any risk in most other industries.&lt;/p&gt;

&lt;p&gt;A monitoring system based on open standards could easily support a mechanism to
filter and aggregate operating statistics for public consumption. Real-time data
will likely always be considered too sensitive, but there should be a clear path
from SCADA systems to a more accessible (but still digital) version of the data,
one without any extra costs beyond initial setup.&lt;/p&gt;

&lt;h2&gt;Comparison&lt;/h2&gt;

&lt;p&gt;This section compares the culture and basic monitoring styles of civil and
computer engineering.&lt;/p&gt;

&lt;h3&gt;Culture&lt;/h3&gt;

&lt;p&gt;Civil and computer engineers come from different backgrounds and don't have much
shared history. The two industries have evolved with very different major
players and cultures.&lt;/p&gt;

&lt;h4&gt;Traditional Civil Engineering Ethos&lt;/h4&gt;

&lt;p&gt;The first 20 years after monitoring systems were first widely deployed can be
described as being embodied by a traditional civil engineering ethos. With the
newfound availability of computers and data storage, this was a period of great
change in design and planning.&lt;/p&gt;

&lt;h5&gt;Process &amp;amp; Information Security&lt;/h5&gt;

&lt;p&gt;From the start, information security was a critical concern for operators. They
also intended to keep specific operation process details within a company or
industry. These were viewed as details that didn't need to be exposed, for the
good of the company and the infrastructure.&lt;/p&gt;

&lt;h5&gt;Custom Software&lt;/h5&gt;

&lt;p&gt;Software development was a very new field, and many operators found themselves
hiring consultants to develop custom monitoring software for their specific use
case. Each operator had slightly different requirements, even within the same
industry, and at the start there wasn't much interoperability or data sharing.&lt;/p&gt;

&lt;h5&gt;Expert Interfaces&lt;/h5&gt;

&lt;p&gt;For industries with generations of existing workers, there wasn't a clear path
to designing proper user interfaces or training the operators. One approach is
to make the interface as visually similar to the physical component. Others
focused on more tabular displays of data, similar to what was previously a
hand-written spreadsheet. In both cases, it often took an expert understanding
of the system to be able to interpret the (often terse) monitoring system
displays.&lt;/p&gt;

&lt;h4&gt;Computer Engineering Ethos&lt;/h4&gt;

&lt;p&gt;At the same time, the computer engineering world was rapidly expanding and
in hindsight had a few basic principles.&lt;/p&gt;

&lt;h5&gt;Openness&lt;/h5&gt;

&lt;p&gt;In part due to the early ties between computers and academia, some of the first
widely used applications were open source - this means that the underlying
source code for the system is provided free of charge to view and modify. This
allowed new developers to quickly see how existing systems worked, and to assist
in the development and expansion of applications.&lt;/p&gt;

&lt;p&gt;Standard, open communication protocols also played an important role early on,
especially with the development of the Internet. Without a freely implementable
communication standard, the World Wide Web could not have succeeded.&lt;/p&gt;

&lt;p&gt;Finally, openness has been a core tenant of security from the start of
computing. Even today, many large software companies share fine-grained details
of the operations of their systems in order to expand knowledge in the field and
potentially reap the benefit of open source contributions to their projects.&lt;/p&gt;

&lt;h5&gt;Humanized UI&lt;/h5&gt;

&lt;p&gt;User interface design is an important part of everything to do with computers -
without the interface, there is very little of interest in networked
applications. This is opposed to infrastructure, where the physical components
exist in the world regardless of if they are monitored by a communications
network.&lt;/p&gt;

&lt;p&gt;The interfaces range from technically demanding (like the early civil
infrastructure displays) to layman friendly. A core principle is flexibility,
however, meaning that with time and effort, almost every system can settle on a
good UI.&lt;/p&gt;

&lt;h4&gt;Modern Civil Engineering Ethos&lt;/h4&gt;

&lt;p&gt;As SCADA matured over the last 20 years, the ethos of engineers working on
monitoring civil infrastructure has changed. Some of the ideas from computer
engineering have migrated over, and the further development of a strong software
industry improved standards and interoperability.&lt;/p&gt;

&lt;h5&gt;Industry-level Standardization&lt;/h5&gt;

&lt;p&gt;Most infrastructure industries have some general standards and guidelines for
SCADA systems. Specific software vendors have become more popular in certain
areas, meaning that an operator moving from one power utility to another is
more likely to find familiar nomenclature and interfaces. For example, in 1993
the American Petroleum Institute issued a set of overall SCADA guidelines to
reconcile a huge diversity of interface styles for pipeline monitoring.
Infrastructure management is founded on a set of core ideas and
tools that can be applied to many different industries with only a few changes
(new expert consultants, different Weibull curves, etc.), and the same is true
for monitoring systems.&lt;/p&gt;

&lt;h5&gt;Third-party Software&lt;/h5&gt;

&lt;p&gt;Most monitoring software is now developed by third-party software vendors such
as Seimens and Rockwell Automation. While these systems still have some
proprietary components, communication protocol standards now have much more
support.&lt;/p&gt;

&lt;h3&gt;Active versus Passive&lt;/h3&gt;

&lt;p&gt;Computer network monitoring systems are typically monitoring other software
system and the monitoring is inherently very tightly woven with the application
itself. This allows for a &lt;a href=&quot;http://dblp.uni-trier.de/db/conf/eagc/eagc2004.html#HolubKMR04&quot;&gt;passive monitoring&lt;/a&gt; approach that
gathers its data directly from the application, or by politely eavesdropping on
network activity. This isn't possible in non-computing infrastructure where the
&quot;application&quot; is something physical, without any explicit bits associated with
it.&lt;/p&gt;

&lt;p&gt;Both types of monitoring accomplish the same end goal, but passive monitoring in
computer networks is used primarily in an attempt to lower the performance
overhead of monitoring. Civil infrastructure will rarely if ever have the same
problem (since wireless sensors can be positioned as to not obstruct normal
operation), but the difference could lead to another divergence in monitoring
techniques.&lt;/p&gt;

&lt;h2&gt;User Interfaces&lt;/h2&gt;

&lt;h3&gt;State of UI in SCADA&lt;/h3&gt;

&lt;p&gt;The purpose of SCADA is to &quot;allow operators to monitor and control systems&quot;
(&lt;a href=&quot;http://portal.acm.org/citation.cfm?id=1047846.1047872&quot;&gt;source&lt;/a&gt;), so the presentation of data is a critical
component that should receive equal consideration with sensor hardware and data
security. Unfortunately, this hasn't been the case for many SCADA developers. A
2005 &lt;a href=&quot;www.ntsb.gov/publictn/2005/ss0502.pdf&quot;&gt;safety study&lt;/a&gt; by the National Transportation Safety Board on SCADA
for liquid pipelines found that inconsistent and problematic user interfaces
were the cause of many accidents, challenging the claims that the systems are a
success. The study concluded:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;The principle issue in the SCADA-related accidents investigated by the Safety
Board was the delay in a controller's recognizing a leak and beginning efforts
to reduce the effect of the leak. SCADA factors identified in these accidents
include alarms, display formats, the accuracy of SCADA screens, the controller's
ability to accurately evaluate SCADA data during abnormal operating conditions,
the appropriateness of controller actions, the ability of the controller and the
supervisor to make appropriate decisions, and the effectiveness of training in
preparing controllers to interpret the SCADA system and react to abnormal
conditions.&quot;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;This highlights the importance of good interface design, as most of these issues
are simply the result of controllers not understanding the information presented
to them by the monitoring system. These are operators who interpret a graph
incorrectly, dismiss repeated warnings because of frequent false positives, and
those who can't get a good sense of the true health of the system in the field
based on tabular data.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/infrastructure-monitoring/scada.png&quot; alt=&quot;SCADA&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A screenshot from a SCADA system for monitoring a pipeline (from the NTSB
study)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The study examined the role of SCADA systems in 13 pipeline accidents from 1992
to 2004, and found that a SCADA system contributed to the accident in some form
in ten of them. Implemented poorly, a monitoring and control system can harm
instead of help.&lt;/p&gt;

&lt;p&gt;These systems should strive to control as much as is reasonably possible
automatically. A common complaint about these systems is that false alarms are
so frequent, that operators have a difficult time distinguishing the real alarms
when they do occur. These annoying alerts can lead to a serious signal being
ignored for a long time by an operator who has tuned them out, or a misdiagnosis
of the root cause. As many as possible of these false positives should be
handled by a low level of automatic control operations done by the computer,
even if it means stopping system components for a brief period when not strictly
necessary. Recent research on &lt;a href=&quot;http://dblp.uni-trier.de/db/conf/eurosys/eurosys2010.html#BodikGFWA10&quot;&gt;data center fingerprinting&lt;/a&gt;,
which identifies new incidents based on data collected during historical events,
could be applied to such automatic failure identification and repair. The human
operator should be brought into the loop only when a bigger system problem is
identified and automated response is not sufficiently subtle or intelligent.&lt;/p&gt;

&lt;p&gt;The standardization of colors and symbols within (and even across) industries is
also required to minimize re-training and misinterpretation. This hasn't always
happened, primarily because a lot of SCADA system development &quot;occurred
company-by-company due to the unique characteristics of each company's operating
practices and other computer systems. For example, one company may use red to
show an operating pump while another may use green&quot; (from the NTSB study). Consistent,
appropriate use of color is one of the major tenants of good user interface
design and all existing displays should be evaluated to make sure they meet this
standard. Other UI improvements that the NTSB suggest are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Real-time comparison with historical data to recognize abnormalities,
  especially in areas the operator many not be explicitly trained.&lt;/li&gt;
&lt;li&gt;Integrate company safety procedures into control systems to guide
  operators down the correct resolution path.&lt;/li&gt;
&lt;li&gt;Extended operator training - operators should be experts in the
  system.&lt;/li&gt;
&lt;li&gt;Increase contrast between foreground and background colors.&lt;/li&gt;
&lt;li&gt;Minimize the quantity of colors and make sure their meaning is
  consistent between screens.&lt;/li&gt;
&lt;li&gt;Target 40% blank space on screens to minimize clutter.&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;User Interfaces &amp;amp; User Experience&lt;/h3&gt;

&lt;p&gt;Beyond incremental improvements, civil infrastructure interfaces have a unique
opportunity to push the envelope of interfaces. These systems have very wide
market penetration, and whatever interface they provide (good or bad) has a
tendency to become the industry standard out of familiarity. A potentially
more revolutionary interface change could take advantage of an immersive
virtual world instead of buttons and charts. Research regarding
different physical interfaces for computers, and the effect of integrating
virtual elements with the real world found that they can &quot;create a system that
leverages the human mind's pattern recognition skills to detect anomalies on a
live running network&quot; (&lt;a href=&quot;http://dblp.uni-trier.de/db/conf/netgames/netgames2006.html#HarropA06&quot;&gt;source&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Naturally, due to the popularity of computer and video games among computer
network operators and researchers, a few &lt;a href=&quot;http://dblp.uni-trier.de/db/conf/netgames/netgames2006.html#HarropA06&quot;&gt;attempts&lt;/a&gt; have been made to create such
an immersive visualization for network monitoring. The primary challenge is to
carefully select &quot;input and output metaphors within [the] real time 3D virtual
environment,&quot; and it proved especially challenging because there
is no obvious analogous physical element for network constructs - the activity
is all bits flowing on wires. Physical infrastructure monitoring is not bothered
as much by this issue, since at the root of most infrastructure monitoring task
is some physical object. Metaphors are not required.&lt;/p&gt;

&lt;p&gt;An immersive, humanized environment offers more opportunities for pattern
recognition of data that would be shown in an unsuitable format on a
traditional display. Immersive virtual environments can also improve collaboration between
operators. Currently, operators working together on a problem must communicate
through voice or video. This is because
it's difficult to tell at a glance what an operator is doing at a computer
terminal. Inside a virtual world, the operator's current activity and status can
be made obvious by the actions of their character. If the operator is opening a
valve on the pipeline, they will appear doing just that inside the environment.
In the control center, they're sitting in an office chair as always.&lt;/p&gt;

&lt;h3&gt;Serious Games&lt;/h3&gt;

&lt;p&gt;The idea of serious games is a good example of a daring use of existing
interfaces in a novel way.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;The term serious games has developed as a rebuttal
to the idea that games are purely for leisure purposes and its use goes back to
Plato's work on the importance of play as a teaching method. Recently, the
serious games movement has emerged from academic communities identifying the
power of play for supporting non-leisure activities such as education and
training.&quot; (&lt;a href=&quot;http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.158.7434&quot;&gt;source&lt;/a&gt;)&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Beyond training, serious games can also be used for monitoring and interacting
with live environments in real-time. Using a first- or third-person viewpoint
into a world that looks very similar to the actual physical components is a more
natural way to get the sense of a system's status.&lt;/p&gt;

&lt;p&gt;A core requirement of an infrastructure monitoring version of a serious game is
to avoid simplifying critical components to the degree that accuracy is lost.
The interface must certainly contain abstractions, lest it become a
micromanagement simulator, but it cannot ignore real-world issues that other
training simulations tend to ignore.&lt;/p&gt;

&lt;h3&gt;Modern Video Games&lt;/h3&gt;

&lt;p&gt;Thankfully, a lot of work has already been done in developing immersive 3D
virtual environments in modern computer and video games. For the purposes of
entertainment and creative expression, game developers are creating increasingly
detailed worlds. Some of the important features that could be applied to
infrastructure management are highlighted by a few recent titles:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/infrastructure-monitoring/mirrorsedge.jpg&quot; alt=&quot;Mirror's Edge&quot; /&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Valve's &lt;a href=&quot;http://orange.half-life2.com/&quot;&gt;Half-Life 2&lt;/a&gt; (2004) - provides
  advanced real-time physics simulation that could be used for combining
  real-time sensor data and prediction models.&lt;/li&gt;
&lt;li&gt;Electronic Arts' &lt;a href=&quot;http://www.ea.com/games/mirrors-edge&quot;&gt;Mirror's Edge&lt;/a&gt; (2007) -
  includes expansive, detailed urban environments that could be used for
  building and bridge inspection.&lt;/li&gt;
&lt;li&gt;Gas Powered Games' &lt;a href=&quot;http://www.supremecommander2.com/&quot;&gt;Supreme Commander 2&lt;/a&gt;
  (2010) - gives top-down strategic control of thousands of military units,
  which could be used for routing traffic on busy streets, waterways and in
  the air.&lt;/li&gt;
&lt;li&gt;Rail Simulator Development's &lt;a href=&quot;http://www.railsimulator.com/&quot;&gt;RailWorks 2&lt;/a&gt;
  (2010) - gives complete control of accurately modeled passenger and cargo
  trains. With a connection to a SCADA system, this game could be useful
  almost out of the box.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Most state-of-the-art game engines (the software that renders the world and
controls the behavior of objects in it) are not available to the public
directly, but many expose a flexible interface that allows anyone to modify the
gameplay. Computer games are often popular well beyond their initial release
because of large communities of gamers and developers creating &quot;mods&quot; that
twist the game engine into completely new forms. Some examples that suggest the
possibilities for infrastructure management mods include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;http://www.garrysmod.com/&quot;&gt;Gary's Mod&lt;/a&gt; (for
  &lt;a href=&quot;http://orange.half-life2.com/&quot;&gt;Half-Life 2&lt;/a&gt;) - strips down the game to
  its core physics engine to allow creating devices as varied as operational
  submarines, construction cranes and rocket ships.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.empiresmod.com/drupal/&quot;&gt;Empires&lt;/a&gt; (for Half-Life 2) - extends the
  formerly single-play, first-person only game to include a commander with a
  top-down view of the battlefield who can give orders to players exploring
  the world. This style of game closely matches the management hierarchy of
  many infrastructure operators.&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.gtagaming.com/downloads/gta-iv/vehicle-mods/2604&quot;&gt;BusMod&lt;/a&gt;
  (for &lt;a href=&quot;http://www.rockstargames.com/IV/&quot;&gt;Grand Theft Auto 4&lt;/a&gt;) - in a game
  built for causing violent mayhem in an urban environment, this mod
  highlights a more mundane feature of the city and lets the player drive a
  bus route.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;A few open-source game engines are available, if the modding interface isn't
sufficient for any specific monitoring task. &lt;a href=&quot;http://dblp.uni-trier.de/db/conf/netgames/netgames2006.html#HarropA06&quot;&gt;Some research&lt;/a&gt; has been done
on using one of these engines for a computer network monitoring task.&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;Civil infrastructure is a notoriously underfunded part of our society. The
ASCE's yearly infrastructure &lt;a href=&quot;http://www.infrastructurereportcard.org/&quot;&gt;report cards&lt;/a&gt; have been giving
failing grades for over a decade, yet overall investment in infrastructure
hasn't risen. The improvements proposed in this paper clearly cost money,
something which few infrastructure managers have available for non-critical
upgrades.&lt;/p&gt;

&lt;p&gt;However, remember that the promise of real-time monitoring and SCADA was a
decrease in total lifetime cost of infrastructure. This promise was largely
fulfilled. With some up-front investment, more efficient operation and
monitoring can potentially save a lot of money. After 40 years of SCADA systems,
it's time to move into the next major phase of technological advancement, which
holds the same cost saving promise. Monitoring systems succeeded, but they risk
falling behind like the rest of America's infrastructure without proper
consideration.&lt;/p&gt;

&lt;p&gt;The next evolution in monitoring must consider the true target of the system:
the users. Julian Bleecker's &lt;a href=&quot;http://www.nearfuturelaboratory.com/2009/03/17/design-fiction-a-short-essay-on-design-science-fact-and-fiction/&quot;&gt;Design Fiction&lt;/a&gt; essay gets it right:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;Engineering makes things for end-users. Accounting makes things for markets,
demographics and consumers. Design makes things for people.&quot;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Engaging interfaces are not reserved exclusively for entertainment
and pleasure, and neither is good design.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Massively Distributed Monitoring</title>
   <link href="http://christopherpeplin.com/2011/05/distributed-monitoring"/>
   <updated>2011-05-29T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/distributed-monitoring</id>
   <content type="html">&lt;p&gt;&lt;em&gt;A &lt;a href=&quot;http://things.rhubarbtech.com/distributed-monitoring/distributed-monitoring-report.pdf&quot;&gt;PDF&lt;/a&gt;
of this article is available&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Abstract&lt;/h2&gt;

&lt;p&gt;As the scale of distributed systems continue to grow, the basic question of
monitoring the system's status becomes more difficult. How do you monitor an
extremely large scale network of nodes without requiring a massive, centralized
cluster for data collection?&lt;/p&gt;

&lt;p&gt;To monitor these systems from a central location would mean potentially hundreds
of thousands of new connections every second, each with a very small update.
Persistent connections do not help, since a server cannot maintain that many
open connections.&lt;/p&gt;

&lt;p&gt;This paper evaluates different approaches to monitoring and describes the
implementation of a few novel monitoring techniques in the peer-to-peer video
distribution network Astral. Peers in Astral use a self-organized dynamic
hierarchy, temporal batching and update filtering to increase the scalability of
the monitoring subsystem. Some of the trade-offs associated with these
optimizations are also enumerated.&lt;/p&gt;

&lt;h2&gt;Introduction&lt;/h2&gt;

&lt;p&gt;Computer system architecture shifted three major times in the past 30 years.
Individual, personal computers gave way to mainframes and thin clients. The tide
shifted back to powerful end-user clients in the 1990's. Now, the cloud and
web services embody somewhat of a throwback to the mainframe era.&lt;/p&gt;

&lt;p&gt;A web service is both distributed and centralized - distributed in that the
client and server are separated; centralized in that all clients of a particular
service connect to the same central hub. In data centers and a few emerging
peer-to-peer applications, smaller distributed systems are emerging. The data
center ecosystem itself is one massively distributed system, encompassing
thousands of nodes across a diverse geography. In short, distributed systems are
more common than ever and the importance of monitoring, tracking and accounting
hasn't waned.&lt;/p&gt;

&lt;h3&gt;Problem&lt;/h3&gt;

&lt;p&gt;As distributed systems become the norm, business process managers are more and
more interested in collecting knowledge of the behavior of the system. Planning
and marketing are also increasingly data-driven. Each client, server or peer in
a system generates potentially valuable usage statistics. System operators want
to access this data from the granularity of an individual node up to an
aggregate value for the entire system, or a value derived from many statistics.&lt;/p&gt;

&lt;p&gt;As the scale of a distributed system increases, the monitoring task can
potentially become a burden for an application. The demands and overhead of
monitoring can equal or exceed those of the system's normal duties.&lt;/p&gt;

&lt;p&gt;A prime example of a problematic monitoring situation is that for large
peer-to-peer networks. With lessons learned from the centralization of Napster
and the unstructured overlay network of Gnutella, modern peer-to-peer
applications like BitTorrent focus on a completely decentralized operation that
incorporates a minimal amount of local structure in the network graph.
BitTorrent trackers can collect statistics on the files they seed, but
statistics for a resource distributed among multiple trackers are difficult to
reconcile.&lt;/p&gt;

&lt;p&gt;Another example is massively multiplayer online games. Blizzard has a vested
interested in tracking the users of their massively multiplayer online game
&lt;a href=&quot;http://us.battle.net/wow/en/&quot;&gt;World of Warcraft&lt;/a&gt; for billing, game balancing
and to plan for future expansions. In order to scale the game world in a
reasonable fashion, the developers split up the environment into thousands of
shards. The gameplay statistics must make it back to a centralized location at
Blizzard eventually, but the fractured architecture doesn't lend itself to a
simple solution.&lt;/p&gt;

&lt;p&gt;Civil infrastructure monitoring efforts have also encountered these issues.
Consider a thousand mile gas pipeline - current monitoring approaches require
wireless sensors spaced evenly along the route, which can quickly outpace a
large data center in raw number of nodes.&lt;/p&gt;

&lt;h3&gt;Research Goals&lt;/h3&gt;

&lt;p&gt;This paper summarizes the challenges with large-scale, distributed monitoring
systems and details some accepted solutions and their limitations. To test these
ideas, we extended the logging and monitoring functionality of the experimental
distributed system &lt;a href=&quot;http://christopherpeplin.com/2011/05/astral&quot;&gt;Astral&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Astral is a peer-to-peer streaming media content delivery network. The intent of
Astral is to leverage the available upstream bandwidth of users watching a live
video stream to alleviate the stress (and bandwidth bill) of the stream
provider. Instead of all retrieving the video stream from a central location,
clients look among network peers for those watching the same event. To match the
quality and quantity of metrics available from a centralized system (and
demanded by management), Astral must provide detailed usage statistics on the
streams and their viewers. Using Astral, we found some of the limits of
centralized data collection and tested the feasibility of a more distributed
approach.&lt;/p&gt;

&lt;h3&gt;Definitions&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Source / Node - The root source of a statistic, e.g. a user's client
 connected to the Astral network.&lt;/li&gt;
&lt;li&gt;Sink / Collector - The hub for collecting statistics from sources.&lt;/li&gt;
&lt;li&gt;Supernode - A parent node in charge of 1 to $n$ child nodes. The
 leader of a neighborhood of nodes. Typically not specially deployed
 hardware, but common nodes promoted by the system dynamically.&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Monitoring&lt;/h2&gt;

&lt;h3&gt;Metrics&lt;/h3&gt;

&lt;p&gt;The range of specific metrics that a system designer or operator might like to
have is quite wide, and thus a monitoring framework must have a generic, unified
interface for specifying the type, number and value of data points. It must be
simple to add new metrics for each node, of different types of data.&lt;/p&gt;

&lt;p&gt;In the context of an individual node, metrics include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;CPU usage (instantaneous and averaged)&lt;/li&gt;
&lt;li&gt;Hard disk usage&lt;/li&gt;
&lt;li&gt;Memory usage&lt;/li&gt;
&lt;li&gt;Swap space usage&lt;/li&gt;
&lt;li&gt;Clock skew&lt;/li&gt;
&lt;li&gt;Network throughput&lt;/li&gt;
&lt;li&gt;Network latency&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;One level higher, applications have their own (potentially more interesting)
metrics. For some common infrastructure-level components, metrics include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Database query latency&lt;/li&gt;
&lt;li&gt;Database index performance&lt;/li&gt;
&lt;li&gt;Database replication status&lt;/li&gt;
&lt;li&gt;Task queue failure rates&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;A peer-to-peer network's metrics include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Number of peers&lt;/li&gt;
&lt;li&gt;Supernode organization&lt;/li&gt;
&lt;li&gt;Supernode history&lt;/li&gt;
&lt;li&gt;Network membership history for a peer&lt;/li&gt;
&lt;li&gt;Specific data requested from the network&lt;/li&gt;
&lt;li&gt;Resource available at a peer&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The metrics for an online game include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Individual player playtime&lt;/li&gt;
&lt;li&gt;Aggregate player activity&lt;/li&gt;
&lt;li&gt;Inter-player transactions&lt;/li&gt;
&lt;li&gt;Non-player character spawns&lt;/li&gt;
&lt;li&gt;Occurrence of in-game events&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The types of civil infrastructure vary widely, but they tend to have close
relation to a physical world element and external sensor input. Some sensor
data may come in the form of video or images, further complication collection
and storage methods. Common infrastructure metrics include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Temperature&lt;/li&gt;
&lt;li&gt;Pressure&lt;/li&gt;
&lt;li&gt;Flood gate status (boolean)&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;Challenges&lt;/h3&gt;

&lt;p&gt;The world of data center monitoring is changing rapidly. The developers of the
popular monitoring tool &lt;a href=&quot;http://www.sciencedirect.com/science/article/B6V12-4CMHWWX-2/2/b6b44ba67c732867d1c3881c510b2953&quot;&gt;Ganglia&lt;/a&gt; remark that &quot;high performance systems today have
sharply diverged from the monolithic machines of the past and now face the same
set of challenges as that of large-scale distributed systems.&quot;&lt;/p&gt;

&lt;p&gt;The variety of systems described earlier are all beginning to look like data
centers, and vice versa. The core goal of any monitoring system is a global view
of the system for the purposes of health monitoring, performance optimization
and accounting. An aggregate global view is more useful in a truly large scale
system, where individual node failures are likely masked or handled by efficient
failover. Consider that &quot;failure&quot; in a network of cable TV set-top boxes could
be a user turning off a power strip each night.&lt;/p&gt;

&lt;p&gt;The Ganglia developers suggest that the most important design challenges for a
distributed monitoring system are scalability, robustness, extensibility,
manageability, portability and overhead. This paper covers three of these in
greater detail.&lt;/p&gt;

&lt;h4&gt;Overhead&lt;/h4&gt;

&lt;p&gt;Performance overhead can manifest itself on individual nodes or across the network as a
whole. The monitoring system must not have a significant effect on the core
task of the application, which means it must not consume significant CPU time,
perform much disk access, or transfer large amounts of data over the network.&lt;/p&gt;

&lt;p&gt;A small network footprint also lends itself to a more dynamic system. The less
data that must be transferred, the quicker the operator can view the status of
the entire system. Processing data as close to its source as possible can both
lessen the network and central collection server load (&lt;a href=&quot;http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=1045780&quot;&gt;source&lt;/a&gt;). This
must not come at the expense of application performance.&lt;/p&gt;

&lt;p&gt;What exactly can and should be processed at the point of collection isn't always
clear. For aggregate statistics, processing isn't possible without a full (or at
least somewhat broad) view of the system. Nodes in a sensor network, oppositely,
are able to process raw sensor data into a time averaged, human-parseable
statistic before sending. In a general sense, the local processing optimization
implies that data should be summarized when and where applicable before being
sent to the sink.&lt;/p&gt;

&lt;h4&gt;Scaling&lt;/h4&gt;

&lt;p&gt;There are three primary techniques for scaling monitoring systems: hierarchical
aggregation, arithmetic filtering and temporal batching. Unfortunately, all
three introduce complexity, uncertainty and or delay and can make the system
highly sensitive to failure.&lt;/p&gt;

&lt;h5&gt;Hierarchical Aggregation&lt;/h5&gt;

&lt;p&gt;One problem for a centralized data sink is the sheer number of updates. A way
to alleviate this stress is to aggregate the data from multiple nodes at
strategic points in the network hierarchy. The exact size of each aggregated
group is configurable, depending the desired load on the collection servers. The
collected statistics can either be forwarded along as a batch of individual
updates or combined into a single summary value (e.g. average CPU load across a
cluster of nodes).&lt;/p&gt;

&lt;p&gt;Unfortunately, network failures are amplified in a system with hierarchical
aggregation. For example, &quot;if a non-leaf node fails, then the entire
subtree rooted at that node can be affected. [The] failure of a level-3
node in a degree-8 aggregation tree can interrupt updates from 512 leaf
node sensors.&quot; (&lt;a href=&quot;http://portal.acm.org/citation.cfm?id=1855748&quot;&gt;source&lt;/a&gt;)&lt;/p&gt;

&lt;h5&gt;Arithmetic Filtering&lt;/h5&gt;

&lt;p&gt;Many metrics change infrequently, so arithmetic filtering can be used to limit
the update frequency. After being reported to the sink once, the metric is
cached and assumed to remain constant if no further updates are received. This
only works for certain classes of statistics (boolean states are a good
example), and also introduces ambiguity - it's difficult or impossible to
distinguish between a non-reporting node that truly has a constant value and one
that has failed. One possible solution for identifying truly failed nodes is to
use the existence of other updates from a node as an implicit aliveness update.&lt;/p&gt;

&lt;h3&gt;Temporal Batching&lt;/h3&gt;

&lt;p&gt;For statistics that change frequently, but aren't immediately required by the
system operator, temporal batching can further alleviate stress on the
monitoring system. Either at individual nodes or combined with hierarchical
aggregation, the values for a metric over a period of time are batched before
being sent to the sink.&lt;/p&gt;

&lt;p&gt;Beyond problems associated with the inherent delay in updates, temporal batching
makes the system much more vulnerable to networking problems - &quot;a short
disruption can block a large batch of updates&quot; (&lt;a href=&quot;http://portal.acm.org/citation.cfm?id=1855748&quot;&gt;source&lt;/a&gt;).
Updates can be persisted to a log on disk at each node to make sure they are not
lost.&lt;/p&gt;

&lt;p&gt;A derived metric called &lt;a href=&quot;http://portal.acm.org/citation.cfm?id=1855748&quot;&gt;network imprecision&lt;/a&gt; (NI) was proposed to account for
these variances when viewing system-wide statistics. NI is a &quot;stability flag&quot;
that indicates if the underlying network organization is stable, which is a good
indicator of the general accuracy of the statistics. To calculate NI, each
update from a neighborhood of nodes must include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The number of nodes who may not be included in this update&lt;/li&gt;
&lt;li&gt;The number of nodes who may be double counted&lt;/li&gt;
&lt;li&gt;The total number of nodes&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;This solution for scalability introduces its own scalability issue - the system
must report when nodes no longer are reachable, so an accurate accounting of the
number of active nodes in the system is required. This will not scale well to a
large peer-to-peer network without hierarchical organization and thus, we're back
where we started.&lt;/p&gt;

&lt;h4&gt;Manageability&lt;/h4&gt;

&lt;p&gt;Manageability deals with both the system's own automated organization as well as
that of the humans ultimately consuming the monitoring data. The management
overhead must scale slowly with the number of nodes for the system to remain
useful.&lt;/p&gt;

&lt;p&gt;The management and monitoring tools of large distributed systems have the
potential to become more complicated than the applications themselves.
Components of the system can be grouped into organizations and split up work in
a federated style to alleviate the management stress. The Domain Name System
(DNS), for example, operates across thousands of administrative and technical
domains by leaving many decisions up to the local administrators.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;http://portal.acm.org/citation.cfm?id=762485&quot;&gt;Astrolabe&lt;/a&gt; project proposed achieving scalability
through a hierarchy of zones, each zone consisting of one or more nodes. The
zone summarizes statistics into fields of a bounded size - e.g. a count of
nodes with a certain property, not a list of their names. The system provides
eventual consistency of these aggregate values, which is likely sufficient for
many distributed monitoring tasks (especially considering the length of time it
takes any action to propagate through an extremely large and diverse network).&lt;/p&gt;

&lt;h2&gt;State of the Art&lt;/h2&gt;

&lt;p&gt;As data centers operations matured, monitoring coalesced around a few state of
the art tools. This section describes some of the design decisions of these
tools.&lt;/p&gt;

&lt;h3&gt;Storage&lt;/h3&gt;

&lt;p&gt;Any monitoring activity can generate a significant amount of data, given enough
nodes and metrics. Even a single metric, measured often can quickly cripple
flat-file storage and traditional databases.&lt;/p&gt;

&lt;p&gt;As an evolution from flat-file storage, the designers of
&lt;a href=&quot;http://portal.acm.org/citation.cfm?id=1037150.1037153&quot;&gt;CARD&lt;/a&gt; choose to use a standard SQL relational database. The
motivation was the existing robust query language (SQL), and the ability to
modify table definitions after creation. The developers were satisfied with
their choice but there are a few issues for a larger scale system. Monitoring
data is not especially relational and could better fit in a database intended
for simpler data models. A database with map-reduce style querying could also
provide a more natural interface. In fact, the designers had to add custom SQL
syntax to allow flexible enough querying.&lt;/p&gt;

&lt;p&gt;Every monitoring task is going to require this flexibility in data schema and
query capability. A database specialized for such flexibility, specifically one
of the newer schema-less databases like &lt;a href=&quot;http://www.mongodb.org/&quot;&gt;MongoDB&lt;/a&gt; or &lt;a href=&quot;http://redis.io/&quot;&gt;Redis&lt;/a&gt;, is likely a
better fit. Non-traditional databases are not new to monitoring - by far the
most popular choice, used by both Ganglia and &lt;a href=&quot;http://collectd.org/&quot;&gt;collectd&lt;/a&gt;, is &lt;a href=&quot;http://www.mrtg.org/rrdtool/&quot;&gt;RRDtool&lt;/a&gt;.
RRDtool is a circular database designed specifically for time slice data like
monitoring statistics. The operator configures a maximum database size and older
data is automatically overwritten in a round-robin fashion to maintain this
size.&lt;/p&gt;

&lt;p&gt;Operators who wish to archive such data must workaround its cyclical nature by
moving weekly or monthly aggregate data to other RRDtool databases, and again to
yearly and beyond. These quirks (and its rather slow performance when drawing
graphs) make it a sufficient but less than perfect choice.&lt;/p&gt;

&lt;p&gt;Finally, CARD only scaled to a few hundred nodes and due to their scale-up (and
not scale-out) design, a relational database would likely have a difficult time
keeping up if the system were to grow into tens of thousands of nodes.&lt;/p&gt;

&lt;h3&gt;Collection&lt;/h3&gt;

&lt;p&gt;The method a system uses for collecting monitored statistics can have a huge
impact on its scalability; the two ends of the spectrum are push and pull.&lt;/p&gt;

&lt;h3&gt;Pull&lt;/h3&gt;

&lt;p&gt;A pull approach requires the sink to explicitly request each desired data point
from the nodes in the system. The sink must poll the nodes at some regular
interval if continuous updates are desired. This requires central coordination
and registration of all nodes in the system, which is often infeasible,
especially in loosely structured peer-to-peer networks.&lt;/p&gt;

&lt;p&gt;If the time it takes the sink to cycle through all of the nodes exceeds the
polling time, time between polling must be increased (thus losing freshness).
Parallel querying can improve this, but only up to a certain point, after which
the pull model runs into an issue shared with the push model. Regardless of how
often polling can be occur, it is likely that much of the data is duplicated or
unchanged. This results in more network traffic than is necessary for freshness.&lt;/p&gt;

&lt;p&gt;The pull method is used by the open source monitoring tool Munin, which
periodically polls pre-registered nodes to retrieve updates.&lt;/p&gt;

&lt;h3&gt;Push&lt;/h3&gt;

&lt;p&gt;A push approach places the burden of sending updates on the end nodes - either
sporadically or at a regular interval, they send their updates to the sink
identified by a known address. One problem with this approach is management and
configuration. The update frequency and sink address are potentially difficult
to change as they are distributed among many nodes (possibly not controlled by
the operator). The use of a long-lived name and DNS can alleviate address
changes, but updating any other settings will require that the nodes
occasionally contact a configuration server to synchronize.&lt;/p&gt;

&lt;p&gt;The scaling challenges of the push model are very similar to those of running a
large web application - the sink must be able to handle a high request
throughput, most of them very small write operations. The write-heavy workload
is somewhat unique, but not especially challenging for today's databases.&lt;/p&gt;

&lt;p&gt;The push method is used by two prominent monitoring tools in the industry,
Ganglia and collectd.&lt;/p&gt;

&lt;h3&gt;Hybrid&lt;/h3&gt;

&lt;p&gt;A hybrid push/pull gathering style can potentially minimize the duplication of
data, maximum freshness and reduce network traffic. From cold start, the sink
sends a request for data to each node with an associated count $c$. The node
performs as in the push model $c$ times, at which point the sink refreshes the
query for another period. This allows occasional configuration changes to happen
more naturally, while not introducing constant network overhead
(&lt;a href=&quot;http://portal.acm.org/citation.cfm?id=1037150.1037153&quot;&gt;source&lt;/a&gt;). The nodes should self-register as in the push
model.&lt;/p&gt;

&lt;p&gt;At the time of writing this report, there were no widely used systems in the
industry that use this hybrid approach.&lt;/p&gt;

&lt;h3&gt;Similarities to Civil Infrastructure&lt;/h3&gt;

&lt;p&gt;Interestingly, the shift in data centers to clusters of low to moderately
powerful nodes brings the computing world more in line with the monitoring
situation in civil infrastructure. Civil engineers and government organizations
in charge of projects such as roads, bridges, oil &amp;amp; gas pipelines and waterways
have been struggling with monitoring some of the earliest distributed systems.
These are not distributed in the familiar computing sense; they are often
entirely offline and unpowered. For decades, their data has been gathered (often
inconsistently) by hand. The engineers tasked with accounting for trillions of
dollars of public assets and physical systems are dealing with what could be
viewed as widespread network unreliability and wholly unreliable nodes.&lt;/p&gt;

&lt;p&gt;The increasing affordability of small, accurate, low-power sensors greatly
interests civil engineers and infrastructure planners as the promise of
accurate, fresh data is now realistic. Nearly all new construction comes
complete with a range of sensors and a suite of software for monitoring and
analyzing the current (and predicted future) state of the piece of
infrastructure.&lt;/p&gt;

&lt;p&gt;Modern &lt;a href=&quot;http://ieeexplore.ieee.org/xpl/freeabs_all.jsp?arnumber=4563333&quot;&gt;research&lt;/a&gt; prefers wireless sensors over wired. These systems
are harder to disrupt, which is of greater concern when the system is out in the
open and many network hops away from the central office. A widely distributed
wired network involves a significant amount of extra infrastructure, and damage
to the infrastructure being monitoring often implies damage to the monitoring
system itself. Wireless systems have the advantage of being easier to deploy, as
they can self-organize into ad-hoc networks as long as they are within range of
another sensor node. Connectivity problems also tend to be isolated to
individual units, and are easier to troubleshoot.&lt;/p&gt;

&lt;p&gt;Recent projects have taken a page (knowingly or not) from tools like Ganglia and
now use a unified data format, regardless of sensor or data type. For example, a
single update could consist of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Type of data (1 byte)&lt;/li&gt;
&lt;li&gt;Geographic coordinates, determined via GPS or inferred via signal
 strength of other nodes&lt;/li&gt;
&lt;li&gt;Network address&lt;/li&gt;
&lt;li&gt;Actual data (4 bytes)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;These monitoring networks self-organize into a dynamic hierarchy of nodes based
on their placement and capabilities. There are generally three types of nodes
deployed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Basic sensor node&lt;/li&gt;
&lt;li&gt;Communication relay node - collects data in its 1- or 2-hop
 neighborhood&lt;/li&gt;
&lt;li&gt;Data Discharge Node - forward results to the Network Control Center,
 i.e. the one with a connection to the Internet&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Whereas in computer system monitoring, each node generally has similar
capabilities for communication and sensing, the role of these nodes are bound by
their physical capabilities. Hierarchical organization becomes a simpler problem
of guaranteeing a wide enough dispersal of communication relay and data
discharge nodes to reach all of the leaves of the tree, compared to the somewhat
arbitrary trees made in computer networks.&lt;/p&gt;

&lt;h2&gt;Astral, a Testbed&lt;/h2&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/distributed-monitoring/images/node.png&quot; alt=&quot;Node&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The major components of the central web application sink and a node in the
Astral network. Nodes communicate with one another via HTTP using an embedded
web server (the Tornado Python package). The web application that accepts
statistics updates is written in Ruby using the Sinatra web framework. The
system component for actually streaming video is completely separate - this is
based around Adobe Flash's Real-Time Message Protocol (&lt;a href=&quot;http://www.adobe.com/devnet/rtmp.html&quot;&gt;RTMP&lt;/a&gt;).&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Astral is a peer-to-peer content distribution network specifically built for
live, streaming media. Without IP multicast, if content producers want to stream
video of live events to users, they are forced to create a separate feed for
each user. A peer-to-peer approach is more efficient and offloads much of the
work from the origin servers to the edge nodes of the network.&lt;/p&gt;

&lt;p&gt;Astral is built on the premise of having knowledge of a virtual overlay network
of streaming clients. The system bootstraps itself and obtains this knowledge
automatically through messaging among nodes and (to a limited extend) an origin
web server. The nodes communicate with HTTP using an embedded web application
running on each. They use simple JSON messages over the wire with a standard
format for statistics. Each node runs a Python background process and the user
sends control messages to it from the browser via simple HTTP requests in
Javascript.&lt;/p&gt;

&lt;h3&gt;Definitions&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;Stream - A real-time data stream either of a live video source or of
 a stored recording&lt;/li&gt;
&lt;li&gt;Node - A networked computer running the Astral client and connected
 to the Astral network. The node can be acting as as producer, consumer,
 seeder, or a combination.&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;Goals&lt;/h3&gt;

&lt;p&gt;The primary motivation for Astral is the group project in Carnegie Mellon
University's &lt;a href=&quot;http://ece842.com&quot;&gt;18-842&lt;/a&gt; Distributed Systems class. A team of
four developers (including myself) designed and implemented the peer discovery &amp;amp;
organization protocol and streaming video service over the Spring 2011 semester.
My own development efforts were also focused on adding a statistics generating
and gathering component to the system for the purposes of this paper.&lt;/p&gt;

&lt;h3&gt;Challenges&lt;/h3&gt;

&lt;p&gt;In contrast to traditional file sharing peer-to-peer networks, Astral is
purpose-built to distribute live media. This prompted some interesting design
decisions; for instance, any client inside the network is guaranteed that what
they are looking for is widely available. Simply a client's membership in the
network is a hint that it has data to distribute to its peers.&lt;/p&gt;

&lt;p&gt;Compared to a centralized distribution network, Astral's nodes must pay special
attention to reliability. Users expect a steady video stream, even if the
quality has to be occasionally reduced due to network congestion. In the
centralized architecture, content producers provide reliability by scaling out
with additional origin servers. In a peer-to-peer version, the departure of any
one client could have a rippling effect on its peers. Astral keeps multiple
streams open for the same content to increase robustness, similar to bonding
multiple network interface cards together into one IP address.&lt;/p&gt;

&lt;p&gt;Because of this resource duplication, the statistics component must take care to
deduplicate stream popularity statistics. Each count must be identified with a
unique node identifier to avoid counting the backup as well as the primary
stream as separate users.&lt;/p&gt;

&lt;h3&gt;Statistics&lt;/h3&gt;

&lt;p&gt;The statistics monitored in the Astral network are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Current (deduplicated) number of nodes watching a stream&lt;/li&gt;
&lt;li&gt;Number of nodes acting as seeders for a stream&lt;/li&gt;
&lt;li&gt;Bitrate of the stream&lt;/li&gt;
&lt;li&gt;IP addresses of nodes watching a stream, for geographic visualization&lt;/li&gt;
&lt;li&gt;Network bandwidth of each node&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;Hierarchical Aggregation&lt;/h3&gt;

&lt;p&gt;The peers in Astral self-organize into a shallow hierarchy at startup, by going
through this process:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Query a central web application for a bootstrap list of supernodes&lt;/li&gt;
&lt;li&gt;Determine the round trip time to each supernode as a heuristic to find

&lt;pre&gt;&lt;code&gt; the closest
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;li&gt;Register with the supernode - the new node will be attached to this

&lt;pre&gt;&lt;code&gt; supernode for its lifetime. The relationship is stored persistently and
 survives peer restarts.
&lt;/code&gt;&lt;/pre&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;All statistics updates from peers are sent directly to their parent supernode,
so beyond the initial bootstrapping step (which in total occurs only once per
node) there is no load on the central server from individual peers.&lt;/p&gt;

&lt;p&gt;The number of peers managed by a supernode is proportional to their available
processing capacity, uptime and bandwidth. Long-living peers are obviously
preferred to be supernodes to avoid requiring the re-registration of every child
node. Each supernode can lessen the load on the sink linear
with the number of child nodes registered with it.&lt;/p&gt;

&lt;h3&gt;Arithmetic Filtering&lt;/h3&gt;

&lt;p&gt;Astral performs limited arithmetic filtering for the stream viewer statistics.
When a peer first requests a stream, that information is propagated back to the
sink through a supernode. Once receiving the stream, the peer sends a heartbeat
every 5 seconds to its parent supernode. The supernode does not propagate this
back to the sink, and thus it assumes that the peer continues to watch the
stream. When a peer leaves the network (either notifying the supernode during the
proper shutdown procedure or as detected at the supernode by missed
heartbeats), the supernode notifies the sink of the change in the peer's status.&lt;/p&gt;

&lt;p&gt;This filtering minimizes the number of updates making it all the way back to the
sink, but keeps the data as fresh as possible with what are essentially
invalidation callbacks (à la &lt;a href=&quot;http://dblp.uni-trier.de/db/conf/usenix/usenix_wi88.html#Howard88&quot;&gt;AFS&lt;/a&gt;). Without this filtering, the sink would
have to manage the heartbeats, which could quickly overwhelm the server.&lt;/p&gt;

&lt;h3&gt;Temporal Batching&lt;/h3&gt;

&lt;p&gt;The stream provider is generally interested in the average bitrate of video
received by the clients, but this information is not required to be completely
fresh. Even after the live stream is concluded, this information is useful for
provisioning network bandwidth in the future.&lt;/p&gt;

&lt;p&gt;Astral takes advantage of this by batching 5 seconds of video bitrate
statistics and returning only an average of these values to the sink. The
batching is done at the level of individual peers, and in the future could also
be performed at each supernode to further diminish the number of updates.&lt;/p&gt;

&lt;h2&gt;Evaluation&lt;/h2&gt;

&lt;p&gt;&lt;em&gt;The effects of hierarchy depth and batch window delay on the overall number of
update requests in a 100,000 node cluster. The cells contain the total number of
requests per second received by the sink.&lt;/em&gt;&lt;/p&gt;

&lt;table&gt;
&lt;thead&gt;
    &lt;tr&gt;
        &lt;td&gt;Depth / Batch Window Size&lt;/td&gt;
        &lt;td&gt;0&lt;/td&gt;
        &lt;td&gt;5s&lt;/td&gt;
        &lt;td&gt;10s&lt;/td&gt;
        &lt;td&gt;30s&lt;/td&gt;
    &lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
    &lt;tr&gt;
        &lt;td&gt;One Level (Baseline Centralized)&lt;/td&gt;
        &lt;td&gt;100,000&lt;/td&gt;
        &lt;td&gt;20,000&lt;/td&gt;
        &lt;td&gt;10,000&lt;/td&gt;
        &lt;td&gt;3,333&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;Two Level (1000 supernodes)&lt;/td&gt;
        &lt;td&gt;1,000&lt;/td&gt;
        &lt;td&gt;200&lt;/td&gt;
        &lt;td&gt;100&lt;/td&gt;
        &lt;td&gt;33&lt;/td&gt;
    &lt;/tr&gt;
        &lt;td&gt;Three Level (10 + 1000 Supernodes)
        &lt;td&gt;10&lt;/td&gt;
        &lt;td&gt;2&lt;/td&gt;
        &lt;td&gt;1&lt;/td&gt;
        &lt;td&gt;0.33&lt;/td&gt;
    &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;


&lt;p&gt;&lt;em&gt;The effects of hierarchy depth and batch window delay on the worst case
freshness in a 100,000 node cluster. The cells contain the worst possible update
delay in seconds.&lt;/em&gt;&lt;/p&gt;

&lt;table&gt;
&lt;thead&gt;
    &lt;tr&gt;
        &lt;td&gt;Depth / Batch Window Size&lt;/td&gt;
        &lt;td&gt;0&lt;/td&gt;
        &lt;td&gt;5s&lt;/td&gt;
        &lt;td&gt;10s&lt;/td&gt;
        &lt;td&gt;30s&lt;/td&gt;
    &lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
    &lt;tr&gt;
        &lt;td&gt;One Level (Baseline Centralized)&lt;/td&gt;
        &lt;td&gt;0&lt;/td&gt;
        &lt;td&gt;5s&lt;/td&gt;
        &lt;td&gt;10s&lt;/td&gt;
        &lt;td&gt;30s&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
        &lt;td&gt;Two Level (1000 supernodes)&lt;/td&gt;
        &lt;td&gt;0 + 2 hops&lt;/td&gt;
        &lt;td&gt;10s&lt;/td&gt;
        &lt;td&gt;20s&lt;/td&gt;
        &lt;td&gt;60s&lt;/td&gt;
    &lt;/tr&gt;
        &lt;td&gt;Three Level (10 + 1000 Supernodes)
        &lt;td&gt;0 + 3 hops&lt;/td&gt;
        &lt;td&gt;15s&lt;/td&gt;
        &lt;td&gt;30s&lt;/td&gt;
        &lt;td&gt;90s&lt;/td&gt;
    &lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;


&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/distributed-monitoring/images/graph1.png&quot; alt=&quot;Graph 1&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This graph relates the total number of nodes being monitored with the total
number of requests per second that the sink cluster must be able to handle. The
vertical axis is logarithmic to accommodate the large range of request rates.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/distributed-monitoring/images/graph2.png&quot; alt=&quot;Graph 2&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This graph relates the total number of nodes being monitored with the total
number of sink servers required to process them, assuming a maximum average of
500 requests per second per sink. The vertical axis is logarithmic to
accommodate the large range of request rates.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/distributed-monitoring/images/graph3.png&quot; alt=&quot;Graph 3&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This graph relates the batch window size (in seconds) with the total number of
requests per second, on average, that the sink cluster must be able to handle.
The vertical axis is logarithmic to accommodate the large range of request
rates.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Astral is obviously a very young project, and quantitative analysis analysis at
this point is likely to change quite a bit. However, we can do some basic
comparisons between a baseline, completely centralized monitoring system and one
with the various improvements discussed. These are currently mathematical
projections, with the goal of performing practical tests when Astral's
development settles.&lt;/p&gt;

&lt;h3&gt;Baseline Centralized&lt;/h3&gt;

&lt;p&gt;With a push-based collection style, the limits of a sink are very
similar to that of a modern web application. The most common bottlenecks are:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Throughput &amp;amp; concurrency capabilities of the front-end web server&lt;/li&gt;
&lt;li&gt;Throughput of the application server&lt;/li&gt;
&lt;li&gt;Performance of web application logic&lt;/li&gt;
&lt;li&gt;Database write throughput&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;A state-of-the-art web application stack geared towards a write-heavy workload
could consist of:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;http://nginx.org/&quot;&gt;Nginx&lt;/a&gt; Web Server as the point of entry for requests&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.modrails.com/&quot;&gt;Phusion Passenger&lt;/a&gt; Ruby application server running 4+ concurrent
 OS threads&lt;/li&gt;
&lt;li&gt;Ruby web application&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://redis.io/&quot;&gt;Redis&lt;/a&gt;, a high-performance key-value store&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The central web application component of Astral runs on this basic stack,
deployed at the moment on the &lt;a href=&quot;http://www.heroku.com/&quot;&gt;Heroku&lt;/a&gt; platform. On an Intel Core 2 2.2GHz
processor laptop with 4GB of RAM, the Redis database performs at an average of
40,000 set (i.e. write) operations per second (determined with the
&lt;code&gt;redis-benchmark&lt;/code&gt; tool, distributed with the Redis server package). A Ruby
application in front of the database can server an average of 500 requests per
second (and re-implementing the core statistics API in Java or Scala could
increase that further) (&lt;a href=&quot;http://www.rubyenterpriseedition.com/comparisons.html&quot;&gt;source&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;In order to have fresh information within 1 second, and assuming an average of 1
update per second from each node, the sink must be able to handle $n$ requests
per second, where $n$ is the number of nodes in the system. With a cluster of
four application servers and one Redis database, for example, the sink could
handle updates from a 2,000 node system on average. The statistics in the graphs
later in this post for a 1-level hierarchy correspond to this baseline
centralized case.&lt;/p&gt;

&lt;h3&gt;Optimized Distributed&lt;/h3&gt;

&lt;p&gt;A much larger scale system such as one for streaming the presidential
inauguration (during which in 2009, CNN served 1.3 million concurrent streams at
the peak) isn't feasible with this linear scaling factor. To handle that many
nodes with a centralized sink would require a 2,600 server cluster just for
monitoring.&lt;/p&gt;

&lt;p&gt;The primary goal of the optimizations discussed in this paper is to lower the
rate of updates from each node. Temporal batching like Astral's 5 second buffer
lowers the rate by a factor of 5. Hierarchical aggregation lowers it by a factor
proportional to the fanout of the supernodes. The effect of arithmetic filtering
is more difficult to determine, as it depends on the length of time each node is
connected. The longer a node is connected, the more the cost of the single
connection request required is amortized. Short-lived clients will cost no more
than with a non-filtered approach, and long-lived clients will avoid potentially
thousands of requests over a one hour live event.&lt;/p&gt;

&lt;p&gt;The first chart below illustrates the effects of both tree depth and batching
window size on the total number of update requests received by the sink.
Something to keep in mind when adding levels to the hierarchy is the load on
supernodes. If these are regular peers in the network, they may not be able to
sustain a high rate of requests without impacting the user experience. These
projections assume the supernodes are capable of an average of 100 requests per
second, and is probably a bit high.&lt;/p&gt;

&lt;p&gt;The second chart illustrates the worst case delay experienced due to batching.
These projections assume that batching is done at every level (both supernodes
and regular nodes). If it only occurs on individual nodes, the delay is never
worse than that in a single level hierarchy (plus a negligible amount for the
increased number of hops).&lt;/p&gt;

&lt;p&gt;The delay from each of these optimizations can be predicted with some
confidence. With a two-level hierarchy (a level of supernodes with regular nodes
beneath each), the delay for batch updated is equal to the batching duration of
each node. A 5 second window avoids a significant number of requests but doesn't
significantly delay the data. Other monitoring tasks might be satisfied with
even longer delays, up to minutes (note that temporal batching could also be
incorporated into a simple centralized collection architecture, with the same
benefits). If the supernodes perform their own temporal batching, or there is a
deeper hierarchy with additional batch windows, the worst case delay is only the
sum of the batching windows along the height of the tree. Delay in one branch
has no effect on the freshness of data from another.&lt;/p&gt;

&lt;p&gt;A monitoring system with a two-level hierarchy and with a 5 second batching
window on average (i.e. some may be delivered closer to real-time than others,
based on the operators needs), the same four server sink cluster from the
baseline centralized example could handle 1 million nodes (up from 2,000).&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;We have explored how many types of distributed systems are converging around
very similar monitoring challenges, including traditional data center
environments, civil infrastructure and peer-to-peer networks. The modern best
practices described in this paper for scaling up a monitoring system to
thousands of nodes come from the literature and existing systems in many fields
and the experience of implementing Astral, a peer-to-peer content distribution
network implemented in part to test out these ideas.&lt;/p&gt;

&lt;p&gt;Astral as it stands is an incomplete system, and additional work is required
before it is production-ready. Its performance and exact approach to monitoring
will likely change. The quantitative evaluation of Astral can be extended in
future work to determine the optimal values for configurable parameters such as
the supernode fanout and batch window.&lt;/p&gt;

&lt;p&gt;Monitoring these large systems is an increasingly large task, one that cannot
take continue to take lower priority over other application features.
Applications stand to benefit from decreased overhead if monitoring can be
worked into the system early on in its development, and developers are
encouraged to plan the accessible views into their systems as early as possible.&lt;/p&gt;

&lt;h3&gt;Source Code&lt;/h3&gt;

&lt;p&gt;Astral is available as an open source project on
&lt;a href=&quot;https://github.com/peplin/astral&quot;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Acknowledgements&lt;/h3&gt;

&lt;p&gt;Thanks to the students in the &lt;a href=&quot;http://www.ece.cmu.edu/~ece845/&quot;&gt;18-845 course&lt;/a&gt;
of Spring 2011 at Carnegie Mellon University who kindly review this paper and
offered their feedback. Thanks also to Professor &lt;a href=&quot;http://www.cs.cmu.edu/~droh/&quot;&gt;David
O'Hallaron&lt;/a&gt; and Kushal Dalmia for their help with
research for the project over the semester.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Data Center Efficiency & Renewable Energy</title>
   <link href="http://christopherpeplin.com/2011/05/data-center-electricity"/>
   <updated>2011-05-29T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/data-center-electricity</id>
   <content type="html">&lt;p&gt;&lt;em&gt;A &lt;a href=&quot;http://things.rhubarbtech.com/data-center-efficiency/dc-efficiency-report.pdf&quot;&gt;PDF&lt;/a&gt;
of this article is available&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Abstract&lt;/h2&gt;

&lt;p&gt;Cloud-based software applications have spurred a renewed
interest in thin clients - laptops, smartphones and tablet PCs. The resulting
increase in the demand for servers in data centers challenges the current
electrical grid, but provides an opportunity to pioneer renewable energy, demand
response and distributed generation. Organizational issues hinder immediate
improvement, but as the infrastructure costs of information technology firms
rise, cooperation with utilities will be an increasingly attractive
idea.&lt;/p&gt;

&lt;h2&gt;Implications of Thin Clients&lt;/h2&gt;

&lt;p&gt;The software as a service (SaaS) business model is relatively new for
application developers and IT firms. Instead of selling packaged software to
customers to run on their local computer hardware, SaaS companies provide access
to their applications exclusively via the Internet. Any computation required is
done by the vendor, on servers they provide. As broadband Internet
connection speeds reach more of the population, SaaS has become increasingly
popular, and the effect is clear in the hardware purchasing decisions of both
businesses and consumers. After years of increasing power in home computers,
SaaS shifts the heavy lifting to servers.&lt;/p&gt;

&lt;p&gt;In a few ways, SaaS is a throwback to the push for &quot;thin clients&quot; in the
1990's.  Thin clients were proposed as low-cost, energy efficient, underpowered
machines that would serve as a gateway to applications hosted and run by third
parties on Internet servers, or &quot;in the cloud&quot; For whatever reason it failed
to take off in the first iteration, the latest push has spurred laptop and
smartphone sales, and encouraged new devices like Apple's iPad. Their minimal
energy demands are even more attractive in today's energy concious society.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/data-center-efficiency/images/thin-power.png&quot; alt=&quot;Thin Power&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Thin clients use up to 24% less electricity than desktop computers,
but some of that efficiency is lost in in the increased demand for servers.
(&lt;a href=&quot;http://www.vxl.net/fckeditor/editor/filemanager/connectors/aspx/fckeditor/userfiles/file/VXL%20Green%20Benefits%20-Thin%20Clients.pdf&quot;&gt;source&lt;/a&gt;)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A consequence of the shift is a dramatic increase in the number of servers
required to support clients that previously ran their own applications.  The
type of hardware used for these servers is also quite different that in the last
decade. Whereas IBM and Sun Microsystems built their companies with large
mainframe servers, they are now sustained by sales and support of volume
servers. These are small, inexpensive servers built with commodity parts.
Instead of a massive mainframe (a single point of failure for a company), data
centers house thousands of rack mounted volume server replacements. A
&lt;a href=&quot;http://www.mendeley.com/research/smart-2020-enabling-the-low-carbon-economy-in-the-information-age/&quot;&gt;recent report&lt;/a&gt; found that &quot;[i]f growth continues in line with
demand, the world will be using 122 million servers in 2020, up from 18 million
today.&quot;&lt;/p&gt;

&lt;p&gt;An increase in the number of servers implies a change in the distribution of
demand for energy. Processing power is being concentrated in data centers,
instead of distributed evenly among all customers.  Unless new data centers are
strategically planned to optimize energy efficiency, they will burden IT firms
with increasing costs, utilities with demand spikes and the environment with
increasing emissions. Regulatory agencies, IT firms and public power utilities
can cooperate to find an optimal solution that maximizes profit, minimizes
emissions, and alleviates some of the problems with young renewable energy
sources.&lt;/p&gt;

&lt;h2&gt;Data Center Efficiency&lt;/h2&gt;

&lt;p&gt;The costs and demands of data centers can be first viewed at the level of an
individual deployment. The site infrastructure capital costs of a data center,
separate from any application development, alone accounts for 2/3 of IT costs of
a data center (&lt;a href=&quot;http://www.youtube.com/watch?v=sOJoB38OxK0&quot;&gt;source&lt;/a&gt;). These costs as well as emissions are increasing
on par with the total number of servers in operation.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/data-center-efficiency/images/smart-datacenters.png&quot; alt=&quot;Smart Datacenters&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This diagram shows projected data center emissions. Volume servers represent
the fastest growing segment (&lt;a href=&quot;http://www.mendeley.com/research/smart-2020-enabling-the-low-carbon-economy-in-the-information-age/&quot;&gt;source&lt;/a&gt;).&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Volume servers are substantially more difficult to run efficiently, primarily
due to their number. Tracking the energy use of 5 high-end machines is a simpler
affair than tracking the use of thousands of volume servers.  It is also common
for a volume server to run with a small amount of load if it is dedicated to a
task with an uneven workload. Similar to power utilities overprovisioning for
reliability, servers are overprovisioned to meet their peak demand.
Unfortunately, a volume server at 20% load still uses 60-90% of its peak power
consumption. In the previous decade, an expensive mainframe would be made to
always run at optimal efficiency, as it represented a significant cost and
investment. A single inefficient volume server is easy to write off because of
the tiny impact, but the combined cost in a data center is high.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/data-center-efficiency/images/1u-cost.png&quot; alt=&quot;1u Cost&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;The total cost of ownership of a 1U or volume server is almost four times that
of the hardware itself and is still increasing, &lt;a href=&quot;http://www.energystar.gov/ia/partners/prod_development/downloads/EPA_Datacenter_Report_Congress_Final1.pdf&quot;&gt;according to the EPA&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The high energy demands and fast growth of data centers prompted the United
States Congress to pass &lt;a href=&quot;http://www.energystar.gov/ia/products/downloads/Public_Law109-431.pdf&quot;&gt;Public Law 109-431&lt;/a&gt; requiring the Environmental
Protection Agency to report on the current state of energy efficiency in data
centers.  Among the EPA's findings is the surprising distribution of power
within a single data center. Another &lt;a href=&quot;http://www.mendeley.com/research/smart-2020-enabling-the-low-carbon-economy-in-the-information-age/&quot;&gt;recent report&lt;/a&gt; put data to this
fact, not spoken of much outside of IT circles:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;Only about half of the energy used by data centres powers the servers and
storage; the rest is needed to run back-up, uninterruptible power supplies (UPS)
(5%) and cooling systems (45%).&quot; (&lt;a href=&quot;http://www.mendeley.com/research/smart-2020-enabling-the-low-carbon-economy-in-the-information-age/&quot;&gt;source&lt;/a&gt;)&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Despite the large increase in the number of servers, their individually low
power requirements and strides in server energy efficiency mostly mitigate any
large increase in power demand. The true effect is the indirect power required
to cool the server rooms. This puts data centers more in line with industrial
loads, where demand is largely for reactive power. In more detail
(&lt;a href=&quot;http://www.energystar.gov/ia/partners/prod_development/downloads/EPA_Datacenter_Report_Congress_Final1.pdf&quot;&gt;from the EPA&lt;/a&gt;),&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;If the power and cooling overhead needed to support the IT equipment are
factored in, only about half the power entering the data center is used by the
IT equipment. The rest is expended for power conversions, backup power, and
cooling. Peak power usage for data centers can range from tens of kilowatts for
a small facility, to tens of megawatts for the largest data centers.&quot;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Data centers often perform critical functions for business who are loath to
experiment with a system that is working well enough to finish the job. The
increased costs are bearable with the subsequent increase in profit due to
enhanced availability and features. The EPA found resistance to energy
efficiency widespread in IT businesses:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;With the increasing importance of digital information, data centers are critical
to businesses and government operations. Thus, data center operators are
particularly averse to making changes that might increase the risk of down time.
Energy efficiency is perceived as a change that, although attractive in
principle, is of uncertain value and therefore may not be worth the risk.&quot;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;The EPA plans to introduce Energy Star ratings for data centers and
their equipment, allowing cost planners to better analyze the total cost of
ownership for a server. In combination with increased public pressure for
environmental efficiency, the efficiency of data centers should continue to
increase.&lt;/p&gt;

&lt;h3&gt;Virtualization&lt;/h3&gt;

&lt;p&gt;Even with efficiency improvements, naive data center software will hamper any
energy conservation. As consumers are encouraged to shut electrical
devices off when not in use, and to hunt down glowing lights and phantom power
in their living rooms, a server often burns for them 24/7 to provide on-demand
availability.&lt;/p&gt;

&lt;p&gt;Server virtualization is a new technique to avoid this symptom. Briefly,
virtualization allows many virtual servers be spun up and down dynamically
across a smaller number of physical servers. One piece of hardware that was
previously leased by a small website can now be used to run 50 small websites
with no change for the customer. For example, at night when application activity
is low, the servers can be automatically shut down to save power. A small volume
website run on a single server can also be scaled up to hundreds in a few
seconds if there is a large traffic spike. Utilities such as
&lt;a href=&quot;http://www.pge.com/includes/docs/pdfs/mybusiness/energysavingsrebates/incentivesbyindustry/hightech/C-4166.pdf&quot;&gt;California's PG &amp;amp; E&lt;/a&gt; are offering monetary incentives to remove physical
hosts from demand and use virtualization instead. Virtualization allows data
centers to aviod the volume server efficiency problem discussed earlier. Virtual
servers can be spread across the fewest number of physical machines required,
each running at 100% load and thus maximum efficiency.&lt;/p&gt;

&lt;h2&gt;Location&lt;/h2&gt;

&lt;p&gt;Location selection is a critical decision for IT companies, regardless of power,
because of the sensitivity and security requirements of data and software
running inside. They also must consider the added network latency due to the
geographic distance between service and client. At the start of the new Internet
age, data centers were typically located closed to metropolitan areas to provide
the lowest latency to their prime users. Unfortunately, this proved less than
ideal as the number of servers grew according to the EPA:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;It is important to note that two of the cities with the highest concentration of
data centers — New York City and San Francisco — are geographically isolated
areas with relatively limited electricity transmission resources.&quot;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/data-center-efficiency/images/congestion.png&quot; alt=&quot;Transmission Congestion&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Some of the areas of high congestion concern noted by the U.S.
Department of Energy are also popular locations for data centers which
contribute additional strain to the grid.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/data-center-efficiency/images/datacenter-eastcoast.png&quot; alt=&quot;East Coast&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;A high density strip of data centers mirrors the strip of transmission
congestion in the previous figure. (&lt;a href=&quot;http://www.datacentermap.com&quot;&gt;source&lt;/a&gt;)&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The impact of data center location is less important than it was a decade ago.
IT firms are increasing interconnecting their networks (peering) to avoid paying
transit costs to the Internet backbone providers, and to create direct links to
their customers. These changes, both policy wise and physically, have more
effect on latency than data center location. The latency within regions of
considerable size is comparable enough to allow more flexibility in locating a
data center.&lt;/p&gt;

&lt;p&gt;Unfortunately, data center locations are notoriously hard to analyze, as
companies are understandably protective of the information. The EPA found that
comprehensive data about the locations is not readily available because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Organizations are concerned about the physical security of these
  critical infrastructure facilities&lt;/li&gt;
&lt;li&gt;Private data centers are seen as a confidential, strategic asset&lt;/li&gt;
&lt;li&gt;Many data centers are part of larger commercial buildings and campuses
  and therefore not separately identified or metered&lt;/li&gt;
&lt;li&gt;Data centers have only recently been seen by government agencies as
  important infrastructure and an indicator of economic activity&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;While increasing the efficiency of individual data centers reduces costs and
increases profits, the environmental impact may be negligible. A recent
&lt;a href=&quot;http://www.greenpeace.org/usa/en/media-center/reports/make-it-green-cloud-computing/&quot;&gt;Greenpeace report&lt;/a&gt; found that many new data centers are located next to
strong but non-renewable power sources.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;[...] efficiency by itself is not green if you are simply working to maximise
output from the cheapest and dirtiest energy source available. The US EPA will
soon be expanding its EnergyStar rating system to apply to data centres, but
similarly does not factor in the fuel source being used to power the data centre
in its rating criteria.&quot;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Acting individually, data center operators are encountering the same issue as
consumers regarding renewable power sources. The electricity market and
Kirchoff's laws dictate that with a general connection to the grid, it cannot be
definitively said where the power originated. Data centers must accept a mix
of clean and dirty energy sources.&lt;/p&gt;

&lt;p&gt;Data centers are typically located near baseload, reliable sources of power.
They are increasingly being built outside of metropolitan areas partly in
response to the congestion encountered with the earliest data centers. The lower
cost of land, construction also contribute to the shift. As long as power and
latency are met at an adequate cost level, the location of a data center is
relatively flexible. This makes them a good candidate for pairing with power
sources that are cost effective only in certain regions. An ongoing issue with
wind power is that the wind blows strongest in the central United States,
exactly where the demand is the lowest and most spread out. Utilities could work
with IT firms to move the load, not the generation.&lt;/p&gt;

&lt;h2&gt;Common Goals Among Utilities &amp;amp; ITC&lt;/h2&gt;

&lt;p&gt;The effect of data center growth and location on the power grid and public
utilities was one of the items set to be investigated by
&lt;a href=&quot;http://www.energystar.gov/ia/products/downloads/Public_Law109-431.pdf&quot;&gt;Public Law 109-431&lt;/a&gt;. The EPA was required to analyze &quot;[...] the potential
cost savings and benefits to the energy supply chain through increasing the
energy efficiency of data centers and servers, including reducing demand,
enhancing capacity, and reducing strain on existing grid infrastructure [...].&quot;&lt;/p&gt;

&lt;p&gt;The EPA suggested that electric utilities &quot;consider offering incentives for
energy-efficient data center facilities and equipment.&quot;
Data centers represent a significant portion of the load in the United States,
so having an open communications channel between the two could benefit both
parties. According to the EPA,&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;Nationwide, the energy use estimate for data centers translates
into a peak load of more than 7 GW in 2007 (equivalent to the output of about 15
baseload power plants), growing to about 12 GW if current growth trends
continue.&quot;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Many data centers are located near metropolitan areas and contribute to the
transmission congestion problems threatening the United States. According to the
EPA, &quot;the peak-load reductions achievable through energy efficiency improvements
could play a significant role in relieving capacity constraints in these grids.&quot;&lt;/p&gt;

&lt;p&gt;With server virtualization and given that most major IT firms have multiple data
centers among which work is distributed, the demand of any given location could
be made highly flexible. When the servers are operating continuously, the load
shape of a data center is currently very flat. Some of the efficiency
improvements may change that, including the ability to use outside air for
cooling and computational load distribution among multiple data centers.
Consider a data center handling exclusively non real-time data processing timed
to peak at night, located near a wind turbine installation.  A coordinated
effort between IT firms and power utilities could pair computation load profiles
with the best possible generator time profile.&lt;/p&gt;

&lt;p&gt;This flexibility, and the requirement of steady power for data centers, offer
two additional opportunities for cooperation - demand response and distributed
generation.&lt;/p&gt;

&lt;h3&gt;Demand Response&lt;/h3&gt;

&lt;p&gt;Due to their requirements for high reliability, data centers are already
equipped to resist short interruptions in power supply. The servers are
usually backed with a large, UPS (flywheel or large battery) for the
entire data center, or individual batteries on each server.  With proper
incentives, the utilities could use data centers for demand response by
switching servers to their backup power source for short periods of time. The
EPA's investigation confirms the feasibility of such an idea:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;Although it is often assumed that data centers are not good candidates for load
management because of the critical function they perform, high-reliability data
centers are in fact designed to continue operating when the power grid is
unavailable — using on-site power generation and storage, which suggests that
they can also reduce the energy drawn from the grid at times of peak load. &quot;&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;There would be some additional technical work to implement such a system
reliably, as the backup equipment is not intended to be used regularly. The
failure rates and repair costs may change if the batteries are discharged often.&lt;/p&gt;

&lt;p&gt;These backup batteries also make data centers a good target for renewable power
sources like solar and wind, which both have inconsistent time profiles and
often generate the most power when it is least needed. Energy storage with
compressed air, elevated water or chemical batteries is a promising grid-level
solution, and with batteries already in place in data centers, the utility would
have smaller startup costs. Fluctuations in generation would matter even less if
this was used in combination with a local prime mover generator that was the
primary backup for the data center.&lt;/p&gt;

&lt;h3&gt;Distributed Generation&lt;/h3&gt;

&lt;p&gt;Some data centers have additional power generation on site for backup and also
to alleviate any voltage fluctuations on the grid. Distributed generation is
being discussed by utilities, but a large number of private generators are
already in operation at data centers. An intelligent connection to the grid is
required to bridge between the utilities and these local generators.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;On-site power generation, whether it is an engine, fuel cell, microturbine, or
other prime mover, supports the need for reliable power by protecting against
long-term outages beyond what the UPS and battery systems can provide. DG/CHP
systems that operate continuously provide additional reliability compared to
emergency backup generators that must be started up during a utility outage.&quot;
(&lt;a href=&quot;http://www.energystar.gov/ia/partners/prod_development/downloads/EPA_Datacenter_Report_Congress_Final1.pdf&quot;&gt;source&lt;/a&gt;)&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Distributed generation applications at data centers include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Standby/backup power&lt;/li&gt;
&lt;li&gt;Continuous prime power&lt;/li&gt;
&lt;li&gt;Combined heat and power (CHP)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Renewable generation is often more feasible and less risky at small scale,
making distributed generation such as this a good candidate for widespread
testing.&lt;/p&gt;

&lt;h2&gt;Impediments to Implementation &amp;amp; Conclusion&lt;/h2&gt;

&lt;p&gt;The biggest impediments to adoption are organizational - the complexity and
massive scale of the power grid combined with the unwillingness of private,
competitive companies to open up what they consider trade secrets make large
scale cooperation unrealistic.&lt;/p&gt;

&lt;p&gt;The relationship between data center operators, their financial planners and
those interested enough in the environment to start a conversation is difficult
to predict.&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;&quot;Many companies, for example, do not know whether a 50% increase
in customer volume would require 25% or 100% more server and data center
capacity. As a result, data center facilities can sit half empty, particularly
just after construction. In other cases, companies find they complete one data
center build program only to find, because of capacity constraints, they must
launch a new one almost immediately.&quot; (&lt;a href=&quot;http://www.mckinsey.com/clientservice/bto/pointofview/pdf/Revolutionizing_Data_Center_Efficiency.pdf&quot;&gt;source&lt;/a&gt;)&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;Ultimately, what will get companies moving is an opportunity to reduce operating
costs of data centers. More and more, they recognize the weight on their balance
sheet of the infrastructure demands of SaaS, and only when there is a concise
solution to seriously reduce that number will they take action.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Viewing GPX files in Embedded Google Maps</title>
   <link href="http://christopherpeplin.com/2011/05/gpxviewer/"/>
   <updated>2011-05-28T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/gpxviewer</id>
   <content type="html">&lt;p&gt;While making &lt;a href=&quot;/2011/05/august23-audio/&quot;&gt;field recordings&lt;/a&gt; for the
&lt;a href=&quot;/2011/05/august23-1966/&quot;&gt;August 23, 1966&lt;/a&gt; project two years ago, I carried
along a small GPS logger. I sat on the resulting
&lt;a href=&quot;http://en.wikipedia.org/wiki/GPS_eXchange_Format&quot;&gt;GPX&lt;/a&gt; file while the project
wrapped up, but now that I've posted an in-depth recap of the project I thought
linking the location with the audio files would be an interesting experiment.&lt;/p&gt;

&lt;p&gt;A quick search led me to &lt;a href=&quot;http://notions.okuda.ca/&quot;&gt;Kaz Okuda's&lt;/a&gt; project on
Google Code, &lt;a href=&quot;http://code.google.com/p/gpxviewer/&quot;&gt;gpxviewer&lt;/a&gt; - a perfect match.
From the project's page:&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;The GPX viewer is a 100% client-side JavaScript GPX file viewer that uses
Google Maps to map waypoints and tracklogs.&lt;/p&gt;

&lt;p&gt;GPX files are a standard GPS data format that is supported by many software
applications. It is an XML based data format which lends itself nicely to being
parsed in JavaScript.&lt;/p&gt;

&lt;p&gt;Since it is written entirely in JavaScript, this script can be added to any web
page with minimal effort and little or no server code. Just copy a GPX file to
your web site, make a web page with an embedded Google Map, include the script,
and initialize it.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;The script works fine, but it hasn't been updated since 2007 and is only
compatible with the now deprecated Google Maps API v2. I
&lt;a href=&quot;https://github.com/peplin/gpxviewer&quot;&gt;forked&lt;/a&gt; the project to
GitHub and added support for version 3 of the Google Maps API. I also changed
the coding style of the library has been changed a bit, for clarity.&lt;/p&gt;

&lt;p&gt;(Kaz, I hope you don't mind - I'd be happy to get this work merged back in with
your project. Google Code just doesn't lend itself well to parallel
development.)&lt;/p&gt;

&lt;h2&gt;Usage&lt;/h2&gt;

&lt;p&gt;With jQuery, using the library is as simple as:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;js&quot;&gt;&lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;loadGPXFileIntoGoogleMap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;filename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;nx&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ajax&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;({&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;url&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;filename&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;nx&quot;&gt;dataType&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;xml&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
        &lt;span class=&quot;nx&quot;&gt;success&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
          &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;parser&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;GPXParser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
          &lt;span class=&quot;nx&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;setTrackColour&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;#ff0000&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
          &lt;span class=&quot;nx&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;setTrackWidth&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
          &lt;span class=&quot;nx&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;setMinTrackPointDelta&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.001&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
          &lt;span class=&quot;nx&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;centerAndZoom&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;data&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
          &lt;span class=&quot;nx&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;addTrackpointsToMap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
          &lt;span class=&quot;nx&quot;&gt;parser&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;addWaypointsToMap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
        &lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;nx&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;document&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;).&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ready&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kd&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;mapOptions&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
      &lt;span class=&quot;nx&quot;&gt;zoom&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt;
      &lt;span class=&quot;nx&quot;&gt;mapTypeId&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;google&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;maps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;MapTypeId&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;ROADMAP&lt;/span&gt;
    &lt;span class=&quot;p&quot;&gt;};&lt;/span&gt;
    &lt;span class=&quot;kd&quot;&gt;var&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;nx&quot;&gt;google&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;maps&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;Map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;document&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;getElementById&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s2&quot;&gt;&amp;quot;map&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;),&lt;/span&gt;
        &lt;span class=&quot;nx&quot;&gt;mapOptions&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
    &lt;span class=&quot;nx&quot;&gt;loadGPXFileIntoGoogleMap&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nx&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;myTracks.gpx&amp;quot;&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;);&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;h2&gt;Source Code&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/peplin/gpxviewer&quot;&gt;Fork with Google Maps API v3 Compatibility&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://code.google.com/p/gpxviewer/&quot;&gt;Original Google Code Project&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>The Audio of August 23, 1966</title>
   <link href="http://christopherpeplin.com/2011/05/august23-audio/"/>
   <updated>2011-05-28T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/august23-audio</id>
   <content type="html">&lt;p&gt;&lt;em&gt;This post is part of a series describing the
&lt;a href=&quot;/2011/05/august23/&quot;&gt;August 23, 1966&lt;/a&gt; project.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;We had two recording sessions for the August project: field recordings for
providing ambient sound in the airlock chamber and voice-over recordings for the
headset narration and voice commands.&lt;/p&gt;

&lt;h2&gt;Airlock Ambiance&lt;/h2&gt;

&lt;p&gt;These recordings were made in and around the University of Michigan campus and
in my apartment building, using an M-Audio Nova microphone. I sought out the
sounds of wind passing through narrow walkways, the hum of machinery, and other
background sounds. Most of the recordings were slowed down quite dramatically.&lt;/p&gt;

&lt;ul class=&quot;audio-playlist&quot; data-id=&quot;1&quot;&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/ambient1.mp3&quot;&gt;Furnace&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/ambient2.mp3&quot;&gt;Water Pump&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/ambient3.mp3&quot;&gt;Corridor&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/ambient4.mp3&quot;&gt;Furnace&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/ambient5.mp3&quot;&gt;Road&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/ambient6.mp3&quot;&gt;Engine&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/ambient7.mp3&quot;&gt;Field&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;div id=&quot;jquery_jplayer_1&quot; class=&quot;jp-jplayer&quot;&gt;&lt;/div&gt;


&lt;div class=&quot;audioplayer jp-audio&quot;&gt;
    &lt;div class=&quot;jp-type-playlist&quot;&gt;
        &lt;div id=&quot;jp_container_1&quot; class=&quot;jp-interface&quot;&gt;
            &lt;ul class=&quot;jp-controls&quot;&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-play&quot; tabindex=&quot;1&quot;&gt;play&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-pause&quot; tabindex=&quot;1&quot;&gt;pause&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-stop&quot; tabindex=&quot;1&quot;&gt;stop&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-mute&quot; tabindex=&quot;1&quot;&gt;mute&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-unmute&quot; tabindex=&quot;1&quot;&gt;unmute&lt;/a&gt;&lt;/li&gt;
            &lt;/ul&gt;
            &lt;div class=&quot;jp-progress&quot;&gt;
              &lt;div class=&quot;jp-seek-bar&quot;&gt;
                &lt;div class=&quot;jp-play-bar&quot;&gt;&lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;
            &lt;div class=&quot;jp-volume-bar&quot;&gt;
              &lt;div class=&quot;jp-volume-bar-value&quot;&gt;&lt;/div&gt;
            &lt;/div&gt;
            &lt;div class=&quot;jp-current-time&quot;&gt;&lt;/div&gt;
            &lt;div class=&quot;jp-duration&quot;&gt;&lt;/div&gt;
        &lt;/div&gt;
        &lt;div id=&quot;jp_playlist_1&quot; class=&quot;jp-playlist&quot;&gt;
            &lt;ul&gt;
              &lt;li&gt;Title of media&lt;/li&gt;
            &lt;/ul&gt;
        &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;




&lt;div id=&quot;map&quot;&gt;&lt;/div&gt;


&lt;p&gt;&lt;a href=&quot;/files/august/audio.gpx&quot; class=&quot;gpx&quot;&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;Mission Control Instructions&lt;/h2&gt;

&lt;p&gt;Visitors to the &lt;a href=&quot;/2011/05/august23/&quot;&gt;gallery installation&lt;/a&gt; wear a radio headset,
and during their trip to space, a voice gives instructions on how and when to
move about the exhibition.&lt;/p&gt;

&lt;p&gt;These were recorded in my Ann Arbor apartment with an M-Audio Nova microphone,
and that's me speaking.&lt;/p&gt;

&lt;ul class=&quot;audio-playlist&quot; data-id=&quot;2&quot;&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/sequence1.mp3&quot;&gt;Connect Lifeline&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/sequence2.mp3&quot;&gt;Lifeline Connected&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/sequence3.mp3&quot;&gt;Airlock Sequence Finished&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/sequence4.mp3&quot;&gt;Enter the Star Formation Chamber&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/sequence5.mp3&quot;&gt;Star Formation Beginning&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;div id=&quot;jquery_jplayer_2&quot; class=&quot;jp-jplayer&quot;&gt;&lt;/div&gt;


&lt;div class=&quot;audioplayer jp-audio&quot;&gt;
    &lt;div class=&quot;jp-type-playlist&quot;&gt;
        &lt;div id=&quot;jp_container_2&quot; class=&quot;jp-interface&quot;&gt;
            &lt;ul class=&quot;jp-controls&quot;&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-play&quot; tabindex=&quot;1&quot;&gt;play&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-pause&quot; tabindex=&quot;1&quot;&gt;pause&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-stop&quot; tabindex=&quot;1&quot;&gt;stop&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-mute&quot; tabindex=&quot;1&quot;&gt;mute&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-unmute&quot; tabindex=&quot;1&quot;&gt;unmute&lt;/a&gt;&lt;/li&gt;
            &lt;/ul&gt;
            &lt;div class=&quot;jp-progress&quot;&gt;
              &lt;div class=&quot;jp-seek-bar&quot;&gt;
                &lt;div class=&quot;jp-play-bar&quot;&gt;&lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;
            &lt;div class=&quot;jp-volume-bar&quot;&gt;
              &lt;div class=&quot;jp-volume-bar-value&quot;&gt;&lt;/div&gt;
            &lt;/div&gt;
            &lt;div class=&quot;jp-current-time&quot;&gt;&lt;/div&gt;
            &lt;div class=&quot;jp-duration&quot;&gt;&lt;/div&gt;
        &lt;/div&gt;
        &lt;div id=&quot;jp_playlist_2&quot; class=&quot;jp-playlist&quot;&gt;
            &lt;ul&gt;
              &lt;li&gt;Title of media&lt;/li&gt;
            &lt;/ul&gt;
        &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;


&lt;h2&gt;Headset Narration&lt;/h2&gt;

&lt;p&gt;As the visitor's star forms on the &lt;a href=&quot;/2011/05/august23-wiremap/&quot;&gt;Wiremap&lt;/a&gt; in
front of them, Brian Nord's voice chimes in with occasional narration.&lt;/p&gt;

&lt;p&gt;These were recorded in the DL-1 lab at the University of Michigan with an
M-Audio Nova microphone.&lt;/p&gt;

&lt;ul class=&quot;audio-playlist&quot; data-id=&quot;3&quot;&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/narration1.mp3&quot;&gt;Star Graveyard&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/narration2.mp3&quot;&gt;Collapsing Molecular Cloud&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/narration3.mp3&quot;&gt;Atomic Fusion&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/narration4.mp3&quot;&gt;Fusion&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/narration5.mp3&quot;&gt;Pressure vs. Gravity&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/narration6.mp3&quot;&gt;Out of Fuel&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/narration7.mp3&quot;&gt;Black Hole&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/narration8.mp3&quot;&gt;Supernova&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/narration9.mp3&quot;&gt;Pulsar&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;div id=&quot;jquery_jplayer_3&quot; class=&quot;jp-jplayer&quot;&gt;&lt;/div&gt;


&lt;div class=&quot;audioplayer jp-audio&quot;&gt;
    &lt;div class=&quot;jp-type-playlist&quot;&gt;
        &lt;div id=&quot;jp_container_3&quot; class=&quot;jp-interface&quot;&gt;
            &lt;ul class=&quot;jp-controls&quot;&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-play&quot; tabindex=&quot;1&quot;&gt;play&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-pause&quot; tabindex=&quot;1&quot;&gt;pause&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-stop&quot; tabindex=&quot;1&quot;&gt;stop&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-mute&quot; tabindex=&quot;1&quot;&gt;mute&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-unmute&quot; tabindex=&quot;1&quot;&gt;unmute&lt;/a&gt;&lt;/li&gt;
            &lt;/ul&gt;
            &lt;div class=&quot;jp-progress&quot;&gt;
              &lt;div class=&quot;jp-seek-bar&quot;&gt;
                &lt;div class=&quot;jp-play-bar&quot;&gt;&lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;
            &lt;div class=&quot;jp-volume-bar&quot;&gt;
              &lt;div class=&quot;jp-volume-bar-value&quot;&gt;&lt;/div&gt;
            &lt;/div&gt;
            &lt;div class=&quot;jp-current-time&quot;&gt;&lt;/div&gt;
            &lt;div class=&quot;jp-duration&quot;&gt;&lt;/div&gt;
        &lt;/div&gt;
        &lt;div id=&quot;jp_playlist_3&quot; class=&quot;jp-playlist&quot;&gt;
            &lt;ul&gt;
              &lt;li&gt;Title of media&lt;/li&gt;
            &lt;/ul&gt;
        &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;


&lt;h2&gt;Miscellaneous&lt;/h2&gt;

&lt;p&gt;Finally, we recorded a few other phrases to throw in to the exhition to give it
some spice. Some are pretty silly.&lt;/p&gt;

&lt;ul class=&quot;audio-playlist&quot; data-id=&quot;4&quot;&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/grabbag1.mp3&quot;&gt;Starstuff&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/grabbag2.mp3&quot;&gt;Saw Ourselves&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/grabbag3.mp3&quot;&gt;System Initialized&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/grabbag4.mp3&quot;&gt;Warning&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/grabbag5.mp3&quot;&gt;Full Power&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&quot;http://things.rhubarbtech.com/august23/audio/grabbag6.mp3&quot;&gt;Malfunction&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;




&lt;div id=&quot;jquery_jplayer_4&quot; class=&quot;jp-jplayer&quot;&gt;&lt;/div&gt;


&lt;div class=&quot;audioplayer jp-audio&quot;&gt;
    &lt;div class=&quot;jp-type-playlist&quot;&gt;
        &lt;div id=&quot;jp_container_4&quot; class=&quot;jp-interface&quot;&gt;
            &lt;ul class=&quot;jp-controls&quot;&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-play&quot; tabindex=&quot;1&quot;&gt;play&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-pause&quot; tabindex=&quot;1&quot;&gt;pause&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-stop&quot; tabindex=&quot;1&quot;&gt;stop&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-mute&quot; tabindex=&quot;1&quot;&gt;mute&lt;/a&gt;&lt;/li&gt;
              &lt;li&gt;&lt;a href=&quot;#&quot; class=&quot;jp-unmute&quot; tabindex=&quot;1&quot;&gt;unmute&lt;/a&gt;&lt;/li&gt;
            &lt;/ul&gt;
            &lt;div class=&quot;jp-progress&quot;&gt;
              &lt;div class=&quot;jp-seek-bar&quot;&gt;
                &lt;div class=&quot;jp-play-bar&quot;&gt;&lt;/div&gt;
              &lt;/div&gt;
            &lt;/div&gt;
            &lt;div class=&quot;jp-volume-bar&quot;&gt;
              &lt;div class=&quot;jp-volume-bar-value&quot;&gt;&lt;/div&gt;
            &lt;/div&gt;
            &lt;div class=&quot;jp-current-time&quot;&gt;&lt;/div&gt;
            &lt;div class=&quot;jp-duration&quot;&gt;&lt;/div&gt;
        &lt;/div&gt;
        &lt;div id=&quot;jp_playlist_4&quot; class=&quot;jp-playlist&quot;&gt;
            &lt;ul&gt;
              &lt;li&gt;Title of media&lt;/li&gt;
            &lt;/ul&gt;
        &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;


&lt;h2&gt;Other August Articles&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23/&quot;&gt;August Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-twoverse/&quot;&gt;Twoverse&lt;/a&gt;, the core software backend&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-multitouch/&quot;&gt;Multi-touch Table&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-wiremap/&quot;&gt;Wiremap&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-pulse-oximeter/&quot;&gt;Pulse Oximeter (Heartbeat Monitor)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-audio/&quot;&gt;Audio Recordings&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>The Wiremap of August 23, 1966</title>
   <link href="http://christopherpeplin.com/2011/05/august23-wiremap/"/>
   <updated>2011-05-28T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/august-wiremap</id>
   <content type="html">&lt;p&gt;&lt;em&gt;This post is part of a series describing the
&lt;a href=&quot;/2011/05/august23/&quot;&gt;August 23, 1966&lt;/a&gt; project.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;http://wiremap.phedhex.com&quot;&gt;Wiremap&lt;/a&gt; is an innovative projection technique
that displays a 3D image in space using a standard computer projector. The
projector throws a beam of light on an array of vertical wires. From the focal
point of the projector's lens, all the wires are evenly spaced from one another
and have a known distance from the projector. With that information (both a
horizontal coordinate and depth), and using some careful calculations on the
computer, we can project simple images at various depths in the field. From any
perspective other than the projector's position, the wires appear randomly
placed and the image becomes visible.&lt;/p&gt;

&lt;p&gt;Our implementation uses mason's string for the field, 3/4&quot; plywood for the top
and bottom alignment/hanging boards, and standard nuts and washers as anchors
for each string. Our map has 256 strings, placed in a randomized dimension of
depth through an equal number of holes in both the top and bottom alignment
boards. The strings are secured with bolts on the top board and weighted down
with a washer below the bottom board. When the top board is raised to 8ft, the
wires become taught and can be aligned to a 90 degree angle with the floor. The
Wiremap must be calibrated each time the projector is positioned - this includes
making sure the wires are parallel, the projector sees the wires evenly spaced,
and there is no unnecessary tilt or keystone in the projector's image.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/august23/wiremapconst6.jpg&quot; alt=&quot;Wiremap&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The 256-string Wiremap installed in the gallery.&lt;/p&gt;

&lt;h2&gt;Wiremap Software Library&lt;/h2&gt;

&lt;p&gt;In order to facilitate quicker prototyping and make the Wiremap software more
accessible to the team, we wrote a
&lt;a href=&quot;https://github.com/peplin/wiremap-shapes&quot;&gt;Processing library&lt;/a&gt; for rendering
simple shapes in the Wiremap field. The library improves upon the source code
provided by the creator of the Wiremap by reducing code duplication and
abstracting most of the implementation details away from a user who wishes to
simply draw a sphere, rectangle or sliver in the field. The library also
includes a
&lt;a href=&quot;https://github.com/peplin/wiremap-shapes/blob/master/examples/ManualCalibrator/ManualCalibrator.pde&quot;&gt;novel calibration method&lt;/a&gt;
developed in response to inaccuracies in our construction.&lt;/p&gt;

&lt;p&gt;The August 23, 1966 logo is actually a sphere generated by this library.
Project this image onto our Wiremap, and a three dimensional sphere appears.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/august23/august_logo_transp.png&quot; alt=&quot;Logo&quot; /&gt;&lt;/p&gt;

&lt;h2&gt;Abstraction&lt;/h2&gt;

&lt;p&gt;The shapes library gathers the coordinate conversion and wire selection math
into a single class. The original Processing code required duplicating a set of
functions in every Processing sketch that output to the Wiremap. Now, the user
creates an instance of the Wiremap class and provides a few key measurements of
the physical interface as well as a text file listing the wire depths. The
calculation is done as necessary, and not exposed to the user.&lt;/p&gt;

&lt;h2&gt;Coordinate Systems&lt;/h2&gt;

&lt;p&gt;One key difference between the original source and the Wiremap library is the
coordinate system used for each plane. Previously, the coordinates of X, Y and
Z were all physical inches and matched the actual dimensions of the Wiremap. To
facilitate quicker transitioning from a regular Processing sketch (using the
standard 2D renderer) to one for the Wiremap, the X and Y were changed to be in
the standard, Processing-style pixel coordinate system.&lt;/p&gt;

&lt;p&gt;The Z plane remains in inches, as there is no obvious relationship between Z
space on the screen (which is infinite in both directions) and Z space in the
Wiremap field (limited by the physical dimensions). Thus, Z coordinates in the
field range from 0 to the field depth.&lt;/p&gt;

&lt;p&gt;The library was released under the Apache open source license on GitHub
(&lt;a href=&quot;https://github.com/peplin/wiremap-shapes&quot;&gt;https://github.com/peplin/wiremap-shapes&lt;/a&gt;),&lt;/p&gt;

&lt;h2&gt;Results&lt;/h2&gt;

&lt;p&gt;Despite careful planning, we had trouble at the very end of the build phase
accurately calibrating our Wiremap. The final transportation from our workspace
to the gallery sealed the deal; we were never able to form a satisfactory
image in the field after the move.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Continue to the next section, details of the
&lt;a href=&quot;/2011/05/august23-pulse-oximeter/&quot;&gt;heartbeat monitor&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Other August Articles&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23/&quot;&gt;August Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-twoverse/&quot;&gt;Twoverse&lt;/a&gt;, the core software backend&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-multitouch/&quot;&gt;Multi-touch Table&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-pulse-oximeter/&quot;&gt;Pulse Oximeter (Heartbeat Monitor)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-audio/&quot;&gt;Audio Recordings&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>Twoverse, the universe of August 23, 1966</title>
   <link href="http://christopherpeplin.com/2011/05/august23-twoverse/"/>
   <updated>2011-05-28T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/august-twoverse</id>
   <content type="html">&lt;p&gt;&lt;em&gt;This post is part of a series describing the
&lt;a href=&quot;/2011/05/august23/&quot;&gt;August 23, 1966&lt;/a&gt; project.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The software core of the August 23, 1966 project is a Java software system named
&quot;Twoverse,&quot; a parallel universe that exists only in the digital realm.&lt;/p&gt;

&lt;p&gt;The Twoverse architecture can be split into three levels - server, client and
input. This diagram &lt;a href=&quot;https://github.com/peplin/august23/raw/master/doc/report/images/twoverse_arch.jpg&quot;&gt;(full size)&lt;/a&gt;
includes some unimplemented elements, such as input from a sound sensor.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/august23/twoverse_arch.jpg&quot; alt=&quot;Arch&quot; /&gt;&lt;/p&gt;

&lt;h2&gt;System Design&lt;/h2&gt;

&lt;p&gt;The broad concept of Twoverse includes mechanics and partial system
specifications for a massively multiplayer online game. The scope was minimized
due to time constraints and to better fit the gallery installation. However,
even with a smaller scope, a complete vertical slice of the entire system was
implemented and used. Downsized from a universe of many types of objects, the
system currently supports a universe made of stars with a few properties, and
constellations that connect them with meta-objects known as links. Extensibility
was considered from the beginning of development, so adding new objects will
require a minimal amount of work.&lt;/p&gt;

&lt;h2&gt;Server&lt;/h2&gt;

&lt;p&gt;The system relies on a central server to provide the following:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A &lt;a href=&quot;https://github.com/peplin/august23/blob/master/database/create.sql&quot;&gt;persistant database&lt;/a&gt;
  of all objects in the universe and their current state, using MySQL&lt;/li&gt;
&lt;li&gt;User account management, as well as authenticated session negotiation&lt;/li&gt;
&lt;li&gt;A &lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/Twoverse/src/twoverse/TwoversePublicApi.java&quot;&gt;public API&lt;/a&gt;
  for interacting with the universe, via Apache &lt;a href=&quot;http://ws.apache.org/xmlrpc/&quot;&gt;XML-RPC&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Client pull style updates for minimizing bandwidth requirements, via an XML
  feed&lt;/li&gt;
&lt;li&gt;A &lt;a href=&quot;https://github.com/peplin/august23/tree/master/www&quot;&gt;browser based frontend&lt;/a&gt;
  to view the status of objects in the universe, via PHP&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Client&lt;/h2&gt;

&lt;p&gt;Using the XML-RPC API, many types of clients are possible. This includes
graphical, text-based, mobile, e-mail, etc. The clients implemented for the
gallery installation are graphical clients written using the Processing
development environment and include the following features:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Users can scroll and zoom around a graphical universe of glowing stars&lt;/li&gt;
&lt;li&gt;Users can click on an individual star to view a close-up view and additional
  details about its creation and properties&lt;/li&gt;
&lt;li&gt;Users can create a new star in the universe, and watch a 3D visualization of
  their star's formation&lt;/li&gt;
&lt;li&gt;Users can draw constellations that connect the stars in the universe, and
  leave them for other users to see&lt;/li&gt;
&lt;li&gt;Users can visit the gallery website to view a table of all of the stars in the
  universe, their properties and current status&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The graphical client uses the XML-RPC API as defined by the server, and updates
its local cache of the universe via the server's XML feed.&lt;/p&gt;

&lt;p&gt;A screenshot of the star chart in a web browser during the time the gallery was
open:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/august23/starchart.png&quot; alt=&quot;Star Chart&quot; /&gt;&lt;/p&gt;

&lt;h2&gt;Library&lt;/h2&gt;

&lt;p&gt;The client and server for Twoverse share many common functions, and were
designed to inherit from the same code hierarchy. They both stem from a
&lt;a href=&quot;https://github.com/peplin/august23/tree/master/src/Twoverse/src/twoverse&quot;&gt;Twoverse Java library&lt;/a&gt;,
which includes many utility classes, shared functionality and the server
executable. The Twoverse library provides these features and many others:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/Twoverse/src/twoverse/Database.java&quot;&gt;Database wrapper&lt;/a&gt;
  for a persistent universe - could be used to run SQL database client-side.
  (Revisiting this, a more established ORM should be used.)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/Twoverse/src/twoverse/TwoverseServer.java&quot;&gt;XML-RPC servlet&lt;/a&gt;
  for serving XML-RPC requests&lt;/li&gt;
&lt;li&gt;Thread-safe universal &lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/Twoverse/src/twoverse/ObjectManager.java&quot;&gt;object manager&lt;/a&gt;
  for maintaining the state of the universe&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/Twoverse/src/twoverse/SessionManager.java&quot;&gt;User session manager&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Small &lt;a href=&quot;https://github.com/peplin/august23/tree/master/src/Twoverse/src/twoverse/test&quot;&gt;unit test suite&lt;/a&gt;
  for core classes&lt;/li&gt;
&lt;li&gt;Processing &lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/Twoverse/src/twoverse/util/Camera.java&quot;&gt;camera wrapper&lt;/a&gt;
  for simplifying the elusive camera() function&lt;/li&gt;
&lt;li&gt;Flexible 2D/3D point &lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/Twoverse/src/twoverse/util/Point.java&quot;&gt;coordinate class&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Additional documentation is available as Javadoc comments with the
&lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/Twoverse/src&quot;&gt;source code&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Interface Screenshots&lt;/h2&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/august23/Screenshot-MultitouchClient.png&quot; alt=&quot;MT-screenshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;A screenshot of the multi-touch Client with a group of stars displayed at the
default zoom level. The lines connecting the stars are constellations drawn by a
user.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/august23/Screenshot-MultitouchClient-1.png&quot; alt=&quot;MT-screenshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;A screenshot of the multi-touch Client viewing the details of a single star.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/august23/Screenshot-MultitouchClient-2.png&quot; alt=&quot;MT-screenshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;A screenshot of the multi-touch Client zoomed in for a closer look at a cluster
of stars.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Continue to the next section, details of the
&lt;a href=&quot;/2011/05/august23-multitouch/&quot;&gt;multi-touch table&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Other August Articles&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23/&quot;&gt;August Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-multitouch/&quot;&gt;Multi-touch Table&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-wiremap/&quot;&gt;Wiremap&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-pulse-oximeter/&quot;&gt;Pulse Oximeter (Heartbeat Monitor)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-audio/&quot;&gt;Audio Recordings&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>The Pulse Oximeter of August 23, 1966</title>
   <link href="http://christopherpeplin.com/2011/05/august23-pulse-oximeter/"/>
   <updated>2011-05-28T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/august-pulse</id>
   <content type="html">&lt;p&gt;&lt;em&gt;This post is part of a series describing the
&lt;a href=&quot;/2011/05/august23/&quot;&gt;August 23, 1966&lt;/a&gt; project.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;In researching ways to detect the human heartbeat, we found that
&lt;a href=&quot;http://en.wikipedia.org/wiki/Photoplethysmograph&quot;&gt;photoplethysmography&lt;/a&gt;
would be the simplest and least expensive way to bring an immediate and personal
touch to the gallery installation. A photoplethysmograph is usually obtained
with a pulse oximeter - this is the device used in hospitals that grips a
person's finger to measure their heart &amp;amp; respiration rates.&lt;/p&gt;

&lt;h2&gt;Concepts&lt;/h2&gt;

&lt;p&gt;A pulse oximeter simply illuminates the skin with light from an LED (usually
infrared), and measures the luminance of the skin on the other side. Each
cardiac cycle brings more blood to the extremities, the finger becomes denser,
and thus less light passes through to the detector. When the blood flows away,
more light is let through. This fluctuation can be measured and the timing of
the luminance peaks used for determining the heart rate.&lt;/p&gt;

&lt;p&gt;This project required knowledge of
&lt;a href=&quot;http://www.swarthmore.edu/NatSci/echeeve1/Ref/FilterBkgrnd/Filters.html&quot;&gt;signal processing filters&lt;/a&gt;
and &lt;a href=&quot;http://web.telia.com/~u85920178/begin/opamp00.htm&quot;&gt;operational amplifiers&lt;/a&gt;.
This &lt;a href=&quot;https://github.com/peplin/august23/raw/master/doc/heartmonitor/schematic-final.png&quot;&gt;schematic&lt;/a&gt;
describes the circut used inside the pulse oximeter. It is also
provided as a &lt;a href=&quot;http://fritzing.org/&quot;&gt;Fritzing&lt;/a&gt; &lt;a href=&quot;https://github.com/peplin/august23/raw/master/doc/heartmonitor/schematic.fz&quot;&gt;file&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/august23/schematic-final.png&quot; alt=&quot;Pulse&quot; /&gt;&lt;/p&gt;

&lt;p&gt;For fun, you can look at the &lt;a href=&quot;https://github.com/peplin/august23/raw/master/doc/heartmonitor/schematic-draft.jpg&quot;&gt;original schematic&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Implementation&lt;/h2&gt;

&lt;p&gt;This device is advantageous for its low intrusiveness. In our implementation,
the visitor just needs to gently place their finger on top of the light sensor.
Depending on the person, the shape, rate and range of the photoplethysmograph
obtained can vary widely, but we found the results distinct enough to obtain a
stable heart rate from most visitors in under 15 seconds.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/august23/pulse-outside.jpg&quot; alt=&quot;Outsides&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Our device used the amplified signal of an inexpensive photo-resistor passed
through a high pass filter and captured by an &lt;a href=&quot;http://www.arduino.cc/&quot;&gt;Arduino microcontroller&lt;/a&gt;.
The microcontroller
&lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/gallery/PulseOximeter/PulseOximeter.pde&quot;&gt;fed an averaged luminosity&lt;/a&gt;
to a &lt;a href=&quot;http://processing.org/&quot;&gt;Processing&lt;/a&gt;
&lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/gallery/WiremapClient/HeartbeatDetector.pde&quot;&gt;sketch&lt;/a&gt;
on the host computer, which analyzed the signal for peaks. The peaks were then
converted to a frequency, and passed along to the
&lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/gallery/WiremapClient/WiremapClient.pde&quot;&gt;gallery software&lt;/a&gt;.
The software is general to heart rate monitoring, and can be used for other
applications that are interested in the data. There is also a sketch to
&lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/test/HeartbeatPulseViewer/HeartbeatPulseViewer.pde&quot;&gt;visualize the value&lt;/a&gt;
of the pulse oximeter on a graph.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/august23/pulse-inside.jpg&quot; alt=&quot;Innards&quot; /&gt;&lt;/p&gt;

&lt;h2&gt;Source Code&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/gallery/PulseOximeter/PulseOximeter.pde&quot;&gt;Arduino program&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;&quot;&gt;Heartbeat Viewer&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/gallery/WiremapClient/HeartbeatDetector.pde&quot;&gt;Integrated with Wiremap&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;&lt;em&gt;Continue to the next section, details of the
&lt;a href=&quot;/2011/05/august23-audio/&quot;&gt;audio recordings&lt;/a&gt; of August.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Other August Articles&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23/&quot;&gt;August Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-twoverse/&quot;&gt;Twoverse&lt;/a&gt;, the core software backend&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-multitouch/&quot;&gt;Multi-touch Table&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-wiremap/&quot;&gt;Wiremap&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-audio/&quot;&gt;Audio Recordings&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>The Multi-touch Table of August 23, 1966</title>
   <link href="http://christopherpeplin.com/2011/05/august23-multitouch/"/>
   <updated>2011-05-28T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/august-multitouch</id>
   <content type="html">&lt;p&gt;&lt;em&gt;This post is part of a series describing the
&lt;a href=&quot;/2011/05/august23/&quot;&gt;August 23, 1966&lt;/a&gt; project.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The multi-touch table created for the project is a rear diffused illumination
design, and is housed in a core‐ten steel cabinet with recessed cooling fans and
an access panel on the rear vertical wall. The touch surface is a 1/2&quot;
polycarbonate sheet with an adhesive projection film applied to the underside
(acting as a diffuser for the projected image).&lt;/p&gt;

&lt;p&gt;Infrared light is projected at the diffuser from below (inside the cabinet) the
touch surface. The table used an array of six multiple LED lamps. When an object
touches the surface it reflects more light than the diffuser or objects far away
from the surface. The change in light is detected by a webcam placed inside the
cabinet, and the signal processed by our software as user input.&lt;/p&gt;

&lt;h2&gt;Multi-touch Software&lt;/h2&gt;

&lt;p&gt;The multi-touch table used &lt;a href=&quot;http://ccv.nuigroup.com/&quot;&gt;The Beta&lt;/a&gt;, from the NUI
Group, to process the video stream from the webcam. The Beta, tbeta for short,
is an open source tool that analyzes video to find tracking data for objects it
recognizes as fingers or cursor devices. The software provides a great deal of
control over the video parameters (high-pass filter, amplification, threshold,
etc.) that adapts well to many types of multi-touch displays.&lt;/p&gt;

&lt;p&gt;Tbeta outputs the tracking data using the &lt;a href=&quot;http://tuio.org/&quot;&gt;TUIO protocol&lt;/a&gt;,
which is an open framework for receiving input events in various programming
environments. For this project, the TUIO events sent by tbeta were received
using the open source Java TUIO library in a
&lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/gallery/MultitouchClient/TuioController.pde&quot;&gt;Processing sketch&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;/2011/05/august23-twoverse/&quot;&gt;Twoverse post&lt;/a&gt; includes some screenshots of
the user interface displayed on the multi-touch table.&lt;/p&gt;

&lt;h3&gt;Results&lt;/h3&gt;

&lt;p&gt;Despite careful calibration and testing in our workspace, the final move to the
gallery threw the multi-touch table out of alignment and with the remaining
time, we were unable to make it sufficiently responsive enough to user
interaction. The distribution of IR light on the table's surface was difficult
to control, and hot spots had the habit of throwing the exposure of the webcam
off and confusing Tbeta.&lt;/p&gt;

&lt;p&gt;Although we used the surface as a display, gallery visitors had to use a
traditional mouse to interact with the galaxy.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Continue to the next section, details of the
&lt;a href=&quot;/2011/05/august23-wiremap/&quot;&gt;Wiremap&lt;/a&gt; display.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Other August Articles&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23/&quot;&gt;August Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-twoverse/&quot;&gt;Twoverse&lt;/a&gt;, the core software backend&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-wiremap/&quot;&gt;Wiremap&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-pulse-oximeter/&quot;&gt;Pulse Oximeter (Heartbeat Monitor)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-audio/&quot;&gt;Audio Recordings&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>The Simulated Time of Threephase</title>
   <link href="http://christopherpeplin.com/2011/05/threephase-time/"/>
   <updated>2011-05-27T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/threephase-time</id>
   <content type="html">&lt;p&gt;&lt;em&gt;This post is part of a series describing the &lt;a href=&quot;/2011/05/threephase/&quot;&gt;Threephase&lt;/a&gt;
project.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;A primary task for any game is making things move. The default approach in a
single-player, desktop computer game is somewhat clear. The software has total
control over the CPU, and there is only a single player waiting for game world
updates. A web-based game needs to optimize for many simultaneous games, while
minimizing server costs.&lt;/p&gt;

&lt;h2&gt;Update Frequency&lt;/h2&gt;

&lt;p&gt;In a single-player, desktop computer game, it is reasonable to set a small
timestep (e.g. 1 second) at each of which the game will recalculate and update
the entire world. Powerful processors permit this method to scale up to very
large games locally, providing the player with a consistent, up-to-date world.&lt;/p&gt;

&lt;p&gt;Unfortunately, this strategy doesn't scale to the web, where a server will have
more individual game instances to process and where it is infeasible to dedicate
an entire processor to a single game instance at all hours of the day. Thanks to
the patterns of web users, there are some relatively easy ways to decrease the
amount of work that needs to be done. Updating at a regular timestep would be
overeager, when a player typically isn't visiting the web application every
second or demanding real-time updates. Users are much more tolerant of short
delays (i.e. 1-5 seconds) on the web than in desktop applications.&lt;/p&gt;

&lt;h2&gt;On-Demand Updates&lt;/h2&gt;

&lt;p&gt;At the minimum, Threephase updates on demand, only when the player visits their
game. To provide background, out-of-band notifications (e.g. e-mails with
in-game alerts), however, the game does occasionally need to be updated to check
the conditions. Instead of maintaining a steady, short timestep, Threephase
gradually decreases the interval between updates as a player stops visiting the
game. 12 hours after a visit, the game updates only once an hour. Three days
after the last player visit, game updates may be as infrequent as once a day.&lt;/p&gt;

&lt;p&gt;Players visiting every day (and by implication those more invested in their
virtual world) will receive an appropriately higher update rate, thanks to the
processing power left free by those visiting Threephase less often.&lt;/p&gt;

&lt;h2&gt;Crossing the Day Boundary&lt;/h2&gt;

&lt;p&gt;The technical impact of this design is that every element of the game needs to
be able to update itself over any time interval - 10 seconds, 10 minutes, 10
days. Instead of simply calculating the difference between now and 1 second ago,
the objects need to handle updates spanning multiple days. This is important in
Threephase because there are certain statistics and calculations that need to
occur exactly once per day. That includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Calculating the marginal cost curve&lt;/li&gt;
&lt;li&gt;Clearing fuel market prices&lt;/li&gt;
&lt;li&gt;Storing average marginal price and operating level statistics&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/frequency2.png&quot; alt=&quot;Decreasing Updates&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The algorithms to calculate these values need to be able to step through the
calculation for multiple days, and not condense the change into a single value.
For example, if three days have passed since an update, the server must to
calculate the marginal price of power on each of the three days, not on average.
It is possible to use an integral in some situations where the calculations per
day are unimportant, and this method is preferred if possible.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/mp.png&quot; alt=&quot;Marginal Price&quot; /&gt;&lt;/p&gt;

&lt;h2&gt;Update Scope&lt;/h2&gt;

&lt;p&gt;In addition to the question of when to update, the game must decide how much to
update. A naive approach is to update every game element, every time, starting
from the top of the object hierarchy. This method grows in complexity
exponentially as games are created, and does a lot of unnecessary work. It also
runs into problems with a fixed timestep - if the work cannot be finished before
the next interval (e.g. 1 second), the updates will fall behind and never catch
up. Another method is to maintain a list of only the objects that explicitly
need to be updated. This avoids duplicated work, but is difficult to maintain.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/update1.png&quot; alt=&quot;Naive Updating&quot; /&gt;&lt;/p&gt;

&lt;p&gt;A representation of naive, update-all approach to game updating. A naive
approach to update scope chooses to update every game object starting and the
top of the object hierarchy. This approach is not scalable to many simultaneous
game instances, especially with short timesteps in between updates.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/update2.png&quot; alt=&quot;List Updating&quot; /&gt;&lt;/p&gt;

&lt;p&gt;A representation of improved, list-based update approach game updating. An
improved scope for updates maintains a list of specific objects that need to be
updated. It saves time and work over the naive approach, but still has
difficulty scaling in small timestep situations.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/update3.png&quot; alt=&quot;Lazy Updating&quot; /&gt;&lt;/p&gt;

&lt;p&gt;A representation of lazy approach to game updating. A lazy approach to update
scope updates objects only on demand from player actions. A request for the
operating level of a generator may or may not propagate up the hierarchy of
objects, depending on the last update time of each object in between.&lt;/p&gt;

&lt;h2&gt;Distributed Updates&lt;/h2&gt;

&lt;p&gt;Threephase uses a distributed update strategy, where updates start at the lowest
levels of the hierarchy and propagate upwards only as needed. This approach
harmonizes well with HTTP, which is closely tied to a request/response cycle of
communication between client and server. For HTTP, there is no concept of a
long-running job that continuously updates the game. Clients should also not
have to worry about keeping the game state up-to-date.&lt;/p&gt;

&lt;p&gt;When a request arrives, the server need only return the best answer it can find,
not necessarily the perfect answer. This approach relies on caching at multiple
levels of the hierarchy to update the minimum amount of game state necessary to
maintain consistency. This frees up processing power for other games, and also
increases the response time for players.&lt;/p&gt;

&lt;h2&gt;Accelerated Game Speed&lt;/h2&gt;

&lt;p&gt;These update issues are compounded by the fact that games are permitted to scale
the passage of time to shorten game duration. Running a power system in
real-time speed would be an achingly slow gaming experience, and it would be
difficult to observe trends over time. Instead, the time in game can be scaled
up as much as 200 times normal. The in-game time is displayed on every page, and
begins counting from the epoch of the game. This represents an accelerated view
into the future, allowing players to see the near to medium-term implications of
their choices.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/crossing-over2.png&quot; alt=&quot;Crossing Days&quot; /&gt;&lt;/p&gt;

&lt;p&gt;At the high end of the time scale, update intervals can be especially
problematic. At the maximum (200 times real-time), a day passes in game every 12
normal minutes. Nearly every player visit is crossing dozens, if not hundreds of
day boundaries. The calculations mentioned must efficiently handle updating a
large number of days.&lt;/p&gt;

&lt;h2&gt;GameTime Helper&lt;/h2&gt;

&lt;p&gt;To take advantage of the useful time helper methods in both Ruby and the
&lt;a href=&quot;http://rubyonrails.org/&quot;&gt;Ruby on Rails&lt;/a&gt; web framework, Threephase uses a
dynamic
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/game.rb#L103&quot;&gt;GameTime&lt;/a&gt;
class when dealing with in-game time. The root problem is that given a Time
object, the system needs to be able to tell if it has already&lt;/p&gt;

&lt;p&gt;Without a steady timestep, all updates must handle the possibility that multiple
days have passed since the last update. The lower timeline shows how the day
boundary crossing problem is compounded when game speed is increased. been
scaled from real-time to game time. A method receiving an instance of GameTime
has some implicit metadata (the fact that this is a GameTime, not a regular
Time) that the time is already scaled. In addition, the class provides automatic
conversion between real and game-time as needed.&lt;/p&gt;

&lt;p&gt;Each instance of a Country generates a unique GameTime class definition, with
the game's epoch and speed stored as class constants. All times are scaled
forward from the epoch (and an error is thrown if a pre-epoch time is passed as
an argument), limiting worry about time scaling to a single location in the code
base. The class can smoothly convert between scaled game time
and unscaled real time.&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;game&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;speed&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;GameTime&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;game&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;GameTime&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;GameTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;epoch&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Sat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mo&quot;&gt;06&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Nov&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2010&lt;/span&gt; &lt;span class=&quot;mo&quot;&gt;03&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;38&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;49&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;UTC&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mo&quot;&gt;00&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mo&quot;&gt;00&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;now&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2010&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mo&quot;&gt;06&lt;/span&gt; &lt;span class=&quot;mo&quot;&gt;03&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;39&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;UTC&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;GameTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;now&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Sat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mo&quot;&gt;06&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Nov&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2010&lt;/span&gt; &lt;span class=&quot;mo&quot;&gt;03&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;42&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;08&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;UTC&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mo&quot;&gt;00&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mo&quot;&gt;00&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;GameTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;now&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Sat&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mo&quot;&gt;06&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Nov&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2010&lt;/span&gt; &lt;span class=&quot;mo&quot;&gt;03&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;42&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;43&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;UTC&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mo&quot;&gt;00&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mo&quot;&gt;00&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;GameTime&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;at&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;no&quot;&gt;Time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;now&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;to_normal&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2010&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mo&quot;&gt;06&lt;/span&gt; &lt;span class=&quot;mo&quot;&gt;03&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;39&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;45&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;UTC&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;&lt;em&gt;Continue to the next section on
&lt;a href=&quot;/2011/05/threephase-implementation/&quot;&gt;implementing&lt;/a&gt; Threephase.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Other Threephase Articles&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase/&quot;&gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-game-objects/&quot;&gt;Game Objects&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-mechanics/&quot;&gt;Game Mechanics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-implementation/&quot;&gt;Implementation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-time/&quot;&gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>The Game Objects of Threephase</title>
   <link href="http://christopherpeplin.com/2011/05/threephase-game-objects/"/>
   <updated>2011-05-27T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/threephase-objects</id>
   <content type="html">&lt;p&gt;&lt;em&gt;This post is part of a series describing the &lt;a href=&quot;/2011/05/threephase/&quot;&gt;Threephase&lt;/a&gt;
project.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The primary geographical units of Threephase are the
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/game.rb&quot;&gt;Country&lt;/a&gt;,
the
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/state.rb&quot;&gt;State&lt;/a&gt;
and the
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/city.rb&quot;&gt;City&lt;/a&gt;. At
any time, there are multiple game worlds running simultaneously. Each game (or
Country) has its own unique attributes and available power technologies. The
player can choose to either join an existing Country or create a new one
(perhaps to experiment with a new regulatory system). A Country's virtual world
is ongoing and persistent - players can join and leave at any time.&lt;/p&gt;

&lt;p&gt;To join a Country (game and Country are used interchangeably), the player
creates a State.&lt;/p&gt;

&lt;p&gt;The State is the player's relationship to a certain Country - they can
participate in multiple games simultaneously, but with only one State in each
game. The player assigns a name, research budget (which effects the cost of new
technology) and a map to the new State.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/city.png&quot; alt=&quot;Models&quot; /&gt;&lt;/p&gt;

&lt;h2&gt;Game/Country&lt;/h2&gt;

&lt;p&gt;In order to lower the barrier to entry for new players, the player may join and
leave games at any time. The time spent finding and joining a game should be
minimal. The player should be able to start making in-game decisions as soon as
possible to grab and keep their attention. The effects of a State suddenly
dropping out of the Country are dampened to not negatively effect the experience
for other players when someone leaves the game.&lt;/p&gt;

&lt;h2&gt;Attributes&lt;/h2&gt;

&lt;p&gt;A Country has many
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/game.rb#L30&quot;&gt;adjustable attributes&lt;/a&gt;
that are set at creation. These include, but are not limited to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Minimum &amp;amp; maximum transmission line capacity&lt;/li&gt;
&lt;li&gt;Relative cost of technology&lt;/li&gt;
&lt;li&gt;Relative wind speed&lt;/li&gt;
&lt;li&gt;Type of economic regulation&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The attributes affect the difficulty of the game (e.g. relatively higher capital
costs make new construction more difficult) similar to how new laws and public
policies can affect power utility strategy in real life. They can also simulate
different physical environments, e.g. the relative scale of the average wind
speed can make a State more or less inclined to add wind turbines to their
generator portfolio.&lt;/p&gt;

&lt;h3&gt;Economic Regulation&lt;/h3&gt;

&lt;p&gt;Economic regulation is a critical factor in operating a profitable utility. The
available economic regulation types in Threephase are rate of return, marginal
cost bidding, and locational marginal pricing (discussed in more detail later
on). Power utility regulation is an unsolved problem, and the ability to switch
between types in Threephase makes comparing scenarios in different regulatory
environments a possible use case.&lt;/p&gt;

&lt;h2&gt;Map&lt;/h2&gt;

&lt;p&gt;Each State has a
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/map.rb&quot;&gt;map&lt;/a&gt;
which defines the land and natural resources underneath and
neighboring each City. The proximity of City to areas of the map can have a
non-trivial effect on the price of certain types of generation in the City
because of the abundance of natural resources (discussed further later).&lt;/p&gt;

&lt;h2&gt;Multiplayer Elements&lt;/h2&gt;

&lt;p&gt;Threephase is a multiplayer game. Each Country is shared among the players that
control a State in that Country. These players currently share national fuel
prices. Generators that use non-renewable fuel purchase their operating fuel at
prices determined by a national
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/fuel_market.rb&quot;&gt;fuel market&lt;/a&gt;.
Every day, the
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/market_price.rb&quot;&gt;market price&lt;/a&gt;
for each &lt;a href=&quot;https://github.com/peplin/threephase/blob/master/db/fixtures/010_fuel_markets.rb&quot;&gt;fuel type&lt;/a&gt;
(e.g. coal, oil, natural gas, etc.) is cleared based on the
total demand (the sum of the demand of all generators using that fuel, with
respect to their projected operating level) and the total supply (the sum of the
availability of the fuel's raw natural resource in the State maps).&lt;/p&gt;

&lt;p&gt;To bootstrap the economy, each fuel market is
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/fuel_market.rb#L106&quot;&gt;initialized&lt;/a&gt;
with a starting price (randomized within a pre-specified standard deviation, for
variety's sake) and a price elasticity of supply. The supply of fuel does not
change since State maps are static in the current implementation, so the price
elasticity is used to calculate how the fuel price reacts to changes in demand.&lt;/p&gt;

&lt;p&gt;The shared fuel prices mean that a player's decision to build many large coal
generators can have implications on the generator portfolios of another player (
who may be less inclined to build coal plants due to the increasing cost of
fuel).&lt;/p&gt;

&lt;h2&gt;State&lt;/h2&gt;

&lt;p&gt;Within a State, the player has complete control over all generators and
transmission lines. The player can view the
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/public/javascripts/application.js#L157&quot;&gt;average cost curve&lt;/a&gt;
of their generators, graphed in order of their average cost. The ideal,
cost-minimizing strategy would run the generators in this order - cheapest
first.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/curve.png&quot; alt=&quot;Average Cost Curve&quot; /&gt;&lt;/p&gt;

&lt;h2&gt;Average Cost Curve&lt;/h2&gt;

&lt;p&gt;Threephase makes a conscious decision to use the average cost curve, as opposed
to the marginal cost curve, to attempt to take into the account the capital
investment of each generator. This is a deviation from the industry norm, done
in part to assist my own understanding.&lt;/p&gt;

&lt;p&gt;A common point of confusion for non-experts is how utilities could ever expect
to do more than depreciate their equipment while operating at the marginal cost.
Different types of regulation attempt to compensate in various ways, most
commonly with what are known as capacity payments. These are side payments made
by the regulator to encourage re-investment and continued expansion of
operations.&lt;/p&gt;

&lt;h2&gt;Geographic Visualization&lt;/h2&gt;

&lt;p&gt;The dashboard of each state displays a basic visualization of the State and its
Citys. The map also
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/public/javascripts/map.js&quot;&gt;visualizes the composition of the land&lt;/a&gt;
underneath the State. Each small dot is a
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/block.rb&quot;&gt;Block&lt;/a&gt;,
and each Block has one of a few different types - mountains, plains or water.
Each Block also has an index per natural resource, describing its relative
abundance in that area. There are currently indices for the non-renewable
resources natural gas, coal and oil and for the renewable resources sun, water
and wind. In the current implementation, the block types and indices are chosen
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/block.rb#L46&quot;&gt;randomly&lt;/a&gt;.
This can be improved using map generation algorithms to create more
natural and useful land organization - mountain ranges, rivers, etc.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/overview.png&quot; alt=&quot;HUD&quot; /&gt;&lt;/p&gt;

&lt;p&gt;A screenshot of the State &quot;heads-up display&quot;, which shows a map of the natural
landscape, where the cities fall in relationship to natural resources, and
historical statistics on the price of fuel and electricity.&lt;/p&gt;

&lt;h2&gt;Location Based Discounts&lt;/h2&gt;

&lt;p&gt;The location of a City on the map has important implications. Based on the
availability of coal within the region, for example, a certain percentage
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/fuel_market.rb#L93&quot;&gt;discount&lt;/a&gt;
is given to coal generators operating in that City. Wind turbines in an area of
Blocks with high wind indices will be more effective than in other places.&lt;/p&gt;

&lt;p&gt;The discounts are
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/city.rb#L139&quot;&gt;calculated&lt;/a&gt;
by finding all blocks within a certain radius of the City, scaled based on the
population. Larger cities will extend further out from their center point, so
they can be expected to utilize a wider area of natural resources.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/discount.png&quot; alt=&quot;Map&quot; /&gt;&lt;/p&gt;

&lt;p&gt;On the in-game map, the natural resource indices of the land beneath each City
can have an effect on the price of local generation. Blocks within a certain
radius of a City (green dots within the blue circle, scaled based on population)
that contain large amounts of coal or natural gas can make a City a good choice
for matching generator types.&lt;/p&gt;

&lt;h2&gt;Creating Objects&lt;/h2&gt;

&lt;p&gt;To create a
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/generator.rb&quot;&gt;generator&lt;/a&gt;,
the player selects a
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/generator_type.rb&quot;&gt;GeneratorType&lt;/a&gt;
from a list of &lt;a href=&quot;https://github.com/peplin/threephase/blob/master/db/fixtures/050_generator_types.rb&quot;&gt;available technologies&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Each Country can customize the list of available GeneratorTypes to simulate
different time periods and regulatory environments (e.g. before advanced
nuclear or coal with carbon capture and sequestration). The player can compare
the available GeneratorTypes based on their attributes to decide which would be
the best choice for their portfolio. The current interface is a simple table,
sortable by column. Additional comparison visualizations can make the choices
easier to understand, and convey some of the issues with new technologies.
&lt;a href=&quot;http://sds.hss.cmu.edu/risk/fleishman/InformationMaterials.html&quot;&gt;Lauren Fleishman's work&lt;/a&gt;
in summarizing and comparing generators is a good inspiration for the user
interface.&lt;/p&gt;

&lt;h2&gt;Generator Type Attributes&lt;/h2&gt;

&lt;p&gt;The attributes for a GeneratorType loosely belong to three different groups -
those that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Scale the marginal cost&lt;/li&gt;
&lt;li&gt;Scale the capital cost&lt;/li&gt;
&lt;li&gt;Scale the rate of occasionally positive &amp;amp; negative one-time events&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;For example, the waste disposal cost effects the marginal cost of power for the
GeneratorType. The tax credit lowers the initial capital investment, relative
to the size of the generator. The technical reliability effects the frequency
of equipment failure, and the technical complexity effects the time it takes to
repair a generator once a failure occurs. A complete list is available in the
&lt;a href=&quot;https://github.com/peplin/threephase/wiki/Technical-Component-Parameters&quot;&gt;Threephase wiki&lt;/a&gt;.
Some of the values for attributes were inferred from the
&lt;a href=&quot;http://www.eia.doe.gov/&quot;&gt;U.S. Energy Information Administraton&lt;/a&gt;, but others
still need to be researched.&lt;/p&gt;

&lt;h3&gt;Capacity Range&lt;/h3&gt;

&lt;p&gt;Each GeneratorType also has a range of valid capacities - the number of MWh the
GeneratorType can produce at its peak. This range reflects the general
capabilities of the GeneratorType, and also its typical applications in real
world power grids. For example, gas turbines have relatively lower capacity
limits than nuclear generators, and they are typically used as peaking plants
(to cover spikes in demand) as opposed to baseload plants which are more
efficient in large capacities.&lt;/p&gt;

&lt;h2&gt;Object Type Extensibility&lt;/h2&gt;

&lt;p&gt;These attributes and effects are not specific to generators. The objects in
Threephase are members of a
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/technical_component.rb&quot;&gt;TechnicalComponent&lt;/a&gt;
hierarchy, allowing them to share the flexibility enjoyed by generators.
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/line.rb&quot;&gt;Transmission lines&lt;/a&gt;,
power storage devices, and other component classes all have
these attributes.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/technical-components.png&quot; alt=&quot;Technical Components&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Generators, transmission lines and power storage devices all inherit from a
common parent type - the &lt;code&gt;TechnicalComponent&lt;/code&gt;. This allows the game to share
code when interacting with each type of object, and still allow for some
customization.&lt;/p&gt;

&lt;p&gt;In the case of transmission lines, the player can choose from
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/line_type.rb&quot;&gt;LineTypes&lt;/a&gt;
such as &lt;a href=&quot;https://github.com/peplin/threephase/blob/master/db/fixtures/050_line_types.rb&quot;&gt;high-voltage DC and high-voltage AC&lt;/a&gt;
of varying capacities and resistances.&lt;/p&gt;

&lt;p&gt;Most of the attributes are shared with generators through the TechnicalComponent
model, but class-specific attributes (e.g. underground v.s. above ground lines)
are also supported.&lt;/p&gt;

&lt;h3&gt;Implementation Note&lt;/h3&gt;

&lt;p&gt;The relationships are maintained using single-table inheritance (so
&lt;code&gt;GeneratorType&lt;/code&gt;, &lt;code&gt;LineType&lt;/code&gt; and &lt;code&gt;StorageDeviceType&lt;/code&gt; share a database table) and
polymorphic associations (so a State can reference instances of the three
classes in a generic fashion).&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Continue to the next section on
&lt;a href=&quot;/2011/05/threephase-mechanics/&quot;&gt;game mechanics&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Other Threephase Articles&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-mechanics/&quot;&gt;Game Mechanics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-time/&quot;&gt;Time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-implementation/&quot;&gt;Implementation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-time/&quot;&gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>The Mechanics of Threephase</title>
   <link href="http://christopherpeplin.com/2011/05/threephase-mechanics/"/>
   <updated>2011-05-27T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/threephase-mechanics</id>
   <content type="html">&lt;p&gt;&lt;em&gt;This post is part of a series describing the &lt;a href=&quot;/2011/05/threephase/&quot;&gt;Threephase&lt;/a&gt;
project.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Demand&lt;/h2&gt;

&lt;p&gt;In a new state, players will see immediately that their power grid is not
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/state.rb#L179&quot;&gt;meeting customer demand&lt;/a&gt;.
This is critical in power systems, much more so than any other commodity market.
Unlike typical consumer product supply and demand, a lack of supply of
electricity doesn't generate consumer buzz like a gadget shortage. The system
experiences instability and outages fails if demand and generation don't match
exactly.&lt;/p&gt;

&lt;h2&gt;Notifications&lt;/h2&gt;

&lt;p&gt;Thanks to the way players authenticate with Threephase (with their existing
Twitter or Facebook account, using the OAuth protocol), the server can
optionally communicate to players out-of-band when such emergency situations
arise. Imagine a tweet or Facebook message from Threephase when generation dips
dangerous close to or below the level of demand.&lt;/p&gt;

&lt;h2&gt;Changing Demand&lt;/h2&gt;

&lt;p&gt;The first solution to not meeting demand, of course, is to build more
generators. Not all is perfect, however, as demand is not static. The load of
each City (and overall that of the State) changes based on the time of day. Each
City has a predefined
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/city.rb#L66&quot;&gt;load profile&lt;/a&gt;
function, which determines how that City's demand changes during the day.
Generally, demand is higher in the afternoon and early evening than late at
night. A player's system may be sufficient at 8am, but insufficient later in the
day.&lt;/p&gt;

&lt;p&gt;The current load profile function is static, and simply scales
linearly with the number of customers in a City. In the future, each City could
have a more intelligent, varying load profile function. The current function
(found by visual approximation) is:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;-.1 ∗ ((.42 ∗ Hour - 5)4 ) + 100) ∗ Customers/Constant)
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/profile.png&quot; alt=&quot;Load Profile&quot; /&gt;&lt;/p&gt;

&lt;h3&gt;Equipment Failure&lt;/h3&gt;

&lt;p&gt;In addition to changes in demand, components in the system can also fail due to
equipment malfunction, natural disasters or union strikes. The system must have
enough capacity to withstand the loss of its largest component - this is known
as an n - 1 reliability constraint. As mentioned earlier, the GeneratorType determines the relative frequency and
severity of failures. The Country can also enforce stricter reliability
constraints (e.g. n - 2) for experimentation.&lt;/p&gt;

&lt;p&gt;The parameters for describing these failures - technical reliability (i.e. mean
time between failure) and technical complexity (i.e. mean time to repair) - are
not the same descriptors used by the electricity industry, but they are more
familiar to laypeople and describe similar concepts to the system-wide metrics
used by experts (e.g. the system average interruption duration index, or SAIDI).
These two attributes determine the frequency at which failures are triggered.&lt;/p&gt;

&lt;h2&gt;Primary Player Goals&lt;/h2&gt;

&lt;p&gt;Players of the game have two basic motives:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Generate enough power to meet demand&lt;/li&gt;
&lt;li&gt;Route the power generated to the demand&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;Line Constraints&lt;/h3&gt;

&lt;p&gt;The requirement to transmit power changes the reality of the operating strategy
quite a bit. The ideal system, where generators are operated in order of their
marginal cost, becomes impossible when the physical location of generators and
customers is considered. The line constraint feature was removed from the list
of initial features of Threephase, but it is the next big logical step for
simulating reality.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/line-constraint.png&quot; alt=&quot;Line Constraints&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The physical location of load on the grid can make the ideal, cost-minimizing
generation scenario impossible. In this example, Titusville has 150MW of demand
and no generator, but the only transmission line into the City has a maximum
capacity of 50MW. It is impossible to transmit enough electricity to meet
demand.&lt;/p&gt;

&lt;h2&gt;Profit&lt;/h2&gt;

&lt;p&gt;The true root motive of any utility operator is profit. The ability to make a
profit on the system is critical to re-investment in new technology, system
upgrades, and investor satisfaction (in the case of investor-owned utilities).
This greatly depends on the economic regulatory environment, both in the real
world and Threephase. The
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/state.rb#L164&quot;&gt;current implementation&lt;/a&gt;
supports rate of return and marginal cost bidding regulation.&lt;/p&gt;

&lt;h3&gt;Simplified Operating Costs&lt;/h3&gt;

&lt;p&gt;In all of the regulatory environments, the actual cost of operations depends on
the operating levels of each generator. In the current implementation, this is
set based on the ideal strategy - generators are enabled in order of their
marginal or average cost. Transmission line constraints must be completed before
a more realistic scenario can be demonstrated.&lt;/p&gt;

&lt;h3&gt;Rate of Return&lt;/h3&gt;

&lt;p&gt;&lt;a href=&quot;http://en.wikipedia.org/wiki/Rate-of-return_regulation&quot;&gt;Rate of return&lt;/a&gt;
regulation is the simplest to calculate and understand. The customers payments
are simply the total cost of operating the system at the level demanded
multiplied by a regulated rate of return (e.g. 8%). This type of regulation is
highly desirable for utilities, as players of the game will quickly realize. A
guaranteed return on investment is great encouragement for expanding the system
to levels beyond what is actually required. The cost of capital in this system
is also lowered, as the risk to banks loaning money is low if the debtor is
guaranteed a return on their investment by the government.&lt;/p&gt;

&lt;p&gt;In real-world rate of return regulation, there is a possibility that investment
decisions made by the utility will not be approved by the regulator. In the
future, Threephase could add intelligence to its regulating algorithm to reject
extraneous investment and equipment purchases.&lt;/p&gt;

&lt;p&gt;In the current implementation, approval is always granted.&lt;/p&gt;

&lt;h3&gt;Marginal Cost Bidding&lt;/h3&gt;

&lt;p&gt;The next type of regulation is marginal cost, or average cost bidding. This type
calculates the average cost curve each day, and the generators
&quot;&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/bid.rb&quot;&gt;bid in&lt;/a&gt;&quot;
at their marginal or average cost (a bid price enforced by the regulators). The
market price of electricity for the day is set at the intersection of demand
that curve. Generators at the intersection price will break even, generators
above it will potentially make a profit and those below are operating below
their marginal cost and are thus guaranteed to lose money.&lt;/p&gt;

&lt;p&gt;In both rate of return and marginal cost bidding, the generator's operating
levels are set
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/app/models/state.rb#L111&quot;&gt;automatically each day&lt;/a&gt;
by the server. An optional operating level override is planned for future
versions, to allow players to experiment with and view the effects of market
manipulation.&lt;/p&gt;

&lt;h3&gt;Locational Marginal Pricing&lt;/h3&gt;

&lt;p&gt;The next regulation type to be implemented in Threephase will be locational
marginal pricing.&lt;/p&gt;

&lt;p&gt;This regulation type depends on determining generator operating levels that
respect transmission line constraints. Each City in the State (more generally
each node) has a local price, which is affected by the system-wide transmission
capacity, cost of local generation and the local demand.&lt;/p&gt;

&lt;p&gt;In this example, Titusville has a marginal price of $80 because of
its isolation from high capacity transmission lines, relatively high demand and
(not pictured) an expensive local generator to make up the difference.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/lmp.png&quot; alt=&quot;LMP&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Continue to the next section on
&lt;a href=&quot;/2011/05/threephase-time/&quot;&gt;in-game time&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Other Threephase Articles&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase/&quot;&gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-game-objects/&quot;&gt;Game Objects&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-time/&quot;&gt;Time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-implementation/&quot;&gt;Implementation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-time/&quot;&gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>Implementation of Threephase</title>
   <link href="http://christopherpeplin.com/2011/05/threephase-implementation/"/>
   <updated>2011-05-27T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/threephase-implementation</id>
   <content type="html">&lt;p&gt;&lt;em&gt;This post is part of a series describing the &lt;a href=&quot;/2011/05/threephase/&quot;&gt;Threephase&lt;/a&gt;
project.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Test-driven Development&lt;/h2&gt;

&lt;p&gt;The development of Threephase followed the test-driven design philosophy. This
method encourages that before any implementation, a test case is written that
exercises a small piece of desired functionality. Initially the test will fail
since nothing has been implemented. The implementation goal is to write just
enough code to pass the test - providing some degree of certainty in correctness
and making sure no more code than is necessary is written. The resulting
collection of test cases (the test suite) is also a critical tool for making
sure that contributions from the open source community don't break existing
features. Each test case also serves as live, runnable documentation of how a
class or method is supposed to work.&lt;/p&gt;

&lt;h2&gt;Unit Testing&lt;/h2&gt;

&lt;p&gt;At its core, Threephase is a web application using the
&lt;a href=&quot;http://rubyonrails.org/&quot;&gt;Ruby on Rails&lt;/a&gt; framework. Most of the interesting
logic is in the application's models (i.e. the State, Generator, etc.), so there
is relatively loose coupling to Rails itself. The test suite uses the
&lt;a href=&quot;http://rspec.info/&quot;&gt;RSpec&lt;/a&gt; testing framework for its human-readable test cases
and strong integration with Rails. To test the standard request/response
patterns of the game's API, the project uses a
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/spec/support/crud_helper.rb&quot;&gt;custom set of RSpec feature groups&lt;/a&gt;.
These can be called like methods in a test case to avoid duplication, e.g.:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;&lt;span class=&quot;n&quot;&gt;describe&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;StatesController&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;before&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
        &lt;span class=&quot;vi&quot;&gt;@game&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;no&quot;&gt;Factory&lt;/span&gt; &lt;span class=&quot;ss&quot;&gt;:game&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;context&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;as an admin&amp;quot;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;before&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;do&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;login_as_admin&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

        &lt;span class=&quot;n&quot;&gt;it_should_behave_like&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;standard GET show&amp;quot;&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;it_should_behave_like&lt;/span&gt; &lt;span class=&quot;s2&quot;&gt;&amp;quot;standard PUT update&amp;quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;This example makes use of the &lt;code&gt;it_should_behave_like&lt;/code&gt; feature group ability in
RSpec, allowing much of the test logic to be shared among controllers. Object
Factories Instead of test fixtures (preloaded database objects), Threephase uses
the &lt;a href=&quot;https://github.com/thoughtbot/factory_girl&quot;&gt;Factory Girl&lt;/a&gt; framework to use
object factories in test cases. Object factories are simpler to maintain than
fixtures, and minimizes work during development to data models that are in flux.&lt;/p&gt;

&lt;h2&gt;Graphics&lt;/h2&gt;

&lt;p&gt;Threephase originally intended to use the Javascript graphics library
&lt;a href=&quot;http://processingjs.org&quot;&gt;Processing.js&lt;/a&gt; to render the map, charts and graphs in
the browser. Upon further investigation, the library seemed to lack helpful
charting features that other libraries offered, and the odd support of syntax
from the Java version of Processing made using the language somewhat unnatural
(a pure Javascript API to Processing.js was discovered later).&lt;/p&gt;

&lt;p&gt;Instead, Threephase uses two different Javascript graphics libraries:
&lt;a href=&quot;http://raphaeljs.com&quot;&gt;Raphaël&lt;/a&gt; for basic charting and mapping and
&lt;a href=&quot;http:%20//code.google.com/p/flot/&quot;&gt;Flot&lt;/a&gt; for the piecewise graph
necessary for the
&lt;a href=&quot;https://github.com/peplin/threephase/blob/master/public/javascripts/application.js#L157&quot;&gt;average cost curve visualization&lt;/a&gt;.
Raphaël also has an existing charting extension,
&lt;a href=&quot;http://g.raphaeljs.com&quot;&gt;gRaphaël&lt;/a&gt; which reduced the amount of boilerplate graph
code that had to be written.&lt;/p&gt;

&lt;h3&gt;Performance&lt;/h3&gt;

&lt;p&gt;The performance of both libraries is very good in modern browsers (performance
tested in Mozilla Firefox and Google Chrome). The bottleneck for rendering
complex visualizations at the moment is the time in downloading the data
required from the server in the background, not rendering.&lt;/p&gt;

&lt;h2&gt;Database&lt;/h2&gt;

&lt;p&gt;The majority of the objects in Threephase are stored using the standard
object-relational mapping provided by Rails, backed by a PostgreSQL database.
The objects in Threephase (and the physical entities in the real world) are
highly relational, so this is a good choice not only for the wide support of SQL
databases, but because it fits the data model well.&lt;/p&gt;

&lt;h2&gt;Asynchronous Tasks&lt;/h2&gt;

&lt;p&gt;The browser-based nature of Threephase rests on HTTP, the most basic web
protocol. HTTP has no knowledge of long running processes, and the relationship
between a client and the server is finished after a single request/response
cycle. There is not an immediately clear way to mesh this with the very
demanding interaction required by games.&lt;/p&gt;

&lt;h3&gt;Motivation&lt;/h3&gt;

&lt;p&gt;In order to provide reasonable response times, so players don't get impatient
waiting for pages to load, the majority of the computation to update the game
needs to happen outside of the normal player interaction cycle - whether
updating one element or a thousand.&lt;/p&gt;

&lt;p&gt;A desktop game may use different threads of execution to make sure a player is
never waiting for a network packet to finish downloading, or for a texture to
load from the hard drive. In web applications, the server can use asynchronous
task queues to accomplish the same thing.&lt;/p&gt;

&lt;p&gt;Whenever possible, computation is bundled up into a &quot;task&quot; and queued to run at
a later time - ideally as soon as possible, so the player gets updated data, but
with no guarantee that it will happen before the server returns a response to
the client. If the job hasn't completed, it may return cached data that is
valid, but not completely up-to-date. There is a trade-off between performance
and liveliness, and in this case player perception of the game's speed is more
important than absolutely current information. To accomplish this, Threephase
uses the &lt;a href=&quot;https://github.com/defunkt/resque&quot;&gt;Resque&lt;/a&gt; background job library backed by a Redis database.&lt;/p&gt;

&lt;h3&gt;Task Examples&lt;/h3&gt;

&lt;p&gt;Examples of work to be done in tasks include:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Charging customers based on their demand and the marginal price&lt;/li&gt;
&lt;li&gt;Deducting power grid operating costs over a time period&lt;/li&gt;
&lt;li&gt;Handling random events: research advancements that lower capital costs,
  unionized worker strikes, etc.&lt;/li&gt;
&lt;li&gt;Clearing the market price of each fuel&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Currently, Threephase only uses one simple task to update the game world. As
more update logic is added, they will be done as tasks.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Continue to the next section, an
&lt;a href=&quot;/2011/05/threephase-evaluation/&quot;&gt;evaluation&lt;/a&gt; of Threephase.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Other Threephase Articles&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase/&quot;&gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-game-objects/&quot;&gt;Game Objects&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-mechanics/&quot;&gt;Game Mechanics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-time/&quot;&gt;Time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-time/&quot;&gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>Evaluation of Threephase</title>
   <link href="http://christopherpeplin.com/2011/05/threephase-evaluation/"/>
   <updated>2011-05-27T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/threephase-evaluation</id>
   <content type="html">&lt;p&gt;&lt;em&gt;This post is part of a series describing the &lt;a href=&quot;/2011/05/threephase/&quot;&gt;Threephase&lt;/a&gt;
project.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;The proposed endpoint of this project was a deployed game server with a playable
version of Threephase. One of the first tasks to make sure this was accomplished
was to define the scope of the game for the initial version. Clearly, there is
enough material to extend beyond a two month timeframe. The core logic of the
virtual world took longer than expected to implement, and as a result progress
fell behind the list of planned features. Threephase is not currently robust
enough for a public deployment, and needs additional work to optimize
performance and the user interface. The three evaluation criteria from the
project proposal were:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Reception among players. Document reactions during beta testing. Is the game
  captivating?&lt;/li&gt;
&lt;li&gt;Conveyance of critical power systems concepts. Is the game a useful teaching
  aid?&lt;/li&gt;
&lt;li&gt;Robustness and scalability of game architecture. Is the system well-designed?
  Are upgrades streamlined?&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The project was not far enough along by the end of the semester to perform user
testing, but the introduction of power systems concepts and scalability of the
system made good progress.&lt;/p&gt;

&lt;h2&gt;Design Changes&lt;/h2&gt;

&lt;p&gt;Threephase evolved from a turn-based game to real-time before the design was
finalized. Real-time strategy games are a more natural setting for players, but
when the project was proposed the technology for real-time updates in the
browser was not mature. While still an emerging technology, recent developments
in non-blocking web servers and browser sockets led to a shift in Threephase 's
playing style.&lt;/p&gt;

&lt;p&gt;Instead of requiring a somewhat complicated system of action points and player
turns, the game world is persistent and never stops. This presented some
additional implementation challenges, but in the end will be a more compelling
interface.&lt;/p&gt;

&lt;h2&gt;Project Management&lt;/h2&gt;

&lt;p&gt;The project uses the &lt;a href=&quot;http://threephase.lighthouseapp.com/&quot;&gt;Lighthouse&lt;/a&gt; issue
tracking system to track milestones and tasks. The initial schedule planned a
milestone every four days. This duration ended up being too short, and did not
allow for slippage or early finishes. Week long milestones (for a project
receiving 3/4 of the time of its participants) would be a more flexible and
realistic time period.&lt;/p&gt;

&lt;h3&gt;Time Distribution&lt;/h3&gt;

&lt;p&gt;The time spent on Threephase was tracked over the past two months with the time
tracking tool &lt;a href=&quot;https://github.com/samg/timetrap&quot;&gt;Timetrap&lt;/a&gt;. Split into general
categories, most of the time was spent on the backend code (which determines the
data storage and object interaction). The second most time was spent in testing.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/time-tracking.png&quot; alt=&quot;Time Distribution&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The initial concept of Threephase evolved over the summer months, culminating in
software requirements and architecture documents (now in the project's wiki
page). The first month of development was spent implementing the core data
models, application controllers and basic web views to interact with the game.
The second and final month was spent implementing the game logic - maintaining
consistency in the virtual world, updating objects and enforcing the different
regulatory conditions.&lt;/p&gt;

&lt;h2&gt;Future Work&lt;/h2&gt;

&lt;h3&gt;Serious Games&lt;/h3&gt;

&lt;p&gt;Threephase represents a proof of concept of a platform for teaching and
experimenting with power system concepts in the context of a familiar game-like
interface. The motive roughly follows the ideas behind the Serious Games
Initiative (and other, less formal pushes towards games for learning and
experimenting).&lt;/p&gt;

&lt;p&gt;Serious games try to leverage the immersive power of games for education, for
both academics and continued learning. These games also serve as excellent
training tools within industry. Instead of building strictly academic
simulations, experts can make an approachable game, something that laypeople
would be interest in playing and the experts themselves would enjoy outside of a
class or job.&lt;/p&gt;

&lt;h4&gt;Player Responsibility Scope&lt;/h4&gt;

&lt;p&gt;A core requirement of serious games is to avoid simplifying critical components,
which could alienate the experts of the field, while balancing enough au-
tomation that newcomers can learn at their own pace. The game needs to be able
to adjust the scope of decisions a player must make, and the extent of the
complexity they see, in order to allow players to focus on only certain aspects
of the system at one time.&lt;/p&gt;

&lt;p&gt;Something discovered firsthand over the course of this project is that it can be
overwhelming to be responsible for all aspects of the system, coming dangerously
close to the micromanagement of resources. The game must balance between system
level decisions and real-world issues that high-level simulations often ignore.
The player should be able to select from a range of difficulty levels that
automate different areas of the system to adjust the difficulty.&lt;/p&gt;

&lt;h4&gt;Expanded Multiplayer&lt;/h4&gt;

&lt;p&gt;Collaborative learning is also possible with serious games. In Threephase,
each State is run by an individual player, but this could be expanded to allow
cooperative play - one player controlling the generators and another the
transmission lines. This actually reflects another real-world regulatory
scenario, where these areas of the power grid are separated by law into
different management entities.&lt;/p&gt;

&lt;h3&gt;Improvements&lt;/h3&gt;

&lt;p&gt;There is a long list of features that could be added to Threephase. The most
interesting and pressing items are:&lt;/p&gt;

&lt;h4&gt;Line Constraints &amp;amp; Location Marginal Prices&lt;/h4&gt;

&lt;p&gt;The implementation of transmission line construction and the respecting of line
constraints in determining the operating levels of generators. The game is
lacking a key component of real power systems without this. LMP-style regulation
depends on this feature.&lt;/p&gt;

&lt;h4&gt;Avoiding Outages&lt;/h4&gt;

&lt;p&gt;The consequences for not meeting demand in Threephase are unclear.&lt;/p&gt;

&lt;p&gt;The player is warned of the condition and their state essentially
freezes in place - customers aren't charged, operating costs aren't deducted,
and no power is generated. This is an overly forgiving approach, as there are
serious consequences for an outage in the real world ranging from unhappy
customers to financial penalties and even eviction from the market.&lt;/p&gt;

&lt;p&gt;Players in Threephase are given this great leeway to allow new players a
build-up period, where they can build enough generators to meet demand when
first joining a game. This phase could be re-worked to occur before players
officially join the game and must being running their utility.&lt;/p&gt;

&lt;p&gt;Once the game has started and harsher consequences are in place, a more useful
warning for players will be that &quot;generation is projected to come dangerous
close to not meeting generation,&quot; thereby giving the player a chance to resolve
the situation before an outage.&lt;/p&gt;

&lt;h4&gt;Load Profiles &amp;amp; Demand Response&lt;/h4&gt;

&lt;p&gt;The load profile of each city is static, and varies only linearly with the
population. This could be improved not only by introducing more
interesting variations, but by incorporating the idea of demand response. If
customers are offered time-dependent electricity prices, they (or their
networked appliances) can schedule their operating hours to minimize their costs
and stabilize the load for generators. For example, electricity is generally
less expensive at night due to excess capacity (and can even have a negative
price), and customers are unaffected by short voluntary outages of certain
appliances.&lt;/p&gt;

&lt;p&gt;This feature would require additional intelligence in the load profile
algorithm, as its value would be based on price as well as time.&lt;/p&gt;

&lt;h4&gt;Intelligent Map Generation&lt;/h4&gt;

&lt;p&gt;The maps assigned to each State must be generated more intelligently, creating a
natural a landscape with a relationship between blocks. The current
implementation does not allow for realistic groupings of generators around
certain resources (e.g. wind farms in a windy area), since the indices can shift
dramatically from block to block.&lt;/p&gt;

&lt;h4&gt;Expanded Multiplayer&lt;/h4&gt;

&lt;p&gt;The multiplayer aspects of the game could be expanded beyond shared national
fuel prices to include interstate trade. Interstate transmission lines are
implemented but not exposed to the player in the interface. Fuel contracts and
contracts for different on transmission have also been proposed.&lt;/p&gt;

&lt;h4&gt;Interactive Visualizations&lt;/h4&gt;

&lt;p&gt;The map and chart visualizations need to be improved to be more accurate,
interactive and useful. The map rendering was intended as the primary interface
for the game, but sits as a sidebar in the current implementation. It should
present a natural way of viewing the geograph of the State and interacting with
the generators and transmission lines. Threephase exposes a JSON API for nearly
every function of the game, so the possibilities here depend primarily on user
experience decisions.&lt;/p&gt;

&lt;h4&gt;Mobile Client&lt;/h4&gt;

&lt;p&gt;Because an API already exists, a mobile client would be a good addition to the
system, so players can keep track of their power grid without being near a
browser.&lt;/p&gt;

&lt;h2&gt;Other Threephase Articles&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase/&quot;&gt;Overview&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-game-objects/&quot;&gt;Game Objects&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-mechanics/&quot;&gt;Game Mechanics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-time/&quot;&gt;Time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-implementation/&quot;&gt;Implementation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>Threephase - A Browser-based Electric Power System Game</title>
   <link href="http://christopherpeplin.com/2011/05/threephase/"/>
   <updated>2011-05-27T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/threephase</id>
   <content type="html">&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/threephase-logo.jpg&quot; alt=&quot;Logo&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;This post is part of a series describing the &lt;a href=&quot;/2011/05/threephase/&quot;&gt;Threephase&lt;/a&gt;
project.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Threephase is a web browser-based game simulating the electric power generation
and transmission system. The project was completed over a two month period
beginning in September 2010, to satisfy the graduate project requirement for
Master's in Information Networking (MSIN) candidates at the
&lt;a href=&quot;http://www.ini.cmu.edu&quot;&gt;Information Networking Institute&lt;/a&gt; of Carnegie Mellon
University (CMU).&lt;/p&gt;

&lt;p&gt;This article is an abriged version of the
&lt;a href=&quot;http://things.rhubarbtech.com/threephase/report.pdf&quot;&gt;final report&lt;/a&gt;,
reformatted for the web and with some (less formal) comments and reflections
added. I've split it up into a few posts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-game-objects/&quot;&gt;Game Objects&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-mechanics/&quot;&gt;Game Mechanics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-time/&quot;&gt;Time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-implementation/&quot;&gt;Implementation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-time/&quot;&gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;The primary inspiration for Threephase was the class &quot;The Engineering &amp;amp;
Economics of Power Systems&quot; offered at CMU in the Spring of 2010. The class
introduced core power system concepts and discussed many of the issues effecting
the utilities today. From my perspective as a computer scientist and video
gamer, the available computer simulations for learning these concepts had room
for expansion and improvement.&lt;/p&gt;

&lt;p&gt;The power system is a growing, popular concern of which the complexity is not
well understood by non-experts. The simulations and teaching tools currently
available aren't sufficiently accessible and modern to attract people from
outside the industry. Threephase is an attempt to balance between the artistic,
playful and technical elements to create an immersive virtual world for
experimentation and learning.&lt;/p&gt;

&lt;p&gt;From conception to implementation, the design shifted in a few ways in respose
to the demands of the web-based user interface. The nature of the web protocol
HTTP also presented unique challenges to a real-time game, and Threephase
applies some novel techniques to find scalable solutions.&lt;/p&gt;

&lt;h3&gt;Source Code&lt;/h3&gt;

&lt;p&gt;The game's source code is provided under an MIT open source license at
&lt;a href=&quot;https://github.com/peplin/threephase&quot;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Acknowldgements&lt;/h3&gt;

&lt;p&gt;This project would not have been possible without the assitance of my advisors
and professors at CMU. The course I mentioned was taught by:
&lt;a href=&quot;http://public.tepper.cmu.edu/facultydirectory/FacultyDirectoryProfile.aspx?id=88&quot;&gt;Dr. Lester Lave&lt;/a&gt;
(who sadly passed away shortly after the completion of this project),
&lt;a href=&quot;http://www.ece.cmu.edu/~milic/&quot;&gt;Dr. Marija Ilić&lt;/a&gt; and &lt;a href=&quot;http://www.linkedin.com/pub/jovan-ilic/5/846/319&quot;&gt;Dr. Jovan Ilić&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The advisors for this project in particular were
&lt;a href=&quot;http://public.tepper.cmu.edu/facultydirectory/FacultyDirectoryProfile.aspx?id=211&quot;&gt;Dr. Jay Apt&lt;/a&gt;
and &lt;a href=&quot;http://www.ece.cmu.edu/directory/details/4617&quot;&gt;Dr. Gabriela Hug&lt;/a&gt;.&lt;/p&gt;

&lt;h3&gt;Current Player Actions &amp;amp; Abilities&lt;/h3&gt;

&lt;p&gt;In the current version of Threephase, players can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Create new games and choose attributes for the game&lt;/li&gt;
&lt;li&gt;Join existing games that have already started&lt;/li&gt;
&lt;li&gt;Build city-local generators from a list of available types and a range of
  capacities&lt;/li&gt;
&lt;li&gt;View marginal price of electricity over time&lt;/li&gt;
&lt;li&gt;View marginal cost of each generator over time&lt;/li&gt;
&lt;li&gt;View marginal price of each type of fuel&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Once the objects are created, the current version of the backend can:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Calculate market price for each fuel based on supply and demand&lt;/li&gt;
&lt;li&gt;Automatically assign the optimal operating level for each generator&lt;/li&gt;
&lt;li&gt;Order generators based on marginal cost or average cost&lt;/li&gt;
&lt;li&gt;Deduct operating costs (cost of fuel, cost of workforce, etc.) over a time
  period from a player's cash&lt;/li&gt;
&lt;li&gt;Add customer payments (based on marginal price of electricity) over a time
  period to a player's cash&lt;/li&gt;
&lt;li&gt;Calculate marginal price for rate of return regulation&lt;/li&gt;
&lt;li&gt;Calculate marginal price for marginal cost bidding regulation, assuming a
  vertical demand curve&lt;/li&gt;
&lt;li&gt;Discount generator operating costs based on map geology&lt;/li&gt;
&lt;li&gt;Trigger random equipment failures (rate determined by generator attributes) -
  equipment repair is notably not yet implemented&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Background&lt;/h2&gt;

&lt;p&gt;Now more than ever, the consumption and generation of electricity are on the
minds of policy makers and concerned citizens alike. Green power, smart grids,
and the renewed popularity of nuclear energy seem like obvious solutions to
increasing efficiencies, so the lack of implementation momentum puzzles many
people outside the industry. The most (and nearly only) visible change in the
past decade to consumers is the shift from incandescent to CCFL light bulbs -
hardly revolutionary.&lt;/p&gt;

&lt;p&gt;Since the widespread restructuring of the power system in the early 1970's, the
complexity of power economics has surpassed the understanding of most people,
including the politicians charged with deciding the future of the system itself.
The engineering problems are also non-intuitive to those without an electrical
engineering background. For example, despite the hype, wind power alone is not
the ultimate solution to the world's energy and environmental issues, but this
isn't communicated to or well understood by laypeople. There is an opportunity
for educating the public and increasing awareness of the tough realities of the
power system.&lt;/p&gt;

&lt;h3&gt;Computer &amp;amp; Video Games&lt;/h3&gt;

&lt;p&gt;The power system is frequently included in computer and video games, dating at
least back to Maxis' SimCity of 1989 (and the version that I played,
&lt;a href=&quot;www.gamegoldies.org/simcity-2000&quot;&gt;SimCity 2000&lt;/a&gt;). Electricity appears even
earlier in board games, where controlling the power &amp;amp; water utilities in
Monopoly garnered players a key advantage. More recently, the German board game
&lt;a href=&quot;http://www.riograndegames.com/games.html?id=5&quot;&gt;Power Grid&lt;/a&gt; used the power
system as its core game mechanic.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/simcity.png&quot; alt=&quot;SimCity 2000&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;www.gamegoldies.org/simcity-2000&quot;&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Electricity transmission surfaced as a game mechanic in recent computer games as
well, such as &lt;a href=&quot;http://www.introversion.co.uk&quot;&gt;Darwinia&lt;/a&gt; and the new massively
multiplayer game &lt;a href=&quot;http://www.quelsolaar.com/love&quot;&gt;Love&lt;/a&gt;. In both games,
protecting transmission lines from attack and malfunction is a key objective.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/darwinia.jpg&quot; alt=&quot;Darwinia&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://www.introversion.co.uk&quot;&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/love.png&quot; alt=&quot;Love&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://www.quelsolaar.com/love&quot;&gt;Source&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;On the other end of the spectrum, the current power system teaching tool used at
Carnegie Mellon University,
&lt;a href=&quot;https://www.ece.cmu.edu/~nsf-education/software.html&quot;&gt;Gipsys&lt;/a&gt;, excels in the
technical but isn't approachable enough to engage those with a passing interest.
Since this project started, IBM released a web-based city planning serious game,
&lt;a href=&quot;www.ibm.com/cityone&quot;&gt;CityOne&lt;/a&gt;, which asks players to make public policy decisions to improve
efficiency in their virtual city. IBM's take on serious games is unfortunately
less of a challenging, immersive virtual world and more of a marketing tool.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/threephase/gipsys.png&quot; alt=&quot;Gipsys&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The frequent appearance of the electrical system in games is not a coincidence -
the concepts of generation and transmission fit well with strategy gameplay. The
games market is ripe for a serious game that combines popular fascination with
an idealized power system and the often troublesome state of engineering and
economics in the actual industry. This game could be used for both education and
casual enjoyment.&lt;/p&gt;

&lt;h3&gt;Concept&lt;/h3&gt;

&lt;p&gt;Threephase tries fill the remaining gap, and balance between the artistic, the
playful and the technical. A new generation of gamers is being formed online, by
the likes of &lt;a href=&quot;http://www.zynga.com/&quot;&gt;Zynga's&lt;/a&gt; Farmville, Frontierville and Mafia
Wars. These gamers are comfortable with having a persistent, virtual world in
the games they play. They are accustomed to games lasting days or months, and
even those without a set endpoint. Unfortunately, few of these games challenge
players to learn or think creatively. They are a missed opportunity to show a
wide audience the positive effects of gaming firsthand.&lt;/p&gt;

&lt;p&gt;The goal of Threephase is to be approachable by a lowest common denominator of
people who understand technology, use the web and are willing to play a game (or
already do). Each player is handed control of a state-wide utility company and
tasked with generating enough power to meet customer demand. Each player
operates in a game world shared with other players, where the repercussions of
energy decisions in one state can be felt by many others.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Continue to the next section on
&lt;a href=&quot;/2011/05/threephase-game-objects/&quot;&gt;game objects&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Other Threephase Articles&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-game-objects/&quot;&gt;Game Objects&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-mechanics/&quot;&gt;Game Mechanics&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-time/&quot;&gt;Time&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-implementation/&quot;&gt;Implementation&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/threephase-time/&quot;&gt;Evaluation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>August 23, 1966</title>
   <link href="http://christopherpeplin.com/2011/05/august23/"/>
   <updated>2011-05-27T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/august23-1966</id>
   <content type="html">&lt;p&gt;How do you tell the universe's story as our story? Myth has been intertwining
the small and the large for a very long time, with varying degrees of success.
Our modern physical cosmology is another mythic story in the suite of human
cosmogonies. Though it holds a special place as one that is connected to
physical reality, it may be interpreted in many ways. By implementing mechanical
and digital interaction technologies, the interpretation can become
re-manifestation, where the observer tells a personal story of the cosmos.&lt;/p&gt;

&lt;p&gt;In 2009, I along with three teammates at the University of Michigan tried to
make sense of that question. We received a grant as a part of the
&lt;a href=&quot;http://www.dc.umich.edu/dmc/grocs/index.html&quot;&gt;GROCS&lt;/a&gt; program, which was started
to try and encourage inter-departmental collaboration unified by technology. The
team consisted of four people from three departments:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Myself, Chris Peplin (computer science)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://flavors.me/briannord&quot;&gt;Brian Nord&lt;/a&gt; (physics, specifically cosmology)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://gmbcg.blogspot.com/&quot;&gt;Jiangang Hao&lt;/a&gt; (also astrophysics)&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://www.johndanielwalters.com/&quot;&gt;John Walters&lt;/a&gt; (art &amp;amp; design, specifically sculpture)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;In November 2008, we &lt;a href=&quot;/files/august23/proposal.pdf&quot;&gt;proposed&lt;/a&gt; that over the next
semester we would:&lt;/p&gt;

&lt;p&gt;&quot;[...] explore the possibility of a collaborative universe creation
computer game written in Java and &lt;a href=&quot;http://processing.org/&quot;&gt;Processing&lt;/a&gt;. Based on
a true to life cosmic starting point, participants can manipulate galaxies and
change the laws of physics on the fly. The creator can allow players to catalog
and experiment with a shared universe (similar to the game, Spore ) while
exploring scientific laws and theories of creation, order, and design.&quot;&lt;/p&gt;

&lt;p&gt;We named the project &quot;August 23rd, 1966,&quot; after the date of the first photograph
of the Earth from the far side of the moon by
&lt;a href=&quot;http://en.wikipedia.org/wiki/Lunar_Orbiter_1&quot;&gt;Lunar Orbiter 1&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/august23/august_logo_transp.png&quot; alt=&quot;Logo&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Our proposal was accepted, and from January to May 2009, we focused our ideas
into a three day interactive gallery installation. Visitors could configure and
create their own personal star, then witness its birth, life and death in a
virtual universe alongside others'.&lt;/p&gt;

&lt;p&gt;This writeup is split into a few separate posts:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-twoverse/&quot;&gt;Twoverse&lt;/a&gt;, the core software backend&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-multitouch/&quot;&gt;Multi-touch Table&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-wiremap/&quot;&gt;Wiremap&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-pulse-oximeter/&quot;&gt;Pulse Oximeter (Heartbeat Monitor)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-audio/&quot;&gt;Audio Recordings&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Gallery Installation&lt;/h2&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/august23/gallery_installation.png&quot; alt=&quot;Gallery&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The gallery experience went something like this:&lt;/p&gt;

&lt;p&gt;A visitor enters and sees a desktop computer, a
&lt;a href=&quot;/2011/05/august23-multitouch/&quot;&gt;multi-touch table&lt;/a&gt; and a big black tent. They
approach the multi-touch table to view a galaxy of stars. They manipulate the
&lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/gallery/MultitouchClient/MultitouchInterface.pde&quot;&gt;viewpoint&lt;/a&gt;,
with their hands, and select existing stars to see details of their makeup and
date of birth.&lt;/p&gt;

&lt;p&gt;The user then selects a location for their own, personal addition to the
universe: a new star.&lt;/p&gt;

&lt;p&gt;A webcam connected to the multi-touch table
&lt;a href=&quot;https://github.com/peplin/august23/blob/master/src/gallery/ActiveColorBackground/ActiveColorGrabber.pde&quot;&gt;automatically configures&lt;/a&gt;
the temperature (and thus color) of the new star based on the color of the
user's clothing. This information is fed over the network to a computer powering
a &lt;a href=&quot;/2011/05/august23-wiremap&quot;&gt;Wiremap&lt;/a&gt;, located inside the black tent.&lt;/p&gt;

&lt;p&gt;The user dons a radio headset and space helmet (a modified face shield) and
steps into an antechamber at the edge of the tent - a simulated airlock. A sound
dome suspended from the ceiling fills the room with
&lt;a href=&quot;/2011/05/august23-audio/&quot;&gt;ambient industrial noise&lt;/a&gt;, much like the loud life
support systems on the space shuttle.
&lt;a href=&quot;/2011/05/august23-audio/&quot;&gt;Verbal instructions&lt;/a&gt; are piped to the user from the
multi-touch table's computer over the radio headset, and a second visitor in the
gallery can acts as &quot;mission controller&quot; and communicate via another headset.
Much like NASA's controllers in Houston, TX, they do not have the same visuals
as the person in the tent.&lt;/p&gt;

&lt;p&gt;A &lt;a href=&quot;/2011/05/august23-audio/&quot;&gt;voice&lt;/a&gt; instructs the user to place their finger on
a sensor at the edge of a &lt;a href=&quot;/2011/05/august23-pulse-oximeter/&quot;&gt;small black box&lt;/a&gt;
in the airlock, lit by a single red bulb, in order to check their life signs.
(The box is a &lt;a href=&quot;/2011/05/august23-pulse-oximeter/&quot;&gt;heartbeat detector&lt;/a&gt; and its
reading is used to set the oscillation frequency of the new star.)&lt;/p&gt;

&lt;p&gt;These events are synchronized with the Wiremap, so when the visitor is
informed that the airlock door is unlocked and they enter the darkened tent
(simulating the isolation of space), the birth of their star begins in front of
their eyes. A new voice on the headset occasionally describes the unfolding
sights, and the visitor is free to walk around and examine a kinetic three
dimensional representation of their star's creation (a light field suspended on
a grid of strings).&lt;/p&gt;

&lt;p&gt;After a few minutes, the process is complete and the user is directed to leave
the tent and return their helmet. Their star is now a part of a universe shared
by all of the other visitors to the gallery. A second computer near the
multi-touch table offers a view of the entire universe via a web browser, with
instructions on how to keep tabs on stars from home.&lt;/p&gt;

&lt;p&gt;This &lt;a href=&quot;http://vimeo.com/5368587&quot;&gt;video&lt;/a&gt; gives a sense of what the final gallery
installation looked like, and hopefully a sense of the experience.&lt;/p&gt;

&lt;iframe
    src=&quot;http://player.vimeo.com/video/5368587?title=0&amp;amp;byline=0&amp;amp;portrait=0&quot;
    width=&quot;600&quot; height=&quot;405&quot; frameborder=&quot;0&quot;&gt;&lt;/iframe&gt;


&lt;p&gt;&lt;a
    href=&quot;http://vimeo.com/5368587&quot;&gt;August 23, 1966&lt;/a&gt; from &lt;a
    href=&quot;http://vimeo.com/user1934112&quot;&gt;Christopher Peplin&lt;/a&gt; on &lt;a
    href=&quot;http://vimeo.com&quot;&gt;Vimeo&lt;/a&gt;.&lt;/p&gt;


&lt;h2&gt;Abstract&lt;/h2&gt;

&lt;p&gt;August's installation can be considered a play off of the
&lt;a href=&quot;http://en.wikipedia.org/wiki/Many-worlds_interpretation&quot;&gt;many-worlds&lt;/a&gt; theory -
alongside our world, with its crumbling economies and warring nations, there
exists a digital universe that is directed by you, the user. Just like in our
known universe, many parameters are outside of your control. The interesting
part is choosing what you can, timing as you may, and watching the results.&lt;/p&gt;

&lt;p&gt;Our intention was to create a multiplayer environment, where visitors to the
gallery and online users from elsewhere are interacting in the same virtual
universe - we called it &lt;a href=&quot;/2011/05/august23-twoverse/&quot;&gt;Twoverse&lt;/a&gt;. Users would
begin their experience in the gallery with our unique hardware and
visualizations, then continue to monitor the objects they create in a web
interface from home.&lt;/p&gt;

&lt;p&gt;From a game design perspective, we took into account the potential for
replayability of what we were creating. We reasoned there were a few factors
that would hopefully give users reason to return to the installation or visit
the website:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The excitement of putting your name on a cluster of planets or on a convoy of
  space vessels destined for Andromeda.&lt;/li&gt;
&lt;li&gt;The parental like ownership you may feel watching your star grow up, and the
  pains in your heart when it dies.&lt;/li&gt;
&lt;li&gt;Something to check on for just a few minutes each day, and more potentially
  intellectually rewarding than fantasy football scores.&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Objects in the Universe&lt;/h2&gt;

&lt;p&gt;We envisioned the universe would start with some phenomena and astral bodies,
but the rest was up to the users. There are two different categories of objects,
each with a set of parameters configurable by the user: man-made bodies and
celestial bodies. These objects span from human scale (single science satellites or cities) up to
unfathomably large (galaxy clusters).&lt;/p&gt;

&lt;p&gt;Some of the parameters envisioned for satellites include the intended research
purse, nationality, speed, orbit and destination. For planets, the parameters
could include size, density, composition, orbit and population.&lt;/p&gt;

&lt;p&gt;The current version implements a single object: stars, with configurable size
and temperature.&lt;/p&gt;

&lt;h2&gt;Documentation&lt;/h2&gt;

&lt;p&gt;I've posted more detailed descriptions and documentation for each of the core
components of the installation in separate articles.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-twoverse/&quot;&gt;Twoverse&lt;/a&gt;, the core software backend&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-multitouch/&quot;&gt;Multi-touch Table&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-wiremap/&quot;&gt;Wiremap&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-pulse-oximeter/&quot;&gt;Pulse Oximeter (Heartbeat Monitor)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-audio/&quot;&gt;Audio Recordings&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Reflection&lt;/h2&gt;

&lt;p&gt;The most important goal to me, and one we did accomplish, was to complete a
vertical slice of the entire Twoverse system. Our gallery installation and the
hardware and software that supported it touched an amazing number of topics, and
I know everyone on the team learned a great deal about their own and one
another's fields in the process. During this project, I expanded my knowledge
with hands on experience in:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;XML-RPC&lt;/li&gt;
&lt;li&gt;Java Servlets&lt;/li&gt;
&lt;li&gt;Multi-threading and databases in Java&lt;/li&gt;
&lt;li&gt;Audio in the Processing environment&lt;/li&gt;
&lt;li&gt;Library development for Processing&lt;/li&gt;
&lt;li&gt;Multi-touch input processing&lt;/li&gt;
&lt;li&gt;Graphics performance optimization&lt;/li&gt;
&lt;li&gt;Analog signal processing (filters, opamps)&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;Future Work&lt;/h2&gt;

&lt;p&gt;There were a few items left incomplete that I think are interesting enough to
mention as possibilities for future work.&lt;/p&gt;

&lt;h3&gt;Real-time Data Integration&lt;/h3&gt;

&lt;p&gt;Far from an isolated simulation, this parallel virtual universe could have
crossover with our current space (and in real time!). Imagine the Sun in this
virtual universe mirroring the solar flares of our own.&lt;/p&gt;

&lt;p&gt;The proliferation of real-time data APIs over the Internet present an
opportunity for integrating these information into the game. In another example,
the user interface could become blurred during periods of high proton wind, or
space vehicles and satellites could lose communication and spin out of control.&lt;/p&gt;

&lt;p&gt;Twoverse could also bring in factors here that have nothing to do with astronomy
in an attempt to relate more directly back to the human scale - e.g. the amount
of activity in the gallery could determine properties of some special galaxy
cluster, or the people in the room could automatically have an asteroid
generated for them.&lt;/p&gt;

&lt;h3&gt;Time Scales&lt;/h3&gt;

&lt;p&gt;Time is a big issue with this game, one not sufficiently explored.&lt;/p&gt;

&lt;p&gt;The progression of events must be slow enough that you can make some decisions
one day, but then you have to wait 24 hours or more for big enough changes to
happen. The speed must also be fast enough that we can demonstrate phenomena in
a reasonable time span.&lt;/p&gt;

&lt;p&gt;Not all of the objects need obey the same time scale, since the goal of Twoverse
isn't to be absolutely accurate. The time scale for stars could be greatly
accelerated so we can ultimately see them collapse. For space travel, maybe it
takes 3 hours instead of 3 days to go from Earth to the moon.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Continue to the next section, details of the software system
&lt;a href=&quot;/2011/05/august23-twoverse/&quot;&gt;Twoverse&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;h2&gt;Other August Articles&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-twoverse/&quot;&gt;Twoverse&lt;/a&gt;, the core software backend&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-multitouch/&quot;&gt;Multi-touch Table&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-wiremap/&quot;&gt;Wiremap&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-pulse-oximeter/&quot;&gt;Pulse Oximeter (Heartbeat Monitor)&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;/2011/05/august23-audio/&quot;&gt;Audio Recordings&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>Astral - Efficiently Distributing Live Video</title>
   <link href="http://christopherpeplin.com/2011/05/astral/"/>
   <updated>2011-05-27T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/astral</id>
   <content type="html">&lt;p&gt;&lt;a href=&quot;https://github.com/peplin/astral&quot;&gt;Astral&lt;/a&gt; is a peer-to-peer content
distribution network specifically built for live, streaming media. Without IP
multicast, if content producers want to stream live events to consumers, they
are forced to create separate feeds for each user. A peer-to-peer approach is
more efficient and offloads much of the work from the origin servers to the
edges of the network.&lt;/p&gt;

&lt;p&gt;This project was completed over the course of the Spring 2011 semester in the
&lt;a href=&quot;http://www.ece.cmu.edu/~ece842/S11/&quot;&gt;Distributed Systems (18-842)&lt;/a&gt; course at
Carnegie Mellon University, taught by Professor Bill Nace. The Astral team
included myself, &lt;a href=&quot;http://fabianpopa.com/&quot;&gt;Fabian Popa&lt;/a&gt;, Darshana Advani and
Anusha Varshney.&lt;/p&gt;

&lt;p&gt;This post is a modified version of the final report the team wrote
collaboratively and submitted with the project. It has been reformatted for the
web, and rewritten/expanded in certain places.&lt;/p&gt;

&lt;h2&gt;Concept&lt;/h2&gt;

&lt;p&gt;Live events are an extremely popular motivation for streaming video. President
Obama's 2009 inauguration, CNN served a record &lt;a href=&quot;http://mashable.com/2009/01/20/cnn-facebook-inauguration-numbers/&quot;&gt;21.3 million live video streams&lt;/a&gt; ,
with as many as 1.3 million concurrent connections at the peak.&lt;/p&gt;

&lt;p&gt;Looking at the way those streams were provided (i.e. in-browser) and considering
related press releases, we can infer that in order to serve such massive amounts
of data, providers invariably need to partner with global content distribution
networks (CDNs) like Akamai and Limelight. That's potentially an expensive
proposition for popular events, although less so than we thought at the start of
the project. Amazon recently added live streaming support to their
&lt;a href=&quot;http://aws.amazon.com/cloudfront/&quot;&gt;CloudFront&lt;/a&gt; CDN at quite reasonable prices,
making the added complexity and reliability issues of a peer-to-peer solution
somewhat suspect.&lt;/p&gt;

&lt;p&gt;The typical centralized streaming architecture depends on a network of
geographically distributed edge servers to which the clients connect. Since
&lt;a href=&quot;http://en.wikipedia.org/wiki/Multicast#IP_multicast&quot;&gt;IP multicast&lt;/a&gt; isn't
feasible over a WAN (due to the many different router configurations between
the source and destination), the stream is duplicated once per end user.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/astral/centralized.png&quot; alt=&quot;Centralized&quot; /&gt;&lt;/p&gt;

&lt;p&gt;In contrast, a peer-to-peer approach could take the producer's stream and seed
it into the network at different (potentially geographically dispersed) points.
The stream's consumers could then participate in uploading the data they're
viewing to others, using a fairly determined portion of their available
bandwidth. A real-world distribution would look much like a telephone fan-out
system for school and office closings.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/astral/idea.png&quot; alt=&quot;Idea&quot; /&gt;&lt;/p&gt;

&lt;h2&gt;Design &amp;amp; Architecture&lt;/h2&gt;

&lt;p&gt;Astral is based on a flexible peer-to-peer client application with a few
different user interfaces. The two major components are the centralized web
application (&lt;a href=&quot;https://github.com/peplin/astral-web&quot;&gt;astral-web&lt;/a&gt;) run by the network operator (implemented in Ruby, with
Sinatra) and a Python process running on each individual node.&lt;/p&gt;

&lt;p&gt;The nodes each run a small embedded web server and expose a simple (currently
unauthenticated) ReST API. The API is used for three things:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Inter-node communication&lt;/li&gt;
&lt;li&gt;Node to astral-web communication (the Sinatra app implements a
  subset of the same API)&lt;/li&gt;
&lt;li&gt;Communicating with the user interface in the browser (AJAX requests)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;A Flash applet in the browser accesses video devices, encodes and finally sends
the video into the network using the Real-Time Messaging Protocol. The actual
video stream is distributed out-of-band via a chain of TCP tunnels, which
connect all of the stream consumers to a
&lt;a href=&quot;http://en.wikipedia.org/wiki/Real_Time_Messaging_Protocol&quot;&gt;Real Time Messaging Protocol&lt;/a&gt; (RTMP)
server at the source, without ever opening a direct link.&lt;/p&gt;

&lt;p&gt;Depending on their configuration, a Python client can act as a stream source,
seeder, or consumer (or a combination, bandwidth permitting). The source streams
from an external video device through a Flash applet in the browser, which
forwards the media stream to the local node and on to the overlay network. The
bandwidth limitations of common household Internet connections limit the number
of forwarded streams per node to one or two, creating an interesting network
graph of consumer chains.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/astral/architecture.png&quot; alt=&quot;Architecture&quot; /&gt;&lt;/p&gt;

&lt;p&gt;After installing the Astral client, the user visits the URL of astral-web in a
browser. This displays a list of all available streams. Each stream has a
preview screenshot (not implemented) and metadata, provided by the producer and
the source node. When a user selects a stream to watch, the browser communicates
their selection to the background process via JavaScript with HTTP requests.&lt;/p&gt;

&lt;p&gt;Once the stream is forwarded to the client by at least one other node on the
network, the user can view it directly in the browser or in any other streaming
media player (by clicking a stream link embedded in the web page).&lt;/p&gt;

&lt;h3&gt;Node Communication&lt;/h3&gt;

&lt;p&gt;The Astral client is designed with flexibility in mind. A node can be any of a
content producer, consumer or seeder. These three types of nodes make up the
clients of the overlay network. The network loosely follows the supernode
organization of KaZaa and &lt;a href=&quot;http://saikat.guha.cc/pub/iptps06-skype/&quot;&gt;Skype&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;A node announces its presence in the network (and thus its candidacy for stream
forwarding) by sending an HTTP &lt;code&gt;POST&lt;/code&gt; request to a supernode with itself as the
data. When a user requests to watch a stream, the node propagates its interest
in this stream through to its neighbor nodes until a node is found that is
capable and willing to forward the content (this is somewhat controlled
flooding, and the strategy could likely be improved).&lt;/p&gt;

&lt;p&gt;When a node leaves the network, it performs a few critical shutdown steps to
give other nodes ample opportunity to adjust their stream source or target; it
sends HTTP &lt;code&gt;DELETE&lt;/code&gt; requests to any nodes to which it is forwarding a stream,
any child nodes, and (if it has one) its primary supernode. This keeps data as
consistent as possible in the network without the overhead of excessive
heartbeat messages.&lt;/p&gt;

&lt;h3&gt;Communication Design Changes&lt;/h3&gt;

&lt;p&gt;The original design of Astral planned to use &lt;a href=&quot;http://www.zeromq.org/&quot;&gt;ZeroMQ&lt;/a&gt;
for inter-node communication. ZeroMQ is message-oriented library that sits on
top of TCP sockets to provide very fast messaging between threads, applications
and networked machines. Astral requires occasional messaging between peers, and
ZeroMQ would have been a good fit.&lt;/p&gt;

&lt;p&gt;In the process of implementing the messaging handling code, however, we realized
that much of the logic for routing messages is already implemented in widely
available web frameworks. Web services that use the
&lt;a href=&quot;http://en.wikipedia.org/wiki/Representational_State_Transfer&quot;&gt;Representational State Transfer&lt;/a&gt;
(ReST) style are also a natural fit for the type of messages that Astral nodes
exchange - e.g. creating and deleting nodes, streams and stream forward
requests.&lt;/p&gt;

&lt;p&gt;With this insight, we replaced the messaging core with an embedded web server
(specifically Facebook's &lt;a href=&quot;http://www.tornadoweb.org/&quot;&gt;Tornado&lt;/a&gt;). Each node
starts an instance of this server listening on port 8000 at startup, and exposes
a simple ReSTful API that accepts and returns data in the JSON format. An
additional advantage of this approach is that it enabled Astral to use simple
HTTP requests in JavaScript to communicate with the node from a web browser.&lt;/p&gt;

&lt;p&gt;(The choice of Tornado was one of familiarity, but it's event-driven nature
actually clashes with a few of the Astral API calls. At the moment, it's
possible to deadlock two nodes with a specific series of requests. Tornado
should be swapped out for a threaded web server.)&lt;/p&gt;

&lt;h3&gt;Video Streaming&lt;/h3&gt;

&lt;p&gt;We originally looked at the &lt;a href=&quot;http://gstreamer.freedesktop.org/&quot;&gt;gstreamer&lt;/a&gt; and
&lt;a href=&quot;http://www.videolan.org/vlc/&quot;&gt;VLC&lt;/a&gt; multimedia libraries, mainly to take
advantage of their excellent codec support and stream packaging functionality.
Although we were able to get example streams open with
&lt;a href=&quot;https://github.com/peplin/astral/blob/master/examples/gstreamer/video_pipeline.py&quot;&gt;both&lt;/a&gt;
&lt;a href=&quot;https://github.com/peplin/astral/tree/master/examples/vlc&quot;&gt;libraries&lt;/a&gt; in Linux,
it turned out that accessing video devices was not as well supported in OS X
(the preferred platform of one of the developers).&lt;/p&gt;

&lt;p&gt;Adobe Flash provides better cross-platform device APIs and includes native
support for RTMP - these two things made the choice clear for a
proof-of-concept. RTMP is an application-layer protocol over TCP with built-in
reliability mechanisms (tolerance of lost packets and dynamic packet sizing,
negotiated with the
server). The stream is encoded with with &lt;a href=&quot;http://en.wikipedia.org/wiki/VP6&quot;&gt;VP6&lt;/a&gt;
video and MP3 audio.&lt;/p&gt;

&lt;p&gt;The choice influenced the core design of Astral. We expected to be performing
explicit chunking of the stream before sending it off to peers. Since RTMP
already provides segmentation and packaging (including meta-information in the
header), we were able to benefit from reliability and picture adjustment right
out of the box. However, it is arguable that additional reliability guarantees
could be achieved in the future by packaging frames directly (e.g. two-stream
input on separate paths for immediate failover).&lt;/p&gt;

&lt;p&gt;On top of this foundation, we integrated a lightweight Python RTMP server
(&lt;a href=&quot;https://github.com/peplin/astral/blob/master/astral/rtmp/rtmp.py&quot;&gt;rtmplite&lt;/a&gt;)
that acts as a distribution hub for all streams published by the node. All
consumers of that stream connect to the corresponding publisher's RTMP server
through a dynamically constructed chain of TCP tunnels hosted by network
participants.&lt;/p&gt;

&lt;p&gt;(This &lt;em&gt;may&lt;/em&gt; be a serious bottleneck, considering that all clients still need to
connect to a single streaming server. Our understanding of the RTMP server isn't
complete enough to say how the bandwidth is conserved in this situation. In the
future, a more native streaming solution is probably the best course of action,
one that allows more precise control over the video frames.)&lt;/p&gt;

&lt;p&gt;In a way, we could look at it as pseudo circuit switching, where you share the
connection through the system with others. It is guaranteed that the consumer
will get the stream they're viewing over that same connection until a node on
the chain of TCP tunnels fails. At that point, Astral immediately tries to
reestablish connection by replacing the missing link or potentially constructing
a new path.&lt;/p&gt;

&lt;h3&gt;Reliability&lt;/h3&gt;

&lt;p&gt;Reliability is of greater concern in a peer-to-peer distribution network. If any
point in the chain fails, the consumer loses their real-time stream. There are a
few ways around this:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Buffer enough of the stream so that the stream can continue playing while a
  new source is located.&lt;/li&gt;
&lt;li&gt;Always open two connections to the stream, for immediate failover.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Neither is implemented in the current version.&lt;/p&gt;

&lt;h3&gt;Source Stream Uploading&lt;/h3&gt;

&lt;p&gt;Astral currently implements source streaming from the browser only. The original
design allowed producers to direct any existing streaming device or client at a
local TCP socket, but a switch to using the Adobe Flash-based protocol RTMP made
this more challenging. Our target external device, VLC, does not currently
support sending a video stream to an RTMP server. As planned, the streaming
interface is extremely simple; it is very similar to hitting play YouTube. The
Flash applet also natively supports streaming from any attached device, be it a
USB webcam or Firewire HD camera.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/astral/source-view.png&quot; alt=&quot;Stream Uploading&quot; /&gt;&lt;/p&gt;

&lt;h3&gt;User Interface for Selecting Stream&lt;/h3&gt;

&lt;p&gt;The user interface for Astral proceeded exactly as planned. It is deployed to a
central location, and accessed via a traditional web browser by all clients.
After selecting a stream, the user can view the video embedded in the page via a
Flash consumer applet. The page also displays the RTMP server's URL, so users
can connect with another streaming client if they so choose.&lt;/p&gt;

&lt;h3&gt;Stream Seeding&lt;/h3&gt;

&lt;p&gt;Nodes in the overlay network can volunteer to seed a specific stream in order to
increase its availability. This requires no special logic - the only difference
between a seeding node and a regular consumer is that the seeder does not
connect to the stream with a Flash consumer.&lt;/p&gt;

&lt;h3&gt;Peer-to-Peer Overlay Network Communication&lt;/h3&gt;

&lt;p&gt;Astral clients bootstrap themselves with knowledge of the overlay network obtain
via static configuration files, the origin webserver, and finally, their primary
supernode. When a node joins the network, it requests a partial list of
supernodes from the origin web application. It selects the closest supernode
from this list (based on ping round-trip time) and attempts to register with it.
If the supernode is already at capacity (currently a hard-coded limit of 100
children), the node continues down the sorted list of supernodes until one
accepts it.&lt;/p&gt;

&lt;p&gt;If no supernodes are available or none have capacity, a node promotes itself to
supernode status, extending the capacity of the network automatically.&lt;/p&gt;

&lt;h3&gt;Simulation of Overlay Network&lt;/h3&gt;

&lt;p&gt;Astral has not been tested in a large-scale, simulated environment as was
planned. The system has been tested with a 4-node setup, but needs to be scaled
up to flush out issues with the overlay network protocol. A potentially good way
to test the system is to load a node with many fake node details via the HTTP
API.&lt;/p&gt;

&lt;h3&gt;Command Line Interface&lt;/h3&gt;

&lt;p&gt;Astral includes a command line tool for interacting with the local node. This is
useful for programmatic integration testing. It controls the node by sending
HTTP requests to the same ReST API used by other components. The CLI can display
and update stream and node metadata, create new tickets and new streams, and
most other node actions.&lt;/p&gt;

&lt;h2&gt;Challenges &amp;amp; Interesting Bits&lt;/h2&gt;

&lt;h3&gt;Streaming Video&lt;/h3&gt;

&lt;p&gt;Video streaming is obviously a critical component of the Astral system. We
needed to confirm our initial assumptions regarding native multimedia libraries
(VLC, gstreamer) much earlier. Native and cross-platform video device access is
possible (other applications are able to do it), but we found ourselves in a
time crunch by the time we came to testing the capabilities. These turned out
to be non-trivial to operate on both Mac OS X and Linux (the operating systems
used by our developers). Using Flash and RTMP for video was a good alternative,
but compromised the original architecture vision and has its own drawbacks.&lt;/p&gt;

&lt;h3&gt;Same Origin Policy&lt;/h3&gt;

&lt;p&gt;The &lt;a href=&quot;http://en.wikipedia.org/wiki/Same_origin_policy&quot;&gt;same origin policy&lt;/a&gt; was
the biggest impediment to implementing the entire UI in the browser.&lt;/p&gt;

&lt;p&gt;The switch to a simpler HTTP API for node communication enabled the browser to
communicate directly with a locally running node using simple JavaScript. The
original plan called for a browser extension, but the switch made this
unnecessary.&lt;/p&gt;

&lt;p&gt;However, browsers enforce something called the same origin policy, which
constrains JavaScript requests to the same domain from which the script was
retrieved (it's a good thing in general for security). The Astral interface's
JavaScript is retrieved from the origin web server (e.g.
&lt;code&gt;http://astral-video.heroku.com&lt;/code&gt;), but needs to connect to
&lt;code&gt;http://localhost:8000&lt;/code&gt; - a clear violation of the same-origin policy.&lt;/p&gt;

&lt;p&gt;Fortunately, we were able to work around this restriction by manipulating URL
query parameters and with clever interpretation of the queries on the server
side. For extended user interaction, a browser extension may be necessary -
these have the advantage of being able to query any domain.&lt;/p&gt;

&lt;h3&gt;Visualization&lt;/h3&gt;

&lt;p&gt;We also implemented a JavaScript
&lt;a href=&quot;https://github.com/peplin/astral-web/blob/master/public/js/visualization.js&quot;&gt;visualization&lt;/a&gt;
of nodes, streams and forwarded streams to get a better sense of the state of
the system. The visualization opens a
&lt;a href=&quot;http://en.wikipedia.org/wiki/WebSockets&quot;&gt;WebSocket&lt;/a&gt; (a new web technology in
modern browsers for persistent, two-way communication between browser and
server) with the server, and the server pushes any updates to its knowledge of
the network through this connection. The nodes, streams and tickets are drawn on
the screen with using the &lt;a href=&quot;http://vis.stanford.edu/protovis/&quot;&gt;Protovis&lt;/a&gt; graphics
library.&lt;/p&gt;

&lt;p&gt;Interestingly, the visualizations on each node are not identical due to their
different viewpoint of the network, and thus incomplete information.
Purposefully, none of the nodes has complete global knowledge.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;http://things.rhubarbtech.com/images/astral/visualization.png&quot; alt=&quot;Visualization&quot; /&gt;&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;The project proved that distributed distribution is possible, but we found that
the costs of centralized distribution aren't as high as we initially predicted
(which explains its continued popularity). The additional complexity that comes
with distributing via a peer-to-peer network may not be worth the trouble,
especially considering the extra challenge with collecting accurate statistics
about stream consumers. In the industry, the video distribution company
&lt;a href=&quot;http://www.joost.com/&quot;&gt;Joost&lt;/a&gt;
&lt;a href=&quot;http://techcrunch.com/2008/12/17/joost-just-gives-up-on-p2p/&quot;&gt;dropped support&lt;/a&gt;
for their peer-to-peer distribution system in favor of a more traditional,
centralized approach with Flash in 2008.&lt;/p&gt;

&lt;p&gt;That said, Astral proved to be both a challenging and rewarding proof-of-concept
project to learn the complexities of building a peer to peer network, as well as
the state of the art in video streaming.&lt;/p&gt;

&lt;h2&gt;Source Code&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/peplin/astral&quot;&gt;Python client&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;https://github.com/peplin/astral-web&quot;&gt;Sinatra web application&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://astral.rhubarbtech.com/&quot;&gt;Deployed version&lt;/a&gt; of web application&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>Parallelizing the FLAC Encoder</title>
   <link href="http://christopherpeplin.com/2011/05/pflac/"/>
   <updated>2011-05-26T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/pflac</id>
   <content type="html">&lt;p&gt;A couple of years ago, a friend (Max Miller) and I worked on a project to
parallelize the FLAC audio encoder. We modified the official encoder, and while
it never made it into the official release, I think some of the ideas are still
interesting. I've included the origin report here. You can also jump
&lt;a href=&quot;https://github.com/peplin/pflac&quot;&gt;straight to the code&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;Introduction&lt;/h2&gt;

&lt;p&gt;Our project began with the intention to parallelize the FLAC encoder and create
a multi-threaded transcoder (FLAC to MP3) that operated at the audio block
level. Upon acquiring the FLAC source code (which includes the C and C++
libraries, flac command line encoder/decoder and some example programs), we
realized the need to scale back our expectations of the project. The FLAC API is
an extensive, robust library which was built from the ground up, unfortunately,
as a serial application.&lt;/p&gt;

&lt;p&gt;We spent the first two weeks attempting to understand the flow of the basic flac
encoder when dealing with a typical audio file (2 channel, 16-bit WAV). After
that, we identified possible points of parallelism at two levels of the encoding
process: the high level (the frontend, above the library), and low level (within
the library).&lt;/p&gt;

&lt;p&gt;This report summarizes the serial algorithm, it's potential for parallelism, and
our issues with the library design in implementation. Ultimately, we were
successful at parallelizing the algorithm at a high level using a pipeline from
Intel's Threading Building Blocks (TBB). This project has definitive
implications on the future of the FLAC project, but its integration depends on
the evolution of the flac program (which is currently written in C, and thus
incompatible with TBB) and the cooperation of the project owner (Josh Coalson)
and community.&lt;/p&gt;

&lt;h2&gt;Serial Problem&lt;/h2&gt;

&lt;p&gt;FLAC is a lossless audio codec that converts WAV files to compressed FLAC files
that are seekable and streamable. A FLAC file contains a stream header followed
by a series of audio frames which hold a small piece of the encoded audio along
with enough information to decode that frame. The encoding algorithm can be
described at a high level in this way:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Read WAV file header and confirm its parameters are compatible with this encoder&lt;/li&gt;
&lt;li&gt;Open the output file stream&lt;/li&gt;
&lt;li&gt;Write stream header, including metadata such as artist/album information and
  an MD5 checksum&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Finally, do the encoding:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;cpp&quot;&gt;&lt;span class=&quot;k&quot;&gt;while&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;blocksLeft&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;readBlock&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;encodeBlockToFlac&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;writeFrameToOutputStream&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;();&lt;/span&gt;
&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;p&gt;Each block of audio is processed in 3 stages.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Inter-channel Decorrelation&lt;/strong&gt; The encoder decides whether or not to split
 the audio into channels (left &amp;amp; right), and if so, which method. One option is a
 simple left/right split, and another (which often garners significant extra
 compression) is mid and side channels.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Modeling&lt;/strong&gt; The audio signal is approximately modeled by a function in one
 of three ways. The function should ideally represent the original audio with
 fewer bits.&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Constant signal frames fit to a simple, constant function with only one argument&lt;/li&gt;
&lt;li&gt;Extremely busy or random frames are modeled verbatim, as themselves&lt;/li&gt;
&lt;li&gt;All other frames are modeled using general linear predictive coding (LPC)&lt;/li&gt;
&lt;/ol&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Residual Coding&lt;/strong&gt; Once modeled, the model is subtracted from the original
 audio to produce the residual, or error. This stream is then encoded using Rice
 codes (a special set of Huffman codes) and run length encoding.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;The bulk of the FLAC project is a C library (libFLAC) that implements the codec
as per the format specification. There is also C++ library of wrapper classes
(libFLAC++). Any actual FLAC encoder or program using the FLAC format must
implement the encoding algorithm as described above, using functions provided in
libFLAC. The reference encoder in the source, flac, provides one such
implementation.&lt;/p&gt;

&lt;h3&gt;Buffering &amp;amp; Blocking&lt;/h3&gt;

&lt;p&gt;There are two important considerations regarding speed and compression ratio
with the serial problem. There are two buffers - one reads directly from the
input file and feeds its data to the second buffer that holds the actual block
of audio to be processed. The second buffer is of constant size (one block of
samples), but the first is variable. The example encoder frontend reads 1024
samples into the first buffer, and calls the process frame function. That
function funnels data from the first buffer to the second, waiting until the
second has one full block of audio before actually processing (encoding) the
frame. Thus, the variable size of the first buffer is important to consider in
relation to available memory, disk read speed and cache size.&lt;/p&gt;

&lt;p&gt;Block size is also an important parameter. Similar to thread overhead problems
in parallel programming, too small of a block size will lower the overall
compression while too large a block will not allow the compressor to generate an
efficient model for the audio.&lt;/p&gt;

&lt;h3&gt;Serial Performance&lt;/h3&gt;

&lt;p&gt;In terms of real world performance, the FLAC encoder typically encodes a 3-4
minute song in 4-6 seconds which translates into approximately 40 seconds per
album. By profiling the reference encoder with gprof, we discovered that 94-98\%
of that time is spent within the different variants of process_frame (the
middle step of the while loop from above).&lt;/p&gt;

&lt;p&gt;Truncated profiling results showing the prime candidates for parallel speedup:&lt;/p&gt;

&lt;div class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;bash&quot;&gt;index  &lt;span class=&quot;se&quot;&gt;\%&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;time  &lt;/span&gt;self   children    called     name
                0.00    0.00       1/3080     FLAC__stream_encoder_finish &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;37&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
                0.00   12.99    3079/3080     FLAC__stream_encoder_process &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;6&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;7&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;     96.5    0.00   13.00    3080          process_frame_ &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;7&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
                0.00   12.34    3080/3080     process_subframes_ &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;8&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
                0.00    0.47    3080/3080     FLAC__MD5Accumulate &lt;span class=&quot;o&quot;&gt;[&lt;/span&gt;22&lt;span class=&quot;o&quot;&gt;]&lt;/span&gt;
...
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;


&lt;h2&gt;Concurrent Architecture&lt;/h2&gt;

&lt;p&gt;We spent a large amount of time after beginning this project identifying which
classes and source files were of interest. We narrowed our modifications down to
the example C++ encoder and both the libFLAC and libFLAC++ libraries. At first
look, we had no problem finding embarrassingly parallel loops. Many of the
functions in the library iterate over large data sets with computationally
intensive processes (process_frame, which does the LPC calculations, is a good
example). However, many of the loops turned out to usually iterate over much
smaller numbers of items than we expected.&lt;/p&gt;

&lt;p&gt;After analyzing the actual runtime and size of each loop, our top three parallel
plans were:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;A 4-6 stage pipeline within &lt;code&gt;process_frame&lt;/code&gt;, which is called on each frame.&lt;/li&gt;
&lt;li&gt;A &lt;code&gt;parallel_for&lt;/code&gt; within &lt;code&gt;stream_encoder_process&lt;/code&gt; for a loop that iterates
 over each sample in the block.&lt;/li&gt;
&lt;li&gt;A 4 stage pipeline at the encoder frontend level, mimicking the pseudocode
 demonstrated in the Serial Problem section.&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;Our program uses plan 3, because it cleanly separates I/O from computation in a
situation very similar Intel's example code for the TBB pipeline. Each frame is
processed completely independent of any other, so data dependencies are in
theory minimal. Plan 1 is also clearly separated, but the overhead of a pipeline
for each frame is too much, and would limit any speedup drastically. We
determined that the work in loop of the second plan was ultimately insignificant
in comparison to other areas of the encoder.&lt;/p&gt;

&lt;h3&gt;Pipeline Filters&lt;/h3&gt;

&lt;p&gt;Before explaining the hazards of this design, we present our pipeline filter
plan and the read/write dependencies between each filter. These filters are
defined in filters.h, and the type of token passed among them is in
&lt;code&gt;pipelinestruct.h&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;InputFilter&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read&lt;/em&gt;:  infile, shared encoder&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Write&lt;/em&gt;:  shared encoder, raw WAV buffer, byte counters&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;PCMFilter (parallel)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read&lt;/em&gt;:  byte counter, shared encoder, raw WAV buffer&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Write&lt;/em&gt;: PCM buffer&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;ProcessFilter (parallel)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read&lt;/em&gt;: PCM buffer, byte counter&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Write&lt;/em&gt;: parallel encoder&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;OutputFilter&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Read&lt;/em&gt;: parallel encoder&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Write&lt;/em&gt;: outfile, shared encoder&lt;/p&gt;

&lt;h3&gt;Serial Nature&lt;/h3&gt;

&lt;p&gt;While designing the pipeline, it became obvious that libFLAC was built from the
ground up as a serial library. Every function in the library is based the
StreamEncoder struct, which stores all information about the audio file,
encoding status and state, progress measures and every buffer minus the first,
variable sized buffer of raw WAV data. This struct is very convenient in serial,
as it imitates object oriented functionality in C. In parallel, the fact that
this much state is held by one object severely restricts the parallelism. To
solve this problem, we rewrote a handful of library functions to be
pipeline-safe. Instead of working on a single encoder struct, they now take two
encoder structs as arguments - a shared encoder read by every filter and another
encoder mutually exclusive to a token. The motive behind this is to give each
pipeline token a unique, local encoder to keep encoding state and buffers.
Anytime file metadata or progress counters are requested in a function, they
instead read the shared encoder (which is thread safe).&lt;/p&gt;

&lt;h2&gt;Performance Results&lt;/h2&gt;

&lt;p&gt;The parallel version compares very favorably to the serial example encoder. Both
encoders are still I/O bound at times (especially on large files). CPU usage
sometimes drops below 100\%, but this has to do more with buffer sizes than the
parallelism. We used a test suite of 6 audio files, which we encoded while
timing using the two encoders, then verified by hand that the resulting audio
was not corrupted. We saw very large speedups in all of the test cases, which we
attribute to the fact that the intense computation is now parallel, and the I/O
and computation are also separated into multiple threads by the pipeline
structure. File size did not have a noticeable impact on encoding time.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;test1.wav (530MB) - Nine Inch Nails - Ghosts, Disc 1

&lt;ul&gt;
&lt;li&gt;Quiet, ambient CD with lots of constant and silent frames&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;test2.wav (584MB) - Nine Inch Nails - Ghosts, Disc 2

&lt;ul&gt;
&lt;li&gt;Heavier industrial sound, a good mix of the three frame types)&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;test3.wav (17MB) - Wolf Eyes - Dead in a Boat

&lt;ul&gt;
&lt;li&gt;Noise rock, stress verbatim frames&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;test4.wav (58MB) - Charles Mingus - Wednesday Night Prayer Meeting

&lt;ul&gt;
&lt;li&gt;Acoustic jazz, lots of ambient noise, no silent frames&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;li&gt;test5.wav (57MB) - Fiona Apple - Window&lt;/li&gt;
&lt;li&gt;test6.wav (22MB) - Tom Waits - Lie to Me

&lt;ul&gt;
&lt;li&gt;Electric blues with louder, full frames than the previous track&lt;/li&gt;
&lt;/ul&gt;
&lt;/li&gt;
&lt;/ul&gt;


&lt;h3&gt;Results&lt;/h3&gt;

&lt;p&gt;Run on a dual core AMD Athlon X2 4200+, averaged over 3 runs:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;Starting testing...

Encoding ./test1.wav with serial encoder...
average wall time  0m48.408s
Encoding ./test1.wav with parallel encoder...
average wall time  0m27.92s
73% speedup

Encoding ./test2.wav with serial encoder...
average wall time  1m2.962
Encoding ./test2.wav with parallel encoder...
average wall time  0m35.693
76% speedup

Encoding ./test3.wav with serial encoder...
average wall time  0m1.971s
Encoding ./test3.wav with parallel encoder...
average wall time  0m1.118s
76% speedup

Encoding ./test4.wav with serial encoder...
average wall time  0m6.209s
Encoding ./test4.wav with parallel encoder...
average wall time  0m2.963s
109% speedup

Encoding ./test5.wav with serial encoder...
average wall time  0m6.524
Encoding ./test5.wav with parallel encoder...
average wall time  0m2.65
146% speedup

Encoding ./test6.wav with serial encoder...
average wall time  0m1.844s
Encoding ./test6.wav with parallel encoder...
average wall time  0m0.973s
89% speedup

Testing finished.
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Average speedup: 94%&lt;/p&gt;

&lt;h2&gt;Lessons Learned&lt;/h2&gt;

&lt;p&gt;This project was an immense learning experience for both of us. This is our
first time touching a project of this magnitude, and it was eye opening to see
their code organization and documentation. We also had to understand automake,
and its Makefile generation scripts. We still do not know how to use this tool
to its fullest, but we could figure out was very helpful for adding our
libraries and new files.&lt;/p&gt;

&lt;p&gt;Our confidence in parallel design patterns was strengthened by planning and
implementing the encoder. Countless times during the planning stage, something
we learned in lecture really solidified in our minds. Examples of this include
being wary of thread overhead and watching for blocking I/O. Most importantly,
we saw firsthand how very difficult it can be to parallelize code written for
serial execution. It is always a better situation to be working with code
already optimized for minimum data dependencies. The majority of our time was
spent separating the shared and mutually exclusive encoders, and still, there
are some issues we have yet to resolve. Namely, the following extra
requirements/limitations are placed on the encoder:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;No loose mid-side stereo&lt;/li&gt;
&lt;li&gt;Verification while encoding&lt;/li&gt;
&lt;li&gt;Verification after decoding (using the MD5 checksum)&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;These are not serious problems, and could be rectified with additional effort.
We made a decision to focus elsewhere on this project, as these options eluded
complete understanding and aren't common encoding options.&lt;/p&gt;

&lt;h2&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;FLAC is an embarrassingly parallel encoder that looked very simple to
parallelize. The vast source code made that job much harder, but ultimately
there wasn't any more difficulty in the actual parallel design besides for what
we anticipated. Intel's Threading Building Blocks freed us up to break down the
library into functions we could use without worrying about the details of
threads. At the scale of the TBB finger exercise, TBB's usefulness is not
entirely clear. Integrated into a large project like this, TBB's algorithms were
extremely helpful. We hope to see the library in common use, so a program like
this could be distributed.&lt;/p&gt;

&lt;p&gt;Our encoder is not full featured. Based on the example C++ encoder, it encodes a
very limited subset of the types of files understood by the full-fledged FLAC
reference encoder. There is no reason why our pipeline design could not be
melded with the robust encoder, but it is beyond the time frame of this class.
The reference encoder is a C program, so any further modifications would and
should start as a new, C++ reference encoder. Time permitting, we will pursue
this idea with the creator of FLAC and search for interested developers in the
FLAC community. There are hints of interest in a multi-threaded FLAC encoder on
the web, so enlisting the help of other develops should not be impossible with
the solid start we've completed already.&lt;/p&gt;

&lt;h3&gt;Source&lt;/h3&gt;

&lt;p&gt;Files created or modified in FLAC source:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;examples/cpp/encode/file/filters.h
examples/cpp/encode/file/filters.cpp
examples/cpp/encode/file/pipelinestruct.h
examples/cpp/encode/file/main_parallel.cpp
include/FLAC++/encoder.h
include/FLAC/stream_encoder.h
src/libFLAC++/stream_encoder.cpp
src/libFLAC/stream_encoder.c
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The patches to the FLAC encoder are available at
&lt;a href=&quot;https://github.com/peplin/pflac&quot;&gt;GitHub&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;A &lt;a href=&quot;https://github.com/peplin/pflac/raw/master/docs/report.pdf&quot;&gt;PDF&lt;/a&gt; version of
this post.&lt;/p&gt;

&lt;h2&gt;References&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;&lt;a href=&quot;http://flac.sourceforge.net/format.html&quot;&gt;FLAC Format Specification&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&quot;http://flac.sourceforge.net/api/index.html&quot;&gt;FLAC Project Documentation&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

</content>
 </entry>
 
 <entry>
   <title>Standardizing Python Web Application Deployment</title>
   <link href="http://christopherpeplin.com/2011/05/python-deployment/"/>
   <updated>2011-05-25T00:00:00-07:00</updated>
   <id>http://christopherpeplin.com/2011/05/test</id>
   <content type="html">&lt;p&gt;Deployment is one of the most talked about issues in the Python web application
community. Every development shop comes up with their own approach, pieced
together with experience and blog posts.&lt;/p&gt;

&lt;p&gt;Fabric is often mentioned as the Python alternative to the code deploy tool
Capistrano, but there are many more Capistrano plugins and extensions than
Fabfiles. One reason for this is a lack of standard application layouts (i.e.
where is the WSGI application, the javascript files, list of dependencies,
etc.). What if there were a standard convention for organizing and deploying
small- to medium-sized applications? We are open-sourcing the configuration we
use at Bueda for building and deploying Django and Tornado web applications.
There are four related projects that make up the release:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/bueda/python-webapp-etc&quot;&gt;python-webapp-etc&lt;/a&gt; -
Config files for tools to deploy Python webapps&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/bueda/ops&quot;&gt;buedafab (a.k.a. &quot;ops&quot;)&lt;/a&gt; -
A collection of Fabric commands for deployment&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/bueda/django-boilerplate&quot;&gt;django-boilerplate&lt;/a&gt; -
A standard layout for Django apps&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/bueda/tornado-boilerplate&quot;&gt;tornado-boilerplate&lt;/a&gt; -
A standard layout for Tornado apps&lt;/p&gt;

&lt;p&gt;I will let the documentation of each individual repository speak for itself.
Ideas, comments and contributions are welcome - we hope to come to a community
consensus on good standard practice.&lt;/p&gt;
</content>
 </entry>
 

</feed>

