<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:sy="http://purl.org/rss/1.0/modules/syndication/" xmlns:slash="http://purl.org/rss/1.0/modules/slash/" version="2.0">

<channel>
	<title>Benjamin McCann - Development Blog</title>
	
	<link>http://www.benmccann.com/dev-blog</link>
	<description>The software development weblog of Benjamin McCann.</description>
	<lastBuildDate>Fri, 27 Jan 2012 01:34:07 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3.1</generator>
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/rss+xml" href="http://feeds.feedburner.com/benmccann-tech" /><feedburner:info xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" uri="benmccann-tech" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><feedburner:emailServiceId xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">benmccann-tech</feedburner:emailServiceId><feedburner:feedburnerHostname xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">http://feedburner.google.com</feedburner:feedburnerHostname><item>
		<title>Migrating from MySQL to Percona Server</title>
		<link>http://www.benmccann.com/dev-blog/migrating-from-mysql-to-percona-server/</link>
		<comments>http://www.benmccann.com/dev-blog/migrating-from-mysql-to-percona-server/#comments</comments>
		<pubDate>Mon, 12 Dec 2011 06:58:53 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Datastores]]></category>
		<category><![CDATA[MySQL]]></category>

		<guid isPermaLink="false">http://www.benmccann.com/dev-blog/?p=553</guid>
		<description><![CDATA[Percona Server is just MySQL with a few extra options added in by Percona. It&#8217;s backwards compatible and based off the same code base. If you&#8217;re not familiar with Percona, they are the world&#8217;s leading MySQL consultants. The main reason I switched is because Ubuntu uses an old version of MySQL. Ubuntu is about a [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.percona.com/doc/percona-server/5.5/index.html">Percona Server</a> is just MySQL with a few extra options added in by Percona.  It&#8217;s backwards compatible and based off the same code base.  If you&#8217;re not familiar with Percona, they are the world&#8217;s leading MySQL consultants.  The main reason I switched is because Ubuntu uses an old version of MySQL.  Ubuntu is about a year behind in packaging MySQL.  Something to do with checking the copyright after Oracle got ahold of it.  This seemed to be the easiest way to update.  A few other reasons follow.</p>
<p>Everyone and their mom says <a href="http://www.percona.com/doc/percona-xtrabackup/">xtraBackup</a> is the way to go for MySQL backups.  Even Facebook uses it.  xtraBackup is an open source project made by Percona.  mysqldump is fine for small projects, but it&#8217;s not real scalable when you have any real amount of data.  It&#8217;s available in the Percona apt repositories.</p>
<p>By default, older version of MySQL use the MyISAM storage engine, which has fallen out of favor.  The default in newer MySQL installs is InnoDB.  Percona also makes a storage engine called XtraDB, which is backwards compatible with InnoDB and supposedly a bit more performant.  <a href="http://mariadb.org/">MariaDB</a> (MySQL fork maintained by the MySQL creator) uses it as their default as well.  Sounds like most people don&#8217;t notice a huge difference between XtraDB and InnoDB, but both are much favored over MyISAM which caused lots of problems for people.</p>
<p>Finally, there&#8217;s also <a href="http://yoshinorimatsunobu.blogspot.com/2010/10/using-mysql-as-nosql-story-for.html">HandlerSocket</a>, which is a plugin for MySQL.  It allows you to do primary key lookups directly to the storage engine bypassing MySQL&#8217;s SQL layer.  It&#8217;s supposed to be 5-10x faster because it doesn&#8217;t have to parse the SQL and do table locking.  It turns MySQL into a key/value as good as any of the NoSQL solutions.  It&#8217;s actually much better because you can still run SQL queries on your data, which you can&#8217;t do with most of the NoSQL solutions and you get MySQL&#8217;s replication etc. which is all very well documented.  As long as your DB can fit in RAM on a single machine it makes MySQL much faster.  Perhaps even faster and easier to use than even memcached.</p>
<p>To migrate, first create a backup:</p>
<pre><code>mysqldump -uroot -p --all-databases > dump.sql</code></pre>
<p>Then do the upgrade:</p>
<pre><code>gpg --keyserver  hkp://keys.gnupg.net --recv-keys 1C4CBDCDCD2EFD2A
gpg -a --export CD2EFD2A | sudo apt-key add -
sudo emacs /etc/apt/sources.list
Add:
    ## Percona repository
    deb http://repo.percona.com/apt maverick main
    deb-src http://repo.percona.com/apt maverick main
sudo apt-get update
sudo apt-get install percona-server-server-5.5
sudo apt-get autoremove</code></pre>
]]></content:encoded>
			<wfw:commentRss>http://www.benmccann.com/dev-blog/migrating-from-mysql-to-percona-server/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Running Ubuntu on VirtualBox</title>
		<link>http://www.benmccann.com/dev-blog/running-ubuntu-on-virtualbox/</link>
		<comments>http://www.benmccann.com/dev-blog/running-ubuntu-on-virtualbox/#comments</comments>
		<pubDate>Wed, 30 Nov 2011 22:53:03 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Tips and Tricks]]></category>

		<guid isPermaLink="false">http://www.benmccann.com/dev-blog/?p=541</guid>
		<description><![CDATA[I had to figure out a few things to get Ubuntu installed and working well on VirtualBox. I had to enable virtualization technologies in my BIOS. I have a Lenovo T520 and did this by pressing F1 during startup and then going to Security > Virtualization. If I did not do this then I would [...]]]></description>
			<content:encoded><![CDATA[<p>I had to figure out a few things to get Ubuntu installed and working well on VirtualBox.</p>
<p>I had to enable virtualization technologies in my BIOS.  I have a Lenovo T520 and did this by pressing F1 during startup and then going to Security > Virtualization.  If I did not do this then I would receive the error &#8220;VT-x features locked or unavailable in MSR&#8221; when trying to run with more than 1 CPU or 3584 MB of RAM.</p>
<p>Also, I had to run &#8220;sudo apt-get install dkms&#8221; to get the VirtualBox Guest Additions to work well.</p>
<p>Finally, I remapped the host key.  By default all kinds of weird things happen when you use the right Ctrl button.  This can be fixed by going to File > Preferences&#8230; > Input and then setting Host Key to something you never use like Pause.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.benmccann.com/dev-blog/running-ubuntu-on-virtualbox/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>SSL on localhost with nginx</title>
		<link>http://www.benmccann.com/dev-blog/ssl-on-localhost-with-nginx/</link>
		<comments>http://www.benmccann.com/dev-blog/ssl-on-localhost-with-nginx/#comments</comments>
		<pubDate>Mon, 14 Nov 2011 08:22:11 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Uncategorized]]></category>
		<category><![CDATA[nginx]]></category>

		<guid isPermaLink="false">http://www.benmccann.com/dev-blog/?p=532</guid>
		<description><![CDATA[Install nginx if it&#8217;s not already installed: sudo apt-get install nginx You must have the SSL module installed. The nginx docs say this is not standard. However, it does come installed on Ubuntu. You can verify by running nginx -V and looking for --with-http_ssl_module. Next up is generating the SSL certs. Follow the Slicehost docs [...]]]></description>
			<content:encoded><![CDATA[<p>Install nginx if it&#8217;s not already installed:</p>
<pre><code>sudo apt-get install nginx</code></pre>
<p>You must have the SSL module installed.  The nginx docs say this is not standard.  However, it does come installed on Ubuntu.  You can verify by running <code>nginx -V</code> and looking for <code>--with-http_ssl_module</code>.</p>
<p>Next up is generating the SSL certs.  <a href="http://articles.slicehost.com/2007/12/19/ubuntu-gutsy-self-signed-ssl-certificates-and-nginx">Follow the Slicehost docs</a> for this step.</p>
<p>Now you&#8217;ll need to update your /etc/nginx/nginx.conf file:</p>
<pre><code>  server {
    server_name www.yourdomain.com yourdomain.com;
    rewrite ^(.*) https://www.yourdomain.com$1 permanent;
  }

  server {
    server_name local.yourdomain.com;
    rewrite ^(.*) https://local.yourdomain.com$1 permanent;
  }

  server {
    listen               443;
    ssl                  on;
    ssl_certificate      /etc/ssl/certs/myssl.crt;
    ssl_certificate_key  /etc/ssl/private/myssl.key;
    keepalive_timeout    70;
    server_name www.yourdomain.com local.yourdomain.com;
    location / {
      proxy_pass  http://backend;
    }
  }</code></pre>
<p>Then restart nginx:</p>
<pre><code>sudo nginx -s reload</code></pre>
<p>Finally, in /etc/hosts put:</p>
<pre><code>127.0.0.1   local.yourdomain.com</code></pre>
<p>This will allow you to visit https://local.yourdomain.com/ which will be served up by the server that you have running on port 8080.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.benmccann.com/dev-blog/ssl-on-localhost-with-nginx/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Embedded Tomcat</title>
		<link>http://www.benmccann.com/dev-blog/embedded-tomcat/</link>
		<comments>http://www.benmccann.com/dev-blog/embedded-tomcat/#comments</comments>
		<pubDate>Sun, 28 Aug 2011 09:08:41 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Tomcat]]></category>

		<guid isPermaLink="false">http://www.benmccann.com/dev-blog/?p=518</guid>
		<description><![CDATA[Earlier in the year, I posted a quick writeup on how to run an embedded Jetty instance. Today, I&#8217;m posting basically the same code showing how to run an embedded Tomcat instance. The embedded Tomcat API is much nicer since it matches closely the web.xml syntax. However, the embedded Tomcat instance takes much longer to [...]]]></description>
			<content:encoded><![CDATA[<p>Earlier in the year, I posted a quick <a href="http://www.benmccann.com/dev-blog/embedded-jetty/">writeup on how to run an embedded Jetty instance</a>.  Today, I&#8217;m posting basically the same code showing how to run an embedded Tomcat instance.  The embedded Tomcat API is much nicer since it matches closely the web.xml syntax.  However, the embedded Tomcat instance takes much longer to startup.</p>
<pre><code>package com.benmccann.webtemplate.frontend.server;

import java.net.URL;

import org.apache.catalina.Context;
import org.apache.catalina.core.AprLifecycleListener;
import org.apache.catalina.core.StandardServer;
import org.apache.catalina.deploy.FilterDef;
import org.apache.catalina.deploy.FilterMap;
import org.apache.catalina.startup.Tomcat;
import org.apache.struts2.dispatcher.ng.filter.StrutsPrepareAndExecuteFilter;

import com.beust.jcommander.JCommander;
import com.google.inject.Guice;
import com.google.inject.Inject;
import com.google.inject.Injector;
import com.google.inject.servlet.GuiceFilter;

/**
 * @author Ben McCann (benmccann.com)
 */
public class WebServer {

  private final FrontendSettings webServerSettings;
  private final GuiceListener guiceListener;
  private final Tomcat tomcat;

  @Inject
  public WebServer(
      FrontendSettings webServerSettings,
      GuiceListener guiceListener) {
    this.webServerSettings = webServerSettings;
    this.guiceListener = guiceListener;
    this.tomcat = new Tomcat();
  }

  private FilterDef createFilterDef(String filterName, String filterClass) {
    FilterDef filterDef = new FilterDef();
    filterDef.setFilterName(filterName);
    filterDef.setFilterClass(filterClass);
    return filterDef;
  }

  private FilterMap createFilterMap(String filterName, String urlPattern) {
    FilterMap filterMap = new FilterMap();
    filterMap.setFilterName(filterName);
    filterMap.addURLPattern(urlPattern);
    return filterMap;
  }

  public void run() throws Exception {
    String appBase = ".";
    tomcat.setPort(webServerSettings.getPort());

    tomcat.setBaseDir("webapp");
    tomcat.getHost().setAppBase(appBase);

    String contextPath = "/";

    // Add AprLifecycleListener to give native speed boost
    // sudo apt-get install libtcnative-1
    StandardServer server = (StandardServer)tomcat.getServer();
    AprLifecycleListener listener = new AprLifecycleListener();
    server.addLifecycleListener(listener);

    Context context = tomcat.addWebapp(contextPath, appBase);
    context.addFilterDef(createFilterDef("guice", GuiceFilter.class.getName()));
    FilterDef struts2FilterDef = createFilterDef("struts2",
        StrutsPrepareAndExecuteFilter.class.getName());
    struts2FilterDef.addInitParameter("struts.devMode",
        Boolean.toString(webServerSettings.isDevModeEnabled()));
    context.addFilterDef(struts2FilterDef);
    context.addFilterMap(createFilterMap("guice", "/*"));
    context.addFilterMap(createFilterMap("struts2", "/*"));

    tomcat.start();
    tomcat.getServer().await();
  }

  public static void main(String[] args) throws Exception {
    FrontendSettings webServerSettings = new FrontendSettings();
    new JCommander(webServerSettings, args);

    Guice.createInjector(new FrontendModule(webServerSettings));

    Injector injector = Guice.createInjector();

    WebServer server = injector.getInstance(WebServer.class);
    server.run();
  }

}
</code></pre>
]]></content:encoded>
			<wfw:commentRss>http://www.benmccann.com/dev-blog/embedded-tomcat/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Installing CUDA and Theano on Ubuntu 11.04 Natty Narwhal</title>
		<link>http://www.benmccann.com/dev-blog/installing-cuda-and-theano/</link>
		<comments>http://www.benmccann.com/dev-blog/installing-cuda-and-theano/#comments</comments>
		<pubDate>Sun, 10 Jul 2011 03:00:10 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://www.benmccann.com/dev-blog/?p=496</guid>
		<description><![CDATA[Theano is a very interesting Python library developed mainly for deep learning, which can run calculations on some NVIDIA GPUs by using the CUDA library.  Setting up Theano to use the GPU can be a little tricky and take a bit of work. However, Aaron Haviland has set up a CUDA 4.0 PPA, which makes [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://deeplearning.net/software/theano/">Theano</a> is a very interesting Python library developed mainly for deep learning, which can run calculations on <a href="http://developer.nvidia.com/cuda-gpus">some NVIDIA GPUs</a> by using the CUDA library.  Setting up Theano to use the GPU can be a little tricky and take a bit of work. However, Aaron Haviland has set up a <a href="https://launchpad.net/~aaron-haviland/+archive/cuda-4.0">CUDA 4.0 PPA</a>, which makes the installation much simpler.</p>
<p><strong>Install Theano</strong><br />
<code>sudo apt-get install python-numpy libblas-dev liblapack-dev gfortran python-dev python-pip mercurial<br />
sudo pip install --upgrade git+git://github.com/Theano/Theano.git</code><br />
This will put Theano in /usr/local/lib/python2.7/dist-packages/theano</p>
<p><strong>Install CUDA (requires downgrading gcc to 4.4)</strong><br />
<code>sudo add-apt-repository ppa:aaron-haviland/cuda-4.0<br />
sudo apt-get update sudo apt-get upgrade<br />
sudo apt-get install nvidia-cuda-toolkit g++-4.4 gcc-4.4<br />
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.5 40 --slave /usr/bin/g++ g++ /usr/bin/g++-4.5<br />
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-4.4 60 --slave /usr/bin/g++ g++ /usr/bin/g++-4.4<br />
sudo update-alternatives --config gcc</code></p>
<p><strong>Test it out</strong></p>
<p>Now run the sample program under &#8220;Putting it all Together&#8221; in the <a href="http://deeplearning.net/software/theano/tutorial/using_gpu.html">Theano tutorial</a>. It will hopefully tell you that it used your GPU.</p>
<p>A good benchmark to test out the speed of your setup is to run /usr/local/lib/python2.7/dist-packages/theano/misc/check_blas.py</p>
<p><strong>Credits</strong></p>
<p>Thanks to <a href="http://www-etud.iro.umontreal.ca/~bergstrj/">James Bergstra</a> for the necessary Theano fix to make it work with the PPA as well as the rest of the Theano developers for providing this very cool library. And also to <a href="http://www.cs.stanford.edu/people/ang/">Andrew Ng</a>, <a href="http://bengio.abracadoudou.com/">Samy Bengio</a>, and the other Googlers who have been taking their time to teach the rest of us more machine learning concepts.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.benmccann.com/dev-blog/installing-cuda-and-theano/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Getting started with Git</title>
		<link>http://www.benmccann.com/dev-blog/getting-started-with-git/</link>
		<comments>http://www.benmccann.com/dev-blog/getting-started-with-git/#comments</comments>
		<pubDate>Mon, 11 Apr 2011 08:03:43 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Version Control]]></category>
		<category><![CDATA[Git]]></category>
		<category><![CDATA[Tutorial]]></category>

		<guid isPermaLink="false">http://www.benmccann.com/dev-blog/?p=470</guid>
		<description><![CDATA[I&#8217;ve recently started using Git, which I&#8217;ve found I much prefer to Subversion for two reasons. The first is that it&#8217;s really fast since almost all commands are run locally. The second reason is that Subversion litters your source code with .svn directories and should you accidentally delete or move one then you&#8217;re in for [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve recently started using Git, which I&#8217;ve found I much prefer to Subversion for two reasons. The first is that it&#8217;s really fast since almost all commands are run locally. The second reason is that Subversion litters your source code with .svn directories and should you accidentally delete or move one then you&#8217;re in for a world of hurt. Git also handles ignored files in a much easier manner.</p>
<p>There are two downsides with Git. The first is that there&#8217;s no central server to store the code base. <a href="http://www.github.com/">GitHub</a> or <a href="https://bitbucket.org/">BitBucket</a> can fulfill this role if you don&#8217;t mind someone else hosting your source code. If you want to set up a central server yourself it seems the best solution is <a href="https://github.com/sitaramc/gitolite">gitolite</a>. The documentation isn&#8217;t for beginners, but I found a decent <a href="http://www.philwhln.com/install-gitolite-to-manage-your-git-repositories">tutorial on setting up gitolite</a>.</p>
<p>The other downside with git is that the commands can be a bit bizarre.</p>
<p><strong>git aliases</strong></p>
<p>You can set aliases using <code>git config --global</code>.  E.g. <code>git config --global alias.dt "difftool --no-prompt"</code> makes <code>git dt</code> act the same as <code>git difftool --no-prompt</code>. These aliases are saved in ~/.gitconfig. My ~/.gitconfig looks like:</p>
<pre><code>[user]
	name = Ben McCann
	email = ben@benmccann.com
[alias]
	cam = commit -am
	dt = difftool --no-prompt
	dtm = !meld .
	pending = !clear &#038; git status
	pullom = pull origin master
	pushom = push origin master
	rev = checkout --
	revall = reset --hard HEAD
</code></pre>
<p><strong>Reverting to a previous version</strong></p>
<pre><code>$ git reset --hard YOUR_CHANGESET_HERE
$ git reset --soft @{1}
$ git commit -a</code></pre>
]]></content:encoded>
			<wfw:commentRss>http://www.benmccann.com/dev-blog/getting-started-with-git/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Sed Cookbook</title>
		<link>http://www.benmccann.com/dev-blog/sed-cookbook/</link>
		<comments>http://www.benmccann.com/dev-blog/sed-cookbook/#comments</comments>
		<pubDate>Fri, 01 Apr 2011 05:50:49 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Linux]]></category>
		<category><![CDATA[Tips and Tricks]]></category>

		<guid isPermaLink="false">http://www.benmccann.com/dev-blog/?p=460</guid>
		<description><![CDATA[The Linux sed command is a stream editor.  What that means is basically that you can do a regex operation on each line of a file or a piped stream.  I always have a bit of trouble remembering how to use it since its regex implementation is a bit different than the ones I&#8217;m used [...]]]></description>
			<content:encoded><![CDATA[<p>The Linux sed command is a stream editor.  What that means is basically that you can do a regex operation on each line of a file or a piped stream.  I always have a bit of trouble remembering how to use it since its regex implementation is a bit different than the ones I&#8217;m used to.  I&#8217;ll post more examples as I encounter them in my work.</p>
<p>Sed regex reminders:</p>
<ul>
<li>You need a backslash before parens in a regex grouping</li>
<li>You refer to matched regex groups using \1, \2, etc.</li>
<li>The + regex operator does not work</li>
<li>Non-greedy quantifiers don&#8217;t work.  For example, .*? will not work</li>
<li>The output is printed to standard out by default.  You need the -i option if you want to edit a file with sed.</li>
</ul>
<p><strong>Remove all but the first column in a .tsv stream</strong><br />
sed &#8216;s/\([^\t]*\).*/\1/&#8217;</p>
<p><strong>Edit a .tsv file by removing all but the first column</strong><br />
sed -i &#8216;s/\([^\t]*\).*/\1/&#8217;</p>
<p><strong>Remove the first line of a stream</strong><br />
sed &#8217;1d&#8217;</p>
<p><strong>Strip trailing whitespace from a file</strong><br />
sed -i -e &#8216;s/ *$//&#8217;</p>
<p><strong>Replace @inheritDoc with @override after marking for edit</strong><br />
grep @inheritDoc -l -r java/com/benmccann | xargs p4 edit<br />
grep @inheritDoc -l -r java/com/benmccann | xargs sed -i &#8216;s/\(.*\)@inheritDoc/\1@override/&#8217;</p>
<p><strong>Replace @inheritDoc with @override in JS files after marking for edit</strong><br />
find java/com/benmccann -name &#8216;*.js&#8217; -print0 | xargs -0 grep -l @inheritDoc | xargs p4 edit<br />
find java/com/benmccann -name &#8216;*.js&#8217; -print0 | xargs -0 grep -l @inheritDoc | xargs sed -i &#8216;s/\(.*\)@inheritDoc/\1@override/&#8217;</p>
]]></content:encoded>
			<wfw:commentRss>http://www.benmccann.com/dev-blog/sed-cookbook/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Using the Guice Struts 2 plugin</title>
		<link>http://www.benmccann.com/dev-blog/guice-3-struts-2-plugin/</link>
		<comments>http://www.benmccann.com/dev-blog/guice-3-struts-2-plugin/#comments</comments>
		<pubDate>Tue, 29 Mar 2011 22:43:25 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Guice]]></category>
		<category><![CDATA[Struts 2]]></category>

		<guid isPermaLink="false">http://www.benmccann.com/dev-blog/?p=449</guid>
		<description><![CDATA[Guice 3.0 was released a few days ago!  One of the easiest ways to use it in your web server is to use Struts 2 with the Struts 2 plugin, which is available in the central Maven repository. This tutorial assumes familiarity with Guice and Struts 2. In order to use it the plugin, your [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://code.google.com/p/google-guice/">Guice</a> 3.0 was released a few days ago!  One of the easiest ways to use it in your web server is to use Struts 2 with the <a href="http://struts.apache.org/">Struts 2</a> plugin, which is <a href="http://repo2.maven.org/maven2/com/google/inject/extensions/guice-struts2/3.0/">available in the central Maven repository</a>.</p>
<p>This tutorial assumes familiarity with Guice and Struts 2.</p>
<p>In order to use it the plugin, your injector must be created with a Struts2GuicePluginModule:</p>
<pre><code>Injector injector = Guice.createInjector(
    new com.google.inject.servlet.ServletModule(),
    new com.google.inject.struts2.Struts2GuicePluginModule(),
    new MyModule());</code></pre>
<p>You must then define a GuiceServletContextListener to provide the injector to the Struts 2 plugin.  I injected the Injector because I&#8217;m using embedded Jetty.  However, if you&#8217;re using a standard servlet container, you&#8217;d probably just create the injector in the class itself.</p>
<pre><code>package com.benmccann.example;

import com.google.inject.Inject;
import com.google.inject.Injector;
import com.google.inject.servlet.GuiceServletContextListener;

/**
 * @author benmccann.com
 */
public class GuiceListener extends GuiceServletContextListener {

  private final Injector injector;

  @Inject
  public GuiceListener(Injector injector) {
    this.injector = injector;
  }

  @Override
  public Injector getInjector() {
    return injector;
  }

}</code></pre>
<p>You must then wire it up in your web.xml:</p>
<pre><code>  &lt;listener&gt;
    &lt;listener-class&gt;com.benmccann.example.GuiceListener&lt;/listener-class&gt;
  &lt;/listener&gt;  

  &lt;filter&gt;
    &lt;filter-name&gt;guice&lt;/filter-name&gt;
    &lt;filter-class&gt;com.google.inject.servlet.GuiceFilter&lt;/filter-class&gt;
  &lt;/filter&gt;

  &lt;filter-mapping&gt;
    &lt;filter-name&gt;guice&lt;/filter-name&gt;
    &lt;url-pattern&gt;/*&lt;/url-pattern&gt;
  &lt;/filter-mapping&gt;</code></pre>
<p>There&#8217;s also an <a href="http://code.google.com/p/google-guice/source/browse/#svn%2Ftrunk%2Fextensions%2Fstruts2%2Fexample">example in the Guice source code repository</a>.</p>
<p>Enjoy!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.benmccann.com/dev-blog/guice-3-struts-2-plugin/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Latent Dirichlet Allocation with Mallet</title>
		<link>http://www.benmccann.com/dev-blog/latent-dirichlet-allocation-mallet/</link>
		<comments>http://www.benmccann.com/dev-blog/latent-dirichlet-allocation-mallet/#comments</comments>
		<pubDate>Fri, 11 Mar 2011 03:40:09 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Machine Learning]]></category>

		<guid isPermaLink="false">http://www.benmccann.com/dev-blog/?p=439</guid>
		<description><![CDATA[We recently had a PhD candidate from UCI come in and speak to the AI club at Google Irvine to speak about her research on Latent Dirichlet Allocation (LDA). LDA is a topic model and groups words into topics where each article is comprised of a mixture of topics. I was interested to play around [...]]]></description>
			<content:encoded><![CDATA[<p>We recently had a PhD candidate from UCI come in and speak to the AI club at Google Irvine to speak about her research on Latent Dirichlet Allocation (LDA).  LDA is a topic model and groups words into topics where each article is comprised of a mixture of topics.  I was interested to play around with this a bit, so I downloaded <a href="http://mallet.cs.umass.edu/">Mallet</a> and wrote up some quick code to try making my own LDA model.</p>
<pre><code>package com.benmccann.topicmodel;

import java.io.File;
import java.io.IOException;
import java.util.ArrayList;
import java.util.List;
import java.util.TreeSet;

import cc.mallet.pipe.CharSequence2TokenSequence;
import cc.mallet.pipe.Pipe;
import cc.mallet.pipe.SerialPipes;
import cc.mallet.pipe.TokenSequence2FeatureSequence;
import cc.mallet.pipe.TokenSequenceLowercase;
import cc.mallet.pipe.TokenSequenceRemoveStopwords;
import cc.mallet.pipe.iterator.ArrayIterator;
import cc.mallet.topics.ParallelTopicModel;
import cc.mallet.types.Alphabet;
import cc.mallet.types.IDSorter;
import cc.mallet.types.InstanceList;

import com.google.inject.Guice;
import com.google.inject.Inject;
import com.google.inject.Injector;

public class Lda {

  @Inject private com.benmccann.topicmodel.TextProvider textProvider;

  InstanceList createInstanceList(List&lt;String&gt; texts) throws IOException {
    ArrayList&lt;Pipe&gt; pipes = new ArrayList&lt;Pipe&gt;();
    pipes.add(new CharSequence2TokenSequence());
    pipes.add(new TokenSequenceLowercase());
    pipes.add(new TokenSequenceRemoveStopwords());
    pipes.add(new TokenSequence2FeatureSequence());
    InstanceList instanceList = new InstanceList(new SerialPipes(pipes));
    instanceList.addThruPipe(new ArrayIterator(texts));
    return instanceList;
  }

  private ParallelTopicModel createNewModel() throws IOException {
    List&lt;String&gt; texts = textProvider.getTexts();
    InstanceList instanceList = createInstanceList(texts);
    int numTopics = instanceList.size() / 5;
    ParallelTopicModel model = new ParallelTopicModel(numTopics);
    model.addInstances(instanceList);
    model.estimate();
    return model;
  }

  ParallelTopicModel getOrCreateModel() throws Exception {
    return getOrCreateModel("model");
  }

  private ParallelTopicModel getOrCreateModel(String directoryPath)
      throws Exception {
    File directory = new File(directoryPath);
    if (!directory.exists()) {
      directory.mkdir();
    }
    File file = new File(directory, "mallet-lda.model");
    ParallelTopicModel model = null;
    if (!file.exists()) {
      model = createNewModel();
      model.write(file);
    } else {
      model = ParallelTopicModel.read(file);
    }
    return model;
  }

  public void printTopics() throws Exception {
    ParallelTopicModel model = getOrCreateModel();
    Alphabet alphabet = model.getAlphabet();
    for (TreeSet&lt;IDSorter&gt; set : model.getSortedWords()) {
      System.out.print("TOPIC: ");
      for (IDSorter s : set) {
        System.out.print(alphabet.lookupObject(s.getID()) + ", ");
      }
      System.out.println();
    }
  }

  public static void main(String[] args) throws Exception {
    Injector injector = Guice.createInjector();
    Lda lda = injector.getInstance(Lda.class);
    lda.printTopics();
  }

}</code></pre>
<p>One of the things I found interesting was that you have to specify a number of topics.  This is where the &#8216;art&#8217; of machine learning comes in.  With some training data this parameter could be tuned to perform better than my random guesses.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.benmccann.com/dev-blog/latent-dirichlet-allocation-mallet/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Remote Java debugging in Eclipse</title>
		<link>http://www.benmccann.com/dev-blog/remote-java-debugging-in-eclipse/</link>
		<comments>http://www.benmccann.com/dev-blog/remote-java-debugging-in-eclipse/#comments</comments>
		<pubDate>Wed, 09 Mar 2011 06:20:30 +0000</pubDate>
		<dc:creator>Ben</dc:creator>
				<category><![CDATA[Java]]></category>
		<category><![CDATA[Tips and Tricks]]></category>

		<guid isPermaLink="false">http://www.benmccann.com/dev-blog/?p=434</guid>
		<description><![CDATA[To debug a Java program being run on the command line from Eclipse you can start the Java program in remote debugging mode: java -Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=y -jar myProgram.jar The program will wait for you to attach the Eclipse debugger to it. Open Eclipse and choose: Run > Debug Configurations... > Remote Java Application > New [...]]]></description>
			<content:encoded><![CDATA[<p>To debug a Java program being run on the command line from Eclipse you can start the Java program in remote debugging mode:</p>
<pre><code>java -Xdebug -Xrunjdwp:transport=dt_socket,address=8000,server=y,suspend=y -jar myProgram.jar</code></pre>
<p>The program will wait for you to attach the Eclipse debugger to it.  Open Eclipse and choose:</p>
<pre><code>Run > Debug Configurations... > Remote Java Application > New</code></pre>
<p>Make sure to enter the same port that you chose on the command line.  The default is port 8000.  Now hit &#8220;Debug&#8221; and you&#8217;re off!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.benmccann.com/dev-blog/remote-java-debugging-in-eclipse/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

