<?xml version="1.0" encoding="UTF-8"?><feed
  xmlns="http://www.w3.org/2005/Atom"
  xmlns:thr="http://purl.org/syndication/thread/1.0"
  xml:lang="en-US"
  xml:base="http://jeffbeard.com/wp-atom.php"
   >
	<title type="text">Jeff Beard</title>
	<subtitle type="text">Blog.blog</subtitle>

	<updated>2016-06-19T17:03:08Z</updated>

	<link rel="alternate" type="text/html" href="http://jeffbeard.com" />
	<id>http://jeffbeard.com/feed/atom/</id>
	<link rel="self" type="application/atom+xml" href="http://jeffbeard.com/feed/atom/" />

	<generator uri="https://wordpress.org/" version="4.8.5">WordPress</generator>
	<entry>
		<author>
			<name>Jeff</name>
						<uri>http://jeffbeard.org</uri>
					</author>
		<title type="html"><![CDATA[Ubuntu Landscape and MotD integration kills Gitlab SSH performance]]></title>
		<link rel="alternate" type="text/html" href="http://jeffbeard.com/2016/06/ubuntu-landscape-and-motd-integration-kills-gitlab-ssh-performance/" />
		<id>http://jeffbeard.com/?p=518</id>
		<updated>2016-06-19T17:03:08Z</updated>
		<published>2016-06-15T22:23:15Z</published>
		<category scheme="http://jeffbeard.com" term="Computing" /><category scheme="http://jeffbeard.com" term="Infrastructure Management" /><category scheme="http://jeffbeard.com" term="Systems Administration" /><category scheme="http://jeffbeard.com" term="gitlab" /><category scheme="http://jeffbeard.com" term="high load" /><category scheme="http://jeffbeard.com" term="landscape-sysinfo" /><category scheme="http://jeffbeard.com" term="message of the day" /><category scheme="http://jeffbeard.com" term="motd" /><category scheme="http://jeffbeard.com" term="performance" /><category scheme="http://jeffbeard.com" term="slowness" /><category scheme="http://jeffbeard.com" term="ubuntu" />		<summary type="html"><![CDATA[Ubuntu's 'landscape-sysinfo' killed our Gitlab performance]]></summary>
		<content type="html" xml:base="http://jeffbeard.com/2016/06/ubuntu-landscape-and-motd-integration-kills-gitlab-ssh-performance/"><![CDATA[<p>I had nothing to do with this discovery but my colleague <a href="https://www.linkedin.com/in/lance-johnston-a45a341">Lance Johnston</a>, who did, felt that we should share it because of a lack of information about the issue on the Internet.</p>
<p>I lead a team at work that, among other things, manages our version control system. We started to have some performance issues with our Gitlab instance as usage increased and it started to impacted users so we decided to restart the Gitlab services first.</p>
<p>We observed a pretty high load when we checked before stopping the services, around 10-12, which we expected to go down when we shutdown the services. However when the services were off, the load did not go down, which was very curious.</p>
<p>Lance investigates and as he watched &#8216;top&#8217; he observed batches of inbound ssh connections, as one would expect. But when the connections happened, he immediately saw another batch of processes named &#8216;landscape-sysinfo&#8217;. </p>
<p>A little digging turned up some information indicating that whenever a shell is spawned, such as when there&#8217;s an ssh connection, the Message of the Day is presented. The MotD runs the &#8216;landscape-sysinfo&#8217; program in order to collect metrics that are presented to users when they login. So we have literally hundreds of ssh connections at any given time as Jenkins and developers do their jobs so this program was producing a consistently high load average. </p>
<p>Since the vast majority of ssh connections are not interactive, we disabled the Message of the Day and the load dropped immediately to .01, with the Gitlab services off. When they were turned back on we stabilized around .7 and during the work day it doesn&#8217;t go over 5 during usage spikes.</p>
]]></content>
			<link rel="replies" type="text/html" href="http://jeffbeard.com/2016/06/ubuntu-landscape-and-motd-integration-kills-gitlab-ssh-performance/#comments" thr:count="0"/>
		<link rel="replies" type="application/atom+xml" href="http://jeffbeard.com/2016/06/ubuntu-landscape-and-motd-integration-kills-gitlab-ssh-performance/feed/atom/" thr:count="0"/>
		<thr:total>0</thr:total>
		</entry>
		<entry>
		<author>
			<name>Jeff</name>
						<uri>http://jeffbeard.org</uri>
					</author>
		<title type="html"><![CDATA[Store Time Machine Backups on an Ubuntu Server]]></title>
		<link rel="alternate" type="text/html" href="http://jeffbeard.com/2014/04/store-time-machine-backups-on-an-ubuntu-server/" />
		<id>http://jeffbeard.com/?p=502</id>
		<updated>2014-04-20T22:31:48Z</updated>
		<published>2014-04-19T23:05:11Z</published>
		<category scheme="http://jeffbeard.com" term="Infrastructure Management" /><category scheme="http://jeffbeard.com" term="Personal Computing" /><category scheme="http://jeffbeard.com" term="Systems Administration" /><category scheme="http://jeffbeard.com" term="Uncategorized" />		<summary type="html"><![CDATA[I found this concise article (author&#8217;s claim verified) on setting up Mac OS X Time Machine backups on a network drive. I tried using SMB/CIFS to no avail but setting up a Netatalk share did the trick! Note that I did not modify the Avahi configuration since it wasn&#8217;t necessary to make the share usable [&#8230;]]]></summary>
		<content type="html" xml:base="http://jeffbeard.com/2014/04/store-time-machine-backups-on-an-ubuntu-server/"><![CDATA[<p>I found this<a href="http://d43.me/blog/1660/concisest-guide-to-setting-up-time-machine-server-on-ubuntu-server-12-04/" title="Concisest guide to setting up Time Machine server on Ubuntu Server 12.04"> concise article</a> (author&#8217;s claim verified) on setting up Mac OS X Time Machine backups on a network drive. I tried using SMB/CIFS to no avail but setting up a Netatalk share did the trick! </p>
<p>Note that I did not modify the Avahi configuration since it wasn&#8217;t necessary to make the share usable for backups.</p>
]]></content>
			<link rel="replies" type="text/html" href="http://jeffbeard.com/2014/04/store-time-machine-backups-on-an-ubuntu-server/#comments" thr:count="0"/>
		<link rel="replies" type="application/atom+xml" href="http://jeffbeard.com/2014/04/store-time-machine-backups-on-an-ubuntu-server/feed/atom/" thr:count="0"/>
		<thr:total>0</thr:total>
		</entry>
		<entry>
		<author>
			<name>Jeff</name>
						<uri>http://jeffbeard.org</uri>
					</author>
		<title type="html"><![CDATA[Processing files from S3 with Cascading]]></title>
		<link rel="alternate" type="text/html" href="http://jeffbeard.com/2013/08/processing-files-from-s3-with-cascading/" />
		<id>http://jeffbeard.org/?p=460</id>
		<updated>2013-08-10T19:05:23Z</updated>
		<published>2013-08-10T19:05:23Z</published>
		<category scheme="http://jeffbeard.com" term="Big Data" /><category scheme="http://jeffbeard.com" term="Computing" /><category scheme="http://jeffbeard.com" term="Data Management" /><category scheme="http://jeffbeard.com" term="Software Development" /><category scheme="http://jeffbeard.com" term="hadoop cascading" />		<summary type="html"><![CDATA[   Cascading is a Hadoop ecosystem framework that provides a higher level abstraction over MapReduce. I recently worked on a Cascading prototype that would read log files from an Amazon Web Services S3 bucket, do a minor transform, land the output in HDFS then move the files to another S3 bucket configured for archiving. Figuring [&#8230;]]]></summary>
		<content type="html" xml:base="http://jeffbeard.com/2013/08/processing-files-from-s3-with-cascading/"><![CDATA[<p><img class="alignnone" alt="" src="http://docs.cascading.org/cascading/2.1/userguide/htmlsingle/images/cascading-logo.png" width="134" height="125" />   <a href="http://cascading.org">Cascading </a>is a Hadoop ecosystem framework that provides a higher level abstraction over MapReduce. I recently worked on a Cascading prototype that would read log files from an Amazon Web Services S3 bucket, do a minor transform, land the output in HDFS then move the files to another S3 bucket configured for archiving.<br />
<span id="more-460"></span></p>
<p>Figuring out how to get Cascading to read a stream of data from S3 turned out to be a bit tricky since the documentation or example applications didn&#8217;t bring together all the pieces explicitly so this article will capture what I&#8217;ve learned.</p>
<p>The first thing to understand is that Cascading&#8217;s S3 support is really just an extension of Hadoop support for S3.</p>
<p>Secondly, as noted in the<a href="http://wiki.apache.org/hadoop/AmazonS3"> Hadoop S3 wiki page</a>, there are two of what I&#8217;ll call &#8220;formats&#8221; that Hadoop supports: HDFS file types and what is called &#8220;native&#8221; files. The former is a file in the same format as it would be stored in HDFS and the latter is what can be thought of as a &#8220;plain old file&#8221;. In the prototypes I&#8217;ve worked I needed to access native files which were gzipped, delimited files (Hadoop can process gzipped files natively and Cascading offers an extension that supports zip files too).</p>
<p>In the end the tricky part was finding the properties and understanding the two different URI schemes. And it turned out to be very simple to stream files from S3 into a Cascading job. Just setup a Tap with an S3 URI:</p>
<pre class="brush: java; title: ; notranslate">

import cascading.flow.FlowDef;
import cascading.flow.hadoop.HadoopFlowConnector;
import cascading.pipe.Pipe;
import cascading.property.AppProps;
import cascading.scheme.hadoop.TextDelimited;
import cascading.tap.Tap;
import cascading.tap.hadoop.Hfs;

// ...
public class Main {

    public static void main(String[] args) throws Exception {
        Properties properties = new Properties();
        String accessKey = args[0];
        String secretKey = args[1];

        properties.setProperty(&quot;fs.s3n.awsAccessKeyId&quot;, accessKey);
        properties.setProperty(&quot;fs.s3n.awsSecretAccessKey&quot;, secretKey);

        properties.setProperty(&quot;fs.defaultFS&quot;, &quot;hdfs://localhost:8020/&quot;);
        properties.setProperty(&quot;fs.permissions.umask-mode&quot;, &quot;007&quot;);

        AppProps.setApplicationJarClass( properties, Main.class );

        HadoopFlowConnector flowConnector = new HadoopFlowConnector( properties );

        String input = &quot;s3n://my-bucket/my-log.gz&quot;;

        Tap inTap = new Hfs( new TextDelimited( false, &quot;\t&quot; ), input);

        Pipe copyPipe = new Pipe( &quot;copy&quot; );
        Tap outTap = new Hfs( new TextDelimited( false, &quot;,&quot; ), &quot;hdfs://tmp/output&quot;);

        FlowDef flowDef = FlowDef.flowDef()
	    .addSource( copyPipe, inTap )
	    .addTailSink( copyPipe, outTap );

        flowConnector.connect( flowDef ).complete();
    }
}
</pre>
<p>The code above would stream a tab delimited file directly from S3 and output it to the HDFS folder /tmp/output as a comma separated file.</p>
<p>I should also note that this code can be run on Elastic Map Reduce in the cloud as well so the data never has to leave the Cloud.</p>
]]></content>
			<link rel="replies" type="text/html" href="http://jeffbeard.com/2013/08/processing-files-from-s3-with-cascading/#comments" thr:count="0"/>
		<link rel="replies" type="application/atom+xml" href="http://jeffbeard.com/2013/08/processing-files-from-s3-with-cascading/feed/atom/" thr:count="0"/>
		<thr:total>0</thr:total>
		</entry>
		<entry>
		<author>
			<name>Jeff</name>
						<uri>http://jeffbeard.org</uri>
					</author>
		<title type="html"><![CDATA[Netalyzr &#8211; Network debugging tool]]></title>
		<link rel="alternate" type="text/html" href="http://jeffbeard.com/2012/01/netalyzr-network-debugging-tool/" />
		<id>http://jeffbeard.org/?p=440</id>
		<updated>2012-01-01T17:51:29Z</updated>
		<published>2012-01-01T17:51:29Z</published>
		<category scheme="http://jeffbeard.com" term="Infrastructure Management" />		<summary type="html"><![CDATA[I&#8217;ve had a transient issue with my Internet access randomly &#8220;going away&#8221;. It&#8217;s annoying but generally clears up within a minute or two. I came across a tool called Netalyzr by a group within UC Berkeley. Netalyzr is a Java application available as either an in-browser Applet or a command line utility. It runs a [&#8230;]]]></summary>
		<content type="html" xml:base="http://jeffbeard.com/2012/01/netalyzr-network-debugging-tool/"><![CDATA[<p>I&#8217;ve had a transient issue with my Internet access randomly &#8220;going away&#8221;. It&#8217;s annoying but generally clears up within a minute or two. I came across a tool called <a href="http://netalyzr.icsi.berkeley.edu/index.html">Netalyzr</a> by a group within UC Berkeley. Netalyzr is a Java application available as either an in-browser Applet or a command line utility. It runs a number of network connectivity tests and provides a detailed report hosted on their web site that uses a simple red/yellow/green motif to show problems and their relative importance.</p>
<p>While Netalyzr didn&#8217;t clearly show what was going on with my Internet connection it did raise a red flag about network buffers that might be the issue. Unfortunately, that&#8217;s a router configuration issue on the part of my ISP so I&#8217;m not hopeful for a resolution. But I can always gather data then open a trouble ticket with the vendor. </p>
<p>Regardless, Netalyzr looks like a great tool for troubleshooting connectivity issues.</p>
]]></content>
			<link rel="replies" type="text/html" href="http://jeffbeard.com/2012/01/netalyzr-network-debugging-tool/#comments" thr:count="0"/>
		<link rel="replies" type="application/atom+xml" href="http://jeffbeard.com/2012/01/netalyzr-network-debugging-tool/feed/atom/" thr:count="0"/>
		<thr:total>0</thr:total>
		</entry>
		<entry>
		<author>
			<name>Jeff</name>
						<uri>http://jeffbeard.org</uri>
					</author>
		<title type="html"><![CDATA[Prey Project, ping and Cygwin]]></title>
		<link rel="alternate" type="text/html" href="http://jeffbeard.com/2011/09/prey-project-ping-and-cygwin/" />
		<id>http://jeffbeard.org/?p=411</id>
		<updated>2012-05-29T13:42:02Z</updated>
		<published>2011-09-10T16:08:48Z</published>
		<category scheme="http://jeffbeard.com" term="Computing" /><category scheme="http://jeffbeard.com" term="Personal Computing" />		<summary type="html"><![CDATA[File this with the obscure issue department&#8230; The Prey Project looked like a nice system for tracking stolen devices and has gotten a lot of good press recently. I decided to try it out. After getting everything setup and working I noticed a lot of Cygwin bash shells running the ping command. The commands accumulated [&#8230;]]]></summary>
		<content type="html" xml:base="http://jeffbeard.com/2011/09/prey-project-ping-and-cygwin/"><![CDATA[<p>File this with the obscure issue department&#8230;</p>
<p>The <a href="http://www.preyproject.com" title="Prey Project">Prey Project</a> looked like a nice system for tracking stolen devices and has gotten a lot of good press recently. I decided to try it out. After getting everything setup and working I noticed a lot of <a href="http://www.cygwin.com" title="Cygwin">Cygwin</a> bash shells running the ping command. The commands accumulated eventually degrading system performance which is when I noticed.</p>
<p>Prey has a partial UNIX environment (<a href="http://www.mingw.org/" title="MingW">MingW</a>) contained in it and consists of shell scripts wrapping a number of UNIX utilities compiled for Windows. I say partial because it doesn&#8217;t include the &#8220;ping&#8221; command which is a dependency for the software. And the shell scripts apparently don&#8217;t take into account the potential for a user having other UNIX-like environments installed (Cygwin also has a bash shell and the ping command but there are <a href="http://www.mkssoftware.com/products/">others</a> <a href="http://unxutils.sourceforge.net/">as well</a>.) So what was happening is that script (<a href="https://github.com/tomas/prey/blob/master/core/pull">pull</a>) naively looks at what operating system it is installed on and for a ping command and issue what it believes are the correct command line arguments. For Windows it&#8217;s this:</p>
<pre class="brush: bash; title: ; notranslate">
ping -n 1 www.google.com
</pre>
<p>This doesn&#8217;t work because Cygwin&#8217;s ping.exe doesn&#8217;t have a &#8220;-n&#8221; switch. But for some reason doesn&#8217;t fail when it encounters an invalid option. Rather, it tried to ping the IP address 0.0.0.1. This doesn&#8217;t work, of course, but the ping command tries forever thus respawning new instances of the bash shell and ping until it kills your computer.</p>
<p>Anyway, I hard coded a change to the script on my system and filed a <a href="https://github.com/prey/prey-bash-client/issues/216">bug</a> with the Prey developers. </p>
<p>I also submitted an email to the Cygwin mailing list describing the Cygwin ping issue.</p>
]]></content>
			<link rel="replies" type="text/html" href="http://jeffbeard.com/2011/09/prey-project-ping-and-cygwin/#comments" thr:count="0"/>
		<link rel="replies" type="application/atom+xml" href="http://jeffbeard.com/2011/09/prey-project-ping-and-cygwin/feed/atom/" thr:count="0"/>
		<thr:total>0</thr:total>
		</entry>
		<entry>
		<author>
			<name>Jeff</name>
						<uri>http://jeffbeard.org</uri>
					</author>
		<title type="html"><![CDATA[Quick-n-dirty git getting started guide]]></title>
		<link rel="alternate" type="text/html" href="http://jeffbeard.com/2011/09/quick-n-dirty-git-getting-started-guide/" />
		<id>http://jeffbeard.org/?p=404</id>
		<updated>2011-09-20T13:28:02Z</updated>
		<published>2011-09-07T21:24:17Z</published>
		<category scheme="http://jeffbeard.com" term="Software Development" />		<summary type="html"><![CDATA[As a git neophyte I approve of this post: http://news.ycombinator.com/item?id=2970637 UPDATE: I also found this helpful site: gitref.org]]></summary>
		<content type="html" xml:base="http://jeffbeard.com/2011/09/quick-n-dirty-git-getting-started-guide/"><![CDATA[<p>As a git neophyte I approve of this post:</p>
<p><a href="http://news.ycombinator.com/item?id=2970637">http://news.ycombinator.com/item?id=2970637</a></p>
<p>UPDATE: I also found this helpful site: <a href="http://gitref.org/">gitref.org</a></p>
]]></content>
			<link rel="replies" type="text/html" href="http://jeffbeard.com/2011/09/quick-n-dirty-git-getting-started-guide/#comments" thr:count="0"/>
		<link rel="replies" type="application/atom+xml" href="http://jeffbeard.com/2011/09/quick-n-dirty-git-getting-started-guide/feed/atom/" thr:count="0"/>
		<thr:total>0</thr:total>
		</entry>
		<entry>
		<author>
			<name>Jeff</name>
						<uri>http://jeffbeard.org</uri>
					</author>
		<title type="html"><![CDATA[Lighting a fire under WordPress]]></title>
		<link rel="alternate" type="text/html" href="http://jeffbeard.com/2011/09/lighting-a-fire-under-wordpress/" />
		<id>http://jeffbeard.org/?p=317</id>
		<updated>2011-09-07T21:25:53Z</updated>
		<published>2011-09-05T17:30:08Z</published>
		<category scheme="http://jeffbeard.com" term="Infrastructure Management" /><category scheme="http://jeffbeard.com" term="9 million hits per day" /><category scheme="http://jeffbeard.com" term="memcached" /><category scheme="http://jeffbeard.com" term="nginx" /><category scheme="http://jeffbeard.com" term="php-fpm" /><category scheme="http://jeffbeard.com" term="varnish" /><category scheme="http://jeffbeard.com" term="wordpress" />		<summary type="html"><![CDATA[Learn how to light a fire under Wordpress and sustain 9 million hits a day with nginx, PHP-FPM, memcached and varnish. ]]></summary>
		<content type="html" xml:base="http://jeffbeard.com/2011/09/lighting-a-fire-under-wordpress/"><![CDATA[<p>Since I moved my personal web site from <a title="Apache Roller" href="http://rollerweblogger.org/project/">Roller </a> to <a title="Wordpress" href="http://www.wordpress.org">WordPress</a> a couple of years ago, my web site had been a dog. After reading an <a title="9 Million Hits per day with 120 megs RAM" href="http://tumbledry.org/2011/08/31/9_million_hits_day_with_120">article</a> about a PHP-based web site configured to support 9 millions hits per day, and knowing through experience that my site should be significantly faster, I decided it was time to light a fire under WordPress.</p>
<p><strong><em>(Note that I&#8217;ve included gists at the bottom of the article with the important configuration files.)</em></strong><br />
<span id="more-317"></span><br />
I was using a <a title="Slicehost" href="http://www.slicehost.com">Slicehost</a> slice with a typical Apache/mod_php configuration but there wasn&#8217;t enough memory so it would start swapping with a little use which caused frequent outages. But rather than upgrade to the next sized Slice, I found that I could double my RAM for the same money simply by moving to <a title="Linode" href="http://www.linode.com/">Linode</a>. So that was the first change I made. (FWIW, I&#8217;m not suggesting this as a performance enhancement but it&#8217;s definitely a better value.)</p>
<p>Next was a series of changes, some of which were noted in the Tumbledry article, some not. The Tumbledry article was thin on details so I did the research myself and came up with a number of articles with the best being this <a title="Running WordPress with nginx, php-fpm, apc and varnish" href="http://www.cryptkcoding.com/2011/08/running-wordpress-with-nginx-php-fpm-apc-and-varnish/">article</a> on setting up nginx, PHP-FPM, APC, memcached and the W3 Total Cache WordPress plugin. The cryptkcoding article shows how to setup an <a title="Ubuntu" href="http://www.ubuntu.com">Ubuntu</a> Linux system for seriously fast WordPress performance that consumes incredible few system resources.</p>
<p>First, an ease of use feature that I discovered: someone did a build of PHP 5.3.8 for Ubuntu 10.04 LTS. Since PHP 5.3 includes PHP-FPM, you can keep everything package based. </p>
<p>To use these packages add these lines to /etc/apt/sources.lst:</p>
<pre>deb http://ppa.launchpad.net/brianmercer/php/ubuntu lucid main
deb-src http://ppa.launchpad.net/brianmercer/php/ubuntu lucid main</pre>
<p>After updating the sources list, run this command to update the apt cache:</p>
<pre class="brush: bash; title: ; notranslate">
sudo apt-get update
</pre>
<p>(For more details on using these packages, as well as setting up a similar system, checkout <a href="http://www.howtoforge.com/installing-php-5.3-nginx-and-php-fpm-on-ubuntu-debian">this</a> HowToForge article. In particular, there are some useful comments at the bottom.)</p>
<p>Anyway, one of the main features of this setup was swapping out the <a href="http://httpd.apache.org">Apache web server</a> for nginx and PHP-FPM. Like most PHP developers, Apache and mod_php has been the default setup for PHP applications for years. However, I can now vouch for the nginx/PHP-FPM combo as both stable and fast production environment. (I will try out this combo for development on my next PHP project to see how it works.) </p>
<p>Importantly, the system now uses a UNIX socket for the connection between the web and application servers rather than TCP/IP. That means that for the core application and web services there are two UNIX sockets used, one between the web server and the application server then again between the the application server and the database server (MySQL clients use the UNIX socket when the &#8220;localhost&#8221; host name is used or the host name is blank.)</p>
<p>Anyway, to really see the difference the architecture changes made, I used a <a href="http://blitz.io">blitz.io</a> Rush to hammer the two instances.</p>
<p>First up was the old Slicehost system. This is the Rush configuration I used (same as Tumbledry):</p>
<pre>--pattern 1-250:60 -T 4000 -r california</pre>
<p>The result: this rendered the system completely unresponsive and required a hard reboot. Here&#8217;s a shot of &#8220;top&#8221; before the system stopped responding. Note all the memory being consumed, load on the way up and lots of Apache processes:</p>
<p><a href="http://jeffbeard.org/wp-content/uploads/2011/09/jeffbeard-top-1.jpg"><img class="alignleft size-full wp-image-327" title="jeffbeard-top-1" src="http://jeffbeard.org/wp-content/uploads/2011/09/jeffbeard-top-1.jpg" alt="" width="595" height="492" /></a></p>
<p>Oh no! The system just died:<br />
<a href="http://jeffbeard.org/wp-content/uploads/2011/09/jeffbeard-org-rush.jpg"><img class="alignleft size-full wp-image-331" title="jeffbeard-org-rush" src="http://jeffbeard.org/wp-content/uploads/2011/09/jeffbeard-org-rush.jpg" alt="" width="540" height="483" /></a></p>
<p>Next was the new Linode hosted solution. The result is that the new architecture sustained the Rush with virtually zero CPU usage (I&#8217;m not kidding) or any changes to memory usage. Varnish takes most of the load.</p>
<p><a href="http://jeffbeard.org/wp-content/uploads/2011/09/jeffbeard-org-rush-1.jpg"><img class="size-full wp-image-323 alignnone" title="jeffbeard-org-rush-1" src="http://jeffbeard.org/wp-content/uploads/2011/09/jeffbeard-org-rush-1.jpg" alt="blitz.io Rush Graph for jeffbeard.org" width="437" height="389" /></a></p>
<p>So as it turns out, I can also serve up 9 million hits per day from a small (512MB RAM), inexpensive ($20 per month) virtual server.</p>
<p>Here are the important configuration files:</p>
<p>nginx.conf:<br />
<script src="https://gist.github.com/1195248.js"> </script></p>
<p>nginx virtual host config:<br />
<script src="https://gist.github.com/1195299.js"> </script></p>
<p>php5-fpm.conf:<br />
<script src="https://gist.github.com/1195269.js"> </script></p>
<p>varnish:<br />
<script src="https://gist.github.com/1195393.js"> </script></p>
<p>wordpress.vcl (varnish site config):<br />
<script src="https://gist.github.com/1195286.js"> </script></p>
]]></content>
			<link rel="replies" type="text/html" href="http://jeffbeard.com/2011/09/lighting-a-fire-under-wordpress/#comments" thr:count="0"/>
		<link rel="replies" type="application/atom+xml" href="http://jeffbeard.com/2011/09/lighting-a-fire-under-wordpress/feed/atom/" thr:count="0"/>
		<thr:total>0</thr:total>
		</entry>
		<entry>
		<author>
			<name>Jeff</name>
						<uri>http://jeffbeard.org</uri>
					</author>
		<title type="html"><![CDATA[Tip for optimizing MySQL data types]]></title>
		<link rel="alternate" type="text/html" href="http://jeffbeard.com/2011/06/tip-for-optimizing-mysql-data-types/" />
		<id>http://jeffbeard.org/?p=227</id>
		<updated>2012-12-11T05:48:32Z</updated>
		<published>2011-06-28T15:08:48Z</published>
		<category scheme="http://jeffbeard.com" term="MySQL" /><category scheme="http://jeffbeard.com" term="Software Development" /><category scheme="http://jeffbeard.com" term="data management" /><category scheme="http://jeffbeard.com" term="database" /><category scheme="http://jeffbeard.com" term="mysql" /><category scheme="http://jeffbeard.com" term="optimizing data types" /><category scheme="http://jeffbeard.com" term="refactoring databases" />		<summary type="html"><![CDATA[This is a tip that I&#8217;ve kept forgetting to write down so here it is: During a system&#8217;s life cycle, requirements change and components are refactored. This includes databases as well, and particularly as data grows. Decisions and assumptions are made at the beginning of a system&#8217;s life cycle that may or may not hold [&#8230;]]]></summary>
		<content type="html" xml:base="http://jeffbeard.com/2011/06/tip-for-optimizing-mysql-data-types/"><![CDATA[<p>This is a tip that I&#8217;ve kept forgetting to write down so here it is:</p>
<p>During a system&#8217;s life cycle, requirements change and components are refactored. This includes databases as well, and particularly as data grows. Decisions and assumptions are made at the beginning of a system&#8217;s life cycle that may or may not hold up over years of operation and it&#8217;s good practice to continually analyze how well the initial design is working.<br />
<span id="more-227"></span><br />
When doing analysis in support of refactoring database schemas in MySQL, I&#8217;ve found this little bit of SQL to be invaluable.</p>
<p>Code:</p>
<pre class="brush: sql; title: ; notranslate">
SELECT * FROM TABLE PROCEDURE analyse();
</pre>
<p>(I suggest giving it a try on a small table with few rows.)</p>
<p><a href="http://dev.mysql.com/doc/refman/5.6/en/procedure-analyse.html">PROCEDURE ANALYSE</a> interrogates the values in a table, shows the smallest and largest values and suggests a type for each column. While the results frequently indicate that an ENUM type is the most appropriate you can add arguments to the ANALYSE procedure to get more rational suggestions. However, even with no arguments the results can be useful.</p>
<p>For example, you might see that the max value of a column actually is smaller than the type that it&#8217;s using. I&#8217;ve frequently seen INT(11) columns that would work fine as a MEDIUMINT or even TINYINT. Or your might find that an ENUM type is better since the distribution of values is small in a VARCHAR column. (The benefit of an ENUM is that the data is stored as an integer rather than the string value so its&#8217; footprint on the disk can be significantly smaller).</p>
<p>Anyway, while it&#8217;s not a panacea, PROCECURE ANALYSE() is another helpful tool.</p>
]]></content>
			<link rel="replies" type="text/html" href="http://jeffbeard.com/2011/06/tip-for-optimizing-mysql-data-types/#comments" thr:count="0"/>
		<link rel="replies" type="application/atom+xml" href="http://jeffbeard.com/2011/06/tip-for-optimizing-mysql-data-types/feed/atom/" thr:count="0"/>
		<thr:total>0</thr:total>
		</entry>
		<entry>
		<author>
			<name>Jeff</name>
						<uri>http://jeffbeard.org</uri>
					</author>
		<title type="html"><![CDATA[MySQL udf_median on Windows 7 64bit]]></title>
		<link rel="alternate" type="text/html" href="http://jeffbeard.com/2011/05/mysql-udf_median-on-windows-7-64bit/" />
		<id>http://jeffbeard.org/?p=290</id>
		<updated>2011-11-24T03:11:40Z</updated>
		<published>2011-05-21T16:00:08Z</published>
		<category scheme="http://jeffbeard.com" term="Information Technology" /><category scheme="http://jeffbeard.com" term="MySQL" /><category scheme="http://jeffbeard.com" term="mysql" /><category scheme="http://jeffbeard.com" term="udf" /><category scheme="http://jeffbeard.com" term="udf on windows" /><category scheme="http://jeffbeard.com" term="udf_median" /><category scheme="http://jeffbeard.com" term="windows 7 64bit" />		<summary type="html"><![CDATA[In a minor but ongoing saga of supporting the venerable MySQL UDF function udf_median, I can now add a HOWTO for building it on Windows 7 x64 using Microsoft Visual C++ Express 2010. I should point out my previous article on the subject since there are parts of it that are still applicable. This is [&#8230;]]]></summary>
		<content type="html" xml:base="http://jeffbeard.com/2011/05/mysql-udf_median-on-windows-7-64bit/"><![CDATA[<p>In a minor but ongoing saga of supporting the venerable MySQL UDF function udf_median, I can now add a HOWTO for building it on Windows 7 x64 using Microsoft Visual C++ Express 2010.<br />
<span id="more-290"></span></p>
<p>I should point out <a href="http://jeffbeard.org/2011/04/mysql-udf_median-on-windows">my previous article</a> on the subject since there are parts of it that are still applicable.</p>
<p>This is likely applicable to other MySQL UDFs as well but I haven&#8217;t tried.</p>
<p>I used the MySQL 5.1.57 x64 version for my system and I downloaded the zip archive rather than the installer. (Note that the server can still be installed as a service but you will need to run the cmd.exe program as an administrator in order to run the command line installation process.)</p>
<p>I also used Microsoft Visual C++ Express 2010 for this and my Windows version is Windows 7 Ultimate x64.</p>
<p>In addition to VS C++, you will also need the Windows SDK for Windows 7 which you can download <a href="http://www.microsoft.com/downloads/dlx/en-us/listdetailsview.aspx?FamilyID=6b6c21d2-2006-4afa-9702-529fa782d63b">here</a>. This addition is critical since it contains the x64 compiler and other tools.</p>
<p>When you have everything installed, follow the instructions on Roland Bouman&#8217;s <a href="http://rpbouman.blogspot.com/2007/09/creating-mysql-udfs-with-microsoft.html">blog post</a> about building UDFs on Windows but stop before building and installing the function then follow these steps to 64bit glory (if you are specifically using udf_median, you might also be interested in <a href="http://www.mooreds.com/wordpress/archives/376">this post</a>):</p>
<ol>
<li>Right-click on the project (not the solution), choose &#8220;Properties&#8221;. At the top of the dialog, from the &#8220;Configuration&#8221; dropdown, select &#8220;All Configurations&#8221;.</li>
<li>Next expand &#8220;Configuration Properties&#8221; then select &#8220;General&#8221;. In the field on the right labeled &#8220;Platform Toolset&#8221; make sure the value &#8220;Windows7.1SDK&#8221; is selected</li>
<li> Now let&#8217;s make it x64. At the top of the dialog box, click on the &#8220;Configuration Manager&#8221; button.</li>
<li>In the resultant grid, select the &#8220;Platform&#8221; dropdown and choose &#8220;New&#8230;&#8221;. When the &#8220;New Project Platform&#8221; comes up, select x64 from the top dropdown then click &#8220;OK&#8221; then &#8220;Close&#8221; then close the Properties dialog.</li>
<li>Now you can try building the project</li>
<li>If the build is successful, you find the .dll, in my case ufd_median.dll, in the &#8220;Debug&#8221; or &#8220;Release&#8221; folder. Put that in the MySQL lib/plugin/ directory</li>
<li>Install the plugin using SQL like this:<br />
           <code><br />
               CREATE AGGREGATE FUNCTION median RETURNS REAL SONAME 'udf_median.dll';<br />
           </code>
        </li>
</ol>
<p>And that should be it. </p>
<p>Here are a few things that I came across while setting up my project that might be helpful:</p>
<ol>
<li>In one instance the linker reported this error:<br />
            <code>1>LINK : fatal error LNK1104: cannot open file 'kernel32.lib'</code><br />
         The fix for this was making sure that the &#8220;Platform Toolset&#8221; in the project configuration properties was set to  &#8220;Windows7.1SDK&#8221;
        </li>
<li>If you get this error:<br />
        <code>ERROR 1126 (HY000): Can't open shared library 'file.dll'</code><br />
        this can sometimes mean that the .dll wasn&#8217;t compiled correctly (i.e. this is what you get when you try to use a 32bit .dll with a 64bit server) or the symbol exports didn&#8217;t work. I used the dumpbin program to make sure that the functions that needed to be exported were. Under the &#8220;Tools&#8221; menu in VC++, select &#8220;Visual Studio Command Prompt&#8221; then navigate to the directory with your .dll file and run this command:<br />
          <code>dumpbin  /exports udf_median.dll</code><br />
You should see the exported functions in the output.
        </li>
<li>I did not have to alter any of the C code to make this work even though I saw some comments on Roland&#8217;s post indicating that might be necessary.
        </li>
</ol>
<p>That&#8217;s about it. Feel free to contact me if you have any questions or if the steps didn&#8217;t work for you.</p>
]]></content>
			<link rel="replies" type="text/html" href="http://jeffbeard.com/2011/05/mysql-udf_median-on-windows-7-64bit/#comments" thr:count="6"/>
		<link rel="replies" type="application/atom+xml" href="http://jeffbeard.com/2011/05/mysql-udf_median-on-windows-7-64bit/feed/atom/" thr:count="6"/>
		<thr:total>6</thr:total>
		</entry>
		<entry>
		<author>
			<name>Jeff</name>
						<uri>http://jeffbeard.org</uri>
					</author>
		<title type="html"><![CDATA[Intercept HTTP requests with Squid]]></title>
		<link rel="alternate" type="text/html" href="http://jeffbeard.com/2011/04/intercept-http-requests-with-squid/" />
		<id>http://jeffbeard.org/?p=260</id>
		<updated>2011-04-20T14:33:02Z</updated>
		<published>2011-04-20T13:08:53Z</published>
		<category scheme="http://jeffbeard.com" term="Systems Administration" /><category scheme="http://jeffbeard.com" term="linux" /><category scheme="http://jeffbeard.com" term="log requests" /><category scheme="http://jeffbeard.com" term="log web traffic" /><category scheme="http://jeffbeard.com" term="squid" /><category scheme="http://jeffbeard.com" term="transparent proxy" />		<summary type="html"><![CDATA[On one of my projects we had some questions about how much bandwidth was being used by requests to a third party service but we didn&#8217;t have any a view beyond general traffic on the network interface. I hit upon the idea of using a transparent proxy to log requests then use log analysis to [&#8230;]]]></summary>
		<content type="html" xml:base="http://jeffbeard.com/2011/04/intercept-http-requests-with-squid/"><![CDATA[<p>On one of my projects we had some questions about how much bandwidth was being used by requests to a third party service but we didn&#8217;t have any a view beyond general traffic on the network interface. I hit upon the idea of using a transparent proxy to log requests then use log analysis to break out data transfer amounts per third party service. And since we already had <a href="http://www.squid-cache.org/">squid</a> as part of our infrastructure applications it seemed like a good choice. </p>
<p>The tricky part of this setup is that everything is hosted on the same hardware node and we also have some web services that needed to be left untouched. These requirements implied some network configuration using <a href="http://www.netfilter.org/">iptables</a> to force outbound web requests through the proxy.<br />
<span id="more-260"></span><br />
So the first thing I needed to do was install squid. On this project we use <a href="http://www.centos.org/">CentOS</a> on all our hosts so this was easily accomplished like this:</p>
<p><code>sudo yum install squid</code></p>
<p>Next was adjusting the configuration. The default squid.conf comes with lots of documentation which is helpful but makes the configuration file difficult to read and navigate so the first thing I did was get rid of it like so:<br />
<code><br />
cd /etc/squid && sudo cp squid.conf squid.conf.orig && sudo egrep -v'^#' squid.conf > /tmp/squid.conf<br />
</code></p>
<p>This leaves a lot of empty lines in the file which can be removed like this:</p>
<p><code><br />
sudo sed '/^$/d' /tmp/squid.conf > /tmp/squid.clean && sudo mv /tmp/squid.clean squid.conf<br />
</code></p>
<p>Next up was setting up networking and squid.</p>
<p>The squid site has a great set of examples, <a href="http://wiki.squid-cache.org/ConfigExamples/Intercept/LinuxLocalhost">one of which</a> looked like it suit my purposes nicely. </p>
<p>First I configured squid by adding this directive:</p>
<p><code>http_port 3128 transparent</code></p>
<p>Then I started squid:</p>
<p><code>sudo service squid start</code></p>
<p>I also wanted to make sure squid starts when the system is rebooted:</p>
<p><code>sudo chkconfig --levels 2345 squid on</code></p>
<p>Next up was network configuration. </p>
<p>I needed to setup iptables with some NAT rules to force requests through the proxy server. The first command clears out any existing rules. If you already have a custom kernel network config, use this with caution:</p>
<p><code>sudo iptables iptables -t nat -F </code></p>
<p>The next rule is for a typical transparent proxy setup. In the setup that I was working with I did not need this rule, something I discovered by disabling the existing web sites with this command. So if you have a web server <strong>DO NOT</strong> do this:</p>
<p><code><br />
sudo iptables -t nat -A PREROUTING -p tcp -i eth0 --dport 80 -j REDIRECT --to-port 3128<br />
</code></p>
<p>Here is the start of the iptables configuration we implemented. </p>
<p>Apply the rules to force local HTTP traffic through the transparent proxy:</p>
<p><code><br />
gid=`id -g squid`<br />
sudo iptables -t nat -A OUTPUT -p tcp --dport 80 -m owner --gid-owner $gid -j ACCEPT<br />
sudo iptables -t nat -A OUTPUT -p tcp --dport 80 -j DNAT --to-destination HOSTIP:3128<br />
</code></p>
<p>Replace the string &#8220;HOSTIP&#8221; with the IP address of the host you&#8217;re configuring.</p>
<p>At this point I needed to test the setup so I tailed the access log. Or at least I tried to. The default directory permissions on the /var/log/squid directory prevented me from viewing its&#8217; contents. I fixed that with this:</p>
<p><code>sudo chmod 0775 /var/log/squid</code></p>
<p>Then I was able to tail the /var/log/squid/access.log. So I created a request to see if it was logged:</p>
<p><code>wget http://www.google.com</code></p>
<p>I saw the request logged in the squid access log so I was satisfied that it and the networking were functional. However the log format wasn&#8217;t what we needed to feed to <a href="http://awstats.sourceforge.net/">awstats</a>, which is what I going to use to process the log.</p>
<p>Since we already use squid and process its&#8217; logs I grabbed the configuration from our production configuration file:</p>
<p><code><br />
logformat combined %>a %ui %un [%{%d/%b/%Y:%H:%M:%S %z}tl] "%rm %ru HTTP/%rv" %Hs %<st "%{User-Agent}>h" %Ss:%Sh<br />
access_log /var/log/squid/access.log combined<br />
</code></p>
<p>Then I restarted squid and tested again. It looked good so I tested one of our batch processes that makes HTTP requests to make sure that it did what I wanted. It did however I noticed that query strings from the URI were not being logged. A quick google told me that I needed to update the squid.conf with this:</p>
<p><code>strip_query_terms off</code></p>
<p>As it turns out squid squid strips query string after the &#8220;?&#8221; by default. This is apparently to &#8220;protect privacy&#8221; but we needed the query string to identify individual requests more accurately. </p>
<p>At this point I had the system setup and working. It logged all the outbound HTTP requests and the existing web services remained unaffected. All that was left to do was setup awstats to process the logs.</p>
]]></content>
			<link rel="replies" type="text/html" href="http://jeffbeard.com/2011/04/intercept-http-requests-with-squid/#comments" thr:count="0"/>
		<link rel="replies" type="application/atom+xml" href="http://jeffbeard.com/2011/04/intercept-http-requests-with-squid/feed/atom/" thr:count="0"/>
		<thr:total>0</thr:total>
		</entry>
	</feed>

<!--
Performance optimized by W3 Total Cache. Learn more: https://www.w3-edge.com/products/


Served from: jeffbeard.com @ 2018-02-24 17:39:36 by W3 Total Cache
-->