<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">
<channel>
	<title>Website and PHP performance on xarg</title>
	<link>http://www.xarg.org/</link>
	<description>OpenSource and PHP Blog</description>

	<language>en</language>
	
		<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/exarg" type="application/rss+xml" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com" /><item>
		<title>Performant PHP work with numbers</title>
		<link>http://feedproxy.google.com/~r/exarg/~3/e1UnK8hEHug/</link>
		<comments>http://www.xarg.org/2009/10/performant-php-work-with-numbers/#comments</comments>
		<pubDate>Sun, 25 Oct 09 18:48:26 +0100</pubDate>
		<dc:creator>Robert Eisele</dc:creator>
				<category><![CDATA[PHP]]></category>

				<category><![CDATA[optimize]]></category>

				<category><![CDATA[Numbers]]></category>

				<category><![CDATA[performance]]></category>

				<guid isPermaLink="false">http://www.xarg.org/2009/10/performant-php-work-with-numbers/</guid>

		<description><![CDATA[PHP is a typeunsafe programming language and when you are looking over the source of beginners you'll see that they work with string functions on numbers and other ugly things. In most cases this will not cause an error but it's really blotted and argue a really bad programming style. I don't say that this way is bad in every case. To calculate the length of a number for example. But there are a few ways to achieve our desired goal and what's the best way? The mathematical way is that we divide the logarithm of the number by the logarithm of 10: log($n) / log(10). Now we can ceil() this result and have the number of digits of $n. Sure, the faster way is adding one and cast to int. But let's have a look on a few implementations and their timings:

$x = ceil(log($num) / log(10));

0.029807 sec
0.02982 sec
0.029742 sec
0.029871 sec
0.02988 sec

This is a good starting position and we have now a reference point. So let's search for a better solution by using the already mentoined usage of string functions on numbers:

$x = strlen($num);

0.017769 sec
0.017727 sec
0.0177 sec
0.017697 sec
0.017742 sec

Okay, we are better, but let's see, if we can optimize this, by pre casting the number to a string:

$tmp = (string)$num;
$x = strlen($tmp);

0.009112 sec
0.009172 sec
0.009155 sec
0.009156 sec
0.009155 sec

...yeah. This looks good for the moment. Now that we know, how we can calculate the length of a number, let's try to split the number digit by digit and save the result in an array. We will look over 4 ways of doing this job here:

1. Division
$arr = array();
for($i = 0; (int)$num &gt; 0; $i++) {
	$arr[] = $num % 10;
	$num/= 10;
}

2. Access the digits by brackets
$arr = array();
for($i = strlen($num) - 1; $i &gt;= 0 ; $i--) {
	$arr[] = $num{$i};
}

3. Access the digits with substr()
$arr = array();
for($i = strlen($num) - 1; $i &gt;= 0 ; $i--) {
	$arr[] = substr($num, $i, 1);
}

4. Access the digits with substr(), optimized
$arr = array();
$tmp = (string)$num;
for($i = strlen($tmp) - 1; $i &gt;= 0 ; $i--) {
	$arr[] = substr($tmp, $i, 1);
}

So, what do you think is the best? Lets look over the timing table:

div:
0.010299 sec
0.010278 sec
0.010207 sec
0.0102 sec
0.010209 sec

brackets:
0.103473 sec
0.103237 sec
0.10262 sec
0.102984 sec
0.102821 sec

substr:
0.217405 sec
0.221709 sec
0.217226 sec
0.217958 sec
0.21719 sec

substr opt:
0.142615 sec
0.139224 sec
0.143382 sec
0.139244 sec
0.139225 sec

Conclusion: If you want to do such work on numbers, you should cast the number to a string before you access any digits. Thereby you will save the conversation from a number to string for every loop. In general, we can state, that you should work as properly as you work with typesafe languages also under PHP to get the best performance.
Sure, the savings are really slim and all benchmarks were run 1000 times to get a readable output. But I think the most important benefit of this post should be the insight, that a clean code performs even better.]]></description>

		<content:encoded><![CDATA[<p>PHP is a typeunsafe programming language and when you are looking over the source of beginners you'll see that they work with string functions on numbers and other ugly things. In most cases this will not cause an error but it's really blotted and argue a really bad programming style. I don't say that this way is bad in every case. To calculate the length of a number for example. But there are a few ways to achieve our desired goal and what's the best way? The mathematical way is that we divide the logarithm of the number by the logarithm of 10: <em>log($n) / log(10)</em>. Now we can <em>ceil()</em> this result and have the number of digits of $n. Sure, the faster way is adding one and cast to int. But let's have a look on a few implementations and their timings:</p>

<pre>$x = ceil(log($num) / log(10));

0.029807 sec
0.02982 sec
0.029742 sec
0.029871 sec
0.02988 sec</pre>

<p>This is a good starting position and we have now a reference point. So let's search for a better solution by using the already mentoined usage of string functions on numbers:</p>

<pre>$x = strlen($num);

0.017769 sec
0.017727 sec
0.0177 sec
0.017697 sec
0.017742 sec</pre>

<p>Okay, we are better, but let's see, if we can optimize this, by pre casting the number to a string:</p>

<pre>$tmp = (string)$num;
$x = strlen($tmp);

0.009112 sec
0.009172 sec
0.009155 sec
0.009156 sec
0.009155 sec</pre>

<p>...yeah. This looks good for the moment. Now that we know, how we can calculate the length of a number, let's try to split the number digit by digit and save the result in an array. We will look over 4 ways of doing this job here:</p>

<p>1. Division</p>
<pre>$arr = array();
for($i = 0; (int)$num > 0; $i++) {
	$arr[] = $num % 10;
	$num/= 10;
}</pre>

<p>2. Access the digits by brackets</p>
<pre>$arr = array();
for($i = strlen($num) - 1; $i >= 0 ; $i--) {
	$arr[] = $num{$i};
}</pre>

<p>3. Access the digits with <em>substr()</em></p>
<pre>$arr = array();
for($i = strlen($num) - 1; $i >= 0 ; $i--) {
	$arr[] = substr($num, $i, 1);
}</pre>

<p>4. Access the digits with <em>substr()</em>, optimized</p>
<pre>$arr = array();
$tmp = (string)$num;
for($i = strlen($tmp) - 1; $i >= 0 ; $i--) {
	$arr[] = substr($tmp, $i, 1);
}</pre>

<p>So, what do you think is the best? Lets look over the timing table:</p>

<pre>div:
0.010299 sec
0.010278 sec
0.010207 sec
0.0102 sec
0.010209 sec

brackets:
0.103473 sec
0.103237 sec
0.10262 sec
0.102984 sec
0.102821 sec

substr:
0.217405 sec
0.221709 sec
0.217226 sec
0.217958 sec
0.21719 sec

substr opt:
0.142615 sec
0.139224 sec
0.143382 sec
0.139244 sec
0.139225 sec</pre>

<p><b>Conclusion:</b> If you want to do such work on numbers, you should cast the number to a string before you access any digits. Thereby you will save the conversation from a number to string for every loop. In general, we can state, that you should work as properly as you work with typesafe languages also under PHP to get the best performance.</p>
<p>Sure, the savings are really slim and all benchmarks were run 1000 times to get a readable output. But I think the most important benefit of this post should be the insight, that a clean code performs even better.</p><img src="http://feeds.feedburner.com/~r/exarg/~4/e1UnK8hEHug" height="1" width="1"/>]]></content:encoded>

	<feedburner:origLink>http://www.xarg.org/2009/10/performant-php-work-with-numbers/</feedburner:origLink></item>
		<item>
		<title>How I've got no new .de domain</title>
		<link>http://feedproxy.google.com/~r/exarg/~3/a7eiYXZJQYY/</link>
		<comments>http://www.xarg.org/2009/10/how-i-got-no-new-de-domain/#comments</comments>
		<pubDate>Sun, 25 Oct 09 01:01:05 +0200</pubDate>
		<dc:creator>Robert Eisele</dc:creator>
				<category><![CDATA[Domain]]></category>

				<category><![CDATA[Grabber]]></category>

				<category><![CDATA[PHP]]></category>

				<category><![CDATA[Algorithm]]></category>

				<guid isPermaLink="false">http://www.xarg.org/2009/10/how-i-got-no-new-de-domain/</guid>

		<description><![CDATA[I think most peoples in germany have realized the changes of denic terms. Now it's possible to register all .de domains, without any restriction. Before 23.10.2009 you had no possibility to register one- and two letter domains and domains containing numbers only.
The easing of terms has ignited a fight for these coveted domains. Not least because every single domain has a worth of about 10,000&amp;euro; and more. I think the chance to get one domain is greater than winning with a lottery, especially when we can use a few tricks. And that is the reason why I have been involved in this madness.
So i focused my consideration on 2 letter domains and the also opened registration plate-domains. What did I do. First, I permutated all possible characters and build a list of domains:
aa
ab
ac
ad
...
zz
00
01
...
99
After that, I appended all 3 letter registration plates to that list. Okay, that was easy, but now I tried to evaluate the list to register the good names as fast as possible. A little side note. Some registrars introduced the possebility to place a reservation. There are 273 registered registrars. Every registrar can register 4 domains per minute. So 273 * 4 = 1092 domains can be registered in theorie per minute. We have the eyes on 36&amp;sup2; + ~170 = 1466 domains; means, in the worst case the whole magic would be done in 1 minute and 20 seconds. The good for us is, that most peoples pounce on the domains that jump in their mind at first. I think, if this scattering factor could be calculated, we would have a better prediction. But let's come to the evaluation of our domain names to have a little advantage over the others. 
What I did at first, was a google lookup for every word/token, which made a good preselection. I thought to get a better result, I have to weight with a few other factors. So what makes a domain worth? How can we express, that &amp;quot;los&amp;quot; is even better than &amp;quot;a3y&amp;quot;? Domain sellers use LLL (letter letter letter) or better CVC (consonant vowel consonant) to express this. So a LNL (letter number letter) is not as good as a LLL or CVC. I wrote a little function (here in PHP) to solve this problem:
function rateDomainName($str) {

	$score = 0;
	$extra = 0;

	for($i = 0, $len = strlen($str); $i &amp;lt; $len; $i++) {

		$extra+= ord($str[$i]);

		switch($str[$i]) {

			case 'a':	case 'e':
			case 'i':	case 'o':
			case 'u':
				if($i === 1 &amp;amp;&amp;amp; $len === 3) {
					$score+= 1.;
				} else {
					$score+= .8;
				}
				break;
			case '0':	case '1':
			case '2':	case '3':
			case '4':	case '5':
			case '6':	case '7':
			case '8':	case '9':
				if($i === 1 &amp;amp;&amp;amp; $len === 3) {
					$score+= .1;
				} else {
					$score+= .2;
				}
				break;
			default:
				$score+= .5;
		}
	}

	if($extra === $len * ord($str[0])) {
		return $score + 2.;
	} else {
		return $score;
	}
}
Calculated the result from rateDomainName() with the number of google serps, the ranking of my domains got better, but that was not enough. The last step, was building a black- and a whitelist of names. Some names are desperate and hold a lot of trouble, for example o2.de, 4u.de or other proper nouns. These problems can have other peoples and help me by locking their registrar for the current minute with such crap.
But the time window was anyway a problem. So I created many accounts with every imaginable registrar and droped my list of domains in the right order into their forms. Now I can say, the work was for nothing. I got not even one domain. But I think this is a good proof, that shows, that the only thing you need is money for such problems. Sure denic is a non-profit concern, but most of these 273 registrars made millions with this little change in denic's terms by ordering the queue onto the bidding. So thanks all you grabbers for having about 25.000 new parking-sites :-)]]></description>

		<content:encoded><![CDATA[<p>I think most peoples in germany have realized the changes of denic terms. Now it's possible to register all .de domains, without any restriction. Before 23.10.2009 you had no possibility to register one- and two letter domains and domains containing numbers only.</p>
<p>The easing of terms has ignited a fight for these coveted domains. Not least because every single domain has a worth of about 10,000&euro; and more. I think the chance to get one domain is greater than winning with a lottery, especially when we can use a few tricks. And that is the reason why I have been involved in this madness.</p>
<p>So i focused my consideration on 2 letter domains and the also opened registration plate-domains. What did I do. First, I permutated all possible characters and build a list of domains:</p>
<pre>aa
ab
ac
ad
...
zz
00
01
...
99</pre>
<p>After that, I appended all 3 letter <a rel="nofollow" class="extern" href="http://de.wikipedia.org/wiki/Liste_der_Kfz-Kennzeichen_in_Deutschland">registration plates</a> to that list. Okay, that was easy, but now I tried to evaluate the list to register the good names as fast as possible. A little side note. Some registrars introduced the possebility to place a reservation. There are <strong>273</strong> registered registrars. Every registrar can register 4 domains per minute. So 273 * 4 = <strong>1092</strong> domains can be registered in theorie per minute. We have the eyes on 36&sup2; + ~170 = 1466 domains; means, in the worst case the whole magic would be done in 1 minute and 20 seconds. The good for us is, that most peoples pounce on the domains that jump in their mind at first. I think, if this scattering factor could be calculated, we would have a better prediction. But let's come to the evaluation of our domain names to have a little advantage over the others. </p>
<p>What I did at first, was a google lookup for every word/token, which made a good preselection. I thought to get a better result, I have to weight with a few other factors. So what makes a domain worth? How can we express, that &quot;los&quot; is even better than &quot;a3y&quot;? Domain sellers use <acronym title="letter letter letter">LLL</acronym> (letter letter letter) or better <acronym title="consonant vowel consonant">CVC</acronym> (consonant vowel consonant) to express this. So a <acronym title="letter number letter">LNL</acronym> (letter number letter) is not as good as a LLL or CVC. I wrote a little function (here in PHP) to solve this problem:</p>
<pre>function rateDomainName($str) {

	$score = 0;
	$extra = 0;

	for($i = 0, $len = strlen($str); $i &lt; $len; $i++) {

		$extra+= ord($str[$i]);

		switch($str[$i]) {

			case 'a':	case 'e':
			case 'i':	case 'o':
			case 'u':
				if($i === 1 &amp;&amp; $len === 3) {
					$score+= 1.;
				} else {
					$score+= .8;
				}
				break;
			case '0':	case '1':
			case '2':	case '3':
			case '4':	case '5':
			case '6':	case '7':
			case '8':	case '9':
				if($i === 1 &amp;&amp; $len === 3) {
					$score+= .1;
				} else {
					$score+= .2;
				}
				break;
			default:
				$score+= .5;
		}
	}

	if($extra === $len * ord($str[0])) {
		return $score + 2.;
	} else {
		return $score;
	}
}</pre>
<p>Calculated the result from <em>rateDomainName()</em> with the number of google serps, the ranking of my domains got better, but that was not enough. The last step, was building a black- and a whitelist of names. Some names are desperate and hold a lot of trouble, for example o2.de, 4u.de or other proper nouns. These problems can have other peoples and help me by locking their registrar for the current minute with such crap.</p>
<p>But the time window was anyway a problem. So I created many accounts with every imaginable registrar and droped my list of domains in the right order into their forms. Now I can say, the work was for nothing. I got not even one domain. But I think this is a good proof, that shows, that the only thing you need is money for such problems. Sure denic is a non-profit concern, but most of these 273 registrars made millions with this little change in denic's terms by ordering the queue onto the bidding. So thanks all you grabbers for having about 25.000 new parking-sites :-)</p><img src="http://feeds.feedburner.com/~r/exarg/~4/a7eiYXZJQYY" height="1" width="1"/>]]></content:encoded>

	<feedburner:origLink>http://www.xarg.org/2009/10/how-i-got-no-new-de-domain/</feedburner:origLink></item>
		<item>
		<title>Write a PID file in bash</title>
		<link>http://feedproxy.google.com/~r/exarg/~3/xTOIcnKfBLw/</link>
		<comments>http://www.xarg.org/2009/10/write-a-pid-file-in-bash/#comments</comments>
		<pubDate>Fri, 23 Oct 09 20:35:12 +0200</pubDate>
		<dc:creator>Robert Eisele</dc:creator>
				<category><![CDATA[daemon]]></category>

				<category><![CDATA[queue]]></category>

				<category><![CDATA[pid file]]></category>

				<category><![CDATA[PHP]]></category>

				<category><![CDATA[bash]]></category>

				<guid isPermaLink="false">http://www.xarg.org/2009/10/write-a-pid-file-in-bash/</guid>

		<description><![CDATA[To write a PID-file of a just created background process you can simply use $! like this:

/usr/local/bin/program &amp;amp;
echo $! &amp;gt; /var/run/program.pid

I need this to fork a little daemon written in PHP. Yes, I could use pcntl_fork(), but my goal was, saving the recompilation of PHP and any changes to the code. Normally, I used a little tricky syntax to fork the PHP-process:

((/usr/local/php -f /var/daemon/queue.php)&amp;amp;)

This is quite cool, because the process runs now as a daemon, and we can close the shell. But what happens, when the process dies? This is a problem and I tried to combine the first and the second snippet to get the PID in a file and look continually over the process. First I wrote the backgrounding to a little shellscript:

#!/bin/bash

/usr/local/php -f /var/daemon/queue.php &amp;amp;
echo $! &amp;gt; /var/run/queue.pid

And started the whole thing by ((queue.sh) &amp;amp;).

But let's try to combine both in a single statement. My result was a easy to read command, which spawn a programm as a daemon and also writes the PID-file:

((/usr/local/php/bin/php -f /root/queue.php) &amp;amp; echo $! &amp;gt; /var/run/queue.pid &amp;amp;)

For now, we have the process running and the according PID in a file. Let's write a little watchdog, that will watch over the health of our daemon and creates a new process, when our daemon dies for some reason. I think the best solution for this job is a cron job. So let's pack the stuff in a little shellscript.

#!/bin/bash
pid=`cat /var/run/queue.pid`

if [ ! -e /proc/$pid -a /proc/$pid/exe ]; then
	mail -s &amp;quot;Queue failed&amp;quot; failed@example.com &amp;lt;&amp;lt;EOF
Queue on host $(hostname) failed and restarted
EOF
fi]]></description>

		<content:encoded><![CDATA[<p>To write a PID-file of a just created background process you can simply use <strong>$!</strong> like this:</p>

<pre>/usr/local/bin/program &amp;
echo $! &gt; /var/run/program.pid</pre>

<p>I need this to fork a little daemon written in PHP. Yes, I could use <em>pcntl_fork()</em>, but my goal was, saving the recompilation of PHP and any changes to the code. Normally, I used a little tricky syntax to fork the PHP-process:

<pre>((/usr/local/php -f /var/daemon/queue.php)&amp;)</pre>

<p>This is quite cool, because the process runs now as a daemon, and we can close the shell. But what happens, when the process dies? This is a problem and I tried to combine the first and the second snippet to get the PID in a file and look continually over the process. First I wrote the backgrounding to a little shellscript:

<pre>#!/bin/bash

/usr/local/php -f /var/daemon/queue.php &amp;
echo $! &gt; /var/run/queue.pid</pre>

<p>And started the whole thing by <strong>((queue.sh) &amp;)</strong>.</p>

<p>But let's try to combine both in a single statement. My result was a easy to read command, which spawn a programm as a daemon and also writes the PID-file:</p>

<pre>((/usr/local/php/bin/php -f /root/queue.php) &amp; echo $! &gt; /var/run/queue.pid &amp;)</pre>

<p>For now, we have the process running and the according PID in a file. Let's write a little watchdog, that will watch over the health of our daemon and creates a new process, when our daemon dies for some reason. I think the best solution for this job is a cron job. So let's pack the stuff in a little shellscript.</p>

<pre>#!/bin/bash
pid=`cat /var/run/queue.pid`

if [ ! -e /proc/$pid -a /proc/$pid/exe ]; then
	mail -s &quot;Queue failed&quot; failed@example.com &lt;&lt;EOF
Queue on host $(hostname) failed and restarted
EOF
fi</pre><img src="http://feeds.feedburner.com/~r/exarg/~4/xTOIcnKfBLw" height="1" width="1"/>]]></content:encoded>

	<feedburner:origLink>http://www.xarg.org/2009/10/write-a-pid-file-in-bash/</feedburner:origLink></item>
		<item>
		<title>Fast circular buffer in MySQL</title>
		<link>http://feedproxy.google.com/~r/exarg/~3/QPYWEI3nM4E/</link>
		<comments>http://www.xarg.org/2009/09/fast-circular-buffer-in-mysql/#comments</comments>
		<pubDate>Wed, 30 Sep 09 21:35:58 +0200</pubDate>
		<dc:creator>Robert Eisele</dc:creator>
				<category><![CDATA[circular buffer]]></category>

				<category><![CDATA[memcached]]></category>

				<category><![CDATA[MySQL]]></category>

				<category><![CDATA[optimize]]></category>

				<category><![CDATA[buffer]]></category>

				<guid isPermaLink="false">http://www.xarg.org/2009/09/fast-circular-buffer-in-mysql/</guid>

		<description><![CDATA[A while ago, I wrote a class to implement a circular buffer with memcached. Today, I want to experiment implementing a fixed size storage in MySQL. A nice real world problem is a queue in MySQL. To use this solution as a queue, you have to choose an appropriate size of the buffer and split the read and write pointer.

While I wrote this article, I had an idea. The most common problem on websites is, that you list a fixed size box, like &quot;10 last comments&quot;, &quot;15 newest posts&quot; and so on. When you try to cache this result, you'll delete the cache and read everything from the database. With a fixed size circular buffer, you have an elegant way to update the cache without a new database-query.

But let's try to find a fast solution for MySQL. My first attempt was bringing the idea from the memcached-implementation to MySQL. For this setup, we need two tables. One for the position and one for the data. If you want to use global variables for the pointers, that's your decision. I made bad experiences with that and so I'll use a second table.

CREATE TABLE buffer (
  id int(10) unsigned not null key,
  value varchar(64)
);

CREATE TABLE bufpos (
  pos int(10) unsigned not null,
  max int(10) unsigned not null
);

The circular buffer is fixed in size, so we have to fill the table once, and then only use UPDATE to change the values in the table. In this example, I use varchar(64) for the value. In general, it doesn't matter what type you use. We focus here on the projecting, not the selecting.

INSERT INTO buffer (id, value) VALUES
(1, null),
(2, null),
(3, null),
(4, null),
(5, null),
(6, null),
(7, null),
(8, null),
(9, null),
(10, null);

You also have to fill the &quot;bufpos&quot;-table with one line. Set the max-value to 10, and the position to 1. This table will ever have one line. You can also add a column with a unique id, and add a WHERE-clause in the UPDATE. So you have the chance to reuse this table for different pointers.

INSERT INTO bufpos (pos, max) VALUES (1, 10);

Now I tried to update the pointer and the data-table with maximal 2 querys. If you want to use this, stick it in a procedure, make a LOCK and everthing is fine.

UPDATE bufpos SET pos=@p WHERE @p:= IF(pos=1, max, pos-1) LIMIT 1;
UPDATE buffer SET value='&amp;lt;STRING&amp;gt;' WHERE BID=@p LIMIT 1;

The write solution is okay, but to read from the data-table you have to do 2 querys and the second query with a union which is not really fast:

SELECT @p:= pos FROM bufpos LIMIT 1;
SELECT *
FROM buffer
WHERE id &gt;= @p
UNION
SELECT *
FROM buffer
WHERE id &lt; @p;

Okay, lets see how we can optimize this first approach. There were a few other steps between the first and the final step, but I don't want to bore you.

So my final version uses a single table, with a new column:

CREATE TABLE buffer (
  id int(10) unsigned not null key,
  value varchar(64),
  pos int(10) unsigned not null,
  key(pos)
);

INSERT INTO buffer (id, value, pos) VALUES
(1, null, 1),
(2, null, 2),
(3, null, 3),
(4, null, 4),
(5, null, 5),
(6, null, 6),
(7, null, 7),
(8, null, 8),
(9, null, 9),
(10, null, 10);

The id is not needed, but handy to find the initial-beginnung of the buffer if you lost the table-sorting. If you want to write something to the buffer, you can easily run the following query:

UPDATE buffer SET value='&amp;lt;STRING&amp;gt;', pos=pos+10 ORDER BY pos LIMIT 1;

...where &quot;10&quot; is the size of the buffer, like in the example above. I had a few intermediate steps with limitations, where you had to update the &quot;pos&quot;-column with every update, and the other problem was, that a extra query was needed to get the max value. In this setup, the only limitation is INT_MAX. But with a cron, this should be the slightest problem. Now you can easily read by:

SELECT * FROM buffer ORDER BY pos DESC;

Note, that the &quot;pos&quot; has to be initialized by the same value as the primary key at the beginnung to have a base sort.]]></description>

		<content:encoded><![CDATA[<p>A while ago, I wrote a class to implement a <strong>circular buffer</strong> with memcached. Today, I want to experiment implementing a fixed size storage in MySQL. A nice real world problem is a <strong>queue in MySQL</strong>. To use this solution as a queue, you have to choose an appropriate size of the buffer and split the read and write pointer.</p>

<p>While I wrote this article, I had an idea. The most common problem on websites is, that you list a fixed size box, like "10 last comments", "15 newest posts" and so on. When you try to cache this result, you'll delete the cache and read everything from the database. With a fixed size circular buffer, you have an elegant way to update the cache without a new database-query.</p>

<p>But let's try to find a fast solution for MySQL. My first attempt was bringing the idea from the memcached-implementation to MySQL. For this setup, we need two tables. One for the position and one for the data. If you want to use global variables for the pointers, that's your decision. I made bad experiences with that and so I'll use a second table.</p>

<pre>CREATE TABLE buffer (
  id int(10) unsigned not null key,
  value varchar(64)
);

CREATE TABLE bufpos (
  pos int(10) unsigned not null,
  max int(10) unsigned not null
);</pre>

<p>The circular buffer is fixed in size, so we have to fill the table once, and then only use UPDATE to change the values in the table. In this example, I use <i>varchar(64)</i> for the value. In general, it doesn't matter what type you use. We focus here on the projecting, not the selecting.</p>

<pre>INSERT INTO buffer (id, value) VALUES
(1, null),
(2, null),
(3, null),
(4, null),
(5, null),
(6, null),
(7, null),
(8, null),
(9, null),
(10, null);</pre>

<p>You also have to fill the "bufpos"-table with one line. Set the max-value to 10, and the position to 1. This table will ever have one line. You can also add a column with a unique id, and add a WHERE-clause in the UPDATE. So you have the chance to reuse this table for different pointers.</p>

<pre>INSERT INTO bufpos (pos, max) VALUES (1, 10);</pre>

Now I tried to update the pointer and the data-table with maximal 2 querys. If you want to use this, stick it in a procedure, make a <i>LOCK</i> and everthing is fine.

<pre>UPDATE bufpos SET pos=@p WHERE @p:= IF(pos=1, max, pos-1) LIMIT 1;
UPDATE buffer SET value='&lt;STRING&gt;' WHERE BID=@p LIMIT 1;</pre>

<p>The write solution is okay, but to read from the data-table you have to do 2 querys and the second query with a union which is not really fast:</p>

<pre>SELECT @p:= pos FROM bufpos LIMIT 1;
SELECT *
FROM buffer
WHERE id >= @p
UNION
SELECT *
FROM buffer
WHERE id < @p;</pre>

<p>Okay, lets see how we can optimize this first approach. There were a few other steps between the first and the final step, but I don't want to bore you.</p>

<p>So my final version uses a single table, with a new column:</p>

<pre>CREATE TABLE buffer (
  id int(10) unsigned not null key,
  value varchar(64),
  pos int(10) unsigned not null,
  key(pos)
);

INSERT INTO buffer (id, value, pos) VALUES
(1, null, 1),
(2, null, 2),
(3, null, 3),
(4, null, 4),
(5, null, 5),
(6, null, 6),
(7, null, 7),
(8, null, 8),
(9, null, 9),
(10, null, 10);</pre>

<p>The <i>id</i> is not needed, but handy to find the initial-beginnung of the buffer if you lost the table-sorting. If you want to write something to the buffer, you can easily run the following query:</p>

<pre>UPDATE buffer SET value='&lt;STRING&gt;', pos=pos+10 ORDER BY pos LIMIT 1;</pre>

<p>...where "10" is the size of the buffer, like in the example above. I had a few intermediate steps with limitations, where you had to update the "pos"-column with every update, and the other problem was, that a extra query was needed to get the max value. In this setup, the only limitation is INT_MAX. But with a cron, this should be the slightest problem. Now you can easily read by:</p>

<pre>SELECT * FROM buffer ORDER BY pos DESC;</pre>

<p>Note, that the "pos" has to be initialized by the same value as the primary key at the beginnung to have a base sort.</p><img src="http://feeds.feedburner.com/~r/exarg/~4/QPYWEI3nM4E" height="1" width="1"/>]]></content:encoded>

	<feedburner:origLink>http://www.xarg.org/2009/09/fast-circular-buffer-in-mysql/</feedburner:origLink></item>
		<item>
		<title>Netcat MySQL Stream</title>
		<link>http://feedproxy.google.com/~r/exarg/~3/RI2N2sM4P78/</link>
		<comments>http://www.xarg.org/2009/08/netcat-mysql-stream/#comments</comments>
		<pubDate>Sat, 01 Aug 09 18:47:58 +0200</pubDate>
		<dc:creator>Robert Eisele</dc:creator>
				<category><![CDATA[MySQL]]></category>

				<category><![CDATA[migration]]></category>

				<category><![CDATA[netcat]]></category>

				<category><![CDATA[stream]]></category>

				<guid isPermaLink="false">http://www.xarg.org/2009/08/netcat-mysql-stream/</guid>

		<description><![CDATA[In most cases you'll use a SSH-pipe to push a query stream in a database or use the mysql-client directly on the sending host. If you don't need any encryption – for example in a private network - you will lose time (which is expensive in a migration scenario).

A faster alternative is using netcat(1). To stream query's from one side to another (as I said without the mysql-client on the sending host). Simply use this command on the host which is running the MySQL-daemon to open a new socket:

nc -lp 2222 | mysql -u&amp;lt;user&amp;gt; -p&amp;lt;pass&amp;gt;

...and this command from the sender to push the query's to the new opened port 2222:

cat query.sql | nc -q1 example.com 2222

To take a snapshot of your database, don't export it and read the dump with the commands above. Peter describes a faster way Using netcat to copy MySQL Database, which uses the same approach but pipes the stream to tar(1) instead of mysql client utility.

Note: The option -q1 is necessary to close the connection rather after EOF was sent.]]></description>

		<content:encoded><![CDATA[<p>In most cases you'll use a SSH-pipe to push a query stream in a database or use the mysql-client directly on the sending host. If you don't need any encryption – for example in a private network - you will lose time (which is expensive in a migration scenario).</p>

<p>A faster alternative is using <em>netcat</em>(1). To stream query's from one side to another (as I said without the mysql-client on the sending host). Simply use this command on the host which is running the MySQL-daemon to open a new socket:<p>

<pre>nc -lp 2222 | mysql -u&lt;user&gt; -p&lt;pass&gt;</pre>

<p>...and this command from the sender to push the query's to the new opened port 2222:</p>

<pre>cat query.sql | nc -q1 example.com 2222</pre>

<p>To take a snapshot of your database, don't export it and read the dump with the commands above. Peter describes a faster way <a href="http://www.mysqlperformanceblog.com/2009/05/31/using-netcat-to-copy-mysql-database/" class="extern">Using netcat to copy MySQL Database</a>, which uses the same approach but pipes the stream to <em>tar</em>(1) instead of mysql client utility.</p>

<p><b>Note:</b> The option -q1 is necessary to close the connection rather after <acronym title="end-of-file">EOF</acronym> was sent.</p><img src="http://feeds.feedburner.com/~r/exarg/~4/RI2N2sM4P78" height="1" width="1"/>]]></content:encoded>

	<feedburner:origLink>http://www.xarg.org/2009/08/netcat-mysql-stream/</feedburner:origLink></item>
	</channel>
</rss>
