<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/rss2full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><!-- generator="wordpress/2.2.1" --><rss xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:wfw="http://wellformedweb.org/CommentAPI/" xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0" version="2.0">

<channel>
	<title>good coders code, great reuse</title>
	<link>http://www.catonmat.net</link>
	<description>Peteris Krumins' blog about programming, hacking, software reuse, software ideas, computer security, google and technology.</description>
	<pubDate>Wed, 08 Jul 2009 19:38:16 +0000</pubDate>
	<generator>http://wordpress.org/?v=2.2.1</generator>
	<language>en</language>
			<atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" href="http://feeds.feedburner.com/catonmat" type="application/rss+xml" /><feedburner:emailServiceId>catonmat</feedburner:emailServiceId><feedburner:feedburnerHostname>http://feedburner.google.com</feedburner:feedburnerHostname><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com" /><item>
		<title>On the Linear Time Algorithm For Finding Fibonacci Numbers</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/5t59FUZ0rFo/</link>
		<comments>http://www.catonmat.net/blog/on-the-linear-time-algorithm-for-finding-fibonacci-numbers/#comments</comments>
		<pubDate>Tue, 07 Jul 2009 04:45:08 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Programming]]></category>
<category>algorithm</category><category>algorithm analysis</category><category>fibonacci</category><category>fibonacci numbers</category><category>linear</category><category>quadratic</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/on-the-linear-time-algorithm-for-finding-fibonacci-numbers/</guid>
		<description><![CDATA[In this article I&#8217;d like to show how the theory does not always match the practice. I am sure you all know the linear time algorithm for finding Fibonacci numbers. The analysis says that the running time of this algorithm is O(n). But is it still O(n) if we actually run it? If not, what [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/07/fibonacci-of-pisa.jpg' alt='Fibonacci of Pisa' class="post-icon" align="left" />In this article I&#8217;d like to show how the theory does not always match the practice. I am sure you all know the linear time algorithm for finding Fibonacci numbers. The analysis says that the running time of this algorithm is O(n). But is it still O(n) if we actually run it? If not, what is wrong?</p>
<p>Let&#8217;s start with the simplest linear time implementation of the Fibonacci number generating algorithm in Python:</p>
<pre>
def LinearFibonacci(n):
  fn = f1 = f2 = 1
  for x in xrange(2, n):
    fn = f1 + f2
    f2, f1 = f1, fn
  return fn
</pre>
<p>The theory says that this algorithm should run in O(n) - given the n-th Fibonacci number to find, the algorithm does a single loop up to n. </p>
<p>Now let&#8217;s verify if this algorithm is really linear in practice. If it&#8217;s linear then the plot of <em>n</em> vs. running time of LinearFibonacci(n) should be a line. I plotted these values for n up to 200,000 and here is the plot that I got:</p>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/07/quadratic-performance-of-linear-fibonacci-algorithm-fixed.gif' alt='Quadratic performance of linear algorithm' /><br />
<small><strong>Note</strong>: Each data point was averaged over 10 calculcations.</small>
</div>
<p>Oh no! This does not look linear at all! It looks quadratic! I fitted the data with a quadratic function and it fit nearly perfectly. Do you know why the seemingly linear algorithm went quadratic?</p>
<p>The answer is that the theoretical analysis assumed that all the operations in the algorithm executed in constant time. But this is not the case when we run the algorithm on a real machine! As the Fibonacci numbers get larger, each addition operation for calculating the next Fibonacci number &#8220;fn = f1 + f2 &#8221; runs in time proportional to the length of the previous Fibonacci number. It&#8217;s because these huge numbers no longer fit in the basic units of computation in the CPU; so a big integer library is required. The addition of two numbers of length O(n) in a big integer library takes time of O(n).</p>
<p>I&#8217;ll show you that the running time of the real-life linear Fibonacci algorithm really is O(n^2) by taking into account this hidden cost of bigint library.</p>
<p>So at each iteration <em>i</em> we have a hidden cost of O(number of digits of f<sub>i</sub>) = O(digits(f<sub>i</sub>)). Let&#8217;s sum these hidden cost for the whole loop up to n:</p>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/07/total-hidden-bigint-fibonacci-cost.gif' alt='Hidden bigint cost in linear fibonacci' />
</div>
<p>Now let&#8217;s find the number of digits in the n-th Fibonacci number. To do that let&#8217;s use the well-known <a href="http://mathworld.wolfram.com/BinetsFibonacciNumberFormula.html">Binet&#8217;s formula</a>, which tells us that the n-th Fibonacci number f<sub>n</sub> can be expressed as:</p>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/07/binets-fibonacci-formula.gif' alt='Binet’s Fibonacci Formula' />
</div>
<p>It is also well-known that the number of digits in a number is integer part of log<sub>10</sub>(number) + 1. Thus the number of digits in the n-th Fibonacci number is:</p>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/07/fibonacci-digits.gif' alt='Digits in the n-th Fibonacci number' />
</div>
<p>Thus if we now sum all the hidden costs for finding the n-th Fibonacci number we get:</p>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/07/total-hidden-fibonacci-cost-final-answer.gif' alt='Hidden integer library cost in linear fibonacci algorithm' />
</div>
<p>There we have it. The running time of this &#8220;linear&#8221; algorithm is actually quadratic if we take into consideration that each addition operation runs proportionally to the length of addends.</p>
<p>Next time I&#8217;ll show you that if the addition operation runs in constant time, then the algorithm is truly linear; and later I will do a similar analysis of the logarithmic time algorithm for finding Fibonnaci numbers that uses this awesome matrix identity:</p>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/07/fibonacci-matrix-identity.gif' alt='Fibonacci Fibonnaci Matrix Identity' />
</div>
<p>Don&#8217;t forget to <a href="http://www.catonmat.net/feed/">subscribe</a> if you are interested! It&#8217;s well worth every byte!</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/catonmat?a=5t59FUZ0rFo:_9seDuExulw:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/catonmat?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=5t59FUZ0rFo:_9seDuExulw:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/catonmat?i=5t59FUZ0rFo:_9seDuExulw:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=5t59FUZ0rFo:_9seDuExulw:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/catonmat?i=5t59FUZ0rFo:_9seDuExulw:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=5t59FUZ0rFo:_9seDuExulw:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/catonmat?i=5t59FUZ0rFo:_9seDuExulw:gIN9vFwOqvQ" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/5t59FUZ0rFo" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/on-the-linear-time-algorithm-for-finding-fibonacci-numbers/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/on-the-linear-time-algorithm-for-finding-fibonacci-numbers/</feedburner:origLink></item>
		<item>
		<title>Low Level Bit Hacks You Absolutely Must Know</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/sM02mGRP-r0/</link>
		<comments>http://www.catonmat.net/blog/low-level-bit-hacks-you-absolutely-must-know/#comments</comments>
		<pubDate>Tue, 30 Jun 2009 11:25:48 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Programming]]></category>
<category>binary</category><category>bit</category><category>bit hacks</category><category>bitwise and</category><category>bitwise not</category><category>bitwise or</category><category>bitwise shift left</category><category>bitwise shift right</category><category>bitwise xor</category><category>byte</category><category>c</category><category>embedded programming</category><category>even</category><category>hacks</category><category>odd</category><category>perl</category><category>proof</category><category>python</category><category>twos complement</category><category>unsigned integers</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/low-level-bit-hacks-you-absolutely-must-know/</guid>
		<description><![CDATA[I decided to write an article about a thing that is second nature to embedded systems programmers - low level bit hacks. Bit hacks are ingenious little programming tricks that manipulate integers in a smart and efficient manner. Instead of performing some operation (such as counting the 1 bits in an integer) by looping over [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/06/microcontroller-bit-hacks.jpg' alt='Bit Hacks' class="post-icon" align="left" />I decided to write an article about a thing that is second nature to embedded systems programmers - <strong>low level bit hacks</strong>. Bit hacks are ingenious little programming tricks that manipulate integers in a smart and efficient manner. Instead of performing some operation (such as counting the 1 bits in an integer) by looping over individual bits, these programming nuggets do the same with one or two carefully chosen bitwise operations.</p>
<p>To get things going I&#8217;ll assume that you know what the two&#8217;s complement binary representation of an integer is and also that you know all the the bitwise operations.</p>
<p>I&#8217;ll use the following notation for bitwise operations in the article:</p>
<pre>
&#038;   -  bitwise and
|   -  bitwise or
^   -  bitwise xor
~   -  bitwise not
&lt;&lt;  -  bitwise shift right
>>  -  bitwise shift left
</pre>
<p>The numbers in the article are 8 bit signed integers (though the operations work on arbitrary length signed integers) that are represented as two&#8217;s complement and they are usually named &#8216;x&#8217;. The result is usually &#8216;y&#8217;. The individual bits of &#8216;x&#8217; are named b7, b6, b5, b4, b3, b3, b2, b1 and b0. The bit b7 is the sign bit (the most significant bit), and b0 is the least significant.</p>
<p>I&#8217;ll start with the most basic bit hacks and gradually progress to more difficult ones. I&#8217;ll use examples to explain how each bithack works.</p>
<p>If you are intrigued by this topic I urge you to <a href="http://www.catonmat.net/feed/">subscribe to my blog</a>. I can share a secret that there will be the 2nd part of this article where I cover more advanced bit hacks, and I will also release a cheat sheet with all these bit tricks! <strong>It&#8217;s well worth subscribing</strong>!</p>
<p>Here we go.</p>
<p><strong>Bit Hack #1. Check if the integer is even or odd.</strong></p>
<pre>
if ((x &#038; 1) == 0) {
  x is even
}
else {
  x is odd
}
</pre>
<p>I am pretty sure everyone has seen this trick. The idea here is that an integer is odd if and only if the least significant bit <em>b0</em> is 1. It follows from the binary representation of &#8216;x&#8217;, where bit <em>b0</em> contributes to either 1 or 0. By AND-ing &#8216;x&#8217; with 1 we eliminate all the other bits than <em>b0</em>. If the result after this operation is 0, then &#8216;x&#8217; was even because bit <em>b0</em> was 0. Otherwise &#8216;x&#8217; was odd.</p>
<p>Let&#8217;s look at some examples. Let&#8217;s take integer 43, which is odd. In binary 43 is 0010101<strong>1</strong>. Notice that the least significant bit <em>b0</em> is 1 (in bold). Now let&#8217;s AND it with 1:</p>
<pre>
    00101011
&#038;   00000001   (note: 1 is the same as 00000001)
    --------
    00000001
</pre>
<p>See how AND-ing erased all the higher order bits b1-b7 but left bit <em>b0</em> the same it was? The result is thus 1 which tells us that the integer was odd.</p>
<p>Now let&#8217;s look at -43. Just as a reminder, a quick way to find negative of a given number in two&#8217;s complement representation is to invert all bits and add one. So -43 is 11010101 in binary. Again notice that the last bit is 1, and the integer is odd. (Note that if we used one&#8217;s complement it wouldn&#8217;t be true!)</p>
<p>Now let&#8217;s take a look at an even integer 98. In binary 98 is 1100010. </p>
<pre>
    01100010
&#038;   00000001
    --------
    00000000
</pre>
<p>After AND-ing the result is 0. It means that the bit <em>b0</em> of original integer 98 was 0. Thus the given integer is even.</p>
<p>Now the negative -98. It&#8217;s 10011110. Again, bit <em>b0</em> is 0, after AND-ing, the result is 0, meaning -98 is even, which indeed is true.</p>
<p><strong>Bit Hack #2. Test if the n-th bit is set.</strong></p>
<pre>
if (x &#038; (1&lt;&lt;n)) {
  n-th bit is set
}
else {
  n-th bit is not set
}
</pre>
<p>In the previous bit hack we saw that (x <font face="arial">&#038;</font> 1) tests if the first bit is set. This bit hack improves this result and tests if n-th bit is set. It does it by shifting that first 1-bit n positions to the left and then doing the same AND operation, which eliminates all bits but n-th.</p>
<p>Here is what happens if you shift 1 several positions to the left:</p>
<pre>
1         00000001    (same as 1<<0)
1<<1      00000010
1<<2      00000100
1<<3      00001000
1<<4      00010000
1<<5      00100000
1<<6      01000000
1<<7      10000000
</pre>
<p>Now if we AND &#8216;x&#8217; with 1 shifted n positions to the left we effectively eliminate all the bits but n-th bit in &#8216;x&#8217;. If the result after AND-ing is 0, then that bit must have been 0, otherwise that bit was set.</p>
<p>Let&#8217;s look at some examples.</p>
<p>Does 122 have 3rd bit set? The operation we do to find it out is:</p>
<pre>
122 &#038; (1<<3)
</pre>
<p>Now, 122 is 01111010 in binary. And (1<<3) is 00001000.</p>
<pre>
    01111010
&#038;   00001000
    --------
    00001000
</pre>
<p>We see that the result is not 0, so yes, 122 has the 3rd bit set.</p>
<p><strong>Note</strong>: In my article bit numeration starts with 0. So it&#8217;s 0th bit, 1st bit, &#8230;, 7th bit.</p>
<p>What about -33? Does it have the 5th bit set?</p>
<pre>
    11011111      (-33 in binary)
&#038;   00100000     (1<<5)
    --------
    00000000
</pre>
<p>Result is 0, so the 5th bit is not set.</p>
<p><strong>Bit Hack #3. Set the n-th bit.</strong></p>
<pre>
y = x | (1&lt;&lt;n)
</pre>
<p>This bit hack combines the same (1&lt;&lt;n) trick of setting n-th bit by shifting with OR operation. The result of OR-ing a variable with a value that has n-th bit set is turning that n-th bit on. It&#8217;s because OR-ing any value with 0 leaves the value the same; but OR-ing it with 1 changes it to 1 (if it wasn&#8217;t already). Let&#8217;s see how that works in action:</p>
<p>Suppose we have value 120, and we wish to turn on the 2nd bit.</p>
<pre>
    01111000    (120 in binary)
|   00000100    (1<<2)
    --------
    01111100
</pre>
<p>What about -120 and 6th bit?</p>
<pre>
    10001000   (-120 in binary)
|   01000000   (1<<6)
    --------
    11001000
</pre>
<p><strong>Bit Hack #4. Unset the n-th bit.</strong></p>
<pre>
y = x &#038; ~(1&lt;&lt;n)
</pre>
<p>The important part of this bithack is the ~(1&lt;&lt;n) trick. It turns on all the bits except n-th.</p>
<p>Here is how it looks:</p>
<pre>
~1        11111110  (same as ~(1<<0))
~(1<<1)   11111101
~(1<<2)   11111011
~(1<<3)   11110111
~(1<<4)   11101111
~(1<<5)   11011111
~(1<<6)   10111111
~(1<<7)   01111111
</pre>
<p>The effect of AND-ing variable &#8216;x&#8217; with this quantity is eliminating n-th bit. It does not matter if the n-th bit was 0 or 1, AND-ing it with 0 sets it to 0.</p>
<p>Here is an example. Let&#8217;s unset 4th bit in 127:</p>
<pre>
    01111111    (127 in binary)
&#038;   11101111    (~(1<<4))
    --------
    01101111
</pre>
<p><strong>Bit Hack #5. Toggle the n-th bit.</strong></p>
<pre>
y = x ^ (1&lt;&lt;n)
</pre>
<p>This bit hack also uses the wonderful &#8220;set n-th bit shift hack&#8221; but this time it XOR&#8217;s it with the variable &#8216;x&#8217;. The result of XOR-ing something with something else is that if both bits are the same, the result is 0, otherwise it&#8217;s 1. How does it toggle n-th bit? Well, if n-th bit was 1, then XOR-ing it with 1 changes it to 0; conversely, if it was 0, then XOR-ing with with 1 changes it to 1. See, the bit got flipped.</p>
<p>Here is an example. Suppose you want to toggle 5th bit in value 01110101:</p>
<pre>
    01110101
^   00100000
    --------
    01010101
</pre>
<p>What about the same value but 5th bit originally 0?</p>
<pre>
    01010101
^   00100000
    --------
    01110101
</pre>
<p>Notice something? XOR-ing the same bit twice returned it to the same value. This nifty XOR property is used in calculating parity in RAID arrays and used in simple cryptography cyphers, but more about that in some other article.</p>
<p><strong>Bit Hack #6. Turn off the rightmost 1-bit.</strong></p>
<pre>
y = x &#038; (x-1)
</pre>
<p>Now it finally gets more interesting!!! Bit hacks #1 - #5 were kind of boring to be honest.</p>
<p>This bit hack turns off the rightmost one-bit. For example, given an integer 001010<strong>1</strong>0 (the rightmost 1-bit in bold) it turns it into 00101000. Or given 00010000 it turns it into 0, as there is just a single 1-bit.</p>
<p>Here are more examples:</p>
<pre>
    01010111    (x)
&#038;   01010110    (x-1)
    --------
    01010110

    01011000    (x)
&#038;   01010111    (x-1)
    --------
    01010000

    10000000    (x = -128)
&#038;   01111111    (x-1 = 127 (with overflow))
    --------
    00000000

    11111111    (x = all bits 1)
&#038;   11111110    (x-1)
    --------
    11111110

    00000000    (x = no rightmost 1-bits)
&#038;   11111111    (x-1)
    --------
    00000000
</pre>
<p>Why does it work?</p>
<p>If you look at the examples and think for a while, you&#8217;ll realize that there are two possible scenarios:</p>
<p>1. The value has the rightmost 1 bit. In this case subtracting one from it sets all the lower bits to one and changes that rightmost bit to 0 (so that if you add one now, you get the original value back). This step has masked out the rightmost 1-bit and now AND-ing it with the original value zeroes that rightmost 1-bit out.</p>
<p>2. The value has no rightmost 1 bit (all 0). In this case subtracting one underflows the value (as it&#8217;s signed) and sets all bits to 1. AND-ing all zeroes with all ones produces 0.</p>
<p><strong>Bit Hack #7. Isolate the rightmost 1-bit.</strong></p>
<pre>
y = x &#038; (-x)
</pre>
<p>This bit hack finds the rightmost 1-bit and sets all the other bits to 0. The end result has only that one rightmost 1-bit set. For example, 01010<strong>1</strong>00 (rightmost bit in bold) gets turned into 00000100.</p>
<p>Here are some more examples:</p>
<pre>
    10111100  (x)
&#038;   01000100  (-x)
    --------
    00000100

    01110000  (x)
&#038;   10010000  (-x)
    --------
    00010000

    00000001  (x)
&#038;   11111111  (-x)
    --------
    00000001

    10000000  (x = -128)
&#038;   10000000  (-x = -128)
    --------
    10000000

    11111111  (x = all bits one)
&#038;   00000001  (-x)
    --------
    00000001

    00000000  (x = all bits 0, no rightmost 1-bit)
&#038;   00000000  (-x)
    --------
    00000000
</pre>
<p>This bit hack works because of two&#8217;s complement. In two&#8217;s complement system -x is the same as ~x+1. Now let&#8217;s examine the two possible cases:</p>
<p>1. There is a rightmost 1-bit b<sub>i</sub>. In this case let&#8217;s pivot on this bit and divide all other bits into two flanks - bits to the right and bits to the left. Remember that all the bits to the right b<sub>i-1</sub>, b<sub>i-2</sub> &#8230; b<sub>0</sub> are 0&#8217;s (because b<sub>i</sub> was the rightmost 1-bit). And bits to the left are the way they are. Let&#8217;s call them b<sub>i+1</sub>, &#8230;, b<sub>n</sub>.</p>
<p>Now, when we calculate -x, we first do ~x which turns bit b<sub>i</sub> into 0, bits b<sub>i-1</sub> &#8230; b<sub>0</sub> into 1s, and inverts bits b<sub>i+1</sub>, &#8230;, b<sub>n</sub>, and then we add 1 to this result. </p>
<p>Since bits b<sub>i-1</sub> &#8230; b<sub>0</sub> are all 1&#8217;s, adding one makes them carry this one all the way to bit b<sub>i</sub>, which is the first zero bit.</p>
<p>If we put it all together, the result of calculating -x is that bits b<sub>i+1</sub>, &#8230;, b<sub>n</sub> get inverted, bit b<sub>i</sub> stays the same, and bits b<sub>i-1</sub>, &#8230;, b<sub>0</sub> are all 0&#8217;s.</p>
<p>Now, AND-ing x with -x makes bits b<sub>i+1</sub>, &#8230;, b<sub>n</sub> all 0, leaves bit b<sub>i</sub> as is, and sets bits b<sub>i-1</sub>, &#8230;, b<sub>0</sub> to 0. Only one bit is left, it&#8217;s the bit b<sub>i</sub> - the rightmost 1-bit.</p>
<p>2. There is no rightmost 1-bit. The value is 0. The negative of 0 in two&#8217;s complement is also 0. 0<font face="arial">&#038;</font>0 = 0. No bits get turned on.</p>
<p>We have proved rigorously that this bithack is correct.</p>
<p><strong>Bit Hack #8. Right propagate the rightmost 1-bit.</strong></p>
<pre>
y = x | (x-1)
</pre>
<p>This is best understood by an example. Given a value 01010000 it turns it into 01011111. All the 0-bits right to the rightmost 1-bit got turned into ones.</p>
<p>This is not a clean hack, tho, as it produces all 1&#8217;s if x = 0.</p>
<p>Let&#8217;s look at more examples:</p>
<pre>
    10111100  (x)
|   10111011  (x-1)
    --------
    10111111

    01110111  (x)
|   01110110  (x-1)
    --------
    01110111

    00000001  (x)
|   00000000  (x-1)
    --------
    00000001

    10000000  (x = -128)
|   01111111  (x-1 = 127)
    --------
    11111111

    11111111  (x = -1)
|   11111110  (x-1 = -2)
    --------
    11111111

    00000000  (x)
|   11111111  (x-1)
    --------
    11111111
</pre>
<p>Let&#8217;s prove it, though not as rigorously as in the previous bithack (as it&#8217;s too time consuming and this is not a scientific publication). There are two cases again. Let&#8217;s start with easiest first.</p>
<p>1. There is no rightmost 1-bit. In that case x = 0 and x-1 is -1. -1 in two&#8217;s complement is 11111111. OR-ing 0 with 11111111 produces the same 11111111. (Not the desired result, but that&#8217;s the way it is.)</p>
<p>2. There is the rightmost 1-bit b<sub>i</sub>. Let&#8217;s divide all the bits in two groups again (like in the previous example). Calculating x-1 modifies only bits to the right, turning b<sub>i</sub> into 0, and all the lower bits to 1&#8217;s. Now OR-ing x with x-1 leaves all the higher bits (to the left) the same, leaves bit b<sub>i</sub> as it was 1, and since lower bits are all low 1&#8217;s it also turns them on. The result is that the rightmost 1-bit got propagated to lower order bits.</p>
<p><strong>Bit Hack #9. Isolate the rightmost 0-bit.</strong></p>
<pre>
y = ~x &#038; (x+1)
</pre>
<p>This bithack does the opposite of #7. It finds the rightmost 0-bit, turns off all bits, and sets this bit to 1 in the result. For example, it finds the zero in bold in this number 10101<strong>0</strong>11, producing 00000100.</p>
<p>More examples:</p>
<pre>
    10111100  (x)
    --------
    01000011  (~x)
&#038;   10111101  (x+1)
    --------
    00000001

    01110111  (x)
    --------
    10001000  (~x)
&#038;   01111000  (x+1)
    --------
    00001000

    00000001  (x)
    --------
    11111110  (~x)
&#038;   00000010  (x+1)
    --------
    00000010

    10000000  (x = -128)
    --------
    01111111  (~x)
&#038;   10000001  (x+1)
    --------
    00000001

    11111111  (x = no rightmost 0-bit)
    --------
    00000000  (~x)
&#038;   00000000  (x+1)
    --------
    00000000

    00000000  (x)
    --------
    11111111  (~x)
&#038;   00000001  (x+1)
    --------
    00000001
</pre>
<p>Proof: Suppose there is a rightmost 0-bit. Then ~x turns this rightmost 0 bit into 1 bit. And so does x+1 (because bits more right to the rightmost 0 bit are 1&#8217;s). Now AND-ing ~x with x+1 evaporates all the bits up to this rightmost 0 bit. This is the highest order bit set in the result. Now what about lower order bits to the right of rightmost 0 bit? They also got evaporated because because x+1 turned them into 0&#8217;s (they were 1&#8217;s) and ~x turned them into 0&#8217;s. They got AND-ed with 0 and evaporated.</p>
<p><strong>Bit Hack #10. Turn on the rightmost 0-bit.</strong></p>
<pre>
y = x | (x+1)
</pre>
<p>This hack changes the rightmost 0-bit into 1. For example, given an integer 10100011 it turns it into 10100111.</p>
<p>More examples:</p>
<pre>
    10111100  (x)
|   10111101  (x+1)
    --------
    10111101

    01110111  (x)
|   01111000  (x+1)
    --------
    01111111

    00000001  (x)
|   00000010  (x+1)
    --------
    00000011

    10000000  (x = -128)
|   10000001  (x+1)
    --------
    10000001

    11111111  (x = no rightmost 0-bit)
|   00000000  (x+1)
    --------
    11111111

    00000000  (x)
|   00000001  (x+1)
    --------
    00000001
</pre>
<p>Here is the proof as a bunch of true statements. OR-ing x with x+1 does not lose any information. Adding 1 to x fills the first rightmost 0. The result is max{x, x+1}. If x+1 overflows it&#8217;s x and there were no 0 bits. If it doesn&#8217;t, it&#8217;s x+1 which just got rightmost bit filled with 1.</p>
<h2 style="margin-bottom:10px">Bonus stuff.</h2>
<p>If you decide to play more with these hacks, here are a few utility functions to print binary values of <strong>8 bit signed integers</strong> in Perl, Python and C.</p>
<p>Print binary representation in Perl:</p>
<pre>
sub int_to_bin {
  my $num = shift;
  print unpack "B8", pack "c", $num;
}
</pre>
<p>Or you can print it from command line right away:</p>
<pre>
perl -wle 'print unpack "B8", pack "c", shift' <integer>

# For example:
perl -wle &#39;print unpack &#34;B8&#34;, pack &#34;c&#34;, shift&#39; 113
01110001

perl -wle &#39;print unpack &#34;B8&#34;, pack &#34;c&#34;, shift&#39; &#45;&#45; -128
10000000
</pre>
<p>Print binary number in Python:</p>
<pre>
def int_to_bin(num, bits=8):
 r = ''
 while bits:
  r = ('1' if num&#038;1 else '0') + r
  bits = bits - 1
  num = num >> 1
 print r
</pre>
<p>Print binary representation in C:</p>
<pre>
void int_to_bin(int num) {
  char str[9] = {0};
  int i;
  for (i=7; i>=0; i--) {
    str[i] = (num&#038;1)?'1':'0';
    num >>= 1;
  }
  printf("%s\n", str);
}
</pre>
<p>Have fun with these! I&#8217;ll write about advanced bit hacks some time soon. If you are really intrigued by this topic I encourage you to <a href="http://www.catonmat.net/feed/">subscribe to my blog</a>. Thanks! <img src='http://www.catonmat.net/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>Ps. Let me know in the comments what you think about this article, and let me know if you do not know what two&#8217;s complement, or the basic binary operations are. If there are a few people who would like me to explain these concepts, I&#8217;ll be glad to write another article just about these fundamental topics.</p>
<p>Pps. There is a book entirely on bit hacks like these. It&#8217;s called &#8220;<a href="http://www.amazon.com/gp/product/0201914654?ie=UTF8&#038;tag=catonmat-20&#038;linkCode=as2&#038;camp=1789&#038;creative=9325&#038;creativeASIN=0201914654">Hacker&#8217;s Delight</a>&#8220;. It may be worth getting if you are into this stuff:</p>
<div class="center-aligner">
<iframe src="http://rcm.amazon.com/e/cm?t=catonmat-20&#038;o=1&#038;p=8&#038;l=as1&#038;asins=0201914654&#038;fc1=000000&#038;IS2=1&#038;lt1=_blank&#038;m=amazon&#038;lc1=0000FF&#038;bc1=000000&#038;bg1=FFFFFF&#038;f=ifr" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe>
</div>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/catonmat?a=sM02mGRP-r0:A9oxEaA16dw:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/catonmat?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=sM02mGRP-r0:A9oxEaA16dw:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/catonmat?i=sM02mGRP-r0:A9oxEaA16dw:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=sM02mGRP-r0:A9oxEaA16dw:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/catonmat?i=sM02mGRP-r0:A9oxEaA16dw:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=sM02mGRP-r0:A9oxEaA16dw:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/catonmat?i=sM02mGRP-r0:A9oxEaA16dw:gIN9vFwOqvQ" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/sM02mGRP-r0" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/low-level-bit-hacks-you-absolutely-must-know/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/low-level-bit-hacks-you-absolutely-must-know/</feedburner:origLink></item>
		<item>
		<title>Python Library for Searching Adwords</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/Gq_P2xFmDK8/</link>
		<comments>http://www.catonmat.net/blog/python-library-for-google-sponsored-links-search/#comments</comments>
		<pubDate>Wed, 15 Apr 2009 05:35:05 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Programming]]></category>
<category>adwords</category><category>google</category><category>google search</category><category>python</category><category>search</category><category>sponsored links</category><category>xgoogle</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/python-library-for-google-sponsored-links-search/</guid>
		<description><![CDATA[Here is another quick hack that I wrote a while ago. It complements the xgoogle library that I published in my previous post with an API for Google Sponsored Links search.
Let me quickly explain why this library is useful, and what the Google Sponsored Links are.
For a typical search, Google shows regular web search results [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/03/google-python-search-library.jpg' alt='Google Python Search Library' class='post-icon' align='left' />Here is another quick hack that I wrote a while ago. It complements the <a href="http://www.catonmat.net/blog/python-library-for-google-search/">xgoogle library</a> that I published in my previous post with an API for <a href="http://www.google.com/sponsoredlinks">Google Sponsored Links</a> search.</p>
<p>Let me quickly explain why this library is useful, and what the Google Sponsored Links are.</p>
<p>For a typical search, Google shows regular web search results on the left side of the page, and &#8220;Sponsored Links&#8221; in a column on the right side. &#8220;Sponsored&#8221; means the results are pulled from Googe&#8217;s advertising network (<a href="https://adwords.google.com">Adwords</a>).</p>
<p>Here is a screenshot that illustrates the Sponsored Links:</p>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/04/google-sponsored-links.jpg' alt='Google Sponsored Links' /><br />
<small>Google Sponsored Links results for search term &#8220;security&#8221; are <font color="red">in red</font>.</small>
</div>
<p>Okay, now why would I need a library to search the Sponsored results? Suppose that I am an advertiser on Adwords, and I buy some software related keywords like &#8220;video software&#8221;. It is in my interests to know my competitors, their advertisement text, what are they up to, the new players in this niche, and their websites. Without my library it would be practically impossible to keep track of all the competitors. There can literally be hundreds of changes per day. However, with my library it&#8217;s now piece of cake to keep track of all the dynamics.</p>
<h2 style="margin-bottom:10px">How does the library work?</h2>
<p>The sponsored links library pulls the results from this URL: <a href="http://www.google.com/sponsoredlinks">http://www.google.com/sponsoredlinks</a>. Here is an example of all the sponsored results for a query &#8220;security&#8221;:</p>
<div class="center-aligner">
<a href='http://www.catonmat.net/blog/wp-content/uploads/2009/04/sponsored-links-security-full.jpg' title='Google Sponsored Links for Security Full'><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/04/sponsored-links-security.jpg' alt='Google Sponsored Links for Security' /></a><br />
<small>Sponsored links results for &#8220;security&#8221;.</small>
</div>
<p>The library just grabs page after page, calls <a href="http://www.crummy.com/software/BeautifulSoup/">BeautifulSoup</a>, and extracts the search result elements. Elementary.</p>
<h2 style="margin-bottom:10px">How to use the library?</h2>
<p>As I mentioned, this library is part of my <a href="http://www.catonmat.net/blog/python-library-for-google-search/">xgoogle library</a>. Download and extract it first:</p>
<p>Download: <a href="http://www.catonmat.net/download/xgoogle.zip" title="Version 1.1 downloaded 1116 times" rel="enclosure">xgoogle library (.zip)</a><br />
Downloaded: 1116 times.<br />
Download url: http://www.catonmat.net/download/xgoogle.zip</p>
<p>Now, the source file that contains the implementation of this library is &#8220;xgoogle/sponsoredlinks.py&#8221;. To use it, do the usual import &#8220;from xgoogle.sponsoredlinks import SponsoredLinks, SLError&#8221;.</p>
<p><strong>SponsoredLinks</strong> is the class that provides the API and <strong>SLError</strong> is exception class that gets thrown in case of errors, so it&#8217;s a good idea to import both.</p>
<p>The SponsoredLinks has a similar interface as the <a href="http://www.catonmat.net/blog/python-library-for-google-search/">xgoogle.search</a> (the plain google search module). The constructor of SponsoredLinks takes the keyword you want to search for, and the constructed object has several public methods and properties:</p>
<ul>
<li><strong>method get_results()</strong> - gets a page of results, returning a list of SponsoredLink objects. It returns an empty list if there are no more results.</li>
<li><strong>property num_results</strong> - returns number of search results found.</li>
<li><strong>property results_per_page</strong> - sets/gets the number of results to get per page (max 100).</li>
</ul>
<p>The returned SponsoredLink objects have four attributes &#8212; &#8220;title&#8221;, &#8220;desc&#8221;, &#8220;url&#8221;, and &#8220;display_url&#8221;. Here is a picture that illustrates what each attribute stands for:</p>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/04/sponsored-link-api.jpg' alt='Coresspondence between sponsored link and result object' />
</div>
<p>The picture does not show the &#8220;display_url&#8221; attribute as it&#8217;s the actual link the result links to (href of blue link in the pic).</p>
<p>Here is an example usage of this library. It retrieves first 100 Sponsored Links results for keyword &#8220;video software&#8221;:</p>
<pre>
from xgoogle.sponsoredlinks import SponsoredLinks, SLError
try:
  sl = SponsoredLinks("video software")
  sl.results_per_page = 100
  results = sl.get_results()
except SLError, e:
  print "Search failed: %s" % e

for result in results:
  print result.title.encode('utf8')
  print result.desc.encode('utf8')
  print result.display_url.encode('utf8')
  print result.url.encode('utf8')
  print
</pre>
<p>Output:</p>
<pre>
Photoshop Video Software
Time saving software for video. Work faster in Photoshop.
www.toolsfortelevision.com
http://www.toolsfortelevision.com

...
</pre>
<p>That&#8217;s about it for this time. Use it to find your competitors and outsmart them!</p>
<p>Next time I am going to expand the library for <a href="http://labs.google.com/sets">Google Sets</a> search.</p>
<div class="download">
<div class="download-title">Download &#8220;<strong>xgoogle</strong>&#8221; library:</div>
<p>Download: <a href="http://www.catonmat.net/download/xgoogle.zip" title="Version 1.1 downloaded 1116 times" rel="enclosure">xgoogle library (.zip)</a><br />
Downloaded: 1116 times.<br />
Download url: http://www.catonmat.net/download/xgoogle.zip</p>
</div>
<p>Have fun!</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/catonmat?a=Gq_P2xFmDK8:RRYDDmQkHkk:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/catonmat?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=Gq_P2xFmDK8:RRYDDmQkHkk:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/catonmat?i=Gq_P2xFmDK8:RRYDDmQkHkk:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=Gq_P2xFmDK8:RRYDDmQkHkk:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/catonmat?i=Gq_P2xFmDK8:RRYDDmQkHkk:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=Gq_P2xFmDK8:RRYDDmQkHkk:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/catonmat?i=Gq_P2xFmDK8:RRYDDmQkHkk:gIN9vFwOqvQ" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/Gq_P2xFmDK8" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/python-library-for-google-sponsored-links-search/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/python-library-for-google-sponsored-links-search/</feedburner:origLink></item>
		<item>
		<title>Python Library for Google Search</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/qodKls9RKm4/</link>
		<comments>http://www.catonmat.net/blog/python-library-for-google-search/#comments</comments>
		<pubDate>Thu, 12 Mar 2009 05:00:34 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Programming]]></category>
<category>google</category><category>google fight</category><category>google search</category><category>python</category><category>search</category><category>xgoogle</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/python-library-for-google-search/</guid>
		<description><![CDATA[Here is a quick hack that I wrote. It&#8217;s a Python library to search Google without using their API. It&#8217;s quick and dirty, just the way I love it.
Why didn&#8217;t I use Google&#8217;s provided REST API? Because it says &#8220;you can only get up to 8 results in a single call and you can&#8217;t go [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/03/google-python-search-library.jpg' alt='Google Python Search Library' class='post-icon' align='left' />Here is a quick hack that I wrote. It&#8217;s a Python library to search Google without using their API. It&#8217;s <a href="http://www.catonmat.net/blog/difference-between-edsger-dijkstra-and-larry-wall/">quick and dirty</a>, just the way I love it.</p>
<p>Why didn&#8217;t I use Google&#8217;s provided <a href="http://googlesystem.blogspot.com/2008/04/google-search-rest-api.html">REST API</a>? Because it says &#8220;you can only get up to 8 results in a single call and you can&#8217;t go beyond the first 32 results&#8221;. Seriously, what am I gonna do with just 32 results?</p>
<p>I wrote it because I want to do various Google hacks automatically, monitor popularity of some keywords and sites, and to use it for various other reasons.</p>
<p>One of my next post is going to extend on this library and build a tool that perfects your English. I have been using Google for a while to find the correct use of various English idioms, phrases, and grammar. For example, &#8220;i am programmer&#8221; vs. &#8220;i am a programmer&#8221;. The first one is missing an indefinite article &#8220;a&#8221;, but the second is correct. Googling for these terms reveal that the first has 6,230 results, but the second has 136,000 results, so I pretty much trust that the 2nd is more correct than the first.</p>
<p>Subscribe to my posts via <a href="http://www.catonmat.net/feed/">catonmat&#8217;s rss</a>, if you are intrigued and would love to receive my posts automatically!</p>
<h2 style="margin-bottom:10px">How to use the library?</h2>
<p>First download the xgoogle library, and extract it somewhere.</p>
<p>Download: <a href="http://www.catonmat.net/download/xgoogle.zip" title="Version 1.1 downloaded 1116 times" rel="enclosure">xgoogle library (.zip)</a><br />
Downloaded: 1116 times.<br />
Download url: http://www.catonmat.net/download/xgoogle.zip</p>
<p>At the moment it contains just the code for Google search, but in the future I will add other searches (google sets, google suggest, etc).</p>
<p>To use the search, from &#8220;<strong>xgoogle.search</strong>&#8221; import &#8220;<strong>GoogleSearch</strong>&#8221; and, optionally, &#8220;<strong>SearchError</strong>&#8220;.</p>
<p>GoogleSearch is the class you will use to do Google searches. SearchError is an exception class that GoogleSearch throws in case of various errors.</p>
<p>Pass the keyword you want to search as the first parameter to GoogleSearch&#8217;s constructor. The constructed object has several public methods and properties:</p>
<ul>
<li><strong>method get_results()</strong> - gets a page of results, returning a list of SearchResult objects. It returns an empty list if there are no more results.</li>
<li><strong>property num_results</strong> - returns number of search results found.</li>
<li><strong>property results_per_page</strong> - sets/gets the number of results to get per page. Possible values are 10, 25, 50, 100.</li>
<li><strong>property page</strong> - sets/gets the search page.</li>
</ul>
<p>As I said, get_results() method returns a SearchResult object. It has three attributes &#8212; &#8220;title&#8221;, &#8220;desc&#8221;, and &#8220;url&#8221;. They are Unicode strings, so do a proper encoding before outputting them.</p>
<p>Here is a screenshot that illustrates the &#8220;title&#8221;, &#8220;desc&#8221;, and &#8220;url&#8221; attributes:</p>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/04/google-result.jpg' alt='Google Search Result, url, title, description' /><br />
<small>Google search result for &#8220;catonmat&#8221;.</small>
</div>
<p>Here is an example program of doing a Google search. It takes the first argument, does a search on it, and prints the results:</p>
<pre>
from xgoogle.search import GoogleSearch, SearchError
try:
  gs = GoogleSearch(&#34;quick and dirty&#34;)
  gs.results_per_page = 50
  results = gs.get_results()
  for res in results:
    print res.title.encode(&#39;utf8&#39;)
    print res.desc.encode(&#39;utf8&#39;)
    print res.url.encode(&#39;utf8&#39;)
    print
except SearchError, e:
  print &#34;Search failed: %s&#34; % e
</pre>
<p>This code fragment sets up a search for &#8220;quick and dirty&#8221; and specifies that a result page should have 50 results. Then it calls get_results() to get a page of results. Finally it prints the title, description and url of each search result.</p>
<p>Here is the output from running this program:</p>
<pre>
Quick-and-dirty - Wikipedia, the free encyclopedia
Quick-and-dirty is a term used in reference to anything that is an easy way to implement a kludge. Its usage is popular among programmers, ...
http://en.wikipedia.org/wiki/Quick-and-dirty

Grammar Girl's Quick and Dirty Tips for Better Writing - Wikipedia ...
"Grammar Girl's Quick and Dirty Tips for Better Writing" is an educational podcast that was launched in July 2006 and the title of a print book that was ...Writing - 39k -
http://en.wikipedia.org/wiki/Grammar_Girl%27s_Quick_and_Dirty_Tips_for_Better_Writing

Quick &#038; Dirty Tips :: Grammar  Girl
Quick &#038; Dirty Tips(tm) and related trademarks appearing on this website are the property of Mignon Fogarty, Inc. and Holtzbrinck Publishers Holdings, LLC. ...
http://grammar.quickanddirtytips.com/
[...]
</pre>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/04/google-quick-and-dirty-search.jpg' alt='Quick and Dirty search on Google' /><br />
<small>Compare these results to the output above.</small>
</div>
<p>You could also have specified which search page to start the search from. For example, the following code will get 25 results per page and start the search at 2nd page.</p>
<pre>
  gs = GoogleSearch(&#34;quick and dirty&#34;)
  gs.results_per_page = 25
  gs.page = 2
  results = gs.get_results()
</pre>
<p>You can also quickly write a scraper to get all the results for a given search term:</p>
<pre>
from xgoogle.search import GoogleSearch, SearchError
try:
  gs = GoogleSearch(&#34;quantum mechanics&#34;)
  gs.results_per_page = 100
  results = []
  while True:
    tmp = gs.get_results()
    if not tmp: # no more results were found
      break
    results.extend(tmp)
  # ... do something with all the results ...
except SearchError, e:
  print "Search failed: %s" % e
</pre>
<p>You can use this library to constantly monitor how your website is ranking for a given search term. Suppose your website has a domain &#8220;catonmat.net&#8221; and the search term you want to find your position for is &#8220;python videos&#8221;.</p>
<p>Here is a code that outputs your ranking: (it looks through first 100 results, if you need more, put a loop there)</p>
<pre>
import re
from urlparse import urlparse
from xgoogle.search import GoogleSearch, SearchError

target_domain = &#34;catonmat.net&#34;
target_keyword = &#34;python videos&#34;

def mk_nice_domain(domain):
    &#34;&#34;&#34;
    convert domain into a nicer one (eg. www3.google.com into google.com)
    &#34;&#34;&#34;
    domain = re.sub(&#34;^www(\d+)?\.&#34;, &#34;&#34;, domain)
    # add more here
    return domain

gs = GoogleSearch(target_keyword)
gs.results_per_page = 100
results = gs.get_results()
for idx, res in enumerate(results):
  parsed = urlparse(res.url)
  domain = mk_nice_domain(parsed.netloc)
  if domain == target_domain:
    print &#34;Ranking position %d for keyword &#39;%s&#39; on domain %s&#34; % (idx+1, target_keyword, target_domain)
</pre>
<p>Output of this program:</p>
<pre>
Ranking position 6 for keyword python videos on domain catonmat.net
Ranking position 7 for keyword python videos on domain catonmat.net
</pre>
<p>Here is a much wicked example. It uses the GeoIP Python module to find all 10 websites for keyword &#8220;wicked code&#8221; that are physically hosting in California or New York in USA. Make sure you download GeoCityLite database from &#8220;http://www.maxmind.com/download/geoip/database/GeoLiteCity.dat.gz&#8221; and extract it to &#8220;/usr/local/geo_ip&#8221;.</p>
<pre>
import GeoIP
from urlparse import urlparse
from xgoogle.search import GoogleSearch, SearchError

class Geo(object):
  GEO_PATH = &#34;/usr/local/geo_ip/GeoLiteCity.dat&#34;

  def __init__(self):
    self.geo = GeoIP.open(Geo.GEO_PATH, GeoIP.GEOIP_STANDARD)

  def detect_by_host(self, host):
    try:
      gir = self.geo.record_by_name(host)
      return {&#39;country&#39;: gir[&#39;country_code&#39;].lower(),
              &#39;region&#39;: gir[&#39;region&#39;].lower()}
    except Exception, e:
      return {&#39;country&#39;: &#39;none&#39;, &#39;region&#39;: &#39;none&#39;}

dst_country = &#39;us&#39;
dst_states = [&#39;ca&#39;, &#39;ny&#39;]
dst_keyword = &#34;wicked code&#34;
num_results = 10
final_results = []
geo = Geo()

gs = GoogleSearch(dst_keyword)
gs.results_per_page = 100

seen_websites = []
while len(final_results) &lt; num_results:
  results = gs.get_results()
  domains = [urlparse(r.url).netloc for r in results]
  for d in domains:
    geo_loc = geo.detect_by_host(d)
    if (geo_loc[&#39;country&#39;] == dst_country and
                 geo_loc[&#39;region&#39;] in dst_states and
                 d not in seen_websites):
      final_results.append((d, geo_loc[&#39;region&#39;]))
      seen_websites.append(d)
      if len(final_results) == num_results:
        break

print &#34;Found %d websites:&#34; % len(final_results)
for w in final_results:
    print &#34;%s (state: %s)&#34; % w
</pre>
<p>Here is the output of running it:</p>
<pre>
Found 10 websites:
www.wickedcode.com (state: ca)
www.retailmenot.com (state: ca)
www.simplyhired.com (state: ca)
archdipesh.blogspot.com (state: ca)
wagnerblog.com (state: ca)
answers.yahoo.com (state: ca)
devsnippets.com (state: ca)
friendfeed.com (state: ca)
www.thedacs.com (state: ny)
www.tipsdotnet.com (state: ca)
</pre>
<p>You may modify these examples the way you wish. I&#8217;d love to hear some comments about what you can come up with!</p>
<p>And just for fun, here are some other simple uses:</p>
<p>You can make your own <a href="http://www.googlefight.com/">Google Fight</a>:</p>
<pre>
import sys
from xgoogle.search import GoogleSearch, SearchError

args = sys.argv[1:]
if len(args) &lt; 2:
 print &#39;Usage: google_fight.py &#34;keyword 1&#34; &#34;keyword 2&#34;&#39;
 sys.exit(1)

try:
  n0 = GoogleSearch(&#39;&#34;%s&#34;&#39; % args[0]).num_results
  n1 = GoogleSearch(&#39;&#34;%s&#34;&#39; % args[1]).num_results
except SearchError, e:
  print &#34;Google search failed: %s&#34; % e
  sys.exit(1)

if n0 &gt; n1:
  print &#34;%s wins with %d results! (%s had %d)&#34; % (args[0], n0, args[1], n1)
elif n1 &gt; n0:
  print &#34;%s wins with %d results! (%s had %d)&#34; % (args[1], n1, args[0], n0)
else:
  print &#34;It&#39;s a tie! Both keywords have %d results!&#34; % n1
</pre>
<p>Download: <a href="http://www.catonmat.net/download/google_fight.py" title="Version 1.0 downloaded 540 times" rel="enclosure">google_fight.py</a><br />
Downloaded: 540 times.<br />
Download url: http://www.catonmat.net/download/google_fight.py</p>
<p>Here is an example usage of google_fight.py:</p>
<pre>
$ ./google_fight.py google microsoft
google wins with 2680000000 results! (microsoft had 664000000)

$ ./google_fight.py "linux ubuntu" "linux gentoo"
linux ubuntu wins with 4300000 results! (linux gentoo had 863000)
</pre>
<p>After I wrote this, I generalized this Google Fight to take N keywords, and made their passing to program easier by allowing them to be separated by a comma.</p>
<pre>
import sys
from operator import itemgetter
from xgoogle.search import GoogleSearch, SearchError

args = sys.argv[1:]
if not args:
  print &#34;Usage: google_fight.py keyword one, keyword two, ...&#34;
  sys.exit(1)

keywords = [k.strip() for k in &#39; &#39;.join(args).split(&#39;,&#39;)]
try:
  results = [(k, GoogleSearch(&#39;&#34;%s&#34;&#39; % k).num_results) for k in keywords]
except SearchError, e:
  print &#34;Google search failed: %s&#34; % e
  sys.exit(1)

results.sort(key=itemgetter(1), reverse=True)
for res in results:
    print &#34;%s: %d&#34; % res
</pre>
<p>Download: <a href="http://www.catonmat.net/download/google_fight2.py" title="Version 1.0 downloaded 520 times" rel="enclosure">google_fight2.py</a><br />
Downloaded: 520 times.<br />
Download url: http://www.catonmat.net/download/google_fight2.py</p>
<p>Here is an example usage of google_fight2.py:</p>
<pre>
$ ./google_fight2.py earth atmospehere, sun atmosphere, moon atmosphere, jupiter atmosphere
earth atmospehere: 685000
jupiter atmosphere: 31400
sun atmosphere: 24900
moon atmosphere: 8130
</pre>
<p>I am going to expand on this library and add search for <a href="http://labs.google.com/sets">Google Sets</a>, <a href="http://www.google.com/sponsoredlinks">Google Sponsored Links</a>, <a href="http://www.google.com/webhp?complete=1&#038;hl=en">Google Suggest</a>, and perhaps some other Google searches. Then I&#8217;m going to build various tools on them, like a sponsored links competitor finder, use Google Suggest together with Google Sets to find various phrases in English, and apply them to tens of other my ideas.</p>
<div class="download">
<div class="download-title">Download &#8220;<strong>xgoogle</strong>&#8221; library and examples:</div>
<p>Download: <a href="http://www.catonmat.net/download/xgoogle.zip" title="Version 1.1 downloaded 1116 times" rel="enclosure">xgoogle library (.zip)</a><br />
Downloaded: 1116 times.<br />
Download url: http://www.catonmat.net/download/xgoogle.zip</p>
<p>Download: <a href="http://www.catonmat.net/download/google_fight.py" title="Version 1.0 downloaded 540 times" rel="enclosure">google_fight.py</a><br />
Downloaded: 540 times.<br />
Download url: http://www.catonmat.net/download/google_fight.py</p>
<p>Download: <a href="http://www.catonmat.net/download/google_fight2.py" title="Version 1.0 downloaded 520 times" rel="enclosure">google_fight2.py</a><br />
Downloaded: 520 times.<br />
Download url: http://www.catonmat.net/download/google_fight2.py
</p></div>
<div class="ads-aad">
<script type="text/javascript"><!--
google_ad_client = "pub-0433925538517663";
google_ad_width = 468;
google_ad_height = 60;
google_ad_format = "468x60_as";
google_ad_type = "text_image";
//2007-07-12: catonmat after download
google_ad_channel = "9549065335";
google_color_border = "FFFFFF";
google_color_bg = "FFFFFF";
google_color_link = "753206";
google_color_text = "4C4C4C";
google_color_url = "E6E6E6";
//-->
</script>
<script type="text/javascript"
  src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script>
</div>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/catonmat?a=qodKls9RKm4:xdGEJyrEhCs:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/catonmat?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=qodKls9RKm4:xdGEJyrEhCs:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/catonmat?i=qodKls9RKm4:xdGEJyrEhCs:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=qodKls9RKm4:xdGEJyrEhCs:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/catonmat?i=qodKls9RKm4:xdGEJyrEhCs:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=qodKls9RKm4:xdGEJyrEhCs:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/catonmat?i=qodKls9RKm4:xdGEJyrEhCs:gIN9vFwOqvQ" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/qodKls9RKm4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/python-library-for-google-search/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/python-library-for-google-search/</feedburner:origLink></item>
		<item>
		<title>MIT’s Introduction to Algorithms, Lectures 20 and 21: Parallel Algorithms</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/7WddiuzcDU4/</link>
		<comments>http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-thirteen/#comments</comments>
		<pubDate>Sat, 07 Mar 2009 11:00:26 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Introduction to Algorithms]]></category>
<category>algorithms</category><category>cilk</category><category>cilkchess</category><category>dynamic multithreading</category><category>fibonacci</category><category>greedy scheduler</category><category>matrix multiplication</category><category>merge sort</category><category>mit</category><category>multithreaded algorithms</category><category>multithreaded computation</category><category>multithreaded programming</category><category>multithreading</category><category>parallel algorithms</category><category>parallelism</category><category>scheduling</category><category>socrates</category><category>speedup</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-thirteen/</guid>
		<description><![CDATA[This is the thirteenth post in an article series about MIT&#8217;s lecture course &#8220;Introduction to Algorithms.&#8221; In this post I will review lectures twenty and twenty-one on parallel algorithms. These lectures cover the basics of multithreaded programming and multithreaded algorithms.
Lecture twenty begins with a good overview of multithreaded programming paradigm, introduces to various concepts of [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2008/08/post-icon.jpg' alt='MIT Algorithms' class="post-icon" align="left" />This is the thirteenth post in an article series about MIT&#8217;s lecture course &#8220;<strong>Introduction to Algorithms</strong>.&#8221; In this post I will review lectures twenty and twenty-one on <strong>parallel algorithms</strong>. These lectures cover the basics of <strong>multithreaded programming</strong> and <strong>multithreaded algorithms</strong>.</p>
<p>Lecture twenty begins with a good overview of multithreaded programming paradigm, introduces to various concepts of parallel programming and at the end talks about <a href="http://supertech.csail.mit.edu/cilk/">Cilk</a> programming language.</p>
<p>Lecture twenty-one implements several multithreaded algorithms, such as n<small>x</small>n matrix addition, n<small>x</small>n matrix multiplication, and parallel merge-sort. It then goes into great detail to analyze the running time and parallelism of these algorithms.</p>
<h2 style="margin-bottom:10px">Lecture 20: Parallel Algorithms I</h2>
<p>The lecture starts with an introduction to a particular model of parallel computation called <strong>dynamic multithreading</strong>. It&#8217;s easiest to understand this model with an example. Here is a multithreaded algorithm that computes Fibonacci numbers:</p>
<pre>
Fibonacci(n):
1   if n < 2:                     | thread A
2     return n                    | thread A
3   x = <strong>spawn</strong> Fibonacci(n-1)      | thread A
4   y = <strong>spawn</strong> Fibonacci(n-2)      | thread B
5   <strong>sync</strong>                          | thread C
6   return x + y                  | thread C

# this is a really bad algorithm, because it&#8217;s
# exponential time.
#
# see <a href="http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-two/">lecture three</a> for four different
# classical algorithms that compute Fibonacci numbers
</pre>
<p>(A thread is defined to be a maximal sequence of instructions not containing the parallel control statements spawn, sync, and return, not something like Java threads or Posix threads.)</p>
<p>I put the most important keywords in bold. They are &#8220;<strong>spawn</strong>&#8221; and &#8220;<strong>sync</strong>&#8220;.</p>
<p>The keyword <strong>spawn</strong> is the parallel analog of an ordinary subroutine call. Spawn before the subroutine call in line 3 indicates that the subprocedure Fibonacci(n-1) can execute in parallel with the procedure Fibonacci(n) itself. Unlike an ordinary function call, where the parent is not resumed until after its child returns, in the case of a spawn, the parent can continue to execute in parallel with the child. In this case, the parent goes on to spawn Fibonacci(n-2). In general, the parent can continue to spawn off children, producing a high degree of parallelism.</p>
<p>A procedure cannot safely use the return values of the children it has spawned until it executes a <strong>sync</strong> statement. If any of its children have not completed when it executes a sync, the procedure suspends and does not resume until all of  its children have completed. When all of  its children return, execution of the procedure resumes at the point immediately following the sync statement. In the Fibonacci example, the sync statement in line 5 is required before the return statement in line 6 to avoid the anomaly that would occur if x and y were summed before each had been computed. </p>
<p>The <strong>spawn</strong> and <strong>sync</strong> keywords specify logical parallelism, not &#8220;actual&#8221; parallelism. That is, these keywords indicate which code may possibly execute in parallel, but what actually runs in parallel is determined by a <strong>scheduler</strong>, which maps the dynamically unfolding computation onto the available processors.</p>
<p>The lecture then continues with a discussion on how a parallel instruction stream can be viewed as a directed acyclic graph G = (V, E), where threads make up the set V of vertices, and each spawn and return statement makes up an edge in E. There are also continuation edges in E that exist between statements.</p>
<p>In Fibonacci example there are three threads: lines 1, 2, 3 make up the thread A, line 4 is thread B and lines 5, 6 are thread C.</p>
<p>Here is how a graph of Fibonacci(4) computation looks like:</p>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/03/multithreading-dag.jpg' alt='Multithreading Directed Acyclic Graph' /></p>
<p><small>A dag representing multithreaded computation of Fibonacci(4).</small>
</div>
<p>Here the threads are show as circles, and each group of threads belonging to the same procedure are surrounded by a rounded rectangle. Downward edges are spawn dependencies, horizontal edges represent continuation dependencies within a procedure, and upward edges are return dependencies. Every computation starts with a single initial thread, and ends with a single final thread.</p>
<p>So to compute Fib(4), first Fib(3) and Fib(2) have to be computed. Fib(4) spawns one thread to compute Fib(3) and another thread to compute Fib(2) and sync&#8217;es. In order to compute Fib(3), values of Fib(2) and Fib(1) have to be known, and to compute Fib(2) values of Fib(1) and Fib(0) have to be known. Fib(3) and Fib(2) spawn their threads and sync. This way the computation unwinds.</p>
<p>Next, the lecture introduces some concepts for measuring performance of multithreaded algorithms.</p>
<p>First some time measures:</p>
<ul>
<li>T<sub>p</sub> - running time of an algorithm on <i>p</i> processors,</li>
<li>T<sub>1</sub> - running time of algorithm on 1 processor, and</li>
<li>T<sub>∞</sub> as critical path length, that is the longest time to execute the algorithm on infinite number of processors.</li>
</ul>
<p>Based on these measures, the <strong>speedup</strong> of a computation on p processors is T<sub>1</sub>/T<sub>p</sub>, that is how much faster a p processor machine executes the algorithm than a one processor machine. If the speedup is O(p), then we say that the computation has a linear speedup. If, however, speedup > p, then we call it a super-linear speedup. The maximum possible speedup is called <strong>parallelism</strong> and is computed by T<sub>1</sub>/T<sub>∞</sub>.</p>
<p>After these concepts the lecture puts forward the concept of <strong>scheduling</strong> and <strong>greedy scheduler</strong>.</p>
<p>A scheduler maps computation to p processors.</p>
<p>The greedy scheduler schedules as much as it can at every time step. On a p-processor machine every step can be classified into two types. If there are p or more threads ready to execute, the scheduler executes any p of them (this step is also called a complete step). If there are fewer than p threads, it executes all of them (called incomplete step). The greedy scheduler is an offline scheduler and it&#8217;s 2-competitive (see <a href="http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-nine/">lecture 13</a> on online/offline algorithms and the meaning of competitive algorithm).</p>
<p>The lecture ends with an introduction to Cilk parallel, multithreaded programming language. It uses a randomized online scheduler. It was used to write two chess programs *Socrates and Cilkchess.</p>
<p>See Charles Leiseron&#8217;s handout for a much more comprehensive overview of dynamic multithreading: <a href='http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-thirteen/a-minicourse-on-dynamic-multithreaded-algorithms-pdf/' rel='attachment wp-att-442' title='A Minicourse on Dynamic Multithreaded Algorithms (pdf)'>A Minicourse on Dynamic Multithreaded Algorithms (pdf)</a>.</p>
<p>You&#8217;re welcome to watch lecture twenty:</p>
<div class="center-aligner">
<embed id="VideoPlayback" src="http://video.google.com/googleplayer.swf?docid=-6296121489758572721&#038;hl=en&#038;fs=true&#038;subtitle=off" style="width:400px;height:326px" allowFullScreen="true" allowScriptAccess="always" type="application/x-shockwave-flash"> </embed></p>
<p>Direct URL: <a href="http://video.google.com/videoplay?docid=-6296121489758572721">http://video.google.com/videoplay?docid=-6296121489758572721</a>
</div>
<p>Topics covered in lecture twenty:</p>
<ul>
<li>[01:33] Parallel algorithms.</li>
<li>[02:33] Dynamic multi-threading.</li>
<li>[03:15] A multithreaded algorithm for fibonnaci numbers.</li>
<li>[04:50] Explanation of spawn and sync.</li>
<li>[06:45] Logical parallelism.</li>
<li>[08:00] Multithreaded computation.</li>
<li>[12:15] Fibonacci(4) computation graph, and its concepts - init thread, spawn edge, continuation edge, return edge, final thread.</li>
<li>[17:45] Edges: spawn, return, continuation.</li>
<li>[19:40] Performance measures: running time on p processors T<sub>p</sub>, work (serial time) T<sub>1</sub>, critcial path length (longest path) T<sub>infinity</sub>.</li>
<li>[24:15] Lower bounds on T<sub>p</sub> (running time on p processors).</li>
<li>[29:35] Speedup, linear speedup, superlinear speedup.</li>
<li>[32:55] Maximum possible speedup.</li>
<li>[36:40] Scheduling.</li>
<li>[38:55] Greedy scheduler.</li>
<li>[43:40] Theorem by Grant and Brent (bound on T<sub>p</sub> for greedy scheduler).</li>
<li>[46:20] Proof of Grant, Brent theorem.</li>
<li>[56:50] Corollary on greedy scheduler&#8217;s linear speedup.</li>
<li>[01:02:20] Cilk programming language.</li>
<li>[01:06:00] Chess programs: *Socrates, Clikchess.</li>
<li>[01:06:30] Analysis of *Socrates running time on 512 processors.</li>
</ul>
<p>Lecture twenty notes:</p>
<div class="center-aligner">
<table>
<tr>
<td align="center" valign="top">
<a href='http://www.catonmat.net/blog/wp-content/uploads/2009/03/mit-algorithms-lecture-20-01.jpg' title='MIT Algorithms Lecture 20 Notes Thumbnail. Page 1 of 2.'><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/03/mit-algorithms-lecture-20-01-thumb.jpg' alt='MIT Algorithms Lecture 20 Notes Thumbnail. Page 1 of 2.' /></a><br />
<small>Lecture 20, page 1 of 2: dynamic multithreading, multithreaded computation, performance measures, speedup, scheduling, greedy scheduler.</small>
</td>
<td width="10"></td>
<td align="center" valign="top">
<a href='http://www.catonmat.net/blog/wp-content/uploads/2009/03/mit-algorithms-lecture-20-02.jpg' title='MIT Algorithms Lecture 20 Notes Thumbnail. Page 2 of 2.'><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/03/mit-algorithms-lecture-20-02-thumb.jpg' alt='MIT Algorithms Lecture 20 Notes Thumbnail. Page 2 of 2.' /></a><br />
<small>Lecture 20, page 2 of 2: grant, brent theorem, cilk, socrates, cilkchess.</small>
</td>
</tr>
</table>
</div>
<h2 style="margin-bottom:10px">Lecture 21: Parallel Algorithms II</h2>
<p>Lecture twenty-one is all about real multithreaded algorithms and their analysis.</p>
<p>It starts with a hugly important problem of multithreaded matrix multiplication: given two n<small>x</small>n matrices A and B, write a multithreaded algorithm Matrix-Multiply that multiplies them, producing matrix C. The natural approach is to use divide and conquer method from <a href="http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-two/">lecture three</a>, and formulate the problem as follows: </p>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/03/multithreaded-matrix-multiplication.jpg' alt='Multithreaded Matrix Multiplication' />
</div>
<p>Each n<small>x</small>n matrix can be expressed as 8 multiplications and 4 additions of (n/2)<small>x</small>(n/2) submatrices. So the first thing we need is Matrix-Add algorithm that adds two matrices in parallel.</p>
<p>Here is the parallel Matrix-Add algorithm:</p>
<pre>
Matrix-Add(C, T, n):
   // Adds matrices C and T in-place, producing C = C + T
   // n is power of 2 (for simplicity).
   if  n == 1:
     C[1, 1] = C[1, 1] + T[1, 1]
   else:
     partition C and T into (n/2)<small>x</small>(n/2) submatrices
     spawn Matrix-Add(C<sub>11</sub>, T<sub>11</sub>, n/2)
     spawn Matrix-Add(C<sub>12</sub>, T<sub>12</sub>, n/2)
     spawn Matrix-Add(C<sub>21</sub>, T<sub>21</sub>, n/2)
     spawn Matrix-Add(C<sub>22</sub>, T<sub>22</sub>, n/2)
     sync
</pre>
<p>The base case for this algorithm is when matrices are just scalars, then it just adds the only element of both matrices. In all other cases, it partitions matrices C and T into (n/2)<small>x</small>(n/2) matrices and spawns four subproblems.</p>
<p>Now we can implement the Matrix-Multiply algorithm, by doing 8 multiplications in parallel and then calling Matrix-Add to do 4 additions.</p>
<p>Here is the parallel Matrix-Multiply algorithm:</p>
<pre>
Matrix-Multiply(C, A, B, n):
   // Multiplies matrices A and B, storing the result in C.
   // n is power of 2 (for simplicity).
   if  n == 1:
     C[1, 1] = A[1, 1] · B[1, 1]
   else:
     allocate a temporary matrix T[1...n, 1...n]
     partition A, B, C, and T into (n/2)<small>x</small>(n/2) submatrices
     spawn Matrix-Multiply(C<sub>12</sub>,A<sub>11</sub>,B<sub>12</sub>, n/2)
     spawn Matrix-Multiply(C<sub>21</sub>,A<sub>21</sub>,B<sub>11</sub>, n/2)
     spawn Matrix-Multiply(C<sub>22</sub>,A<sub>21</sub>,B<sub>12</sub>, n/2)
     spawn Matrix-Multiply(T<sub>11</sub>,A<sub>12</sub>,B<sub>21</sub>, n/2)
     spawn Matrix-Multiply(T<sub>12</sub>,A<sub>12</sub>,B<sub>22</sub>, n/2)
     spawn Matrix-Multiply(T<sub>21</sub>,A<sub>22</sub>,B<sub>21</sub>, n/2)
     spawn Matrix-Multiply(T<sub>22</sub>,A<sub>22</sub>,B<sub>22</sub>, n/2)
     sync
     Matrix-Add(C, T, n)
</pre>
<p>The lecture then proceeds to calculating how good the algorithm is, turns out the parallelism (ratio of time of algorithm running on a single processor T<sub>1</sub> to running on infinite number of processors T<sub>inf</sub>) for doing 1000&#215;1000 matrix multiplication is 10<sup>7</sup>, which the parallel algorithm is fast as hell.</p>
<p>Matrix-Multiply used a temporary matrix which was actually unnecessary. To achieve higher performance, it&#8217;s often advantageous for an algorithm to use less space, because more space means more time. Therefore a faster algorithm can be added that multiplies matrices and adds them in-place. See lecture at 33:05 for Matrix-Multiply-Add algorithm. The parallelism of this algorithm for 1000&#215;1000 matrix multiplication is 10<sup>6</sup>, smaller than the previous one, but due to fact that no temporary matrices had to be used, it beats Matrix-Multiply algorithm in practice.</p>
<p>The lecture then proceeds to parallel sorting. The classical sorting algorithms were covered in <a href="http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-three/">lectures four and five</a>. This lecture parallelizes the Merge-Sort algorithm.</p>
<p>Here is the parallel merge-sort algorithm:</p>
<pre>
Parallel-Merge-Sort(A, p, r): // sort A[p...r]
  if p < r:
    q = floor((p+r)/2)
    spawn Parallel-Merge-Sort(A, p, q)
    spawn Parallel-Merge-Sort(A, q+1, r)
    sync
    Merge(A, p, q, r) // merge A[p...q] with A[q+1...r]
</pre>
<p>Instead of recursively calling Parallel-Merge-Sort for subproblems, we spawn them, thus utilizing the power of parallelism. After doing analysis turns out that due to fact that we use the classical Merge() function, the parallelism is just lg(n). That&#8217;s puny!</p>
<p>To solve this problem, a Parallel-Merge algorithm has to be written. The lecture continues with presenting a Parallel-Merge algorithm, and after that does the analysis of the same Parallel-Merge-Sort algorithm but with Parallel-Merge function. Turns out this algorithm has parallelism of n/lg<sup>2</sup>(n), that is much better. See lecture at 51:00 for Parallel-Merge algorithm.</p>
<p>The best parallel sorting algorithm know to date has the parallelism of n/lg(n).</p>
<p>You&#8217;re welcome to watch lecture twenty-one:</p>
<div class="center-aligner">
<embed id="VideoPlayback" src="http://video.google.com/googleplayer.swf?docid=-1803812673760482155&#038;hl=en&#038;fs=true&#038;subtitle=off" style="width:400px;height:326px" allowFullScreen="true" allowScriptAccess="always" type="application/x-shockwave-flash"> </embed></p>
<p>Direct URL: <a href="http://video.google.com/videoplay?docid=-1803812673760482155">http://video.google.com/videoplay?docid=-1803812673760482155</a>
</div>
<p>Topics covered in lecture twenty-one:</p>
<ul>
<li>[00:35] Multithreaded algorithms.</li>
<li>[01:32] Multithreaded matrix multiplication.</li>
<li>[02:05] Parallelizing with divide and conquer.</li>
<li>[04:30] Algorithm Matrix-Multiply, that computes C = A*B.</li>
<li>[10:13] Algorithm Matrix-Add, that computes C = C+T.</li>
<li>[12:45] Analysis of running time of Matrix-Multiply and Matrix-Add.</li>
<li>[19:40] Analysis of critical path length of Matrix-Multiply and Matrix-Add.</li>
<li>[26:10] Parallelism of Matrix-Multiply and Matrix-Add.</li>
<li>[33:05] Algorithm Matrix-Multiply-Add, that computes C = C + A*B.</li>
<li>[35:50] Analysis of Matrix-Multiply-Add.</li>
<li>[42:30] Multithreaded, parallel sorting.</li>
<li>[43:30] Parallelized Merge-Sort algorithm.</li>
<li>[45:55] Analysis of multithreaded Merge-Sort.</li>
<li>[51:00] Parallel-Merge algorithm.</li>
<li>[01:00:30] Analysis of Parallel-Merge algorithm.</li>
</ul>
<p>Lecture twenty-one notes:</p>
<div class="center-aligner">
<table>
<tr>
<td align="center" valign="top">
<a href='http://www.catonmat.net/blog/wp-content/uploads/2009/03/mit-algorithms-lecture-21-01.jpg' title='MIT Algorithms Lecture 21 Notes Thumbnail. Page 1 of 2.'><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/03/mit-algorithms-lecture-21-01-thumb.jpg' alt='MIT Algorithms Lecture 21 Notes Thumbnail. Page 1 of 2.' /></a><br />
<small>Lecture 21, page 1 of 2: multithreaded algorithms, matrix multiplication, analysis (critical path length, parallelism).</small>
</td>
<td width="10"></td>
<td align="center" valign="top">
<a href='http://www.catonmat.net/blog/wp-content/uploads/2009/03/mit-algorithms-lecture-21-02.jpg' title='MIT Algorithms Lecture 21 Notes Thumbnail. Page 2 of 2.'><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/03/mit-algorithms-lecture-21-02-thumb.jpg' alt='MIT Algorithms Lecture 21 Notes Thumbnail. Page 2 of 2.' /></a><br />
<small>Lecture 21, page 2 of 2: paralel sorting, merge sort, parallel merge, analysis.</small>
</td>
</tr>
</table>
</div>
<p>Have fun with parallel, multithreaded algorithms! The next post is going to be on cache oblivious algorithms! They are a class of algorithms that exploit all the various caches in modern computer architecture to make them run notably faster.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/catonmat?a=7WddiuzcDU4:yH49dYDC9Yw:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/catonmat?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=7WddiuzcDU4:yH49dYDC9Yw:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/catonmat?i=7WddiuzcDU4:yH49dYDC9Yw:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=7WddiuzcDU4:yH49dYDC9Yw:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/catonmat?i=7WddiuzcDU4:yH49dYDC9Yw:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=7WddiuzcDU4:yH49dYDC9Yw:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/catonmat?i=7WddiuzcDU4:yH49dYDC9Yw:gIN9vFwOqvQ" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/7WddiuzcDU4" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-thirteen/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-thirteen/</feedburner:origLink></item>
		<item>
		<title>JavaScript: The Good Parts</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/2vRRPg1P-vw/</link>
		<comments>http://www.catonmat.net/blog/javascript-the-good-parts/#comments</comments>
		<pubDate>Sat, 28 Feb 2009 06:12:43 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Video Lectures]]></category>
<category>dom</category><category>douglas crockford</category><category>global variables</category><category>inheritance</category><category>java</category><category>javascript</category><category>js</category><category>jslint</category><category>jsmin</category><category>json</category><category>null</category><category>objects</category><category>perl</category><category>scheme</category><category>typeof</category><category>undefined</category><category>video lecture</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/javascript-the-good-parts/</guid>
		<description><![CDATA[I found a really nice video lecture on JavaScript. I&#8217;m a JavaScript journeyman myself so I decided to review it to learn something new. I have written about JavaScript in the past &#8212; the previous post on JavaScript was &#8220;Learning JavaScript through Video Lectures&#8220;.
This lecture is given by JavaScript guru Doug Crockford. He&#8217;s the author [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/02/javascript-the-good-parts.jpg' alt='JavaScript: The Good Parts' class='post-icon' align='left' />I found a really nice video lecture on JavaScript. I&#8217;m a JavaScript journeyman myself so I decided to review it to learn something new. I have written about JavaScript in the past &#8212; the previous post on JavaScript was &#8220;<a href="http://www.catonmat.net/blog/learning-javascript-programming-language-through-video-lectures/">Learning JavaScript through Video Lectures</a>&#8220;.</p>
<p>This lecture is given by JavaScript guru <a href="http://crockford.com/">Doug Crockford</a>. He&#8217;s the author of <a href="http://www.json.org/">JSON</a>, <a href="http://www.crockford.com/javascript/jsmin.html">JSMin</a> JavaScript minifier and <a href="http://www.jslint.com/">JSLint</a>. </p>
<p>The talk is based on Douglas&#8217;s recent book &#8220;<a href="http://www.amazon.com/JavaScript-Good-Parts-Douglas-Crockford/dp/0596517742/freesciencand-20">JavaScript: The Good Parts</a>&#8220;. It&#8217;s excellent.</p>
<p>The lecture begins with a brief intro of why JavaScript is the way it is and how it came to be so. An interesting point Doug makes is that JavaScript is basically Scheme, except with Java syntax. He talks about the bad and good parts of JavaScript and gives a few examples of common JavaScript problems, their solutions and patterns. Douglas also talks about how he discovered JSON. After the talk there is a 13 minute Q and A.</p>
<p>You&#8217;re welcome to watch the video lecture. It&#8217;s 1 hour long and it will definitely make your understanding of JavaScript better:</p>
<div class="center-aligner">
<object width="425" height="344">
<param name="movie" value="http://www.youtube.com/v/hQVTIJBZook&#038;hl=en&#038;fs=1"></param>
<param name="allowFullScreen" value="true"></param>
<param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/hQVTIJBZook&#038;hl=en&#038;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object><br />
Direct URL: <a href="http://www.youtube.com/watch?v=hQVTIJBZook">http://www.youtube.com/watch?v=hQVTIJBZook</a>
</div>
<p>Here are some interesting points from the lecture:</p>
<p>[09:57] JavaScript was influenced by Scheme, Java and Perl. From Scheme it borrowed lambda functions and loose typing. From Java it took most of the syntax, and from Perl some of its regular expressions. And finally, it derived the idea of prototypal inheritance and dynamic objects from <a href="http://en.wikipedia.org/wiki/Self_(programming_language)">Self</a>. See my <a href="http://www.catonmat.net/blog/learning-javascript-programming-language-through-video-lectures/">previous post on JavaScript</a> for explanation of prototypal inheritance.</p>
<p>[11:38] JavaScript bad parts:</p>
<ul>
<li><strong>Global variables</strong>. Global variables make it harder to run independent subprograms in the same program. If the subprograms happen to have global variables that share the same names, then they will interfere with each other and likely fail, usually in difficult to diagnose ways.</li>
<li><strong>Newlines get converted into semicolons</strong>. JavaScript has a mechanism that tries to correct faulty programs by automatically inserting semicolons. It sometimes inserts semicolons in places where they are not welcome. For example:
<pre>
return
{
    status: true
};
</pre>
<p>returns undefined because JavaScript inserts &#8216;;&#8217; after return.</p>
<p>Correct way:</p>
<pre>
return {
    status: true
};
</pre>
</li>
<li><strong>Operator typeof is not very helpful</strong>. For example, &#8220;typeof null&#8221; is &#8220;object&#8221;, &#8220;typeof [1,2,3]&#8221; is also &#8220;object&#8221;.</li>
<li><strong>Operator + adds numbers and concatenates</strong>. The + operator can add or concatenate. Which one it does depends on the types of the parameters. If either operand is an empty string, it produces the other operand converted to a string. If both operands are numbers, it produces the sum. Otherwise, it converts both operands to strings and concatenates them. This complicated behavior is a common source of bugs. If you intend + to add, make sure that both operands are numbers.</li>
<li><strong>Operators == and != do type coercion</strong>. Use of the === and !== operators is always preferred.</li>
<li><strong>Too many falsy values</strong>. 0, Nan, &#8221; (empty string), false, null, undefined are all false.</li>
</ul>
<p>[17:25] for &#8230; in operator mixes inherited functions with the desired data members (it does deep reflection). Use object.hasOwnProperty to filter out names that belong to object itself:</p>
<pre>
for (name in object) {
    if (object.hasOwnProperty(name)) {
        ...
    }
}
</pre>
<p>[22:00] Javascript good parts:</p>
<ul>
<li><strong>Lambda</strong>. Enough said.</li>
<li><strong>Dynamic objects</strong>. Just add a property to an object, or remove it, no need for classes and derivation to create a similar object.</li>
<li><strong>Loose Typing</strong>. No need for type declarations.</li>
<li><strong>Object Literals</strong>. {} for object literals, [] for array literals and // for regexp literals.</li>
</ul>
<p>[23:00] Two schools of inheritance - classical and prototypal. Prototypal inheritance means that objects can inherit properties directly from other objects. The language is class-free.</p>
<p>[24:35] Realization of prototypal inheritance in JavaScript:</p>
<pre>
if (typeof Object.create !== 'function') {
    Object.create = function (o) {
        function F() {}
        F.prototype = o;
        return new F();
    };
}
</pre>
<p>Now a newObject can be created by inheriting from oldObject:</p>
<pre>
newObject = Object.create(oldObject);
</pre>
<p>[26:05] Example of global variables and why they are bad and how to solve the problem by using closures.</p>
<pre>
var my_thing = function () {
    var names = ["zero", "one", ... ];
    return function(n) {
        return names[n];
    };
}();
</pre>
<p>[29:00] There are four ways to create a new object in JavaScript:</p>
<ul>
<li>Object literal &#8212; var object = { }.</li>
<li>New &#8212; var object = new Object()</li>
<li>Object.create &#8212; var object = Object.create(old_object).</li>
<li>Call another constructor (use different inheritance model).</li>
</ul>
<p>[42:42] <a href="http://www.jslint.com">JSLint</a>. JSLint defines a professional subset of JavaScript. JSLint will hurt your feelings.</p>
<p>[52:00] Q/A: Does strict mode change the behavior or does it take things out? &#8212; Can&#8217;t have &#8220;with&#8221; in strict mode, changes the way eval() works, changes error handling.</p>
<p>[53:00] Q/A: What&#8217;s going on with DOM? &#8212; Ajax libraries fix DOM, these changes should be propagated back into DOM API.</p>
<p>[55:30] Q/A: How do you spell lambda in JavaScript? &#8212; function.</p>
<p>[55:54] Q/A: How to solve global variable problem? &#8212; Each of compilation unit should be isolated, but there should be a way how they can introduce (link) to each other.</p>
<p>[56:30] Q/A: How do JavaScript objects differ from hash tables, they seem the same to me?</p>
<p>[57:23] Q/A: What&#8217;s wrong with HTML 5 and web apps? &#8212; They are doing too much and there are way too many of them.</p>
<p>[59:10] Q/A: How come JSON and JavaScript have almost the same syntax? &#8212; Doug admits he forgot to include Unicode Line Separator (LS) and Paragraph Separator (PS) control codes as whitespace chars.</p>
<p>[01:00:32] Q/A: Why does JSON require quotes around the property names? &#8212; Three reasons: 1. Wanted to align with Python where quotes are required, 2. Wanted to make the grammar of standard simpler, 3. JavaScript has stupid reserved word policy.</p>
<p>[01:02:40] Q/A: Are there any prospects for adding concurrency to the language? &#8212; Definitely no threads. Could be some kind of a messaging model.</p>
<p>If you liked this talk, I recommend that you get Doug&#8217;s book:</p>
<div class="center-aligner">
<iframe src="http://rcm.amazon.com/e/cm?t=freesciencand-20&#038;o=1&#038;p=8&#038;l=as1&#038;asins=0596517742&#038;md=10FE9736YVPPT7A0FBG2&#038;fc1=000000&#038;IS2=1&#038;lt1=_blank&#038;m=amazon&#038;lc1=0000FF&#038;bc1=000000&#038;bg1=FFFFFF&#038;f=ifr" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe>
</div>
<p>Happy javascripting! <img src='http://www.catonmat.net/blog/wp-includes/images/smilies/icon_wink.gif' alt=';)' class='wp-smiley' /></p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~ff/catonmat?a=2vRRPg1P-vw:ZJGkbhrYPRA:yIl2AUoC8zA"><img src="http://feeds.feedburner.com/~ff/catonmat?d=yIl2AUoC8zA" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=2vRRPg1P-vw:ZJGkbhrYPRA:F7zBnMyn0Lo"><img src="http://feeds.feedburner.com/~ff/catonmat?i=2vRRPg1P-vw:ZJGkbhrYPRA:F7zBnMyn0Lo" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=2vRRPg1P-vw:ZJGkbhrYPRA:V_sGLiPBpWU"><img src="http://feeds.feedburner.com/~ff/catonmat?i=2vRRPg1P-vw:ZJGkbhrYPRA:V_sGLiPBpWU" border="0"></img></a> <a href="http://feeds.feedburner.com/~ff/catonmat?a=2vRRPg1P-vw:ZJGkbhrYPRA:gIN9vFwOqvQ"><img src="http://feeds.feedburner.com/~ff/catonmat?i=2vRRPg1P-vw:ZJGkbhrYPRA:gIN9vFwOqvQ" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/2vRRPg1P-vw" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/javascript-the-good-parts/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/javascript-the-good-parts/</feedburner:origLink></item>
		<item>
		<title>Famous Perl One-Liners Explained, Part I</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/QhmmiUvBEaY/</link>
		<comments>http://www.catonmat.net/blog/perl-one-liners-explained-part-one/#comments</comments>
		<pubDate>Wed, 25 Feb 2009 03:30:32 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Programming]]></category>
<category>begin</category><category>eric pement</category><category>file spacing</category><category>line spacing</category><category>one liner</category><category>one liners</category><category>perl</category><category>perl1line.txt</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/perl-one-liners-explained-part-one/</guid>
		<description><![CDATA[Hi all! I am starting yet another article series here. Remember my two articles on Famous Awk One-Liners Explained and Famous Sed One-Liners Explained? They have received more than 150,000 views total and they attract a few thousand new visitors every week. Inspired by their success, I am going to create my own perl1line.txt file [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/02/perl-one-liners.jpg' alt='Perl One Liners' class='post-icon' align='left' />Hi all! I am starting yet another article series here. Remember my two articles on <a href="http://www.catonmat.net/blog/awk-one-liners-explained-part-one/">Famous Awk One-Liners Explained</a> and <a href="http://www.catonmat.net/blog/sed-one-liners-explained-part-one/">Famous Sed One-Liners Explained</a>? They have received more than 150,000 views total and they attract a few thousand new visitors every week. Inspired by their success, I am going to create my own <strong>perl1line.txt</strong> file and explain every single oneliner in it. I hope it becomes as famous as <a href="http://www.pement.org/awk/awk1line.txt">awk1line.txt</a> and <a href="http://sed.sourceforge.net/sed1line.txt">sed1line.txt</a> files.</p>
<p>A tiny intro to those unfamiliar with the awk1line.txt and sed1line.txt files &#8212; they contain around 80 useful awk and sed one-liners for doing various text file manipulations. Some examples are double spacing a file, numbering the lines, doing various text substitutions, etc. They were compiled by <a href="http://student.northpark.edu/pemente/">Eric Pement</a>. Kudos to him!</p>
<p>The article will be divided in six parts. In parts 1 - 5 I will create the one-liners and explain them, and in part 6 I will release the perl1line.txt file. I found that splitting the article in many parts is much easier to get it written, that&#8217;s why I do it.</p>
<p>Here is the general plan:</p>
<ul>
<li>Part I: File spacing.</li>
<li>Part II: Numbering and calculations.</li>
<li>Part III: String creation. Array creation.</li>
<li>Part IV: Text conversion and substitution.</li>
<li>Part V: Selective printing and deleting of certain lines.</li>
<li>Part VI: Release of perl1line.txt.</li>
</ul>
<p>The one-liners will make heavy use of Perl special variables. Luckily, a few years ago I compiled all the Perl special vars in a single file and called it <a href="http://www.catonmat.net/blog/perls-special-variable-cheat-sheet/">Perl special variable cheat-sheet</a>. Even tho it&#8217;s mostly copied out of <a href="http://perldoc.perl.org/perlvar.html">perldoc perlvar</a>, it&#8217;s still handy to have in front of you. I suggest that you print it.</p>
<p>Other than that, I can&#8217;t wait to start writing the article, so here I go:</p>
<h2 style="margin-bottom: 10px">File Spacing</h2>
<p><strong>1. Double space a file.</strong></p>
<pre>perl -pe '$\="\n"'</pre>
<p>This one-liner double spaces a file. There are three things to explain in this one-liner. The &#8220;-p&#8221; and &#8220;-e&#8221; command line options, and the &#8220;$\&#8221; variable.</p>
<p>First let&#8217;s start with the &#8220;-e&#8221; option. The &#8220;-e&#8221; option can be used to enter a Perl program directly in the command line. Typically you don&#8217;t want to create source files for every small program. By using &#8220;-e&#8221; you can handily specify the program to execute on the command line.</p>
<p>Next the &#8220;-p&#8221; switch. Specifying &#8220;-p&#8221; to a Perl program causes it to assume the following loop around your program:</p>
<pre>
while (<>) {
    # your program goes here
} continue {
    print or die &#34;-p failed: $!\n&#34;;
}
</pre>
<p>This construct loops over all the input, executes your code and prints the value of &#8220;$_&#8221;. This way you can effectively modify all or some lines of input. The &#8220;$_&#8221; variable can be explained as an anonymous variable that gets filled with the good stuff.</p>
<p>The &#8220;$\&#8221; variable is similar to ORS in Awk. It gets appended after every &#8220;print&#8221; operation. Without any arguments &#8220;print&#8221; prints the contents of &#8220;$_&#8221; (the good stuff).</p>
<p>In this one-liner the code specified by &#8220;-e&#8221; is &#8216;$\=&#8221;\n&#8221;&#8216;, thus the whole program looks like this:</p>
<pre>
while (<>) {
    $\ = &#34;\n&#34;;
} continue {
    print or die &#34;-p failed: $!\n&#34;;
}
</pre>
<p>What happens is that after reading each line, &#8220;$_&#8221; gets filled with it (including the existing line&#8217;s newline), the &#8220;$\&#8221; gets set to a newline itself and &#8220;print&#8221; is called. As I already mentioned, without any arguments &#8220;print&#8221; prints the contents of &#8220;$_&#8221; and &#8220;$\&#8221; gets appended at the end. The result is that each line gets printed unmodified and it&#8217;s followed by the &#8220;$\&#8221; which has been set to newline. The input has been double-spaced.</p>
<p>There is actually no need to set &#8220;$\&#8221; to newline on each line. It was just the shortest possible one-liner that double-spaced the file. Here are several others that do the same:</p>
<pre>perl -pe 'BEGIN { $\="\n" }'</pre>
<p>This one sets the &#8220;$\&#8221; to newline just once before Perl does anything (BEGIN block gets executed before everything else).</p>
<pre>perl -pe '$_ .= &#34;\n&#34;'</pre>
<p>This one-liner is equivalent to:</p>
<pre>
while (<>) {
    $_ = $_ . &#34;\n&#34;
} continue {
    print or die &#34;-p failed: $!\n&#34;;
}
</pre>
<p>It appends another new-line at the end of each line, then prints it out.</p>
<p>The cleanest and coolest way to do it is probably use the substitution &#8220;s///&#8221; operator:</p>
<pre>perl -pe 's/$/\n/'</pre>
<p>It replaces the regular expression &#8220;$&#8221; that matches at the end of line with a newline, effectively adding a newline at the end.</p>
<p><strong>2. Double space a file, except the blank lines.</strong></p>
<pre>perl -pe '$_ .= "\n" unless /^$/'</pre>
<p>This one-liner double spaces all lines that are not completely empty. It does it by appending a newline character at the end of each line that is not blank. The &#8220;unless&#8221; means &#8220;if not&#8221;. And &#8220;unless /^$/&#8221; means &#8220;if not &#8216;beginning is end of line&#8217;&#8221;. The condition &#8220;beginning is end of line&#8221; is true only for lines that contain the newline character.</p>
<p>Here is how it looks when expanded:</p>
<pre>
while (<>) {
    if ($_ !~ /^$/) {
        $_ .= &#34;\n&#34;
    }
} continue {
    print or die &#34;-p failed: $!\n&#34;;
}
</pre>
<p>A better test that takes spaces and tabs on the line into account is this one:</p>
<pre>perl -pe '$_ .= "\n" if /\S/'</pre>
<p>Here the line is matched against &#8220;\S&#8221;. &#8220;\S&#8221; is a regular expression that is the inverse of &#8220;\s&#8221;. The inverse of &#8220;\s&#8221; is any non-whitespace character. The result is that every line that has at least one non-whitespace character (tab, vertical tab, space, etc) gets double spaced.</p>
<p><strong>3. Triple space a file.</strong></p>
<pre>perl -pe '$\="\n\n"'</pre>
<p>Or</p>
<pre>perl -pe '$_.="\n\n"'</pre>
<p>They are the same as one-liner #1, except that two new-lines get appended after each line.</p>
<p><strong>4. N-space a file.</strong></p>
<pre>perl -pe '$_.="\n"x7'</pre>
<p>This one-liner uses inserts 7 new lines after each line. Notice how I used &#8216;&#8221;\n&#8221; x 7&#8242; to repeat the newline char 7 times. The &#8220;x&#8221; operator repeats the thing on the left N times.</p>
<p>For example:</p>
<pre>perl -e 'print "foo"x5'</pre>
<p>would print &#8220;foofoofoofoofoo&#8221;.</p>
<p><strong>5. Add a blank line before every line.</strong></p>
<pre>perl -pe 's//\n/'</pre>
<p>This one-liner uses the &#8220;s/pattern/replacement/&#8221; operator. It substitutes the first pattern (regular expression) in the &#8220;$_&#8221; variable with the replacement. In this one-liner the pattern is empty, meaning it matches any position between chars (and in this case it&#8217;s the position before first char) and replaces it with &#8220;\n&#8221;. The effect is that a newline char gets inserted before the line.</p>
<p><strong>6. Remove all blank lines.</strong></p>
<pre>perl -ne 'print unless /^$/'</pre>
<p>This one-liner uses &#8220;-n&#8221; flag. The &#8220;-n&#8221; flag causes Perl to assume to following loop around your program:</p>
<pre>
LINE:
while (<>) {
    # your program goes here
}
</pre>
<p>What happens here is that each line gets read by the diamond &#8220;<>&#8221; operator and is placed in the dolar underscore special variable &#8220;$_&#8221;. At this moment you are free to do with it whatever you want. You specify that in your main program text.</p>
<p>In this one-liner the main program is &#8220;print unless /^$/&#8221;, it gets inserted in the while loop above and the whole Perl program becomes:</p>
<pre>
LINE:
while (<>) {
    print unless /^$/
}
</pre>
<p>Unraveling it further:</p>
<pre>
LINE:
while (<>) {
    print $_ unless $_ =~ /^$/
}
</pre>
<p>This one-liner prints all non-blank lines.</p>
<p>A few other ways to do the same:</p>
<pre>perl -lne 'print if length'</pre>
<p>This one uses the &#8220;-l&#8221; command line argument. The &#8220;-l&#8221; automatically chomps the input line (basically gets rid of newline at the end). Next the line is tested for its length. If there are any chars left, it evals to true and the line gets printed.</p>
<pre>perl -ne 'print if /\S/'</pre>
<p>This one-liner behaves differently than the &#8220;print unless /^$/&#8221; one-liner. The &#8220;print unless /^$/&#8221; one-liner prints lines with spaces and tabs, this one doesn&#8217;t.</p>
<p><strong>7. Remove all consecutive blank lines, leaving just one.</strong></p>
<pre>perl -00 -pe ''</pre>
<p>Ok, this is really tricky, ain&#8217;t it? First of all it does not have any code, the -e is empty. Next it has a silly -00 command line option. This command line option turns paragraph slurp mode on. A paragraph is text between two newlines. All the other newlines get ignored. The paragraph gets put in &#8220;$_&#8221; and the &#8220;-p&#8221; option prints it out.</p>
<p><strong>8. Compress/expand all blank lines into N consecutive ones.</strong></p>
<pre>perl -00 -pe '$_.="\n"x4'</pre>
<p>This one-liner combines the previous one and one-liner #4. It slurps lines paragraph wise, then appends (N-1) new-line. This one-liner expands (or compresses) all new-lines to 5 (&#8221;\n&#8221; x 4 prints four, and there was one at the end of paragraph itself, so 5).</p>
<h2 style="margin-bottom: 10px">Have Fun!</h2>
<p>Have fun with these file spacing one-liners. The next part is going to be about line numbering and calculations one-liners.</p>
<p><strong>Can you think of other file spacing operations that I did not include here?</strong></p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~f/catonmat?a=YnI67lzV"><img src="http://feeds.feedburner.com/~f/catonmat?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=dJhJI0Rw"><img src="http://feeds.feedburner.com/~f/catonmat?i=dJhJI0Rw" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=gj5lsZxF"><img src="http://feeds.feedburner.com/~f/catonmat?i=gj5lsZxF" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=i193T54f"><img src="http://feeds.feedburner.com/~f/catonmat?i=i193T54f" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/QhmmiUvBEaY" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/perl-one-liners-explained-part-one/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/perl-one-liners-explained-part-one/</feedburner:origLink></item>
		<item>
		<title>A Unix Utility You Should Know About: Netcat</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/p1p4B7mWkMo/</link>
		<comments>http://www.catonmat.net/blog/unix-utilities-netcat/#comments</comments>
		<pubDate>Tue, 17 Feb 2009 08:00:21 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Tools]]></category>
<category>linux</category><category>nc</category><category>net cat</category><category>netcat</category><category>tar</category><category>tool</category><category>unix</category><category>utility</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/unix-utilities-netcat/</guid>
		<description><![CDATA[This is the second post in the article series about Unix utilities that you should know about. In this post I will introduce you to the netcat tool or simply nc.
Netcat is often referred to as a &#8220;Swiss Army knife&#8221; utility, and for a good reason. Just like the multi-function usefulness of the venerable Swiss [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/02/unix-utilities.jpg' alt='Unix Utilities' class="post-icon" align="left" />This is the second post in the article series about Unix utilities that you should know about. In this post I will introduce you to the <strong>netcat</strong> tool or simply <strong>nc</strong>.</p>
<p>Netcat is often referred to as a &#8220;<strong>Swiss Army knife</strong>&#8221; utility, and for a good reason. Just like the multi-function usefulness of the venerable Swiss Army pocket knife, netcat&#8217;s functionality is as helpful. Some of its features include port scanning, transferring files, port listening and it can be used a backdoor.</p>
<p>In 2006 netcat was ranked #4 in &#8220;<a href="http://sectools.org/">Top 100 Network Security Tools</a>&#8221; survey, so it&#8217;s definitely a tool to know.</p>
<p>See the first post on <a href="http://www.catonmat.net/blog/unix-utilities-pipe-viewer/">pipe viewer</a> for the introduction to this article series. If you feel like you are interested in this stuff, I suggest that you subscribe to <a href="http://feeds.feedburner.com/catonmat">my rss feed</a> to receive my future posts automatically.</p>
<h2 style="margin-bottom: 10px">How to use nc?</h2>
<p>Let&#8217;s start with a few very simple examples and build up on those.</p>
<p>If you remember, I said that netcat was a Swiss Army knife. What would a Swiss Army knife be if it also wasn&#8217;t a regular knife, right? That&#8217;s why <strong>netcat can be used as a replacement of telnet</strong>:</p>
<pre>
$ nc www.google.com 80
</pre>
<p>It&#8217;s actually much more handy than the regular telnet because you can terminate the connection at any time with ctrl+c, and it handles binary data as regular data (no escape codes, nothing).</p>
<p>You may add &#8220;-v&#8221; parameter for more verboseness, and two -v&#8217;s (-vv) to get statistics of how many bytes were transmitted during the connection.</p>
<p>Netcat can also be <strong>used as a server</strong> itself. If you start it as following, it will listen on port 12345 (on all interfaces):</p>
<pre>
$ nc -l -p 12345
</pre>
<p>If you now connect to port 12345 on that host, everything you type will be sent to the other party, which leads us to using netcat as a <strong>chat server</strong>. Start the server on one computer:</p>
<pre>
# On a computer A with IP 10.10.10.10
$ nc -l -p 12345
</pre>
<p>And connect to it from another:</p>
<pre>
# On computer B
$ nc 10.10.10.10 12345
</pre>
<p>Now both parties can chat!</p>
<p>Talking of which, the chat can be turned to make two processes talk to each other, thus making nc do <strong>I/O over network</strong>! For example, you can send the whole directory from one computer to another by piping tar to nc on the first computer, and redirecting output to another tar process on the second.</p>
<p>Suppose you want to <strong>send files</strong> in /data from computer A with IP 192.168.1.10 to computer B (with any IP). It&#8217;s as simple as this:</p>
<pre>
# On computer A with IP 192.168.1.10
$ tar -cf - /data | nc -l -p 6666

# On computer B
$ nc 192.168.1.10 6666 | tar -xf -
</pre>
<p>Don&#8217;t forget to combine the pipeline with <a href="http://www.catonmat.net/blog/unix-utilities-pipe-viewer/">pipe viewer</a> from previous article in this series to get statistics on how fast the transfer is going!</p>
<p>A single file can be sent even easier:</p>
<pre>
# On computer A with IP 192.168.1.10
$ cat file | nc -l -p 6666

# On computer B
$ nc 192.168.1.10 6666 > file
</pre>
<p>You may even copy and restore the whole disk with nc:</p>
<pre>
# On computer A with IP 192.168.1.10
$ cat /dev/hdb | nc -l -p 6666

# On computer B
$ nc 192.168.1.10 6666 > /dev/hdb
</pre>
<p><strong>Note:</strong> It turns out that &#8220;-l&#8221; can&#8217;t be used together with &#8220;-p&#8221; on a Mac! The solution is to replace &#8220;-l -p 6666&#8243; with just &#8220;-l 6666&#8243;. Like this:</p>
<pre>
$ nc -l 6666

# nc now listens on port 6666 on a Mac computer
</pre>
<p>An uncommon use of netcat is <strong>port scanning</strong>. Netcat is not the best tool for this job, but it does it ok (the best tool is <a href="http://nmap.org/">nmap</a>):</p>
<pre>
$ nc -v -n -z -w 1 192.168.1.2 1-1000
(UNKNOWN) [192.168.1.2] 445 (microsoft-ds) open
(UNKNOWN) [192.168.1.2] 139 (netbios-ssn) open
(UNKNOWN) [192.168.1.2] 111 (sunrpc) open
(UNKNOWN) [192.168.1.2] 80 (www) open
(UNKNOWN) [192.168.1.2] 25 (smtp) : Connection timed out
(UNKNOWN) [192.168.1.2] 22 (ssh) open
</pre>
<p>The &#8220;-n&#8221; parameter here prevents DNS lookup, &#8220;-z&#8221; makes nc not to receive any data from the server, and &#8220;-w 1&#8243; makes the connection timeout after 1 second of inactivity.</p>
<p>Another uncommon behavior is using <strong>netcat as a proxy</strong>. Both ports and hosts can be redirected. Look at this example:</p>
<pre>
$ nc -l -p 12345 | nc www.google.com 80
</pre>
<p>This starts a nc server on port 12345 and all the connections get redirected to google.com:80. If you now connect to that computer on port 12345 and do a request, you will find that no data gets sent back. That&#8217;s correct, because we did not set up a bidirectional pipe. If you add another pipe, you can get the data back on another port:</p>
<pre>
$ nc -l -p 12345 | nc www.google.com 80 | nc -l -p 12346
</pre>
<p>After you have sent the request on port 12345, connect on port 12346 to get the data.</p>
<p>Probably the most powerful netcat&#8217;s feature is <strong>making any process a server</strong>:</p>
<pre>
$ nc -l -p 12345 -e /bin/bash
</pre>
<p>The &#8220;-e&#8221; option spawns the executable with it&#8217;s input and output redirected via network socket. If you now connect to the host on port 12345, you may use bash:</p>
<pre>
$ nc localhost 12345
ls -las
total 4288
   4 drwxr-xr-x 15 pkrumins users    4096 2009-02-17 07:47 .
   4 drwxr-xr-x  4 pkrumins users    4096 2009-01-18 21:22 ..
   8 -rw-------  1 pkrumins users    8192 2009-02-16 19:30 .bash_history
   4 -rw-r--r--  1 pkrumins users     220 2009-01-18 21:04 .bash_logout
   ...
</pre>
<p>The consequences are that nc is a popular hacker tool as it is so easy to create a backdoor on any computer. On a Linux computer you may spawn /bin/bash and on a Windows computer cmd.exe to have total control over it.</p>
<p>That&#8217;s everything I can think of. <strong>Do you know any other netcat uses that I did not include?</strong></p>
<h2 style="margin-bottom: 10px">How to install nc?</h2>
<p>If you&#8217;re on Debian or Debian based system such as Ubuntu do the following:</p>
<pre>
$ sudo aptitude install netcat
</pre>
<p>If you&#8217;re on Fedora or Fedora based system such as CentOS do:</p>
<pre>
$ sudo yum install netcat
</pre>
<p>If you&#8217;re on Slackware, FreeBSD, NetBSD, Solaris or Mac, download the <a href="http://netcat.sourceforge.net/download.php">source code of nc</a> and just:</p>
<pre>
$ tar -zxf nc-version.tar.gz
$ cd nc-version
$ ./configure &#038;&#038; sudo make install
</pre>
<p>Another way to do it on Mac, if you have <a href="http://www.macports.org/">MacPorts</a> is:</p>
<pre>
$ sudo port install netcat
</pre>
<p>On Slackware you can actually install it as a package from n/ package directory:</p>
<pre>
$ sudo installpkg nc-1.10-i386-1.tgz
</pre>
<p>If you&#8217;re on Windows, download the Windoze port of it from <a href="http://www.securityfocus.com/tools/139">securityfocus</a>.</p>
<p>The manual of the utility can be found here <a href="http://linux.die.net/man/1/nc">man nc</a>.</p>
<p>Have fun netcatting, and until next time!</p>
<p>PS. My website just got featured on <a href="http://www.makeuseof.com/tag/4-websites-to-learn-cool-linux-command-line-tricks/">makeuseof.com website</a>. Check it out!</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~f/catonmat?a=pj5mOFnB"><img src="http://feeds.feedburner.com/~f/catonmat?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=fReLiZO0"><img src="http://feeds.feedburner.com/~f/catonmat?i=fReLiZO0" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=ov9O6kwc"><img src="http://feeds.feedburner.com/~f/catonmat?i=ov9O6kwc" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=OTGONbuX"><img src="http://feeds.feedburner.com/~f/catonmat?i=OTGONbuX" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/p1p4B7mWkMo" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/unix-utilities-netcat/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/unix-utilities-netcat/</feedburner:origLink></item>
		<item>
		<title>Update on Famous Awk One-Liners Explained</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/GmiWGf69nP8/</link>
		<comments>http://www.catonmat.net/blog/update-on-famous-awk-one-liners-explained/#comments</comments>
		<pubDate>Mon, 09 Feb 2009 07:25:44 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Programming]]></category>
<category>array creation</category><category>awk</category><category>one liner</category><category>one liners</category><category>selective printing</category><category>string creation</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/update-on-famous-awk-one-liners-explained/</guid>
		<description><![CDATA[This is an update post on my three-part article Famous Awk One-Liners Explained.
I received an email from Eric Pement (the original author of Awk one-liners) and he said that there was a new version of awk1line.txt file available. I did a diff and found that there were seven new one-liners in it!
The new file has [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2008/09/awk-programming-one-liners-explained.jpg' alt='awk programming one-liners explained' class='post-icon' align='left' />This is an update post on my three-part article <a href="http://www.catonmat.net/blog/awk-one-liners-explained-part-one/">Famous Awk One-Liners Explained</a>.</p>
<p>I received an email from <a href="http://student.northpark.edu/pemente/">Eric Pement</a> (the original author of Awk one-liners) and he said that there was a new version of awk1line.txt file available. I did a diff and found that there were seven new one-liners in it!</p>
<p>The new file has two new sections &#8220;<strong>String Creation</strong>&#8221; and &#8220;<strong>Array Creation</strong>&#8221; and it updates &#8220;<strong>Selective Printing of Certain Lines</strong>&#8221; section. I&#8217;ll explain the new one-liners in this article.</p>
<p>Here is the latest version of awk1line.txt: <a href='http://www.catonmat.net/blog/wp-content/uploads/2009/02/awk1line-new.txt' title='Eric Pement’s Awk One-Liners'>awk1line-new.txt</a>.</p>
<p>The original Eric Pement&#8217;s Awk one-liner collection consists of five sections, and I explained them in my previous three articles:</p>
<ul>
<li>1. <a href="http://www.catonmat.net/blog/awk-one-liners-explained-part-one/">File spacing</a> (explained in part one),</li>
<li>2. <a href="http://www.catonmat.net/blog/awk-one-liners-explained-part-one/">Numbering and calculations</a> (explained in part one)</li>
<li>3. <a href="http://www.catonmat.net/blog/awk-one-liners-explained-part-two/">Text conversion and substitution</a> (explained in part two),</li>
<li>4. <a href="http://www.catonmat.net/blog/awk-one-liners-explained-part-three/">Selective printing of certain lines</a> (explained in part three),</li>
<li>5. <a href="http://www.catonmat.net/blog/awk-one-liners-explained-part-three/">Selective deletion of certain lines</a> (explained in part three).</li>
<li>6. <strong>String creation, array creation and update on selective printing of certain lines</strong>. (explained in this part).</li>
</ul>
<p>Okay, let&#8217;s roll with the new one-liners:</p>
<h2 style="margin-bottom: 10px">String Creation</h2>
<p><strong>1. Create a string of a specific length (generate a string of x&#8217;s of length 513).</strong></p>
<pre>awk 'BEGIN { while (a++<513) s=s "x"; print s }'</pre>
<p>This one-liner uses the &#8220;BEGIN { }&#8221; special block that gets executed before anything else in an Awk program. In this block a while loop appends character &#8220;x&#8221; to variable &#8220;s&#8221; 513 times. After it has looped, the &#8220;s&#8221; variable gets printed out. As this Awk program does not have a body, it quits after executing the BEGIN block.</p>
<p>This one-liner printed the 513 x&#8217;s out, but you could have used it for anything you wish in BEGIN, main program or END blocks.</p>
<p>Unfortunately this is not the most effective way to do it. It&#8217;s a linear time solution. My friend <strong>waldner</strong> (who, by the way, wrote a guest post on <a href="http://www.catonmat.net/blog/ten-awk-tips-tricks-and-pitfalls/">10 Awk Tips, Tricks and Pitfalls</a>) showed me a <a href="http://awk.freeshell.org/RepeatAString">solution</a> that&#8217;s logarithmic time (based on idea of <a href="http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-two/">recursive squaring</a>):</p>
<pre>
function rep(str, num,     remain, result) {
    if (num < 2) {
        remain = (num == 1)
    } else {
        remain = (num % 2 == 1)
        result = rep(str, (num - remain) / 2)
    }
    return result result (remain ? str  : "")
}
</pre>
<p>This function can be used as following:</p>
<pre>
awk 'BEGIN { s = rep("x", 513) }'
</pre>
<p><strong>2. Insert a string of specific length at a certain character position (insert 49 x&#8217;s after 6th char).</strong></p>
<pre>gawk --re-interval 'BEGIN{ while(a++<49) s=s "x" }; { sub(/^.{6}/,"&#038;" s) }; 1'</pre>
<p>This one-liner works only with Gnu Awk, because it uses the interval expression &#8220;.{6}&#8221; in the Awk program&#8217;s body. Interval expressions were not traditionally available in awk, that&#8217;s why you have to use &#8220;&#45;&#45;re-interval&#8221; option to enable them.</p>
<p>For those that do not know what interval expressions are, they are regular expressions that match a certain number of characters. For example, &#8220;.{6}&#8221; matches any six characters (the any char is specified by the dot &#8220;.&#8221;). An interval expression &#8220;b{2,4}&#8221; matches at least two, but not more than four &#8220;b&#8221; characters. To match words, you have to give them higher precedence - &#8220;(foo){4}&#8221; matches &#8220;foo&#8221; repeated four times - &#8220;foofoofoofoo&#8221;.</p>
<p>The one-liner starts the same way as the previous - it creates a 49 character string &#8220;s&#8221; in the BEGIN block. Next, for each line of the input, it calls sub() function that replaces the first 6 characters with themselves and &#8220;s&#8221; appended. The &#8220;&#038;&#8221; in the sub() function means the matched part of regular expression. The &#8216;&#8221;&#038;&#8221; s&#8217; means matched part of regex and contents of variable &#8220;s&#8221;. The &#8220;1&#8243; at the end of whole Awk one-liner prints out the modified line (it&#8217;s syntactic sugar for just &#8220;print&#8221; (that itself is syntactic sugar for &#8220;print $0&#8243;)).</p>
<p>The same can be achieved with normal standard Awk:</p>
<pre>awk 'BEGIN{ while(a++<49) s=s "x" }; { sub(/^....../,"&#038;" s) }; 1</pre>
<p>Here we just match six chars &#8220;&#46;&#46;&#46;&#46;&#46;&#46;&#8221; at the beginning of line, and replace them with themselves + contents of variable &#8220;s&#8221;.</p>
<p>It may get troublesome to insert a string at 29th position for example&#8230; You&#8217;d have to go tapping &#8220;.&#8221; twenty-nine times &#8220;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#46;&#8221;. Better use Gnu Awk then and write &#8220;.{29}&#8221;.</p>
<p>Once again, my friend waldner corrected me and pointed to <a href="http://awk.freeshell.org/AwkFeatureComparison">Awk Feature Comparsion</a> chart. The chart suggests that the original one-liner with &#8220;.{6}&#8221; would also work with POSIX awk, Busybox awk, and Solaris awk.</p>
<h2 style="margin-bottom: 10px">Array Creation</h2>
<p><strong>3. Create an array from string.</strong></p>
<pre>split("Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec", month, " ")</pre>
<p>This is not a one-liner per se but a technique to create an array from a string. The split(Str, Arr, Regex) function is used do that. It splits string Str into fields by regular expression Regex and puts the fields in array Arr. The fields are placed in Arr[1], Arr[2], &#8230;, Arr[N]. The split() function itself returns the number of fields the string was split into.</p>
<p>In this piece of code the Regex is simply space character &#8221; &#8220;, the array is month and string is &#8220;Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec&#8221;. After the split, month[1] is &#8220;Jan&#8221;, month[2] is &#8220;Feb&#8221;, &#8230;, month[12] is &#8220;Dec&#8221;.</p>
<p><strong>4. Create an array named &#8220;mdigit&#8221;, indexed by strings.</strong></p>
<pre>for (i=1; i<=12; i++) mdigit[month[i]] = i</pre>
<p>This is another array creation technique and not a real one-liner. This technique creates a reverse lookup array. Remember from the previous &#8220;one-liner&#8221; that month[1] was &#8220;Jan&#8221;, &#8230;, month[12] was &#8220;Dec&#8221;. Now we want to the reverse lookup and find the number for each month. To do that we create a reverse lookup array &#8220;mdigit&#8221;, such that mdigit[&#8221;Jan&#8221;] = 1, &#8230;, mdigit[&#8221;Dec&#8221;] = 12.</p>
<p>It&#8217;s really trivial, we loop over month[1], month[2], &#8230;, month[12] and set mdigit[month[i]] to i. This way mdigit[&#8221;Jan&#8221;] = 1, etc.</p>
<h2 style="margin-bottom: 10px">Selective Printing of Certain Lines</h2>
<p><strong>5. Print all lines where 5th field is equal to &#8220;abc123&#8243;.</strong></p>
<pre>awk '$5 == "abc123"'</pre>
<p>This one-liner uses idiomatic Awk - if the given expression is true, Awk prints out the line. The fifth field is referenced by &#8220;$5&#8243; and it&#8217;s checked to be equal to &#8220;abc123&#8243;. If it is, the expression is true and the line gets printed.</p>
<p>Unwinding this idiom, this one-liner is really equal to:</p>
<pre>
awk '{ if ($5 == "abc123") { print $0 } }'
</pre>
<p><strong>6. Print any line where field #5 is not equal to &#8220;abc123&#8243;.</strong></p>
<pre>awk '$5 != "abc123"'</pre>
<p>This is exactly the same as previous one-liner, except it negates the comparison. If the fifth field &#8220;$5&#8243; is not equal to &#8220;abc123&#8243;, then print it.</p>
<p>Unwinding it, it&#8217;s equal to:</p>
<pre>
awk '{ if ($5 != "abc123") { print $0 } }'
</pre>
<p>Another way is to literally negate the whole previous one-liner:</p>
<pre>awk '!($5 == "abc123")'</pre>
<p><strong>7. Print all lines whose 7th field matches a regular expression.</strong></p>
<pre>awk '$7  ~ /^[a-f]/'</pre>
<p>This is also idiomatic Awk. It uses &#8220;~&#8221; operator to test if the seventh &#8220;$7&#8243; field matches a regular expression &#8220;^[a-f]&#8221;. This regular expression means &#8220;all lines that start with a lower-case letter a, b, c, d, e, or f&#8221;.</p>
<pre>awk '$7 !~ /^[a-f]/'</pre>
<p>This one-liner matches negates the previous one and prints all lines that do not start with a lower-case letter a, b, c, d, e, and f.</p>
<p>Another way to write the same is:</p>
<pre>awk '$7 ~ /^[^a-f]/'</pre>
<p>Here we negated the group of letters [a-f] by adding &#8220;^&#8221; in the group. That&#8217;s a regex trick to know.</p>
<h2 style="margin-bottom:10px">Have Fun!</h2>
<p>Have fun with these Awk oneliners!</p>
<p>If you haven&#8217;t already, I recommend that you download my <a href="http://www.catonmat.net/blog/awk-nawk-and-gawk-cheat-sheet/">Awk cheat-cheet</a>, read the &#8220;<a href="http://www.catonmat.net/blog/ten-awk-tips-tricks-and-pitfalls/">10 Awk Tips, Tricks and Pitfalls</a>&#8221; article, and study the source code of my <a href="http://www.catonmat.net/blog/downloading-youtube-videos-with-gawk/">YouTube Video Downloader</a>, written entirely in Gnu Awk.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~f/catonmat?a=2N233oHe"><img src="http://feeds.feedburner.com/~f/catonmat?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=15M0ejeQ"><img src="http://feeds.feedburner.com/~f/catonmat?i=15M0ejeQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=zz2jaqo7"><img src="http://feeds.feedburner.com/~f/catonmat?i=zz2jaqo7" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=GQ7UQ4AU"><img src="http://feeds.feedburner.com/~f/catonmat?i=GQ7UQ4AU" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/GmiWGf69nP8" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/update-on-famous-awk-one-liners-explained/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/update-on-famous-awk-one-liners-explained/</feedburner:origLink></item>
		<item>
		<title>Musical Geek Friday #17: Hax That Fsck</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/RVJBAs9V8ig/</link>
		<comments>http://www.catonmat.net/blog/musical-geek-friday-hax-that-fsck/#comments</comments>
		<pubDate>Fri, 06 Feb 2009 07:15:31 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Musical Geek Friday]]></category>
<category>0day</category><category>amd</category><category>b0g</category><category>b0w</category><category>bill gates</category><category>christina aguilera</category><category>cia</category><category>fbi</category><category>firefox</category><category>geek</category><category>gibson</category><category>hack</category><category>leech axss</category><category>mp3</category><category>music</category><category>musical geek friday</category><category>risc</category><category>steve jobs</category><category>zero day</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/musical-geek-friday-hax-that-fsck/</guid>
		<description><![CDATA[I have found some new geek songs to post on my Musical Geek Friday. This week it&#8217;s a song called Hax That Fsck. It&#8217;s a hip-hop/rap song about a hacker getting owned by FBI and his adventures in hacking Gibson and CIA.
The song is written by Finnish band called &#8220;Leech Axss&#8220;. This is their second [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/02/leech-axss-hack-that-fuck.jpg' alt='Don’t Copy That Floppy' class="post-icon" align="left" />I have found some new <strong>geek songs</strong> to post on my <strong>Musical Geek Friday</strong>. This week it&#8217;s a song called <strong>Hax That Fsck</strong>. It&#8217;s a hip-hop/rap song about a hacker getting owned by FBI and his adventures in hacking <a href="http://www.urbandictionary.com/define.php?term=Hack%20the%20Gibson">Gibson</a> and CIA.</p>
<p>The song is written by Finnish band called &#8220;<a href="http://www.mikseri.net/artists/?id=19062">Leech Axss</a>&#8220;. This is their second geek song that I have posted on Musical Geek Friday. See &#8220;<a href="http://www.catonmat.net/blog/musical-geek-friday-leech-access-is-coming-at-you/">Leech Access is Coming at You</a>&#8220;.</p>
<p>I had problems with understanding some parts of the song. I put &#8220;(???)&#8221; in the lyrics where I was not sure what was being said. I&#8217;d appreciate if someone helped me figure it out the correct lyrics!</p>
<p>The song is <strong>strongly NSFW</strong>, be sure to put your headphones on before listening at work.</p>
<p></p>
<p>Download this song: <a href="http://www.catonmat.net/download/leech_axss-hax_that_fuck.mp3" title="Version 1.0 downloaded 2626 times" rel="enclosure">hax that fsck (musical geek friday #17)</a><br />
Downloaded: 2626 times</p>
<p>Download lyrics: <a href="http://www.catonmat.net/download/leech_axss-hax_that_fuck-lyrics.txt" title="Version 1.0 downloaded 616 times" rel="enclosure">hax that fsck lyrics (musical geek friday #17)</a><br />
Downloaded: 616 times</p>
<p>Here is the lyrics (I censored the explicit language, see the &#8216;download lyrics&#8217; link above for uncensored version):</p>
<blockquote><p>
Hax that f**k, move that box<br />
Exploit a hole in a FireFox<br />
Haxing is about breaking in, not fame<br />
Whoever you be haxing is all the same</p>
<p>Crack that code, buffer overflow<br />
Alt-F4 makes a lamer go<br />
Hax that Gibson and CIA<br />
Sell your soul to the zero da-aa-ay</p>
<p>As you well know, I net sex with duty creatures (???)<br />
I now have sex with the federal agents<br />
You bruteforce the password of gay.com<br />
Knock, knock, it&#8217;s the feds and the bubba is on</p>
<p>Right to remain silent like my box in ice<br />
Perfect source code, never write it twice<br />
Better go figure it out before you talk to a fed<br />
It&#8217;s a feeling very close to a blue screen of death</p>
<p>A monkey Steve Jobs intringing with Bill Gates<br />
That ain&#8217;t gonna tway with the fscking absura blades (????)<br />
Apple in my eye and &#8230; (?) in my tank<br />
I bought an itunes that are like tri-hand ****</p>
<p>Still in the game but before I went away<br />
Nerdies want my wishes and they want my zero days<br />
But behave you geeks, and I won&#8217;t go latest patch<br />
The one that cycle is Christina Aguilera snatch</p>
<p>Hax that f**k, move that box<br />
Exploit a hole in the FireFox<br />
Haxing is about breaking in, not fame<br />
Whoever you be haxing is all the same</p>
<p>Crack that code, buffer overflow<br />
Alt-F4 makes a lamer go<br />
Hax that Gibson and CIA<br />
Sell your soul to the zero da-aa-ay</p>
<p>I read b0w and b0g<br />
I sport AMD and fscking RISC<br />
I am not like you, I never disrespect my box<br />
Points to stolen motherf*****g Mozilla leanfox</p>
<p>What&#8217;s under the hood, I&#8217;ve got future of jobs<br />
Sounds like d3s (?) in the court<br />
Drop it like it&#8217;s hot, drop it like it&#8217;s hot, drop it like it&#8217;s hot<br />
Check my benchmark, you skinny retard</p>
<p>[Hshhha] loves a key generator<br />
If it&#8217;s not 0day, don&#8217;t seed it later<br />
I met my share of lamers, I&#8217;ve got some spleen<br />
Zero-bit (?) is like Commander Keen</p>
<p>I&#8217;ve been in hax of yours and to haxifice<br />
I&#8217;ve seen in the eyes which drugsterize (???)<br />
I&#8217;m haxing my way out of the paper bag<br />
Authenticating g in my f*****g &#8230; (???)</p>
<p>Hax that f**k, move that box<br />
Exploit a hole in the FireFox<br />
Haxing is about breaking in, not fame<br />
Whoever you be haxing is all the same</p>
<p>Crack that code, buffer overflow<br />
Alt-F4 makes a lamer go<br />
Hack that Gibbson and CIA<br />
Sell your soul to the zero da-aa-ay</p>
<p>Hax that fsck, move that box<br />
Exploit a hole in the FireFox<br />
Haxing is about breaking in, not fame<br />
Whoever you be haxing is all the same</p>
<p>Crack that code, buffer overflow<br />
Alt-F4 makes a lamer go<br />
Hax that Gibson and CIA<br />
Sell your soul to the zero da-aa-ay
</p></blockquote>
<div class="download">
<div class="download-title">Download &#8220;Hax That Fsck&#8221; Song</div>
<p>Download this song: <a href="http://www.catonmat.net/download/leech_axss-hax_that_fuck.mp3" title="Version 1.0 downloaded 2626 times" rel="enclosure">hax that fsck (musical geek friday #17)</a><br />
Downloaded: 2626 times</p>
<p>Download lyrics: <a href="http://www.catonmat.net/download/leech_axss-hax_that_fuck-lyrics.txt" title="Version 1.0 downloaded 616 times" rel="enclosure">hax that fsck lyrics (musical geek friday #17)</a><br />
Downloaded: 616 times</p>
<p>Click to listen:<br />
</p>
<p>Have fun and until next geeky Friday! <img src='http://www.catonmat.net/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' />
</div>
<div class="ads-aad">
<script type="text/javascript"><!--
google_ad_client = "pub-0433925538517663";
google_ad_width = 468;
google_ad_height = 60;
google_ad_format = "468x60_as";
google_ad_type = "text_image";
//2007-07-12: catonmat after download
google_ad_channel = "9549065335";
google_color_border = "FFFFFF";
google_color_bg = "FFFFFF";
google_color_link = "753206";
google_color_text = "4C4C4C";
google_color_url = "E6E6E6";
//-->
</script>
<script type="text/javascript"
  src="http://pagead2.googlesyndication.com/pagead/show_ads.js">
</script>
</div>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~f/catonmat?a=slEuhq7I"><img src="http://feeds.feedburner.com/~f/catonmat?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=yIbWkbVW"><img src="http://feeds.feedburner.com/~f/catonmat?i=yIbWkbVW" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=9XMVxwzQ"><img src="http://feeds.feedburner.com/~f/catonmat?i=9XMVxwzQ" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=7wq9l4QL"><img src="http://feeds.feedburner.com/~f/catonmat?i=7wq9l4QL" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/RVJBAs9V8ig" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/musical-geek-friday-hax-that-fsck/feed/</wfw:commentRss>
<enclosure url="http://www.catonmat.net/download/leech_axss-hax_that_fuck.mp3" length="3745920" type="audio/mpeg" />
		<feedburner:origLink>http://www.catonmat.net/blog/musical-geek-friday-hax-that-fsck/</feedburner:origLink></item>
		<item>
		<title>Vim Plugins You Should Know About, Part III: matchit.vim</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/b_B5LFy-MVo/</link>
		<comments>http://www.catonmat.net/blog/vim-plugins-matchit-vim/#comments</comments>
		<pubDate>Thu, 05 Feb 2009 06:50:22 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Programming]]></category>
<category>benji fisher</category><category>matchit</category><category>plugin</category><category>python matchit</category><category>vim</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/vim-plugins-matchit-vim/</guid>
		<description><![CDATA[This is the third post in the article series &#8220;Vim Plugins You Should Know About&#8220;. This time I am going to introduce you to a plugin called &#8220;matchit.vim&#8220;.
If you are intrigued by this topic, I suggest that you subscribe to my posts! For the introduction and first post in this article series, follow this link [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2008/12/vim-plugins-surround-vim.gif' alt='Vim Plugins, surround.vim' class="post-icon" align="left" />This is the third post in the article series &#8220;<strong>Vim Plugins You Should Know About</strong>&#8220;. This time I am going to introduce you to a plugin called &#8220;<strong>matchit.vim</strong>&#8220;.</p>
<p>If you are intrigued by this topic, I suggest that you <a href="http://www.catonmat.net/feed/">subscribe to my posts</a>! For the introduction and first post in this article series, follow this link - <a href="http://www.catonmat.net/blog/vim-plugins-surround-vim/">Vim Plugins You Should Know About, Part I: surround.vim</a>. Part II is here: <a href="http://www.catonmat.net/blog/vim-plugins-repeat-vim/">repeat.vim</a>.</p>
<p>Matchit extends the existing functionality of &#8220;<strong>%</strong>&#8221; key (percent key). I&#8217;ll first briefly remind you what the original &#8220;%&#8221; does and then explain how matchit.vim enhances it.</p>
<p>The original &#8220;%&#8221; key allows you to jump between various pairs of characters and some programming constructs. For example, it jumps between pairs of parenthesis ( )&#8217;s, { }&#8217;s, [ ]&#8217;s. It also jumps between opening and closing tags of C style comments /* and */. And it&#8217;s smart enough to jump between C preprocessor directives - from #if to #endif and match #elif or #else in between.</p>
<p>Here is an example. Suppose you have this code and you press &#8220;%&#8221;, the cursor jumps between { and } parens:</p>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/02/matchit-vim-example.gif' alt='matchit vim plugin example' />
</div>
<p>Matchit.vim extends this functionality. It&#8217;s written by <a href="http://www.vim.org/account/profile.php?user_id=92">Benji Fisher</a> and it adds support to cycle between if, else if, else, endif keywords in various programming languages. Another improvement is the ability to find pairs of HTML tags, such as &lt;p> &#8230; &lt;/p>. Another handy mapping is &#8220;g%&#8221; that does &#8220;%&#8221; in opposite direction (goes from endif to else to else if to if). The plugin also includes several other mappings like  &#8220;]%&#8221;, &#8220;[%&#8221; and &#8220;a%&#8221; but I could not figure out how to effectively use them in real life code, so I don&#8217;t use them at all.</p>
<p>Here is another example. Suppose you are editing this HTML and you quickly want to go to the corresponding closing tag of &lt;body>. Just press &#8220;%&#8221;:</p>
<div class="center-aligner">
<img src='http://www.catonmat.net/blog/wp-content/uploads/2009/02/matchit-vim-html-example.gif' alt='matchit vim html example' />
</div>
<p>Overall it&#8217;s a great plugin to have in your inventory!</p>
<h2 style="margin-bottom:10px">How to install matchit.vim?</h2>
<p>Matchit.vim has been included in vim since version 6.0. However there are newer versions of the script available with bug fixes and enhancements.</p>
<p>To get the latest version:</p>
<ul>
<li>1. Download <a href="http://www.vim.org/scripts/script.php?script_id=39">matchit.zip</a>.</li>
<li>2. Extract matchit.zip to ~/.vim (on Unix/Linux) or ~\vimfiles (on Windows).</li>
<li>3. Run :helptags ~/.vim/doc (on Unix/Linux) or :helptags ~/vimfiles/doc (on Windows) to rebuild the tags file (so that you can read :help %, :help g%, etc.)</li>
<li>4. Restart Vim or source matchit.vim with &#8220;:so ~/.vim/plugin/matchit.vim&#8221; on Unix or &#8220;:so ~/vimfiles/plugin/matchit.vim&#8221; on Windows).</li>
<li>5. Use &#8216;%&#8217; to find corresponding </li>
</ul>
<p>For Python programmers: Turns out the original matchit.vim plugin does not match if / elif / else. Benji extended matchit.vim itself and created &#8220;<a href="http://www.vim.org/scripts/script.php?script_id=386">python_matchit.vim</a>&#8220;. This extension allows us to use the &#8220;%&#8221; key to cycle through if/elif/else, try/except/catch, for/continue/break, and while/continue/break structures. The script also defines g% to cycle in the opposite direction, and it defines two other motions, [% and ]%, go to the start and end of the current block, respectively.</p>
<h2 style="margin-bottom:10px">How to install python_matchit.vim?</h2>
<p>Python_matchit.vim is a filetype plugin. It has to be installed in &#8220;ftplugin&#8221; directory. Follow these steps to get it installed:</p>
<ul>
<li>1. Download <a href="http://www.vim.org/scripts/script.php?script_id=386">python_matchit.vim</a>.</li>
<li>2. Put it in ~/.vim/ftplugin (on Unix/Linux) or ~\vimfiles\ftplugin (on Windows).</li>
<li>3. Restart Vim or source matchit.vim with &#8220;:so ~/.vim/ftplugin/python_matchit.vim&#8221; on Unix or &#8220;:so ~/vimfiles/ftplugin/python_matchit.vim&#8221; on Windows).</li>
</ul>
<p>The same steps can be taken to install matchit for <a href="http://vim.sourceforge.net/scripts/script.php?script_id=290">Ruby</a>.</p>
<h2 style="margin-bottom:10px">Have Fun!</h2>
<p>Happy matching with matchit.vim!</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~f/catonmat?a=C9zAuEfx"><img src="http://feeds.feedburner.com/~f/catonmat?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=tcQ4Mv4J"><img src="http://feeds.feedburner.com/~f/catonmat?i=tcQ4Mv4J" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=mIaPrn4k"><img src="http://feeds.feedburner.com/~f/catonmat?i=mIaPrn4k" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=HsIUUNro"><img src="http://feeds.feedburner.com/~f/catonmat?i=HsIUUNro" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/b_B5LFy-MVo" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/vim-plugins-matchit-vim/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/vim-plugins-matchit-vim/</feedburner:origLink></item>
		<item>
		<title>A Unix Utility You Should Know About: Pipe Viewer</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/PWwglyg39EU/</link>
		<comments>http://www.catonmat.net/blog/unix-utilities-pipe-viewer/#comments</comments>
		<pubDate>Mon, 02 Feb 2009 05:26:58 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Tools]]></category>
<category>gzip</category><category>linux</category><category>nc</category><category>pipe</category><category>progress bar</category><category>pv</category><category>tar</category><category>unix</category><category>utility</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/unix-utilities-pipe-viewer/</guid>
		<description><![CDATA[Hi all. I&#8217;m starting yet another article series here. This one is going to be about Unix utilities that you should know about. The articles will discuss one Unix program at a time. I&#8217;ll try to write a good introduction to the tool and give as many examples as I can think of.
Before I start, [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/02/unix-utilities.jpg' alt='Unix Utilities' class="post-icon" align="left" />Hi all. I&#8217;m starting yet another article series here. This one is going to be about Unix utilities that you should know about. The articles will discuss one Unix program at a time. I&#8217;ll try to write a good introduction to the tool and give as many examples as I can think of.</p>
<p>Before I start, I want to clarify one thing - Why am I starting so many article series? The answer is that I want to write about many topics simultaneously and switch between them as I feel inspired.</p>
<p>The first post in this series is going to be about not so well known Unix program called <strong>Pipe Viewer</strong> or <strong>pv</strong> for short. Pipe viewer is a terminal-based tool for monitoring the progress of data through a pipeline. It can be inserted into any normal pipeline between two processes to give a visual indication of how quickly data is passing through, how long it has taken, how near to completion it is, and an estimate of how long it will be until completion. </p>
<p><strong>Update</strong>: <a href="http://www.neuronfarm.net/blog/utilitaire-unix-a-connaitre-pipe-viewer">French translation</a> available.</p>
<p>Pipe viewer is written by <a href="http://www.ivarch.com/">Andrew Wood</a>, an experienced Unix sysadmin. The homepage of pv utility is here: <a href="http://www.ivarch.com/programs/pv.shtml">pv utility</a>.</p>
<p>If you feel like you are interested in this stuff, I suggest that you subscribe to <a href="http://feeds.feedburner.com/catonmat">my rss feed</a> to receive my future posts automatically.</p>
<h2 style="margin-bottom: 10px">How to use pv?</h2>
<p>Ok, let&#8217;s start with some really easy examples and progress to more complicated ones.</p>
<p>Suppose that you had a file &#8220;access.log&#8221; that is a few gigabytes in size and contains web logs. You want to compress it into a smaller file, let&#8217;s say a gunzip archive (.gz). The obvious way would be to do:</p>
<pre>
$ gzip -c access.log > access.log.gz
</pre>
<p>As the file is so huge (several gigabytes), you have no idea how long to wait. Will it finish soon? Or will it take another 30 mins?</p>
<p>By using pv you can precisely time how long it will take. Take a look at doing the same through pv:</p>
<pre>
$ pv access.log | gzip > access.log.gz
<strong>611MB 0:00:11 [58.3MB/s] [=>      ] 15% ETA 0:00:59</strong>
</pre>
<p>Pipe viewer acts as &#8220;cat&#8221; here, except it also adds a progress bar. We can see that gzip processed 611MB of data in 11 seconds. It has processed 15% of all data and it will take 59 more seconds to finish.</p>
<p>You may stick several pv processes in between. For example, you can time how fast the data is being read from the disk and how much data is gzip outputting:</p>
<pre>
$ pv -cN source access.log | gzip | pv -cN gzip > access.log.gz
<strong>source:  760MB 0:00:15 [37.4MB/s] [=>     ] 19% ETA 0:01:02
  gzip: 34.5MB 0:00:15 [1.74MB/s] [  <=>  ]</strong>
</pre>
<p>Here we specified the &#8220;-N&#8221; parameter to pv to create a <strong>n</strong>amed stream. The &#8220;-c&#8221; parameter makes sure the output is not garbaged by one pv process writing over the other.</p>
<p>This example shows that &#8220;access.log&#8221; file is being read at a speed of 37.4MB/s but gzip is writing data at only 1.74MB/s. We can immediately calculate the compression rate. It&#8217;s 37.4/1.74 = 21x!</p>
<p>Notice how the gzip does not include how much data is left or how fast it will finish. It&#8217;s because the pv process after gzip has no idea how much data gzip will produce (it&#8217;s just outputting compressed data from input stream). The first pv process, however, knows how much data is left, because it&#8217;s reading it.</p>
<p>Another similar example would be to pack the whole directory of files into a compressed tarball:</p>
<pre>
$ tar -czf - . | pv > out.tgz
<strong> 117MB 0:00:55 [2.7MB/s] [>         ]</strong>
</pre>
<p>In this example pv shows just the output rate of &#8220;tar -czf&#8221; command. Not very interesting and it does not provide information about how much data is left. We need to provide the total size of data we are tarring to pv, it&#8217;s done this way:</p>
<pre>
$ tar -cf - . | pv -s $(du -sb . | awk '{print $1}') | gzip > out.tgz
<strong> 253MB 0:00:05 [46.7MB/s] [>     ]  1% ETA 0:04:49</strong>
</pre>
<p>What happens here is we tell tar to create &#8220;-c&#8221; an archive of all files in current dir &#8220;.&#8221; (recursively) and output the data to stdout &#8220;-f -&#8221;. Next we specify the size &#8220;-s&#8221; to pv of all files in current dir. The &#8220;du -sb . | awk &#8216;{print $1}&#8217;&#8221; returns number of bytes in current dir, and it gets fed as &#8220;-s&#8221; parameter to pv. Next we gzip the whole content and output the result to out.tgz file. This way &#8220;pv&#8221; knows how much data is still left to be processed and shows us that it will take yet another 4 mins 49 secs to finish.</p>
<p>Another fine example is copying large amounts of data over network by using help of &#8220;nc&#8221; utility that I will write about some other time.</p>
<p>Suppose you have two computers A and B. You want to transfer a directory from A to B very quickly. The fastest way is to use tar and nc, and time the operation with pv.</p>
<pre>
# on computer A, with IP address 192.168.1.100
$ tar -cf - /path/to/dir | pv | nc -l -p 6666 -q 5
</pre>
<pre>
# on computer B
$ nc 192.168.1.100 6666 | pv | tar -xf -
</pre>
<p>That&#8217;s it. All the files in /path/to/dir on computer A will get transferred to computer B, and you&#8217;ll be able to see how fast the operation is going.</p>
<p>If you want the progress bar, you have to do the &#8220;pv -s $(&#8230;)&#8221; trick from the previous example (only on computer A).</p>
<p>Another funny example is by my blog reader alexandru. He shows how to time how fast the computer reads from /dev/zero:</p>
<pre>
$ pv /dev/zero > /dev/null
 157GB 0:00:38 [4,17GB/s]
</pre>
<p>That&#8217;s about it. I hope you enjoyed my examples and learned something new. I love explaining things and teaching! <img src='http://www.catonmat.net/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<h2 style="margin-bottom: 10px">How to install pv?</h2>
<p>If you&#8217;re on Debian or Debian based system such as Ubuntu do the following:</p>
<pre>
$ sudo aptitude install pv
</pre>
<p>If you&#8217;re on Fedora or Fedora based system such as CentOS do:</p>
<pre>
$ sudo yum install pv
</pre>
<p>If you&#8217;re on Slackware, go to <a href="http://www.ivarch.com/programs/pv.shtml">pv homepage</a>, download the pv-version.tar.gz archive and do:</p>
<pre>
$ tar -zxf pv-version.tar.gz
$ cd pv-version
$ ./configure &#038;&#038; sudo make install
</pre>
<p>If you&#8217;re a Mac user:</p>
<pre>
$ sudo port install pv
</pre>
<p>If you&#8217;re OpenSolaris user:</p>
<pre>
$ pfexec pkg install pv
</pre>
<p>If you&#8217;re a Windows user on Cygwin:</p>
<pre>
$ ./configure
$ export DESTDIR=/cygdrive/c/cygwin
$ make
$ make install
</pre>
<p>The manual of the utility can be found here <a href="http://www.ivarch.com/programs/quickref/pv.shtml">man pv</a>.</p>
<p>Have fun measuring your pipes with pv, and until next time!</p>
<p><strong>A question to my readers</strong>: what other not so well known Unix utilities do you use and/or know about?</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~f/catonmat?a=DPoE285D"><img src="http://feeds.feedburner.com/~f/catonmat?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=nMzszKlb"><img src="http://feeds.feedburner.com/~f/catonmat?i=nMzszKlb" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=yc9OPDQ9"><img src="http://feeds.feedburner.com/~f/catonmat?i=yc9OPDQ9" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=5Te4iEvt"><img src="http://feeds.feedburner.com/~f/catonmat?i=5Te4iEvt" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/PWwglyg39EU" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/unix-utilities-pipe-viewer/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/unix-utilities-pipe-viewer/</feedburner:origLink></item>
		<item>
		<title>MIT’s Introduction to Algorithms, Lectures 17, 18 and 19: Shortest Path Algorithms</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/0G8ioHf9vh8/</link>
		<comments>http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-twelve/#comments</comments>
		<pubDate>Tue, 27 Jan 2009 02:00:29 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Introduction to Algorithms]]></category>
<category>algorithms</category><category>bellman ford algorithm</category><category>bfs</category><category>breadth first search</category><category>constraint graph</category><category>dijkstras algorithm</category><category>edsger dijkstra</category><category>fifo</category><category>floyd warshall</category><category>graphs</category><category>johnsons algorithm</category><category>lecture</category><category>linear programming</category><category>mit</category><category>paths</category><category>priority queue</category><category>shortest path algorithms</category><category>video</category><category>video lecture</category><category>vlsi</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-twelve/</guid>
		<description><![CDATA[This is the twelfth post in an article series about MIT&#8217;s lecture course &#8220;Introduction to Algorithms.&#8221; In this post I will review a trilogy of lectures on graph and shortest path algorithms. They are lectures seventeen, eighteen and nineteen. They&#8217;ll cover Dijkstra&#8217;s Algorithm, Breadth-First Search Algorithm and Bellman-Ford Algorithm for finding single-source shortest paths as [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2008/08/post-icon.jpg' alt='MIT Algorithms' class="post-icon" align="left" />This is the twelfth post in an article series about MIT&#8217;s lecture course &#8220;<strong>Introduction to Algorithms</strong>.&#8221; In this post I will review a trilogy of lectures on graph and shortest path algorithms. They are lectures seventeen, eighteen and nineteen. They&#8217;ll cover <strong>Dijkstra&#8217;s Algorithm</strong>, <strong>Breadth-First Search Algorithm</strong> and <strong>Bellman-Ford Algorithm</strong> for finding single-source shortest paths as well as <strong>Floyd-Warshall Algorithm</strong> and <strong>Johnson&#8217;s Algorithm</strong> for finding all-pairs shortest paths.</p>
<p>These algorithms require a thorough understanding of graphs. See the <a href="http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-eleven/">previous lecture</a> for a good review of graphs.</p>
<p>Lecture seventeen focuses on the single-source shortest-paths problem: Given a graph G = (V, E), we want to find a shortest path from a given source vertex s ∈ V to each vertex v ∈ V. In this lecture the weights of edges are restricted to be positive which leads it to Dijkstra&#8217;s algorithm and in a special case when all edges have unit weight to Breadth-first search algorithm.</p>
<p>Lecture eighteen also focuses on the same single-source shortest-paths problem, but allows edges to be negative. In this case a negative-weight cycles may exist and Dijkstra&#8217;s algorithm would no longer work and would produce incorrect results. Bellman-Ford algorithm therefore is introduced that runs slower than Dijkstra&#8217;s but detects negative cycles. As a corollary it is shown that Bellman-Ford solves Linear Programming problems with constraints in form x<sub>j</sub> - x<sub>i</sub> <= w<sub>ij</sub>.</p>
<p>Lecture nineteen focuses on the all-pairs shortest-paths problem: Find a shortest path from u to v for every pair of vertices u and v. Although this problem can be solved by running a single-source algorithm once from each vertex, it can be solved faster with Floyd-Warshall algorithm or Johnson&#8217;s algorithm.</p>
<h2 style="margin-bottom:10px">Lecture 17: Shortest Paths I: Single-Source Shortest Paths and Dijkstra&#8217;s Algorithm</h2>
<p>Lecture seventeen starts with a small review of paths and shortest paths.</p>
<p>It reminds that given a graph G = (V, E, w), where V is a set of vertices, E is a set of edges and w is weight function that maps edges to real-valued weights, a <strong>path</strong> p from a vertex u to a vertex v in this graph is a sequence of vertices (v<sub>0</sub>, v<sub>1</sub>, &#8230;, v<sub>k</sub>) such that u = v<sub>0</sub>, v = v<sub>k</sub> and (v<sub>i-1</sub>, v<sub>i</sub>) ∈ E. The weight w(p) of this path is a sum of weights over all edges = w(v<sub>0</sub>, v<sub>1</sub>) + w(v<sub>1</sub>, v<sub>2</sub>) + &#8230; + w(v<sub>k-1</sub>, v<sub>k</sub>). It also reminds that a <strong>shortest path</strong> from u to v is the path with minimum weight of all paths from u to v, and that a shortest path in a graph might not exist if it contains a negative weight cycle.</p>
<p>The lecture then notes that shortest paths exhibit the optimal substructure property - a subpath of a shortest path is also a shortest path. The proof of this property is given by cut and paste argument. If you remember from previous two lectures on <a href="http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-ten/">dynamic programming</a> and <a href="http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-eleven/">greedy algorithms</a>, an optimal substructure property suggests that these two techniques could be applied to solve the problem efficiently. Indeed, applying the greedy idea, Dijkstra&#8217;s algorithm emerges.</p>
<p>Here is a somewhat precise definition of single-source shortest paths problem with non-negative edge weights: Given a graph G = (V, E), and a starting vertex s ∈ V, find shortest-path weights for all vertices v ∈ V.</p>
<p>Here is the greedy idea of Dijkstra&#8217;s algorithm:</p>
<ul>
<li>1. Maintain a set S of vertices whose shortest-path from s are known (s ∈ S initially).</li>
<li>2. At each step add vertex v from the set V-S to the set S. Choose v that has minimal distance from s (be greedy).</li>
<li>3. Update the distance estimates of vertices adjacent to v.</li>
</ul>
<p>I have also posted a video interview with Edsger Dijkstra - <a href="http://www.catonmat.net/blog/edsger-dijkstra-discipline-in-thought/">Edsger Dijkstra: Discipline in Thought</a>, please take a look if you want to see how Dijkstra looked like. <img src='http://www.catonmat.net/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p>The lecture continues with an example of running Dijkstra&#8217;s algorithm on a non-trivial graph. It also introduces to a concept of a shortest path tree - a tree that is formed by edges that were last relaxed in each iteration (hard to explain in English, see lecture at 43:40).</p>
<p>The other half of lecture is devoted to three correctness arguments of Dijkstra&#8217;s algorithm. The first one proves that relaxation never makes a mistake. The second proves that relaxation always makes the right greedy choice. And the third proves that when algorithm terminates the results are correct.</p>
<p>At the final minutes of lecture, running time of Dijkstra&#8217;s algorithm is analyzed. Turns out that the running time depends on what data structure is used for maintaining the priority queue of the set V-S (step 2). If we use an array, the running time is O(V<sup>2</sup>), if we use binary heap, it&#8217;s O(E·lg(V)) and if we use Fibonacci heap, it&#8217;s O(E + V·lg(V)).</p>
<p>Finally a special case of weighted graphs is considered when all weights are unit weights. In this case a single-source shortest-paths problem can be solved by a the Breadth-first search (BFS) algorithm that is actually a simpler version of Dijkstra&#8217;s algorithm with priority queue replaced by a FIFO! The running time of BFS is O(V+E).</p>
<p>You&#8217;re welcome to watch lecture seventeen:</p>
<div class="center-aligner">
<embed id="VideoPlayback" src="http://video.google.com/googleplayer.swf?docid=1944394844023872411&#038;hl=en&#038;fs=true&#038;subtitle=off" style="width:400px;height:326px" allowFullScreen="true" allowScriptAccess="always" type="application/x-shockwave-flash"></embed></p>
<p>Direct URL: <a href="http://video.google.com/videoplay?docid=1944394844023872411">http://video.google.com/videoplay?docid=1944394844023872411</a>
</div>
<p>Topics covered in lecture seventeen:</p>
<ul>
<li>[01:40] Review of paths.</li>
<li>[03:15] Edge weight functions.</li>
<li>[03:30] Example of a path, its edge weights, and weight of the path.</li>
<li>[04:22] Review of shortest-paths.</li>
<li>[05:15] Shortest-path weight.</li>
<li>[06:30] Negative edge weights.</li>
<li>[10:55] Optimal substructure of a shortest path.</li>
<li>[11:50] Proof of optimal substructure property: cut and paste.</li>
<li>[14:23] Triangle inequality.</li>
<li>[15:15] Geometric proof of triangle inequality.</li>
<li>[16:30] Single-source shortest paths problem.</li>
<li>[18:32] Restricted single-source shortest paths problem: all edge weights positive or zero.</li>
<li>[19:35] Greedy idea for ss shortest paths.</li>
<li>[26:40] Dijkstra&#8217;s algorithm.</li>
<li>[35:30] Example of Dijkstra&#8217;s algorithm.</li>
<li>[43:40] Shortest path trees.</li>
<li>[45:12] Correctness of Dijkstra&#8217;s algorithm: why relaxation never makes mistake.</li>
<li>[53:55] Correctness of Dijkstra&#8217;s algorithm: why relaxation makes progress.</li>
<li>[01:01:00] Correctness of Dijkstra&#8217;s algorithm: why it gives correct answer when it terminates.</li>
<li>[01:15:40] Running time of Dijkstra&#8217;s algorithm.</li>
<li>[01:18:40] Running time depending on using array O(V^2), binary heap O(E·lg(V)) and Fibonacci heap O(E + V·lg(V)) for priority queue.</li>
<li>[01:20:00] Unweighted graphs.</li>
<li>[01:20:40] Breadth-First Search (BFS) algorithm.</li>
<li>[01:23:23] Running time of BFS: O(V+E).</li>
</ul>
<p>Lecture seventeen notes:</p>
<div class="center-aligner">
<table>
<tr>
<td align="center" valign="top">
<a href='http://www.catonmat.net/blog/wp-content/uploads/2009/01/mit-algorithms-lecture-17-01.jpg' title='MIT Algorithms Lecture 17 Notes Thumbnail. Page 1 of 2.'><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/01/mit-algorithms-lecture-17-01-thumb.jpg' alt='MIT Algorithms Lecture 17 Notes Thumbnail. Page 1 of 2.' /></a><br />
<small>Lecture 17, page 1 of 2: paths, shortest paths, negative edge weights, optimal substructure, triangle inequality, single-source shortest paths, dijkstra&#8217;s algorithm.</small>
</td>
<td width="10"></td>
<td align="center" valign="top">
<a href='http://www.catonmat.net/blog/wp-content/uploads/2009/01/mit-algorithms-lecture-17-02.jpg' title='MIT Algorithms Lecture 17 Notes Thumbnail. Page 2 of 2.'><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/01/mit-algorithms-lecture-17-02-thumb.jpg' alt='MIT Algorithms Lecture 17 Notes Thumbnail. Page 2 of 2.' /></a><br />
<small>Lecture 17, page 2 of 2: example of dijkstra&#8217;s algorithm, correctness of dijkstra&#8217;s algorithm, running time of dijkstra, unweighted graphs, breadth first search (bfs).</small>
</td>
</tr>
</table>
</div>
<h2 style="margin-bottom:10px">Lecture 18: Shortest Paths II: Bellman-Ford Algorithm</h2>
<p>Lecture eighteen begins with recalling that if a graph contains a negative weight cycle, then a shortest path may not exist and gives a geometric illustration of this fact.</p>
<p>Right after this fact, it jumps to Bellman-Ford algorithm. The Bellman-Ford algorithm solves the single-source shortest-paths problem in the general case in which edge weights may be negative. Given a weighted, directed graph G = (V, E) with source s and weight function w: E → <strong>R</strong>, the Bellman-Ford algorithm produces the shortest paths from s and their weights, if there is no negative weight cycle, and it produces no answer if there is a negative weight cycle.</p>
<p>The algorithm uses relaxation, progressively decreasing an estimate on the weight of a shortest path from the source s to each vertex v ∈ V until it achieves the actual shortest-path weight.</p>
<p>The running time of Bellman-Ford algorithm is O(VE). The lecture also gives a correctness proof of Bellman-Ford algorithm.</p>
<p>The other half of the lecture is devoted to a problem that can be effectively solved by Bellman-Ford. It&#8217;s called the linear feasibility problem that it is a special case of linear programming (LP) problem, where there is no objective but the constraints are in form x<sub>i</sub> <= w<sub>ij</sub>. It is noted that Bellman-Ford is actually a simple case of LP problem.</p>
<p>The lecture ends with an application of Bellman-Ford to solving a special case of VLSI layout problem in 1 dimension.</p>
<p>You&#8217;re welcome to watch lecture eighteen:</p>
<div class="center-aligner">
<embed id="VideoPlayback" src="http://video.google.com/googleplayer.swf?docid=-4318261111901337492&#038;hl=en&#038;fs=true&#038;subtitle=off" style="width:400px;height:326px" allowFullScreen="true" allowScriptAccess="always" type="application/x-shockwave-flash"> </embed></p>
<p>Direct URL: <a href="http://video.google.com/videoplay?docid=-4318261111901337492">http://video.google.com/videoplay?docid=-4318261111901337492</a>
</div>
<p>Topics covered in lecture eighteen:</p>
<ul>
<li>[00:20] A long, long time ago&#8230; in a galaxy far, far away&#8230;</li>
<li>[00:40] Quick review of previous lecture - Dijkstra&#8217;s algorithm, non-negative edge weights.</li>
<li>[01:40] Description of Bellman-Ford algorithm.</li>
<li>[04:30] Bellman-Ford algorithm.</li>
<li>[08:50] Running time of Bellman-Ford O(VE).</li>
<li>[10:05] Example of Bellmen-Ford algorithm.</li>
<li>[18:40] Correctness of Bellman-Ford algorithm:</li>
<li>[36:30] Linear programming (LP).</li>
<li>[42:48] Efficient algorithms for solving LPs: simplex algorithm (exponential in worst case, but practical), ellipsoid algorithm (polynomial time, impractical), interior point methods (polynomial), random sampling (brand new, discovered at MIT).</li>
<li>[45:58] Linear feasibility problem - LP with no objective.</li>
<li>[47:30] Difference constraints - constraints in form x<sub>j</sub> - x<sub>i</sub> <= w<sub>ij</sub>.</li>
<li>[49:50] Example of difference constraints.</li>
<li>[51:04] Constraint graph.</li>
<li>[54:05] Theorem: Negative weight cycle in constraint means difference constraints are infeasible/unsatisfiable.</li>
<li>[54:50] Proof.</li>
<li>[59:15] Theorem: If no negative weight cycle then satisfiable.</li>
<li>[01:00:23] Proof.</li>
<li>[01:08:20] Corollary: Bellman-Ford solves a system of of m difference constraints on n variables in O(mn) time.</li>
<li>[01:12:30] VLSI Layout problem solved by Bellman-Ford.</li>
</ul>
<p>Lecture eighteen notes:</p>
<div class="center-aligner">
<table>
<tr>
<td align="center" valign="top">
<a href='http://www.catonmat.net/blog/wp-content/uploads/2009/01/mit-algorithms-lecture-18-01.jpg' title='MIT Algorithms Lecture 18 Notes Thumbnail. Page 1 of 2.'><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/01/mit-algorithms-lecture-18-01-thumb.jpg' alt='MIT Algorithms Lecture 18 Notes Thumbnail. Page 1 of 2.' /></a><br />
<small>Lecture 18, page 1 of 2: bellman-ford algorithm, correctness of bellman-ford, linear programming.</small>
</td>
<td width="10"></td>
<td align="center" valign="top">
<a href='http://www.catonmat.net/blog/wp-content/uploads/2009/01/mit-algorithms-lecture-18-02.jpg' title='MIT Algorithms Lecture 18 Notes Thumbnail. Page 2 of 2.'><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/01/mit-algorithms-lecture-18-02-thumb.jpg' alt='MIT Algorithms Lecture 18 Notes Thumbnail. Page 2 of 2.' /></a><br />
<small>Lecture 18, page 2 of 2: feasibility problem, bellman-ford and difference constraints, vlsi layouts.</small>
</td>
</tr>
</table>
</div>
<h2 style="margin-bottom:10px">Lecture 19: Shortest Paths III: All-Pairs Shortest Paths and Floyd-Warshall Algorithm</h2>
<p>Lecture nineteen starts with a quick review of lectures seventeen and eighteen. It reminds the running times of various single-source shortest path algorithms, and mentions that in case of a directed acyclic graphs (which was not covered in previous lectures), you can run topological sort and 1 round of Bellman-Ford that makes it find single-source shortest paths in linear time (for graphs) in O(V+E).</p>
<p>The lecture continues all-pairs shortest paths problem, where we want to know the shortest path between every pair of vertices. </p>
<p>A naive approach to this problem is run single-source shortest path from each vertex. For example, on an unweighted graph we&#8217;d run BFS algorithm |V| times that would give O(VE) running time. On a non-negative edge weight graph it would be |V| times Dijkstra&#8217;s algorithm, giving O(VE + V<sup>2</sup>lg(V)) time. And in general case we&#8217;d run Bellman-Ford |V| times that would make the algorithm run in O(V<sup>2</sup>E) time.</p>
<p>The lecture continues with a precise definition of all-pairs shortest paths problem: given a directed graph, find an NxN matrix (N = |V|), where each entry a<sub>ij</sub> is the shortest path from vertex i to vertex j.</p>
<p>In general case, if the graph has negative edges and it&#8217;s dense, the best we can do so far is run Bellman-Ford |V| times. Recalling that E = O(V<sup>2</sup>) in a dense graph, the running time is O(V<sup>2</sup>E) = O(V<sup>4</sup>) - hypercubed in number of vertices = slow.</p>
<p>Lecture then proceeds with a dynamic programming algorithm without knowing if it will be faster or not. It&#8217;s too complicated to explain here, and I recommend watching lecture at 11:54 to understand it. Turns out this dynamic programming algorithm does not give a performance boost and is still O(V<sup>4</sup>), but it gives some wicked ideas.</p>
<p>The most wicked idea is to connect matrix multiplication with the dynamic programming recurrence and using repeated squaring to beat O(V<sup>4</sup>). This craziness gives O(V<sup>3</sup>lgV) time that is an improvement. Please see 23:40 in the lecture for full explanation.</p>
<p>After all this the lecture arrives at Floyd-Warshall algorithm that finds all-pairs shortest paths in O(V<sup>3</sup>). The algorithm is derived from a recurrence that the shortest path from vertex i to vertex j is minimum of { shortest path from i to j directly or shortest path from i to k and shortest path from k to j }.</p>
<p>Finally the lecture explains Johnson&#8217;s algorithm that runs in O(VE + V<sup>2</sup>log(V)) time for sparse graphs. The key idea in this algorithm is to reweigh all edges so that they are all positive, then run Dijkstra from each vertex and finally undo the reweighing.</p>
<p>It turns out, however, that to find the function for reweighing all edges, a set of difference constraints need to be satisfied. It makes us first run Bellman-Ford to solve these constraints.</p>
<p>Reweighing takes O(EV) time, running Dijkstra on each vertex takes O(VE + V<sup>2</sup>lgV) and undoing reweighing takes O(V<sup>2</sup>) time. Of these terms O(VE + V<sup>2</sup>lgV) dominates and defines algorithm&#8217;s running time (for dense it&#8217;s still O(V<sup>3</sup>).</p>
<p>You&#8217;re welcome to watch lecture nineteen:</p>
<div class="center-aligner">
<embed id="VideoPlayback" src="http://video.google.com/googleplayer.swf?docid=8781097484701714492&#038;hl=en&#038;fs=true&#038;subtitle=off" style="width:400px;height:326px" allowFullScreen="true" allowScriptAccess="always" type="application/x-shockwave-flash"> </embed></p>
<p>Direct URL: <a href="http://video.google.com/videoplay?docid=8781097484701714492">http://video.google.com/videoplay?docid=8781097484701714492</a>
</div>
<p>Topics covered in lecture nineteen:</p>
<ul>
<li>[01:00] Review of single-source shortest path algorithms.</li>
<li>[04:45] All-pairs shortest paths by running single-source shortest path algorithms from each vertex.</li>
<li>[05:35] Unweighted edges: |V|xBFS = O(VE).</li>
<li>[06:35] Nonnegative edge weights: |V|xDijkstra = O(VE + V<sup>2</sup>lg(V)).</li>
<li>[07:40] General case: |V|xBellman-Ford = O(V<sup>2</sup>E).</li>
<li>[09:10] Formal definition of all-pairs shortest paths problem.</li>
<li>[11:08] |V|xBellman-Ford for dense graphs (E = O(V<sup>2</sup>)) is O(V<sup>4</sup>).</li>
<li>[11:54] Trying to beat O(V<sup>4</sup>) with dynamic programming.</li>
<li>[19:30] Dynamic programming algorithm, still O(V<sup>4</sup>).</li>
<li>[23:40] A better algorithm via wicked analogy with matrix multiplication. Running time: O(V<sup>3</sup>lg(V)).</li>
<li>[37:45] Floyd-Warshall algorithm. Runs in O(V<sup>3</sup>).</li>
<li>[47:35] Transitive closure problem of directed graphs.</li>
<li>[53:30] Johnson&#8217;s algorithm. Runs in O(VE + V<sup>2</sup>log(V))</li>
</ul>
<p>Lecture nineteen notes:</p>
<div class="center-aligner">
<table>
<tr>
<td align="center" valign="top">
<a href='http://www.catonmat.net/blog/wp-content/uploads/2009/01/mit-algorithms-lecture-19-01.jpg' title='MIT Algorithms Lecture 19 Notes Thumbnail. Page 1 of 2.'><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/01/mit-algorithms-lecture-19-01-thumb.jpg' alt='MIT Algorithms Lecture 19 Notes Thumbnail. Page 1 of 2.' /></a><br />
<small>Lecture 19, page 1 of 2: all-pairs shortest paths, dynamic programming algorithm, matrix multiplication analogy.</small>
</td>
<td width="10"></td>
<td align="center" valign="top">
<a href='http://www.catonmat.net/blog/wp-content/uploads/2009/01/mit-algorithms-lecture-19-02.jpg' title='MIT Algorithms Lecture 19 Notes Thumbnail. Page 2 of 2.'><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/01/mit-algorithms-lecture-19-02-thumb.jpg' alt='MIT Algorithms Lecture 19 Notes Thumbnail. Page 2 of 2.' /></a><br />
<small>Lecture 19, page 2 of 2: floyd-warshall algorithm, transitive closure problem, johnson&#8217;s algorithn.</small>
</td>
</tr>
</table>
</div>
<p>Have fun with shortest path algorithms! The next post is going to be an introduction to parallel algorithms - things like dynamic multithreading, scheduling and multithreaded algorithms.</p>
<p>PS. This course is taught from the CLRS book (also called &#8220;Introduction to Algorithms&#8221;). Chapters 24 and 25, called &#8220;Single-Source Shortest Paths&#8221; and &#8220;All-Pairs Shortest Paths&#8221; explain everything I wrote about here in much more detail. If these topics excite you, you may want to buy this awesome book:</p>
<div class="center-aligner">
<iframe src="http://rcm.amazon.com/e/cm?t=freesciencand-20&#038;o=1&#038;p=8&#038;l=as1&#038;asins=0262032937&#038;fc1=000000&#038;IS2=1&#038;lt1=_blank&#038;m=amazon&#038;lc1=0000FF&#038;bc1=000000&#038;bg1=FFFFFF&#038;f=ifr" style="width:120px;height:240px;" scrolling="no" marginwidth="0" marginheight="0" frameborder="0"></iframe>
</div>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~f/catonmat?a=pc7aZlr8"><img src="http://feeds.feedburner.com/~f/catonmat?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=ByVyvE28"><img src="http://feeds.feedburner.com/~f/catonmat?i=ByVyvE28" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=udEguCsT"><img src="http://feeds.feedburner.com/~f/catonmat?i=udEguCsT" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=bAbJOHGF"><img src="http://feeds.feedburner.com/~f/catonmat?i=bAbJOHGF" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/0G8ioHf9vh8" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-twelve/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/mit-introduction-to-algorithms-part-twelve/</feedburner:origLink></item>
		<item>
		<title>Famous Sed One-Liners Explained, Part III</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/uVeAYs8Kibg/</link>
		<comments>http://www.catonmat.net/blog/sed-one-liners-explained-part-three/#comments</comments>
		<pubDate>Wed, 14 Jan 2009 14:10:20 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Programming]]></category>
<category>cat</category><category>one liner</category><category>one liners</category><category>sed</category><category>uniq</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/sed-one-liners-explained-part-three/</guid>
		<description><![CDATA[This is the third and final part of an article on the famous sed one-liners. This part will explain sed one-liners for selective deletion of certain lines and special applications of sed. See part one for introduction of the series.
Similarly to famous Awk one-liners, sed one-liners are short and concise sed programs that span less [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2007/07/post-icon-sed.jpg' alt='sed -- the superman of unix stream editing' class='post-icon' align='left' />This is the third and final part of an article on the <strong>famous sed one-liners</strong>. This part will explain sed one-liners for <strong>selective deletion of certain lines</strong> and <strong>special applications</strong> of sed. See <a href="http://www.catonmat.net/blog/sed-one-liners-explained-part-one/">part one</a> for introduction of the series.</p>
<p>Similarly to <a href="http://www.catonmat.net/blog/awk-one-liners-explained-part-one/">famous Awk one-liners</a>, sed one-liners are short and concise sed programs that span less than one terminal line. They were written by <a href="http://www.pement.org/">Eric Pement</a> and are floating around on the Internet as &#8216;<strong>sed1line.txt</strong>&#8216; file.</p>
<p>If you are intrigued by this article series, I suggest that you <a href="http://www.catonmat.net/feed/">subscribe to my posts</a>!</p>
<p>Eric&#8217;s sed one-liners are divided into several sections:</p>
<ul>
<li>1. <a href="http://www.catonmat.net/blog/sed-one-liners-explained-part-one/">File spacing</a> (explained in part one).</li>
<li>2. <a href="http://www.catonmat.net/blog/sed-one-liners-explained-part-one/">Numbering</a> (explained in part one).</li>
<li>3. <a href="http://www.catonmat.net/blog/sed-one-liners-explained-part-one/">Text conversion and substitution</a> (explained in part one).</li>
<li>4. <a href="http://www.catonmat.net/blog/sed-one-liners-explained-part-one/">Selective printing of certain lines</a> (explained in part two).</li>
<li>5. <strong>Selective deletion of certain lines (explained in this part)</strong>.</li>
<li>6. <strong>Special applications (explained in this part)</strong>.</li>
</ul>
<p>Grab my &#8220;<a href="http://www.catonmat.net/blog/sed-stream-editor-cheat-sheet/">sed cheat sheet</a>&#8221; and a local copy of <a href="http://www.catonmat.net/blog/wp-content/uploads/2008/09/sed1line.txt">sed1line.txt</a> file, and let the explanation begin!</p>
<h2 style="margin-bottom:10px">5. Selective Deletion of Certain Lines</h2>
<p><strong>68. Print all lines in the file except a section between two regular expressions.</strong></p>
<pre>sed '/Iowa/,/Montana/d'</pre>
<p>This one-liner continues where the previous left off. <a href="http://www.catonmat.net/blog/sed-one-liners-explained-part-two/">One-liner #67</a> used the range match &#8220;/start/,/finish/&#8221; to print lines between two regular expressions (inclusive). This one-liner, on the other hand, deletes lines between two regular expressions and prints all the lines outside this range. Just to remind you, a range &#8220;/start/,/finish/&#8221; matches all lines starting from the first line that matches a regular expression &#8220;/start/&#8221; to the first line that matches a regular expression &#8220;/finish/&#8221;. In this particular one-liner the &#8220;d&#8221;, delete, command is applied to these lines. The delete command prevents the matching lines from ever seeing the light.</p>
<p>For example, suppose your input to this one-liner was:</p>
<pre>
Florida
<strong>Iowa
New York
San Jose
Montana</strong>
Texas
Fairbanks
</pre>
<p>Then after the sed program has finished running, the output is:</p>
<pre>
Florida
Texas
Fairbanks
</pre>
<p>We see this output because the lines from Iowa to Montana matched the &#8220;/Iowa/,/Montana/&#8221; range match (i put the matched lines in bold) and were deleted.</p>
<p><strong>69. Delete duplicate, consecutive lines from a file (emulates &#8220;uniq&#8221;).</strong></p>
<pre>sed '$!N; /^\(.*\)\n\1$/!P; D'</pre>
<p>This one-liner acts as the &#8220;uniq&#8221; Unix utility. So how does it work? First of all, for every line that is not the very last line of input, sed appends the next line to the pattern space by the &#8220;N&#8221; command. The &#8220;N&#8221; command is restricted to all but the last line by &#8220;$!&#8221; restriction pattern. The newly appended line is separated from the previous line by the &#8220;\n&#8221; character. Next, the pattern space is matched against &#8220;/^\(.*\)\n\1$/&#8221; regular expression. This regular expression captures the previous line up to &#8220;\n&#8221; character and saves it in the match group &#8220;\1&#8243;. Then it tests if the newly appended line is the same as the previous one. If it is not, the &#8220;P&#8221; gets executed. If it is, the &#8220;P&#8221; command does not get executed. The &#8220;P&#8221; command prints everything in the pattern space up to the first &#8220;\n&#8221; character. Next the &#8220;D&#8221; command executes and deletes everything up to the first &#8220;\n&#8221; char, leaving only the newly read line in pattern space. It also forces the sed script to begin from the first command.</p>
<p>This way it loops over all lines, comparing two consecutive lines. If they are equal, the first line gets deleted, and a new line gets appended to what&#8217;s left. If they are not equal, the first one gets deleted, and deleted.</p>
<p>I think it&#8217;s hard to understand what is going on from this description. I&#8217;ll illustrate it with an example. Suppose this is the input:</p>
<pre>
foo
foo
foo
bar
bar
baz
</pre>
<p>The first thing sed does is it reads the first line of input in pattern space. The pattern space now contains &#8220;foo&#8221;. Now the &#8220;N&#8221; command executed. The pattern space now contains &#8220;foo\nfoo&#8221;. Next the pattern space is tested against &#8220;/^\(.*\)\n\1$/&#8221; regular expression. This regular expression matches because &#8220;\(.*\)&#8221; is &#8220;foo&#8221; and &#8220;/^\(.*\)\n\1$/&#8221; is &#8220;foo\nfoo&#8221;, exactly what we have in the pattern space. As it matched, the &#8220;P&#8221; command does not get executed. Now the &#8220;D&#8221; command executes, deleting the everything up to first &#8220;\n&#8221; from pattern space. The pattern space now contains just &#8220;foo&#8221;. The &#8220;D&#8221; command forces sed to start from the first command. Now the &#8220;N&#8221; is executed again, the pattern space now contains &#8220;foo\nfoo&#8221; again and the same thing happens, &#8220;P&#8221; does not get executed and &#8220;D&#8221; deletes the first &#8220;foo&#8221;, leaving the pattern space with just &#8220;foo&#8221; in it. Now the &#8220;N&#8221; gets executed once again, this time &#8220;bar&#8221; gets appended to pattern space. It contains &#8220;foo\nbar&#8221; now. The regular expression &#8220;/^\(.*\)\n\1$/&#8221; does not match and &#8220;P&#8221; gets executed, printing &#8220;foo&#8221;. After that &#8220;D&#8221; gets executed wiping &#8220;foo&#8221; from pattern space. The pattern space now contains &#8220;bar&#8221;. The commands restart and &#8220;N&#8221; gets executed, it appends the next &#8220;bar&#8221; to current pattern space. Now it contains &#8220;bar\nbar&#8221;. Just like with &#8220;foo\nfoo&#8221;, nothing gets printed, and &#8220;D&#8221; deletes the first &#8220;bar&#8221;, leaving pattern space with &#8220;bar&#8221;. The one-liner restarts its execution. Now &#8220;N&#8221; reads in the final line &#8220;baz&#8221;. The pattern space contains &#8220;bar\nbaz&#8221; which does not match the regular expression. The &#8220;P&#8221; prints out the &#8220;bar&#8221; and &#8220;D&#8221; deletes &#8220;bar&#8221;. Now &#8220;N&#8221; does not get executed because we are at the last line of input. The &#8220;$!N&#8221; restricts &#8220;N&#8221; to all lines but last. At this moment pattern space contains only the last &#8220;baz&#8221;, the regular expression does not match, so &#8220;baz&#8221; gets printed. The &#8220;D&#8221; command executes, emptying the pattern space. There is no more input and sed quits.</p>
<p>The output for this example is:</p>
<pre>
foo
bar
baz
</pre>
<p>I think this is one of the most detailed explanations I have written about a single one liner. <img src='http://www.catonmat.net/blog/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /> </p>
<p><strong>70. Delete duplicate, nonconsecutive lines from a file.</strong></p>
<pre>sed -n 'G; s/\n/&#038;&#038;/; /^\([ -~]*\n\).*\n\1/d; s/\n//; h; P'</pre>
<p>This is a very tricky one-liner. It stores the unique lines in hold buffer and at each newly read line, tests if the new line already is in the hold buffer. If it is, then the new line is purged. If it&#8217;s not, then it&#8217;s saved in hold buffer for future tests and printed.</p>
<p>A more detailed description - at each line this one-liner appends the contents of hold buffer to pattern space with &#8220;G&#8221; command. The appended string gets separated from the existing contents of pattern space by &#8220;\n&#8221; character. Next, a substitution is made to that substitutes the &#8220;\n&#8221; character with two &#8220;\n\n&#8221;. The substitute command &#8220;s/\n/&#038;&#038;/&#8221; does that. The &#8220;&#038;&#8221; means the matched string. As the matched string was &#8220;\n&#8221;, then &#8220;&#038;&#038;&#8221; is two copies of it &#8220;\n\n&#8221;. Next, a test &#8220;/^\([ -~]*\n\).*\n\1/&#8221; is done to see if the contents of group capture group 1 is repeated. The capture group 1 is all the characters from space &#8221; &#8221; to &#8220;~&#8221; (which include all printable chars). The &#8220;[ -~]*&#8221; matches that. Replacing one &#8220;\n&#8221; with two was the key idea here. As &#8220;\([ -~]*\n\)&#8221; is greedy (matches as much as possible), the double newline makes sure that it matches as little text as possible. If the test is successful, the current input line was already seen and &#8220;d&#8221; purges the whole pattern space and starts script execution from the beginning. If the test was not successful, the doubled &#8220;\n\n&#8221; gets replaced with a single &#8220;\n&#8221; by &#8220;s/\n//&#8221; command. Then &#8220;h&#8221; copies the whole string to hold buffer, and &#8220;P&#8221; prints the new line.</p>
<p><strong>71. Delete all lines except duplicate consecutive lines (emulates &#8220;uniq -d&#8221;).</strong></p>
<pre>sed '$!N; s/^\(.*\)\n\1$/\1/; t; D'</pre>
<p>This sed one-liner prints only the duplicate lines. This sed one-liner starts with reading in the next line from input with the &#8220;N&#8221; command. As I already mentioned, the current line and the next get separated by &#8220;\n&#8221; character after &#8220;N&#8221; executes. This one-liner also restrics &#8220;N&#8221; to all lines but last with &#8220;$!&#8221; restriction. Now a substitution &#8220;s/^\(.*\)\n\1$/\1/&#8221; is tried. Similarly to one-liner #69, this substitution replaces two repeating strings with one. For example, a string &#8220;foo\nfoo&#8221; gets replaced with just &#8220;foo&#8221;. Now, if this substitution was successful (there was a repeated string), the &#8220;t&#8221; command takes the script to the end where the current pattern space gets printed automatically. If the substitution was not successful, &#8220;D&#8221; executes, deleting the non-repeated string. The cycle continues and this way only the duplicate lines get printed once.</p>
<p>Let&#8217;s take a look at an example. Suppose the input is:</p>
<pre>
foo
foo
bar
baz
</pre>
<p>This one-liner reads the first line and immediately executes the &#8220;N&#8221; command. The pattern space now is &#8220;foo\nfoo&#8221;. The substitution &#8220;s/^\(.*\)\n\1$/\1/&#8221; is tried and it&#8217;s successful, because &#8220;foo&#8221; is repeated twice. The pattern space now contains just a single &#8220;foo&#8221;. As the substitution was successful, &#8220;t&#8221; command branches to the end of the script. At this moment &#8220;foo&#8221; gets printed. Now the cycle repeats. Sed reads in &#8220;bar&#8221;, the &#8220;N&#8221; command appends &#8220;baz&#8221; to &#8220;bar&#8221;. The pattern space now is &#8220;bar\nbaz&#8221;. The substitution is tried, but it&#8217;s not successful, as &#8220;bar&#8221; is not repeated. As the substitution failed, &#8220;t&#8221; does nothing and &#8220;D&#8221; executes, deleting &#8220;bar&#8221; from pattern space. The pattern space is left with single &#8220;baz&#8221;. Command &#8220;N&#8221; no longer executes as we reached end of file, substitution fails, &#8220;t&#8221; fails, and &#8220;D&#8221; deletes the &#8220;baz&#8221;.</p>
<p>The end result is:</p>
<pre>
foo
</pre>
<p>Just as we expected - only the duplicate line got printed.</p>
<p><strong>72. Delete the first 10 lines of a file.</strong></p>
<pre>sed '1,10d'</pre>
<p>This one-liner restricts the &#8220;d&#8221; command to a range of lines by number. The &#8220;1,10&#8243; means a range matching lines 1 to 10 inclusive. On each of the lines the &#8220;d&#8221; command gets executed. It deletes the current pattern space, and restarts the commands from beginning. The default action for lines > 10 is to print the line.</p>
<p><strong>73. Delete the last line of a file.</strong></p>
<pre>sed '$d'</pre>
<p>This one-liner restricts the &#8220;d&#8221; command to the last line of file. It&#8217;s done by specifying the special char &#8220;$&#8221; as the line to match. It matches only the last line. The last line gets deleted, but the others get printed implicitly.</p>
<p><strong>74. Delete the last 2 lines of a file.</strong></p>
<pre>sed 'N;$!P;$!D;$d'</pre>
<p>This one-liner always keeps two lines in the pattern space. At the very last line, it just does not output these last two. All the others before last two get output implicitly. Let&#8217;s see how it does it. As soon as sed reads the first line of input in pattern space, it executes the first command &#8220;N&#8221;. It places the 2nd line of input in pattern space. The next two commands &#8220;$!P&#8221; and &#8220;$!D&#8221; print the first part of pattern space up to newline character, and delete this part from pattern space. They keep doing it until the very last line gets appended to pattern space by &#8220;N&#8221; command. At this moment the last two lines are in pattern space and &#8220;$d&#8221; executes, deleting them both. That&#8217;s it. Last two lines got deleted.</p>
<p>If there is just one line of data, then it outputs it.</p>
<p><strong>75. Delete the last 10 lines of a file.</strong></p>
<pre>sed -e :a -e '$d;N;2,10ba' -e 'P;D'</pre>
<p>This is really straight forward one-liner. It always keeps 10 lines in pattern-space, by appending each new input line with &#8220;N&#8221;, and deleting the 11th excessive line with &#8220;D&#8221;. Once the end of file is reached, it &#8220;d&#8221; the whole pattern space, deleting the last 10 lines.</p>
<pre>sed -n -e :a -e '1,10!{P;N;D;};N;ba'</pre>
<p>This is also a straight forward one-liner. For the lines that are not 1-10, it appends them to pattern space with &#8220;N&#8221;. For lines > 10, it prints the first line in pattern space with &#8220;P&#8221;, appends another line with &#8220;N&#8221; and deletes the printed line with &#8220;D&#8221;. The &#8220;D&#8221; command causes sed to branch to the beginning of script! The &#8220;N;ba&#8221; at the end never, ever gets executed again for lines > 10. It keeps looping this way &#8220;P&#8221;, &#8220;N&#8221;, &#8220;D&#8221;, always keeping 10 lines in pattern space and printing line-10 on each cycle. The &#8220;N&#8221; command causes script to quit if it tries to read past end of file.</p>
<p><strong>76. Delete every 8th line.</strong></p>
<pre>gsed '0~8d'</pre>
<p>This one-liner only works with GNU Sed only. It uses a special address range match &#8220;first~step&#8221; that matches every <em>step</em>&#8216;th line starting with the <em>first</em>. In this one-liner <em>first</em> is 0 and <em>step</em> is 8. Zero is not a valid physical line number, so the very first line of input does not match. The first line to match is 8th, then 16th, then 24th, etc. Each line that matches is deleted by &#8220;d&#8221; command.</p>
<pre>sed 'n;n;n;n;n;n;n;d;'</pre>
<p>This is a portable version. The &#8220;n&#8221; command prints the current pattern space, empties it, and reads in the next line. It does so for every 7 lines, and 8th line gets deleted with &#8220;d&#8221;. This process continues until all input has been processed.</p>
<p><strong>77. Delete lines that match regular expression pattern.</strong></p>
<pre>sed '/pattern/d'</pre>
<p>This one-liner executes the &#8220;d&#8221; command on all lines that match &#8220;/pattern/&#8221;. The &#8220;d&#8221; command deletes the line and skips to the next line.</p>
<p><strong>78. Delete all blank lines in a file (emulates &#8220;grep &#8216;.&#8217;&#8221;.</strong></p>
<pre>sed '/^$/d'</pre>
<p>The regular expression &#8220;/^$/&#8221; in this one-liner tests if the beginning of line matches the end of the line. Only the empty lines have this property and sed deletes them.</p>
<p>Another way to do the same is:</p>
<pre>sed '/./!d'</pre>
<p>This one-liner tests if the line matches at least one character. The dot &#8220;.&#8221; in the regular expression matches any character. An empty line does not have any characters and it does not match this regular expression. Sed deletes all the lines that do not match this regular expression.</p>
<p><strong>79. Delete all consecutive blank lines from a file (emulates &#8220;cat -s&#8221;).</strong></p>
<pre>sed '/./,/^$/!d'</pre>
<p>This one-liner leaves one blank line at the end of the file, if there are multiple blanks at the end. Other than that, all consecutive blanks are stripped.</p>
<p>It uses an inverse range match &#8220;/start/,/finish/!&#8221; to &#8220;d&#8221; delete lines from first blank line, to first non-blank, non-inclusive.</p>
<pre>sed '/^$/N;/\n$/D'</pre>
<p>This one-liner leaves one blank line at the beginning and end of the file, if there are multiple blanks at both sides. Other than that, all consecutive blanks are stripped.</p>
<p>The consecutive empty lines get appended in pattern space by &#8220;/^$/N&#8221; command. The &#8220;/\n$/D&#8221; command matches and deletes blanks until only 1 is left. At that moment it no longer matches, and the line is output.</p>
<p><strong>80. Delete all consecutive blank lines from a file except the first two.</strong></p>
<pre>sed '/^$/N;/\n$/N;//D'</pre>
<p>In case of > 2 blank lines, this one-liner trims them down to two. There is a catch to this one-liner. Let me explain it first. See the last command &#8220;//D&#8221;? It&#8217;s a shortcut for &#8220;/previous-match/D&#8221;. In this case it&#8217;s shortcut for &#8220;/\n$/D&#8221;. Alright, now the one-liner itself. On every empty line, it appends the next to current pattern space with &#8220;/^$/N&#8221; command. Next it tests if the line just read in was actually a blank line with &#8220;/\n$/&#8221;, if it is, it reads another line in with &#8220;N&#8221;. At this moment it repeats the same test &#8220;/\n$/&#8221;. If the line was a blank one again, it deletes the first blank line and restarts sed script from the beginning. Notice that at all times only 2 consecutive blank lines are in pattern space. This way any number of blank lines get deleted and only two are left.</p>
<p><strong>81. Delete all leading blank lines at the top of a file.</strong></p>
<pre>sed '/./,$!d'</pre>
<p>This one-liner inverts a match &#8220;match from the first non-blank line to end of file&#8221;. It becomes &#8220;match from the beginning of file to last blank line&#8221;.</p>
<p><strong>82. Delete all trailing blank lines at the end of a file.</strong></p>
<pre>sed -e :a -e '/^\n*$/{$d;N;ba' -e '}'</pre>
<p>This one-liner accumulates blank lines in pattern space until it either hits end or hits a non-blank line. If it hits end, &#8220;$d&#8221; deletes the whole pattern space (which contained just the trailing blank lines) and quits. If however, it hits non-blank line, the whole pattern space gets printed implicitly and script continues as if nothing had happened.</p>
<p>This one is a portable version.</p>
<pre>gsed -e :a -e '/^\n*$/N;/\n$/ba'</pre>
<p>This is the same script, except a shorter version, made to work with Gnu Sed.</p>
<p><strong>83. Delete the last line of each paragraph.</strong></p>
<pre>sed -n '/^$/{p;h;};/./{x;/./p;}'</pre>
<p>This one-liner always keeps the previous line in hold buffer. It&#8217;s accomplished by 2nd block of commands &#8220;/./{x;/./p;}&#8221;. In this block, the pattern space (1 line) gets exchanged with hold buffer (1 line) by &#8220;x&#8221; command and if the hold buffer was not empty, it gets printed by &#8220;p&#8221;. The next moment to note is what happens on the first empty line. That is the line after the paragraph. At this moment &#8220;/^$/{p;h;}&#8221; gets executed, that prints the blank line (but does not print the last line of paragraph!), and puts the blank line in hold buffer. Once a new paragraph is reached, the script executed just like it was the very first paragraph of the input.</p>
<h2 style="margin-bottom:10px">6. Special Sed Applications</h2>
<p><strong>84. Remove nroff overstrikes.</strong></p>
<p>Nroff overstrikes are chars that are formatted to stand out in bold. They are achieved like in old typewriters, where you would do backspace and hit the same key again. In nroff it&#8217;s key CHAR, CTRL+H, CHAR. This one-liner deletes the CHAR, CTRL+H, leaving just plain CHAR.</p>
<pre>sed 's/.^H//g'</pre>
<p>Press Ctrl+V and then Ctrl+H to insert ^H literally in sed one-liner. It then uses the substitute command to delete any char &#8220;.&#8221; followed by CTRL+H &#8220;^H&#8221;.</p>
<p>Another way to do the same is use a hex escape expression that works in most recent seds:</p>
<pre>sed 's/.\x08//g'</pre>
<p>Yet another way is to use &#8220;echo&#8221; and enable interpretation of backslashed characters:</p>
<pre>sed 's/.'`echo -e "\b"`'//g'</pre>
<p><strong>85. Print Usenet/HTTP/Email message header.</strong></p>
<pre>gsed -r '/^\r?$/q'</pre>
<p>Usenet, HTTP and Email headers are similar. They are a bunch of text lines, separated from the body of the message with two new lines &#8220;\r\n\r\n&#8221;. Some implementations might even go with just &#8220;\n\n&#8221;. This one-liner quits on the first line that is either empty or contains &#8220;\r&#8221;. In other words, it prints the message header and quits.</p>
<p><strong>86. Print Usenet/HTTP/Email message body.</strong></p>
<pre>sed '1,/^$/d'</pre>
<p>This one-liner uses a range match &#8220;1,/^$/&#8221; to delete lines starting from 1st, and ending with the first blank line (inclusive). As I explained in the previous one-liner #78 above, &#8220;/^$/&#8221; matches empty lines. All the lines before first blank line in a Usenet/Email message or a HTTP header are message headers. They get deleted.</p>
<p><strong>87. Extract subject from an email message.</strong></p>
<pre>sed '/^Subject: */!d; s///; q'</pre>
<p>This one-liner deletes all lines that do not match &#8220;^Subject: &#8220;. Then it re-uses the match in &#8220;s///&#8221; to delete &#8220;Subject: &#8221; part from the line, leaving just the real subject. Please notice how &#8220;s///&#8221; is equivalent to &#8220;s/previous-match//&#8221;, where &#8220;previous-match&#8221; is &#8220;^Subject: *&#8221; in this one-liner.</p>
<p><strong>88. Extract sender information from an email message.</strong></p>
<pre>sed '/^From: */!d; s///; q'</pre>
<p>This one liner is equivalent to the previous one, except it prints sender information from email.</p>
<p><strong>89. Extract email address from a &#8220;Name Surname &lt;email@domain.com&gt;&#8221; string.</strong></p>
<pre>sed 's/.*< *//;s/ *>.*//;</pre>
<p>This one-liner strips all symbols before &lt; symbol (and any whitespace after it), and stips all symbols after &gt; symbol (including whitespace before it). That&#8217;s it. What&#8217;s left is email@domain.com.</p>
<p><strong>90. Add a leading angle bracket and space to each line (quote an email message).</strong></p>
<pre>sed 's/^/> /'</pre>
<p>This one-liner substitutes zero-width anchor &#8220;^&#8221; that matches beginning of line with &#8220;> &#8220;. As it&#8217;s a zero-width anchor, the result is that &#8220;> &#8221; gets added to beginning of each line.</p>
<p><strong>91. Delete leading angle bracket from each line (unquote an email message).</strong></p>
<pre>sed 's/^> //'</pre>
<p>It does what it says, deletes two characters &#8220;>&#8221; and a space &#8221; &#8221; from the beginning of each line.</p>
<p><strong>92. Strip HTML tags.</strong></p>
<pre>sed -e :a -e 's/&lt;[^>]*>//g;/&lt;/N;//ba'</pre>
<p>Sed is not made for parsing HTML. This is a very crude version of HTML tag eraser. It starts by creating a branch label named &#8220;a&#8221;. Then on each line it substitutes &#8220;&lt;[^>]*>&#8221; with nothing as many times as possible (&#8221;g&#8221; flag for s/// command). The &#8220;&lt;[^>]*>&#8221; expression means match match symbol &#8220;&lt;&#8221; followed by any other symbols that are not &#8220;>&#8221;, and that ends with &#8220;>&#8221;. This is a common pattern in regular expressions for non-greediness. Next, the one-liner tests if there are any open tags left on the line, if there are &#8220;N&#8221; reads the next line of input to make it work across multiple lines. &#8220;//ba&#8221; finally branches to the beginning of the script (it&#8217;s short for &#8220;/previous-expression/ba&#8221; which in this case is &#8220;/&lt;/ba&#8221;).</p>
<h2 style="margin-bottom:10px">Have Fun!</h2>
<p>This post completes the three part article on the superman of Unix stream editing. I hope that you learned a thing or two and it was my pleasure explaining them.</p>
<p>Just as with the three part article on <a href="http://www.catonmat.net/blog/awk-one-liners-explained-part-one/">famous Awk one-liners explained</a>, my plans are to create a &#8216;<strong>awk1line-explained.txt</strong>&#8216; file that will be supplementary to &#8216;awk1line.txt&#8217;, and publish an ebook in pdf format.</p>
<p>If you have any comments on the one-liners, please let me know in the comments. Thanks!</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~f/catonmat?a=xThS1ydr"><img src="http://feeds.feedburner.com/~f/catonmat?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=mcvQxgNs"><img src="http://feeds.feedburner.com/~f/catonmat?i=mcvQxgNs" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=FvPadVG1"><img src="http://feeds.feedburner.com/~f/catonmat?i=FvPadVG1" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=7VRUExEs"><img src="http://feeds.feedburner.com/~f/catonmat?i=7VRUExEs" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/uVeAYs8Kibg" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/sed-one-liners-explained-part-three/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/sed-one-liners-explained-part-three/</feedburner:origLink></item>
		<item>
		<title>How to Save Time by Watching Videos at Higher Playback Speeds</title>
		<link>http://feedproxy.google.com/~r/catonmat/~3/MSXlNjd_4yQ/</link>
		<comments>http://www.catonmat.net/blog/how-to-save-time-by-watching-videos-at-higher-playback-speeds/#comments</comments>
		<pubDate>Wed, 07 Jan 2009 10:10:52 +0000</pubDate>
		<dc:creator>Peteris Krumins</dc:creator>
		
		<category><![CDATA[Hacker's Approach]]></category>
<category>alan kay</category><category>avisynth</category><category>oopsla</category><category>pitch</category><category>playback rate</category><category>speed up</category><category>video</category><category>videos</category>
		<guid isPermaLink="false">http://www.catonmat.net/blog/how-to-save-time-by-watching-videos-at-higher-playback-speeds/</guid>
		<description><![CDATA[This is going to be a small, technical tutorial on how to save a lot of time by watching videos at higher playback rates.
I first read about this idea from my most favorite personal development blog at Steve Pavlina.com. In his post &#8220;Overclock Your Audio Learning&#8221; he says that he occasionally listens to audios at [...]]]></description>
			<content:encoded><![CDATA[<p><img src='http://www.catonmat.net/blog/wp-content/uploads/2009/01/save-time-speed-up-video.jpg' alt='Save Time by Speeding Up Videos' class="post-icon" align="left" />This is going to be a small, technical tutorial on <strong>how to save a lot of time</strong> by watching videos at higher playback rates.</p>
<p>I first read about this idea from my most favorite personal development blog at <a href="http://www.stevepavlina.com">Steve Pavlina.com</a>. In his post &#8220;<a href="http://www.stevepavlina.com/blog/2007/08/overclock-your-audio-learning/">Overclock Your Audio Learning</a>&#8221; he says that he occasionally listens to audios at 4.1x. At this speed 4 hour video/audio can be listened in less 1 hour!</p>
<p>I personally found it impossible to understand anything at 4x speed. My optimal listening speed is 1.65x - 2.1x.</p>
<p>To speed up the videos you will first need to download and install <a href="http://avisynth.org">AviSynth</a>. AviSynth is kind of a video programming language with which you can do all kinds of manipulations to videos programmatically. If you are on Windows, then during the installation make sure to associate .avs file extension with Windows Media Player and not Notepad.</p>
<p>Next, create this AviSynth script, and place it in the same directory as your video. Name the script as &#8220;speedup.avs&#8221; or something similar. Make sure the extension is &#8220;.avs&#8221; if you are on Windows!</p>
<pre>
<strong>file</strong> = &#34;file_name_of_video.avi&#34;
<strong>speedup</strong> = 1.65
<strong>pitch</strong> = 100

DirectShowSource(file)

audio_rate = last.audiorate
video_rate = last.framerate

AssumeSampleRate(int(audio_rate*speedup))
AssumeFPS(video_rate*speedup)
TimeStretch(pitch = pitch)
</pre>
<p>There are three variables that you can change in this simple script. The first is &#8220;<strong>file</strong>&#8220;. It should be the filename of the video you are about to watch. The next is &#8220;<strong>speedup</strong>&#8220;. It&#8217;s the new playback rate, you may set it to any value you wish. For example, if you set it to 2.0, then the video will play twice as fast as it normally would. And the last parameter to change is the &#8220;<strong>pitch</strong>&#8220;. You may change it to something lower than 100 when the video plays at higher speeds to make the speaker sound lower. I usually keep &#8220;speedup&#8221; at 1.65 and &#8220;pitch&#8221; at 75.</p>
<p>Once you have made your own configuration, just double click the .avs on Windows to play it at the new playback speed, or play it through mplayer on Linux!</p>
<p><strong>Update:</strong> My blog readers <strong>bobb</strong> and <strong>crb</strong> suggested two new techniques on how to watch videos faster. Bobb suggested just to use mplayer. Turns out it already has an option to play videos faster!</p>
<pre>
mplayer -speed 1.65 file.avi

# use keys [ ], and { } to control the playback speed
# use backspace to reset video speed to normal.
</pre>
<p>Crb suggested to use <a href="http://myspeedtube.com/?RC=ytchw">MySpeed™ Plug-In for YouTube</a> to speed up video on YouTube in real time.</p>
<p>Here are a few examples that I crafted at speeds 1x, 1.4x, 1.65x, 2x, 2.25x, 2.5x, 2.75x and 3x. They are from Alan Kay&#8217;s keynote speech &#8220;The computer revolution hasn&#8217;t happend yet&#8221; at OOPSLA Conference in 1997.</p>
<p>After a few minutes of listening at higher speeds, try going back to listen at 1x. It will seem incredibly slow because your mind will have adapted to the faster input rate.</p>
<div class="center-aligner">
<small>Alan Kay&#8217;s talk at a normal speed. Total time: 5 mins.</small><br />
<object width="425" height="344">
<param name="movie" value="http://www.youtube.com/v/MaRODiPR-ZU&#038;hl=en&#038;fs=1"></param>
<param name="allowFullScreen" value="true"></param>
<param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/MaRODiPR-ZU&#038;hl=en&#038;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object><br />
Direct URL: <a href="http://www.youtube.com/watch?v=MaRODiPR-ZU">http://www.youtube.com/watch?v=MaRODiPR-ZU</a>
</div>
<div class="center-aligner">
<small>Alan Kay&#8217;s talk at 1.4x speed. Total time: 3:34 mins.</small><br />
<object width="425" height="344">
<param name="movie" value="http://www.youtube.com/v/Oc_3yk22gn8&#038;hl=en&#038;fs=1"></param>
<param name="allowFullScreen" value="true"></param>
<param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/Oc_3yk22gn8&#038;hl=en&#038;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object><br />
Direct URL: <a href="http://www.youtube.com/watch?v=Oc_3yk22gn8">http://www.youtube.com/watch?v=Oc_3yk22gn8</a>
</div>
<div class="center-aligner">
<small>Alan Kay&#8217;s talk at 1.65x speed. Total time: 3 mins.</small><br />
<object width="425" height="344">
<param name="movie" value="http://www.youtube.com/v/lrAq86Qk_rU&#038;hl=en&#038;fs=1"></param>
<param name="allowFullScreen" value="true"></param>
<param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/lrAq86Qk_rU&#038;hl=en&#038;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object><br />
Direct URL: <a href="http://www.youtube.com/watch?v=lrAq86Qk_rU">http://www.youtube.com/watch?v=lrAq86Qk_rU</a>
</div>
<div class="center-aligner">
<small>Alan Kay&#8217;s talk at 2x speed. Total time: 2:30 mins.</small><br />
<object width="425" height="344">
<param name="movie" value="http://www.youtube.com/v/WlwEq4HXB3Y&#038;hl=en&#038;fs=1"></param>
<param name="allowFullScreen" value="true"></param>
<param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/WlwEq4HXB3Y&#038;hl=en&#038;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object><br />
Direct URL: <a href="http://www.youtube.com/watch?v=WlwEq4HXB3Y">http://www.youtube.com/watch?v=WlwEq4HXB3Y</a>
</div>
<div class="center-aligner">
<small>Alan Kay&#8217;s talk at 2.25x speed. Total time: 2:13 mins.</small><br />
<object width="425" height="344">
<param name="movie" value="http://www.youtube.com/v/b0K5JMjtP3w&#038;hl=en&#038;fs=1"></param>
<param name="allowFullScreen" value="true"></param>
<param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/b0K5JMjtP3w&#038;hl=en&#038;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object><br />
Direct URL: <a href="http://www.youtube.com/watch?v=b0K5JMjtP3w">http://www.youtube.com/watch?v=b0K5JMjtP3w</a>
</div>
<div class="center-aligner">
<small>Alan Kay&#8217;s talk at 2.5x speed. Total time: 2 mins.</small><br />
<object width="425" height="344">
<param name="movie" value="http://www.youtube.com/v/YBt01TFNIA0&#038;hl=en&#038;fs=1"></param>
<param name="allowFullScreen" value="true"></param>
<param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/YBt01TFNIA0&#038;hl=en&#038;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object><br />
Direct URL: <a href="http://www.youtube.com/watch?v=YBt01TFNIA0">http://www.youtube.com/watch?v=YBt01TFNIA0</a>
</div>
<div class="center-aligner">
<small>Alan Kay&#8217;s talk at 2.75x speed. Total time: 1:49 mins.</small><br />
<object width="425" height="344">
<param name="movie" value="http://www.youtube.com/v/BNCqZp0XOVw&#038;hl=en&#038;fs=1"></param>
<param name="allowFullScreen" value="true"></param>
<param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/BNCqZp0XOVw&#038;hl=en&#038;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object><br />
Direct URL: <a href="http://www.youtube.com/watch?v=BNCqZp0XOVw">http://www.youtube.com/watch?v=BNCqZp0XOVw</a>
</div>
<div class="center-aligner">
<small>Alan Kay&#8217;s talk at 3x speed. Total time: 1:40 mins.</small><br />
<object width="425" height="344">
<param name="movie" value="http://www.youtube.com/v/8DsuyPNOqGw&#038;hl=en&#038;fs=1"></param>
<param name="allowFullScreen" value="true"></param>
<param name="allowscriptaccess" value="always"></param><embed src="http://www.youtube.com/v/8DsuyPNOqGw&#038;hl=en&#038;fs=1" type="application/x-shockwave-flash" allowscriptaccess="always" allowfullscreen="true" width="425" height="344"></embed></object><br />
Direct URL: <a href="http://www.youtube.com/watch?v=8DsuyPNOqGw">http://www.youtube.com/watch?v=8DsuyPNOqGw</a>
</div>
<p>Do you have any other techniques to speed up videos? I am also curious at what speeds do you feel the most comfortable watching the videos?</p>
<p>It would also be cool to create a hack that modifies youtube and google video players to make them play videos faster natively.</p>
<p>Ps. Did you know that I was a big fan of video lectures? I have been collecting math, physics, computer science video lectures for almost 3 years now and posting them to my other <a href="http://freescienceonline.blogspot.com">Free Science Online Blog</a>.</p>
<div class="feedflare">
<a href="http://feeds.feedburner.com/~f/catonmat?a=KlKRr93i"><img src="http://feeds.feedburner.com/~f/catonmat?d=41" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=vLyXmjCa"><img src="http://feeds.feedburner.com/~f/catonmat?i=vLyXmjCa" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=mybycf1A"><img src="http://feeds.feedburner.com/~f/catonmat?i=mybycf1A" border="0"></img></a> <a href="http://feeds.feedburner.com/~f/catonmat?a=IqJdlV5c"><img src="http://feeds.feedburner.com/~f/catonmat?i=IqJdlV5c" border="0"></img></a>
</div><img src="http://feeds.feedburner.com/~r/catonmat/~4/MSXlNjd_4yQ" height="1" width="1"/>]]></content:encoded>
			<wfw:commentRss>http://www.catonmat.net/blog/how-to-save-time-by-watching-videos-at-higher-playback-speeds/feed/</wfw:commentRss>
		<feedburner:origLink>http://www.catonmat.net/blog/how-to-save-time-by-watching-videos-at-higher-playback-speeds/</feedburner:origLink></item>
	</channel>
</rss>
