<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" media="screen" href="/~d/styles/atom10full.xsl"?><?xml-stylesheet type="text/css" media="screen" href="http://feeds.feedburner.com/~d/styles/itemcontent.css"?><feed xmlns="http://www.w3.org/2005/Atom" xmlns:feedburner="http://rssnamespace.org/feedburner/ext/1.0">
 
 <title>The Julia Blog</title>
 
 <link href="http://julialang.org/blog" />
 <updated>2012-05-15T17:56:56-07:00</updated>
 <id>http://julialang.org/blog</id>
 <author>
   <name>Julia Developers</name>
   <email>julia-dev@googlegroups.com</email>
 </author>

 
 <atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="self" type="application/atom+xml" href="http://feeds.feedburner.com/JuliaLang" /><feedburner:info uri="julialang" /><atom10:link xmlns:atom10="http://www.w3.org/2005/Atom" rel="hub" href="http://pubsubhubbub.appspot.com/" /><entry>
   <title>New York Open Stats Meetup</title>
   <link href="http://feedproxy.google.com/~r/JuliaLang/~3/nGWSNSEoA3A/nyc-open-stats-meetup-announcement" />
   <updated>2012-04-18T00:00:00-07:00</updated>
   <id>http://julialang.org/blog/2012/04/nyc-open-stats-meetup-announcement</id>
   <content type="html">&lt;p&gt;I&amp;rsquo;ll be giving a talk on Julia at the &lt;a href="http://www.meetup.com/nyhackr/events/60839932/"&gt;New York Open Statistical Programming Meetup on May 1st&lt;/a&gt;. After my presentation, &lt;a href="http://www.johnmyleswhite.com/"&gt;John Myles White&lt;/a&gt; and &lt;a href="http://www.statalgo.com/"&gt;Shane Conway&lt;/a&gt; are going to give followup demos of statistical applications using Julia. Then we&amp;rsquo;re going to hang out and grab drinks nearby. Thanks to &lt;a href="http://www.harlan.harris.name/"&gt;Harlan Harris&lt;/a&gt; and &lt;a href="http://www.drewconway.com/"&gt;Drew Conway&lt;/a&gt; for setting the whole thing up!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Announcement:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;After a brief hiatus, we are very excited to announce our May meetup will feature one of the hottest new languages in statistical computing: Julia.  We are delighted to welcome Stefan Karpinski, one of the creators of Julia, to give an introduction to the language and his perspective on statistical computing.&lt;/p&gt;

&lt;p&gt;Julia is a general-purpose, high-level, dynamic language in the tradition of Lisp, Perl, Python and Ruby. It is designed to take advantage of modern techniques for executing dynamic languages with statically-compiled performance. As part of this design, the language has an expressive type system, which programmers may leverage for dispatch and error checking — incidentally providing the compiler with useful type information. Using types is entirely optional, however: &amp;ldquo;typeless Julia&amp;rdquo; is a valid and useful subset of the language, similar to traditional dynamic languages, which nevertheless runs at statically compiled speeds.\&lt;/p&gt;

&lt;p&gt;Julia is especially good at running Matlab and R-style programs. Given its level of performance, we envision a new era of technical computing where libraries can be developed in a high-level language instead of C or Fortran. We have also experimented with cloud API integration, and begun to develop a web-based interactive computing environment. The ultimate goal is to make cloud-based supercomputing as easy and accessible as Google Docs.&lt;/p&gt;

&lt;p&gt;We will also hear from a mix of people who have already started developing in Julia and see some examples of what they have developed.&lt;/p&gt;

&lt;p&gt;The meetup will follow our typical schedule: pizza will begin at 6:15pm, Stefan will begin promptly at 7pm, and we will head to The Central Bar around 8:30pm.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt; You can see the slides for the talk &lt;a href="/images/nyhackr.pdf"&gt;here&lt;/a&gt;. There was no video of the talk, but hopefully the slides are informative — there are, among other things, a lot of code examples that should just work if pasted into the Julia repl.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/JuliaLang/~4/nGWSNSEoA3A" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://julialang.org/blog/2012/04/nyc-open-stats-meetup-announcement</feedburner:origLink></entry>
 
 <entry>
   <title>Lang.NEXT Announcement</title>
   <link href="http://feedproxy.google.com/~r/JuliaLang/~3/PigIYrg6rjw/lang-next-talk-announcement" />
   <updated>2012-03-24T00:00:00-07:00</updated>
   <id>http://julialang.org/blog/2012/03/lang-next-talk-announcement</id>
   <content type="html">&lt;p&gt;Jeff and I will be giving a &lt;a href="http://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2012/Julia"&gt;presentation on Julia&lt;/a&gt; at the upcoming &lt;a href="http://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2012"&gt;Lang.NEXT conference&lt;/a&gt;, a gathering of &amp;ldquo;programming language design experts and enthusiasts&amp;rdquo; featuring &amp;ldquo;talks, panels and discussion on leading programming language work from industry and research.&amp;rdquo;
We are honored and excited to have been invited to speak at an event alongside so many programming language luminaries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Julia is a dynamic language in the tradition of Lisp, Perl, Python and Ruby. It aims to advance  expressiveness and convenience for scientific and technical computing beyond that of environments like Matlab and NumPy, while simultaneously closing the performance gap with compiled languages like C, C++, Fortran and Java.&lt;/p&gt;

&lt;p&gt;Most high-performance dynamic language implementations have taken an existing interpreted language and worked to accelerate its execution. In creating Julia, we have reconsidered the basic language design, taking into account the capabilities of modern JIT compilers and the specific needs of technical computing. Our design includes:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Multiple dispatch as the core language paradigm.&lt;/li&gt;
&lt;li&gt;Exposing a sophisticated type system including parametric dependent types.&lt;/li&gt;
&lt;li&gt;Dynamic type inference to generate fast code from programs with no declarations.&lt;/li&gt;
&lt;li&gt;Aggressive specialization of generated code for types encountered at run-time.&lt;/li&gt;
&lt;/ul&gt;


&lt;p&gt;Julia feels light and natural for data exploration and algorithm prototyping, but has performance that lets you deploy your prototypes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt; You can see the slides for our talk &lt;a href="/images/lang.next.pdf"&gt;here&lt;/a&gt;. Video of the presentation is available &lt;a href="http://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2012/Julia"&gt;here&lt;/a&gt;.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/JuliaLang/~4/PigIYrg6rjw" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://julialang.org/blog/2012/03/lang-next-talk-announcement</feedburner:origLink></entry>
 
 <entry>
   <title>Shelling Out Sucks</title>
   <link href="http://feedproxy.google.com/~r/JuliaLang/~3/0H6a-4z6Wno/shelling-out-sucks" />
   <updated>2012-03-11T00:00:00-08:00</updated>
   <id>http://julialang.org/blog/2012/03/shelling-out-sucks</id>
   <content type="html">&lt;p&gt;Spawning a pipeline of connected programs via an intermediate shell — a.k.a. &amp;ldquo;shelling out&amp;rdquo; — is a really convenient and effective way to get things done.
It&amp;rsquo;s so handy that some &amp;ldquo;&lt;a href="http://en.wikipedia.org/wiki/Glue_language"&gt;glue languages&lt;/a&gt;,&amp;rdquo; like &lt;a href="http://www.perl.org/"&gt;Perl&lt;/a&gt; and &lt;a href="http://www.ruby-lang.org/"&gt;Ruby&lt;/a&gt;, even have special syntax for it (backticks).
However, shelling out is also a common source of bugs, security holes, unnecessary overhead, and silent failures.
Here are the three reasons why shelling out is problematic:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;em&gt;&lt;a href="#Metacharacter+Brittleness"&gt;Metacharacter brittleness.&lt;/a&gt;&lt;/em&gt;
When commands are constructed programmatically, the resulting code is almost always brittle:
if a variable used to construct the command contains any shell metacharacters, including spaces, the command will likely break and do something very different than what was intended — potentially something quite dangerous.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&lt;a href="#Indirection+and+Inefficiency"&gt;Indirection and inefficiency.&lt;/a&gt;&lt;/em&gt;
When shelling out, the main program forks and execs a shell process just so that the shell can in turn fork and exec a series of commands with their inputs and outputs appropriately connected.
Not only is starting a shell an unnecessary step, but since the main program is not the parent of the pipeline commands, it cannot be notified when they terminate — it can only wait for the pipeline to finish and hope the shell indicates what happened.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;&lt;a href="#Silent+Failures+by+Default"&gt;Silent failures by default.&lt;/a&gt;&lt;/em&gt;
Errors in shelled out commands don&amp;rsquo;t automatically become exceptions in most languages.
This default leniency leads to code that fails silently when shelled out commands don&amp;rsquo;t work.
Worse still, because of the indirection problem, there are many cases where the failure of a process in a spawned pipeline &lt;em&gt;cannot&lt;/em&gt; be detected by the parent process, even if errors are fastidiously checked for.&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;In the rest of this post, I&amp;rsquo;ll go over examples demonstrating each of these problems.
At &lt;a href="#Summary+and+Remedy"&gt;the end&lt;/a&gt;, I&amp;rsquo;ll talk about better alternatives to shelling out, and in a followup post, I&amp;rsquo;ll demonstrate how Julia makes these better alternatives dead simple to use.
Examples below are given in Ruby which shells out to &lt;a href="http://www.gnu.org/software/bash/"&gt;Bash&lt;/a&gt;, but the same problems exist no matter what language one shells out from:
it&amp;rsquo;s the technique of using an intermediate shell process to spawn external commands that&amp;rsquo;s at fault, not the language.&lt;/p&gt;

&lt;h2 id="Metacharacter+Brittleness"&gt;Metacharacter Brittleness&lt;/h2&gt;

&lt;p&gt;Let&amp;rsquo;s start with a simple example of shelling out from Ruby.
Suppose you want to count the number of lines containing the string &amp;ldquo;foo&amp;rdquo; in all the files under a directory given as an argument.
One option is to write Ruby code that reads the contents of the given directory, finds all the files, opens them and iterates through them looking for the string &amp;ldquo;foo&amp;rdquo;.
However, that&amp;rsquo;s a lot of work and it&amp;rsquo;s going to be much slower than using a pipeline of standard UNIX commands, which are written in C and heavily optimized.
The most natural and convenient thing to do in Ruby is to shell out, using backticks to capture output:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;`find #{dir} -type f -print0 | xargs -0 grep foo | wc -l`.to_i
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This expression interpolates the &lt;code&gt;dir&lt;/code&gt; variable into a command, spawns a Bash shell to execute the resulting command, captures the output into a string, and then converts that string to an integer.
The command uses the &lt;code&gt;-print0&lt;/code&gt; and &lt;code&gt;-0&lt;/code&gt; options to correctly handle strange characters in file names piped from &lt;code&gt;find&lt;/code&gt; to &lt;code&gt;xargs&lt;/code&gt; (these options cause file names to be delimited by &lt;a href="http://en.wikipedia.org/wiki/Null_character"&gt;NULs&lt;/a&gt; instead of whitespace).
Even with extra-careful options, this code for shelling out is simple and clear.
Here it is in action:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;irb(main):001:0&amp;gt; dir="src"
=&amp;gt; "src"
irb(main):002:0&amp;gt; `find #{dir} -type f -print0 | xargs -0 grep foo | wc -l`.to_i
=&amp;gt; 5    
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Great.
However, this only works as expected if the directory name &lt;code&gt;dir&lt;/code&gt; doesn&amp;rsquo;t contain any characters that the shell considers special.
For example, the shell decides what constitutes a single argument to a command using whitespace.
Thus, if the value of &lt;code&gt;dir&lt;/code&gt; is a directory name containing a space, this will fail:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;irb(main):003:0&amp;gt; dir="source code"
=&amp;gt; "source code"
irb(main):004:0&amp;gt; `find #{dir} -type f -print0 | xargs -0 grep foo | wc -l`.to_i
find: `source': No such file or directory
find: `code': No such file or directory
=&amp;gt; 0
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;The simple solution to the problem of spaces is to surround the interpolated directory name in quotes, telling the shell to treat spaces inside as normal characters:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;irb(main):005:0&amp;gt; `find '#{dir}' -type f -print0 | xargs -0 grep foo | wc -l`.to_i
=&amp;gt; 5
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Excellent.
So what&amp;rsquo;s the problem?
While this solution addresses the issue of file names with spaces in them, it is still brittle with respect to other shell metacharacters.
What if a file name has a quote character in it?
Let&amp;rsquo;s try it.
First, let&amp;rsquo;s create a very weirdly named directory:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;bash-3.2$ mkdir "foo'bar"
bash-3.2$ echo foo &amp;gt; "foo'bar"/test.txt
bash-3.2$ ls -ld foo*bar
drwxr-xr-x 3 stefan staff 102 Feb  3 16:17 foo'bar/
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;That&amp;rsquo;s an admittedly strange directory name, but it&amp;rsquo;s perfectly legal in UNIXes of all flavors.
Now back to Ruby:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;irb(main):006:0&amp;gt; dir="foo'bar"
=&amp;gt; "foo'bar"
irb(main):007:0&amp;gt; `find '#{dir}' -type f -print0  | xargs -0 grep foo | wc -l`.to_i
sh: -c: line 0: unexpected EOF while looking for matching `''
sh: -c: line 1: syntax error: unexpected end of file
=&amp;gt; 0
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Doh.
Although this may seem like an unlikely corner case that one needn&amp;rsquo;t realistically worry about, there are serious security ramifications.
Suppose the name of the directory came from an untrusted source — like a web submission, or an argument to a setuid program from an untrusted user.
Suppose an attacker could arrange for any value of &lt;code&gt;dir&lt;/code&gt; they wanted:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;irb(main):008:0&amp;gt; dir="foo'; echo MALICIOUS ATTACK 1&amp;gt;&amp;amp;2; echo '"
=&amp;gt; "foo'; echo MALICIOUS ATTACK 1&amp;gt;&amp;amp;2; echo '"
irb(main):009:0&amp;gt; `find '#{dir}' -type f -print0  | xargs -0 grep foo | wc -l`.to_i
find: `foo': No such file or directory
MALICIOUS ATTACK
grep:  -type f -print0
: No such file or directory
=&amp;gt; 0
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Your box is now owned.
Of course, you could sanitize the value of the &lt;code&gt;dir&lt;/code&gt; variable, but there&amp;rsquo;s a fundamental tug-of-war between security (as limited as possible) and flexibility (as unlimited as possible).
The ideal behavior is to allow any directory name, no matter how bizarre, as long as it actually exists, but &amp;ldquo;defang&amp;rdquo; all shell metacharacters.&lt;/p&gt;

&lt;p&gt;The only two way to fully protect against these sorts of metacharacter attacks — whether malicious or accidental — while still using an external shell to construct the pipeline, is to do full shell metacharacter escaping:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;irb(main):010:0&amp;gt; require 'shellwords'
=&amp;gt; true
irb(main):011:0&amp;gt; `find #{Shellwords.shellescape(dir)} -type f -print0  | xargs -0 grep foo | wc -l`.to_i
find: `foo\'; echo MALICIOUS ATTACK 1&amp;gt;&amp;amp;2; echo \'': No such file or directory
=&amp;gt; 0
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;With shell escaping, this safely attempts to search a very oddly named directory instead of executing the malicious attack.
Although shell escaping does work (assuming that there aren&amp;rsquo;t any mistakes in the shell escaping implementation), realistically, no one actually bothers — it&amp;rsquo;s too much trouble.
Instead, code that shells out with programmatically constructed commands is typically riddled with potential bugs in the best case and massive security holes in the worst case.&lt;/p&gt;

&lt;h2 id="Indirection+and+Inefficiency"&gt;Indirection and Inefficiency&lt;/h2&gt;

&lt;p&gt;If we were using the above code to count the number of lines with the string &amp;ldquo;foo&amp;rdquo; in a directory, we would want to check to see if everything worked and respond appropriately if something went wrong.
In Ruby, you can check if a shelled out command was successful using the bizarrely named &lt;code&gt;$?.success?&lt;/code&gt; indicator:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;irb(main):012:0&amp;gt; dir="src"                                                              
=&amp;gt; "src"
irb(main):013:0&amp;gt; `find #{Shellwords.shellescape(dir)} -type f -print0  | xargs -0 grep foo | wc -l`.to_i
=&amp;gt; 5
irb(main):014:0&amp;gt; $?.success?                                                                
=&amp;gt; true
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Ok, that correctly indicates success.
Let&amp;rsquo;s make sure that it can detect failure:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;irb(main):015:0&amp;gt; dir="nonexistent"                                                              
=&amp;gt; "nonexistent"
irb(main):016:0&amp;gt; `find #{Shellwords.shellescape(dir)} -type f -print0  | xargs -0 grep foo | wc -l`.to_i
find: `nonexistent': No such file or directory
=&amp;gt; 0
irb(main):017:0&amp;gt; $?.success?
=&amp;gt; true
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Wait. What?!
That wasn&amp;rsquo;t successful.
What&amp;rsquo;s going on?&lt;/p&gt;

&lt;p&gt;The heart of the problem is that when you shell out, the commands in the pipeline are not immediate children of the main program, but rather its grandchildren:
the program spawns a shell, which makes a bunch of UNIX pipes, forks child processes, connects inputs and outputs to pipes using the &lt;a href="https://developer.apple.com/library/IOs/#documentation/System/Conceptual/ManPages_iPhoneOS/man2/dup2.2.html"&gt;&lt;code&gt;dup2&lt;/code&gt; system call&lt;/a&gt;, and then execs the appropriate commands.
As a result, your main program is not the parent of the commands in the pipeline, but rather, their grandparent.
Therefore, it doesn&amp;rsquo;t know their process IDs, nor can it wait on them or get their exit statuses when they terminate.
The shell process, which is their parent, has to do all of that.
Your program can only wait for the shell to finish and see if &lt;em&gt;that&lt;/em&gt; was successful.
If the shell is only executing a single command, this is fine:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;irb(main):018:0&amp;gt; `cat /dev/null`
=&amp;gt; ""
irb(main):019:0&amp;gt; $?.success?
=&amp;gt; true
irb(main):020:0&amp;gt; `cat /dev/nada`
cat: /dev/nada: No such file or directory
=&amp;gt; ""
irb(main):021:0&amp;gt; $?.success?
=&amp;gt; false
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Unfortunately, by default the shell is quite lenient about what it considers to be a successful pipeline:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;irb(main):022:0&amp;gt; `cat /dev/nada | sort`
cat: /dev/nada: No such file or directory
=&amp;gt; ""
irb(main):023:0&amp;gt; $?.success?
=&amp;gt; true
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;As long as the last command in a pipeline succeeds — in this case &lt;code&gt;sort&lt;/code&gt; — the entire pipeline is considered a success.
Thus, even when one or more of the earlier programs in a pipeline fails spectacularly, the last command may not, leading the shell to consider the entire pipeline to be successful.
This is probably not what you meant by success.&lt;/p&gt;

&lt;p&gt;Bash&amp;rsquo;s notion of pipeline success can fortunately be made stricter with the &lt;code&gt;pipefail&lt;/code&gt; option.
This option causes the shell to consider a pipeline successful only if all of its commands are successful:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;irb(main):024:0&amp;gt; `set -o pipefail; cat /dev/nada | sort`
cat: /dev/nada: No such file or directory
=&amp;gt; ""
irb(main):025:0&amp;gt; $?.success?
=&amp;gt; false
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;Since shelling out spawns a new shell every time, this option has to be set for every multi-command pipeline in order to be able to determine its true success status.
Of course, just like shell-escaping every interpolated variable, setting &lt;code&gt;pipefail&lt;/code&gt; at the start of every command is simply something that no one actually does.
Moreover, even with the &lt;code&gt;pipefail&lt;/code&gt; option, your program has no way of determining &lt;em&gt;which&lt;/em&gt; commands in a pipeline were unsuccessful — it just knows that something somewhere went wrong.
While that&amp;rsquo;s better than silently failing and continuing as if there were no problem, its not very helpful for postmortem debugging:
many programs are not as well-behaved as &lt;code&gt;cat&lt;/code&gt; and don&amp;rsquo;t actually identify themselves or the specific problem when printing error messages before going belly up.&lt;/p&gt;

&lt;p&gt;Given the other problems caused by the indirection of shelling out, it seems like a barely relevant afterthought to mention that execing a shell process just to spawn a bunch of other processes is inefficient.
However, it is a real source of unnecessary overhead:
the main process could just do the work the shell does itself.
Asking the kernel to fork a process and exec a new program is a non-trivial amount of work.
The only reason to have the shell do this work for you is that it&amp;rsquo;s complicated and hard to get right.
The shell makes it easy.
So programming languages have traditionally relied on the shell to setup pipelines for them, regardless of the additional overhead and problems caused by indirection.&lt;/p&gt;

&lt;h2 id="Silent+Failures+by+Default"&gt;Silent Failures by Default&lt;/h2&gt;

&lt;p&gt;Let&amp;rsquo;s return to our example of shelling out to count &amp;ldquo;foo&amp;rdquo; lines.
Here&amp;rsquo;s the total expression we need to use in order to shell out without being susceptible to metacharacter breakage and so we can actually tell whether the entire pipeline succeeded:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;`set -o pipefail; find #{Shellwords.shellescape(dir)} -type f -print0  | xargs -0 grep foo | wc -l`.to_i
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;However, an error isn&amp;rsquo;t raised by default when a shelled out command fails.
To avoid silent errors, we need to explicitly check &lt;code&gt;$?.success?&lt;/code&gt; after every time we shell out and raise an exception if it indicates failure.
Of course, doing this manually is tedious, and as a result, it largely isn&amp;rsquo;t done.
The default behavior — and therefore the easiest and most common behavior — is to assume that shelled out commands worked and completely ignore failures.
To make our &amp;ldquo;foo&amp;rdquo; counting example well-behaved, we would have to wrap it in a function like so:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;def foo_count(dir)
  n = `set -o pipefail;
       find #{Shellwords.shellescape(dir)} -type f -print0  | xargs -0 grep foo | wc -l`.to_i
  raise("pipeline failed") unless $?.success?
  return n
end
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;This function behaves the way we would like it to:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;irb(main):026:0&amp;gt; foo_count("src")
=&amp;gt; 5
irb(main):027:0&amp;gt; foo_count("source code")
=&amp;gt; 5
irb(main):028:0&amp;gt; foo_count("nonexistent")
find: `nonexistent': No such file or directory
RuntimeError: pipeline failed
    from (irb):5:in `foo_count'
    from (irb):13
    from :0
irb(main):029:0&amp;gt; foo_count("foo'; echo MALICIOUS ATTACK; echo '")
find: `foo\'; echo MALICIOUS ATTACK; echo \'': No such file or directory
RuntimeError: pipeline failed
    from (irb):5:in `foo_count'
    from (irb):14
    from :0
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;However, this 6-line, 200-character function is a far cry from the clarity and brevity we started with:&lt;/p&gt;

&lt;pre&gt;&lt;code&gt;`find #{dir} -type f -print0 | xargs -0 grep foo | wc -l`.to_i
&lt;/code&gt;&lt;/pre&gt;

&lt;p&gt;If most programmers saw the longer, safer version of this in a program, they&amp;rsquo;d probably wonder why someone was writing such verbose, cryptic code to get something so simple and straightforward done.&lt;/p&gt;

&lt;h2 id="Summary+and+Remedy"&gt;Summary and Remedy&lt;/h2&gt;

&lt;p&gt;To sum it up, shelling out is great, but making code that shells out bug-free, secure, and not prone to silent failures requires three things that typically aren&amp;rsquo;t done:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Shell-escaping all values used to construct commands&lt;/li&gt;
&lt;li&gt;Prefixing each multi-command pipeline with &amp;ldquo;&lt;code&gt;set -o pipefail;&lt;/code&gt;&amp;rdquo;&lt;/li&gt;
&lt;li&gt;Explicitly checking for failure after each shelled out command.&lt;/li&gt;
&lt;/ol&gt;


&lt;p&gt;The trouble is that after doing all of these things, shelling out is no longer terribly convenient, and the code becomes annoyingly verbose.
In short, shelling out responsibly kind of sucks.&lt;/p&gt;

&lt;p&gt;As is so often the case, the root of all of these problems is relying on a middleman rather than doing things yourself.
If a program constructs and executes pipelines itself, it remains in control of all the subprocesses, can determine their individual exit conditions, automatically handle errors appropriately, and give accurate, comprehensive diagnostic messages when things go wrong.
Moreover, without a shell to interpret commands, there is also no shell to treat metacharacters specially, and therefore no danger of metacharacter brittleness.
&lt;a href="http://python.org/"&gt;Python&lt;/a&gt; gets this right:
using &lt;a href="http://docs.python.org/library/os.html#os.popen"&gt;&lt;code&gt;os.popen&lt;/code&gt;&lt;/a&gt; to shell out is officially deprecated, and the recommended way to call external programs is to use the &lt;a href="http://docs.python.org/library/subprocess.html"&gt;&lt;code&gt;subprocess&lt;/code&gt;&lt;/a&gt; module, which spawns external programs without using a shell.
Constructing pipelines using &lt;code&gt;subprocess&lt;/code&gt; &lt;a href="http://docs.python.org/library/subprocess.html#replacing-shell-pipeline"&gt;can be a little verbose&lt;/a&gt;, but it is safe and avoids all the problems that shelling out is prone to.
In my followup post, I will describe how Julia makes constructing and executing pipelines of external commands as safe as Python&amp;rsquo;s &lt;code&gt;subprocess&lt;/code&gt; and as convenient as shelling out.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/JuliaLang/~4/0H6a-4z6Wno" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://julialang.org/blog/2012/03/shelling-out-sucks</feedburner:origLink></entry>
 
 <entry>
   <title>Stanford Talk Video</title>
   <link href="http://feedproxy.google.com/~r/JuliaLang/~3/tetgEe_JhbE/stanford-talk-video" />
   <updated>2012-03-01T00:00:00-08:00</updated>
   <id>http://julialang.org/blog/2012/03/stanford-talk-video</id>
   <content type="html">&lt;p&gt;Jeff gave his &lt;a href="/blog/2012/02/talk-announcement/"&gt;previously announced&lt;/a&gt;, invited talk at Stanford yesterday and the video is &lt;a href="http://ee380.stanford.edu/cgi-bin/videologger.php?target=120229-ee380-300.asx"&gt;available here&lt;/a&gt;.
Congrats, Jeff!&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/JuliaLang/~4/tetgEe_JhbE" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://julialang.org/blog/2012/03/stanford-talk-video</feedburner:origLink></entry>
 
 <entry>
   <title>Stanford Talk Announcement</title>
   <link href="http://feedproxy.google.com/~r/JuliaLang/~3/vjpljHmVJy8/talk-announcement" />
   <updated>2012-02-27T00:00:00-08:00</updated>
   <id>http://julialang.org/blog/2012/02/talk-announcement</id>
   <content type="html">&lt;p&gt;I will be speaking about Julia at the
&lt;a href="http://www.stanford.edu/class/ee380/"&gt;Stanford EE Computer Systems Colloquium&lt;/a&gt;
on Wednesday, February 29 at 4:15PM PST.
The title of the talk is &lt;em&gt;Julia: A Fast Dynamic Language For Technical Computing&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;Julia is a general-purpose, high-level, dynamic language, designed from the start to take advantage of techniques for executing dynamic languages at statically-compiled language speeds. As a result the language has a more powerful type system, and generally provides better type information to the compiler.&lt;/p&gt;

&lt;p&gt;Julia is especially good at running MATLAB and R-style programs. Given its level of performance, we envision a new era of technical computing where libraries can be developed in a high-level language instead of C or FORTRAN. We have also experimented with cloud API integration, and begun to develop a web-based, language-neutral platform for visualization and collaboration. The ultimate goal is to make cloud-based supercomputing as easy and accessible as Google Docs.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Speaker Bio:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;&lt;p&gt;Jeff Bezanson has been developing the Julia language for two and a half years with a small distributed team of collaborators. Previously, he worked as a software engineer at Interactive Supercomputing, which developed the Star-P parallel extension to MATLAB. At the company, Jeff was a principal developer of &amp;ldquo;M#&amp;rdquo;, an implementation of the MATLAB language running on .NET. He is now a second-year graduate student at MIT. Jeff received an A.B. in Computer Science from Harvard University in 2004, and has experience with applications of technical computing in medical imaging.&lt;/p&gt;&lt;/blockquote&gt;

&lt;p&gt;The talk will be webcast live.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Edit:&lt;/strong&gt; the video of the talk can be &lt;a href="http://ee380.stanford.edu/cgi-bin/videologger.php?target=120229-ee380-300.asx"&gt;found here&lt;/a&gt;.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/JuliaLang/~4/vjpljHmVJy8" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://julialang.org/blog/2012/02/talk-announcement</feedburner:origLink></entry>
 
 <entry>
   <title>Why We Created Julia</title>
   <link href="http://feedproxy.google.com/~r/JuliaLang/~3/E7TOUPv4r60/why-we-created-julia" />
   <updated>2012-02-14T00:00:00-08:00</updated>
   <id>http://julialang.org/blog/2012/02/why-we-created-julia</id>
   <content type="html">&lt;p&gt;In short, because we are greedy.&lt;/p&gt;

&lt;p&gt;We are power Matlab users.
Some of us are Lisp hackers.
Some are Pythonistas, others Rubyists, still others Perl hackers.
There are those of us who used Mathematica before we could grow facial hair.
There are those who still can&amp;rsquo;t grow facial hair.
We&amp;rsquo;ve generated more R plots than any sane person should.
C is our desert island programming language.&lt;/p&gt;

&lt;p&gt;We love all of these languages;
they are wonderful and powerful.
For the work we do — scientific computing, machine learning, data mining, large-scale linear algebra, distributed and parallel computing — each one is perfect for some aspects of the work and terrible for others.
Each one is a trade-off.&lt;/p&gt;

&lt;p&gt;We are greedy: we want more.&lt;/p&gt;

&lt;p&gt;We want a language that&amp;rsquo;s open source, with a liberal license.
We want the speed of C with the dynamism of Ruby.
We want a language that&amp;rsquo;s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab.
We want something as usable for general programming as Python,
as easy for statistics as R,
as natural for string processing as Perl,
as powerful for linear algebra as Matlab,
as good at gluing programs together as the shell.
Something that is dirt simple to learn, yet keeps the most serious hackers happy.
We want it interactive and we want it compiled.&lt;/p&gt;

&lt;p&gt;(Did we mention it should be as fast as C?)&lt;/p&gt;

&lt;p&gt;While we&amp;rsquo;re being demanding, we want something that provides the distributed power of Hadoop — without the kilobytes of boilerplate Java and XML;
without being forced to sift through gigabytes of log files on hundreds of machines to find our bugs.
We want the power without the layers of impenetrable complexity.
We want to write simple scalar loops that compile down to tight machine code using just the registers on a single CPU.
We want to write &lt;code&gt;A*B&lt;/code&gt; and launch a thousand computations on a thousand machines, calculating a vast matrix product together.&lt;/p&gt;

&lt;p&gt;We never want to mention types when we don&amp;rsquo;t feel like it.
But when we need polymorphic functions, we want to use generic programming to write an algorithm just once and apply it to an infinite lattice of types;
we want to use multiple dispatch to efficiently pick the best method for all of a function&amp;rsquo;s arguments, from dozens of method definitions, providing common functionality across drastically different types.
Despite all this power, we want the language to be simple and clean.&lt;/p&gt;

&lt;p&gt;All this doesn&amp;rsquo;t seem like too much to ask for, does it?&lt;/p&gt;

&lt;p&gt;Even though we recognize that we are inexcusably greedy, we still want to have it all.
About two and a half years ago, we set out to create the language of our greed.
It&amp;rsquo;s not complete, but it&amp;rsquo;s time for a 1.0 release — the language we&amp;rsquo;ve created is called &lt;a href="/"&gt;Julia&lt;/a&gt;.
It already delivers on 90% of our ungracious demands, and now it needs the ungracious demands of others to shape it further.
So, if you are also a greedy, unreasonable, demanding programmer, we want you to give it a try.&lt;/p&gt;
&lt;img src="http://feeds.feedburner.com/~r/JuliaLang/~4/E7TOUPv4r60" height="1" width="1"/&gt;</content>
 <feedburner:origLink>http://julialang.org/blog/2012/02/why-we-created-julia</feedburner:origLink></entry>
 
 
</feed>

