<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">

 <title>The Julia Blog</title>
 <link href="http://julialang.org/blog/feed.xml" rel="self"/>
 <link href="http://julialang.org/blog"/>
 <updated>2017-03-14T09:51:44+00:00</updated>
 <id>http://julialang.org/blog</id>
 <author>
   <name>Julia Developers</name>
   <email>julia-dev@googlegroups.com</email>
 </author>

 
 <entry>
   <title>Technical preview: Native GPU programming with CUDAnative.jl</title>
   <link href="http://julialang.org/blog/2017/03/cudanative"/>
   <updated>2017-03-14T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2017/03/cudanative</id>
   <content type="html">&lt;p&gt;After 2 years of slow but steady development, we would like to announce the first preview
release of native GPU programming capabilities for Julia. You can now write your CUDA
kernels in Julia, albeit with some restrictions, making it possible to use Julia’s
high-level language features to write high-performance GPU code.&lt;/p&gt;

&lt;p&gt;The programming support we’re demonstrating here today consists of the low-level building
blocks, sitting at the same abstraction level of CUDA C. You should be interested if you
know (or want to learn) how to program a parallel accelerator like a GPU, while dealing with
tricky performance characteristics and communication semantics.&lt;/p&gt;

&lt;p&gt;You can easily add GPU support to your Julia installation (see below for detailed
instructions) by installing &lt;a href=&quot;https://github.com/JuliaGPU/CUDAnative.jl&quot;&gt;CUDAnative.jl&lt;/a&gt;. This
package is built on top of experimental interfaces to the Julia compiler, and the
purpose-built &lt;a href=&quot;https://github.com/maleadt/LLVM.jl&quot;&gt;LLVM.jl&lt;/a&gt; and
&lt;a href=&quot;https://github.com/JuliaGPU/CUDAdrv.jl&quot;&gt;CUDAdrv.jl&lt;/a&gt; packages to compile and execute code.
All this functionality is brand-new and thoroughly untested, so we need your help and
feedback in order to improve and finalize the interfaces before Julia 1.0.&lt;/p&gt;

&lt;h2 id=&quot;how-to-get-started&quot;&gt;How to get started&lt;/h2&gt;

&lt;p&gt;CUDAnative.jl is tightly integrated with the Julia compiler and the underlying LLVM
framework, which complicates version and platform compatibility. For this preview we only
support Julia 0.6 built from source, on Linux or macOS. Luckily, installing Julia from
source is well documented in the &lt;a href=&quot;https://github.com/JuliaLang/julia/blob/master/README.md#source-download-and-compilation&quot;&gt;main repository’s
README&lt;/a&gt;.
Most of the time it boils down to the following commands:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ git clone https://github.com/JuliaLang/julia.git
$ cd julia
$ git checkout v0.6.0-pre.alpha  # or any later tag
$ make                           # add -jN for N parallel jobs
$ ./julia
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;From the Julia REPL, installing CUDAnative.jl and its dependencies is just a matter of using
the package manager. Do note that you need to be using the NVIDIA binary driver, and have
the CUDA toolkit installed.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&amp;gt; Pkg.add(&quot;CUDAnative&quot;)

# Optional: test the package
&amp;gt; Pkg.test(&quot;CUDAnative&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;At this point, you can start writing kernels and execute them on the GPU using CUDAnative’s
&lt;code class=&quot;highlighter-rouge&quot;&gt;@cuda&lt;/code&gt;! Be sure to check out the
&lt;a href=&quot;https://github.com/JuliaGPU/CUDAnative.jl/tree/master/examples&quot;&gt;examples&lt;/a&gt;, or continue
reading for a more textual introduction.&lt;/p&gt;

&lt;h2 id=&quot;hello-world-vector-addition&quot;&gt;&lt;del&gt;Hello World&lt;/del&gt; Vector addition&lt;/h2&gt;

&lt;p&gt;A typical small demo of GPU programming capabilities (think of it as the &lt;em&gt;GPU Hello World&lt;/em&gt;)
is to perform a vector addition. The snippet below does exactly that using Julia and
CUDAnative.jl:&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;using&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CUDAdrv&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CUDAnative&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; kernel_vadd&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# from CUDAnative: (implicit) CuDeviceArray type,&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;#                  and thread/block intrinsics&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;blockIdx&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;blockDim&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;threadIdx&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;()&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;nothing&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;dev&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CuDevice&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CuContext&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dev&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# generate some data&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;len&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;512&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# allocate &amp;amp; upload on the GPU&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;d_a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CuArray&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;d_b&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CuArray&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;d_c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;similar&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d_a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# execute and fetch results&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@cuda&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kernel_vadd&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d_a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;d_b&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;d_c&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;    &lt;span class=&quot;c&quot;&gt;# from CUDAnative.jl&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d_c&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;using&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Base&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Test&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@test&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;destroy&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ctx&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h3 id=&quot;how-does-it-work&quot;&gt;How does it work?&lt;/h3&gt;

&lt;p&gt;Most of this example does not rely on CUDAnative.jl, but uses functionality from CUDAdrv.jl.
This package makes it possible to interact with CUDA hardware through user-friendly wrappers
of CUDA’s driver API. For example, it provides an array type &lt;code class=&quot;highlighter-rouge&quot;&gt;CuArray&lt;/code&gt;, takes care of memory
management, integrates with Julia’s garbage collector, implements &lt;code class=&quot;highlighter-rouge&quot;&gt;@elapsed&lt;/code&gt; using GPU
events, etc. It is meant to form a strong foundation for all interactions with the CUDA
driver, and does not require a bleeding-edge version of Julia. A slightly higher-level
alternative is available under &lt;a href=&quot;https://github.com/JuliaGPU/CUDArt.jl&quot;&gt;CUDArt.jl&lt;/a&gt;, building
on the CUDA runtime API instead, but hasn’t been integrated with CUDAnative.jl yet.&lt;/p&gt;

&lt;p&gt;Meanwhile, CUDAnative.jl takes care of all things related to native GPU programming. The
most significant part of that is generating GPU code, and essentially consists of three
phases:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;strong&gt;interfacing with Julia&lt;/strong&gt;: repurpose the compiler to emit GPU-compatible LLVM IR (no
calls to CPU libraries, simplified exceptions, …)&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;interfacing with LLVM&lt;/strong&gt; (using LLVM.jl): optimize the IR, and compile to PTX&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;interfacing with CUDA&lt;/strong&gt; (using CUDAdrv.jl): compile PTX to SASS, and upload it to the
GPU&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;All this is hidden behind the call to &lt;code class=&quot;highlighter-rouge&quot;&gt;@cuda&lt;/code&gt;, which generates code to compile our kernel
upon first use. Every subsequent invocation will re-use that code, convert and upload
arguments&lt;sup id=&quot;fnref:1&quot;&gt;&lt;a href=&quot;#fn:1&quot; class=&quot;footnote&quot;&gt;1&lt;/a&gt;&lt;/sup&gt;, and finally launch the kernel. And much like we’re used to on the CPU, you
can introspect this code using runtime reflection:&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;c&quot;&gt;# CUDAnative.jl provides alternatives to the @code_ macros,&lt;/span&gt;
&lt;span class=&quot;c&quot;&gt;# looking past @cuda and converting argument types&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CUDAnative&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;nd&quot;&gt;@code_llvm&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@cuda&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;len&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kernel_vadd&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d_a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;d_b&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;d_c&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;define&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;void&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@julia_kernel_vadd_68711&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;LLVM&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IR&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# ... but you can also invoke without @cuda&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@code_ptx&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;kernel_vadd&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d_a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;d_b&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;d_c&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;visible&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;func&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;julia_kernel_vadd_68729&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;
    &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;PTX&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CODE&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;}&lt;/span&gt;

&lt;span class=&quot;c&quot;&gt;# or manually specify types (this is error prone!)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;code_sass&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kernel_vadd&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CuDeviceArray&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float32&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CuDeviceArray&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float32&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CuDeviceArray&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float32&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;code&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sm_20&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;Function&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;julia_kernel_vadd_68481&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;SASS&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CODE&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Another important part of CUDAnative.jl are the intrinsics: special functions and macros
that provide functionality hard or impossible to express using normal functions. For
example, the &lt;code class=&quot;highlighter-rouge&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;thread,block,grid&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}{&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;Idx,Dim&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt; functions provide access to the size and index
of each level of work. Local shared memory can be created using the &lt;code class=&quot;highlighter-rouge&quot;&gt;@cuStaticSharedMem&lt;/code&gt; and
&lt;code class=&quot;highlighter-rouge&quot;&gt;@cuDynamicSharedMem&lt;/code&gt; macros, while &lt;code class=&quot;highlighter-rouge&quot;&gt;@cuprintf&lt;/code&gt; can be used to display a formatted string
from within a kernel function. Many &lt;a href=&quot;https://github.com/JuliaGPU/CUDAnative.jl/blob/0721783db9ac4cc2c2948cbf8cbff4aa5f7c4271/src/device/intrinsics.jl#L499-L807&quot;&gt;math
functions&lt;/a&gt; are also available;
these should be used instead of similar functions in the standard library.&lt;/p&gt;

&lt;h3 id=&quot;what-is-missing&quot;&gt;What is missing?&lt;/h3&gt;

&lt;p&gt;As I’ve already hinted, we don’t support all features of the Julia language yet. For
example, it is currently impossible to call any function from the Julia C runtime library
(aka. &lt;code class=&quot;highlighter-rouge&quot;&gt;libjulia.so&lt;/code&gt;). This makes dynamic allocations impossible, cripples exceptions, etc.
As a result, large parts of the standard library are unavailable for use on the GPU. We will
obviously try to improve this in the future, but for now the compiler will error when it
encounters unsupported language features:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; nope() = println(42)
nope (generic function with 1 method)

julia&amp;gt; @cuda (1,1) nope()
ERROR: error compiling nope: emit_builtin_call for REPL[1]:1 requires the runtime language feature, which is disabled
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Another big gap is documentation. Most of CUDAnative.jl mimics or copies &lt;a href=&quot;https://docs.nvidia.com/cuda/cuda-c-programming-guide/&quot;&gt;CUDA
C&lt;/a&gt;, while CUDAdrv.jl wraps the &lt;a href=&quot;http://docs.nvidia.com/cuda/cuda-driver-api/&quot;&gt;CUDA
driver API&lt;/a&gt;. But we haven’t documented what
parts of those APIs are covered, or how the abstractions behave, so you’ll need to refer to
the examples and tests in the CUDAnative and CUDAdrv repositories.&lt;/p&gt;

&lt;h2 id=&quot;another-example-parallel-reduction&quot;&gt;Another example: parallel reduction&lt;/h2&gt;

&lt;p&gt;For a more complex example, let’s have a look at a &lt;a href=&quot;https://github.com/JuliaGPU/CUDAnative.jl/blob/0721783db9ac4cc2c2948cbf8cbff4aa5f7c4271/examples/reduce/reduce.cu&quot;&gt;parallel
reduction&lt;/a&gt; for &lt;a href=&quot;https://devblogs.nvidia.com/parallelforall/faster-parallel-reductions-kepler/&quot;&gt;Kepler-generation
GPUs&lt;/a&gt;. This
is a typical well-optimized GPU implementation, using fast communication primitives at each
level of execution. For example, threads within a warp execute together on a SIMD-like core,
and can share data through each other’s registers. At the block level, threads are allocated
on the same core but don’t necessarily execute together, which means they need to
communicate through core local memory. Another level up, only the GPU’s DRAM memory is a
viable communication medium.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/JuliaGPU/CUDAnative.jl/blob/0721783db9ac4cc2c2948cbf8cbff4aa5f7c4271/examples/reduce/reduce.jl&quot;&gt;Julia version of this algorithm&lt;/a&gt;
looks pretty similar to the CUDA original: this is as intended, because CUDAnative.jl is a
counterpart to CUDA C. The new version is much more generic though, specializing both on the
reduction operator and value type. And just like we’re used to with regular Julia code, the
&lt;code class=&quot;highlighter-rouge&quot;&gt;@cuda&lt;/code&gt; macro will just-in-time compile and dispatch to the correct specialization based on
the argument types.&lt;/p&gt;

&lt;p&gt;So how does it perform? Turns out, pretty good! The chart below compares the performance of
both the CUDAnative.jl and CUDA C implementations&lt;sup id=&quot;fnref:2&quot;&gt;&lt;a href=&quot;#fn:2&quot; class=&quot;footnote&quot;&gt;2&lt;/a&gt;&lt;/sup&gt;, using BenchmarkTools.jl to &lt;a href=&quot;https://github.com/JuliaGPU/CUDAnative.jl/blob/0721783db9ac4cc2c2948cbf8cbff4aa5f7c4271/examples/reduce/benchmark.jl&quot;&gt;measure
the execution time&lt;/a&gt;. The small
constant overhead (note the logarithmic scale) is due to a deficiency in argument passing,
and will be fixed.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/blog/2017-03-14-cudanative/performance.png&quot; alt=&quot;Performance comparison of parallel reduction
implementations.&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We also aim to be compatible with tools from the CUDA toolkit. For example, you can &lt;a href=&quot;/images/blog/2017-03-14-cudanative/nvvp.png&quot;&gt;profile
Julia kernels&lt;/a&gt; using the NVIDIA Visual
Profiler, or use &lt;code class=&quot;highlighter-rouge&quot;&gt;cuda-memcheck&lt;/code&gt; to detect out-of-bound accesses&lt;sup id=&quot;fnref:3&quot;&gt;&lt;a href=&quot;#fn:3&quot; class=&quot;footnote&quot;&gt;3&lt;/a&gt;&lt;/sup&gt;:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ cuda-memcheck julia examples/oob.jl
========= CUDA-MEMCHECK
========= Invalid __global__ write of size 4
=========     at 0x00000148 in examples/oob.jl:14:julia_memset_66041
=========     by thread (10,0,0) in block (0,0,0)
=========     Address 0x1020b000028 is out of bounds
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Full debug information &lt;a href=&quot;https://github.com/JuliaGPU/CUDAnative.jl/issues/31&quot;&gt;is not
available&lt;/a&gt; yet, so &lt;code class=&quot;highlighter-rouge&quot;&gt;cuda-gdb&lt;/code&gt; and
friends will not work very well.&lt;/p&gt;

&lt;h2 id=&quot;try-it-out&quot;&gt;Try it out!&lt;/h2&gt;

&lt;p&gt;If you have experience with GPUs or CUDA development, or maintain a package which could
benefit from GPU acceleration, please have a look or try out CUDAnative.jl! We need all the
feedback we can get, in order to prioritize development and finalize the infrastructure
before Julia hits 1.0.&lt;/p&gt;

&lt;h3 id=&quot;i-want-to-help&quot;&gt;I want to help&lt;/h3&gt;

&lt;p&gt;Even better! There’s many ways to contribute, for example by looking at the issues trackers
of the individual packages making up this support:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/JuliaGPU/CUDAnative.jl/issues&quot;&gt;CUDAnative.jl&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/JuliaGPU/CUDAdrv.jl/issues&quot;&gt;CUDAdrv.jl&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/maleadt/LLVM.jl/issues&quot;&gt;LLVM.jl&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of those packages are also in perpetual need of better API coverage, and documentation
to cover and explain what has already been implemented.&lt;/p&gt;

&lt;h2 id=&quot;thanks&quot;&gt;Thanks&lt;/h2&gt;

&lt;p&gt;This work would not have been possible without Viral Shah and Alan Edelman arranging my stay
at MIT. I’d like to thank everybody at Julia Central and around, it has been a blast! I’m
also grateful to Bjorn De Sutter, and IWT Vlaanderen, for supporting my time at Ghent
University.&lt;/p&gt;

&lt;hr /&gt;
&lt;div class=&quot;footnotes&quot;&gt;
  &lt;ol&gt;
    &lt;li id=&quot;fn:1&quot;&gt;
      &lt;p&gt;See the &lt;a href=&quot;https://github.com/JuliaGPU/CUDAnative.jl/blob/0721783db9ac4cc2c2948cbf8cbff4aa5f7c4271/README.md#object-arguments&quot;&gt;README&lt;/a&gt; for a note on how expensive this currently is.&amp;nbsp;&lt;a href=&quot;#fnref:1&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:2&quot;&gt;
      &lt;p&gt;The measurements include memory transfer time, which is why a CPU implementation was not included (realistically, data would be kept on the GPU as long as possible, making it an unfair comparison).&amp;nbsp;&lt;a href=&quot;#fnref:2&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
    &lt;li id=&quot;fn:3&quot;&gt;
      &lt;p&gt;Bounds-checked arrays are not supported yet, due to &lt;a href=&quot;https://github.com/JuliaGPU/CUDAnative.jl/issues/4&quot;&gt;a bug in the NVIDIA PTX compiler&lt;/a&gt;.&amp;nbsp;&lt;a href=&quot;#fnref:3&quot; class=&quot;reversefootnote&quot;&gt;&amp;#8617;&lt;/a&gt;&lt;/p&gt;
    &lt;/li&gt;
  &lt;/ol&gt;
&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>More Dots: Syntactic Loop Fusion in Julia</title>
   <link href="http://julialang.org/blog/2017/01/moredots"/>
   <updated>2017-01-21T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2017/01/moredots</id>
   <content type="html">&lt;p&gt;After a &lt;a href=&quot;https://github.com/JuliaLang/julia/issues/8450&quot;&gt;lengthy design process&lt;/a&gt; and &lt;a href=&quot;http://julialang.org/blog/2016/10/julia-0.5-highlights#vectorized-function-calls&quot;&gt;preliminary foundations in Julia 0.5&lt;/a&gt;, Julia 0.6 includes new facilities for writing code in the “vectorized”
style (familiar from Matlab, Numpy, R, etcetera) while avoiding the
overhead that this style of programming usually imposes: multiple
vectorized operations can now be “fused” into a single loop, without
allocating any extraneous temporary arrays.&lt;/p&gt;

&lt;p&gt;This is best illustrated with an example (in which we get
&lt;em&gt;order-of-magnitude&lt;/em&gt; savings in memory and time, as demonstrated below).  Suppose we have
a function &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x) = 3x^2 + 5x + 2&lt;/code&gt; that evaluates a polynomial,
and we want to evaluate &lt;code class=&quot;highlighter-rouge&quot;&gt;f(2x^2 + 6x^3 - sqrt(x))&lt;/code&gt; for a whole array &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt;,
storing the result in-place in &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt;.  You can now do:&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;or, &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/20321&quot;&gt;equivalently&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nd&quot;&gt;@.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;and the whole computation will be &lt;em&gt;fused&lt;/em&gt; into a single loop, operating in-place,
with performance comparable to the hand-written
“devectorized” loop:&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eachindex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;(Of course, like all Julia code, to get good performance both of these snippets should be executed inside some function, not in global scope.)   To see the details of a variety of performance experiments with this example code, follow along in the attached IJulia/Jupyter &lt;a href=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/moredots/More-Dots.ipynb&quot;&gt;notebook&lt;/a&gt;: we find that the
&lt;code class=&quot;highlighter-rouge&quot;&gt;X .= ...&lt;/code&gt; code has performance within 10% of the hand-devectorized loop (which itself is within 5% of the
speed of C code),
except for very small arrays where there is a modest overhead (e.g. 50% overhead for a length-1 array &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;In this blog post, we delve into some of the details of this new development, in order to answer questions that often arise when this feature is presented:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;What is the overhead of traditional “vectorized” code?  Isn’t vectorized code supposed to be fast already?&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Why are all these dots necessary?  Couldn’t Julia just optimize “ordinary” vector code?&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Is this something unique to Julia, or can other languages do the same thing?&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The short answers are:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;http://www.johnmyleswhite.com/notebook/2013/12/22/the-relationship-between-vectorized-and-devectorized-code/&quot;&gt;Ordinary vectorized code is fast, but not as fast as a hand-written loop&lt;/a&gt;
  (assuming loops are efficiently compiled, as in Julia)
  because each vectorized operation generates a new temporary array and
  executes a separate loop, leading to a lot of overhead when multiple
  vectorized operations are combined.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The dots allow Julia to recognize the “vectorized” nature of the
  operations at a &lt;em&gt;syntactic&lt;/em&gt; level (before e.g. the type of &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; is known),
  and hence the loop fusion is a &lt;em&gt;syntactic guarantee&lt;/em&gt;, not a
  compiler optimization that may or may not occur for carefully written code.  They also allow the &lt;em&gt;caller&lt;/em&gt; to “vectorize” &lt;em&gt;any&lt;/em&gt; function, rather than relying on the function author.  (The &lt;code class=&quot;highlighter-rouge&quot;&gt;@.&lt;/code&gt; macro lets you add dots to every operation in an expression, improving readability for expressions with lots of dots.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Other languages have implemented loop fusion for vectorized operations,
  but typically for only a small set of types and operations/functions that
  are known to the compiler or vectorization library.  Julia’s ability to do it generically, even
  for &lt;em&gt;user-defined&lt;/em&gt; array types and functions/operators, is unusual
  and relies in part on the syntax choices above and on its ability to efficiently
  compile higher-order functions.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Finally, we’ll review why, since these dots actually correspond to
&lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; operations, they can &lt;strong&gt;combine arrays and scalars, or combine containers
of different shapes and kinds&lt;/strong&gt;, and we’ll compare &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt;.  Moreover, Julia 0.6 expanded and
clarified the notion of a “scalar” for &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt;, so that it is &lt;strong&gt;not limited to numerical operations&lt;/strong&gt;: you can use &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; and fusing “dot calls” for many other
tasks (e.g. string processing).&lt;/p&gt;

&lt;h2 id=&quot;isnt-vectorized-code-already-fast&quot;&gt;Isn’t vectorized code already fast?&lt;/h2&gt;

&lt;p&gt;To explore this question (also discussed
&lt;a href=&quot;http://www.johnmyleswhite.com/notebook/2013/12/22/the-relationship-between-vectorized-and-devectorized-code/&quot;&gt;in this blog post&lt;/a&gt;), let’s begin by rewriting the code above in a more traditional vectorized style, without
so many dots, such as you might use in Julia 0.4 or in other languages
(most famously Matlab, Python/Numpy, or R).&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Of course, this assumes that the functions &lt;code class=&quot;highlighter-rouge&quot;&gt;sqrt&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; are “vectorized,”
i.e. that they accept vector arguments &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt; and compute the
function elementwise.  This is true of &lt;code class=&quot;highlighter-rouge&quot;&gt;sqrt&lt;/code&gt; in Julia 0.4, but it
means that we have to rewrite our function &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; from above in a vectorized style, as
e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x) = 3x.^2 + 5x + 2&lt;/code&gt; (changing &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; to use the elementwise operator &lt;code class=&quot;highlighter-rouge&quot;&gt;.^&lt;/code&gt; because
&lt;code class=&quot;highlighter-rouge&quot;&gt;vector^scalar&lt;/code&gt; is not defined).   (If we were using Julia 0.4 and cared a lot about efficiency,
we might have instead used the &lt;code class=&quot;highlighter-rouge&quot;&gt;@vectorize_1arg f Number&lt;/code&gt; macro to generate
more specialized elementwise code.)&lt;/p&gt;

&lt;h3 id=&quot;which-functions-are-vectorized&quot;&gt;Which functions are vectorized?&lt;/h3&gt;

&lt;p&gt;As an aside, this example illustrates an annoyance with the vectorized style:
you have to &lt;em&gt;decide in advance&lt;/em&gt; whether a given function &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x)&lt;/code&gt;
will also be applied elementwise to arrays, and either
write it specially or define a corresponding elementwise method.&lt;/p&gt;

&lt;p&gt;(Our function &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; accepts any &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; type, and in Matlab or R there is no distinction between
a scalar and a 1-element array.  However, even if a function &lt;em&gt;accepts&lt;/em&gt; an array argument &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt;,
that doesn’t mean it will &lt;em&gt;work&lt;/em&gt; elementwise
for an array unless you write the function with that in mind.)&lt;/p&gt;

&lt;p&gt;For library functions like &lt;code class=&quot;highlighter-rouge&quot;&gt;sqrt&lt;/code&gt;, this means that the library authors
have to guess at which functions should have vectorized methods, and users
have to guess at what vaguely defined subset of library functions work
for vectors.&lt;/p&gt;

&lt;p&gt;One possible solution is to vectorize &lt;em&gt;every function automatically&lt;/em&gt;.   The
language &lt;a href=&quot;https://en.wikipedia.org/wiki/Chapel_%28programming_language%29&quot;&gt;Chapel&lt;/a&gt;
does this: every function &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x...)&lt;/code&gt; implicitly
defines a function &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x::Array...)&lt;/code&gt; that evaluates &lt;code class=&quot;highlighter-rouge&quot;&gt;map(f, x...)&lt;/code&gt;
&lt;a href=&quot;http://pgas11.rice.edu/papers/ChamberlainEtAl-Chapel-Iterators-PGAS11.pdf&quot;&gt;(Chamberlain et al, 2011)&lt;/a&gt;.
This could be implemented in Julia as well via
function-call overloading &lt;a href=&quot;https://github.com/JeffBezanson/phdthesis/blob/master/main.pdf&quot;&gt;(Bezanson, 2015: chapter 4)&lt;/a&gt;,
but we chose to go in a different direction.&lt;/p&gt;

&lt;p&gt;Instead, starting in Julia 0.5, &lt;em&gt;any&lt;/em&gt; function &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x)&lt;/code&gt; can be applied elementwise
to an array &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt; with the &lt;a href=&quot;http://docs.julialang.org/en/stable/manual/functions/#dot-syntax-for-vectorizing-functions&quot;&gt;“dot call” syntax &lt;code class=&quot;highlighter-rouge&quot;&gt;f.(X)&lt;/code&gt;&lt;/a&gt;.
Thus, the &lt;em&gt;caller&lt;/em&gt; decides which functions to vectorize.  In Julia 0.6,
“traditionally” vectorized library functions like &lt;code class=&quot;highlighter-rouge&quot;&gt;sqrt(X)&lt;/code&gt; are &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/17302&quot;&gt;deprecated&lt;/a&gt; in
favor of &lt;code class=&quot;highlighter-rouge&quot;&gt;sqrt.(X)&lt;/code&gt;, and dot operators
like &lt;code class=&quot;highlighter-rouge&quot;&gt;x .+ y&lt;/code&gt; are &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/17623&quot;&gt;now equivalent&lt;/a&gt; to
dot calls &lt;code class=&quot;highlighter-rouge&quot;&gt;(+).(x,y)&lt;/code&gt;.   Unlike Chapel’s implicit vectorization, Julia’s
&lt;code class=&quot;highlighter-rouge&quot;&gt;f.(x...)&lt;/code&gt; syntax corresponds to &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast(f, x...)&lt;/code&gt; rather than &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt;,
allowing you to &lt;em&gt;combine arrays and scalars or arrays of different shapes/dimensions.&lt;/em&gt;  (&lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt; are compared at the end of this post; each
has its own unique capabilities.)
From the standpoint of the programmer, this adds a certain amount of
clarity because it indicates explicitly when an elementwise operation
is occuring.  From the standpoint of the compiler, dot-call syntax
enables the &lt;em&gt;syntactic loop fusion&lt;/em&gt; optimization described in more detail
below, which we think is an overwhelming advantage of this style.&lt;/p&gt;

&lt;h3 id=&quot;why-vectorized-code-is-fast&quot;&gt;Why vectorized code is fast&lt;/h3&gt;

&lt;p&gt;In many dynamically typed languages popular for interactive technical computing
(Matlab, Python, R, etc.), vectorization is seen as a key (often &lt;em&gt;the&lt;/em&gt; key)
performance optimization.   It allows your code to take advantage of highly
optimized (perhaps even parallelized) library routines for basic operations like
&lt;code class=&quot;highlighter-rouge&quot;&gt;scalar*array&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;sqrt(array)&lt;/code&gt;. Those functions, in turn, are usually
implemented in a low-level language like C or Fortran.   Writing your own
“devectorized” loops, in contrast, is too slow, unless you are willing to drop
down to a low-level language yourself, because the semantics of those dynamic
languages make it hard to compile them to efficient code in general.&lt;/p&gt;

&lt;p&gt;Thanks to Julia’s design, a properly written devectorized loop in Julia
has performance within a few percent of C or Fortran, so there is no &lt;em&gt;necessity&lt;/em&gt;
of vectorizing; this is explicitly demonstrated for the devectorized
loop above in the accompanying &lt;a href=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/moredots/More-Dots.ipynb&quot;&gt;notebook&lt;/a&gt;. However, vectorization may still be &lt;em&gt;convenient&lt;/em&gt; for some problems.
And vectorized operations like &lt;code class=&quot;highlighter-rouge&quot;&gt;scalar*array&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;sqrt(array)&lt;/code&gt; are still fast in Julia
(calling optimized library routines, albeit ones written in Julia itself).&lt;/p&gt;

&lt;p&gt;Furthermore, if your problem involves a function that does not have a pre-written,
highly optimized, vectorized library routine in Julia, and that does not
decompose easily into existing vectorized building blocks like &lt;code class=&quot;highlighter-rouge&quot;&gt;scalar*array&lt;/code&gt;, then
you can write your own building block without dropping down to a low-level language.
(If all the performance-critical code you will ever need already existed in the
form of optimized library routines, programming would be a lot easier!)&lt;/p&gt;

&lt;h3 id=&quot;why-vectorized-code-is-not-as-fast-as-it-could-be&quot;&gt;Why vectorized code is not as fast as it could be&lt;/h3&gt;

&lt;p&gt;There is a tension between two general principles in computing: on
the one hand, &lt;em&gt;re-using&lt;/em&gt; highly optimized code is good for
performance; on the other other hand, optimized code that is &lt;em&gt;specialized&lt;/em&gt;
for your problem can usually beat general-purpose functions.
This is illustrated nicely by the traditional vectorized version of our code above:&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Each of the operations like &lt;code class=&quot;highlighter-rouge&quot;&gt;X.^2&lt;/code&gt;  and &lt;code class=&quot;highlighter-rouge&quot;&gt;5*X&lt;/code&gt; &lt;em&gt;individually&lt;/em&gt;
calls highly optimized functions, but their &lt;em&gt;combination&lt;/em&gt;
leaves a lot of performance on the table when &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt; is an array.   To see that,
you have to realize that this code is equivalent to:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;tmp1 = X.^2
tmp2 = 2*tmp1
tmp3 = X.^3
tmp4 = 6 * tmp3
tmp5 = tmp2 + tmp4
tmp6 = sqrt(X)
tmp7 = tmp5 - tmp7
X = f(tmp7)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;That is, each of these vectorized operations allocates a separate
temporary array, and is a separate library call with its own inner
loop.  Both of these properties are bad for performance.&lt;/p&gt;

&lt;p&gt;First, eight arrays are allocated (&lt;code class=&quot;highlighter-rouge&quot;&gt;tmp1&lt;/code&gt; through &lt;code class=&quot;highlighter-rouge&quot;&gt;tmp7&lt;/code&gt;, plus another
for the result of &lt;code class=&quot;highlighter-rouge&quot;&gt;f(tmp7)&lt;/code&gt;, and another four are allocated
internally by &lt;code class=&quot;highlighter-rouge&quot;&gt;f(tmp7)&lt;/code&gt; for the same reasons, for &lt;em&gt;12 arrays in all&lt;/em&gt;.
The resulting &lt;code class=&quot;highlighter-rouge&quot;&gt;X = ...&lt;/code&gt; expression does &lt;em&gt;not&lt;/em&gt; update &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt; in-place, but
rather makes the variable &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt; “point” to a new array returned by &lt;code class=&quot;highlighter-rouge&quot;&gt;f(tmp7)&lt;/code&gt;,
discarding the old array &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt;.   All of these extra arrays are eventually
deallocated by Julia’s garbage collector, but in the meantime it wastes
a lot of memory (an order of magnitude!)&lt;/p&gt;

&lt;p&gt;By itself, allocating/freeing memory can take a significant amount of time
compared to our other computations. This is especially true if &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt; is very small
so that the allocation overhead matters (in our benchmark &lt;a href=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/moredots/More-Dots.ipynb&quot;&gt;notebook&lt;/a&gt;, we pay
a 10× cost for a 6-element array and a 6× cost for a 36-element array), or  if
&lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt; is very large so that the memory churn matters (see below for numbers).
Furthermore, you pay a &lt;em&gt;different&lt;/em&gt; performance price from the fact that you have
12 loops (12 passes over memory) compared to one, in part because of the loss of
&lt;a href=&quot;https://en.wikipedia.org/wiki/Locality_of_reference&quot;&gt;memory locality&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;In particular, reading or writing data in main computer memory (RAM) is much slower than performing scalar arithmetic operations like &lt;code class=&quot;highlighter-rouge&quot;&gt;+&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;*&lt;/code&gt;, so computer hardware stores recently used data in a &lt;a href=&quot;https://en.wikipedia.org/wiki/Cache_%28computing%29&quot;&gt;cache&lt;/a&gt;: a small amount
of much faster memory.  Furthermore, there is a hierarchy of smaller,
faster caches, culminating in the &lt;a href=&quot;https://en.wikipedia.org/wiki/Processor_register&quot;&gt;register memory&lt;/a&gt;
of the CPU itself.   This means that, for good performance, you should
load each datum &lt;code class=&quot;highlighter-rouge&quot;&gt;x = X[i]&lt;/code&gt; &lt;em&gt;once&lt;/em&gt; (so that it goes into cache, or into a register for small enough types), and
then perform several operations like &lt;code class=&quot;highlighter-rouge&quot;&gt;f(2x^2 + 6x^3 - sqrt(x))&lt;/code&gt; on &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt;
while you still have fast access to it, before loading the next datum;
this is called “temporal locality.”   The traditional vectorized code
discards this potential locality: each &lt;code class=&quot;highlighter-rouge&quot;&gt;X[i]&lt;/code&gt; is loaded once for a
single small operation like &lt;code class=&quot;highlighter-rouge&quot;&gt;2*X[i]&lt;/code&gt;, writing the result out to a temporary
array before immediately reading the next &lt;code class=&quot;highlighter-rouge&quot;&gt;X[i]&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;In typical performance benchmarks (see &lt;a href=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/moredots/More-Dots.ipynb&quot;&gt;notebook&lt;/a&gt;), therefore, the traditional
vectorized code &lt;code class=&quot;highlighter-rouge&quot;&gt;X = f(2 * X.^2 + 6 * X.^3 - sqrt(X))&lt;/code&gt; turns out to be &lt;strong&gt;about
10× slower&lt;/strong&gt; than the devectorized or fused-vectorized versions of the same code
at the beginning of this article for &lt;code class=&quot;highlighter-rouge&quot;&gt;X = zeros(10^6)&lt;/code&gt;.   Even if we
pre-allocate all of the temporary arrays (completely eliminating the allocation
cost),  our benchmarks show that performing a separate loop for each operation
still is about 4–5× slower for a million-element &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt;. This is not unique to
Julia!  &lt;strong&gt;Vectorized code is suboptimal in any language&lt;/strong&gt; unless the
language’s compiler can automatically fuse all of these loops (even ones that
appear inside function calls), which rarely happens for the reasons described
below.&lt;/p&gt;

&lt;h2 id=&quot;why-does-julia-need-dots-to-fuse-the-loops&quot;&gt;Why does Julia need dots to fuse the loops?&lt;/h2&gt;

&lt;p&gt;You might look at an expression like &lt;code class=&quot;highlighter-rouge&quot;&gt;2 * X.^2 + 6 * X.^3 - sqrt(X)&lt;/code&gt; and
think that it is “obvious” that it could be combined into a single loop
over &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt;.  Why can’t Julia’s compiler be smart enough to recognize this?&lt;/p&gt;

&lt;p&gt;The thing that you need to realize is that, in Julia, there is nothing
particularly special about &lt;code class=&quot;highlighter-rouge&quot;&gt;+&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;sqrt&lt;/code&gt; — they are arbitrary functions
and could do &lt;em&gt;anything&lt;/em&gt;.   &lt;code class=&quot;highlighter-rouge&quot;&gt;X + Y&lt;/code&gt; could send an email or open
a plotting window, for all the compiler knows.   To figure out that it
could fuse e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;2*X + Y&lt;/code&gt; into a single loop, allocating a single
array for the result, the compiler would need to:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;Deduce the types of &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;Y&lt;/code&gt; and figure out what &lt;code class=&quot;highlighter-rouge&quot;&gt;*&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;+&lt;/code&gt; functions
  to call.  (Julia already does this, at least when &lt;a href=&quot;https://en.wikipedia.org/wiki/Type_inference&quot;&gt;type inference&lt;/a&gt; succeeds.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Look inside of those functions, realize that they are elementwise loops over &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt;
  and &lt;code class=&quot;highlighter-rouge&quot;&gt;Y&lt;/code&gt;, and realize that they are &lt;a href=&quot;https://en.wikipedia.org/wiki/Pure_function&quot;&gt;pure&lt;/a&gt;
  (e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;2*X&lt;/code&gt; has no side-effects like modifying &lt;code class=&quot;highlighter-rouge&quot;&gt;Y&lt;/code&gt;).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Analyze expressions like &lt;code class=&quot;highlighter-rouge&quot;&gt;X[i]&lt;/code&gt; (which are calls to a function &lt;code class=&quot;highlighter-rouge&quot;&gt;getindex(X, i)&lt;/code&gt;
  that is “just another function” to the compiler), to detect that they
  are memory reads/writes and determine what &lt;em&gt;data dependencies&lt;/em&gt; they imply
  (e.g. to figure out that &lt;code class=&quot;highlighter-rouge&quot;&gt;2*X&lt;/code&gt; allocates a temporary array that can be eliminated).&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The second and third steps pose an &lt;em&gt;enormous challenge&lt;/em&gt;: looking at an arbitrary
function and “understanding” it at this level turns out to be a very hard
problem for a computer.  If fusion is viewed as a compiler &lt;em&gt;optimization&lt;/em&gt;, then the
compiler is only free to fuse if it can &lt;em&gt;prove&lt;/em&gt; that fusion &lt;em&gt;won’t change the
results&lt;/em&gt;, which requires the detection of purity and other data-dependency
analyses.&lt;/p&gt;

&lt;p&gt;In contrast, when the Julia compiler sees an expression like &lt;code class=&quot;highlighter-rouge&quot;&gt;2 .* X .+ Y&lt;/code&gt;,
it knows just from the &lt;em&gt;syntax&lt;/em&gt; (the “spelling”) that these are elementwise
operations, and Julia &lt;em&gt;guarantees&lt;/em&gt; that the code will &lt;em&gt;always&lt;/em&gt; fuse into a single
loop, freeing it from the need to prove purity.  This is what we
term &lt;strong&gt;syntactic loop fusion&lt;/strong&gt;, described in more detail below.&lt;/p&gt;

&lt;h3 id=&quot;a-halfway-solution-loop-fusion-for-a-few-operationstypes&quot;&gt;A halfway solution: Loop fusion for a few operations/types&lt;/h3&gt;

&lt;p&gt;One approach that may occur to you, and which has been implemented in a
variety of languages (e.g. &lt;a href=&quot;http://dl.acm.org/citation.cfm?id=665526&quot;&gt;Kennedy &amp;amp; McKinley, 1993&lt;/a&gt;;
&lt;a href=&quot;http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.46.6627&quot;&gt;Lewis et al., 1998&lt;/a&gt;;
&lt;a href=&quot;http://dl.acm.org/citation.cfm?id=507661&quot;&gt;Chakravarty &amp;amp; Keller, 2001&lt;/a&gt;;
&lt;a href=&quot;http://ieeexplore.ieee.org.libproxy.mit.edu/document/577265/&quot;&gt;Manjikian &amp;amp; Abdelrahman, 2002&lt;/a&gt;;
&lt;a href=&quot;http://ieeexplore.ieee.org/document/5389392/&quot;&gt;Sarkar, 2010&lt;/a&gt;;
&lt;a href=&quot;http://dl.acm.org/citation.cfm?id=1993517&quot;&gt;Prasad et al., 2011&lt;/a&gt;;
&lt;a href=&quot;http://dl.acm.org/citation.cfm?id=2457490&quot;&gt;Wu et al., 2012&lt;/a&gt;), is to only
perform loop fusion for &lt;em&gt;a few “built-in” types and operations&lt;/em&gt; that the
compiler can be designed to recognize.   The same idea has also been
implemented as libraries (e.g. template libraries in C++:
&lt;a href=&quot;http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.43.248&quot;&gt;Veldhuizen, 1995&lt;/a&gt;) or
&lt;a href=&quot;https://en.wikipedia.org/wiki/Domain-specific_language&quot;&gt;domain-specific
languages (DSLs)&lt;/a&gt;
as extensions of existing languages; in Python, for example, loop fusion for a small
set of vector operations and array/scalar types can be found in the
&lt;a href=&quot;http://deeplearning.net/software/theano/introduction.html&quot;&gt;Theano&lt;/a&gt;,
&lt;a href=&quot;https://op2.github.io/PyOP2/&quot;&gt;PyOP2&lt;/a&gt;, and &lt;a href=&quot;https://github.com/numba/numba/pull/1110&quot;&gt;Numba&lt;/a&gt;
software. Likewise, in Julia we could
potentially build the compiler to recognize that it can fuse
&lt;code class=&quot;highlighter-rouge&quot;&gt;*&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;+&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;.^&lt;/code&gt;, and similar operations for the built-in &lt;code class=&quot;highlighter-rouge&quot;&gt;Array&lt;/code&gt; type,
(and perhaps only for a few scalar types).
This has, in fact, already been implemented in Julia as a macro-based DSL (you add &lt;code class=&quot;highlighter-rouge&quot;&gt;@vec&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;
decorators to a vectorized expression) in the &lt;a href=&quot;https://github.com/lindahua/Devectorize.jl&quot;&gt;Devectorize&lt;/a&gt;
and &lt;a href=&quot;https://github.com/IntelLabs/ParallelAccelerator.jl&quot;&gt;ParallelAccelerator&lt;/a&gt;
packages.&lt;/p&gt;

&lt;p&gt;However, even though Julia will certainly implement additional compiler
optimizations as time passes, one of the key principles of Julia’s design
is to “build in” as little as possible into the core language, implementing
as much as possible of Julia &lt;em&gt;in Julia&lt;/em&gt; itself &lt;a href=&quot;https://github.com/JeffBezanson/phdthesis/blob/master/main.pdf&quot;&gt;(Bezanson, 2015)&lt;/a&gt;.
Put another way, the same &lt;em&gt;optimizations should be just as available to user-defined
types and functions&lt;/em&gt; as to the “built-in” functions of Julia’s standard library
(&lt;code class=&quot;highlighter-rouge&quot;&gt;Base&lt;/code&gt;).  You should be able to define your own array types
(e.g. via the &lt;a href=&quot;https://github.com/JuliaArrays/StaticArrays.jl&quot;&gt;StaticArrays&lt;/a&gt;
package or &lt;a href=&quot;https://github.com/JuliaParallel/PETSc.jl&quot;&gt;PETSc arrays&lt;/a&gt;)
and functions (such as our &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; above), and have them be capable of fusing vectorized operations.&lt;/p&gt;

&lt;p&gt;Moreover, a difficulty with fancy compiler optimizations is that, as a
programmer, you are often unsure whether they will occur.  You have to learn to
avoid coding styles that accidentally prevent the compiler from recognizing
the fusion opportunity (e.g. because you called a “non-built-in” function), you
need to learn to use additional compiler-diagnostic tools to identify which
optimizations are taking place, and you need to continually check these
diagnostics as new versions of the compiler and language are released.  With
vectorized code, losing a fusion optimization may mean wasting an order of
magnitude in memory and time, so you have to worry much more than you would for
a typical compiler micro-optimization.&lt;/p&gt;

&lt;h3 id=&quot;syntactic-loop-fusion-in-julia&quot;&gt;Syntactic loop fusion in Julia&lt;/h3&gt;

&lt;p&gt;In contrast, Julia’s approach is quite simple and general: the caller
indicates, by adding dots, which function calls and operators are
intended to be applied elementwise (specifically, as &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; calls).
The compiler notices these dots at &lt;em&gt;parse time&lt;/em&gt; (or technically
at “lowering” time, but in any case long before it knows
the types of the variables etc.), and transforms them into calls to
&lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt;.  Moreover, it guarantees that &lt;em&gt;nested&lt;/em&gt; “dot calls” will
&lt;em&gt;always&lt;/em&gt; be fused into a single broadcast call, i.e. a single loop.&lt;/p&gt;

&lt;p&gt;Put another way, &lt;code class=&quot;highlighter-rouge&quot;&gt;f.(g.(x .+ 1))&lt;/code&gt; is treated by Julia as merely
&lt;a href=&quot;https://en.wikipedia.org/wiki/Syntactic_sugar&quot;&gt;syntactic sugar&lt;/a&gt; for
&lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast(x -&amp;gt; f(g(x + 1)), x)&lt;/code&gt;.   An assignment &lt;code class=&quot;highlighter-rouge&quot;&gt;y .= f.(g.(x .+ 1))&lt;/code&gt;
is treated as sugar for the in-place operation
&lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast!(x -&amp;gt; f(g(x + 1)), y, x)&lt;/code&gt;.   The compiler need not prove
that this produces the same result as a corresponding non-fused operation,
because the fusion is a mandatory transformation defined as part
of the language, rather than an optional optimization.&lt;/p&gt;

&lt;p&gt;Arbitrary user-defined functions &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x)&lt;/code&gt; work with this mechanism,
as do arbitrary user-defined collection types for &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt;, as long as you
define &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; methods for your collection.  (The default
&lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; already works for any subtype of &lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray&lt;/code&gt;.)&lt;/p&gt;

&lt;p&gt;Moreover, dotted operators are now available for not just
the familiar ASCII operators like &lt;code class=&quot;highlighter-rouge&quot;&gt;.+&lt;/code&gt;, but for &lt;em&gt;any&lt;/em&gt;
character that Julia parses as a binary operator.  This includes
a wide array of Unicode symbols like &lt;code class=&quot;highlighter-rouge&quot;&gt;⊗&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;∪&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;⨳&lt;/code&gt;, most
of which are undefined by default.   So, for example, if you
define &lt;code class=&quot;highlighter-rouge&quot;&gt;⊗(x,y) = kron(x,y)&lt;/code&gt; for the &lt;a href=&quot;https://en.wikipedia.org/wiki/Kronecker_product&quot;&gt;Kronecker product&lt;/a&gt;,
you can immediately do &lt;code class=&quot;highlighter-rouge&quot;&gt;[A, B] .⊗ [C, D]&lt;/code&gt; to compute the
“elementwise” operation &lt;code class=&quot;highlighter-rouge&quot;&gt;[A ⊗ C, B ⊗ D]&lt;/code&gt;, because &lt;code class=&quot;highlighter-rouge&quot;&gt;x .⊗ y&lt;/code&gt;
is sugar for &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast(⊗, x, y)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Note that “side-by-side” binary operations are actually equivalent
to nested calls, and hence they fuse for dotted operations.   For
example &lt;code class=&quot;highlighter-rouge&quot;&gt;3 .* x .+ y&lt;/code&gt; is equivalent to &lt;code class=&quot;highlighter-rouge&quot;&gt;(+).((*).(3, x), y)&lt;/code&gt;, and
hence it fuses into &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast((x,y) -&amp;gt; 3*x+y, x, y)&lt;/code&gt;.   Note
also that the fusion stops only when a “non-dot” call is encountered,
e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;sqrt.(abs.(sort!(x.^2)))&lt;/code&gt; fuses the &lt;code class=&quot;highlighter-rouge&quot;&gt;sqrt&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;abs&lt;/code&gt; operations
into a single loop, but &lt;code class=&quot;highlighter-rouge&quot;&gt;x.^2&lt;/code&gt; occurs in a separate loop (producing
a temporary array) because of the intervening non-dot function call
&lt;code class=&quot;highlighter-rouge&quot;&gt;sort!(...)&lt;/code&gt;.&lt;/p&gt;

&lt;h3 id=&quot;other-partway-solutions&quot;&gt;Other partway solutions&lt;/h3&gt;

&lt;p&gt;For the sake of completeness, we should mention
some other possibilities that would partly
address the problems of vectorization.  For example, functions could
be specially &lt;a href=&quot;https://github.com/JuliaLang/julia/issues/414&quot;&gt;annotated to declare that they are pure&lt;/a&gt;,
one could specially annotate container types with
array-like semantics, etcetera, to help the compiler recognize the
possibility of fusion.   But this imposes a lot of requirements
on library authors, and once again it requires them to identify
in advance which functions are likely to be applied to vectors
(and hence be worth the additional analysis and annotation effort).&lt;/p&gt;

&lt;p&gt;Another approach that has been suggested is to define updating operators
like &lt;code class=&quot;highlighter-rouge&quot;&gt;x += y&lt;/code&gt; to be equivalent to calls to a special function,
like &lt;code class=&quot;highlighter-rouge&quot;&gt;x = plusequals!(x, y)&lt;/code&gt;, that can be defined as an in-place operation, rather
than &lt;code class=&quot;highlighter-rouge&quot;&gt;x += y&lt;/code&gt; being a synonym for &lt;code class=&quot;highlighter-rouge&quot;&gt;x = x + y&lt;/code&gt; as in Julia today.
(&lt;a href=&quot;https://docs.python.org/3.3/reference/datamodel.html#object.__iadd__&quot;&gt;NumPy does this&lt;/a&gt;.)
By itself, this can be used to &lt;a href=&quot;http://blog.svenbrauch.de/2016/04/13/processing-scientific-data-in-python-and-numpy-but-doing-it-fast/&quot;&gt;avoid temporary arrays in some simple cases&lt;/a&gt; by breaking them into a sequence of in-place updates, but
it doesn’t handle more complex expressions, is limited to a few
operations like &lt;code class=&quot;highlighter-rouge&quot;&gt;+&lt;/code&gt;, and doesn’t address the cache inefficiency of
multiple loops.   (In Julia 0.6, you can do &lt;code class=&quot;highlighter-rouge&quot;&gt;x .+= y&lt;/code&gt; and it is
equivalent to &lt;code class=&quot;highlighter-rouge&quot;&gt;x .= x .+ y&lt;/code&gt;, which does a single fused loop in-place,
but this syntax now extends to arbitrary combinations of arbitrary functions.)&lt;/p&gt;

&lt;h2 id=&quot;should-other-languages-implement-syntactic-loop-fusion&quot;&gt;Should other languages implement syntactic loop fusion?&lt;/h2&gt;

&lt;p&gt;Obviously, Julia’s approach of syntactic loop fusion relies partly on the
fact that, as a young language, we are still relatively free to
redefine core syntactic elements like &lt;code class=&quot;highlighter-rouge&quot;&gt;f.(x)&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;x .+ y&lt;/code&gt;.  But
suppose you were willing to add this or similar syntax to an
existing language, like Python or Go, or create a DSL add-on on top of those
languages as discussed above; would you then be able to
implement the same fusing semantics efficiently?&lt;/p&gt;

&lt;p&gt;There is a catch: &lt;code class=&quot;highlighter-rouge&quot;&gt;2 .* x .+ x .^ 2&lt;/code&gt; is sugar for
&lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast(x -&amp;gt; 2*x + x^2, x)&lt;/code&gt; in Julia, but for this to be
fast we need the &lt;a href=&quot;https://en.wikipedia.org/wiki/Higher-order_function&quot;&gt;higher-order function&lt;/a&gt;
&lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; to be very fast as well.  First, this
requires that arbitrary user-defined scalar (non-vectorized!) functions like
&lt;code class=&quot;highlighter-rouge&quot;&gt;x -&amp;gt; 2*x + x^2&lt;/code&gt; be compiled to fast code, which is often a challenge
in high-level dynamic languages.   Second, it ideally requires that
higher-order functions like &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; be able to &lt;a href=&quot;https://en.wikipedia.org/wiki/Inline_expansion&quot;&gt;inline&lt;/a&gt;
the function argument &lt;code class=&quot;highlighter-rouge&quot;&gt;x -&amp;gt; 2*x + x^2&lt;/code&gt;, and this facility is even
less common.  (It wasn’t available in Julia until version 0.5.)&lt;/p&gt;

&lt;p&gt;Also, the ability of &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; to combine arrays and scalars or
arrays of different shapes (see below) turns out to be subtle to
implement efficiently without losing generality. The current
implementation relies on a metaprogramming feature that Julia provides
called &lt;a href=&quot;http://docs.julialang.org/en/stable/manual/metaprogramming/#generated-functions&quot;&gt;generated
functions&lt;/a&gt;
in order to get compile-time specialization on the number and types of
the arguments.  An alternative solution to the inlining and
specialization issues would be to build the &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; function into
the compiler, but then you might lose the ability of &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; to be
overloadable for user-defined containers, nor could users write their
own higher-order functions with similar functionality.&lt;/p&gt;

&lt;h3 id=&quot;the-importance-of-higher-order-inlining&quot;&gt;The importance of higher-order inlining&lt;/h3&gt;

&lt;p&gt;In particular, consider
a naive implementation of &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; (only for one-argument functions):&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; naivebroadcast&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;similar&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eachindex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;In Julia, as in other languages, &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; must be some kind of &lt;a href=&quot;https://en.wikipedia.org/wiki/Function_pointer&quot;&gt;function
pointer&lt;/a&gt; or &lt;a href=&quot;https://en.wikipedia.org/wiki/Function_object&quot;&gt;function
object&lt;/a&gt;. Normally, a call
&lt;code class=&quot;highlighter-rouge&quot;&gt;f(x[i])&lt;/code&gt; to a function object &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; must figure out where the actual &lt;a href=&quot;https://en.wikipedia.org/wiki/Machine_code&quot;&gt;machine
code&lt;/a&gt; for the function is (in Julia,
this involves dispatching on the type of &lt;code class=&quot;highlighter-rouge&quot;&gt;x[i]&lt;/code&gt;; in object-oriented languages,
it might involve dispatching on the type of &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt;), push the argument &lt;code class=&quot;highlighter-rouge&quot;&gt;x[i]&lt;/code&gt;
etcetera to &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; via a register and/or a &lt;a href=&quot;https://en.wikipedia.org/wiki/Call_stack&quot;&gt;call stack&lt;/a&gt;,
jump to the machine instructions to execute them, jump back to
the caller &lt;code class=&quot;highlighter-rouge&quot;&gt;naivebroadcast&lt;/code&gt;, and extract the return value.
That is, calling a function argument &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; involves some overhead beyond
the cost of the computations inside &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;If &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x)&lt;/code&gt; is expensive enough, then the overhead of the function call may be negligible,
but for a cheap function like &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x) = 2*x + x^2&lt;/code&gt; the overhead can be very
significant: with Julia 0.4, the overhead is roughly a factor of two compared
to a hand-written loop that evaluates &lt;code class=&quot;highlighter-rouge&quot;&gt;z = x[i]; y[i] = 2*z + z^2&lt;/code&gt;.     Since lots
of vectorized code in practice evaluates relatively cheap functions like this,
it would be a big problem for a generic vectorization method based on &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt;.  (The function call also inhibits &lt;a href=&quot;https://software.intel.com/en-us/articles/vectorization-in-julia&quot;&gt;SIMD optimization&lt;/a&gt;
by the compiler, which prevents computations in &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x)&lt;/code&gt; from
being applied simultaneously to several &lt;code class=&quot;highlighter-rouge&quot;&gt;x[i]&lt;/code&gt;
elements.)&lt;/p&gt;

&lt;p&gt;However, &lt;a href=&quot;http://julialang.org/blog/2016/10/julia-0.5-highlights#functions&quot;&gt;in Julia 0.5, every function has its own type&lt;/a&gt;.  And, in Julia,
whenever you call a function like &lt;code class=&quot;highlighter-rouge&quot;&gt;naivebroadcast(f, x)&lt;/code&gt;, a &lt;em&gt;specialized version&lt;/em&gt;
of &lt;code class=&quot;highlighter-rouge&quot;&gt;naivebroadcast&lt;/code&gt; is compiled for &lt;code class=&quot;highlighter-rouge&quot;&gt;typeof(f)&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;typeof(x)&lt;/code&gt;.   Since
the compiled code is specific to &lt;code class=&quot;highlighter-rouge&quot;&gt;typeof(f)&lt;/code&gt;, i.e. to the specific function
being passed, the Julia compiler is free to &lt;a href=&quot;https://en.wikipedia.org/wiki/Inline_expansion&quot;&gt;inline&lt;/a&gt; &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x)&lt;/code&gt; into the generated code
if it wants to, and all of the function-call overhead can disappear.&lt;/p&gt;

&lt;p&gt;Julia is neither the first nor the only language that can inline
higher-order functions; e.g. it is reportedly &lt;a href=&quot;http://stackoverflow.com/questions/25566517/can-haskell-inline-functions-passed-as-an-argument&quot;&gt;possible in Haskell&lt;/a&gt; and in
the &lt;a href=&quot;https://kotlinlang.org/docs/reference/inline-functions.html&quot;&gt;Kotlin&lt;/a&gt; language.
Nevertheless, it seems to be a rare feature, especially in &lt;a href=&quot;https://en.wikipedia.org/wiki/Imperative_programming&quot;&gt;imperative languages&lt;/a&gt;. Fast
higher-order functions are a key ingredient of Julia that allows
a function like &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; to be written in Julia itself (and
hence be extensible to user-defined containers), rather than having
to be built in to the compiler (and probably limited to “built-in” container types).&lt;/p&gt;

&lt;h2 id=&quot;not-just-elementwise-math-the-power-of-broadcast&quot;&gt;Not just elementwise math: The power of broadcast&lt;/h2&gt;

&lt;p&gt;Dot calls correspond to the &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; function in Julia.  Broadcasting
is a powerful concept (also found, for example, in &lt;a href=&quot;https://docs.scipy.org/doc/numpy/user/basics.broadcasting.html&quot;&gt;NumPy&lt;/a&gt; and
&lt;a href=&quot;https://www.mathworks.com/help/matlab/ref/bsxfun.html&quot;&gt;Matlab&lt;/a&gt;) in which
the concept of “elementwise” operations is extended to encompass combining
arrays of different shapes or arrays and scalars.   Moreover, this is
not limited to arrays of numbers, and starting in Julia 0.6 a
“scalar” in a &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; context can be an object of an arbitrary type.&lt;/p&gt;

&lt;h3 id=&quot;combining-containers-of-different-shapes&quot;&gt;Combining containers of different shapes&lt;/h3&gt;

&lt;p&gt;You may have noticed that the examples above included expressions like
&lt;code class=&quot;highlighter-rouge&quot;&gt;6 .* X.^3&lt;/code&gt; that combine an array (&lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt;) with scalars (&lt;code class=&quot;highlighter-rouge&quot;&gt;6&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;3&lt;/code&gt;).
Conceptually, in &lt;code class=&quot;highlighter-rouge&quot;&gt;X.^3&lt;/code&gt; the scalar &lt;code class=&quot;highlighter-rouge&quot;&gt;3&lt;/code&gt; is “expanded” (or “broadcasted”)
to match the size of &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt;, as if it became an array &lt;code class=&quot;highlighter-rouge&quot;&gt;[3,3,3,...]&lt;/code&gt;,
before performing &lt;code class=&quot;highlighter-rouge&quot;&gt;^&lt;/code&gt; elementwise.  In practice of course, no array
of &lt;code class=&quot;highlighter-rouge&quot;&gt;3&lt;/code&gt;s is ever explicitly constructed.&lt;/p&gt;

&lt;p&gt;More generally, if you combine two arrays of different dimensions or shapes,
any “singleton” (length 1) or missing dimension of one array is “broadcasted”
across that dimension of the other array.   For example, &lt;code class=&quot;highlighter-rouge&quot;&gt;A .+ [1,2,3]&lt;/code&gt;
adds &lt;code class=&quot;highlighter-rouge&quot;&gt;[1,2,3]&lt;/code&gt; to &lt;em&gt;each column&lt;/em&gt; of an 3×&lt;em&gt;n&lt;/em&gt; matrix &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt;.   Another typical
example is to combine a row vector (or a 1×&lt;em&gt;n&lt;/em&gt; array) and a column vector to make a matrix
(2d array):&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.+&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;30&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;×3&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;13&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;21&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;22&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;23&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;31&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;32&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;33&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;(If &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; is a row vector, and &lt;code class=&quot;highlighter-rouge&quot;&gt;y&lt;/code&gt; is a column vector, then &lt;code class=&quot;highlighter-rouge&quot;&gt;A = x .+ y&lt;/code&gt; makes
a matrix with &lt;code class=&quot;highlighter-rouge&quot;&gt;A[i,j] = x[j] + y[i]&lt;/code&gt;.)&lt;/p&gt;

&lt;p&gt;Although other languages have also implemented similar &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; semantics,
Julia is unusual in being able to support such operations for &lt;em&gt;arbitrary&lt;/em&gt; user-defined
functions and types with &lt;em&gt;performance comparable to hand-written C&lt;/em&gt; loops, even though
its &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; function is written &lt;em&gt;entirely in Julia&lt;/em&gt; with no special support
from the compiler.   This not only requires efficient compilation and
higher-order inlining as mentioned above, but also the ability
to &lt;a href=&quot;http://julialang.org/blog/2016/02/iteration&quot;&gt;efficiently iterate over arrays of arbitrary dimensionalities&lt;/a&gt; determined
at compile-time for each caller.&lt;/p&gt;

&lt;h3 id=&quot;not-just-numbers&quot;&gt;Not just numbers&lt;/h3&gt;

&lt;p&gt;Although the examples above were all for numeric computations, in fact
neither the &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; function nor the dot-call fusion syntax is limited
to numeric data.  For example:&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;The QUICK Brown&quot;&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;fox     jumped&quot;&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;over the LAZY dog.&quot;&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;];&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;replace&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lowercase&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;r&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;\s+&quot;&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;-&quot;&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;s&quot;&gt;&quot;the-quick-brown&quot;&lt;/span&gt;   
 &lt;span class=&quot;s&quot;&gt;&quot;fox-jumped&quot;&lt;/span&gt;        
 &lt;span class=&quot;s&quot;&gt;&quot;over-the-lazy-dog.&quot;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Here, we take an array &lt;code class=&quot;highlighter-rouge&quot;&gt;s&lt;/code&gt; of strings, we convert each string to
lower case, and then we replace any sequence of whitespace (the &lt;a href=&quot;http://docs.julialang.org/en/latest/manual/strings.html#Regular-Expressions-1&quot;&gt;regular expression&lt;/a&gt;
&lt;code class=&quot;highlighter-rouge&quot;&gt;r&quot;\s+&quot;&lt;/code&gt;) with a hyphen &lt;code class=&quot;highlighter-rouge&quot;&gt;&quot;-&quot;&lt;/code&gt;.  Since these two dot calls are nested,
they are fused into a single loop over &lt;code class=&quot;highlighter-rouge&quot;&gt;s&lt;/code&gt; and are written in-place in &lt;code class=&quot;highlighter-rouge&quot;&gt;s&lt;/code&gt;
thanks to the &lt;code class=&quot;highlighter-rouge&quot;&gt;s .= ...&lt;/code&gt; (temporary &lt;em&gt;strings&lt;/em&gt; are allocated in this process,
but not temporary &lt;em&gt;arrays&lt;/em&gt; of strings).   Furthermore, notice that the
arguments &lt;code class=&quot;highlighter-rouge&quot;&gt;r&quot;\s+&quot;&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;&quot;-&quot;&lt;/code&gt; are treated as “scalars” and are “broadcasted”
to every element of &lt;code class=&quot;highlighter-rouge&quot;&gt;s&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The general rule (starting in Julia 0.6) is that, in &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt;, arguments of &lt;em&gt;any type&lt;/em&gt; are
&lt;em&gt;treated as scalars by default&lt;/em&gt;.  The main exceptions are arrays (subtypes of
&lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray&lt;/code&gt;) and tuples, which are treated as containers and are iterated
over.  (If you define your own container type that is not a subtype of
&lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray&lt;/code&gt;, you can tell &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; to treat it as a container to
be iterated over by overloading &lt;code class=&quot;highlighter-rouge&quot;&gt;Base.Broadcast.containertype&lt;/code&gt; and a
couple of other functions.)&lt;/p&gt;

&lt;h3 id=&quot;not-just-containers&quot;&gt;Not just containers&lt;/h3&gt;

&lt;p&gt;Since the dot-call syntax corresponds to &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; is just an
ordinary Julia function to which you can add your own methods (as opposed to
some kind of privileged compiler built-in), many possibilities open up.  Not
only can you extend fusing dot calls to your own data structures (e.g.
&lt;a href=&quot;https://github.com/JuliaParallel/DistributedArrays.jl&quot;&gt;DistributedArrays&lt;/a&gt;
extends &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; to work for arrays
&lt;a href=&quot;https://en.wikipedia.org/wiki/Distributed_memory&quot;&gt;distributed&lt;/a&gt; across multiple
computers), but you can apply the same syntax to data types that are &lt;em&gt;hardly
“containers” at all&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;For example, the &lt;a href=&quot;https://github.com/JuliaApproximation/ApproxFun.jl&quot;&gt;ApproxFun&lt;/a&gt;
package defines an object called a &lt;code class=&quot;highlighter-rouge&quot;&gt;Fun&lt;/code&gt; that represents a numerical
approximation of a user-defined function (essentially, a &lt;code class=&quot;highlighter-rouge&quot;&gt;Fun&lt;/code&gt; is a fancy
polynomial fit). By defining &lt;a href=&quot;https://github.com/JuliaApproximation/ApproxFun.jl/issues/356&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; methods for
&lt;code class=&quot;highlighter-rouge&quot;&gt;Fun&lt;/code&gt;&lt;/a&gt;, you can
now take an &lt;code class=&quot;highlighter-rouge&quot;&gt;f::Fun&lt;/code&gt; and do, for example, &lt;code class=&quot;highlighter-rouge&quot;&gt;exp.(f.^2 .+ f.^3)&lt;/code&gt; and it will
translate to &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast(y -&amp;gt; exp(y^2 + y^3), f)&lt;/code&gt;.  This &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; call, in
turn, will evaluate &lt;code class=&quot;highlighter-rouge&quot;&gt;exp(y^2 + y^3)&lt;/code&gt; for &lt;code class=&quot;highlighter-rouge&quot;&gt;y = f(x)&lt;/code&gt; at cleverly selected &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt;
points, construct a polynomial fit, and return a new &lt;code class=&quot;highlighter-rouge&quot;&gt;Fun&lt;/code&gt; object representing
the fit. (Conceptually, this replaces &lt;em&gt;elementwise&lt;/em&gt; operations on containers
with &lt;em&gt;pointwise&lt;/em&gt; operations on functions.) In contrast, ApproxFun also allows
you to compute the same result using &lt;code class=&quot;highlighter-rouge&quot;&gt;exp(f^2 + f^3)&lt;/code&gt;, but in this case it will go
through the fitting process &lt;em&gt;four times&lt;/em&gt; (constructing four &lt;code class=&quot;highlighter-rouge&quot;&gt;Fun&lt;/code&gt; objects), once
for each operation like &lt;code class=&quot;highlighter-rouge&quot;&gt;f^2&lt;/code&gt;, and is more than an order of magnitude slower
due to this lack of fusion.&lt;/p&gt;

&lt;h3 id=&quot;broadcast-vs-map&quot;&gt;broadcast vs. map&lt;/h3&gt;

&lt;p&gt;Finally, it is instructive to compare &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; with &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt;, since &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt; &lt;em&gt;also&lt;/em&gt;
applies a function elementwise to one or more arrays.   (The dot-call
syntax invokes &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt;, not &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt;.) The basic differences are:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; handles only &lt;em&gt;containers with “shapes”&lt;/em&gt; M×N×⋯ (i.e., a &lt;code class=&quot;highlighter-rouge&quot;&gt;size&lt;/code&gt; and dimensionality), whereas &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt;
handles “shapeless” containers like &lt;code class=&quot;highlighter-rouge&quot;&gt;Set&lt;/code&gt; or iterators of
unknown length like &lt;code class=&quot;highlighter-rouge&quot;&gt;eachline(file)&lt;/code&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt; requires all arguments to have the &lt;em&gt;same length&lt;/em&gt; (and
hence cannot combine arrays and scalars) and (for array containers) the same shape, whereas &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; does
not (it can “expand” smaller containers to match larger ones).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt; treats all arguments as &lt;em&gt;containers by default&lt;/em&gt;, and in particular
expects its arguments to &lt;a href=&quot;http://docs.julialang.org/en/latest/manual/interfaces.html#man-interface-iteration-1&quot;&gt;act as iterators&lt;/a&gt;.
In contrast, &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; treats its arguments as &lt;em&gt;scalars by default&lt;/em&gt; (i.e., as 0-dimensional arrays
of one element), except for a few types like &lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;Tuple&lt;/code&gt;
that are explicitly declared to be broadcast containers.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Sometimes, of course, their behavior coincides, e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;map(sqrt, [1,2,3])&lt;/code&gt; and
&lt;code class=&quot;highlighter-rouge&quot;&gt;sqrt.([1,2,3])&lt;/code&gt; give the same result.  But, in general, neither &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt;
nor &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; generalizes the other — each has things they can do that
the other cannot.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>Julia 0.5 Highlights</title>
   <link href="http://julialang.org/blog/2016/10/julia-0.5-highlights"/>
   <updated>2016-10-11T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/10/julia-0.5-highlights</id>
   <content type="html">&lt;p&gt;&lt;em&gt;To follow along with the examples in this blog post and run them live, you can go to &lt;a href=&quot;https://juliabox.com/&quot;&gt;JuliaBox&lt;/a&gt;, create a free login, and open the “Julia 0.5 Highlights” notebook under “What’s New in 0.5”. The notebook can also be downloaded from &lt;a href=&quot;https://raw.githubusercontent.com/JuliaLang/julialang.github.com/master/blog/_posts/Julia-0.5-highlights-notebook/Julia%200.5%20Highlights.ipynb&quot;&gt;here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;/blog/2016/10/julia-0.5-release&quot;&gt;Julia 0.5&lt;/a&gt; is a pivotal release.
It introduces more transformative features than any release since the first official version.
Moreover, several of these features set the stage for even more to come in the &lt;a href=&quot;https://www.youtube.com/watch?v=5gXMpbY1kJY&quot;&gt;lead up to Julia 1.0&lt;/a&gt;.
In this post, we’ll go through some of the major changes in 0.5, including improvements to functional programming, comprehensions, generators, arrays, strings, and more.&lt;/p&gt;

&lt;h2 id=&quot;functions&quot;&gt;Functions&lt;/h2&gt;

&lt;p&gt;Julia has always supported functional programming features:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;anonymous functions (&lt;a href=&quot;https://en.wikipedia.org/wiki/Anonymous_function&quot;&gt;lambdas&lt;/a&gt;),&lt;/li&gt;
  &lt;li&gt;inner functions that close over local variables (&lt;a href=&quot;https://en.wikipedia.org/wiki/Closure_(computer_programming)&quot;&gt;closures&lt;/a&gt;),&lt;/li&gt;
  &lt;li&gt;functions passed to and from other functions (&lt;a href=&quot;https://en.wikipedia.org/wiki/First-class_function&quot;&gt;first-class&lt;/a&gt; and &lt;a href=&quot;https://en.wikipedia.org/wiki/Higher-order_function&quot;&gt;higher-order&lt;/a&gt; functions).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Before this release, however, these features all came with a significant performance cost.
In a language that targets high-performance technical computing, that’s a serious limitation.
So the Julia standard library and ecosystem have been rife with work-arounds to get the expressiveness of functional programming without the performance problems.
But the right solution, of course, is to make functional programming fast – ideally just as fast as the optimal hand-written version of your code would be.
In Julia 0.5, it is.
And that changes everything.&lt;/p&gt;

&lt;p&gt;This change is so important that there will be a separate blog post about it in the coming weeks, explaining how higher-order functions, closures and lambdas have been made so efficient, as well as detailing the kinds of zero-cost abstractions these changes enable.
But for now, I’ll just tease with a little timing comparison.
First, some definitions – they’re the same in both 0.4 and 0.5:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;);&lt;/span&gt;                   &lt;span class=&quot;c&quot;&gt;# 10 million random numbers&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;double_it_vec&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;             &lt;span class=&quot;c&quot;&gt;# vectorized doubling of input&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;double_it_map&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;map&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# map a lambda over input&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;First, a timing comparison in Julia 0.4:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;VERSION&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;0.4.7&quot;&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;nd&quot;&gt;@elapsed&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;double_it_vec&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;mf&quot;&gt;0.024444888209999998&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;nd&quot;&gt;@elapsed&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;double_it_map&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;mf&quot;&gt;0.5515606454499999&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;On 0.4, the functional version using &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt; is 22 times slower than the vectorized version, which uses specialized generated code for maximal speed.
Now, the same comparison in Julia 0.5:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;VERSION&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;0.5.0&quot;&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;nd&quot;&gt;@elapsed&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;double_it_vec&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;mf&quot;&gt;0.024549842180000003&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;nd&quot;&gt;@elapsed&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;double_it_map&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;_&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;mf&quot;&gt;0.023871925960000002&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The version using &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt; is as fast as the vectorized one in 0.5.
In this case, writing &lt;code class=&quot;highlighter-rouge&quot;&gt;2v&lt;/code&gt; happens to be more convenient than writing &lt;code class=&quot;highlighter-rouge&quot;&gt;map(x-&amp;gt;2x, v)&lt;/code&gt;, so we may choose not to use &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt; here, but there are many cases where functional constructs are clearer, more general, and more convenient.
Now, they are also fast.&lt;/p&gt;

&lt;h3 id=&quot;ambiguous-methods&quot;&gt;Ambiguous methods&lt;/h3&gt;

&lt;p&gt;One design decision that any multiple dispatch language must make is how to handle dispatch ambiguities: cases where none of the methods applicable to a given set of arguments is more specific than the rest.
Suppose, for example, that a generic function, &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt;, has the following methods:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;In Julia 0.4 and earlier, the second method definition causes an ambiguity warning:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;WARNING&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;New&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;definition&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;at&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;none&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
&lt;span class=&quot;nb&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ambiguous&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;with&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;at&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;none&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;To&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fix&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;define&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;before&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;the&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;new&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;definition&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This warning is clear and gets right to the point: the case &lt;code class=&quot;highlighter-rouge&quot;&gt;f(a,b)&lt;/code&gt; where &lt;code class=&quot;highlighter-rouge&quot;&gt;a&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;b&lt;/code&gt; are of type &lt;code class=&quot;highlighter-rouge&quot;&gt;Int&lt;/code&gt; (aka &lt;code class=&quot;highlighter-rouge&quot;&gt;Int64&lt;/code&gt; on 64-bit systems) is ambiguous.
Evaluating &lt;code class=&quot;highlighter-rouge&quot;&gt;f(3,4)&lt;/code&gt; calls the first method of &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; – but this behavior is undefined.
Giving a warning whenever methods &lt;em&gt;could&lt;/em&gt; be ambiguous is a fairly conservative choice: it urges people to define a method covering the ambiguous intersection before even defining the methods that overlap.
When we decided to give warnings for potentially ambiguous methods, we hoped that people would avoid ambiguities and all would be well in the world.&lt;/p&gt;

&lt;p&gt;Warning about method ambiguities turns out to be both too strict and too lenient.
It’s far too easy for ambiguities to arise when shared generic functions serve as extension points across unrelated packages.
When many packages extend the same generic functions, it’s common for the methods added to have some ambiguous overlap.
This happens even when each package has no ambiguities on its own.
Worse still, slight changes to one package can introduce ambiguities elsewhere, resulting in the least fun game of &lt;a href=&quot;https://en.wikipedia.org/wiki/Whac-A-Mole#Colloquial_usage&quot;&gt;whack-a-mole&lt;/a&gt; ever.
At the same time, the fact that ambiguities &lt;em&gt;only&lt;/em&gt; cause warnings means that people learn to ignore them, which is annoying at best, and dangerous at worst: it’s far too easy for a real problem to be hidden by a barrage of insignificant ambiguity warnings.
In particular, on 0.4 and earlier if an ambiguous method is actually called, no error occurs.
Instead, one of the possible methods is called, based on the order in which methods were defined – which is essentially arbitrary when they come from different packages.
Usually the method works – it does apply, after all – but this is clearly not the right thing to do.&lt;/p&gt;

&lt;p&gt;The solution is simple: in Julia 0.5 the existence of potential ambiguities is fine, but actually calling an ambiguous method is an immediate error.
The above method definitions for &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt;, which previously triggered a warning, are now silent, but &lt;em&gt;calling&lt;/em&gt; &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; with two &lt;code class=&quot;highlighter-rouge&quot;&gt;Int&lt;/code&gt; arguments is a method dispatch error:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ERROR&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MethodError&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;is&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ambiguous&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Candidates&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;at&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;REPL&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;at&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;REPL&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
 &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eval&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Module&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Any&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;at&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;./&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;boot&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jl&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;231&lt;/span&gt;
 &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;macro&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;expansion&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;at&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;./&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;REPL&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jl&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;92&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;inlined&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
 &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Base&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;REPL&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;c&quot;&gt;##1#2{Base.REPL.REPLBackend})() at ./event.jl:46&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This improves the experience of using the Julia package ecosystem considerably, while also making Julia safer and more reliable.
No more torrent of insignificant ambiguity warnings.
No more playing ambiguity whack-a-mole when someone else refactors their code and accidentally introduces ambiguities in yours.
No more risk that a method call could be silently broken because of warnings that we’ve all learned to ignore.&lt;/p&gt;

&lt;h3 id=&quot;return-type-annotations&quot;&gt;Return type annotations&lt;/h3&gt;

&lt;p&gt;A long-requested feature has been the ability to annotate method definitions with an explicit return type.
This aids the clarity of code, serves as self-documentation, helps the compiler reason about code, and ensures that return types are what programmers intend them to be.
In 0.5, you can annotate method definitions with a return type like so:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; clip&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hi&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lo&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lo&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;elseif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hi&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hi&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This function is similar to the built-in &lt;a href=&quot;http://docs.julialang.org/en/release-0.5/stdlib/math/#Base.clamp&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;clamp&lt;/code&gt;&lt;/a&gt; function, but let’s consider this definition for the sake of example.
The return annotation on &lt;code class=&quot;highlighter-rouge&quot;&gt;clip&lt;/code&gt; has the effect of inserting implicit calls to &lt;code class=&quot;highlighter-rouge&quot;&gt;x-&amp;gt;convert(T, x)&lt;/code&gt; at each return point of the method.
It has no effect on any other method of &lt;code class=&quot;highlighter-rouge&quot;&gt;clip&lt;/code&gt;, only the one where the annotation occurs.
In this case, the annotation ensures that this method always returns a value of the same type as &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt;, regardless of the types of &lt;code class=&quot;highlighter-rouge&quot;&gt;lo&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;hi&lt;/code&gt;:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clip&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# convert(T, lo)&lt;/span&gt;
&lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clip&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# convert(T, x)&lt;/span&gt;
&lt;span class=&quot;mf&quot;&gt;1.5&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clip&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# convert(T, hi)&lt;/span&gt;
&lt;span class=&quot;mf&quot;&gt;2.0&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;You’ll note that the annotated return type here is &lt;code class=&quot;highlighter-rouge&quot;&gt;T&lt;/code&gt;, which is a type parameter of the &lt;code class=&quot;highlighter-rouge&quot;&gt;clip&lt;/code&gt; method.
Not only is that allowed, but the return type can be an arbitrary expression of argument values, type parameters, and values from outer scopes.
For example, here is a variation that promotes its arguments:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; clip2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hi&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;promote_type&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;typeof&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;typeof&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lo&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;typeof&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hi&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lo&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lo&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;elseif&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hi&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hi&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clip2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clip2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;13&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clip2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;2.5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;13&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;//&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mf&quot;&gt;2.5&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Return type annotations are a fairly simple syntactic transformation, but they make it easier to write methods with consistent and predictable return types.
If different branches of your code can lead to slightly different types, the fix is now as simple as putting a single type annotation on the entire method.&lt;/p&gt;

&lt;h3 id=&quot;vectorized-function-calls&quot;&gt;Vectorized function calls&lt;/h3&gt;

&lt;p&gt;Julia 0.5 introduces the syntax &lt;code class=&quot;highlighter-rouge&quot;&gt;f.(A1, A2, ...)&lt;/code&gt; for vectorized function calls.
This syntax translates to &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast(f, A1, A2, ...)&lt;/code&gt;, where &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; is a higher-order function (introduced in 0.2), which generically implements the kind of broadcasting behavior found in Julia’s “dotted operators” such as &lt;code class=&quot;highlighter-rouge&quot;&gt;.+&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;.-&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;.*&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;./&lt;/code&gt;.
Since higher-order functions are now efficient, writing &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast(f,v,w)&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;f.(v,w)&lt;/code&gt; are both about as fast as loops specialized for the operation &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; and the shapes of &lt;code class=&quot;highlighter-rouge&quot;&gt;v&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;w&lt;/code&gt;.
This syntax lets you vectorize your scalar functions the way built-in vectorized functions like &lt;code class=&quot;highlighter-rouge&quot;&gt;log&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;exp&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;atan2&lt;/code&gt; work.
In fact, in the future, this syntax will likely replace the pre-vectorized methods of functions like &lt;code class=&quot;highlighter-rouge&quot;&gt;exp&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;log&lt;/code&gt;, so that users will write &lt;code class=&quot;highlighter-rouge&quot;&gt;exp.(v)&lt;/code&gt; to exponentiate a vector of values.
This may seem a little bit uglier, but it’s more consistent than choosing an essentially arbitrarily set of functions to pre-vectorize, and as I’ll explain below, this approach can also have significant performance benefits.&lt;/p&gt;

&lt;p&gt;To give a more concrete sense of what this syntax can be used for, consider the &lt;code class=&quot;highlighter-rouge&quot;&gt;clip&lt;/code&gt; function defined above for real arguments.
This scalar function can be applied to vectors using vectorized call syntax without any further method definitions:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;randn&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.868996&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;1.79301&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.309632&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;1.16802&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.57178&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.223385&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.608423&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.54862&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.33672&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;0.864448&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clip&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ERROR&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;MethodError&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;no&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;method&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;matching&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clip&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Closest&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;candidates&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;are&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;clip&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;lt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Real&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;at&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;REPL&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clip&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.868996&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.309632&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.223385&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.608423&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;0.864448&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The second and third arguments don’t need to be scalars – as with dotted operators, they can be vectors as well, and the &lt;code class=&quot;highlighter-rouge&quot;&gt;clip&lt;/code&gt; operation will be applied to each corresponding triple of values:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clip&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;repmat&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;repmat&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;],&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.868996&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.608423&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.0&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;0.864448&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;From these examples, it may be unclear why this operation is called “&lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt;”.
The function gets its name from the following behavior:
wherever one of its arguments has a singleton dimension (i.e. dimension of size 1), it “broadcasts” that value along the corresponding dimension of the other arguments when applying the operator.
Broadcasting allows dotted operations to easily do handy tricks like mean-centering the columns of a matrix:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;);&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;×4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;0.343976&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;0.427378&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.503356&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.00448691&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.210096&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.531489&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;0.168928&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.128212&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.13388&lt;/span&gt;    &lt;span class=&quot;mf&quot;&gt;0.104111&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;0.334428&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;0.132699&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;×4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The matrix &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt; is 3×4 and &lt;code class=&quot;highlighter-rouge&quot;&gt;mean(A,1)&lt;/code&gt; is 1×4 so the &lt;code class=&quot;highlighter-rouge&quot;&gt;.-&lt;/code&gt; operator broadcasts the subtraction of each mean value along the corresponding column of A, thereby mean-centering each column.
Combining this broadcasting behavior with vectorized call syntax lets us write some fairly fancy custom array operations very concisely:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;clip&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;'&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;×4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;0.343976&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;       &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;       &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.00448691&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;       &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;        &lt;span class=&quot;mf&quot;&gt;0.168928&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.128212&lt;/span&gt;
 &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;        &lt;span class=&quot;mf&quot;&gt;0.104111&lt;/span&gt;   &lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;        &lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This expression clips each element of &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; with its own specific &lt;code class=&quot;highlighter-rouge&quot;&gt;(hi,lo)&lt;/code&gt; pair from this matrix:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;lo&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;hi&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lo&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hi&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]]&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;×4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tuple&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Vectorized call syntax avoids ever materializing this array of pairs, however, and the messy code to apply &lt;code class=&quot;highlighter-rouge&quot;&gt;clip&lt;/code&gt; to each element of &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; with the corresponding &lt;code class=&quot;highlighter-rouge&quot;&gt;lo&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;hi&lt;/code&gt; values doesn’t have to be written.
When &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; is larger than a toy example, not constructing a temporary matrix of &lt;code class=&quot;highlighter-rouge&quot;&gt;(lo,hi)&lt;/code&gt; pairs can be a big efficiency win.&lt;/p&gt;

&lt;p&gt;There is a bit more to the story about vectorized call syntax.
It’s common to write expressions applying multiple vectorized functions to some arrays.
For example, one might write something like:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;abs&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;abs&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This computes the absolute values of each element of &lt;code class=&quot;highlighter-rouge&quot;&gt;X&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;Y&lt;/code&gt; and takes the larger of the corresponding elements from &lt;code class=&quot;highlighter-rouge&quot;&gt;abs(X)&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;abs(Y)&lt;/code&gt;.
In this traditional vectorized form, the code allocates two temporary intermediate arrays – one to store each of &lt;code class=&quot;highlighter-rouge&quot;&gt;abs(X)&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;abs(Y)&lt;/code&gt;.
If we use the new vectorized function call syntax, however, these calls are syntactically fused into a &lt;em&gt;single&lt;/em&gt; call to &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; with an anonymous function.
In other words, we write this:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;abs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;abs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;which internally becomes this:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;broadcast&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;abs&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;abs&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This version of the computation avoids allocating any intermediate arrays and performs the entire vectorized computation all at once, directly into the result array.
We can see this difference in memory usage and speed when we benchmark these expressions:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;using&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BenchmarkTools&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1000&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;);&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@benchmark&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;abs&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;abs&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BenchmarkTools&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Trial&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;estimate&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;22.89&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mb&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;     &lt;span class=&quot;mf&quot;&gt;13.95&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ms&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.77&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;median&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;      &lt;span class=&quot;mf&quot;&gt;14.17&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ms&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.76&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;        &lt;span class=&quot;mf&quot;&gt;14.32&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ms&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.78&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;     &lt;span class=&quot;mf&quot;&gt;17.15&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ms&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.47&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@benchmark&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;abs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;abs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BenchmarkTools&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Trial&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;estimate&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;7.63&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mb&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;     &lt;span class=&quot;mf&quot;&gt;2.84&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ms&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.00&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;median&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;      &lt;span class=&quot;mf&quot;&gt;2.98&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ms&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.00&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;        &lt;span class=&quot;mf&quot;&gt;3.27&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ms&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;18.26&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;     &lt;span class=&quot;mf&quot;&gt;5.96&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ms&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;65.68&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;22.89&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;7.63&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;16.63&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.84&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;4.330729166666667&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;I’m using the &lt;a href=&quot;https://github.com/JuliaCI/BenchmarkTools.jl&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;BenchmarkTools&lt;/code&gt;&lt;/a&gt; package here instead of hand-rolled timing loops. &lt;code class=&quot;highlighter-rouge&quot;&gt;BenchmarkTools&lt;/code&gt; has been carefully designed to avoid many of the common pitfalls of benchmarking code and to provide sound statistical estimates of how much time and memory your code uses.
For the sake of brevity, I’m omitting some of the less relevant output from &lt;code class=&quot;highlighter-rouge&quot;&gt;@benchmark&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;As you can see, the dotted form uses 3 times less memory and is 4.3 times faster.
These improvements come from avoiding temporary allocations and performing the entire computation in a single pass over the arrays.
Even greater reduction in allocation can occur when we use the new &lt;code class=&quot;highlighter-rouge&quot;&gt;.=&lt;/code&gt; operator to also do vectorized assignment:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Z&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zeros&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;);&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# matrix of zeros similar to X&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@benchmark&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Z&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;abs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;abs&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BenchmarkTools&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Trial&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;estimate&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;96.00&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bytes&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;     &lt;span class=&quot;mf&quot;&gt;1.76&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ms&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.00&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;median&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;      &lt;span class=&quot;mf&quot;&gt;1.82&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ms&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.00&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;        &lt;span class=&quot;mf&quot;&gt;1.89&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ms&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.00&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;     &lt;span class=&quot;mf&quot;&gt;4.24&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ms&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.00&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;With in-place vectorized assignment, we can fill the pre-allocated array, &lt;code class=&quot;highlighter-rouge&quot;&gt;Z&lt;/code&gt;, without doing any allocation (the 96 bytes is an artifact), and do so 7.3 times faster than the old-style vectorized computation.
This can be a big win in situations where we can reuse the same output array for multiple computations.&lt;/p&gt;

&lt;p&gt;The last major missing piece of vectorized call syntax is yet to come – it will be implemented in the next version of Julia.
Dotted operators like &lt;code class=&quot;highlighter-rouge&quot;&gt;.+&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;.*&lt;/code&gt; will cease to be their own independent operators and simply become the vectorized forms of the corresponding scalar operators, &lt;code class=&quot;highlighter-rouge&quot;&gt;+&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;*&lt;/code&gt;.
In other words, instead of &lt;code class=&quot;highlighter-rouge&quot;&gt;.+&lt;/code&gt; being a function as it is now, with its own behavior independent of &lt;code class=&quot;highlighter-rouge&quot;&gt;+&lt;/code&gt;, when you write &lt;code class=&quot;highlighter-rouge&quot;&gt;X .+ Y&lt;/code&gt; it will mean &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast(+, X, Y)&lt;/code&gt;.
Furthermore, dotted operators will participate in the same syntax-level fusion as other vectorized calls, so an expression like &lt;code class=&quot;highlighter-rouge&quot;&gt;exp.(log.(X) .+ log.(Y))&lt;/code&gt; will translate into a single call to broadcast:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;broadcast&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;exp&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;X&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Y&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This change will complete the transition to a generalized approach to vectorized function application (including syntax-level loop fusion), which will make Julia’s story for writing allocation-free array code much stronger.&lt;/p&gt;

&lt;h2 id=&quot;comprehensions&quot;&gt;Comprehensions&lt;/h2&gt;

&lt;p&gt;Julia’s array comprehensions have always supported some advanced features such as iterating with several variables to produce multidimensional arrays.
This release rounds out the functionality of comprehensions with two additional features: nested generation with multiple &lt;code class=&quot;highlighter-rouge&quot;&gt;for&lt;/code&gt; clauses, and filtering with a trailing &lt;code class=&quot;highlighter-rouge&quot;&gt;if&lt;/code&gt; clause.
To demonstrate these features, consider making a dollar (100¢) using quarters (25¢), dimes (10¢), nickels (5¢) and pennies (1¢).
We can generate an array of tuples of total values in each kind of coin by using a comprehension with nested &lt;code class=&quot;highlighter-rouge&quot;&gt;for&lt;/code&gt; clauses:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;change&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;25&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;242&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NTuple&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;95&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;90&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;15&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;85&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;80&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;25&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;75&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;n&quot;&gt;⋮&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;75&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;75&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;75&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;15&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;75&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;75&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;There are a few notable differences from the multidimensional array syntax:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Each iteration is a new &lt;code class=&quot;highlighter-rouge&quot;&gt;for&lt;/code&gt; clause, rather than a single compound iteration separated by commas;&lt;/li&gt;
  &lt;li&gt;Each successive &lt;code class=&quot;highlighter-rouge&quot;&gt;for&lt;/code&gt; clause &lt;em&gt;can&lt;/em&gt; refer to variables from the previous clauses;&lt;/li&gt;
  &lt;li&gt;The result is a single flat vector regardless of how many nested &lt;code class=&quot;highlighter-rouge&quot;&gt;for&lt;/code&gt; clauses there are.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The tuple &lt;code class=&quot;highlighter-rouge&quot;&gt;(q,d,n,p)&lt;/code&gt; in the comprehension body is a breakdown of monetary value into quarters, dimes, nickels and pennies.
Note that the iteration range for &lt;code class=&quot;highlighter-rouge&quot;&gt;p&lt;/code&gt; isn’t a range at all, it’s a single value, &lt;code class=&quot;highlighter-rouge&quot;&gt;100-q-d-n&lt;/code&gt;, the unique number guaranteeing that each tuple adds up to a dollar.
(This relies on the fact that a number behaves like an immutable zero-dimensional container, holding only itself, a behavior which is sometimes convenient but which has been the subject of significant debate.
As of 0.5 it still works.)
We can verify that each tuple adds up to 100:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;extrema&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;change&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Since 100 is both the minimum and maximum of all the tuple sums, we know they are all exactly 100.
So, there are 242 ways to make a dollar with common coins.
But suppose we want to ensure that the value in pennies is less than the value in nickels, and so forth.
By adding a filter clause, we can do this easily too:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;25&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;p&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;d&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;NTuple&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;50&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;30&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;15&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;50&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;30&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;50&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;40&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;75&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The only difference here is the &lt;code class=&quot;highlighter-rouge&quot;&gt;if p &amp;lt; n &amp;lt; d &amp;lt; q&lt;/code&gt; clause at the end of the comprehension, which has the effect that the result only contains cases where this predicate holds true.
There are exactly four ways to make a dollar with strictly increasing value from pennies to nickels to dimes to quarters.&lt;/p&gt;

&lt;p&gt;Nested and filtered comprehensions aren’t earth-shattering features – everything you can do with them can be done in a variety of other ways – but they are expressive and convenient, found in other languages, and they allow you to try more things with your data quickly and easily, with less pointless refactoring.&lt;/p&gt;

&lt;h2 id=&quot;generators&quot;&gt;Generators&lt;/h2&gt;

&lt;p&gt;In the previous section we used an array comprehension to take the sum of each tuple, save the sums as an array, and then pass that array of sums to the &lt;code class=&quot;highlighter-rouge&quot;&gt;extrema&lt;/code&gt; function to find the largest and smallest sum (they’re all 100):&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@time&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;extrema&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;change&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;0.000072&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seconds&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;allocations&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;2.203&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;KB&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Wrapping this in the &lt;code class=&quot;highlighter-rouge&quot;&gt;@time&lt;/code&gt; macro shows that this expression allocates 2.2 KB of memory – mostly for the array of sums, which is thrown away after the computation.
But allocating an array just to find its extrema is unnecessary:
the minimum and maximum can be computed over streamed data by keeping the largest and smallest values seen so far.
In other words, this calculation could be expressed with constant memory overhead by interleaving the production of values with computation of extrema.
Previously, expressing this interleaved computation required some amount of refactoring, and many approaches were considerably less efficient.
In 0.5, if you simply omit the square brackets around an array comprehension, you get a &lt;em&gt;generator expression&lt;/em&gt;, which instead of producing an array of values, can be iterated over, yielding one value at a time.
Since &lt;code class=&quot;highlighter-rouge&quot;&gt;extrema&lt;/code&gt; works with arbitrary iterable objects – including generators – expressing an interleaved calculation using constant memory is now as simple as deleting &lt;code class=&quot;highlighter-rouge&quot;&gt;[&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;]&lt;/code&gt;:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@time&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;extrema&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;change&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;0.000066&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seconds&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;allocations&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;208&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bytes&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;100&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This avoids allocating a temporary array of sums entirely, instead computing the next tuple’s sum only when the &lt;code class=&quot;highlighter-rouge&quot;&gt;extrema&lt;/code&gt; function is ready to accept a new value.
Using a generator reduces the memory overhead to 208 bytes – the size of the the return value.
More importantly, the memory usage doesn’t depend on the size of the &lt;code class=&quot;highlighter-rouge&quot;&gt;change&lt;/code&gt; array anymore – it will always be just 208 bytes, even if &lt;code class=&quot;highlighter-rouge&quot;&gt;change&lt;/code&gt; holds a trillion tuples.
It’s not hard to imagine situations where such a reduction in asymptotic memory usage is crucial.
The similar syntax between array comprehensions and generator expressions makes it trivial to move back and forth between the two styles of computation as needed.&lt;/p&gt;

&lt;h3 id=&quot;initializing-collections&quot;&gt;Initializing collections&lt;/h3&gt;

&lt;p&gt;The new generator syntax dovetails particularly nicely with Julia’s convention for constructing collections – to make a new collection, you call the constructor with a single iterable argument, which yields the values you want in the new collection.
In its simplest form, this looks something like:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IntSet&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;9&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;25&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;36&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;49&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;IntSet&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;9&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;25&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;36&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;49&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;In this expression, an array of integers is passed to the &lt;code class=&quot;highlighter-rouge&quot;&gt;IntSet&lt;/code&gt; constructor to create an object representing that set, which in this case happen to be small squares.
Once constructed, the &lt;code class=&quot;highlighter-rouge&quot;&gt;IntSet&lt;/code&gt; object no longer refers to the original array of integers.
Instead, it uses a bitmask to efficiently store and operate on sets.
It displays itself as you would construct it from an array, but that’s merely for convenience – there’s no actual array anymore.&lt;/p&gt;

&lt;p&gt;Now, I’m a human (no blogbots here) and I find typing out even short sequences of perfect squares tedious and error prone – despite a math degree, I’m awful at arithmetic.
It would be much easier to generate squares with an array comprehension:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IntSet&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;IntSet&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;9&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;25&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;36&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;49&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This comprehension produces the same array of integers that I typed manually above.
As before, creating this array object is unnecessary – it would be even better to generate the desired squares as they are inserted into the new &lt;code class=&quot;highlighter-rouge&quot;&gt;IntSet&lt;/code&gt;.
Which, of course, is precisely what generator expressions allow:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IntSet&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;IntSet&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;9&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;16&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;25&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;36&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;49&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Using a generator here is just as clear, more concise, and significantly more efficient:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;using&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;BenchmarkTools&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@benchmark&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IntSet&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BenchmarkTools&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Trial&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;estimate&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;320.00&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bytes&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;     &lt;span class=&quot;mf&quot;&gt;163.00&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ns&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.00&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;median&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;      &lt;span class=&quot;mf&quot;&gt;199.00&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ns&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.00&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;        &lt;span class=&quot;mf&quot;&gt;245.18&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ns&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;12.95&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;     &lt;span class=&quot;mf&quot;&gt;5.36&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;μs&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;92.47&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@benchmark&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;IntSet&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;BenchmarkTools&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Trial&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;memory&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;estimate&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;160.00&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;bytes&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;minimum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;     &lt;span class=&quot;mf&quot;&gt;114.00&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ns&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.00&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;median&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;      &lt;span class=&quot;mf&quot;&gt;139.00&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ns&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.00&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;        &lt;span class=&quot;mf&quot;&gt;165.74&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ns&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;11.48&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;maximum&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;     &lt;span class=&quot;mf&quot;&gt;4.82&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;μs&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;93.20&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GC&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;As you can see from this benchmark, the version with an array comprehension uses twice as much memory and is 50% slower than constructing the same &lt;code class=&quot;highlighter-rouge&quot;&gt;IntSet&lt;/code&gt; using a generator expression.&lt;/p&gt;

&lt;h4 id=&quot;constructing-dictionaries&quot;&gt;Constructing dictionaries&lt;/h4&gt;

&lt;p&gt;Generators can be used to construct dictionaries too, and this use case deserves some special attention since it completes a multi-release process of putting user-defined dictionary types on an equal footing with the built-in &lt;code class=&quot;highlighter-rouge&quot;&gt;Dict&lt;/code&gt; type.
In Julia 0.3, the &lt;code class=&quot;highlighter-rouge&quot;&gt;=&amp;gt;&lt;/code&gt; operator only existed as part of syntax for constructing &lt;code class=&quot;highlighter-rouge&quot;&gt;Dict&lt;/code&gt; objects:
&lt;code class=&quot;highlighter-rouge&quot;&gt;[k₁ =&amp;gt; v₁, k₂ =&amp;gt; v₂]&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;[k(i) =&amp;gt; v(i) for i = c]&lt;/code&gt;.
This design was based on other dynamic languages where dictionaries are among a small set of built-in types with special syntax that are deeply integrated into the language.
As Julia’s ecosystem has matured, however, it has become apparent that Julia is actually more like Java or C++ in this respect than it is like Python or Lua: the &lt;code class=&quot;highlighter-rouge&quot;&gt;Dict&lt;/code&gt; type isn’t that special – it happens to be defined in the standard library, but is otherwise quite ordinary.
Many programs use other dictionary implementations: for example, the tree-based &lt;code class=&quot;highlighter-rouge&quot;&gt;SortedDict&lt;/code&gt; type, which sorts values by key, or &lt;code class=&quot;highlighter-rouge&quot;&gt;OrderedDict&lt;/code&gt;, which maintains keys in the order they are inserted.
Having special syntax only for &lt;code class=&quot;highlighter-rouge&quot;&gt;Dict&lt;/code&gt; makes using other dictionary implementations problematic.
In 0.3, there was no good syntax for constructing values of these dictionaries – the best one could do was to invoke a constructor with an array of two-tuples:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;SortedDict&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;k₁&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v₁&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;k₂&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v₂&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)])&lt;/span&gt;        &lt;span class=&quot;c&quot;&gt;# fixed-size dictionaries&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;SortedDict&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;   &lt;span class=&quot;c&quot;&gt;# dictionary comprehensions&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Not only are these constructions inconvenient and ugly, they’re also inefficient since they create temporary heap-allocated arrays of heap-allocated tuples of key-value pairs.
With much relief, we can now instead write:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;SortedDict&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;k₁&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v₁&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k₂&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v₂&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;          &lt;span class=&quot;c&quot;&gt;# fixed-size dictionaries, since 0.4&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;SortedDict&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;      &lt;span class=&quot;c&quot;&gt;# dictionary comprehensions, since 0.5&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This last syntax combines two orthogonal features introduced in 0.4 and 0.5, respectively:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;k =&amp;gt; v&lt;/code&gt; as a standalone syntax for a &lt;code class=&quot;highlighter-rouge&quot;&gt;Pair&lt;/code&gt; object, and&lt;/li&gt;
  &lt;li&gt;generator expressions, particularly to initialize collections.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;Dict&lt;/code&gt; type is now constructed in exactly the same way:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;foo&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;bar&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;entries&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;bar&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;foo&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;*&quot;&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;k&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Dict&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;String&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;entries&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;**********&quot;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;***&quot;&lt;/span&gt;        &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;*******&quot;&lt;/span&gt;    &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;********&quot;&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;*&quot;&lt;/span&gt;          &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;**&quot;&lt;/span&gt;         &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;****&quot;&lt;/span&gt;       &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;*********&quot;&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;9&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;*****&quot;&lt;/span&gt;      &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;
  &lt;span class=&quot;s&quot;&gt;&quot;******&quot;&lt;/span&gt;     &lt;span class=&quot;o&quot;&gt;=&amp;gt;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This generalization makes the syntax for constructing a &lt;code class=&quot;highlighter-rouge&quot;&gt;Dict&lt;/code&gt; slightly longer, but we feel that the increased consistency, ability to change dictionary implementations with a simple search-and-replace, and putting user-defined dictionary-like types on the same level as the built-in &lt;code class=&quot;highlighter-rouge&quot;&gt;Dict&lt;/code&gt; type make this change well worthwhile.&lt;/p&gt;

&lt;h2 id=&quot;arrays&quot;&gt;Arrays&lt;/h2&gt;

&lt;p&gt;The 0.5 release was originally intended to include a large number of disruptive array changes, collectively dubbed “Arraymageddon”.
After much discussion, experimentation and benchmarking, this set of breaking changes was significantly reduced for a variety of reasons:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Some changes were deemed not to be good ideas after all;&lt;/li&gt;
  &lt;li&gt;Others were of unclear benefit, so it was decided to reconsider them in the future once there is more information to support a decision;&lt;/li&gt;
  &lt;li&gt;A few didn’t get implemented due to lack of developer time, including some cases where everyone agrees there’s a problem but there is not yet any complete design for a solution.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Although not many breaking changes happened in 0.5, this was a major release for Julia’s array infrastructure.
The code to implement various complex polymorphic indexing operations for generic arrays and array-like structures was majorly refactored, and in the process it shrank by 40% while becoming more complete, more general, and faster.
You can read more about the very cool things you can now do with array-like types in an excellent pair of blog posts published here earlier in the year: &lt;a href=&quot;/blog/2016/02/iteration&quot;&gt;&lt;em&gt;Multidimensional algorithms and iteration&lt;/em&gt;&lt;/a&gt; and &lt;a href=&quot;/blog/2016/03/arrays-iteration&quot;&gt;&lt;em&gt;Generalizing AbstractArrays&lt;/em&gt;&lt;/a&gt;.
In the next two subsections, I’ll go over some of the array changes that did happen in 0.5.&lt;/p&gt;

&lt;h3 id=&quot;dimension-sum-slices&quot;&gt;Dimension sum slices&lt;/h3&gt;

&lt;p&gt;The most significant breaking change in the 0.5 cycle affects multidimensional array slicing.
To explain it we’ll need a little terminology.
A &lt;em&gt;singleton dimension&lt;/em&gt; of a multidimensional array is a dimension whose size is 1.
For example, a 5x1 matrix has a trailing singleton dimension and may be called a “column matrix”, and a 1x5 matrix has a leading singleton dimension and may be called a “row matrix”.
A &lt;em&gt;scalar slice&lt;/em&gt; refers to a dimension in a multidimensional slice expression where the index is a scalar integer (considered to be zero-dimensional), rather than a 1-dimensional range or vector, or some higher-dimensional collection of indices.
For example, in &lt;code class=&quot;highlighter-rouge&quot;&gt;A[1,:]&lt;/code&gt; the first slice is scalar, the second is not; in &lt;code class=&quot;highlighter-rouge&quot;&gt;A[:,2]&lt;/code&gt; the second slice is scalar, the first is not; in &lt;code class=&quot;highlighter-rouge&quot;&gt;A[3,4]&lt;/code&gt; both slices are scalar.&lt;/p&gt;

&lt;p&gt;All previous versions of Julia have dropped trailing scalar slices when performing multidimensional array slicing.
That is, when an array was sliced with multiple indices, the resulting array had the number of dimensions of the original array minus the number of trailing scalar slices.
So when you sliced a column out of a matrix the result was a 1-dimensional vector, but when you sliced a row the result was a 2-dimensional row matrix:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;VERSION&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;0.4.7&quot;&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;17&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;19&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[:,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# vector&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,:]&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# row matrix&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;19&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This rule is handy for linear algebra since row and column slices have distinct types and different orientations, but its complexity, asymmetry, and lack of generality make it less than ideal for arrays as general purpose containers.
With more dimensions, the asymmetry of this behavior can be seen even in a single slice operation:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;VERSION&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;0.4.7&quot;&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;^&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x4x2&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;[:,&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;:,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;19&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;13&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;

&lt;span class=&quot;x&quot;&gt;[:,&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;:,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;13&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;25&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;14&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;19&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;26&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;15&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;27&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,:,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;14&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;19&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;26&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The leading dimension of this slice is retained while the trailing dimension is discarded – even though both are scalar slices.
The result array is neither 3-dimensional like the original, nor 1-dimensional like the collective indexes (0 + 1 + 0); instead, it’s 2-dimensional – apropos of nothing.
Here, in another fairly similar slice, all dimensions are kept:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[:,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,:]&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x1x2&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;[:,&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;:,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;19&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;

&lt;span class=&quot;x&quot;&gt;[:,&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;:,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;25&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;26&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;27&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;By comparison, the new slicing behavior in 0.5 is simple, systematic, and symmetrical.
(And not original by any means – APL pioneered this array slicing scheme in the 1960s.)
In Julia 0.5, when an array is sliced, the dimension of the result is the sum of the dimensions of the slices, and the dimension sizes of the result are the concatenation of the sizes of the slices.
Thus, row slices and column slices both produce vectors:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;VERSION&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;0.5.0&quot;&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[:,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# vector: 1 + 0 = 1&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,:]&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# vector: 0 + 1 = 1&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
  &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
  &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;170&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Similarly, slicing a 3-dimensional array with scalars in all but one dimension also produces a vector:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,:,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# vector: 0 + 1 + 0 = 1&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;14&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;19&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;26&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The only example from above that doesn’t produce a vector is the last one:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[:,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,:]&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# matrix: 1 + 0 + 1 = 2&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;×2&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;25&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;19&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;26&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;27&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The result is a matrix since the leading and trailing slices are ranges, and the middle slice disappears since it is scalar, leaving a matrix.
The 0.5 slicing behavior naturally generalizes to higher dimensional slices:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;I&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;;&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;×3&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;J&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;×4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;J&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;×3×1×4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
&lt;span class=&quot;x&quot;&gt;[:,&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;:,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;17&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;17&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;17&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;19&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;18&lt;/span&gt;

&lt;span class=&quot;x&quot;&gt;[:,&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;:,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;

&lt;span class=&quot;x&quot;&gt;[:,&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;:,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;

&lt;span class=&quot;x&quot;&gt;[:,&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;:,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;
 &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;  &lt;span class=&quot;mi&quot;&gt;11&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Here we have the following natural identity on dimensions:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;J&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;==&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;J&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;J&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;In addition to being more systematic and symmetrical, this new behavior allows many complex indexing operations to be expressed concisely.&lt;/p&gt;

&lt;p&gt;Although the change to multidimensional slicing behavior is a significant breaking change, it has caused surprisingly little havoc in the Julia package ecosystem.
It tends to primarily affect linear algebra code, and when code does break, it’s usually fairly clear what is broken and what needs to be done to fix it.
When updating your code, if you need to keep a dimension that is dropped under the new indexing behavior, you can write &lt;code class=&quot;highlighter-rouge&quot;&gt;M[1:1,:]&lt;/code&gt;:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,:]&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;×4&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;mf&quot;&gt;0.950951&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.713032&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.0835119&lt;/span&gt;  &lt;span class=&quot;mf&quot;&gt;0.897018&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Since integer range construction can be eliminated by Julia’s compiler, writing this is free but has the effect of keeping a dimension which would otherwise be dropped under the new rules.
Unfortunately, there’s no way to make this change without breaking some code – we apologize in advance for the inconvenience, and we hope you find the improvement to be worthwhile.&lt;/p&gt;

&lt;h3 id=&quot;array-views&quot;&gt;Array views&lt;/h3&gt;

&lt;p&gt;One of the major news items of 0.5 is a non-change:
array slices still create copies of array data.
There was a lot of discussion about changing the default behavior to creating views, but we ended up deciding against this change and keeping the old behavior.
The motivation for views by default was to improve performance drastically in a variety of slow cases, but after a lot of discussion, experiments, and benchmarks, it was decided not to make this change.
The conversation about this decision is long, so I’ll summarize the major points:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Slicing should either consistently produce views or copies.
Unpredictably doing one or the other depending on types – or worse still, on runtime values – would be a disaster for writing reliable, generic code.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Guaranteeing view semantics for all abstract arrays – especially sparse and custom array types – is hard and can be quite slow and/or expensive in general cases.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Even in the case of dense arrays with cheap array views, it’s not clear that views are always a performance win.
In some cases they definitely are, but in others the fact that a copied slice is contiguous and has optimal memory ordering for iteration overwhelms the benefit of not copying.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Copied slices are easier to reason about and less likely to lead to subtle bugs than views.
Views can lead to situations where someone modifies the view, not realizing that it’s a view, thereby unintentionally modifying the original array.
These kinds of bugs are hard to track down and even harder to notice.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;There is no clear transition or deprecation strategy.
Changing from copying slices to views would be a major compatibility issue.
We generally give programmers deprecation warnings when some behavior is going to break or change in the next release.
Sometimes we can’t do that so we just bite the bullet and break code with an error. But changing slices to views wouldn’t break code with an error, it would just silently cause code to produce different, incorrect results.
There’s no clear way to make this transition safely.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Taken together this makes a compelling case against changing the default slicing behavior to returning views.
That said, even if they’re not the default, views are a crucial tool for performance in some situations.
Accordingly, a huge amount of work went into improving the ergonomics of views in 0.5, including:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Renaming the function for view construction from “&lt;code class=&quot;highlighter-rouge&quot;&gt;sub&lt;/code&gt;” to “&lt;code class=&quot;highlighter-rouge&quot;&gt;view&lt;/code&gt;”, which seems like a much better name.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Array views now support all forms of indexing supported by arrays.
Previously, views did not support some of the more complex forms of array indexing.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;@view&lt;/code&gt; macro was introduced, allowing the use of natural slicing syntax for views.
In other words you can now write &lt;code class=&quot;highlighter-rouge&quot;&gt;@view A[1:end-1,2:end]&lt;/code&gt; instead of &lt;code class=&quot;highlighter-rouge&quot;&gt;view(A, 1:size(A,1)-1, 2:size(A,2))&lt;/code&gt;.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Since views are an such important tool for both performance and for expressing complex mutating operations on arrays (especially with higher order functions), we may introduce a special syntax for view slices in the future.
In particular, the syntax &lt;code class=&quot;highlighter-rouge&quot;&gt;A@[I...]&lt;/code&gt; had a fair amount of popular support.
Stay tuned!&lt;/p&gt;

&lt;h2 id=&quot;and-more&quot;&gt;And more…&lt;/h2&gt;

&lt;p&gt;This is far from the full extent of the improvements introduced in Julia 0.5, but this blog post is already getting quite long, so I’ll just summarize a few of the other big ticket items:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The set of string types and operations has been significantly simplified and streamlined.
The &lt;code class=&quot;highlighter-rouge&quot;&gt;ASCIIString&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;UTF8String&lt;/code&gt; types have been merged into a single &lt;code class=&quot;highlighter-rouge&quot;&gt;String&lt;/code&gt; type, and the &lt;code class=&quot;highlighter-rouge&quot;&gt;UTF16String&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;UTF32String&lt;/code&gt; and related functions have been moved into the &lt;a href=&quot;https://github.com/JuliaArchive/LegacyStrings.jl&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;LegacyStrings&lt;/code&gt;&lt;/a&gt; package, which keeps the same implementations as 0.4.
In the future, better support for different string encodings will be developed under the &lt;a href=&quot;https://github.com/nalimilan/StringEncodings.jl&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;StringEncodings&lt;/code&gt;&lt;/a&gt; package.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Most functionality related to prime generation, primality checking and combinatorics, has been moved into two external packages: &lt;a href=&quot;https://github.com/JuliaMath/Primes.jl&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Primes&lt;/code&gt;&lt;/a&gt; and &lt;a href=&quot;https://github.com/JuliaMath/Combinatorics.jl&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Combinatorics&lt;/code&gt;&lt;/a&gt;.
To use these functions, you’ll need to install these packages and do &lt;code class=&quot;highlighter-rouge&quot;&gt;using Primes&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;using Combinatorics&lt;/code&gt; as necessary.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Julia’s LLVM version was upgraded from 3.3 to 3.7.1.
This may not seem like a big deal, but the transition required herculean effort by many core Julia contributors.
For a series of different and impossibly annoying reasons, LLVM versions 3.4, 3.5 and 3.6 were not usable for Julia, so we’re very happy to be back to using current versions of our favorite compiler framework.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Support for compiling and running on &lt;a href=&quot;https://en.wikipedia.org/wiki/ARM_architecture&quot;&gt;ARM&lt;/a&gt; chips is much improved since 0.4.
Julia 0.5 also introduced initial support for &lt;a href=&quot;https://en.wikipedia.org/wiki/Power_Architecture&quot;&gt;Power&lt;/a&gt; systems, a development which has been supported and driven by IBM.
We will be expanding and improving support for many architectures going forward.
With support for ARM and Power, Julia is already a productive platform for technical computing from embedded systems to big iron.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The 0.5 release has experimental multithreading support.
This isn’t ready for production usage, but it’s fun to play around with and you can already get impressive performance gains – scalability is a key focus.
Julia’s threading provides true concurrent execution like C++, Go or Java:
different threads can do work at the same time, up to the number of physical cores available.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Interactive debugging support has been a weak spot in the Julia ecosystem for some time, but not any more.
On a vanilla build of Julia 0.5, you can install the &lt;a href=&quot;https://github.com/Keno/Gallium.jl&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Gallium&lt;/code&gt;&lt;/a&gt; package to get a full-fledged, high-performance debugger:
set breakpoints, step through code, examine variables, and inspect stack frames.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I hope you’ve enjoyed this overview of highlights from the new release of Julia, and that you enjoy the release itself even more.
Julia 0.5 is easily the strongest release to date, but of course the next one will be even better :)&lt;/p&gt;

&lt;p&gt;Happy coding!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Julia 0.5 Release Announcement</title>
   <link href="http://julialang.org/blog/2016/10/julia-0.5-release"/>
   <updated>2016-10-10T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/10/julia-0.5-release</id>
   <content type="html">&lt;p&gt;After over a year of development, the Julia community is proud to announce
the release of version 0.5 of the Julia language and standard library.
This release contains major language refinements and numerous standard library improvements.
A long list of changes is available in the &lt;a href=&quot;https://github.com/JuliaLang/julia/blob/release-0.5/NEWS.md#julia-v050-release-notes&quot;&gt;NEWS log&lt;/a&gt; found in our main repository, with a summary reproduced below.
A separate blog post detailing some of the &lt;a href=&quot;/blog/2016/10/julia-0.5-highlights&quot;&gt;highlights of the new release&lt;/a&gt; has also been posted.&lt;/p&gt;

&lt;p&gt;We’ll be releasing regular bugfix backports from the 0.5.x line, which is recommended for users requiring a stable language and API.
Major feature work is ongoing on master for 0.6-dev.&lt;/p&gt;

&lt;p&gt;The Julia ecosystem continues to grow, and there are now &lt;a href=&quot;http://pkg.julialang.org/pulse.html&quot;&gt;over one thousand&lt;/a&gt; registered packages!
The third annual &lt;a href=&quot;http://juliacon.org/&quot;&gt;JuliaCon&lt;/a&gt; took place in Cambridge, MA in the &lt;a href=&quot;http://julialang.org/blog/2016/09/juliacon2016&quot;&gt;summer of 2016&lt;/a&gt;, with an exciting line up of talks and keynotes.
Most of them are &lt;a href=&quot;https://www.youtube.com/playlist?list=PLP8iPy9hna6SQPwZUDtAM59-wPzCPyD_S&quot;&gt;available to view&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Binaries are available from the &lt;a href=&quot;http://julialang.org/downloads/&quot;&gt;main download page&lt;/a&gt; or visit &lt;a href=&quot;https://juliabox.com/&quot;&gt;JuliaBox&lt;/a&gt; to try this release from the comfort of your browser. Happy Coding!&lt;/p&gt;

&lt;h3 id=&quot;notable-compiler-and-language-changes&quot;&gt;Notable compiler and language changes:&lt;/h3&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;The major focus of this release has been the ability to write fast functional code, removing the earlier performance penalty for anonymous functions and closures.
This has been achieved via each function and closure now being its own type, and the captured variables of a closure are fields of its type.
All functions, including anonymous functions, are now generic and support all features.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Experimental support for &lt;a href=&quot;http://docs.julialang.org/en/latest/manual/parallel-computing/#multi-threading-experimental&quot;&gt;multi threading&lt;/a&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;All dimensions indexed by &lt;a href=&quot;https://github.com/JuliaLang/julia/issues/13612&quot;&gt;scalars are now dropped&lt;/a&gt;, whereas previously only trailing scalar dimensions would be omitted from the result.
This is a major breaking changes, but has been made to make the indexing rules much more consistent.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;https://github.com/JuliaLang/julia/issues/4470&quot;&gt;Generator expressions&lt;/a&gt; now can create iterators that are computed only on demand.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Experimental support for &lt;a href=&quot;https://github.com/JuliaLang/julia/issues/16260&quot;&gt;arrays whose indexing&lt;/a&gt; starts from values other than 1. Standard Julia arrays are still 1-based, but external packages can implement array types with indexing from arbitrary indices.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Major simplification of the &lt;a href=&quot;https://github.com/JuliaLang/julia/issues/16107&quot;&gt;string types&lt;/a&gt;, unifying &lt;code class=&quot;highlighter-rouge&quot;&gt;ASCIIString&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;UTF8String&lt;/code&gt; as &lt;code class=&quot;highlighter-rouge&quot;&gt;String&lt;/code&gt;, as well as moving types and functions related to different encodings out of the standard library.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;https://github.com/JuliaLang/julia/issues/11196&quot;&gt;Package operations&lt;/a&gt; now use the &lt;code class=&quot;highlighter-rouge&quot;&gt;libgit2&lt;/code&gt; library rather than shelling out to command line git. This makes these calls to package related functions much faster, and more reliable, especially on Windows.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;And &lt;a href=&quot;https://github.com/JuliaLang/julia/blob/release-0.5/NEWS.md#julia-v050-release-notes&quot;&gt;many many more&lt;/a&gt; changes and improvements…&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;ports&quot;&gt;Ports&lt;/h3&gt;

&lt;p&gt;Julia now runs on the ARM and Power architectures, making it possible to use it on the widest variety of hardware, from the smallest embedded machines to the largest HPC systems. Porting a language to a new architecture is never easy, so special thanks to the people who made it possible. Part of the work to create the Power port was supported by IBM, for which we are grateful.&lt;/p&gt;

&lt;h3 id=&quot;developing-with-julia&quot;&gt;Developing with Julia&lt;/h3&gt;

&lt;p&gt;The Julia debugger, &lt;a href=&quot;https://github.com/Keno/Gallium.jl&quot;&gt;Gallium&lt;/a&gt;, is now ready to use. It allows for a full, multi language debug experience, debugging Julia and C code with ease. The debugger is also integrated with &lt;a href=&quot;http://junolab.org&quot;&gt;Juno&lt;/a&gt;, the Julia IDE that is now fully featured and ready to use.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>StructuredQueries.jl - A generic data manipulation framework</title>
   <link href="http://julialang.org/blog/2016/10/StructuredQueries"/>
   <updated>2016-10-03T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/10/StructuredQueries</id>
   <content type="html">&lt;p&gt;This post describes my work conducted this summer at the &lt;a href=&quot;http://julia.mit.edu/&quot;&gt;Julia Lab&lt;/a&gt; to develop &lt;a href=&quot;https://github.com/davidagold/StructuredQueries.jl/&quot;&gt;StructuredQueries.jl&lt;/a&gt;, a generic data manipulation framework for &lt;a href=&quot;http://julialang.org/&quot;&gt;Julia&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Our initial vision for this work was much inspired by Hadley Wickham’s &lt;a href=&quot;https://github.com/hadley/dplyr&quot;&gt;dplyr&lt;/a&gt; R package, which provides data manipulation verbs that are generic over in-memory R tabular data structures and SQL databases, and &lt;a href=&quot;https://github.com/JuliaStats/DataFramesMeta.jl&quot;&gt;DataFramesMeta&lt;/a&gt; (begun by &lt;a href=&quot;https://github.com/tshort&quot;&gt;Tom Short&lt;/a&gt;), which provides metaprogramming facilities for working with Julia &lt;code class=&quot;highlighter-rouge&quot;&gt;DataFrame&lt;/code&gt;s.&lt;/p&gt;

&lt;p&gt;While a generic querying interface is a worthwhile end in itself (and has been discussed &lt;a href=&quot;https://groups.google.com/d/topic/julia-dev/jL2FSL4EneE/discussion&quot;&gt;elsewhere&lt;/a&gt;), it may also be useful for solving problems specific to in-memory Julia tabular data structures. We will discuss how a query interface suggests solutions to two important problems facing the development of tabular data structures in Julia: the &lt;em&gt;column-indexing&lt;/em&gt; and &lt;em&gt;nullable semantics&lt;/em&gt; problems. So, the present post will describe both the progress of my work and also discuss a wider scope of issues concerning support for tabular data structures in Julia. I will provide some context for these issues; the reader should feel free to skip over any uninteresting details.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;Recall that the primary shortcoming of &lt;a href=&quot;https://travis-ci.org/JuliaStats/DataArrays.jl&quot;&gt;DataArrays.jl&lt;/a&gt; is that it does not allow for type-inferable indexing. That is, the means by which missing values are represented in &lt;code class=&quot;highlighter-rouge&quot;&gt;DataArray&lt;/code&gt;s – i.e. with a token &lt;code class=&quot;highlighter-rouge&quot;&gt;NA::NAtype&lt;/code&gt; object – entails that the most specific return type inferable from &lt;code class=&quot;highlighter-rouge&quot;&gt;Base.getindex(df::DataArray{T}, i)&lt;/code&gt; is &lt;code class=&quot;highlighter-rouge&quot;&gt;Union{T, NAtype}&lt;/code&gt;. This means that until Julia’s compiler can better handle small &lt;code class=&quot;highlighter-rouge&quot;&gt;Union&lt;/code&gt; types, code that naively indexes into a &lt;code class=&quot;highlighter-rouge&quot;&gt;DataArray&lt;/code&gt; will perform unnecessarily poorly.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/JuliaStats/NullableArrays.jl&quot;&gt;NullableArrays.jl&lt;/a&gt; &lt;a href=&quot;http://julialang.org/blog/2015/10/nullablearrays&quot;&gt;remedied&lt;/a&gt; this shortcoming by representing both missing and present values of type &lt;code class=&quot;highlighter-rouge&quot;&gt;T&lt;/code&gt; as objects of type &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable{T}&lt;/code&gt;. However, this solution has limitations in other respects. First, use of &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt;s does nothing to support type inference in column-indexing of &lt;code class=&quot;highlighter-rouge&quot;&gt;DataFrame&lt;/code&gt;s. That is, the return type of &lt;code class=&quot;highlighter-rouge&quot;&gt;Base.getindex(df::DataFrame, field::Symbol)&lt;/code&gt; is not straightforwardly inferable, even if &lt;code class=&quot;highlighter-rouge&quot;&gt;DataFrame&lt;/code&gt;s are built over &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt;s. Call this first problem the &lt;em&gt;column-indexing problem&lt;/em&gt;. Second, NullableArrays introduces certain difficulties centered around the &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; type. Call this second problem the &lt;em&gt;nullable semantics problem&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;The column-indexing problem is &lt;a href=&quot;http://www.johnmyleswhite.com/notebook/2015/11/28/why-julias-dataframes-are-still-slow/&quot;&gt;well-documented&lt;/a&gt;. To see the difficulty, consider the following function&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataFrame&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zero&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;eltype&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eachindex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;where &lt;code class=&quot;highlighter-rouge&quot;&gt;df[:A]&lt;/code&gt; retrieves the column named &lt;code class=&quot;highlighter-rouge&quot;&gt;:A&lt;/code&gt; from &lt;code class=&quot;highlighter-rouge&quot;&gt;df&lt;/code&gt;. A user might reasonably expect the above to be idiomatic Julia: the work is written in a &lt;code class=&quot;highlighter-rouge&quot;&gt;for&lt;/code&gt; loop that is wrapped inside a function. However, this code will not be (ahead-of-time) compiled to efficient machine instructions because the type of the object that &lt;code class=&quot;highlighter-rouge&quot;&gt;df[:A]&lt;/code&gt; returns cannot be inferred during static analysis. This is because there is nothing the &lt;code class=&quot;highlighter-rouge&quot;&gt;DataFrame&lt;/code&gt; type can do to communicate the &lt;code class=&quot;highlighter-rouge&quot;&gt;eltype&lt;/code&gt;s of its columns to the compiler.&lt;/p&gt;

&lt;p&gt;The nullable semantics problem is described throughout a dispersed series of GitHub issues (the interested reader can start &lt;a href=&quot;https://github.com/JuliaStats/NullableArrays.jl/issues/95&quot;&gt;here&lt;/a&gt; and &lt;a href=&quot;https://github.com/JuliaStats/NullableArrays.jl/pull/85&quot;&gt;here&lt;/a&gt;) (and &lt;a href=&quot;https://groups.google.com/d/topic/julia-dev/WD7-vQeweJE/discussion&quot;&gt;at least one&lt;/a&gt; mailing list post). To my knowledge, a self-contained treatment has not been given (I don’t necessarily claim to be giving one now). The problem has two parts, which I’ll call the “easy question” and the “hard question”, respectively:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;What should the semantics of &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x::Nullable{T})&lt;/code&gt; be given a definition of &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x::T)&lt;/code&gt;?&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;How should we implement these semantics in a sufficiently general and user-friendly way?&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In most cases, the answer to the easy question is clear: &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x::Nullable{T})&lt;/code&gt; should return an empty &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable{U}&lt;/code&gt; if &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; is null and &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable(f(x.value))&lt;/code&gt; if &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; is not null. There is a question of how to choose the type parameter &lt;code class=&quot;highlighter-rouge&quot;&gt;U&lt;/code&gt;, but a solution involving Julia’s type inference facilities seems to be about right. (The discussion of &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/16622&quot;&gt;0.5-style comprehensions&lt;/a&gt; and &lt;a href=&quot;https://github.com/JuliaLang/julia/issues/7258&quot;&gt;one&lt;/a&gt; or &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/11034&quot;&gt;two&lt;/a&gt; discussions about the return type of &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt; over an empty array, were all influential on this matter.) We will refer to these semantics as the &lt;em&gt;standard lifting semantics&lt;/em&gt;. It is worth noting that there is at least one considerable alternative to standard lifting semantics, at least in the realm of binary operators on &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable{Bool}&lt;/code&gt; arguments: &lt;a href=&quot;https://en.wikipedia.org/wiki/Three-valued_logic&quot;&gt;three-valued logic&lt;/a&gt;. But whether to use three-valued logic or standard lifting semantics is usually clear from the context of the program and the intention of the programmer.&lt;/p&gt;

&lt;p&gt;On the other hand, the hard question is still unresolved. There are a number of possible solutions, and it’s difficult to know how to weigh their costs and benefits.&lt;/p&gt;

&lt;p&gt;We’ll return to the column-indexing problem and the hard question of nullable semantics after we’ve described the present query interface. Before we dive in, I want to emphasize that this blog post is a status update, not a release notice (though StructuredQueries is registered so that you can play with it if you like). StructuredQueries (SQ) is a work in progress, and it will likely remain that way for some time. I hope to convince the reader that SQ nonetheless represents an interesting and worthwhile direction for the development of tabular data facilities in Julia.&lt;/p&gt;

&lt;h2 id=&quot;the-query-framework&quot;&gt;The query framework&lt;/h2&gt;

&lt;p&gt;The StructuredQueries package provides a framework for representing the &lt;em&gt;structure&lt;/em&gt; of a query without assuming any specific corresponding &lt;em&gt;semantics&lt;/em&gt;. By the structure of a query, we mean the series of particular manipulation verbs invoked and the respective arguments passed to these verbs. By the semantics of a query, we mean the actual behavior of executing a query with a particular structure against a particular data source. A query semantics thus depends both on the structure of the query and on the type of the data source against which the query is executed. We will refer to the implementation of a particular query semantics as a &lt;em&gt;collection machinery&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Decoupling the representation of a query’s structure from the collection machinery helps to make the present query framework&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;generic – the framework should be able to support multiple backends.&lt;/li&gt;
  &lt;li&gt;modular – the framework should encourage modularity of collection machinery.&lt;/li&gt;
  &lt;li&gt;extensible – the framework should be easily extensible to represent (relatively) arbitrary manipulations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These desiderata are interrelated. For instance, modularity of collection machinery allows the latter to be re-used in support for different data backends, thereby supporting generality as well.&lt;/p&gt;

&lt;p&gt;In this section we’ll describe how SQ represents query structures. In the following sections we’ll see how SQ’s query representation framework suggests solutions to the column-indexing and nullable semantics problems described above.&lt;/p&gt;

&lt;p&gt;To express a query in SQ, one uses the &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; macro:&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nd&quot;&gt;@query&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;qry&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;where &lt;code class=&quot;highlighter-rouge&quot;&gt;qry&lt;/code&gt; is Julia code that follows a certain structure that we will describe below. &lt;code class=&quot;highlighter-rouge&quot;&gt;qry&lt;/code&gt; is parsed according to what we’ll call a &lt;em&gt;query context&lt;/em&gt;. By a &lt;em&gt;context&lt;/em&gt; we mean a general semantics for Julia code that may differ from the semantics of the standard Julia environment. That is to say: though &lt;code class=&quot;highlighter-rouge&quot;&gt;qry&lt;/code&gt; must be valid Julia syntax, the code is not run as it would were it executed outside of the &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; macro. Rather, code such as &lt;code class=&quot;highlighter-rouge&quot;&gt;qry&lt;/code&gt; that occurs inside of a query context is subject to a number of transformations before it is run. &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; uses these transformations to produce a graphical representation of the structure of &lt;code class=&quot;highlighter-rouge&quot;&gt;qry&lt;/code&gt;. An &lt;code class=&quot;highlighter-rouge&quot;&gt;@query qry&lt;/code&gt; invocation returns a &lt;code class=&quot;highlighter-rouge&quot;&gt;Query&lt;/code&gt; object, which wraps the query graph produced as a result of processing &lt;code class=&quot;highlighter-rouge&quot;&gt;qry&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We said above that SQ represents queries in terms of their structure but does not itself guarantee any particular semantics. This allows packages to implement their own semantics for a given query structure. To demonstrate this design, I’ve put together (i) an &lt;a href=&quot;https://github.com/davidagold/AbstractTables.jl&quot;&gt;abstract tabular data type&lt;/a&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractTable&lt;/code&gt;; (ii) an &lt;a href=&quot;https://github.com/davidagold/AbstractTables.jl#column-indexable-interface&quot;&gt;interface&lt;/a&gt; to support a collection machinery against what I call &lt;em&gt;column-indexable&lt;/em&gt; types &lt;code class=&quot;highlighter-rouge&quot;&gt;T &amp;lt;: AbstractTable&lt;/code&gt;; and (iii) a &lt;a href=&quot;https://github.com/davidagold/TablesDemo.jl&quot;&gt;concrete tabular data type&lt;/a&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;Table &amp;lt;: AbstractTable&lt;/code&gt; that satisfies the column-indexable interface and therefore inherits a collection machinery to support SQ queries.&lt;/p&gt;

&lt;p&gt;This following behavior mimics that which one would expect from querying against a &lt;code class=&quot;highlighter-rouge&quot;&gt;DataFrame&lt;/code&gt;. The main reason for putting together a demonstration using &lt;code class=&quot;highlighter-rouge&quot;&gt;Table&lt;/code&gt;s and not &lt;code class=&quot;highlighter-rouge&quot;&gt;DataFrame&lt;/code&gt;s has to do with ease of experimentation. I can more easily modify the &lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractTable&lt;/code&gt;/&lt;code class=&quot;highlighter-rouge&quot;&gt;Table&lt;/code&gt; types and interfaces more easily than I can the &lt;code class=&quot;highlighter-rouge&quot;&gt;DataFrame&lt;/code&gt; type and interface. Indeed, this project has become just as much about designing an in-memory Julia tabular data type that is most compatible with a Julia query framework as it is about designing a query framework compatible with an in-memory Julia tabular data type. Fortunately, the implementation of backend support for &lt;code class=&quot;highlighter-rouge&quot;&gt;Table&lt;/code&gt;s will be straightforward to port to support for &lt;code class=&quot;highlighter-rouge&quot;&gt;DataFrame&lt;/code&gt;s once we decide where such support should live.&lt;/p&gt;

&lt;p&gt;Let’s dive into the query interface by considering examples using the iris data set. (Though the package TablesDemo.jl is intended solely as a demonstration, it is registered so that readers can easily install it with &lt;code class=&quot;highlighter-rouge&quot;&gt;Pkg.add(&quot;TablesDemo.jl&quot;)&lt;/code&gt; and follow along.)&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iris&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CSV&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Source&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;joinpath&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Pkg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dir&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;Tables&quot;&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;csv/iris.csv&quot;&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Tables&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Row&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sepal_length&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sepal_width&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;petal_length&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;petal_width&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;species&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;├─────┼──────────────┼─────────────┼──────────────┼─────────────┼──────────┤&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;5.1&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3.5&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.4&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;setosa&quot;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;4.9&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3.0&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.4&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;setosa&quot;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;4.7&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3.2&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.3&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;setosa&quot;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;4.6&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3.1&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.5&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;setosa&quot;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;5.0&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3.6&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.4&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;setosa&quot;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;5.4&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3.9&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.7&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.4&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;setosa&quot;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;4.6&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3.4&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.4&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.3&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;setosa&quot;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;5.0&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3.4&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.5&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;setosa&quot;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;9&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;4.4&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;2.9&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.4&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;setosa&quot;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;4.9&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;3.1&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.5&lt;/span&gt;          &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.1&lt;/span&gt;         &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;setosa&quot;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;⋮&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;140&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;more&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rows&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We can then use &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; to express a query against this data set – say, filtering rows according to a condition on &lt;code class=&quot;highlighter-rouge&quot;&gt;sepal_length&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;q&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@query&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iris&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sepal_length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;5.0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Query&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tables&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;source&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This produces a &lt;code class=&quot;highlighter-rouge&quot;&gt;Query{S}&lt;/code&gt; object, where &lt;code class=&quot;highlighter-rouge&quot;&gt;S&lt;/code&gt; is the type of the data source&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;typeof&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;StructuredQueries&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Query&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Tables&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The structure of the query passed to &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; consists of a &lt;em&gt;manipulation verb&lt;/em&gt; (e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;filter&lt;/code&gt;) that in turn takes a &lt;em&gt;data source&lt;/em&gt; (e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;iris&lt;/code&gt;) for its first argument and any number of &lt;em&gt;query arguments&lt;/em&gt; (e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;sepal_length &amp;gt; 5.0&lt;/code&gt;) for its latter arguments. These are the three different “parts” of a query: (1) data sources (or just “sources”), (2) manipulation verbs (or just “verbs”), and (3) query arguments.&lt;/p&gt;

&lt;p&gt;Each part of a query induces its own context in which code is evaluated. The most significant aspect of such contexts is name resolution. That is to say, names resolve differently depending on which part of a query they appear in and in what capacity they appear:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;In a data source specification context – e.g., as the first argument to a verb such as &lt;code class=&quot;highlighter-rouge&quot;&gt;filter&lt;/code&gt; above – names are evaluated in the enclosing scope of the &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; invocation. Thus, &lt;code class=&quot;highlighter-rouge&quot;&gt;iris&lt;/code&gt; in the query used to define &lt;code class=&quot;highlighter-rouge&quot;&gt;q&lt;/code&gt; above refers precisely to the &lt;code class=&quot;highlighter-rouge&quot;&gt;Table&lt;/code&gt; object to which the name is bound in the top level of &lt;code class=&quot;highlighter-rouge&quot;&gt;Main&lt;/code&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Names of manipulation verbs are not resolved to objects but rather merely signal how to construct the graphical representation of the query. (Indeed, in what follows there is no such function &lt;code class=&quot;highlighter-rouge&quot;&gt;filter&lt;/code&gt; that is ever invoked in the execution of a query involving a &lt;code class=&quot;highlighter-rouge&quot;&gt;filter&lt;/code&gt; clause.)&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Names of functions called within a query argument context, such as &lt;code class=&quot;highlighter-rouge&quot;&gt;&amp;gt;&lt;/code&gt; in &lt;code class=&quot;highlighter-rouge&quot;&gt;sepal_length &amp;gt; 5.0&lt;/code&gt; are evaluated in the enclosing scope of the &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; invocation.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Names that appear as arguments to function calls within a query argument context, such as &lt;code class=&quot;highlighter-rouge&quot;&gt;sepal_length&lt;/code&gt; in &lt;code class=&quot;highlighter-rouge&quot;&gt;sepal_length &amp;gt; 5.0&lt;/code&gt; are not resolved to objects but are rather parsed as “attributes” of the data source (in this case, &lt;code class=&quot;highlighter-rouge&quot;&gt;iris&lt;/code&gt;). When the data source is a tabular data structure, such attributes are taken to be column names, but such behavior is just a feature of a particular query semantics (see below in the section “Roadmap and open questions”.) The attributes that are passed as arguments to a given function call in a query argument are stored as data in the graphical query representation.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;One can pipe arguments to verbs inside an &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; context. For instance, the &lt;code class=&quot;highlighter-rouge&quot;&gt;Query&lt;/code&gt; above is equivalent to that produced by&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nd&quot;&gt;@query&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iris&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sepal_length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;5.0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;In this case, the first argument (&lt;code class=&quot;highlighter-rouge&quot;&gt;sepal_length &amp;gt; 5.0&lt;/code&gt;) to the verb &lt;code class=&quot;highlighter-rouge&quot;&gt;filter&lt;/code&gt; is not a data source specification (&lt;code class=&quot;highlighter-rouge&quot;&gt;iris&lt;/code&gt;), which is instead the first argument to &lt;code class=&quot;highlighter-rouge&quot;&gt;|&amp;gt;&lt;/code&gt;, but is rather a query argument (&lt;code class=&quot;highlighter-rouge&quot;&gt;sepal_length &amp;gt; 5.0&lt;/code&gt;).&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Query&lt;/code&gt; objects represent the structure of a query composed of the three building blocks above. To see how, lets take a look at the internals of a &lt;code class=&quot;highlighter-rouge&quot;&gt;Query&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;fieldnames&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Symbol&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;source&lt;/span&gt;
 &lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;graph&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The first field, &lt;code class=&quot;highlighter-rouge&quot;&gt;:source&lt;/code&gt;, just contains the data source specified in the query – in this case, the &lt;code class=&quot;highlighter-rouge&quot;&gt;Table&lt;/code&gt; object that was bound to the name &lt;code class=&quot;highlighter-rouge&quot;&gt;iris&lt;/code&gt; when the query was specified. The second field, &lt;code class=&quot;highlighter-rouge&quot;&gt;:graph&lt;/code&gt; contains a(n admittedly not very interesting) graphical representation of the query structure:&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;graph&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;FilterNode&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;arguments&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;sepal_length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;5.0&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;inputs&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;DataNode&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;source&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;unset&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;source&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;filter&lt;/code&gt; verb from the original &lt;code class=&quot;highlighter-rouge&quot;&gt;qry&lt;/code&gt; expression passed to &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; is represented in the graph by a &lt;code class=&quot;highlighter-rouge&quot;&gt;FilterNode&lt;/code&gt; object and that the data source is represented by a &lt;code class=&quot;highlighter-rouge&quot;&gt;DataNode&lt;/code&gt; object. Both &lt;code class=&quot;highlighter-rouge&quot;&gt;FilterNode&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;DataNode&lt;/code&gt; are leaf subtypes of the abstract &lt;code class=&quot;highlighter-rouge&quot;&gt;QueryNode&lt;/code&gt; type. The &lt;code class=&quot;highlighter-rouge&quot;&gt;FilterNode&lt;/code&gt; is connected to the &lt;code class=&quot;highlighter-rouge&quot;&gt;DataNode&lt;/code&gt; via the &lt;code class=&quot;highlighter-rouge&quot;&gt;:input&lt;/code&gt; field of the former. In general, these connections constitute directed acyclic graphs. We may refer to such graphs as &lt;code class=&quot;highlighter-rouge&quot;&gt;QueryNode&lt;/code&gt; graphs or query graphs.&lt;/p&gt;

&lt;p&gt;SQ currently recognizes the following verbs out of the box – that is, it properly incorporates them into a &lt;code class=&quot;highlighter-rouge&quot;&gt;QueryNode&lt;/code&gt; graph:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;select&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;filter&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;groupby&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;summarize&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;orderby&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;innerjoin&lt;/code&gt; (or just &lt;code class=&quot;highlighter-rouge&quot;&gt;join&lt;/code&gt;)&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;leftjoin&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;outerjoin&lt;/code&gt;&lt;/li&gt;
  &lt;li&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;crossjoin&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;One uses &lt;code class=&quot;highlighter-rouge&quot;&gt;collect(q::Query)&lt;/code&gt; to materialize &lt;code class=&quot;highlighter-rouge&quot;&gt;q&lt;/code&gt; as a concrete set results set – hence the term “collection machinery”. Note that the set of verbs that receive support from the column-indexable interface – that is, the verbs that may be &lt;code class=&quot;highlighter-rouge&quot;&gt;collect&lt;/code&gt;ed against a column-indexable data source – currently only includes the first four: &lt;code class=&quot;highlighter-rouge&quot;&gt;select&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;filter&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;groupby&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;summarize&lt;/code&gt;. This is what such support currently looks like:&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;q&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@query&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iris&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt;
           &lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sepal_length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;5.0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt;
           &lt;span class=&quot;n&quot;&gt;groupby&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;species&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;petal_length&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt;
           &lt;span class=&quot;n&quot;&gt;summarize&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;avg&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;digamma&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;petal_width&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Query&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tables&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;source&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;graph&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;SummarizeNode&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;arguments&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;avg&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;=&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;mean&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;digamma&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;petal_width&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
  &lt;span class=&quot;n&quot;&gt;inputs&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
      &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;GroupbyNode&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;arguments&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;species&lt;/span&gt;
                &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;petal_length&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;inputs&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
                &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;FilterNode&lt;/span&gt;
                      &lt;span class=&quot;n&quot;&gt;arguments&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
                          &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;sepal_length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;5.0&lt;/span&gt;
                      &lt;span class=&quot;n&quot;&gt;inputs&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
                          &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;DataNode&lt;/span&gt;
                                &lt;span class=&quot;n&quot;&gt;source&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;unset&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;source&lt;/span&gt;


&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Grouped&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tables&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Groupings&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;by&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;species&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;petal_length&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;alias&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;pred_1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;Source&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Tables&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Row&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;species&lt;/span&gt;      &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;pred_1&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;avg&lt;/span&gt;       &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;├─────┼──────────────┼────────┼───────────┤&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;virginica&quot;&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;true&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.428644&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;setosa&quot;&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;true&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;3.17557&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;versicolor&quot;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;true&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.136551&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;s&quot;&gt;&quot;setosa&quot;&lt;/span&gt;     &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;false&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;4.7391&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We hope to include support for the other verbs in the near future.&lt;/p&gt;

&lt;p&gt;Again we emphasize that this collection machinery is provided by the AbstractTables package, not StructuredQueries. As we see above, the latter provides a framework for representing a query structure, whereas packages such as AbstractTables (i) decide what it means to execute a query with a particular structure against a particular backend, and (ii) provide the implementation of the behavior in (i).&lt;/p&gt;

&lt;p&gt;We provide a convenience macro, &lt;code class=&quot;highlighter-rouge&quot;&gt;@collect(qry)&lt;/code&gt;, which is equivalent to &lt;code class=&quot;highlighter-rouge&quot;&gt;collect(@query(qry))&lt;/code&gt;, for when one wishes to query and collect in the same command:&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@collect&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iris&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt;
           &lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;erf&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;petal_length&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;petal_length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;log&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sepal_width&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&amp;gt;&lt;/span&gt;
           &lt;span class=&quot;n&quot;&gt;summarize&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sum&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ifelse&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;()&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sin&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;petal_width&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Tables&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Row&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sum&lt;/span&gt;       &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;├─────┼───────────┤&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0998334&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Again, note the patterns of name resolution: names of functions (e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;erf&lt;/code&gt;) invoked within the context of a query argument are evaluated within the enclosing scope of the &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; invocation, whereas names in the arguments of such functions (e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;petal_length&lt;/code&gt;) are taken to be attributes of the data source (i.e., &lt;code class=&quot;highlighter-rouge&quot;&gt;iris&lt;/code&gt;).&lt;/p&gt;

&lt;h3 id=&quot;dummy-sources&quot;&gt;Dummy sources&lt;/h3&gt;

&lt;p&gt;We saw above how there are three parts to a query structure: verbs, sources and query arguments. A &lt;code class=&quot;highlighter-rouge&quot;&gt;Query&lt;/code&gt; object represents the verbs and query arguments together in the &lt;code class=&quot;highlighter-rouge&quot;&gt;QueryNode&lt;/code&gt; graph and wraps the data source separately. This suggests that one ought to be able to generate query graphs using &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; even if one does not specify a particular data source. One can do precisely this by using &lt;em&gt;dummy sources&lt;/em&gt;, which are essentially placeholders that can be “filled in” with particular data sources later, when one calls &lt;code class=&quot;highlighter-rouge&quot;&gt;collect&lt;/code&gt;. To indicate a source as a dummy source, simply prepend it with a &lt;code class=&quot;highlighter-rouge&quot;&gt;:&lt;/code&gt;. For instance:&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;q&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@query&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;twice_sepal_length&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sepal_length&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Query&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dummy&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;source&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iris&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;Tables&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Table&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Row&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;twice_sepal_length&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;├─────┼────────────────────┤&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;10.2&lt;/span&gt;               &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;9.8&lt;/span&gt;                &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;9.4&lt;/span&gt;                &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;9.2&lt;/span&gt;                &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;10.0&lt;/span&gt;               &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;10.8&lt;/span&gt;               &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;9.2&lt;/span&gt;                &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;8&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;10.0&lt;/span&gt;               &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;9&lt;/span&gt;   &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;8.8&lt;/span&gt;                &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;10&lt;/span&gt;  &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;9.8&lt;/span&gt;                &lt;span class=&quot;n&quot;&gt;│&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;⋮&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;with&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;140&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;more&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rows&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Whatever the name of the dummy source (minus the &lt;code class=&quot;highlighter-rouge&quot;&gt;:&lt;/code&gt;) was in the query must be the key in the kwarg passed to &lt;code class=&quot;highlighter-rouge&quot;&gt;collect&lt;/code&gt;. Otherwise, the method will fail:&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;collect&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tbl&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iris&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;ERROR&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ArgumentError&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Undefined&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;source&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;tbl&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Check&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;spelling&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;query&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;
 &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;#collect#5(::Array{Any,1}, ::Function, ::StructuredQueries.Query{Symbol}) at /Users/David/.julia/v0.6/StructuredQueries/src/query/collect.jl:23&lt;/span&gt;
 &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Base&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;c&quot;&gt;#kw##collect)(::Array{Any,1}, ::Base.#collect, ::StructuredQueries.Query{Symbol}) at ./&amp;lt;missing&amp;gt;:0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h2 id=&quot;the-two-problems&quot;&gt;The two problems&lt;/h2&gt;

&lt;p&gt;Now that we’ve seen what the SQ query framework itself consists of, we can discuss how such a framework may help to solve the column-indexing and nullable semantics problems.&lt;/p&gt;

&lt;h3 id=&quot;type-inferability&quot;&gt;Type-inferability&lt;/h3&gt;

&lt;p&gt;Recall that the column-indexing problem consists in the inability of type inference to detect the return type of&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;DataFrame&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zero&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;eltype&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eachindex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;What &lt;em&gt;would&lt;/em&gt; make &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; above amenable to type inference is to pass &lt;code class=&quot;highlighter-rouge&quot;&gt;A = df[:A]&lt;/code&gt; above to an inner function that executes the loop, for instance&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;f_inner&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zero&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;eltype&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;As long as &lt;code class=&quot;highlighter-rouge&quot;&gt;f_inner&lt;/code&gt; does not get inlined, type inference will run “at” the point at which the body of &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; calls &lt;code class=&quot;highlighter-rouge&quot;&gt;f_inner&lt;/code&gt; and will have access to the &lt;code class=&quot;highlighter-rouge&quot;&gt;eltype&lt;/code&gt; of &lt;code class=&quot;highlighter-rouge&quot;&gt;df[:A]&lt;/code&gt;, since the latter is passed as an argument to &lt;code class=&quot;highlighter-rouge&quot;&gt;f_inner&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;This strategy of introducing a function barrier also works when one requires multiple columns. For instance, suppose I wanted to generate a new column &lt;code class=&quot;highlighter-rouge&quot;&gt;C&lt;/code&gt; where &lt;code class=&quot;highlighter-rouge&quot;&gt;C[i] = g(A[i], B[i])&lt;/code&gt;. The following solution is type-inferable since the type parameters of the zipped iterator &lt;code class=&quot;highlighter-rouge&quot;&gt;zip(A, B)&lt;/code&gt; reflects the &lt;code class=&quot;highlighter-rouge&quot;&gt;eltype&lt;/code&gt;s of &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;df&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;C&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;similar&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;f_inner!&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;DataFrame&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;C&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; f_inner&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;itr&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;c&quot;&gt;# bang because mutates C&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;itr&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;g&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;C&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;In other words: If one intends to iterate over the rows of some subset of columns of a &lt;code class=&quot;highlighter-rouge&quot;&gt;DataFrame&lt;/code&gt;, then at some point there must be a function barrier through which is passed an argument whose signature reflects the &lt;code class=&quot;highlighter-rouge&quot;&gt;eltype&lt;/code&gt;s of the relevant columns.&lt;/p&gt;

&lt;p&gt;The manipulation described above could be expressed for a column-indexable table (e.g. a &lt;code class=&quot;highlighter-rouge&quot;&gt;Table&lt;/code&gt; object) as&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nd&quot;&gt;@query&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;select&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tbl&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;C&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/davidagold/AbstractTables.jl/tree/master/src/column_indexable/query&quot;&gt;collection machinery&lt;/a&gt; that supports this query against, say, a &lt;code class=&quot;highlighter-rouge&quot;&gt;Table&lt;/code&gt; source essentially &lt;a href=&quot;https://github.com/davidagold/AbstractTables.jl/blob/master/src/column_indexable/query/select.jl&quot;&gt;follows&lt;/a&gt; the above pattern of &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;f_inner&lt;/code&gt;. That is, an outer function passes a “scalar kernel” (here, &lt;code class=&quot;highlighter-rouge&quot;&gt;row -&amp;gt; row[1] * row[2]&lt;/code&gt;) that reflects the structure of &lt;code class=&quot;highlighter-rouge&quot;&gt;A * B&lt;/code&gt; and a “row iterator” (here &lt;code class=&quot;highlighter-rouge&quot;&gt;zip(tbl[:A], tbl[:B])&lt;/code&gt;) to an inner function that computes the value of the scalar kernel applied to the “rows” returned by iterating over the row iterator. (Note that the argument to the scalar kernel is assumed to be a &lt;code class=&quot;highlighter-rouge&quot;&gt;Tuple&lt;/code&gt; whose individual elements assume the positions of named attributes (such as &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt;) in the body of the “value expression” (here &lt;code class=&quot;highlighter-rouge&quot;&gt;A * B&lt;/code&gt;) from which the scalar kernel is generated).&lt;/p&gt;

&lt;p&gt;The scalar kernel and the information about which column to extract from &lt;code class=&quot;highlighter-rouge&quot;&gt;tbl&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;zip&lt;/code&gt; together are all stored in the &lt;code class=&quot;highlighter-rouge&quot;&gt;QueryNode&lt;/code&gt; graph produced by &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt;. Much of the work in producing such a graph consists in extracting such information from the &lt;code class=&quot;highlighter-rouge&quot;&gt;qry&lt;/code&gt; expression (here &lt;code class=&quot;highlighter-rouge&quot;&gt;select(tbl, C = A * B)&lt;/code&gt;) and processing it to produce (i) a lambda that captures the form of the transformation (&lt;code class=&quot;highlighter-rouge&quot;&gt;A * B&lt;/code&gt;), (ii) a &lt;code class=&quot;highlighter-rouge&quot;&gt;Symbol&lt;/code&gt; that names the resultant column (&lt;code class=&quot;highlighter-rouge&quot;&gt;C&lt;/code&gt;) and a &lt;code class=&quot;highlighter-rouge&quot;&gt;Vector{Symbol}&lt;/code&gt; that lists the relevant argument column names (&lt;code class=&quot;highlighter-rouge&quot;&gt;[:A, :B]&lt;/code&gt;) in the order they are encountered during the production of the lambda.&lt;/p&gt;

&lt;p&gt;Note that these data (a scalar kernel and result and argument fields) are not necessary to generate SQL code from a raw query argument, say the &lt;code class=&quot;highlighter-rouge&quot;&gt;Expr&lt;/code&gt; object &lt;code class=&quot;highlighter-rouge&quot;&gt;:( C = A * B )&lt;/code&gt;. Thus, one might argue that it is somewhat wasteful to compute such data and store it in the &lt;code class=&quot;highlighter-rouge&quot;&gt;QueryNode&lt;/code&gt; graph when one might be able to compute the data at run-time dispatch of &lt;code class=&quot;highlighter-rouge&quot;&gt;collect&lt;/code&gt; on a &lt;code class=&quot;highlighter-rouge&quot;&gt;Query{S}&lt;/code&gt; where &lt;code class=&quot;highlighter-rouge&quot;&gt;S&lt;/code&gt; is a type that satisfies the column-indexable interface. This is a good point, but there are two considerations to account. The first is that computing the scalar kernel and extracting the result and argument fields from the query argument is probably not prohibitively expensive. The second is that generating the scalar kernel at run-time (i) involves use of &lt;code class=&quot;highlighter-rouge&quot;&gt;eval&lt;/code&gt;, which is to be avoided, and (ii) may involve a lot of work to re-incorporate the module information of names appearing in expression to be &lt;code class=&quot;highlighter-rouge&quot;&gt;eval&lt;/code&gt;‘d into a scalar kernel. For now, it is easiest to generate scalar kernels at macroexpand-time and let them come along for the ride in the &lt;code class=&quot;highlighter-rouge&quot;&gt;QueryNode&lt;/code&gt; graph even if the latter is to be collected against a data source (e.g. a SQL connection) that doesn’t need such data.&lt;/p&gt;

&lt;p&gt;The use of metaprogramming to circumvent type-inferability is not a new strategy. Indeed, it is the basis for the &lt;a href=&quot;https://github.com/JuliaStats/DataFramesMeta.jl&quot;&gt;DataFramesMeta&lt;/a&gt; manipulation framework. The interested reader is referred &lt;a href=&quot;https://github.com/JuliaStats/DataFrames.jl/issues/523#issuecomment-33908369&quot;&gt;here&lt;/a&gt; and &lt;a href=&quot;https://github.com/JuliaStats/DataFramesMeta.jl/issues/1&quot;&gt;here&lt;/a&gt; for more on the history and motivation for these endeavors.&lt;/p&gt;

&lt;h3 id=&quot;the-hard-question-of-nullable-semantics&quot;&gt;The hard question of nullable semantics&lt;/h3&gt;

&lt;p&gt;Recall the hard question of nullable semantics involves implementing a given lifting semantics – that is, a given behavior for &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x::Nullable{T})&lt;/code&gt; given a defined method &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x::T)&lt;/code&gt;– in a “general” way.&lt;/p&gt;

&lt;p&gt;One solution – perhaps the most obvious, and which I have &lt;a href=&quot;https://github.com/JuliaStats/NullableArrays.jl/commit/e3d68ab2502e3e8c2e9e6b7c299f9078b9154e3e#diff-04c6e90faac2675aa89e2176d2eec7d8R140&quot;&gt;previously endorsed&lt;/a&gt; – involves defining the method &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x::Nullable{T})&lt;/code&gt; as something like&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Nullable&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;T&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;})&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;isnull&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Nullable&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;U&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Nullable&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;value&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;with natural analogues for methods with n-ary arguments. This process is a bit cumbersome, but it would not be difficult to automate with a macro with which one could annotate the original definition &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x::T)&lt;/code&gt;. Call this approach the “method extension lifting” approach.&lt;/p&gt;

&lt;p&gt;The method extension lifting approach is very flexible. However, it does face some difficulties. One must somehow decide which functions should be lifted in this manner, and it’s not clear how this line (between lifted and non-lifted functions) ought to be drawn. And if one cannot edit the definition of a function then a macro is of no use; one must manually introduce the lifted variant.&lt;/p&gt;

&lt;p&gt;There is a further problem. If one wants to support lifting over arguments with “mixed” signatures – i.e. signatures in which some argument types are &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; and some are not – then one has either to extend the promotion machinery or to define methods for mixed signatures, e.g. &lt;code class=&quot;highlighter-rouge&quot;&gt;+{T}(x, y::Nullable{T})&lt;/code&gt;. That may end up being a lot of methods. Even if their definition can be automated with metaprogramming, the compilation costs associated with method proliferation may be considerable (but I haven’t tested this).&lt;/p&gt;

&lt;p&gt;Finally, there is the problem described in &lt;a href=&quot;https://github.com/JuliaStats/NullableArrays.jl/issues/148#issuecomment-249335994&quot;&gt;NullableArrays.jl#148&lt;/a&gt;. I won’t repeat the entire argument here. The summary of this problem is: if one is going to rely on a minimal set of lifted operators to support generic lifting of user-defined functions, those user-defined functions essentially have to give up much of multiple dispatch.&lt;/p&gt;

&lt;p&gt;The difficulties associated with method extension lifting are not insurmountable, but the solution – namely, keeping a repository of lifted methods – requires an undetermined amount of maintenance and coordination.&lt;/p&gt;

&lt;p&gt;Another way to implement standard lifting semantics is by means of a higher-order function – that is, on Julia 0.5 where higher-order functions are performant. Such a function – call it &lt;code class=&quot;highlighter-rouge&quot;&gt;lift&lt;/code&gt; – might look like the following:&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; lift&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;if&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;hasvalue&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Nullable&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;else&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;U&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Core&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Inference&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;return_type&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;typeof&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),))&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Nullable&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;U&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}()&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This definition can naturally be extended to methods with more than one argument. The primary advantage of this approach over method extension lifting is its generality: one needs only to define one (two, three) higher-order &lt;code class=&quot;highlighter-rouge&quot;&gt;lift&lt;/code&gt; method to support lifting of all functions of one (two, &lt;em&gt;n&lt;/em&gt;) argument(s), as opposed to having to define a lifted version for each such function. Note that as long as &lt;code class=&quot;highlighter-rouge&quot;&gt;hasvalue&lt;/code&gt; has some generic fallback method for non-&lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; arguments, such &lt;code class=&quot;highlighter-rouge&quot;&gt;lift&lt;/code&gt; functions cover both standard and mixed-signature lifting. (Ideally one would ensure that the code is optimized for when types are non-&lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt;; in particular, one would ensure that the dead branch is removed – cf. &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/18484&quot;&gt;julia#18484&lt;/a&gt;.) Call this approach the “higher-order lifting” approach.&lt;/p&gt;

&lt;p&gt;So, with the higher-order lifting approach we might better avoid method proliferation and generality worries, which is nice. However, now we require users to invoke &lt;code class=&quot;highlighter-rouge&quot;&gt;lift&lt;/code&gt; everywhere. In particular, to lift &lt;code class=&quot;highlighter-rouge&quot;&gt;f(g(x))&lt;/code&gt; over a &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; argument &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt;, one needs to write &lt;code class=&quot;highlighter-rouge&quot;&gt;lift(f, lift(g, x))&lt;/code&gt;. The least we could do in this case is provide an &lt;code class=&quot;highlighter-rouge&quot;&gt;@lift&lt;/code&gt; macro that, say, traverses the AST of &lt;code class=&quot;highlighter-rouge&quot;&gt;f(g(x))&lt;/code&gt; and replaces each function call &lt;code class=&quot;highlighter-rouge&quot;&gt;f(...)&lt;/code&gt; by an invocation of &lt;code class=&quot;highlighter-rouge&quot;&gt;lift(f, ...)&lt;/code&gt;. That might be reasonable, but it’s still an artifact of implementation details of support for missing values, and ideally it would not be exposed to users.&lt;/p&gt;

&lt;p&gt;Recall that the present query framework extracts the “value expression” of a query argument (for instance, &lt;code class=&quot;highlighter-rouge&quot;&gt;B * C&lt;/code&gt; in the query argument &lt;code class=&quot;highlighter-rouge&quot;&gt;C = A * B&lt;/code&gt;) and generates a lambda that mimics the former’s structure (in this case, &lt;code class=&quot;highlighter-rouge&quot;&gt;row -&amp;gt; row[1] * row[2]&lt;/code&gt;). A proposed modification (see &lt;a href=&quot;https://github.com/davidagold/_AbstractTables.jl/issues/2&quot;&gt;AbstractTables#2&lt;/a&gt;) to this process is to modify the AST of the value expression (&lt;code class=&quot;highlighter-rouge&quot;&gt;A * B&lt;/code&gt;) by appropriately inserting calls to &lt;code class=&quot;highlighter-rouge&quot;&gt;lift&lt;/code&gt;, e.g.&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;row&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;lift&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;],&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;row&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;While there is a &lt;a href=&quot;https://github.com/davidagold/AbstractTables.jl/blob/2a7771ce865b961fa0e454508ce8b7aa6a85e1fd/src/column_indexable/query/select.jl#L43-L48&quot;&gt;simpler way&lt;/a&gt; to achieve standard lifting semantics, this approach (which is currently employed by the column-indexing collection machinery) does not easily support non-standard lifting semantics such as three-valued logic.&lt;/p&gt;

&lt;p&gt;The higher-order lifting approach is not without its own drawbacks. Most notably, non-standard lifting semantics, such as three-valued logic, are more difficult to implement and are subject to restrictions that do not apply to the method extension lifting approach. The details of this difficulty is the proper subject of another blog post. The summary of the problem is: higher-order lifting (via code transformation, such as within &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt;) can only give non-standard lifting semantics to methods called explicitly within the expression passed to &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt;. That is,&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;nd&quot;&gt;@query&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tbl&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;can be given, say, three-valued logic semantics via higher-order lifting, but&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;|&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;y&lt;/span&gt;
&lt;span class=&quot;nd&quot;&gt;@query&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tbl&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;cannot.&lt;/p&gt;

&lt;p&gt;Which approach to solving the hard question of &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; semantics is better? It really is not clear. Right now, the Julia statistics community is trying out both solutions. I am hopeful time and experimentation will yield new insights.&lt;/p&gt;

&lt;h2 id=&quot;sql-backends&quot;&gt;SQL backends&lt;/h2&gt;

&lt;p&gt;Above we have seen (i) how the implementation of a generic querying interface suggested a solution to the column-indexing and the &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; semantics problems and (ii) how these latter solutions may be implemented in a manner generic over so-called column-indexable in-memory Julia tabular data structures. But we haven’t said anything about how the interface is generic over tables other than in-memory Julia objects. In particular, we desire that the above framework be applicable to SQL database connections as well.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/yeesian&quot;&gt;Yeesian Ng&lt;/a&gt;, who provided invaluable feedback and ideas during the development of SQ, also began to develop such an extension in a package called &lt;a href=&quot;https://github.com/yeesian/SQLQuery.jl/pull/2&quot;&gt;SQLQuery&lt;/a&gt;. We are working to further integrate it with StructuredQueries in  &lt;a href=&quot;https://github.com/yeesian/SQLQuery.jl/pull/2&quot;&gt;SQLQuery.jl#2&lt;/a&gt;, and we encourage the reader to stay tuned for updates concerning this endeavor.&lt;/p&gt;

&lt;h2 id=&quot;roadmap-and-open-questions&quot;&gt;Roadmap and open questions&lt;/h2&gt;

&lt;p&gt;There is a general roadmap available at  &lt;a href=&quot;https://github.com/davidagold/StructuredQueries.jl/issues/19&quot;&gt;structuredQueries.jl#19&lt;/a&gt;. I’ll briefly describe some of what I believe are the most pressing/interesting open questions.&lt;/p&gt;

&lt;p&gt;Interpolation syntax and implementation are both significant open questions. Suppose I wish to refer to a name in the enclosing scope of an &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; invocation. A straightforward syntax would be to prepend the interpolated variable with &lt;code class=&quot;highlighter-rouge&quot;&gt;$&lt;/code&gt;, as in&lt;/p&gt;

&lt;div class=&quot;language-julia highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;q&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@query&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;filter&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;tbl&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;$&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;c&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;How should this be implemented? For full generality, we would like to be able to “capture” &lt;code class=&quot;highlighter-rouge&quot;&gt;c&lt;/code&gt; from the enclosing scope and store it &lt;code class=&quot;highlighter-rouge&quot;&gt;q&lt;/code&gt;. One way to do so is to include &lt;code class=&quot;highlighter-rouge&quot;&gt;c&lt;/code&gt; in the closure of a lambda &lt;code class=&quot;highlighter-rouge&quot;&gt;() -&amp;gt; c&lt;/code&gt; that we store in &lt;code class=&quot;highlighter-rouge&quot;&gt;q&lt;/code&gt;. However, there is the question of how to deal with &lt;a href=&quot;https://github.com/davidagold/StructuredQueries.jl/issues/22#issuecomment-244995697&quot;&gt;problems of type-inferability&lt;/a&gt;. Solving this problem may either require or strongly suggest some sort of “parametrized queries” API by which one can designate a name inside of a query argument context a &lt;em&gt;parameter&lt;/em&gt; that can then be bound after the &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; invocation, e.g. specified as kwargs to &lt;code class=&quot;highlighter-rouge&quot;&gt;collect&lt;/code&gt; or to a function like &lt;code class=&quot;highlighter-rouge&quot;&gt;bind!(q::Query[; kwargs...])&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;We are also still deciding what the general syntax within a query context should look like. A big part of this decision concerns how aliasing and related functionality ought to work. See &lt;a href=&quot;https://github.com/davidagold/StructuredQueries.jl/issues/21&quot;&gt;StructuredQueries.jl#21&lt;/a&gt; for more details. This issue is similar to that of interpolation syntax insofar as both involve name resolution within different query contexts (e.g. in a data source specification context vs. a query argument context).&lt;/p&gt;

&lt;p&gt;Finally, extensibility of not only &lt;code class=&quot;highlighter-rouge&quot;&gt;collect&lt;/code&gt; but also of the graph generation facilities is an important issue, of which we hope to say more in a later post.&lt;/p&gt;

&lt;h2 id=&quot;related-work&quot;&gt;Related work&lt;/h2&gt;

&lt;p&gt;As mentioned above, &lt;a href=&quot;https://github.com/JuliaStats/DataFramesMeta.jl&quot;&gt;DataFramesMeta&lt;/a&gt; is a pioneering approach to enhancing tabular data support in Julia via metaprogramming. Another exciting (and slightly more mature than the presently discussed package) endeavor in the realm of generic data manipulation facilities support is &lt;a href=&quot;https://github.com/davidanthoff/Query.jl&quot;&gt;Query.jl&lt;/a&gt; by &lt;a href=&quot;https://github.com/davidanthoff&quot;&gt;David Anthoff&lt;/a&gt;. Query.jl and SQ are very similar in their objectives, though different in important respects. A comparison of these packages is the proper topic of a separate blog post.&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;The foregoing post has described a work in progress. Not just the StructuredQueries package, but also the Julia statistical ecosystem. Though it will likely take a while for this ecosystem to mature, the general trend I’ve observed over the past two years is encouraging. It’s also worth noting that much of what is described above would have been difficult to conceive without developments of the Julia language. In particular, performant higher-order functions and type-inferable map have both allowed us to explore solutions that were previously made difficult by the amount of metaprogramming required to ensure type-inferability. It will be interesting to see what we can come up with given the improvements to Julia in 0.6 and beyond.&lt;/p&gt;

&lt;p&gt;I’m very grateful to John Myles White for his guidance on this project, to Yeesian Ng at MIT for his collaboration, to Viral Shah and Alan Edelman for arranging this opportunity, and to many others at Julia Central and elsewhere for their help and insight.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>A Personal Perspective On JuliaCon 2016</title>
   <link href="http://julialang.org/blog/2016/09/juliacon2016"/>
   <updated>2016-09-21T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/09/juliacon2016</id>
   <content type="html">&lt;p&gt;The gentle breeze brushed my face and the mild sunshine warmed an
otherwise chilly morning. I was standing in front of a large building that can
only be described as unique: a series of metal plates jutting out at odd angles,
whose dull resplendence cast an instant impression. It was the Ray and Maria
Stata Centre, a towering monolith and the venue for an event that
people from all over the world came to attend and participate in. I was in town for
the &lt;a href=&quot;https://www.youtube.com/watch?v=EZD3Scuv02g&amp;amp;list=PLP8iPy9hna6SQPwZUDtAM59-wPzCPyD_S&quot;&gt;third edition of JuliaCon&lt;/a&gt;,
the annual Julia Conference at MIT.&lt;/p&gt;

&lt;p&gt;On the eve of JuliaCon, a series of workshops were organised on some important
areas people use Julia for. I was conducting the
&lt;a href=&quot;https://www.youtube.com/watch?v=euZkvgx0fG8&quot;&gt;Parallel Computing workshop&lt;/a&gt;
along with some other members of the &lt;a href=&quot;http://julia.mit.edu/&quot;&gt;JuliaLab&lt;/a&gt;. The key idea in our workshop was
to show users the many different ways of writing and executing parallel code in Julia.
I was talking about easy GPU computing using my package called ArrayFire and achieving
acceleration using Julia’s multi-threading.&lt;/p&gt;

&lt;p&gt;Day 1 started off with the first keynote speaker - Guy Steele, a stalwart
in the software industry and an expert in programming language design. He
&lt;a href=&quot;https://www.youtube.com/watch?v=EZD3Scuv02g&quot;&gt;spoke&lt;/a&gt; about his adventures designing
Fortress, a language that was intended to be good at mathematical programming.
He went through the key design principles and tradeoffs: from the type hierarchy,
to their model for parallelism (automatic work-stealing), and interesting choices
(such as non-transitive operator precedence). My colleague Keno Fischer was up next
with a &lt;a href=&quot;https://www.youtube.com/watch?v=e6-hcOHO0tc&quot;&gt;tour&lt;/a&gt; of the new Julia Debugger:
Gallium! Gallium was quite breathtaking in its complexity and versatility, so much so
that Keno himself uses it to debug code in C and C++! A powerful debugger becomes even
better with GUI-integration, which Mike Innes very usefully
&lt;a href=&quot;https://www.youtube.com/watch?v=yDwUL3aRSRc&quot;&gt;pitched in&lt;/a&gt; with during his demo of
Juno-Gallium integration. Stepping, printing and breakpoints promised a powerful
package development experience.&lt;/p&gt;

&lt;p&gt;The next session was all about data science. Simon Byrne
&lt;a href=&quot;https://www.youtube.com/watch?v=ScCY_nE0hlU&quot;&gt;spoke&lt;/a&gt; about the data science ecosystem
in Julia and future plans. He touched on the
&lt;a href=&quot;http://www.johnmyleswhite.com/notebook/2015/11/28/why-julias-dataframes-are-still-slow/&quot;&gt;famous problem&lt;/a&gt;
with DataFrames, and then laid out a roadmap for the ecosystem. The rest of the
session featured an interesting demo in
&lt;a href=&quot;https://www.youtube.com/watch?v=IOVrVOacLP8&quot;&gt;music processing&lt;/a&gt;,
while Arch Robison &lt;a href=&quot;https://www.youtube.com/watch?v=02NkiDoRDCU&quot;&gt;showed&lt;/a&gt;
us how to use Julia as a code generator.&lt;/p&gt;

&lt;p&gt;The evening had two sessions in parallel at different rooms. This is a recurrent
feature of JuliaCon, and it’s always hard to decide which session to attend.
This time, I chose to attend the sessions on
&lt;a href=&quot;https://www.youtube.com/watch?v=xtfNug-htcs&quot;&gt;automatic differentation&lt;/a&gt; in
&lt;a href=&quot;https://github.com/JuliaOpt/JuMP.jl&quot;&gt;JuMP&lt;/a&gt; and
&lt;a href=&quot;https://www.youtube.com/watch?v=r2hhRSHiQwY&quot;&gt;forward&lt;/a&gt; differentiation using
&lt;a href=&quot;https://github.com/JuliaDiff/ForwardDiff.jl&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;ForwardDiff.jl&lt;/code&gt;&lt;/a&gt;. I didn’t want to miss the
&lt;a href=&quot;https://www.youtube.com/watch?v=AJHyr-O5qfY&quot;&gt;talk&lt;/a&gt; on iterative methods for sparse
linear systems. Performance of different kinds of techniques and approaches were compared
and evaluated against one another, which made for a compelling presentation,
which I really enjoyed.&lt;/p&gt;

&lt;p&gt;The evening session featured Jeffrey Sarnoff, one of the sponsors of JuliaCon 2016.
Mr. Sarnoff had some &lt;a href=&quot;https://www.youtube.com/watch?v=R111conL0jM&quot;&gt;very interesting thoughts&lt;/a&gt;
on extended precision floating point numbers. And so ended the first day at JuliaCon.
Now it was time to head to the JuliaHouse! The JuliaHouse was an AirBnb that a bunch of
Julia contributors rented out. They had a yard and a barbecue and it was the ideal place
for people to go relax, unwind and network with the other Julia folks. People chilled there
till the wee hours of the morning, and somehow made it on time for the next day’s session.&lt;/p&gt;

&lt;p&gt;The second day started with a &lt;a href=&quot;https://www.youtube.com/watch?v=fl0g9tHeghA&quot;&gt;keynote speech&lt;/a&gt;
by Professor Tim Holy, a prolific contributor to the Julia language and ecosystem.
He spoke about the state of arrays in Julia and showed us a few of his ideas for iterators.
I saw that Professor Holy is widely admired in the entire Julia community due to his involvement
in various packages and the key issues on the language. I noticed that he asked some pretty neat
insightful questions at various earlier sessions too. Stefan was up next with his super-important
Julia 1.0 &lt;a href=&quot;https://www.youtube.com/watch?v=5gXMpbY1kJY&quot;&gt;talk&lt;/a&gt;. It was quite a comprehensive list
of things that needed to be done before Julia would be 1.0 ready and he touched on a variety of areas
such as the compiler, the type system, the runtime, multi-threading, strings and so on.&lt;/p&gt;

&lt;p&gt;The next session saw a team from UC Berkeley show off their
&lt;a href=&quot;https://www.youtube.com/watch?v=bX4TXWO7dA0&quot;&gt;autonomous racing car&lt;/a&gt; that uses some optimization
packages (JuMP and Ipopt in particular) to solve real-time optimization problems. Julia was running
on an ARM chip with Ubuntu 14.04 installed. Julia can also run on the Raspberry Pi, and my colleague
Avik took some time to &lt;a href=&quot;https://www.youtube.com/watch?v=EvJ-OvTC5eE&quot;&gt;show off&lt;/a&gt; a cool Minecraft demo
running on the Pi. The talk after that was about JuliaBox. Nishanth, another colleague of mine, has
been hard at work porting JuliaBox to Google Cloud from AWS, and he
&lt;a href=&quot;https://www.youtube.com/watch?v=j0tmyWJ-aSQ&quot;&gt;spoke&lt;/a&gt; about his exciting plans for JuliaBox.&lt;/p&gt;

&lt;p&gt;Post lunch, I had to choose again between parallel sessions, but I couldn’t quite
resist the session with stochastic PDEs and Finite Elements. Kristoffer Carlsson
reviewed the state of &lt;a href=&quot;https://www.youtube.com/watch?v=30TUEhbGmuc&quot;&gt;FEM in Julia&lt;/a&gt;,
talking about the packages and ecosystem for every FEM step from assembly to the
conjugate gradient. The next &lt;a href=&quot;https://www.youtube.com/watch?v=EEP2NMgC9Zo&quot;&gt;talk&lt;/a&gt;
was given by a professor at TU Vienna whose group conducts research on nano-biosensors,
and the group uses Julia to solve the stochastic PDEs that come up when trying to model
noise and fluctuations. The next &lt;a href=&quot;https://www.youtube.com/watch?v=IjJqVwtWO3s&quot;&gt;talk&lt;/a&gt;
on astrodynamics was very interesting in that it gave me an insight into the kinds of
computational challenges faced by scientists in the field. There were also some interesting
demos which I enjoyed, particularly the one where we modelled and visualized a target orbit,
which superimposed upon a visual of the earth in space.&lt;/p&gt;

&lt;p&gt;In the afternoon, after much consideration, I went to the session that featured statistical
modelling and least squares. The first talk on
&lt;a href=&quot;https://www.youtube.com/watch?v=S5sA-Ch_KPo&quot;&gt;sparse least squares optimization&lt;/a&gt; problems
gave me a flavor of the kinds of models and problems economists need to solve, and how the
Julia ecosystem helps them. The &lt;a href=&quot;https://www.youtube.com/watch?v=ZfjRjljXYXk&quot;&gt;next talk&lt;/a&gt;
on computational neuroscience focussed on dealing with tens of terabytes of brain data
coming from both animals and human surgery patients. I had a very interesting discussion
with John earlier about his work, and I was able to get a keen sense of how why the package
he was talking about (VinDsl.jl) was important for his work. And so ended Day 2 at JuliaCon,
a highly educational day for me personally, with insights into astrodynamics, finite elements
and computational neuroscience.&lt;/p&gt;

&lt;p&gt;I would contest that one of the best ways to begin your day is to listen to a speech by a Nobel
Laureate. It was quite a &lt;a href=&quot;https://www.youtube.com/watch?v=KkKBwJkYgVk&quot;&gt;surreal experience&lt;/a&gt;
listening to Professor Tom Sargent, and to see him excited by Julia. He gave us a flavor of
macroeconomics research and introduced dynamic programming squared problems that were
“a walking advertisement for Julia”. As a case in point, the
&lt;a href=&quot;https://www.youtube.com/watch?v=Vd2LJI3JLU0&quot;&gt;next session&lt;/a&gt; on DSGE models in Julia highlighted
the benefits Julia can bring to macroeconomics research and analysis.&lt;/p&gt;

&lt;p&gt;The next session had a bunch of Julia Summer of Code (JSOC) students present their projects.
Some couldn’t make it to the conference so they presented their work
&lt;a href=&quot;https://www.youtube.com/watch?v=On0AtfGh758&quot;&gt;through Google Hangouts&lt;/a&gt; or through
&lt;a href=&quot;https://www.youtube.com/watch?v=AVOooQYi9F4&quot;&gt;pre-recorded video&lt;/a&gt;. Unfortunately, I couldn’t
catch all of them because I wanted to catch my colleague Jameson’s Machine Code
&lt;a href=&quot;https://www.youtube.com/watch?v=ErGi9sNgUjw&quot;&gt;talk&lt;/a&gt; which was in another room. The material
he spoke about was very interesting, and got me thinking about the Julia compiler. I also had a
very enlightening discussion with him later about the Julia parser.&lt;/p&gt;

&lt;p&gt;It turned out that in the afternoon, I was crunched for time. I was helping Shashi plug
&lt;a href=&quot;https://github.com/JuliaComputing/ArrayFire.jl&quot;&gt;ArrayFire&lt;/a&gt; into
&lt;a href=&quot;https://github.com/JuliaParallel/Dagger.jl&quot;&gt;Dagger.jl&lt;/a&gt; for his talk that was due in a
couple of hours, while also working on my own ArrayFire notebooks for late that evening.
But we managed to pull through in time. So the afternoon session had
Shashi &lt;a href=&quot;https://www.youtube.com/watch?v=1hvCuQtt6Yg&quot;&gt;presenting&lt;/a&gt;
Dagger, his out-of-core framework, followed by a &lt;a href=&quot;https://www.youtube.com/watch?v=Ti9qqAe_NF4&quot;&gt;tour&lt;/a&gt;
of &lt;a href=&quot;https://github.com/IntelLabs/ParallelAccelerator.jl&quot;&gt;ParallelAccelerator&lt;/a&gt; from the IntelLabs team.
I have been following ParallelAccelerator for a while, and I’m excited by how certain aspects
of it (such as automatic elimination of bounds checking) can be incorporated into Base Julia.&lt;/p&gt;

&lt;p&gt;The evening session showed people how they can accelerate their code in Julia. The speaker
before me covered vectorization with &lt;a href=&quot;https://www.youtube.com/watch?v=luScuvqiow4&quot;&gt;Yeppp&lt;/a&gt;
before I &lt;a href=&quot;https://www.youtube.com/watch?v=2f32XSMYlDk&quot;&gt;covered&lt;/a&gt; GPU acceleration with ArrayFire.
It was quite overwhelming to be speaking in front of a bunch of experts, but I think I did okay.
But I did finish 5 minutes faster than my allotted time. As it turned out, both parallel sessions
actually ended up concluding a few minutes early.&lt;/p&gt;

&lt;p&gt;Finally, Andreas came up to the podium for the concluding remarks and closed off a very important
JuliaCon for me personally. I was able to appreciate the various kinds of people involved in the
Julia community: some who worked on the core language to some who worked on their own packages as
part of their research; some who worked on Julia part-time, to some (like myself) who worked
full-time; the relatively uninitiated JSOC students to experienced old-timers in the community.
One thing tied them all together though: a quite thorough appreciation of a new language whose
flexibility and power enabled people to solve important problems, whose community’s openness and sense
of democracy welcomed more smart people, and the idea that a group of individuals on different time zones
and from different walks of life can drive a revolution in scientific computing.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>BioJulia 2016 - online sequence search, sequence demultiplexing, new readers and much more!</title>
   <link href="http://julialang.org/blog/2016/09/biojulia2016-mid"/>
   <updated>2016-09-10T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/09/biojulia2016-mid</id>
   <content type="html">&lt;p&gt;We are pleased to announce releasing
&lt;a href=&quot;https://github.com/BioJulia/Bio.jl&quot;&gt;Bio.jl&lt;/a&gt; 0.4, a minor release including
significant functionality improvements as I promised in &lt;a href=&quot;http://julialang.org/blog/2016/04/biojulia2016&quot;&gt;the previous blog
post&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The following features are added since the post:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Online sequence search algorithms.&lt;/li&gt;
  &lt;li&gt;Sequence data structure for reference genomes.&lt;/li&gt;
  &lt;li&gt;Data reader and writer for the .2bit file format.&lt;/li&gt;
  &lt;li&gt;Data reader and writer for the SAM and BAM file formats.&lt;/li&gt;
  &lt;li&gt;Sequence demultiplexing tool.&lt;/li&gt;
  &lt;li&gt;Package to handle BGZF files.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And many other miscellaneous performance and usability improvements!  Tutorial
notebooks are available at &lt;a href=&quot;https://github.com/BioJulia/BioTutorials&quot;&gt;https://github.com/BioJulia/BioTutorials&lt;/a&gt;.  Here I
briefly introduce you to these new features one by one.&lt;/p&gt;

&lt;h2 id=&quot;online-sequence-search-algorithms&quot;&gt;Online sequence search algorithms&lt;/h2&gt;

&lt;p&gt;Sequence search is an indispensable tool in sequence analysis.  Since the last
post, I have added exact, approximate and regex search algorithms.  The search
interface of Bio.jl mimics that of Julia’s standard library.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; using Bio.Seq

julia&amp;gt; seq = dna&quot;ACAGCGTAGCT&quot;
11nt DNA Sequence:
ACAGCGTAGCT

# Exact search.
julia&amp;gt; search(seq, dna&quot;AGCG&quot;)
3:6

# Approximate search with one error or less.
julia&amp;gt; approxsearch(seq, dna&quot;AGGG&quot;, 1)
3:6

# Regular expression search.
julia&amp;gt; search(seq, biore&quot;AGN*?G&quot;d)
3:6
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h2 id=&quot;sequence-data-structure-for-reference-genomes&quot;&gt;Sequence data structure for reference genomes&lt;/h2&gt;

&lt;p&gt;In Bio.jl DNA sequences are encoded using 4 bits per base by default in order to
store ambiguous nucleotides and this encoding does well in most cases. However,
some biological sequences such as chromosomal sequences are so long especially
for eukaryotic organisms and the default DNA sequences may result in a waste of
memory space. &lt;code class=&quot;highlighter-rouge&quot;&gt;ReferenceSequence&lt;/code&gt; is a new type introduced in Bio.jl that
compresses positions of ambiguous nucleotides using a sparse bit vector. This
type can achieve almost 2-bit encoding in common reference sequences because
most of the ambiguous nucleotides are clustered in a sequence and the number of
them is small compared to other unambiguous nucleotides.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# Converting a DNASequence object to ReferenceSequence.
julia&amp;gt; ReferenceSequence(dna&quot;ACGT&quot;^10000)
40000nt Reference Sequence:
ACGTACGTACGTACGTACGTACGTACGTACGTACGTACG…CGTACGTACGTACGTACGTACGTACGTACGTACGTACGT

# Reading chromosome 1 of human from a FASTA file.
julia&amp;gt; open(first, FASTAReader{ReferenceSequence}, &quot;hg38.fa&quot;)
Bio.Seq.SeqRecord{Bio.Seq.ReferenceSequence,Bio.Seq.FASTAMetadata}:
  name: chr1
  sequence: NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN…NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
  metadata: Bio.Seq.FASTAMetadata(&quot;&quot;)

julia&amp;gt; sequence(ans)
248956422nt Reference Sequence:
NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN…NNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNNN
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h2 id=&quot;data-reader-and-writer-for-the-2bit-file-format&quot;&gt;Data reader and writer for the 2bit file format&lt;/h2&gt;

&lt;p&gt;2bit is a binary file format to store reference sequences. This is a kind of
binary counterpart of &lt;a href=&quot;https://en.wikipedia.org/wiki/FASTA_format&quot;&gt;FASTA&lt;/a&gt; but
specialized for DNA reference sequences to enable smaller file size and faster
loading. Reference sequences of various organisms are distributed from &lt;a href=&quot;http://hgdownload.soe.ucsc.edu/downloads.html&quot;&gt;the
download page of UCSC&lt;/a&gt; in this
file format. An important advantage of 2bit is that sequences are indexed by its
name and can be accessed immediately.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# Opening a sequence file of yeast (S.cerevisiae).
julia&amp;gt; reader = open(TwoBitReader, &quot;sacCer3.2bit&quot;);

# Loading a chromosome VI using random access index.
julia&amp;gt; reader[&quot;chrVI&quot;]
Bio.Seq.SeqRecord{Bio.Seq.ReferenceSequence,Array{UnitRange{Int64},1}}:
  name: chrVI
  sequence: GATCTCGCAAGTGCATTCCTAGACTTAATTCATATCTGC…GTGTGGTGTGTGGGTGTGGTGTGTGGGTGTGGTGTGTGG
  metadata: UnitRange{Int64}[]
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h2 id=&quot;data-reader-and-writer-for-the-sam-and-bam-file-formats&quot;&gt;Data reader and writer for the SAM and BAM file formats&lt;/h2&gt;

&lt;p&gt;The SAM and BAM file formats are designed for storing sequences aligned to
reference sequences. SAM is a line-oriented text file format and easy to handle
with UNIX command line tools. BAM is a compressed binary version of SAM and
suitable for storing data in disks and processing with purpose-built softwares
like &lt;a href=&quot;https://samtools.github.io/&quot;&gt;samtools&lt;/a&gt;. The BAM data reader is carefully
tuned so that users can use it in real analysis with large files. It is also
feasible to read a &lt;a href=&quot;http://www.ebi.ac.uk/ena/software/cram-toolkit&quot;&gt;CRAM&lt;/a&gt; file
combining the BAM reader and &lt;code class=&quot;highlighter-rouge&quot;&gt;samtools view&lt;/code&gt; command.&lt;/p&gt;

&lt;p&gt;An experimental feature is parallel processing using multiple threads.
Multi-threading support is introduced in Julia 0.5 and we use it to parallelize
decompression of BAM files. Here is a simple benchmark script to show how
much reading speed can be improved with multiple threads:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;using Bio.Align

# Count the number of mapped records.
function countmapped(reader)
    ret = 0
    record = BAMRecord()
    while !eof(reader)
        # in-place reading
        read!(reader, record)
        if ismapped(record)
            ret += 1
        end
    end
    return ret
end

println(open(countmapped, BAMReader, ARGS[1]))
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;JULIA_NUM_THREADS&lt;/code&gt; environment variable controls the number of worker threads.
The result below shows that the elapsed time is almost halved using two threads:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;~/.j/v/Bio $ time julia countmapped.jl SRR1238088.sort.bam
28040186
       29.27 real        28.64 user         0.66 sys
~/.j/v/Bio $ env JULIA_NUM_THREADS=2 time julia countmapped.jl SRR1238088.sort.bam
28040186
       17.40 real        32.31 user         0.63 sys
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h2 id=&quot;package-to-handle-bgzf-files&quot;&gt;Package to handle BGZF files&lt;/h2&gt;

&lt;p&gt;BGZF (Blocked GZip Format) is a gzip-compliant file format commonly used in
bioinformatics. BGZF can be read using standard gzip tools but files in the
format are compressed block by block and special metadata are added to index the
compressed files for random access. BAM files are compressed in this file format
and sequence alignments in a specific genomic region can be retrieved
efficiently.  BGZFStreams.jl is a new package to handle BGZF files like usual
I/O streams and it is built on top of our
&lt;a href=&quot;https://github.com/BioJulia/Libz.jl&quot;&gt;Libz.jl&lt;/a&gt; package. Parallel decompression
mentioned above is implemented in this package layer.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; using BGZFStreams

julia&amp;gt; stream = BGZFStream(&quot;/Users/kenta/.julia/v0.5/BGZFStreams/test/bar.bgz&quot;)
BGZFStreams.BGZFStream{IOStream}(&amp;lt;mode=read&amp;gt;)

julia&amp;gt; readstring(stream)
&quot;bar&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/BioJulia/BGZFStreams.jl&quot;&gt;https://github.com/BioJulia/BGZFStreams.jl&lt;/a&gt;&lt;/p&gt;

&lt;h2 id=&quot;sequence-demultiplexing-tool&quot;&gt;Sequence demultiplexing tool&lt;/h2&gt;

&lt;p&gt;Sequence demultiplexing is a technique to distinguish the origin of a sequence
using its artificially-attached “barcode” sequence. This is often used at a
preprocessing phase after &lt;a href=&quot;http://www.illumina.com/technology/next-generation-sequencing/multiplexing-sequencing-assay.html&quot;&gt;multiplexed
sequencing&lt;/a&gt;,
a common technique to sequence multiple samples simultaneously.  A barcode
sequence, however, may be corrupted due to sequencing error, and we need to find
the best matching barcode from a barcode set.  The demultiplexer algorithm
implemented in Bio.jl is based on a trie-like data structure, and efficiently
finds the optimal barcode from the prefix of a given DNA sequence.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# Set DNA barcode pool.
julia&amp;gt; barcodes = DNASequence[&quot;ATGG&quot;, &quot;CAGA&quot;, &quot;GGAA&quot;, &quot;TACG&quot;];

# Create a sequence demultiplexer that allows one errors at most.
julia&amp;gt; dplxr = Demultiplexer(barcodes, n_max_errors=1, distance=:hamming)
Bio.Seq.Demultiplexer{Bio.Seq.BioSequence{Bio.Seq.DNAAlphabet{4}}}:
  distance: hamming
  number of barcodes: 4
  number of correctable errors: 1

# Demultiplex a given sequence from its prefix.
julia&amp;gt; demultiplex(dplxr, dna&quot;ATGGCGNT&quot;)  # 1st barcode with no errors
(1,0)

julia&amp;gt; demultiplex(dplxr, dna&quot;CAGGCGNT&quot;)  # 2nd barcode with one error
(2,1)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h2 id=&quot;next-step&quot;&gt;Next step&lt;/h2&gt;

&lt;p&gt;This is still the first half of my project this year. The next term will come
with:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Supporting more file formats including GFF3, VCF and BCF.&lt;/li&gt;
  &lt;li&gt;Integration with databases.&lt;/li&gt;
  &lt;li&gt;Integration with genome browsers.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;And, of course, improving existing features of Bio.jl and other packages. We
welcome any contributions and feature requests from you all.  &lt;a href=&quot;https://gitter.im/BioJulia/Bio.jl&quot;&gt;Gitter chat
channel&lt;/a&gt; is the best place to communicate
with developers and other users. If you love Julia and/or biology, any reason
not to join us?&lt;/p&gt;

&lt;h2 id=&quot;acknowledgements&quot;&gt;Acknowledgements&lt;/h2&gt;

&lt;p&gt;I gratefully acknowledge the Moore Foundation and the Julia project for
supporting the BioJulia project. I also would like to thank &lt;a href=&quot;https://github.com/Ward9250&quot;&gt;Ben J.
Ward&lt;/a&gt; and &lt;a href=&quot;https://github.com/kdmurray91&quot;&gt;Kevin
Murray&lt;/a&gt; for comments on my program code and other
contributions.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Graft.jl - General purpose graph analytics for Julia</title>
   <link href="http://julialang.org/blog/2016/08/GSoC2016-Graft"/>
   <updated>2016-08-22T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/08/GSoC2016-Graft</id>
   <content type="html">&lt;p&gt;This blog post describes my work on &lt;a href=&quot;https://github.com/pranavtbhat/Graft.jl&quot;&gt;Graft.jl&lt;/a&gt;, a general purpose graph analysis package for Julia. For those unfamiliar with graph algorithms, a quick &lt;a href=&quot;https://www.cl.cam.ac.uk/teaching/1011/PrincComm/slides/graph_theory_1-11.pdf&quot;&gt;introduction&lt;/a&gt; might help.&lt;/p&gt;

&lt;h1 id=&quot;proposal&quot;&gt;Proposal&lt;/h1&gt;
&lt;p&gt;My proposal, titled &lt;a href=&quot;https://github.com/pranavtbhat/Gsoc2016/blob/master/Proposal.md&quot;&gt;ParallelGraphs&lt;/a&gt;, was to develop a parallelized/distributed graph algorithms
library. However, in the first month or so, we decided to work towards a more general framework that supports data analysis on
networks (graphs with attributes defined on vertices and edges).
Our change in direction was mainly motivated by:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The challenges associated with distributed graph computations. &lt;a href=&quot;http://www.frankmcsherry.org/graph/scalability/cost/2015/01/15/COST.html&quot;&gt;This&lt;/a&gt;
blog post was an eye opener.&lt;/li&gt;
  &lt;li&gt;Only very large graphs, of the order of terabytes or petabytes, require distributed execution. Most useful graphs can be analyzed on a single compute node.&lt;/li&gt;
  &lt;li&gt;Multi-threading is under heavy development, and we decided to wait for the full multi-threaded programming model to be available.&lt;/li&gt;
  &lt;li&gt;As we looked at public datasets, we felt that the ability to combine graph theoretic analyses with real world data was the missing piece in Julia. &lt;a href=&quot;https://github.com/JuliaGraphs/LightGraphs.jl&quot;&gt;LightGraphs.jl&lt;/a&gt; already provides fast implementations for most graph algorithms, so we decided to target
graph data analysis.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The modified proposal could be summarized as the development of a package that supports:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Vertex and edge metadata : Key value pairs for vertices and edges.&lt;/li&gt;
  &lt;li&gt;Vertex labelling : Allow vertices to be referenced, externally, through arbitrary Julia types.&lt;/li&gt;
  &lt;li&gt;SQL like queries for edge data and metadata.&lt;/li&gt;
  &lt;li&gt;Compatibility with &lt;code class=&quot;highlighter-rouge&quot;&gt;LightGraphs&lt;/code&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;graft&quot;&gt;Graft&lt;/h1&gt;
&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;ParallelGraphs&lt;/code&gt; turned out to be a misnomer, since we were moving towards a more general purpose data analysis framework. So we chose the name &lt;code class=&quot;highlighter-rouge&quot;&gt;Graft&lt;/code&gt;, a kind of abbreviation for Graph Toolkit. The following sections detail &lt;code class=&quot;highlighter-rouge&quot;&gt;Graft's&lt;/code&gt; features:&lt;/p&gt;

&lt;h2 id=&quot;vertex-and-edge-metadata&quot;&gt;Vertex and Edge Metadata&lt;/h2&gt;
&lt;p&gt;Graphs are often representations of real world entities, and the relationships between them. Such entities (and their relationships), often have data attached to them.
While it is quite straightforward to store vertex data (a simple table will suffice), storing edges and their data is very tricky. The data should be structured on the
source and target vertices, should support random access and should be vectorized for queries.&lt;/p&gt;

&lt;p&gt;At first we tried placing the edge data in a SparseMatrixCSC. This turned out to be a bad idea, because sparse matrices are designed for numeric storage.
A simpler solution is to store edge metadata in a DataFrame, and have a SparseMatrixCSC map edges onto indices for the DataFrame. This strategy needed a lot less
code, and the benchmarks were more promising. Mutations such as the addition or removal of vertices and edges become more complicated however.&lt;/p&gt;

&lt;h2 id=&quot;vertex-labelling&quot;&gt;Vertex Labelling&lt;/h2&gt;
&lt;p&gt;Most graph libraries do not support vertex labelling. It can be very confusing to refer to a vertex by its (often long) integer identifier. It is also
computationally expensive to use non-integer labels in the implementation of the package (any such implementation would involve dictionaries). There is no reason, however,
for the user to have to use integer labels externally. Graft supports two modes of vertex labelling. By default, a vertex is identified by its internal identifier. A user
can assign labels of any arbitrary Julia type to identify vertices, overriding the internal identifiers. This strategy, we feel, makes a reasonable compromise between
user experience and performance.&lt;/p&gt;

&lt;p&gt;If vertex labels were used in the internal implementation, the graph data structure would probably look like this:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  Dict(
     &quot;Alice&quot; =&amp;gt; Dict(
        &quot;age&quot; =&amp;gt; 34,
        &quot;occupation&quot;  =&amp;gt; &quot;Doctor&quot;,
        &quot;adjacencies&quot; =&amp;gt; Dict(&quot;Bob&quot; =&amp;gt; Dict(&quot;relationship&quot; =&amp;gt; &quot;follow&quot;)))
     ),
     &quot;Bob&quot; =&amp;gt; Dict(
        &quot;age&quot; =&amp;gt; 36,
        &quot;occupation&quot;  =&amp;gt; &quot;Software Engineer&quot;,
        &quot;adjacencies&quot; =&amp;gt; Dict(&quot;Charlie&quot; =&amp;gt; Dict(&quot;relationship&quot; =&amp;gt; &quot;friend&quot;))
     ),
     &quot;Charlie&quot; =&amp;gt; Dict(
        &quot;age&quot; =&amp;gt; 30,
        &quot;occupation&quot;  =&amp;gt; &quot;Lawyer&quot;,
        &quot;adjacencies&quot; =&amp;gt; Dict(&quot;David&quot; =&amp;gt; Dict(&quot;relationship&quot; =&amp;gt; &quot;follow&quot;))
     ),
     &quot;David&quot; =&amp;gt; Dict(
        &quot;age&quot; =&amp;gt; 29,
        &quot;occupation&quot; =&amp;gt; &quot;Athlete&quot;,
        &quot;adjacencies&quot; =&amp;gt; Dict(&quot;Alice&quot; =&amp;gt; Dict(&quot;relationship&quot; =&amp;gt; &quot;friend&quot;))
     )
  )
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Cleary, using labels internally is a very bad idea. Any sort of data access would set off multiple dictionary look-ups. Instead, if a bidirectional map
could be used to translate labels into vertex identifiers and back, the number of dictionary lookups could be reduced to one. The data would also be better structured for query processing.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  # Label Map to resolve queries
  LabelMap(
     # Forward map : labels to vertex identifiers
     Dict(&quot;Alice&quot; =&amp;gt; 1, &quot;David&quot; =&amp;gt; 4&quot;, &quot;Charlie&quot; =&amp;gt; 3, Bob&quot; =&amp;gt; 2),

     # Reverse map : vertex identifiers to labels
     String[&quot;Alice, &quot;Bob&quot;, &quot;Charlie&quot;, &quot;David&quot;]
  )

  # Vertex DataFrame
  4×2 DataFrames.DataFrame
  │ Row │ age │ occupation          │
  ├─────┼─────┼─────────────────────┤
  │ 1   │ 34  │ &quot;Doctor&quot;            │
  │ 2   │ 36  │ &quot;Software Engineer&quot; │
  │ 3   │ 30  │ &quot;Lawyer&quot;            │
  │ 4   │ 29  │ &quot;Athlete&quot;           │

  # SparseMatrixCSC : maps edges onto indices into Edge DataFrame
  4×4 sparse matrix with 4 Int64 nonzero entries:
     [4, 1]  =  1
     [1, 2]  =  2
     [2, 3]  =  3
     [3, 4]  =  4

  # Edge DataFrame
  4×1 DataFrames.DataFrame
  │ Row │ relationship │
  ├─────┼──────────────┤
  │ 1   │ &quot;follow&quot;     │
  │ 2   │ &quot;friend&quot;     │
  │ 3   │ &quot;follow&quot;     │
  │ 4   │ &quot;friend&quot;     │
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h2 id=&quot;sql-like-queries&quot;&gt;SQL Like Queries&lt;/h2&gt;
&lt;p&gt;Graft’s query notation is borrowed from &lt;a href=&quot;https://github.com/davidagold/jplyr.jl&quot;&gt;Jplyr&lt;/a&gt;. The &lt;code class=&quot;highlighter-rouge&quot;&gt;@query&lt;/code&gt; macro is used to simplify the query syntax, and
accepts a pipeline of abstractions separated by the pipe operator &lt;code class=&quot;highlighter-rouge&quot;&gt;|&amp;gt;&lt;/code&gt;. The stages are described through abstractions:&lt;/p&gt;

&lt;h3 id=&quot;eachvertex&quot;&gt;eachvertex&lt;/h3&gt;
&lt;p&gt;Accepts an expression, that is run over every vertex. Vertex properties can be expressed using the dot notation. Some reserved properties are &lt;code class=&quot;highlighter-rouge&quot;&gt;v.id&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;v.label&lt;/code&gt;,
&lt;code class=&quot;highlighter-rouge&quot;&gt;v.adj&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;v.indegree&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;v.outdegree&lt;/code&gt;.
Examples:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  # Check if the user has overridden the default labels
  julia&amp;gt; @query(g |&amp;gt; eachvertex(v.id == v.label)) |&amp;gt; all

  # Kirchoff's law :P
  julia&amp;gt; @query(g |&amp;gt; eachvertex(v.outdegree - v.indegree)) .== 0
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h3 id=&quot;eachedge&quot;&gt;eachedge&lt;/h3&gt;
&lt;p&gt;Accepts an expression, that is run over every edge. The symbol &lt;code class=&quot;highlighter-rouge&quot;&gt;s&lt;/code&gt; is used to denote
the source vertex, and &lt;code class=&quot;highlighter-rouge&quot;&gt;t&lt;/code&gt; is used to denote the target vertex in the edge. The symbol &lt;code class=&quot;highlighter-rouge&quot;&gt;e&lt;/code&gt; is used to denote
the edge itself. Edge properties can be expressed through the dot notation. Some reserved properties are &lt;code class=&quot;highlighter-rouge&quot;&gt;e.source&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;e.target&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;e.mutualcount&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;e.mutual&lt;/code&gt;.
Examples:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  # Arithmetic expression on edge, source and target properties
  julia&amp;gt; @query g |&amp;gt; eachedge(e.p1 - s.p1 - t.p1)


  # Check if constituent vertices have the same outdegree
  julia&amp;gt; @query g |&amp;gt; eachedge(s.outdegree == t.outdegree)


  # Count the number of &quot;mutual friends&quot; between the source and target vertices in each edge
  julia&amp;gt; @query g |&amp;gt; eachedge(e.mutualcount)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h3 id=&quot;filter&quot;&gt;filter&lt;/h3&gt;
&lt;p&gt;Accepts vertex or edge expressions and computes subgraphs with a subset of vertices, or a subset
of edges, or both.
Examples:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  # Remove vertices where property p1 equals property p2
  @query g |&amp;gt; filter(v.p1 != v.p2)

  # Remove self loops from the graph
  @query g |&amp;gt; filter(e.source != e.target)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h3 id=&quot;select&quot;&gt;select&lt;/h3&gt;
&lt;p&gt;Returns a subgraph with a subset of vertex properties, or a subset of edge properties or both.
Examples:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;  # Preserve vertex properties p1, p2 and nothing else
  @query g |&amp;gt; select(v.p1, v.p2)

  # Preserve vertex property p1 and edge property p2
  @query g |&amp;gt; select(v.p1, e.p2)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h1 id=&quot;demonstration&quot;&gt;Demonstration&lt;/h1&gt;
&lt;p&gt;The typical workflow we hope to support with &lt;code class=&quot;highlighter-rouge&quot;&gt;Graft&lt;/code&gt; is:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Load a graph from memory&lt;/li&gt;
  &lt;li&gt;Use the query abstractions to construct new vertex/edge properties or obtain subgraphs.&lt;/li&gt;
  &lt;li&gt;Run complex queries on the subgraphs, or export data to &lt;code class=&quot;highlighter-rouge&quot;&gt;LightGraphs&lt;/code&gt; and run computationally expensive algorithms there.&lt;/li&gt;
  &lt;li&gt;Bring the data back into &lt;code class=&quot;highlighter-rouge&quot;&gt;Graft&lt;/code&gt; as a new property, or use it to modify the graphs structure.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The following examples should demonstrate this workflow:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/pranavtbhat/Graft.jl/blob/master/examples/google%2B.ipynb&quot;&gt;Google+&lt;/a&gt;: This demo uses a real, somewhat large, dataset with plenty of text data.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/pranavtbhat/Graft.jl/blob/master/examples/baseball.ipynb&quot;&gt;Baseball Players&lt;/a&gt;: Two separate datasets spliced together, a table on baseball players
and a trust network. The resulting data is quite absurd, but does a good job of showing the quantitative queries Graft can run.&lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;future-work&quot;&gt;Future Work&lt;/h1&gt;
&lt;ul&gt;
  &lt;li&gt;Graph IO : Support more graph file formats.&lt;/li&gt;
  &lt;li&gt;Improve the query interface: The current pipelined macro based syntax has a learning curve, and the macro itself does some eval at runtime. We would like to move towards a cleaner composable syntax, that will pass off as regular Julia commands.&lt;/li&gt;
  &lt;li&gt;New abstractions, such as Group-by, sort, and table output.&lt;/li&gt;
  &lt;li&gt;Database backends : A RDBMS can be used instead of the DataFrames. Or Graft can serve as a wrapper on a GraphDB such as Neo4j.&lt;/li&gt;
  &lt;li&gt;Integration with ComputeFramework for out of core processing. Support for parallelized IO, traversals and queries.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;More information can be found &lt;a href=&quot;https://github.com/pranavtbhat/Graft.jl/issues&quot;&gt;here&lt;/a&gt;&lt;/p&gt;

&lt;h1 id=&quot;acknowledgements&quot;&gt;Acknowledgements&lt;/h1&gt;
&lt;p&gt;This work was carried out as part of the Google Summer of Code program, under the guidance of mentors: &lt;a href=&quot;https://github.com/viralbshah&quot;&gt;Viral B Shah&lt;/a&gt; and &lt;a href=&quot;https://github.com/shashi&quot;&gt;Shashi Gowda&lt;/a&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Announcing support for complex-domain linear programs in Convex.jl</title>
   <link href="http://julialang.org/blog/2016/08/announcing-support-for-complex-domain-linear-programs-in-Convex.jl"/>
   <updated>2016-08-17T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/08/announcing-support-for-complex-domain-linear-programs-in-Convex.jl</id>
   <content type="html">&lt;p&gt;I am pleased to announce the support for complex-domain linear programs (LPs) in Convex.jl. As one of the &lt;em&gt;Google Summer of Code&lt;/em&gt; students under &lt;em&gt;The Julia Language&lt;/em&gt;, I had proposed to implement the support for complex semidefinite programming. In the first phase of project, I started by tackling the problem of complex-domain LPs where in first subphase, I had announced the support for complex coefficients during &lt;a href=&quot;https://www.youtube.com/watch?v=fHG4uEOlMbY&quot;&gt;JuliaCon’16&lt;/a&gt; and now I take this opportunity to announce the support for complex variables in LPs.&lt;/p&gt;

&lt;p&gt;Complex-domain LPs consist of a real linear objective function, real linear inequality constraints, and real and complex linear equality constraints.&lt;/p&gt;

&lt;p&gt;In order to enable complex-domain LPs, we came up with these ideas:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;We redefined the &lt;strong&gt;conic_form!&lt;/strong&gt; of every affine atom to accept complex arguments.&lt;/li&gt;
  &lt;li&gt;Every complex variable z was internally represented as &lt;code class=&quot;highlighter-rouge&quot;&gt;z = z1 + i*z2&lt;/code&gt;, where z1 and z2 are real.&lt;/li&gt;
  &lt;li&gt;We introduced two new affine atoms &lt;strong&gt;real&lt;/strong&gt; and &lt;strong&gt;imag&lt;/strong&gt; which return the real and the imaginary parts of the complex variable respectively.&lt;/li&gt;
  &lt;li&gt;transpose and ctranspose perform differently on complex variables so a new atom &lt;strong&gt;CTransposeAtom&lt;/strong&gt; was created.&lt;/li&gt;
  &lt;li&gt;A complex-equality constraint &lt;em&gt;RHS = LHS&lt;/em&gt; can be decomposed into two corresponding real equalities constraint &lt;em&gt;real(RHS) = real(LHS)&lt;/em&gt; and &lt;em&gt;imag(RHS) = imag(LHS)&lt;/em&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;After above changes were made to the codebase, we wrote two use cases to demonstrate the usability and the correctness of our idea which I am presenting below:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# Importing Packages
Pkg.clone(&quot;https://github.com/Ayush-iitkgp/Convex.jl/tree/gsoc2&quot;)
using Convex
 
# Complex LP with real variable
n = 10 # variable dimension (parameter)
m = 5 # number of constraints (parameter)
xo = rand(n)
A = randn(m,n) + im*randn(m,n)
b = A * xo 
# Declare a real variable
x = Variable(n)
p1 = minimize(sum(x), A*x == b, x&amp;gt;=0) 
# Notice A*x==b is complex equality constraint 
solve!(p1)
x1 = x.value

# Let's now solve by decomposing complex equality constraint into the corresponding real and imaginary part.
p2 = minimize(sum(x), real(A)*x == real(b), imag(A)*x==imag(b), x&amp;gt;=0)
solve!(p2)
x2 = x.value
x1==x2 # should return true


# Let's now consider an example using a complex variable
# Complex LP with complex variable
n = 10 # variable dimension (parameter)
m = 5 # number of constraints (parameter)
xo = rand(n)+im*rand(n)
A = randn(m,n) + im*randn(m,n)
b = A * xo

# Declare a complex variable
x = ComplexVariable(n)
p1 = minimize(real(sum(x)), A*x == b, real(x)&amp;gt;=0, imag(x)&amp;gt;=0)
solve!(p1)
x1 = x.value

xr = Variable(n)
xi = Variable(n)
p2 = minimize(sum(xr), real(A)*xr-imag(A)*xi == real(b), imag(A)*xr+real(A)*xi == imag(b), xr&amp;gt;=0, xi&amp;gt;=0)
solve!(p2)
x1== xr.value + im*xi.value # should return true
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;List of all the affine atoms are as follows:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;addition, substraction, multiplication, division&lt;/li&gt;
  &lt;li&gt;indexing and slicing&lt;/li&gt;
  &lt;li&gt;k-th diagonal of a matrix&lt;/li&gt;
  &lt;li&gt;construct diagonal matrix&lt;/li&gt;
  &lt;li&gt;transpose and ctranspose&lt;/li&gt;
  &lt;li&gt;stacking&lt;/li&gt;
  &lt;li&gt;sum&lt;/li&gt;
  &lt;li&gt;trace&lt;/li&gt;
  &lt;li&gt;conv&lt;/li&gt;
  &lt;li&gt;real and imag&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Now, I am working towards implementing complex-domain second order conic programming. Meanwhile, I invite the Julia community to play around with the complex-domain LPs. The link to the development branch is &lt;a href=&quot;https://github.com/Ayush-iitkgp/Convex.jl/tree/gsoc2&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Looking forward to your suggestions!&lt;/p&gt;

&lt;p&gt;Special thanks to my mentors Madeleine Udell and Dvijotham Krishnamurthy!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>An invitation to JuliaCon 2016</title>
   <link href="http://julialang.org/blog/2016/05/juliacon-invitation"/>
   <updated>2016-05-08T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/05/juliacon-invitation</id>
   <content type="html">&lt;p&gt;&lt;em&gt;For the third year in row we are happy to invite you to &lt;a href=&quot;http://juliacon.org/&quot;&gt;JuliaCon&lt;/a&gt;,
the annual meeting of the Julia programming language community.
JuliaCon 2016 will be held at the Massachusetts Institute of Technology from
June 21st to 25th and as a first, this year we will have several high-profile
keynote speakers, as well as the top-notch tutorials and talks you have come to
expect over the years.
Please &lt;a href=&quot;http://www.eventbrite.com/e/juliacon-2016-tickets-20943697162?ref=ebtnebregn&quot;&gt;purchase your tickets&lt;/a&gt; before May 13th to take advantage of the
early-bird pricing and we look forward to seeing you in June!&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;a href=&quot;http://juliacon.org/&quot;&gt;JuliaCon&lt;/a&gt;, just like the Julia language, has come a long way over
the last three years.
In 2014 we were roughly 75 attendees meeting in a medium-sized conference room
at the University of Chicago to great success, in 2015 we had about 225
attendees and enough content to cover four full days at the Massachusetts
Institute of Technology, and this year we hope that you will join us for the
greatest JuliaCon yet!&lt;/p&gt;

&lt;p&gt;From June 21st to the 25th JuliaCon 2016 will be held at the Massachusetts
Institute of Technology, for a full five days of Julia-related content.
On Tuesday 21st we will hold several &lt;a href=&quot;http://juliacon.org/workshops.html&quot;&gt;workshops&lt;/a&gt;, on topics ranging
from intermediate Julia programming to more advanced topics such as writing
high-performance and parallel programming.
From Wednesday 22nd to Friday 24th we will start each day with a keynote by a
high-profile speaker, followed by &lt;a href=&quot;http://juliacon.org/abstracts.html&quot;&gt;talks&lt;/a&gt; on a great variety of subjects:
macro economics, machine learning, astrophysics, visualisation, and more!
On Saturday 25th, the final day of the conference, we will hold a hackathon
where attendees are encouraged to team up based on personal interests to either
create new Julia projects or contribute to existing ones. All these details are
now in the &lt;a href=&quot;http://juliacon.org/pdf/juliacon2016poster3.pdf&quot;&gt;JuliaCon poster&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Without further ado, please allow us to introduce our keynote speakers:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Timothy E. Holy&lt;/strong&gt; is an Associate Professor of Neuroscience at Washington
  University in St. Louis.
  In 2009 he received the NIH Director’s Pioneer award for innovations in
  optics and microscopy.
  His lab, which studies how the brain detects pheromones and develops new
  optical methods for imaging neuronal activity, was one of the first to adopt
  Julia for scientific research.
  He is a long time Julia contributor and a lead developer of Julia’s
  multidimensional array capabilities as well as the author of far too many
  Julia packages.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Thomas J. Sargent&lt;/strong&gt; is a Professor of Economics at New York University and
  Senior Fellow at the Hoover Institution.
  In 2011 the Royal Swedish Academy of Sciences awarded him the Nobel Memorial
  Prize in Economic Sciences for his work on macroeconomics.
  Together with John Stachurski, he founded quant-econ.net, a Julia and Python
  based learning platform for quantitative economics focusing on algorithms
  and numerical methods for studying economic problems as well as coding
  skills.&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Guy L. Steele, Jr.&lt;/strong&gt; is a Software Architect for Oracle Labs and Principal
  Investigator of the Programming Language Research project.
  In 1994, he was made a fellow of the Association for Computing Machinery
  after receiving the Grace Murray Hopper Award in 1988.
  He is an experienced designer of programming languages, like Scheme,
  Fortress and Java, and many of his ideas have had an impact on the design
  of Julia.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We hope that our invitation entices you to join us – new, intermediate, and
experienced Julia users – for five days of fun at MIT this June and remember to
&lt;a href=&quot;http://www.eventbrite.com/e/juliacon-2016-tickets-20943697162?ref=ebtnebregn&quot;&gt;purchase your tickets&lt;/a&gt; before May 13th to receive a 33% early-bird
discount!&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;We need your help to spread this message far and wide! Post the
&lt;a href=&quot;http://juliacon.org/pdf/juliacon2016poster3.pdf&quot;&gt;JuliaCon poster&lt;/a&gt; and
this blog post to your local email lists. Print the poster and post it
on your local message board. In addition, please tweet, retweet, post
on FaceBook and LinkedIn and other social media. This is the biggest
JuliaCon ever, and we need your help in making it a huge success.&lt;/p&gt;

</content>
 </entry>
 
 <entry>
   <title>BioJulia Project in 2016</title>
   <link href="http://julialang.org/blog/2016/04/biojulia2016"/>
   <updated>2016-04-30T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/04/biojulia2016</id>
   <content type="html">&lt;p&gt;I am pleased to announce that the next phase of BioJulia is starting! In the next several months, I’m going to implement many crucial features for bioinformatics that will motivate you to use Julia and BioJulia libraries in your work. But before going to the details of the project, let me briefly introduce you what the BioJulia project is. This project is supported by the Moore Foundation and the Julia project.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/BioJulia&quot;&gt;The BioJulia project&lt;/a&gt; is a collaborative open source project to create an infrastructure for bioinformatics in the Julia programming language. It aims to provide fast and accessible software libraries. Julia’s Just-In-Time (JIT) compiler enables this greedy goal without resorting to other compiled languages like C/C++. The central package developed under the project is &lt;a href=&quot;https://github.com/BioJulia/Bio.jl&quot;&gt;Bio.jl&lt;/a&gt;, which provides fundamental features including biological symbols/sequences, file format parsers, alignment algorithms, wrappers for external softwares, etc. It also supports several common file formats such as FASTA, FASTQ, BED, PDB, and so on. Last year I made the &lt;a href=&quot;https://github.com/BioJulia/FMIndexes.jl&quot;&gt;FMIndexes.jl&lt;/a&gt; package to build a full-text search index for large genomes as a Julia Summer of Code (JSoC) student, and we released the first development version of Bio.jl. While the BioJulia project is getting more active and the number of contributors are growing, we still lack some important features for realistic applications. Filling in gaps between our current libraries and actual use cases is the purpose of my new project.&lt;/p&gt;

&lt;p&gt;So, what will be added in it? Here is the summary of my plan:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Sequence analysis:
    &lt;ul&gt;
      &lt;li&gt;Online sequence search algorithms&lt;/li&gt;
      &lt;li&gt;Data structure for reference genomes&lt;/li&gt;
      &lt;li&gt;Error-correcting algorithms for DNA barcodes&lt;/li&gt;
      &lt;li&gt;Parsers for BAM and CRAM file formats&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
  &lt;li&gt;Integration with data viewers and databases:
    &lt;ul&gt;
      &lt;li&gt;Genome browser backend&lt;/li&gt;
      &lt;li&gt;Parsers for GFF3 and VCF/BCF&lt;/li&gt;
      &lt;li&gt;Database access through web APIs&lt;/li&gt;
    &lt;/ul&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These things are of crucial importance for writing analysis programs because they connects software components (e.g. programs, archives, databases, viewers, etc.); data analysis softwares in bioinformatics usually read/write formatted data from/to each other. The figure below shows common workflow of detecting genetic variants; underlined deliverables will connect softwares, archives and databases so that you can write your analysis software in the Julia language.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/blog/2016-04-30-biojulia/schema.png&quot; alt=&quot;schema&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;sequence-analysis&quot;&gt;Sequence Analysis&lt;/h2&gt;

&lt;p&gt;The online sequence search algorithms will come with three flavors: exact, approximate, and regular expression search algorithms. The exact sequence search literally means finding exactly matching positions of a query sequence in another sequence. The approximate search is similar to the exact search but allows up to a specified number of errors: mismatches, insertions, and deletions. The regular expression search accepts a query in regular expression, which enables flexible description of a query pattern like &lt;a href=&quot;https://en.wikipedia.org/wiki/Sequence_motif&quot;&gt;motifs&lt;/a&gt;. For these algorithms, there are already half-done pull requests I’m working on: &lt;a href=&quot;https://github.com/BioJulia/Bio.jl/pull/152&quot;&gt;#152&lt;/a&gt;, &lt;a href=&quot;https://github.com/BioJulia/Bio.jl/pull/153&quot;&gt;#153&lt;/a&gt;, &lt;a href=&quot;https://github.com/BioJulia/Bio.jl/pull/143&quot;&gt;#143&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;After the last release of Bio.jl v0.1.0, the sequence data structure has been significantly rewritten to make biological sequence types coherent and extensible. But because we chose an encoding that requires 4 bits per base to represent DNA sequences, the DNA sequence type consumes too much memory than necessary to store a &lt;a href=&quot;https://en.wikipedia.org/wiki/Reference_genome&quot;&gt;reference genome&lt;/a&gt;, which is usually composed of four kinds of DNA nucleotides (denoted by A/C/G/T) and (consecutive and relatively small number of) undetermined nucleotides (denoted by N). After trying some data structures, I found that memory space of N positions can be dramatically saved using &lt;a href=&quot;https://github.com/BioJulia/IndexableBitVectors.jl&quot;&gt;IndexableBitVectors.jl&lt;/a&gt;, which is a package I created in &lt;a href=&quot;http://julialang.org/blog/2015/10/biojulia-sequence-analysis/&quot;&gt;JSoC 2015&lt;/a&gt;. I’m developing a separated package for reference genomes, &lt;a href=&quot;https://github.com/BioJulia/ReferenceSequences.jl&quot;&gt;ReferenceSequences.jl&lt;/a&gt;, and going to improve the functionality and performance to handle huge genomes like the human genome.&lt;/p&gt;

&lt;p&gt;If you are a researcher or an engineer who handles high-throughput sequencing data, BAM and CRAM parsers would be the most longing feature addition in the list. BAM is the &lt;em&gt;de facto&lt;/em&gt; standard file format to accommodate aligned sequences and most sequence mappers generate alignments in this format. CRAM is a storage-efficient alternative of BAM and is getting popular reflecting explosion of accumulated sequence data. Since these files contain massive amounts of DNA sequences from high-throughput sequencing machines, high-speed parsing is a practically desirable feature. I’m going to concentrate on the speed by careful tuning and multi-thread parallel computation which is planned to be introduced in the next Julia release.&lt;/p&gt;

&lt;h2 id=&quot;integration-with-data-viewers-and-databases&quot;&gt;Integration with Data Viewers and Databases&lt;/h2&gt;

&lt;p&gt;Genome browsers enable to interactively visualize genetic features found in
individuals and/or populations. For example, using the UCSC Genome Browser, you can investigate genetic regions along with sequence annotations &lt;a href=&quot;https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&amp;amp;lastVirtModeType=default&amp;amp;lastVirtModeExtraState=&amp;amp;virtModeType=default&amp;amp;virtMode=0&amp;amp;nonVirtPosition=&amp;amp;position=chr9%3A133206569-133324246&amp;amp;hgsid=491214673_Ob3A4L4zTLibsCuyq7xgazU3Goqg&quot;&gt;around the ABO gene&lt;/a&gt; in a window. Genome browser is one of the most common visualizations and hence lots of softwares have been developed but unfortunately there is no standardized interface. So, we will need to select a promising one that is an open source and supporting interactions with other softwares. The first candidate is &lt;a href=&quot;http://jbrowse.org/&quot;&gt;JBrowse&lt;/a&gt;, which is built with modern JavaScript and HTML5 technologies. It also supports RESTful APIs and hence it can fetch data from a backend server via HTTP. I’m planning to make an API server that responds to queries from a genome browser to interactively visualize in-memory data.&lt;/p&gt;

&lt;p&gt;Many databases distribute their data in some standardized file formats. As for genetic annotations and variants, GFF3 and VCF would be the most common formats. If you are using data from human or mouse, you should know various annotations are available from &lt;a href=&quot;http://www.gencodegenes.org/&quot;&gt;the GENCODE project&lt;/a&gt;. It offers data in GTF or GFF3 file formats. NCBI provides human variation sets in VCF file formats &lt;a href=&quot;http://www.ncbi.nlm.nih.gov/variation/docs/human_variation_vcf/&quot;&gt;here&lt;/a&gt;. These file formats are text, so you may think it is trivial to write parsers when you need them. It is partially true — if you don’t care about completeness and performance. Parsing a text file format in a naive way (for example, &lt;code class=&quot;highlighter-rouge&quot;&gt;split&lt;/code&gt; a line by a tab character) allocates many temporary objects and often leads to degrade performance, while careful tuning of a parser leads to complicated code that is hard to maintain. &lt;a href=&quot;https://github.com/dcjones&quot;&gt;@dcjones&lt;/a&gt; challenged this problem and made a great work and made Julia support for &lt;a href=&quot;http://www.colm.net/open-source/ragel/&quot;&gt;Ragel&lt;/a&gt;, which generates Julia code that executes a finite state machine. Daniel’s talk of the JuliaCon 2015 is helpful to know about the details if you are interested:&lt;/p&gt;
&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/sQvNNj3MthQ&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;p&gt;Sometimes you may need only a part of data provided by a database. In such a case, web-based APIs are handy to fetch necessary data on demand. &lt;a href=&quot;http://www.biomart.org/&quot;&gt;BioMart Central Portal&lt;/a&gt; offers a unified access point to a range of biological databases that is programmatically accessible via REST and SOAP APIs. Julian wrapper to BioMart will make it much easier to access data by automatically converting response to Julia objects. In the R language, the &lt;a href=&quot;https://bioconductor.org/packages/release/bioc/html/biomaRt.html&quot;&gt;biomaRt&lt;/a&gt; package is one of the most downloaded packages in Bioconductor packages: &lt;a href=&quot;https://www.bioconductor.org/packages/stats/&quot;&gt;https://www.bioconductor.org/packages/stats/&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;try-biojulia&quot;&gt;Try BioJulia!&lt;/h2&gt;

&lt;p&gt;We need users and collaborators of our libraries. Feedbacks from users in the real world are the most precious thing to improve the quality of our libraries. We welcome feature requests and discussions that will make bioinformatics easier and faster. Tools for phylogenetics and structural biology, which I didn’t mention in this post, are also under active development. You can post issues here: &lt;a href=&quot;https://github.com/BioJulia/Bio.jl/issues&quot;&gt;https://github.com/BioJulia/Bio.jl/issues&lt;/a&gt;; if you want to get in touch with us more casually, this Gitter room may be more convenient: &lt;a href=&quot;https://gitter.im/BioJulia/Bio.jl&quot;&gt;https://gitter.im/BioJulia/Bio.jl&lt;/a&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Google Summer of Code 2016</title>
   <link href="http://julialang.org/blog/2016/04/gsoc"/>
   <updated>2016-04-14T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/04/gsoc</id>
   <content type="html">&lt;p&gt;We’re pleased to announce that the Julia Language is taking part in this year’s &lt;a href=&quot;https://summerofcode.withgoogle.com&quot;&gt;Google Summer of Code&lt;/a&gt;. This means that interested students will have the opportunity to spend their summers getting paid to write code on a project of their choice.&lt;/p&gt;

&lt;p&gt;Student applications are open from &lt;strong&gt;March 14th – 25th&lt;/strong&gt; on the SoC website, but there’s no reason not to get going right away! To get you started thinking about what you’d like to work on, there are a bunch of interesting projects on our &lt;a href=&quot;http://julialang.org/soc/ideas-page.html&quot;&gt;ideas page&lt;/a&gt;. At this stage, it’s also a good idea to start getting involved with the community around your area of interest by opening issues, sending PRs and speaking to developers on relevant packages. Finding a good mentor for your project will be a big help for most applications, and showing mentors your enthusiasm is a great way to get them on board. Once you’re ready to start writing an application, check out our &lt;a href=&quot;http://julialang.org/soc/guidelines/&quot;&gt;guidelines&lt;/a&gt; which gives some hints on what to include.&lt;/p&gt;

&lt;p&gt;To give an idea of the kind of projects we’d like to support:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Parallel and distributed computing&lt;/li&gt;
  &lt;li&gt;Support for data science and analysis&lt;/li&gt;
  &lt;li&gt;Compiler optimisations and work on Julia on Android&lt;/li&gt;
  &lt;li&gt;Numerical and scientific computing – ODEs, matrix library functions, optimisation…&lt;/li&gt;
  &lt;li&gt;IDEs, tooling and 2D/3D visualisation&lt;/li&gt;
  &lt;li&gt;GPUs for graphics and numerical computing&lt;/li&gt;
  &lt;li&gt;Web tooling and networking&lt;/li&gt;
  &lt;li&gt;… and much more.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We welcome involvement in our summer frivolities even if you’re not a student. Firstly, if you happen to know any students, please let them know! We’d also like to encourage people to step up as mentors, so if you’re interested then please contact us (see below) and let us know what areas you’d like to help with. Please also feel free to give technical feedback on proposals that come up on our mailing lists.&lt;/p&gt;

&lt;p&gt;The primary point of contact for the community is our mailing list, julia-users@googlegroups.com. For more administrative questions you can also reach out to us privately at juliasoc@googlegroups.com. Feel free to start discussions about projects and ideas, although note that it’s easier for us to answer broad questions about the process than to give specific technical feedback.&lt;/p&gt;

&lt;p&gt;Our participation in previous years has resulted in some great projects, so we’re really looking forward to working with you this year and seeing what you can do. Good luck!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Generalizing AbstractArrays: opportunities and challenges</title>
   <link href="http://julialang.org/blog/2016/03/arrays-iteration"/>
   <updated>2016-03-27T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/03/arrays-iteration</id>
   <content type="html">&lt;h1 id=&quot;introduction-generic-algorithms-with-abstractarrays&quot;&gt;Introduction: generic algorithms with AbstractArrays&lt;/h1&gt;

&lt;p&gt;Somewhat unusually, this blog post is future-looking: it mostly
focuses on things that don’t yet exist. Its purpose is to lay out the
background for community discussion about possible changes to the core
API for &lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray&lt;/code&gt;s, and serves as background reading and
reference material for a more focused “julep” (a julia enhancement
proposal).  Here, often I’ll use the shorthand “array” to mean
&lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray&lt;/code&gt;, and use &lt;code class=&quot;highlighter-rouge&quot;&gt;Array&lt;/code&gt; if I explicitly mean julia’s concrete
&lt;code class=&quot;highlighter-rouge&quot;&gt;Array&lt;/code&gt; type.&lt;/p&gt;

&lt;p&gt;As the reader is likely aware, in julia it’s possible to write
algorithms for which one or more inputs are only assumed to be
&lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray&lt;/code&gt;s.  This is “generic” code, meaning it should work
(i.e., produce a correct result) on any specific concrete array type.
In an ideal world—which julia approaches rather well in many
cases—generality of code should not have a negative impact on its
performance: a generic implementation should be approximately as fast
as one restricted to specific array type(s).  This implies that
generic algorithms should be written using lower-level operations that
give good performance across a wide variety of array types.&lt;/p&gt;

&lt;p&gt;Providing efficient low-level operations is a different kind of design
challenge than one experiences with programming languages that
“vectorize” everything.  When successful, it promotes much greater
reuse of code, because efficient, generic low-level parts allow you to
write a wide variety of efficient, generic higher-level functions.&lt;/p&gt;

&lt;p&gt;Naturally, as the diversity of array types grows, the more careful we
have to be about our abstractions for these low-level operations.&lt;/p&gt;

&lt;h1 id=&quot;examples-of-arrays&quot;&gt;Examples of arrays&lt;/h1&gt;

&lt;p&gt;In discussing general operations on arrays, it’s useful to have a
diverse collection of concrete arrays in mind.&lt;/p&gt;

&lt;p&gt;In core julia, some types we support fairly well are:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Array&lt;/code&gt;: the prototype for all arrays&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Range&lt;/code&gt;s: a good example of what I often consider a “computed”
array, where essentially none of the values are stored in
memory. Since there is no storage, these are immutable containers:
you can’t set values in individual slots.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;BitArray&lt;/code&gt;s: arrays that can only store 0 or 1 (&lt;code class=&quot;highlighter-rouge&quot;&gt;false&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;true&lt;/code&gt;),
and for which the internal storage is packed so that each entry
requires only one bit.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;SubArray&lt;/code&gt;s: the problems this type introduced, and the resolution
we adopted, probably serves as the best model for the
generalizations considered here. Therefore, this case is discussed
in greater detail below.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Another important class of array types in Base are sparse arrays:
&lt;code class=&quot;highlighter-rouge&quot;&gt;SparseMatrixCSC&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;SparseVector&lt;/code&gt;, as well as other sparse
representations like &lt;code class=&quot;highlighter-rouge&quot;&gt;Diagonal&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;Bidiagonal&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;Tridiagonal&lt;/code&gt;.
These are good examples of array types where access patterns deserve
careful thought. Notably, despite many commonalities in “strategy”
among the 5 or so sparse parametrizations we have, implementations of
core algorithms (e.g., matrix multiplication) are specialized for each
sparse-like type—in other words, these mimic the “high level
vectorized functions” strategy common to other languages. What we lack
is a “sparse iteration API” that lets you write the main algorithms of
sparse linear algebra efficiently in a generic way.  Our current model
is probably fine for &lt;code class=&quot;highlighter-rouge&quot;&gt;SparseLike*Dense&lt;/code&gt; operations, but gets to be
harder to manage if you want to efficiently compute, e.g., &lt;code class=&quot;highlighter-rouge&quot;&gt;Bidiagonal*SparseMatrixCSC&lt;/code&gt;: the number of possible combinations you have to
support grows rapidly with more sparse types, and thus represents a
powerful incentive for developing efficient, generic low-level
operations.&lt;/p&gt;

&lt;p&gt;Outside of Base, there are some other mind-stretching examples of
arrays, including:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;DataFrames&lt;/code&gt;: indexing arrays with symbols rather than
integers. Other related types include &lt;code class=&quot;highlighter-rouge&quot;&gt;NamedArrays&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;AxisArrays&lt;/code&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Interpolations&lt;/code&gt;: indexing arrays with non-integer floating-point
numbers&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;DistributedArrays&lt;/code&gt;: another great example of a case in which you
need to think through access patterns carefully&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;subarrays-a-case-study&quot;&gt;SubArrays: a case study&lt;/h1&gt;

&lt;p&gt;For arrays of fixed dimension, one can write algorithms that index
arrays as &lt;code class=&quot;highlighter-rouge&quot;&gt;A[i,j,k,...]&lt;/code&gt; (good examples can be found in our linear
algebra code, where everything is a vector or matrix).  For algorithms
that have to support arbitrary dimensionality, for a long time our
fallback was linear indexing, &lt;code class=&quot;highlighter-rouge&quot;&gt;A[i]&lt;/code&gt; for integer &lt;code class=&quot;highlighter-rouge&quot;&gt;i&lt;/code&gt;. However, in
general SubArrays cannot be efficiently accessed by a linear index
because it results in call(s) to &lt;code class=&quot;highlighter-rouge&quot;&gt;div&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;div&lt;/code&gt; is slow. This is a
CPU problem, not a Julia-specific problem. The slowness of &lt;code class=&quot;highlighter-rouge&quot;&gt;div&lt;/code&gt; is
still true despite the &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/15357&quot;&gt;recent addition of
infrastructure&lt;/a&gt; to make
it much faster—now one can make it merely “really bad” rather than
&lt;a href=&quot;https://en.wikipedia.org/wiki/Alexander_and_the_Terrible,_Horrible,_No_Good,_Very_Bad_Day&quot;&gt;“Terrible, Horrible, No Good, and Very
Bad”&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;The way we (largely) resolved this problem was to make it possible to
do cartesian indexing, &lt;code class=&quot;highlighter-rouge&quot;&gt;A[i,j,k,...]&lt;/code&gt;, for arrays of arbitrary
dimensionality (the &lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianIndex&lt;/code&gt; type).  To leverage this in
practical code, we also had to extend our iterators with the &lt;code class=&quot;highlighter-rouge&quot;&gt;for I in
eachindex(A)&lt;/code&gt; construct.  This allows one to select an iterator that
optimizes the efficiency of access to elements of &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt;.  In generic
algorithms, the performance gains were not small, sometimes on the
scale of ten- to fifty-fold.  These types were described in a
&lt;a href=&quot;http://julialang.org/blog/2016/02/iteration&quot;&gt;previous blog post&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To my knowledge, this approach has given Julia one of the most
flexible yet efficient “array view” types in any programming language.
Many languages base views on array &lt;em&gt;strides&lt;/em&gt;, meaning situations in
which the memory offset is regular along each dimension.  Among other
things, this requires that the underlying array is dense.  In
contrast, in Julia we can easily handle non-strided arrays (e.g.,
sampling at &lt;code class=&quot;highlighter-rouge&quot;&gt;[1,3,17,428,...]&lt;/code&gt; along one dimension, or creating a view
of a &lt;code class=&quot;highlighter-rouge&quot;&gt;SparseMatrixCSC&lt;/code&gt;).  We can also handle arrays for which there is
no underlying storage (e.g., &lt;code class=&quot;highlighter-rouge&quot;&gt;Range&lt;/code&gt;s).  Being able to do this with a
common infrastructure is part of what makes different optimized array
types useful in generic programming.&lt;/p&gt;

&lt;p&gt;It’s also worth pointing out some problems:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;Most importantly, it requires that one adopt a slightly different
programming style. Despite being well into another release cycle,
this transition is still &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/15434#issuecomment-194991739&quot;&gt;not complete, even in Base&lt;/a&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;For algorithms that involve two or more arrays, there’s a
possibility that their “best” iterators will be of different
types. &lt;em&gt;In principle&lt;/em&gt;, this is a big problem. Consider matrix-vector
multiplication, &lt;code class=&quot;highlighter-rouge&quot;&gt;A[i,j]*v[j]&lt;/code&gt;, where &lt;code class=&quot;highlighter-rouge&quot;&gt;j&lt;/code&gt; needs to be in-sync for
both &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;v&lt;/code&gt;, yet you’d also like all of these accesses to be
maximally-efficient.  &lt;em&gt;In practice&lt;/em&gt;, right now this isn’t a burning
problem: even if our arrays don’t all have efficient linear
indexing, to my knowledge all of our (dense) array types have
efficient cartesian indexing. Since indexing by &lt;code class=&quot;highlighter-rouge&quot;&gt;N&lt;/code&gt; integers (where
&lt;code class=&quot;highlighter-rouge&quot;&gt;N&lt;/code&gt; is equal to the dimensionality of the array) is always
performant, this serves as a reliable default for generic code.
(It’s worth noting that this isn’t true for sparse arrays, and the
lack of a corresponding generic solution is probably the main reason
we lack a generic API for writing sparse algorithms.)&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Unfortunately, I suspect that if we want to add support for certain
new operations or types (specific examples below), it will force us to
set the latter problem on fire.&lt;/p&gt;

&lt;h1 id=&quot;challenging-examples&quot;&gt;Challenging examples&lt;/h1&gt;

&lt;p&gt;Some possible new &lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray&lt;/code&gt; types pose novel challenges.&lt;/p&gt;

&lt;h2 id=&quot;reshapedarrays-15449&quot;&gt;ReshapedArrays (&lt;a href=&quot;https://github.com/JuliaLang/julia/pull/15449&quot;&gt;#15449&lt;/a&gt;)&lt;/h2&gt;

&lt;p&gt;These are the front-and-center motivation for this post. These are
motivated by a desire to ensure that &lt;code class=&quot;highlighter-rouge&quot;&gt;reshape(A, dims)&lt;/code&gt; always returns
a “view” of &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt; rather than allocating a copy of &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt;. (Much of the
urgency of this julep goes away if we decide to abandon this goal, in
which case for consistency we should always return a copy of &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt;.)
It’s worth noting that besides an explicit &lt;code class=&quot;highlighter-rouge&quot;&gt;reshape&lt;/code&gt;, we have some
mechanisms for reshaping that currently cause a copy to be created,
notably &lt;code class=&quot;highlighter-rouge&quot;&gt;A[:]&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;A[:, :]&lt;/code&gt; applied to a 3D array.&lt;/p&gt;

&lt;p&gt;Similar to &lt;code class=&quot;highlighter-rouge&quot;&gt;SubArrays&lt;/code&gt;, the main challenge for &lt;code class=&quot;highlighter-rouge&quot;&gt;ReshapedArrays&lt;/code&gt; is
getting good performance.  If &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt; is a 3D array, and you reshape it to
a 2D array &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt;, then &lt;code class=&quot;highlighter-rouge&quot;&gt;B[i,j]&lt;/code&gt; must be expanded to &lt;code class=&quot;highlighter-rouge&quot;&gt;A[k,l,m]&lt;/code&gt;.  The
problem is that computing the correct &lt;code class=&quot;highlighter-rouge&quot;&gt;k,l,m&lt;/code&gt; might result in a call
to &lt;code class=&quot;highlighter-rouge&quot;&gt;div&lt;/code&gt;. So ReshapedArrays violate a crutch of our current ecosystem,
in that indexing with &lt;code class=&quot;highlighter-rouge&quot;&gt;N&lt;/code&gt; integers might not be the fastest way to
access elements of &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt;. From a performance perspective, this problem
is substantial (see &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/15449&quot;&gt;#15449&lt;/a&gt;, about five- to ten-fold).&lt;/p&gt;

&lt;p&gt;In simple cases, there’s an easy way to circumvent this performance
problem: define a new iterator type that (internally) iterates over
the parent &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt;’s indexes directly.  In other words, create an iterator
so that &lt;code class=&quot;highlighter-rouge&quot;&gt;B[I]&lt;/code&gt; immediately expands to &lt;code class=&quot;highlighter-rouge&quot;&gt;A[I']&lt;/code&gt;, and so that the latter
has “ideal” performance.&lt;/p&gt;

&lt;p&gt;Unfortunately, this strategy runs into a lot of trouble when you need
to keep two arrays in sync: if you want to adopt this strategy, you
simply can’t write &lt;code class=&quot;highlighter-rouge&quot;&gt;B[i,j]*v[j]&lt;/code&gt; for matrix-vector multiplication
anymore.  A potential way around &lt;em&gt;this&lt;/em&gt; problem is to define a new class
of iterators that operate on specific dimensions of an array (&lt;a href=&quot;https://github.com/JuliaLang/julia/pull/15459&quot;&gt;#15459&lt;/a&gt;),
writing &lt;code class=&quot;highlighter-rouge&quot;&gt;B[ii,jj]*v[j]&lt;/code&gt;.  &lt;code class=&quot;highlighter-rouge&quot;&gt;jj&lt;/code&gt; (whatever that is) and &lt;code class=&quot;highlighter-rouge&quot;&gt;j&lt;/code&gt; need to be
in-sync, but they don’t necessarily need to both be integers. Using
this kind of strategy, matrix-vector multiplication&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;vj&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vj&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;might be written in a more performant manner like this:&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jj&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vj&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;eachindex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Dimension&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ii&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;eachindex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eachindex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(:,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;jj&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;ii&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;jj&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;vj&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;It’s not too hard to figure out what &lt;code class=&quot;highlighter-rouge&quot;&gt;eachindex(B, Dimension{2})&lt;/code&gt; and
&lt;code class=&quot;highlighter-rouge&quot;&gt;eachindex(B, (:, jj))&lt;/code&gt; should do: &lt;code class=&quot;highlighter-rouge&quot;&gt;ii&lt;/code&gt;, for example, could be a
&lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianInnerIndex&lt;/code&gt; (a type that does not yet exist) that for a
particular column of &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; iterates from &lt;code class=&quot;highlighter-rouge&quot;&gt;A[3,7,4]&lt;/code&gt; to &lt;code class=&quot;highlighter-rouge&quot;&gt;A[5,8,4]&lt;/code&gt;, where
the &lt;code class=&quot;highlighter-rouge&quot;&gt;d&lt;/code&gt;th index component wraps around at &lt;code class=&quot;highlighter-rouge&quot;&gt;size(A, d)&lt;/code&gt;.  The
big performance advantage of this strategy is that you only have to
compute a &lt;code class=&quot;highlighter-rouge&quot;&gt;div&lt;/code&gt; to set the bounds of the iterator on each column; the
inner loop doesn’t require a &lt;code class=&quot;highlighter-rouge&quot;&gt;div&lt;/code&gt; on each element access. No doubt,
given suitable definition of &lt;code class=&quot;highlighter-rouge&quot;&gt;jj&lt;/code&gt; one could be even more clever and
avoid calculating &lt;code class=&quot;highlighter-rouge&quot;&gt;div&lt;/code&gt; altogether.  To the author, this strategy
seems promising as a way to resolve the majority of the performance
concerns about ReshapedArrays—only if you needed “random access”
would you require slow (integer-based) operations.&lt;/p&gt;

&lt;p&gt;However, a big problem is that compared to the “naive” implementation,
this is rather ugly.&lt;/p&gt;

&lt;h2 id=&quot;row-major-matrices-permuteddimensionarrays-and-taking-transposes-seriously&quot;&gt;Row-major matrices, PermutedDimensionArrays, and “taking transposes seriously”&lt;/h2&gt;

&lt;p&gt;Julia’s &lt;code class=&quot;highlighter-rouge&quot;&gt;Array&lt;/code&gt; type stores its entries in column-major order, meaning
that &lt;code class=&quot;highlighter-rouge&quot;&gt;A[i,j]&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;A[i+1,j]&lt;/code&gt; are in adjacent memory locations.  For
certain applications—or for interfacing with certain external code
bases—it might be convenient to support row-major arrays, where
instead &lt;code class=&quot;highlighter-rouge&quot;&gt;A[i,j]&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;A[i,j+1]&lt;/code&gt; are in adjacent memory locations. More
fundamentally, this is partially related to one of the most
commented-on issues in all of julia’s development history, known as
“taking transposes seriously” aka &lt;a href=&quot;https://github.com/JuliaLang/julia/issues/4774&quot;&gt;#4774&lt;/a&gt;.  There have been at least two
attempts at implementation, &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/6837&quot;&gt;#6837&lt;/a&gt; and the &lt;code class=&quot;highlighter-rouge&quot;&gt;mb/transpose&lt;/code&gt; branch, and
for the latter a summary of benefits and challenges was &lt;a href=&quot;https://github.com/JuliaLang/julia/issues/4774#issuecomment-149349751&quot;&gt;posted&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;One of the biggest challenges mentioned was the huge explosion of
methods that one would need to support.  Can generic code come to the
rescue here?  There are two related concerns.  The first is linear
indexing: oftentimes this is conflated with “storage order,” i.e.,
given two linear indexes &lt;code class=&quot;highlighter-rouge&quot;&gt;i&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;j&lt;/code&gt; for the same array, the offset in
memory is proportional to &lt;code class=&quot;highlighter-rouge&quot;&gt;i-j&lt;/code&gt;.  For row-major arrays, this
notion is not viable, because otherwise a loop&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; copy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;  &lt;span class=&quot;c&quot;&gt;# trouble if `i` means &quot;memory offset&quot;&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;would end up taking a transpose if &lt;code class=&quot;highlighter-rouge&quot;&gt;src&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;dest&lt;/code&gt; don’t use the same
storage order.  Consequently, a linear index has to be defined in
terms of the corresponding cartesian (full-dimensionality) index.
This isn’t much of a real problem, because it’s one we know how to
solve: use &lt;code class=&quot;highlighter-rouge&quot;&gt;ind2sub&lt;/code&gt; (which is slow) when you have to, but for
efficiency make row major arrays belong to the category (&lt;code class=&quot;highlighter-rouge&quot;&gt;LinearSlow&lt;/code&gt;)
of arrays that defaults to iteration with cartesian indexes.  Doing so
will ensure that if one uses generic constructs like &lt;code class=&quot;highlighter-rouge&quot;&gt;eachindex(src)&lt;/code&gt;
rather than &lt;code class=&quot;highlighter-rouge&quot;&gt;1:length(src)&lt;/code&gt;, then the loop above can be fast.&lt;/p&gt;

&lt;p&gt;The far more challenging problem concerns cache-efficiency: it’s much
slower to access elements of an array in anything other than
&lt;a href=&quot;http://julialang.org/blog/2013/09/fast-numeric&quot;&gt;storage-order&lt;/a&gt;.  Some
reasonably fast ways to write matrix-vector multiplication are&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;vj&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;vj&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;
&lt;p&gt;for a column-major matrix &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt;, and&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;j&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;v&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;j&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;for a row-major matrix.  (One can do even better than this by using a
scalar temporary accumulator, but let’s not worry about that here.)
The key point to note is that the order of the loops has been
switched.&lt;/p&gt;

&lt;p&gt;One could generalize this by defining a &lt;code class=&quot;highlighter-rouge&quot;&gt;RowMajorRange&lt;/code&gt; iterator
that’s a lot like our &lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianRange&lt;/code&gt; iterator, but traverses the
array in row-major order.  &lt;code class=&quot;highlighter-rouge&quot;&gt;eachindex&lt;/code&gt; claims to return an “efficient
iterator,” and without a doubt the &lt;code class=&quot;highlighter-rouge&quot;&gt;RowMajorRange&lt;/code&gt; is a (much) more
efficient iterator than a &lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianRange&lt;/code&gt; iterator for row-major
arrays. So let’s imagine that &lt;code class=&quot;highlighter-rouge&quot;&gt;eachindex&lt;/code&gt; does what it says, and
returns a &lt;code class=&quot;highlighter-rouge&quot;&gt;RowMajorRange&lt;/code&gt; iterator.  Using this strategy, the two
algorithms above can be combined into a single generic implementation:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;for I in eachindex(B)
    dest[I[1]] += B[I]*v[I[2]]
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Yay! Score one for efficient generic implementations.&lt;/p&gt;

&lt;p&gt;But our triumph is short-lived. Let’s return to the example of
&lt;code class=&quot;highlighter-rouge&quot;&gt;copy!&lt;/code&gt; above, and realize that &lt;code class=&quot;highlighter-rouge&quot;&gt;dest&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;src&lt;/code&gt; might be two
different array types, and therefore might be most-efficiently indexed
with different iterator types.  We’re tempted to write this as&lt;/p&gt;

&lt;div class=&quot;language-jl highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; copy&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;idest&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;isrc&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zip&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;eachindex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eachindex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;idest&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;src&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;isrc&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;dest&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Up until we introduced our &lt;code class=&quot;highlighter-rouge&quot;&gt;RowMajorRange&lt;/code&gt; return-type for
&lt;code class=&quot;highlighter-rouge&quot;&gt;eachindex&lt;/code&gt;, this implementation would have been fine.  But we just
broke it, because now this will incorrectly take a transpose in
certain situations.&lt;/p&gt;

&lt;p&gt;In other words, without careful design the goals of
“maximally-efficient iteration” and “keeping accesses in-sync” are in
conflict.&lt;/p&gt;

&lt;h2 id=&quot;offsetarrays-and-the-meaning-of-abstractarray&quot;&gt;OffsetArrays and the meaning of AbstractArray&lt;/h2&gt;

&lt;p&gt;Julia’s arrays are indexed starting at 1, whereas some other languages
start numbering at 0. If you take comments on various blog posts at
face value, there are vast armies of programmers out there eagerly
poised to adopt julia, but who won’t even try it because of this
difference in indexing.  Since recruiting those armies will lead to
world domination, this is clearly a problem of the utmost urgency.&lt;/p&gt;

&lt;p&gt;More seriously, there &lt;em&gt;are&lt;/em&gt; algorithms which simplify if you can index
outside of the range from &lt;code class=&quot;highlighter-rouge&quot;&gt;1:size(A,d)&lt;/code&gt;.  In my own lab’s internal
code, we’ve long been using a &lt;code class=&quot;highlighter-rouge&quot;&gt;CenterIndexedArray&lt;/code&gt; type, in which such
arrays (all of which have odd sizes) are indexed over the range &lt;code class=&quot;highlighter-rouge&quot;&gt;-n:n&lt;/code&gt;
and for which 0 refers to the “center” element. One package which
generalizes this notion is &lt;code class=&quot;highlighter-rouge&quot;&gt;OffsetArrays&lt;/code&gt;.  Unfortunately, in practice
both of these array types produce segfaults (due to built-in
assumptions about when &lt;code class=&quot;highlighter-rouge&quot;&gt;@inbounds&lt;/code&gt; is appropriate) for many of julia’s
core functions; over time my lab has had to write implementations
specialized for &lt;code class=&quot;highlighter-rouge&quot;&gt;CenterIndexedArrays&lt;/code&gt; for quite a few julia functions.&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;OffsetArrays&lt;/code&gt; illustrates another conceptual challenge, which can
easily be demonstrated by &lt;code class=&quot;highlighter-rouge&quot;&gt;copy!&lt;/code&gt;.  When &lt;code class=&quot;highlighter-rouge&quot;&gt;dest&lt;/code&gt; is a 1-dimensional
&lt;code class=&quot;highlighter-rouge&quot;&gt;OffsetArray&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;src&lt;/code&gt; is a standard &lt;code class=&quot;highlighter-rouge&quot;&gt;Vector&lt;/code&gt;, what should &lt;code class=&quot;highlighter-rouge&quot;&gt;copy!&lt;/code&gt;
do? In particular, where does &lt;code class=&quot;highlighter-rouge&quot;&gt;src[1]&lt;/code&gt; go? Does it go in the &lt;code class=&quot;highlighter-rouge&quot;&gt;first&lt;/code&gt;
element of &lt;code class=&quot;highlighter-rouge&quot;&gt;dest&lt;/code&gt;, or does it get stored in &lt;code class=&quot;highlighter-rouge&quot;&gt;dest[1]&lt;/code&gt; (which may not
be the first element).&lt;/p&gt;

&lt;p&gt;Such examples force us to think a little more deeply about what an
array really is.  There seem to be two potential conceptions.  One is
that arrays are lists, and multidimensional arrays are
lists-of-lists-of-lists-of…  In such a world view, the right thing
to do is to put &lt;code class=&quot;highlighter-rouge&quot;&gt;src[1]&lt;/code&gt; into the first slot of &lt;code class=&quot;highlighter-rouge&quot;&gt;dest&lt;/code&gt;, because 1 is
just a synonym for &lt;code class=&quot;highlighter-rouge&quot;&gt;first&lt;/code&gt;.  However, this world view doesn’t really
endow any kind of “meaning” to the index-tuple of an array, and in
that sense doesn’t even include the distinction conveyed by an
&lt;code class=&quot;highlighter-rouge&quot;&gt;OffsetArray&lt;/code&gt;. In other words, in this world an &lt;code class=&quot;highlighter-rouge&quot;&gt;OffsetArray&lt;/code&gt; is
simply nonsensical, and shouldn’t exist.&lt;/p&gt;

&lt;p&gt;If instead one thinks &lt;code class=&quot;highlighter-rouge&quot;&gt;OffsetArray&lt;/code&gt;s should exist, this essentially
forces one to adopt a different world view: arrays are effectively
associative containers, where each index-tuple is the “key” by which
one retrieves a value.  With this mode of thinking, &lt;code class=&quot;highlighter-rouge&quot;&gt;src[1]&lt;/code&gt; should be
stored in &lt;code class=&quot;highlighter-rouge&quot;&gt;dest[1]&lt;/code&gt;.&lt;/p&gt;

&lt;h1 id=&quot;formalizing-abstractarray&quot;&gt;Formalizing AbstractArray&lt;/h1&gt;

&lt;p&gt;These examples suggest a formalization of &lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray&lt;/code&gt;:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray&lt;/code&gt;s are specialized associative containers, in that the
allowable “keys” may be restricted by more than just their julia
type.  Specifically, the allowable keys must be representable as a
cartesian product of one-dimensional lists of values.  The allowed
keys may depend not just on the array type but also the specific
array (e.g., its size).  Attempted access by keys that cannot be
converted to one of the allowed keys, for that specific array,
result in &lt;code class=&quot;highlighter-rouge&quot;&gt;BoundsError&lt;/code&gt;s.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;For any given array, one must be able to generate a
finite-dimensional parametrization of the full domain of valid keys
from the array itself.  This might only require knowledge of the
array size, or the keys might depend on some internal storage (think
&lt;code class=&quot;highlighter-rouge&quot;&gt;DataFrames&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;OffsetArrays&lt;/code&gt;).  In some cases, just the array
type might be sufficient (e.g., &lt;code class=&quot;highlighter-rouge&quot;&gt;FixedSizeArrays&lt;/code&gt;).  By this
definition, note that a &lt;code class=&quot;highlighter-rouge&quot;&gt;Dict{ASCII5,Int}&lt;/code&gt;, where &lt;code class=&quot;highlighter-rouge&quot;&gt;ASCII5&lt;/code&gt; is a type
that means an ASCII string with 5 characters, would qualify as a
5-dimensional (sparse) array, but that a &lt;code class=&quot;highlighter-rouge&quot;&gt;Dict{ASCIIString,Int}&lt;/code&gt;
would not (because there is no length limit to an &lt;code class=&quot;highlighter-rouge&quot;&gt;ASCIIString&lt;/code&gt;, and
hence no finite dimensionality).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;An array may be indexed by more than one key type (i.e., keys may
have multiple parametrizations).  Different key parametrizations are
equivalent when they refer to the same element of a given
array. Linear indexes and cartesian indexes are simple examples of
interconvertable representations, but specialized iterators can
produce other key types as well.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Arrays may support multiple iterators that produce non-equivalent
key sequences.  In other words, a row-major matrix may support both
&lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianRange&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;RowMajorRange&lt;/code&gt; iterators that access elements
in different orders.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;h1 id=&quot;finding-a-way-forward&quot;&gt;Finding a way forward&lt;/h1&gt;

&lt;p&gt;Resolving these conflicting demands is not easy. One approach might be
to decree that some of these array types simply can’t be supported
with generic code. It is possible that this is the right
strategy. Alternatively, one can attept to devise an array API that
handles all of these types (and hopefully more).&lt;/p&gt;

&lt;p&gt;In GitHub issue
&lt;a href=&quot;https://github.com/JuliaLang/julia/issues/15648&quot;&gt;#15648&lt;/a&gt;, we are
discussing APIs that may resolve these challenges. Readers are
encouraged to contribute to this discussion.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>An introduction to ParallelAccelerator.jl</title>
   <link href="http://julialang.org/blog/2016/03/parallelaccelerator"/>
   <updated>2016-03-01T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/03/parallelaccelerator</id>
   <content type="html">&lt;p&gt;The High Performance Scripting team at Intel Labs recently released
&lt;a href=&quot;https://github.com/IntelLabs/ParallelAccelerator.jl&quot;&gt;ParallelAccelerator.jl&lt;/a&gt;,
a Julia package for high-performance, high-level
&lt;a href=&quot;https://en.wikipedia.org/wiki/Array_programming&quot;&gt;array-style programming&lt;/a&gt;.
The goal of ParallelAccelerator is to make high-level array-style
programs run as efficiently as possible in Julia, with a minimum of
extra effort required from the programmer.  In this post, we’ll take a
look at the ParallelAccelerator package and walk through some examples
of how to use it to speed up some typical array-style programs in
Julia.&lt;/p&gt;

&lt;h2 id=&quot;introduction&quot;&gt;Introduction&lt;/h2&gt;

&lt;p&gt;Ideally, high-level array-style Julia programs should run as
efficiently as possible on high-performance parallel hardware, with a
minimum of extra programmer effort required, and with performance
reasonably close to that of an expert implementation in C or C++.
There are three main things that ParallelAccelerator does to move us
toward this goal:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;First, we identify &lt;em&gt;implicit parallel patterns&lt;/em&gt; in array-style
code the user writes.  We’ll say more about these parallel
patterns shortly.&lt;/li&gt;
  &lt;li&gt;Second, we compile these parallel patterns to explicit parallel
loops.&lt;/li&gt;
  &lt;li&gt;Third, we &lt;em&gt;minimize runtime overheads&lt;/em&gt; incurred by things like
array bounds checks and intermediate array allocations.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The key user-facing feature that the ParallelAccelerator package
provides is a Julia macro called &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;, which is short for
“accelerate”.  Annotating functions or blocks of code with &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt; lets
you designate the parts of your Julia program that you want to compile
to optimized native code.  Here’s a toy example of using &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt; to
annotate a function:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;using&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ParallelAccelerator&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@acc&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;f&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;generic&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; with&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;method&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;f&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;([&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;element&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}:&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;6&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;12&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;20&lt;/span&gt;
&lt;span class=&quot;mi&quot;&gt;30&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Under the hood, ParallelAccelerator is essentially a compiler – itself implemented in Julia – that
intercepts the usual Julia JIT compilation process for
&lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;-annotated functions.  It compiles &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;-annotated code to C++
OpenMP code, which can then be compiled to a native library by an
external C++ compiler such as GCC or ICC.
(This intermediate C++ generation step isn’t essential to the design
of ParallelAccelerator, though – instead, the compiler could target
Julia’s own forthcoming native threading backend. [&lt;a href=&quot;#footnote1&quot;&gt;1&lt;/a&gt;])
On the
Julia side, ParallelAccelerator generates a &lt;em&gt;proxy function&lt;/em&gt; that
calls into that native library, and replaces calls to &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;-annotated
functions, like &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; in the above example, with calls to the
appropriate proxy function.&lt;/p&gt;

&lt;p&gt;We’ll say more shortly about the parallel patterns that
ParallelAccelerator targets and about how the ParallelAccelerator
compiler works, but before we do, let’s look at some code and some
performance results.&lt;/p&gt;

&lt;h2 id=&quot;a-quick-preview-of-results-black-scholes-option-pricing-benchmark&quot;&gt;A quick preview of results: Black-Scholes option pricing benchmark&lt;/h2&gt;

&lt;p&gt;Let’s see how to use ParallelAccelerator to speed up a classic
high-performance computing benchmark: an implementation of the
&lt;a href=&quot;https://en.wikipedia.org/wiki/Black%E2%80%93Scholes_model&quot;&gt;Black-Scholes formula&lt;/a&gt;
for option pricing.  The following code is a Julia implementation of
the Black-Scholes formula.&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; cndf2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;})&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.+&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;erf&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.707106781&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; blackscholes&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sptprice&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt;
                      &lt;span class=&quot;n&quot;&gt;strike&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt;
                      &lt;span class=&quot;n&quot;&gt;rate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt;
                      &lt;span class=&quot;n&quot;&gt;volatility&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt;
                      &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;})&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;logterm&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;log10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sptprice&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;./&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;strike&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;powterm&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;volatility&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;volatility&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;den&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;volatility&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;d1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rate&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;powterm&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;logterm&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;./&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;den&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;d2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;d1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;den&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;NofXd1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cndf2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;NofXd2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cndf2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;futureValue&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;strike&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exp&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rate&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;c1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;futureValue&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NofXd2&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;call&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sptprice&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NofXd1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c1&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;put&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;call&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;futureValue&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sptprice&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; run&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iterations&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sptprice&lt;/span&gt;   &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;42.0&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iterations&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;initStrike&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;40.0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;/&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iterations&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iterations&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;rate&lt;/span&gt;       &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iterations&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;volatility&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.2&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iterations&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;       &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iterations&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;

    &lt;span class=&quot;n&quot;&gt;tic&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;put&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;blackscholes&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sptprice&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;initStrike&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rate&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;volatility&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;toq&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;()&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;println&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;checksum: &quot;&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sum&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;put&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;t&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Here, the &lt;code class=&quot;highlighter-rouge&quot;&gt;blackscholes&lt;/code&gt; function takes five arguments, each of which
is an array of &lt;code class=&quot;highlighter-rouge&quot;&gt;Float64&lt;/code&gt;s.  The &lt;code class=&quot;highlighter-rouge&quot;&gt;run&lt;/code&gt; function initializes these five
arrays and passes them to &lt;code class=&quot;highlighter-rouge&quot;&gt;blackscholes&lt;/code&gt;, which, along with the
&lt;code class=&quot;highlighter-rouge&quot;&gt;cndf2&lt;/code&gt; (cumulative normal distribution) function that it calls, does
several computations involving pointwise addition (&lt;code class=&quot;highlighter-rouge&quot;&gt;.+&lt;/code&gt;), subtraction
(&lt;code class=&quot;highlighter-rouge&quot;&gt;.-&lt;/code&gt;), multiplication (&lt;code class=&quot;highlighter-rouge&quot;&gt;.*&lt;/code&gt;), and division (&lt;code class=&quot;highlighter-rouge&quot;&gt;./&lt;/code&gt;) on the arrays.
It’s not necessary to understand the details of the Black-Scholes
formula; the important thing to notice about the code is that we are
doing lots of pointwise array arithmetic.  Using Julia 0.4.4-pre on
a 4-core Ubuntu 14.04 desktop machine with 8 GB of memory, the &lt;code class=&quot;highlighter-rouge&quot;&gt;run&lt;/code&gt;
function takes about 11 seconds to run when called with an argument
of 40,000,000 (meaning that we are dealing with 40-million-element
arrays):&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@time&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;40_000_000&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;checksum&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;8.381928525856283e8&lt;/span&gt;
 &lt;span class=&quot;mf&quot;&gt;12.885293&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seconds&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;458.51&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;k&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;allocations&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;9.855&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GB&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;2.95&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gc&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mf&quot;&gt;11.297714183&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Here, the &lt;code class=&quot;highlighter-rouge&quot;&gt;11.297714183&lt;/code&gt; being returned from &lt;code class=&quot;highlighter-rouge&quot;&gt;run&lt;/code&gt; is the number of
seconds it takes the &lt;code class=&quot;highlighter-rouge&quot;&gt;blackscholes&lt;/code&gt; call alone to return.  The
&lt;code class=&quot;highlighter-rouge&quot;&gt;12.885293&lt;/code&gt; seconds reported by &lt;code class=&quot;highlighter-rouge&quot;&gt;@time&lt;/code&gt; is a little longer, because
it’s the running time of the entire &lt;code class=&quot;highlighter-rouge&quot;&gt;run&lt;/code&gt; call.&lt;/p&gt;

&lt;p&gt;The many pointwise array operations in this code make it a great
candidate for speeding up with ParallelAccelerator (as we’ll discuss
more shortly).  Doing so requires only minor changes to the code: we
import the ParallelAccelerator library with &lt;code class=&quot;highlighter-rouge&quot;&gt;using
ParallelAccelerator&lt;/code&gt;, then wrap the &lt;code class=&quot;highlighter-rouge&quot;&gt;cndf2&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;blackscholes&lt;/code&gt;
functions in an &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt; block, as follows:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;using&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;ParallelAccelerator&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@acc&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;begin&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; cndf2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;})&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.+&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;erf&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;0.707106781&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; blackscholes&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sptprice&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt;
                      &lt;span class=&quot;n&quot;&gt;strike&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt;
                      &lt;span class=&quot;n&quot;&gt;rate&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt;
                      &lt;span class=&quot;n&quot;&gt;volatility&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt;
                      &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float64&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;})&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;logterm&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;log10&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;sptprice&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;./&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;strike&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;powterm&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;5&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;volatility&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;volatility&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;den&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;volatility&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sqrt&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;d1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(((&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;rate&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;powterm&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;logterm&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;./&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;den&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;d2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;d1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;den&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;NofXd1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cndf2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;NofXd2&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;cndf2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;d2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;futureValue&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;strike&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;exp&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rate&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;c1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;futureValue&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NofXd2&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;call&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sptprice&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.*&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;NofXd1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;c1&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;put&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;call&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.-&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;futureValue&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;.+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sptprice&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The definition of &lt;code class=&quot;highlighter-rouge&quot;&gt;run&lt;/code&gt; stays the same.  With the addition of the
&lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt; wrapper, we now have much better performance:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@time&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;40_000_000&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;checksum&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;8.381928525856283e8&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;4.010668&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seconds&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mf&quot;&gt;1.90&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;M&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;allocations&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.584&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GB&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;2.06&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gc&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mf&quot;&gt;3.503281464&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This time, &lt;code class=&quot;highlighter-rouge&quot;&gt;blackscholes&lt;/code&gt; returns in about 3.5 seconds, and the entire
&lt;code class=&quot;highlighter-rouge&quot;&gt;run&lt;/code&gt; call finishes in about 4 seconds.  This is already an
improvement, but on subsequent calls to &lt;code class=&quot;highlighter-rouge&quot;&gt;run&lt;/code&gt;, we do even better:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@time&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;40_000_000&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;checksum&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;8.381928525856283e8&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;1.418709&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seconds&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;158&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;allocations&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.490&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GB&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;8.98&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gc&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mf&quot;&gt;1.007861068&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;nd&quot;&gt;@time&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;run&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;40_000_000&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;checksum&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;8.381928525856283e8&lt;/span&gt;
  &lt;span class=&quot;mf&quot;&gt;1.410865&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;seconds&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;154&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;allocations&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;1.490&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;GB&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;7.93&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;%&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;gc&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;time&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;mf&quot;&gt;1.012813958&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;In subsequent calls, &lt;code class=&quot;highlighter-rouge&quot;&gt;run&lt;/code&gt; finishes in about a second, with the entire
call taking about 1.4 seconds.  The reason for this additional
improvement is that ParallelAccelerator has already compiled the
&lt;code class=&quot;highlighter-rouge&quot;&gt;blackscholes&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;cndf2&lt;/code&gt; functions and doesn’t need to do so again
on subsequent runs.&lt;/p&gt;

&lt;p&gt;These results were collected on
an ordinary desktop machine, but we can scale up further.  The
following figure reports the time it takes &lt;code class=&quot;highlighter-rouge&quot;&gt;blackscholes&lt;/code&gt; to run on
arrays of 100 million elements, this time on a 36-core machine with
128 GB of RAM [&lt;a href=&quot;#footnote2&quot;&gt;2&lt;/a&gt;]:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/parallelaccelerator_figures/black-scholes-2016-01-31-blogpost.png?raw=true&quot; alt=&quot;Benchmark results for plain Julia and ParallelAccelerator implementations of the Black-Scholes formula&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The first three bars of the above figure show performance results for
ParallelAccelerator using different numbers of threads.  Since
ParallelAccelerator compiles Julia to OpenMP C++, we can use the
&lt;code class=&quot;highlighter-rouge&quot;&gt;OMP_NUM_THREADS&lt;/code&gt; environment variable to control the number of
threads that the code runs with.  Here, with &lt;code class=&quot;highlighter-rouge&quot;&gt;OMP_NUM_THREADS&lt;/code&gt; set to
18, &lt;code class=&quot;highlighter-rouge&quot;&gt;blackscholes&lt;/code&gt; runs in 0.27 seconds; with 36 threads (matching the
number of cores on the machine), running time drops to 0.16 seconds.
The third bar shows results for ParallelAccelerator with
&lt;code class=&quot;highlighter-rouge&quot;&gt;OMP_NUM_THREADS&lt;/code&gt; set to 1, which clocks in at about 3 seconds. For
comparison, the rightmost bar show results for “plain Julia”, that is,
a version of the code without &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;, which runs in about 21 seconds.&lt;/p&gt;

&lt;p&gt;Because Julia doesn’t (yet) have native multithreading support, the
plain Julia results shown in the rightmost bar are for one thread.
But it is interesting to note that the ParallelAccelerator
implementation of Black-Scholes outperforms plain Julia by a factor of
about seven, even when running on just one core.  The reason for this
speedup is that ParallelAccelerator (despite its name!) does more than
just parallelize code.  The ParallelAccelerator compiler is able to do
away with much of the runtime overhead incurred by array bounds checks
and allocation of intermediate arrays.  After that, with the addition
of parallelism, we’re able to do even better, for a total speedup of
more than 100x over plain Julia.&lt;/p&gt;

&lt;p&gt;To see how ParallelAccelerator accomplishes this, we’ll discuss the
parallel patterns that ParallelAccelerator handles in a bit more
detail, and then we’ll take a closer look at the ParallelAccelerator
compiler pipeline.&lt;/p&gt;

&lt;h2 id=&quot;parallel-patterns&quot;&gt;Parallel patterns&lt;/h2&gt;

&lt;p&gt;ParallelAccelerator works by identifying implicit parallel patterns in
source code and making the parallelism explicit.  These patterns
include &lt;em&gt;map&lt;/em&gt;, &lt;em&gt;reduce&lt;/em&gt;, &lt;em&gt;array comprehension&lt;/em&gt;, and &lt;em&gt;stencil&lt;/em&gt;.&lt;/p&gt;

&lt;h3 id=&quot;map&quot;&gt;Map&lt;/h3&gt;

&lt;p&gt;As we saw in the Black-Scholes example above, the &lt;code class=&quot;highlighter-rouge&quot;&gt;.+&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;.-&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;.*&lt;/code&gt;,
and &lt;code class=&quot;highlighter-rouge&quot;&gt;./&lt;/code&gt; operations in Julia are pointwise array operations that take
input arrays as arguments and produce an output array.
ParallelAccelerator translates these pointwise array operations into
data-parallel &lt;em&gt;map&lt;/em&gt; operations.  (See
&lt;a href=&quot;http://parallelacceleratorjl.readthedocs.org/en/latest/advanced.html#map-and-reduce&quot;&gt;the ParallelAccelerator documentation&lt;/a&gt;
for a complete list of all the pointwise array operations that it
knows how to parallelize.)  Furthermore, ParallelAccelerator
translates array assignments into &lt;em&gt;in-place&lt;/em&gt; map operations.  For
instance, assigning &lt;code class=&quot;highlighter-rouge&quot;&gt;a = a .* b&lt;/code&gt; where &lt;code class=&quot;highlighter-rouge&quot;&gt;a&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;b&lt;/code&gt; are arrays would
map &lt;code class=&quot;highlighter-rouge&quot;&gt;.*&lt;/code&gt; over &lt;code class=&quot;highlighter-rouge&quot;&gt;a&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;b&lt;/code&gt; and update &lt;code class=&quot;highlighter-rouge&quot;&gt;a&lt;/code&gt; in place with the result.
For both standard map and in-place map, it is possible for
ParallelAccelerator to avoid any array bounds checking once we’ve
established that the input arrays and the output arrays are the same
size.&lt;/p&gt;

&lt;h3 id=&quot;reduce&quot;&gt;Reduce&lt;/h3&gt;

&lt;p&gt;Reduce operations take an array argument and produce a scalar result
by combining all the elements of an array with an associative and
commutative operation.  ParallelAccelerator translates the Julia
functions &lt;code class=&quot;highlighter-rouge&quot;&gt;minimum&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;maximum&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;sum&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;prod&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;any&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;all&lt;/code&gt; into
data-parallel reduce operations when they are called on arrays.&lt;/p&gt;

&lt;h3 id=&quot;array-comprehension&quot;&gt;Array comprehension&lt;/h3&gt;

&lt;p&gt;Julia supports
&lt;a href=&quot;http://docs.julialang.org/en/release-0.4/manual/arrays/#comprehensions&quot;&gt;array comprehensions&lt;/a&gt;,
a convenient and concise way to construct arrays.  For example, the
expressions that initialize the five input arrays in the Black-Scholes
example above are all array comprehensions.  As a more sophisticated
example, the following &lt;code class=&quot;highlighter-rouge&quot;&gt;avg&lt;/code&gt; function, taken from
&lt;a href=&quot;http://docs.julialang.org/en/release-0.4/manual/arrays/#comprehensions&quot;&gt;the Julia manual&lt;/a&gt;,
takes a one-dimensional input array &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; of length &lt;em&gt;n&lt;/em&gt; and uses an
array comprehension to construct an output array of length &lt;em&gt;n&lt;/em&gt;-2, in
which each element is a weighted average of the corresponding element
in the original array and its two neighbors:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;avg&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.25&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.5&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.25&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;-&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Comprehensions like this one can also be parallelized by ParallelAccelerator: in a nutshell, ParallelAccelerator can transform array comprehensions to code that first allocates an output array and then performs an in-place map that can write to each element of the output array in parallel.&lt;/p&gt;

&lt;p&gt;Array comprehensions differ from map and reduce operations in that
they involve explicit array indexing.  But it is still possible to
parallelize array comprehensions in Julia, as long as there are no
side effects in the comprehension body (everything before the
&lt;code class=&quot;highlighter-rouge&quot;&gt;for&lt;/code&gt;). [&lt;a href=&quot;#footnote3&quot;&gt;3&lt;/a&gt;] ParallelAccelerator uses a conservative
static analysis to try to identify and reject side-effecting
operations in comprehensions.&lt;/p&gt;

&lt;h3 id=&quot;stencil&quot;&gt;Stencil&lt;/h3&gt;

&lt;p&gt;In addition to map, reduce, and comprehension, ParallelAccelerator
targets a fourth parallel pattern:
&lt;a href=&quot;https://en.wikipedia.org/wiki/Stencil_code&quot;&gt;stencil computations&lt;/a&gt;.  A
stencil computation updates the elements of an array according to a
fixed pattern called a stencil.  In fact, the &lt;code class=&quot;highlighter-rouge&quot;&gt;avg&lt;/code&gt; comprehension
example above could also be thought of as a stencil computation,
because it updates the contents of an array based on each element’s
neighbors.  However, stencil computations differ from the other
patterns that ParallelAccelerator targets, because there’s not a
built-in, user-facing language feature in Julia that expresses stencil
computations specifically.  So, ParallelAccelerator introduces a new
user-facing language construct called &lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt; for expressing
stencil computations in Julia.  Next, we’ll look at an example that
illustrates how &lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt; works.&lt;/p&gt;

&lt;h2 id=&quot;example-blurring-an-image-with-runstencil&quot;&gt;Example: Blurring an image with runStencil&lt;/h2&gt;

&lt;p&gt;Let’s consider a stencil computation that blurs an image using a
&lt;a href=&quot;https://en.wikipedia.org/wiki/Gaussian_blur&quot;&gt;Gaussian blur&lt;/a&gt;.  The
image is represented as a two-dimensional array of pixels.  To blur
the image, we set the value of each output pixel to a particular
weighted average of the corresponding input pixel’s value and the
values of its neighboring input pixels.  By repeating this process
multiple times, we can get an increasingly blurred
image. [&lt;a href=&quot;#footnote4&quot;&gt;4&lt;/a&gt;]&lt;/p&gt;

&lt;p&gt;The following code implements a Gaussian blur in Julia.  It operates
on a 2D array of &lt;code class=&quot;highlighter-rouge&quot;&gt;Float32&lt;/code&gt;s: the pixels of the source image.  It’s
easy to obtain such an array using, for instance, the &lt;code class=&quot;highlighter-rouge&quot;&gt;load&lt;/code&gt; function
from the &lt;a href=&quot;https://github.com/timholy/Images.jl&quot;&gt;Images.jl&lt;/a&gt; library,
followed by a call to
&lt;a href=&quot;http://docs.julialang.org/en/release-0.4/manual/conversion-and-promotion/#conversion&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;convert&lt;/code&gt;&lt;/a&gt;
to get an array of type &lt;code class=&quot;highlighter-rouge&quot;&gt;Array{Float32,2}&lt;/code&gt;.  (For simplicity, we’re
assuming that the input image is a grayscale image, so each pixel has
just one value instead of red, green, and blue values.  However, it
would be straightforward to use the same approach for RGB pixels.)&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; blur&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float32&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iterations&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;h&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;iterations&lt;/span&gt;
      &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 
           &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0030&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0219&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0030&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
           &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0596&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0983&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0596&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
           &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0219&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0983&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.1621&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0983&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0219&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
           &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0596&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0983&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0596&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
           &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;4&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0030&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0219&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;w&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;h&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0030&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Here, to compute the value of a pixel in the output image, we use the
the corresponding input pixel as well as all its neighboring pixels,
to a depth of two pixels out from the input pixel – so, twenty-four
neighbors.  In all, there are twenty-five pixel values to examine.  We
add all these pixel values together, each multiplied by a weight – in
this case &lt;code class=&quot;highlighter-rouge&quot;&gt;0.0030&lt;/code&gt; for the cornermost pixels, &lt;code class=&quot;highlighter-rouge&quot;&gt;0.1621&lt;/code&gt; for the center
pixel, and for all the other pixels, something in between – and the
total is the value of the output pixel.  At the borders of the image,
we don’t have enough neighboring pixels to compute an output pixel
value, so we simply skip those pixels and don’t assign to
them. [&lt;a href=&quot;#footnote5&quot;&gt;5&lt;/a&gt;]&lt;/p&gt;

&lt;p&gt;Notice that the &lt;code class=&quot;highlighter-rouge&quot;&gt;blur&lt;/code&gt; function explicitly loops over the number of
iterations, that is, times to apply the blur to the the image, but it
does not explicitly loop over pixels in the image.  Instead, the code
is written in array style: it performs just one assignment to the
array &lt;code class=&quot;highlighter-rouge&quot;&gt;img&lt;/code&gt;, using the ranges &lt;code class=&quot;highlighter-rouge&quot;&gt;3:w-2&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;3:h-2&lt;/code&gt; to avoid assigning
to the borders of the image.  On a
&lt;a href=&quot;https://github.com/IntelLabs/ParallelAccelerator.jl/blob/master/examples/example.jpg&quot;&gt;large grayscale input image&lt;/a&gt;
of 7095 by 5322 pixels, this code takes about 10 minutes to run for
100 iterations.&lt;/p&gt;

&lt;p&gt;Using ParallelAccelerator, we can get much better performance.  Let’s look at a version of &lt;code class=&quot;highlighter-rouge&quot;&gt;blur&lt;/code&gt; that uses &lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt;:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@acc&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; blur&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float32&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;},&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iterations&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;kt&quot;&gt;Float32&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; 
    &lt;span class=&quot;n&quot;&gt;runStencil&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;buf&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iterations&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;oob_skip&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;do&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;
       &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 
            &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.003&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0219&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0030&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
             &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0596&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0983&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0596&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
             &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0219&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0983&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.1621&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0983&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0219&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
             &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0596&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0983&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0596&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;
             &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.003&lt;/span&gt;  &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0219&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0133&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;*&lt;/span&gt; &lt;span class=&quot;mf&quot;&gt;0.0030&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
       &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;b&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;return&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;img&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Here, we again have a function called &lt;code class=&quot;highlighter-rouge&quot;&gt;blur&lt;/code&gt; – now annotated with
&lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt; – that takes the same arguments as the original code.  This
version of &lt;code class=&quot;highlighter-rouge&quot;&gt;blur&lt;/code&gt; allocates a new 2D array called &lt;code class=&quot;highlighter-rouge&quot;&gt;buf&lt;/code&gt; that is the
same size as the original &lt;code class=&quot;highlighter-rouge&quot;&gt;img&lt;/code&gt; array.  The allocation of &lt;code class=&quot;highlighter-rouge&quot;&gt;buf&lt;/code&gt; is
followed by a call to &lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt;.  Let’s take a closer look at the
&lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt; call.&lt;/p&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt; has the following signature:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;runStencil&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;kernel&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Function&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buffer1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;buffer2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;iteration&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;kt&quot;&gt;Int&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;boundaryHandling&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;::&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Symbol&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;In &lt;code class=&quot;highlighter-rouge&quot;&gt;blur&lt;/code&gt;, the call to &lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt; uses Julia’s
&lt;a href=&quot;http://docs.julialang.org/en/release-0.4/manual/functions/#do-block-syntax-for-function-arguments&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;do&lt;/code&gt;-block syntax for function arguments&lt;/a&gt;,
so the &lt;code class=&quot;highlighter-rouge&quot;&gt;do b, a ... end&lt;/code&gt; block is actually the first argument to the
&lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt; call.  The &lt;code class=&quot;highlighter-rouge&quot;&gt;do&lt;/code&gt; block creates an anonymous function that
binds the variables &lt;code class=&quot;highlighter-rouge&quot;&gt;b&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;a&lt;/code&gt;.  The arguments &lt;code class=&quot;highlighter-rouge&quot;&gt;buffer1, buffer2,
...&lt;/code&gt; that are passed to &lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt; become the arguments to the
anonymous function.  In this case, we are passing two buffers, &lt;code class=&quot;highlighter-rouge&quot;&gt;buf&lt;/code&gt;
and &lt;code class=&quot;highlighter-rouge&quot;&gt;img&lt;/code&gt;, to &lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt;, and so the anonymous function takes two
arguments.&lt;/p&gt;

&lt;p&gt;Aside from the anonymous function and the two buffers, &lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt;
takes two other arguments.  The first of these is a number of
iterations that we want to run the stencil computation for.  In this
case, we simply pass along the &lt;code class=&quot;highlighter-rouge&quot;&gt;iterations&lt;/code&gt; argument that is passed to
&lt;code class=&quot;highlighter-rouge&quot;&gt;blur&lt;/code&gt;.  Finally, the last argument to &lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt; is a symbol
indicating how stencil boundaries are to be handled.  Here, we’re
using the &lt;code class=&quot;highlighter-rouge&quot;&gt;:oob_skip&lt;/code&gt; symbol, short for “out-of-bounds skip”.  It
means that when input indices are out of bounds – for instance, in
the situation where the input pixel is one of those on the two-pixel
border of the image, and there aren’t enough neighbor pixels to
compute the output pixel value – then we simply skip writing to the
output pixel.  This has the same effect as the careful indexing in the
original version of &lt;code class=&quot;highlighter-rouge&quot;&gt;blur&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Finally, let’s look at the body of the &lt;code class=&quot;highlighter-rouge&quot;&gt;do&lt;/code&gt; block that we’re passing
to &lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt;.  It contains an assignment to &lt;code class=&quot;highlighter-rouge&quot;&gt;b&lt;/code&gt;, using values
computed from &lt;code class=&quot;highlighter-rouge&quot;&gt;a&lt;/code&gt;.  As we’ve said, &lt;code class=&quot;highlighter-rouge&quot;&gt;b&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;a&lt;/code&gt; here are &lt;code class=&quot;highlighter-rouge&quot;&gt;buf&lt;/code&gt; and
&lt;code class=&quot;highlighter-rouge&quot;&gt;img&lt;/code&gt;: our newly-allocated buffer, and the original image.  The code
here is similar to that of the original implementation of &lt;code class=&quot;highlighter-rouge&quot;&gt;blur&lt;/code&gt;, but
here we’re using &lt;em&gt;relative&lt;/em&gt; rather than absolute indexing into arrays,
The index &lt;code class=&quot;highlighter-rouge&quot;&gt;0,0&lt;/code&gt; in &lt;code class=&quot;highlighter-rouge&quot;&gt;b[0,0]&lt;/code&gt; doesn’t refer to any particular element of
&lt;code class=&quot;highlighter-rouge&quot;&gt;b&lt;/code&gt;, but instead to the current position of a cursor that can be
thought of as traversing all the elements of &lt;code class=&quot;highlighter-rouge&quot;&gt;b&lt;/code&gt;.  On the right side
of the assignment.  &lt;code class=&quot;highlighter-rouge&quot;&gt;a[-2,-1]&lt;/code&gt; refers to the element in &lt;code class=&quot;highlighter-rouge&quot;&gt;a&lt;/code&gt; that is
two elements to the left and one element up from the &lt;code class=&quot;highlighter-rouge&quot;&gt;0,0&lt;/code&gt; element of
&lt;code class=&quot;highlighter-rouge&quot;&gt;a&lt;/code&gt;.  In this way, we can express a stencil computation more concisely
than the original version of &lt;code class=&quot;highlighter-rouge&quot;&gt;blur&lt;/code&gt; did, and we don’t have to worry
about getting the indices correct for boundary handling as we had to
do before, because the &lt;code class=&quot;highlighter-rouge&quot;&gt;:oob_skip&lt;/code&gt; argument tells &lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt;
everything it needs to no to handle boundaries correctly.&lt;/p&gt;

&lt;p&gt;Finally, at the end of the &lt;code class=&quot;highlighter-rouge&quot;&gt;do&lt;/code&gt; block, we return &lt;code class=&quot;highlighter-rouge&quot;&gt;a, b&lt;/code&gt;.  They were
bound as &lt;code class=&quot;highlighter-rouge&quot;&gt;b, a&lt;/code&gt;, but we return them in the opposite order so that for
each iteration of the stencil, we’ll be using the already-blurred
buffer as the input for another round of blurring.  This continues for
however many iterations we’ve specified.  There’s therefore no need to
write an explicit &lt;code class=&quot;highlighter-rouge&quot;&gt;for&lt;/code&gt; loop for stencil iterations when using
&lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt;; one just passes an argument saying how many iterations
should occur.&lt;/p&gt;

&lt;p&gt;Therefore &lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt; enables us to write more concise code than
plain Julia, as we’d expect from a language extension.  But where
&lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt; really shines is in the performance it enables.  The
following figure compares performance results for plain Julia and
ParallelAccelerator implementations of &lt;code class=&quot;highlighter-rouge&quot;&gt;blur&lt;/code&gt;, each running for 100
iterations on the aforementioned 7095x5322 source image, run using the
same machine as for the previous Black-Scholes benchmark.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/parallelaccelerator_figures/gaussian-blur-2016-03-02-blogpost.png?raw=true&quot; alt=&quot;Benchmark results for plain Julia and ParallelAccelerator implementations of Gaussian blur&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The rightmost column shows the results for plain Julia, using the
first implementation of &lt;code class=&quot;highlighter-rouge&quot;&gt;blur&lt;/code&gt; shown above.  The three columns to the
left show results for the ParallelAccelerator version that uses
&lt;code class=&quot;highlighter-rouge&quot;&gt;runStencil&lt;/code&gt;.  As we can see, even when running on just one thread,
ParallelAccelerator enables a speedup of about 15x: from about 600
seconds to about 40 seconds.  Running on 36 threads provides a further
parallel speedup of more than 26x, resulting in a total speedup of
nearly 400x over plain single-threaded Julia.&lt;/p&gt;

&lt;h2 id=&quot;an-overview-of-the-parallelaccelerator-compiler-architecture&quot;&gt;An overview of the ParallelAccelerator compiler architecture&lt;/h2&gt;

&lt;p&gt;Now that we’ve talked about the parallel patterns that
ParallelAccelerator speeds up and seen some code examples, let’s take
a look at how the ParallelAccelerator compiler works.&lt;/p&gt;

&lt;p&gt;The standard Julia JIT compiler parses Julia source code into the
Julia abstract syntax tree (AST) representation.  It performs type
inference on the AST, then transforms the AST to LLVM IR, and finally
generates native assembly code.  ParallelAccelerator intercepts this
process at the level of the AST.  It introduces new AST nodes for the
parallel patterns we discussed above.  It then does various
optimizations on the resulting AST.  Finally, it generates C++ code
that can be compiled by an external C++ compiler.  The following
figure shows an overview of the ParallelAccelerator compilation
process:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/parallelaccelerator_figures/compiler-pipeline.png?raw=true&quot; alt=&quot;The ParallelAccelerator compiler pipeline&quot; /&gt;&lt;/p&gt;

&lt;p&gt;As many readers of this blog will know, Julia has good support for
&lt;a href=&quot;http://docs.julialang.org/en/release-0.4/devdocs/reflection/&quot;&gt;inspecting and manipulating its own ASTs&lt;/a&gt;.
Its built-in &lt;code class=&quot;highlighter-rouge&quot;&gt;code_typed&lt;/code&gt; function will return the AST of any function
after Julia’s type inference has taken place.  This is very convenient
for ParallelAccelerator, which is able to use the output from
&lt;code class=&quot;highlighter-rouge&quot;&gt;code_typed&lt;/code&gt; as the input to the first pass of its compiler, which is
called “Domain Transformations”.  The Domain Transformations pass
produces ParallelAccelerator’s &lt;em&gt;Domain AST&lt;/em&gt; intermediate
representation.&lt;/p&gt;

&lt;p&gt;Domain AST is similar to Julia’s AST, except it introduces new AST
nodes for parallel patterns that it identifies.  We call these nodes
“domain nodes”, collectively.  The Domain Transformations pass
replaces certain parts of the AST with domain nodes.&lt;/p&gt;

&lt;p&gt;The Domain Transformations pass is followed by the Parallel
Transformations pass, which replaces domain nodes with “parfor” nodes,
each of which represents one or more nested parallel &lt;code class=&quot;highlighter-rouge&quot;&gt;for&lt;/code&gt; loops.
Loop fusion also takes place during the Parallel Transformations pass.
We call the result of Parallel Transformations &lt;em&gt;Parallel
AST&lt;/em&gt;. [&lt;a href=&quot;#footnote6&quot;&gt;6&lt;/a&gt;]&lt;/p&gt;

&lt;p&gt;The compiler hands off Parallel AST code to the last pass of the
compiler, CGen, which generates C++ code and converts parfor nodes
into OpenMP loops.  Finally, an external C++ compiler creates an
executable which is linked to OpenMP and to a small array runtime
component written in C that manages the transfer of arrays back and
forth between Julia and C++.&lt;/p&gt;

&lt;h2 id=&quot;caveats&quot;&gt;Caveats&lt;/h2&gt;

&lt;p&gt;ParallelAccelerator is still a proof of concept at this stage.  Users
should be aware of two issues that can stand in the way of being able
to make effective use of ParallelAccelerator.  Those issues are,
first, package load time, and second, limitations in what Julia
programs ParallelAccelerator is able to handle.  We discuss each of
these issues in turn.&lt;/p&gt;

&lt;h3 id=&quot;package-load-time&quot;&gt;Package load time&lt;/h3&gt;

&lt;p&gt;Because ParallelAccelerator is a large Julia package (it’s a compiler,
after all), it takes a long time (perhaps 20 or 25 seconds on a 4-core
desktop machine) for &lt;code class=&quot;highlighter-rouge&quot;&gt;using ParallelAccelerator&lt;/code&gt; to run.  This long
pause is &lt;em&gt;not&lt;/em&gt; the time that ParallelAccelerator is taking to compile
your &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;-annotated code; it’s the time that Julia is taking to
compile ParallelAccelerator itself.  After this initial pause, the
first call to an &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;-annotated function will incur a brief
compilation pause (this time from the ParallelAccelerator compiler,
not Julia itself) of perhaps a couple of seconds.  Subsequent calls to
the same function won’t incur the compilation pause.&lt;/p&gt;

&lt;p&gt;Let’s see what these compilation pauses look like in practice.  The
ParallelAccelerator package comes with a collection of
&lt;a href=&quot;https://github.com/IntelLabs/ParallelAccelerator.jl/tree/master/examples&quot;&gt;example programs&lt;/a&gt;
that print timing information, including the
&lt;a href=&quot;https://github.com/IntelLabs/ParallelAccelerator.jl/blob/master/examples/black-scholes/black-scholes.jl&quot;&gt;Black-Scholes&lt;/a&gt;
and
&lt;a href=&quot;https://github.com/IntelLabs/ParallelAccelerator.jl/blob/master/examples/gaussian-blur/gaussian-blur.jl&quot;&gt;Gaussian blur&lt;/a&gt;
examples shown in this post.  All the examples print timing
information for two calls to an &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;-annotated function: first a
“warm-up” call with trivial arguments to measure compilation time, and
then a more realistic call.  In the output printed by each example,
timing information for the more realistic call is preceded by the
string &lt;code class=&quot;highlighter-rouge&quot;&gt;&quot;SELFTIMED&quot;&lt;/code&gt;, while timing information for the warm-up call is
preceded by &lt;code class=&quot;highlighter-rouge&quot;&gt;&quot;SELFPRIMED&quot;&lt;/code&gt;.  Let’s run the Black-Scholes example and
time it using the &lt;code class=&quot;highlighter-rouge&quot;&gt;time&lt;/code&gt; shell command:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-bash&quot; data-lang=&quot;bash&quot;&gt;&lt;span class=&quot;gp&quot;&gt;$ &lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;time &lt;/span&gt;julia ParallelAccelerator/examples/black-scholes/black-scholes.jl 
iterations &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 10000000
SELFPRIMED 1.766323497
checksum: 2.0954821257116848e8
rate &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; 1.9205394841503927e8 opts/sec
SELFTIMED 0.052068703

real	0m26.454s
user	0m31.027s
sys	0m0.874s&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Here, we’re running Black-Scholes for 10,000,000 iterations on our
4-core desktop machine.  The total wall-clock time of 26.454 seconds
consists mostly of the time it takes for &lt;code class=&quot;highlighter-rouge&quot;&gt;using ParallelAccelerator&lt;/code&gt;
to run.  Once that’s done, Julia reports a &lt;code class=&quot;highlighter-rouge&quot;&gt;SELFPRIMED&lt;/code&gt; time of about
1.8 seconds, which is dominated by the time it takes for
ParallelAccelerator to compile the &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;-annotated code, and finally
the &lt;code class=&quot;highlighter-rouge&quot;&gt;SELFTIMED&lt;/code&gt; time is about 0.05 seconds for this problem size.&lt;/p&gt;

&lt;p&gt;As Julia’s compilation speed improves, we expect that
package load time will be less of a problem for ParallelAccelerator.&lt;/p&gt;

&lt;h3 id=&quot;compiler-limitations&quot;&gt;Compiler limitations&lt;/h3&gt;

&lt;p&gt;ParallelAccelerator is able to handle only a limited subset of Julia
language features, and it only supports a limited subset of Julia’s
&lt;code class=&quot;highlighter-rouge&quot;&gt;Base&lt;/code&gt; library functions.  In other words, you cannot yet put an
&lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt; annotation on arbitrary Julia code and expect it to go faster out of
the box.  The examples in this post give an idea of what kinds of
programs are supported currently; for more, check out the
&lt;a href=&quot;https://github.com/IntelLabs/ParallelAccelerator.jl/tree/master/examples&quot;&gt;full collection of ParallelAccelerator examples&lt;/a&gt;.
However, if ParallelAccelerator can’t compile some code in an
&lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;-annotated function, it will simply fall back to running the
function under regular Julia.  So your code will run, regardless of
whether ParallelAccelerator can speed it up.&lt;/p&gt;

&lt;p&gt;One reason why an &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;-annotated function might fail to compile is
that ParallelAccelerator tries to transitively compile every Julia
function that is called by the &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;-annotated function.  So, if an
&lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;-annotated function makes several Julia library calls,
ParallelAccelerator will attempt to compile those functions as well –
and every Julia function that &lt;em&gt;they&lt;/em&gt; call, and so on.  If any of the
code in the call chain contains a feature that ParallelAccelerator
doesn’t currently support, ParallelAccelerator will fail to compile
the original &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;-annotated function.  It is therefore a good idea
to begin by annotating small (but expensive) computational kernels
with &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;, rather than wrapping an entire program in an &lt;code class=&quot;highlighter-rouge&quot;&gt;@acc&lt;/code&gt;
block.  The ParallelAccelerator
&lt;a href=&quot;http://parallelacceleratorjl.readthedocs.org/en/latest/limits.html&quot;&gt;documentation&lt;/a&gt;
has many more details on which Julia features we don’t support and why.&lt;/p&gt;

&lt;p&gt;These limitations explain why the kind of performance improvements
that ParallelAccelerator provides aren’t already the default in Julia.
Supporting all of Julia would be a major undertaking; however, in many
cases, there’s not a fundamental reason why ParallelAccelerator
couldn’t support a particular Julia feature or a function in &lt;code class=&quot;highlighter-rouge&quot;&gt;Base&lt;/code&gt;,
and supporting it is a matter of realizing that it is a problem for
users and putting in the necessary engineering effort to fix it.  So,
when you come across code that ParallelAccelerator can’t handle,
please do
&lt;a href=&quot;https://github.com/IntelLabs/ParallelAccelerator.jl/issues&quot;&gt;file bugs&lt;/a&gt;!&lt;/p&gt;

&lt;h2 id=&quot;conclusion&quot;&gt;Conclusion&lt;/h2&gt;

&lt;p&gt;In this post, we’ve introduced
&lt;a href=&quot;https://github.com/IntelLabs/ParallelAccelerator.jl&quot;&gt;ParallelAccelerator.jl&lt;/a&gt;,
a package for speeding up array-style Julia programs.  It works by
identifying implicit parallel patterns in source code and compiling
them to efficient, explicitly parallel executables, along the way
getting rid of many of the usual overheads of high-level array-style
programming.&lt;/p&gt;

&lt;p&gt;ParallelAccelerator is an open source project in its early stages, and
we enthusiastically encourage comments, questions,
&lt;a href=&quot;https://github.com/IntelLabs/ParallelAccelerator.jl/issues&quot;&gt;bug reports&lt;/a&gt;,
and contributions from the Julia community.  We welcome everyone’s
participation, and we are especially interested in how
ParallelAccelerator can be used to speed up real-world Julia programs.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;a name=&quot;footnote1&quot;&gt;&lt;/a&gt;[1] Starting with Julia 0.5, Julia will have its own
native threading support, which means that ParallelAccelerator can
target Julia’s own native threads instead of generating C++ OpenMP
code for parallelism.  We’ve begun work on implementing a
native-threading-based backend for ParallelAccelerator, but we still
target C++ by default.&lt;/p&gt;

&lt;p&gt;&lt;a name=&quot;footnote2&quot;&gt;&lt;/a&gt;[2] Detailed machine and benchmarking
specifications: We use a machine with two Intel Xeon E5-2699 v3
processors (2.3 GHz) with 18 physical cores each and 128 GB RAM,
running the CentOS 6.7 Linux distribution.  We use the Intel C++
Compiler (ICC) v15.0.2 with “-O3” for compilation of the generated C++
code.  The Julia version is 0.4.4-pre+26.  The results shown are the
average of three runs (we run each version of a benchmark five times
and discard the first and last runs).&lt;/p&gt;

&lt;p&gt;&lt;a name=&quot;footnote3&quot;&gt;&lt;/a&gt;[3] In Julia, it is not possible to index into
a comprehension’s output array in the body of the comprehension.  (The
&lt;code class=&quot;highlighter-rouge&quot;&gt;avg&lt;/code&gt; example indexes only into the input array, not the output
array.)  Therefore, it’s not necessary to do any bounds checking on
writes to the output array.  However, we still need to bounds-check
reads from the input array (for instance, in the &lt;code class=&quot;highlighter-rouge&quot;&gt;avg&lt;/code&gt; example, if
we’d written &lt;code class=&quot;highlighter-rouge&quot;&gt;0.25*x[i-2]&lt;/code&gt;, that would be out of bounds), so we cannot
avoid all array bounds checking for comprehensions in the way that we
can for map operations.&lt;/p&gt;

&lt;p&gt;&lt;a name=&quot;footnote4&quot;&gt;&lt;/a&gt;[4] In practice, rather than applying
successive Gaussian blurs to an image, we’d probably apply a single,
larger Gaussian blur, which, as
&lt;a href=&quot;https://en.wikipedia.org/wiki/Gaussian_blur&quot;&gt;Wikipedia notes&lt;/a&gt;, is at
least as efficient computationally.  Nevertheless, we’ll use it here
as an example of a stencil computation that can be iterated.&lt;/p&gt;

&lt;p&gt;&lt;a name=&quot;footnote5&quot;&gt;&lt;/a&gt;[5] A more sophisticated implementation of
Gaussian blur might do a fancier form of border handling, using only
the pixels it has available at the borders.&lt;/p&gt;

&lt;p&gt;&lt;a name=&quot;footnote6&quot;&gt;&lt;/a&gt;[6] The names “Domain AST” and “Parallel AST”
are inspired by the Domain IR and Parallel IR of the
&lt;a href=&quot;https://ppl.stanford.edu/papers/pact11-brown.pdf&quot;&gt;Delite compiler framework&lt;/a&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Multidimensional algorithms and iteration</title>
   <link href="http://julialang.org/blog/2016/02/iteration"/>
   <updated>2016-02-01T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/02/iteration</id>
   <content type="html">&lt;p&gt;Starting with release 0.4, Julia makes it easy to write elegant and
efficient multidimensional algorithms. The new capabilities rest on
two foundations: a new type of iterator, called &lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianRange&lt;/code&gt;, and
sophisticated array indexing mechanisms.  Before I explain, let me
emphasize that developing these capabilities was a collaborative
effort, with the bulk of the work done by Matt Bauman (@mbauman),
Jutho Haegeman (@Jutho), and myself (@timholy).&lt;/p&gt;

&lt;p&gt;These new iterators are deceptively simple, so much so that I’ve never
been entirely convinced that this blog post is necessary: once you
learn a few principles, there’s almost nothing to it.  However, like
many simple concepts, the implications can take a while to sink in.
There also seems to be some widespread confusion about the
relationship between these iterators and
&lt;a href=&quot;http://docs.julialang.org/en/stable/devdocs/cartesian/&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;Base.Cartesian&lt;/code&gt;&lt;/a&gt;,
which is a completely different (and much more painful) approach to
solving the same problem.  There are still a few occasions where
&lt;code class=&quot;highlighter-rouge&quot;&gt;Base.Cartesian&lt;/code&gt; is necessary, but for many problems these new
capabilities represent a vastly simplified approach.&lt;/p&gt;

&lt;p&gt;Let’s introduce these iterators with an extension of an example taken
from the
&lt;a href=&quot;http://docs.julialang.org/en/stable/manual/arrays/#iteration&quot;&gt;manual&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&quot;eachindex-cartesianindex-and-cartesianrange&quot;&gt;eachindex, CartesianIndex, and CartesianRange&lt;/h1&gt;

&lt;p&gt;You may already know that, in julia 0.4, there are two recommended
ways to iterate over the elements in an &lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray&lt;/code&gt;: if you don’t
need an index associated with each element, then you can use&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;a&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;    &lt;span class=&quot;c&quot;&gt;# A is an AbstractArray&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# Code that does something with the element a&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;If instead you also need the index, then use&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eachindex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# Code that does something with i and/or A[i]&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;In some cases, the first line of this loop expands to &lt;code class=&quot;highlighter-rouge&quot;&gt;for i =
1:length(A)&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;i&lt;/code&gt; is just an integer.  However, in other cases,
this will expand to the equivalent of&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianRange&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# i is now a CartesianIndex&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# Code that does something with i and/or A[i]&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Let’s see what these objects are:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;rand&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;

&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianRange&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
          &lt;span class=&quot;nd&quot;&gt;@show&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;
       &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}((&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}((&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}((&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}((&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}((&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}((&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;A &lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianIndex{N}&lt;/code&gt; represents an &lt;code class=&quot;highlighter-rouge&quot;&gt;N&lt;/code&gt;-dimensional index.
&lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianIndex&lt;/code&gt;es are based on tuples, and indeed you can access the
underlying tuple with &lt;code class=&quot;highlighter-rouge&quot;&gt;i.I&lt;/code&gt;.  However, they also support certain
arithmetic operations, treating their contents like a fixed-size
&lt;code class=&quot;highlighter-rouge&quot;&gt;Vector{Int}&lt;/code&gt;.  Since the length is fixed, julia/LLVM can generate
very efficient code (without introducing loops) for operations with
N-dimensional &lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianIndex&lt;/code&gt;es.&lt;/p&gt;

&lt;p&gt;A &lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianRange&lt;/code&gt; is just a pair of &lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianIndex&lt;/code&gt;es, encoding the
start and stop values along each dimension, respectively:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianRange&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;CartesianRange&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}}(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}((&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)),&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}((&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;You can construct these manually: for example,&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;julia&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;&amp;gt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianRange&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;((&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;15&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)))&lt;/span&gt;
&lt;span class=&quot;n&quot;&gt;CartesianRange&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}}(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}((&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)),&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;}((&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;7&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;15&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)))&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;constructs a range that will loop over &lt;code class=&quot;highlighter-rouge&quot;&gt;-7:7&lt;/code&gt; along the first
dimension and &lt;code class=&quot;highlighter-rouge&quot;&gt;0:15&lt;/code&gt; along the second.&lt;/p&gt;

&lt;p&gt;One reason that &lt;code class=&quot;highlighter-rouge&quot;&gt;eachindex&lt;/code&gt; is recommended over &lt;code class=&quot;highlighter-rouge&quot;&gt;for i = 1:length(A)&lt;/code&gt;
is that some &lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray&lt;/code&gt;s cannot be indexed efficiently with a
linear index; in contrast, a much wider class of objects can be
efficiently indexed with a multidimensional iterator.  (SubArrays are,
generally speaking, &lt;a href=&quot;http://docs.julialang.org/en/stable/devdocs/subarrays/#indexing-cartesian-vs-linear-indexing&quot;&gt;a prime
example&lt;/a&gt;.)
&lt;code class=&quot;highlighter-rouge&quot;&gt;eachindex&lt;/code&gt; is designed to pick the most efficient iterator for the
given array type.  You can even use&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;eachindex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;to increase the likelihood that &lt;code class=&quot;highlighter-rouge&quot;&gt;i&lt;/code&gt; will be efficient for accessing
both &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;As we’ll see below, these iterators have another purpose: independent
of whether the underlying arrays have efficient linear indexing,
multidimensional iteration can be a powerful ally when writing
algorithms.  The rest of this blog post will focus on this
latter application.&lt;/p&gt;

&lt;h1 id=&quot;writing-multidimensional-algorithms-with-cartesianindex-iterators&quot;&gt;Writing multidimensional algorithms with CartesianIndex iterators&lt;/h1&gt;

&lt;h2 id=&quot;a-multidimensional-boxcar-filter&quot;&gt;A multidimensional boxcar filter&lt;/h2&gt;

&lt;p&gt;Let’s suppose we have a multidimensional array &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt;, and we want to
compute the &lt;a href=&quot;https://en.wikipedia.org/wiki/Boxcar_averager&quot;&gt;“moving
average”&lt;/a&gt; over a
3-by-3-by-… block around each element.  From any given index position,
we’ll want to sum over a region offset by &lt;code class=&quot;highlighter-rouge&quot;&gt;-1:1&lt;/code&gt; along each dimension.
Edge positions have to be treated specially, of course, to avoid going
beyond the bounds of the array.&lt;/p&gt;

&lt;p&gt;In many languages, writing a general (N-dimensional) implementation of
this conceptually-simple algorithm is somewhat painful, but in Julia
it’s a piece of cake:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; boxcar3&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;AbstractArray&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;similar&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;R&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianRange&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;I1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Iend&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;first&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;R&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;last&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;R&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;I&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;R&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;zero&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;eltype&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;J&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianRange&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;max&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;I1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;I1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;min&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Iend&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;I1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;J&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;/&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;out&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Let’s walk through this line by line:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;out = similar(A)&lt;/code&gt; allocates the output. In a “real” implementation,
you’d want to be a little more careful about the element type of the
output (what if the input array element type is &lt;code class=&quot;highlighter-rouge&quot;&gt;Int&lt;/code&gt;?), but
we’re cutting a few corners here for simplicity.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;R = CartesianRange(size(A))&lt;/code&gt; creates the iterator for the array,
ranging from &lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianIndex((1, 1, 1, ...))&lt;/code&gt; to
&lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianIndex((size(A,1), size(A,2), size(A,3), ...))&lt;/code&gt;.  We don’t
use &lt;code class=&quot;highlighter-rouge&quot;&gt;eachindex&lt;/code&gt;, because we can’t be sure whether that will return a
&lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianRange&lt;/code&gt; iterator, and here we explicitly need one.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;I1 = first(R)&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;Iend = last(R)&lt;/code&gt; return the lower
(&lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianIndex((1, 1, 1, ...))&lt;/code&gt;) and upper
(&lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianIndex((size(A,1), size(A,2), size(A,3), ...))&lt;/code&gt;) bounds
of the iteration range, respectively.  We’ll use these to ensure
that we never access out-of-bounds elements of &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt;.&lt;/p&gt;

    &lt;p&gt;Conveniently, &lt;code class=&quot;highlighter-rouge&quot;&gt;I1&lt;/code&gt; can also be used to compute the offset range.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;for I in R&lt;/code&gt;: here we loop over each entry of &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;n = 0&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;s = zero(eltype(out))&lt;/code&gt; initialize the accumulators. &lt;code class=&quot;highlighter-rouge&quot;&gt;s&lt;/code&gt;
will hold the sum of neighboring values. &lt;code class=&quot;highlighter-rouge&quot;&gt;n&lt;/code&gt; will hold the number of
neighbors used; in most cases, after the loop we’ll have &lt;code class=&quot;highlighter-rouge&quot;&gt;n == 3^N&lt;/code&gt;,
but for edge points the number of valid neighbors will be smaller.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;for J in CartesianRange(max(I1, I-I1), min(Iend, I+I1))&lt;/code&gt; is
probably the most “clever” line in the algorithm.  &lt;code class=&quot;highlighter-rouge&quot;&gt;I-I1&lt;/code&gt; is a
&lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianIndex&lt;/code&gt; that is lower by 1 along each dimension, and &lt;code class=&quot;highlighter-rouge&quot;&gt;I+I1&lt;/code&gt;
is higher by 1.  Therefore, this constructs a range that, for
interior points, extends along each coordinate by an offset of 1 in
either direction along each dimension.&lt;/p&gt;

    &lt;p&gt;However, when &lt;code class=&quot;highlighter-rouge&quot;&gt;I&lt;/code&gt; represents an edge point, either &lt;code class=&quot;highlighter-rouge&quot;&gt;I-I1&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;I+I1&lt;/code&gt;
(or both) might be out-of-bounds.  &lt;code class=&quot;highlighter-rouge&quot;&gt;max(I-I1, I1)&lt;/code&gt; ensures that each
coordinate of &lt;code class=&quot;highlighter-rouge&quot;&gt;J&lt;/code&gt; is 1 or larger, while &lt;code class=&quot;highlighter-rouge&quot;&gt;min(I+I1, Iend)&lt;/code&gt; ensures
that &lt;code class=&quot;highlighter-rouge&quot;&gt;J[d] &amp;lt;= size(A,d)&lt;/code&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;The inner loop accumulates the sum in &lt;code class=&quot;highlighter-rouge&quot;&gt;s&lt;/code&gt; and the number of visited
neighbors in &lt;code class=&quot;highlighter-rouge&quot;&gt;n&lt;/code&gt;.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Finally, we store the average value in &lt;code class=&quot;highlighter-rouge&quot;&gt;out[I]&lt;/code&gt;.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Not only is this implementation simple, but it is surprisingly robust:
for edge points it computes the average of whatever nearest-neighbors
it has available.  It even works if &lt;code class=&quot;highlighter-rouge&quot;&gt;size(A, d) &amp;lt; 3&lt;/code&gt; for some
dimension &lt;code class=&quot;highlighter-rouge&quot;&gt;d&lt;/code&gt;; we don’t need any error checking on the size of &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;computing-a-reduction&quot;&gt;Computing a reduction&lt;/h2&gt;

&lt;p&gt;For a second example, consider the implementation of multidimensional
&lt;em&gt;reductions&lt;/em&gt;. A reduction takes an input array, and returns an array
(or scalar) of smaller size.  A classic example would be summing along
particular dimensions of an array: given a three-dimensional array,
you might want to compute the sum along dimension 2, leaving
dimensions 1 and 3 intact.&lt;/p&gt;

&lt;h3 id=&quot;the-core-algorithm&quot;&gt;The core algorithm&lt;/h3&gt;

&lt;p&gt;An efficient way to write this algorithm requires that the output
array, &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt;, is pre-allocated by the caller (later we’ll see how one
might go about allocating &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; programmatically).  For example, if the
input &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt; is of size &lt;code class=&quot;highlighter-rouge&quot;&gt;(l,m,n)&lt;/code&gt;, then when summing along just dimension
2 the output &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; would have size &lt;code class=&quot;highlighter-rouge&quot;&gt;(l,1,n)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Given this setup, the implementation is shockingly simple:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; sumalongdims&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;c&quot;&gt;# It's assumed that B has size 1 along any dimension that we're summing&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;fill!&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Bmax&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianIndex&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;I&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianRange&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;))&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;min&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Bmax&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;I&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;The key idea behind this algorithm is encapsulated in the single
statement &lt;code class=&quot;highlighter-rouge&quot;&gt;B[min(Bmax,I)]&lt;/code&gt;.  For our three-dimensional example where
&lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt; is of size &lt;code class=&quot;highlighter-rouge&quot;&gt;(l,m,n)&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; is of size &lt;code class=&quot;highlighter-rouge&quot;&gt;(l,1,n)&lt;/code&gt;, the inner loop
is essentially equivalent to&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;B[i,1,k] += A[i,j,k]
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;because &lt;code class=&quot;highlighter-rouge&quot;&gt;min(1,j) = 1&lt;/code&gt;.&lt;/p&gt;

&lt;h3 id=&quot;the-wrapper-and-handling-type-instability-using-function-barriers&quot;&gt;The wrapper, and handling type-instability using function barriers&lt;/h3&gt;

&lt;p&gt;As a user, you might prefer an interface more like &lt;code class=&quot;highlighter-rouge&quot;&gt;sumalongdims(A,
dims)&lt;/code&gt; where &lt;code class=&quot;highlighter-rouge&quot;&gt;dims&lt;/code&gt; specifies the dimensions you want to sum along.
&lt;code class=&quot;highlighter-rouge&quot;&gt;dims&lt;/code&gt; might be a single integer, like &lt;code class=&quot;highlighter-rouge&quot;&gt;2&lt;/code&gt; in our example above, or
(should you want to sum along multiple dimensions at once) a tuple or
&lt;code class=&quot;highlighter-rouge&quot;&gt;Vector{Int}&lt;/code&gt;.  This is indeed the interface used in &lt;code class=&quot;highlighter-rouge&quot;&gt;sum(A, dims)&lt;/code&gt;;
here we want to write our own (somewhat simpler) implementation.&lt;/p&gt;

&lt;p&gt;A bare-bones implementation of the wrapper is straightforward:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; sumalongdims&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dims&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dims&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;B&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;eltype&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;sumalongdims!&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Obviously, this simple implementation skips all relevant error
checking.  However, here the main point I wish to explore is that the
allocation of &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; turns out to be
&lt;a href=&quot;http://docs.julialang.org/en/stable/manual/faq/#what-does-type-stable-mean&quot;&gt;type-unstable&lt;/a&gt;:
&lt;code class=&quot;highlighter-rouge&quot;&gt;sz&lt;/code&gt; is a &lt;code class=&quot;highlighter-rouge&quot;&gt;Vector{Int}&lt;/code&gt;, the length (number of elements) of a specific
&lt;code class=&quot;highlighter-rouge&quot;&gt;Vector{Int}&lt;/code&gt; is not encoded by the type itself, and therefore the
dimensionality of &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; cannot be inferred.&lt;/p&gt;

&lt;p&gt;Now, we could fix that in several ways, for example by annotating the
result:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Array&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;eltype&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;sz&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;nb&quot;&gt;typeof&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;However, this isn’t really necessary: in the remainder of this
function, &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; is not used for any performance-critical operations.
&lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; simply gets passed to &lt;code class=&quot;highlighter-rouge&quot;&gt;sumalongdims!&lt;/code&gt;, and it’s the job of the
compiler to ensure that, given the type of &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt;, an efficient version
of &lt;code class=&quot;highlighter-rouge&quot;&gt;sumalongdims!&lt;/code&gt; gets generated.  In other words, the type
instability of &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt;’s allocation is prevented from “spreading” by the
fact that &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; is henceforth used only as an argument in a function
call.  This trick, using a &lt;a href=&quot;http://docs.julialang.org/en/stable/manual/performance-tips/#separate-kernel-functions&quot;&gt;function-call to separate a
performance-critical step from a potentially type-unstable
precursor&lt;/a&gt;,
is sometimes referred to as introducing a &lt;em&gt;function barrier&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;As a general rule, when writing multidimensional code you should
ensure that the main iteration is in a separate function from
type-unstable precursors.  Even when you take appropriate precautions,
there’s a potential “gotcha”: if your inner loop is small, julia’s
ability to inline code might eliminate the intended function barrier,
and you get dreadful performance.  For this reason, it’s recommended
that you annotate function-barrier callees with &lt;code class=&quot;highlighter-rouge&quot;&gt;@noinline&lt;/code&gt;:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;nd&quot;&gt;@noinline&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; sumalongdims&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;B&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;A&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Of course, in this example there’s a second motivation for making this
a standalone function: if this calculation is one you’re going to
repeat many times, re-using the same output array can reduce the
amount of memory allocation in your code.&lt;/p&gt;

&lt;h2 id=&quot;filtering-along-a-specified-dimension-exploiting-multiple-indexes&quot;&gt;Filtering along a specified dimension (exploiting multiple indexes)&lt;/h2&gt;

&lt;p&gt;One final example illustrates an important new point: when you index
an array, you can freely mix &lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianIndex&lt;/code&gt;es and
integers.  To illustrate this, we’ll write an &lt;a href=&quot;https://en.wikipedia.org/wiki/Exponential_smoothing&quot;&gt;exponential
smoothing
filter&lt;/a&gt;.  An
efficient way to implement such filters is to have the smoothed output
value &lt;code class=&quot;highlighter-rouge&quot;&gt;s[i]&lt;/code&gt; depend on a combination of the current input &lt;code class=&quot;highlighter-rouge&quot;&gt;x[i]&lt;/code&gt; and
the previous filtered value &lt;code class=&quot;highlighter-rouge&quot;&gt;s[i-1]&lt;/code&gt;; in one dimension, you can write
this as&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; expfilt1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;α&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;mi&quot;&gt;0&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;α&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;&amp;lt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;||&lt;/span&gt; &lt;span class=&quot;nb&quot;&gt;error&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;s&quot;&gt;&quot;α must be between 0 and 1&quot;&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;length&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;a&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
        &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;α&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;α&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;This would result in an approximately-exponential decay with timescale &lt;code class=&quot;highlighter-rouge&quot;&gt;1/α&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Here, we want to implement this algorithm so that it can be used to
exponentially filter an array along any chosen dimension.  Once again,
the implementation is surprisingly simple:&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; expfiltdim&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dim&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;::&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Integer&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;α&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;similar&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Rpre&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianRange&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dim&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;Rpost&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;CartesianRange&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;dim&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;+&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;])&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;_expfilt!&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;α&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rpre&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;size&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;dim&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;),&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rpost&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;

&lt;span class=&quot;nd&quot;&gt;@noinline&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;function&lt;/span&gt;&lt;span class=&quot;nf&quot;&gt; _expfilt&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;!&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;α&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rpre&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rpost&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Ipost&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rpost&lt;/span&gt;
        &lt;span class=&quot;c&quot;&gt;# Initialize the first value along the filtered dimension&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Ipre&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rpre&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Ipre&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Ipost&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Ipre&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Ipost&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
        &lt;span class=&quot;c&quot;&gt;# Handle all other entries&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;mi&quot;&gt;2&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;:&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;n&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Ipre&lt;/span&gt; &lt;span class=&quot;k&quot;&gt;in&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Rpre&lt;/span&gt;
                &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Ipre&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Ipost&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;=&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;α&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Ipre&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Ipost&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;+&lt;/span&gt; &lt;span class=&quot;x&quot;&gt;(&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;α&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;)&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;*&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Ipre&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;o&quot;&gt;-&lt;/span&gt;&lt;span class=&quot;mi&quot;&gt;1&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Ipost&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt;
            &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;
    &lt;span class=&quot;n&quot;&gt;s&lt;/span&gt;
&lt;span class=&quot;k&quot;&gt;end&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;Note once again the use of the function barrier technique.  In the
core algorithm (&lt;code class=&quot;highlighter-rouge&quot;&gt;_expfilt!&lt;/code&gt;), our strategy is to use &lt;em&gt;two&lt;/em&gt;
&lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianIndex&lt;/code&gt; iterators, &lt;code class=&quot;highlighter-rouge&quot;&gt;Ipre&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;Ipost&lt;/code&gt;, where the first covers
dimensions &lt;code class=&quot;highlighter-rouge&quot;&gt;1:dim-1&lt;/code&gt; and the second &lt;code class=&quot;highlighter-rouge&quot;&gt;dim+1:ndims(x)&lt;/code&gt;; the filtering
dimension &lt;code class=&quot;highlighter-rouge&quot;&gt;dim&lt;/code&gt; is handled separately by an integer-index &lt;code class=&quot;highlighter-rouge&quot;&gt;i&lt;/code&gt;.
Because the filtering dimension is specified by an integer input,
there is no way to infer how many entries will be within each
index-tuple &lt;code class=&quot;highlighter-rouge&quot;&gt;Ipre&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;Ipost&lt;/code&gt;.  Hence, we compute the &lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianRange&lt;/code&gt;s in
the type-unstable portion of the algorithm, and then pass them as
arguments to the core routine &lt;code class=&quot;highlighter-rouge&quot;&gt;_expfilt!&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;What makes this implementation possible is the fact that we can index
&lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; as &lt;code class=&quot;highlighter-rouge&quot;&gt;x[Ipre, i, Ipost]&lt;/code&gt;.  Note that the total number of indexes
supplied is &lt;code class=&quot;highlighter-rouge&quot;&gt;(dim-1) + 1 + (ndims(x)-dim)&lt;/code&gt;, which is just &lt;code class=&quot;highlighter-rouge&quot;&gt;ndims(x)&lt;/code&gt;.
In general, you can supply any combination of integer and
&lt;code class=&quot;highlighter-rouge&quot;&gt;CartesianIndex&lt;/code&gt; indexes when indexing an &lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray&lt;/code&gt; in Julia.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/timholy/AxisAlgorithms.jl&quot;&gt;AxisAlgorithms&lt;/a&gt;
package makes heavy use of tricks such as these, and in turn provides
core support for high-performance packages like
&lt;a href=&quot;https://github.com/tlycken/Interpolations.jl&quot;&gt;Interpolations&lt;/a&gt; that
require multidimensional computation.&lt;/p&gt;

&lt;h1 id=&quot;additional-issues&quot;&gt;Additional issues&lt;/h1&gt;

&lt;p&gt;It’s worth noting one point that has thus far remained unstated: all
of the examples here are relatively &lt;em&gt;cache efficient&lt;/em&gt;.  This is a key
property to observe when writing &lt;a href=&quot;http://julialang.org/blog/2013/09/fast-numeric/&quot;&gt;efficient
code&lt;/a&gt;.  In
particular, julia arrays are stored in first-to-last dimension order
(for matrices, “column-major” order), and hence you should nest
iterations from last-to-first dimensions.  For example, in the
filtering example above we were careful to iterate in the order&lt;/p&gt;

&lt;figure class=&quot;highlight&quot;&gt;&lt;pre&gt;&lt;code class=&quot;language-julia&quot; data-lang=&quot;julia&quot;&gt;&lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Ipost&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
    &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
        &lt;span class=&quot;k&quot;&gt;for&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Ipre&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;
            &lt;span class=&quot;n&quot;&gt;x&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;[&lt;/span&gt;&lt;span class=&quot;n&quot;&gt;Ipre&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;i&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;,&lt;/span&gt; &lt;span class=&quot;n&quot;&gt;Ipost&lt;/span&gt;&lt;span class=&quot;x&quot;&gt;]&lt;/span&gt; &lt;span class=&quot;o&quot;&gt;...&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/figure&gt;

&lt;p&gt;so that &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; would be traversed in memory-order.&lt;/p&gt;

&lt;h1 id=&quot;summary&quot;&gt;Summary&lt;/h1&gt;

&lt;p&gt;As is hopefully clear by now, much of the pain of writing generic
multidimensional algorithms is eliminated by Julia’s elegant
iterators.  The examples here just scratch the surface, but the
underlying principles are very simple; it is hoped that these
examples will make it easier to write your own algorithms.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Julia IDE work in Atom</title>
   <link href="http://julialang.org/blog/2016/01/atom-work"/>
   <updated>2016-01-07T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2016/01/atom-work</id>
   <content type="html">&lt;div align=&quot;center&quot;&gt;&lt;img src=&quot;https://github.com/JunoLab/atom-ink/raw/readme/demos/full.gif&quot; /&gt;&lt;/div&gt;

&lt;blockquote&gt;
  &lt;p&gt;A PL designer used to be able to design some syntax and semantics for their language, implement a compiler, and then call it a day. –  Sean McDirmid&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;In the few years since its &lt;a href=&quot;http://julialang.org/blog/2012/02/why-we-created-julia/&quot;&gt;initial release&lt;/a&gt;, the Julia language has made wonderful progress. Over &lt;a href=&quot;https://github.com/JuliaLang/julia/graphs/contributors&quot;&gt;four hundred contributors&lt;/a&gt; – and counting – have donated their time developing exciting and modern language features like &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/12042&quot;&gt;channels&lt;/a&gt; for concurrency, a &lt;a href=&quot;http://docs.julialang.org/en/latest/manual/documentation/&quot;&gt;native documentation system&lt;/a&gt;, &lt;a href=&quot;http://docs.julialang.org/en/latest/manual/metaprogramming/#generated-functions&quot;&gt;staged functions&lt;/a&gt;, &lt;a href=&quot;http://docs.julialang.org/en/release-0.4/manual/modules/#module-initialization-and-precompilation&quot;&gt;compiled packages&lt;/a&gt;, &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/13410&quot;&gt;threading&lt;/a&gt;, and tons more. In the lead up to 1.0 we have a faster and more stable runtime, a more comprehensive standard library, and a more enthusiastic community than ever before.&lt;/p&gt;

&lt;p&gt;However, a programming language isn’t just a compiler or spec in a vacuum. More and more, the ecosystem around a language – the packages, tooling, and community that support you – are a huge determining factor in where a language can be used, and who it can be used by. Making Julia accessible to everybody means facing these issues head-on. In particular, we’ll be putting a lot of effort into building a comprehensive IDE, Juno, which supports users with features like smart autocompletion, plotting and data handling, interactive live coding and debugging, and more.&lt;/p&gt;

&lt;p&gt;Julia users aren’t just programmers – they’re engineers, scientists, data mungers, financiers, statisticians, researchers, and many other things, so it’s vital that our IDE is flexible and extensible enough to support all their different workflows fluidly. At the same time, we want to avoid reinventing well-oiled wheels, and don’t want to compromise on the robust and powerful core editing experience that people have come to expect. Luckily enough, we think we can have our cake and eat it too by building on top of the excellent &lt;a href=&quot;http://atom.io/&quot;&gt;Atom&lt;/a&gt; editor.&lt;/p&gt;

&lt;p&gt;The Atom community has done an amazing job of building an editor that’s powerful and flexible without sacrificing a pleasant and intuitive experience. Web technologies not only make hacking on the editor extremely accessible for new contributors, but also make it easy for us to experiment with exciting and modern features like live coding, making it a really promising option for our work.&lt;/p&gt;

&lt;p&gt;Our immediate priorities will be to get basic interactive usage working really well, including strong multimedia support for display and graphics. Before long we’ll have a comprehensive IDE bundle which includes Juno, Julia, and a bunch of useful packages for things like plotting – with the aim that anyone can get going productively with Julia within a few minutes. Once the basics are in place, we’ll integrate the documentation system and the up-and-coming debugger, implement performance linting, and make sure that there’s help and tutorials in place so that it’s easy for everyone to get started.&lt;/p&gt;

&lt;p&gt;Juno is implemented as a &lt;a href=&quot;https://github.com/JunoLab&quot;&gt;large collection&lt;/a&gt; of independent modules and plugins; although this adds some development overhead, we think it’s well worthwhile to make sure that other projects can benefit from our work. For example, our collection of IDE components for Atom, &lt;a href=&quot;https://github.com/JunoLab/atom-ink&quot;&gt;Ink&lt;/a&gt;, is completely language-agnostic and should be reusable by other languages.&lt;/p&gt;

&lt;p&gt;New contributions are always welcome, so if you’re interested in helping to push this exciting project forward, check out the &lt;a href=&quot;https://github.com/JunoLab/atom-julia-client/tree/master/docs&quot;&gt;developer install instructions&lt;/a&gt; and send us a PR!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>JSoC 2015 project: DataStreams.jl</title>
   <link href="http://julialang.org/blog/2015/10/datastreams"/>
   <updated>2015-10-25T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2015/10/datastreams</id>
   <content type="html">&lt;p&gt;Data processing got ya down? Good news! The &lt;a href=&quot;https://github.com/JuliaDB/DataStreams.jl&quot;&gt;DataStreams.jl&lt;/a&gt; package, er, framework, has arrived!&lt;/p&gt;

&lt;p&gt;The DataStreams processing framework provides a consistent interface for working with data, from source to sink and eventually every step in-between. It’s really about putting forth an interface (specific types and methods) to go about ingesting and transferring data sources that hopefully makes for a consistent experience for users, no matter what kind of data they’re working with.&lt;/p&gt;

&lt;p&gt;######How does it work?
DataStreams is all about creating “sources” (Julia types that represent true data sources; e.g. csv files, database backends, etc.), “sinks” or data destinations, and defining the appropriate &lt;code class=&quot;highlighter-rouge&quot;&gt;Data.stream!(source, sink)&lt;/code&gt; methods to actually transfer data from source to sink. Let’s look at a quick example.&lt;/p&gt;

&lt;p&gt;Say I have a table of data in a CSV file on my local machine and need to do a little cleaning and aggregation on the data before building a model with the &lt;a href=&quot;https://github.com/JuliaStats/GLM.jl&quot;&gt;GLM.jl&lt;/a&gt; package. Let’s see some code in action:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;using CSV, SQLite, DataStreams, DataFrames

# let's create a Julia type that understands our data file
csv_source = CSV.Source(&quot;datafile.csv&quot;)

# let's also create an SQLite destination for our data
# according to its structure
db = SQLite.DB() # create an in-memory SQLite database

# creates an SQLite table
sqlite_sink = SQLite.Sink(Data.schema(csv_source), db, &quot;mydata&quot;)

# parse the CSV data directly into our SQLite table
Data.stream!(csv_source, sqlite_sink)

# now I can do some data cleansing/aggregation
# ...various SQL statements on the &quot;mydata&quot; SQLite table...

# now I'm ready to get my data out and ready for model fitting
sqlite_source = SQLite.Source(sqlite_sink)

# stream our data into a Julia structure (Data.Table)
dt = Data.stream!(sqlite_source, Data.Table)

# convert to DataFrame (non-copying)
df = DataFrame(dt)

# do model-fitting
OLS = glm(Y~X,df,Normal(),IdentityLink())
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Here we see it’s quite simple to create a &lt;code class=&quot;highlighter-rouge&quot;&gt;Source&lt;/code&gt; type by wrapping a true datasource (our CSV file), a destination for that data (an SQLite table), and to transfer the data. We can then turn our &lt;code class=&quot;highlighter-rouge&quot;&gt;SQLite.Sink&lt;/code&gt; into an &lt;code class=&quot;highlighter-rouge&quot;&gt;SQLite.Source&lt;/code&gt; for getting the data back out again.&lt;/p&gt;

&lt;h5 id=&quot;so-what-have-you-really-been-working-on&quot;&gt;So What Have You Really Been Working On?&lt;/h5&gt;

&lt;p&gt;Well, a lot actually. Even though the DataStreams framework is currently simple and minimalistic, it took a lot of back and forth on the design, including several discussions at this year’s JuliaCon at MIT. Even with a tidy little framework, however, the bulk of the work still lies in actually implementing the interface in various packages. The two that are ready for release today are &lt;a href=&quot;https://github.com/JuliaDB/CSV.jl&quot;&gt;CSV.jl&lt;/a&gt; and &lt;a href=&quot;https://github.com/JuliaDB/SQLite.jl&quot;&gt;SQLite.jl&lt;/a&gt;. They are currently available for julia 0.4+ only.&lt;/p&gt;

&lt;p&gt;Quick rundown of each package:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;CSV: provides types and methods for working with CSV and other delimited files. Aims to be (and currently is) the fastest and most flexible CSV reader in Julia.&lt;/li&gt;
  &lt;li&gt;SQLite: an interface to the popular &lt;a href=&quot;http://sqlite.org/&quot;&gt;SQLite&lt;/a&gt; local-machine database. Provides methods for creating/managing database files, along with executing SQL statements and viewing the results of such.&lt;/li&gt;
&lt;/ul&gt;

&lt;h5 id=&quot;so-whats-next&quot;&gt;So What’s Next?&lt;/h5&gt;
&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/JuliaDB/ODBC.jl&quot;&gt;ODBC.jl&lt;/a&gt;: the next package to get the DataStreams makeover is ODBC. I’ve already started work on this and hopefully should be ready soon.&lt;/li&gt;
  &lt;li&gt;Other packages: I’m always on the hunt for new ways to spread the framework; if you’d be interested in implementing DataStreams for your own package or want to collaborate, just &lt;a href=&quot;https://github.com/quinnj&quot;&gt;ping&lt;/a&gt; me and I’m happy to discuss!&lt;/li&gt;
  &lt;li&gt;transforms: an important part of data processing tasks is not just connecting to and moving the data to somewhere else: often you need to clean/transform/aggregate the data in some way in-between. Right now, that’s up to users, but I have some ideas around creating DataStreams-friendly ways to easily incorporate transform steps as data is streamed from one place to another.&lt;/li&gt;
  &lt;li&gt;DataStreams for chaining pipelines + transforms: I’m also excited about the idea of creating entire &lt;code class=&quot;highlighter-rouge&quot;&gt;DataStreams&lt;/code&gt;, which would define entire data processing tasks end-to-end. Setting up a pipeline that could consistently move and process data gets even more powerful as we start looking into automatic-parallelism and extensibility.&lt;/li&gt;
  &lt;li&gt;DataStream scheduling/management: I’m also interested in developing capabilities around scheduling and managing DataStreams.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;em&gt;The work on DataStreams.jl was carried out as part of the Julia Summer of Code program, made possible thanks to the generous support of the &lt;a href=&quot;https://moore.org&quot;&gt;Gordon and Betty Moore Foundation&lt;/a&gt;, and MIT.&lt;/em&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>JSoC 2015 project: Automatic Differentiation in Julia with ForwardDiff.jl</title>
   <link href="http://julialang.org/blog/2015/10/auto-diff-in-julia"/>
   <updated>2015-10-23T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2015/10/auto-diff-in-julia</id>
   <content type="html">&lt;script type=&quot;text/javascript&quot; src=&quot;https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;p&gt;This summer, I’ve had the good fortune to be able to participate in the first ever &lt;strong&gt;Julia Summer of Code (JSoC)&lt;/strong&gt;, generously sponsored by the Gordon and Betty Moore Foundation. My JSoC project was to explore the use of Julia for automatic differentiation (AD), a topic with a wide array of applications in the field of optimization.&lt;/p&gt;

&lt;p&gt;Under the mentorship of Miles Lubin and Theodore Papamarkou, I completed a major overhaul of &lt;strong&gt;&lt;a href=&quot;https://github.com/JuliaDiff/ForwardDiff.jl&quot;&gt;ForwardDiff.jl&lt;/a&gt;&lt;/strong&gt;, a Julia package for calculating derivatives, gradients, Jacobians, Hessians, and higher-order derivatives of native Julia functions (or any callable Julia type, really).&lt;/p&gt;

&lt;p&gt;By the end of this post, you’ll hopefully know a little bit about how ForwardDiff.jl works, why it’s useful, and why Julia is uniquely well-suited for AD compared to other languages.&lt;/p&gt;

&lt;h1 id=&quot;what-is-automatic-differentiation&quot;&gt;What is Automatic Differentiation?&lt;/h1&gt;

&lt;p&gt;In broad terms, &lt;a href=&quot;https://en.wikipedia.org/wiki/Automatic_differentiation&quot;&gt;automatic differentiation&lt;/a&gt; describes a class of algorithms for automatically taking exact derivatives of user-provided functions. In addition to producing more accurate results, AD methods are also often faster than other common differentiation methods (such as &lt;a href=&quot;https://en.wikipedia.org/wiki/Numerical_differentiation&quot;&gt;finite differencing&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;The two main flavors of AD are called &lt;em&gt;forward mode&lt;/em&gt; and &lt;em&gt;reverse mode&lt;/em&gt;. As you might’ve guessed, this post only discusses forward mode, which is the kind of AD implemented by ForwardDiff.jl.&lt;/p&gt;

&lt;h1 id=&quot;seeing-forwarddiffjl-in-action&quot;&gt;Seeing ForwardDiff.jl In Action&lt;/h1&gt;

&lt;p&gt;Before we get down to the nitty-gritty details, it might be helpful to see a simple example that illustrates various methods from ForwardDiff.jl’s API.&lt;/p&gt;

&lt;p&gt;The snippet below is a somewhat contrived example, but works well enough as an introduction to the package. First, we define a target function we’d like to differentiate, then use ForwardDiff.jl to calculate some derivatives of the function at a given input:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; using ForwardDiff

julia&amp;gt; f(x::Vector) = sum(sin, x) + prod(tan, x) * sum(sqrt, x);

julia&amp;gt; x = rand(5)
5-element Array{Float64,1}:
 0.986403
 0.140913
 0.294963
 0.837125
 0.650451

julia&amp;gt; g = ForwardDiff.gradient(f); # g = ∇f

julia&amp;gt; g(x)
5-element Array{Float64,1}:
 1.01358
 2.50014
 1.72574
 1.10139
 1.2445

julia&amp;gt; j = ForwardDiff.jacobian(g); # j = J(∇f)

julia&amp;gt; j(x)
5x5 Array{Float64,2}:
 0.585111  3.48083  1.7706    0.994057  1.03257
 3.48083   1.06079  5.79299   3.25245   3.37871
 1.7706    5.79299  0.423981  1.65416   1.71818
 0.994057  3.25245  1.65416   0.251396  0.964566
 1.03257   3.37871  1.71818   0.964566  0.140689

julia&amp;gt; ForwardDiff.hessian(f, x) # H(f)(x) == J(∇f)(x), as expected
5x5 Array{Float64,2}:
 0.585111  3.48083  1.7706    0.994057  1.03257
 3.48083   1.06079  5.79299   3.25245   3.37871
 1.7706    5.79299  0.423981  1.65416   1.71818
 0.994057  3.25245  1.65416   0.251396  0.964566
 1.03257   3.37871  1.71818   0.964566  0.140689
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Tada!&lt;/p&gt;

&lt;p&gt;Okay, that’s not &lt;em&gt;too&lt;/em&gt; exciting - I could’ve just done the same thing with Calculus.jl. Why would I ever want to use ForwardDiff.jl?&lt;/p&gt;

&lt;p&gt;The simple answer is that ForwardDiff.jl’s AD-based methods are, in many cases, much more performant than the finite differencing methods implemented in other packages.&lt;/p&gt;

&lt;h1 id=&quot;how-forwarddiffjl-works---an-overview&quot;&gt;How ForwardDiff.jl Works - An Overview&lt;/h1&gt;

&lt;p&gt;The key technique leveraged by ForwardDiff.jl is the implementation of several different &lt;code class=&quot;highlighter-rouge&quot;&gt;ForwardDiffNumber&lt;/code&gt; types, each of which allocate storage space for both normal values and derivative values. Elementary numerical functions on a &lt;code class=&quot;highlighter-rouge&quot;&gt;ForwardDiffNumber&lt;/code&gt; are then overloaded to evaluate both the original function and the function’s derivative, returning the results in the form of a new &lt;code class=&quot;highlighter-rouge&quot;&gt;ForwardDiffNumber&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Thus, we can pass these number types into a general function &lt;script type=&quot;math/tex&quot;&gt;f&lt;/script&gt; (which is assumed to be composed of the overloaded elementary functions), and the derivative information is naturally propagated at each step of the calculation by way of the chain rule. The final result of the evaluation (usually a &lt;code class=&quot;highlighter-rouge&quot;&gt;ForwardDiffNumber&lt;/code&gt; or an array of them) then contains both &lt;script type=&quot;math/tex&quot;&gt;f(x)&lt;/script&gt; and &lt;script type=&quot;math/tex&quot;&gt;f'(x)&lt;/script&gt;, where &lt;script type=&quot;math/tex&quot;&gt;x&lt;/script&gt; was the original point of evaluation.&lt;/p&gt;

&lt;h1 id=&quot;simple-forward-mode-ad-in-julia&quot;&gt;Simple Forward Mode AD in Julia&lt;/h1&gt;

&lt;p&gt;The easiest way to write actual Julia code demonstrating this technique is to implement a simple &lt;a href=&quot;https://en.wikipedia.org/wiki/Dual_number&quot;&gt;dual number&lt;/a&gt; type. Note that there is already &lt;a href=&quot;https://github.com/JuliaDiff/DualNumbers.jl&quot;&gt;a Julia package&lt;/a&gt; dedicated to such an implementation, but we’re going to roll our own here for pedagogical purposes.&lt;/p&gt;

&lt;p&gt;Here’s how we’ll define our &lt;code class=&quot;highlighter-rouge&quot;&gt;DualNumber&lt;/code&gt; type:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;immutable DualNumber{T} &amp;lt;: Number
    value::T
    deriv::T
end

value(d::DualNumber) = d.value
deriv(d::DualNumber) = d.deriv
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Next, we can start defining functions on &lt;code class=&quot;highlighter-rouge&quot;&gt;DualNumber&lt;/code&gt;. Here are a few examples to give you a feel for the process:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function Base.sqrt(d::DualNumber)
    new_value = sqrt(value(d))
    new_deriv = 0.5 / new_value
    return DualNumber(new_value, new_deriv*deriv(d))
end

function Base.sin(d::DualNumber)
    new_value = sin(value(d))
    new_deriv = cos(value(d))
    return DualNumber(new_value, new_deriv*deriv(d))
end

function Base.(:+)(a::DualNumber, b::DualNumber)
    new_value = value(a) + value(b)
    new_deriv = deriv(a) + deriv(b)
    return DualNumber(new_value, new_deriv)
end

function Base.(:*)(a::DualNumber, b::DualNumber)
    val_a, val_b = value(a), value(b)
    new_value = val_a * val_b
    new_deriv = val_b * deriv(a) + val_a * deriv(b)
    return DualNumber(new_value, new_deriv)
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We can now evaluate the derivative of &lt;em&gt;any scalar function composed of the above elementary functions&lt;/em&gt;. To do so, we simply pass an instance of our &lt;code class=&quot;highlighter-rouge&quot;&gt;DualNumber&lt;/code&gt; type into the function, and extract the derivative from the result. For example:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; f(x) = sqrt(sin(x * x)) + x
f (generic function with 1 method)

julia&amp;gt; f(1.0)
1.8414709848078965

julia&amp;gt; d = f(DualNumber(1.0, 1.0))
DualNumber{Float64}(1.8414709848078965,1.5403023058681398)

julia&amp;gt; deriv1 = deriv(d)
1.589002649374538

julia&amp;gt; using Calculus; deriv2 = Calculus.derivative(f, 1.0)
1.5890026493377403

julia&amp;gt; deriv1 - deriv2
3.679767601738604e-11
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Notice that our dual number result comes &lt;em&gt;close&lt;/em&gt; to the result obtained from Calculus.jl, but is actually slightly different. That slight difference is due to the approximation error inherent to the finite differencing method employed by Calculus.jl.&lt;/p&gt;

&lt;p&gt;In reality, the number types that ForwardDiff.jl provides are quite a bit more complicated than &lt;code class=&quot;highlighter-rouge&quot;&gt;DualNumber&lt;/code&gt;. Instead of simple dual numbers, the various &lt;code class=&quot;highlighter-rouge&quot;&gt;ForwardDiffNumber&lt;/code&gt; types behave like &lt;em&gt;ensembles&lt;/em&gt; of dual numbers and &lt;a href=&quot;https://adl.stanford.edu/hyperdual/Fike_AIAA-2011-886.pdf&quot;&gt;hyper-dual numbers&lt;/a&gt; (the higher-order analog of dual numbers). This ensemble-based approach allows for simultaneous calculation of multiple higher-order partial derivatives in a single evaluation of the target function. For an in-depth examination of ForwardDiff.jl’s number type implementation, see &lt;a href=&quot;http://www.juliadiff.org/ForwardDiff.jl/types.html&quot;&gt;this section of the developer documentation&lt;/a&gt;.&lt;/p&gt;

&lt;h1 id=&quot;performance-comparison-the-ackley-function&quot;&gt;Performance Comparison: The Ackley Function&lt;/h1&gt;

&lt;p&gt;The best way to illustrate the performance gains that can be achieved using ForwardDiff.jl is to do some benchmarking. Let’s compare the time to calculate the gradient of a function using ForwardDiff.jl, Calculus.jl, and a Python-based AD tool, AlgoPy.&lt;/p&gt;

&lt;p&gt;The function we’ll be using in our test is the &lt;a href=&quot;http://www.sfu.ca/~ssurjano/ackley.html&quot;&gt;Ackley function&lt;/a&gt;, which is mathematically defined as&lt;/p&gt;

&lt;script type=&quot;math/tex; mode=display&quot;&gt;f(\vec{x}) = -a \exp\left( -b \sqrt{\frac{1}{k} \sum_{i=1}^k x^{2}_{i}} \right) - \exp\left(\frac{1}{k} \sum_{i=1}^k \cos(cx_{i})\right) + a + \exp(1)&lt;/script&gt;

&lt;p&gt;Here’s the definition of the function in Julia:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function ackley(x)
    a, b, c = 20.0, -0.2, 2.0*π
    len_recip = inv(length(x))
    sum_sqrs = zero(eltype(x))
    sum_cos = sum_sqrs
    for i in x
        sum_cos += cos(c*i)
        sum_sqrs += i^2
    end
    return (-a * exp(b * sqrt(len_recip*sum_sqrs)) -
            exp(len_recip*sum_cos) + a + e)
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;…and here’s the corresponding Python definition:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;def ackley(x):
    a, b, c = 20.0, -0.2, 2.0*numpy.pi
    len_recip = 1.0/len(x)
    sum_sqrs, sum_cos = 0.0, 0.0
    for i in x:
        sum_cos += algopy.cos(c*i)
        sum_sqrs += i*i
    return (-a * algopy.exp(b*algopy.sqrt(len_recip*sum_sqrs)) -
            algopy.exp(len_recip*sum_cos) + a + numpy.e)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h1 id=&quot;performance-comparison-the-results&quot;&gt;Performance Comparison: The Results&lt;/h1&gt;

&lt;p&gt;The benchmarks were performed with input vectors of length 16, 1600, and 16000, taking the best time out of 5 trials for each test. I ran them on a late 2013 MacBook Pro (macOS 10.9.5, 2.6 GHz Intel Core i5, 8 GB 1600 MHz DDR3) with the following versions of the relevant libraries: Julia v0.4.1-pre+15, Python v2.7.9, ForwardDiff.jl v0.1.2, Calculus.jl v0.1.13, and AlgoPy v0.5.1.&lt;/p&gt;

&lt;p&gt;Let’s start by looking at the evaluation times of &lt;code class=&quot;highlighter-rouge&quot;&gt;ackley(x)&lt;/code&gt; in both Python and Julia:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;length(x) &lt;/th&gt;
      &lt;th&gt;Python time (s) &lt;/th&gt;
      &lt;th&gt;Julia time (s) &lt;/th&gt;
      &lt;th&gt;Speed-Up vs. Python &lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;16&lt;/td&gt;
      &lt;td&gt;0.00011&lt;/td&gt;
      &lt;td&gt;2.3e-6&lt;/td&gt;
      &lt;td&gt;47.83x&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1600&lt;/td&gt;
      &lt;td&gt;0.00477&lt;/td&gt;
      &lt;td&gt;4.0269e-5&lt;/td&gt;
      &lt;td&gt;118.45x&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;16000&lt;/td&gt;
      &lt;td&gt;0.04747&lt;/td&gt;
      &lt;td&gt;0.00037&lt;/td&gt;
      &lt;td&gt;128.30x&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;As you can see, there’s already a significant performance difference between the languages. We’ll have to keep that in mind when comparing our Julia differentiation tools with AlgoPy, in order to avoid confusing the languages’ performance characteristics with those of the libraries (though there is obviously a solid coupling between the two concepts).&lt;/p&gt;

&lt;p&gt;The below table shows the evaluation times of &lt;code class=&quot;highlighter-rouge&quot;&gt;∇ackley(x)&lt;/code&gt; using various libraries (the &lt;code class=&quot;highlighter-rouge&quot;&gt;chunk_size&lt;/code&gt; column denotes a configuration option passed to the &lt;code class=&quot;highlighter-rouge&quot;&gt;ForwardDiff.gradient&lt;/code&gt; method, see the &lt;a href=&quot;http://www.juliadiff.org/ForwardDiff.jl/chunk_vec_modes.html&quot;&gt;chunk-mode docs&lt;/a&gt; for details.):&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;length(x) &lt;/th&gt;
      &lt;th&gt;AlgoPy time (s) &lt;/th&gt;
      &lt;th&gt;Calculus.jl time (s) &lt;/th&gt;
      &lt;th&gt;ForwardDiff time (s) &lt;/th&gt;
      &lt;th&gt;chunk_size &lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;16&lt;/td&gt;
      &lt;td&gt;0.00212&lt;/td&gt;
      &lt;td&gt;2.2e-5&lt;/td&gt;
      &lt;td&gt;3.5891e-5&lt;/td&gt;
      &lt;td&gt;16&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1600&lt;/td&gt;
      &lt;td&gt;0.53439&lt;/td&gt;
      &lt;td&gt;0.10259&lt;/td&gt;
      &lt;td&gt;0.01304&lt;/td&gt;
      &lt;td&gt;10&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;16000&lt;/td&gt;
      &lt;td&gt;101.55801&lt;/td&gt;
      &lt;td&gt;11.18762&lt;/td&gt;
      &lt;td&gt;1.35411&lt;/td&gt;
      &lt;td&gt;10&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;From the above tables, we can calculate the speed-up ratio of ForwardDiff.jl over the other libraries:&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;length(x) &lt;/th&gt;
      &lt;th&gt;Speed-Up vs. AlgoPy &lt;/th&gt;
      &lt;th&gt;Speed-Up vs. Calculus.jl &lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;16&lt;/td&gt;
      &lt;td&gt;59.07x&lt;/td&gt;
      &lt;td&gt;0.61x&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1600&lt;/td&gt;
      &lt;td&gt;40.98x&lt;/td&gt;
      &lt;td&gt;7.86x&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;16000&lt;/td&gt;
      &lt;td&gt;74.99x&lt;/td&gt;
      &lt;td&gt;8.26x&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;As you can see, Python + AlgoPy falls pretty short of the speeds achieved by Julia + ForwardDiff.jl, or even Julia + Calculus.jl. While Calculus.jl is actually almost twice as fast as ForwardDiff.jl for the lowest input dimension vector, it is ~8 times slower than ForwardDiff.jl for the higher input dimension vectors.&lt;/p&gt;

&lt;p&gt;Another metric that might be useful to look at is the “slowdown ratio” between the gradient evaluation time and the function evaluation time, defined as:&lt;/p&gt;

&lt;script type=&quot;math/tex; mode=display&quot;&gt;\text{slowdown ratio} = \frac{\text{gradient time}}{\text{function time}}&lt;/script&gt;

&lt;p&gt;Here are the results (lower is better):&lt;/p&gt;

&lt;table&gt;
  &lt;thead&gt;
    &lt;tr&gt;
      &lt;th&gt;length(x) &lt;/th&gt;
      &lt;th&gt;AlgoPy ratio &lt;/th&gt;
      &lt;th&gt;Calculus.jl ratio &lt;/th&gt;
      &lt;th&gt;ForwardDiff.jl ratio &lt;/th&gt;
    &lt;/tr&gt;
  &lt;/thead&gt;
  &lt;tbody&gt;
    &lt;tr&gt;
      &lt;td&gt;16&lt;/td&gt;
      &lt;td&gt;19.27&lt;/td&gt;
      &lt;td&gt;9.56&lt;/td&gt;
      &lt;td&gt;15.60&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;1600&lt;/td&gt;
      &lt;td&gt;112.03&lt;/td&gt;
      &lt;td&gt;2547.61&lt;/td&gt;
      &lt;td&gt;323.82&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;16000&lt;/td&gt;
      &lt;td&gt;2139.41&lt;/td&gt;
      &lt;td&gt;30236.81&lt;/td&gt;
      &lt;td&gt;3659.77&lt;/td&gt;
    &lt;/tr&gt;
  &lt;/tbody&gt;
&lt;/table&gt;

&lt;p&gt;Both AlgoPy and ForwardDiff.jl beat out Calculus.jl for evaluation at higher input dimensions, which isn’t too surprising. AlgoPy beating ForwardDiff.jl, though, might catch you off guard - ForwardDiff.jl had the fastest absolute runtimes, after all! One explanation for this outcome is that AlgoPy falls back to vectorized Numpy methods when calculating the gradient, while the &lt;code class=&quot;highlighter-rouge&quot;&gt;ackley&lt;/code&gt; function itself uses your usual, slow Python scalar arithmetic. Julia’s scalar arithmetic performance is &lt;em&gt;much&lt;/em&gt; faster than Python’s, so ForwardDiff.jl doesn’t have as much “room for improvement” as AlgoPy does.&lt;/p&gt;

&lt;h1 id=&quot;julias-ad-advantage&quot;&gt;Julia’s AD Advantage&lt;/h1&gt;

&lt;p&gt;At the beginning of this post, I promised I would give the reader an answer to the question: “Why is Julia uniquely well-suited for AD compared to other languages?”&lt;/p&gt;

&lt;p&gt;There are several good answers, but the chief reason for Julia’s superiority is its efficient implementation of &lt;em&gt;multiple dispatch&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;Unlike many other languages, Julia’s type-based operator overloading is fast and natural, as it’s one of the central design tenets of the language. Since Julia is JIT-compiled, the bytecode representation of a Julia function can be tied directly to the types with which the function is called. This allows the compiler to optimize every Julia method for the specific input type at runtime.&lt;/p&gt;

&lt;p&gt;This ability is phenomenally useful for implementing forward mode AD, which relies almost entirely on operator overloading in order to work. In most other scientific computing languages, operator overloading is either very slow (e.g. MATLAB), fraught with weird edge cases (e.g. Python), arduous to implement generally (e.g. C++) or some combination of all three. In addition, very few languages allow operator overloading to naturally extend to native, black-box, user-written code. Julia’s multiple dispatch is the secret weapon leveraged by ForwardDiff.jl to overcome these hurdles.&lt;/p&gt;

&lt;h1 id=&quot;future-directions&quot;&gt;Future Directions&lt;/h1&gt;

&lt;p&gt;The new version of ForwardDiff.jl has just been released, but development of the package is still ongoing! Here’s a list of things I’d like to see ForwardDiff.jl support in the future:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;More elementary function definitions on &lt;code class=&quot;highlighter-rouge&quot;&gt;ForwardDiffNumber&lt;/code&gt; types&lt;/li&gt;
  &lt;li&gt;More optimized versions of existing elementary function definitions on &lt;code class=&quot;highlighter-rouge&quot;&gt;ForwardDiffNumber&lt;/code&gt; types&lt;/li&gt;
  &lt;li&gt;Methods for evaluating Jacobian-matrix products (highly useful in conjunction with reverse mode AD).&lt;/li&gt;
  &lt;li&gt;Parallel/shared-memory/distributed-memory versions of current API methods for handling problems with huge input/output dimensions&lt;/li&gt;
  &lt;li&gt;A more robust benchmarking suite for catching performance regressions&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you have any ideas on how to make ForwardDiff.jl more useful, feel free to open a pull request or issue in &lt;a href=&quot;https://github.com/JuliaDiff/ForwardDiff.jl&quot;&gt;the package’s GitHub repository&lt;/a&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>JSoC 2015 project: Interactive Visualizations in Julia with GLVisualize.jl</title>
   <link href="http://julialang.org/blog/2015/10/glvisualize"/>
   <updated>2015-10-22T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2015/10/glvisualize</id>
   <content type="html">&lt;p&gt;GLVisualize is an interactive visualization library that supports 2D and 3D rendering as well as building of basic GUIs. It’s written entirely in Julia and OpenGL.
I’m really glad that I could continue working on this project with the support of Julia Summer of Code.&lt;/p&gt;

&lt;p&gt;During &lt;strong&gt;JSoC&lt;/strong&gt;, my main focus was on advancing &lt;a href=&quot;https://github.com/JuliaGL/GLVisualize.jl&quot;&gt;GLVisualize&lt;/a&gt;, but also improving the surrounding infrastructure like &lt;a href=&quot;https://github.com/JuliaGeometry/GeometryTypes.jl&quot;&gt;GeometryTypes&lt;/a&gt;, &lt;a href=&quot;https://github.com/JuliaIO/FileIO.jl&quot;&gt;FileIO&lt;/a&gt;, &lt;a href=&quot;https://github.com/JuliaIO/ImageMagick.jl&quot;&gt;ImageMagick&lt;/a&gt;, &lt;a href=&quot;https://github.com/JuliaIO/MeshIO.jl&quot;&gt;MeshIO&lt;/a&gt; and &lt;a href=&quot;https://github.com/SimonDanisch/FixedSizeArrays.jl&quot;&gt;FixedSizeArrays&lt;/a&gt;.
All recorded gifs in this blog post suffer from lossy compression. You can click on most of them to see the code that produced them.&lt;/p&gt;

&lt;p&gt;One of the most interesting parts of &lt;strong&gt;GLVisualize&lt;/strong&gt; is, that it’s combining GUIs and visualizations, instead of relying on a 3rd party library like &lt;strong&gt;QT&lt;/strong&gt; for GUIs.
This has many advantages and disadvantages.
The main advantage is, that interactive visualization share a lot of infrastructure with GUI libraries.
By combining these two, new features are possible, e.g. text editing of labels in 3D space, or making elements of a visualization work like a button. These features should end up being pretty snappy, since &lt;strong&gt;GLVisualize&lt;/strong&gt; was created with &lt;a href=&quot;http://randomfantasies.com/2015/05/glvisualize-benchmark/&quot;&gt;high performance&lt;/a&gt; in mind.&lt;/p&gt;

&lt;p&gt;Obviously, the biggest downside is, that it is really hard to reach the maturity and feature completeness from e.g. &lt;strong&gt;QT&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;So to really get the best of both worlds a lot of work is needed.&lt;/p&gt;

&lt;h2 id=&quot;current-status-of-glvisualize-and-what-ive-been-doing-during-jsoc&quot;&gt;Current status of GLVisualize, and what I’ve been doing during &lt;strong&gt;JSoC&lt;/strong&gt;&lt;/h2&gt;

&lt;p&gt;A surprisingly large amount of time went into improving &lt;strong&gt;FileIO&lt;/strong&gt; together with &lt;a href=&quot;https://github.com/timholy&quot;&gt;Tim Holy&lt;/a&gt;.
The selling point of &lt;strong&gt;FileIO&lt;/strong&gt; is, that one can just load a file into &lt;strong&gt;FileIO&lt;/strong&gt; and it will recognize the format and load the respective IO library.
This makes it a lot easier to start working with files in Julia, since no prior knowledge about formats and loading files in Julia is needed.
This is perfect for a visualization library, since most visualization start from data, that comes in some format, which might even be unknown initially.&lt;/p&gt;

&lt;p&gt;Since all files are loaded with the same function, it becomes much easier to implement functionality like drag and drop of any file supported by FileIO.
To give you an example, the implementation of the drag and drop feature in &lt;strong&gt;GLVisualize&lt;/strong&gt; only needs a &lt;a href=&quot;https://gist.github.com/SimonDanisch/e0a8a2cbc3106ce6c123#file-dragndrop-jl&quot;&gt;few lines of code&lt;/a&gt; thanks to FileIO:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://gist.github.com/SimonDanisch/e0a8a2cbc3106ce6c123#file-dragndrop-jl&quot;&gt;&lt;img src=&quot;https://github.com/SimonDanisch/Blog/blob/master/10-22-15-jsoc/dragndrop2.gif?raw=true&quot; alt=&quot;drag and drop&quot; /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;Another feature I’ve been working on is better 2D support.
I’ve implemented different anti-aliased marker, text rendering and line types.
Apart from the image markers, they all use the &lt;a href=&quot;http://www.valvesoftware.com/publications/2007/SIGGRAPH2007_AlphaTestedMagnification.pdf&quot;&gt;distance field technique&lt;/a&gt;, to achieve view independent anti-aliasing.
Here are a few examples:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/SimonDanisch/Blog/blob/master/10-22-15-jsoc/lines.png?raw=true&quot; alt=&quot;lines&quot; /&gt;
&lt;a href=&quot;https://github.com/SimonDanisch/Blog/blob/master/10-22-15-jsoc/marker.jl&quot;&gt;&lt;img src=&quot;https://github.com/SimonDanisch/Blog/blob/master/10-22-15-jsoc/markers.gif?raw=true&quot; alt=&quot;markers&quot; /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;In the last example all the markers move together.
This is actually one of the core feature of &lt;strong&gt;GLVisualize&lt;/strong&gt;. The markers share the same memory for the positions on the GPU without any overhead. Each marker then just has a different offset to that shared position.
This is easily achieved in &lt;strong&gt;GLVisualize&lt;/strong&gt;, since all visualization methods are defined on the GPU objects.
This also works for GPU objects which come from some simulation calculated on the GPU.&lt;/p&gt;

&lt;p&gt;During &lt;strong&gt;JSoC&lt;/strong&gt;, I also implemented sliders and line editing widgets for GLVisualize.
One can use them to add interactivity to parameters of a visualization:&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/SimonDanisch/Blog/blob/master/10-22-15-jsoc/color_volume.jl&quot;&gt;&lt;img src=&quot;https://github.com/SimonDanisch/Blog/blob/master/10-22-15-jsoc/volume_color.gif?raw=true&quot; alt=&quot;line_edit&quot; /&gt;&lt;/a&gt;
&lt;a href=&quot;https://github.com/SimonDanisch/Blog/blob/master/10-22-15-jsoc/arbitrary_surf.jl&quot;&gt;&lt;img src=&quot;https://github.com/SimonDanisch/Blog/blob/master/10-22-15-jsoc/arbitrary_surf.gif?raw=true&quot; alt=&quot;arbitrary_surf&quot; /&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I have also worked with &lt;a href=&quot;https://github.com/dpsanders&quot;&gt;David P. Sanders&lt;/a&gt; to visualize his &lt;a href=&quot;https://github.com/dpsanders/BilliardModels.jl&quot;&gt;billiard model&lt;/a&gt;, which demonstrates the particle system and a new camera type.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/SimonDanisch/Blog/blob/master/10-22-15-jsoc/billard.jl&quot;&gt;&lt;img src=&quot;https://github.com/SimonDanisch/Blog/blob/master/10-22-15-jsoc/billiard.gif?raw=true&quot; alt=&quot;billiard&quot; /&gt;&lt;/a&gt;
The particle system can use any mesh primitive. To make it easy to load and create meshes, &lt;a href=&quot;https://github.com/sjkelly&quot;&gt;Steve Kelly&lt;/a&gt; and I rewrote the &lt;a href=&quot;https://github.com/JuliaGeometry/Meshes.jl&quot;&gt;Meshes&lt;/a&gt; package to include more features and have a better separation of mesh IO and manipulation. The IO is now in &lt;strong&gt;MeshIO&lt;/strong&gt;, which supports the &lt;strong&gt;FileIO&lt;/strong&gt; interface. The mesh types are in &lt;strong&gt;GeometryTypes&lt;/strong&gt; and meshing algorithms are in different packages in the &lt;a href=&quot;https://github.com/JuliaGeometry&quot;&gt;JuliaGeometry&lt;/a&gt; org.&lt;/p&gt;

&lt;p&gt;In this example one can see, that there are also some GUI widgets to interact with the camera.
The small rectangles in the corner are for switching between orthographic and perspective projection. The cube can be used to center the camera on a particular side.
These kind of widgets are easy to implement in &lt;strong&gt;GLVisualize&lt;/strong&gt;, as it is build for GUIs and interactivity from the beginning.
Better camera controls are a big usability win, and I will put more time into improving these even further.&lt;/p&gt;

&lt;p&gt;I recorded one last demo to give you some more ideas of what &lt;strong&gt;GLVisualize&lt;/strong&gt; is currently capable of:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/SimonDanisch/Blog/blob/master/10-22-15-jsoc/interactivity.gif?raw=true&quot; alt=&quot;interactivity&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The demo shows different kind of animations, 3D text editing and pop ups that are all relatively easy to include in any visualization created with &lt;strong&gt;GLVisualize&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;All of this looks promising, but there is still a lot of work needed!
First of all, there is still no tagged version of &lt;strong&gt;GLVisualize&lt;/strong&gt; that will just install via Julia’s package manager.
This is because &lt;a href=&quot;https://github.com/JuliaLang/Reactive.jl&quot;&gt;Reactive.jl&lt;/a&gt; and &lt;a href=&quot;https://github.com/timholy/Images.jl&quot;&gt;Images.jl&lt;/a&gt; are currently not tagged on a version that works with &lt;strong&gt;GLVisualize&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;On the other side, the API is not that thought out yet.
It is planned to use more ideas from &lt;a href=&quot;https://github.com/shashi/Escher.jl&quot;&gt;Escher.jl&lt;/a&gt; and &lt;a href=&quot;https://github.com/dcjones/Compose.jl&quot;&gt;Compose.jl&lt;/a&gt; to improve the API.
The goal is to fully support the Compose interface at some point.
Like that, &lt;strong&gt;GLVisualize&lt;/strong&gt; can be used as a backend for &lt;a href=&quot;https://github.com/dcjones/Gadfly.jl&quot;&gt;Gadfly&lt;/a&gt;. This will make &lt;strong&gt;Gadfly&lt;/strong&gt; much fitter for large, animated data sets.
In the next weeks, I will need to work on tutorials, documentations and handling edge cases better.&lt;/p&gt;

&lt;p&gt;Big thanks go to the Julia team and everyone involved to make this possible!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>JSoC 2015 project: Efficient data structures and algorithms for sequence analysis in BioJulia</title>
   <link href="http://julialang.org/blog/2015/10/biojulia-sequence-analysis"/>
   <updated>2015-10-21T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2015/10/biojulia-sequence-analysis</id>
   <content type="html">&lt;ul&gt;
  &lt;li&gt;Participant: Kenta Sato (&lt;a href=&quot;https://github.com/bicycle1885&quot;&gt;@bicycle1885&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;Mentor: Daniel C. Jones (&lt;a href=&quot;https://github.com/dcjones&quot;&gt;@dcjones&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Thanks to a grant from the Gordon and Betty Moore Foundation, I’ve enjoyed the
Julia Summer of Code 2015 program administered by the NumFOCUS and a travel to
the JuliaCon 2015 at Boston.  During this program, I have created several
packages about data structures and algorithms for sequence analysis, mainly
targeted for bioinformatics.  Even though Julia had lots of practical packages
for numerical computing on floating-point numbers, it lacked efficient and
compact data structures that are fundamental in bioinformatics.&lt;/p&gt;

&lt;p&gt;Recent development of high-throughput DNA sequencers has enabled to sequence
massive numbers of DNA fragments (known as reads) from biological samples
within a day.  The first step of sequence analysis is locating positions of
these fragments in other long reference sequence, then we can detect genetic
variants or gene expressions based on the result.  This step is called sequence
mapping or aligning, and because reference sequences are most commonly
genome-scale (about 3.2 billions length for human), a full-text search index is
used to speed up this alignment process.  This kind of full-text search index
is implemented in many bioinformatics tools, most notably
&lt;a href=&quot;http://bowtie-bio.sourceforge.net/bowtie2/index.shtml&quot;&gt;bowtie2&lt;/a&gt; and
&lt;a href=&quot;http://bio-bwa.sourceforge.net/&quot;&gt;BWA&lt;/a&gt;, whose papers are cited thousands of
times.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/blog/2015-10-03-sequence-analysis/mapping.png&quot; alt=&quot;Mapping&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The main focus of my project was creating a full-text search index in Julia
that is easy to use and efficient in practical applications.  In the course
towards this destination, I’ve created several packages that are useful as a
building block for other data structures.  I’m going to introduce you these
packages in this post.&lt;/p&gt;

&lt;h2 id=&quot;intarraysjl&quot;&gt;IntArrays.jl&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/bicycle1885/IntArrays.jl&quot;&gt;IntArrays.jl&lt;/a&gt; is a package for arrays of unsigned integer.
So, is it useful? Yes, it is! This is because the &lt;code class=&quot;highlighter-rouge&quot;&gt;IntArray&lt;/code&gt; type implemented in this package can store integers as small space as possible.
The &lt;code class=&quot;highlighter-rouge&quot;&gt;IntArray&lt;/code&gt; type has a type parameter &lt;code class=&quot;highlighter-rouge&quot;&gt;w&lt;/code&gt; that represents the number of bits required to encode elements in an array.
For example, if each element is an integer between 0 and 3, you only need to use two bits to encode it and &lt;code class=&quot;highlighter-rouge&quot;&gt;w&lt;/code&gt; can be set to 2 or greater.
These 2-bit integers are packed into a buffer and therefore the array consumes only one fourth of the space compared to the usual array.
The following is a case of a byte sequence of &lt;code class=&quot;highlighter-rouge&quot;&gt;[0x01, 0x03, 0x02, 0x00]&lt;/code&gt;:&lt;/p&gt;

&lt;pre&gt;
    index:                           1          2          3          4
    byte sequence (hex):          0x01       0x03       0x02       0x00
    byte sequence (bin):    0b00000001 0b00000011 0b00000010 0b00000000
    packed sequence (w=2):          01         11         10         00
    in-memory layout:         00101101
&lt;/pre&gt;

&lt;p&gt;The full type definition is &lt;code class=&quot;highlighter-rouge&quot;&gt;IntArray{w,T,n}&lt;/code&gt;, where &lt;code class=&quot;highlighter-rouge&quot;&gt;w&lt;/code&gt; is the number of bits
for each element as I explained, &lt;code class=&quot;highlighter-rouge&quot;&gt;T&lt;/code&gt; is the type of elements, and &lt;code class=&quot;highlighter-rouge&quot;&gt;n&lt;/code&gt; is the
dimension of the array.  This type is a subtype of the &lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractArray{T,n}&lt;/code&gt; and
will behave like a familiar array; allocation, random access and update are
supported.  &lt;code class=&quot;highlighter-rouge&quot;&gt;IntVector&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;IntMatrix&lt;/code&gt; are also defined as type aliases like
&lt;code class=&quot;highlighter-rouge&quot;&gt;Vector&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;Matrix&lt;/code&gt;, respectively.&lt;/p&gt;

&lt;p&gt;Here is an example:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; IntArray{2,UInt8}(2, 3)
2x3 IntArrays.IntArray{2,UInt8,2}:
 0x00  0x00  0x01
 0x00  0x00  0x03

julia&amp;gt; array = IntVector{2,UInt8}(6)
6-element IntArrays.IntArray{2,UInt8,1}:
 0x00
 0x00
 0x03
 0x03
 0x02
 0x00

julia&amp;gt; array[1] = 0x02
0x02

julia&amp;gt; array
6-element IntArrays.IntArray{2,UInt8,1}:
 0x02
 0x00
 0x03
 0x03
 0x02
 0x00

julia&amp;gt; sort!(array)
6-element IntArrays.IntArray{2,UInt8,1}:
 0x00
 0x00
 0x02
 0x02
 0x03
 0x03
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;And the memory footprint of &lt;code class=&quot;highlighter-rouge&quot;&gt;IntArray&lt;/code&gt; is much smaller:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; sizeof(IntVector{2,UInt8}(1_000_000))
250000

julia&amp;gt; sizeof(Vector{UInt8}(1_000_000))
1000000
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Since packing and unpacking integers in a buffer require additional operations,
there are overheads in operations and &lt;code class=&quot;highlighter-rouge&quot;&gt;IntArray&lt;/code&gt; is often slower than &lt;code class=&quot;highlighter-rouge&quot;&gt;Array&lt;/code&gt;.
I’ve tried to keep this discrepancy as small as possible, but the &lt;code class=&quot;highlighter-rouge&quot;&gt;IntArray&lt;/code&gt; is
about 4-5 times slower when sorting it:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; array = rand(0x00:0x03, 2^24);

julia&amp;gt; sort(array); @time sort(array);
  0.488779 seconds (8 allocations: 16.000 MB)

julia&amp;gt; iarray = IntVector{2}(array);

julia&amp;gt; sort(iarray); @time sort(iarray);
  2.290878 seconds (18 allocations: 4.001 MB)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;If you have a great idea to improve the performance, please let me know!&lt;/p&gt;

&lt;h2 id=&quot;indexablebitvectorsjl&quot;&gt;IndexableBitVectors.jl&lt;/h2&gt;

&lt;p&gt;The next package is &lt;a href=&quot;https://github.com/BioJulia/IndexableBitVectors.jl&quot;&gt;IndexableBitVectors.jl&lt;/a&gt;.
You must be familiar with the &lt;code class=&quot;highlighter-rouge&quot;&gt;BitVector&lt;/code&gt; type in the standard library; types defined in my package is a static but indexable version of it.
Here “indexable” means that a query to ask the number of bits between an arbitrary range can be answered &lt;strong&gt;in constant time&lt;/strong&gt;.
If you are already familiar with &lt;a href=&quot;https://en.wikipedia.org/wiki/Succinct_data_structure&quot;&gt;succinct data structures&lt;/a&gt;, you may know this is an important building block of other succinct data structures like wavelet trees, LOUDS, etcetera.&lt;/p&gt;

&lt;p&gt;The package exports two variants of such bit vectors: &lt;code class=&quot;highlighter-rouge&quot;&gt;SucVector&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;RRR&lt;/code&gt;.
&lt;code class=&quot;highlighter-rouge&quot;&gt;SucVector&lt;/code&gt; is simpler and faster than &lt;code class=&quot;highlighter-rouge&quot;&gt;RRR&lt;/code&gt;, but &lt;code class=&quot;highlighter-rouge&quot;&gt;RRR&lt;/code&gt; is compressible and will be smaller if 0/1 bits are localized in a bit vector.
Both types split a bit vector into blocks and cache the number of bits up to the position.
In &lt;code class=&quot;highlighter-rouge&quot;&gt;SucVector&lt;/code&gt;, the extra space is about 1/4 bits per bit, so it will become ~25% larger than the original bit vector.&lt;/p&gt;

&lt;p&gt;The most important query operation over these data structures would be the &lt;code class=&quot;highlighter-rouge&quot;&gt;rank1(bv, i)&lt;/code&gt; query, which counts the number of 1 bits within &lt;code class=&quot;highlighter-rouge&quot;&gt;bv[1:i]&lt;/code&gt;. Owing to the cached bit counts, we can finish the rank operation in constant time:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; using IndexableBitVectors

julia&amp;gt; bv = bitrand(2^30);

julia&amp;gt; function myrank1(bv, i)  # count ones by loop
           r = 0
           for j in 1:i
               r += bv[j]
           end
           return r
       end
myrank1 (generic function with 1 method)

julia&amp;gt; myrank1(bv, 2^29); @time myrank1(bv, 2^29);
  0.714866 seconds (6 allocations: 192 bytes)

julia&amp;gt; sbv = SucVector(bv);

julia&amp;gt; rank1(sbv, 2^29); @time rank1(sbv, 2^29);  # much faster!
  0.000003 seconds (6 allocations: 192 bytes)

julia&amp;gt; rrr = RRR(bv);

julia&amp;gt; rank1(rrr, 2^29); @time rank1(rrr, 2^29);  # much faster, too!
  0.000004 seconds (6 allocations: 192 bytes)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;select1(bv, j)&lt;/code&gt; query is also useful in many cases, which locates the
&lt;code class=&quot;highlighter-rouge&quot;&gt;j&lt;/code&gt;-th 1 bit in the bit vector &lt;code class=&quot;highlighter-rouge&quot;&gt;bv&lt;/code&gt;.  For example, if a set of positive
integers is represented in this bit vector, you can efficiently query the
&lt;code class=&quot;highlighter-rouge&quot;&gt;j&lt;/code&gt;-th smallest member in the set.&lt;/p&gt;

&lt;p&gt;Let’s see the internal representation of &lt;code class=&quot;highlighter-rouge&quot;&gt;SucVector&lt;/code&gt; to understand the magic.
A bit vector is separated into large blocks:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;type SucVector &amp;lt;: AbstractIndexableBitVector
    blocks::Vector{Block}
    len::Int
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Each large block contains 256 bits and consists of four small blocks which
contain 64 bits respectively, a large block stores &lt;em&gt;global&lt;/em&gt; 1s’ count up to the
starting position of it and a small block stores &lt;em&gt;local&lt;/em&gt; 1s’ count staring from
the beginning position of its parent large block.  Bits itself are stored in
four bit chunks corresponding to small blocks:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;immutable Block
    # large block
    large::UInt32
    # small blocks
    #   the first small block is used for 8-bit extension of the large block
    #   hence, 40 (= 32 + 8) bits are available in total
    smalls::NTuple{4,UInt8}
    # bit chunks (64bits × 4 = 256bits)
    chunks::NTuple{4,UInt64}
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;img src=&quot;/images/blog/2015-10-03-sequence-analysis/sucvector.png&quot; alt=&quot;Block&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Since the bit count of the first small block is always zero, we can exploit
this space to extend the cache of the large block (red frame).  When running
the &lt;code class=&quot;highlighter-rouge&quot;&gt;rank1(bv, i)&lt;/code&gt; query, it first picks a large and small block pair that the
&lt;code class=&quot;highlighter-rouge&quot;&gt;i&lt;/code&gt;-th bit belongs to and then adds their cached bit counts, finally counts
remaining 1 bits in a chunk on the fly.&lt;/p&gt;

&lt;p&gt;As I mentioned, this data structure can be used as a building block of various
data structures. The next package I’m going to introduce is one of them.&lt;/p&gt;

&lt;h2 id=&quot;waveletmatricesjl&quot;&gt;WaveletMatrices.jl&lt;/h2&gt;

&lt;p&gt;You may already know about the &lt;a href=&quot;https://en.wikipedia.org/wiki/Wavelet_Tree&quot;&gt;wavelet
tree&lt;/a&gt;, which supports the &lt;em&gt;rank&lt;/em&gt;
and &lt;em&gt;select&lt;/em&gt; queries like &lt;code class=&quot;highlighter-rouge&quot;&gt;SucVector&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;RRR&lt;/code&gt;, but elements are not
restricted to 0/1 bits.  In fact, the &lt;em&gt;rank&lt;/em&gt; and &lt;em&gt;select&lt;/em&gt; queries are available
on arbitrary unsigned integers. The wavelet tree can be thought as a
generalization of indexable bit vectors in this respect.  What I’ve implemented
is not the well-known wavelet tree, a variant of it called “wavelet matrix”.
You can find an implementation and a link to a paper at
&lt;a href=&quot;https://github.com/BioJulia/WaveletMatrices.jl&quot;&gt;WaveletMatrices.jl&lt;/a&gt;.
According to the authors of the paper, the wavelet matrix is “simpler to build,
simpler to query, and faster in practice than the levelwise wavelet tree”.&lt;/p&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;WaveletMatrix&lt;/code&gt; type takes three type parameters: &lt;code class=&quot;highlighter-rouge&quot;&gt;w&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;T&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt;.  &lt;code class=&quot;highlighter-rouge&quot;&gt;w&lt;/code&gt;
and &lt;code class=&quot;highlighter-rouge&quot;&gt;T&lt;/code&gt; are analogous to those of &lt;code class=&quot;highlighter-rouge&quot;&gt;IntArray{w,T,n}&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;B&lt;/code&gt; is a type of
indexable bit vector.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; using WaveletMatrices

julia&amp;gt; wm = WaveletMatrix{2}([0x00, 0x01, 0x02, 0x03])
4-element WaveletMatrices.WaveletMatrix{2,UInt8,IndexableBitVectors.SucVector}:
 0x00
 0x01
 0x02
 0x03

julia&amp;gt; wm[3]
0x02

julia&amp;gt; rank(0x02, wm, 2)
0

julia&amp;gt; rank(0x02, wm, 3)
1

julia&amp;gt; xs = rand(0x00:0x03, 2^16);

julia&amp;gt; wm = WaveletMatrix{2}(xs);  # 2-bit encoding

julia&amp;gt; sum(xs[1:2^15] .== 0x03)
8171

julia&amp;gt; rank(0x03, wm, 2^15)
8171
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The details of the data structure and algorithms are relatively simple but
beyond the scope of this post. For people who are interested in this data
structure, the paper I mentioned above and my implementation would be helpful.
There are more operations that the wavelet matrix can run efficiently and those
operations will be added in the future.&lt;/p&gt;

&lt;h2 id=&quot;fmindexesjl&quot;&gt;FMIndexes.jl&lt;/h2&gt;

&lt;p&gt;80% of sequence analysis in bioinformatics is about sequence search, which
includes pattern search, homologous gene search, genome comparison, short-read
mapping, and so on.  The &lt;a href=&quot;https://en.wikipedia.org/wiki/FM-index&quot;&gt;FM-Index&lt;/a&gt; is
often regarded as one of the most efficient indices for full-text search, and I’ve
implemented it in the &lt;a href=&quot;https://github.com/BioJulia/FMIndexes.jl&quot;&gt;FMIndexes.jl&lt;/a&gt;
package.  Thanks to the packages I’ve introduced so far, the code of it looks
really simple.  For example, counting the number of occurrences of a given
pattern in a text can be written as follows (slightly simplified for explanatory
purpose):&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function count(query, index::FMIndex)
    sp, ep = 1, length(index)
    # backward search
    i = length(query)
    while sp ≤ ep &amp;amp;&amp;amp; i ≥ 1
        char = convert(UInt8, query[i])
        c = index.count[char+1]
        sp = c + rank(char, index.bwt, sp - 1) + 1
        ep = c + rank(char, index.bwt, ep)
        i -= 1
    end
    return length(sp:ep)
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;A unique property of the FM-Index is that an index itself is just a permutation
of characters of an original text and counts of characters contained in it.
This permutation is called &lt;a href=&quot;https://en.wikipedia.org/wiki/Burrows%E2%80%93Wheeler_transform&quot;&gt;Burrows-Wheeler
transform&lt;/a&gt;
(also known as BWT), and the permuted text is stored in a wavelet matrix (or a
wavelet tree) in order to efficiently count the number of characters within a
specific region.  Therefore, the space required to index a text is often
smaller than that of other full-text indices (actually, in practice,
efficiently finding positions of a query needs auxiliary data as well).
Moreover, this transform is
&lt;a href=&quot;https://en.wikipedia.org/wiki/Bijection&quot;&gt;bijective&lt;/a&gt;, and thus the original
text can be restored from an index.&lt;/p&gt;

&lt;p&gt;Building an index for full-text search is ridiculously simple: just passing a
sequence to a constructor:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; using FMIndexes

julia&amp;gt; fmindex = FMIndex(&quot;abracadabra&quot;);
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;FMIndex&lt;/code&gt; type supports two main queries: &lt;code class=&quot;highlighter-rouge&quot;&gt;count&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;locate&lt;/code&gt;.  The
&lt;code class=&quot;highlighter-rouge&quot;&gt;count(query, index)&lt;/code&gt; query literally counts the number of occurrences of the
&lt;code class=&quot;highlighter-rouge&quot;&gt;query&lt;/code&gt; string and the &lt;code class=&quot;highlighter-rouge&quot;&gt;locate(query, index)&lt;/code&gt; locates starting positions of the
&lt;code class=&quot;highlighter-rouge&quot;&gt;query&lt;/code&gt;.  In order to restore the original text, you can use the &lt;code class=&quot;highlighter-rouge&quot;&gt;restore&lt;/code&gt;
function.  Here is a simple usage:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; count(&quot;a&quot;, fmindex)
5

julia&amp;gt; count(&quot;abra&quot;, fmindex)
2

julia&amp;gt; locate(&quot;a&quot;, fmindex) |&amp;gt; collect
5-element Array{Any,1}:
 11
  8
  1
  4
  6

julia&amp;gt; locate(&quot;abra&quot;, fmindex) |&amp;gt; collect
2-element Array{Any,1}:
 8
 1

julia&amp;gt; bytestring(restore(fmindex))
&quot;abracadabra&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;As an example, for bioinformaticians, let’s try several queries on a
chromosome.  You also need to install the
&lt;a href=&quot;https://github.com/BioJulia/Bio.jl&quot;&gt;Bio.jl&lt;/a&gt; package to efficiently parse a
&lt;a href=&quot;https://en.wikipedia.org/wiki/FASTA_format&quot;&gt;FASTA&lt;/a&gt; file. The next script reads
a chromosome from a FASTA file, build an FM-Index, and then serialize it into a
file for later use (I love the serializers of Julia, they are available for
free!):&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;index.jl&lt;/strong&gt;&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;using Bio.Seq
using IntArrays
using FMIndexes

# encode a DNA sequence with 3-bit unsigned integers;
# this is because a reference genome has five nucleotides: A/C/G/T/N.
function encode(seq)
    encoded = IntVector{3,UInt8}(length(seq))
    for i in 1:endof(seq)
        encoded[i] = convert(UInt8, seq[i])
    end
    return encoded
end

# read a chromosome from a FASTA file
filepath = ARGS[1]
record = first(open(filepath, FASTA))
println(record.name, &quot;: &quot;, length(record.seq), &quot;bp&quot;)
# build an FM-Index
fmindex = FMIndex(encode(record.seq))
# save it in a file
open(string(filepath, &quot;.index&quot;), &quot;w+&quot;) do io
    serialize(io, fmindex)
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;OK, then create an index for chromosome 22 of human (you can download it from
&lt;a href=&quot;http://hgdownload.cse.ucsc.edu/goldenPath/hg38/chromosomes/&quot;&gt;here&lt;/a&gt;):&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;$ julia4 index.jl chr22.fa
chr22: 50818468bp
$ ls -lh chr22.fa.index
-rw-r--r--+ 1 kenta  staff    74M  9 26 06:30 chr22.fa.index
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;After construction finished (this will take several minutes), read the index in
REPL:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; using FMIndexes

julia&amp;gt; fmindex = open(deserialize, &quot;chr22.fa.index&quot;);
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Now that you can execute queries to search a DNA fragment:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; using Bio.Seq

julia&amp;gt; count(dna&quot;GACTTTCAC&quot;, fmindex)  # this DNA fragment hits at 111 locations
111

julia&amp;gt; count(dna&quot;GACTTTCACTTT&quot;, fmindex)  # this hits at 3 locations
3

julia&amp;gt; locate(dna&quot;GACTTTCACTTT&quot;, fmindex) |&amp;gt; collect  # the loci of these hits
3-element Array{Any,1}:
 36253071
 47308573
 34159872

julia&amp;gt; count(dna&quot;GACTTTCACTTTCCC&quot;, fmindex)  # found a unique hit!
1

julia&amp;gt; locate(dna&quot;GACTTTCACTTTCCC&quot;, fmindex) |&amp;gt; collect
1-element Array{Any,1}:
 36253071

julia&amp;gt; @time locate(dna&quot;GACTTTCACTTTCCC&quot;, fmindex);  # this can be located in 32 μs!
  0.000032 seconds (5 allocations: 192 bytes)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This locus,
&lt;a href=&quot;https://genome.ucsc.edu/cgi-bin/hgTracks?db=hg38&amp;amp;position=chr22%3A36253071-36253120&amp;amp;hgsid=446220019_CeC0woSUOd5ov3GLph7a6fs5Uryo&quot;&gt;chr22:36253071&lt;/a&gt;,
is the starting position of the &lt;em&gt;APOL1&lt;/em&gt; gene.&lt;/p&gt;

&lt;h2 id=&quot;applications&quot;&gt;Applications&lt;/h2&gt;

&lt;p&gt;My aim of having created these packages was to prove that it is practicable to
implement high-performance data structures for bioinformatics in Julia.  I’m
pretty sure that it is true, but it may be skeptical to others.  So, I’m going
to prove it by writing useful and performant applications using these packages.
Now I’m working on &lt;a href=&quot;https://github.com/bicycle1885/FMM.jl&quot;&gt;FMM.jl&lt;/a&gt;, which
aligns massive amounts of DNA fragments to a genome sequence using the FM-Index
and other algorithms.  This is still a work in progress, there would be many
bugs and unusual cases I should care about, but its performance is not so bad
compared to other implementations.&lt;/p&gt;

&lt;p&gt;The &lt;a href=&quot;https://github.com/BioJulia&quot;&gt;BioJulia&lt;/a&gt; project is also under active
development.  The packages I made are intended to work with the
&lt;a href=&quot;https://github.com/BioJulia/Bio.jl&quot;&gt;Bio.jl&lt;/a&gt; package.  If you are interested in
the BioJulia project, we really welcome your contributions!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>JSoC 2015 project: Interactive 3D Graphics in the Browser with Compose3D</title>
   <link href="http://julialang.org/blog/2015/10/compose3d-threejs"/>
   <updated>2015-10-20T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2015/10/compose3d-threejs</id>
   <content type="html">&lt;p&gt;Over the last three months, I’ve been working on &lt;a href=&quot;https://github.com/rohitvarkey/Compose3D.jl&quot;&gt;Compose3D&lt;/a&gt;,
which is an extension of the amazing &lt;a href=&quot;https://github.com/dcjones/Compose.jl&quot;&gt;Compose&lt;/a&gt; package to 3D. My work on
Compose3D began as a project for my Computer Graphics course along with &lt;a href=&quot;https://github.com/pranavtbhat&quot;&gt;Pranav T Bhat&lt;/a&gt;,
and by the end of the course, we had a working prototype for Compose3D with support for contexts and geometries and a
very basic WebGL backend.&lt;/p&gt;

&lt;p&gt;It has been my pleasure to have been able to continue this work under the guidance of &lt;a href=&quot;https://github.com/shashi&quot;&gt;Shashi Gowda&lt;/a&gt;
and &lt;a href=&quot;https://github.com/SimonDanisch&quot;&gt;Simon Danisch&lt;/a&gt; as a part of the first ever Julia Summer of Code, generously
sponsored by the &lt;a href=&quot;https://www.moore.org/&quot;&gt;Gordon and Betty Moore Foundation&lt;/a&gt;. While I’ve been able to add quite a lot of
functionality to Compose3D, it isn’t totally ready for release yet. Hopefully, in some time it
will be ready. But as a happy side effect, I have been able to abstract out the WebGL rendering functionality provided
by the original prototype (and a lot more!) to a separate package called
&lt;a href=&quot;https://github.com/rohitvarkey/ThreeJS.jl&quot;&gt;ThreeJS.jl&lt;/a&gt;,
which can now be used to render 3D graphics in browsers using Julia, opening up possibilities of displaying such
scenes in &lt;a href=&quot;https://github.com/JuliaLang/IJulia.jl&quot;&gt;IJulia&lt;/a&gt; notebooks and &lt;a href=&quot;https://github.com/shashi/Escher.jl&quot;&gt;Escher&lt;/a&gt;.&lt;/p&gt;

&lt;h2 id=&quot;threejsjl&quot;&gt;ThreeJS.jl&lt;/h2&gt;

&lt;p&gt;ThreeJS is now responsible for all the WebGL rendering done by Compose3D. It can also be used as a standalone package for
other graphics packages to use as a backend.&lt;/p&gt;

&lt;p&gt;Initially, my approach to render scenes in Compose3D was to just emit out the corresponding JavaScript code, into the
IJulia notebook, which would then run it! This worked pretty well in IJulia notebooks, but it was soon apparent that
there were several flaws with this approach.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;It was hard to extend.&lt;/li&gt;
  &lt;li&gt;Did not play well with Escher.&lt;/li&gt;
  &lt;li&gt;Nor did it work with Interact to provide interactivity.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So Shashi suggested implementing a &lt;a href=&quot;https://www.polymer-project.org/1.0/&quot;&gt;Polymer&lt;/a&gt; wrapper around the excellent
&lt;a href=&quot;http://threejs.org/&quot;&gt;three.js&lt;/a&gt; library, to create threejs web components. The Polymer team had done some work on
creating threejs components and had a basic implementation of the same ready, which I promptly &lt;a href=&quot;https://github.com/rohitvarkey/three-js&quot;&gt;forked&lt;/a&gt;
and tweaked to add functionality I needed. It’s quite safe to say that I’ve spent more time writing JavaScript than
Julia during JSoC!&lt;/p&gt;

&lt;p&gt;Switching over to using web components suddenly opened up 2 major avenues. Compose3D could now work with Escher and
also provided interactivity. ThreeJS outputs &lt;a href=&quot;https://github.com/shashi/Patchwork.jl&quot;&gt;Patchwork&lt;/a&gt;
elements, which lets it use Patchwork’s clever diffing capabilities, thereby updating only the required DOM elements and
helping performance.&lt;/p&gt;

&lt;p&gt;On the other hand, web components introduced issues with IJulia notebooks regarding serving the files required by
ThreeJS. I’m still working on finding a good solution for this problem, but for now, a hack gets ThreeJS working in
IJulia, albiet with some limitations.&lt;/p&gt;

&lt;h3 id=&quot;drawing-stuff&quot;&gt;Drawing stuff!&lt;/h3&gt;

&lt;p&gt;Anyway, now we were all set to draw 3D scenes in browsers! The below code snippet, for example, would draw a red cube
illuminated from a corner. The camera in the scenes drawn by ThreeJS can be rotated, zoomed and panned using your mouse
or trackpad, allowing you to explore the scene.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;import ThreeJS
ThreeJS.outerdiv() &amp;lt;&amp;lt; (ThreeJS.initscene() &amp;lt;&amp;lt;
    [
        ThreeJS.mesh(0.0, 0.0, 0.0) &amp;lt;&amp;lt;
        [
            ThreeJS.box(1.0,1.0,1.0),
            ThreeJS.material(Dict(:kind=&amp;gt;&quot;lambert&quot;,:color=&amp;gt;&quot;red&quot;))
        ],
        ThreeJS.pointlight(3.0, 3.0, 3.0),
        ThreeJS.camera(0.0, 0.0, 10.0)
    ])
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h3 id=&quot;making-them-interactive&quot;&gt;Making them interactive&lt;/h3&gt;

&lt;p&gt;Currently, interactivity is broken in IJulia (a side effect of the switch to Polymer 1.0, and the new sneaky DOM),
so Escher is the way to go if you want to interact with your 3D scene. So an example for this can be the same scene as before,
but after adding a slider and make it such that the size of the cube is controlled by the slider.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;import ThreeJS
function main(window)
  push!(window.assets, &quot;widgets&quot;)
  push!(window.assets, (&quot;ThreeJS&quot;, &quot;threejs&quot;))
  side = Input(1.0)
  vbox(
    slider(1.0:5.0) &amp;gt;&amp;gt;&amp;gt; side,
    lift(side) do val
      ThreeJS.outerdiv() &amp;lt;&amp;lt; (ThreeJS.initscene() &amp;lt;&amp;lt;
      [
          ThreeJS.mesh(0.0, 0.0, 0.0) &amp;lt;&amp;lt;
          [
              ThreeJS.box(val, val, val),
              ThreeJS.material(Dict(:kind=&amp;gt;&quot;lambert&quot;,:color=&amp;gt;&quot;red&quot;))
          ],
          ThreeJS.pointlight(3.0, 3.0, 3.0),
          ThreeJS.camera(0.0, 0.0, 10.0)
      ])
    end
  )
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h3 id=&quot;you-can-also-do-animations&quot;&gt;You can also do animations!&lt;/h3&gt;

&lt;p&gt;Small scale animations can also be created using Escher. Instead of using sliders to update the elements,
we just update it at certain intervals using the &lt;code class=&quot;highlighter-rouge&quot;&gt;every&lt;/code&gt; function or the &lt;code class=&quot;highlighter-rouge&quot;&gt;fpswhen&lt;/code&gt; functions. A scene with a
rotating cube can be drawn using just a couple of modifications of the above code.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;import ThreeJS
function main(window)
  push!(window.assets, &quot;widgets&quot;)
  push!(window.assets, (&quot;ThreeJS&quot;, &quot;threejs&quot;))
  rx = 0.0
  ry = 0.0
  rz = 0.0
  delta = fpswhen(window.alive, 60) #Update at 60 FPS
  lift(delta) do _
      rx += 0.5
      ry += 0.5
      rz += 0.5
      ThreeJS.outerdiv() &amp;lt;&amp;lt; (ThreeJS.initscene() &amp;lt;&amp;lt;
      [
          ThreeJS.mesh(0.0, 0.0, 0.0) &amp;lt;&amp;lt;
          [
              ThreeJS.box(2.0, 2.0, 2.0, rx = rx, ry = ry, rz = rz),
              ThreeJS.material(Dict(:kind=&amp;gt;&quot;lambert&quot;,:color=&amp;gt;&quot;red&quot;))
          ],
          ThreeJS.pointlight(3.0, 3.0, 3.0),
          ThreeJS.camera(0.0, 0.0, 10.0)
      ])
    end
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;img src=&quot;https://gist.github.com/rohitvarkey/1d65925850198bc284f5/raw/b7dc41f2b3f869c103dcbcb79632f92397767b01/rotating_cube.gif&quot; alt=&quot;Rotating Cube&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;surf-and-mesh-plots-sort-of&quot;&gt;Surf and mesh plots! (Sort of)&lt;/h3&gt;

&lt;p&gt;ThreeJS has support to render parametric surfaces, which are basically the kind of surfaces drawn by
a typical &lt;code class=&quot;highlighter-rouge&quot;&gt;surf&lt;/code&gt; plot. It also has support for drawing lines like a typical &lt;code class=&quot;highlighter-rouge&quot;&gt;mesh&lt;/code&gt; plot. Colormaps can
be applied to these surfaces by passing in an array of colors to be used. Colors to be applied are
calculated and chosen by ThreeJS. These come into effect when put together with materials using the &lt;code class=&quot;highlighter-rouge&quot;&gt;colorkind&lt;/code&gt;
property of &lt;code class=&quot;highlighter-rouge&quot;&gt;vertex&lt;/code&gt;. Screenshots of such surfaces drawn by ThreeJS are shown below.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://gist.github.com/rohitvarkey/1d65925850198bc284f5/raw/d1d8e389dd5baf5420cb24c1dfdf784bc61bf217/parametric.png&quot; alt=&quot;Parametric surface&quot; /&gt;
&lt;img src=&quot;https://gist.github.com/rohitvarkey/1d65925850198bc284f5/raw/d1d8e389dd5baf5420cb24c1dfdf784bc61bf217/meshlines.png&quot; alt=&quot;Mesh lines&quot; /&gt;&lt;/p&gt;

&lt;h2 id=&quot;compose3d&quot;&gt;Compose3D&lt;/h2&gt;

&lt;p&gt;Compose3D provides an abstraction over the rendering library and lets you compose together primitives to
build scenes just like the inspiration for it, the Compose library. This lets you create very interesting
structures, with very less code! Compose3D has similar features to Compose, with users being able to create 3D contexts, and then use relative and absolute measures inside them and compose other primitives together.&lt;/p&gt;

&lt;p&gt;My favorite example to showcase Compose3D would be the Sierpinski pyramid example. Here, we split the parent context
into the sections that we want and then just draw the pyramid in them! So the bottom half of the 3D space is split into 4,
and then, a pyramid is arranged on top of them.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;using Compose3D

function sierpinski(n)
    if n == 0
        compose(Context(0w,0h,0d,1w,1h,1d),pyramid(0w,0h,0d,1w,1h)) #The basic unit
    else
        t = sierpinski(n - 1)
        compose(Context(0w,0h,0d,1w,1h,1d),
        (Context(0w,0h,0d,(1/2)w,(1/2)h,(1/2)d), t),
        (Context(0w,0h,0.5d,(1/2)w,(1/2)h,(1/2)d), t),
        (Context(0.5w,0h,0.5d,(1/2)w,(1/2)h,(1/2)d), t),
        (Context(0.5w,0h,0d,(1/2)w,(1/2)h,(1/2)d), t),
        (Context(0.25w,0.5h,0.25d,(1/2)w,(1/2)h,(1/2)d), t)) #The top one
    end
end
compose(Context(-5mm,-5mm,-5mm,10mm,10mm,10mm),sierpinski(3))
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;And voila! You have a Sierpinski pyramid of level 3 like in the figure below.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://gist.github.com/rohitvarkey/1d65925850198bc284f5/raw/d1d8e389dd5baf5420cb24c1dfdf784bc61bf217/sierpinski.png&quot; alt=&quot;Sierpinski&quot; /&gt;
The switch to ThreeJS allows Compose3D all the advantages that comes with ThreeJS. This includes interactivity
and animations!&lt;/p&gt;

&lt;p&gt;For example, the same Sierpinski example can be have some interactive elements, say a slider defining the
number of levels of recursion and maybe some controlling the colors of the pyramid. This can be done easily
in Escher just like it was done with ThreeJS. After defining the &lt;code class=&quot;highlighter-rouge&quot;&gt;sierpinski&lt;/code&gt; function given below, just creating a slider
and hooking it up to the &lt;code class=&quot;highlighter-rouge&quot;&gt;sierpinski&lt;/code&gt; function will set this up!&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function main(window)
    push!(window.assets, (&quot;ThreeJS&quot;, &quot;threejs&quot;)) #Push the threejs static assets
    push!(window.assets, &quot;widgets&quot;)
    n = Input(0.0)

    vbox(
        slider(0.0:3.0) &amp;gt;&amp;gt;&amp;gt; n, #Set up the slider
        lift(n) do i
            #Draw the composed figure!
            draw(
                Patchable3D(100,100),
                compose(
                    Context(-5mm,-5mm,-5mm,10mm,10mm,10mm), sierpinski(i)
                )
            )
        end
    )
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;img src=&quot;https://gist.github.com/rohitvarkey/1d65925850198bc284f5/raw/78fefb17032a0bd9861e8497133cb6ce3876a4d4/interactive_sierpinski.gif&quot; alt=&quot;Interactive Sierpinski&quot; /&gt;&lt;/p&gt;

&lt;p&gt;An an example for animations, I ported the Escher boids example by Ian Dunning from 2D to 3D and a screencast of the same can be found below.&lt;/p&gt;

&lt;div style=&quot;text-align: center&quot;&gt;&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;https://www.youtube.com/embed/Yul3iBkAVHs&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;

&lt;h2 id=&quot;future-directions&quot;&gt;Future directions&lt;/h2&gt;

&lt;ul&gt;
  &lt;li&gt;Several new primitives have been added in ThreeJS which don’t yet have corresponding primitives in Compose3D.&lt;/li&gt;
  &lt;li&gt;Add support for text in ThreeJS allowing use of labels in plots.&lt;/li&gt;
  &lt;li&gt;Being able to use &lt;code class=&quot;highlighter-rouge&quot;&gt;surf&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;mesh&lt;/code&gt; that will automatically draw scaled surface plots in browsers and a WebGL based
plotting library around ThreeJS.&lt;/li&gt;
  &lt;li&gt;Actually get Compose3D ready for public use!&lt;/li&gt;
&lt;/ul&gt;
</content>
 </entry>
 
 <entry>
   <title>JSoC 2015 project: NullableArrays.jl</title>
   <link href="http://julialang.org/blog/2015/10/nullablearrays"/>
   <updated>2015-10-16T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2015/10/nullablearrays</id>
   <content type="html">&lt;p&gt;My project under the 2015 &lt;a href=&quot;http://julialang.org/jsoc&quot;&gt;Julia Summer of Code&lt;/a&gt; program has been to develop the &lt;a href=&quot;https://github.com/JuliaStats/NullableArrays.jl&quot;&gt;NullableArrays&lt;/a&gt; package, which provides the &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt; data type and its respective interface. I first encountered Julia earlier this year as a suggestion for which language I ought to learn as a matriculating PhD student in statistics. This summer has been an incredible opportunity for me both to develop as a young programmer and to contribute to an open-source community as full of possibility as Julia’s. I’d be remiss not to thank &lt;a href=&quot;http://www-math.mit.edu/~edelman/&quot;&gt;Alan Edelman&lt;/a&gt;’s group at MIT, &lt;a href=&quot;http://numfocus.org/&quot;&gt;NumFocus&lt;/a&gt;, and the &lt;a href=&quot;https://www.moore.org/&quot;&gt;Gordon &amp;amp; Betty Moore Foundation&lt;/a&gt; for their financial support, &lt;a href=&quot;https://github.com/johnmyleswhite/&quot;&gt;John Myles White&lt;/a&gt; for his mentorship and guidance, and many others of the Julia community who have helped to contribute both to the package and to my edification as a programmer over the summer. Much of my work on this project was conducted at the &lt;a href=&quot;https://www.recurse.com&quot;&gt;Recurse Center&lt;/a&gt;, where I received the support of an amazing community of self-directed learners.&lt;/p&gt;

&lt;h2 id=&quot;the-nullablearray-data-structure&quot;&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt; data structure&lt;/h2&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt;s are array structures that efficiently represent missing values without incurring the performance difficulties that face &lt;code class=&quot;highlighter-rouge&quot;&gt;DataArray&lt;/code&gt; objects, which have heretofore been used to store data that include missing values. The core issue responsible for &lt;code class=&quot;highlighter-rouge&quot;&gt;DataArray&lt;/code&gt;s performance woes concerns the way in which the former represent missing values, i.e. through a token &lt;code class=&quot;highlighter-rouge&quot;&gt;NA&lt;/code&gt; object of token type &lt;code class=&quot;highlighter-rouge&quot;&gt;NAType&lt;/code&gt;. In particular, indexing into, say, a &lt;code class=&quot;highlighter-rouge&quot;&gt;DataArray{Int}&lt;/code&gt; can return an object either of type &lt;code class=&quot;highlighter-rouge&quot;&gt;Int&lt;/code&gt; or of type &lt;code class=&quot;highlighter-rouge&quot;&gt;NAType&lt;/code&gt;. This design does not provide sufficient information to Julia’s type inference system at JIT-compilation time to support the sort of static analysis that Julia’s compiler can otherwise leverage to emit efficient machine code. We can illustrate as much through following example, in which we calculate the sum of five million random &lt;code class=&quot;highlighter-rouge&quot;&gt;Float64&lt;/code&gt;s stored in a &lt;code class=&quot;highlighter-rouge&quot;&gt;DataArray&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; using DataArrays
# warnings suppressed…

julia&amp;gt; A = rand(5_000_000);

julia&amp;gt; D = DataArray(A);

julia&amp;gt; function f(D::AbstractArray)
           x = 0.0
           for i in eachindex(D)
               x += D[i]
           end
           x
       end
f (generic function with 1 method)

julia&amp;gt; f(D);

julia&amp;gt; @time f(D)
  0.163567 seconds (10.00 M allocations: 152.598 MB, 9.21% gc time)
2.500102419334644e6
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Looping through and summing the elements of &lt;code class=&quot;highlighter-rouge&quot;&gt;D&lt;/code&gt; is over twenty times slower and allocates far more memory than running the same loop over &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; f(A);

julia&amp;gt; @time f(A)
  0.007465 seconds (5 allocations: 176 bytes)
2.500102419334644e6
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This is because the code generated for &lt;code class=&quot;highlighter-rouge&quot;&gt;f(D)&lt;/code&gt; must assume that &lt;code class=&quot;highlighter-rouge&quot;&gt;getindex(D, i)&lt;/code&gt; for an arbitrary index &lt;code class=&quot;highlighter-rouge&quot;&gt;i&lt;/code&gt; may return an object either of type &lt;code class=&quot;highlighter-rouge&quot;&gt;Float64&lt;/code&gt; or of type &lt;code class=&quot;highlighter-rouge&quot;&gt;NAType&lt;/code&gt; and hence must “box” every object returned from indexing into &lt;code class=&quot;highlighter-rouge&quot;&gt;D&lt;/code&gt;. The performance penalty incurred by this requirement is reflected in the comparison above. (The interested reader can find more about these issues &lt;a href=&quot;http://www.johnmyleswhite.com/notebook/2014/11/29/whats-wrong-with-statistics-in-julia/&quot;&gt;here&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;On the other hand, &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt;s are designed to support the sort of static analysis used by Julia’s type inference system to generate efficient machine code. The crux of the strategy is to use a single type — &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable{T}&lt;/code&gt; — to represent both missing and present values. &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable{T}&lt;/code&gt; objects are specialized containers that hold precisely either one or zero values. A &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; that wraps, say, &lt;code class=&quot;highlighter-rouge&quot;&gt;5&lt;/code&gt; can be taken to represent a present value of &lt;code class=&quot;highlighter-rouge&quot;&gt;5&lt;/code&gt;, whereas an empty &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable{Int}&lt;/code&gt; can represent a missing value that, if it had been present, would have been of type &lt;code class=&quot;highlighter-rouge&quot;&gt;Int&lt;/code&gt;. Crucially, both such objects are of the same type, i.e. &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable{Int}&lt;/code&gt;. Interested readers can hear a bit more on these design considerations in my &lt;a href=&quot;https://www.youtube.com/watch?v=2v5k28F80BQ&quot;&gt;JuliaCon 2015 lighting talk&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Here is the result of running the same loop over a comparable &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; using NullableArrays

julia&amp;gt; X = NullableArray(A);

julia&amp;gt; function f(X::NullableArray)
           x = Nullable(0.0)
           for i in eachindex(X)
               x += X[i]
           end
           x
       end
f (generic function with 1 method)

julia&amp;gt; f(X);

julia&amp;gt; @time f(X)
  0.009812 seconds (5 allocations: 192 bytes)
Nullable(2.500102419334644e6)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;As can be seen, naively looping over a &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt; is on the same order of magnitude as naively looping over a regular &lt;code class=&quot;highlighter-rouge&quot;&gt;Array&lt;/code&gt; in terms of both time elapsed and memory allocated. Below is a set of plots (drawn with &lt;a href=&quot;https://github.com/dcjones/Gadfly.jl&quot;&gt;Gadfly.jl&lt;/a&gt;) that visualize the results of running 20 benchmark samples of &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; over both &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;DataArray&lt;/code&gt; arguments each consisting of 5,000,000 random &lt;code class=&quot;highlighter-rouge&quot;&gt;Float64&lt;/code&gt; values and containing either zero null entries or approximately half randomly chosen null entries.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/2015-10-03-nullablearrays-images/f_plot.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Of course, it is possible to bring the performance of such a loop over a &lt;code class=&quot;highlighter-rouge&quot;&gt;DataArray&lt;/code&gt; up to par with that of a loop over an &lt;code class=&quot;highlighter-rouge&quot;&gt;Array&lt;/code&gt;. But such optimizations generally introduce additional complexity that oughtn’t to be required to achieve acceptable performance in such a simple task. Considerably more complex code can be required to achieve performance in more involved implementations, such as that of &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast!&lt;/code&gt;. We intend for &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt;s to to perform well under involved tasks involving missing data while requiring as little interaction with &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt; internals as possible. This includes allowing users to leverage extant implementations without sacrificing performance. Consider for instance the results of relying on Base’s implementation of &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast!&lt;/code&gt; for &lt;code class=&quot;highlighter-rouge&quot;&gt;DataArray&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt; arguments (i.e., having omitted the respective &lt;code class=&quot;highlighter-rouge&quot;&gt;src/broadcast.jl&lt;/code&gt; from each package’s source code). Below are plots that visualize the results of running 20 benchmark samples of &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast!(dest, src1, src2)&lt;/code&gt;, where &lt;code class=&quot;highlighter-rouge&quot;&gt;dest&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;src2&lt;/code&gt; are &lt;code class=&quot;highlighter-rouge&quot;&gt;5_000_000 x 2&lt;/code&gt; &lt;code class=&quot;highlighter-rouge&quot;&gt;Array&lt;/code&gt;s, &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt;s or &lt;code class=&quot;highlighter-rouge&quot;&gt;DataArray&lt;/code&gt;s, and &lt;code class=&quot;highlighter-rouge&quot;&gt;src1&lt;/code&gt; is a &lt;code class=&quot;highlighter-rouge&quot;&gt;5_000_000 x 1&lt;/code&gt; &lt;code class=&quot;highlighter-rouge&quot;&gt;Array&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;DataArray&lt;/code&gt;. As above, the &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;DataArray&lt;/code&gt; arguments are tested in cases with either zero or approximately half null entries:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;/images/2015-10-03-nullablearrays-images/bcast_plot.png&quot; alt=&quot;&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We have designed the &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt; type to feel as much like a regular &lt;code class=&quot;highlighter-rouge&quot;&gt;Array&lt;/code&gt; as possible. However, that &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt;s return &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; objects is a significant departure from both &lt;code class=&quot;highlighter-rouge&quot;&gt;Array&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;DataArray&lt;/code&gt; behavior. Arguably the most important issue is to support user-defined functions that lack methods for &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; arguments as they interact with &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt; objects. Throughout my project I have also worked to develop interfaces that make dealing with &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; objects user-friendly and safe.&lt;/p&gt;

&lt;p&gt;Given a method &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; defined on an argument signature of types &lt;code class=&quot;highlighter-rouge&quot;&gt;(U1, U2, …, UN)&lt;/code&gt;, we would like to provide an accessible, safe and performant way for a user to call &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; on an argument of signature &lt;code class=&quot;highlighter-rouge&quot;&gt;(Nullable{U1}, Nullable{U2}, …, Nullable{UN})&lt;/code&gt; without having to extend &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; herself. Doing so should return &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable(f(get(u1), get(u1), …, get(un)))&lt;/code&gt; if each argument is non-null, and should return an empty &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; if any argument is null. Systematically extending an arbitrary method &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; over &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; argument signatures is often referred to as “lifting” &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; over the &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; arguments.&lt;/p&gt;

&lt;p&gt;NullableArrays offers keyword arguments for certain methods such as &lt;code class=&quot;highlighter-rouge&quot;&gt;broadcast&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt; that direct the latter methods to lift passed function arguments over &lt;code class=&quot;highlighter-rouge&quot;&gt;NullableArray&lt;/code&gt; arguments:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; X = NullableArray(collect(1:10), rand(Bool, 10))
10-element NullableArray{Int64,1}:
 #NULL
 #NULL
 #NULL
     4
     5
     6
     7
     8
 #NULL
    10

julia&amp;gt; f(x::Int) = 2x
f (generic function with 2 methods)

julia&amp;gt; map(f, X)
ERROR: MethodError: `f` has no method matching f(::Nullable{Int64})
Closest candidates are:
  f(::Any, ::Any)
 [inlined code] from /Users/David/.julia/v0.4/NullableArrays/src/map.jl:93
 in _F_ at /Users/David/.julia/v0.4/NullableArrays/src/map.jl:124
 in map at /Users/David/.julia/v0.4/NullableArrays/src/map.jl:172

julia&amp;gt; map(f, X; lift=true)
10-element NullableArray{Int64,1}:
 #NULL
 #NULL
 #NULL
     8
    10
    12
    14
    16
 #NULL
    20
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;I also plan to release shortly a small package that will offer a more flexible “lift” macro, which will be able to lift function calls over &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; arguments within a variety of expression types.&lt;/p&gt;

&lt;p&gt;We hope that the new NullableArrays package will help to support not only Julia’s statistical computing ecosystem as it moves forward but also any endeavor that requires an efficient, developed interface for handling arrays of &lt;code class=&quot;highlighter-rouge&quot;&gt;Nullable&lt;/code&gt; objects. Please do try the package, submit feature requests, report bugs, and, if you’re interested, submit a PR or two. Happy coding!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Julia 0.4 Release Announcement</title>
   <link href="http://julialang.org/blog/2015/10/julia-0.4-release"/>
   <updated>2015-10-09T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2015/10/julia-0.4-release</id>
   <content type="html">&lt;p&gt;We are pleased to announce the release of Julia 0.4.0.  This release contains
major language refinements and numerous standard library improvements.
A summary of changes is available in the
&lt;a href=&quot;https://github.com/JuliaLang/julia/blob/release-0.4/NEWS.md&quot;&gt;NEWS log&lt;/a&gt;
found in our main repository. We will be making regular 0.4.x bugfix releases from
the release-0.4 branch of the codebase, and we recommend the 0.4.x line for users
requiring a more stable Julia environment.&lt;/p&gt;

&lt;p&gt;The Julia ecosystem continues to grow, and there are now
&lt;a href=&quot;http://pkg.julialang.org/pulse.html&quot;&gt;over 700&lt;/a&gt; registered packages! (highlights below).
JuliaCon 2015 was held in June, and &amp;gt;60 talks are &lt;a href=&quot;https://www.youtube.com/playlist?list=PLP8iPy9hna6Sdx4soiGrSefrmOPdUWixM&quot;&gt;available to view&lt;/a&gt;. &lt;a href=&quot;http://www.juliacon.in/2015&quot;&gt;JuliaCon India&lt;/a&gt; will be held in Bangalore on 9 and 10 October.&lt;/p&gt;

&lt;p&gt;We welcome bug reports on our GitHub tracker, and general usage questions on the
users mailing list, StackOverflow, and several &lt;a href=&quot;http://julialang.org/community/&quot;&gt;community forums&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Binaries are available from the
&lt;a href=&quot;http://julialang.org/downloads/&quot;&gt;main download page&lt;/a&gt;, or visit &lt;a href=&quot;https://juliabox.org/&quot;&gt;JuliaBox&lt;/a&gt;
to try 0.4 from the comfort of your browser. Happy Coding!&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;strong&gt;Notable compiler and language news:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/JuliaLang/julia/issues/8745&quot;&gt;Incremental code caching for packages&lt;/a&gt;,
resulting in a major reduction in loading time for &lt;a href=&quot;http://gadflyjl.org/&quot;&gt;Gadfly&lt;/a&gt; and other large,
inter-dependent packages.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/JuliaLang/julia/issues/5227&quot;&gt;Generational garbage collector&lt;/a&gt; which greatly
reduces GC overhead for many common workloads.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/JuliaLang/julia/pull/8712&quot;&gt;Function call overloading for arbitrary objects&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/JuliaLang/julia/issues/7311&quot;&gt;Generated functions&lt;/a&gt; (sometimes known as “staged functions”) introduce finer control
over compile-time specialization.
&lt;a href=&quot;http://docs.julialang.org/en/release-0.4/manual/metaprogramming/#generated-functions&quot;&gt;Docs&lt;/a&gt;
and related &lt;a href=&quot;https://www.youtube.com/watch?v=KAN8zbM659o&amp;amp;list=PLP8iPy9hna6Sdx4soiGrSefrmOPdUWixM&amp;amp;index=55&quot;&gt;JuliaCon talk&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/JuliaLang/julia/pull/8791&quot;&gt;Support for documenting user functions and other objects&lt;/a&gt;
and retrieving the documentation via the help system.&lt;/li&gt;
  &lt;li&gt;Improvements in the performance and flexibility of &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/10525&quot;&gt;multidimensional abstract arrays&lt;/a&gt;,
&lt;a href=&quot;https://github.com/JuliaLang/julia/pull/8501&quot;&gt;SubArrays (array views)&lt;/a&gt;,
and efficient &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/8432&quot;&gt;multidimensional iterators&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/JuliaLang/julia/pull/12264&quot;&gt;Inter-task channels&lt;/a&gt; for faster communication between parallel tasks&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/JuliaLang/julia/issues/10380&quot;&gt;Tuple type improvements&lt;/a&gt;: the type tuple &lt;code class=&quot;highlighter-rouge&quot;&gt;(A,B)&lt;/code&gt;
now written &lt;code class=&quot;highlighter-rouge&quot;&gt;Tuple{A,B}&lt;/code&gt;. This change has improved the performance of many tuple-related operations, and allowed one to write fixed-size aggregate fields
as &lt;code class=&quot;highlighter-rouge&quot;&gt;field::NTuple{N,T}&lt;/code&gt; (&lt;code class=&quot;highlighter-rouge&quot;&gt;N&lt;/code&gt;umber of elements of given &lt;code class=&quot;highlighter-rouge&quot;&gt;T&lt;/code&gt;ype).&lt;/li&gt;
  &lt;li&gt;Major improvements in Julia’s test coverage and the ability to analyze the test coverage of packages&lt;/li&gt;
  &lt;li&gt;The command line (REPL) now supports &lt;a href=&quot;https://github.com/JuliaLang/julia/issues/10709&quot;&gt;tab-completion of emoji characters&lt;/a&gt; (common LaTeX symbols have been supported since 0.3!)&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;strong&gt;Upcoming work for 0.5&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Nightly builds will use the versioning scheme 0.5.0-dev.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;A major focus of 0.5 will be further (breaking) improvements to core array functionality, as detailed
in &lt;a href=&quot;https://github.com/JuliaLang/julia/issues/13157&quot;&gt;this issue&lt;/a&gt;.&lt;/li&gt;
  &lt;li&gt;We plan to merge the &lt;a href=&quot;https://github.com/JuliaLang/julia/pull/13410&quot;&gt;threading branch&lt;/a&gt;,
but the functionality will be considered experimental and only available as a compile-
time flag for the near future.&lt;/li&gt;
&lt;/ul&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;strong&gt;Community News&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The Julia ecosystem continues to grow, and there are now
&lt;a href=&quot;http://pkg.julialang.org/pulse.html&quot;&gt;over 700&lt;/a&gt; registered packages! (highlights below)&lt;/p&gt;

&lt;p&gt;The second &lt;a href=&quot;http://www.juliacon.org&quot;&gt;JuliaCon&lt;/a&gt; was held in Cambridge (USA) in June, 2015.
Over 60 talks were recorded and
&lt;a href=&quot;https://www.youtube.com/playlist?list=PLP8iPy9hna6Sdx4soiGrSefrmOPdUWixM&quot;&gt;are available for viewing&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://www.juliacon.in/2015&quot;&gt;JuliaCon India&lt;/a&gt; will be held in Bangalore on 9 and 10 October.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://www.juliabloggers.com/&quot;&gt;JuliaBloggers&lt;/a&gt; is going strong! A notable recent feature is
the #MonthOfJulia series exploring the core language and a number of packages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Topical Highlights&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://juliastats.github.io/&quot;&gt;JuliaStats&lt;/a&gt; - statistical and machine learning community.&lt;br /&gt;
&lt;a href=&quot;http://www.juliaopt.org/&quot;&gt;JuliaOpt&lt;/a&gt; - optimization community.&lt;br /&gt;
&lt;a href=&quot;https://juliaquantum.github.io/&quot;&gt;JuliaQuantum&lt;/a&gt; - Julia libraries for quantum-science and technology.&lt;br /&gt;
&lt;a href=&quot;https://github.com/JuliaGPU&quot;&gt;JuliaGPU&lt;/a&gt; - GPU libraries and tooling.&lt;br /&gt;
&lt;a href=&quot;https://github.com/JuliaLang/IJulia.jl&quot;&gt;IJulia&lt;/a&gt; - notebook interface built on IPython.&lt;br /&gt;
&lt;a href=&quot;https://github.com/timholy/Images.jl&quot;&gt;Images&lt;/a&gt; - image processing and i/o library.&lt;br /&gt;
&lt;a href=&quot;http://gadflyjl.org/&quot;&gt;Gadfly&lt;/a&gt; - Grammar of Graphics-inspired statistical plotting.&lt;br /&gt;
&lt;a href=&quot;https://github.com/nolta/Winston.jl&quot;&gt;Winston&lt;/a&gt; - 2D plotting.&lt;br /&gt;
&lt;a href=&quot;http://junolab.org/&quot;&gt;JunoLab&lt;/a&gt; - LightTable-based interactive environment.&lt;br /&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>JuliaCon 2015 Preview - Deep Learning, 3D Printing, Parallel Computing, and so much more</title>
   <link href="http://julialang.org/blog/2015/05/juliacon-preview"/>
   <updated>2015-05-30T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2015/05/juliacon-preview</id>
   <content type="html">&lt;p&gt;&lt;em&gt;&lt;a href=&quot;http://juliacon.org&quot;&gt;JuliaCon 2015&lt;/a&gt; is being held at the Massachusetts Institute of Technology from June 24th to the 28th. &lt;a href=&quot;http://www.eventbrite.com/e/juliacon-2015-tickets-16517619645&quot;&gt;Get your tickets&lt;/a&gt; and &lt;a href=&quot;http://juliacon.org/#accom&quot;&gt;book your hotel&lt;/a&gt; before June 4th to take advantage of early bird pricing.&lt;/em&gt;&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;The &lt;a href=&quot;http://juliacon.org/2014/&quot;&gt;first ever JuliaCon&lt;/a&gt; was held in Chicago last year and was a great success. JuliaCon is back for 2015, this time in Cambridge, Massachusetts at &lt;a href=&quot;http://web.mit.edu/&quot;&gt;MIT&lt;/a&gt;’s architecturally-delightful Stata Center, the &lt;a href=&quot;https://www.csail.mit.edu/&quot;&gt;home of computer science at MIT&lt;/a&gt;. Last year we had a single-track format, but this year we’ve expanded into a four-day extravaganza:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;On Wednesday 24th there will an introduction to Julia workshop run by &lt;a href=&quot;https://github.com/dpsanders&quot;&gt;David P. Sanders (@dpsanders)&lt;/a&gt; as well as a Julia &lt;strong&gt;hackathon&lt;/strong&gt; - a great chance to get some help for your new Julia projects, or to begin contributing to Julia or its many packages.&lt;/li&gt;
  &lt;li&gt;On Thursday 25th and Friday 26th we will be having speakers talking about a range of topics - we were fortunate to have so many fantastic submissions that we had to open up a second track of talks. The near-final &lt;a href=&quot;http://juliacon.org&quot;&gt;schedule is on the main page&lt;/a&gt;. We’ll be alternating between ~40 minute long “regular” talks, and ~10 minute long “lightning” talks across all the sessions.&lt;/li&gt;
  &lt;li&gt;On Saturday 27th we will finish with a series of &lt;strong&gt;workshops&lt;/strong&gt; on a range of topics: data wrangling and visualization, &lt;a href=&quot;http://juliaopt.org&quot;&gt;optimization&lt;/a&gt;, high-performance computing and more. These workshops run from 1.5 to 3 hours and will be a great way to rapidly boost your Julia skills.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3 id=&quot;thursdays-talks&quot;&gt;Thursday’s Talks&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;https://avatars3.githubusercontent.com/u/6486271&quot; width=&quot;20%&quot; /&gt;&lt;img src=&quot;http://juliaastro.github.io/images/logo.svg&quot; width=&quot;20%&quot; /&gt;&lt;img src=&quot;https://www.juliabox.org/assets/img/juliacloudlogo.png&quot; width=&quot;20%&quot; /&gt;&lt;/p&gt;

&lt;p&gt;After getting everyone settled in, we’ll start the conference proper with a session about the use of Julia in a wide variety of &lt;strong&gt;scientific applications&lt;/strong&gt;. Many of the talks at the conference focus on Julia package organizations: groupings of similar packages that promote interoperability and focussing of efforts. In the session &lt;a href=&quot;https://github.com/dcjones&quot;&gt;Daniel C. Jones (@dcjones)&lt;/a&gt;, the creator of the visualization package &lt;a href=&quot;http://gadflyjl.org&quot;&gt;Gadfly&lt;/a&gt;, will discuss the advances being made in the &lt;a href=&quot;https://github.com/BioJulia&quot;&gt;BioJulia&lt;/a&gt; &lt;strong&gt;bioinformatics&lt;/strong&gt; organization, and &lt;a href=&quot;https://github.com/kbarbary&quot;&gt;Kyle Barbary (@kbarbary)&lt;/a&gt; will present &lt;a href=&quot;http://juliaastro.github.io/&quot;&gt;JuliaAstro&lt;/a&gt;, a home for &lt;strong&gt;astronomy and astrophysics&lt;/strong&gt; packages. Theres something for everyone: quantitative &lt;strong&gt;economic modeling&lt;/strong&gt; (&lt;a href=&quot;http://quantecon.org/&quot;&gt;QuantEcon.jl&lt;/a&gt;), &lt;strong&gt;quantum statistical simulations&lt;/strong&gt;, and how to fit Julia into a pre-existing body of code in other languages.&lt;/p&gt;

&lt;p&gt;After lunch we’ll be splitting into two tracks: &lt;strong&gt;visualization and interactivity&lt;/strong&gt; and &lt;strong&gt;statistics&lt;/strong&gt;. The &lt;strong&gt;visualization&lt;/strong&gt; track will be demonstrating some of the exciting advances being made that enable Julia to both produce high-quality visualizations, but also share them. &lt;a href=&quot;https://github.com/one-more-minute&quot;&gt;Mike Innes (@one-more-minute)&lt;/a&gt;, creator of the &lt;strong&gt;&lt;a href=&quot;http://junolab.org/&quot;&gt;Juno&lt;/a&gt; IDE for Julia&lt;/strong&gt;, will be sharing his working on building &lt;strong&gt;web-powered apps&lt;/strong&gt; in Julia, while &lt;a href=&quot;https://github.com/ViralBShah&quot;&gt;Viral B. Shah (@ViralBShah)&lt;/a&gt;, one of the Julia founders, will be discussing more about the inner workings of and plans for &lt;strong&gt;&lt;a href=&quot;http://juliabox.org&quot;&gt;JuliaBox&lt;/a&gt;&lt;/strong&gt;. For a different take on “visualization”, &lt;a href=&quot;https://github.com/jminardi&quot;&gt;Jack Minardi of Voxel8&lt;/a&gt; will be sharing how Julia is powering their &lt;strong&gt;3D printing work&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;statistics&lt;/strong&gt; session covers some hot topics in the field, including two talks from researchers at MIT about how Julia is playing a big part: &lt;strong&gt;probabilistic programming&lt;/strong&gt; (&lt;a href=&quot;https://github.com/zenna/Sigma.jl&quot;&gt;Sigma.jl&lt;/a&gt;) and &lt;strong&gt;deep learning&lt;/strong&gt; (&lt;a href=&quot;https://github.com/pluskid/Mocha.jl&quot;&gt;Mocha.jl&lt;/a&gt;). Facebooker John Myles White, author of &lt;a href=&quot;http://shop.oreilly.com/product/0636920018483.do&quot;&gt;“Machine Learning for Hackers”&lt;/a&gt; and a variety of packages in R and Julia, will share his thoughts on how statistics in Julia can be taken to the next stage in development, and &lt;a href=&quot;https://github.com/ninjin&quot;&gt;Pontus Stenetop (@ninjin)&lt;/a&gt; will educate and entertain in his talk “Suitably Naming a Child with Multiple Nationalities using Julia”.&lt;/p&gt;

&lt;p&gt;We’ll come together at the end of Thursday to learn more about how to write &lt;a href=&quot;https://github.com/tonyhffong/Lint.jl&quot;&gt;good&lt;/a&gt; Julia code, how to write packages that Just Work on Windows, and how wrappers around C libraries can be made easier than you might think through the magic of &lt;a href=&quot;https://github.com/ihnorton/Clang.jl&quot;&gt;Clang.jl&lt;/a&gt;. &lt;a href=&quot;http://github.com/IainNZ&quot;&gt;Iain Dunning (@IainNZ)&lt;/a&gt;, maintainer of &lt;a href=&quot;http://pkg.julialang.org&quot;&gt;Julia’s package listing and test infrastructure&lt;/a&gt; will follow up on last years talk by giving a brief history and updated status report on Julia’s package ecosystem. Finally current Googler &lt;a href=&quot;https://github.com/astrieanna&quot;&gt;Lean Hanson (@astrieanna)&lt;/a&gt; will share some of her tips for people looking to get started with contributing to Julia and to open-source projects.&lt;/p&gt;

&lt;p&gt;Whatever you get up to after the talks end on Thursday, make sure you are up in time for…&lt;/p&gt;

&lt;h3 id=&quot;fridays-talks&quot;&gt;Friday’s talks&lt;/h3&gt;

&lt;p&gt;&lt;img src=&quot;http://www.juliaopt.org/images/juliaopt.svg&quot; width=&quot;20%&quot; /&gt;&lt;img src=&quot;https://camo.githubusercontent.com/12d691a97c0fb8364be856247ceb90c9204c2e01/687474703a2f2f6c6962656c656d656e74616c2e6f72672f5f7374617469632f656c656d656e74616c2e706e67&quot; width=&quot;20%&quot; /&gt;&lt;/p&gt;

&lt;p&gt;If you are interested in learning &lt;strong&gt;how Julia works&lt;/strong&gt; from the people who work on it every day, then Friday morning’s session is for you. The morning will kick off with newly-minted-PhD and Julia co-founder &lt;a href=&quot;https://github.com/JeffBezanson&quot;&gt;Jeff Bezanson (@JeffBezanson)&lt;/a&gt;, who is still recovering from his defense and will be updating us on the title of his talk soon. We’ll be learning more about different stages of the &lt;strong&gt;compilation process&lt;/strong&gt; from contributors &lt;a href=&quot;https://github.com/jakebolewski&quot;&gt;Jake Bolewski (@jakebolewski)&lt;/a&gt; and &lt;a href=&quot;https://github.com/quinnj&quot;&gt;Jacob Quinn (@quinnj)&lt;/a&gt;, and we’ll be covering a miscellany of other cutting-edge topics for Julia like tuning LLVM, debugging, and interfaces.&lt;/p&gt;

&lt;p&gt;In the afternoon we’ll have four sessions split across two rooms. In the second &lt;strong&gt;scientific applications&lt;/strong&gt; session we’ll be learning more about how Julia is being used to &lt;strong&gt;prevent airborne collisions&lt;/strong&gt; from Lincoln Lab’s Robert Moss, and &lt;a href=&quot;http://github.com/IainNZ&quot;&gt;Iain Dunning (@IainNZ)&lt;/a&gt; will give a sequel to last years &lt;a href=&quot;http://juliaopt.org&quot;&gt;JuliaOpt&lt;/a&gt; talk to update us on how Julia is becoming the language of choice for many for &lt;strong&gt;optimization&lt;/strong&gt;. We’ll also hear how Julia is enabling rapid development of advanced algorithms for simulating &lt;strong&gt;quantum systems&lt;/strong&gt;, evolving graphs, and analyzing &lt;strong&gt;seismic waves&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The &lt;strong&gt;numerical computing&lt;/strong&gt; track kicks of with Stanford’s Prof. &lt;a href=&quot;https://github.com/poulson&quot;&gt;Jack Poulson (@poulson)&lt;/a&gt;, creator of the &lt;a href=&quot;https://github.com/elemental/Elemental&quot;&gt;Elemental&lt;/a&gt; library for &lt;strong&gt;distributed-memory linear algebra&lt;/strong&gt;. Right after, the linear algebra wizard &lt;a href=&quot;https://github.com/xianyi&quot;&gt;Zhang Xianyi (@xianyi)&lt;/a&gt; will give a talk about &lt;a href=&quot;https://github.com/xianyi/OpenBLAS&quot;&gt;OpenBLAS&lt;/a&gt;, the high-performance linear algebra library Julia ships with. After a break, we’ll hear Viral’s thoughts on how &lt;strong&gt;sparse matrices&lt;/strong&gt; currently and should work in Julia, before finishing off with lightning talks about &lt;strong&gt;validated numerics&lt;/strong&gt; and &lt;strong&gt;Taylor series&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;We’ll see out the day with two sessions that hit some topics of interest to people deploying Julia into larger systems: &lt;strong&gt;data&lt;/strong&gt; and &lt;strong&gt;parallel computing&lt;/strong&gt;. In the data session we’ll learn how about the nuts and bolts of &lt;strong&gt;sharing and storing data&lt;/strong&gt; in Julia and hear more about plans for the future by the contributors working in these areas. Make sure to check out the talk by &lt;a href=&quot;https://github.com/aviks&quot;&gt;Avik Sengupta (@aviks)&lt;/a&gt; about his real-world industry experiences about putting Julia code behind a &lt;strong&gt;web-accessible API&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The parallel computing session will tackle parallelism at all levels. Contributor &lt;a href=&quot;https://github.com/amitmurthy/&quot;&gt;Amit Murthy (@amitmurthy)&lt;/a&gt; will open the session with a discussion of his recent work and plans for managing Julia in a cluster. We’ll also hear about work being done to &lt;strong&gt;make Julia multithreaded&lt;/strong&gt; at Intel, and about running Julia on a &lt;strong&gt;Cray supercomputer&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;After all that you will surely be inspired to hack on Julia projects all night, but make sure to wake up for a full day of &lt;strong&gt;workshops&lt;/strong&gt; on Saturday!&lt;/p&gt;

&lt;p&gt;Remember to &lt;a href=&quot;http://www.eventbrite.com/e/juliacon-2015-tickets-16517619645&quot;&gt;get your tickets&lt;/a&gt; and &lt;a href=&quot;http://juliacon.org/#accom&quot;&gt;book your hotel&lt;/a&gt; before June 4th to take advantage of early bird pricing. We’d also like to thank our &lt;strong&gt;platinum sponsors&lt;/strong&gt;: the Gordon and Betty Moore Foundation, BlackRock, and Julia Computing. We can’t forget out &lt;strong&gt;silver sponsors&lt;/strong&gt; either: Intel and Invenia. We’re looking forward to seeing you there!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Julia Summer of Code 2015</title>
   <link href="http://julialang.org/blog/2015/05/jsoc-cfp"/>
   <updated>2015-05-23T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2015/05/jsoc-cfp</id>
   <content type="html">&lt;p&gt;Thanks to a generous grant from the &lt;a href=&quot;http://www.moore.org/&quot;&gt;Moore Foundation&lt;/a&gt;, we are happy to announce the 2015 &lt;strong&gt;Julia Summer of Code (JSoC)&lt;/strong&gt; administered by &lt;a href=&quot;http://numfocus.org/&quot;&gt;NumFocus&lt;/a&gt;. We realize that this announcement comes quite late in the summer internship process, but we are hoping to fund six projects. The duration of JSoC 2015 will be June 15-September 15. Last date for submitting applications is June 1.&lt;/p&gt;

&lt;p&gt;Stipends will match those of the Google Summer of Code (GSoC) at $5500 for the summer &lt;strong&gt;plus travel support to attend this year’s &lt;a href=&quot;http://juliacon.org/&quot;&gt;JuliaCon&lt;/a&gt; at MIT&lt;/strong&gt;. Some amazing work from last year’s GSoC includes the &lt;a href=&quot;http://junolab.org/&quot;&gt;Juno IDE&lt;/a&gt;, the &lt;a href=&quot;https://github.com/JuliaLang/Interact.jl&quot;&gt;Interact.jl&lt;/a&gt; package, and &lt;a href=&quot;https://github.com/SimonDanisch/GLPlot.jl&quot;&gt;GLPlot&lt;/a&gt;; we hope to support another round of fun and useful projects.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you are looking for a project&lt;/strong&gt;, first, find a mentor. You may want to contact your favorite core developer, package author, or look through some of the &lt;a href=&quot;http://julialang.org/gsoc/2015/&quot;&gt;previously proposed&lt;/a&gt; projects. Mentors will be looking for some evidence that you have experience using Julia and contributing to open source projects, but you are not expected to be an expert in the proposed project area. In fact, JSoC could be a great opportunity to explore an entirely new subject. &lt;strong&gt;If you’re already a contributor to Julia or a Julia package and want to get paid to continue an existing project, that’s okay too!&lt;/strong&gt; In this case we still ask you to find a mentor who’s familiar with your field of work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;If you are a mentor looking for a student&lt;/strong&gt;, advertise the project! Post it on &lt;a href=&quot;https://groups.google.com/forum/#!forum/julia-users&quot;&gt;julia-users&lt;/a&gt; and relevant community forums. Keep in mind that project proposals should be concrete but flexible enough to adapt to the interests of a broad range of potential applicants.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Once a mentor and student have agreed on a project&lt;/strong&gt;, send an email to juliasoc@googlegroups.com for feedback and approval. We ask for this to be done by &lt;strong&gt;June 1st&lt;/strong&gt; at the latest (yes that’s soon!).&lt;/p&gt;

&lt;p&gt;Note that we use student in the broad sense. Participation is open to all, in accordance with applicable regulations. Participants do not need to demonstrate student status in any formal way. Contact juliasoc@googlegroups.com with any questions regarding eligibility.&lt;/p&gt;

&lt;p&gt;Happy coding!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Julia 0.3 Release Announcement</title>
   <link href="http://julialang.org/blog/2014/08/julia-0.3-release"/>
   <updated>2014-08-20T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2014/08/julia-0.3-release</id>
   <content type="html">&lt;p&gt;We are pleased to announce the release of Julia 0.3.0.  This release contains numerous improvements across the
board from standard library changes to pure performance enhancements as well as an expanded ecosystem of packages as
compared to the 0.2 releases. A summary of changes is available in &lt;a href=&quot;https://github.com/JuliaLang/julia/blob/release-0.3/NEWS.md&quot;&gt;NEWS.md&lt;/a&gt;
found in our main repository, and binaries are now available on our &lt;a href=&quot;http://julialang.org/downloads/&quot;&gt;main download page&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;A few notable changes:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;System image caching for fast startup.&lt;/li&gt;
  &lt;li&gt;A pure-Julia REPL was introduced, replacing readline and providing expanded functionality and customization.&lt;/li&gt;
  &lt;li&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;workspace()&lt;/code&gt; function was added, to clear the environment without restarting.&lt;/li&gt;
  &lt;li&gt;Tab substitution of Latex character codes is now supported in the REPL, IJulia, and several editor environments.&lt;/li&gt;
  &lt;li&gt;Unicode improvements including expanded operators and NFC normalization.&lt;/li&gt;
  &lt;li&gt;Multi-process shared memory support. (multi-threading support is in progress and has been a major summer focus)&lt;/li&gt;
  &lt;li&gt;Improved hashing and floating point range support.&lt;/li&gt;
  &lt;li&gt;Better tuple performance.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We are now transitioning into the 0.4 development cycle and encourage users to use the 0.3.X line if they need a stable
julia environment.  Many breaking changes will be entering the environment over the course of the next few months. To reflect this period of change, nightly builds will use the versioning scheme 0.4.0-dev.  Once the major breaking changes have been merged and the
development cycle progresses towards a stable release, the version will shift to 0.4.0-pre, at which point package authors
and users should start to think about transitioning the codebases over to the 0.4.X line.&lt;/p&gt;

&lt;p&gt;The release-0.3 branch of the codebase will remain open for bugfixes during this time. We encourage users facing
problems to open issues on our GitHub tracker, or email the julia-users mailing list.&lt;/p&gt;

&lt;p&gt;Happy coding.&lt;/p&gt;

&lt;hr /&gt;

&lt;p&gt;&lt;strong&gt;News&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://www.juliabloggers.com/&quot;&gt;JuliaBloggers&lt;/a&gt; and the &lt;a href=&quot;http://pkg.julialang.org/&quot;&gt;searchable package listing&lt;/a&gt; were recently introduced.&lt;/p&gt;

&lt;p&gt;The first ever &lt;a href=&quot;http://www.juliacon.org&quot;&gt;JuliaCon&lt;/a&gt; was held in Chicago in June, 2014. Several session recordings are available, and the others will be released soon:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;http://julialang.org/blog/2014/08/juliacon-opening-session/&quot;&gt;Opening session&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;http://julialang.org/blog/2014/08/juliacon-opt-session/&quot;&gt;Optimization session&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Julia community participated in &lt;a href=&quot;http://julialang.org/gsoc/2014/&quot;&gt;Google Summer of Code 2014&lt;/a&gt;. Wrap-up blog posts will be coming soon from the participants:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/SimonDanisch&quot;&gt;Simon Danisch&lt;/a&gt; (&lt;a href=&quot;https://randomphantasies.wordpress.com/&quot;&gt;3D visualization&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/shashi&quot;&gt;Shashi Gowda&lt;/a&gt; (&lt;a href=&quot;https://github.com/shashi/Interact.jl&quot;&gt;Interactive Widgets for IJulia&lt;/a&gt; and &lt;a href=&quot;http://shashi.github.io/React.jl&quot;&gt;React.jl&lt;/a&gt;)&lt;/li&gt;
  &lt;li&gt;&lt;a href=&quot;https://github.com/one-more-minute&quot;&gt;Mike Innes&lt;/a&gt; (&lt;a href=&quot;https://github.com/one-more-minute/Juno-LT&quot;&gt;Julia + LightTable&lt;/a&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Topical highlights&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;“&lt;a href=&quot;http://jiahao.github.io/julia-blog/2014/06/09/the-colors-of-chemistry.html&quot;&gt;The colors of chemistry&lt;/a&gt;” notebook by &lt;a href=&quot;http://github.com/jiahao&quot;&gt;Jiahao Chen&lt;/a&gt; demonstrating IJulia, Gadfly, dimensional computation with SIUnits, and more.&lt;/p&gt;

&lt;p&gt;&lt;a href=&quot;http://juliastats.github.io/&quot;&gt;JuliaStats&lt;/a&gt; - statistical and machine learning community.&lt;br /&gt;
&lt;a href=&quot;http://www.juliaopt.org/&quot;&gt;JuliaOpt&lt;/a&gt; - optimization community.&lt;br /&gt;
&lt;a href=&quot;https://github.com/JuliaLang/IJulia.jl&quot;&gt;IJulia&lt;/a&gt; - notebook interface built on IPython.&lt;br /&gt;
&lt;a href=&quot;https://github.com/timholy/Images.jl&quot;&gt;Images&lt;/a&gt; - image processing and i/o library.&lt;br /&gt;
&lt;a href=&quot;http://gadflyjl.org/&quot;&gt;Gadfly&lt;/a&gt; - Grammar of Graphics-inspired statistical plotting.&lt;br /&gt;
&lt;a href=&quot;https://github.com/nolta/Winston.jl&quot;&gt;Winston&lt;/a&gt; - 2D plotting.&lt;br /&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>JuliaCon 2014 Optimization Presentations</title>
   <link href="http://julialang.org/blog/2014/08/juliacon-opt-session"/>
   <updated>2014-08-09T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2014/08/juliacon-opt-session</id>
   <content type="html">&lt;h2 id=&quot;optimization-session&quot;&gt;Optimization Session&lt;/h2&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;h3 id=&quot;iain-dunning--joey-huchette--juliaopt---optimization-packages-for-julia&quot;&gt;&lt;a href=&quot;http://iaindunning.com/&quot;&gt;Iain Dunning&lt;/a&gt; / &lt;a href=&quot;http://www.mit.edu/~huchette/&quot;&gt;Joey Huchette&lt;/a&gt; — &lt;a href=&quot;http://www.juliaopt.org/&quot;&gt;JuliaOpt&lt;/a&gt; - Optimization Packages for Julia&lt;/h3&gt;

&lt;p&gt;Iain Dunning and Joey Huchette are both doctoral students in the Massachusetts Institute of Technology Operations Research Center, where they study constrained continuous and combinatorial numerical optimization methods and theory. In this session they present the JuliaOpt suite of optimization packages and how they interoperate. They also discuss how various Julia features enable exciting functionality in these packages.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Video:&lt;/strong&gt; &lt;a href=&quot;http://youtu.be/VwZvUvXX-vY&quot;&gt;http://youtu.be/VwZvUvXX-vY&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Slides:&lt;/strong&gt; &lt;a href=&quot;http://goo.gl/RwUdOI&quot;&gt;http://goo.gl/RwUdOI&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href=&quot;https://github.com/IainNZ&quot;&gt;https://github.com/IainNZ&lt;/a&gt; &lt;a href=&quot;https://github.com/joehuchette&quot;&gt;https://github.com/joehuchette&lt;/a&gt; &lt;a href=&quot;https://github.com/mlubin&quot;&gt;https://github.com/mlubin&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;div style=&quot;text-align: center&quot;&gt;&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;//www.youtube.com/embed/VwZvUvXX-vY?list=PLP8iPy9hna6TSRouJfvobfxkZFYiPSvPd&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;h3 id=&quot;madeleine-udell--convex-optimization-in-julia&quot;&gt;Madeleine Udell — Convex Optimization in Julia&lt;/h3&gt;

&lt;p&gt;Madeleine Udell is a PhD candidate in Computational &amp;amp; Mathematical Engineering at Stanford University, where she works with Professor Stephen Boyd. Madeleine’s work focuses on modeling and solving large-scale optimization problems and in finding and exploiting structure in high dimensional data. She is the lead developer of the CVX.jl package.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Video:&lt;/strong&gt; &lt;a href=&quot;http://youtu.be/SoI0lEaUvTs&quot;&gt;http://youtu.be/SoI0lEaUvTs&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Slides:&lt;/strong&gt; &lt;a href=&quot;http://goo.gl/Nfy14D&quot;&gt;http://goo.gl/Nfy14D&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Website:&lt;/strong&gt; &lt;a href=&quot;http://web.stanford.edu/~udell/&quot;&gt;http://web.stanford.edu/~udell/&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;div style=&quot;text-align: center&quot;&gt;&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;//www.youtube.com/embed/SoI0lEaUvTs?list=PLP8iPy9hna6TSRouJfvobfxkZFYiPSvPd&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>JuliaCon 2014 Opening Session Presentations</title>
   <link href="http://julialang.org/blog/2014/08/juliacon-opening-session"/>
   <updated>2014-08-09T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2014/08/juliacon-opening-session</id>
   <content type="html">&lt;h2 id=&quot;scientific-applications-session&quot;&gt;Scientific Applications Session&lt;/h2&gt;
&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;h3 id=&quot;tim-holy--image-representation-and-analysis&quot;&gt;Tim Holy — Image Representation and Analysis&lt;/h3&gt;

&lt;p&gt;Tim Holy is a Professor in the Department of Anatomy and Neurobiology at Washington University in St. Louis. He’s been involved with Julia development for over 2 years. In this presentation, Tim describes how Images.jl can be used for rapid inquiry and dissection of biomedical imaging data.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Video:&lt;/strong&gt; &lt;a href=&quot;http://youtu.be/FA-1B_amwt8&quot;&gt;http://youtu.be/FA-1B_amwt8&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Slides:&lt;/strong&gt; &lt;a href=&quot;https://github.com/JuliaCon/presentations/tree/master/Images&quot;&gt;https://github.com/JuliaCon/presentations/tree/master/Images&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href=&quot;https://github.com/timholy&quot;&gt;https://github.com/timholy&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;div style=&quot;text-align: center&quot;&gt;&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;//www.youtube.com/embed/FA-1B_amwt8?list=PLP8iPy9hna6TSRouJfvobfxkZFYiPSvPd&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;h3 id=&quot;pontus-stenetorp--natural-language-processing-with-julia&quot;&gt;Pontus Stenetorp — Natural Language Processing with Julia&lt;/h3&gt;

&lt;p&gt;Pontus Stenetorp is a Japan Society for the Promotion of Science Postdoctoral Research Fellow at the University of Tokyo working in the areas of machine learning and natural language processing (NLP). In this talk, Pontus describes his recent experience in learning Julia and how Julia and its community have helped in his implementing a transition-based dependency parser in Julia.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Video:&lt;/strong&gt; &lt;a href=&quot;http://youtu.be/OrFxjE44COc&quot;&gt;http://youtu.be/OrFxjE44COc&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;Slides:&lt;/strong&gt; &lt;a href=&quot;https://github.com/JuliaCon/presentations/blob/master/JuliaNLP/JuliaNLP.pdf&quot;&gt;https://github.com/JuliaCon/presentations/blob/master/JuliaNLP/JuliaNLP.pdf&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href=&quot;https://github.com/ninjin&quot;&gt;https://github.com/ninjin&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;div style=&quot;text-align: center&quot;&gt;&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;//www.youtube.com/embed/OrFxjE44COc?list=PLP8iPy9hna6TSRouJfvobfxkZFYiPSvPd&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;h3 id=&quot;speed-vs-correctness-led-by-arch-robison&quot;&gt;Speed vs. Correctness (led by Arch Robison)&lt;/h3&gt;

&lt;p&gt;Arch Robison is a Senior Principal Engineer at Intel and is an expert in parallel programming, being the original designer of the widely used Intel Threading Building Blocks library. In this session, Arch discusses the tradeoffs between instruction-level correctness and its implications for compiler optimizations.&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;&lt;strong&gt;Video:&lt;/strong&gt; &lt;a href=&quot;http://youtu.be/GFTCQNYddhs&quot;&gt;http://youtu.be/GFTCQNYddhs&lt;/a&gt;&lt;/li&gt;
  &lt;li&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href=&quot;https://github.com/ArchRobison&quot;&gt;https://github.com/ArchRobison&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;br /&gt;&lt;/p&gt;

&lt;div style=&quot;text-align: center&quot;&gt;&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;//www.youtube.com/embed/GFTCQNYddhs?list=PLP8iPy9hna6TSRouJfvobfxkZFYiPSvPd&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;&lt;/div&gt;
</content>
 </entry>
 
 <entry>
   <title>Fast Numeric Computation in Julia</title>
   <link href="http://julialang.org/blog/2013/09/fast-numeric"/>
   <updated>2013-09-04T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2013/09/fast-numeric</id>
   <content type="html">&lt;p&gt;Working on numerical problems daily, I have always dreamt of a language that provides an elegant interface while allowing me to write codes that run blazingly fast on large data sets. Julia is a language that turns this dream into a reality. 
With Julia, you can focus on your problem, keep your codes clean, and more importantly, write fast codes without diving into lower level languages such as C or Fortran even when performance is critical.&lt;/p&gt;

&lt;p&gt;However, you should not take this potential speed for granted. To get your codes fast, you should keep performance in mind and follow general best practice guidelines. Here, I would like to share with you my experience in writing efficient codes for numerical computation.&lt;/p&gt;

&lt;h2 id=&quot;first-make-it-correct&quot;&gt;First, make it correct&lt;/h2&gt;

&lt;p&gt;As in any language, the foremost goal when you implement your algorithm is to &lt;em&gt;make it correct&lt;/em&gt;.
An algorithm that doesn’t work correctly is useless no matter how fast it runs. One can always optimize the codes afterwards when necessary.
When there are different approaches to a problem, you should choose the one that is &lt;em&gt;asymptotically more efficient&lt;/em&gt;.
For example, an unoptimized quick-sort implementation can easily beat a carefully optimized bubble-sort when sorting even moderately large arrays.
Given a particular choice of algorithm, however, implementing it carefully and observing common performance guidelines can still make a big difference in performance – I will focus on this in the remaining part.&lt;/p&gt;

&lt;h2 id=&quot;devectorize-expressions&quot;&gt;Devectorize expressions&lt;/h2&gt;

&lt;p&gt;Users of other high level languages such as MATLAB&lt;sup&gt;®&lt;/sup&gt; or Python are often advised to &lt;em&gt;vectorize&lt;/em&gt; their codes as much as possible to get performance, because loops are slow in those languages. In Julia, on the other hand, loops can run as fast as those written in C and you no longer have to count on vectorization for speed. Actually, turning vectorized expressions into loops, which we call &lt;em&gt;devectorization&lt;/em&gt;, often results in even higher performance.&lt;/p&gt;

&lt;p&gt;Consider the following:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;r = exp(-abs(x-y))
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Very simple expression, right?
Behind the scenes, however, it takes a lot of steps and temporary arrays to get you the results of this expression.
The following sequence of temporary array constructions is what is done to compute the above expression:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;n = length(x)

tmp1 = Array(Float64, n)
for i = 1:n
    tmp1[i] = x[i]-y[i]
end

tmp2 = Array(Float64, n)
for i = 1:n
    tmp2[i] = abs(tmp1[i])
end

tmp3 = Array(Float64, n)
for i = 1:n
    tmp3[i] = -tmp2[i]
end

r = Array(Float64, n)
for i = 1:n
    r[i] = exp(tmp3[i])
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We can see that this procedure creates three temporary arrays and it takes four passes to complete the computation.
This introduces significant overhead:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;It takes time to allocate memory for the temporary arrays;&lt;/li&gt;
  &lt;li&gt;It takes time to reclaim the memory of these arrays during garbage collection;&lt;/li&gt;
  &lt;li&gt;It takes time to traverse the memory – generally, fewer passes means higher efficiency.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Such overhead is significant in practice, often leading to 2x to 3x slow down. To get optimal performance, one should &lt;em&gt;devectorize&lt;/em&gt; this code like so:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;r = similar(x) 
for i = 1:length(x)
    r[i] = exp(-abs(x[i]-y[i]))
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This version finishes the computation in one pass, without introducing any temporary arrays. Moreover, if &lt;code class=&quot;highlighter-rouge&quot;&gt;r&lt;/code&gt; is pre-allocated, one can even omit the statment that creates &lt;code class=&quot;highlighter-rouge&quot;&gt;r&lt;/code&gt;. The &lt;a href=&quot;https://github.com/lindahua/Devectorize.jl&quot;&gt;&lt;em&gt;Devectorize.jl&lt;/em&gt;&lt;/a&gt; package provides a macro &lt;code class=&quot;highlighter-rouge&quot;&gt;@devec&lt;/code&gt; that can automatically translate vectorized expressions into loops:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;using Devectorize

@devec r = exp(-abs(x-y))
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The comprehension syntax also provides a concise syntax for devectorized computation:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;r = [exp(-abs(x[i]-y[i])) for i = 1:length(x)]
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Note that comprehension always creates new arrays to store the results. Hence, to write results to pre-allocated arrays, you still have to devectorize the computation manually or use the &lt;code class=&quot;highlighter-rouge&quot;&gt;@devec&lt;/code&gt; macro.&lt;/p&gt;

&lt;h2 id=&quot;merge-computations-into-a-single-loop&quot;&gt;Merge computations into a single loop&lt;/h2&gt;

&lt;p&gt;Traversing arrays, especially large ones, may incur cache misses or even page faults, both of which can cause significant latency.
Thus, it is desirable to minimize the number of round trips to memory as much as possible.
For example, you may compute multiple maps with one loop:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;for i = 1:length(x)
    a[i] = x[i] + y[i]
    b[i] = x[i] - y[i]
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This is usually faster than writing &lt;code class=&quot;highlighter-rouge&quot;&gt;a = x + y; b = x - y&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The following example shows how you can compute multiple statistics (e.g. sum, max, and min) over a dataset efficiently.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;n = length(x)
rsum = rmax = rmin = x[1]
for i = 2:n
    xi = x[i]
    rsum += xi
    if xi &amp;gt; rmax
        rmax = xi
    elseif xi &amp;lt; rmin
        rmin = xi
    end
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;h2 id=&quot;write-cache-friendly-codes&quot;&gt;Write cache-friendly codes&lt;/h2&gt;

&lt;p&gt;Modern computer systems have a complicated heterogeneous memory structure that combines registers, multiple levels of caches, and RAM. Data are accessed through the cache hierarchy – a smaller and much faster memory that stores copies of frequently used data.&lt;/p&gt;

&lt;p&gt;Most systems do not provide ways to directly control the cache system. However, you can take steps to make it much easier for the automated cache management system to help you if you write &lt;em&gt;cache-friendly&lt;/em&gt; codes. In general, you don’t have to understand every detail about how a cache system works. It is often sufficient to observe the simple rule below:&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Access data in a pattern similar to how the data resides in memory – don’t jump around between non-contiguous locations in memory.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;This is sometimes referred to as the &lt;em&gt;principle of locality&lt;/em&gt;. For example, if &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; is a contiguous array, then after reading &lt;code class=&quot;highlighter-rouge&quot;&gt;x[i]&lt;/code&gt;, it is much more likely that &lt;code class=&quot;highlighter-rouge&quot;&gt;x[i+1]&lt;/code&gt; is already in the cache than it is that &lt;code class=&quot;highlighter-rouge&quot;&gt;x[i+1000000]&lt;/code&gt; is, in which case it will be &lt;em&gt;much&lt;/em&gt; faster to access &lt;code class=&quot;highlighter-rouge&quot;&gt;x[i+1]&lt;/code&gt; than &lt;code class=&quot;highlighter-rouge&quot;&gt;x[i+1000000]&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Julia arrays are stored in column-major order, which means that the rows of a column are contiguous, but the columns of a row are generally not. It is therefore generally more efficient to access data column-by-column than row-by-row. 
Consider the problem of computing the sum of each row in a matrix. It is natural to implement this as follows:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;m, n = size(a)
r = Array(Float64, m)

for i = 1:m
    s = 0.
    for j = 1:n
        s += a[i,j]
    end
    r[i] = s
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The loop here accesses the elements row-by-row, as &lt;code class=&quot;highlighter-rouge&quot;&gt;a[i,1], a[i,2], ..., a[i,n]&lt;/code&gt;. The interval between these elements is &lt;code class=&quot;highlighter-rouge&quot;&gt;m&lt;/code&gt;. Intuitively, it jumps at the stride of length &lt;code class=&quot;highlighter-rouge&quot;&gt;m&lt;/code&gt; from the begining of each row to the end in each inner loop, and then jumps back to the begining of next row. This is not very efficient, especially when &lt;code class=&quot;highlighter-rouge&quot;&gt;m&lt;/code&gt; is large.&lt;/p&gt;

&lt;p&gt;This procedure can be made much more cache-friendly by changing the order of computation:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;for i = 1:m
    r[i] = a[i,1]
end

for j = 2:n, i = 1:m
    r[i] += a[i,j]
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Some benchmarking shows that this version can be &lt;em&gt;5-10 times&lt;/em&gt; faster than the one above for large matrices.&lt;/p&gt;

&lt;h2 id=&quot;avoid-creating-arrays-in-loops&quot;&gt;Avoid creating arrays in loops&lt;/h2&gt;

&lt;p&gt;Creating arrays requires memory allocation and adds to the workload of the garbage collector. Reusing the same array is a good way to reduce the cost of memory management.&lt;/p&gt;

&lt;p&gt;It is not uncommon that you want to update arrays in an iterative algorithm. For example, in K-means, you may want to update both the cluster means and distances in each iteration. A straightforward way to do this might look like:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;while !converged &amp;amp;&amp;amp; t &amp;lt; maxiter
    means = compute_means(x, labels)
    dists = compute_distances(x, means)
    labels = assign_labels(dists)
    ...
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;In this implementation of K-means, the arrays &lt;code class=&quot;highlighter-rouge&quot;&gt;means&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;dists&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;labels&lt;/code&gt; are recreated at each iteration. This reallocation of memory on each step is unnecessary. The sizes of these arrays are fixed, and their storage can be reused across iterations. The following alternative code is a more efficient way to implement the same algorithm:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;d, n = size(x)

# pre-allocate storage
means = Array(Float64, d, K)
dists = Array(Float64, K, n)
labels = Array(Int, n)

while !converged &amp;amp;&amp;amp; t &amp;lt; maxiter
    update_means!(means, x, labels)
    update_distances!(dists, x, means)
    update_labels!(labels, dists)
    ...
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;In this version, the functions invoked in the loop updates pre-allocated arrays in-place.&lt;/p&gt;

&lt;p&gt;If you are writing a package, it is recommended that you provide two versions for each function that outputs arrays: one that performs the update in-place, and another that returns a new array. The former can usually be implemented as a light-weight wrapper of the latter that copies the input array before modifying it.
A good example is the &lt;a href=&quot;https://github.com/JuliaStats/Distributions.jl&quot;&gt;&lt;em&gt;Distributions.jl&lt;/em&gt;&lt;/a&gt; package, which provides both &lt;code class=&quot;highlighter-rouge&quot;&gt;logpdf&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;logpdf!&lt;/code&gt;, so that one can write &lt;code class=&quot;highlighter-rouge&quot;&gt;lp = logpdf(d,x)&lt;/code&gt; when a new array is needed, or &lt;code class=&quot;highlighter-rouge&quot;&gt;logpdf!(lp,d,x)&lt;/code&gt; when &lt;code class=&quot;highlighter-rouge&quot;&gt;lp&lt;/code&gt; has been pre-allocated.&lt;/p&gt;

&lt;h2 id=&quot;identify-opportunities-to-use-blas&quot;&gt;Identify opportunities to use BLAS&lt;/h2&gt;

&lt;p&gt;Julia wraps a large number of &lt;a href=&quot;http://en.wikipedia.org/wiki/Basic_Linear_Algebra_Subprograms&quot;&gt;BLAS&lt;/a&gt; routines for linear algebraic computation. These routines are the result of decades of research and optimization by many of the world’s top experts in fast numerical computation. As a result, using them where possible can provide performance boosts that seem almost magical – BLAS routines are often orders of magnitude faster than the simple loop implementations they replace.&lt;/p&gt;

&lt;p&gt;For example, consider accumulating weighted versions of vectors as follows:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;r = zeros(size(x,1))
for j = 1:size(x,2)
    r += x[:,j] * w[j]
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;You can replace the statement &lt;code class=&quot;highlighter-rouge&quot;&gt;r += x[:,j] * w[j]&lt;/code&gt; with a call to the BLAS &lt;code class=&quot;highlighter-rouge&quot;&gt;axpy!&lt;/code&gt; function to get better performance:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;for j = 1:size(x,2)
    axpy!(w[j], x[:,j], r)
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This, however, is still far from being optimal. If you are familiar with linear algebra, you may have probably found that this is just matrix-vector multiplication, and can be written as &lt;code class=&quot;highlighter-rouge&quot;&gt;r = x * w&lt;/code&gt;, which is not only shorter, simpler and clearer than either of the above loops – it also runs much faster than either versions.&lt;/p&gt;

&lt;p&gt;Our next example is a subtler application of BLAS routines to computing pairwise Euclidean distances between columns in two matrices. Below is a straightforward implementation that directly computes pairwise distances:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;m, n = size(a)
r = Array(Float64, m, n)

for j = 1:n, i = 1:m
    r[i,j] = sqrt(sum(abs2(a[:,i] - b[:,j])))
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This is clearly suboptimal – a lot of temporary arrays are created in evaluating the expression in the inner loop. To speed this up, we can devectorize the inner expression:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;d, m = size(a)
n = size(b,2)
r = Array(Float64, m, n)

for j = 1:n, i = 1:m
        s = 0.
        for k = 1:d
            s += abs2(a[k,i] - b[k,j])
        end
        r[i,j] = sqrt(s)
    end
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This version is much more performant than the vectorized form. But is it the best we can do? By employing an alternative strategy, we can write a even faster algorithm for computing pairwise distances. The trick is that the squared Euclidean distance between two vectors can be expanded as:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;sum(abs2(x-y)) == sum(abs2(x)) + sum(abs2(y)) - 2*dot(x,y)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;If we evaluate these three terms separately, the computation can be mapped to BLAS routines perfectly. Below, we have a new implementation of pairwise distances written using only BLAS routines, including the norm calls that are wrapped by the &lt;a href=&quot;https://github.com/lindahua/NumericExtensions.jl&quot;&gt;&lt;em&gt;NumericExtensions.jl&lt;/em&gt;&lt;/a&gt; package:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;using NumericExtensions   # for sqsum
using Base.LinAlg.BLAS    # for gemm!

m, n = size(a)

sa = sqsum(a, 1)   # sum(abs2(x)) for each column in a
sb = sqsum(b, 1)   # sum(abs2(y)) for each column in b

r = sa .+ reshape(sb, 1, n)          # first two terms
gemm!('T', 'N', -2.0, a, b, 1.0, r)  # add (-2.0) * a' * b to r

for i = 1:length(r)
    r[i] = sqrt(r[i])
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This version is over &lt;em&gt;100 times&lt;/em&gt; faster than our original implementation — the &lt;code class=&quot;highlighter-rouge&quot;&gt;gemm&lt;/code&gt; function in BLAS has been optimized to the extreme by many talented developers and engineers over the past few decades.&lt;/p&gt;

&lt;p&gt;We should mention that you don’t have to implement this yourself if you really want to compute pairwise distances: the &lt;a href=&quot;https://github.com/lindahua/Distance.jl&quot;&gt;&lt;em&gt;Distance.jl&lt;/em&gt;&lt;/a&gt; package provides optimized implementations of a broad variety of distance metrics, including this one. We presented this optimization trick as an example to illustrate the substantial performance gains that can be achieved by writing code that uses BLAS routines wherever possible.&lt;/p&gt;

&lt;h2 id=&quot;explore-available-packages&quot;&gt;Explore available packages&lt;/h2&gt;

&lt;p&gt;Julia has a very active open source ecosystem. A variety of packages have been developed that provide optimized algorithms for high performance computation.
Look for a package that does what you need before you decide to roll your own – and if you don’t find what you need, consider contributing it!
Here are a couple of packages that might be useful for those interested in high performance computation:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;https://github.com/lindahua/NumericExtensions.jl&quot;&gt;NumericExtensions.jl&lt;/a&gt; – extensions to Julia’s base functionality for high-performance support for a variety of common computations (many of these will gradually get moved into base Julia).&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;&lt;a href=&quot;https://github.com/lindahua/Devectorize.jl&quot;&gt;Devectorize.jl&lt;/a&gt; – macros and functions to de-vectorize vector expressions. With this package, users can write computations in high-level vectorized way while enjoying the high run-time performance of hand-coded de-vectorized loops.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Check out the &lt;a href=&quot;http://pkg.julialang.org/&quot;&gt;Julia package list&lt;/a&gt; for many more packages. Julia also ships with a &lt;a href=&quot;http://docs.julialang.org/en/latest/stdlib/profile/&quot;&gt;sampling profiler&lt;/a&gt; to measure where your code is spending most of its time. When in doubt, measure don’t guess!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Building GUIs with Julia, Tk, and Cairo, Part II</title>
   <link href="http://julialang.org/blog/2013/05/graphical-user-interfaces-part2"/>
   <updated>2013-05-23T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2013/05/graphical-user-interfaces-part2</id>
   <content type="html">&lt;h1 id=&quot;drawing-painting-and-plotting&quot;&gt;Drawing, painting, and plotting&lt;/h1&gt;

&lt;p&gt;In this installment, we’ll cover both low-level graphics (using Cairo) and plotting graphs inside GUIs (using Winston).
Here again we’re relying on infrastructure built by many people, including Jeff Bezanson, Mike Nolta, and Keno Fisher.&lt;/p&gt;

&lt;h2 id=&quot;cairo&quot;&gt;Cairo&lt;/h2&gt;

&lt;h3 id=&quot;the-basics&quot;&gt;The basics&lt;/h3&gt;

&lt;p&gt;The display of the image is handled by Cairo, a C library for two-dimensional drawing.
Julia’s Cairo wrapper isn’t currently documented, so let’s walk through a couple of basics first.&lt;/p&gt;

&lt;p&gt;If you’re new to graphics libraries like Cairo, there are a few concepts that may not be immediately obvious but are introduced in the Cairo &lt;a href=&quot;http://cairographics.org/tutorial/&quot;&gt;tutorial&lt;/a&gt;.
The key concept is that the Cairo API works like “stamping,” where a &lt;em&gt;source&lt;/em&gt; gets applied to a &lt;em&gt;destination&lt;/em&gt; in a region specified by a &lt;em&gt;path&lt;/em&gt;.
Here, the destination will be the pixels corresponding to a region of a window on the screen.
We’ll control the source and the path to achieve the effects we want.&lt;/p&gt;

&lt;p&gt;Let’s play with this.
First, inside a new window we create a Cairo-enabled Canvas for drawing:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;using Base.Graphics
using Cairo
using Tk

win = Toplevel(&quot;Test&quot;, 400, 200)
c = Canvas(win)
pack(c, expand=true, fill=&quot;both&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We’ve created a window 400 pixels wide and 200 pixels high.
&lt;code class=&quot;highlighter-rouge&quot;&gt;c&lt;/code&gt; is our Canvas, a type defined in the &lt;code class=&quot;highlighter-rouge&quot;&gt;Tk&lt;/code&gt; package.
Later we’ll dig into the internals a bit, but for now suffice it to say that a Canvas is a multi-component object that you can often treat as a black box.
The initial call creating the canvas leaves a lot of its fields undefined, because you don’t yet know crucial details like the size of the canvas.
The call to &lt;code class=&quot;highlighter-rouge&quot;&gt;pack&lt;/code&gt; specifies that this canvas fills the entire window, and simultaneously fills in the missing information in the Canvas object itself.&lt;/p&gt;

&lt;p&gt;Note that the window is currently blank, because we haven’t drawn anything to it yet, so you can see whatever was lying underneath.
In my case it captured a small region of my desktop:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/cairo_example2.jpg?raw=true&quot; alt=&quot;cairo snapshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Now let’s do some drawing.
Cairo doesn’t know anything about Tk Canvases, so we have to pull out the part of it that works directly with  Cairo:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;ctx = getgc(c)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;getgc&lt;/code&gt; means “get graphics context,” returning an object (here &lt;code class=&quot;highlighter-rouge&quot;&gt;ctx&lt;/code&gt;) that holds all relevant information about the current state of drawing to this canvas.&lt;/p&gt;

&lt;p&gt;One nice feature of Cairo is that the coordinates are abstracted; ultimately we care about screen pixels, but we can set up &lt;em&gt;user coordinates&lt;/em&gt; that have whatever scaling is natural to the problem.
We just have to tell Cairo how to convert user coordinates to &lt;em&gt;device&lt;/em&gt; (screen) coordinates.
We set up a coordinate system using &lt;code class=&quot;highlighter-rouge&quot;&gt;set_coords&lt;/code&gt;, defined in &lt;code class=&quot;highlighter-rouge&quot;&gt;base/graphics.jl&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function set_coords(ctx::GraphicsContext, x, y, w, h, l, r, t, b)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; (horizontal) and &lt;code class=&quot;highlighter-rouge&quot;&gt;y&lt;/code&gt; (vertical) specify the upper-left corner of the drawing region in &lt;em&gt;device&lt;/em&gt; coordinates, and &lt;code class=&quot;highlighter-rouge&quot;&gt;w&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;h&lt;/code&gt; its width and height, respectively.
(Note Cairo uses (0,0) for the top-left corner of the window.) &lt;code class=&quot;highlighter-rouge&quot;&gt;l&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;r&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;t&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;b&lt;/code&gt; are the &lt;em&gt;user&lt;/em&gt; coordinates corresponding to the left, right, top, and bottom, respectively, of this region.
Note that &lt;code class=&quot;highlighter-rouge&quot;&gt;set_coords&lt;/code&gt; will also &lt;code class=&quot;highlighter-rouge&quot;&gt;clip&lt;/code&gt; any drawing that occurs outside the region defined by &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;y&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;w&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;h&lt;/code&gt;; however, the coordinate system you’ve specified extends to infinity, and you can draw all the way to the edge of the canvas by calling &lt;code class=&quot;highlighter-rouge&quot;&gt;reset_clip()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Let’s fill the drawing region with a color, so we can see it:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# Set coordinates to go from 0 to 10 within a 300x100 centered region
set_coords(ctx, 50, 50, 300, 100, 0, 10, 0, 10)
set_source_rgb(ctx, 0, 0, 1)   # set color to blue
paint(ctx)                     # paint the entire clip region
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Perhaps surprisingly, nothing happened.
The reason is that the Tk Canvas implements a technique called &lt;a href=&quot;http://en.wikipedia.org/wiki/Multiple_buffering#Double_buffering_in_computer_graphics&quot;&gt;double buffering&lt;/a&gt;, which means that you do all your drawing to a back (hidden) surface, and then blit the completed result to the front (visible) surface.
We can see this in action simply by bringing another window over the top of the window we’re using to draw, and then bringing our window back to the top; suddenly you’ll see a nice blue rectangle within the window, surrounded by whatever is in the background window(s):&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/cairo_example3.jpg?raw=true&quot; alt=&quot;cairo snapshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Fortunately, to display your graphics you don’t have to rely on users changing the stacking order of windows: call &lt;code class=&quot;highlighter-rouge&quot;&gt;reveal(c)&lt;/code&gt; to update the front surface with the contents of the back surface, followed by &lt;code class=&quot;highlighter-rouge&quot;&gt;update()&lt;/code&gt; (or perhaps better, &lt;code class=&quot;highlighter-rouge&quot;&gt;Tk.update()&lt;/code&gt; since &lt;code class=&quot;highlighter-rouge&quot;&gt;update&lt;/code&gt; is a fairly generic name) to give Tk a chance to expose the front surface to the OS’s window manager.&lt;/p&gt;

&lt;p&gt;Now let’s draw a red line:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;move_to(ctx, -1, 5)
line_to(ctx, 7, 6)
set_source_rgb(ctx, 1, 0, 0)
set_line_width(ctx, 5)
stroke(ctx)
reveal(c)
Tk.update()
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We started at a position outside the coordinate region (we’ll get to see the clipping in action this way).
The next command, &lt;code class=&quot;highlighter-rouge&quot;&gt;line_to&lt;/code&gt;, creates a segment of a &lt;em&gt;path&lt;/em&gt;, the way that regions are defined in Cairo.
The &lt;code class=&quot;highlighter-rouge&quot;&gt;stroke&lt;/code&gt; command draws a line along the trajectory of the path, after which the path is cleared.
(You can use &lt;code class=&quot;highlighter-rouge&quot;&gt;stroke_preserve&lt;/code&gt; if you want to re-use this path for another purpose later.)&lt;/p&gt;

&lt;p&gt;Let’s illustrate this by adding a solid green rectangle with a magenta border, letting it spill over the edges of the previously-defined coordinate region:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;reset_clip(ctx)
rectangle(ctx, 7, 5, 4, 4)
set_source_rgb(ctx, 0, 1, 0)
fill_preserve(ctx)
set_source_rgb(ctx, 1, 0, 1)
stroke(ctx)
reveal(c)
Tk.update()
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;fill&lt;/code&gt; differs from &lt;code class=&quot;highlighter-rouge&quot;&gt;paint&lt;/code&gt; in that &lt;code class=&quot;highlighter-rouge&quot;&gt;fill&lt;/code&gt; works inside the currently-defined path, whereas &lt;code class=&quot;highlighter-rouge&quot;&gt;paint&lt;/code&gt; fills the entire clip region.&lt;/p&gt;

&lt;p&gt;Here is our masterpiece, where the “background” may differ for you (mine was positioned over the bottom of a wikipedia page):&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/cairo_example.jpg?raw=true&quot; alt=&quot;cairo snapshot&quot; /&gt;&lt;/p&gt;

&lt;h3 id=&quot;rendering-an-image&quot;&gt;Rendering an image&lt;/h3&gt;

&lt;p&gt;Images are rendered in Cairo inside a &lt;code class=&quot;highlighter-rouge&quot;&gt;rectangle&lt;/code&gt; (controlling placement of the image) followed by &lt;code class=&quot;highlighter-rouge&quot;&gt;fill&lt;/code&gt;.
So far this is just like the simple drawing above.
The difference is the &lt;em&gt;source&lt;/em&gt;, which now will be a &lt;em&gt;surface&lt;/em&gt; instead of an RGB color.
If you’re drawing from Julia, chances are that you want to display an in-memory array.
The main trick is that Cairo requires this array to be a matrix of type &lt;code class=&quot;highlighter-rouge&quot;&gt;Uint32&lt;/code&gt; encoding the color.
The scheme is that the least significant byte is the blue value (ranging from &lt;code class=&quot;highlighter-rouge&quot;&gt;0x00&lt;/code&gt; to &lt;code class=&quot;highlighter-rouge&quot;&gt;0xff&lt;/code&gt;), the next is green, and the next red.
(The most significant byte can encode the alpha value, or transparency, if you specify that transparency is to be used in your image surface.)&lt;/p&gt;

&lt;p&gt;Both &lt;code class=&quot;highlighter-rouge&quot;&gt;Winston&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;Images&lt;/code&gt; can generate a buffer of &lt;code class=&quot;highlighter-rouge&quot;&gt;Uint32&lt;/code&gt; for you.
Let’s try the one in &lt;code class=&quot;highlighter-rouge&quot;&gt;Images&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;using Images
img = imread(&quot;some_photo.jpg&quot;)
buf = uint32color(img)'
image(ctx, CairoRGBSurface(buf), 0, 0, 10, 10)
reveal(c)
Tk.update()
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Rather than manually calling &lt;code class=&quot;highlighter-rouge&quot;&gt;rectangle&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;fill&lt;/code&gt;, we use the convenience method &lt;code class=&quot;highlighter-rouge&quot;&gt;image(ctx, surf, x, y, w, h)&lt;/code&gt; (defined in &lt;code class=&quot;highlighter-rouge&quot;&gt;Cairo.jl&lt;/code&gt;).
Here &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;y&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;w&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;h&lt;/code&gt; are user-coordinates of your canvas, not pixels on the screen or pixels in your image; being able to express location in user coordinates is the main advantage of using &lt;code class=&quot;highlighter-rouge&quot;&gt;image()&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The image should now be displayed within your window (squashed, because we haven’t worried about aspect ratio):&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/cairo_image1.jpg?raw=true&quot; alt=&quot;cairo snapshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;It fills only part of the window because of the coordinate system we’ve established, where the range &lt;code class=&quot;highlighter-rouge&quot;&gt;0:10&lt;/code&gt; corresponds to an inset region in the center of the window.&lt;/p&gt;

&lt;p&gt;While it’s a minor point, note that &lt;code class=&quot;highlighter-rouge&quot;&gt;CairoRGBSurface&lt;/code&gt; takes a transpose for you, to convert from the column-major order of matrices in Julia to the row-major convention of Cairo.
&lt;code class=&quot;highlighter-rouge&quot;&gt;Images&lt;/code&gt; avoids taking transposes unless necessary, and is capable of handling images with any storage order.
Here we do a transpose in preparation to have it be converted back to its original shape by &lt;code class=&quot;highlighter-rouge&quot;&gt;CairoRGBSurface&lt;/code&gt;.
If performance is critical, you can avoid the default behavior of &lt;code class=&quot;highlighter-rouge&quot;&gt;CairoRGBSurface&lt;/code&gt; by calling &lt;code class=&quot;highlighter-rouge&quot;&gt;CairoImageSurface&lt;/code&gt; directly (see the &lt;code class=&quot;highlighter-rouge&quot;&gt;Cairo.jl&lt;/code&gt; code).&lt;/p&gt;

&lt;h3 id=&quot;redrawing--resize-support&quot;&gt;Redrawing &amp;amp; resize support&lt;/h3&gt;

&lt;p&gt;A basic feature of windows is that they should behave properly under resizing operations.
This doesn’t come entirely for free, although the grid (and pack) managers of Tk take care of many details for us.
However, for Canvases we need to to do a little bit of extra work; to see what I mean, just try resizing the window we created above.&lt;/p&gt;

&lt;p&gt;The key is to have a callback that gets activated whenever the canvas changes size, and to have this callback capable of redrawing the window at arbitrary size.
Canvases make this easy by having a field, &lt;code class=&quot;highlighter-rouge&quot;&gt;resize&lt;/code&gt;, that you assign the callback to.
This function will receive a single argument, the canvas itself, but as always you can provide more information.
Taking our image example, we could set&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;c.resize = c-&amp;gt;redraw(c, buf)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;and then define&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function redraw(c::Canvas, buf)
    ctx = getgc(c)
    set_source_rgb(ctx, 1, 0, 0)
    paint(ctx)
    set_coords(ctx, 50, 50, Tk.width(c)-100, Tk.height(c)-100, 0, 10, 0, 10)
    image(ctx, CairoRGBSurface(buf), 0, 0, 10, 10)
    reveal(c)
    Tk.update()
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Here you can see that we’re aiming to be a bit more polished, and want to avoid seeing bits of the desktop around the borders of our drawing region.
So we fill the window with a solid color (but choose a garish red, to make sure we notice it) before displaying the image. 
We also have to re-create our coordinate system, because that too was destroyed, and in this case we dynamically adjust the coordinates to the size of the canvas.
Finally, we redraw the image.
Note we didn’t have to go through the process of converting to &lt;code class=&quot;highlighter-rouge&quot;&gt;Uint32&lt;/code&gt;-based color again.
Obviously, you can use this &lt;code class=&quot;highlighter-rouge&quot;&gt;redraw&lt;/code&gt; function even for the initial rendering of the window, so there’s really no extra work in setting up your code this way.&lt;/p&gt;

&lt;p&gt;If you grab the window handle and resize it, now you should see something like this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/cairo_image2.jpg?raw=true&quot; alt=&quot;cairo snapshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Voila! We’re really getting somewhere now.&lt;/p&gt;

&lt;p&gt;Unlike the complete GUI, this implementation doesn’t have the option to preserve the image’s aspect ratio.
However, there’s really no magic there; it all comes down to computing sizes and controlling the drawing region and coordinate system.&lt;/p&gt;

&lt;p&gt;One important point: resizing the window causes the existing Cairo context(s) to be destroyed, and creates new ones suitable for the new canvas size.
One consequence is that your old &lt;code class=&quot;highlighter-rouge&quot;&gt;ctx&lt;/code&gt; variable &lt;em&gt;is now invalid, and trying to use it for drawing will cause a segfault&lt;/em&gt;.
For this reason, you shouldn’t ever store a ctx object on its own; always begin drawing by calling &lt;code class=&quot;highlighter-rouge&quot;&gt;getgc(c)&lt;/code&gt; again.&lt;/p&gt;

&lt;h3 id=&quot;canvases-and-the-mouse&quot;&gt;Canvases and the mouse&lt;/h3&gt;

&lt;p&gt;A Canvas already comes with a set of fields prepared for mouse events.
For example, in the complete GUI we have the equivalent of the following:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;selectiondonefunc = (c, bb) -&amp;gt; zoombb(imgc, img2, bb)
c.mouse.button1press = (c, x, y) -&amp;gt; rubberband_start(c, x, y, selectiondonefunc)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;rubberband_start&lt;/code&gt;, a function defined in &lt;code class=&quot;highlighter-rouge&quot;&gt;rubberband.jl&lt;/code&gt;, will now be called whenever the user presses the left mouse button.
&lt;code class=&quot;highlighter-rouge&quot;&gt;selectiondonefunc&lt;/code&gt; is a callback that we supply; it will be executed when the user releases the mouse button, and it needs to implement whatever it is we want to achieve with the selected region (in this case, a zoom operation).
Part of what &lt;code class=&quot;highlighter-rouge&quot;&gt;rubberband_start&lt;/code&gt; does is to bind &lt;code class=&quot;highlighter-rouge&quot;&gt;selectiondonefunc&lt;/code&gt; to the release of the mouse button, via &lt;code class=&quot;highlighter-rouge&quot;&gt;c.mouse.button1release&lt;/code&gt;.
&lt;code class=&quot;highlighter-rouge&quot;&gt;bb&lt;/code&gt; is a &lt;code class=&quot;highlighter-rouge&quot;&gt;BoundingBox&lt;/code&gt; (a type defined in &lt;code class=&quot;highlighter-rouge&quot;&gt;base/graphics.jl&lt;/code&gt;) that will store the region selected by the user, and this gets passed to &lt;code class=&quot;highlighter-rouge&quot;&gt;selectiondonefunc&lt;/code&gt;.
(The first two inputs to &lt;code class=&quot;highlighter-rouge&quot;&gt;zoombb&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;imgc&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;img2&lt;/code&gt;, store settings that are relevant to this particular GUI but will not be described in detail here.)&lt;/p&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;mouse&lt;/code&gt; inside a &lt;code class=&quot;highlighter-rouge&quot;&gt;Canvas&lt;/code&gt; is an object of type &lt;code class=&quot;highlighter-rouge&quot;&gt;MouseHandler&lt;/code&gt;, which has fields for &lt;code class=&quot;highlighter-rouge&quot;&gt;press&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;release&lt;/code&gt; of all 3 mouse buttons and additional ones for motion.
However, a few cases (which happen to be relevant to this GUI) are not available in &lt;code class=&quot;highlighter-rouge&quot;&gt;MouseHandler&lt;/code&gt;.
Here are some examples of how to configure these actions:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# Bind double-clicks
bind(c.c, &quot;&amp;lt;Double-Button-1&amp;gt;&quot;, (path,x,y)-&amp;gt;zoom_reset(imgc, img2))
# Bind Shift-scroll (using the wheel mouse)
bindwheel(c.c, &quot;Shift&quot;, (path,delta)-&amp;gt;panhorz(imgc,img2,int(delta)))
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;delta&lt;/code&gt; argument for the wheel mouse will encode the direction of scrolling.&lt;/p&gt;

&lt;h3 id=&quot;the-rubber-band-region-selection&quot;&gt;The rubber band (region selection)&lt;/h3&gt;

&lt;p&gt;Support for the rubber band is provided in the file &lt;code class=&quot;highlighter-rouge&quot;&gt;rubberband.jl&lt;/code&gt;.
Like &lt;code class=&quot;highlighter-rouge&quot;&gt;navigation.jl&lt;/code&gt;, this is a stand-alone set of functions that you should be able to incorporate into other projects.
It draws a dashed rectangle employing the same machinery we described at the top of this page, with slight modifications to create the dashes (through the &lt;code class=&quot;highlighter-rouge&quot;&gt;set_dash&lt;/code&gt; function).
By now, this should all be fairly straightforward.&lt;/p&gt;

&lt;p&gt;However, these functions use one additional trick worth mentioning.
Let’s finally look at the Tk &lt;code class=&quot;highlighter-rouge&quot;&gt;Canvas&lt;/code&gt; object:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;type Canvas
    c::TkWidget
    front::CairoSurface  # surface for window
    back::CairoSurface   # backing store
    frontcc::CairoContext
    backcc::CairoContext
    mouse::MouseHandler
    redraw
    
    function ...
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Here we can explicitly see the two buffers, used in double-buffering, and their associated contexts.
&lt;code class=&quot;highlighter-rouge&quot;&gt;getgc(c)&lt;/code&gt;, where &lt;code class=&quot;highlighter-rouge&quot;&gt;c&lt;/code&gt; is a &lt;code class=&quot;highlighter-rouge&quot;&gt;Canvas&lt;/code&gt;, simply returns &lt;code class=&quot;highlighter-rouge&quot;&gt;backcc&lt;/code&gt;.
This is why all drawing occurs on the back surface.
For the rubber band, we choose instead to draw on the front surface, and then (as the size of the rubber band changes) “repair the damage” by copying from the back surface.
Since we only have to modify the pixels along the band itself, this is fast.
You can see these details in &lt;code class=&quot;highlighter-rouge&quot;&gt;rubberband.jl&lt;/code&gt;.&lt;/p&gt;

&lt;h2 id=&quot;winston&quot;&gt;Winston&lt;/h2&gt;

&lt;p&gt;For many GUIs in Julia, an important component will be the ability to display data graphically.
While we could draw graphs directly with Cairo, it would be a lot of work to build from scratch; fortunately, there’s an excellent package, Winston, that already does this.&lt;/p&gt;

&lt;p&gt;Since there’s a nice set of &lt;a href=&quot;https://github.com/nolta/Winston.jl/blob/master/doc/examples.md&quot;&gt;examples&lt;/a&gt; of some of the things you can do with Winston, here our focus is very narrow: how do you integrate Winston plots into GUIs built with Tk.
Fortunately, this is quite easy.
Let’s walk through an example:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;using Tk
using Winston

win = Toplevel(&quot;Testing&quot;, 400, 200)
fwin = Frame(win)
pack(fwin, expand=true, fill=&quot;both&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;We chose to fill the entire window with a frame &lt;code class=&quot;highlighter-rouge&quot;&gt;fwin&lt;/code&gt;, so that everything inside this GUI will have a consistent background. All other objects will be placed inside &lt;code class=&quot;highlighter-rouge&quot;&gt;fwin&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Next, let’s set up the elements, a Canvas on the left and a single button on the right:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;c = Canvas(fwin, 300, 200)
grid(c, 1, 1, sticky=&quot;nsew&quot;)
fctrls = Frame(fwin)
grid(fctrls, 1, 2, sticky=&quot;sw&quot;, pady=5, padx=5)
grid_columnconfigure(fwin, 1, weight=1)
grid_rowconfigure(fwin, 1, weight=1)

ok = Button(fctrls, &quot;OK&quot;)
grid(ok, 1, 1)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Finally, let’s plot something inside the Canvas:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;x = linspace(0.0,10.0,1001)
y = sin(x)
p = FramedPlot()
add(p, Curve(x, y, &quot;color&quot;, &quot;red&quot;))

Winston.display(c, p)
reveal(c)
Tk.update()
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/winston.jpg?raw=true&quot; alt=&quot;Winston snapshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;You’ll note that you can resize this window, and the plot grows or shrinks accordingly.&lt;/p&gt;

&lt;p&gt;Easy, huh? The only part of this code that is specific to GUIs is the line &lt;code class=&quot;highlighter-rouge&quot;&gt;Winston.display(c, p)&lt;/code&gt;, where we specified that we wanted our plot to appear inside a particular Canvas.
Of course, there’s a lot of magic behind the scenes in Winston, but covering its internals is beyond our scope here.&lt;/p&gt;

&lt;h2 id=&quot;conclusions&quot;&gt;Conclusions&lt;/h2&gt;

&lt;p&gt;There’s more one could cover, but most of the rest is fairly specific to this particular GUI.
A fair amount of code is needed to handle coordinates: selecting specific regions within the 4d image, and rendering to specific regions of the output canvas.
If you want to dive into these details, your best bet is to start reading through the &lt;code class=&quot;highlighter-rouge&quot;&gt;ImageView&lt;/code&gt; code, but it’s not going to be covered in any more detail here.&lt;/p&gt;

&lt;p&gt;Hopefully by this point you have a pretty good sense for how to produce on-screen output with Tk, Cairo, and Winston.
It takes a little practice to get comfortable with these tools, but the end result is quite powerful.
Happy hacking!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Building GUIs with Julia, Tk, and Cairo, Part I</title>
   <link href="http://julialang.org/blog/2013/05/graphical-user-interfaces-part1"/>
   <updated>2013-05-23T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2013/05/graphical-user-interfaces-part1</id>
   <content type="html">&lt;p&gt;This is the first of two blog posts designed to walk users through the process of creating GUIs in Julia.
Those following Julia development will know that plotting in Julia is still evolving, and one could therefore expect that it might be premature to build GUIs with Julia.
My own recent experience has taught me that this expectation is wrong: compared with building GUIs in Matlab (my only previous GUI-writing experience), Julia already offers a number of quite compelling advantages.
We’ll see some of these advantages on display below.&lt;/p&gt;

&lt;p&gt;We’ll go through the highlights needed to create an image viewer GUI.
Before getting into how to write this GUI, first let’s play with it to get a sense for how it works.
It’s best if you just try these commands yourself, because it’s difficult to capture things like interactivity with static text and pictures.&lt;/p&gt;

&lt;p&gt;You’ll need the &lt;code class=&quot;highlighter-rouge&quot;&gt;ImageView&lt;/code&gt; package:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;Pkg.add(&quot;ImageView&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;It’s worth pointing out that this package is expected to evolve over time; however, if things have changed from what’s described in this blog, try checking out the “blog” branch directly from the &lt;a href=&quot;https://github.com/timholy/ImageView.jl&quot;&gt;repository&lt;/a&gt;.
I should also point out that this package was developed on the author’s Linux system, and it’s possible that things may not work as well on other platforms.&lt;/p&gt;

&lt;p&gt;First let’s try it with a photograph. Load one this way:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;using Images
using ImageView
img = imread(&quot;my_photo.jpg&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Any typical image format should be fine, it doesn’t have to be a jpg.
Now display the image this way:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;display(img, pixelspacing = [1,1])
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The basic command to view the image is &lt;code class=&quot;highlighter-rouge&quot;&gt;display&lt;/code&gt;.
The optional &lt;code class=&quot;highlighter-rouge&quot;&gt;pixelspacing&lt;/code&gt; input tells &lt;code class=&quot;highlighter-rouge&quot;&gt;display&lt;/code&gt; that this image has a fixed aspect ratio, and that this needs to be honored when displaying the image. (Alternatively, you could set &lt;code class=&quot;highlighter-rouge&quot;&gt;img[&quot;pixelspacing&quot;] = [1,1]&lt;/code&gt; and then you wouldn’t have to tell this to the &lt;code class=&quot;highlighter-rouge&quot;&gt;display&lt;/code&gt; function.)&lt;/p&gt;

&lt;p&gt;You should get a window with your image:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/photo1.jpg?raw=true&quot; alt=&quot;photo&quot; /&gt;&lt;/p&gt;

&lt;p&gt;OK, nice.
But we can start to have some fun if we resize the window, which causes the image to get bigger or smaller:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/photo2.jpg?raw=true&quot; alt=&quot;photo&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Note the black perimeter; that’s because we’ve specified the aspect ratio through the &lt;code class=&quot;highlighter-rouge&quot;&gt;pixelspacing&lt;/code&gt; input, and when the window doesn’t have the same aspect ratio as the image you’ll have a perimeter either horizontally or vertically.
Try it without specifying &lt;code class=&quot;highlighter-rouge&quot;&gt;pixelspacing&lt;/code&gt;, and you’ll see that the image stretches to fill the window, but it looks distorted:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;display(img)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/photo3.jpg?raw=true&quot; alt=&quot;photo&quot; /&gt;&lt;/p&gt;

&lt;p&gt;(This won’t work if you’ve already defined &lt;code class=&quot;highlighter-rouge&quot;&gt;&quot;pixelspacing&quot;&lt;/code&gt; for &lt;code class=&quot;highlighter-rouge&quot;&gt;img&lt;/code&gt;; if necessary, use &lt;code class=&quot;highlighter-rouge&quot;&gt;delete!(img, &quot;pixelspacing&quot;)&lt;/code&gt; to remove that setting.)&lt;/p&gt;

&lt;p&gt;Next, click and drag somewhere inside the image.
You’ll see the typical rubberband selection, and once you let go the image display will zoom in on the selected region.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/photo4.jpg?raw=true&quot; alt=&quot;photo&quot; /&gt;
&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/photo5.jpg?raw=true&quot; alt=&quot;photo&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Again, the aspect ratio of the display is preserved.
Double-clicking on the image restores the display to full size.&lt;/p&gt;

&lt;p&gt;If you have a wheel mouse, zoom in again and scroll the wheel, which should cause the image to pan vertically.
If you scroll while holding down Shift, it pans horizontally; hold down Ctrl and you affect the zoom setting.
Note as you zoom via the mouse, the zoom stays focused around the mouse pointer location, making it easy to zoom in on some small feature simply by pointing your mouse at it and then Ctrl-scrolling.&lt;/p&gt;

&lt;p&gt;Long-time users of Matlab may note a number of nice features about this behavior:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;The resizing and panning is much smoother than Matlab’s&lt;/li&gt;
  &lt;li&gt;Matlab doesn’t expose modifier keys in conjunction with the wheel mouse, making it difficult to implement this degree of interactivity&lt;/li&gt;
  &lt;li&gt;In Matlab, zooming with the wheel mouse is always centered on the middle of the display, requiring you to alternate between zooming and panning to magnify a particular small region of your image or plot.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;These already give a taste of some of the features we can achieve quite easily in Julia.&lt;/p&gt;

&lt;p&gt;However, there’s more to this GUI than meets the eye.
You can display the image upside-down with&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;display(img, pixelspacing = [1,1], flipy=true)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;or switch the &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;y&lt;/code&gt; axes with&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;display(img, pixelspacing = [1,1], xy=[&quot;y&quot;,&quot;x&quot;])
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/photo6.jpg?raw=true&quot; alt=&quot;photo&quot; /&gt;
&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/photo7.jpg?raw=true&quot; alt=&quot;photo&quot; /&gt;&lt;/p&gt;

&lt;p&gt;To experience the full functionality, you’ll need a “4D  image,” a movie (time sequence) of 3D images.
If you don’t happen to have one lying around, you can create one via &lt;code class=&quot;highlighter-rouge&quot;&gt;include(&quot;test/test4d.jl&quot;)&lt;/code&gt;, where &lt;code class=&quot;highlighter-rouge&quot;&gt;test&lt;/code&gt; means the test directory in &lt;code class=&quot;highlighter-rouge&quot;&gt;ImageView&lt;/code&gt;.
(Assuming you installed &lt;code class=&quot;highlighter-rouge&quot;&gt;ImageView&lt;/code&gt; via the package manager, you can say &lt;code class=&quot;highlighter-rouge&quot;&gt;include(joinpath(Pkg.dir(), &quot;ImageView&quot;, &quot;test&quot;, &quot;test4d.jl&quot;))&lt;/code&gt;.)
This creates a solid cone that changes color over time, again in the variable &lt;code class=&quot;highlighter-rouge&quot;&gt;img&lt;/code&gt;.
Then, type &lt;code class=&quot;highlighter-rouge&quot;&gt;display(img)&lt;/code&gt;.
You should see something like this:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/display_GUI.jpg?raw=true&quot; alt=&quot;GUI snapshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;The green circle is a “slice” from the cone.
At the bottom of the window you’ll see a number of buttons and our current location, &lt;code class=&quot;highlighter-rouge&quot;&gt;z=1&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;t=1&lt;/code&gt;, which correspond to the base of the cone and the beginning of the movie, respectively.
Click the upward-pointing green arrow, and you’ll “pan” through the cone in the &lt;code class=&quot;highlighter-rouge&quot;&gt;z&lt;/code&gt; dimension, making the circle smaller.
You can go back with the downward-pointing green arrow, or step frame-by-frame with the black arrows.
Next, clicking the “play forward” button moves forward in time, and you’ll see the color change through gray to magenta.
The black square is a stop button. You can, of course, type a particular &lt;code class=&quot;highlighter-rouge&quot;&gt;z&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;t&lt;/code&gt; location into the entry boxes, or grab the sliders and move them.&lt;/p&gt;

&lt;p&gt;If you have a wheel mouse, Alt-scroll changes the time, and Ctrl-Alt-scroll changes the z-slice.&lt;/p&gt;

&lt;p&gt;You can change the playback speed by right-clicking in an empty space within the navigation bar, which brings up a popup (context) menu:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/popup.jpg?raw=true&quot; alt=&quot;GUI snapshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;&lt;br /&gt;
&lt;br /&gt;&lt;/p&gt;

&lt;p&gt;By default, &lt;code class=&quot;highlighter-rouge&quot;&gt;display&lt;/code&gt; will show you slices in the &lt;code class=&quot;highlighter-rouge&quot;&gt;xy&lt;/code&gt;-plane.
You might want to see a different set of slices from the 4d image:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;display(img, xy=[&quot;x&quot;,&quot;z&quot;])
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Initially you’ll see nothing, but that’s because this edge of the image is black.
Type 151 into the &lt;code class=&quot;highlighter-rouge&quot;&gt;y:&lt;/code&gt; entry box (note its name has changed) and hit enter, or move the “y” slider into the middle of its range; now you’ll see the cone from the side.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/display_GUI2.jpg?raw=true&quot; alt=&quot;GUI snapshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;This GUI is also useful for “plain movies” (2d images with time), in which case the &lt;code class=&quot;highlighter-rouge&quot;&gt;z&lt;/code&gt; controls will be omitted and it will behave largely as a typical movie-player.
Likewise, the &lt;code class=&quot;highlighter-rouge&quot;&gt;t&lt;/code&gt; controls will be omitted for 3d images lacking a temporal component, making this a nice viewer for MRI scans.&lt;/p&gt;

&lt;p&gt;Again, we note a number of improvements over Matlab:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;When you resize the window, note that the controls keep their initial size, while the image fills the window. With some effort this behavior is possible to achieve in Matlab, but (as you’ll see later in these posts) it’s essentially trivial with Julia and Tk.&lt;/li&gt;
  &lt;li&gt;When we move the sliders, the display updates while we drag it, not just when we let go of the mouse button.&lt;/li&gt;
  &lt;li&gt;If you try this with a much larger 3d or 4d image, you may also notice that the display feels snappy and responsive in a way that’s sometimes hard to achieve with Matlab.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Altogether advantages such as these combine to give a substantially more polished feel to GUI applications written in Julia.&lt;/p&gt;

&lt;p&gt;This completes our tour of the features of this GUI.
Now let’s go through a few of the highlights needed to create it.
We’ll tackle this in pieces; not only will this make it easier to learn, but it also illustrates how to build re-useable components.
Let’s start with the navigation frame.&lt;/p&gt;

&lt;h2 id=&quot;first-steps-the-navigation-frame&quot;&gt;First steps: the navigation frame&lt;/h2&gt;

&lt;p&gt;First, let me acknowledge that this GUI is built on the work of many people who have contributed to Julia’s Cairo and Tk packages.
For this step, we’ll make particular use of John Verzani’s contribution of a huge set of convenience wrappers for most of Tk’s widget functionality.
John wrote up a nice set of &lt;a href=&quot;https://github.com/JuliaLang/Tk.jl/tree/master/examples&quot;&gt;examples&lt;/a&gt; that demonstrate many of the things you can do with it; this first installment is essentially just a “longer” example, and won’t surprise anyone who has read his documentation.&lt;/p&gt;

&lt;p&gt;Let’s create a couple of types to hold the data we’ll need.
We need a type that stores “GUI state,” which here consists of the currently-viewed location in the image and information needed to implement the “play” functionality:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;type NavigationState
    # Dimensions:
    zmax::Int          # number of frames in z, set to 1 if only 2 spatial dims
    tmax::Int          # number of frames in t, set to 1 if only a single image
    z::Int             # current position in z-stack
    t::Int             # current moment in time
    # Other state data:
    timer              # nothing if not playing, TimeoutAsyncWork if we are
    fps::Float64       # playback speed in frames per second
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Next, let’s create a type to hold “handles” to all the widgets:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;type NavigationControls
    stepup                            # z buttons...
    stepdown
    playup
    playdown
    stepback                          # t buttons...
    stepfwd
    playback
    playfwd
    stop
    editz                             # edit boxes
    editt
    textz                             # static text (information)
    textt
    scalez                            # scale (slider) widgets
    scalet
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;It might not be strictly necessary to hold handles to all the widgets (you could do everything with callbacks), but having them available is convenient.
For example, if you don’t like the icons I created, you can easily initialize the GUI and replace, using the handles, the icons with something better.&lt;/p&gt;

&lt;p&gt;We’ll talk about initialization later; for now, assume that we have a variable &lt;code class=&quot;highlighter-rouge&quot;&gt;state&lt;/code&gt; of type &lt;code class=&quot;highlighter-rouge&quot;&gt;NavigationState&lt;/code&gt; that holds the current position in the (possibly) 4D image, and &lt;code class=&quot;highlighter-rouge&quot;&gt;ctrls&lt;/code&gt; which contains a fully-initialized set of widget handles.&lt;/p&gt;

&lt;p&gt;Each button needs a callback function to be executed when it is clicked.
Let’s go through the functions for controlling &lt;code class=&quot;highlighter-rouge&quot;&gt;t&lt;/code&gt;.
First there is a general utility not tied to any button, but it affects many of the controls:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function updatet(ctrls, state)
    set_value(ctrls.editt, string(state.t))
    set_value(ctrls.scalet, state.t)
    enableback = state.t &amp;gt; 1
    set_enabled(ctrls.stepback, enableback)
    set_enabled(ctrls.playback, enableback)
    enablefwd = state.t &amp;lt; state.tmax
    set_enabled(ctrls.stepfwd, enablefwd)
    set_enabled(ctrls.playfwd, enablefwd)
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The first two lines synchronize the entry box and slider to the current value of &lt;code class=&quot;highlighter-rouge&quot;&gt;state.t&lt;/code&gt;;
the currently-selected time can change by many different mechanisms (one of the buttons, typing into the entry box, or moving the slider), so we make &lt;code class=&quot;highlighter-rouge&quot;&gt;state.t&lt;/code&gt; be the “authoritative” value and synchronize everything to it.
The remaining lines of this function control which of the &lt;code class=&quot;highlighter-rouge&quot;&gt;t&lt;/code&gt; navigation buttons are enabled (if &lt;code class=&quot;highlighter-rouge&quot;&gt;t==1&lt;/code&gt;, we can’t go any earlier in the movie, so we gray out the backwards buttons).&lt;/p&gt;

&lt;p&gt;A second utility function modifies &lt;code class=&quot;highlighter-rouge&quot;&gt;state.t&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function incrementt(inc, ctrls, state, showframe)
    state.t += inc
    updatet(ctrls, state)
    showframe(state)
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Note the call to &lt;code class=&quot;highlighter-rouge&quot;&gt;updatet&lt;/code&gt; described above.
The new part of this is the &lt;code class=&quot;highlighter-rouge&quot;&gt;showframe&lt;/code&gt; function, whose job it is to display the image frame (or any other visual information) to the user.
Typically, the actual &lt;code class=&quot;highlighter-rouge&quot;&gt;showframe&lt;/code&gt; function will need additional information such as where to render the image, but you can provide this information using anonymous functions.
We’ll see how that works in the next installment; below we’ll just create a simple “stub” function.&lt;/p&gt;

&lt;p&gt;Now we get to callbacks which we’ll “bind” to the step and play buttons:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function stept(inc, ctrls, state, showframe)
    if 1 &amp;lt;= state.t+inc &amp;lt;= state.tmax
        incrementt(inc, ctrls, state, showframe)
    else
        stop_playing!(state)
    end
end

function playt(inc, ctrls, state, showframe)
    if !(state.fps &amp;gt; 0)
        error(&quot;Frame rate is not positive&quot;)
    end
    stop_playing!(state)
    dt = 1/state.fps
    state.timer = TimeoutAsyncWork(i -&amp;gt; stept(inc, ctrls, state, showframe))
    start_timer(state.timer, iround(1000*dt), iround(1000*dt))
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;stept()&lt;/code&gt; increments the &lt;code class=&quot;highlighter-rouge&quot;&gt;t&lt;/code&gt; frame by the specified amount (typically 1 or -1), while &lt;code class=&quot;highlighter-rouge&quot;&gt;playt()&lt;/code&gt; starts a timer that will call &lt;code class=&quot;highlighter-rouge&quot;&gt;stept&lt;/code&gt; at regular intervals.
The timer is stopped if play reaches the beginning or end of the movie.
The &lt;code class=&quot;highlighter-rouge&quot;&gt;stop_playing!&lt;/code&gt; function checks to see whether we have an active timer, and if so stops it:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function stop_playing!(state::NavigationState)
    if !is(state.timer, nothing)
        stop_timer(state.timer)
        state.timer = nothing
    end
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;An alternative way to handle playback without a timer would be in a loop, like this:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function stept(inc, ctrls, state, showframe)
    if 1 &amp;lt;= state.t+inc &amp;lt;= state.tmax
        incrementt(inc, ctrls, state, showframe)
    end
end

function playt(inc, ctrls, state, showframe)
    state.isplaying = true
    while 1 &amp;lt;= state.t+inc &amp;lt;= state.tmax &amp;amp;&amp;amp; state.isplaying
        tcl_doevent()    # allow the stop button to take effect
        incrementt(inc, ctrls, state, showframe)
    end
    state.isplaying = false
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;With this version we would use a single Boolean value to signal whether there is active playback.
A key point here is the call to &lt;code class=&quot;highlighter-rouge&quot;&gt;tcl_doevent()&lt;/code&gt;, which allows Tk to interrupt the execution of the loop to handle user interaction (in this case, clicking the stop button).
But with the timer that’s not necessary, and moreover the timer gives us control over the speed of playback.&lt;/p&gt;

&lt;p&gt;Finally, there are callbacks for the entry and slider widgets:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function sett(ctrls,state, showframe)
    tstr = get_value(ctrls.editt)
    try
        val = int(tstr)
        state.t = val
        updatet(ctrls, state)
        showframe(state)
    catch
        updatet(ctrls, state)
    end
end

function scalet(ctrls, state, showframe)
    state.t = get_value(ctrls.scalet)
    updatet(ctrls, state)
    showframe(state)
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;sett&lt;/code&gt; runs when the user types an entry into the edit box; if the user types in nonsense like “foo”, it will gracefully reset it to the current position.&lt;/p&gt;

&lt;p&gt;There’s a complementary set of these functions for the &lt;code class=&quot;highlighter-rouge&quot;&gt;z&lt;/code&gt; controls.&lt;/p&gt;

&lt;p&gt;These callbacks implement the functionality of this “navigation” GUI.
The other main task is initialization.
We won’t cover this in gory detail (you are invited to browse the code), but let’s hit a few highlights.&lt;/p&gt;

&lt;h4 id=&quot;creating-the-buttons&quot;&gt;Creating the buttons&lt;/h4&gt;

&lt;p&gt;You can use image files (e.g., .png files) for your icons, but the ones here are created programmatically.
To do this, specify two colors, the “foreground” and “background”, as strings.
One also needs the &lt;code class=&quot;highlighter-rouge&quot;&gt;data&lt;/code&gt; array (of type &lt;code class=&quot;highlighter-rouge&quot;&gt;Bool&lt;/code&gt;) for the pixels that should be colored by the foreground color, and false for the ones to be set to the background.
There’s also the &lt;code class=&quot;highlighter-rouge&quot;&gt;mask&lt;/code&gt; array, which can prevent the &lt;code class=&quot;highlighter-rouge&quot;&gt;data&lt;/code&gt; array from taking effect in any pixels marked as false in the &lt;code class=&quot;highlighter-rouge&quot;&gt;mask&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Given suitable &lt;code class=&quot;highlighter-rouge&quot;&gt;data&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;mask&lt;/code&gt; arrays (here we just set the mask to &lt;code class=&quot;highlighter-rouge&quot;&gt;trues&lt;/code&gt;), and color strings, we create the icon and assign it to a button like this:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;icon = Tk.image(data, mask, &quot;gray70&quot;, &quot;black&quot;)  # background=gray70, foreground=black
ctrls.stop = Button(f, icon)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Here &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; is the “parent frame” that the navigation controller will be rendered in.
A frame is a container that organizes a collection of related GUI elements.
Later we’ll find out how to create one.&lt;/p&gt;

&lt;h4 id=&quot;assigning-callbacks-to-widgets&quot;&gt;Assigning callbacks to widgets&lt;/h4&gt;

&lt;p&gt;The “stop” and “play backwards” buttons look like this:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;bind(ctrls.stop, &quot;command&quot;, path -&amp;gt; stop_playing!(state))
bind(ctrls.playback, &quot;command&quot;, path -&amp;gt; playt(-1, ctrls, state, showframe)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;path&lt;/code&gt; input is generated by Tk/Tcl, but we don’t have to use it.
Instead, we use anonymous functions to pass the arguments relavant to this particular GUI instantiation.
Note that these two buttons share &lt;code class=&quot;highlighter-rouge&quot;&gt;state&lt;/code&gt;; that means that any changes made by one callback will have impact on the other.&lt;/p&gt;

&lt;h4 id=&quot;placing-the-buttons-in-the-frame-layout-management&quot;&gt;Placing the buttons in the frame (layout management)&lt;/h4&gt;

&lt;p&gt;Here our layout needs are quite simple, but I recommend that you read the &lt;a href=&quot;http://www.tkdocs.com/tutorial/concepts.html#geometry&quot;&gt;excellent&lt;/a&gt; &lt;a href=&quot;http://www.tkdocs.com/tutorial/grid.html&quot;&gt;tutorial&lt;/a&gt; on Tk’s &lt;code class=&quot;highlighter-rouge&quot;&gt;grid&lt;/code&gt; layout engine.
&lt;code class=&quot;highlighter-rouge&quot;&gt;grid&lt;/code&gt; provides a great deal of functionality missing in Matlab, and in particular allows flexible and polished GUI behavior when resizing the window.&lt;/p&gt;

&lt;p&gt;We position the stop button this way:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;grid(ctrls.stop, 1, stopindex, padx=3*pad, pady=pad)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;After the handle for the button itself, the next two inputs determine the row, column position of the widget.
Here the column position is set using a variable (an integer) whose value will depend on whether the z controls are present.
 The &lt;code class=&quot;highlighter-rouge&quot;&gt;pad&lt;/code&gt; settings just apply a bit of horizontal and vertical padding around the button.&lt;/p&gt;

&lt;p&gt;To position the slider widgets, we could do something like this:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;ctrls.scalez = Slider(f, 1:state.zmax)
grid(ctrls.scalez, 2, start:stop, sticky=&quot;we&quot;, padx=pad)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This positions them in row 2 of the frame’s grid, and has them occupy the range of columns (indicated by &lt;code class=&quot;highlighter-rouge&quot;&gt;start:stop&lt;/code&gt;) used by the button controls for the same &lt;code class=&quot;highlighter-rouge&quot;&gt;z&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;t&lt;/code&gt; axis.
The &lt;code class=&quot;highlighter-rouge&quot;&gt;sticky&lt;/code&gt; setting means that it will stretch to fill from West to East (left to right).&lt;/p&gt;

&lt;p&gt;In the main GUI we’ll use one more feature of &lt;code class=&quot;highlighter-rouge&quot;&gt;grid&lt;/code&gt;, so let’s cover it now.
This feature controls how regions of the window expand or shrink when the window is resized:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;grid_rowconfigure(win, 1, weight=1)
grid_columnconfigure(win, 1, weight=1)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This says that row 1, column 1 will expand at a rate of &lt;code class=&quot;highlighter-rouge&quot;&gt;1&lt;/code&gt; when the figure is made larger.
You can set different weights for different GUI components.
The default value is 0, indicating that it shouldn’t expand at all.
That’s what we want for this navigation frame, so that the buttons keep their size when the window is resized.
Larger weight values indicate that the given component should expand (or shrink) at faster rates.&lt;/p&gt;

&lt;h3 id=&quot;putting-it-all-together-and-testing-it-out&quot;&gt;Putting it all together and testing it out&lt;/h3&gt;

&lt;p&gt;We’ll place the navigation controls inside a Tk frame.
Let’s create one from the command line:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;using Tk
win = Toplevel()
f = Frame(win)
pack(f, expand=true, fill=&quot;both&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The first three lines create the window and the frame. &lt;code class=&quot;highlighter-rouge&quot;&gt;pack&lt;/code&gt; is an alternative layout engine to &lt;code class=&quot;highlighter-rouge&quot;&gt;grid&lt;/code&gt;, and slightly more convenient when all you want is to place a single item so that it fills its container.
(You can mix &lt;code class=&quot;highlighter-rouge&quot;&gt;pack&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;grid&lt;/code&gt; as long as they are operating on separate containers.
Here we’ll have a frame &lt;code class=&quot;highlighter-rouge&quot;&gt;pack&lt;/code&gt;ed in the window, and the widgets will be &lt;code class=&quot;highlighter-rouge&quot;&gt;grid&lt;/code&gt;ded inside the frame.)
After that fourth line, the window is rather tiny; the call to &lt;code class=&quot;highlighter-rouge&quot;&gt;pack&lt;/code&gt; causes the frame to fill to expand the whole window, but at the moment the frame has no contents, so the window is as small as it can be.&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/navigation1.jpg?raw=true&quot; alt=&quot;GUI snapshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;We need a &lt;code class=&quot;highlighter-rouge&quot;&gt;showframe&lt;/code&gt; callback; for now let’s create a very simple one that will help in testing:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;showframe = x -&amp;gt; println(&quot;showframe z=&quot;, x.z, &quot;, t=&quot;, x.t)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Next, load the GUI code (&lt;code class=&quot;highlighter-rouge&quot;&gt;using ImageView.Navigation&lt;/code&gt;) and create the &lt;code class=&quot;highlighter-rouge&quot;&gt;NavigationState&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;NavigationControls&lt;/code&gt; objects:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;ctrls = NavigationControls()
state = NavigationState(40, 1000, 2, 5)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Here we’ve set up a fake movie with 40 image slices in &lt;code class=&quot;highlighter-rouge&quot;&gt;z&lt;/code&gt;, and 1000 image stacks in &lt;code class=&quot;highlighter-rouge&quot;&gt;t&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Finally, we initialize the widgets:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;init_navigation!(f, ctrls, state, showframe)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/navigation2.jpg?raw=true&quot; alt=&quot;GUI snapshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Now when you click on buttons, or change the text in the entry boxes, you’ll see the GUI in action.
You can tell from the command line output, generated by &lt;code class=&quot;highlighter-rouge&quot;&gt;showframe&lt;/code&gt;, what’s happening internally:&lt;/p&gt;

&lt;p&gt;&lt;img src=&quot;https://github.com/JuliaLang/julialang.github.com/blob/master/blog/_posts/GUI_figures/navigation_repl.jpg?raw=true&quot; alt=&quot;GUI snapshot&quot; /&gt;&lt;/p&gt;

&lt;p&gt;Hopefully this demonstrates another nice feature of developing GUIs in Julia: it’s straightforward to build re-usable components.
This navigation frame can be added as an element to any window, and the grid layout manager takes care of the rest.
All you need to do is to include &lt;code class=&quot;highlighter-rouge&quot;&gt;ImageView/src/navigation.jl&lt;/code&gt; into your module, and you can make use of it with just a few lines of code.&lt;/p&gt;

&lt;p&gt;Not too hard, right? The next step is to render the image, which brings us into the domain of Cairo.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Passing Julia Callback Functions to C</title>
   <link href="http://julialang.org/blog/2013/05/callback"/>
   <updated>2013-05-10T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2013/05/callback</id>
   <content type="html">&lt;p&gt;One of the great strengths of Julia is that it is so easy to &lt;a href=&quot;http://docs.julialang.org/en/latest/manual/calling-c-and-fortran-code.html&quot;&gt;call C
code&lt;/a&gt; natively, with no special “glue” routines or overhead to marshal
arguments and convert return values.  For example, if you want to call
&lt;a href=&quot;http://www.gnu.org/software/gsl/&quot;&gt;GNU GSL&lt;/a&gt; to compute a special function
like a &lt;a href=&quot;http://linux.math.tifr.res.in/manuals/html/gsl-ref-html/gsl-ref_7.html&quot;&gt;Debye integral&lt;/a&gt;, it is as easy as:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;debye_1(x) = ccall((:gsl_sf_debye_1,:libgsl), Cdouble, (Cdouble,), x)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;at which point you can compute &lt;code class=&quot;highlighter-rouge&quot;&gt;debye_1(2)&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;debye_1(3.7)&lt;/code&gt;, and so
on.  (Even easier would be to use Jiahao Chen’s
&lt;a href=&quot;https://github.com/jiahao/GSL.jl&quot;&gt;GSL package&lt;/a&gt; for Julia, which has
already created such wrappers for you.)  This makes a vast array of
existing C libraries accessible to you in Julia (along with Fortran
libraries and other languages with C-accessible calling conventions).&lt;/p&gt;

&lt;p&gt;In fact, you can even go the other way around, passing Julia routines
to C, so that C code is calling Julia code in the form of &lt;em&gt;callback&lt;/em&gt;
functions.   For example, a C library for numerical integration might
expect you to pass the integrand as a &lt;em&gt;function&lt;/em&gt; argument, which the
library will then call to evaluate the integrand as many times as
needed to estimate the integral.  Callback functions are also natural
for optimization, root-finding, and many other numerical tasks, as well
as in many non-numerical problems.  The purpose of this blog post is to
illustrate the techniques for passing Julia functions as callbacks to
C routines, which is straightforward and efficient but requires some
lower-level understanding of how functions and other values are passed as
arguments.&lt;/p&gt;

&lt;p&gt;The code in this post requires Julia 0.2 (or a recent &lt;code class=&quot;highlighter-rouge&quot;&gt;git&lt;/code&gt; facsimile
thereof); the key features needed for callback functions (especially
&lt;code class=&quot;highlighter-rouge&quot;&gt;unsafe_pointer_to_objref&lt;/code&gt;) are not available in Julia 0.1.&lt;/p&gt;

&lt;h2 id=&quot;sorting-with-qsort&quot;&gt;Sorting with &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort&lt;/code&gt;&lt;/h2&gt;

&lt;p&gt;Perhaps the most well-known example of a callback parameter is
provided by the
&lt;a href=&quot;http://pubs.opengroup.org/onlinepubs/009695399/functions/qsort.html&quot;&gt;qsort&lt;/a&gt;
function, part of the ANSI C standard library and declared in C as:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;void qsort(void *base, size_t nmemb, size_t size,
           int(*compare)(const void *a, const void *b));
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;base&lt;/code&gt; argument is a pointer to an array of length &lt;code class=&quot;highlighter-rouge&quot;&gt;nmemb&lt;/code&gt;, with
elements of &lt;code class=&quot;highlighter-rouge&quot;&gt;size&lt;/code&gt; bytes each.  &lt;code class=&quot;highlighter-rouge&quot;&gt;compare&lt;/code&gt; is a callback function which
takes pointers to two elements &lt;code class=&quot;highlighter-rouge&quot;&gt;a&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;b&lt;/code&gt; and returns an integer
less/greater than zero if &lt;code class=&quot;highlighter-rouge&quot;&gt;a&lt;/code&gt; should appear before/after &lt;code class=&quot;highlighter-rouge&quot;&gt;b&lt;/code&gt; (or zero
if any order is permitted).  Now, suppose that we have a 1d array &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt;
of values in Julia that we want to sort using the &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort&lt;/code&gt; function
(rather than Julia’s built-in &lt;code class=&quot;highlighter-rouge&quot;&gt;sort&lt;/code&gt; function).  Before we worry about
calling &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort&lt;/code&gt; and passing arguments, we need to write a comparison
function that works for some arbitrary type &lt;code class=&quot;highlighter-rouge&quot;&gt;T&lt;/code&gt;, e.g.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function mycompare{T}(a_::Ptr{T}, b_::Ptr{T})
    a = unsafe_load(a_)
    b = unsafe_load(b_)
    return a &amp;lt; b ? cint(-1) : a &amp;gt; b ? cint(+1) : cint(0)
end
cint(n) = convert(Cint, n)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Notice that we use the built-in function &lt;code class=&quot;highlighter-rouge&quot;&gt;unsafe_load&lt;/code&gt; to fetch the
values pointed to by the arguments &lt;code class=&quot;highlighter-rouge&quot;&gt;a_&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;b_&lt;/code&gt; (which is “unsafe”
because it will crash if these are not valid pointers, but &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort&lt;/code&gt;
will always pass valid pointers).  Also, we have to be a little
careful about return values: &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort&lt;/code&gt; expects a function returning a C
&lt;code class=&quot;highlighter-rouge&quot;&gt;int&lt;/code&gt;, so we must be sure to return &lt;code class=&quot;highlighter-rouge&quot;&gt;Cint&lt;/code&gt; (the corresponding type in
Julia) via a call to &lt;code class=&quot;highlighter-rouge&quot;&gt;convert&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Now, how do we pass this to C?  A function pointer in C is essentially
just a pointer to the memory location of the machine code implementing
that function, whereas a function value &lt;code class=&quot;highlighter-rouge&quot;&gt;mycompare&lt;/code&gt; (of type
&lt;code class=&quot;highlighter-rouge&quot;&gt;Function&lt;/code&gt;) in Julia is quite different.  Thanks to Julia’s &lt;a href=&quot;http://en.wikipedia.org/wiki/Just-in-time_compilation&quot;&gt;JIT
compilation&lt;/a&gt;
approach,a Julia function may not even be &lt;em&gt;compiled&lt;/em&gt; until the first
time it is called, and in general the &lt;em&gt;same&lt;/em&gt; Julia function may be
compiled into &lt;em&gt;multiple&lt;/em&gt; machine-code instantiations, which are
specialized for arguments of different types (e.g. different &lt;code class=&quot;highlighter-rouge&quot;&gt;T&lt;/code&gt; in
this case).  So, you can imagine that &lt;code class=&quot;highlighter-rouge&quot;&gt;mycompare&lt;/code&gt; must internally
point to a rather complicated data structure (a &lt;code class=&quot;highlighter-rouge&quot;&gt;jl_function_t&lt;/code&gt; in
&lt;code class=&quot;highlighter-rouge&quot;&gt;julia.h&lt;/code&gt;, if you are interested), which holds information about the
argument types, the compiled versions (if any), and so on.  In
general, it must store a
&lt;a href=&quot;http://en.wikipedia.org/wiki/Closure_%28computer_science%29&quot;&gt;closure&lt;/a&gt;
with information about the environment in which the function was
defined; we will talk more about this below.  In any case, it is a
very different object than a simple pointer to machine code for one
set of argument types.  Fortunately, we can get the latter simply by
calling a &lt;a href=&quot;docs.julialang.org/en/latest/stdlib/base.html#c-interface&quot;&gt;built-in Julia function&lt;/a&gt; called &lt;code class=&quot;highlighter-rouge&quot;&gt;cfunction&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;const mycompare_c = cfunction(mycompare, Cint, (Ptr{Cdouble}, Ptr{Cdouble}))
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Here, we pass &lt;code class=&quot;highlighter-rouge&quot;&gt;cfunction&lt;/code&gt; three arguments: the function &lt;code class=&quot;highlighter-rouge&quot;&gt;mycompare&lt;/code&gt;,
the return type &lt;code class=&quot;highlighter-rouge&quot;&gt;Cint&lt;/code&gt;, and a tuple of the argument types, in this case to
sort an array of &lt;code class=&quot;highlighter-rouge&quot;&gt;Cdouble&lt;/code&gt; (&lt;code class=&quot;highlighter-rouge&quot;&gt;Float64&lt;/code&gt;) elements.  Julia compiles a version of
&lt;code class=&quot;highlighter-rouge&quot;&gt;mycompare&lt;/code&gt; specialized for these argument types (if it has not done
so already), and returns a &lt;code class=&quot;highlighter-rouge&quot;&gt;Ptr{Void}&lt;/code&gt; holding the address of the
machine code, &lt;em&gt;exactly&lt;/em&gt; what we need to pass to &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort&lt;/code&gt;.  We are
now ready to call &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort&lt;/code&gt; on some sample data:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;A = [1.3, -2.7, 4.4, 3.1]
ccall(:qsort, Void, (Ptr{Cdouble}, Csize_t, Csize_t, Ptr{Void}),
      A, length(A), sizeof(eltype(A)), mycompare_c)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;After this executes, &lt;code class=&quot;highlighter-rouge&quot;&gt;A&lt;/code&gt; is changed to the sorted array &lt;code class=&quot;highlighter-rouge&quot;&gt;[ -2.7, 1.3,
3.1, 4.4]&lt;/code&gt;.  Note that Julia knows how to convert an array
&lt;code class=&quot;highlighter-rouge&quot;&gt;A::Vector{Cdouble}&lt;/code&gt; into a &lt;code class=&quot;highlighter-rouge&quot;&gt;Ptr{Cdouble}&lt;/code&gt;, how to compute the &lt;code class=&quot;highlighter-rouge&quot;&gt;sizeof&lt;/code&gt;
a type in bytes (identical to C’s &lt;code class=&quot;highlighter-rouge&quot;&gt;sizeof&lt;/code&gt; operator), and so on.  For fun,
try inserting a &lt;code class=&quot;highlighter-rouge&quot;&gt;println(&quot;mycompare($a,$b)&quot;)&lt;/code&gt; line into mycompare, which
will allow you to see the comparisons that &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort&lt;/code&gt; is performing (and to verify
that it is really calling the Julia function that you passed to it).&lt;/p&gt;

&lt;h2 id=&quot;the-problem-with-closures&quot;&gt;The problem with closures&lt;/h2&gt;

&lt;p&gt;We aren’t done yet, however.  If you start passing callback functions to
C routines, it won’t be long before you discover that &lt;code class=&quot;highlighter-rouge&quot;&gt;cfunction&lt;/code&gt; doesn’t
always work.  For example, suppose we tried to declare our comparison
function inline, via:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;mycomp = cfunction((a_,b_) -&amp;gt; unsafe_load(a_) &amp;lt; unsafe_load(b_) ? 
                              cint(-1) : cint(+1),
                   Cint, (Ptr{Cdouble}, Ptr{Cdouble}))
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Julia barfs on this, printing &lt;code class=&quot;highlighter-rouge&quot;&gt;ERROR: function is not yet c-callable&lt;/code&gt;.  In
general, &lt;code class=&quot;highlighter-rouge&quot;&gt;cfunction&lt;/code&gt; only works for “top-level” functions: named
functions defined in the top-level (global or module) scope, but &lt;em&gt;not&lt;/em&gt;
anonymous (&lt;code class=&quot;highlighter-rouge&quot;&gt;args -&amp;gt; value&lt;/code&gt;) functions and not functions defined within
other functions (“nested” functions).  The reason for this stems from
one important concept in computer science: a
&lt;a href=&quot;http://en.wikipedia.org/wiki/Closure_%28computer_science%29&quot;&gt;closure&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;To understand the need for closures, and the difficulty they pose for
callback functions, suppose that we wanted to provide a nicer interface
for qsort, one which permitted the user to simply pass a &lt;code class=&quot;highlighter-rouge&quot;&gt;lessthan&lt;/code&gt; function
returning &lt;code class=&quot;highlighter-rouge&quot;&gt;true&lt;/code&gt; or &lt;code class=&quot;highlighter-rouge&quot;&gt;false&lt;/code&gt; while hiding all of the low-level business with
pointers, &lt;code class=&quot;highlighter-rouge&quot;&gt;Cint&lt;/code&gt;, and so on.  We might &lt;em&gt;like&lt;/em&gt; to do something of the form:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function qsort!{T}(A::Vector{T}, lessthan::Function)
    function mycompare(a_::Ptr{T}, b_::Ptr{T})
        a = unsafe_load(a_)
        b = unsafe_load(b_)
        return lessthan(a, b) ? cint(-1) : cint(+1)
    end
    mycompare_c = cfunction(mycompare, Cint, (Ptr{T}, Ptr{T}))
    ccall(:qsort, Void, (Ptr{T}, Csize_t, Csize_t, Ptr{Void}),
          A, length(A), sizeof(T), mycompare_c)
    A
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Then we could simply call &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort!([1.3, -2.7, 4.4, 3.1], &amp;lt;)&lt;/code&gt; to sort
in ascending order using the built-in &lt;code class=&quot;highlighter-rouge&quot;&gt;&amp;lt;&lt;/code&gt; comparison, or any other
comparison function we wanted.  Unfortunately &lt;code class=&quot;highlighter-rouge&quot;&gt;cfunction&lt;/code&gt; will again
barf when you try to call &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort!&lt;/code&gt;, and it is no longer so difficult
to understand why.  Notice that the nested &lt;code class=&quot;highlighter-rouge&quot;&gt;mycompare&lt;/code&gt; function is no
longer self-contained: it uses the variable &lt;code class=&quot;highlighter-rouge&quot;&gt;lessthan&lt;/code&gt; from the
surrounding scope.  This is a common pattern for nested functions and
anonymous functions: often, they are parameterized by local variables
in the environment where the function is defined.  Technically, the
ability to have this kind of dependency is provided by &lt;a href=&quot;http://en.wikipedia.org/wiki/Scope_%28computer_science%29&quot;&gt;lexical
scoping&lt;/a&gt; in
a programming language like Julia, and is typical of any language in
which functions are
“&lt;a href=&quot;http://en.wikipedia.org/wiki/First-class_function&quot;&gt;first-class&lt;/a&gt;”
objects.  In order to support lexical scoping, a Julia &lt;code class=&quot;highlighter-rouge&quot;&gt;Function&lt;/code&gt;
object needs to internally carry around a pointer to the variables in
the enclosing environment, and this encapsulation is called a
&lt;em&gt;closure&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;In contrast, a C function pointer is &lt;em&gt;not&lt;/em&gt; a closure.  It doesn’t
enclose a pointer to the environment in which the function was
defined, or anything else for that matter; it is just the address of a
stream of instructions.  This makes it hard, in C, to write functions
to transform other functions (&lt;a href=&quot;http://en.wikipedia.org/wiki/Higher-order_function&quot;&gt;higher-order
functions&lt;/a&gt;) or to
parameterize functions by local variables.  This apparently leaves us
with two options, neither of which is especially attractive:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;
    &lt;p&gt;We could store &lt;code class=&quot;highlighter-rouge&quot;&gt;lessthan&lt;/code&gt; in a global variable, and reference that
from a top-level &lt;code class=&quot;highlighter-rouge&quot;&gt;mycompare&lt;/code&gt; function.  (This is the traditional solution
for C programmers calling &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort&lt;/code&gt; with parameterized comparison functions.)
The problem with this strategy is that it is not &lt;a href=&quot;http://en.wikipedia.org/wiki/Reentrancy_%28computing%29&quot;&gt;re-entrant&lt;/a&gt;: it prevents us from calling &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort!&lt;/code&gt;
recursively (e.g. if the comparison function itself needs to do a sort, for
some complicated datastructure), or from calling &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort!&lt;/code&gt; from multiple
threads (when a future Julia version supports shared-memory parallelism).
Still, this is better than nothing.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Every time &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort!&lt;/code&gt; is called, Julia could JIT-compile a new version
of &lt;code class=&quot;highlighter-rouge&quot;&gt;mycompare&lt;/code&gt;, which hard-codes the reference to the &lt;code class=&quot;highlighter-rouge&quot;&gt;lessthan&lt;/code&gt;
argument passed on that call.  This is technically possible and has
been implemented in some languages (e.g. reportedly &lt;a href=&quot;http://www.gnu.org/software/guile/manual/html_node/Dynamic-FFI.html&quot;&gt;GNU
Guile&lt;/a&gt;
and &lt;a href=&quot;http://luajit.org/ext_ffi_semantics.html&quot;&gt;Lua&lt;/a&gt;
do something like this).  However, this strategy
comes at a price: it requires that callbacks be recompiled every time
a parameter in them changes, which is not true of the global-variable
strategy.  Anyway, it is not implemented yet in Julia.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Fortunately, there is often a &lt;em&gt;third&lt;/em&gt; option, because C programmers
long ago recognized these limitations of function pointers, and
devised a workaround: most modern C callback interfaces allow
arbitrary data to be passed through to the callback via a
“pass-through” (or “thunk”) pointer parameter.  As explained in the
next section, we can exploit this technique in Julia to pass a “true”
closure as a callback.&lt;/p&gt;

&lt;h2 id=&quot;passing-closures-via-pass-through-pointers&quot;&gt;Passing closures via pass-through pointers&lt;/h2&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort&lt;/code&gt; interface is nowadays considered rather antiquated.  Years
ago, it was supplemented on BSD-Unix systems, and eventually in GNU
libc, by a function called &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort_r&lt;/code&gt; that solves the problem of passing
parameters to the callback in a re-entrant way.  This is how the BSD (e.g. MacOS)
&lt;code class=&quot;highlighter-rouge&quot;&gt;qsort_r&lt;/code&gt; function &lt;a href=&quot;https://developer.apple.com/library/mac/documentation/Darwin/Reference/ManPages/man3/qsort_r.3.html&quot;&gt;is defined&lt;/a&gt;:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;void qsort_r(void *base, size_t nmemb, size_t size, void *thunk,
             int (*compare)(void *thunk, const void *a, const void *b));
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Compared to &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort&lt;/code&gt;, there is an extra &lt;code class=&quot;highlighter-rouge&quot;&gt;thunk&lt;/code&gt; parameter, and this is
&lt;em&gt;passed through&lt;/em&gt; to the &lt;code class=&quot;highlighter-rouge&quot;&gt;compare&lt;/code&gt; function as its first argument.  In this
way, you can pass a pointer to &lt;em&gt;arbitrary&lt;/em&gt; data through to your callback,
and we can exploit this to pass a closure through for an arbitrary Julia
callback.&lt;/p&gt;

&lt;p&gt;All we need is a way to convert a Julia &lt;code class=&quot;highlighter-rouge&quot;&gt;Function&lt;/code&gt; into an opaque
&lt;code class=&quot;highlighter-rouge&quot;&gt;Ptr{Void}&lt;/code&gt; so that we can pass it through to our callback, and then a
way to convert the opaque pointer back into a &lt;code class=&quot;highlighter-rouge&quot;&gt;Function&lt;/code&gt;.  The former
is automatic if we simply declare the &lt;code class=&quot;highlighter-rouge&quot;&gt;ccall&lt;/code&gt; argument as type &lt;code class=&quot;highlighter-rouge&quot;&gt;Any&lt;/code&gt;
(which passes the argument as an opaque Julia object pointer), and the
latter is accomplished by the built-in function
&lt;code class=&quot;highlighter-rouge&quot;&gt;unsafe_pointer_to_objref&lt;/code&gt;.  (Technically, we could use type
&lt;code class=&quot;highlighter-rouge&quot;&gt;Function&lt;/code&gt; or an explicit call to &lt;code class=&quot;highlighter-rouge&quot;&gt;pointer_from_objref&lt;/code&gt; instead of
&lt;code class=&quot;highlighter-rouge&quot;&gt;Any&lt;/code&gt;.)  Using these, we can now define a working high-level &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort!&lt;/code&gt;
function that takes an arbitrary &lt;code class=&quot;highlighter-rouge&quot;&gt;lessthan&lt;/code&gt; comparison-function
argument:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function qsort!_compare{T}(lessthan_::Ptr{Void}, a_::Ptr{T}, b_::Ptr{T})
    a = unsafe_load(a_)
    b = unsafe_load(b_)
    lessthan = unsafe_pointer_to_objref(lessthan_)::Function
    return lessthan(a, b) ? cint(-1) : cint(+1)
end

function qsort!{T}(A::Vector{T}, lessthan::Function=&amp;lt;)
    compare_c = cfunction(qsort!_compare, Cint, (Ptr{Void}, Ptr{T}, Ptr{T}))
    ccall(:qsort_r, Void, (Ptr{T}, Csize_t, Csize_t, Any, Ptr{Void}),
          A, length(A), sizeof(T), lessthan, compare_c)
    return A
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;qsort!_compare&lt;/code&gt; is a top-level function, so &lt;code class=&quot;highlighter-rouge&quot;&gt;cfunction&lt;/code&gt; has no
problem with it, and it will only be compiled once per type &lt;code class=&quot;highlighter-rouge&quot;&gt;T&lt;/code&gt; to be
sorted (rather than once per call to &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort!&lt;/code&gt; or per &lt;code class=&quot;highlighter-rouge&quot;&gt;lessthan&lt;/code&gt;
function).  We use the explicit &lt;code class=&quot;highlighter-rouge&quot;&gt;::Function&lt;/code&gt; assertion to tell
the compiler that we will only pass &lt;code class=&quot;highlighter-rouge&quot;&gt;Function&lt;/code&gt; pointers in
&lt;code class=&quot;highlighter-rouge&quot;&gt;lessthan_&lt;/code&gt;. Note that we gave the &lt;code class=&quot;highlighter-rouge&quot;&gt;lessthan&lt;/code&gt; argument a default value
of &lt;code class=&quot;highlighter-rouge&quot;&gt;&amp;lt;&lt;/code&gt; (default arguments being a &lt;a href=&quot;https://github.com/JuliaLang/julia/issues/1817&quot;&gt;recent
feature&lt;/a&gt; added to
Julia).&lt;/p&gt;

&lt;p&gt;We can now do &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort!([1.3, -2.7, 4.4, 3.1])&lt;/code&gt; and it will
return the array sorted in ascending order, or &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort!([1.3, -2.7,
4.4, 3.1], &amp;gt;)&lt;/code&gt; to sort in descending order.&lt;/p&gt;

&lt;h4 id=&quot;warning-qsort_r-is-not-portable&quot;&gt;Warning: &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort_r&lt;/code&gt; is not portable&lt;/h4&gt;

&lt;p&gt;The example above has one major problem that has nothing to do with
Julia: the &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort_r&lt;/code&gt; function is not portable.  The above example
won’t work on Windows, since the Windows C library doesn’t define
&lt;code class=&quot;highlighter-rouge&quot;&gt;qsort_r&lt;/code&gt; (instead, it has a function called
&lt;a href=&quot;http://msdn.microsoft.com/en-us/library/4xc60xas%28VS.80%29.aspx&quot;&gt;qsort_s&lt;/a&gt;,
which of course uses an argument order incompatible with &lt;em&gt;both&lt;/em&gt; the
BSD and GNU &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort_r&lt;/code&gt; functions).  Worse, it will crash on GNU/Linux
systems, which &lt;em&gt;do&lt;/em&gt; provide &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort_r&lt;/code&gt; but with an
&lt;a href=&quot;http://www.memoryhole.net/kyle/2009/11/qsort_r.html&quot;&gt;incompatible&lt;/a&gt;
&lt;a href=&quot;http://www.cygwin.com/ml/libc-alpha/2008-12/msg00008.html&quot;&gt;calling
convention&lt;/a&gt;. And
as a result it is difficult to use &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort_r&lt;/code&gt; in a way that does not
crash either on GNU/Linux or BSD (e.g. MacOS) systems.  This is how
glibc’s &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort_r&lt;/code&gt; is defined:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;void qsort_r(void *base, size_t nmemb, size_t size,
             int (*compare)(const void *a, const void *b, void *thunk),
              void *thunk);
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Note that the position of the &lt;code class=&quot;highlighter-rouge&quot;&gt;thunk&lt;/code&gt; argument is moved, both in &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort_r&lt;/code&gt; itself and
in the comparison function.   So, the corresponding &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort!&lt;/code&gt; Julia code on
GNU/Linux systems should be:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function qsort!_compare{T}(a_::Ptr{T}, b_::Ptr{T}, lessthan_::Ptr{Void})
    a = unsafe_load(a_)
    b = unsafe_load(b_)
    lessthan = unsafe_pointer_to_objref(lessthan_)::Function
    return lessthan(a, b) ? cint(-1) : cint(+1)
end

function qsort!{T}(A::Vector{T}, lessthan::Function=&amp;lt;)
    compare_c = cfunction(qsort!_compare, Cint, (Ptr{T}, Ptr{T}, Ptr{Void}))
    ccall(:qsort_r, Void, (Ptr{T}, Csize_t, Csize_t, Ptr{Void}, Any),
          A, length(A), sizeof(T), compare_c, lessthan)
    return A
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;If you really needed to call &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort_r&lt;/code&gt; from Julia, you could use the
above definitions if &lt;code class=&quot;highlighter-rouge&quot;&gt;OS_NAME == :Linux&lt;/code&gt; and the BSD definitions
otherwise, with a third version using &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort_s&lt;/code&gt; on Windows, but
fortunately there is not much need as Julia comes with its own
perfectly adequate &lt;code class=&quot;highlighter-rouge&quot;&gt;sort&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;sort!&lt;/code&gt; routines.&lt;/p&gt;

&lt;h2 id=&quot;passing-closures-in-data-structures&quot;&gt;Passing closures in data structures&lt;/h2&gt;

&lt;p&gt;As another example that is oriented more towards numerical
computations, we’ll examine how we might call the numerical integration
routines in the &lt;a href=&quot;http://www.gnu.org/software/gsl/&quot;&gt;GNU Scientific Library
(GSL)&lt;/a&gt;.  There is already a &lt;a href=&quot;https://github.com/jiahao/GSL.jl&quot;&gt;GSL
package&lt;/a&gt; that handles the wrapper
work below for you, but it is instructive to look at how this is
implemented because GSL simulates closures in a slightly different
way, with data structures.&lt;/p&gt;

&lt;p&gt;Like most modern C libraries accepting callbacks, GSL uses a &lt;code class=&quot;highlighter-rouge&quot;&gt;void*&lt;/code&gt; pass-through
parameter to allow arbitrary data to be passed through to the callback routine,
and we can use that to support arbitrary closures in Julia.   Unlike &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort_r&lt;/code&gt;,
however, GSL wraps both the C function pointer and the pass-through pointer in
a data structure called &lt;code class=&quot;highlighter-rouge&quot;&gt;gsl_function&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;struct {
    double (*function)(double x, void *params);
    void *params;
} gsl_function;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Using the techniques above, we can easily declare a &lt;code class=&quot;highlighter-rouge&quot;&gt;GSL_Function&lt;/code&gt; type in Julia
that mirrors this C type, and with a constructor &lt;code class=&quot;highlighter-rouge&quot;&gt;GSL_Function(f::Function)&lt;/code&gt; that
creates a wrapper around an arbitrary Julia function &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function gsl_function_wrap(x::Cdouble, params::Ptr{Void})
    f = unsafe_pointer_to_objref(params)::Function
    convert(Cdouble, f(x))::Cdouble
end
const gsl_function_wrap_c = cfunction(gsl_function_wrap,
                                      Cdouble, (Cdouble, Ptr{Void}))

type GSL_Function
    func::Ptr{Void}
    params::Any
    GSL_Function(f::Function) = new(gsl_function_wrap_c, f)
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;One subtlety with the above code is that we need to explicitly
&lt;code class=&quot;highlighter-rouge&quot;&gt;convert&lt;/code&gt; the return value of &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; to a &lt;code class=&quot;highlighter-rouge&quot;&gt;Cdouble&lt;/code&gt; (in case the caller’s
code returns some other numeric type for some &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt;, such as an &lt;code class=&quot;highlighter-rouge&quot;&gt;Int&lt;/code&gt;).
Moreover, we need to explicitly assert (&lt;code class=&quot;highlighter-rouge&quot;&gt;::Cdouble&lt;/code&gt;) that the result
of the &lt;code class=&quot;highlighter-rouge&quot;&gt;convert&lt;/code&gt; was a &lt;code class=&quot;highlighter-rouge&quot;&gt;Cdouble&lt;/code&gt;.  As with the &lt;code class=&quot;highlighter-rouge&quot;&gt;qsort&lt;/code&gt; example, this
is because &lt;code class=&quot;highlighter-rouge&quot;&gt;cfunction&lt;/code&gt; only works if Julia can guarantee that
&lt;code class=&quot;highlighter-rouge&quot;&gt;gsl_function_wrap&lt;/code&gt; returns the specified &lt;code class=&quot;highlighter-rouge&quot;&gt;Cdouble&lt;/code&gt; type, and
Julia cannot infer the return type of &lt;code class=&quot;highlighter-rouge&quot;&gt;convert&lt;/code&gt; since it does not
know the return type of &lt;code class=&quot;highlighter-rouge&quot;&gt;f(x)&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Given the above definitions, it is a simple matter to pass this to the
&lt;a href=&quot;http://www.gnu.org/software/gsl/manual/html_node/QAG-adaptive-integration.html&quot;&gt;GSL
adaptive-integration&lt;/a&gt;
routines in a wrapper function &lt;code class=&quot;highlighter-rouge&quot;&gt;gsl_integration_qag&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function gsl_integration_qag(f::Function, a::Real, b::Real, epsrel::Real=1e-12,
                             maxintervals::Integer=10^7)
    s = ccall((:gsl_integration_workspace_alloc,:libgsl), Ptr{Void}, (Csize_t,),
              maxintervals)
    result = Array(Cdouble,1)
    abserr = Array(Cdouble,1)
    ccall((:gsl_integration_qag,:libgsl), Cint,
          (Ptr{GSL_Function}, Cdouble,Cdouble, Cdouble, Csize_t, Cint, Ptr{Void}, 
           Ptr{Cdouble}, Ptr{Cdouble}),
          &amp;amp;GSL_Function(f), a, b, epsrel, maxintervals, 1, s, result, abserr)
    ccall((:gsl_integration_workspace_free,:libgsl), Void, (Ptr{Void},), s)
    return (result[1], abserr[1])
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Note that &lt;code class=&quot;highlighter-rouge&quot;&gt;&amp;amp;GSL_Function(f)&lt;/code&gt; passes a pointer to a &lt;code class=&quot;highlighter-rouge&quot;&gt;GSL_Function&lt;/code&gt;
“struct” containing a pointer to &lt;code class=&quot;highlighter-rouge&quot;&gt;gsl_function_wrap_c&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt;, corresponding
to the &lt;code class=&quot;highlighter-rouge&quot;&gt;gsl_function*&lt;/code&gt; argument in C.  The return value is a tuple of the estimated
integral and an estimated error.&lt;/p&gt;

&lt;p&gt;For example, &lt;code class=&quot;highlighter-rouge&quot;&gt;gsl_integration_qag(cos, 0, 1)&lt;/code&gt; returns
&lt;code class=&quot;highlighter-rouge&quot;&gt;(0.8414709848078965,9.34220461887732e-15)&lt;/code&gt;, which computes the
correct integral &lt;code class=&quot;highlighter-rouge&quot;&gt;sin(1)&lt;/code&gt; to machine precision.&lt;/p&gt;

&lt;h2 id=&quot;taking-out-the-trash-or-not&quot;&gt;Taking out the trash (or not)&lt;/h2&gt;

&lt;p&gt;In the above examples, we pass an opaque pointer (object reference) to a
Julia &lt;code class=&quot;highlighter-rouge&quot;&gt;Function&lt;/code&gt; into C.  Whenever one passes pointers to Julia data into C
code, one has to ensure that the Julia data is not garbage-collected until
the C code is done with it, and functions are no exception to this rule.
An anonymous function that is no longer referred to by any Julia variable
may be garbage collected, at which point any C pointers to it become invalid.&lt;/p&gt;

&lt;p&gt;This sounds scary, but in practice you don’t need to worry about it very often,
because Julia guarantees that &lt;code class=&quot;highlighter-rouge&quot;&gt;ccall&lt;/code&gt; arguments won’t be garbage-collected until
the &lt;code class=&quot;highlighter-rouge&quot;&gt;ccall&lt;/code&gt; exits.  So, in all of the above examples, we are safe: the &lt;code class=&quot;highlighter-rouge&quot;&gt;Function&lt;/code&gt;
only needs to live as long as the &lt;code class=&quot;highlighter-rouge&quot;&gt;ccall&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The only danger arises when you pass a function pointer to C and the C code
&lt;em&gt;saves the pointer&lt;/em&gt; in some data structure which it will use in a &lt;em&gt;later&lt;/em&gt; &lt;code class=&quot;highlighter-rouge&quot;&gt;ccall&lt;/code&gt;.
In that case, you are responsible for ensuring that the &lt;code class=&quot;highlighter-rouge&quot;&gt;Function&lt;/code&gt; variable lives
(is referred to by some Julia variable) as long as the C code might need it.&lt;/p&gt;

&lt;p&gt;For example, in the GSL &lt;a href=&quot;http://www.gnu.org/software/gsl/manual/html_node/One-dimensional-Minimization.html&quot;&gt;one-dimensional minimization
interface&lt;/a&gt;,
you don’t simply pass your objective function to a minimization
routine and wait until it is minimized.  Instead, you call a GSL
routine to create a “minimizer object”, store your function pointer in this
object, call routines to iterate the minimization, and then deallocate the
minimizer when you are done.  The Julia function must not be garbage-collected
until this process is complete.  The easiest way to ensure this is to create
a Julia wrapper type around the minimizer object that stores an &lt;em&gt;explicit&lt;/em&gt;
reference to the Julia function, like this:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;type GSL_Minimizer
    m::Ptr{Void} # the gsl_min_fminimizer pointer
    f::Any  # explicit reference to objective, to prevent garbage-collection
    function GSL_Minimizer(t)
       m = ccall((:gsl_min_fminimizer_alloc,:libgsl), Ptr{Void}, (Ptr{Void},), t)
       p = new(m, nothing)
       finalizer(p, p -&amp;gt; ccall((:gsl_min_fminimizer_free,:libgsl),
                               Void, (Ptr{Void},), p.m))
       p
    end
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This wraps around a &lt;code class=&quot;highlighter-rouge&quot;&gt;gsl_min_fminimizer&lt;/code&gt; object of type &lt;code class=&quot;highlighter-rouge&quot;&gt;t&lt;/code&gt;, with a
placeholder &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; to store a reference to the objective function (once
it is set below), including a &lt;code class=&quot;highlighter-rouge&quot;&gt;finalizer&lt;/code&gt; to deallocate the GSL object
when the &lt;code class=&quot;highlighter-rouge&quot;&gt;GSL_Minimizer&lt;/code&gt; is garbage-collected.  The parameter &lt;code class=&quot;highlighter-rouge&quot;&gt;t&lt;/code&gt; is
used to specify the minimization algorithm, which could default to
Brent’s algorithm via:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;const gsl_brent = unsafe_load(cglobal((:gsl_min_fminimizer_brent,:libgsl), Ptr{Void}))
GSL_Minimizer() = GSL_Minimizer(gsl_brent)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;(The call to &lt;code class=&quot;highlighter-rouge&quot;&gt;cglobal&lt;/code&gt; yields a pointer to the
&lt;code class=&quot;highlighter-rouge&quot;&gt;gsl_min_fminimizer_brent&lt;/code&gt; global variable in GSL, which we then
dereference to get the &lt;em&gt;actual&lt;/em&gt; pointer via &lt;code class=&quot;highlighter-rouge&quot;&gt;unsafe_load&lt;/code&gt;.)&lt;/p&gt;

&lt;p&gt;Then, when we set the function to minimize (the “objective”), we store
an extra reference to it in the &lt;code class=&quot;highlighter-rouge&quot;&gt;GSL_Minimizer&lt;/code&gt; to prevent
garbage-collection for the lifetime of the &lt;code class=&quot;highlighter-rouge&quot;&gt;GSL_Minimizer&lt;/code&gt;, again
using the &lt;code class=&quot;highlighter-rouge&quot;&gt;GSL_Function&lt;/code&gt; type defined above to wrap the callback:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function gsl_minimizer_set!(m::GSL_Minimizer, f, x0, xmin, xmax)
    ccall((:gsl_min_fminimizer_set,:libgsl), Cint,
          (Ptr{Void}, Ptr{GSL_Function}, Cdouble, Cdouble, Cdouble),
          m.m, &amp;amp;GSL_Function(f), x0, xmin, xmax)
    m.f = f
    m
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;There are then various GSL routines to iterate the minimizer and to check the
current &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt;, objective value, or bounds on the minimum, which are convenient to wrap:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;gsl_minimizer_iterate!(m::GSL_Minimizer) =
    ccall((:gsl_min_fminimizer_iterate,:libgsl), Cint, (Ptr{Void},), m.m)

gsl_minimizer_x(m::GSL_Minimizer) =
    ccall((:gsl_min_fminimizer_x_minimum,:libgsl), Cdouble, (Ptr{Void},), m.m)

gsl_minimizer_f(m::GSL_Minimizer) =
    ccall((:gsl_min_fminimizer_f_minimum,:libgsl), Cdouble, (Ptr{Void},), m.m)

gsl_minimizer_xmin(m::GSL_Minimizer) =
    ccall((:gsl_min_fminimizer_x_lower,:libgsl), Cdouble, (Ptr{Void},), m.m)
gsl_minimizer_xmax(m::GSL_Minimizer) =
    ccall((:gsl_min_fminimizer_x_upper,:libgsl), Cdouble, (Ptr{Void},), m.m)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Putting all of these together, we can minimize a simple function &lt;code class=&quot;highlighter-rouge&quot;&gt;sin(x)&lt;/code&gt; in 
the interval [-3,1], with a starting guess -1, via:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;m = GSL_Minimizer()
gsl_minimizer_set!(m, sin, -1, -3, 1)
while gsl_minimizer_xmax(m) - gsl_minimizer_xmin(m) &amp;gt; 1e-6
    println(&quot;iterating at x = $(gsl_minimizer_x(m))&quot;)
    gsl_minimizer_iterate!(m)
end
println(&quot;found minimum $(gsl_minimizer_f(m)) at x = $(gsl_minimizer_x(m))&quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;After a few iterations, it prints &lt;code class=&quot;highlighter-rouge&quot;&gt;found minimum -1.0 at x =
-1.5707963269964016&lt;/code&gt;, which is the correct minimum (−π/2) to
about 10 digits.&lt;/p&gt;

&lt;p&gt;At this point, I will shamelessly plug my own &lt;a href=&quot;https://github.com/stevengj/NLopt.jl&quot;&gt;NLopt
package&lt;/a&gt; for Julia, which wraps
around my free/open-source &lt;a href=&quot;http://ab-initio.mit.edu/nlopt&quot;&gt;NLopt&lt;/a&gt; library
to provide many more optimization algorithms than GSL, with perhaps a nicer
interface.   However, the techniques used to pass callback functions to
NLopt are actually quite similar to those used for GSL.&lt;/p&gt;

&lt;p&gt;An even more complicated version of these techniques can be found in
the &lt;a href=&quot;https://github.com/stevengj/PyCall.jl&quot;&gt;PyCall package&lt;/a&gt; to call
Python from Julia.  In order to pass a Julia function to Python, we
again use &lt;code class=&quot;highlighter-rouge&quot;&gt;cfunction&lt;/code&gt; on a wrapper function that handles the type
conversions and so on, and pass the actual Julia closure through via a
pass-through pointer.  But in that case, the pass-through pointer
consists of a Python object that has been created with a new type that
allows it to wrap a Julia object, and garbage-collection is deferred
by storing the Julia object in a global dictionary of saved objects
(removing it via the Python destructor of the new type).  That is all
somewhat tricky stuff and beyond the scope of this blog post; I only
mention it to illustrate the fact that it is possible to implement
quite complex inter-language calling behaviors purely in Julia by
building on the above techniques.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Put This In Your Pipe</title>
   <link href="http://julialang.org/blog/2013/04/put-this-in-your-pipe"/>
   <updated>2013-04-08T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2013/04/put-this-in-your-pipe</id>
   <content type="html">
&lt;p&gt;In a &lt;a href=&quot;http://julialang.org/blog/2012/03/shelling-out-sucks&quot;&gt;previous post&lt;/a&gt;, I talked about why “shelling out” to spawn a pipeline of external programs via an intermediate shell is a common cause of bugs, security holes, unnecessary overhead, and silent failures.
But it’s so convenient!
Why can’t running pipelines of external programs be convenient &lt;em&gt;and&lt;/em&gt; safe?
Well, there’s no real reason, actually.
The shell itself manages to construct and execute pipelines quite well.
In principle, there’s nothing stopping high-level languages from doing it at least as well as shells do – the common ones just don’t by default, instead requiring users to make the extra effort to use external programs safely and correctly.
There are two major impediments:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Some moderately tricky low-level UNIX plumbing using the &lt;a href=&quot;https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man2/pipe.2.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;pipe&lt;/code&gt;&lt;/a&gt;, &lt;a href=&quot;https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man2/dup2.2.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;dup2&lt;/code&gt;&lt;/a&gt;, &lt;a href=&quot;https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man2/fork.2.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;fork&lt;/code&gt;&lt;/a&gt;, &lt;a href=&quot;https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man2/close.2.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;close&lt;/code&gt;&lt;/a&gt;, and &lt;a href=&quot;https://developer.apple.com/library/mac/#documentation/Darwin/Reference/ManPages/man2/execve.2.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;exec&lt;/code&gt;&lt;/a&gt; system calls;&lt;/li&gt;
  &lt;li&gt;The UX problem of designing an easy, flexible programming interface for commands and pipelines.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This post describes the system we designed and implemented for Julia, and how it avoids the major flaws of shelling out in other languages.
First, I’ll present the Julia version of the previous post’s example – counting the number of lines in a given directory containing the string “foo”.
The fact that Julia provides complete, specific diagnostic error messages when pipelines fail turns out to reveal a surprising and subtle bug, lurking in what appears to be a perfectly innocuous UNIX pipeline.
After fixing this bug, we go into details of how Julia’s external command execution and pipeline construction system actually works, and why it provides greater flexibility and safety than the traditional approach of using an intermediate shell to do all the heavy lifting.&lt;/p&gt;

&lt;h2 id=&quot;simple-pipeline-subtle-bug&quot;&gt;Simple Pipeline, Subtle Bug&lt;/h2&gt;

&lt;p&gt;Here’s how you write the example of counting the number of lines in a directory containing the string “foo” in Julia
(you can follow along at home if you have Julia installed from source by changing directories into the Julia source directory and doing &lt;code class=&quot;highlighter-rouge&quot;&gt;cp -a src &quot;source code&quot;; mkdir tmp&lt;/code&gt; and then firing up the Julia repl):&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; dir = &quot;src&quot;;

julia&amp;gt; int(readchomp(`find $dir -type f -print0` |&amp;gt; `xargs -0 grep foo` |&amp;gt; `wc -l`))
5
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This Julia command looks suspiciously similar to the naïve Ruby version we started with in the previous post:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;`find #{dir} -type f -print0 | xargs -0 grep foo | wc -l`.to_i
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;However, it isn’t susceptible to the same problems:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; dir = &quot;source code&quot;;

julia&amp;gt; int(readchomp(`find $dir -type f -print0` |&amp;gt; `xargs -0 grep foo` |&amp;gt; `wc -l`))
5

julia&amp;gt; dir = &quot;nonexistent&quot;;

julia&amp;gt; int(readchomp(`find $dir -type f -print0` |&amp;gt; `xargs -0 grep foo` |&amp;gt; `wc -l`))
find: `nonexistent': No such file or directory
ERROR: failed processes:
  Process(`find nonexistent -type f -print0`, ProcessExited(1)) [1]
  Process(`xargs -0 grep foo`, ProcessExited(123)) [123]
 in pipeline_error at process.jl:412
 in readall at process.jl:365
 in readchomp at io.jl:172

julia&amp;gt; dir = &quot;foo'; echo MALICIOUS ATTACK; echo '&quot;;

julia&amp;gt; int(readchomp(`find $dir -type f -print0` |&amp;gt; `xargs -0 grep foo` |&amp;gt; `wc -l`))
find: `foo\'; echo MALICIOUS ATTACK; echo \'': No such file or directory
ERROR: failed processes:
  Process(`find &quot;foo'; echo MALICIOUS ATTACK; echo '&quot; -type f -print0`, ProcessExited(1)) [1]
  Process(`xargs -0 grep foo`, ProcessExited(123)) [123]
 in pipeline_error at process.jl:412
 in readall at process.jl:365
 in readchomp at io.jl:172
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The default, simplest-to-achieve behavior in Julia is:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;not susceptible to any kind of metacharacter breakage,&lt;/li&gt;
  &lt;li&gt;reliably detects all subprocess failures,&lt;/li&gt;
  &lt;li&gt;automatically raises an exception if any subprocess fails,&lt;/li&gt;
  &lt;li&gt;prints error messages including exactly which commands failed.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In the above examples, we can see that even when &lt;code class=&quot;highlighter-rouge&quot;&gt;dir&lt;/code&gt; contains spaces or quotes, the expression still behaves exactly as intended – the value of &lt;code class=&quot;highlighter-rouge&quot;&gt;dir&lt;/code&gt; is interpolated as a single argument to the &lt;code class=&quot;highlighter-rouge&quot;&gt;find&lt;/code&gt; command.
When &lt;code class=&quot;highlighter-rouge&quot;&gt;dir&lt;/code&gt; is not the name of a directory that exists, &lt;code class=&quot;highlighter-rouge&quot;&gt;find&lt;/code&gt; fails – as it should – and this failure is detected and automatically converted into an informative exception, including the fully expanded command-lines that failed.&lt;/p&gt;

&lt;p&gt;In the previous post, we observed that using the &lt;code class=&quot;highlighter-rouge&quot;&gt;pipefail&lt;/code&gt; option for Bash allows detection of pipeline failures, like this one, occurring before the last process in the pipeline.
However, it only allows us to detect that at least one thing in the pipeline failed.
We still have to guess at what parts of the pipeline actually failed.
In the Julia example, on the other hand, there is no guessing required:
when a non-existent directory is given, we can see that both &lt;code class=&quot;highlighter-rouge&quot;&gt;find&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;xargs&lt;/code&gt; fail.
While it is unsurprising that &lt;code class=&quot;highlighter-rouge&quot;&gt;find&lt;/code&gt; fails in this case, it is unexpected that &lt;code class=&quot;highlighter-rouge&quot;&gt;xargs&lt;/code&gt; also fails.
Why &lt;em&gt;does&lt;/em&gt; &lt;code class=&quot;highlighter-rouge&quot;&gt;xargs&lt;/code&gt; fail?&lt;/p&gt;

&lt;p&gt;One possibility to check for is that the &lt;code class=&quot;highlighter-rouge&quot;&gt;xargs&lt;/code&gt; program fails with no input.
We can use Julia’s &lt;code class=&quot;highlighter-rouge&quot;&gt;success&lt;/code&gt; predicate to try it out:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; success(`cat /dev/null` |&amp;gt; `xargs true`)
true
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Ok, so &lt;code class=&quot;highlighter-rouge&quot;&gt;xargs&lt;/code&gt; seems perfectly happy with no input.
Maybe grep doesn’t like not getting any input?&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; success(`cat /dev/null` |&amp;gt; `grep foo`)
false
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Aha! &lt;code class=&quot;highlighter-rouge&quot;&gt;grep&lt;/code&gt; returns a non-zero status when it doesn’t get any input.
Good to know.
It turns out that &lt;code class=&quot;highlighter-rouge&quot;&gt;grep&lt;/code&gt; indicates whether it matched anything or not with its return status.
Most programs use their return status to indicate success or failure, but some, like &lt;code class=&quot;highlighter-rouge&quot;&gt;grep&lt;/code&gt;, use it to indicate some other boolean condition – in this case “found something” versus “didn’t find anything”:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; success(`echo foo` |&amp;gt; `grep foo`)
true

julia&amp;gt; success(`echo bar` |&amp;gt; `grep foo`)
false
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Now we know why &lt;code class=&quot;highlighter-rouge&quot;&gt;grep&lt;/code&gt; is “failing” – and &lt;code class=&quot;highlighter-rouge&quot;&gt;xargs&lt;/code&gt; too, since it returns a non-zero status if the program it runs returns non-zero.
This means that our Julia pipeline and the “responsible” Ruby version are both susceptible to bogus failures when we search an existing directory that happens not to contain the string “foo” anywhere:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; dir = &quot;tmp&quot;;

julia&amp;gt; int(readchomp(`find $dir -type f -print0` |&amp;gt; `xargs -0 grep foo` |&amp;gt; `wc -l`))
ERROR: failed process: Process(`xargs -0 grep foo`, ProcessExited(123)) [123]
 in error at error.jl:22
 in pipeline_error at process.jl:394
 in pipeline_error at process.jl:407
 in readall at process.jl:365
 in readchomp at io.jl:172
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Since &lt;code class=&quot;highlighter-rouge&quot;&gt;grep&lt;/code&gt; indicates not finding anything using a non-zero return status, the &lt;code class=&quot;highlighter-rouge&quot;&gt;readall&lt;/code&gt; function concludes that its pipeline failed and raises an error to that effect.
In this case, this default behavior is undesirable:
we want the expression to just return &lt;code class=&quot;highlighter-rouge&quot;&gt;0&lt;/code&gt; without raising an error.
The simple fix in Julia is this:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; dir = &quot;tmp&quot;;

julia&amp;gt; int(readchomp(`find $dir -type f -print0` |&amp;gt; ignorestatus(`xargs -0 grep foo`) |&amp;gt; `wc -l`))
0
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This works correctly in all cases.
Next I’ll explain &lt;em&gt;how&lt;/em&gt; all of this works, but for now it’s enough to note that the detailed error message provided when our pipeline failed exposed a rather subtle bug that would eventually cause subtle and hard-to-debug problems when used in production.
Without such detailed error reporting, this bug would be pretty difficult to track down.&lt;/p&gt;

&lt;h2 id=&quot;do-nothing-backticks&quot;&gt;Do-Nothing Backticks&lt;/h2&gt;

&lt;p&gt;Julia borrows the backtick syntax for external commands form Perl and Ruby, both of which in turn got it from the shell.
Unlike in these predecessors, however, in Julia backticks don’t immediately run commands, nor do they necessarily indicate that you want to capture the output of the command.
Instead, backticks just construct an object representing a command:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; `echo Hello`
`echo Hello`

julia&amp;gt; typeof(ans)
Cmd
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;(In the Julia repl, &lt;code class=&quot;highlighter-rouge&quot;&gt;ans&lt;/code&gt; is automatically bound to the value of the last evaluated input.)
In order to actually run a command, you have to &lt;em&gt;do&lt;/em&gt; something with a command object.
To run a command and capture its output into a string – what other languages do with backticks automatically – you can apply the &lt;code class=&quot;highlighter-rouge&quot;&gt;readall&lt;/code&gt; function:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; readall(`echo Hello`)
&quot;Hello\n&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Since it’s very common to want to discard the trailing line break at the end of a command’s output, Julia provides the &lt;code class=&quot;highlighter-rouge&quot;&gt;readchomp(x)&lt;/code&gt; command which is equivalent to writing &lt;code class=&quot;highlighter-rouge&quot;&gt;chomp(readall(x))&lt;/code&gt;:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; readchomp(`echo Hello`)
&quot;Hello&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;To run a command without capturing its output, letting it just print to the same &lt;code class=&quot;highlighter-rouge&quot;&gt;stdout&lt;/code&gt; stream as the main process – i.e. what the &lt;code class=&quot;highlighter-rouge&quot;&gt;system&lt;/code&gt; function does when given a command as a string in other languages – use the &lt;code class=&quot;highlighter-rouge&quot;&gt;run&lt;/code&gt; function:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; run(`echo Hello`)
Hello
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;&quot;Hello\n&quot;&lt;/code&gt; after the &lt;code class=&quot;highlighter-rouge&quot;&gt;readall&lt;/code&gt; command is a returned value, whereas the &lt;code class=&quot;highlighter-rouge&quot;&gt;Hello&lt;/code&gt; after the &lt;code class=&quot;highlighter-rouge&quot;&gt;run&lt;/code&gt; command is printed output.
(If your terminal supports color, these are colored differently so that you can easily distinguish them visually.)
Nothing is returned by the &lt;code class=&quot;highlighter-rouge&quot;&gt;run&lt;/code&gt; command, but if something goes wrong, an exception is raised:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; run(`false`)
ERROR: failed process: Process(`false`, ProcessExited(1)) [1]
 in error at error.jl:22
 in pipeline_error at process.jl:394
 in run at process.jl:384

julia&amp;gt; run(`notaprogram`)
execvp(): No such file or directory
ERROR: failed process: Process(`notaprogram`, ProcessExited(-1)) [-1]
 in error at error.jl:22
 in pipeline_error at process.jl:394
 in run at process.jl:384
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;As with &lt;code class=&quot;highlighter-rouge&quot;&gt;xargs&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;grep&lt;/code&gt; above, this may not always be desirable.
In such cases, you can use &lt;code class=&quot;highlighter-rouge&quot;&gt;ignorestatus&lt;/code&gt; to indicate that the command returning a non-zero value should not be considered an error:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; run(ignorestatus(`false`))

julia&amp;gt; run(ignorestatus(`notaprogram`))
execvp(): No such file or directory
ERROR: failed process: Process(`notaprogram`, ProcessExited(-1)) [-1]
 in error at error.jl:22
 in pipeline_error at process.jl:394
 in run at process.jl:384
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;In the latter case, an error is still raised in the parent process since the problem is that the executable doesn’t even exist, rather than merely that it ran and returned a non-zero status.&lt;/p&gt;

&lt;p&gt;Although Julia’s backtick syntax intentionally mimics the shell as closely as possible, there is an important distinction:
the command string is never passed to a shell to be interpreted and executed;
instead it is parsed in Julia code, using the same rules the shell uses to determine what the command and arguments are.
Command objects allow you to see what the program and arguments were determined to be by accessing the &lt;code class=&quot;highlighter-rouge&quot;&gt;.exec&lt;/code&gt; field:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; cmd = `perl -e 'print &quot;Hello\n&quot;'`
`perl -e 'print &quot;Hello\n&quot;'`

julia&amp;gt; cmd.exec
3-element Union(UTF8String,ASCIIString) Array:
 &quot;perl&quot;
 &quot;-e&quot;
 &quot;print \&quot;Hello\\n\&quot;&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This field is a plain old array of strings that can be manipulated like any other Julia array.&lt;/p&gt;

&lt;h2 id=&quot;constructing-commands&quot;&gt;Constructing Commands&lt;/h2&gt;

&lt;p&gt;The purpose of the backtick notation in Julia is to provide a familiar, shell-like syntax for making objects representing commands with arguments.
To that end, quotes and spaces work just as they do in the shell.
The real power of backtick syntax doesn’t emerge, however, until we begin constructing commands programmatically.
Just as in the shell (and in Julia strings), you can interpolate values into commands using the dollar sign (&lt;code class=&quot;highlighter-rouge&quot;&gt;$&lt;/code&gt;):&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; dir = &quot;src&quot;;

julia&amp;gt; `find $dir -type f`.exec
4-element Union(UTF8String,ASCIIString) Array:
 &quot;find&quot;
 &quot;src&quot;
 &quot;-type&quot;
 &quot;f&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Unlike in the shell, however, Julia values interpolated into commands are interpolated as a single verbatim argument – no characters inside the value are interpreted as special after the value has been interpolated:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; dir = &quot;two words&quot;;

julia&amp;gt; `find $dir -type f`.exec
4-element Union(UTF8String,ASCIIString) Array:
 &quot;find&quot;
 &quot;two words&quot;
 &quot;-type&quot;
 &quot;f&quot;

julia&amp;gt; dir = &quot;foo'bar&quot;;

julia&amp;gt; `find $dir -type f`.exec
4-element Union(UTF8String,ASCIIString) Array:
 &quot;find&quot;
 &quot;foo'bar&quot;
 &quot;-type&quot;
 &quot;f&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This works no matter what the contents of the interpolated value is, allowing simple interpolation of characters that are quite difficult to pass as parts of command-line arguments even in the shell (for the following examples, &lt;code class=&quot;highlighter-rouge&quot;&gt;tmp/a.tsv&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;tmp/b.tsv&lt;/code&gt; can be created in the shell with &lt;code class=&quot;highlighter-rouge&quot;&gt;echo -e &quot;foo\tbar\nbaz\tqux&quot; &amp;gt; tmp/a.tsv; echo -e &quot;foo\t1\nbaz\t2&quot; &amp;gt; tmp/b.tsv&lt;/code&gt;):&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; tab = &quot;\t&quot;;

julia&amp;gt; cmd = `join -t$tab tmp/a.tsv tmp/b.tsv`;

julia&amp;gt; cmd.exec
4-element Union(UTF8String,ASCIIString) Array:
 &quot;join&quot;
 &quot;-t\t&quot;
 &quot;tmp/a.tsv&quot;
 &quot;tmp/b.tsv&quot;

julia&amp;gt; run(cmd)
foo     bar     1
baz     qux     2
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Moreover, what comes after the &lt;code class=&quot;highlighter-rouge&quot;&gt;$&lt;/code&gt; can actually be any valid Julia expression, not just a variable name:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; `join -t$&quot;\t&quot; tmp/a.tsv tmp/b.tsv`.exec
4-element Union(UTF8String,ASCIIString) Array:
 &quot;join&quot;
 &quot;-t\t&quot;
 &quot;a.tsv&quot;
 &quot;b.tsv&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;A tab character is somewhat harder to pass in the shell, requiring command interpolation and some tricky quoting:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;bash-3.2$ join -t&quot;$(printf '\t')&quot; tmp/a.tsv tmp/b.tsv
foo	    bar	    1
baz	    qux	    2
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;While interpolating values with spaces and other strange characters is great for non-brittle construction of commands, there was a reason why the shell split values on spaces in the first place:
to allow interpolation of multiple arguments.
Most modern shells have first-class array types, but older shells used space-separation to simulate arrays.
Thus, if you interpolate a value like “foo bar” into a command in the shell, it’s treated as two separate words by default.
In languages with first-class array types, however, there’s a much better option:
consistently interpolate single values as single arguments and interpolate arrays as multiple values.
This is precisely what Julia’s backtick interpolation does:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; dirs = [&quot;foo&quot;, &quot;bar&quot;, &quot;baz&quot;];

julia&amp;gt; `find $dirs -type f`.exec
6-element Union(UTF8String,ASCIIString) Array:
 &quot;find&quot;
 &quot;foo&quot;
 &quot;bar&quot;
 &quot;baz&quot;
 &quot;-type&quot;
 &quot;f&quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;And of course, no matter how strange the strings contained in an interpolated array are, they become verbatim arguments, without any shell interpretation.
Julia’s backticks have one more fancy trick up their sleeve.
We saw earlier (without really remarking on it) that you could interpolate single values into a larger argument:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; x = &quot;bar&quot;;

julia&amp;gt; `echo foo$x`
`echo foobar`
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;What happens if &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; is an array?
Only one way to find out:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; x = [&quot;bar&quot;, &quot;baz&quot;];

julia&amp;gt; `echo foo$x`
`echo foobar foobaz`
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Julia does what the shell would do if you wrote &lt;code class=&quot;highlighter-rouge&quot;&gt;echo foo{bar,baz}&lt;/code&gt;.
This even works correctly for multiple values interpolated into the same shell word:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia&amp;gt; dir = &quot;/data&quot;; names = [&quot;foo&quot;,&quot;bar&quot;]; exts=[&quot;csv&quot;,&quot;tsv&quot;];

julia&amp;gt; `cat $dir/$names.$exts`
`cat /data/foo.csv /data/foo.tsv /data/bar.csv /data/bar.tsv`
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This is the same Cartesian product expansion that the shell does if multiple &lt;code class=&quot;highlighter-rouge&quot;&gt;&lt;span class=&quot;p&quot;&gt;{&lt;/span&gt;&lt;span class=&quot;err&quot;&gt;...&lt;/span&gt;&lt;span class=&quot;p&quot;&gt;}&lt;/span&gt;&lt;/code&gt; expressions are used in the same word.&lt;/p&gt;

&lt;h2 id=&quot;further-reading&quot;&gt;Further Reading&lt;/h2&gt;

&lt;p&gt;You can read more in Julia’s &lt;a href=&quot;http://docs.julialang.org/en/release-0.1/manual/running-external-programs/&quot;&gt;online manual&lt;/a&gt;, including how to construct complex pipelines, and how shell-compatible quoting and interpolation rules in Julia’s backtick syntax make it both simple and safe to cut-and-paste shell commands into Julia code.
The whole system is designed on the principle that the easiest thing to do should also be the right thing.
The end result is that starting and interacting with external processes in Julia is both convenient and safe.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Distributed Numerical Optimization</title>
   <link href="http://julialang.org/blog/2013/04/distributed-numerical-optimization"/>
   <updated>2013-04-05T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2013/04/distributed-numerical-optimization</id>
   <content type="html">&lt;script src=&quot;https://cdn.mathjax.org/mathjax/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML&quot;&gt;
&lt;/script&gt;

&lt;p&gt;This post walks through the parallel computing functionality of Julia
to implement an asynchronous parallel version of the classical
&lt;em&gt;cutting-plane&lt;/em&gt; algorithm for convex (nonsmooth) optimization,
demonstrating the complete workflow including running on both Amazon
EC2 and a large multicore server. I will quickly review the
cutting-plane algorithm and will be focusing primarily on parallel
computation patterns, so don’t worry if you’re not familiar with the
optimization side of things.&lt;/p&gt;

&lt;h3 id=&quot;cutting-plane-algorithm&quot;&gt;Cutting-plane algorithm&lt;/h3&gt;

&lt;p&gt;The cutting-plane algorithm is a method for solving the optimization problem&lt;/p&gt;

&lt;script type=&quot;math/tex; mode=display&quot;&gt;\min_{x \in \mathbb R^d} \sum_{i=1}^n f_i(x)&lt;/script&gt;

&lt;p&gt;where the functions \( f_i \) are convex but not necessarily differentiable.
The absolute value function \( |x| \) and the 1-norm \( ||x|| _ 1 \) are
typical examples. Important applications also arise from &lt;a href=&quot;http://en.wikipedia.org/wiki/Lagrangian_relaxation&quot;&gt;Lagrangian
relaxation&lt;/a&gt;. The idea of the algorithm is to approximate the functions \(
f_i \) with piecewise linear models \( m_i \) which are built up from
information obtained by evaluating \( f_i \) at different points. We
iteratively minimize over the models to generate candidate solution points.&lt;/p&gt;

&lt;p&gt;We can state the algorithm as&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Choose starting point \( x \).&lt;/li&gt;
  &lt;li&gt;For \(i = 1,\ldots,n\), evaluate \(
f_i(x) \) and update corresponding model \( m_i \).&lt;/li&gt;
  &lt;li&gt;Let the next
candidate \( x \) be the minimizer of \( \sum_{i=1}^n m_i(x) \).&lt;/li&gt;
  &lt;li&gt;If not converged, goto step 2.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If it is costly to evaluate \( f_i(x) \), then the algorithm is naturally
parallelizable at step 2. The minimization in step 3 can be computed by solving
a linear optimization problem, which is usually very fast. (Let me point out
here that Julia has interfaces to linear programming and other
optimization solvers under &lt;a href=&quot;http://juliaopt.org/&quot;&gt;JuliaOpt&lt;/a&gt;.)&lt;/p&gt;

&lt;p&gt;Abstracting the math, we can write the algorithm using the following Julia code.&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# functions initialize, isconverged, solvesubproblem, and process implemented elsewhere
state, subproblems = initialize()
while !isconverged(state)
    results = map(solvesubproblem,subproblems)
    state, subproblems = process(state, results)
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The function &lt;code class=&quot;highlighter-rouge&quot;&gt;solvesubproblem&lt;/code&gt; corresponds to evaluating \( f_i(x) \) for a
given \( i \) and \( x \) (the elements of &lt;code class=&quot;highlighter-rouge&quot;&gt;subproblems&lt;/code&gt; could be tuples
&lt;code class=&quot;highlighter-rouge&quot;&gt;(i,x)&lt;/code&gt;). The function &lt;code class=&quot;highlighter-rouge&quot;&gt;process&lt;/code&gt; corresponds to minimizing the model in step
3, and it produces a new state and a new set of subproblems to solve.&lt;/p&gt;

&lt;p&gt;Note that the algorithm looks much like a map-reduce that would be easy to
parallelize using many existing frameworks. Indeed, in Julia we can simply
replace &lt;code class=&quot;highlighter-rouge&quot;&gt;map&lt;/code&gt; with &lt;code class=&quot;highlighter-rouge&quot;&gt;pmap&lt;/code&gt; (parallel map). Let’s consider a twist that makes
the parallelism not so straightforward.&lt;/p&gt;

&lt;h3 id=&quot;asynchronous-variant&quot;&gt;Asynchronous variant&lt;/h3&gt;

&lt;p&gt;Variability in the time taken by the &lt;code class=&quot;highlighter-rouge&quot;&gt;solvesubproblem&lt;/code&gt; function can lead to
load imbalance and limit parallel efficiency as workers sit idle waiting for new
tasks. Such variability arises naturally if &lt;code class=&quot;highlighter-rouge&quot;&gt;solvesubproblem&lt;/code&gt; itself requires
solving a optimization problem, or if the workers and network are shared, as is
often the case with cloud computing.&lt;/p&gt;

&lt;p&gt;We can consider a new variant of the cutting-plane algorithm to address this
issue. The key point is&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;When proportion \(0 &amp;lt; \alpha \le 1 \) of subproblems for a given candidate
have been solved, generate a new candidate and corresponding set of
subproblems by using whatever information is presently available.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;In other words, we generate new tasks to feed to workers without needing to wait
for all current tasks to complete, making the algorithm asynchronous. The
algorithm remains convergent, although the total number of iterations may
increase. For more details, see this &lt;a href=&quot;http://dx.doi.org/10.1023/A:1021858008222&quot;&gt;paper&lt;/a&gt; by Jeff Linderoth and
Stephen Wright.&lt;/p&gt;

&lt;p&gt;By introducing asynchronicity we can no longer use a nice black-box &lt;code class=&quot;highlighter-rouge&quot;&gt;pmap&lt;/code&gt;
function and have to dig deeper into the parallel implementation. Fortunately,
this is easy to do in Julia.&lt;/p&gt;

&lt;h3 id=&quot;parallel-implementation-in-julia&quot;&gt;Parallel implementation in Julia&lt;/h3&gt;

&lt;p&gt;Julia implements distributed-memory parallelism based on one-sided message
passing, where process push work onto others (via &lt;code class=&quot;highlighter-rouge&quot;&gt;remotecall&lt;/code&gt;) and the
results are retrieved (via &lt;code class=&quot;highlighter-rouge&quot;&gt;fetch&lt;/code&gt;) by the process which requires them. Macros
such as &lt;code class=&quot;highlighter-rouge&quot;&gt;@spawn&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;@parallel&lt;/code&gt; provide pretty syntax around this low-level
functionality.  This model of parallelism is very different from the typical
SIMD style of MPI. Both approaches are useful in different contexts, and I
expect an MPI wrapper for Julia will appear in the future (see also &lt;a href=&quot;https://github.com/lcw/julia-mpi&quot;&gt;here&lt;/a&gt;).&lt;/p&gt;

&lt;p&gt;Reading the &lt;a href=&quot;http://docs.julialang.org/en/release-0.1/manual/parallel-computing/&quot;&gt;manual&lt;/a&gt;
on parallel computing is highly recommended, and I won’t try to reproduce it in
this post. Instead, we’ll dig into and extend one of the examples it presents.&lt;/p&gt;

&lt;p&gt;The implementation of &lt;code class=&quot;highlighter-rouge&quot;&gt;pmap&lt;/code&gt; in Julia is&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function pmap(f, lst)
    np = nprocs()  # determine the number of processors available
    n = length(lst)
    results = cell(n)
    i = 1
    # function to produce the next work item from the queue.
    # in this case it's just an index.
    next_idx() = (idx=i; i+=1; idx)
    @sync begin
        for p=1:np
            if p != myid() || np == 1
                @spawnlocal begin
                    while true
                        idx = next_idx()
                        if idx &amp;gt; n
                            break
                        end
                        results[idx] = remotecall_fetch(p, f, lst[idx])
                    end
                end
            end
        end
    end
    results
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;On first sight, this code is not particularly intuitive. The &lt;code class=&quot;highlighter-rouge&quot;&gt;@spawnlocal&lt;/code&gt;
macro creates a &lt;em&gt;&lt;a href=&quot;http://docs.julialang.org/en/latest/manual/control-flow/#man-tasks&quot;&gt;task&lt;/a&gt;&lt;/em&gt;
on the &lt;em&gt;master process&lt;/em&gt; (e.g. process 1). Each task feeds work to a
corresponding worker; the call &lt;code class=&quot;highlighter-rouge&quot;&gt;remotecall_fetch(p, f, lst[idx])&lt;/code&gt; function
calls &lt;code class=&quot;highlighter-rouge&quot;&gt;f&lt;/code&gt; on process &lt;code class=&quot;highlighter-rouge&quot;&gt;p&lt;/code&gt; and returns the result when finished. Tasks are
uninterruptable and only surrender control at specific points such as
&lt;code class=&quot;highlighter-rouge&quot;&gt;remotecall_fetch&lt;/code&gt;. Tasks cannot directly modify variables from the enclosing
scope, but the same effect can be achieved by using the &lt;code class=&quot;highlighter-rouge&quot;&gt;next_idx&lt;/code&gt; function to
access and mutate &lt;code class=&quot;highlighter-rouge&quot;&gt;i&lt;/code&gt;. &lt;em&gt;The task idiom functions in place of using a loop to
poll for results from each worker process.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;Implementing our asynchronous algorithm is not much more than a modification of
the above code:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;# given constants n and 0 &amp;lt; alpha &amp;lt;= 1
# functions initialize and solvesubproblem defined elsewhere
np = nprocs()
state, subproblems = initialize()
converged = false
isconverged() = converged
function updatemodel(mysubproblem, result)
    # store result
    ...
    # decide whether to generate new subproblems
    state.numback[mysubproblem.parent] += 1
    if state.numback[mysubproblem.parent] &amp;gt;= alpha*n &amp;amp;&amp;amp; !state.didtrigger[mysubproblem.parent]
        state.didtrigger[mysubproblem.parent] = true
        # generate newsubproblems by solving linear optimization problem
        ...
        if ... # convergence test
            converged = true
        else
            append!(subproblems, newsubproblems)
            push!(state.didtrigger, false)
            push!(state.numback, 0)
            # ensure that for s in newsubproblems, s.parent == length(state.numback)
        end
    end
end

@sync begin
    for p=1:np
        if p != myid() || np == 1
            @spawnlocal begin
                while !isconverged()
                    if length(subproblems) == 0
                        # no more subproblems but haven't converged yet
                        yield()
                        continue
                    end
                    mysubproblem = shift!(subproblems) # pop subproblem from queue
                    result = remotecall_fetch(p, solvesubproblem, mysubproblem)
                    updatemodel(mysubproblem, result)
                end
            end
        end
    end
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;where &lt;code class=&quot;highlighter-rouge&quot;&gt;state&lt;/code&gt; is an instance of a type defined as&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;type State
    didtrigger::Vector{Bool}
    numback::Vector{Int}
    ...
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;There is little difference in the structure of the code inside the &lt;code class=&quot;highlighter-rouge&quot;&gt;@sync&lt;/code&gt;
blocks, and the asynchronous logic is encapsulated in the local &lt;code class=&quot;highlighter-rouge&quot;&gt;updatemodel&lt;/code&gt;
function which conditionally generates new subproblems. A strength of Julia is
that functions like &lt;code class=&quot;highlighter-rouge&quot;&gt;pmap&lt;/code&gt; are implemented in Julia itself, so that it is
particularly straightforward to make modifications like this.&lt;/p&gt;

&lt;h3 id=&quot;running-it&quot;&gt;Running it&lt;/h3&gt;

&lt;p&gt;Now for the fun part. The complete cutting-plane algorithm (along with
additional variants) is implemented in &lt;a href=&quot;https://github.com/mlubin/JuliaBenders&quot;&gt;JuliaBenders&lt;/a&gt;. The code is
specialized for &lt;a href=&quot;http://en.wikipedia.org/wiki/Stochastic_programming&quot;&gt;stochastic
programming&lt;/a&gt; where the cutting-plane algorithm is known as the &lt;a href=&quot;http://www.springerreference.com/docs/html/chapterdbid/72429.html&quot;&gt;L-shaped
method&lt;/a&gt; or Benders decomposition and is used to decompose the solution of
large linear optimization problems. Here, &lt;code class=&quot;highlighter-rouge&quot;&gt;solvesubproblem&lt;/code&gt; entails solving a
relatively small linear optimization problem. Test instances are taken from the
previously mentioned &lt;a href=&quot;http://dx.doi.org/10.1023/A:1021858008222&quot;&gt;paper&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;We’ll first run on a large multicore server. The
&lt;code class=&quot;highlighter-rouge&quot;&gt;runals.jl&lt;/code&gt; (asynchronous L-shaped) file contains the algorithm we’ll use. Its
usage is&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia runals.jl [data source] [num subproblems] [async param] [block size]
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;where &lt;code class=&quot;highlighter-rouge&quot;&gt;[num subproblems]&lt;/code&gt; is the \(n\) as above and &lt;code class=&quot;highlighter-rouge&quot;&gt;[async param]&lt;/code&gt; is
the proportion \(\alpha\). By setting \(\alpha = 1\) we obtain the
synchronous algorithm. For the asynchronous version we will take \(\alpha =
0.6\). The &lt;code class=&quot;highlighter-rouge&quot;&gt;[block size]&lt;/code&gt; parameter controls how many subproblems are sent to
a worker at once (in the previous code, this value was always 1). We will use
4000 subproblems in our experiments.&lt;/p&gt;

&lt;p&gt;To run multiple Julia processes on a shared-memory machine, we pass the &lt;code class=&quot;highlighter-rouge&quot;&gt;-p N&lt;/code&gt;
option to the &lt;code class=&quot;highlighter-rouge&quot;&gt;julia&lt;/code&gt; executable, which will start up &lt;code class=&quot;highlighter-rouge&quot;&gt;N&lt;/code&gt; system processes.
To execute the asynchronous version with 10 workers, we run&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia -p 12 runals.jl Data/storm 4000 0.6 30
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Note that we start 12 processes. These are the 10 workers, the master (which
distributes tasks), and another process to perform the master’s computations (an
additional refinement which was not described above). Results from various runs
are presented in the table below.&lt;/p&gt;

&lt;table style=&quot;text-align:right;margin-left:auto;margin-right:auto&quot; cellspacing=&quot;5&quot;&gt;
&lt;tr style=&quot;text-align:center&quot;&gt;
	&lt;td&gt; &lt;/td&gt;
	&lt;td colspan=&quot;2&quot; style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;Synchronous&lt;/td&gt;
	&lt;td colspan=&quot;2&quot; style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;Asynchronous&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;text-align:center&quot;&gt;
	&lt;td style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;No. Workers&lt;/td&gt;
	&lt;td style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;Speed&lt;/td&gt;
	&lt;td style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;Efficiency
	&lt;td style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;Speed&lt;/td&gt;
	&lt;td style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;Efficiency
&lt;/td&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
	&lt;td style=&quot;text-align:center&quot;&gt;10&lt;/td&gt;
	&lt;td&gt;154&lt;/td&gt;
	&lt;td&gt;Baseline&lt;/td&gt;
	&lt;td&gt;166&lt;/td&gt;
	&lt;td&gt;Baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
	&lt;td style=&quot;text-align:center&quot;&gt;20&lt;/td&gt;
	&lt;td&gt;309&lt;/td&gt;
	&lt;td&gt;100.3%&lt;/td&gt;
	&lt;td&gt;348&lt;/td&gt;
	&lt;td&gt;105%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
	&lt;td style=&quot;text-align:center&quot;&gt;40&lt;/td&gt;
	&lt;td&gt;517&lt;/td&gt;
	&lt;td&gt;84%&lt;/td&gt;
	&lt;td&gt;654&lt;/td&gt;
	&lt;td&gt;98%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
	&lt;td style=&quot;text-align:center&quot;&gt;60&lt;/td&gt;
	&lt;td&gt;674&lt;/td&gt;
	&lt;td&gt;73%&lt;/td&gt;
	&lt;td&gt;918&lt;/td&gt;
	&lt;td&gt;92%&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;p class=&quot;caption&quot; style=&quot;text-align:center&quot;&gt;&lt;b&gt;Table:&lt;/b&gt;
Results on a shared-memory 8x Xeon E7-8850 server. Workers correspond to
individual cores. Speed is the rate of subproblems solved per second. Efficiency
is calculated as the percent of ideal parallel speedup obtained. The superlinear
scaling observed with 20 workers is likely a system artifact.
&lt;/p&gt;

&lt;p&gt;There are a few more hoops to jump through in order to run on EC2. First we must
build a system image (AMI) with Julia installed. Julia connects to workers over
ssh, so I found it useful to put my EC2 ssh key on the AMI and also set
&lt;code class=&quot;highlighter-rouge&quot;&gt;StrictHostKeyChecking no&lt;/code&gt; in &lt;code class=&quot;highlighter-rouge&quot;&gt;/etc/ssh/ssh_config&lt;/code&gt; to disable the
authenticity prompt when connecting to new workers. Someone will likely correct
me on if this is the right approach.&lt;/p&gt;

&lt;p&gt;Assuming we have an AMI in place, we can fire up the instances. I used an
m3.xlarge instance for the master and m1.medium instances for the workers.
(Note: you can save a lot of money by using the spot market.)&lt;/p&gt;

&lt;p&gt;To add remote workers on startup, Julia accepts a file with a list of host names
through the &lt;code class=&quot;highlighter-rouge&quot;&gt;--machinefile&lt;/code&gt; option. We can generate this easily enough by
using the EC2 API Tools (Ubuntu package &lt;code class=&quot;highlighter-rouge&quot;&gt;ec2-api-tools&lt;/code&gt;) with the command&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;ec2-describe-instances | grep running | awk '{ print $5; }' &amp;gt; mfile
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;On the master instance we can then run&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;julia --machinefile mfile runatr.jl Data/storm 4000 0.6 30
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Results from various runs are presented in the table below.&lt;/p&gt;

&lt;table style=&quot;text-align:right;margin-left:auto;margin-right:auto&quot; cellspacing=&quot;5&quot;&gt;
&lt;tr style=&quot;text-align:center&quot;&gt;
	&lt;td&gt; &lt;/td&gt;
	&lt;td colspan=&quot;2&quot; style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;Synchronous&lt;/td&gt;
	&lt;td colspan=&quot;2&quot; style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;Asynchronous&lt;/td&gt;
&lt;/tr&gt;
&lt;tr style=&quot;text-align:center&quot;&gt;
	&lt;td style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;No. Workers&lt;/td&gt;
	&lt;td style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;Speed&lt;/td&gt;
	&lt;td style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;Efficiency
	&lt;td style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;Speed&lt;/td&gt;
	&lt;td style=&quot;border-bottom-style:solid;border-bottom-width:2px&quot;&gt;Efficiency
&lt;/td&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;
	&lt;td style=&quot;text-align:center&quot;&gt;10&lt;/td&gt;
	&lt;td&gt;149&lt;/td&gt;
	&lt;td&gt;Baseline&lt;/td&gt;
	&lt;td&gt;151&lt;/td&gt;
	&lt;td&gt;Baseline&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
	&lt;td style=&quot;text-align:center&quot;&gt;20&lt;/td&gt;
	&lt;td&gt;289&lt;/td&gt;
	&lt;td&gt;97%&lt;/td&gt;
	&lt;td&gt;301&lt;/td&gt;
	&lt;td&gt;99.7%&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
	&lt;td style=&quot;text-align:center&quot;&gt;40&lt;/td&gt;
	&lt;td&gt;532&lt;/td&gt;
	&lt;td&gt;89%&lt;/td&gt;
	&lt;td&gt;602&lt;/td&gt;
	&lt;td&gt;99.5%&lt;/td&gt;
&lt;/tr&gt;
&lt;/table&gt;

&lt;p class=&quot;caption&quot; style=&quot;text-align:center&quot;&gt;&lt;b&gt;Table:&lt;/b&gt;
Results on Amazon EC2. Workers correspond to individual m1.medium instances. The
master process is run on an m3.xlarge instance.
&lt;/p&gt;

&lt;p&gt;On both architectures the asynchronous version solves subproblems at a higher
rate and has significantly better parallel efficiency. Scaling is better on EC2
than on the shared-memory server likely because the subproblem calculation is
memory bound, and so performance is better on the distributed-memory
architecture. Anyway, with Julia we can easily experiment on both.&lt;/p&gt;

&lt;h3 id=&quot;further-reading&quot;&gt;Further reading&lt;/h3&gt;

&lt;p&gt;A more detailed &lt;a href=&quot;https://github.com/JuliaLang/julia-tutorial/blob/master/NumericalOptimization/tutorial.pdf?raw=true&quot;&gt;tutorial&lt;/a&gt;
was prepared for the Julia &lt;a href=&quot;https://github.com/JuliaLang/julia-tutorial&quot;&gt;IAP session&lt;/a&gt; at MIT in January 2013.&lt;/p&gt;

&lt;p&gt;&lt;a rel=&quot;license&quot; href=&quot;http://creativecommons.org/licenses/by/4.0/&quot;&gt;&lt;img alt=&quot;Creative Commons License&quot; style=&quot;border-width:0&quot; src=&quot;https://i.creativecommons.org/l/by/4.0/88x31.png&quot; /&gt;&lt;/a&gt;&lt;br /&gt;&lt;span xmlns:dct=&quot;http://purl.org/dc/terms/&quot; property=&quot;dct:title&quot;&gt;Distributed Numerical Optimization&lt;/span&gt; by &lt;span xmlns:cc=&quot;http://creativecommons.org/ns#&quot; property=&quot;cc:attributionName&quot;&gt;Miles Lubin&lt;/span&gt; is licensed under a &lt;a rel=&quot;license&quot; href=&quot;http://creativecommons.org/licenses/by/4.0/&quot;&gt;Creative Commons Attribution 4.0 International License&lt;/a&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Videos from the Julia tutorial at MIT</title>
   <link href="http://julialang.org/blog/2013/03/julia-tutorial-MIT"/>
   <updated>2013-03-30T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2013/03/julia-tutorial-MIT</id>
   <content type="html">&lt;p&gt;We held a two day Julia tutorial at MIT in January 2013, which included 10 sessions. &lt;a href=&quot;http://ocw.mit.edu/&quot;&gt;MIT Open Courseware&lt;/a&gt; and &lt;a href=&quot;http://www.mitx.org/&quot;&gt;MIT-X&lt;/a&gt; graciously provided support for recording of these lectures, so that the wider Julia community can benefit from these sessions.&lt;/p&gt;

&lt;h2 id=&quot;julia-lightning-round-slides&quot;&gt;Julia Lightning Round (&lt;a href=&quot;https://raw.github.com/JuliaLang/julia-tutorial/master/LightningRound/IAP_2013_Lightning.pdf&quot;&gt;slides&lt;/a&gt;)&lt;/h2&gt;

&lt;p&gt;This session is a rapid introduction to julia, using a number of lightning rounds. It uses a number of short examples to demonstrate syntax and features, and gives a quick feel for the language.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;http://www.youtube.com/embed/37L1OMk_3FU&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;h2 id=&quot;rationale-behind-julia-and-the-vision-slides&quot;&gt;Rationale behind Julia and the Vision (&lt;a href=&quot;https://github.com/JuliaLang/julia-tutorial/raw/master/Vision/vision.pdf&quot;&gt;slides&lt;/a&gt;)&lt;/h2&gt;

&lt;p&gt;The rationale and vision behind julia, and its design principles are discussed in this session.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;http://www.youtube.com/embed/02U9AJMEWx0&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;h2 id=&quot;data-analysis-with-dataframes-slides&quot;&gt;Data Analysis with DataFrames (&lt;a href=&quot;https://github.com/JuliaLang/julia-tutorial/raw/master/DataFrames/slides.pdf&quot;&gt;slides&lt;/a&gt;)&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;https://github.com/HarlanH/DataFrames.jl&quot;&gt;DataFrames&lt;/a&gt; is one of the most widely used Julia packages. This session is an introduction to data analysis with Julia using DataFrames.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;http://www.youtube.com/embed/XRClA5YLiIc&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;h2 id=&quot;statistical-models-in-julia-slides&quot;&gt;Statistical Models in Julia (&lt;a href=&quot;https://github.com/JuliaLang/julia-tutorial/raw/master/Stats/slides.pdf&quot;&gt;slides&lt;/a&gt;)&lt;/h2&gt;

&lt;p&gt;This session demonstrates Julia’s statistics capabilities, which are provided by these packages: &lt;a href=&quot;https://github.com/JuliaStats/Distributions.jl&quot;&gt;Distributions&lt;/a&gt;, &lt;a href=&quot;https://github.com/JuliaStats/GLM.jl&quot;&gt;GLM&lt;/a&gt;, and &lt;a href=&quot;https://github.com/JuliaStats/LM.jl&quot;&gt;LM&lt;/a&gt;.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;http://www.youtube.com/embed/v9Io-p_iymI&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;h2 id=&quot;fast-fourier-transforms&quot;&gt;Fast Fourier Transforms&lt;/h2&gt;

&lt;p&gt;Julia provides a built-in interface to the &lt;a href=&quot;http://www.fftw.org/&quot;&gt;FFTW&lt;/a&gt; library. This session demonstrates the Julia’s &lt;a href=&quot;http://docs.julialang.org/en/release-0.1/stdlib/base/#signal-processing&quot;&gt;signal processing&lt;/a&gt; capabilities, such as FFTs and DCTs. Also see the &lt;a href=&quot;https://github.com/stevengj/Hadamard.jl&quot;&gt;Hadamard&lt;/a&gt; package.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;http://www.youtube.com/embed/1iBLaHGL1AM&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;h2 id=&quot;optimization-slides&quot;&gt;Optimization (&lt;a href=&quot;https://github.com/JuliaLang/julia-tutorial/raw/master/NumericalOptimization/presentation.pdf&quot;&gt;slides&lt;/a&gt;)&lt;/h2&gt;

&lt;p&gt;This session focuses largely on using Julia for solving linear programming problems. The algebraic modeling language discussed was later released as &lt;a href=&quot;https://github.com/IainNZ/JuMP.jl&quot;&gt;JuMP&lt;/a&gt;. Benchmarks are shown evaluating the performance of Julia for implementing low-level optimization code. Optimization software in Julia has been grouped under the &lt;a href=&quot;http://juliaopt.org/&quot;&gt;JuliaOpt&lt;/a&gt; project.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;http://www.youtube.com/embed/O1icUP6sajU&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;h2 id=&quot;metaprogramming-and-macros&quot;&gt;Metaprogramming and Macros&lt;/h2&gt;

&lt;p&gt;Julia is homoiconic: it represents its own code as a data structure of the language itself. Since code is represented by objects that can be created and manipulated from within the language, it is possible for a program to transform and generate its own code. &lt;a href=&quot;http://docs.julialang.org/en/release-0.1/manual/metaprogramming/&quot;&gt;Metaprogramming&lt;/a&gt; is described in detail in the Julia manual.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;http://www.youtube.com/embed/EpNeNCGmyZE&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;h2 id=&quot;parallel-and-distributed-computing-lab-solution&quot;&gt;Parallel and Distributed Computing (&lt;a href=&quot;https://github.com/JuliaLang/julia-tutorial/raw/master/NumericalOptimization/tutorial.pdf&quot;&gt;Lab&lt;/a&gt;, &lt;a href=&quot;https://github.com/JuliaLang/julia-tutorial/blob/master/NumericalOptimization/Tutorial.jl&quot;&gt;Solution&lt;/a&gt;)&lt;/h2&gt;

&lt;p&gt;&lt;a href=&quot;http://docs.julialang.org/en/release-0.1/manual/parallel-computing/&quot;&gt;Parallel and distributed computing&lt;/a&gt; have been an integral part of Julia’s capabilities from an early stage. This session describes existing basic capabilities, which can be used as building blocks for higher level parallel libraries.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;http://www.youtube.com/embed/JoRn4ryMclc&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;h2 id=&quot;networking&quot;&gt;Networking&lt;/h2&gt;

&lt;p&gt;Julia provides asynchronous networking I/O using the &lt;a href=&quot;https://github.com/joyent/libuv&quot;&gt;libuv&lt;/a&gt; library. Libuv is a portable networking library created as part of the &lt;a href=&quot;http://www.nodejs.org/&quot;&gt;Node.js&lt;/a&gt; project.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;http://www.youtube.com/embed/qYjHYTn7r2w&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;

&lt;h2 id=&quot;grid-of-resistors-lab-solution&quot;&gt;Grid of Resistors (&lt;a href=&quot;https://github.com/JuliaLang/julia-tutorial/blob/master/GridOfResistors/GridOfResistors.md&quot;&gt;Lab&lt;/a&gt;, &lt;a href=&quot;https://github.com/JuliaLang/julia-tutorial/tree/master/GridOfResistors&quot;&gt;Solution&lt;/a&gt;)&lt;/h2&gt;

&lt;p&gt;The Grid of Resistors is a classic numerical problem to compute the voltages and the effective resistance of a 2n+1 by 2n+2 grid of 1 ohm resistors if a battery is connected to the two center points. As part of this lab, the problem is solved in Julia in a number of different ways such as a vectorized implementation, a devectorized implementation, and using comprehensions, in order to study the performance characteristics of various methods.&lt;/p&gt;

&lt;iframe width=&quot;560&quot; height=&quot;315&quot; src=&quot;http://www.youtube.com/embed/OFWYPqwVtHU&quot; frameborder=&quot;0&quot; allowfullscreen=&quot;&quot;&gt;&lt;/iframe&gt;
</content>
 </entry>
 
 <entry>
   <title>Efficient Aggregates in Julia</title>
   <link href="http://julialang.org/blog/2013/03/efficient-aggregates"/>
   <updated>2013-03-05T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2013/03/efficient-aggregates</id>
   <content type="html">&lt;p&gt;We recently introduced an exciting feature that has been in planning for some
time: immutable aggregate types. In fact, we have been planning to do this
for so long that this feature is the subject of our issue #13 on GitHub,
out of more than 2400 total issues so far.&lt;/p&gt;

&lt;p&gt;Essentially, this feature drastically reduces the overhead of user-defined
types that represent small number-like values, or that wrap a small number
of other objects. Consider an RGB pixel type:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;immutable Pixel
    r::Uint8
    g::Uint8
    b::Uint8
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Instances of this type can now be packed efficiently into arrays, using
exactly 3 bytes per object. In all other respects, these objects continue
to act like normal first-class objects. To see how we might use
this, here is a function that converts an RGB image in standard 24-bit
framebuffer format to grayscale:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;function rgb2gray!(img::Array{Pixel})
    for i=1:length(img)
        p = img[i]
        v = uint8(0.30*p.r + 0.59*p.g + 0.11*p.b)
        img[i] = Pixel(v,v,v)
    end
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This code will run blazing fast, performing no memory allocation. We
have not done thorough benchmarking, but this is in fact likely to be the
fastest way to write this function in Julia from now on.&lt;/p&gt;

&lt;p&gt;The key to this behavior is the new &lt;code class=&quot;highlighter-rouge&quot;&gt;immutable&lt;/code&gt; keyword, which means
instances of the type cannot be modified. At first this sounds like
a mere restriction — how come I’m not allowed to modify one? — but
what it really means is that the object is identified with its contents,
rather than its memory address. A mutable object has “behavior”; it changes
over time, and there may be many references to the object, all of which
can observe those changes. An immutable object, on the other hand, has only
a value, and no time-varying behavior. Its location does not matter. It is
“just some bits”.&lt;/p&gt;

&lt;p&gt;Julia has always had some immutable values, in the form of bits types,
which are used to represent fixed-bit-width numbers. It is highly intuitive
that numbers are immutable. If &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt; equals 2, you might later change the value
of &lt;code class=&quot;highlighter-rouge&quot;&gt;x&lt;/code&gt;, but it is understood that the value of 2 itself does not change.
The &lt;code class=&quot;highlighter-rouge&quot;&gt;immutable&lt;/code&gt; keyword generalizes this idea to structured data types with
named fields. Julia variables and containers, including arrays, are all
still mutable. While a &lt;code class=&quot;highlighter-rouge&quot;&gt;Pixel&lt;/code&gt; object itself can’t change, a new &lt;code class=&quot;highlighter-rouge&quot;&gt;Pixel&lt;/code&gt;
can be written over an old one within an array, since the array is mutable.&lt;/p&gt;

&lt;p&gt;Let’s take a look at the benefits of this feature.&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;
    &lt;p&gt;The compiler and GC have a lot of freedom to move and copy these objects
around. This flexibility can be used to store data more efficiently,
for example keeping the real and imaginary parts of a complex number in
separate registers, or keeping only one part in a register.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Immutable objects are easy to reason about. Some languages, such as C++
and C#, provide “value types”, which have many of the benefits of immutable
objects. However, their behavior can be confusing. Consider code like
the following:&lt;/p&gt;

    &lt;p&gt;item = lookup(collection, index)
   modify!(item)
The question here is whether we have modified the same &lt;code class=&quot;highlighter-rouge&quot;&gt;item&lt;/code&gt; that is in
the collection, or if we have modified a local copy. In Julia there are
only two possibilities: either &lt;code class=&quot;highlighter-rouge&quot;&gt;item&lt;/code&gt; is mutable, in which case we modified the
one and only copy of it, or it is immutable, in which case modifying it is
not allowed.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;No-overhead data abstractions become possible. It is often useful to
define a new type that simply wraps a single value, and modifies its
behavior in some way. Our favorite modular integer example type fits this
description:&lt;/p&gt;

    &lt;p&gt;immutable ModInt{n} &amp;lt;: Integer
       k::Int
       ModInt(k) = new(mod(k,n))
   end
Since a given &lt;code class=&quot;highlighter-rouge&quot;&gt;ModInt&lt;/code&gt; doesn’t need to exist at a particular address, it
can be passed to functions, stored in arrays, and so on, as efficiently as
a single &lt;code class=&quot;highlighter-rouge&quot;&gt;Int&lt;/code&gt;, with no wrapping overhead. But, in Julia, the overhead will not
&lt;em&gt;always&lt;/em&gt; be zero. The &lt;code class=&quot;highlighter-rouge&quot;&gt;ModInt&lt;/code&gt; type information will “follow the data around”
at compile time to the extent possible, but heap-allocated wrappers will be
added as needed at run time. Typically these wrappers will be short-lived;
if the final destination of a &lt;code class=&quot;highlighter-rouge&quot;&gt;ModInt&lt;/code&gt; is in a &lt;code class=&quot;highlighter-rouge&quot;&gt;ModInt&lt;/code&gt; array, for example,
the wrapper can be discarded when the value is assigned. But if the value is
only used locally inside a function, there will most likely be no wrappers
at all.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;Abstractions are fully enforced. If a custom constructor is written for
an immutable type, then all instances will be created by it. Since the
constructed objects are never modified, the invariants provided by the
constructor cannot be violated. At this time, uninitialized arrays are an
exception to this rule. New arrays of “plain data” immutable types have
unspecified contents, so it is possible to obtain an invalid value from one.
This is usually harmless in practice, since arrays must be initialized anyway,
and are often created through functions like &lt;code class=&quot;highlighter-rouge&quot;&gt;zeros&lt;/code&gt; that do so.&lt;/p&gt;
  &lt;/li&gt;
  &lt;li&gt;
    &lt;p&gt;We can automatically type-specialize fields. Since field values at
construction time are final, their types are too, so we learn everything
about the type of an immutable object when it is constructed.&lt;/p&gt;
  &lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;There are many potential optimizations here, and we have not implemented
all of them yet. But having this feature in place provides another lever to
help us improve performance over time.&lt;/p&gt;

&lt;p&gt;For now though, we at least have a much simpler implementation of complex
numbers, and will be able to take advantage of efficient rational matrices
and other similar niceties.&lt;/p&gt;

&lt;h2 id=&quot;addendum-under-the-hood&quot;&gt;Addendum: Under the hood&lt;/h2&gt;

&lt;p&gt;For purposes of calling C and writing reflective code, it helps to know a
bit about how immutable types are implemented. Before this change, we had
types &lt;code class=&quot;highlighter-rouge&quot;&gt;AbstractKind&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;BitsKind&lt;/code&gt;, and &lt;code class=&quot;highlighter-rouge&quot;&gt;CompositeKind&lt;/code&gt;, for separating which
types are abstract, which are represented by immutable bit strings, and which
are mutable aggregates. It was sometimes convenient that the type system
reflected these differences, but also a bit unwarranted since all these
types participate in the same hierarchy and follow the same subtyping rules.&lt;/p&gt;

&lt;p&gt;Now, the type landscape is both simpler and more complex. The three Kinds
have been merged into a single kind called &lt;code class=&quot;highlighter-rouge&quot;&gt;DataType&lt;/code&gt;. The type of every
value in Julia is now either a &lt;code class=&quot;highlighter-rouge&quot;&gt;DataType&lt;/code&gt;, or else a tuple type (union types
still exist, but of course are always abstract). To find out the details
of a &lt;code class=&quot;highlighter-rouge&quot;&gt;DataType&lt;/code&gt;’s physical representation, you must query its properties.
&lt;code class=&quot;highlighter-rouge&quot;&gt;DataType&lt;/code&gt;s have three boolean properties &lt;code class=&quot;highlighter-rouge&quot;&gt;abstract&lt;/code&gt;, &lt;code class=&quot;highlighter-rouge&quot;&gt;mutable&lt;/code&gt;, and
&lt;code class=&quot;highlighter-rouge&quot;&gt;pointerfree&lt;/code&gt;, and an integer property &lt;code class=&quot;highlighter-rouge&quot;&gt;size&lt;/code&gt;. The &lt;code class=&quot;highlighter-rouge&quot;&gt;CompositeKind&lt;/code&gt; properties
&lt;code class=&quot;highlighter-rouge&quot;&gt;names&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;types&lt;/code&gt; are still there to describe fields.&lt;/p&gt;

&lt;p&gt;The &lt;code class=&quot;highlighter-rouge&quot;&gt;abstract&lt;/code&gt; property indicates that the type was declared with the
&lt;code class=&quot;highlighter-rouge&quot;&gt;abstract&lt;/code&gt; keyword and has no direct instances. &lt;code class=&quot;highlighter-rouge&quot;&gt;mutable&lt;/code&gt; indicates, for
concrete types, whether instances are mutable. &lt;code class=&quot;highlighter-rouge&quot;&gt;pointerfree&lt;/code&gt; means that
instances contain “just data” and no references to other Julia values.
&lt;code class=&quot;highlighter-rouge&quot;&gt;size&lt;/code&gt; gives the size of an instance in bytes.&lt;/p&gt;

&lt;p&gt;What used to be &lt;code class=&quot;highlighter-rouge&quot;&gt;BitsKind&lt;/code&gt;s are now &lt;code class=&quot;highlighter-rouge&quot;&gt;DataType&lt;/code&gt;s that are immutable, concrete,
have no fields, and have non-zero size. The former &lt;code class=&quot;highlighter-rouge&quot;&gt;CompositeKind&lt;/code&gt;s are
mutable and concrete, and either have fields or are zero size if they
have zero fields. Clearly, new combinations are now possible. We have
already mentioned immutable types with fields. We could have the equivalent
of mutable &lt;code class=&quot;highlighter-rouge&quot;&gt;BitsKind&lt;/code&gt;s, but this combination is not exposed in the language,
since it is easily emulated using mutable fields. Another new combination
is abstract types with fields, which would allow you to declare that all
subtypes of some abstract type should have certain fields. That one is
definitely useful, and we plan to provide syntax for it.&lt;/p&gt;

&lt;p&gt;Typically, the only time you need to worry about these things
is when calling native code, when you want to know whether some array
or struct has C-compatible data layout. This is handled by the type
predicate &lt;code class=&quot;highlighter-rouge&quot;&gt;isbits(T)&lt;/code&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Design and implementation of Julia</title>
   <link href="http://julialang.org/blog/2012/08/design-and-implementation-of-julia"/>
   <updated>2012-08-16T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2012/08/design-and-implementation-of-julia</id>
   <content type="html">&lt;p&gt;We describe the design and implementation of Julia in our first paper - &lt;a href=&quot;/images/julia-dynamic-2012-tr.pdf&quot;&gt;Julia: A Fast Dynamic Language for Technical Computing&lt;/a&gt;. This is work in progress and comments are appreciated.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>New York Open Stats Meetup</title>
   <link href="http://julialang.org/blog/2012/04/nyc-open-stats-meetup-announcement"/>
   <updated>2012-04-18T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2012/04/nyc-open-stats-meetup-announcement</id>
   <content type="html">&lt;p&gt;I’ll be giving a talk on Julia at the &lt;a href=&quot;http://www.meetup.com/nyhackr/events/60839932/&quot;&gt;New York Open Statistical Programming Meetup on May 1st&lt;/a&gt;. After my presentation, &lt;a href=&quot;http://www.johnmyleswhite.com/&quot;&gt;John Myles White&lt;/a&gt; and &lt;a href=&quot;http://www.statalgo.com/&quot;&gt;Shane Conway&lt;/a&gt; are going to give followup demos of statistical applications using Julia. Then we’re going to hang out and grab drinks nearby. Thanks to &lt;a href=&quot;http://www.harlan.harris.name/&quot;&gt;Harlan Harris&lt;/a&gt; and &lt;a href=&quot;http://www.drewconway.com/&quot;&gt;Drew Conway&lt;/a&gt; for setting the whole thing up!&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Announcement:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;After a brief hiatus, we are very excited to announce our May meetup will feature one of the hottest new languages in statistical computing: Julia.  We are delighted to welcome Stefan Karpinski, one of the creators of Julia, to give an introduction to the language and his perspective on statistical computing.&lt;/p&gt;

&lt;p&gt;Julia is a general-purpose, high-level, dynamic language in the tradition of Lisp, Perl, Python and Ruby. It is designed to take advantage of modern techniques for executing dynamic languages with statically-compiled performance. As part of this design, the language has an expressive type system, which programmers may leverage for dispatch and error checking — incidentally providing the compiler with useful type information. Using types is entirely optional, however: “typeless Julia” is a valid and useful subset of the language, similar to traditional dynamic languages, which nevertheless runs at statically compiled speeds.\&lt;/p&gt;

&lt;p&gt;Julia is especially good at running Matlab and R-style programs. Given its level of performance, we envision a new era of technical computing where libraries can be developed in a high-level language instead of C or Fortran. We have also experimented with cloud API integration, and begun to develop a web-based interactive computing environment. The ultimate goal is to make cloud-based supercomputing as easy and accessible as Google Docs.&lt;/p&gt;

&lt;p&gt;We will also hear from a mix of people who have already started developing in Julia and see some examples of what they have developed.&lt;/p&gt;

&lt;p&gt;The meetup will follow our typical schedule: pizza will begin at 6:15pm, Stefan will begin promptly at 7pm, and we will head to The Central Bar around 8:30pm.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt; You can see the slides for the talk &lt;a href=&quot;/images/nyhackr.pdf&quot;&gt;here&lt;/a&gt;. There was no video of the talk, but hopefully the slides are informative — there are, among other things, a lot of code examples that should just work if pasted into the Julia repl.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Lang.NEXT Announcement</title>
   <link href="http://julialang.org/blog/2012/03/lang-next-talk-announcement"/>
   <updated>2012-03-24T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2012/03/lang-next-talk-announcement</id>
   <content type="html">&lt;p&gt;Jeff and I will be giving a &lt;a href=&quot;http://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2012/Julia&quot;&gt;presentation on Julia&lt;/a&gt; at the upcoming &lt;a href=&quot;http://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2012&quot;&gt;Lang.NEXT conference&lt;/a&gt;, a gathering of “programming language design experts and enthusiasts” featuring “talks, panels and discussion on leading programming language work from industry and research.”
We are honored and excited to have been invited to speak at an event alongside so many programming language luminaries.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Julia is a dynamic language in the tradition of Lisp, Perl, Python and Ruby. It aims to advance  expressiveness and convenience for scientific and technical computing beyond that of environments like Matlab and NumPy, while simultaneously closing the performance gap with compiled languages like C, C++, Fortran and Java.&lt;/p&gt;

&lt;p&gt;Most high-performance dynamic language implementations have taken an existing interpreted language and worked to accelerate its execution. In creating Julia, we have reconsidered the basic language design, taking into account the capabilities of modern JIT compilers and the specific needs of technical computing. Our design includes:&lt;/p&gt;

&lt;ul&gt;
  &lt;li&gt;Multiple dispatch as the core language paradigm.&lt;/li&gt;
  &lt;li&gt;Exposing a sophisticated type system including parametric dependent types.&lt;/li&gt;
  &lt;li&gt;Dynamic type inference to generate fast code from programs with no declarations.&lt;/li&gt;
  &lt;li&gt;Aggressive specialization of generated code for types encountered at run-time.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Julia feels light and natural for data exploration and algorithm prototyping, but has performance that lets you deploy your prototypes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Update:&lt;/strong&gt; You can see the slides for our talk &lt;a href=&quot;/images/lang.next.pdf&quot;&gt;here&lt;/a&gt;. Video of the presentation is available &lt;a href=&quot;http://channel9.msdn.com/Events/Lang-NEXT/Lang-NEXT-2012/Julia&quot;&gt;here&lt;/a&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Shelling Out Sucks</title>
   <link href="http://julialang.org/blog/2012/03/shelling-out-sucks"/>
   <updated>2012-03-11T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2012/03/shelling-out-sucks</id>
   <content type="html">
&lt;p&gt;Spawning a pipeline of connected programs via an intermediate shell — a.k.a. “shelling out” — is a really convenient and effective way to get things done.
It’s so handy that some “&lt;a href=&quot;http://en.wikipedia.org/wiki/Glue_language&quot;&gt;glue languages&lt;/a&gt;,” like &lt;a href=&quot;http://www.perl.org/&quot;&gt;Perl&lt;/a&gt; and &lt;a href=&quot;http://www.ruby-lang.org/&quot;&gt;Ruby&lt;/a&gt;, even have special syntax for it (backticks).
However, shelling out is also a common source of bugs, security holes, unnecessary overhead, and silent failures.
Here are the three reasons why shelling out is problematic:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;&lt;em&gt;&lt;a href=&quot;#Metacharacter+Brittleness&quot;&gt;Metacharacter brittleness.&lt;/a&gt;&lt;/em&gt;
When commands are constructed programmatically, the resulting code is almost always brittle:
if a variable used to construct the command contains any shell metacharacters, including spaces, the command will likely break and do something very different than what was intended — potentially something quite dangerous.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;&lt;a href=&quot;#Indirection+and+Inefficiency&quot;&gt;Indirection and inefficiency.&lt;/a&gt;&lt;/em&gt;
When shelling out, the main program forks and execs a shell process just so that the shell can in turn fork and exec a series of commands with their inputs and outputs appropriately connected.
Not only is starting a shell an unnecessary step, but since the main program is not the parent of the pipeline commands, it cannot be notified when they terminate — it can only wait for the pipeline to finish and hope the shell indicates what happened.&lt;/li&gt;
  &lt;li&gt;&lt;em&gt;&lt;a href=&quot;#Silent+Failures+by+Default&quot;&gt;Silent failures by default.&lt;/a&gt;&lt;/em&gt;
Errors in shelled out commands don’t automatically become exceptions in most languages.
This default leniency leads to code that fails silently when shelled out commands don’t work.
Worse still, because of the indirection problem, there are many cases where the failure of a process in a spawned pipeline &lt;em&gt;cannot&lt;/em&gt; be detected by the parent process, even if errors are fastidiously checked for.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;In the rest of this post, I’ll go over examples demonstrating each of these problems.
At &lt;a href=&quot;#Summary+and+Remedy&quot;&gt;the end&lt;/a&gt;, I’ll talk about better alternatives to shelling out, and in a &lt;a href=&quot;http://julialang.org/blog/2013/04/put-this-in-your-pipe&quot;&gt;followup post&lt;/a&gt;. I’ll demonstrate how Julia makes these better alternatives dead simple to use.
Examples below are given in Ruby which shells out to &lt;a href=&quot;http://www.gnu.org/software/bash/&quot;&gt;Bash&lt;/a&gt;, but the same problems exist no matter what language one shells out from:
it’s the technique of using an intermediate shell process to spawn external commands that’s at fault, not the language.&lt;/p&gt;

&lt;h2 id=&quot;metacharacter-brittleness&quot;&gt;Metacharacter Brittleness&lt;/h2&gt;

&lt;p&gt;Let’s start with a simple example of shelling out from Ruby.
Suppose you want to count the number of lines containing the string “foo” in all the files under a directory given as an argument.
One option is to write Ruby code that reads the contents of the given directory, finds all the files, opens them and iterates through them looking for the string “foo”.
However, that’s a lot of work and it’s going to be much slower than using a pipeline of standard UNIX commands, which are written in C and heavily optimized.
The most natural and convenient thing to do in Ruby is to shell out, using backticks to capture output:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;`find #{dir} -type f -print0 | xargs -0 grep foo | wc -l`.to_i
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This expression interpolates the &lt;code class=&quot;highlighter-rouge&quot;&gt;dir&lt;/code&gt; variable into a command, spawns a Bash shell to execute the resulting command, captures the output into a string, and then converts that string to an integer.
The command uses the &lt;code class=&quot;highlighter-rouge&quot;&gt;-print0&lt;/code&gt; and &lt;code class=&quot;highlighter-rouge&quot;&gt;-0&lt;/code&gt; options to correctly handle strange characters in file names piped from &lt;code class=&quot;highlighter-rouge&quot;&gt;find&lt;/code&gt; to &lt;code class=&quot;highlighter-rouge&quot;&gt;xargs&lt;/code&gt; (these options cause file names to be delimited by &lt;a href=&quot;http://en.wikipedia.org/wiki/Null_character&quot;&gt;NULs&lt;/a&gt; instead of whitespace).
Even with extra-careful options, this code for shelling out is simple and clear.
Here it is in action:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;irb(main):001:0&amp;gt; dir=&quot;src&quot;
=&amp;gt; &quot;src&quot;
irb(main):002:0&amp;gt; `find #{dir} -type f -print0 | xargs -0 grep foo | wc -l`.to_i
=&amp;gt; 5
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Great.
However, this only works as expected if the directory name &lt;code class=&quot;highlighter-rouge&quot;&gt;dir&lt;/code&gt; doesn’t contain any characters that the shell considers special.
For example, the shell decides what constitutes a single argument to a command using whitespace.
Thus, if the value of &lt;code class=&quot;highlighter-rouge&quot;&gt;dir&lt;/code&gt; is a directory name containing a space, this will fail:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;irb(main):003:0&amp;gt; dir=&quot;source code&quot;
=&amp;gt; &quot;source code&quot;
irb(main):004:0&amp;gt; `find #{dir} -type f -print0 | xargs -0 grep foo | wc -l`.to_i
find: `source': No such file or directory
find: `code': No such file or directory
=&amp;gt; 0
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;The simple solution to the problem of spaces is to surround the interpolated directory name in quotes, telling the shell to treat spaces inside as normal characters:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;irb(main):005:0&amp;gt; `find '#{dir}' -type f -print0 | xargs -0 grep foo | wc -l`.to_i
=&amp;gt; 5
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Excellent.
So what’s the problem?
While this solution addresses the issue of file names with spaces in them, it is still brittle with respect to other shell metacharacters.
What if a file name has a quote character in it?
Let’s try it.
First, let’s create a very weirdly named directory:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;bash-3.2$ mkdir &quot;foo'bar&quot;
bash-3.2$ echo foo &amp;gt; &quot;foo'bar&quot;/test.txt
bash-3.2$ ls -ld foo*bar
drwxr-xr-x 3 stefan staff 102 Feb  3 16:17 foo'bar/
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;That’s an admittedly strange directory name, but it’s perfectly legal in UNIXes of all flavors.
Now back to Ruby:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;irb(main):006:0&amp;gt; dir=&quot;foo'bar&quot;
=&amp;gt; &quot;foo'bar&quot;
irb(main):007:0&amp;gt; `find '#{dir}' -type f -print0  | xargs -0 grep foo | wc -l`.to_i
sh: -c: line 0: unexpected EOF while looking for matching `''
sh: -c: line 1: syntax error: unexpected end of file
=&amp;gt; 0
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Doh.
Although this may seem like an unlikely corner case that one needn’t realistically worry about, there are serious security ramifications.
Suppose the name of the directory came from an untrusted source — like a web submission, or an argument to a setuid program from an untrusted user.
Suppose an attacker could arrange for any value of &lt;code class=&quot;highlighter-rouge&quot;&gt;dir&lt;/code&gt; they wanted:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;irb(main):008:0&amp;gt; dir=&quot;foo'; echo MALICIOUS ATTACK 1&amp;gt;&amp;amp;2; echo '&quot;
=&amp;gt; &quot;foo'; echo MALICIOUS ATTACK 1&amp;gt;&amp;amp;2; echo '&quot;
irb(main):009:0&amp;gt; `find '#{dir}' -type f -print0  | xargs -0 grep foo | wc -l`.to_i
find: `foo': No such file or directory
MALICIOUS ATTACK
grep:  -type f -print0
: No such file or directory
=&amp;gt; 0
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Your box is now owned.
Of course, you could sanitize the value of the &lt;code class=&quot;highlighter-rouge&quot;&gt;dir&lt;/code&gt; variable, but there’s a fundamental tug-of-war between security (as limited as possible) and flexibility (as unlimited as possible).
The ideal behavior is to allow any directory name, no matter how bizarre, as long as it actually exists, but “defang” all shell metacharacters.&lt;/p&gt;

&lt;p&gt;The only two way to fully protect against these sorts of metacharacter attacks — whether malicious or accidental — while still using an external shell to construct the pipeline, is to do full shell metacharacter escaping:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;irb(main):010:0&amp;gt; require 'shellwords'
=&amp;gt; true
irb(main):011:0&amp;gt; `find #{Shellwords.shellescape(dir)} -type f -print0  | xargs -0 grep foo | wc -l`.to_i
find: `foo\'; echo MALICIOUS ATTACK 1&amp;gt;&amp;amp;2; echo \'': No such file or directory
=&amp;gt; 0
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;With shell escaping, this safely attempts to search a very oddly named directory instead of executing the malicious attack.
Although shell escaping does work (assuming that there aren’t any mistakes in the shell escaping implementation), realistically, no one actually bothers — it’s too much trouble.
Instead, code that shells out with programmatically constructed commands is typically riddled with potential bugs in the best case and massive security holes in the worst case.&lt;/p&gt;

&lt;h2 id=&quot;indirection-and-inefficiency&quot;&gt;Indirection and Inefficiency&lt;/h2&gt;

&lt;p&gt;If we were using the above code to count the number of lines with the string “foo” in a directory, we would want to check to see if everything worked and respond appropriately if something went wrong.
In Ruby, you can check if a shelled out command was successful using the bizarrely named &lt;code class=&quot;highlighter-rouge&quot;&gt;$?.success?&lt;/code&gt; indicator:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;irb(main):012:0&amp;gt; dir=&quot;src&quot;
=&amp;gt; &quot;src&quot;
irb(main):013:0&amp;gt; `find #{Shellwords.shellescape(dir)} -type f -print0  | xargs -0 grep foo | wc -l`.to_i
=&amp;gt; 5
irb(main):014:0&amp;gt; $?.success?
=&amp;gt; true
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Ok, that correctly indicates success.
Let’s make sure that it can detect failure:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;irb(main):015:0&amp;gt; dir=&quot;nonexistent&quot;
=&amp;gt; &quot;nonexistent&quot;
irb(main):016:0&amp;gt; `find #{Shellwords.shellescape(dir)} -type f -print0  | xargs -0 grep foo | wc -l`.to_i
find: `nonexistent': No such file or directory
=&amp;gt; 0
irb(main):017:0&amp;gt; $?.success?
=&amp;gt; true
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Wait. What?!
That wasn’t successful.
What’s going on?&lt;/p&gt;

&lt;p&gt;The heart of the problem is that when you shell out, the commands in the pipeline are not immediate children of the main program, but rather its grandchildren:
the program spawns a shell, which makes a bunch of UNIX pipes, forks child processes, connects inputs and outputs to pipes using the &lt;a href=&quot;https://developer.apple.com/library/IOs/#documentation/System/Conceptual/ManPages_iPhoneOS/man2/dup2.2.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;dup2&lt;/code&gt; system call&lt;/a&gt;, and then execs the appropriate commands.
As a result, your main program is not the parent of the commands in the pipeline, but rather, their grandparent.
Therefore, it doesn’t know their process IDs, nor can it wait on them or get their exit statuses when they terminate.
The shell process, which is their parent, has to do all of that.
Your program can only wait for the shell to finish and see if &lt;em&gt;that&lt;/em&gt; was successful.
If the shell is only executing a single command, this is fine:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;irb(main):018:0&amp;gt; `cat /dev/null`
=&amp;gt; &quot;&quot;
irb(main):019:0&amp;gt; $?.success?
=&amp;gt; true
irb(main):020:0&amp;gt; `cat /dev/nada`
cat: /dev/nada: No such file or directory
=&amp;gt; &quot;&quot;
irb(main):021:0&amp;gt; $?.success?
=&amp;gt; false
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Unfortunately, by default the shell is quite lenient about what it considers to be a successful pipeline:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;irb(main):022:0&amp;gt; `cat /dev/nada | sort`
cat: /dev/nada: No such file or directory
=&amp;gt; &quot;&quot;
irb(main):023:0&amp;gt; $?.success?
=&amp;gt; true
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;As long as the last command in a pipeline succeeds — in this case &lt;code class=&quot;highlighter-rouge&quot;&gt;sort&lt;/code&gt; — the entire pipeline is considered a success.
Thus, even when one or more of the earlier programs in a pipeline fails spectacularly, the last command may not, leading the shell to consider the entire pipeline to be successful.
This is probably not what you meant by success.&lt;/p&gt;

&lt;p&gt;Bash’s notion of pipeline success can fortunately be made stricter with the &lt;code class=&quot;highlighter-rouge&quot;&gt;pipefail&lt;/code&gt; option.
This option causes the shell to consider a pipeline successful only if all of its commands are successful:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;irb(main):024:0&amp;gt; `set -o pipefail; cat /dev/nada | sort`
cat: /dev/nada: No such file or directory
=&amp;gt; &quot;&quot;
irb(main):025:0&amp;gt; $?.success?
=&amp;gt; false
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;Since shelling out spawns a new shell every time, this option has to be set for every multi-command pipeline in order to be able to determine its true success status.
Of course, just like shell-escaping every interpolated variable, setting &lt;code class=&quot;highlighter-rouge&quot;&gt;pipefail&lt;/code&gt; at the start of every command is simply something that no one actually does.
Moreover, even with the &lt;code class=&quot;highlighter-rouge&quot;&gt;pipefail&lt;/code&gt; option, your program has no way of determining &lt;em&gt;which&lt;/em&gt; commands in a pipeline were unsuccessful — it just knows that something somewhere went wrong.
While that’s better than silently failing and continuing as if there were no problem, its not very helpful for postmortem debugging:
many programs are not as well-behaved as &lt;code class=&quot;highlighter-rouge&quot;&gt;cat&lt;/code&gt; and don’t actually identify themselves or the specific problem when printing error messages before going belly up.&lt;/p&gt;

&lt;p&gt;Given the other problems caused by the indirection of shelling out, it seems like a barely relevant afterthought to mention that execing a shell process just to spawn a bunch of other processes is inefficient.
However, it is a real source of unnecessary overhead:
the main process could just do the work the shell does itself.
Asking the kernel to fork a process and exec a new program is a non-trivial amount of work.
The only reason to have the shell do this work for you is that it’s complicated and hard to get right.
The shell makes it easy.
So programming languages have traditionally relied on the shell to setup pipelines for them, regardless of the additional overhead and problems caused by indirection.&lt;/p&gt;

&lt;h2 id=&quot;silent-failures-by-default&quot;&gt;Silent Failures by Default&lt;/h2&gt;

&lt;p&gt;Let’s return to our example of shelling out to count “foo” lines.
Here’s the total expression we need to use in order to shell out without being susceptible to metacharacter breakage and so we can actually tell whether the entire pipeline succeeded:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;`set -o pipefail; find #{Shellwords.shellescape(dir)} -type f -print0  | xargs -0 grep foo | wc -l`.to_i
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;However, an error isn’t raised by default when a shelled out command fails.
To avoid silent errors, we need to explicitly check &lt;code class=&quot;highlighter-rouge&quot;&gt;$?.success?&lt;/code&gt; after every time we shell out and raise an exception if it indicates failure.
Of course, doing this manually is tedious, and as a result, it largely isn’t done.
The default behavior — and therefore the easiest and most common behavior — is to assume that shelled out commands worked and completely ignore failures.
To make our “foo” counting example well-behaved, we would have to wrap it in a function like so:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;def foo_count(dir)
  n = `set -o pipefail;
       find #{Shellwords.shellescape(dir)} -type f -print0  | xargs -0 grep foo | wc -l`.to_i
  raise(&quot;pipeline failed&quot;) unless $?.success?
  return n
end
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;This function behaves the way we would like it to:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;irb(main):026:0&amp;gt; foo_count(&quot;src&quot;)
=&amp;gt; 5
irb(main):027:0&amp;gt; foo_count(&quot;source code&quot;)
=&amp;gt; 5
irb(main):028:0&amp;gt; foo_count(&quot;nonexistent&quot;)
find: `nonexistent': No such file or directory
RuntimeError: pipeline failed
	from (irb):5:in `foo_count'
	from (irb):13
	from :0
irb(main):029:0&amp;gt; foo_count(&quot;foo'; echo MALICIOUS ATTACK; echo '&quot;)
find: `foo\'; echo MALICIOUS ATTACK; echo \'': No such file or directory
RuntimeError: pipeline failed
	from (irb):5:in `foo_count'
	from (irb):14
	from :0
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;However, this 6-line, 200-character function is a far cry from the clarity and brevity we started with:&lt;/p&gt;

&lt;div class=&quot;highlighter-rouge&quot;&gt;&lt;pre class=&quot;highlight&quot;&gt;&lt;code&gt;`find #{dir} -type f -print0 | xargs -0 grep foo | wc -l`.to_i
&lt;/code&gt;&lt;/pre&gt;
&lt;/div&gt;

&lt;p&gt;If most programmers saw the longer, safer version of this in a program, they’d probably wonder why someone was writing such verbose, cryptic code to get something so simple and straightforward done.&lt;/p&gt;

&lt;h2 id=&quot;summary-and-remedy&quot;&gt;Summary and Remedy&lt;/h2&gt;

&lt;p&gt;To sum it up, shelling out is great, but making code that shells out bug-free, secure, and not prone to silent failures requires three things that typically aren’t done:&lt;/p&gt;

&lt;ol&gt;
  &lt;li&gt;Shell-escaping all values used to construct commands&lt;/li&gt;
  &lt;li&gt;Prefixing each multi-command pipeline with “&lt;code class=&quot;highlighter-rouge&quot;&gt;set -o pipefail;&lt;/code&gt;”&lt;/li&gt;
  &lt;li&gt;Explicitly checking for failure after each shelled out command.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The trouble is that after doing all of these things, shelling out is no longer terribly convenient, and the code becomes annoyingly verbose.
In short, shelling out responsibly kind of sucks.&lt;/p&gt;

&lt;p&gt;As is so often the case, the root of all of these problems is relying on a middleman rather than doing things yourself.
If a program constructs and executes pipelines itself, it remains in control of all the subprocesses, can determine their individual exit conditions, automatically handle errors appropriately, and give accurate, comprehensive diagnostic messages when things go wrong.
Moreover, without a shell to interpret commands, there is also no shell to treat metacharacters specially, and therefore no danger of metacharacter brittleness.
&lt;a href=&quot;http://python.org/&quot;&gt;Python&lt;/a&gt; gets this right:
using &lt;a href=&quot;http://docs.python.org/library/os.html#os.popen&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;os.popen&lt;/code&gt;&lt;/a&gt; to shell out is officially deprecated, and the recommended way to call external programs is to use the &lt;a href=&quot;http://docs.python.org/library/subprocess.html&quot;&gt;&lt;code class=&quot;highlighter-rouge&quot;&gt;subprocess&lt;/code&gt;&lt;/a&gt; module, which spawns external programs without using a shell.
Constructing pipelines using &lt;code class=&quot;highlighter-rouge&quot;&gt;subprocess&lt;/code&gt; &lt;a href=&quot;http://docs.python.org/library/subprocess.html#replacing-shell-pipeline&quot;&gt;can be a little verbose&lt;/a&gt;, but it is safe and avoids all the problems that shelling out is prone to.
In my &lt;a href=&quot;/blog/2013/04/put-this-in-your-pipe&quot;&gt;followup post&lt;/a&gt;, I will describe how Julia makes constructing and executing pipelines of external commands as safe as Python’s &lt;code class=&quot;highlighter-rouge&quot;&gt;subprocess&lt;/code&gt; and as convenient as shelling out.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Stanford Talk Video</title>
   <link href="http://julialang.org/blog/2012/03/stanford-talk-video"/>
   <updated>2012-03-01T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2012/03/stanford-talk-video</id>
   <content type="html">&lt;p&gt;Jeff gave his &lt;a href=&quot;/blog/2012/02/talk-announcement/&quot;&gt;previously announced&lt;/a&gt;, invited talk at Stanford yesterday and the video is &lt;a href=&quot;http://ee380.stanford.edu/cgi-bin/videologger.php?target=120229-ee380-300.asx&quot;&gt;available here&lt;/a&gt;.
Congrats, Jeff!&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Stanford Talk Announcement</title>
   <link href="http://julialang.org/blog/2012/02/talk-announcement"/>
   <updated>2012-02-27T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2012/02/talk-announcement</id>
   <content type="html">&lt;p&gt;I will be speaking about Julia at the
&lt;a href=&quot;http://www.stanford.edu/class/ee380/&quot;&gt;Stanford EE Computer Systems Colloquium&lt;/a&gt;
on Wednesday, February 29 at 4:15PM PST.
The title of the talk is &lt;em&gt;Julia: A Fast Dynamic Language For Technical Computing&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Abstract:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Julia is a general-purpose, high-level, dynamic language, designed from the start to take advantage of techniques for executing dynamic languages at statically-compiled language speeds. As a result the language has a more powerful type system, and generally provides better type information to the compiler.&lt;/p&gt;

  &lt;p&gt;Julia is especially good at running MATLAB and R-style programs. Given its level of performance, we envision a new era of technical computing where libraries can be developed in a high-level language instead of C or FORTRAN. We have also experimented with cloud API integration, and begun to develop a web-based, language-neutral platform for visualization and collaboration. The ultimate goal is to make cloud-based supercomputing as easy and accessible as Google Docs.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;&lt;strong&gt;Speaker Bio:&lt;/strong&gt;&lt;/p&gt;

&lt;blockquote&gt;
  &lt;p&gt;Jeff Bezanson has been developing the Julia language for two and a half years with a small distributed team of collaborators. Previously, he worked as a software engineer at Interactive Supercomputing, which developed the Star-P parallel extension to MATLAB. At the company, Jeff was a principal developer of “M#”, an implementation of the MATLAB language running on .NET. He is now a second-year graduate student at MIT. Jeff received an A.B. in Computer Science from Harvard University in 2004, and has experience with applications of technical computing in medical imaging.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The talk will be webcast live.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Edit:&lt;/strong&gt; the video of the talk can be &lt;a href=&quot;http://ee380.stanford.edu/cgi-bin/videologger.php?target=120229-ee380-300.asx&quot;&gt;found here&lt;/a&gt;.&lt;/p&gt;
</content>
 </entry>
 
 <entry>
   <title>Why We Created Julia</title>
   <link href="http://julialang.org/blog/2012/02/why-we-created-julia"/>
   <updated>2012-02-14T00:00:00+00:00</updated>
   <id>http://julialang.org/blog/2012/02/why-we-created-julia</id>
   <content type="html">&lt;p&gt;In short, because we are greedy.&lt;/p&gt;

&lt;p&gt;We are power Matlab users.
Some of us are Lisp hackers.
Some are Pythonistas, others Rubyists, still others Perl hackers.
There are those of us who used Mathematica before we could grow facial hair.
There are those who still can’t grow facial hair.
We’ve generated more R plots than any sane person should.
C is our desert island programming language.&lt;/p&gt;

&lt;p&gt;We love all of these languages;
they are wonderful and powerful.
For the work we do — scientific computing, machine learning, data mining, large-scale linear algebra, distributed and parallel computing — each one is perfect for some aspects of the work and terrible for others.
Each one is a trade-off.&lt;/p&gt;

&lt;p&gt;We are greedy: we want more.&lt;/p&gt;

&lt;p&gt;We want a language that’s open source, with a liberal license.
We want the speed of C with the dynamism of Ruby.
We want a language that’s homoiconic, with true macros like Lisp, but with obvious, familiar mathematical notation like Matlab.
We want something as usable for general programming as Python,
as easy for statistics as R,
as natural for string processing as Perl,
as powerful for linear algebra as Matlab,
as good at gluing programs together as the shell.
Something that is dirt simple to learn, yet keeps the most serious hackers happy.
We want it interactive and we want it compiled.&lt;/p&gt;

&lt;p&gt;(Did we mention it should be as fast as C?)&lt;/p&gt;

&lt;p&gt;While we’re being demanding, we want something that provides the distributed power of Hadoop — without the kilobytes of boilerplate Java and XML;
without being forced to sift through gigabytes of log files on hundreds of machines to find our bugs.
We want the power without the layers of impenetrable complexity.
We want to write simple scalar loops that compile down to tight machine code using just the registers on a single CPU.
We want to write &lt;code class=&quot;highlighter-rouge&quot;&gt;A*B&lt;/code&gt; and launch a thousand computations on a thousand machines, calculating a vast matrix product together.&lt;/p&gt;

&lt;p&gt;We never want to mention types when we don’t feel like it.
But when we need polymorphic functions, we want to use generic programming to write an algorithm just once and apply it to an infinite lattice of types;
we want to use multiple dispatch to efficiently pick the best method for all of a function’s arguments, from dozens of method definitions, providing common functionality across drastically different types.
Despite all this power, we want the language to be simple and clean.&lt;/p&gt;

&lt;p&gt;All this doesn’t seem like too much to ask for, does it?&lt;/p&gt;

&lt;p&gt;Even though we recognize that we are inexcusably greedy, we still want to have it all.
About two and a half years ago, we set out to create the language of our greed.
It’s not complete, but it’s time for a 1.0 release — the language we’ve created is called &lt;a href=&quot;/&quot;&gt;Julia&lt;/a&gt;.
It already delivers on 90% of our ungracious demands, and now it needs the ungracious demands of others to shape it further.
So, if you are also a greedy, unreasonable, demanding programmer, we want you to give it a try.&lt;/p&gt;
</content>
 </entry>
 

</feed>
