Affine arithmetic was proposed as an alternative, and it explicitly accounts for dependencies between variables, because it represents intervals as *affine forms.* Here’s the generic form for an interval , corresponding to some quantity x:

As we see, the affine form is a linear combination. The first term, , is the central coefficient. The remaining terms are of the form . Each such term represents one source of noise, or uncertainty, that we know about x. In those terms, the are real coefficients, and the $\epsilon_i$ are called *noise symbols*, and can be either 1 or -1. This means the range of an affine form is .

In the simplest case, then, one could represent a interval as an affine form with a single noise symbol: . In fact, this is how one converts between affine forms and intervals.

Things become more interesting, however, when we note that all the noise symbols can be shared by multiple affine forms. In other words, a symbol such as, say, is *the same * in all affine forms that contain it, and has the same value when they are evaluated. All those forms have some source of uncertainty in common, and the shared noise symbol is the way affine arithmetic represents that relation. Those forms are no longer independent, and their dependencies are explicitly handled through the sharing of noise symbols.

For an example of how this is useful, consider two affine forms, and . Using the formula for ranges above, we know that the value of x is in the interval [13, 27], and the range of y is the interval [2, 14]. So if we were to plot the point (x, y) on a plane, it could be anywhere in the rectangle delimited by the above values (light gray in the picture below). But we know that x and y have two sources of uncertainty each, and that some of their uncertainty is explained by the same source (corresponding to ). The shared noise symbol allows us to estimate their joint range, or the area in which the point (x, y) can lie, more precisely, as the darker polygon in the figure below.

Affine arithmetic provides operations on affine forms as well. These can be of two kinds: affine operations, which are linear combinations of its inputs, and non-affine operations. The first kind includes addition, subtraction, negation, multiplication by scalars and combinations of those. The generic formula for linear combinations of two affine forms is:

where are real numbers. Obviously, setting appropriate values for one or more of those gives us the elementary operations of addition (), subtraction () and so forth.

All affine operations are computed by specializations of the above formula, so they introduce no error during their computation. Also, elementary arithmetic axioms are respected, so the results look natural. For instance, it’s easy to see how x-x will be zero. Affine operations are also commutative, transitive, etc.

Unfortunately, not all useful operations are affine ones. Non-affine operations such as multiplication, division, logarithms and others are less straightforward, and will be presented in the next article in this series.

]]>Most operations in interval arithmetic are simple and intuitive. Some examples are:

- ;
- ;
- ;
- ;
- ;
- .

Intervals don’t have to be finite. There’s an empty interval, represented by , and an one or both interval limites can be infinite: . Interval arithmetic implementations typically handle invalid domain inputs (such as being asked to compute the square root of ) by operating only on the valid range (this is a design choice, and perhaps it would be preferred to “fail early” with an error).

There are numerous open source implementations of interval arithmetic in various languages, including C++ (part of Boost), MATLAB, Python, and others. There is also an online interval calculator you can play with.

Let’s work through a simple expression using intervals and the above operations. Say and . Let’s compute , in steps:

- ;
- ;
- ;
- ;
- .

While interval arithmetic operations are simple (and therefore fast implementations can be written) and mostly intuitive, they suffer from some issues. First, if an interval appears multiple times in the same computation, IA doesn’t know it represents the same quantity. This leads to some results that, from a real arithmetic perspective, are non-sensical. For instance, consider above. , while clearly the correct answer is 0.

This leads to an error explosion problem when the same quantity is repeatedly used in a chain of operations such as evaluating and polynomial. The end result is likely to bee too conservative (i.e., too wide) to be practical. For instance, still considering above, the simple expression leads to , which is 12 times wider than the real result, (just iterate through a few values for x and you’ll get there).

Several extensions and improvements to interval arithmetic have been proposed to handle these issues. The one I use in my PhD work is affine arithmetic, and I’ll talk about it next in this series.

]]>While computing with ranges and uncertainties has been done for a long time, most people consider Moore’s Interval Analysis book from the 1960′s to be the seminal work in the field. That’s one of those books everyone cites but very few people read, as numerous modern introductions exist. Wikipedia has a good starting point, as usual.

The original motivation for the introduction of interval arithmetic was to handle rounding errors introducing during numerical computations, as well as measurement errors in the inputs. Regarding the former, it’s interesting to note that Moore’s work predates the IEEE 754 standard for floating point arithmetic by a couple of decades.

It turns out that interval arithmetic is useful for many applications besides error handling. It is useful in computer graphics, robotics, and many other areas. One particular application is relevant to my work: robust optimization.

In order to understand robust optimization, let’s take a look at the grossly simplistic function plotted below. Assume we need to find the value of x that maximizes f(x). What’s that value?

It seems pretty obvious, right? In practice, the optimal value depends on how precisely you can measure x. In many real-world scenarios, x is estimated by a noisy process, and that introduces a degree of error. Let’s say that our measurement error is in the [-2, 2] interval, that is, a reading of 50 for x actually means that x could be anywhere from 48 to 52. That would mean there is no single best value for f(x), rather we have a range of values (a interval).

Now, if we want to find the best **guaranteed** range for f(x), we’re in the field of robust optimization. While there are many ways to define robust optimization, this is a simple and useful one: find the best worst case scenario, or the range of f(x) values with the highest min value (or lowest max value in minimization problems, hence the term min-max optimization). With this definition we see that our desired optimal value changes:

The rectangles in the above chart are made of intervals. In the horizontal axis we have the range of values for x, while the vertical axis corresponds to the resulting range of values for f(x). Using interval arithmetic one can estimate guaranteed bounds for the vertical interval, giving us a worst case scenario value, which is what we’d want to maximize in robust optimization. In this case we’d much rather have x around 85 than x around 20.

There are multiple ways to incorporate intervals into optimization algorithms, the most common and natural one being using it as a filter for a branch and bound method.

In the next post, we’ll look into interval arithmetic in more detail, including some of its shortcomings.

]]>