Today I want to wrap up (for now) my series on multiresolution pyramids by showing you how to make this strange-looking fruit:... read more >>

]]>Today I want to wrap up (for now) my series on multiresolution pyramids by showing you how to make this strange-looking fruit:

This a slightly nontraditional demonstration of a technique called image blending. I first considered writing a blog about image blending when I saw an example recently that was posted inside MathWorks. I asked the engineer who wrote the example, Steve Kuznicki, for more information about the example, and he shared his method and code with me.

The method for blending two images together was based on using Laplacian pyramids. The earliest reference I know about is Burt and Adelson, "A Multiresolution Spline with Application to Image Mosaics," *ACM Transactions on Graphics*, vol. 2, no. 4, October 1983, pp. 217-236.

The code, though, looked a little complicated. I realized that the code was complicated because of underlying limitations in the functional design of `impyramid`, which I wrote about earlier in the series (02-Apr-2019 and 09-Apr-2019).

So, instead of jumping right into image blending, I spent some time first writing about image tiling and revised interfaces for computing and visualization multiresolution and Laplacian pyramids.

Now, I'm ready to come back to image blending.

Because, I think, of the influence of the Burt and Adelson paper, it has become common to demonstrate the technique by blending images of an apple and orange, with the seam between the two images split right down the middle. I found a couple of public-domain images (Big red apple and 20140119Hockenheim1) to use for a demonstration.

apple_url = 'https://blogs.mathworks.com/steve/files/apple.jpg'; A = im2double(imread(apple_url)); orange_url = 'https://blogs.mathworks.com/steve/files/orange.jpg'; B = im2double(imread(orange_url)); subplot(1,2,1) imshow(A) subplot(1,2,2) imshow(B)

Now I will make a mask image to define which part of the blended image will come from image A (the apple).

```
mask_A = zeros(1024,1024);
mask_A(:,1:512) = 1;
clf
imshow(mask_A)
xticks([])
yticks([])
axis on
```

The simplest way to blend the images is to just put the pieces together. We can do that with a little mask arithmetic.

C = (A .* mask_A) + (B .* (1 - mask_A)); clf imshow(C) xticks([]) yticks([])

You can see the sharp edge and contrast at the seam between the two images. The idea behind using Laplacian pyramids is to smooth the seam in a spatial-frequency-dependent way. Here is the procedure:

- Construct Laplacian pyramids for the two images.
- Make a third Laplacian pyramid that is contructed by a mask-based joining of the two original-image Laplacian pyramids at every pyramid level.
- Reconstruct the output image from the blended Laplacian pyramid.

The overall effect is to smooth low spatial frequencies at lot at the seam, but to smooth high spatial frequencies only a little. It's a clever, deceptively simple idea, really.

Here's the first step. (Note: the multiresolution pyramid formed from the mask will be used to combine the Laplacian pyramids at each level.)

mrp_A = multiresolutionPyramid(A); mrp_B = multiresolutionPyramid(B); mrp_mask_A = multiresolutionPyramid(mask_A); lap_A = laplacianPyramid(mrp_A); lap_B = laplacianPyramid(mrp_B);

Next, form the blended Laplacian pyramid.

for k = 1:length(lap_A) lap_blend{k} = (lap_A{k} .* mrp_mask_A{k}) + ... (lap_B{k} .* (1 - mrp_mask_A{k})); end

Finally, reconstruct the output image from the blended Laplacian pyramid. (The code for `reconstructFromLaplacianPyramid` is below.)

C_blended = reconstructFromLaplacianPyramid(lap_blend); imshow(C_blended)

A side-by-side blend is boring, though. Note that the two regions in the mask can have any shape. Here's how I made the blended image that started this post. It is based on a mask with a circular region in the center.

x = linspace(-1,1,1024); y = x'; mask2_A = hypot(x,y) <= 0.5; mrp_mask2_A = multiresolutionPyramid(mask2_A); for k = 1:length(lap_A) lap_blend2{k} = (lap_A{k} .* mrp_mask2_A{k}) + ... (lap_B{k} .* (1 - mrp_mask2_A{k})); end C2_blended = reconstructFromLaplacianPyramid(lap_blend2); imshow(C2_blended)

Doesn't that look yummy?

Functions used above:

function mrp = multiresolutionPyramid(A,num_levels) %multiresolutionPyramid(A,numlevels) % mrp = multiresolutionPyramid(A,numlevels) returns a multiresolution % pyramd from the input image, A. The output, mrp, is a 1-by-numlevels % cell array. The first element of mrp, mrp{1}, is the input image. % % If numlevels is not specified, then it is automatically computed to % keep the smallest level in the pyramid at least 32-by-32. % Steve Eddins % Copyright The MathWorks, Inc. 2019 A = im2double(A); M = size(A,1); N = size(A,2); if nargin < 2 lower_limit = 32; num_levels = min(floor(log2([M N]) - log2(lower_limit))) + 1; else num_levels = min(num_levels, min(floor(log2([M N]))) + 2); end mrp = cell(1,num_levels); smallest_size = [M N] / 2^(num_levels - 1); smallest_size = ceil(smallest_size); padded_size = smallest_size * 2^(num_levels - 1); Ap = padarray(A,padded_size - [M N],'replicate','post'); mrp{1} = Ap; for k = 2:num_levels mrp{k} = imresize(mrp{k-1},0.5,'lanczos3'); end mrp{1} = A; end function lapp = laplacianPyramid(mrp) % Steve Eddins % MathWorks lapp = cell(size(mrp)); num_levels = numel(mrp); lapp{num_levels} = mrp{num_levels}; for k = 1:(num_levels - 1) A = mrp{k}; B = imresize(mrp{k+1},2,'lanczos3'); [M,N,~] = size(A); lapp{k} = A - B(1:M,1:N,:); end lapp{end} = mrp{end}; end function out = reconstructFromLaplacianPyramid(lapp) % Steve Eddins % MathWorks num_levels = numel(lapp); out = lapp{end}; for k = (num_levels - 1) : -1 : 1 out = imresize(out,2,'lanczos3'); g = lapp{k}; [M,N,~] = size(g); out = out(1:M,1:N,:) + g; end end

Get
the MATLAB code

Published with MATLAB® R2019a

Does this line of code make you raise your eyebrows in puzzlement?

c = sum(A,[1 2])

If so, you are likely an experienced MATLAB user who does not make a habit of carefully studying the release notes, line by line, every six months. So, let me tell you about this change to `sum` and other related functions.

I think of `sum` as a *vector-wise* function. Its basic definition is expressed in terms of its operation on a vector: `sum(v)`, where `v` is a vector, returns a scalar that is the sum of the elements of `v`.

v = [1 2 3 4 5 6]; sum(v)

ans = 6 times 7 divided 2, silly.

For `sum` and many, many other functions, MATLAB has a forgiving definition of the term "vector." It can refer to a single column (Px1) or a single row (1xP). The expression `sum(v)` doesn't care.

w = v' sum(w)

w = 1 2 3 4 5 6 ans = you can't fool me, the answer is still 21.

How about `sum` on a matrix? We just love magic squares around here, especially when we're talking about sums.

A = magic(3)

A = 8 1 6 3 5 7 4 9 2

What does `sum(A)` do? Remember what I said earlier - I think of `sum` as a vector-wise function, meaning that fundamentally it operates on vectors. If you pass it a matrix, it will treat that matrix as a **stack of vectors**. Specifically, it will treat the matrix as a stack of column vectors, computing the sum of each of them.

sum(A)

ans = 15 15 15

For the last two decades (and just a bit more), the `sum` function has accepted an optional argument that we call `DIM`. This argument lets you change the way `sum` treats a matrix (or a multidimensional array) as a stack of vectors. For example, the call `sum(A,2)` tells `sum` to treat `A` as a stack of row vectors and compute the sum of each one, result in a column vector containing row sums.

sum(A,2)

ans = 15 15 15

Of course, a magic square has the same row sums as column sums, which is why a magic square was probably a bad choice for this example. However, my employment contract requires that I use `magic(3)` at every opportunity.

Now, suppose I have an RGB image, `F`, which is MxNx3, and I want to compute the sum of each of the three component images, `F(:,:,1)`, `F(:,:,2)`, and `F(:,:,3)`? For most of MATLAB history, this is what we would tell you to do: Compute the column sums and then compute the row sums of the result. (Or, you could do it the other way 'round and get the same answer.)

```
F = imread('peppers.png');
c = sum(sum(F,1),2)
```

c(:,:,1) = 23746472 c(:,:,2) = 13054194 c(:,:,3) = 11331211

OK, suppose I want to compute the sum of **all** the elements of `F`? The historical answer: convert `F` to a single column vector using the expression `F(:)` and then call `sum`.

sum(F(:))

ans = 48131877

But those coding patterns are **SO** R2018a. With the R2018b release, `DIM` can be a vector, and it call also be the string `"all"` (or `'all'`). Computing the sum of the each of the three component images can now be:

sum(F,[1 2])

ans(:,:,1) = 23746472 ans(:,:,2) = 13054194 ans(:,:,3) = 11331211

And computing the sum of all of the elements of the three-dimensional array can be:

sum(F,[1 2 3])

ans = 48131877

or

```
sum(F,"all")
```

ans = 48131877

After you have summed everything in sight that seems to need summing, you can check out these other MATLAB vector-wise functions that also have this new `DIM` behavior: `all`, `any`, `bounds`, `max`, `min`, `mean`, `median`, `mode`, `prod`, `std`, and `var`.

Let me leave you with a little bit of programming advice related to these functions. If you are certain that the input argument you are passing to one of these functions is a vector (because you constructed or computed it yourself), then you can safely use the single-input syntax, like `sum(v)`. Otherwise, I recommend that you always provide the optional `DIM` argument to avoid certain ambiguities associated with MATLAB's forgiving definition of "vector."

Get
the MATLAB code

Published with MATLAB® R2019a

In my April 2 post, I introduced multiresolution pyramids, and I explained my dissatisfaction with the function impyramid, which (sadly) I designed. (Aside: I had that post ready to go on April 1, but I decided that maybe I should wait a day to publish it.) I followed that up... read more >>

]]>In my April 2 post, I introduced *multiresolution pyramids*, and I explained my dissatisfaction with the function `impyramid`, which (sadly) I designed. (Aside: I had that post ready to go on April 1, but I decided that maybe I should wait a day to publish it.) I followed that up on April 9 with an alternative computation method and interface for computing such a pyramid.

Today, I'll introduce a variation called the *Laplacian pyramid*. While the ordinary multiresolution pyramid contains a bunch of images that have been successively lowpass filtered and downsampled, the Laplacian pyramid contains a bunch of images that have essentially been bandpass filtered and downsampled. Burt and Adelson described the Laplacian pyramid as a data structure useful for image compression in "The Laplacian Pyramid as a Compact Image Code," IEEE Transactions on Communications, vol. COM-31, no. 4, April 1983, pp. 532-540.

The i-th image in the Laplacian pyramid is computed by taking the (i+1)-st image in the multiresolution pyramid, expanding it by a factor of 2, and subtracting the result from the i-th multiresolution pyramid image. The only thing to be particulary careful about is to handle the case where the level i image is not exactly twice the size of the level (i+1) image. If you compute the multiresolution pyramid the way I did last time, that can only happen between the first two levels.

Let's try it on the image I used last time.

```
A = imread('https://blogs.mathworks.com/steve/files/IMG_9968.jpg');
imshow(A)
```

First, we need to compute the multiresolution pyramid using the function I presented previously.

mrp = multiresolutionPyramid(im2double(A));

Now we follow the computational recipe that I described in words above.

lapp = cell(size(mrp)); num_levels = numel(mrp); lapp{num_levels} = mrp{num_levels}; for k = 1:(num_levels - 1) A = mrp{k}; B = imresize(mrp{k+1},2,'lanczos3'); [M,N,~] = size(A); lapp{k} = A - B(1:M,1:N,:); end lapp{end} = mrp{end};

Another function, `showLaplacianPyramid` (also at the end of this script), visualizes the pyramid.

showLaplacianPyramid(lapp)

Given a Laplacian pyramid, we can reconstruct the original image by reversing the recipe above.

num_levels = numel(lapp); B = lapp{end}; for k = (num_levels - 1) : -1 : 1 B = imresize(B,2,'lanczos3'); Lk = lapp{k}; [M,N,~] = size(Lk); B = B(1:M,1:N,:) + Lk; end imshow(B)

See the function `reconstructFromLaplacianPyramid` below.

In the Burt and Adelson paper, a key idea was to quantize the level images in the Laplacian pyramid so that they could be stored using fewer bits per pixel (on average). To simulate the effect, I'll round each pixel in the Laplacian level images (except the last one) to the nearest 0.1 (in a normalized range) and reconstruct from that.

for k = 1:(numel(lapp) - 1) lapp_quantized{k} = round(lapp{k},1); end lapp_quantized{numel(lapp)} = lapp{end}; B_coded = reconstructFromLaplacianPyramid(lapp_quantized); imshow(B_coded)

You can see the distortions resulting from this crude quantization method, but the overall image is surprisingly recognizable and even detailed considering how much information we tossed away.

Next time, I plan to discuss the application that I was originally interested when I started writing this series on multiresolution pyramids: image blending.

function lapp = laplacianPyramid(mrp) % Steve Eddins % MathWorks lapp = cell(size(mrp)); num_levels = numel(mrp); lapp{num_levels} = mrp{num_levels}; for k = 1:(num_levels - 1) A = mrp{k}; B = imresize(mrp{k+1},2,'lanczos3'); [M,N,~] = size(A); lapp{k} = A - B(1:M,1:N,:); end lapp{end} = mrp{end}; end function showLaplacianPyramid(p) % Steve Eddins % MathWorks M = size(p{1},1); N = size(p{1},2); stretch_factor = 3; for k = 1:numel(p) Mk = size(p{k},1); Nk = size(p{k},2); Mpad1 = ceil((M - Mk)/2); Mpad2 = M - Mk - Mpad1; Npad1 = ceil((N - Nk)/2); Npad2 = N - Nk - Npad1; if (k < numel(p)) pad_value = -0.1/stretch_factor; else pad_value = 0.4; end A = p{k}; A = padarray(A,[Mpad1 Npad1],pad_value,'pre'); A = padarray(A,[Mpad2 Npad2],pad_value,'post'); p{k} = A; end for k = 1:(numel(p)-1) p{k} = (stretch_factor*p{k} + 0.5); end imshow(imtile(p,'GridSize',[NaN 2],... 'BorderSize',20,'BackgroundColor',[0.3 0.3 0.3])) end function out = reconstructFromLaplacianPyramid(lapp) % Steve Eddins % MathWorks num_levels = numel(lapp); out = lapp{end}; for k = (num_levels - 1) : -1 : 1 out = imresize(out,2,'lanczos3'); g = lapp{k}; [M,N,~] = size(g); out = out(1:M,1:N,:) + g; end end

Get
the MATLAB code

Published with MATLAB® R2019a

Last time, I talked about the function impyramid and how I have been dissatisfied with it. (Confession: I designed it.) Today I want to present an alternative approach to creating a multiresolution pyramid.... read more >>

]]>Last time, I talked about the function `impyramid` and how I have been dissatisfied with it. (Confession: I designed it.) Today I want to present an alternative approach to creating a multiresolution pyramid.

Here are some of my wishes:

- Don't make me think about how many levels to specify. Find a good heuristic rule and apply it by default.
- Compute all the levels at once. Don't make me write a loop or initialize a place to keep the levels.
- Handle all the picky details like how to handle reducing a level with odd dimensions.
- Use a better resampling filter.
- Make the sizes match when you reduce a level and then expand it.

To start off, then, how many levels should we compute by default? Another way to ask that question is: how small do we want to let the last level get? Should we avoid letting the images get smaller than a certain size, such as 32x32? Or should we compute the levels all the way down to a scalar?

I don't know what is best. To answer the question, I would want to experiment with practical applications of multiresolution pyramids to find a heuristic rule that works well. For now, I'll go with a lower size limit of 32x32. Let's tinker with some code based on that rule.

```
A = imread('https://blogs.mathworks.com/steve/files/IMG_9968.jpg');
imshow(A)
```

M = size(A,1); N = size(A,2); [M N]

ans = 1360 1904

lower_limit = 32;

Each level of the pyramid reduces the size by a factor of two, so the `log2` function seems relevant.

log2([M N])

ans = 10.4094 10.8948

ans - log2(lower_limit)

ans = 5.4094 5.8948

If we don't divide by 2 more than the above number of times, then the smallest image size in the pyramid will be no smaller than `lower_limit`. Clearly, the number of size reduction steps has to be an integer.

num_levels = min(floor(log2([M N]) - log2(lower_limit)))

num_levels = 5

To handle degenerate cases smoothly, as well as to make visualizations easier, I want to include the original image as the first level in the pyramid, so ...

num_levels = num_levels + 1

num_levels = 6

Next, let's consider how to avoid odd image size dimensions when reducing each pyramid level. Our example image is 1360x1904, and if you just divide those numbers by 2 five times, you end up with [42.5 59.5]. Let's take the point of view that we want the smallest image in the pyramid to have an integer number of rows and columns. And then let's figure out how much to pad the original image so that we don't end up with any fractional rows and columns anywhere.

smallest_size = [M N] / 2^(num_levels - 1)

smallest_size = 42.5000 59.5000

smallest_size = ceil(smallest_size)

smallest_size = 43 60

padded_size = smallest_size * 2^(num_levels - 1)

padded_size = 1376 1920

I can imagine different ways to pad the original image, each of which has pros and cons. I'm inclined to do something simple, like pad to the right and bottom of the image by replicating the border pixels.

A_padded = padarray(A,padded_size - [M N],'replicate','post');

Now I need to pick a method for shrinking images by a factor of two to form each level. I propose to use `imresize` with the Lanczos3 interpolating kernel. (The `lanczos3` function is at the bottom of this script.

figure fplot(@lanczos3,[-4 4],'LineWidth',1.5) title('Lanczos3 interpolating kernel')

You can see that this is similar to a sinc function but nonzero only in the interval $|x| \leq 3$. The kernel has reasonably good sharpness and antialiasing behavior.

Now we're ready to shrink the original image (as padded) several times to form the multiresolution pyramid.

pyramid = cell(1,num_levels); pyramid{1} = A_padded; for k = 2:num_levels pyramid{k} = imresize(pyramid{k-1},0.5,'lanczos3'); end

One final fix-up step: let's use the original (unpadded) image as level 1. One advantage of doing that is that we don't have to keep track of the original image size as a separate piece of data.

pyramid{1} = A;

The function `multiresolutionPyramid`, included at the bottom of this post, includes the logic and computations that I have described above. I have included another function, `visualizePyramid`, for visualizing the pyramid.

pyramid = multiresolutionPyramid(A); visualizePyramid(pyramid)

Next time, I expect to talk about Laplacian pyramids.

function f = lanczos3(x) % See Graphics Gems, Andrew S. Glasser (ed), Morgan Kaufman, 1990, % pp. 157-158. f = (sin(pi*x) .* sin(pi*x/3) + eps) ./ ((pi^2 * x.^2 / 3) + eps); f = f .* (abs(x) < 3); end function mrp = multiresolutionPyramid(A,num_levels) %multiresolutionPyramid(A,numlevels) % mrp = multiresolutionPyramid(A,numlevels) returns a multiresolution % pyramd from the input image, A. The output, mrp, is a 1-by-numlevels % cell array. The first element of mrp, mrp{1}, is the input image. % % If numlevels is not specified, then it is automatically computed to % keep the smallest level in the pyramid at least 32-by-32. % Steve Eddins % MathWorks A = im2double(A); M = size(A,1); N = size(A,2); if nargin < 2 lower_limit = 32; num_levels = min(floor(log2([M N]) - log2(lower_limit))) + 1; else num_levels = min(num_levels, min(floor(log2([M N]))) + 2); end mrp = cell(1,num_levels); smallest_size = [M N] / 2^(num_levels - 1); smallest_size = ceil(smallest_size); padded_size = smallest_size * 2^(num_levels - 1); Ap = padarray(A,padded_size - [M N],'replicate','post'); mrp{1} = Ap; for k = 2:num_levels mrp{k} = imresize(mrp{k-1},0.5,'lanczos3'); end mrp{1} = A; end function tiles_out = visualizePyramid(p) % Steve Eddins % MathWorks M = size(p{1},1); N = size(p{1},2); for k = 1:numel(p) Mk = size(p{k},1); Nk = size(p{k},2); Mpad1 = ceil((M - Mk)/2); Mpad2 = M - Mk - Mpad1; Npad1 = ceil((N - Nk)/2); Npad2 = N - Nk - Npad1; A = p{k}; A = padarray(A,[Mpad1 Npad1],0.5,'pre'); A = padarray(A,[Mpad2 Npad2],0.5,'post'); p{k} = A; end tiles = imtile(p,'GridSize',[NaN 2],'BorderSize',20,'BackgroundColor',[0.3 0.3 0.3]); imshow(tiles) if nargout > 0 tiles_out = tiles; end end

Get
the MATLAB code

Published with MATLAB® R2019a

There's a function in the Image Processing Toolbox that I'm not especially fond of: impyramid.... read more >>

]]>There's a function in the Image Processing Toolbox that I'm not especially fond of: `impyramid`.

I shouldn't tell you who designed this function, but I'll give you a hint: his initials are "Steve Eddins."

Years ago, we had occasional user requests for creating multiresolution image pyramids, such as Gaussian pyramids and Laplacian pyramids. I thought these requests would be easy to satisfy by `impyramid`, but I was wrong. I was reminded of the problems by some code written by one our Pilot engineers, Steve Kuznicki. Steve made a nice of use `impyramid` to implement a multiresolution image blending method, but weaknesses in the design of `impyramid` made some of the code awkward. Today I want to discuss those weaknesses. In a follow-up post later, I'll demonstrate some possible improvements.

I'll start by explaining multiresolution pyramids. Below an example of one. This form is sometimes called a lowpass pyramid.

The original image is shown in the upper left. The image is lowpass filtered and then subsampled by a factor of 2 in each direction to form the image in the upper right. That process is repeated several times, reaching the tiny version of the original image at the lower right.

So, I think "multiresolution pyramid" and "lowpass pyramid" are both sensible terms, but why is this frequently called a "Gaussian pyramid"? The answer comes from an implementation technique described in Burt, "Fast Filter Transforms for Image Processing," *Computer Graphics and Image Processing*, vol. 16, pp. 20-51, 1981. Burt described a procedure for forming a multiresolution pyramid by filtering at each step with the kernel $h_1(x)$ in Fig. 2 from the paper:

The coefficients of $h_1(x)$ are $1/4 - a/2$, 0.25, and $a$, with typical values for $a$ around 0.4.

As you successively apply this filter at each stage of a multiresolution pyramid, the overall effect becomes similar to filtering with a continuous Gaussian-shaped kernel, and this is where the name *Gaussian pyramid* comes from.

At the time this paper was written, computers were vastly slower than they are now, especially with floating-point computation. The floating-point convolution of a 512x512 image (considered large then, and considered quite small now) with a filter of modest size could take minutes.

That explains part of the significance of Burt's implementation method. Not only is the filter applied at each pyramid stage very small, but when $a = 0.375$, all the filter coefficients are exact binary fractions and so can be implemented by direct processor bit-shift instructions on the integer pixel values. That allowed for reasonably fast implementations on the computers of that time.

When I try to put `impyramid` to good use, or when I have seen other people try to use it, the following problems stand out to me:

- Each call to the function
`impyramid`implements a single level, either a reduction or an expansion. That leaves it up to the caller to handle a lot tedious code for deciding how many levels there should be, keeping track of the size of the original image, and handling cases where the original image dimensions are not powers-of-two multiples of the size of the smallest image in the pyramid. - When you call
`impyramid`to reduce an image by a level, and then you call`impyramid`on the reduced image to expand it, you don't get the same image size that you started with. For example, if you start with a 1360x1904 image, you end up with a 1359x1903 image after reducing and then expanding again. I distinctly remember convincing myself that there was a legit reason for this. Sigh. It just adds a lot more tedious image padding code around many uses of`impyramid`. - As image resizing filters go, the one used by
`impyramid`(described above) is not a very good one for downsampling applications. I understand its use in 1981, but on modern computers, we can do better. - Finally, there needs to be an easier way to visualize the multiresolution pyramid. This would be useful as an aid to general understanding, as well as to debugging specific issues. The
`imtile`function that I discussed on March 17 doesn't quite do it because it automatically resizes each image tile to be the same size.

Extra credit: Do you know what building is in the image above? If so, leave a note in the comments.

Get
the MATLAB code

Published with MATLAB® R2019a

I was noodling around recently with a future blog topic (multiresolution pyramids; stay tuned) when I was reminded of a recent addition to MATLAB: the function imtile, introduced in R2018b.... read more >>

]]>I was noodling around recently with a future blog topic (multiresolution pyramids; stay tuned) when I was reminded of a recent addition to MATLAB: the function `imtile`, introduced in R2018b.

This function has an interesting design history. (Interesting to me, anyway.) The Image Processing Toolbox team has moved several functions into MATLAB over the past few years to support basic image processing workflows in products such as the Deep Learning Toolbox. Examples include `imshow`, `imresize`, and `rgb2gray`. For similar reasons, the team was acting on a request to move the `montage` function into MATLAB. This prompted a review of the function's design.

And, well, we weren't convinced that the `montage` function was what we really wanted to put into MATLAB.

This function is one of the oldest functions in the toolbox, dating back to the initial version 1.0 release in 1993. It's been steadily tinkered with over all those years, but the function we have today still isn't meeting everyone's needs. It is also a little confused about whether it is primarily a display function or a computation function. It tries to do both, but using it as a computation function is awkward.

So, the team took a fresh look at all the sources of feedback we have about `montage`. For example, there are a lot of questions and discussions about `montage` on MATLAB Answers. The team wrote internal design documents summarizing the feedback.

As I reread those documents to prepare this post, I thought that two pains associated with `montage` particularly stood out. The first pain is that, as I mentioned above, it is awkward to get the tiled image that `montage` creates because `montage` doesn't return it directly as an output argument. Instead, you have to get the `CData` property of the image object that `montage` displays on the screen.

The second pain is that it's very difficult to achieve adequate control over the resolution of the tiled image, especially when the tiled image has more pixels than the display.

Following this analysis, the toolbox team recommended a new function, `imtile`, that would act purely as a computation function. It creates and returns a new array of pixels, using exactly the original image pixels - The new function does not display anything. If you want to write a script that tiles images and writes the result to a file, with no intermediate result appearing on the screen, you can easily do it with `imtile`. If you want to display the tiled result, use `imshow`.

Here is a little snippet of code that creates a tiled image out of the three color components of the peppers.png image. The code specifies the size of the border to be used between the tiles, the background color between the tiles, and the size of the tile grid. Then the code displays the tiled image using `imshow`.

rgb = imread('peppers.png'); tiled_components = imtile(rgb,'BorderSize',20,... 'BackgroundColor',[72 162 63]/255,... 'GridSize',[1 3]); imshow(tiled_components)

The array returned by `imtile` can be written directly out to an image file.

```
imwrite(tiled_components,'tiled_components.png')
```

I hope you find that the new function is a useful addition to MATLAB.

Get
the MATLAB code

Published with MATLAB® R2018b

An Image Processing Toolbox user recently reported to us good results for using makecform and applycform to convert colors from sRGB to CMYK. Here's the code they used:... read more >>

]]>An Image Processing Toolbox user recently reported to us good results for using `makecform` and `applycform` to convert colors from sRGB to CMYK. Here's the code they used:

c = makecform('srgb2cmyk'); cmyk_colors = applycform(srgb_colors,c);

The user asked if we could provide formulas for the conversion, but we could not. The conversion process involves several computation steps. One of the steps is a specialized interpolation involving a four-dimensional lookup table, for which there is no formula.

If one is very familiar with the internals of `makecform` and `applycform`, as well as with the organization of code within the toolbox, one could work out the exact steps of the computation. But that's only about four people. So, I thought I would decode the steps here and take you through it.

Before diving in too deep, though, I first want to observe that there is no single standard way to convert to CMYK. That's because the conversion depends strongly on several fundamental factors, including:

- the reflectance properties of a specific set of ink pigments
- how the spread of ink dots on a particular type of paper
- heuristics that control exactly where and how to lay down the black (K) ink
- how to adjust input colors that cannot be exactly reproduced in the output CMYK space

These factors make the conversion depend on the characteristics of a specific device.

If that's the case, though, then what is `applycform` doing? It is performing conversion based on the information contained in two *ICC profiles*, sRGB.icm and swopcmyk.icm. ICC stands for International Color Consortium, and an ICC profile contains computational recipes for converting device colors (or a standard color space such as sRGB) to a common working color space, called a *connection space*. The connection space is either CIEXYZ or CIELAB.

Here's a peek at the contents of sRGB.icm:

```
srgb_profile = iccread('sRGB.icm');
srgb_profile.Header
```

ans = struct with fields: Size: 3124 CMMType: 'appl' Version: '2.1.0' DeviceClass: 'display' ColorSpace: 'RGB' ConnectionSpace: 'XYZ' CreationDate: '03-Dec-1999 17:30:00' Signature: 'acsp' PrimaryPlatform: 'Apple' Flags: 0 IsEmbedded: 0 IsIndependent: 1 DeviceManufacturer: 'IEC ' DeviceModel: 'sRGB' Attributes: 0 IsTransparency: 0 IsMatte: 0 IsNegative: 0 IsBlackandWhite: 0 RenderingIntent: 'perceptual' Illuminant: [0.9642 1 0.8249] Creator: 'HP '

This profile specifies the computations for converting between sRGB and CIEXYZ.

Here's a look at the second profile, swopcmyk.icm:

```
cmyk_profile = iccread('swopcmyk.icm');
cmyk_profile.Header
```

ans = struct with fields: Size: 503780 CMMType: 'appl' Version: '4.0.0' DeviceClass: 'output' ColorSpace: 'CMYK' ConnectionSpace: 'Lab' CreationDate: '14-Apr-2006 02:00:00' Signature: 'acsp' PrimaryPlatform: 'Apple' Flags: 131072 IsEmbedded: 0 IsIndependent: 1 DeviceManufacturer: ' ' DeviceModel: ' ' Attributes: 0 IsTransparency: 0 IsMatte: 0 IsNegative: 0 IsBlackandWhite: 0 RenderingIntent: 'relative colorimetric' Illuminant: [0.9642 1 0.8249] Creator: ' ' ProfileID: [1×16 uint8]

This profile specifies the computations for converting between something called SWOP CMYK and CIELAB. SWOP (Specifications for Web Offset Printing), now part of Idealliance, was a U.S. organization that created common industry specifications that are widely used in the U.S. print processes and materials.

Enough background. To see how the computation is actually performed, we need to dig (deep) into the internals of the struct returned by `makecform`.

```
cform = makecform('srgb2cmyk')
```

cform = struct with fields: c_func: @applyiccsequence ColorSpace_in: 'RGB' ColorSpace_out: 'CMYK' encoding: 'uint16' cdata: [1×1 struct]

Notice the function handle, `@applyiccsequence`. Most of the function handles we're going to see are for "private" toolbox functions. That means they are not directly callable, but you **can** look inside them to see what they're doing. Use `which -all` to see where `applyiccsequence` is located.

which -all applyiccsequence

/Applications/MATLAB_R2018b.app/toolbox/images/colorspaces/private/applyiccsequence.m % Private to colorspaces

Then you can type

edit private/applyiccsequence

to load the MATLAB source file into the editor for your inspection. This particular source file is not very interesting; it just applies a sequence of computational steps. The data used by the function handle is in the `cdata` field of the struct.

cform.cdata.sequence

ans = struct with fields: source: [1×1 struct] fix_black: [1×1 struct] fix_pcs: [1×1 struct] destination: [1×1 struct]

The four fields mentioned represent four steps:

- Convert from the "source" colorspace (sRGB) to the connection colorspace (which is CIEXYZ for the sRGB.icm profile).
- Apply a blackpoint adjustment step that is needed because the two ICC profiles have different versions.
- Convert from CIEXYZ to CIELAB, which is needed because the two profiles are using different connection spaces.
- Convert from CIELAB to the "destination" colorspace (SWOP CMYK).

This step is achieved using the private function `applymattrc_fwd` and parameters stored in the nested `cdata` struct.

cform.cdata.sequence.source

ans = struct with fields: c_func: @applymattrc_fwd ColorSpace_in: 'rgb' ColorSpace_out: 'xyz' encoding: 'double' cdata: [1×1 struct]

mattrc = cform.cdata.sequence.source.cdata.MatTRC

mattrc = struct with fields: RedColorant: [0.4361 0.2225 0.0139] GreenColorant: [0.3851 0.7169 0.0971] BlueColorant: [0.1431 0.0606 0.7141] RedTRC: [1024×1 uint16] GreenTRC: [1024×1 uint16] BlueTRC: [1024×1 uint16]

The "MatTRC" computation applies a "tone-response curve" to each of the input color components, and then it multiplies by a matrix formed from `RedColorant`, `GreenColorant`, and `BlueColorant`. Here are the tone response curves.

x = linspace(0,1,length(mattrc.RedTRC)); subplot(2,2,1) plot(x,double(mattrc.RedTRC)/65535) title('Red TRC') subplot(2,2,2) plot(x,double(mattrc.GreenTRC)/65535) title('Green TRC') subplot(2,2,3) plot(x,double(mattrc.BlueTRC)/65535) title('Blue TRC')

This step is performed using the private function `applyblackpoint` and the parameters in the nested `cdata` struct.

cform.cdata.sequence.fix_black

ans = struct with fields: c_func: @applyblackpoint ColorSpace_in: 'xyz' ColorSpace_out: 'xyz' encoding: 'double' cdata: [1×1 struct]

cform.cdata.sequence.fix_black.cdata

ans = struct with fields: colorspace: 'xyz' upconvert: 1

For these parameters, I can see from reading `applyblackpoint` that the specific computation performed is:

out = (0.0034731 * whitepoint) + (0.9965269 * in)

Here are the specifics:

cform.cdata.sequence.fix_pcs

ans = struct with fields: c_func: @xyz2lab ColorSpace_in: 'xyz' ColorSpace_out: 'lab' encoding: 'double' cdata: [1×1 struct]

cform.cdata.sequence.fix_pcs.cdata

ans = struct with fields: whitepoint: [0.9642 1 0.8249]

The function that is used here, `xyz2lab` is not the documented, user-callable toolbox, but a private version of it. As with the other private functions, you can see it by typing:

edit private/xyz2lab

The whitepoint parameter for the computation is the standard whitepoint value used for ICC profile computations:

```
whitepoint('icc')
```

ans = 0.9642 1.0000 0.8249

This turns out to the most complicated step. Here is the computational function handle and the parameters used:

cform.cdata.sequence.destination.c_func

ans = function_handle with value: @applyclut

cform.cdata.sequence.destination.cdata.luttag

ans = struct with fields: MFT: 4 PreShaper: {[256×1 uint16] [256×1 uint16] [256×1 uint16]} PreMatrix: [3×4 double] InputTables: {[256×1 uint16] [256×1 uint16] [256×1 uint16]} CLUT: [21×21×21×4 uint16] OutputTables: {1×4 cell} PostMatrix: [] PostShaper: []

You really have to step through the execution of `applyclut` in the debugger to see each of about 6 substeps. In this case it turns out that only 2 of the substeps actually have an effect. One is a simple scaling to account for encoding differences between the two ICC profiles:

out = in * (257 / 256);

The second substep that matters is the big one, and it uses this 4-D lookup table:

size(cform.cdata.sequence.destination.cdata.luttag.CLUT)

ans = 21 21 21 4

This is really 4 different 3-D lookup tables packed into one array. The three dimensions of each lookup table correspond to the three components of the CIELAB space (L*, a*, and b*), and the 4 different tables correspond to the four components of the output space (C, M, Y, and K). The `applyclut` file contains a local function, `clutinterp_tet3`, that performs *tetrahedral interpolation*. See my 24-Nov-2006 blog post for a discussion of tetrahedral interpolation for color-space conversion.

All of the other steps do have relatively simple formulas associated with them, but this last step, based on these multidimensional lookup tables, does not.

Now you know, more or less, the computational steps for converting this orange color to SWOP CMYK.

orange_rgb = [215 136 37]/255; clf patch([0 1 1 0 0],[0 0 1 1 0],orange_rgb);

orange_cmyk = applycform(orange_rgb,cform)

orange_cmyk = 0.0286 0.4748 0.8917 0.1153

That's just a tiny bit of cyan, a fair amount of magenta, a lot of yellow, and small amount of black.

Get
the MATLAB code

Published with MATLAB® R2018b

Today's blog post is dedicated to everyone whose eyes aren't as young as they used to be.... read more >>

]]>Today's blog post is dedicated to everyone whose eyes aren't as young as they used to be.

In last week's blog post on pursuit curves, I showed this figure:

If you ran my code exactly as I posted it, though, you probably wouldn't see exactly the same thing. You would see something like this instead:

Do you see the difference? Exactly what you see will vary depending on your computer and your display resolution, but the original image from my blog post has thicker colored lines.

That's because I don't use the default line thickness for my plots.

Let me show you what I mean. Here's a basic example of `plot` from the MATLAB documentation:

x = linspace(-2*pi,2*pi); y1 = sin(x); y2 = cos(x); figure plot(x,y1,x,y2)

The plot above uses the default MATLAB line width of 0.5 points. Here's where I have to wave my hands a little. Because of the way the figure above was captured for display in your browser, the lines probably appear a little thicker than 0.5 points. On a high resolution display, however, the plotted lines are pretty close to 0.5 points thick.

And, to my eyes, that's too thin to see the colors clearly.

So, I like to plot thicker lines, roughly 1.5-2.0 points. You can do that in each call to `plot` by using the `LineWidth` parameter, like this:

```
plot(x,y1,x,y2,'LineWidth',2.0)
```

But then you have to remember to add the `LineWidth` parameter all the time. It turns out that there's a way to get MATLAB to draw all plotted lines thicker by default. Here it is:

```
set(groot,'defaultLineLineWidth',2.0)
```

This odd-looking line of code sets the default `LineWidth` property for `line` objects to 2.0. For a full explanation of the MATLAB system for setting default graphics object properties, see Default Property Values in the MATLAB documentation.

*
Update: Using a thicker line width might only work well for high-DPI monitors. Blog reader Gang Yao pointed out that using 2-pt. lines can make markers indistinct, and I can confirm that observation when not using a high-DPI monitor. After some additional experimentation, I found that using 1.5-pt. lines on a high-DPI monitor works a little better.
*

This default setting does not persist between MATLAB sessions, though. If you want to do this for each MATLAB session, put the call to `set` into a file called `startup.m` that is located in your *user path* folder. Run the `userpath` function to see where that is. Here's mine:

userpath

ans = '/Users/steve/OneDrive - MathWorks/MATLAB'

You can also use the `userpath` function to change your user path location, if you don't like the default location chosen by MATLAB. (That's what I did.)

For more information, see the function reference pages for `startup` and `userpath`.

Do you like thicker plotted lines too? Or is it just me?

Get
the MATLAB code

Published with MATLAB® R2018b

Here's an interesting snippet I learned while working on today's blog post: Mathematician Jacob Bernoulli asked for a logarithmic spiral to be engraved on his tombstone. Here's an example of a such a spiral.... read more >>

]]>Here's an interesting snippet I learned while working on today's blog post: Mathematician Jacob Bernoulli asked for a *logarithmic spiral* to be engraved on his tombstone. Here's an example of a such a spiral.

theta = -6*pi:0.01:6*pi; rho = 1.1.^theta; polarplot(theta,rho) rlim([0 max(rho)])

Sadly, the engravers of his tombstone apparently did not fully understand his request. They engraved a spiral, but it was not logarithmic.

I came across the concept of logarithmic spirals while looking into something called *pursuit curves*. Here's an example, in the form of what is sometimes called the *mice problem*. Start by constructing `P`, a vector of four complex-valued points, so that they form the corners of a square in the complex plane.

P = [0 1 1+1i 1i]; plot(real(P),imag(P),'*') axis equal axis([-.5 1.5 -.5 1.5])

Now imagine that each of these "mice" is chasing the next one, always running directly toward it as is moves. To illustrate the basic idea, let's have the mice take a few fixed-length steps toward their targets.

P = [0 1 1+1i 1i]; step_size = 0.1; Pt = P([2 3 4 1]); V = Pt - P; P = [P ; P + step_size*V]; % Second step Pn = P(end,:); Pt = Pn(end,[2 3 4 1]); V = Pt - Pn; P = [P ; Pn + step_size*V]; % Third step. Pn = P(end,:); Pt = Pn(end,[2 3 4 1]); V = Pt - Pn; P = [P ; Pn + step_size*V]; % Fourth step. Pn = P(end,:); Pt = Pn(end,[2 3 4 1]); V = Pt - Pn; P = [P ; Pn + step_size*V];

Now plot the original positions and steps taken by each mouse.

hold on for k = 1:4 plot(real(P(:,k)),imag(P(:,k))) end hold off

You get the idea, and you can see four spirals start to take shape.

I've written a function, shown at the bottom of this post, that simulates and plots pursuit curves for any collection of starting points (mice). I chose to plot 1000 steps, with each step covering 1/100 of the remaining distance, so that the mice never quite catch their targets. The plotting function also plots the tangent lines every 10 steps. Here's the result for the four-mice-on-a-square starting configuration.

P = [0 1 1+1i 1i]; clf plotPursuitCurves(P) axis([-0.1 1.1 -0.1 1.1])

Logarithmic spiral curves have the interesting property that, when you scale them up by a constant factor, the curve remains the same, but rotated. We can get a sense of this property by zooming in on the plot. Let's zoom in by a factor of 100.

axis([0.495 0.505 0.495 0.505])

You can see that the curve's overall shape has not changed. It's just been rotated.

The function `plotPursuitCurves` lets us explore other starting configurations, such as a triangle.

clf P1 = 0; P2 = 1; P3 = exp(1j*pi/3); plotPursuitCurves([P1 P2 P3]) axis([-0.1 1.1 -0.1 1.0])

Or a hexagon.

clf theta = 0:60:330; P = cosd(theta) + 1i*sind(theta); plotPursuitCurves(P) axis([-1.1 1.1 -1.1 1.1])

We can even tinker with the select of targets. In this next example, the mice start at the vertices of a hexagon, but instead of chasing the next mouse over, they start out chasing **other** mice. We do this by reordering the vertices.

P2 = P([1 3 5 2 4 6]); clf plotPursuitCurves(P2)

Those are some odd-looking curves. If we zoom in we can see why the curves evolve the way they do by examining the tangent lines. We can also see that the curves do appear to settle into a set of regular spirals, although I do not know whether this is a general property of such pursuit curves.

xlim([-0.147 0.103]) ylim([-0.129 0.069])

Do you like to explore mathematical visualizations like this? If you have a favorite, let us know in the comments.

function plotPursuitCurves(P) P = reshape(P,1,[]); N = length(P); d = 0.01; for k = 1:1000 V = P(end,[(2:N) 1]) - P(end,:); P(end+1,:) = P(end,:) + d*V; end hold on Q = P(:,[(2:N) 1]); for k = 1:10:size(P,1) for m = 1:N T = [P(k,m) Q(k,m)]; plot(real(T),imag(T),'LineWidth',0.5,'Color',[.8 .8 .8]) end end for k = 1:N plot(real(P(:,k)),imag(P(:,k))) end hold off axis equal end

Get
the MATLAB code

Published with MATLAB® R2018b

Have you heard of animated PNG (APNG) files? Are you interested in creating them from MATLAB? They are supposed to be an improvement over GIF for making animations that are viewable in a web browser. (If you have an opinion about this, one way or the other, please share your... read more >>

]]>Have you heard of animated PNG (APNG) files? Are you interested in creating them from MATLAB? They are supposed to be an improvement over GIF for making animations that are viewable in a web browser. (If you have an opinion about this, one way or the other, please share your thoughts in the comments.)

Here's a sample image. Do you see the animation in your web browser, or do you see only a still image?

*Public domain image. Credit: Holger Will*

When I look at this image on my Mac, it animates using Safari (version 11.1.2) and Chrome (version 62), but not with the MATLAB browser.

I have to admit that I had never heard of animated PNG files until last week, when I happened to see them mentioned in MATLAB Answers. One of the MATLAB Answers pages where APNG has been discussed is the question, "How can I create animated GIF images in MATLAB?", where Royi suggested using APNG as a superior replacement for GIF.

I was curious, so I started looking into it.

First, a little background. The PNG image format was created in the 1990s, and it was intended as a replacement for GIF. It is an efficient and popular lossless image file format that is universally supported in browsers and other applications that deal with image files. However, GIF has one capability that PNG does not have: animations. For this reason, PNG never replaced GIF for use on web pages, despite some temporary legal restrictions associated with using GIF in the 1990s. (Search for "gif unisys" if you're interested.)

As an image file format, GIF has a couple of significant weaknesses. The first is that a GIF image can have only at most 256 different colors. For animated GIFs, this limitation applies to each frame. The second weakness is that saving an image with transparency uses up one of the 256 colors, and the transparency of each image pixel is either fully opaque or fully transparent. This makes it difficult to make images with smoothly antialiased edges, and it also causes artifacts when a GIF still image or animation using transparency is displayed on top of a different background color than what was originally used to render the image.

The animated PNG format was intended to overcome both of these weaknesses. (Also, unlike GIF, there is no controversy over how to pronounce PNG.)

Here are some key information sources about APNG:

- Wikipedia article
- APNG specification (by Stuart Parmenter, Vladimir Vukicevic, and Andrew Smith)
- APNG web site with history, samples, tutorial, and some code

And here are a few interesting things I've learned:

- An APNG file is a valid PNG file. According to the PNG spec, compliant file readers are required to ignore file chunks that they don't know about. Therefore, a compliant PNG reader that doesn't handle APNG should simply show the first image frame in the file.
- Apparently, the community responsible for maintaining the PNG specification never accepted the APNG spec extension or updated the reference library (libpng) to handle APNG.
- Despite that, APNG has achieved a surprisingly broad level of browser support. According to the Wikipedia page, as of this writing, only Internet Explorer and Microsoft Edge are holdouts.
- Various benchmarks are available online that claim to show that smaller file sizes are achievable with APNG, despite the fact that PNG files are not limited to just 256 colors. I haven't verified it, but I believe this file size improvement is because of the optional interframe differencing feature of APNG.
- There is a fairly straightforward implementation recipe on the Wikipedia page for combining any number of individual PNG files into an APNG file.

Finally, I learned (by accident!) that the Preview app on the Mac is capable of converting an animated GIF file to an APNG file. You can accomplish this by viewing an animated GIF file in Preview and then using the "Export" menu to save it as a PNG. It gets saved with the animation intact.

My 20-Apr-2017 post on the "Eight Queens Problem" included an animated GIF. Here it is as an APNG file, as created by the Preview app. (If you are looking at this blog post in a browser that doesn't support APNG, you'll probably just see an empty 5x5 chessboard.)

In this case, the APNG file is about 2.5 times the size of the GIF file, so perhaps the Preview app isn't fully optimizing the APNG output. Also, the APNG file likely would have been smaller if it had been created directly from the original graphic, because the GIF file appears to use dithering in the chessboard squares.

If you have an interest in APNG files, please let us know in the comments.

Get
the MATLAB code

Published with MATLAB® R2018b