What's the problem

I needed a segmentation of the image with some weird contraint: each segment has to be a quadrilateral with sort of the same size.

Image of a plane with a possible segmentation
(Image:"XA-VOC Volaris Airlines Airbus A319-132 C/N 2997 'Claudia'" by TDelCoro is licensed under CC BY-SA 2.0

Each quadrilateral is a superpixel, which is a connected group of pixels with similar intensities. You can imagine how many applications that can have. (see here, it's 2020 and we are still working on that problem given its importance).

Now, here the key is the quadrilateral. Superpixel methods only care abour base pixel grouping, but in our case here we deal with a geometrical shape. So, intead of outputting an image, we output a mesh, defined by:

  • Vertices, which are pixel coordinates, which map to some quadrilateral corner
  • Faces, which in our case each face has 4 indices, one for each vertex

In addition, our goal of sort of the same size makes things impossible. Having consistent pixel data without breaking this requirement. Superpixels are irregular in nature, and forcing them to be square is really the issue.

A solution

So we know we want to find the 4 vertices coordinates for each superpixel.

A first attempt to solve would be to stand on giants shoulders and use something existing for superpixels, and after having superpixels identified, do some set of polygonal fits to find quadrilaterals. But using something existing, first it removes the fun, and second, that extra transform/optimization removes an opportunity to do things better. This is because the actual solution may be different when forcing the quadrilateral constraint for a superpixel.

Instead, let's define the problem as an optimization problem. Why? Note first how superpixels are connected. Let's say you shift a vertex by some ammount. To improve the solution, we are likely to adjust the all the quadrilaterals that use that vertex, which will be 4. But if we adjust those neighbors, we may also need to update the neighbors of the neighbors, and so on.

Algorithm

Being a fan of ceres-solver I though, this is for sure a problem we can model with that. The following is the algo:

  1. Take input image and apply some blur. This makes optimizer life's easier.
  2. Initialize the mesh. We create square superpixels of some size that cover the input image.
  3. Create an optimization problem with that initial mesh and optimize vertex coordinates

Try it yourself

Next is a demo that was implemented in C++ and then targeted to the web using WebAssembly.

Image you choose WON'T BE SENT TO ANY SERVER! it will be processed locally in your device.

Pick an image

Drag here or click to browse for an image to test.

or

Preview

Process

Code

The following pieces are involved

  • ceres-solver. The optimizer we use here to optimize our cost function (see more bellow).
  • Eigen. Linear algebra backend that we use with ceres. Note that other backens are available for ceres, but Eigen plays well with WebAssembly.
  • superpixel-mesh. This is actual code that performs meshing, written in C++17.
  • emscripten. LLVM based toolchain to compile our C++ code to WASM.
  • superpixel-mesh-wasm. CMake script and WebIDL definitions to port superpixel mesh to the browser.

As for the browser portion of this:

  • worker-loader. WebPack loader to easily call web workers as modules. Starting Webpack 5, you can directly load workers using web workers.
  • file-loader. Another loader to copy the WASM file to the output folder.

The Math

First, lets define define an error function we would to minimize:

Error(V1, V2 ...Vn) = ∑SuperpixelDissimilary + λ ∑Regularization

As for sumations, in both cases we have 1 term for each superpixel. Let's review each term:

SuperpixelDissimilary(V1, V2, V3, V4) = ∑StdDev(I(x))

Here I(x) refers to all pixels in the superpixel. We compute the standard deviation and that becomes our dissimilarity term: we want to minimize this std deviation in each superpixel. To normalize the area of the pixels, we use the 4 vertices of each superpixel to compute a homography from the configuration to a fixed size square. In this part ceres-solver is really handy as it already has an interpolation implementation that we need to get the remapped pixels.

&lambda is closely related to the regularization parameter you can change in the demo, but there ara additional normalization steps.

As for regularization we first computer 4 triangle areas:

Regularization(V1, V2, V3, V4) = (0.5 (targetArea) - TriangleArea(V4, V3, V2)) +
(0.5 (targetArea) - TriangleArea(V3, V2, V1)) +
(0.5 (targetArea) - TriangleArea(V2, V1, V4)) +
(0.5 (targetArea) - TriangleArea(V1, V4, V3)) +

Where targetArea is the square of the targetSize parameter you can change in the demo. This regularization was the one I liked the most, it works as far as there are enough squares.

Conclusions

  • Just showed you a way to solve the problem of superpixels.
  • It is not the faster (not even close) nor the best method, but, certainly it works and the behaviour can be tweaked.
  • Furthermore, we can extend this to user other sources of data, for instance, using a depth map would be straight forward.
  • In the short term I plan to support mutilevel meshing.I'll post an update here once done.