I needed a segmentation of the image with some weird contraint: each segment has to be a quadrilateral with sort of the same size.
Image of a plane with a possible segmentation
(Image:"XA-VOC Volaris Airlines Airbus A319-132 C/N 2997 'Claudia'" by TDelCoro is licensed under CC BY-SA 2.0
Each quadrilateral is a superpixel, which is a connected group of pixels with similar intensities. You can imagine how many applications that can have. (see here, it's 2020 and we are still working on that problem given its importance).
Now, here the key is the quadrilateral. Superpixel methods only care abour base pixel grouping, but in our case here we deal with a geometrical shape. So, intead of outputting an image, we output a mesh, defined by:
In addition, our goal of sort of the same size makes things impossible. Having consistent pixel data without breaking this requirement. Superpixels are irregular in nature, and forcing them to be square is really the issue.
So we know we want to find the 4 vertices coordinates for each superpixel.
A first attempt to solve would be to stand on giants shoulders and use something existing for superpixels, and after having superpixels identified, do some set of polygonal fits to find quadrilaterals. But using something existing, first it removes the fun, and second, that extra transform/optimization removes an opportunity to do things better. This is because the actual solution may be different when forcing the quadrilateral constraint for a superpixel.
Instead, let's define the problem as an optimization problem. Why? Note first how superpixels are connected. Let's say you shift a vertex by some ammount. To improve the solution, we are likely to adjust the all the quadrilaterals that use that vertex, which will be 4. But if we adjust those neighbors, we may also need to update the neighbors of the neighbors, and so on.
Being a fan of ceres-solver I though, this is for sure a problem we can model with that. The following is the algo:
Next is a demo that was implemented in C++ and then targeted to the web using WebAssembly.
Image you choose WON'T BE SENT TO ANY SERVER! it will be processed locally in your device.
Drag here or click to browse for an image to test.
The following pieces are involved
As for the browser portion of this:
First, lets define define an error function we would to minimize:
Error(V1, V2 ...Vn) = ∑SuperpixelDissimilary + λ ∑Regularization
As for sumations, in both cases we have 1 term for each superpixel. Let's review each term:
SuperpixelDissimilary(V1, V2, V3, V4) = ∑StdDev(I(x))
Here I(x) refers to all pixels in the superpixel. We compute the standard deviation and that becomes our dissimilarity term: we want to minimize this std deviation in each superpixel. To normalize the area of the pixels, we use the 4 vertices of each superpixel to compute a homography from the configuration to a fixed size square. In this part ceres-solver is really handy as it already has an interpolation implementation that we need to get the remapped pixels.
&lambda is closely related to the regularization parameter you can change in the demo, but there ara additional normalization steps.
As for regularization we first computer 4 triangle areas:
Regularization(V1, V2, V3, V4) = (0.5 (targetArea) - TriangleArea(V4, V3, V2)) +
(0.5 (targetArea) - TriangleArea(V3, V2, V1)) +
(0.5 (targetArea) - TriangleArea(V2, V1, V4)) +
(0.5 (targetArea) - TriangleArea(V1, V4, V3)) +
Where targetArea is the square of the targetSize parameter you can change in the demo. This regularization was the one I liked the most, it works as far as there are enough squares.