CS180 Project 4
The basic idea here is that we know that some objects are rectangular in real life, so once we label the corners of the rectangle in the image, we can find a homography from them to a hard-coded rectangle and use this to make that particular rectangle appear to be a rectangle in the image.
To deal with problems associated with the homography being unstable, we actually labeled 9 equally spaced points in a rectangular grid and used these to constrain the computation of the homography.
Here's a photo of a multi-monitor setup, and the result of rectifying it to each monitor:
For completeness, here's also a photo of part of an office, and the result of rectifying it to the dresser (the dresser is parallel to the whiteboard, so it looks as if we rectified the whiteboard as well; I originally tried to rectify to the points on the shelf and we'll discuss why this was a bad idea later):
Using the homography, we can also overlay corresponding points in two images to make a mosaic. There are some subtleties about how to handle the boundaries of the images, but the basic summary is that we can use a Gaussian filter to weight the contributions of the two images on their overlap. This still causes trouble in the corners of the overlap, since they are close to both of the original images, which is fundamentally unsolvable if the original images are sufficiently different in lighting conditions or perspective.
Here's a mosaic of two photos of a bench:
Here's a mosaic of two photos of a laundry room:
Here's a mosaic of two new photos in the same office as before:
For reference, two photos originally taken in the same office were labeled a bit poorly, plus the constraining points were nowhere near the boundary image, so the homography didn't align many of the edges. However, the Gaussian blur still convincingly blends them if you don't look at the corners of the overlap:
I originally tried to rectify to the points on the shelf in the office, thinking it would be great because the photo would be extremely distorted, which would produce a cool visual effect.
Unfortunately, the shelf is actually angled far enough from the plane of the original image that the naive homography puts other points in the image behind the "camera", and other points to the side of the "camera", so projecting them back to the image results in the points basically being unbounded, causing the naive code to hang and making me think I had a bug.
Also, I tried for a long time to figure out how to join corresponding points in the images to make a mosaic. My initial approach was to write a bunch of functions to register the two images against each other using their features, similar to in Assignment 1. I originally thought that naive L2 difference would be sufficient, and I spent a long time trying out different feature choices because most of them worked surprisingly badly, and eventually I realized that the homography tells us exactly where the boundaries of one image end up in the other image, so we can figure out exactly how they lie on top of each other.
These are both great examples of why it's important to thoroughly understand what the homography you're computing actually does.
Part B: Automatic feature correspondences
As expressed in the spec, this part follows the logic of the paper “Multi-Image Matching using Multi-Scale Oriented Patches” by Brown et al, but with several simplifications. The main simplification is that our feature extractor doesn't involve any orientation; we just use axis-aligned patches at one scale.
First we start with a simple Harris corner detector. Here are the top 200 Harris corners for the laundry room image:
The biggest problem with this is that too many of the strongest corner candidates are clustered together, which makes for bad feature correspondences.
So we apply a non-maximum suppression to filter out corners that are too close to each other. Here are the top 200 corners after non-maximum suppression:
We undergo this process for the two images we want to mosaic, and then in each image, we extract a feature descriptor at each corner. It's hard to helpfully visualize the feature descriptor, but we take a 40x40 grid of pixels centered at the corner, downsample it to 8x8, and normalize it to have zero mean and unit variance.
As referenced above, obviously this isn't invariant to rotation or to scale, so this method won't robustly find correspondences for other images, but even this simple feature descriptor is strong enough for these.
We use these features to find correspondences between the two images. A feature in one image is better match for a feature in the other image if the L2 distance between their feature descriptors is small.
Now to be clear, for each feature in one image we calculate its matching score with every feature in the other image. We use Lowe's method here to determine the quality of the correspondence: A correspondence is better if the difference in quality of match between the best candidate and the second best candidate is large.
We sort the correspondences in order of (highest matching score / second best matching score), with a lower ratio being better.
Here are the top 16 correspondences found by this process:
As you can see, not all of the correspondences are correct, but 14 out of 16 of them are about as good as our manually labeled correspondences.
It follows that we can use this method to mosaic the two images, the same way we did in part A. However, we can't simply minimize the least-squares solution to the homography, because the correspondences are noisy.
So we use RANSAC to repeatedly fit homographies to four points, and find a homography that maximizes the number of inliers. Here inliers are points that, when transformed by the homography, are within some tolerance of their corresponding point.
We set the tolerance really high, to 50 pixels, which doesn't really matter because if a correspondence is bad, it's unlikely to be within 50 pixels of the correct one.
Here is the mosaic we generated with manual correspondences (i.e, the same mosaic as above in part A), followed by the one we generated with automatic correspondences:
They are practically indistinguishable, as expected.
Now here is the same thing again with the other above mosaics, with the manual correspondences first in each case:
Again, they are practically indistinguishable.
I noticed that the office mosaics are a little bit blurry in the middle, but the border between the wall and ceiling is very clean and so are the lines on the square shelf.
Even checking the correspondences manually, the correspondences are extremely good and they fit a RANSAC with quite few outliers even with 8 pixel tolerance.
I think this is because the photos actually were taken with a slight translation, so that no homography can perfectly align them.