Project 4
Rectifying images and automatically creating panorama images through feature matching and homography
Part A: Computing Homographies
To align the images for both the mosaic and rectification, I needed to compute the homography matrix between the images. The homography defines the transformation between two perspectives of the same scene, which can be used to map points from one image to the other.
The homography matrix is a 3x3 matrix that represents a projective transformation. We want to recover this projective transformation such that
If we expand this out, we have the following system of equations:
Expanding this further:
Which simplifies to:
This gives us the following system of linear equations:
Rectification
To ensure the homography matrix was calculated corectly, I took a picture of a square painting from an angle. I then manually selected four corner points from the painting in the image with the tool from the previous student. Then, I mapped those points to a perfect square which in this case was [0, 0], [0, 100], [100, 0], [100, 100] to compute the homography. I then inverse warped the image using the inverse of the homography matrix and bilinear interpolation where the pixels mapped between multiple pixels. Here are the points I selected on the painting and the result after warping the image in to the square:
Painting with defined points
Painting Rectified
I then did the same thing with a picture of a mirror
Mirror with defined points
Mirror Rectified
Warping for the Mosaic
For the mosaicing, I took three photos of I-House overlapping each other by about 60%. Here I made sure to keep the camera fixed in one location and rotating it around its own axis to make the center of projection the same for all images.
I House left image
I House middle image
I House right image
Afterward, I manually selected a bunch of points for correspondences between the left and middle image and the right and middle image.
Left image points
Middle image correspondence points
Middle image correspondence points
Right image points
Using the same warping funciton as before, I warped the images into three separate canvases, all with the same dimentions.
Left image canvas
Middle image canvas
Right image canvas
Blending the Images into a Mosaic
To then combine the images onto one canvas and smooth the transitions over them together, I first created an alpha mask for the middle image by setting 1 at the center and gradually decreasing to 0 at the edges by using euclidean distance from the center. I then multiplied this mask with the pixel values of the middle image to ensure a smooth transition between the warped images on the side and the middle image.
Alpha Mask
Middle Canvas with Alpha Mask
I then computed the weighted average of the three images by combining the pixel values from all images, using the total weight map to normalize the pixel intensities. By using this weight map, I got all the pixel values to a weighted average of all the overlapping images. This produced a smooth transition between the overlapping images. After cropping the image to remove the black edges, I got this final blended panorama.
I then did the exact same procedure for these three images of haas and the street next to it.
Haas left image
Haas middle image
Haas right image
Street left image
Street middle image
Street right image
Part B: Feature Matching and Autostitching
Harris points
To start implementing the autostitching, I used the sample code given in the project to detect the harris interest points. This made me get this result.
Adaptive Non-Maximal Suppression
As we can see, getting all the harris interest points results in a lot of points. To reduce the number of points, I used the Adaptive Non-Maximal Suppression algorithm to get the interest points that were both spread out and had a high corner response. I did this by first sorting the harris points by their strength. For each point, I then found all points that were stronger than the current point, even after mulitplying by a c_robust factor of 0.9. I stored the distance r_i to the closest point in a list. This process can be described with the following equation:
With this list of distances, I sorted it and select the stronger points that were the furthest away from each other. For the amount of points, I selected 500 as in the paper which resulted in the following image.
Feature Descriptor extraction
The next step now was to get feature descriptors for the points in each image. To do this, I looped over every 5th pixel around the image in a 36x36 grid (since this will result in the same number of points as a 40x40 grid) to get a 8x8 descriptor for each point. I then normalized it by subtracting the mean and dividing by the standard deviation of the descriptor. This gave me descriptors that looked like this:
Descriptor before normalization
Descriptor after normalization
Feature Matching
After getting the descriptors, I then matched the descriptors between the two images by first normalizing the descriptors and then computing the euclidean distance between them. I could then sort the distances so that I had a list of the the descriptors that were the most similar. I then took the two first entries in this list which were the first and second nearest neighbor and kept all the matches that satisfied the Lowe's ratio test. This test was done by taking the ratio of the distance between the first and second nearest neighbor to make sure it was both a good and unique match. This gave me the following matches:
Matches between left and middle image
Matches between right and middle image
RANSAC
I now had the matches between the images, but as we see on the images above, some of the matches are still off. To fix this, I implemented the RANSAC algorithm to find the inliers in the matches. I did this by first randomly selecting 4 matches and computing the homography matrix between them. I then used this matrix to warp the points from the first image to the second image and checked if the distance between the warped points and the actual points were less than a threshold of 1 pixel away from each other. I then counted the amount of inliers and kept the homography matrix that had the most inliers. I then ended up with these inliers:
Inliers between left and middle image
Inliers between right and middle image
With the inliers, I could now warp the images into the final mosaic.
Autostitched
Manually stitched
Haas Autostitched
Haas Manually stitched
Street Autostitched
Street Manually stitched