CS1674: Homework 4 - Programming

Due: 9/29/2016, 11:59pm

This assignment is worth 50 points. Please read the full assignment description before you start working! I expect this assignment will take you 3-5 hours.

Part I: Feature Detection (25 points)

In this problem, you will implement feature extraction using the Harris corner detector, as discussed in class.
  1. Use the following signature: function [x, y, scores, Ix, Iy] = extract_keypoints(image); image is a color image of type uint8 which you should convert to grayscale and double in your function. Each of x,y is an nx1 vector that denotes the x and y locations, respectively, of each of the n detected keypoints (i.e. points with "cornerness" scores greater than a threshold). Keep in mind that x denotes the horizontal direction, hence columns of the image, and y denotes the vertical direction, hence rows, counting from the top-left of the image. scores is an nx1 vector that contains the value (denoted by R in the lecture slides) to which you applied a threshold, for each detected keypoint. Ix,Iy are matrices with the same number of rows and columns as your input image, and store the gradients in the x and y directions at each pixel. You can reuse code to compute gradients from HW2P.
  2. You can use a window function of your choice; opt for the simplest one, e.g. the 1 inside, 0 outside one from class. Use a window size of e.g. 5 pixels.
  3. Common values for the k value in the "Harris Detector: Algorithm" slide are 0.04-0.06.
  4. You can set the threshold for the "cornerness" score R however you like; for example, you can set it to 5 times the average R score. Alternatively, you can simply output the top n keypoints (e.g. top 1%).
  5. One potentially annoying step is computing the M matrix for each window. Rather than loop, it may be easier to just index the values in a local neighborhood in the energy images, e.g. Ix(i-offset:i+offset, j-offset:j+offset). Then you can use this matrix to compute each element-wise gradient multiplication, e.g. Ix^2, and sum all the values in the resulting (post-multiplication) matrix. This would give you the first entry of the matrix M. Repeat with the other image gradients shown on the "Harris Detector: Algorithm" slide, to get the full matrix M.
  6. [2 points extra credit] You can also perform non-maximum suppression by only keeping those keypoints whose R score is larger than all of their 8 neighbors; if a keypoint does not have 8 neighbors, do not keep it. Don't remove indices while looping over pixels; instead keep a vector of indices you want to remove (start it empty and concatenate indices to it as needed), run the unique operation on it, then set the keypoints at those indices to []. The scores/x/y that you output should correspond to the final set of keypoints, after non-max suppression, i.e. you should remove the values at the same indices in these three vectors.
  7. After you have written your extract_keypoints function, show what it does on a set of 10 images of your choice. Do this in a script called show_keypoints.m. Visualize the keypoints you have detected, for example by drawing circles over them. Use the scores variable and make keypoints with higher scores correspond to larger circles. For example, you can use plot(x(i), y(i), 'ro', 'MarkerSize', scores(i) / 1000000000); Note that Matlab's plot counts from the top-left when plotting over an image. Save the figures that show your features and include them with your submission.
Part II: Feature Description (25 points)

In this problem, you will implement a feature description pipeline, as discussed in class. While you will not exactly implement it, the SIFT paper by David Lowe is a useful resource, in addition to Section 4.1 of the Szeliski textbook.
  1. Use the following signature: function [features, x, y, scores] = compute_features(x, y, scores, Ix, Iy).
  2. x, y, scores, Ix, Iy are defined as above.
  3. features is an nxd matrix, each row of which contains the d-dimensional descriptor for the n-th keypoint.
  4. We'll simplify the histogram creation procedure a bit, compared to the original implementation presented in class. In particular, we'll compute a descriptor with dimensionality d=8 (rather than 4x4x8), which contains an 8-dimensional histogram of gradients computed from a 11x11 grid centered around each detected keypoint (i.e. -5:+5 neighborhood horizontally and vertically).
  5. Quantize the gradient orientations in 8 bins (so put values between 0 and 22.5 degrees in one bin, the 22.5 to 45 degree angles in another bin, etc.). For example, you can have a variable with the same size as the image, that says to which bin (1 through 8) the gradient at that pixel belongs. To populate the SIFT histogram, consider each of the 8 bins. To populate the first bin, sum the gradient magnitudes that are between 0 and 22.5 degrees. Repeat analogously for all bins.
  6. Finally, you should clip all values to 0.2 as discussed in class, and normalize each descriptor to be of unit length, e.g. using hist_final = hist_final / sum(hist_final); Normalize both before and after the clipping. You do not have to implement any more sophisticated detail from the Lowe paper.
  7. If any of your detected keypoints are less than 5 pixels from the top/left or 5 pixels from the bottom/right of the image, erase this keypoint from the x, y, scores vectors at the start of your code and do not compute a descriptor for it.
  8. To compute the gradient magnitude m(x, y) and gradient angle θ(x, y) at point (x, y), take L to be the image and use the formula below shown in class and Matlab's atand:
    In code, this looks something like:

    grad_mag(i, j) = sqrt(Ix(i, j)^2 + Iy(i, j)^2);
    orient_raw = atand(Iy(i, j) / Ix(i, j));
    if(isnan(orient_raw))
        assert(grad_mag(i, j) == 0);
        orient_raw = 0; % if no change, we won't count a gradient magnitude
    end

    And to get the mapping of orientations to a number between 1 and 8, you can use:

    assert(orient_raw >= -90);
    if(orient_raw <= -67.5)
        grad_orient(i, j) = 1;
    elseif(orient_raw <= -45)
        grad_orient(i, j) = 2;
    % FILL IN!
    end