CS2770: Homework 3

Due: 3/23/2017, 11:59pm


Part I: Color quantization with K-means (20 points)



For this problem you will write code to quantize a color space by applying K-means clustering to the pixels in a given input image. You are allowed to use built-in K-means code in Matlab or Python; you do not have to write your own K-means. Include each of the following components in your submission:
  1. [10 pts] Given an RGB image, perform clustering in the 3-dimensional RGB space, and map each pixel in the input image to its nearest center. That is, replace the RGB value at each pixel with its nearest cluster's average RGB value. For example, if you set K=2, you might get:

    Since these average RGB values may not be integers, you should round them to the nearest integer (1 through 255). Your function should be called quantizeRGB, should take in inputs origImg and k, and return outputs outputImg, meanColors, clusterIds. The variables origImg and outputImg are RGB images, k specifies the number of colors to quantize to, and meanColors is a Kx3 array of the K centers (one value for each cluster and each color channel). clusterIds is a numpixelsx1 matrix (with numpixels = numrows * numcolumns) that says which cluster each pixel belongs to.

  2. [2 pts] Write a function to compute the Euclidean distance between the original RGB pixel values and the quantized values. Your function should be called computeQuantizationError, should take in inputs origImg, quantizedImg, and should return an output error, where origImg and quantizedImg are both RGB images, and error is a real number.

  3. [8 pts] Write a function colorQuantizeMain that calls all the above functions appropriately using the image fish.jpg, and displays the results. Illustrate the quantization with at least three different values of K. Label all plots clearly with titles. In a text file explanation.txt, briefly answer the following: How and why does the error differ based on the value of K?

Part II: Edge detection and circle detection (30 points)



In this problem, you will implement (1) a simple edge detector, and (2) a Hough Transform circle detector that takes an input image and a fixed radius, and returns the centers of any detected circles of about that size. You are not allowed to use any built-in functions for finding edges or circles. Include the following in your submission:
  1. [10 pts] A function called detectEdges which takes in as input im, threshold and returns output edges. This function computes edges in an image. im is the input color image, and threshold is a user-set threshold for detecting edges. edges is an Nx4 matrix containing 4 numbers for each of N detected edge points: the x location of the point, the y location of the point, the gradient magnitude at the point, and the gradient orientation (non-quantized) at the point.

  2. [15 pts] A function called detectCircles which takes in as input im, edges, radius, top_k and returns as output centers. This function finds and visualizes circles from an edge map. im, edges are defined as above, radius specifies the size of circle we are looking for, and top_k says how many of the top-scoring circle center possibilities to show. The output centers is a Kx2 matrix in which each row lists the x, y position of a detected circle's center.

  3. [5 pts] Demonstrate the function applied to the images jupiter.jpg and egg.jpg. Display the images with detected circle(s), labeling the figure with the radius, save your image outputs, and include them in your submission.

Part III: Spatial Pyramid Match (50 points)

In this problem, you will develop a scene categorization system, using the spatial pyramid representation proposed in 2006 by Svetlana Lazebnik, Cordelia Schmid and Jean Ponce.
What you need to include in your submission:
  1. [30 points] A function computeSPMHistogram which takes in inputs im, means and returns output pyramid. This function computes the Spatial Pyramid Match histogram as discussed above. im should be a grayscale image whose SIFT features you should extract inside the function, means should be the cluster centers from the bag-of-visual-words clustering operation, and pyramid should be a 1xD feature descriptor for the image.
  2. [20 pts] A function SPMMain which gets the training/test images and their labels, extracts the SIFT features of training images that will be used for clustering, runs K-means to find the cluster means, computes SPM representations for all images, runs the SVM, and computes accuracy.

Acknowledgement: Parts I and II of this assignment are adapted from Kristen Grauman.