CS1699: Homework 4

Due: 11/24/2015, 11:59pm

Instructions: Please provide your code and your written answers. Your written answers should be in the form of a single PDF or Word document (.doc or .docx). Your code should be written in Matlab. Zip or tar your written answers and .m files and upload the .zip or .tar file on CourseWeb -> CS1699 -> Assignments -> Homework 4. Name the file YourFirstName_YourLastName.zip or YourFirstName_YourLastName.tar. Include your name in your write-up.

Note: This homework includes up to 30 points of extra credit!

Part I: Scene categorization (60 points)

In this problem, you will develop two variants of a scene categorization system.
What you need to include in your submission:
  1. [15 points] function [pyramid] = computeSPMHistogram(im, codebook_centers); which computes the Spatial Pyramid Match histogram as discussed above. im should be a grayscale image whose SIFT features you should extract, codebook_centers should be the cluster centers from the bag-of-visual-words clustering operation, and pyramid should be a 1xd feature descriptor for the image. You're allowed to pass in optional extra parameters after the first two.
  2. [15 points] function [labels] = findLabelsKNN(pyramids_train, pyramids_test, labels_train); which predicts the labels of the test images using the KNN classifier. pyramids_train, pyramids_test should be an Mx1 cell array and an Nx1 cell array, respectively, where M is the size of the training image set and N is the size of your test image set, and each pyramids{i} is the 1xd Spatial Pyramid Match representation of the corresponding training or test image. labels_train should be an Mx1 vector of training labels, and labels should be a Nx1 vector of predicted labels for the test images.
  3. [5 points] function [labels] = findLabelsSVM(pyramids_train, pyramids_test, labels_train); which predicts the labels of the test images using an SVM. This function should include training the SVM. The inputs and outputs are defined as above but now use an SVM.
  4. [5 pts] function [accuracy] = computeAccuracy(trueLabels, predictedLabels); which computes and prints the accuracy of a classifier on the test images, where trueLabels is the Nx1 vector of ground truth labels that came with the dataset, and predictedLabels is the corresponding Nx1 vector of labels predicted by the classifier.
  5. [20 pts] A script which get all images and their labels (feel free to reuse code from HW3 that shows how to get the contents of a directory), extracts the features of training images, runs kmeansML to find the codebook centers, then computes SPM representations, and runs the KNN and SVM classifiers, including computing their accuracy. In this script, run the KNN classification with the following values for the k (different from the k-means k = 100): 1, 5, 25, 125. In other words, you have to run KNN 4 times and show 4 accuracy values, plus 1 for SVM. Include your accuracy results in your write-up.

Part II: Pedestrian detection (40 points)

In this problem, you will implement a simple pedestrian detection system. This system is somewhat similar to the 2005 paper by Navneet Dalal and Bill Triggs found here.
What you need to include in your submission:
  1. [20 points] A script setup_and_train.m that gets the positive crops and generates the negative crops (feel free to just use a sample for each), extracts their features, and trains an SVM with these.
  2. [20 points] A script test.m that implements sliding window detection for a test image, plus your write-up which includes your test images and the predicted person detections in each.
  3. [up to 20 points of extra credit] A script evaluate.m which computes Intersection Over Union scores, and from those, precision and recall. Also include the precision and recall scores in your write-up.