CS1674: Homework 9

Due: 12/4/2018, 11:59pm

This assignment is worth 50 points.

In this assignment, you will use a deep network to perform categorization. In order to simplify the problem, you will not train a deep net. Instead, you will use a common practice of simply using a pre-trained network to extract features. You will then use these features to train an SVM classifier which discriminates between the 10 scene categories used in HW7.

You will use the Caffe package, a very popular deep learning framework for computer vision. Caffe is a C++ framework, but has both Python and Matlab interfaces. In this assignment, we will only use the Matlab interface. We have installed Caffe for you on the class4.cs.pitt.edu server.

Submit all code you write in a single script solution.m.

Setup:

You will be connecting to the server via SSH. If you are using a Windows machine and haven't used SSH before, you will need to first download a SSH client such as PuTTY. You can download PuTTY from here. If you are using a Mac or Linux, you already have ssh installed.
If you are off campus and don't have the VPN installed, you will need to first connect to Pitt's server and then ssh over to the class4 server. If you are on a Mac or Linux, open a terminal and type: ssh USERNAME@unixs.cis.pitt.edu (where USERNAME is your Pitt username) and press enter to connect to the server. If you are on Windows, open PuTTY; for the host name, enter unixs.cis.pitt.edu, then click Open to connect to the server. You will need to enter your Pitt username and password when prompted by the server. Once you are connected to that server, you can then type: ssh class4.cs.pitt.edu and press enter to connect to the class4 server. If you are on campus, or have the VPN installed, you should be able to connect to the class4 server directly.
Once you are logged in, you will be taken to your AFS home directory and will probably see a "public" and "private" directory (if you have not changed these yourself). Make sure to put any assignment files you are working on in the private directory (or another directory which no one except you can access).
You can either write your Matlab assignment file on your own computer and transfer it to the server using scp (on Mac or Linux) or WinSCP (you'll need to download this on Windows) to run it on the server, or directly write the Matlab assignment file on the server using a text editor. On Mac or Linux an scp command to copy a file you've written to the server might look like this (where my username is clt29): scp your_matlab_assignment_file.m clt29@class4.cs.pitt.edu:/afs/pitt.edu/home/c/l/clt29/private/ (note you should replace "c" with the first letter of your username, and "l" with the second letter). This command will copy the Matlab file from your computer to your AFS storage space. If you are on Windows and install WinSCP, you will be presented with a GUI interface where you can drag and drop files from your computer to your AFS space.
Caffe requires libraries to be visible to Matlab for it to work. We need to tell Matlab where these libraries are located. Before starting matlab, type the following directly into the bash shell on the server: export LD_LIBRARY_PATH=/u/caffe/helper_tools/lib:/u/caffe/build/lib:/opt/cuda-7.5/lib64:/usr/local/lib:$LD_LIBRARY_PATH
To launch Matlab on the server, type matlab -nodisplay. We include the -nodisplay so that Matlab does not attempt to open the GUI interface. You can then begin typing commands as you normally would in Matlab or run a script that you write.
Matlab needs to know where to find its interface to the Caffe library. The Matlab addpath command allows you to specify the location of files your scripts need to run. Add the following line to the top of your script: addpath('/u/caffe/matlab/')

Loading the model and data, extracting features:

We will now load in a pretrained CNN model. The model we are loading has been trained on 1.4M images to classify each into 1000 classes. Add the following line to your script:
net = caffe.Net('/u/caffe/hw_cs1674/models/deploy.prototxt', '/u/caffe/hw_cs1674/models/alexnet.caffemodel', 'test');
The caffe.Net function loads a network model for use in Matlab. The first argument specifies a file containing the network structure which tells Caffe how the various network layers connect. The second argument specifies the learned model weights to load, and copies them into the network structure created by the first argument. The final argument tells Caffe to load the network in test mode, rather than train mode. You will see a lot of output appear once you execute this command which you can ignore.
It is standard practice to subtract the mean of the train data from each image before feeding the image to the CNN. Load the mean image of the train data by using the following command:
image_mean = caffe.io.read_mean('/u/caffe/hw_cs1674/models/mean227.binaryproto');
The data for this assignment is located at /u/caffe/hw_cs1674/data/scenes_lazebnik/. You will find 10 folders with scene category names. Each folder contains 60 images of the scene specified. You can use the Matlab imageSet function with the 'recursive' parameter to obtain a list of folders and all images in them for easy processing. Note this function is part of the Computer Vision Toolbox. Alternatively, you can any useful parts from load_split_dataset.m from HW7.
The name of the folder is the image's ground truth label. For each image, you will need to extract two feature types from the CNN, 'fc6' and 'fc7', and store them in a variable. Later, you will train a linear SVM on each of the feature types.
In order to extract the features for an image, first load the image in Matlab using the caffe.io.load_image('PATH') function, where PATH is the path to the image you want to pass through the network. Do not use Matlab's imread function. Caffe expects images in BGR format (instead of RGB), needs to have the width and height dimensions flipped, and needs to be in single precision. The load_image function will do all of these things for you automatically. Because we are using black and white images, we need to convert to color using cat(3, im, im, im). After this conversion, use Matlab's imresize function to resize the image to height and width 227 (which is what this model expects). After resizing the image, subtract the image_mean we loaded previously from the image. You can then run the image through the neural network by using the command net.forward({im});
Once the image has been run through the neural network, we are ready to extract features from the network. You will be extracting features from two of the three fully connected layers of the network, 'fc7' and 'fc6'. To extract an image feature from the network, use the command
net.blobs(feature_name).get_data();
Store the features you extract somewhere, so you can use them to train the SVM.

Training and testing the SVM:

Use 30 images from each of 10 classes to train, and 30 images from each class to test. Randomly sample the 30 images for training from the 60 images avaialble for each class. Use the remaining 30 images per class for testing.
Train 2 linear SVMs using Matlab's fitcecoc function.
Test each of your SVMs on the test set and report the accuracy, in a file accuracies.txt. Add a brief sentence in the text file saying how the performance of these features compares to the performance you obtained in HW7 with SIFT BOW and SIFT SPM.

Submission:

solution.m
accuracies.txt

Grading rubric:

[10 pts] Setting up the network, getting the mean image, iterating over the images in folders
[10 pts] Loading and preparing each image in order to pass it through the network
[10 pts] Extracting the features from the network for all images
[10 pts] Aggregating the features and training the two SVMs
[10 pts] Computing and reporting accuracy

Acknowledgement: This assignment was designed by Chris Thomas and Nils Murrugarra-Llerena.