CS 2770: Homework 1 (Matlab Version)

Due: 2/9/2017, 11:59pm

In this homework assignment, you will use a deep network to perform image categorization. You will first use a pretrained network (trained on a different problem) to extract features. You will then use these features to train a SVM classifier which discriminates between 20 object categories. You will then train a network (with weights initialized from the same pre-trained network) and train it on this task. Finally, you will compare the performance of the pre-trained network to the network you trained on this problem.

You will use the Caffe package, a very popular deep learning framework for computer vision. Caffe is a C++ framework, but has both Python and Matlab interfaces. This page is for the Matlab interface. We have installed Caffe for you on the nietzsche.cs.pitt.edu server.

Training the CNN in this assignment may take a long time, and several of you will be using the limited computing resources at the same time, so be sure to start this assignment early.

Part I: SSH Basics - Getting Connected to the Server and Transferring Files

  1. You will be connecting to the server via SSH. If you are using a Windows machine and haven't used SSH before, you will need to first download a SSH client such as PuTTY. You can download PuTTY from here. If you are using a Mac or Linux, you already have SSH installed.
  2. This server only allows incoming connections from computers in the CS department or via the VPN client. If you are connecting from off campus, you must first install the Pulse VPN client (see here for instructions) in order to connect to the server. Connect to the VPN before trying to ssh to the server.
  3. If you are on a Mac or Linux, open a terminal and type: ssh nietzsche.cs.pitt.edu and press enter to connect to the server. If you are on Windows, open PuTTY and for the host name, enter nietzsche.cs.pitt.edu and click Open to connect to the server. You will need to enter your departmental username and password when prompted by the server.
  4. Once you are logged in, you will be taken to your AFS home directory and will probably see a "public" and "private" directory (if you have not changed these yourself). Make sure to put any assignment files you are working on in the private directory (or another directory which no one except you can access).
  5. You can either write your Matlab assignment file on your own computer and transfer it to the server using scp (on Mac or Linux) or WinSCP (you'll need to download this on Windows) to run it on the server or directly write the Matlab assignment file on the server using a text editor such as vim. On Mac or Linux a scp command to copy a file you've written to the server might look like this (where my username is chris):
    scp file.m chris@nietzsche.cs.pitt.edu:/afs/cs.pitt.edu/usr0/chris/private/ 
    This command will copy the Matlab file from your computer to your AFS storage space. If you are on Windows and install WinSCP, you will be presented with a GUI interface where you can drag and drop files from your computer to your AFS space.

Part II: Setting Up Your Environment and Matlab for Caffe

  1. Caffe requires libraries to be visible to Matlab for it to work. We need to tell Matlab where these libraries are located. Before starting Matlab, copy paste the following directly into the shell on the server:
    bash (press enter after each line)
    export LD_LIBRARY_PATH=/tmp/caffe/ffmpeg:/opt/cuda-8.0-cuDNN5.1/lib64:/tmp/caffe/opencv/install/lib:/tmp/caffe/anaconda2/lib:/opt/OpenBLAS/lib:/usr/local/lib
    export PATH=/opt/cuda-8.0-cuDNN5.1/bin:/tmp/caffe/anaconda2/bin:/usr/lib64/qt-3.3/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin
  2. To launch Matlab on the server, type matlab -nodisplay. We include the -nodisplay so that Matlab does not attempt to open the GUI interface. You can then begin typing commands as you normally would in Matlab or run a script that you write.
  3. Matlab needs to know where to find its interface to the Caffe library. The Matlab addpath command allows you to specify the location of files your scripts need to run. Add the following line to the top of your script: addpath('/tmp/caffe/matlab/')
  4. You will be using a GPU to accelerate your CNNs. There are 4 GPUs on this machine. Type nvidia-smi (before starting Matlab) to view the 4 GPUs on the machine. Look in the center column and you will see four lines like: 0MiB / 11439MiB which show the memory utilization on the GPU. The first number is the current utilization. Note which GPU has the least memory utilization on the machine (this will change depending on who is using what GPUs). Once a model loads on the GPU, the memory is unable to be used by anybody else, so make sure to exit Matlab after you are done doing your work so as not to exclusively hold memory unnecessarily.
  5. Add the following lines to the top of your script:
    caffe.set_device( #ENTER THE GPU NUMBER YOU NOTED ABOVE (0-3) HERE )
    caffe.set_mode_gpu()

Part III: Preparing the Dataset for the Experiment

  1. We need to subtract the mean of the train data from each image before the CNN classifies it. Load the mean image of the train data by using the following command:
    image_mean = caffe.io.read_mean('/tmp/caffe/models/data_mean.binaryproto');
  2. The data for this assignment is located at /tmp/caffe/data/. You will find 20 folders with images in them. Each folder is the category of the image. For each image, you will need to extract image features from the CNN and store them in a variable along with the folder name that the image came from. Later, you will train a linear SVM using these features to predict which folder an image came from. In the second part of the assignment, you will train the network on these images. Note: You can use the Matlab imageSet function with the recursive flag in order to easily get a list of all the images and the folder that they are in.
  3. You will need to randomly withhold 10% of the images as a validation set for training the CNN. Withhold an additional 10% as a test set for evaluation. You can use the Matlab datasample function with 'replace' set to false to randomly sample from the data (make sure to then remove the images you sample from the list you sampled them from). Make sure to retain your data split for the entire assignment because you will use the same data split for training, validating, and testing the neural network.

Part IV: Using a Pretrained Network as a Feature Extractor

  1. We will now load in a pretrained CNN model. The model we are loading has been trained on 1.4M images to classify images into 1000 classes (which aren't necessarily the animals we will be classfying). Add the following line to your script:
    net = caffe.Net('/tmp/caffe/models/deploy.prototxt', '/tmp/caffe/models/weights.caffemodel', 'test')

    The caffe.Net function loads a network model for use in Matlab. The first argument specifies a file containing the network structure which tells Caffe how the various network layers connect. The second argument specifies the learned model to load containing the weights learned during training and copies those weights into the network structure created by the first argument.  The final argument tells Caffe to load the network in test mode, rather than train mode. You will see a lot of output appear once you execute this command which you can ignore.
  2. In order to extract the features for an image, first load the image in Matlab using the: caffe.io.load_image('/path/to/image/to/load.jpg') function. Do not use Matlab's imread function. Caffe expects images in BGR format (instead of RGB), needs to have the width and height dimensions flipped, and needs to be in single precision. The load_image function will do all of these things for you automatically. After loading the image, use Matlab's imresize function to resize the image to height and width 227 (which is what this model expects. After resizing the image, subtract the image_mean we loaded previously from the image. You can then run the image through the neural network by using the command net.forward({image});
  3. Once each image has been run through the neural network, we are ready to extract features from the network for that image. You will be extracting features from the fc8 layer of the network. To extract an image feature from the network for an image, use the command net.blobs('fc8').get_data() Store the features you extract somewhere for training the SVM along with the folder that the image came from.
  4. Train a linear SVM using Matlab's fitcecoc function on the train set but do not train on the withheld validation set or test set. To specify that Matlab should train a linear SVM, pass the following templateSVM to the fitcecoc function: templateSVM('Standardize',1,'KernelFunction','linear'); Matlab will also automatically standardize your data for you. Note you will not be using the validation set for this part of the assignment.
  5. Test your SVM on the test set and report the accuracy of the SVM at predicting the folder that the image was in. Also include a confusion matrix of the predictions using the confusionmat function and include it in your submission. What do you observe about the types of errors the network makes?

Part V: Preparing Your Own Network

  1. Before we train the network, we must first set up the network solver, which contains parameters necessary for training the network. Copy all of the prototxt files from the /tmp/caffe/models directory to your own directory.
  2. We will begin by editing the solver.prototxt file. You will see the syntax of the file when you open it. Each variable is on its own line and is followed by a colon and then the parameter.
  3. Now, we will need to change the train_val.prototxt file to handle our problem. Currently, the network is trained to handle 1000 object classes. We need to change the classifier output so that there are only 20 outputs (for our 20 categories). Find the line: num_output: 1000 and change it to num_output: 20 to accommodate the 20 object classes in our dataset. You will also need to rename the layer you changed since you changed the dimensions of the layer. Search the file for fc8 and rename it to something of your choice (it appears in multiple places, so be sure to change them all). While you are in this file, you can view the overall network structure and see the different layers in the network.

Part VI: Training and Evaluating Your Own Network

  1. We are now ready to begin training in Matlab. Begin by creating a Caffe solver:
    solver = caffe.Solver('Path to your solver.prototxt');

    This instantiates the solver in Matlab. However, we don't have enough data to train the network entirely from scratch, so we will initialize the network to the same weights we used before. To do this, type:
    solver.net.copy_from('/tmp/caffe/models/weights.caffemodel');
  2. Write a loop to loop through your train set 25 times (25 epochs). You will process 8 images each iteration. For each iteration, randomly choose 8 images and their labels from your train set (but do not use the same images again in that epoch). Note: Caffe accepts labels as 0 indexed, so your labels should be from 0 to 19, not strings. Load the 8 images and subtract their means as you did in step 15. You will now create an input "blob" for the Caffe network from the 8 preprocessed images. To do this, concatenate the 8 images along the fourth dimension using the cat(4,...) function to form a Matlab array of shape [227 227 3 8]. Also create a 8x1 labels array which contains an integer from 0 to 19 for each of the images in the input image blob.
  3. Provide Caffe with the data and labels using these commands:
    solver.net.blobs('data').set_data(INPUT MINIBATCH)
    solver.net.blobs('label').set_data(INPUT LABELS)
  4. Train the network on the minibatch using solver.step(1). This tells Caffe to perform one update of the weights using your minibatch.
  5. After each step of the solver, get the value of the "loss" layer and save it in an array. See step 16 for how to get the value of a layer.
  6. After each epoch of training, evaluate the model on the validation set. To do this, load and preprocess the images as usual, and run the images through the network by providing the images and their labels as you did in step 24 (you will need to run the images through the network in batches of 8). However, instead of doing net.forward, you need to access the network using solver.net.forward_prefilled(). Do not use solver.step because we are not training on the validation set. Finally, get the accuracy on each minibatch from the validation set by getting the result of the accuracy layer. Take the average of all of the accuracies of the minibatches in the validation set and you have the accuracy of the network at that epoch.
  7. After training, you can use the solver.net.save('FILENAME.caffemodel')command to save your final trained network.
  8. Provide a plot of the train losses in your report. Also, provide a second plot of your validation set accuracies (you should have 25 numbers in this plot).
  9. Perform Part IV using your trained network instead of the pretrained model. Use your network which had the best accuracy on the validation set. You can reuse all of your code from Part IV. You will need to change the line to point to your network instead of the pretrained model:
    net = caffe.Net(MODIFIED DEPLOY FILE, YOUR CAFFEMODEL FILE, 'test')
  10. Note: you will also need to modify the deploy.prototxt  file to have num_output: 20 and to have the name of the layer that you changed in the train_val.prototxt file (i.e. find all fc8 in the deploy.prototxt file and rename it to whatever name you chose).
  11. Report the accuracy of your network on the train set and test set without the SVM. To do this, you can extract the network's classification scores for each image by accessing the output of the fc8 layer (remember to access your re-named version) and using the class with the max score as the network's prediction to compute the accuracy.

If you need additional help with Matlab Caffe syntax, you may want to consult the Caffe interface tutorial. Scroll down to the "Use MatCaffe" section. It is short and covers basics of how to create a network, perform input and output, access data blobs, and train a network.

Grading rubric:

  1. [10 points] Setting up and splitting the data correctly.
  2. [30 points] Accuracy of pretrained model using SVM, and confusion matrix.
  3. [40 points] Accuracy of trained model without SVM.
  4. [10 points] Accuracy of trained model using SVM.
  5. [10 points] Plot of train losses and validation accuracies.
Acknowledgement: The photos used for this assignment come from the PASCAL VOC dataset. The network model used in this assignment is AlexNet.