CS1674: Homework 9

Due: 12/5/2019, 11:59pm

This assignment is worth 50 points.


In this assignment, you will use deep networks to perform categorization. You will use the same dataset as for HW7. However, because of specifics of the functions we are going to use, you will need to download a new copy of the data: scenes_all, which contains all the data (train and test splitting will be described later).

This assignment has five parts. Each part is worth 10 points. In the first, you will train a neural network from scratch. In the second and third, you will transfer layers from a pretrained network, then append one fully-connected (FC) layer. In the fourth, you will experiment with different learning rates. In the last part, you will describe your findings.

First, you might need to install the Matlab Deep Learning (DL) Toolbox add-on. Go to Home --> Add-ons to do that.

For each problem, write your code in a separate script titled part_X.m where X is i, ii, iii or iv. Also submit a separate file answers.txt where you describe and compare the performance of your different networks. Briefly hypothesize why you observe these trends, based on what we have discussed in class.

You will need to rely on the Matlab documentation to learn how to use the built-in neural network functions. Any existing functions are fair game-- of course, do not look for scripts that accomplish the entirety of what an assignment part asks. However, for this assignment, the goal IS to learn how to use the documentation, hence please do look up functions and examples. Some useful links are below. Please skim through all of them to get a sense of how the DL toolbox works.
  1. deep learning with images
  2. train network
  3. training options
  4. transfer learning
  5. classify
  6. convolution layer
  7. max pooling layer
  8. fully connected layer
Unless otherwise specified, use a learning rate of 0.001, a maximum of 1 epochs, 100 images per class for training, and the rest (50) for testing. Note that Matlab's DL toolbox provides support for splitting your data so you don't have to do this manually; see the links above.

You will need to train 5 networks in total, and training each should take about 15-20 min on CPU, thus allow at least 2 hours for training, separate from writing and debugging your code.


Part I (10 pts):

In this part, you will train a neural network for the task of classifying the eight scenes from HW7, from scratch.
  1. You need to specify a folder for the train set and the test set. Refer to the "train network" link for details on how to set up the data.
  2. Create a network with three types of layers (denoted A, B, C in the following). These correspond to a few more layers in practice. First, use an image input layer; this layer takes in the images at size 227x227x3. Next, use a group of layers for type A: a layer with 50 11x11 layers (same size as the filters in the first layer of AlexNet), followed by RELU and a max pooling layer of size 3x3 and stride 1. Then use a group B of 60 5x5 filters, RELU and max pooling of size 3x3 and stride 2. Then create group C: a fully-connected layer with size 8 (for 8 classes), followed by a softmax layer (which computes probabilities) and a classification layer. Check the links above for the corresponding functions and their inputs format.
  3. You need to specify options for training the network. Use the "training options" link above. Specify the max number of epochs, the learning rate, and set the 'Plots' variable such that it shows training progress.
  4. For simplicity, we will not use a validation set. Train the network and output performance on the test set after the last iteration. You can use the classify function and the imdsTest.Labels variable to get the ground-truth labels on the test set.
  5. In your answers file, hypothesize why you see such high/low performance. Keep in mind what performance was in HW7.


Part II (10 pts):

In this part, you will transfer layers from an AlexNet network trained on the ImageNet dataset. Refer to the "transfer learning" link. Transfer all layers up to but excluding the FC6 layer. It will be helpful to see what layers you are transfering; try the net.Layers to get a list of the layers in AlexNet. Append a single fully-connected layer (of size 8), followed by softmax and classification. Then train and evaluate performance, and describe your observations.


Part III (10 pts):

In this part, you will also transfer layers, but additionally transfer FC6 and FC7, all the way up to (but excluding) FC8. Also transfer the layers that come after FC6 and FC7 (RELU, dropout). Now append a single fully-connected layer, as before. Train the network, evaluate performance on the test set, and describe your observations in the answers file.


Part IV (10 pts):

In this part, you will construct a network with the same structure as the previous part, but use different learning rates. In particular, in addition to 0.001 which you tried before, also use 0.0001 and 0.01. Train the networks, separately using each learning rate, evaluate performances on the test set, and describe your observations in the answers file.


Part V (10 pts):

In a file answers.txt, list all accuracies for each setting described above. Then hypothesize the reasons for the relative performance of each method. For the first part, hypothesize reasons for the network's performance relative to the performance of the SVM classifier we developed in HW7. For later parts, hypothesize reasons for the performance of the network in that part, compared to the network in the previous part. Aim for about 5-8 sentences.


Submission: