CS1674: Homework 9

Due: 11/5/2020, 11:59pm

This assignment is worth 50 points.

In this assignment, you will train deep networks to perform categorization. The assignment has four parts. In the first part, you will train a neural network from scratch. In the second and third, you will transfer layers from a pretrained network, then append and train one fully-connected (FC) layer. In the last part, you will describe your findings.

You will use the same dataset as for HW7. However, because we want to make sure we use square images (i.e. 227x227), you will need to create a separate folder with the same eight folders (categories) as those in HW7, but inside each folder, copy only the images with "resized" in the filename, resulting in 1200 images total in the top-level folder scenes_lazebnik (you can also download a new copy of the data with just the resized images from Canvas).

You will also need to install the Matlab Deep Learning (DL) Toolbox add-on. Go to Home --> Add-ons to do that.

For each problem, write your code in a separate script titled part_X.m where X is i, ii or iii. Also submit a separate file answers.txt where you describe and compare the performance of your different networks. Briefly hypothesize why you observe these trends, based on what we have discussed in class.

You will need to rely on the Matlab documentation to learn how to use the built-in neural network functions. Any existing functions are fair game-- of course, do not look for scripts that accomplish the entirety of what an assignment part asks. However, for this assignment, the goal IS to learn how to use the documentation, hence please do look up functions and examples. Some useful links are below. Please skim through all of them to get a sense of how the DL toolbox works.

Some of the functions you will need to call are imageDatastore, splitEachLabel (to load and split the dataset; we will not use load_split_dataset from before), trainingOptions, trainNetwork, classify. Doing the assignment will be very easy if you skim through the references above.

Unless otherwise specified, use a learning rate of 0.001, a maximum of 1 epochs, 100 images per class for training, and the rest (50) for testing.

You will need to train 5 networks in total, and training each should take about 10-45 min on CPU, thus allow at least 2 hours for training, separate from writing and debugging your code. Your computer might become slow or unresponsive so reduce the number of applications running on your computer while running training. On my computer, Part I took about 10 min, Part II took about 20 min, and Part III about 45 min. It will take less time if you are using a more recent version of Matlab; this is ok assuming you get reasonable performance in the last two parts.

Part I (15 pts):

In this part, you will train a neural network for the task of classifying the eight scenes from HW7, from scratch.

You need to specify a folder for the train set and the test set. Refer to the "train network" link for details on how to set up the data.
Create a network with three types of layers (denoted A, B, C in the following). First, use an image input layer; this layer takes in the images at size 227x227x3. Next, use a group of layers for type A: a layer with 50 11x11 layers (same size as the filters in the first layer of AlexNet), followed by RELU and a max pooling layer of size 3x3 and stride 1. Then use a group B of 60 5x5 filters, RELU and max pooling of size 3x3 and stride 2. Then create group C: a fully-connected layer with size 8 (for 8 classes), followed by a softmax layer (which computes probabilities) and a classification layer. Check the links above for the corresponding functions and their inputs format.
You need to specify options for training the network. Use the "training options" link above. Specify the max number of epochs, the learning rate, and set the 'Plots' variable such that it shows training progress.
For simplicity, we will not use a validation set. Train the network and output performance on the test set after the last iteration. You can use the classify function and the imdsTest.Labels variable to get the ground-truth labels on the test set.
In your answers file, hypothesize why you see such high/low performance. Keep in mind what performance was in HW7.

Part II (15 pts):

In this part, you will transfer layers from an AlexNet network trained on the ImageNet dataset. Refer to the "transfer learning" link. Transfer all layers up to but excluding the FC6 layer. It will be helpful to see what layers you are transfering; try the net.Layers to get a list of the layers in AlexNet. Append a single fully-connected layer (of size 8), followed by softmax and classification. Then train and evaluate performance, and describe your observations.

Part III (10 pts):

In this part, you will also transfer layers, but additionally transfer FC6 and FC7, all the way up to (but excluding) FC8. Also transfer the layers that come after FC6 and FC7 (RELU, dropout). Now append a single fully-connected layer, as before. Train the network, evaluate performance on the test set, and describe your observations in the answers file.

Part IV (10 pts):

In a file answers.txt, list all accuracies for each setting described above. Then hypothesize the reasons for the relative performance of each method. For the first part, hypothesize reasons for the network's performance relative to the performance of the SVM classifier we developed in HW7. For later parts, hypothesize reasons for the performance of the network in that part, compared to the network in the previous part. Aim for about 3-5 sentences.

Submission:

part_i.m
part_ii.m
part_iii.m
answers.txt