CS 1678: Homework 2

Due: 10/6/2021, 11:59pm

This assignment is worth 40 points. The first part asks you to implement a simple neural network from scratch, and the second asks you to test it. The third part asks you to perform "training" of a neural network by hand (following an exercise we will do in class).


Part A: Training a neural network (14 points)

In this part, you will write code to train and apply a very simple neural network. Follow the example in Bishop Ch. 5 (linked under Readings) that uses a single hidden layer, a tanh function at the hidden layer and an identity function at the output layer, and a squared error loss. The network will have 30 hidden neurons (i.e. M=30) and 1 output neuron (i.e. K=1). To implement it, follow the equations in the slides and Bishop Ch. 5. You can include the bias term or omit it. Make sure you use a small enough learning rate (e.g. 10e-3 to 10e-5) and you initialize the weights to small random numbers, as shown in the slides.

First, write a function forward that takes inputs X, W1, W2 and outputs y_pred, Z. This function computes activations from the front towards the back of the network, using fixed input features and weights. You will also use the forward pass function to apply (run inference) and compute the loss for your network during/after training.

Inputs: Outputs: Second, write a function backprop that takes inputs X, y, M, iters, eta and outputs W1, W2, error_over_time. This function performs training using backpropagation (and calls the forward function as it iterates). Construct the network in this function, i.e. create the weight matrices and initialize the weights to small random numbers, then iterate: pick a training sample, compute the error at the output, then backpropagate to the hidden layer, and update the weights with the resulting error.

Inputs: Outputs:
Part B: Testing your neural network on wine quality (14 points)

You will use the Wine Quality dataset. Use only the red wine data. The goal is to find the quality score of some wine based on its attributes. Write your code in a script neural_net.py.
  1. [6 pts] First, download the winequality-red.csv file, load it, and divide the data into a training and test set using approximately 50% for training. Standardize the data, by computing the mean and standard deviation for each feature dimension using the train set only, then subtracting the mean and dividing by the stdev for each feature and each sample. Append a 1 for each feature vector, which will correspond to the bias that our model learns. Set the number of hidden units, the number of iterations to run, and the learning rate.
  2. [3 pts] Call the backprop function to construct and train the network. Use 1000 iterations and 30 hidden neurons.
  3. [3 pts] Then call the forward function to make predictions and compute the root mean squared error between predicted and ground-truth labels, sqrt(mean(square(y_test_pred - y_test))). Report this number in a file report.pdf/docx
  4. [2 pts] Experiment with three different values of the learning rate. For each, plot the error over time (output by backprop above). Include these plots in your report.

Part C: Computing weight updates by hand (12 points)

In class, we saw how to compute activations in a neural network, and how to perform stochastic gradient descent to train it. We computed activations for two example networks, but only showed how to train one of them. Show how to train the second network using just a single example, x = [1 1 1], y = [0 0] (note that in this case, the label is a vector). Initialize all weights to 0.05. Use a learning rate of 0.3. You only need to perform a single iteration of training. Include your answers in text form in the file report.pdf/docx.


Submission: Please include the following files in your submission zip file: