CS 1699: Homework 1

Due: 1/23/2020, 11:59pm

This assignment is worth 50 points. Each question/micro-exercise is worth 2.5 points.
We have starter code hw1.py in the zip file hw1_starter.zip, provided on CourseWeb. Your task is to complete the functions in the starter file. The specific function that you need to complete is listed in the brackets for each question. You may also need to write your answers (see below) in a file answers.txt.
It is fair game to look up the Python documentation, or to look for answers on the web, assuming you look for individual functions that accomplish what you are asked, rather than entire code blocks.
Please use Python3 for all assignments (Python3.5+ recommended). You can also use numpy/scipy, scikit-image and matplotlib libraries for this assignment.

Matrices and functions:

Generate a 1000000x1 (one million by one) vector of random numbers from a Gaussian (normal) distribution with mean of 0 and standard deviation of 5. (generate_random_numbers)
Add 1 to every value in the previous list, by using a loop. To determine how many times to loop, use the size or shape functions. Time this operation and print the number in the code. Write that number down in answers.txt. (add_one_by_loop, measure_time_consumptions)
Now add 1 to every value in the original random vector, without using a loop. Time this operation, print the time and write it down. (add_one_without_loop, measure_time_consumptions)
Plot the exponential function 2**x, for non-negative even values of x smaller than 30, without using loops. Saving the figure into a file called exponential.png for submission. (plot_without_loop)
Create a script that prints all the values between 1 and 10, in random order, with pauses of 1 second between each two prints. (print_one_to_ten_in_random_order_with_pauses)
Generate two random matrices A and B, and compute their product by hand, using loops. It is guaranteed that the two matrices could be multiplied. Your code should generate the same results as Python's A@B operation or numpy's np.matmul(). (matrix_multiplication_by_loop)
Generate a matrix of shape [10, 10] containing numbers from 0 to 99 by manipulation of a given vector. Specifically, given a vector containing numbers ranging from 0 to 9, you need to perform some matrix manipulations on the vector (addition, transpose, broadcast, etc.), and generate a matrix containing 0 to 99. You should not initialze the desired matrix manually. (matrix_manipulation)
Write a function normalize_rows which uses a single command (one line and no loops) to make the sum in each row of the matrix 1. More specifically, row-wise normalization requires the following property to hold:
1. Sum of the entries in each row should be 1.
2. If the elements in a row were not identical before the normalization, they should remain different after your normalization; however, the relative order should be preserved.
Assume the input matrix to your function is (1) non-negative and (2) all rows contain at least 1 non-zero element. (normalize_rows)
Create a recursive function that returns the n-th number (n >= 1) in the Fibonacci sequence 1, 1, 2, 3, 5, 8, 13... Call it to demonstrate how it works. (recursive_fibonacci)
Implement a function that takes in a matrix M, removes duplicate rows from that input matrix and outputs the result as matrix N. You cannot call numpy's np.unique or Python's unique functions. (unique_rows)

Images:

Read pittsburgh.png into Python as a matrix, and write down its dimensions. (read_image)
Convert the image to grayscale. There are a few different libraries for handling images in Python (such as Scikit-Image, PIL, OpenCV, etc.). The input and output of your function should be np.ndarray, so please make sure you are dealing with the correct data type if you want to use an external library. You are also welcome to implement this function by yourself (via matrix manipulation). (convert_image_into_grayscale)
Find the darkest pixel in the image, and write its value and [row, column] in your answer file. (find_darkest_pixel)
Place a 31x31 square (a square with side equal to 31 pixels) centered on the darkest pixel from the previous question. In other words, replace all pixels in that square with white pixels. (mask_image_around_darkest_pixel)
Display the modified image (which includes the original image with a white square over it), and save the new figure to a file masked_image.png. (save_image)
Using the original pittsburgh.png image, compute the scalar average pixel value along each channel (R, G, B) separately, then subtract the average value per channel. Display the resulting image and write it to a file mean_subtracted.png. (subtract_per_channel_mean)

Text:

Read in waugh.txt as a string. Measure and write down the number of characters in it. (read_and_count_text)
Convert all text to lowercase, remove punctuation, and extract all words. Place them in an array. (preprocess_text)
Compute the frequency of each word in the text. Report the frequencies of the top-5 most frequent words in your answers file, using the format word1: count1, word2:count2, ... (measure_word_frequency)
Use the Python built-in function shuffle to shuffle the letters in each word, then put the words back together in a single string, and save the string to a new text file waugh_shuffled.txt. (shuffle_texts_in_file)

Submission: Please include the following files in your submission zip:

A completed Python file hw1.py
An answers file (where answers are requested above) answers.txt
Image files masked_image.png, mean_subtracted.png and exponential.png
Text file waugh_shuffled.txt