Homework 2 (CS 1671)

Assigned: October 2, 2018

Due: October 16, 2018 (midnight)

2.1 HMM Decoding (Viterbi) (20 points)

A partial Viterbi calculation is pictured here. This calculation takes us up through t=2 where v2(1) and v2(2) are computed. In the picture, the index 1 is used for the state labeled C and the index 2 is used for the state labeled H. Compute v3(1) and v3(2). You will need the transition and observation probabilities given here.

Think of this as filling in a table where the columns are moments in time and the rows are states in the HMM. Filling in the table with the numbers computed in the diagram above, and adding a column for time t = 0, and showing all the probability cells, it looks like this:

end 0 0 0  
H 0 .32 .0448  
C 0 .02 .048  
start 1.0 0 0  
t = 0 1 2 3

Each cell in the Viterbi table is filled with one of the Viterbi values computed in the diagram. Like the diagram, the table is complete through t=2. The values in the cells represent Viterbi probabilities. The Viterbi probability written as v2(2) repesents the probability of the highest probability path that ends at state 2 at time 2.

2.2    CKY Parsing (60 Points)

Implement a non-probabilistic CKY parser.

0.80 S -> NP VP
0.15 S -> Aux NP VP
0.05 S -> VP
0.35 NP -> Pronoun
0.30 NP -> Proper-Noun
0.20 NP -> Det Nominal
0.15 NP -> Nominal
0.75 Nominal -> Noun
0.20 Nominal -> Nominal Noun
0.05 Nominal -> Nominal PP
0.35 VP -> Verb
0.20 VP -> Verb NP
0.10 VP -> Verb NP PP
0.15 VP -> Verb PP
0.05 VP -> Verb NP NP
0.15 VP -> VP PP
1.0 PP -> Preposition NP
Det -> that [0.10] | a [0.30] | the [0.60]
Noun -> book [0.10] | flight [0.30] | meal [0.15] | money [0.05] | flights [0.40] | dinner [0.10]
Verb -> book [0.30] | includes [0.30] | prefer [0.40]
Pronoun -> i [0.40] | she [0.05] | me [0.15] | you [0.40]
Proper-Noun -> houston [0.60] | twa [0.40]
Aux -> does [0.60] | can [0.40]
Preposition -> from [0.30] | to [0.30] | on [0.20] | near [0.15] | through [0.05]

Input/Output Requirements

Your script (for Python users) or executable jar (for Java users) must take two parameters:

If you use Python, your code will be tested as:

python cky.py cfg.txt "A test sentence"

If you use Java, your code will be tested as:

java -cp yourname.jar cs1671.hw2.CKY cfg.txt "A test sentence"

The output should be printed to the standard output stream. Print all of the parse trees for the sentence in the following bracket-based format:

[S [NP [Pronoun I]] [VP [Verb book] [NP [Det a] [Nominal [Noun flight]]] [PP [Preposition to] [NP [Proper-Noun houston]]]]]

(Copy and paste this string into mshang.ca/syntree to visualize it. You will find this tool very useful throughout this homework. )

What to Include in Submission?

For Python users, include:

For Java users, include:

2.3    Probabilistic Parsing (20 Points)

The probabilistic grammar provided has rules such as VP -> Verb NP PP, which has more than two non-terminals on the right hand side.