HUMAN LANGUAGE TECHNOLOGIES (CS 1671), Fall 2018
|
|
Professor |
Dr. Diane Litman (5105 Sensq) |
TA |
Yanbing Xue (5324 Sensq) |
When & Where | Tuesday and Thursdays 1:00-2:15, SENSQ 5313 |
Office Hours | Litman: Tu 4:30-5:30, Th 2:15-3:30, by appointment; Xue: W, F 2-3:30 |
Description | This course provides an introduction to the field of natural
language processing - the creation of computer programs
that can understand, generate, and learn languages used by
humans. It will expose students to applications such
as question answering and dialogue agents by means of
computational techniques including search algorithms, dynamic
programming, hidden markov models, probalistic context free
grammars, and machine learning algorithms.
Prerequisites: CS 1501 and CS 1502 OR consent of the instructor Textbook: Speech and Language Processing (3rd edition online draft - free!) |
Required Work (tentative) | Homeworks (40%): written and programming Exams (40%): midterm and final Group Course Project (20%): presentation and written report Late Penalty: For assignments that may be accepted late, the penalty is 10% per day up to 5 days including Saturday, Sunday, and holidays. Assignments are due by 11:59pm. |
Date/Topic
|
Textbook Readings
|
Assignments and Other Readings |
August 28 Introduction (pdf) |
Ch 1 | Due 8/29: Fill out the Background Knowledge Survey in CourseWeb
xkcd Humor (credits to E. Riloff for this and links below) |
August 30, September 4 Regular Expressions, Text Normalization, Edit Distance (pdf) |
Ch 2 | Unix for Poets, pages 1-19
regular-expressions.info Humor: plurals, sentence tokenization |
September 4, 6, 11 Language Modeling with N-Grams (pdf) |
Ch 3 (3.1-3.4) | Humor Authorship Attribution (for the op-ed): NYtimes, a linguist, textbook HW1: assigned 9/11, due 9/27 |
September 13, 18 Part-of-Speech Tagging (pdf) |
Ch 8 (8.1-8.4.6) | Humor
Schoolhouse Rock for Conjunctions |
September 20 Formal Grammars of English (pdf) |
Ch 10 (10.1-10.5) | Humor |
September 25, 27 Syntactic Parsing (pdf) |
Ch 11 (11.1-11.3.0) | Humor |
October 2, 4 Statistical Parsing (pdf) |
Ch 12 (12.1-12.6.0, 12.8) |
HW2: assigned 10/2, due 10/16 |
October 4, 9, 11 Naive Bayes and Sentiment Classification (pdf) |
Ch 4 (through 4.8) | Humor
Bag of Words and the Beatles |
October 18 Logistic Regression (pdf) |
Ch 5 (5.1-5.2) | |
October 23 Note: No class October
16 (fall break) |
Midterm Exam (closed book) | Through 10/4 class (Chapter 12) NO MAKEUPS |
October 25, 30 Vector Semantics (pdf1, pdf2) |
Ch 6 (skip 6.7) | Background for project (pdf) Project: assigned 10/30 |
November 1, 6, 8 ML Tutorial by Yanbing (11/1) Midterm Review (11/6) Information Extraction (pdf) |
Ch 17 (17.1-17.2) | |
November 13, 15 Entity Linking; Semantic Role Labeling (pdf) |
Ch 18 (and part of missing Ch 20) | 11/15: Project preliminary evaluation deadline |
November 20, 27 Question Answering (pdf) |
Ch 23 | Watson documentary |
November 27, 29 Dialog Systems and Chatbots (pdf) |
Ch 24 | HW3: assigned 11/27, due 12/6
11/29: Project final evaluation deadline Project Paper Instructions: Your paper should both describe your system (the architecture, components, etc.) and contain a discussion evaluating how well the version turned in for the final evaluation performed (using the provided programs to compute performance). Papers should be NO LONGER THAN 4 pages (excluding references) using these LaTex or Word templates.
Alexa
Prize (Venturebeat, 11/26/18)
Alexa/Google/Siri
(Washington Post, 11/21/18)
Why Amazon thought that the Mets David Wright was 234 years old
(Washington Post, 4/18/17)
Amazon Alexa
Silver (Satuday Night Live)
|
December 4 Ethics, Social Good (pdf) |
The Social
Impact of Natural Language Processing
Man is to computer programmer as woman is to homemaker? debiasing word embeddings |
Project results
12/4: Project paper deadline |
December 6 | Project Presentations |
12/6: HW3 deadline |
December 13th (Thursday), 10:00am - 11:50am (Pitt Exam Schedule) |
Final Exam Note room assignment!!! |
All material since midterm (not cumulative) NO MAKEUPS |
Acknowledgements: Some of the materials used in this course borrow from Kai-Wei Chang, Jason Eisner, Rebecca Hwa, Dan Jurafsky, Chris Manning, Kathleen McKeown, Ellen Riloff, Noah Smith, and others