Overview
Course description: In this class, students will learn the basics of modern computer vision. The first major part of the course will cover fundamental concepts such as image filtering, edge detection, feature extraction, texture description, and grouping and fitting. The second part will focus on visual recognition. We will study state of the art approaches to object recognition and detection, examine the interplays between vision and language, learn how to model motion and actions, and briefly overview generative techniques. We will cover recently popular techniques such as convolutional and recurrent neural networks. We will also discuss a few topics from recent computer vision conferences. The course format includes lectures, in-class activities, and weekly homework assignments.Prerequisites: CS1501
Piazza: Sign up for it here. Note that we will use Piazza primarily for classmate-to-classmate discussion of homework problems, etc. The instructor and TA will monitor it infrequently. The time when you should ask the instructor or TA questions is during office hours.
Programming languages: We will use Matlab, which is available for download for free using Software Downloads in My Pitt.
Textbooks: We will have required readings from the following textbooks:
- Computer Vision: Algorithms and Applications by Richard Szeliski (available for free on author's page)
- Visual Object Recognition by Kristen Grauman and Bastian Leibe (accessible for free from campus)
- Deep Learning by Ian Goodfellow, Yoshua Bengio, Aaron Courville
- Pattern Recognition and Machine Learning by Christopher Bishop
- Computer Vision: A Modern Approach by David Forsyth and Jean Ponce
- Computer Vision: Models, Learning, and Inference by Simon Prince
Policies
Grading
Grading will be based on the following components:- Homework: Programming assignments (9 assignments x 8% each = 72%)
- Homework: Essays and paper reviews (2 assignments x 8% each = 16%)
- Participation (8%)
- Quizzes (16 quizzes x 0.25% each = 4%)
Homework Submission Mechanics
Homework is due at 11:59pm on the due date. You will submit your homework using Canvas. Navigate to the Canvas page for CS1674, and the corresponding homework ID. You should submit a single zip file with .m files (and images/results if requested) for programming assignments, and .pdf/.docx files for essays and reviews. Name the file YourFirstName_YourLastName.zip. Please comment your code! Grades will be posted on Canvas as well.Participation
Students are expected to regularly attend the class lectures, and should actively engage in in-class discussions. Attendance will not be taken, but keep in mind that if you don't attend, you cannot participate. You can actively participate by, for example, responding to the instructor's or others' questions, asking questions or making meaningful remarks and comments about the lecture, answering others' questions on Piazza, or bringing in relevant articles you saw in the news. The grading rubric will be as follows: 1 = you attended infrequently, 2 = you attended frequently but did not speak in class, 3 = you attended frequently and spoke a few times, 4 = you participated frequently, 5 = you participated every other week or more. By 11/17, you should submit a text response on Canvas explaining how you participated during the semester, to complement my observations.Quizzes
We will conclude almost every class with a 2-question quiz about what was discussed. Each quiz will only count for 0.25% of the final grade.Homework Late Policy
You get 3 "free" late days counted in minutes, i.e., you can submit a total of 72 hours late. For example, you can submit one homework 12 hours late, and another 60 hours late. The 72-hour "budget" is total for all assignments, NOT per assignment. Once you've used up your free late days, you will incur a penalty of 25% from the total assignment credit possible for each late day. A late day is anything from 1 minute to 24 hours.Collaboration Policy and Academic Honesty
You will do your work (exams and homework) individually. The work you turn in must be your own work. You are allowed to discuss the assignments with your classmates, but do not look at code they might have written for the assignments, or at their written answers. You are also not allowed to search for code on the internet, use solutions posted online unless you are explicitly allowed to look at those, or to use Matlab's implementation if you are asked to write your own code. When in doubt about what you can or cannot use, ask the instructor! A first offense will cause you to get 0% credit on the assignment. A report will be filed with the school. A second offense will cause you to fail the class and receive disciplinary penalty. Please consult SCI's Academic Integrity Country and Pitt's Academic Integrity Guidelines.Note on Disabilities
If you have a disability for which you are or may be requesting an accommodation, you are encouraged to contact both your instructor and Disability Resources and Services (DRS), 140 William Pitt Union, (412) 648-7890, drsrecep@pitt.edu, (412) 228-5347 for P3 ASL users, as early as possible in the term. DRS will verify your disability and determine reasonable accommodations for this course.Note on Medical Conditions
If you have a medical condition which will prevent you from doing a certain assignment, you must inform the instructor of this before the deadline. You must then submit documentation of your condition within a week of the assignment deadline.Statement on Classroom Recording
To ensure the free and open discussion of ideas, students may not record classroom lectures, discussion and/or activities without the advance written permission of the instructor, and any such recording properly approved in advance can be used solely for the student's own private use.Getting Help
If in rare cases, you need to talk to the instructor privately rather than in office hours, please send an email and we can arrange a separate Zoom session. Please do not use this for general assignment or lecture questions.[top]
Schedule
Date | Chapter | Topic | Readings | Lecture slides | Due |
8/20 | Intro | Introduction (incl. review) | Szeliski Sec. 1.1-1.2 | pptx pdf tutorial.m myfunction.m myotherfunction.m pittsburgh.png | |
8/25 | |||||
8/27 | |||||
9/1 | Low-level vision | Filtering and texture | Szeliski Sec. 3.2, 4.1.1, 10.5 | pptx pdf filtering.m butterfly.jpg | |
9/3 | HW1 due | ||||
9/8 | |||||
9/10 | Feature detection and description | Szeliski Sec. 4.1, Grauman/Leibe
Ch. 1-3,
Sec. 4.2.1; SIFT paper by David Lowe |
pptx pdf autocorr_surface.m blobs.m butterfly3.png | HW2 due | |
9/15 | |||||
9/17 | HW3 due | ||||
9/22 | Edges, lines, circles and segments | Szeliski Sec. 4.2, 4.3.2, 5.3-4; Grauman/Leibe Ch. 5.2 |
pptx pdf | ||
9/24 | HW4 due | ||||
9/29 | Multiple views | Szeliski Sec 2.1, 3.6.1, 7.2, 11.1.1; Grauman/Leibe Sec. 5.1 | pptx pdf | ||
10/1 | HW5 due | ||||
10/6 | High-level vision | Intro to recognition | Grauman/Leibe Ch. 7, Sec. 8.1, 8.2, 9,
10.2.1.1, 10.3.3, 11.1,2,5; Szeliski
Sec. 14.1,4; Bishop PRML
Sec. 1.1, Bishop PRML Sec. 7.1 |
pptx pdf | |
10/8 | HW6 due | ||||
10/13 | |||||
10/15 | Essay 1 due | ||||
10/20 | Convolutional neural networks | Karpathy's notes, Module 1, Module 2 | pptx pdf | ||
10/22 | HW7 due | ||||
10/27 | |||||
10/29 | Object recognition, detection, segmentation | Szeliski Sec. 14.1, 14.4;
Grauman/Leibe Sec. 8, 9,
10.2.1.1, 10.3.3,
11.1,2,5; Girshick CVPR 2014, Ren NIPS 2015, Redmon CVPR 2016, Zhou CVPR 2016, Ye ICCV 2019 |
pptx pdf | HW8 due | |
11/3 | |||||
11/5 | HW9 due | ||||
11/10 | Sequences: Vision and language, video and motion | blog1,
blog2, Karpathy CVPR 2015, Antol ICCV 2015, Goyal CVPR 2017, Hussain CVPR 2017, Gurari CVPR 2018, Nagrani CVPR 2020 |
pptx pdf | ||
11/12 | |||||
11/17 | Unsupervised learning and generation methods | Doersch ICCV 2015, Jayaraman ICCV 2015, Pinto ECCV 2016, Goodfellow NIPS 2014, Isola CVPR 2017, Zhu ICCV 2017 | pptx pdf | Participation notes due | |
11/19 | Essay 2 due |
[top]
Resources
This course was inspired by the following courses:- Computer Vision by Kristen Grauman, UT Austin, Spring 2011
- Computer Vision by Derek Hoiem, UIUC, Spring 2015
- Convolutional Neural Networks for Visual Recognition by Fei-Fei Li, Andrej Karpathy, Justin Johnson, and Serena Young, Stanford University, Spring 2017
- Matlab tutorial
- Linear algebra review by Fei-Fei Li
- Brief machine learning intro by Aditya Khosla and Joseph Lim
- Resources list (including code and data, tutorials, and other related courses) compiled by Devi Parikh
- Microsoft COCO (Common Objects in Context) (object recognition, segmentation, image description)
- ImageNet (object recognition)
- SUN Database (scenes)
- Caltech-UCSD Birds 200 (fine-grained object recognition)
- MSRC Annotations (active learning)
- Animals with Attributes (attribute-based recognition)
- a-Pascal + a-Yahoo (attribute-based recognition)
- Shoes (attribute-based search)
- INRIA Movie Actions (action recognition)
- ADL (ego-centric action recognition)
- Action Quality (evaluating action quality)
- CarDb Historical Cars (style classification of cars)
- Recognizing Image Style (photographic style classification)
- Judd gaze (visual saliency prediction)
- Visual Persuasion (predicting subtle messages in images)
- Advertisements: Images and Videos (understanding what the ad prompts of the viewer and why)
- VQA (visual question-answering)
- Recognition datasets list compiled by Kristen Grauman
- Human activity datasets list compiled by Chao-Yeh Chen
- LIBSVM (by Chih-Chung Chang and Chih-Jen Lin)
- SVM Light (by Thorsten Joachims)
- VLFeat (feature extraction, tutorials and more, by Andrea Vedaldi)
- TensorFlow (deep learning framework by Google)
- Caffe (deep learning framework by Yangqing Jia et al.)
- PyTorch (another popular deep learning framework)
- Keras (deep learning library)