Overview
Course description: In this class, students will learn the basics of modern computer vision. The first major part of the course will cover fundamental concepts such as image filtering, edge detection, feature extraction, texture description, and grouping and fitting. The second part will focus on visual recognition. We will study state of the art approaches to object recognition and detection, examine the interplays between vision and language, learn how to model motion and actions, and briefly overview generative techniques. We will cover recently popular techniques such as convolutional, recurrent neural networks, and transformers. We will also discuss a few topics from recent computer vision conferences. The course format includes lectures, in-class activities, exams, and weekly homework assignments.Prerequisites: CS1501
Programming languages: We will use Matlab, which is available for download for free using Software Downloads in My Pitt. Please use the latest version.
Textbooks: We will have required readings from the following textbooks:
- Computer Vision: Algorithms and Applications, Second Version, by Richard Szeliski (available for free on author's page)
- Visual Object Recognition by Kristen Grauman and Bastian Leibe (accessible for free from campus)
- Deep Learning by Ian Goodfellow, Yoshua Bengio, Aaron Courville
- Pattern Recognition and Machine Learning by Christopher Bishop
- Computer Vision: A Modern Approach by David Forsyth and Jean Ponce
- Computer Vision: Models, Learning, and Inference by Simon Prince
Policies
Grading
Grading for CS 1674 will be based on the following components:- Homework (9 assignments x 6% each = 54%)
- First exam (20%)
- Second exam (20%)
- Participation (6%)
- Homework (9 assignments x 5% each = 45%)
- First exam (15%)
- Second exam (15%)
- Participation (5%)
- Project (20%)
- Proposal (3%)
- Status report (5%)
- Presentation (5%)
- Final report (7%)
Homework Submission Mechanics
Homework is due at 11:59pm on the due date. You will submit your homework using Canvas. Navigate to the Canvas page for CS1674/2074, and the corresponding homework ID. You should submit a single zip file with .m files (and images/results if requested). Name the file YourFirstName_YourLastName.zip. Please comment your code! Grades will be posted on Canvas as well.Exams
There will be two in-class exams. The second exam is not cumulative and will not cover material from the first exam. There will be no make-up exams unless you or a very close friend/relative is seriously ill!Project (CS 2074 only)
Students will complete a project which studies in more depth one of the topics we cover in class. Students should work in groups of two or three. These projects should focus on one of the following:- a novel approach which addresses one of the problems covered in class, properly evaluated
- a definition of a new problem, along with detailed argumentation of why this problem is important and challenging, an approach to solve this problem, and an evaluation of this approach
- an extensive analysis and experimental evaluation of one or more of the approaches covered in class
The mid-semester project status report (due Nov. 3) will describe the students' progress on the project, and any problems encountered along the way. The status report should use a known conference format (e.g. CVPR/ICCV/ECCV), but can be more informal than a conference paper. The progress report should include the following sections: Introduction, Related Work, Approach, and Results. In Results, include your experimental setup (this can change later). If you have results but they do not yet look great, include them anyway. Comment on any challenges encountered as well.
The project presentation, scheduled for the last day of class (Dec. 6), will describe the students' approach and their experimental findings in a clear and engaging fashion. This will be a chance to get feedback from the class before final submission of your report. Presentations will be about 10-15 minutes long. Please submit a copy of your slides to Canvas on the same day as your presentation.
The project final report (due Dec. 8) should be formatted and should read like a conference paper, with clear problem definition and argumentation of why this problem is important, overview of related work, detailed explanation of the approach, and well-motivated experimental evaluation. Еach student should document what part of the project they did, and how duties and tasks were divided.
Participation
Students are expected to regularly attend the class lectures, and should actively engage in in-class discussions. Attendance will not be taken, but keep in mind that if you don't attend, you cannot participate. You can actively participate by, for example, responding to the instructor's or others' questions, asking questions or making meaningful remarks and comments about the lecture, or bringing in relevant articles you saw in the news. The grading rubric will be as follows: 1 = you attended infrequently, 2 = you attended frequently but did not speak in class, 3 = you attended frequently and spoke a few times, 4 = you participated frequently, 5 = you participated every other week or more. Near the end of the course, you should submit a text response on Canvas explaining how you participated during the semester, to complement the instructor's observations.Homework Late Policy
You get 3 "free" late days counted in minutes, i.e., you can submit a total of 72 hours late. For example, you can submit one homework 12 hours late, and another 60 hours late. The 72-hour "budget" is total for all assignments, NOT per assignment. Once you've used up your free late days, you will incur a penalty of 25% from the total assignment credit possible for each late day. A late day is anything from 1 minute to 24 hours. The free late days only apply to homework assignments (HW1-HW9). Participation and project submissions cannot be submitted late.Collaboration Policy and Academic Honesty
You will do your work (exams and homework) individually and without the help of artificial intelligence systems (e.g. ChatGPT). The work you turn in must be your own work. You are allowed to discuss the assignments with your classmates, but do not look at code they might have written for the assignments, or at their written answers. You are also not allowed to search for or generate code on the internet, use solutions posted online unless you are explicitly allowed to look at those, or to use Matlab's implementation if you are asked to write your own code. When in doubt about what you can or cannot use, ask the instructor! A first offense will cause you to get 0% credit on the assignment. A report will be filed with the school. A second offense will cause you to fail the class and receive disciplinary penalty. Please consult SCI's Academic Integrity Country and Pitt's Academic Integrity Guidelines.Note on Disabilities
If you have a disability for which you are or may be requesting an accommodation, you are encouraged to contact both your instructor and Disability Resources and Services (DRS), 140 William Pitt Union, (412) 648-7890, drsrecep@pitt.edu, (412) 228-5347 for P3 ASL users, as early as possible in the term. DRS will verify your disability and determine reasonable accommodations for this course.Note on Medical Conditions
If you have a medical condition which will prevent you from doing a certain assignment, you must inform the instructor of this before the deadline. You must then submit documentation of your condition within a week of the assignment deadline.Statement on Classroom Recording
To ensure the free and open discussion of ideas, students may not record classroom lectures, discussion and/or activities without the advance written permission of the instructor, and any such recording properly approved in advance can be used solely for the student's own private use.[top]
Schedule
Date | Chapter | Topic | Readings | Lecture slides | Due |
8/28 | Intro | Introduction (incl. review) | Szeliski Sec. 1.1, 1.2; Grauman/Leibe Ch. 1 | pptx pdf tutorial.m myfunction.m myotherfunction.m pittsburgh.png |
|
8/30 | |||||
9/6 | |||||
9/11 | Low-level vision | Filtering and texture | Szeliski Sec. 3.2 |
pptx pdf
filtering.m butterfly.jpg |
|
9/13 | HW1 | ||||
9/18 | |||||
9/20 | Feature detection and description | Szeliski Sec. 7.1, Grauman/Leibe
Ch. 2, 3,
Sec. 4.2; SIFT paper by David Lowe |
pptx pdf
autocorr_surface.m blobs.m butterfly3.png |
HW2 | |
9/25 | |||||
9/27 | HW3 | ||||
10/2 | Grouping: Edges, lines, circles and segments | Szeliski Sec. 5.2.1, 5.2.2, 7.2, 7.5, 7.4.2; Grauman/Leibe Ch. 5.2 |
pptx pdf | ||
10/4 | HW4; proposal (10/6) |
||||
10/9 | Multiple views | Szeliski Sec 2.1, 3.6.1, 11.3, 12.1.1; Grauman/Leibe Sec. 5.1 | pptx pdf | ||
10/11 | HW5 | ||||
10/16 | First exam | ||||
10/18 | High-level vision | Intro to recognition | Szeliski Sec. 5.1, 6.1, 6.2;
Grauman/Leibe Ch. 6, 7, Sec. 8.1, 10.1, 10.2, 11.3; Bishop PRML
Sec. 1.1, Bishop PRML Sec. 7.1 |
pptx pdf | |
10/23 | HW6 | ||||
10/25 | |||||
10/30 | Convolutional neural networks | Szeliski Sec. 5.3, 5.4, 5.5.3; Karpathy's notes, Module 1, Module 2 | pptx pdf | ||
11/1 | HW7; status (11/3) |
||||
11/6 | |||||
11/8 | Object recognition, detection, segmentation | Szeliski Sec. 6.3, 6.4;
Grauman/Leibe Sec. 8.2, 9,
10.3,
11.1, 11.2,
11.5; Girshick CVPR 2014, Ren NIPS 2015, Redmon CVPR 2016, Radford ICML 2021 |
pptx pdf | ||
11/13 | |||||
11/15 | Sequences: Vision and language, video and motion | Szeliski Sec. 5.5.1, 5.5.2, 6.5, 6.6; blog1,
blog2, Karpathy CVPR 2015, Vaswani NeurIPS 2017 |
pptx pdf | HW8 | |
11/27 | |||||
11/29 | |||||
12/4 | Unsupervised learning | Szeliski Sec. 5.5.4; Doersch ICCV 2015, Chen ICML 2020, Goodfellow NIPS 2014, Zhu ICCV 2017, Rombach CVPR 2022 | pptx pdf | HW9 | |
12/6 | Project presentations (speakers: CS2074 students) |
presentation; report (12/8) |
|||
12/13 | Second exam (12-1:15pm, 5129 Sennott Square) | participation notes |
[top]
Resources
This course was inspired by the following courses:- Computer Vision by Kristen Grauman, UT Austin, Spring 2011
- Computer Vision by Derek Hoiem, UIUC, Spring 2015
- Convolutional Neural Networks for Visual Recognition by Fei-Fei Li, Andrej Karpathy, Justin Johnson, and Serena Young, Stanford University, Spring 2017
- Matlab tutorial
- Linear algebra review by Fei-Fei Li
- Brief machine learning intro by Aditya Khosla and Joseph Lim
- Resources list (including code and data, tutorials, and other related courses) compiled by Devi Parikh
- Microsoft COCO (Common Objects in Context) (object recognition, segmentation, image description)
- ImageNet (object recognition)
- SUN Database (scenes)
- Caltech-UCSD Birds 200 (fine-grained object recognition)
- MSRC Annotations (active learning)
- Animals with Attributes (attribute-based recognition)
- a-Pascal + a-Yahoo (attribute-based recognition)
- Shoes (attribute-based search)
- INRIA Movie Actions (action recognition)
- ADL (ego-centric action recognition)
- Action Quality (evaluating action quality)
- CarDb Historical Cars (style classification of cars)
- Recognizing Image Style (photographic style classification)
- Judd gaze (visual saliency prediction)
- Visual Persuasion (predicting subtle messages in images)
- Advertisements: Images and Videos (understanding what the ad prompts of the viewer and why)
- VQA (visual question-answering)
- Recognition datasets list compiled by Kristen Grauman
- Human activity datasets list compiled by Chao-Yeh Chen
- LIBSVM (by Chih-Chung Chang and Chih-Jen Lin)
- SVM Light (by Thorsten Joachims)
- VLFeat (feature extraction, tutorials and more, by Andrea Vedaldi)
- TensorFlow (deep learning framework by Google)
- Caffe (deep learning framework by Yangqing Jia et al.)
- PyTorch (another popular deep learning framework)
- Keras (deep learning library)