Overview
Course description: In this class, students will learn about modern computer vision. The first part of the course will cover fundamental concepts such as filtering, extracting features and describing images, grouping features, and matching features across multiple views. In the second part, we will cover techniques and approaches to classic tasks/topics such as object recognition, vision and language, and tracking, including tools such as support vector machines and convolutional neural networks. In the third part, we will overview recent and emerging topics such as self-supervised and embodied learning, visual reasoning, generative adversarial models, etc. The format will include lectures, homework assignments, exams, and a course project.Prerequisites: CS1501 and MATH 0280 (or equivalent). The expectation is that you can program and analyze the performance of programs. Some experience with linear algebra (matrix and vector operations), basic calculus, and probability and statistics is strongly recommended.
Piazza: Sign up for it here. Note that we will use Piazza primarily for classmate-to-classmate discussion of homework problems, etc. The instructor will not monitor Piazza, and the TA will monitor it infrequently. The time when you should ask the instructor or TA questions is during office hours.
Programming languages: For homework assignments, you can use Matlab or Python. For the course project, you can use any language of your choice.
Textbooks:
- Readings from:
- Computer Vision: Algorithms and Applications by Richard Szeliski (available for free on author's page)
- Visual Object Recognition by Kristen Grauman and Bastian Leibe (accessible for free from campus)
- For reference:
- Computer Vision: A Modern Approach by David Forsyth and Jean Ponce
- Computer Vision: Models, Learning, and Inference by Simon Prince (available for free on author's page)
- Pattern Recognition and Machine Learning by Christopher Bishop
- Deep Learning by Ian Goodfellow, Yoshua Bengio and Aaron Courville (available for free online)
Policies
Grading
Grading will be based on the following components:- Homework assignments (3 assignments x 10% each = 30%)
- Project (15% first presentation + 15% second presentation = 30%)
- Exams (15% first exam + 15% second exam = 30%)
- Participation (10%)
Homework Submission Mechanics
You will submit your homework using CourseWeb. Navigate to the CourseWeb page for CS2770, then click on "Assignments" (on the left) and the corresponding homework ID. Your source code should be a single zip file (also including images/results if requested). Name the file YourFirstName_YourLastName.[extension]. Please comment your code! Homework is due at 11:59pm on the due date. Grades will be posted on CourseWeb.Exams
There will be two in-class exams. The second exam is not cumulative and will not cover material from the first exam. There will be no make-up exams unless you or a close relative is seriously ill!Participation
Students are expected to regularly attend the class lectures, and should actively engage in in-class discussions. Attendance will not be taken, but keep in mind that if you don't attend, you cannot participate. You can actively participate by, for example, responding to the instructor's or others' questions, asking questions or making meaningful remarks and comments about the lecture, answering others' questions on Piazza, or bringing in relevant articles you saw in the news. The grading rubric will be as follows: 1 = you attended infrequently, 2 = you attended frequently but did not speak in class, 3 = you attended frequently and spoke a few times, 5 = you attended and participated frequently, 4 = in between 3 and 5.New this year: For each lecture, I will rely on a "panel" of experts who have gone through the readings for that lecture in detail, and are able to ask questions and address questions from their classmates. This "panel" will help me moderate the discussion. Please sign up for which lecture you want to help on as a panel member, here. Your help on a panel will count as part of your participation grade.
Late Policy
On your programming assignments only, you get 3 "free" late days counted in minutes, i.e., you can submit a total of 72 hours late. For example, you can submit one homework 12 hours late, and another 60 hours late. The 72-hour "budget" is total for all programming assignments, NOT per assignment. Once you've used up your free late days, you will incur a penalty of 25% from the total assignment credit possible for each late day. A late day is anything from 1 minute to 24 hours. Note this policy does not apply to components of the project.Collaboration Policy and Academic Honesty
You will do your work (exams and homework) individually. The only exception is the project, which can be done in pairs. The work you turn in must be your own work. You are allowed to discuss the assignments with your classmates, but do not look at code they might have written for the assignments, or at their written answers. You are also not allowed to search for code on the internet, use solutions posted online unless you are explicitly allowed to look at those, or to use Matlab's or Python's implementation if you are asked to write your own code. When in doubt about what you can or cannot use, ask the instructor! Plagiarism will cause you to fail the class and receive disciplinary penalty. Please consult the University Guidelines on Academic Integrity.Note on Disabilities
If you have a disability for which you are or may be requesting an accommodation, you are encouraged to contact both your instructor and Disability Resources and Services (DRS), 140 William Pitt Union, (412) 648-7890, drsrecep@pitt.edu, (412) 228-5347 for P3 ASL users, as early as possible in the term. DRS will verify your disability and determine reasonable accommodations for this course.Note on Medical Conditions
If you have a medical condition which will prevent you from doing a certain assignment, you must inform the instructor of this before the deadline. You must then submit documentation of your condition within a week of the assignment deadline.Statement on Classroom Recording
To ensure the free and open discussion of ideas, students may not record classroom lectures, discussion and/or activities without the advance written permission of the instructor, and any such recording properly approved in advance can be used solely for the student's own private use.[top]
Project
Please enter your team information and topic here by March 1.The project is expected to be a new method for an existing problem or an application of techniques we studied in class (or another method) to a new problem that we have not discussed in class. Below are some tips:
- The project should include some amount of novelty. For example, you cannot just re-implement an existing paper or project. You are allowed to use existing code for known methods, but your project is expected to be a significant amount of work and not just a straight-up run of some package.
- Do not rely on data collection to be the novel component of your work. If you are proposing to tackle a new problem, you might need to collect data, but while this is a contribution, it will not be enough to earn a good project grade. You still have to come up with a solid method idea, i.e. your project has to have sufficient technical novelty.
- You must show that your method is in some sense better (quantitatively) than at least some relatively recent existing methods. For example, you can show that your method achieves superior accuracy in some prediction task compared to prior methods, or that it achieves comparable accuracy but is faster. This outcome is not guaranteed to come out the way you intended during the limited timespan of a course project, so whether or not your outperform the state of the art will only be a small component of your grade. Further, if you propose a sufficiently interesting method, rather than an extremely simple method, it will be less of a problem if your method does not outperform other existing approaches to the problem.
- You are encouraged to use any external expertise you might have (e.g. biology, physics, etc.) so that your project makes the best use of areas you know well, and is as interesting as possible.
- Students are encouraged to work in groups of three for their final project.
- You are strongly encouraged to present a project proposal to the instructor during one office hours session. Please use visual aides (e.g. bring slides/figures/tables, and/or draw on the whiteboard during the conversation). The project proposal discussion should happen before the date indicated in the schedule before. This component is not graded.
- There will be two presentations per team. The first will focus on motivation, related work, and your proposed approach. The goal is to have carefully designed your method conceptually, and to have made notable progress in terms of implementing this proposed method. During the first presentations, you will receive feedback from your classmates. Do comment on challenges you are facing, if you think this will be useful. You should present a literature review during your first presentation, so your classmates know the space in which you are working. A good source for learning about what work has been done in your domain of interest are search engines, Google Scholar, and arxiv.org.
- The second presentation will focus on your experimental validation and findings. You only need to briefly refresh your classmates' memory about the motivation for your work and the proposed method.
- Your slides are due on CourseWeb on the indicated due date, at 11:59pm.
- You are allowed to change your slides for the second presentation between presenting and submitting them, but make sure to change the slide background for any new slides to yellow so the instructor can easily tell what content is new compared to the version that was presented.
- Your grades will be based on two rubrics, available below. For each presentation, your grade will be an average of (1) your classmates' scores, and (2) the instructor's scores. You will be able to see your score for each item in the rubric.
- If you are interested in receiving feedback from the instructor, please come to office hours after your grades are posted.
- There will be no written report, but you will practice your ability to describe your work in clear and memorable fashion through your presentation.
- Combining your final project for this class and another class is generally permitted, but the project proposal and final report should clearly outline what part of the work was done to get credit in this class, and the instructor should approve the proposed breakdown of work between this and another class.
- The first presentation should take about 12 minutes, with 3 minutes for questions and discussion. The second presentation should take 10 minutes, with 1 minute for questions. Make sure to test your laptops before class as we'll have a very tight schedule!
- The order in which you present will be decided by a random draw. The order of teams for the first presentation will be inversely related to the order for the second presentation. For example, if you present on the third day for the first presentation, you'll present on the first day for the second.
- How well did the authors (presenters) explain what problem they are trying to solve?
- How well did they explain why this problem is important?
- How well did they explain why the problem is challenging?
- How thorough was the literature review?
- How clearly was prior work described?
- How well did the authors explain how their proposed work is different than prior work?
- How clearly did the authors describe their proposed approach?
- How novel is the proposed approach?
- How challenging and ambitious is the proposed approach? (1-10)
- To what extent did the authors develop the method as described in the first presentation? (1-10)
- How well did the authors describe their experimental validation?
- How informative were the figures used?
- Were all/most relevant baselines and competitor methods included in the experimental validation?
- Were sufficient experimental settings (e.g. datasets) tested?
- To what extent is the performance of the proposed method satisfactory?
- How informative were the conclusions the authors drew about their method’s performance relative to other methods?
- How sensible was the discussion of limitations?
- How interesting was the discussion of future work?
- Look at the datasets and tasks below.
- Look at the topics in the programs of some of the recent computer vision conferences: CVPR 2018 (with papers downloadable here), CVPR 2017 (with papers here), ICCV 2017 (with papers here) and ECCV 2018 (with papers here).
- Check out project resources and topics in a related class here.
- Read some paper abstracts on this page.
Schedule
Date | Chapter | Topic | Readings | Lecture slides | Due |
1/8 | Basics | Introduction | Szeliski Sec. 1.1-1.2 | pptx pdf | |
1/10 | |||||
1/15 | Filters | Szeliski Sec. 3.2, 10.5, 4.1.1 | pptx pdf | ||
1/17 | |||||
1/22 | Features | Szeliski Sec. 4.1, Grauman/Leibe Sec. 3, 4.2.1; feature survey Sec. 1,3.2,7; Lowe IJCV 2004 |
pptx pdf | ||
1/24 | |||||
1/29 | Grouping and transformations | Szeliski Sec. 2.1, 3.6.1, 4.2, 4.3.2, 5.3-4, 6.1.4, 7.2, 11.1.1; Grauman/Leibe Sec. 5.1, 5.2; | pptx pdf | ||
1/31 | |||||
2/5 | |||||
2/7 | Classification, support vector machines (SVMs), convolutional neural networks (CNNs) |
Bishop
PRML
Sec. 1.1, Bishop PRML Sec. 7.1, Karpathy Module 1 and Module 2; Krizhevsky NIPS 2012, Zeiler ECCV 2014 |
pptx pdf | ||
2/12 | HW1 | ||||
2/14 | |||||
2/19 | First exam | ||||
2/21 | Classics | Object detection | Szeliski Sec. 14.1, 14.4; Grauman/Leibe Sec. 8, 9, 10.2.1.1, 10.3.3, 11.1,2,5; Felzenszwalb PAMI 2010, Girshick CVPR 2014, Ren NIPS 2015, Redmon CVPR 2016, Zhou CVPR 2016, Harwath ECCV 2018 | pptx pdf | proposal meetings |
2/26 | |||||
2/28 | |||||
3/5 | Vision, language and reasoning | blog1,
blog2, Karpathy CVPR 2015, Venugopalan CVPR 2017, Wu CVPR 2016, Narasimhan ECCV 2018 |
pptx pdf | ||
3/7 | HW2 | ||||
3/19 | |||||
3/21 | Motion and tracking | Laptev CVPR 2008 | pptx pdf | ||
3/26 | First project presentations and discussion | ||||
3/28 | No class | ||||
4/2 | First project presentations and discussion (cont'd) | ||||
4/4 | slides | ||||
4/9 | Emergent etc. | Self-supervised and embodied learning | Doersch ICCV 2015, Jayaraman ICCV 2015, Pinto ECCV 2016, Mnih 2013, Caicedo ICCV 2015, Zhu ICRA 2017, Das CVPR 2018 | pptx pdf | |
4/11 | |||||
4/16 | Generative models | Goodfellow NIPS 2014, Radford NIPS 2016, Isola CVPR 2017, Zhu ICCV 2017, Ren ECCV 2018, Bansal ECCV 2018 | pptx pdf | HW3 | |
4/18 | Second exam | ||||
4/23 | Second project presentations | ||||
4/25 | slides |
[top]
Resources
This course was inspired by the following courses:- Computer Vision by Kristen Grauman, UT Austin, Spring 2011
- Computer Vision by Derek Hoiem, UIUC, Spring 2015
- Convolutional Neural Networks for Visual Recognition by Fei-Fei Li, Andrej Karpathy, Justin Johnson, and Serena Young, Stanford University, Spring 2018
- Matlab tutorial
- Linear algebra review by Fei-Fei Li
- Brief machine learning intro by Aditya Khosla and Joseph Lim
- Resources list (including code and data, tutorials, and other related courses) compiled by Devi Parikh
- Microsoft COCO (Common Objects in Context) (object recognition, segmentation, image description)
- ImageNet (object recognition)
- SUN Database (scenes)
- Caltech-UCSD Birds 200 (fine-grained object recognition)
- MSRC Annotations (active learning)
- Animals with Attributes (attribute-based recognition)
- a-Pascal + a-Yahoo (attribute-based recognition)
- Shoes (attribute-based search)
- INRIA Movie Actions (action recognition)
- ADL (ego-centric action recognition)
- Action Quality (evaluating action quality)
- CarDb Historical Cars (style classification of cars)
- Recognizing Image Style (photographic style classification)
- Judd gaze (visual saliency prediction)
- Visual Persuasion (predicting subtle messages in images)
- Advertisements: Images and Videos (understanding what the ad prompts of the viewer and why)
- VQA (visual question-answering)
- Recognition datasets list compiled by Kristen Grauman
- Human activity datasets list compiled by Chao-Yeh Chen
- LIBSVM (by Chih-Chung Chang and Chih-Jen Lin)
- SVM Light (by Thorsten Joachims)
- VLFeat (feature extraction, tutorials and more, by Andrea Vedaldi)
- TensorFlow (deep learning framework by Google)
- Caffe (deep learning framework by Yangqing Jia et al.)
- PyTorch (another popular deep learning framework)
- Keras (deep learning library)