CS2770: Computer Vision

Overview

Course description: In this class, students will learn about modern computer vision. The first part of the course will cover fundamental concepts such as filtering, extracting features and describing images, grouping features, matching features across multiple views, and convolutional neural networks. In the second part, we will cover approaches to classic tasks/topics such as object recognition, vision and language, tracking, and human pose and activity recognition. In the third part, we will overview recent and emerging topics such as self-supervised learning, domain adaptation, and visual discovery. The format will include lectures, homework assignments, exams, and a course project.

Prerequisites: CS1501 and MATH 0280 (or equivalent). The expectation is that you can program and analyze the performance of programs. Some experience with linear algebra (matrix and vector operations), basic calculus, and probability and statistics is recommended.

Piazza: Sign up for it here. Note that we will use Piazza primarily for classmate-to-classmate discussion of homework problems, etc. The instructor will not monitor Piazza, and the TA will monitor it infrequently. The time when you should ask the instructor or TA questions is during office hours.

Programming languages: For homework assignments, you can use Matlab or Python. For the course project, you can use any language of your choice.

Textbooks:

Readings from:
- Computer Vision: Algorithms and Applications by Richard Szeliski (available for free on author's page)
- Visual Object Recognition by Kristen Grauman and Bastian Leibe (accessible for free from campus)
For reference:
- Computer Vision: A Modern Approach by David Forsyth and Jean Ponce
- Computer Vision: Models, Learning, and Inference by Simon Prince (available for free on author's page)
- Pattern Recognition and Machine Learning by Christopher Bishop

[top]

Policies

Grading

Grading will be based on the following components:

Homework assignments (3 assignments x 10% each = 30%)
Project (15% first presentation + 15% second presentation = 30%)
Exams (20% first exam + 15% second exam = 35%)
Participation (5%)

Homework Submission Mechanics

You will submit your homework using CourseWeb. Navigate to the CourseWeb page for CS2770, then click on "Assignments" (on the left) and the corresponding homework ID. Your source code should be a single zip file (also including images/results if requested). Name the file YourFirstName_YourLastName.[extension]. Please comment your code! Homework is due at 11:59pm on the due date. Grades will be posted on CourseWeb.

Exams

There will be two in-class exams. The second exam is not cumulative and will not cover material from the first exam. There will be no make-up exams unless you or a close relative is seriously ill!

Participation

Students are expected to regularly attend the class lectures, and should actively engage in in-class discussions. Attendance will not be taken, but keep in mind that if you don't attend, you cannot participate. You can actively participate by, for example, responding to the instructor's or others' questions, asking questions or making meaningful remarks and comments about the lecture, answering others' questions on Piazza, or bringing in relevant articles you saw in the news. The grading rubric will be as follows: 1 = you attended infrequently, 2 = you attended frequently but did not speak in class, 3 = you attended frequently and spoke a few times, 5 = you attended and participated frequently, 4 = in between 3 and 5.

Late Policy

On your programming assignments only, you get 3 "free" late days counted in minutes, i.e., you can submit a total of 72 hours late. For example, you can submit one homework 12 hours late, and another 60 hours late. The 72-hour "budget" is total for all programming assignments, NOT per assignment. Once you've used up your free late days, you will incur a penalty of 25% from the total assignment credit possible for each late day. A late day is anything from 1 minute to 24 hours. Note this policy does not apply to components of the project.

Collaboration Policy and Academic Honesty

You will do your work (exams and homework) individually. The only exception is the project, which can be done in pairs. The work you turn in must be your own work. You are allowed to discuss the assignments with your classmates, but do not look at code they might have written for the assignments, or at their written answers. You are also not allowed to search for code on the internet, use solutions posted online unless you are explicitly allowed to look at those, or to use Matlab's or Python's implementation if you are asked to write your own code. When in doubt about what you can or cannot use, ask the instructor! Plagiarism will cause you to fail the class and receive disciplinary penalty. Please consult the University Guidelines on Academic Integrity.

Note on Disabilities

If you have a disability for which you are or may be requesting an accommodation, you are encouraged to contact both your instructor and Disability Resources and Services (DRS), 140 William Pitt Union, (412) 648-7890, drsrecep@pitt.edu, (412) 228-5347 for P3 ASL users, as early as possible in the term. DRS will verify your disability and determine reasonable accommodations for this course.

Note on Medical Conditions

If you have a medical condition which will prevent you from doing a certain assignment, you must inform the instructor of this before the deadline. You must then submit documentation of your condition within a week of the assignment deadline.

Statement on Classroom Recording

To ensure the free and open discussion of ideas, students may not record classroom lectures, discussion and/or activities without the advance written permission of the instructor, and any such recording properly approved in advance can be used solely for the student's own private use.

[top]

Project

The project is expected to be a new method for an existing problem or an application of techniques we studied in class (or another method) to a new problem that we have not discussed in class. Below are some tips:

The project should include some amount of novelty. For example, you cannot just re-implement an existing paper or project. You are allowed to use existing code for known methods, but your project is expected to be a significant amount of work and not just a straight-up run of some package.
Do not rely on data collection to be the novel component of your work. If you are proposing to tackle a new problem, you might need to collect data, but while this is a contribution, it will not be enough to earn a good project grade. You still have to come up with a solid method idea, i.e. your project has to have sufficient technical novelty.
You must show that your method is in some sense better (quantitatively) than at least some relatively recent existing methods. For example, you can show that your method achieves superior accuracy in some prediction task compared to prior methods, or that it achieves comparable accuracy but is faster. This outcome is not guaranteed to come out the way you intended during the limited timespan of a course project, so whether or not your outperform the state of the art will only be a small component of your grade. Further, if you propose a sufficiently interesting method, rather than an extremely simple method, it will be less of a problem if your method does not outperform other existing approaches to the problem.
You are encouraged to use any external expertise you might have (e.g. biology, physics, etc.) so that your project makes the best use of areas you know well, and is as interesting as possible.

Rules and logistics:

Students are encouraged to work in groups of three for their final project.
You are strongly encouraged to present a project proposal to the instructor during one office hours session. Please use visual aides (e.g. bring slides/figures/tables, and/or draw on the whiteboard during the conversation). The project proposal discussion should happen before the date indicated in the schedule before. This component is not graded.
There will be two presentations per team. The first will focus on motivation, related work, and your proposed approach. The goal is to have carefully designed your method conceptually, and to have made notable progress in terms of implementing this proposed method. During the first presentations, you will receive feedback from your classmates. Do comment on challenges you are facing, if you think this will be useful. You should present a literature review during your first presentation, so your classmates know the space in which you are working. A good source for learning about what work has been done in your domain of interest are search engines, Google Scholar, and arxiv.org.
The second presentation will focus on your experimental validation and findings. You only need to briefly refresh your classmates' memory about the motivation for your work and the proposed method.
Your slides are due on CourseWeb on the indicated due date, at 11:59pm.
You are allowed to change your slides for the second presentation between presenting and submitting them, but make sure to change the slide background for any new slides to yellow so the instructor can easily tell what content is new compared to the version that was presented.
Your grades will be based on two rubrics, available below. For each presentation, your grade will be an average of (1) your classmates' scores, and (2) the instructor's scores. You will be able to see your score for each item in the rubric.
If you are interested in receiving feedback from the instructor, please come to office hours after your grades are posted.
There will be no written report, but you will practice your ability to describe your work in clear and memorable fashion through your presentation.
Combining your final project for this class and another class is generally permitted, but the project proposal and final report should clearly outline what part of the work was done to get credit in this class, and the instructor should approve the proposed breakdown of work between this and another class.

Rubric for first presentation (all questions except the last one are scored on a scale of 1 to 5, with 5 being best):

How well did the authors (presenters) explain what problem they are trying to solve?
How well did they explain why this problem is important?
How well did they explain why the problem is challenging?
How thorough was the literature review?
How clearly was prior work described?
How well did the authors explain how their proposed work is different than prior work?
How clearly did the authors describe their proposed approach?
How novel is the proposed approach?
How challenging and ambitious is the proposed approach? (1-10)

Rubric for second presentation (all questions except the first one are scored on a scale of 1 to 5, with 5 being best):

To what extent did the authors develop the method as described in the first presentation? (1-10)
How well did the authors describe their experimental validation?
How informative were the figures used?
Were all/most relevant baselines and competitor methods included in the experimental validation?
Were sufficient experimental settings (e.g. datasets) tested?
To what extent is the performance of the proposed method satisfactory?
How informative were the conclusions the authors drew about their method’s performance relative to other methods?
How sensible was the discussion of limitations?
How interesting was the discussion of future work?

Some potential sources of ideas:

Look at the datasets and tasks below.
Read some paper abstracts on this page.
Look at the topics in the programs of some of the recent computer vision conferences: CVPR 2017 (with papers downloadable here), ICCV 2017 (with papers downloadable here) and ECCV 2016.

[top]

Schedule

Date	Chapter	Topic	Readings	Lecture slides	Due
1/9	Basics	Introduction	Szeliski Sec. 1.1-1.2	pptx pdf
1/11		Filters	Szeliski Sec. 3.2, 10.5, 4.1.1	pptx pdf
1/16					HW 1 out
1/18
1/23		Features	Szeliski Sec. 4.1, Grauman/Leibe Sec. 3, 4.2.1; feature survey Sec. 1,3.2,7; SIFT paper by David Lowe	pptx pdf autocorr blobs
1/25
1/30
2/1		Grouping	Szeliski Sec. 4.2, 4.3.2, 5.3-4, 6.1.4; Grauman/Leibe Sec. 5.2; Hariharan CVPR 2015	pptx pdf
2/6		Grouping		pptx pdf	HW1 due
2/8		Transformations	Szeliski Sec 2.1, 3.6.1, 7.2, 11.1.1; Grauman/Leibe Sec. 5.1	pptx pdf
2/13		Recognition and support vector machines (SVMs); neural networks (begin)	Bishop PRML Sec. 1.1, Bishop PRML Sec. 7.1	pptx pdf	HW2 out
2/15
2/20
2/22		First exam
2/27		Convolutional neural networks (CNNs)	Karpathy Module 1 and Module 2; Krizhevsky NIPS 2012, Zeiler ECCV 2014	pptx pdf
3/1		Convolutional neural networks (CNNs)		pptx pdf	proposal due
3/13	Classics	Object recognition	Szeliski Sec. 14.1, 14.4; Grauman/Leibe Sec. 8, 9, 10.2.1.1, 10.3.3, 11.1,2,5; Viola Jones CVPR 2001, Felzenszwalb PAMI 2010, Girshick CVPR 2014	pptx pdf
3/15		Object recognition		pptx pdf
3/20		First project presentations
3/22		First project presentations			slides due
3/27		Presentation discussion; Object recognition wrap-up			HW2 due; HW3 out
3/29		Presentation discussion; Object recognition wrap-up
4/3		Vision and language	blog1, blog2, Karpathy CVPR 2015, Wu CVPR 2016	pptx pdf
4/5		Motion, tracking and actions	Laptev CVPR 2008	pptx pdf
4/10	Emergent etc.	Self-supervised learning	Doersch ICCV 2015, Jayaraman ICCV 2015, Lee ICCV 2013	pptx pdf
4/12		Generative adversarial networks	Goodfellow NIPS 2014, Isola CVPR 2017, Zhu ICCV 2017	pptx pdf
4/17		Generative adversarial networks	Goodfellow NIPS 2014, Isola CVPR 2017, Zhu ICCV 2017	pptx pdf	HW3 due
4/19		Second exam
4/24		Second project presentations
4/26		Second project presentations			slides due

[top]

Resources

This course was inspired by the following courses:

Computer Vision by Kristen Grauman, UT Austin, Spring 2011
Computer Vision by Derek Hoiem, UIUC, Spring 2015

Tutorials:

Matlab tutorial
Linear algebra review by Fei-Fei Li
Brief machine learning intro by Aditya Khosla and Joseph Lim
Resources list (including code and data, tutorials, and other related courses) compiled by Devi Parikh

Some datasets:

Microsoft COCO (Common Objects in Context) (object recognition, segmentation, image description)
ImageNet (object recognition)
SUN Database (scenes)
Caltech-UCSD Birds 200 (fine-grained object recognition)
MSRC Annotations (active learning)
Animals with Attributes (attribute-based recognition)
a-Pascal + a-Yahoo (attribute-based recognition)
Shoes (attribute-based search)
INRIA Movie Actions (action recognition)
ADL (ego-centric action recognition)
Action Quality (evaluating action quality)
CarDb Historical Cars (style classification of cars)
Recognizing Image Style (photographic style classification)
Judd gaze (visual saliency prediction)
Visual Persuasion (predicting subtle messages in images)
VQA (visual question-answering)
Recognition datasets list compiled by Kristen Grauman
Human activity datasets list compiled by Chao-Yeh Chen

Some code/frameworks of interest:

LIBSVM (by Chih-Chung Chang and Chih-Jen Lin)
SVM Light (by Thorsten Joachims)
VLFeat (feature extraction, tutorials and more, by Andrea Vedaldi)
Caffe (deep learning framework by Yangqing Jia et al.)
Torch (deep learning framework)
TensorFlow (deep learning framework)
Theano (deep learning framework)

[top]