16-720-B: Computer Vision [Fall 2022]

Course Page | Schedule | Piazza | Syllabus | Office Hour [TBD]

Monday and Wednesday, 11:50 AM - 1:10 PM
@ DH [Doherty Hall] A302


Course Description

This course introduces the fundamental techniques used in computer vision, that is, the analysis of patterns in visual images to reconstruct and understand the objects and scenes that generated them. Topics covered include image processing basics, Hough Transforms, feature detection, feature descriptors, image representations, image classification and object detection. We will also cover camera geometry, multi-view geometry, stereo, 3D reconstruction from images, optical flow, motion analysis and tracking.

Version

Version B of 16-720 is intended for students with prior knowledge of computer vision and prior exposure to machine learning. Undergraduate students should take 16-385 which is the undergraduate version of the class. Those with no exposure to computer vision or machine learning should take the A version of the class. Those with advance experience in computer vision should take the 800 level computer vision courses.

Prerequisites (self evaluation)

Linear Algebra, Multivariate Calculus, Probability theory, Programming

Educational Outcomes

  • Implement the Hough Transform to detect lines in an image
  • Extract SIFT features to build a Bag-of-Words representation of an image for classification
  • Perform object recognition using a convolutional neural network
  • Detect Harris Corners and implement the RANSAC algorithm to find the homography between two images
  • Perform 3D reconstruction and stereo rectification to implement stereo block matching using two images
  • Implement a gradient descent based image alignment algorithm to track objects in a video
  • Students will learn how to use Python and PyTorch through the programming assignments

Course Staff

    Instructor                         Teaching Assistants
Kris Kitani
           
Rawal Khirodkar
Sheng-Yu Wang
Rohan Choudhury
Jinkun Cao
Arkadeep Chaudhury

Grading

Programming Assignments 100% (6 assignments total). Grades determined on an absolute scale. Typically 90% and above is A, 80% - 89% is B, 70% - 79% is C, 60% - 69% D, 59% or below is R. There will be extra credit opportunities for students who want to go deeper into the material.

  • Hough Transform (10%)
  • Bag of Visual Words (18%)
  • Neural Networks (18%)
  • Homography (18%)
  • 3D Reconstruction (18%)
  • LK Image Alignment and Tracking (18%)


* This page style is adopted from Shubham's page for another awesome course by Jinkun Cao. [Last Update: 08/31/2022]