Learning for 3D Vision

Course Description

This course introduces the fundamental techniques used in computer vision, that is, the analysis of patterns in visual images to reconstruct and understand the objects and scenes that generated them. Topics covered include image processing basics, Hough Transforms, feature detection, feature descriptors, image representations, image classification and object detection. We will also cover camera geometry, multi-view geometry, stereo, 3D reconstruction from images, optical flow, motion analysis and tracking.

Version

Version B of 16-720 is intended for students with prior knowledge of computer vision and prior exposure to machine learning. Undergraduate students should take 16-385 which is the undergraduate version of the class. Those with no exposure to computer vision or machine learning should take the A version of the class. Those with advance experience in computer vision should take the 800 level computer vision courses.

Prerequisites (self evaluation)

Linear Algebra, Multivariate Calculus, Probability theory, Programming

Educational Outcomes

Implement the Hough Transform to detect lines in an image
Extract SIFT features to build a Bag-of-Words representation of an image for classification
Perform object recognition using a convolutional neural network
Detect Harris Corners and implement the RANSAC algorithm to find the homography between two images
Perform 3D reconstruction and stereo rectification to implement stereo block matching using two images
Implement a gradient descent based image alignment algorithm to track objects in a video
Students will learn how to use Python and PyTorch through the programming assignments

Course Staff

Instructor Teaching Assistants

Grading

Programming Assignments 100% (6 assignments total). Grades determined on an absolute scale. Typically 90% and above is A, 80% - 89% is B, 70% - 79% is C, 60% - 69% D, 59% or below is R. There will be extra credit opportunities for students who want to go deeper into the material.

Hough Transform (10%)
Bag of Visual Words (18%)
Neural Networks (18%)
Homography (18%)
3D Reconstruction (18%)
LK Image Alignment and Tracking (18%)

16-720-B: Computer Vision [Fall 2022]

Course Page | Schedule | Piazza | Syllabus | Office Hour [TBD]

Monday and Wednesday, 11:50 AM - 1:10 PM
@ DH [Doherty Hall] A302

Course Description

Version

Prerequisites (self evaluation)

Educational Outcomes

Course Staff

Grading

16-720-B: Computer Vision [Fall 2022]

Course Page | Schedule | Piazza | Syllabus | Office Hour [TBD]

Monday and Wednesday, 11:50 AM - 1:10 PM @ DH [Doherty Hall] A302

Course Description

Version

Prerequisites (self evaluation)

Educational Outcomes

Course Staff

Grading

Monday and Wednesday, 11:50 AM - 1:10 PM
@ DH [Doherty Hall] A302