Chapter 1 - Vision AI
Chapter 1 - Vision AI
Introduction
Notes based on
EECS 498-007 / 598-005
Deep Learning for Computer Vision
At University of Michigan
Left to right:
Image by Nasa is public
Image from Unsplash
Image from Unsplash
Image from Unsplash
Left to right:
Image from Unsplash
Image from Unsplash
Image from Unsplash
Image copyrighted by
ExtremeTech
Machine Learning
Computer
Deep
Vision
Learning
Measure Display
brain activity Simple cells:
Response to light
orientation
Complex cells:
Response to light
orientation and movement
Hypercomplex cells:
response to movement
with an end point
1959
Hubel & Wiesel
Response Stimulus
No response
(a) Original picture (b) Differentiated picture (c) Feature points selected
1959 1963
Hubel & Wiesel Roberts
3-D models
Zero crossings, Local surface
hierarchically
blobs, edges, bars, orientation and
Perceived organized in terms
ends, virtual lines, discontinuities in
intensities of surface and
groups, curves depth and in surface
volumetric
boundaries orientation
primitives
1959 1963 1970s 1979 1986 1997 Normalized Cuts, Shi and Malik, 1997
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts
SIFT, David
1959 1963 1970s 1979 1986 1997 1999
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT Lowe, 1999
Demo
train
person
person
airplane
1959 1963 1970s 1979 1986 1997 1999 2001 2007 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet
Interactive Graph
1959 1963 1970s 1979 1986 1997 1999 2001 2007 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet
1959 1963 1970s 1979 1986 1997 1999 2001 2007 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet
2012
AlexNet
1959 1963 1970s 1979 1986 1997 1999 2001 2007 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet
1958 2012
Perceptron AlexNet
1959 1963 1970s 1979 1986 1997 1999 2001 2007 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet
1959 1963 1970s 1979 1986 1997 1999 2001 2007 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet
Introduced backpropagation
for computing gradients in
neural networks
Successfully trained
perceptrons with multiple
layers
Illustration of Rumelhart et al., 1986 by Lane McIntosh,
copyright CS231n 2017
1959 1963 1970s 1979 1986 1997 1999 2001 2007 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet
1959 1963 1970s 1979 1986 1997 1999 2001 2007 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet
1959 1963 1970s 1979 1986 1997 1999 2001 2007 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet
1959 1963 1970s 1979 1986 1997 1999 2001 2007 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet
Figures copyright Alex Krizhevsky, Ilya Sutskever, and Geoffrey Hinton, 2012.
Kaggle Challenge
Image Captioning
Vinyals et al, 2015
Karpathy and Fei-Fei, 2015
Data
Computation
http://en.wikipedia.org/wiki/FLOPS#Hardware_costs
1959 1963 1970s 1979 1986 1997 1999 2001 2007 2009
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet
1959 1963 1970s 1979 1986 1997 1999 2001 2007 2009 2021
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet Dall-E
1959 1963 1970s 1979 1986 1997 1999 2001 2007 2009 2021 2023
Hubel & Wiesel Roberts David Marr Gen. Cylinders Canny Norm. Cuts SIFT V&J PASCAL ImageNet Dall-E This Class!