Lec01 Intro
Lec01 Intro
https://slazebni.cs.illinois.edu/spring23/
Lecture overview
• About the class
• Milestones of deep learning
• Recent successes and origins
• Visual recognition
• Natural language understanding
• Generative modeling
• Games
• Robotics
• Topics to be covered in class
A few historical milestones
• 1958: Rosenblatt’s perceptron
Image source
A few historical milestones
• 1958: Rosenblatt’s perceptron
• 1969: Minsky and Papert Perceptrons book
• 1980: Fukushima’s Neocognitron
• 1986: Back-propagation
• Origins in control theory and optimization: Kelley (1960), Dreyfus (1962),
Bryson & Ho (1969), Linnainmaa (1970)
• Application to neural networks: Werbos (1974)
• Popularized by Rumelhart, Hinton & Williams (1986)
A few historical milestones
• 1958: Rosenblatt’s perceptron
• 1969: Minsky and Papert Perceptrons book
• 1980: Fukushima’s Neocognitron
• 1986: Back-propagation
• 1989 – 1998: Convolutional neural networks
• LeNet to LeNet-5
Yann LeCun
2018 ACM Turing Award winner
(with Hinton and Bengio)
A few historical milestones
• 1958: Rosenblatt’s perceptron
• 1969: Minsky and Papert Perceptrons book
• 1980: Fukushima’s Neocognitron
• 1986: Back-propagation
• 1989 – 1998: Convolutional neural networks
• 2012: AlexNet
Photo source
A few historical milestones
• 1958: Rosenblatt’s perceptron
• 1969: Minsky and Papert Perceptrons book
• 1980: Fukushima’s Neocognitron
• 1986: Back-propagation
• 1989 – 1998: Convolutional neural networks
• 2012: AlexNet
• Fascinating reading: The secret auction that set off the race for AI supremacy,
Wired, 3/16/2021
A few historical milestones
• 1958: Rosenblatt’s perceptron
• 1969: Minsky and Papert Perceptrons book
• 1980: Fukushima’s Neocognitron
• 1986: Back-propagation
• 1989 – 1998: Convolutional neural networks
• 2012: AlexNet
• 2012 – present: deep learning explosion
Convolutional Human
ILSVRC Before deep
learning architectures baseline
Figure source
ImageNet is obsolete?
“Programmer”
How China Uses High-Tech Surveillance to Subdue Minorities – New York Times, 5/22/2019
The Secretive Company That Might End Privacy As We Know It – New York Times, 1/18/2020
Wrongfully Accused by an Algorithm – New York Times, 6/24/2020
Lecture overview
• About the class
• Milestones of deep learning
• Progress in the last decade
• Visual recognition
• Natural language understanding
Neural machine translation
Y. Wu et al.
Google's Neural Machine Translation System: Bri Figure source
dging the Gap between Human and Machine Tra
nslation
Previous system (before deep learning):
. arXiv 2016
PBMT (2014): 37 BLEU A. Vaswani et al. Attention is all you need.
https://mobile.nytimes.com NeurIPS 2017
/2016/12/14/magazine/the-great-ai-
awakening.html
Large language models: Google BERT
• Self-supervised pre-training task: masked token prediction
Bidirectional Encoder Representations from Transformers (BERT)
Figure source
J. Devlin et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. EMNLP 2018
Large language models: OpenAI GPT
• Self-supervised pre-training task: next token prediction
Figure source
GPT: A. Radford et al. Improving language understanding with unsupervised learning. 2018
GPT-2 (1.5B parameters): A. Radford et al. Language models are unsupervised multitask learners. 2019
GPT-3 (175B parameters): T. Brown et al. Language models are few-shot learners. NeurIPS 2020 (Best Paper Award)
Stochastic parrots or sentient entities?*
*Asking either question will get you fired from Google
https://www.technologyreview.com/2020/12/04/1013294/google-ai https://www.cnn.com/2022/07/23/business/google-ai
-ethics-research-paper-forced-out-timnit-gebru/ -engineer-fired-sentient/index.html
L. Ouyang et al. Training language models to follow instructions with human feedback. NeurIPS 2022
https://openai.com/blog/chatgpt/
ChatGPT
Generated on 1/10/2023
ChatGPT
Generated on 1/10/2023
ChatGPT
Generated on 1/10/2023
ChatGPT: Concerns
https://www.nytimes.com
/2023/01/16/technology/chatgpt
-artificial-intelligence-universities.html
ChatGPT: Concerns – and opportunities
Decode to 256x256
Text prompt encoding (256 tokens) Image encoding (1024 = 32x32 tokens) image
A. Ramesh et al. Hierarchical text-conditional image generation with CLIP latents. 2022
Diffusion models
• Idea: convert noise to an image in multiple passes
https://www.foley.com/en/insights/publications/2022/12/venture-capital-investors-betting-generative-ai
Generative modeling: Concerns
• Deepfakes DALL-E 2 images of lawyers, flight attendants (source)
https://www.wired.com/story/zelensky-deepfake-facebook-twitter-playbook/
AI-generated work wins first prize at art fair
Lecture overview
• About the class
• Milestones of deep learning
• Progress in the last decade
• Vision
• Language
• Generative modeling
• Games
Games
• 2013:
DeepMind uses deep reinforcement learning t
o beat humans at some Atari games
• 2016:
DeepMind’s AlphaGo system beats Go grand
master Lee Sedol 4-1
• 2017:
AlphaZero learns to play Go and chess from s
cratch
• 2019:
DeepMind’s StarCraft 2 AI is better than 99.8 p
Lecture overview
• About the class
• Milestones of deep learning
• Progress in the last decade
• Vision
• Language
• Generative modeling
• Games
• Robotics
Sensorimotor learning
Overview video,
training video
S. Levine, C. Finn, T. Darrell, P. Abbeel, End-to-end training of deep visuomotor policies, JMLR 2016
Sensorimotor learning
A. Agarwal, A. Kumar, J. Malik, and D. Pathak. Legged Locomotion in Challenging Terrains using Egocentric Vision. CoRL 2022
Lecture overview
• About the class
• Milestones of deep learning
• Progress in the last decade
• Vision
• Language
• Generative modeling
• Games
• Robotics
• Topics to be covered in class
Topics to be covered in class
ML basics, linear classifiers Multilayer neural networks, backpropagation Convolutional networks for classification
Networks for detection, dense prediction Self-supervised learning Generative models (GANs, image-to-image
translation, diffusion models)
1500 characters (26 letters, 10 digits from 50 writers), 12x12 resolution, stored on IBM 704 punch cards
Bill Highleyman and Louis Kamentsky, Bell Labs