0% found this document useful (0 votes)

11 views

0 Introduction

Uploaded by

mafia.elmasry52

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views

0 Introduction

Uploaded by

mafia.elmasry52

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 132

Introduction

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, M. Nau, S. Jaganathan, C. Liu, N. Maul, L. Folle,
K. Packhäuser, M. Zinnen
Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg
April 15, 2024
Who are we? - Lab Members

Andreas Zijin Alexander Chang Leonhard

Maier Yang Barnhill Liu Rist

Merlin Noah Mathias Srikrishna Kai

Nau Maul Zinnen Jaganathan Packhäuser

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 1
Who are we? - Student Members

Lisa Majid Chengze Teena Leyi

Schmidt Sharghi Ye Tom Dieck Tang

Supraja Karlo Jingyi Anna- Philip

Ramesh Gabriel Yao Sophie Wagner
Fonseca Stephan
Yakovenko
A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 2
Deep Learning – Buzzwords

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 3
Outline

Motivation

Machine Learning and Pattern Recognition

Perceptron

Organizational Matters
Motivation
NVIDIA Stock Market

Source: https://www.google.com/finance/quote/

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 4
The Big Bang of Deep Learning

ImageNet [8] Dataset

• ≈ 14 mio. images, labeled into ≈ 20.000 synonym sets
• ImageNet Large Scale Visual Recognition Challenge using ≈ 1000 classes
• Images downloaded from the Internet, single label per image
• 2012: Breakthrough by Krizhevsky et al. [10]

Source: Krizhevsky et al. 2012

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 5
ImageNet Large Scale Visual Recognition Challenge

ILSVRC Top-5 Error [%]

25.8 First CNN

16.4
11.7 Residual Network
6.7
5.1 3.6 3 2.4

2011 2012 2013 2014 Human 2015 2016 2017

• First CNN approach now famous as AlexNet [10]

Source: image-net.org, Russakovsky et al. 2015

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 6
ImageNet Large Scale Visual Recognition Challenge

ILSVRC Top-5 Error [%]

25.8 First CNN

16.4
11.7 Residual Network
6.7
5.1 3.6 3 2.4

2011 2012 2013 2014 Human 2015 2016 2017

• First CNN approach now famous as AlexNet [10]

• “Superhuman” should be Super-Karpathy-an performance

Source: image-net.org, Russakovsky et al. 2015

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 6
ImageNet Large Scale Visual Recognition Challenge

Source: Krizhevsky et al. 2012

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 7
Deep Learning Users

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 8
Playing Go

• 1997: Deep Blue beats Garry Kasparov

• Go as a next challenge
• Large branching factor

Source: https://commons.wikimedia.org/wiki/File:FloorGoban.jpg

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 9
Playing Go

• 1997: Deep Blue beats Garry Kasparov

• Go as a next challenge
• Large branching factor
• 2016: AlphaGo [16] beats a professional

Source: https://commons.wikimedia.org/wiki/File:FloorGoban.jpg

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 9
Playing Go

• 1997: Deep Blue beats Garry Kasparov

• Go as a next challenge
• Large branching factor
• 2016: AlphaGo [16] beats a professional
• 2017: AlphaGoZero [1] surpasses every
human in Go by self-play
• 2017: AlphaZero [2] generalizes to a number
of other board games

Source: https://commons.wikimedia.org/wiki/File:FloorGoban.jpg

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 9
Playing Go

• 1997: Deep Blue beats Garry Kasparov

Source: https://commons.wikimedia.org/wiki/File:FloorGoban.jpg

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 9
Google DeepDream

Attempt to understand the inner workings of the network: What it "dreams" about
when presented with images

Idea:
• Arbitrary image or noise as input
• Instead of adjusting network
parameters, tweak image towards
high activations
• Different layers enhance different
features (low or high level)

Source: https://research.googleblog.com

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 10
Google DeepDream

Source: https://research.googleblog.com

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 11
Google DeepDream

Source: https://research.googleblog.com
A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 11
Google DeepDream

Looking for new animals in the clouds

Source: https://research.googleblog.com

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 12
Real-Time Object Detection: YOLO, YOLO9000, YOLOv3 [11]–[13]

Click for video

• YOLO: You only live look once
• Prior systems : Use classifiers at multiple locations and scales
• YOLO : Simultaneous regression of bounding box and label
• FAST: 40-90 frames/second on a NVIDIA Titan X

Source: www.youtube.com, Redmon and Farhadi 2016

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 13
Every Day Use
Siri

Siri: Speech Interpretation and Recognition Interface

Source: www.apple.com/ios/siri/
A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 14
Google Echo & Amazon Alexa Voice Service

Source: www.amazon.com

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 15
Google Translate

Source: translate.google.de

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 16
Introduction - Part 2

Goal
Find new ways to train and update deep learning mechanisms in environments with
high safety requirements

• Assisted and automatic driving

relies on sensor data
• Cameras to detect dynamic objects,
driving lanes and free space
• Detection and segmentation tasks
: deep learning

Source: Audi AG

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 18
Assisted and Automated Driving

• Currently: neural networks trained and

thoroughly tested before deployment
: Requires huge amounts of manually
labeled data
• Regular test drives cannot verify system
reliability in all traffic scenarios

Click for video

Source: Mobileye N.V.

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 19
Assisted and Automated Driving

• Currently: neural networks trained and

thoroughly tested before deployment
: Requires huge amounts of manually
labeled data
• Regular test drives cannot verify system
reliability in all traffic scenarios
• Challenge: New ways to test algorithms in
simulated environments and utilize data Click for video
collected in production cars equipped with
appropriate hardware

Source: Mobileye N.V.

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 19
Smart Devices

Problem statement
Renewable energy power 6= energy demand

• Underproduction : backup power plants

• Overproduction : energy lost
: Real-Time-Pricing to match energy demand
and supply
• Needs smart devices to shift workload
automatically

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 20
Smart Devices

Goal
Establish energy equilibrium by predicting energy consumption

• Example: Interrupt fridge cooling cycle when price is high, start washing
machine when price is low
• Dependencies between tasks, user information and action necessary (e.g.,
washer/dryer)
• Task: Identify time-shiftable loads and assess appropriate time frame
• Approach: Train recurrent neural networks to identify usage patterns and
dependencies between devices

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 21
Cloud Detection for Power Forecast [4]

Goal
Power forecast for solar power plants with a high temporal and spatial resolution

Approach

1. Monitor the sky

2. Detect clouds
3. Estimate the cloud motion
4. Establish power forecasts

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 22
Cloud Detection for Power Forecast [4]

...
...
...

b
g
r

Input: Sky moving towards the sun

Output: Clear Sky Index = values betw. 0 (overcast sky) to 1 (clear sky)

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 23
Writer Recognition

Goal
Writer identification with limited training data (few pages per writer)

..
.

Source: ICDAR’13 dataset, QUWI’15 dataset, freepik.com

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 24
Writer Recognition using CNN Activation Features [6]

Use Neuronal Network for feature extraction

Activation features

ut r1
Inp ye r2
La

Classification layer
ye K
La ye
r

Hidden layer
La
...

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 25
Medical Applications
Cell Classification for Tumor Diagnostics [3]

Goal
Identify cells undergoing mitosis to asses tumor proliferation and aggressiveness in
histological images

Challenge

• Histological images: large number of cells

• Full annotations not feasible
• Sparse annotations
• Cells vary significantly in size/shape/etc

Source: Aubreville et al. 2017

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 26
Cell Classification for Tumor Diagnostics [3]

Source: Aubreville et al. 2017

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 27
Cell Classification for Tumor Diagnostics [3]

Approach
Use spatial transformer networks (STNs) to learn affine transformation and
classification

Source: Aubreville et al. 2017

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 28
Defect Pixel Interpolation

Goal
• Reconstruction of coronaries based on truncated X-ray images
• Create “virtual” digital subtraction angiography

Approach

1. Segment coronary vessels

2. Mask fluoroscopic image
3. Inpaint using U-net
4. Subtract inpainted image to
get untruncated data

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 29
Defect Pixel Interpolation

Processing pipeline

Subtraction

X-ray projection Digital subtraction

angiogram

Segmentation
algorithm

Masking

Inpainting algorithm
Binary mask Virtual mask image

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 30
Defect Pixel Interpolation

Deep learning for inpainting

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 31
Organ Search [7]

Goal

Locate anatomic structures automatically

Approach

• Deep reinforcement learning

• Learn strategies how to search
objects
: Learn optimal shortest search
through image volume to different
landmarks
• Hierarchical approach to improve
speed and robustness
Source: Ghesu et al. 2016, Ghesu et al. 2017

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 32
Organ Search [7]
Brainstem
missing!
109.7%
3 Aortic Arch 100%
Bifurcations

Sternum Tip

Interpolate
Right Kidney Left Kidney

Right Hip-bone 0%
Left Hip-bone

Extrapolate
Right Knee - 31.4%
missing!

3D search trajectories Body Range

starting from image center

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 33
Organ Search [7]

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 34
X-ray-transform Invariant Anatomical Landmark Detection

Goal
• Detect landmarks in X-ray images
• Knowing correspondences enables symbolic reconstruction
: Classic computervision reconstruction

Challenge

• Transmission imaging
: Overlap/superposition of structures
: High variance due to projection
: Artifacts e.g. interventional devices

Source: Bier et al. 2018

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 35
X-ray-transform Invariant Anatomical Landmark Detection

Approach: Convolutional Pose Machine (CPM) [17]

• Sequential prediction framework to detect landmarks

: Yields 2D belief maps
Properties

• Large receptive fields enable learning of configurations

• Estimation is refined over stages

Source: Wei et al. 2016

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 36
X-ray-transform Invariant Anatomical Landmark Detection

Source: Bier et al. 2018

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 37
Organ Prediction

Goal
Estimation of body and organ shapes based on patient’s height and weight for X-ray
exposure estimation.

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 38
Organ Prediction

Goal
Estimation of body and organ shapes based on patient’s height and weight for X-ray
exposure estimation.

Could we achieve more if we had old CT data of a patient?

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 38
Action Learning for 3D Point Cloud Based Organ Segmentation

Goal: Versatile organ segmentation for:

• Use it in computer aided diagnosis

• Treatment planning
• Dose management

Dose estimation in interventions with overlays

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 39
Action Learning for 3D Point Cloud Based Organ Segmentation

Challenges for clinical applications

• Robustness w.r.t.
1. Individual anatomy
2. Scan protocols
• Time constraints

Pre-operative CT (left) and contrast enhanced CT (right)

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 40
Action Learning for 3D Point Cloud Based Organ Segmentation

• Reinforcement learning
• Predict the transformation at given state

Figure: Action prediction pipeline for 3D point cloud based organ segmentation

Source: Zhong et al. 2018

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 41
Action Learning for 3D Point Cloud Based Organ Segmentation

• Runtime:
1. 0.3 - 2.6s per volume
2. 50 - 100 speedup from U-net [5]
• Very accurate
• Robust to:
1. scan protocol
2. contrast agent
3. organ initialization

Source: Zhong et al. 2018

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 42
Introduction - Part 3

Figure: "baseball player is Figure: "girl in pink dress is Figure: "man in black shirt is
throwing ball in game." jumping in air." playing guitar."

Source: http://cs.stanford.edu/people/karpathy/deepimagesent

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 44
Image Captioning & Generating

GPT4 can do even better:

GPT4: “An athletically built man with a
serious expression stands in a well-lit
room with a wooden ceiling. He is
wearing a blue t-shirt with the phrase
’SAVE THE CHUBBY UNICORNS’ and
an image of a rhinoceros. The man is
taking a selfie with a modern
smartphone, capturing his reflection in a
mirror. In the background, one can see
gym equipment including dumbbells and
a weight rack. There is also a hanging
chandelier with a unique design,
illuminating the room”