0% found this document useful (0 votes)
11 views

0 Introduction

Uploaded by

mafia.elmasry52
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views

0 Introduction

Uploaded by

mafia.elmasry52
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 132

Introduction

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, M. Nau, S. Jaganathan, C. Liu, N. Maul, L. Folle,
K. Packhäuser, M. Zinnen
Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg
April 15, 2024
Who are we? - Lab Members

Andreas Zijin Alexander Chang Leonhard


Maier Yang Barnhill Liu Rist

Merlin Noah Mathias Srikrishna Kai


Nau Maul Zinnen Jaganathan Packhäuser

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 1
Who are we? - Student Members

Lisa Majid Chengze Teena Leyi


Schmidt Sharghi Ye Tom Dieck Tang

Supraja Karlo Jingyi Anna- Philip


Ramesh Gabriel Yao Sophie Wagner
Fonseca Stephan
Yakovenko
A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 2
Deep Learning – Buzzwords

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 3
Outline

Motivation

Machine Learning and Pattern Recognition

Perceptron

Organizational Matters
Motivation
NVIDIA Stock Market

Source: https://www.google.com/finance/quote/

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 4
The Big Bang of Deep Learning

ImageNet [8] Dataset


• ≈ 14 mio. images, labeled into ≈ 20.000 synonym sets
• ImageNet Large Scale Visual Recognition Challenge using ≈ 1000 classes
• Images downloaded from the Internet, single label per image
• 2012: Breakthrough by Krizhevsky et al. [10]

Source: Krizhevsky et al. 2012

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 5
ImageNet Large Scale Visual Recognition Challenge

ILSVRC Top-5 Error [%]


25.8 First CNN

16.4
11.7 Residual Network
6.7
5.1 3.6 3 2.4

2011 2012 2013 2014 Human 2015 2016 2017

• First CNN approach now famous as AlexNet [10]

Source: image-net.org, Russakovsky et al. 2015

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 6
ImageNet Large Scale Visual Recognition Challenge

ILSVRC Top-5 Error [%]


25.8 First CNN

16.4
11.7 Residual Network
6.7
5.1 3.6 3 2.4

2011 2012 2013 2014 Human 2015 2016 2017

• First CNN approach now famous as AlexNet [10]


• “Superhuman” should be Super-Karpathy-an performance

Source: image-net.org, Russakovsky et al. 2015

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 6
ImageNet Large Scale Visual Recognition Challenge

Source: Krizhevsky et al. 2012

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 7
Deep Learning Users

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 8
Playing Go

• 1997: Deep Blue beats Garry Kasparov


• Go as a next challenge
• Large branching factor

Source: https://commons.wikimedia.org/wiki/File:FloorGoban.jpg

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 9
Playing Go

• 1997: Deep Blue beats Garry Kasparov


• Go as a next challenge
• Large branching factor
• 2016: AlphaGo [16] beats a professional

Source: https://commons.wikimedia.org/wiki/File:FloorGoban.jpg

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 9
Playing Go

• 1997: Deep Blue beats Garry Kasparov


• Go as a next challenge
• Large branching factor
• 2016: AlphaGo [16] beats a professional
• 2017: AlphaGoZero [1] surpasses every
human in Go by self-play
• 2017: AlphaZero [2] generalizes to a number
of other board games

Source: https://commons.wikimedia.org/wiki/File:FloorGoban.jpg

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 9
Playing Go

• 1997: Deep Blue beats Garry Kasparov


• Go as a next challenge
• Large branching factor
• 2016: AlphaGo [16] beats a professional
• 2017: AlphaGoZero [1] surpasses every
human in Go by self-play
• 2017: AlphaZero [2] generalizes to a number
of other board games
• 2019: AlphaStar beats professional StarCraft
players

Source: https://commons.wikimedia.org/wiki/File:FloorGoban.jpg

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 9
Google DeepDream

Attempt to understand the inner workings of the network: What it "dreams" about
when presented with images

Idea:
• Arbitrary image or noise as input
• Instead of adjusting network
parameters, tweak image towards
high activations
• Different layers enhance different
features (low or high level)

Source: https://research.googleblog.com

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 10
Google DeepDream

Source: https://research.googleblog.com

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 11
Google DeepDream

Source: https://research.googleblog.com
A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 11
Google DeepDream

Looking for new animals in the clouds

Source: https://research.googleblog.com

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 12
Real-Time Object Detection: YOLO, YOLO9000, YOLOv3 [11]–[13]

Click for video


• YOLO: You only live look once
• Prior systems : Use classifiers at multiple locations and scales
• YOLO : Simultaneous regression of bounding box and label
• FAST: 40-90 frames/second on a NVIDIA Titan X

Source: www.youtube.com, Redmon and Farhadi 2016

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 13
Every Day Use
Siri

Siri: Speech Interpretation and Recognition Interface

Source: www.apple.com/ios/siri/
A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 14
Google Echo & Amazon Alexa Voice Service

Source: www.amazon.com

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 15
Google Translate

Source: translate.google.de

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction April 15, 2024 16
Introduction - Part 2

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, M. Nau, S. Jaganathan, C. Liu, N. Maul, L. Folle,
K. Packhäuser, M. Zinnen
Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg
April 15, 2024
Research at the Pattern Recognition Lab
Assisted and Automated Driving

Goal
Find new ways to train and update deep learning mechanisms in environments with
high safety requirements

• Assisted and automatic driving


relies on sensor data
• Cameras to detect dynamic objects,
driving lanes and free space
• Detection and segmentation tasks
: deep learning

Source: Audi AG

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 18
Assisted and Automated Driving

• Currently: neural networks trained and


thoroughly tested before deployment
: Requires huge amounts of manually
labeled data
• Regular test drives cannot verify system
reliability in all traffic scenarios

Click for video

Source: Mobileye N.V.

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 19
Assisted and Automated Driving

• Currently: neural networks trained and


thoroughly tested before deployment
: Requires huge amounts of manually
labeled data
• Regular test drives cannot verify system
reliability in all traffic scenarios
• Challenge: New ways to test algorithms in
simulated environments and utilize data Click for video
collected in production cars equipped with
appropriate hardware

Source: Mobileye N.V.

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 19
Smart Devices

Problem statement
Renewable energy power 6= energy demand

• Underproduction : backup power plants


• Overproduction : energy lost
: Real-Time-Pricing to match energy demand
and supply
• Needs smart devices to shift workload
automatically

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 20
Smart Devices

Goal
Establish energy equilibrium by predicting energy consumption

• Example: Interrupt fridge cooling cycle when price is high, start washing
machine when price is low
• Dependencies between tasks, user information and action necessary (e.g.,
washer/dryer)
• Task: Identify time-shiftable loads and assess appropriate time frame
• Approach: Train recurrent neural networks to identify usage patterns and
dependencies between devices

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 21
Cloud Detection for Power Forecast [4]

Goal
Power forecast for solar power plants with a high temporal and spatial resolution

Approach

1. Monitor the sky


2. Detect clouds
3. Estimate the cloud motion
4. Establish power forecasts

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 22
Cloud Detection for Power Forecast [4]

...
...
...

b
g
r

Input: Sky moving towards the sun


Output: Clear Sky Index = values betw. 0 (overcast sky) to 1 (clear sky)

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 23
Writer Recognition

Goal
Writer identification with limited training data (few pages per writer)

..
.

Source: ICDAR’13 dataset, QUWI’15 dataset, freepik.com

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 24
Writer Recognition using CNN Activation Features [6]

Use Neuronal Network for feature extraction

Activation features

ut r1
Inp ye r2
La

Classification layer
ye K
La ye
r

Hidden layer
La
...

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 25
Medical Applications
Cell Classification for Tumor Diagnostics [3]

Goal
Identify cells undergoing mitosis to asses tumor proliferation and aggressiveness in
histological images

Challenge

• Histological images: large number of cells


• Full annotations not feasible
• Sparse annotations
• Cells vary significantly in size/shape/etc

Source: Aubreville et al. 2017

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 26
Cell Classification for Tumor Diagnostics [3]

Source: Aubreville et al. 2017

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 27
Cell Classification for Tumor Diagnostics [3]

Approach
Use spatial transformer networks (STNs) to learn affine transformation and
classification

Source: Aubreville et al. 2017

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 28
Defect Pixel Interpolation

Goal
• Reconstruction of coronaries based on truncated X-ray images
• Create “virtual” digital subtraction angiography

Approach

1. Segment coronary vessels


2. Mask fluoroscopic image
3. Inpaint using U-net
4. Subtract inpainted image to
get untruncated data

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 29
Defect Pixel Interpolation

Processing pipeline

Subtraction

X-ray projection Digital subtraction


angiogram

Segmentation
algorithm

Masking

Inpainting algorithm
Binary mask Virtual mask image

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 30
Defect Pixel Interpolation

Deep learning for inpainting

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 31
Organ Search [7]

Goal

Locate anatomic structures automatically

Approach

• Deep reinforcement learning


• Learn strategies how to search
objects
: Learn optimal shortest search
through image volume to different
landmarks
• Hierarchical approach to improve
speed and robustness
Source: Ghesu et al. 2016, Ghesu et al. 2017

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 32
Organ Search [7]
Brainstem
missing!
109.7%
3 Aortic Arch 100%
Bifurcations

Sternum Tip

Interpolate
Right Kidney Left Kidney

Right Hip-bone 0%
Left Hip-bone

Extrapolate
Right Knee - 31.4%
missing!

3D search trajectories Body Range


starting from image center

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 33
Organ Search [7]

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 34
X-ray-transform Invariant Anatomical Landmark Detection

Goal
• Detect landmarks in X-ray images
• Knowing correspondences enables symbolic reconstruction
: Classic computervision reconstruction

Challenge

• Transmission imaging
: Overlap/superposition of structures
: High variance due to projection
: Artifacts e.g. interventional devices

Source: Bier et al. 2018

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 35
X-ray-transform Invariant Anatomical Landmark Detection

Approach: Convolutional Pose Machine (CPM) [17]

• Sequential prediction framework to detect landmarks


: Yields 2D belief maps
Properties

• Large receptive fields enable learning of configurations


• Estimation is refined over stages

Source: Wei et al. 2016

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 36
X-ray-transform Invariant Anatomical Landmark Detection

Source: Bier et al. 2018

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 37
Organ Prediction

Goal
Estimation of body and organ shapes based on patient’s height and weight for X-ray
exposure estimation.

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 38
Organ Prediction

Goal
Estimation of body and organ shapes based on patient’s height and weight for X-ray
exposure estimation.

Could we achieve more if we had old CT data of a patient?


A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 38
Action Learning for 3D Point Cloud Based Organ Segmentation

Goal: Versatile organ segmentation for:

• Use it in computer aided diagnosis


• Treatment planning
• Dose management

Dose estimation in interventions with overlays

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 39
Action Learning for 3D Point Cloud Based Organ Segmentation

Challenges for clinical applications

• Robustness w.r.t.
1. Individual anatomy
2. Scan protocols
• Time constraints

Pre-operative CT (left) and contrast enhanced CT (right)

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 40
Action Learning for 3D Point Cloud Based Organ Segmentation

• Reinforcement learning
• Predict the transformation at given state

Figure: Action prediction pipeline for 3D point cloud based organ segmentation

Source: Zhong et al. 2018

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 41
Action Learning for 3D Point Cloud Based Organ Segmentation

• Runtime:
1. 0.3 - 2.6s per volume
2. 50 - 100 speedup from U-net [5]
• Very accurate
• Robust to:
1. scan protocol
2. contrast agent
3. organ initialization

Source: Zhong et al. 2018

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 2 April 15, 2024 42
Introduction - Part 3

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, M. Nau, S. Jaganathan, C. Liu, N. Maul, L. Folle,
K. Packhäuser, M. Zinnen
Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg
April 15, 2024
Limitations
Image Captioning

Figure: "baseball player is Figure: "girl in pink dress is Figure: "man in black shirt is
throwing ball in game." jumping in air." playing guitar."

Source: http://cs.stanford.edu/people/karpathy/deepimagesent

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 44
Image Captioning & Generating

GPT4 can do even better:


GPT4: “An athletically built man with a
serious expression stands in a well-lit
room with a wooden ceiling. He is
wearing a blue t-shirt with the phrase
’SAVE THE CHUBBY UNICORNS’ and
an image of a rhinoceros. The man is
taking a selfie with a modern
smartphone, capturing his reflection in a
mirror. In the background, one can see
gym equipment including dumbbells and
a weight rack. There is also a hanging
chandelier with a unique design,
illuminating the room”

Source: https://twitter.com/maier_ak/status/1713563211574751494

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 45
Image Captioning & Generating

Now DALL-E3, "Create an image for this prompt in different styles"

Figure: "version 1." Figure: "version 2." Figure: "version 3."

Source: https://twitter.com/maier_ak/status/1713563211574751494

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 46
Image Captioning & Generating

To DALL-E3, "Can you create a few pixel art versions of this?"

Figure: "pixel version 1." Figure: "pixel version 2." Figure: "pixel version 3."

Source: https://twitter.com/maier_ak/status/1713563211574751494

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 47
Image Captioning & Generating

To DALL-E3, "Show the scene in the style of a massive online roleplaying game"

Figure: "warcraft version 1." Figure: "warcraft version 2." Figure: "warcraft version 3."

Source: https://twitter.com/maier_ak/status/1713563211574751494

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 48
Image Captioning & Generating

To DALL-E3, "Make a few versions that show the scene in Lego Style"

Figure: "lego version 1." Figure: "lego version 2." Figure: "lego version 3."

Source: https://twitter.com/maier_ak/status/1713563211574751494

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 49
Challenges with Training Data

• Deep learning applications often rely on huge, manually-annotated data sets


• Hard to obtain, time-consuming, expensive, ambiguous
• To err is human: Mislabeled ground-truth annotation
: May cause a significant drop in performance

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 50
Challenges with Training Data

• Deep learning applications often rely on huge, manually-annotated data sets


• Hard to obtain, time-consuming, expensive, ambiguous
• To err is human: Mislabeled ground-truth annotation
: May cause a significant drop in performance

• Question: How far can we get with simulations?

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 50
Generating Synthetic Data

• Sample from trained latent diffusion model

Figure: four chest X-ray images sampled from a trained latent diffusion model. Image generation was done in a
conditional way to produce images of specific abnormality classes. The induced abnormality patterns in the
synthetic images are highlighted with red arrows and circles.

Source: Packhäuser et al. IEEE ISBI 2023

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 51
Memorization problem for diffusion models

• Problem: models in practice tend to


reproduce patterns/complete images of the
used training set.
• Proposed method: uses a pre-trained patient
retrieval model to compare a created
synthetic scan to all training images (see
pipeline below)

Privacy-enhancing Image Sampling Strategy

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 52
Challenges with Trust and Reliability

• Verification is mandatory for high risk applications


: End-to-end learning prohibits verification of parts
: Largely unsolved

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 53
Challenges with Trust and Reliability

• Verification is mandatory for high risk applications


: End-to-end learning prohibits verification of parts
: Largely unsolved
• Possible solution: Reformulate classical algorithms

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 53
Large Language Models for MRI Scanners

to GPT: Can you implement the sequence show


in the given sequence diagram in pypulseq?

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 54
Large Language Models for MRI Scanners

to GPT: Can you implement the sequence show


in the given sequence diagram in pypulseq?

GPT: Yes. (On the right)


• Still buggy, works only 50% of the time

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 54
Future Directions
Learning of Algorithms

• Computed Tomography
• Efficient solution via filtered back-projection:
Z π
f (x , y ) = p(s, θ) ∗ h(s)|s=x cos θ+y sin θ d θ
0

• Three steps:
• Convolution along s
• Back-projection along θ
• Suppress negative values

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 55
Reconstruction Networks

• All three steps can be modeled as a neural network:


C ATpb non-neg. constraint

projection

reconstruction
sinogram

loss
func-
tion

convolution layer fully connected layer rectified linear unit

• All weights are known from FBP

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 56
Reconstruction Networks

• Reconstruction Networks can be expanded


Wcos2D Wred C ATcb non-neg. constraint
projection

reconstruction
sinogram

loss
func-
tion

weighting layer weighting layer conv. layer FC layer ReLU

• Embedding of "heuristics" for artifact reduction possible

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 57
Application to Incomplete Scans [18]

Figure: Reconstruction with 360◦

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 58
Application to Incomplete Scans [18]

Figure: Reconstruction with 180◦ (FBP)

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 59
Application to Incomplete Scans [18]

Figure: Reconstruction with 180◦ (NN)

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 60
Application to Incomplete Scans [18]

0 1.0

100 0.8

normalized intensity
200 0.6
height [px]

300 0.4

400
0.2
reference fr
limited angle fl
our model fm
500
0.0
0 100 200 300 400 500 0 100 200 300 400 500
width [px] position [px]

Figure: Location of the lineplot Figure: Lineplot

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 61
Parker Weights

2.0

1.5
weight

1.0

0.5

0.0

0
π
4
ga
ntr

π
2
y rot

3π 0
ati

4 100
300 200
on

π 600 500 400 if t [px]


700 r sh
detecto

Figure: Parker weights before learning

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 62
Parker Weights

2.0

1.5
weight

1.0

0.5

0.0

0
π
4
ga
ntr

π
2
y rot

3π 0
ati

4 100
300 200
on

π 600 500 400 if t [px]


700 r sh
detecto

Figure: Parker weights after learning

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 62
Further Extensions

• Add non-linear de-streaking and de-noising step:


x yN N = yV0 N yVT N
NN GD1 ... GDt ... GDT

yVt−1
N + yVt N
λt (yVt N − yN N )
- -Σ
ρ01 ,t
k1 ,t k̄1 ,t
+

...

...

...
Σ
ρ0N ,t +
kN ,t k̄N ,t
Wcos Wcomp C B Ψ(·)

Step 1: Neural network CT reconstruction Step 2: Variational network non-linear filtering

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 63
Further Extensions

Full Scan Reference Neural Network Input

BM3D Variational Network (k = 13)

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 3 April 15, 2024 64
Introduction - Part 4

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, M. Nau, S. Jaganathan, C. Liu, N. Maul, L. Folle,
K. Packhäuser, M. Zinnen
Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg
April 15, 2024
Machine Learning and Pattern Recognition
Terminology and Notation

Throughout these slides, we will use the following notation:


• Matrices: bold, uppercase, e.g., M, A
• Vectors: bold, lowercase, e.g., v, x
• Scalars: italic, lowercase, e.g., y , w, α
• Gradient of a function: ∇, partial derivative: ∂

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 66
Terminology and Notation

Throughout these slides, we will use the following notation:


• Matrices: bold, uppercase, e.g., M, A
• Vectors: bold, lowercase, e.g., v, x
• Scalars: italic, lowercase, e.g., y , w, α
• Gradient of a function: ∇, partial derivative: ∂
Notation regarding deep learning:
• Trainable parameters (“weights”): w
• Features/input: x
• Ground truth label/target: y
• Estimated output: ŷ
• Index denoting iteration will be in superscript, e.g., x(i )

The notation and the terminology will be further developed throughout the lecture.
A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 66
“Classical” Image Processing Pipeline

Classification phase

f feature c Ωκ
recording preprocessing classification
extraction

ω
Learning phase training

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 67
“Classical” Image Processing Pipeline

Lecture Introduction to Pattern Recognition

Classification phase

f feature c Ωκ
recording preprocessing classification
extraction

ω
Learning phase training

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 67
“Classical” Image Processing Pipeline

Lecture Introduction to Pattern Recognition

Classification phase

f feature c Ωκ
recording preprocessing classification
extraction

ω
Learning phase training

Lecture Pattern Recognition

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 67
“Classical” Image Processing Pipeline: Apple vs. Pears

Source: https://commons.wikimedia.org

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 68
“Classical” Image Processing Pipeline: Apple vs. Pears

Source: https://commons.wikimedia.org

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 68
“Classical” Image Processing Pipeline: Apple vs. Pears

Source: https://commons.wikimedia.org

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 68
“Classical” Image Processing Pipeline: Apple vs. Pears

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 69
“Classical” Image Processing Pipeline: Apple vs. Pears

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 69
Pipeline in Deep Learning

Source: https://xkcd.com/1838/

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 70
Pipeline in Deep Learning

Reminder

feature
measurement preprocessing classification
extraction

training

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 71
Pipeline in Deep Learning

Now
representation
measurement learning
engine

training

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 71
Postulates for Pattern Recognition

6 Postulates:

1. Availability of a representative sample ω of patterns i f(x)


for the given field of problems Ω

ω = {1 f(x), . . . , N f(x)} ⊆ Ω.

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 72
Postulates for Pattern Recognition

6 Postulates:

1. Availability of a representative sample ω of patterns i f(x)


for the given field of problems Ω

ω = {1 f(x), . . . , N f(x)} ⊆ Ω.

2. A (simple) pattern has features, which characterize


its membership in a certain class Ωκ .

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 72
Postulates for Pattern Recognition (cont.)

3. Compact domain of features of the same class; domains of different classes


are (reasonably) separable.
• small intra-class distance
• high inter-class distance

Example of an increasingly less compact domain in the feature space:

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 73
Postulates for Pattern Recognition (cont.)

4. A (complex) pattern consists of simpler constituents, which have certain


relations to each other. A pattern may be decomposed into these constituents.

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 74
Postulates for Pattern Recognition (cont.)

4. A (complex) pattern consists of simpler constituents, which have certain


relations to each other. A pattern may be decomposed into these constituents.

5. A (complex) pattern f(x) ∈ Ω has a certain structure. Not any arrangement


of simple constituents is a valid pattern. Many patterns may be represented
with relatively few constituents.

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 74
Postulates for Pattern Recognition (cont.)

4. A (complex) pattern consists of simpler constituents, which have certain


relations to each other. A pattern may be decomposed into these constituents.

5. A (complex) pattern f(x) ∈ Ω has a certain structure. Not any arrangement


of simple constituents is a valid pattern. Many patterns may be represented
with relatively few constituents.
6. Two patterns are similar if their features or simpler constituents differ only
slightly.

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 74
Perceptron
Perceptron Biology - Neural Excitation (simplified)

• Neurons are connected by


synapses / dendrites
• If the sum of incoming (excitatory and
inhibitory) activations is large enough, an
action potential is created
• The action potential activates synapses to
other neurons, “transmitting” information
• All-or-none response: A higher stimulus
does not cause a higher response : “binary
classifier”

Source: https://commons.wikimedia.org

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 75
Rosenblatt’s Perceptron

• In 1957, Frank Rosenblatt [14]


invented the Perceptron
1 w0
• Binary classification y ∈ {−1, 1}.
• It computes the function x1 w1

ŷ = sign(w| x), x2 w2
P

where .. .. Activation
. . function
w = (w0 , . . . , wn ): set of weights
(w0 =bias) xn wn
x = (1, x1 , . . . , xn ): input feature
vector inputs weights

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 76
Perceptron Objective Function

Task: Find weights that minimize the distance of misclassified samples to the
decision boundary

Assumptions

• Let S = {(x1 , y1 ), (x2 , y2 ), . . . , (xm , ym )} be a training data set


• Let M be the set of misclassified feature vectors yi 6= ŷi = sign(w| xi )
according to a given set of weights w
• Optimization problem:
 X 
|
argmin D (w) =− yi · (w xi )
w
xi ∈M

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 77
Perceptron Objective Function – Observations

• Objective function depends on misclassified feature vectors M : iterative


optimization
• In each iteration, the cardinality and composition of M may change
• The gradient of the objective function is:
X
∇D (w) = − yi · xi
xi ∈M

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 78
Perceptron Training

• Strategy 1: Process all samples, then perform weight update


• Strategy 2: Take an update step right after each misclassified sample
• Update rule in iteration (k + 1) for the misclassified sample xi simplifies to:

w(k +1) = w(k ) + yi · xi

• Optimization until convergence or for a predefined number of iterations

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 4 April 15, 2024 79
Introduction - Part 5

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, M. Nau, S. Jaganathan, C. Liu, N. Maul, L. Folle,
K. Packhäuser, M. Zinnen
Pattern Recognition Lab, Friedrich-Alexander-Universität Erlangen-Nürnberg
April 15, 2024
Organizational Matters
Grading

• Module consists of lecture and exercises (together 5 ECTS)


• 90 min. written exam in the semester break, determines grade
• Exercises are optional. 100% exercise completion = 10% grade when you
pass the exam

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 5 April 15, 2024 81
Exercise Content

• Python introduction
• Developing a neural network framework from scratch
• Feed Forward Neural Networks
• Convolutional Neural Networks
• Regularization
• Recurrent Networks
• Using the PyTorch framework
• Large scale classification

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 5 April 15, 2024 82
Exercise Requirements

• Basic knowledge of Python and Numpy


• Linear algebra, -
• Image processing, -
• Pattern recognition fundamentals
• Passion for coding
• Attention to detail
• Time

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 5 April 15, 2024 83
How it works

• Five exercises throughout the semester


• Unit tests for all but last exercise
• Last exercise: PyTorch + Challenge
• Assistance during exercise sessions
• Personal demonstration of every exercise to get bonus points
• Exercise deadlines are announced in the respective exercise sessions

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 5 April 15, 2024 84
Summary

• Deep learning more and more present in day to day life


• Huge support and interest from industry
• Very active area of research!
• Perceptron as binary classifier motivated by biological neurons

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 5 April 15, 2024 85
Next Lecture Block

• Extending the Perceptron to obtain a universal function approximator


• Gradient based training algorithm for these models
• Efficient automatic computation of gradients

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 5 April 15, 2024 87
Comprehensive Questions

• What are the six postulates of pattern recognition?


• What is the Perceptron objective function?
• Can you name three applications successfully tackled by deep learning?

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 5 April 15, 2024 88
Further Reading

• Link - Deep learning book


• Link - Research and publications at the Pattern Recognition Lab
• Link - Google Research Blog with posts on e.g. Deep dream or Alpha Go

A. Maier, V. Christlein, K. Breininger, Z. Yang, L. Rist, A. Barnhill | Introduction - Part 5 April 15, 2024 89
Questions?
References
References I

[1] David Silver, Julian Schrittwieser, Karen Simonyan, et al. “Mastering the
game of go without human knowledge”. In: Nature 550.7676 (2017), p. 354.
[2] David Silver, Thomas Hubert, Julian Schrittwieser, et al. “Mastering Chess
and Shogi by Self-Play with a General Reinforcement Learning Algorithm”. In:
arXiv preprint arXiv:1712.01815 (2017).
[3] M. Aubreville, M. Krappmann, C. Bertram, et al. “A Guided Spatial
Transformer Network for Histology Cell Differentiation”. In: ArXiv e-prints (July
2017). arXiv: 1707.08525 [cs.CV].
[4] David Bernecker, Christian Riess, Elli Angelopoulou, et al. “Continuous
short-term irradiance forecasts using sky images”. In: Solar Energy 110
(2014), pp. 303–315.
References II

[5] Patrick Ferdinand Christ, Mohamed Ezzeldin A Elshaer, Florian Ettlinger, et al.
“Automatic liver and lesion segmentation in CT using cas-
caded fully convolutional neural networks and 3D conditional random fields”. In:
International Conference on Medical Image Computing and Computer-Assisted
Springer. 2016, pp. 415–423.
[6] Vincent Christlein, David Bernecker, Florian Hönig, et al. “Writer Identification
Using GMM Supervectors and Exemplar-SVMs”. In: Pattern Recognition 63
(2017), pp. 258–267.
[7] Florin Cristian Ghesu, Bogdan Georgescu, Tommaso Mansi, et al. “An
Artificial Agent for Anatomical Landmark Detection in Medical Images”. In:
Medical Image Computing and Computer-Assisted Intervention - MICCAI 2016
Athens, 2016, pp. 229–237.
References III

[8] Jia Deng, Wei Dong, Richard Socher, et al. “Imagenet: A large-scale
hierarchical image database”. In:
Computer Vision and Pattern Recognition, 2009. CVPR 2009. IEEE Conference
IEEE. 2009, pp. 248–255.
[9] A. Karpathy and L. Fei-Fei. “Deep Visual-Semantic Alignments for Generating
Image Descriptions”. In: ArXiv e-prints (Dec. 2014). arXiv: 1412.2306
[cs.CV].
[10] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. “ImageNet
Classification with Deep Convolutional Neural Networks”. In:
Advances in Neural Information Processing Systems 25. Curran Associates,
Inc., 2012, pp. 1097–1105.
References IV

[11] Joseph Redmon, Santosh Kumar Divvala, Ross B. Girshick, et al. “You Only
Look Once: Unified, Real-Time Object Detection”. In: CoRR abs/1506.02640
(2015).
[12] J. Redmon and A. Farhadi. “YOLO9000: Better, Faster, Stronger”. In:
ArXiv e-prints (Dec. 2016). arXiv: 1612.08242 [cs.CV].
[13] Joseph Redmon and Ali Farhadi. “YOLOv3: An Incremental Improvement”. In:
arXiv (2018).
[14] Frank Rosenblatt. The Perceptron–a perceiving and recognizing automaton.
85-460-1. Cornell Aeronautical Laboratory, 1957.
[15] Olga Russakovsky, Jia Deng, Hao Su, et al. “ImageNet Large Scale Visual
Recognition Challenge”. In: International Journal of Computer Vision 115.3
(2015), pp. 211–252.
References V

[16] David Silver, Aja Huang, Chris J. Maddison, et al. “Mastering the game of Go
with deep neural networks and tree search”. In: Nature 529.7587 (Jan. 2016),
pp. 484–489.
[17] S. E. Wei, V. Ramakrishna, T. Kanade, et al. “Convolutional Pose Machines”.
In: CVPR. 2016, pp. 4724–4732.
[18] Tobias Würfl, Florin C Ghesu, Vincent Christlein, et al. “Deep learning
computed tomography”. In:
International Conference on Medical Image Computing and Computer-Assisted
Springer International Publishing. 2016, pp. 432–440.

You might also like