SlideShare a Scribd company logo
HOW I MADE
ZOOM IN AND ENHANCE
CONVOLUTIONAL NEURAL NETWORKS
How I Made Zoom In and Enhance - Seattle Mobile .NET
How I Made Zoom In and Enhance - Seattle Mobile .NET
How I Made Zoom In and Enhance - Seattle Mobile .NET
?
1. SOMEONE ELSE DID IT
2. FINALLY THERE WAS A
LIBRARY I COULD UNDERSTAND
THEORY
THEORY
▸ A way to compute a function inspired by the human brain
▸ Much simpler than the brain!
▸ Input -> Network -> Output
▸ Comprised of many neurons
▸ These neurons are interconnected
▸ Some are “inputs” and some are “outputs”
THEORY
DEEP NETWORKS
▸ Nowadays we have so many neurons that we no longer describe the networks
with them
▸ Instead, we have layers that contain many neurons
▸ When layers are connected to each other, the constituent neurons get
connected
▸ Neurons in different types of layers behave differently from those in other
types
TEXT
CONVOLUTIONAL NEURAL NETWORKS
▸ CNNs use a layer type called a Spatial Convolution to operate on images
▸ This is a way of connecting neurons so that nearby pixels in 2D images are
connected to nearby neurons
▸ The operation the neurons perform is called convolution and is a generalized
technique for manipulating images
▸ The benefits are that local information gets local connections and the
operation itself is very powerful
TEXT
RECURRENT NEURAL NETWORKS
▸ RNNs use feedback to predict the future!
▸ Most networks are pure functions that process inputs and produce outputs
▸ RNNs feed outputs back into the network to model time
▸ It’s scarily effective
THEORY
NETWORK NOTATION
▸ Networks are written like a chemistry formula
▸ Instead of atoms, layer types are used
THEORY
TRAINING
▸ We train the network by showing it a bunch of inputs and desired outputs
▸ The training algorithm is called back propagation and involves a lot of
number crunching
▸ Neurons assign weights to the neurons they’re connected to
▸ These weights control how much the neighbors influence that neuron
▸ Training is the process of determining these weights
▸ The number of times a pair have been reused is called an epoch
THEORY
MINIMIZE ERROR BY ADJUSTING WEIGHTS
THEORY
GENERALIZING
▸ What’s the point of training a network if we already have a bunch of inputs and
outputs?
▸ The hope is that the network will learn how to solve problems it hasn’t yet seen
▸ When we train, we always reserve a batch of validation input and outputs that it
never sees
▸ We then use those inputs and outputs to rate the network
PRACTICE
PRACTICE
NEURAL NETWORK DISTRIBUTION
▸ While the concepts are universal…
▸ Neural networks are released as source code that can train and execute the network
▸ The code can be in any number of languages and use any number of support libraries
▸ Python with TensorFlow
▸ Lua with Torch
▸ Networks may only work with some hardware
▸ NVIDIA CUDA is used a lot (they invest in academia)
▸ Cloud solutions exist varying from virtualized hardware to proprietary languages
PRACTICE
PIX2PIX https://github.com/phillipi/pix2pix
▸ 2 Networks: Generator + Discriminator
PRACTICE
PREREQUISITES - HARDWARE
LOTS OF
PROCESSING
POWER
How I Made Zoom In and Enhance - Seattle Mobile .NET
PRACTICE
INSTALLATION
▸ Install NVIDIA drivers
▸ CUDA - GPU programming SDK
▸ CUDNN - GPU libraries to help writing nets
▸ I did this on Mac and Linux
▸ Install Torch
▸ Lua libraries that can use CUDNN
▸ Install pix2pix
PRACTICE
TRAINING
$ DATA_ROOT=./datasets/facades 
name=facades_generation 
which_direction=AtoB 
th train.lua
PRACTICE
WE’RE TRAINING!
PRACTICE
TRAINING COMPLETE
▸ We get a trained
model file
▸ Contains the
structured of the
network
▸ Along with the
learned weights
PRACTICE
RUNNING THE NETWORK
▸ Now that the network is trained, we can run it against new inputs
▸ Put images you want to test in a special folder
▸ Run test.lua instead of train.lua
▸ Now you have outputs!
PRACTICE
BACK TO
ZOOM AND ENHANCE
ZOOM AND ENHANCE
GOOD TRAINING EXAMPLES
▸ It some fooling around to learn what the network can and can’t do
▸ You can’t just throw images at it and hope for the best
▸ You must spend time to give it good training examples
▸ Inputs and outputs should only differ by what you want the network to
learn
▸ Other differences will cause slow or impossible learning
▸ Examples: aspect ratio, cropping / region of interest, backgrounds
ZOOM AND ENHANCE
FACE ZOOMING EXAMPLE GENERATION
▸ Narrowed the task down to zooming in on faces
▸ Wrote an app that extracts faces from images using Apple’s CoreImage
framework
▸ Wrote another app that down samples those faces by 8x, then 16x
▸ This only simulates the problem of Z&E since noise is drastically reduced by
this downsampling
▸ Ideally, my inputs and outputs would be taken with the same camera using
two different zoom levels. But alas…
ZOOM AND ENHANCE
300 EXAMPLES - NOT JUST ME
ZOOM AND ENHANCE
RESULT
ZOOM AND ENHANCE
RESULT
ZOOM AND ENHANCE
RESULT
ZOOM AND ENHANCE
RESULT
FUTURE
GOTTA GITIT WORKIN
ON MOBILE
FUTURE
MOBILE LIBRARIES
▸ TensorFlow is a C++ library that runs on Android & iOS
▸ Miguel de Icaza is binding TensorFlow to .NET (hope it works on mobile!)
▸ https://github.com/migueldeicaza/TensorFlowSharp
▸ Apple includes Metal Performance Shaders that contains basic CNN routines
▸ I’m porting Torch, for now…

More Related Content

Similar to How I Made Zoom In and Enhance - Seattle Mobile .NET (20)

PPT
deepnet-lourentzou.ppt
yang947066
 
PPT
Deep learning is a subset of machine learning and AI
leradiophysicien1
 
PPT
Overview of Deep Learning and its advantage
aqib296675
 
PPT
Introduction to Deep Learning presentation
johanericka2
 
PPTX
Deep Learning Tutorial
Amr Rashed
 
PPTX
Deep learning tutorial 9/2019
Amr Rashed
 
PPTX
Introduction to deep learning
Abhishek Bhandwaldar
 
PPTX
Deep Learning Structure of Neural Network.pptx
AmbreenMaroof
 
PDF
Neural Networks, Spark MLlib, Deep Learning
Asim Jalis
 
PPTX
Introduction to artificial neural networks
Chetan Ruparel
 
PPTX
Karan ppt for neural network and deep learning
KathiriyaParthiv
 
PPTX
Build a simple image recognition system with tensor flow
DebasisMohanty37
 
PPTX
An introduction to Deep Learning
David Rostcheck
 
PPTX
ANN.ppt[1].pptx
SrujanaChiliveri
 
PDF
[PR12] understanding deep learning requires rethinking generalization
JaeJun Yoo
 
PPTX
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Impetus Technologies
 
PPTX
Hard & soft computing
SCAROLINEECE
 
PPTX
Neural network
Saddam Hussain
 
PDF
DEF CON 24 - Clarence Chio - machine duping 101
Felipe Prado
 
PDF
An Introduction to Deep Learning (May 2018)
Julien SIMON
 
deepnet-lourentzou.ppt
yang947066
 
Deep learning is a subset of machine learning and AI
leradiophysicien1
 
Overview of Deep Learning and its advantage
aqib296675
 
Introduction to Deep Learning presentation
johanericka2
 
Deep Learning Tutorial
Amr Rashed
 
Deep learning tutorial 9/2019
Amr Rashed
 
Introduction to deep learning
Abhishek Bhandwaldar
 
Deep Learning Structure of Neural Network.pptx
AmbreenMaroof
 
Neural Networks, Spark MLlib, Deep Learning
Asim Jalis
 
Introduction to artificial neural networks
Chetan Ruparel
 
Karan ppt for neural network and deep learning
KathiriyaParthiv
 
Build a simple image recognition system with tensor flow
DebasisMohanty37
 
An introduction to Deep Learning
David Rostcheck
 
ANN.ppt[1].pptx
SrujanaChiliveri
 
[PR12] understanding deep learning requires rethinking generalization
JaeJun Yoo
 
Deep Learning: Evolution of ML from Statistical to Brain-like Computing- Data...
Impetus Technologies
 
Hard & soft computing
SCAROLINEECE
 
Neural network
Saddam Hussain
 
DEF CON 24 - Clarence Chio - machine duping 101
Felipe Prado
 
An Introduction to Deep Learning (May 2018)
Julien SIMON
 

More from Frank Krueger (9)

PDF
Open Source CLRs - Seattle Mobile .NET
Frank Krueger
 
PDF
Asynchronous Application Patterns in C# - MonkeySpace
Frank Krueger
 
PDF
Programming Augmented Reality - Xamarin Evolve
Frank Krueger
 
PDF
3 Mobile App Dev Problems - Monospace
Frank Krueger
 
PPTX
Algorithms - Future Decoded 2016
Frank Krueger
 
PDF
Overview of iOS 11 - Seattle Mobile .NET
Frank Krueger
 
PPTX
Functional GUIs with F#
Frank Krueger
 
PDF
Mocast Postmortem
Frank Krueger
 
PPT
Programming iOS in C#
Frank Krueger
 
Open Source CLRs - Seattle Mobile .NET
Frank Krueger
 
Asynchronous Application Patterns in C# - MonkeySpace
Frank Krueger
 
Programming Augmented Reality - Xamarin Evolve
Frank Krueger
 
3 Mobile App Dev Problems - Monospace
Frank Krueger
 
Algorithms - Future Decoded 2016
Frank Krueger
 
Overview of iOS 11 - Seattle Mobile .NET
Frank Krueger
 
Functional GUIs with F#
Frank Krueger
 
Mocast Postmortem
Frank Krueger
 
Programming iOS in C#
Frank Krueger
 
Ad

Recently uploaded (20)

PDF
AOMEI Partition Assistant Crack 10.8.2 + WinPE Free Downlaod New Version 2025
bashirkhan333g
 
PDF
Technical-Careers-Roadmap-in-Software-Market.pdf
Hussein Ali
 
PDF
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
PPTX
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
PPTX
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
PDF
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
PDF
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
PDF
Dipole Tech Innovations – Global IT Solutions for Business Growth
dipoletechi3
 
PDF
MiniTool Power Data Recovery 8.8 With Crack New Latest 2025
bashirkhan333g
 
PDF
Simplify React app login with asgardeo-sdk
vaibhav289687
 
PDF
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
PPTX
iaas vs paas vs saas :choosing your cloud strategy
CloudlayaTechnology
 
PPTX
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
PPTX
Build a Custom Agent for Agentic Testing.pptx
klpathrudu
 
PDF
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
PPTX
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
PPTX
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
PDF
ERP Consulting Services and Solutions by Contetra Pvt Ltd
jayjani123
 
AOMEI Partition Assistant Crack 10.8.2 + WinPE Free Downlaod New Version 2025
bashirkhan333g
 
Technical-Careers-Roadmap-in-Software-Market.pdf
Hussein Ali
 
Generic or Specific? Making sensible software design decisions
Bert Jan Schrijver
 
Customise Your Correlation Table in IBM SPSS Statistics.pptx
Version 1 Analytics
 
Agentic Automation: Build & Deploy Your First UiPath Agent
klpathrudu
 
MiniTool Partition Wizard Free Crack + Full Free Download 2025
bashirkhan333g
 
How to Hire AI Developers_ Step-by-Step Guide in 2025.pdf
DianApps Technologies
 
Dipole Tech Innovations – Global IT Solutions for Business Growth
dipoletechi3
 
MiniTool Power Data Recovery 8.8 With Crack New Latest 2025
bashirkhan333g
 
Simplify React app login with asgardeo-sdk
vaibhav289687
 
IDM Crack with Internet Download Manager 6.42 Build 43 with Patch Latest 2025
bashirkhan333g
 
iaas vs paas vs saas :choosing your cloud strategy
CloudlayaTechnology
 
Change Common Properties in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
In From the Cold: Open Source as Part of Mainstream Software Asset Management
Shane Coughlan
 
Build a Custom Agent for Agentic Testing.pptx
klpathrudu
 
SAP Firmaya İade ABAB Kodları - ABAB ile yazılmıl hazır kod örneği
Salih Küçük
 
Coefficient of Variance in IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Homogeneity of Variance Test Options IBM SPSS Statistics Version 31.pptx
Version 1 Analytics
 
Milwaukee Marketo User Group - Summer Road Trip: Mapping and Personalizing Yo...
bbedford2
 
ERP Consulting Services and Solutions by Contetra Pvt Ltd
jayjani123
 
Ad

How I Made Zoom In and Enhance - Seattle Mobile .NET

  • 1. HOW I MADE ZOOM IN AND ENHANCE CONVOLUTIONAL NEURAL NETWORKS
  • 5. ?
  • 6. 1. SOMEONE ELSE DID IT 2. FINALLY THERE WAS A LIBRARY I COULD UNDERSTAND
  • 8. THEORY ▸ A way to compute a function inspired by the human brain ▸ Much simpler than the brain! ▸ Input -> Network -> Output ▸ Comprised of many neurons ▸ These neurons are interconnected ▸ Some are “inputs” and some are “outputs”
  • 9. THEORY DEEP NETWORKS ▸ Nowadays we have so many neurons that we no longer describe the networks with them ▸ Instead, we have layers that contain many neurons ▸ When layers are connected to each other, the constituent neurons get connected ▸ Neurons in different types of layers behave differently from those in other types
  • 10. TEXT CONVOLUTIONAL NEURAL NETWORKS ▸ CNNs use a layer type called a Spatial Convolution to operate on images ▸ This is a way of connecting neurons so that nearby pixels in 2D images are connected to nearby neurons ▸ The operation the neurons perform is called convolution and is a generalized technique for manipulating images ▸ The benefits are that local information gets local connections and the operation itself is very powerful
  • 11. TEXT RECURRENT NEURAL NETWORKS ▸ RNNs use feedback to predict the future! ▸ Most networks are pure functions that process inputs and produce outputs ▸ RNNs feed outputs back into the network to model time ▸ It’s scarily effective
  • 12. THEORY NETWORK NOTATION ▸ Networks are written like a chemistry formula ▸ Instead of atoms, layer types are used
  • 13. THEORY TRAINING ▸ We train the network by showing it a bunch of inputs and desired outputs ▸ The training algorithm is called back propagation and involves a lot of number crunching ▸ Neurons assign weights to the neurons they’re connected to ▸ These weights control how much the neighbors influence that neuron ▸ Training is the process of determining these weights ▸ The number of times a pair have been reused is called an epoch
  • 14. THEORY MINIMIZE ERROR BY ADJUSTING WEIGHTS
  • 15. THEORY GENERALIZING ▸ What’s the point of training a network if we already have a bunch of inputs and outputs? ▸ The hope is that the network will learn how to solve problems it hasn’t yet seen ▸ When we train, we always reserve a batch of validation input and outputs that it never sees ▸ We then use those inputs and outputs to rate the network
  • 17. PRACTICE NEURAL NETWORK DISTRIBUTION ▸ While the concepts are universal… ▸ Neural networks are released as source code that can train and execute the network ▸ The code can be in any number of languages and use any number of support libraries ▸ Python with TensorFlow ▸ Lua with Torch ▸ Networks may only work with some hardware ▸ NVIDIA CUDA is used a lot (they invest in academia) ▸ Cloud solutions exist varying from virtualized hardware to proprietary languages
  • 21. PRACTICE INSTALLATION ▸ Install NVIDIA drivers ▸ CUDA - GPU programming SDK ▸ CUDNN - GPU libraries to help writing nets ▸ I did this on Mac and Linux ▸ Install Torch ▸ Lua libraries that can use CUDNN ▸ Install pix2pix
  • 24. PRACTICE TRAINING COMPLETE ▸ We get a trained model file ▸ Contains the structured of the network ▸ Along with the learned weights
  • 25. PRACTICE RUNNING THE NETWORK ▸ Now that the network is trained, we can run it against new inputs ▸ Put images you want to test in a special folder ▸ Run test.lua instead of train.lua ▸ Now you have outputs!
  • 27. BACK TO ZOOM AND ENHANCE
  • 28. ZOOM AND ENHANCE GOOD TRAINING EXAMPLES ▸ It some fooling around to learn what the network can and can’t do ▸ You can’t just throw images at it and hope for the best ▸ You must spend time to give it good training examples ▸ Inputs and outputs should only differ by what you want the network to learn ▸ Other differences will cause slow or impossible learning ▸ Examples: aspect ratio, cropping / region of interest, backgrounds
  • 29. ZOOM AND ENHANCE FACE ZOOMING EXAMPLE GENERATION ▸ Narrowed the task down to zooming in on faces ▸ Wrote an app that extracts faces from images using Apple’s CoreImage framework ▸ Wrote another app that down samples those faces by 8x, then 16x ▸ This only simulates the problem of Z&E since noise is drastically reduced by this downsampling ▸ Ideally, my inputs and outputs would be taken with the same camera using two different zoom levels. But alas…
  • 30. ZOOM AND ENHANCE 300 EXAMPLES - NOT JUST ME
  • 37. FUTURE MOBILE LIBRARIES ▸ TensorFlow is a C++ library that runs on Android & iOS ▸ Miguel de Icaza is binding TensorFlow to .NET (hope it works on mobile!) ▸ https://github.com/migueldeicaza/TensorFlowSharp ▸ Apple includes Metal Performance Shaders that contains basic CNN routines ▸ I’m porting Torch, for now…