SlideShare a Scribd company logo
Deep Learning OCR using Nimbix
POWER8
INTRODUCTION
• OCR is the transformation of Images of text to
Machine encoded text.
• A simple API to an OCR library might provide a
function which takes as input an image and
outputs a string.
• In this project we have applied Deep learning
Neural Network to solve Optical Character
Recognition.
• We have made use of Tensorflow and
Convolutional Neural Network.
MOTIVATION
• Optical character recognition is needed when the
information should be readable both to humans and to a
machine and alternative inputs can not be predefined.
• The basic OCR system was invented to convert the data
available on papers in to computer process able
documents, So that the documents can be editable and
reusable.
• Traditional OCR techniques are typically multi-stage
processes. For example, first the image may be divided into
smaller regions that contain the individual characters,
second the individual characters are recognized, and finally
the result is pieced back together. A difficulty with this
approach is to obtain a good division of the original image.
Sample Architecture for CNN
What are convolution Neural Network
• Step 1 – Convolution Operation
• Step 1(b) – ReLu layer (Rectified Linear unit)
• Step 2 – Pooling
• Step 3 – Flattering
• Step 4 – Full Connection
Fully Connected Layer of CNN model
Source : Created by Kirill Eremenko, Hadelin de Ponteves, SuperDataScience Team
OCRGen.py
STRINGPOWER
AI
Dataset is generated using the
Python Imaging Library (PIL)
A fully convolutional network is presented
which transforms the input volume into a
sequence of character predictions.
Predicted Output
Fully Connected
Layer
CSV file
Deep OCR Architecture
• A fully convolutional network is presented
which transforms the input volume into a
sequence of character predictions. These
character predictions can then be transformed
into a string. The architecture of the network
is shown below in Figure.
• Where N is the number of possible characters. In this example,
there are 63 possible characters for uppercase and lowercase
characters, digits, and a blank character. The parenthesized values in
the convolutional layers are the filter sizes and stride values from
top to bottom respectively. The values in the reshape layer are the
reshaped dimension.
• The input volume is a rectangular RGB image. This first height and
width of this volume are reduced across the convolutional layers
using striding. The 3rd dimension of this volume increases from 3
channels (RGB) to 1 channel for each character possible. Thus, the
volume is transformed from an RGB image into a sequence of
vectors. Applying argmax across the channel dimension gives a
sequence of 1-hot encoded vectors which can be transformed into a
string.
SOURCE
https://github.com/nicholastoddsmith/pythonml/blob/master/Dee
pOCR/TFModel/_classes.txt
Result
• To facilitate training this network, a dataset is generated using the Python
Imaging Library (PIL). Random strings consisting of alphanumeric
characters are generated. Using PIL, images are generated for each
random string. A CSV file is also generated which contains the file name
and the associated random string. Some examples from the generated
dataset are shown below in Figure.
Training Data
Generating Data
Test Data
• Training and cross-validation results are
shown
Training the Network
• To train the network, the CSV file is parsed and the images are loaded into
memory. Each target value for the training data is a sequence of 1-hot
vectors. Thus the target matrix is a 3D matrix with the three dimensions
corresponding to sample, character, and 1-hot encoding respectively.
• Next the neural network is constructed using the artificial neural network
classifier (ANNC) class from TFANN. The architecture described above is
represented in the following lines of code using ANNC
• Softmax cross-entropy is used as the loss function which is performed
over the 3rd dimension of the output.
• Fitting the network and performing predictions is simple using the ANNC
class. The prediction is split up using array_split from numpy to prevent
out of memory errors.
System Details
• Distributed Deep Learning (DDL) environment
on POWER8 system with IBM PowerAI ML/DL
frameworks.
• 40 threads POWER8, 256 RAM, 1 x k80 GPU
• PushToCompute for compiling POWER8
applications and deploying directly to the
Nimbix Cloud

More Related Content

What's hot (20)

In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unit
Jinwon Lee
 
Jpeg
JpegJpeg
Jpeg
Rashmi R Upadhya
 
ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3
zukun
 
Aerial detection part2
Aerial detection part2Aerial detection part2
Aerial detection part2
ssuser456ad6
 
Tldr
TldrTldr
Tldr
NishaMohanDevadiga
 
Lecture 11 neural network principles
Lecture 11 neural network principlesLecture 11 neural network principles
Lecture 11 neural network principles
Vajira Thambawita
 
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis -  Massimo PeriniDeep Stream Dynamic Graph Analytics with Grapharis -  Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Flink Forward
 
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
Sunghoon Joo
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
Jinwon Lee
 
Hardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region DescriptorsHardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region Descriptors
Marek Kraft
 
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignPR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
Jinwon Lee
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with Hypergraphs
MLAI2
 
Energy efficient wireless sensor networks using linear programming optimizati...
Energy efficient wireless sensor networks using linear programming optimizati...Energy efficient wireless sensor networks using linear programming optimizati...
Energy efficient wireless sensor networks using linear programming optimizati...
LogicMindtech Nologies
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
NUMA optimized Parallel Breadth first Search on Multicore Single node System
NUMA optimized Parallel Breadth first Search on Multicore Single node SystemNUMA optimized Parallel Breadth first Search on Multicore Single node System
NUMA optimized Parallel Breadth first Search on Multicore Single node System
Mohammad Tahsin Alshalabi
 
Strings in c langauge
Strings in c langaugeStrings in c langauge
Strings in c langauge
Yash Thakkar
 
Graph Matching
Graph MatchingGraph Matching
Graph Matching
graphitech
 
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
CSCJournals
 
A short introduction to Network coding
A short introduction to Network codingA short introduction to Network coding
A short introduction to Network coding
Arash Pourdamghani
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab Projects
Vijay Karan
 
In datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unitIn datacenter performance analysis of a tensor processing unit
In datacenter performance analysis of a tensor processing unit
Jinwon Lee
 
ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3ECCV2010: feature learning for image classification, part 3
ECCV2010: feature learning for image classification, part 3
zukun
 
Aerial detection part2
Aerial detection part2Aerial detection part2
Aerial detection part2
ssuser456ad6
 
Lecture 11 neural network principles
Lecture 11 neural network principlesLecture 11 neural network principles
Lecture 11 neural network principles
Vajira Thambawita
 
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis -  Massimo PeriniDeep Stream Dynamic Graph Analytics with Grapharis -  Massimo Perini
Deep Stream Dynamic Graph Analytics with Grapharis - Massimo Perini
Flink Forward
 
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
[PR-325] Pixel-BERT: Aligning Image Pixels with Text by Deep Multi-Modal Tran...
Sunghoon Joo
 
PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...PR-297: Training data-efficient image transformers & distillation through att...
PR-297: Training data-efficient image transformers & distillation through att...
Jinwon Lee
 
Hardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region DescriptorsHardware Architecture for Calculating LBP-Based Image Region Descriptors
Hardware Architecture for Calculating LBP-Based Image Region Descriptors
Marek Kraft
 
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network DesignPR-144: SqueezeNext: Hardware-Aware Neural Network Design
PR-144: SqueezeNext: Hardware-Aware Neural Network Design
Jinwon Lee
 
Edge Representation Learning with Hypergraphs
Edge Representation Learning with HypergraphsEdge Representation Learning with Hypergraphs
Edge Representation Learning with Hypergraphs
MLAI2
 
Energy efficient wireless sensor networks using linear programming optimizati...
Energy efficient wireless sensor networks using linear programming optimizati...Energy efficient wireless sensor networks using linear programming optimizati...
Energy efficient wireless sensor networks using linear programming optimizati...
LogicMindtech Nologies
 
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...IJCER (www.ijceronline.com) International Journal of computational Engineerin...
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
ijceronline
 
NUMA optimized Parallel Breadth first Search on Multicore Single node System
NUMA optimized Parallel Breadth first Search on Multicore Single node SystemNUMA optimized Parallel Breadth first Search on Multicore Single node System
NUMA optimized Parallel Breadth first Search on Multicore Single node System
Mohammad Tahsin Alshalabi
 
Strings in c langauge
Strings in c langaugeStrings in c langauge
Strings in c langauge
Yash Thakkar
 
Graph Matching
Graph MatchingGraph Matching
Graph Matching
graphitech
 
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
Parallelization of the LBG Vector Quantization Algorithm for Shared Memory Sy...
CSCJournals
 
A short introduction to Network coding
A short introduction to Network codingA short introduction to Network coding
A short introduction to Network coding
Arash Pourdamghani
 
IEEE 2015 Matlab Projects
IEEE 2015 Matlab ProjectsIEEE 2015 Matlab Projects
IEEE 2015 Matlab Projects
Vijay Karan
 

Similar to Ocr using tensor flow (20)

Text Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A ReviewText Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A Review
IRJET Journal
 
IRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten Characters
IRJET Journal
 
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
IRJET Journal
 
ocr with N N
ocr with N Nocr with N N
ocr with N N
Marwa Alkubaissy
 
Wise Document Translator Report
Wise Document Translator ReportWise Document Translator Report
Wise Document Translator Report
Raouf KESKES
 
Optical Character Recognition Using Python
Optical Character Recognition Using PythonOptical Character Recognition Using Python
Optical Character Recognition Using Python
YogeshIJTSRD
 
Neural Networks in the Wild: Handwriting Recognition
Neural Networks in the Wild: Handwriting RecognitionNeural Networks in the Wild: Handwriting Recognition
Neural Networks in the Wild: Handwriting Recognition
John Liu
 
DigitRecognition.pptx
DigitRecognition.pptxDigitRecognition.pptx
DigitRecognition.pptx
ruvex
 
PDF OCR
PDF OCRPDF OCR
PDF OCR
OliviaSmith160
 
Presentation on BornoNet Research Paper and Python Basics
Presentation on BornoNet Research Paper and Python BasicsPresentation on BornoNet Research Paper and Python Basics
Presentation on BornoNet Research Paper and Python Basics
Shibbir Ahmed
 
Final Report on Optical Character Recognition
Final Report on Optical Character Recognition Final Report on Optical Character Recognition
Final Report on Optical Character Recognition
Vidyut Singhania
 
Assignment-1-NF.docx
Assignment-1-NF.docxAssignment-1-NF.docx
Assignment-1-NF.docx
KhondokerAbuNaim
 
Nimbix execution steps and sample exercise
Nimbix execution steps and sample exerciseNimbix execution steps and sample exercise
Nimbix execution steps and sample exercise
Ganesan Narayanasamy
 
PB.docx
PB.docxPB.docx
PB.docx
KalyaniDarapaneni
 
Deep Learning in Text Recognition and Text Detection : A Review
Deep Learning in Text Recognition and Text Detection : A ReviewDeep Learning in Text Recognition and Text Detection : A Review
Deep Learning in Text Recognition and Text Detection : A Review
IRJET Journal
 
Bangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptxBangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptx
KhondokerAbuNaim
 
From Data Collection to Text Recognition: The OCR Training Dataset Journey
From Data Collection to Text Recognition: The OCR Training Dataset JourneyFrom Data Collection to Text Recognition: The OCR Training Dataset Journey
From Data Collection to Text Recognition: The OCR Training Dataset Journey
Globose Technology Solutions
 
Top Strategies for Developing High-Quality AI OCR Training Datasets
Top Strategies for Developing High-Quality AI OCR Training DatasetsTop Strategies for Developing High-Quality AI OCR Training Datasets
Top Strategies for Developing High-Quality AI OCR Training Datasets
Globose Technology Solutions
 
Digit recognition using mnist database
Digit recognition using mnist databaseDigit recognition using mnist database
Digit recognition using mnist database
btandale
 
OpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyOpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon Valley
Ganesan Narayanasamy
 
Text Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A ReviewText Recognition using Convolutional Neural Network: A Review
Text Recognition using Convolutional Neural Network: A Review
IRJET Journal
 
IRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten CharactersIRJET- Intelligent Character Recognition of Handwritten Characters
IRJET- Intelligent Character Recognition of Handwritten Characters
IRJET Journal
 
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...IRJET-  	  Intelligent Character Recognition of Handwritten Characters using ...
IRJET- Intelligent Character Recognition of Handwritten Characters using ...
IRJET Journal
 
Wise Document Translator Report
Wise Document Translator ReportWise Document Translator Report
Wise Document Translator Report
Raouf KESKES
 
Optical Character Recognition Using Python
Optical Character Recognition Using PythonOptical Character Recognition Using Python
Optical Character Recognition Using Python
YogeshIJTSRD
 
Neural Networks in the Wild: Handwriting Recognition
Neural Networks in the Wild: Handwriting RecognitionNeural Networks in the Wild: Handwriting Recognition
Neural Networks in the Wild: Handwriting Recognition
John Liu
 
DigitRecognition.pptx
DigitRecognition.pptxDigitRecognition.pptx
DigitRecognition.pptx
ruvex
 
Presentation on BornoNet Research Paper and Python Basics
Presentation on BornoNet Research Paper and Python BasicsPresentation on BornoNet Research Paper and Python Basics
Presentation on BornoNet Research Paper and Python Basics
Shibbir Ahmed
 
Final Report on Optical Character Recognition
Final Report on Optical Character Recognition Final Report on Optical Character Recognition
Final Report on Optical Character Recognition
Vidyut Singhania
 
Nimbix execution steps and sample exercise
Nimbix execution steps and sample exerciseNimbix execution steps and sample exercise
Nimbix execution steps and sample exercise
Ganesan Narayanasamy
 
Deep Learning in Text Recognition and Text Detection : A Review
Deep Learning in Text Recognition and Text Detection : A ReviewDeep Learning in Text Recognition and Text Detection : A Review
Deep Learning in Text Recognition and Text Detection : A Review
IRJET Journal
 
Bangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptxBangla Hand Written Digit Recognition presentation slide .pptx
Bangla Hand Written Digit Recognition presentation slide .pptx
KhondokerAbuNaim
 
From Data Collection to Text Recognition: The OCR Training Dataset Journey
From Data Collection to Text Recognition: The OCR Training Dataset JourneyFrom Data Collection to Text Recognition: The OCR Training Dataset Journey
From Data Collection to Text Recognition: The OCR Training Dataset Journey
Globose Technology Solutions
 
Top Strategies for Developing High-Quality AI OCR Training Datasets
Top Strategies for Developing High-Quality AI OCR Training DatasetsTop Strategies for Developing High-Quality AI OCR Training Datasets
Top Strategies for Developing High-Quality AI OCR Training Datasets
Globose Technology Solutions
 
Digit recognition using mnist database
Digit recognition using mnist databaseDigit recognition using mnist database
Digit recognition using mnist database
btandale
 
OpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon ValleyOpenPOWER Workshop in Silicon Valley
OpenPOWER Workshop in Silicon Valley
Ganesan Narayanasamy
 
Ad

Recently uploaded (20)

Kubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too LateKubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too Late
Michael Furman
 
Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdfCrypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
 
Ben Blair - Operating Safely in a Vibe Coding World
Ben Blair - Operating Safely in a Vibe Coding WorldBen Blair - Operating Safely in a Vibe Coding World
Ben Blair - Operating Safely in a Vibe Coding World
AWS Chicago
 
Introduction to Internet of things .ppt.
Introduction to Internet of things .ppt.Introduction to Internet of things .ppt.
Introduction to Internet of things .ppt.
hok12341073
 
Secure Access with Azure Active Directory
Secure Access with Azure Active DirectorySecure Access with Azure Active Directory
Secure Access with Azure Active Directory
VICTOR MAESTRE RAMIREZ
 
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
Safe Software
 
cnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdf
cnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdfcnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdf
cnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdf
AmirStern2
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FMEEnabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdfArtificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven InfrastructureNo-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
Safe Software
 
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationAI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
Christine Shepherd
 
Murdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementaryMurdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementary
JorgeSemperteguiMont
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Providing an OGC API Processes REST Interface for FME Flow
Providing an OGC API Processes REST Interface for FME FlowProviding an OGC API Processes REST Interface for FME Flow
Providing an OGC API Processes REST Interface for FME Flow
Safe Software
 
Your startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean accountYour startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean account
angelo60207
 
Kubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too LateKubernetes Security Act Now Before It’s Too Late
Kubernetes Security Act Now Before It’s Too Late
Michael Furman
 
Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdfCrypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
 
Ben Blair - Operating Safely in a Vibe Coding World
Ben Blair - Operating Safely in a Vibe Coding WorldBen Blair - Operating Safely in a Vibe Coding World
Ben Blair - Operating Safely in a Vibe Coding World
AWS Chicago
 
Introduction to Internet of things .ppt.
Introduction to Internet of things .ppt.Introduction to Internet of things .ppt.
Introduction to Internet of things .ppt.
hok12341073
 
Secure Access with Azure Active Directory
Secure Access with Azure Active DirectorySecure Access with Azure Active Directory
Secure Access with Azure Active Directory
VICTOR MAESTRE RAMIREZ
 
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
National Fuels Treatments Initiative: Building a Seamless Map of Hazardous Fu...
Safe Software
 
cnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdf
cnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdfcnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdf
cnc-drilling-dowel-inserting-machine-drillteq-d-510-english.pdf
AmirStern2
 
June Patch Tuesday
June Patch TuesdayJune Patch Tuesday
June Patch Tuesday
Ivanti
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FMEEnabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely Demo Showcase: Powering ServiceNow Discovery with Precisely Ironstr...
Precisely
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdfArtificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven InfrastructureNo-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
Safe Software
 
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and ImplementationAI Agents in Logistics and Supply Chain Applications Benefits and Implementation
AI Agents in Logistics and Supply Chain Applications Benefits and Implementation
Christine Shepherd
 
Murdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementaryMurdledescargadarkweb.pdfvolumen1 100 elementary
Murdledescargadarkweb.pdfvolumen1 100 elementary
JorgeSemperteguiMont
 
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in S...
Anish Kumar
 
Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...Bridging the divide: A conversation on tariffs today in the book industry - T...
Bridging the divide: A conversation on tariffs today in the book industry - T...
BookNet Canada
 
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Integration of Utility Data into 3D BIM Models Using a 3D Solids Modeling Wor...
Safe Software
 
Providing an OGC API Processes REST Interface for FME Flow
Providing an OGC API Processes REST Interface for FME FlowProviding an OGC API Processes REST Interface for FME Flow
Providing an OGC API Processes REST Interface for FME Flow
Safe Software
 
Your startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean accountYour startup on AWS - How to architect and maintain a Lean and Mean account
Your startup on AWS - How to architect and maintain a Lean and Mean account
angelo60207
 
Ad

Ocr using tensor flow

  • 1. Deep Learning OCR using Nimbix POWER8
  • 2. INTRODUCTION • OCR is the transformation of Images of text to Machine encoded text. • A simple API to an OCR library might provide a function which takes as input an image and outputs a string. • In this project we have applied Deep learning Neural Network to solve Optical Character Recognition. • We have made use of Tensorflow and Convolutional Neural Network.
  • 3. MOTIVATION • Optical character recognition is needed when the information should be readable both to humans and to a machine and alternative inputs can not be predefined. • The basic OCR system was invented to convert the data available on papers in to computer process able documents, So that the documents can be editable and reusable. • Traditional OCR techniques are typically multi-stage processes. For example, first the image may be divided into smaller regions that contain the individual characters, second the individual characters are recognized, and finally the result is pieced back together. A difficulty with this approach is to obtain a good division of the original image.
  • 4. Sample Architecture for CNN What are convolution Neural Network • Step 1 – Convolution Operation • Step 1(b) – ReLu layer (Rectified Linear unit) • Step 2 – Pooling • Step 3 – Flattering • Step 4 – Full Connection
  • 5. Fully Connected Layer of CNN model Source : Created by Kirill Eremenko, Hadelin de Ponteves, SuperDataScience Team
  • 6. OCRGen.py STRINGPOWER AI Dataset is generated using the Python Imaging Library (PIL) A fully convolutional network is presented which transforms the input volume into a sequence of character predictions. Predicted Output Fully Connected Layer CSV file
  • 7. Deep OCR Architecture • A fully convolutional network is presented which transforms the input volume into a sequence of character predictions. These character predictions can then be transformed into a string. The architecture of the network is shown below in Figure.
  • 8. • Where N is the number of possible characters. In this example, there are 63 possible characters for uppercase and lowercase characters, digits, and a blank character. The parenthesized values in the convolutional layers are the filter sizes and stride values from top to bottom respectively. The values in the reshape layer are the reshaped dimension. • The input volume is a rectangular RGB image. This first height and width of this volume are reduced across the convolutional layers using striding. The 3rd dimension of this volume increases from 3 channels (RGB) to 1 channel for each character possible. Thus, the volume is transformed from an RGB image into a sequence of vectors. Applying argmax across the channel dimension gives a sequence of 1-hot encoded vectors which can be transformed into a string. SOURCE https://github.com/nicholastoddsmith/pythonml/blob/master/Dee pOCR/TFModel/_classes.txt
  • 9. Result • To facilitate training this network, a dataset is generated using the Python Imaging Library (PIL). Random strings consisting of alphanumeric characters are generated. Using PIL, images are generated for each random string. A CSV file is also generated which contains the file name and the associated random string. Some examples from the generated dataset are shown below in Figure. Training Data Generating Data
  • 11. • Training and cross-validation results are shown
  • 12. Training the Network • To train the network, the CSV file is parsed and the images are loaded into memory. Each target value for the training data is a sequence of 1-hot vectors. Thus the target matrix is a 3D matrix with the three dimensions corresponding to sample, character, and 1-hot encoding respectively. • Next the neural network is constructed using the artificial neural network classifier (ANNC) class from TFANN. The architecture described above is represented in the following lines of code using ANNC • Softmax cross-entropy is used as the loss function which is performed over the 3rd dimension of the output. • Fitting the network and performing predictions is simple using the ANNC class. The prediction is split up using array_split from numpy to prevent out of memory errors.
  • 13. System Details • Distributed Deep Learning (DDL) environment on POWER8 system with IBM PowerAI ML/DL frameworks. • 40 threads POWER8, 256 RAM, 1 x k80 GPU • PushToCompute for compiling POWER8 applications and deploying directly to the Nimbix Cloud