SlideShare a Scribd company logo
Future of Cognitive Computing and AI
Semantic PDF Processing and Knowledge
Representation
Sridhar Iyengar
Distinguished Engineer
Cognitive Computing Research
IBM T.J. Watson Research Center
siyengar@us.ibm.com
20102009 2011 2012 2013 2014 2015
50
0
25
75
100
125
150
Financial Services : Query from Unstructured Data
Financial Documents
(.pdf, .html, docx…)
Ingest
“Show	me	revenues	for	Citibank
between	2009	and	2015”
© 2017, IBM Corporation
Summary : PDF understanding is hard and requires significant
Research breakthroughs and Product Innovations
3
▪ PDF Documents are optimized for display and often do not
include metadata and structure to facilitate Cognitive post
processing
– Existing technologies and solutions are optimized for printing and
viewing – not cognitive post processing
▪ Need to handle Programmatically created (via MS Word, PPT….)
and Legacy and scanned documents (Forms, hand written
notes...)
▪ Approach : The definition of a Semantic Document Structure
Model (DSM) for a consistent internal representation of document
structures to be used in Future WDC Services and products
▪ Currently focused on Table and Diagram Understanding from PDF
– Healthcare, Financial Services, Compliance, Legal…
© 2017, IBM Corporation
Research Focus : IBM AI Platform for Business
• Best platform for building applications that incorporate enterprise and industry knowledge
• Time to Value at every step of cognitive application development
• Tools & Methodology to support development, deployment & intuitive usage
4
Data:
‒ Structured & Unstructured data sources
‒ Multimodal (text, visual, speech, etc.) data sources
‒ Public & private data sources
Training
‒ Create Domain Models and Specialize them
§ for conversations aligned with business process
§ For discovery of insights
‒ Fast adaptation to new domains
‒ Scale from small to large amounts of training data
‒ Tuned model creation for accuracy vs. training time.
‒ Incremental & Automated Knowledge Evolution
Conversation
‒ Tools for SMEs train from Business Processes
‒ Inference engines for specific content structure
Discovery
‒ Tools for SMEs train from Business Knowledge
‒ Reason about domain knowledge (vs. Lexical/Syntactic)
Tools & Methodology
‒ Cognitive application lifecycle (code/data/model)
Resilient deployment of cognitive models
© 2017, IBM Corporation
Cognitive Computing (AI) Technologies Research
Decision Support People Insights
Cognitive Software
and Data Life Cycle*
Reasoning and
Planning
Human Computer
Interaction
Conversation
Query and Retrieval*
Knowledge Extraction
and Representation*
Learning*
Natural Language &
Text Understanding*
Visual
Comprehension*
Speech and Audio Embodied Cognition
Cognitive Computing
Platform Infrastructure
Signal
Comprehension
Reasoning
About Domains
Interaction Systems
Trust and Security
Semantic PDF
Processing*
© 2017, IBM Corporation
Goal : From Raw Data to Business Artifacts
.pdf
Line PlotBulleted List
• Create representation for an obligation
• Models for “obligation language”
• Reason about list or data that refines the obligation
• Create document fragments by parsing out chunks
• Document structure models
• Reason about document chunks
Obligation
• Create representation for a fragments
• Document fragment models
• Reason about fragment constituentsfragment
Section
fragmentfragment
• Hierarchical Processing
• Machine-learned models and reasoning at all levels
• Learnability of artifacts, models
• Learn how to specify reasoners
Example 1: Semantic PDF Processing
6© 2017, IBM Corporation
Example 1:
.pdf
Line PlotBulleted List
• Create representation for an obligation
• Models for “obligation language”
• Reason about list or data that refines the obligation
• Create document fragments by parsing out chunks
• Document structure models
• Reason about document chunks
Obligation
• Create representation for a fragments
• Document fragment models
• Reason about fragment constituentsfragment
Section
fragmentfragment
.mp4
SceneScene
Boy Girl Night
Soft
Music
Candles
Romantic Scene
• Hierarchical Processing
• Machine-learned models and reasoning at all levels
• Learnability of artifacts, models
• Learn how to specify reasoners
Example 2: Semantic MPEG Processing
7
From Raw Data to Business Artifacts
© 2017, IBM Corporation
Complexity akin to “Natural Language Understanding”
Why is PDF Processing hard?
▪ Thousands of PDF generators (driver), with their own rules for
placing marks on paper.
▪ Incredible variety in content – complex tables, images, diagrams,
formulas, varying resolution in scanned content
▪ No closed form / algorithmic solution feasible – must resort to
machine learning.
© 2017, IBM Corporation
Why is it hard? Variety of tables : 20-25 major table types
in discussion with just one major customer
Complex tables – graphical lines can
be misleading – is this 1, 2 or 3
tables ?
Table with
visual clues
only
Multi-row, multi-
column column
headers
Nested
row
headers
Tables with Textual
content
Table with
graphic
lines
Table
interleaved
with text and
charts
Complex multi-row,
multi-column column
headers identifiable
using graphical lines
and visual clues
Why is it hard? Variety in Image, Diagram Types
L. Lin et al. / Pattern Recognition 42 (2009) 1297--1307 1305
Fig. 8. ROC curves of the detection results for bicycle parts. Each graph shows the ROC curve of the results for a different part of the bicycle using just bottom-up information
and bottom-up + top-down information. We can see that the addition of top-down information greatly improves the results. We can also see that the bicycle wheel is the
most reliably detected object using only bottom-up cues, so we will look for that part first.
With a quick second glance, even the seat and handlebars may be
“seen”, though they are actually occluded. Our algorithm simulates
the top-down process (indicated by blue/green downward arrows in
Fig. 4) in a similar way, using the constructed And–Or graphs.
Verification of hypotheses: Each of the bottom-up proposals ac-
tivates a production rule that matches the terminal nodes in the
graph, and the algorithm predicts its neighboring nodes subject to
the learned relationships and node attributes. For example in Fig. 4,
a proposed circle will activate the rule that expands a wheel into
two rings. The algorithm then searches for another circle of propor-
tional radius, subject to the concentric relation with existing circle.
In Fig. 5(b), the wheels are already verified. The candidate frames
are then predicted with their ends affixed to the center points of the
wheels. Since we cannot tell the front wheels from the rear ones at
this moment, frames facing in two different directions are both pre-
dicted and put in the Open List. In Fig. 5(a), the triangle templates
are detected using a Generalized Hough Transform only when the
wheels are first verified and frames are predicted. If no neighboring
nodes are matched, the algorithm stops pursuing this proposal and
removes it from the Lists. Otherwise, if all of the neighboring nodes
are matched, the production rule is completed. The grouped nodes
are then put in the Closed List and lined up to be another bottom-up
proposal for the higher level. Note that we may have both bottom-
up and top-down information being passed about a particular pro-
posal as shown by the gray arrows in Fig. 3. In Fig. 4, the sub-parts
of the frame are predicted in the top-down phase from the frame
node (blue arrows); at the same time, they are also proposed in the
bottom-up phase based on the triangles we detected (red arrows).
Proposals with bidirectional supports such as these are more likely
to be accepted. After one particle is accepted from the Open List, any
other overlapping particles should update accordingly.
Template match: The pre-defined part templates, such as the bi-
cycle frames or teapot bodies, are represented by sub-sketch-graphs,
which are composed of a set of linked edgelets and junctions. Once a
template is proposed and placed at a location with initial attributes,
the template matching process is then activated. As shown in
10
PDF rendering
q .doc, .ppt rendering to .pdf keeps minimal structure formatting.
Geared towards visual fidelity
q Often .pdf is created by “screen scraping” or scanning or hybrid
ways that do not keep structure information.
Multi-modality: extremely rich information
q Images + Text + Tables both co-exist as well as form nested
hierarchies possibly with several levels
Nested table (numeric and
non-numeric + image)
Tabular representation
of images with pictorial
cross reference
Images + captions + cross references and
text that comments the image
Two major approaches to tackling PDF Processing
▪ Unsupervised Learning and out of the box PDF
processing
– Works well for a large class of domains with some compromise in
quality
▪ Supervised Learning with a graphical labelling tool
– Potential for improved quality when many similar documents are
available
Both approaches can be used together
…
…
DU:	Line	plots
(LP)
DU:	flow	
charts
(FC)
DU:	bubble	
plots
(BP)
Image	
classification
TU:	Table	
understanding
(Programmatic	PDF
Text	analytics
(Programmatic	
PDF)
PDF	Parser
DU:	scanned	
tables
(ST)
Data	integration:	Linking	text	
to	diagrams,	tables,	
serialization….
PDF	Understanding:	High	Level	Overview
Learned Semantic Document Representation
© 2017, IBM Corporation
PDF	Processing	Overview	in	WDC
WDC	DCS	
Service
PDF	Docs
HTML
JSON
Plain	Text
https://www.ibm.com/watson/developercloud/document-conversion.html
Current	implementation	of	DCS	has	limited	Table	processing	capability	and	no	support	
for	scanned	documents,	diagrams,	graphs	etc.
Text	and	Simple
Table	structure
© 2017, IBM Corporation
PDF	Processing.Next Overview	(	2017/2018	)
WDC	DCS.Next Service
PDF,	HTML,	
Word	Docs
DSM-XML
JSON
Plain	Text
HTML
WDC	DCS
Service…
PDF2HTML
PDF2JSON
PDF2-DSMXML
New	PDF	
Tools
SME,	Data	Scientist
(Domain	Adaptation	using	ML)
Developer
Using	DCS	API
Text	,	Tables,	Diagrams
Graphs..
PDF,	HTML,	
Word	Docs
(Training)
© 2017, IBM Corporation
PDF	Conversion	Architecture
Programmatic	
PDF
PDFBox API:
Parse PDF
Document
HTML
Layout	+	
Reading	Order
Inference	
HTML	
Generation
Table
Structure	
Population
Metadata	
Identification
Table	
Identification
Cleanse	Raw	
PDF	Data
Open	Source	or	Commercial	Software
Research	Extensions
Composite	
Unit	/	Region	
Identification	
Scanned	PDF
Cleanse	Raw	
OCR	Output
OCR Engine API:
Scan PDF
Document
• ML-based PDF conversion Pipeline is source-independent
• SAME ML-based algorithms can be applied directly to data
extracted from either scanned or programmatic PDF
• PDF Conversion ML algorithms are unsupervised; thus achieve
stated performance out-of-box with NO training / tuning data
required
• Deployable in Cloud for document-at-a-time processing service
• Scanned PDF processing available
now using Datacap OCR engine
• Extension using Tesseract engine
Programmatic	PDF	Extraction
Scanned	&	Hybrid	PDF	
Extraction
Hybrid	PDF
Chart
Identification	
ML-based	PDF	Conversion	Pipeline
• HTML	output	from	WCS	PDF	Conversion	is	directly	consumable	by	downstream	analytics
• PDF	Conversion	Table	processing	example	:	
17
PDF	Conversion	Downstream	Analytics	Example
PDF HTML Watson
Knowledge	
Graph
WCS
PDF	
Conversion
Table
Processing
NLQ
Answering
Structured	
Facts	from	
Table
Answer
Original	Scanned
PDF	table
HTML	generated	from	current
PDF	Conversion	
Web	service
Bridge Designer Length
Brooklyn J.	A.	
Roebling
1595
Manhattan G.	
Lindenthal
1470
Queensborough Palmer &
Hornboste
l
1182
Structured	facts	from	existing	Table	
Processing	Libraries	
(with	appropriate	customization)
Who	designed
Brooklyn	Bridge?
NLQ
Answering
J.	A.	Roebling
…
Document	Structure	Model	(Document	Representation)
• Define	common	document	structure	ideal	for	subsequent	semantic	analysis
• Defined	per	feature	:	Section,	Bulleted	Lists,	Headers,	Footers,	Footnotes,	Tables,	...
18
Define	how	section	information	such	as	title,	
number	and	nesting	should	be	Represented
Define	how	list	information	such	as	list	items	
and	list	type	should	be	Represented
© 2017, IBM Corporation
Document	Structure	Model	(DSM)	- Draft
Scan
PDF
Prog
PDF
Page
[1…n]
Token
Character
Phrase
TextLine
Paragraph
PageColumn
[1…n]
[1…n]
[1…n]
[1…n]
[1…n]
PageChart
TableCell
Table Graphical
Line
[1…n]	means	ordered	list
All	objects	have	BoundingBox attribute
Color
displayOrder
rowSpan
colSpan
[1…n][1…n]
[1…n]
[1…n]
[1…n]
[1…n]
[1…n]
Embedded	
Image
BoundBoxCoords
contents
displayOrder
[1…n]
Logical	Data	Model
Ontology	Representation
19
• Goal: Define common document structure ideal for subsequent semantic
processing
• Captures both raw extracted information (text, vector graphics) along
with inferred artifacts (tables, charts, paragraphs)
• Start with PDF documents and extend to other formats such as Word
and Excel
• DSM Schema in OWL, Serializations to HTML, JSON...
20102009 2011 2012 2013 2014 2015
50
0
25
75
100
125
150
Financial Services : Query from Unstructured Data
Financial Documents
(.pdf, .html, docx…)
Ingest
“Show	me	revenues	for	Citibank
between	2009	and	2015”
© 2017, IBM Corporation
Thank You

More Related Content

What's hot (6)

Resume_NamanSinghal
Resume_NamanSinghalResume_NamanSinghal
Resume_NamanSinghal
NAMAN SINGHAL
 
Careers in Electronics Engineering
Careers in Electronics Engineering Careers in Electronics Engineering
Careers in Electronics Engineering
TMI Group
 
Sakshi Sharma: Resume
Sakshi Sharma: ResumeSakshi Sharma: Resume
Sakshi Sharma: Resume
Sakshi Sharma
 
Chapter 7 basics of computational thinking
Chapter 7 basics of computational thinkingChapter 7 basics of computational thinking
Chapter 7 basics of computational thinking
Praveen M Jigajinni
 
Visual Analytics for User Behaviour Analysis in Cyber Systems
Visual Analytics for User Behaviour Analysis in Cyber SystemsVisual Analytics for User Behaviour Analysis in Cyber Systems
Visual Analytics for User Behaviour Analysis in Cyber Systems
Cagatay Turkay
 
Data Visualization
Data VisualizationData Visualization
Data Visualization
Freddy San
 
Careers in Electronics Engineering
Careers in Electronics Engineering Careers in Electronics Engineering
Careers in Electronics Engineering
TMI Group
 
Sakshi Sharma: Resume
Sakshi Sharma: ResumeSakshi Sharma: Resume
Sakshi Sharma: Resume
Sakshi Sharma
 
Chapter 7 basics of computational thinking
Chapter 7 basics of computational thinkingChapter 7 basics of computational thinking
Chapter 7 basics of computational thinking
Praveen M Jigajinni
 
Visual Analytics for User Behaviour Analysis in Cyber Systems
Visual Analytics for User Behaviour Analysis in Cyber SystemsVisual Analytics for User Behaviour Analysis in Cyber Systems
Visual Analytics for User Behaviour Analysis in Cyber Systems
Cagatay Turkay
 
Data Visualization
Data VisualizationData Visualization
Data Visualization
Freddy San
 

Viewers also liked (11)

Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewEntity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Radityo Eko Prasojo
 
Using AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer FeedbackUsing AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer Feedback
Alyona Medelyan
 
Large Scale Processing of Unstructured Text
Large Scale Processing of Unstructured TextLarge Scale Processing of Unstructured Text
Large Scale Processing of Unstructured Text
DataWorks Summit
 
Pipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyPipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontology
Janna Hastings
 
Knowledge representation
Knowledge representationKnowledge representation
Knowledge representation
Md. Tanvir Masud
 
Natural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in LumifyNatural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in Lumify
Charlie Greenbacker
 
Ontologies for Mental Health and Disease
Ontologies for Mental Health and DiseaseOntologies for Mental Health and Disease
Ontologies for Mental Health and Disease
Janna Hastings
 
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
semanticsconference
 
Ontology
OntologyOntology
Ontology
Mithat Ekinci
 
Ontology-based Data Integration
Ontology-based Data IntegrationOntology-based Data Integration
Ontology-based Data Integration
Janna Hastings
 
AI and the Future of Growth
AI and the Future of GrowthAI and the Future of Growth
AI and the Future of Growth
Accenture Technology
 
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - OverviewEntity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Entity-Relationship Extraction from Wikipedia Unstructured Text - Overview
Radityo Eko Prasojo
 
Using AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer FeedbackUsing AI to Make Sense of Customer Feedback
Using AI to Make Sense of Customer Feedback
Alyona Medelyan
 
Large Scale Processing of Unstructured Text
Large Scale Processing of Unstructured TextLarge Scale Processing of Unstructured Text
Large Scale Processing of Unstructured Text
DataWorks Summit
 
Pipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontologyPipeline for automated structure-based classification in the ChEBI ontology
Pipeline for automated structure-based classification in the ChEBI ontology
Janna Hastings
 
Natural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in LumifyNatural Language Processing and Graph Databases in Lumify
Natural Language Processing and Graph Databases in Lumify
Charlie Greenbacker
 
Ontologies for Mental Health and Disease
Ontologies for Mental Health and DiseaseOntologies for Mental Health and Disease
Ontologies for Mental Health and Disease
Janna Hastings
 
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
Adam Bartusiak and Jörg Lässig | Semantic Processing for the Conversion of Un...
semanticsconference
 
Ontology-based Data Integration
Ontology-based Data IntegrationOntology-based Data Integration
Ontology-based Data Integration
Janna Hastings
 

Similar to “Semantic PDF Processing & Document Representation” (20)

The Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- ReduxThe Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- Redux
Pierre Schaus
 
Introduction-to-Artificiefgerwgergtergeteetetgfger gfrner jf ergejg kjg byurk...
Introduction-to-Artificiefgerwgergtergeteetetgfger gfrner jf ergejg kjg byurk...Introduction-to-Artificiefgerwgergtergeteetetgfger gfrner jf ergejg kjg byurk...
Introduction-to-Artificiefgerwgergtergeteetetgfger gfrner jf ergejg kjg byurk...
ShubhamPuranik7
 
Hands on with Edge AI
Hands on with Edge AIHands on with Edge AI
Hands on with Edge AI
Giacomo Bartoli
 
Imaging automotive 2015 addfor v002
Imaging automotive 2015   addfor v002Imaging automotive 2015   addfor v002
Imaging automotive 2015 addfor v002
Enrico Busto
 
Imaging automotive 2015 addfor v002
Imaging automotive 2015   addfor v002Imaging automotive 2015   addfor v002
Imaging automotive 2015 addfor v002
Enrico Busto
 
Finely Chair talk: Every company is an AI company - and why Universities sho...
Finely Chair talk: Every company is an AI company  - and why Universities sho...Finely Chair talk: Every company is an AI company  - and why Universities sho...
Finely Chair talk: Every company is an AI company - and why Universities sho...
Amit Sheth
 
Alison Lowndes, Artificial Intelligence DevRel, Nvidia – Fueling the Artifici...
Alison Lowndes, Artificial Intelligence DevRel, Nvidia – Fueling the Artifici...Alison Lowndes, Artificial Intelligence DevRel, Nvidia – Fueling the Artifici...
Alison Lowndes, Artificial Intelligence DevRel, Nvidia – Fueling the Artifici...
Techsylvania
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
Trivadis
 
Intelligent image processing
Intelligent image processingIntelligent image processing
Intelligent image processing
Andrew Stewart
 
Innovation report: Artificial Intelligence
Innovation report: Artificial IntelligenceInnovation report: Artificial Intelligence
Innovation report: Artificial Intelligence
Youssef Rahoui
 
Intel 20180608 v2
Intel 20180608 v2Intel 20180608 v2
Intel 20180608 v2
home
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
Andre Freitas
 
AI - Exploring Frontiers
AI - Exploring FrontiersAI - Exploring Frontiers
AI - Exploring Frontiers
Virendra Gupta
 
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Stuart Chalk
 
Benefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleBenefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycle
Martin Kaltenböck
 
Solutions for ADAS and AI data engineering using OpenPOWER/POWER systems
Solutions for ADAS and AI data engineering using OpenPOWER/POWER systemsSolutions for ADAS and AI data engineering using OpenPOWER/POWER systems
Solutions for ADAS and AI data engineering using OpenPOWER/POWER systems
Ganesan Narayanasamy
 
AI in the Enterprise at Scale
AI in the Enterprise at ScaleAI in the Enterprise at Scale
AI in the Enterprise at Scale
Ganesan Narayanasamy
 
Prior AI consulting use cases
Prior AI consulting use casesPrior AI consulting use cases
Prior AI consulting use cases
Harendra Singh
 
Understanding Artificial Intelligence - Major concepts for enterprise applica...
Understanding Artificial Intelligence - Major concepts for enterprise applica...Understanding Artificial Intelligence - Major concepts for enterprise applica...
Understanding Artificial Intelligence - Major concepts for enterprise applica...
APPANION
 
DU_SERIES_Session1.pdf
DU_SERIES_Session1.pdfDU_SERIES_Session1.pdf
DU_SERIES_Session1.pdf
RohitRadhakrishnan8
 
The Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- ReduxThe Concurrent Constraint Programming Research Programmes -- Redux
The Concurrent Constraint Programming Research Programmes -- Redux
Pierre Schaus
 
Introduction-to-Artificiefgerwgergtergeteetetgfger gfrner jf ergejg kjg byurk...
Introduction-to-Artificiefgerwgergtergeteetetgfger gfrner jf ergejg kjg byurk...Introduction-to-Artificiefgerwgergtergeteetetgfger gfrner jf ergejg kjg byurk...
Introduction-to-Artificiefgerwgergtergeteetetgfger gfrner jf ergejg kjg byurk...
ShubhamPuranik7
 
Imaging automotive 2015 addfor v002
Imaging automotive 2015   addfor v002Imaging automotive 2015   addfor v002
Imaging automotive 2015 addfor v002
Enrico Busto
 
Imaging automotive 2015 addfor v002
Imaging automotive 2015   addfor v002Imaging automotive 2015   addfor v002
Imaging automotive 2015 addfor v002
Enrico Busto
 
Finely Chair talk: Every company is an AI company - and why Universities sho...
Finely Chair talk: Every company is an AI company  - and why Universities sho...Finely Chair talk: Every company is an AI company  - and why Universities sho...
Finely Chair talk: Every company is an AI company - and why Universities sho...
Amit Sheth
 
Alison Lowndes, Artificial Intelligence DevRel, Nvidia – Fueling the Artifici...
Alison Lowndes, Artificial Intelligence DevRel, Nvidia – Fueling the Artifici...Alison Lowndes, Artificial Intelligence DevRel, Nvidia – Fueling the Artifici...
Alison Lowndes, Artificial Intelligence DevRel, Nvidia – Fueling the Artifici...
Techsylvania
 
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - TrivadisTechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
TechEvent 2019: Artificial Intelligence in Dev & Ops; Martin Luckow - Trivadis
Trivadis
 
Intelligent image processing
Intelligent image processingIntelligent image processing
Intelligent image processing
Andrew Stewart
 
Innovation report: Artificial Intelligence
Innovation report: Artificial IntelligenceInnovation report: Artificial Intelligence
Innovation report: Artificial Intelligence
Youssef Rahoui
 
Intel 20180608 v2
Intel 20180608 v2Intel 20180608 v2
Intel 20180608 v2
home
 
AI Beyond Deep Learning
AI Beyond Deep LearningAI Beyond Deep Learning
AI Beyond Deep Learning
Andre Freitas
 
AI - Exploring Frontiers
AI - Exploring FrontiersAI - Exploring Frontiers
AI - Exploring Frontiers
Virendra Gupta
 
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Rule-based Capture/Storage of Scientific Data from PDF Files and Export using...
Stuart Chalk
 
Benefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycleBenefiting from Semantic AI along the data life cycle
Benefiting from Semantic AI along the data life cycle
Martin Kaltenböck
 
Solutions for ADAS and AI data engineering using OpenPOWER/POWER systems
Solutions for ADAS and AI data engineering using OpenPOWER/POWER systemsSolutions for ADAS and AI data engineering using OpenPOWER/POWER systems
Solutions for ADAS and AI data engineering using OpenPOWER/POWER systems
Ganesan Narayanasamy
 
Prior AI consulting use cases
Prior AI consulting use casesPrior AI consulting use cases
Prior AI consulting use cases
Harendra Singh
 
Understanding Artificial Intelligence - Major concepts for enterprise applica...
Understanding Artificial Intelligence - Major concepts for enterprise applica...Understanding Artificial Intelligence - Major concepts for enterprise applica...
Understanding Artificial Intelligence - Major concepts for enterprise applica...
APPANION
 

More from diannepatricia (20)

Cognitive systems institute talk 8 june 2017 - v.1.0
Cognitive systems institute talk   8 june 2017 - v.1.0Cognitive systems institute talk   8 june 2017 - v.1.0
Cognitive systems institute talk 8 june 2017 - v.1.0
diannepatricia
 
Building Compassionate Conversational Systems
Building Compassionate Conversational SystemsBuilding Compassionate Conversational Systems
Building Compassionate Conversational Systems
diannepatricia
 
“Artificial Intelligence, Cognitive Computing and Innovating in Practice”
“Artificial Intelligence, Cognitive Computing and Innovating in Practice”“Artificial Intelligence, Cognitive Computing and Innovating in Practice”
“Artificial Intelligence, Cognitive Computing and Innovating in Practice”
diannepatricia
 
Cognitive Insights drive self-driving Accessibility
Cognitive Insights drive self-driving AccessibilityCognitive Insights drive self-driving Accessibility
Cognitive Insights drive self-driving Accessibility
diannepatricia
 
Artificial Intellingence in the Car
Artificial Intellingence in the CarArtificial Intellingence in the Car
Artificial Intellingence in the Car
diannepatricia
 
Joining Industry and Students for Cognitive Solutions at Karlsruhe Services R...
Joining Industry and Students for Cognitive Solutions at Karlsruhe Services R...Joining Industry and Students for Cognitive Solutions at Karlsruhe Services R...
Joining Industry and Students for Cognitive Solutions at Karlsruhe Services R...
diannepatricia
 
170330 cognitive systems institute speaker series mark sherman - watson pr...
170330 cognitive systems institute speaker series    mark sherman - watson pr...170330 cognitive systems institute speaker series    mark sherman - watson pr...
170330 cognitive systems institute speaker series mark sherman - watson pr...
diannepatricia
 
“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”
“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”
“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”
diannepatricia
 
Cognitive Assistance for the Aging
Cognitive Assistance for the AgingCognitive Assistance for the Aging
Cognitive Assistance for the Aging
diannepatricia
 
From complex Systems to Networks: Discovering and Modeling the Correct Network"
From complex Systems to Networks: Discovering and Modeling the Correct Network"From complex Systems to Networks: Discovering and Modeling the Correct Network"
From complex Systems to Networks: Discovering and Modeling the Correct Network"
diannepatricia
 
The Role of Dialog in Augmented Intelligence
The Role of Dialog in Augmented IntelligenceThe Role of Dialog in Augmented Intelligence
The Role of Dialog in Augmented Intelligence
diannepatricia
 
Developing Cognitive Systems to Support Team Cognition
Developing Cognitive Systems to Support Team CognitionDeveloping Cognitive Systems to Support Team Cognition
Developing Cognitive Systems to Support Team Cognition
diannepatricia
 
Cyber-Social Learning Systems
Cyber-Social Learning SystemsCyber-Social Learning Systems
Cyber-Social Learning Systems
diannepatricia
 
“IT Technology Trends in 2017… and Beyond”
“IT Technology Trends in 2017… and Beyond”“IT Technology Trends in 2017… and Beyond”
“IT Technology Trends in 2017… and Beyond”
diannepatricia
 
"Curious Learning: using a mobile platform for early literacy education as a ...
"Curious Learning: using a mobile platform for early literacy education as a ..."Curious Learning: using a mobile platform for early literacy education as a ...
"Curious Learning: using a mobile platform for early literacy education as a ...
diannepatricia
 
Embodied Cognition - Booch HICSS50
Embodied Cognition - Booch HICSS50Embodied Cognition - Booch HICSS50
Embodied Cognition - Booch HICSS50
diannepatricia
 
KATE - a Platform for Machine Learning
KATE - a Platform for Machine LearningKATE - a Platform for Machine Learning
KATE - a Platform for Machine Learning
diannepatricia
 
Cognitive Computing for Aging Society
Cognitive Computing for Aging SocietyCognitive Computing for Aging Society
Cognitive Computing for Aging Society
diannepatricia
 
Hicss17 asakawa
Hicss17 asakawaHicss17 asakawa
Hicss17 asakawa
diannepatricia
 
“Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services” “Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services”
diannepatricia
 
Cognitive systems institute talk 8 june 2017 - v.1.0
Cognitive systems institute talk   8 june 2017 - v.1.0Cognitive systems institute talk   8 june 2017 - v.1.0
Cognitive systems institute talk 8 june 2017 - v.1.0
diannepatricia
 
Building Compassionate Conversational Systems
Building Compassionate Conversational SystemsBuilding Compassionate Conversational Systems
Building Compassionate Conversational Systems
diannepatricia
 
“Artificial Intelligence, Cognitive Computing and Innovating in Practice”
“Artificial Intelligence, Cognitive Computing and Innovating in Practice”“Artificial Intelligence, Cognitive Computing and Innovating in Practice”
“Artificial Intelligence, Cognitive Computing and Innovating in Practice”
diannepatricia
 
Cognitive Insights drive self-driving Accessibility
Cognitive Insights drive self-driving AccessibilityCognitive Insights drive self-driving Accessibility
Cognitive Insights drive self-driving Accessibility
diannepatricia
 
Artificial Intellingence in the Car
Artificial Intellingence in the CarArtificial Intellingence in the Car
Artificial Intellingence in the Car
diannepatricia
 
Joining Industry and Students for Cognitive Solutions at Karlsruhe Services R...
Joining Industry and Students for Cognitive Solutions at Karlsruhe Services R...Joining Industry and Students for Cognitive Solutions at Karlsruhe Services R...
Joining Industry and Students for Cognitive Solutions at Karlsruhe Services R...
diannepatricia
 
170330 cognitive systems institute speaker series mark sherman - watson pr...
170330 cognitive systems institute speaker series    mark sherman - watson pr...170330 cognitive systems institute speaker series    mark sherman - watson pr...
170330 cognitive systems institute speaker series mark sherman - watson pr...
diannepatricia
 
“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”
“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”
“Fairness Cases as an Accelerant and Enabler for Cognitive Assistance Adoption”
diannepatricia
 
Cognitive Assistance for the Aging
Cognitive Assistance for the AgingCognitive Assistance for the Aging
Cognitive Assistance for the Aging
diannepatricia
 
From complex Systems to Networks: Discovering and Modeling the Correct Network"
From complex Systems to Networks: Discovering and Modeling the Correct Network"From complex Systems to Networks: Discovering and Modeling the Correct Network"
From complex Systems to Networks: Discovering and Modeling the Correct Network"
diannepatricia
 
The Role of Dialog in Augmented Intelligence
The Role of Dialog in Augmented IntelligenceThe Role of Dialog in Augmented Intelligence
The Role of Dialog in Augmented Intelligence
diannepatricia
 
Developing Cognitive Systems to Support Team Cognition
Developing Cognitive Systems to Support Team CognitionDeveloping Cognitive Systems to Support Team Cognition
Developing Cognitive Systems to Support Team Cognition
diannepatricia
 
Cyber-Social Learning Systems
Cyber-Social Learning SystemsCyber-Social Learning Systems
Cyber-Social Learning Systems
diannepatricia
 
“IT Technology Trends in 2017… and Beyond”
“IT Technology Trends in 2017… and Beyond”“IT Technology Trends in 2017… and Beyond”
“IT Technology Trends in 2017… and Beyond”
diannepatricia
 
"Curious Learning: using a mobile platform for early literacy education as a ...
"Curious Learning: using a mobile platform for early literacy education as a ..."Curious Learning: using a mobile platform for early literacy education as a ...
"Curious Learning: using a mobile platform for early literacy education as a ...
diannepatricia
 
Embodied Cognition - Booch HICSS50
Embodied Cognition - Booch HICSS50Embodied Cognition - Booch HICSS50
Embodied Cognition - Booch HICSS50
diannepatricia
 
KATE - a Platform for Machine Learning
KATE - a Platform for Machine LearningKATE - a Platform for Machine Learning
KATE - a Platform for Machine Learning
diannepatricia
 
Cognitive Computing for Aging Society
Cognitive Computing for Aging SocietyCognitive Computing for Aging Society
Cognitive Computing for Aging Society
diannepatricia
 
“Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services” “Semantic Technologies for Smart Services”
“Semantic Technologies for Smart Services”
diannepatricia
 

Recently uploaded (20)

Measuring Microsoft 365 Copilot and Gen AI Success
Measuring Microsoft 365 Copilot and Gen AI SuccessMeasuring Microsoft 365 Copilot and Gen AI Success
Measuring Microsoft 365 Copilot and Gen AI Success
Nikki Chapple
 
Master tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 Professio
Master tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 ProfessioMaster tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 Professio
Master tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 Professio
Kari Kakkonen
 
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Nikki Chapple
 
Content and eLearning Standards: Finding the Best Fit for Your-Training
Content and eLearning Standards: Finding the Best Fit for Your-TrainingContent and eLearning Standards: Finding the Best Fit for Your-Training
Content and eLearning Standards: Finding the Best Fit for Your-Training
Rustici Software
 
Fully Open-Source Private Clouds: Freedom, Security, and Control
Fully Open-Source Private Clouds: Freedom, Security, and ControlFully Open-Source Private Clouds: Freedom, Security, and Control
Fully Open-Source Private Clouds: Freedom, Security, and Control
ShapeBlue
 
"AI in the browser: predicting user actions in real time with TensorflowJS", ...
"AI in the browser: predicting user actions in real time with TensorflowJS", ..."AI in the browser: predicting user actions in real time with TensorflowJS", ...
"AI in the browser: predicting user actions in real time with TensorflowJS", ...
Fwdays
 
System Card: Claude Opus 4 & Claude Sonnet 4
System Card: Claude Opus 4 & Claude Sonnet 4System Card: Claude Opus 4 & Claude Sonnet 4
System Card: Claude Opus 4 & Claude Sonnet 4
Razin Mustafiz
 
A Comprehensive Guide on Integrating Monoova Payment Gateway
A Comprehensive Guide on Integrating Monoova Payment GatewayA Comprehensive Guide on Integrating Monoova Payment Gateway
A Comprehensive Guide on Integrating Monoova Payment Gateway
danielle hunter
 
AI in Java - MCP in Action, Langchain4J-CDI, SmallRye-LLM, Spring AI
AI in Java - MCP in Action, Langchain4J-CDI, SmallRye-LLM, Spring AIAI in Java - MCP in Action, Langchain4J-CDI, SmallRye-LLM, Spring AI
AI in Java - MCP in Action, Langchain4J-CDI, SmallRye-LLM, Spring AI
Buhake Sindi
 
MCP Dev Summit - Pragmatic Scaling of Enterprise GenAI with MCP
MCP Dev Summit - Pragmatic Scaling of Enterprise GenAI with MCPMCP Dev Summit - Pragmatic Scaling of Enterprise GenAI with MCP
MCP Dev Summit - Pragmatic Scaling of Enterprise GenAI with MCP
Sambhav Kothari
 
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AI Emotional Actors:  “When Machines Learn to Feel and Perform"AI Emotional Actors:  “When Machines Learn to Feel and Perform"
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AkashKumar809858
 
Introducing FME Realize: A New Era of Spatial Computing and AR
Introducing FME Realize: A New Era of Spatial Computing and ARIntroducing FME Realize: A New Era of Spatial Computing and AR
Introducing FME Realize: A New Era of Spatial Computing and AR
Safe Software
 
Cognitive Chasms - A Typology of GenAI Failure Failure Modes
Cognitive Chasms - A Typology of GenAI Failure Failure ModesCognitive Chasms - A Typology of GenAI Failure Failure Modes
Cognitive Chasms - A Typology of GenAI Failure Failure Modes
Dr. Tathagat Varma
 
UiPath Community Berlin: Studio Tips & Tricks and UiPath Insights
UiPath Community Berlin: Studio Tips & Tricks and UiPath InsightsUiPath Community Berlin: Studio Tips & Tricks and UiPath Insights
UiPath Community Berlin: Studio Tips & Tricks and UiPath Insights
UiPathCommunity
 
Kubernetes Cloud Native Indonesia Meetup - May 2025
Kubernetes Cloud Native Indonesia Meetup - May 2025Kubernetes Cloud Native Indonesia Meetup - May 2025
Kubernetes Cloud Native Indonesia Meetup - May 2025
Prasta Maha
 
What is DePIN? The Hottest Trend in Web3 Right Now!
What is DePIN? The Hottest Trend in Web3 Right Now!What is DePIN? The Hottest Trend in Web3 Right Now!
What is DePIN? The Hottest Trend in Web3 Right Now!
cryptouniversityoffi
 
Introducing Ensemble Cloudlet vRouter
Introducing Ensemble  Cloudlet vRouterIntroducing Ensemble  Cloudlet vRouter
Introducing Ensemble Cloudlet vRouter
Adtran
 
AI Trends - Mary Meeker
AI Trends - Mary MeekerAI Trends - Mary Meeker
AI Trends - Mary Meeker
Razin Mustafiz
 
European Accessibility Act & Integrated Accessibility Testing
European Accessibility Act & Integrated Accessibility TestingEuropean Accessibility Act & Integrated Accessibility Testing
European Accessibility Act & Integrated Accessibility Testing
Julia Undeutsch
 
From Legacy to Cloud-Native: A Guide to AWS Modernization.pptx
From Legacy to Cloud-Native: A Guide to AWS Modernization.pptxFrom Legacy to Cloud-Native: A Guide to AWS Modernization.pptx
From Legacy to Cloud-Native: A Guide to AWS Modernization.pptx
Mohammad Jomaa
 
Measuring Microsoft 365 Copilot and Gen AI Success
Measuring Microsoft 365 Copilot and Gen AI SuccessMeasuring Microsoft 365 Copilot and Gen AI Success
Measuring Microsoft 365 Copilot and Gen AI Success
Nikki Chapple
 
Master tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 Professio
Master tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 ProfessioMaster tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 Professio
Master tester AI toolbox - Kari Kakkonen at Testaus ja AI 2025 Professio
Kari Kakkonen
 
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Protecting Your Sensitive Data with Microsoft Purview - IRMS 2025
Nikki Chapple
 
Content and eLearning Standards: Finding the Best Fit for Your-Training
Content and eLearning Standards: Finding the Best Fit for Your-TrainingContent and eLearning Standards: Finding the Best Fit for Your-Training
Content and eLearning Standards: Finding the Best Fit for Your-Training
Rustici Software
 
Fully Open-Source Private Clouds: Freedom, Security, and Control
Fully Open-Source Private Clouds: Freedom, Security, and ControlFully Open-Source Private Clouds: Freedom, Security, and Control
Fully Open-Source Private Clouds: Freedom, Security, and Control
ShapeBlue
 
"AI in the browser: predicting user actions in real time with TensorflowJS", ...
"AI in the browser: predicting user actions in real time with TensorflowJS", ..."AI in the browser: predicting user actions in real time with TensorflowJS", ...
"AI in the browser: predicting user actions in real time with TensorflowJS", ...
Fwdays
 
System Card: Claude Opus 4 & Claude Sonnet 4
System Card: Claude Opus 4 & Claude Sonnet 4System Card: Claude Opus 4 & Claude Sonnet 4
System Card: Claude Opus 4 & Claude Sonnet 4
Razin Mustafiz
 
A Comprehensive Guide on Integrating Monoova Payment Gateway
A Comprehensive Guide on Integrating Monoova Payment GatewayA Comprehensive Guide on Integrating Monoova Payment Gateway
A Comprehensive Guide on Integrating Monoova Payment Gateway
danielle hunter
 
AI in Java - MCP in Action, Langchain4J-CDI, SmallRye-LLM, Spring AI
AI in Java - MCP in Action, Langchain4J-CDI, SmallRye-LLM, Spring AIAI in Java - MCP in Action, Langchain4J-CDI, SmallRye-LLM, Spring AI
AI in Java - MCP in Action, Langchain4J-CDI, SmallRye-LLM, Spring AI
Buhake Sindi
 
MCP Dev Summit - Pragmatic Scaling of Enterprise GenAI with MCP
MCP Dev Summit - Pragmatic Scaling of Enterprise GenAI with MCPMCP Dev Summit - Pragmatic Scaling of Enterprise GenAI with MCP
MCP Dev Summit - Pragmatic Scaling of Enterprise GenAI with MCP
Sambhav Kothari
 
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AI Emotional Actors:  “When Machines Learn to Feel and Perform"AI Emotional Actors:  “When Machines Learn to Feel and Perform"
AI Emotional Actors: “When Machines Learn to Feel and Perform"
AkashKumar809858
 
Introducing FME Realize: A New Era of Spatial Computing and AR
Introducing FME Realize: A New Era of Spatial Computing and ARIntroducing FME Realize: A New Era of Spatial Computing and AR
Introducing FME Realize: A New Era of Spatial Computing and AR
Safe Software
 
Cognitive Chasms - A Typology of GenAI Failure Failure Modes
Cognitive Chasms - A Typology of GenAI Failure Failure ModesCognitive Chasms - A Typology of GenAI Failure Failure Modes
Cognitive Chasms - A Typology of GenAI Failure Failure Modes
Dr. Tathagat Varma
 
UiPath Community Berlin: Studio Tips & Tricks and UiPath Insights
UiPath Community Berlin: Studio Tips & Tricks and UiPath InsightsUiPath Community Berlin: Studio Tips & Tricks and UiPath Insights
UiPath Community Berlin: Studio Tips & Tricks and UiPath Insights
UiPathCommunity
 
Kubernetes Cloud Native Indonesia Meetup - May 2025
Kubernetes Cloud Native Indonesia Meetup - May 2025Kubernetes Cloud Native Indonesia Meetup - May 2025
Kubernetes Cloud Native Indonesia Meetup - May 2025
Prasta Maha
 
What is DePIN? The Hottest Trend in Web3 Right Now!
What is DePIN? The Hottest Trend in Web3 Right Now!What is DePIN? The Hottest Trend in Web3 Right Now!
What is DePIN? The Hottest Trend in Web3 Right Now!
cryptouniversityoffi
 
Introducing Ensemble Cloudlet vRouter
Introducing Ensemble  Cloudlet vRouterIntroducing Ensemble  Cloudlet vRouter
Introducing Ensemble Cloudlet vRouter
Adtran
 
AI Trends - Mary Meeker
AI Trends - Mary MeekerAI Trends - Mary Meeker
AI Trends - Mary Meeker
Razin Mustafiz
 
European Accessibility Act & Integrated Accessibility Testing
European Accessibility Act & Integrated Accessibility TestingEuropean Accessibility Act & Integrated Accessibility Testing
European Accessibility Act & Integrated Accessibility Testing
Julia Undeutsch
 
From Legacy to Cloud-Native: A Guide to AWS Modernization.pptx
From Legacy to Cloud-Native: A Guide to AWS Modernization.pptxFrom Legacy to Cloud-Native: A Guide to AWS Modernization.pptx
From Legacy to Cloud-Native: A Guide to AWS Modernization.pptx
Mohammad Jomaa
 

“Semantic PDF Processing & Document Representation”

  • 1. Future of Cognitive Computing and AI Semantic PDF Processing and Knowledge Representation Sridhar Iyengar Distinguished Engineer Cognitive Computing Research IBM T.J. Watson Research Center [email protected]
  • 2. 20102009 2011 2012 2013 2014 2015 50 0 25 75 100 125 150 Financial Services : Query from Unstructured Data Financial Documents (.pdf, .html, docx…) Ingest “Show me revenues for Citibank between 2009 and 2015” © 2017, IBM Corporation
  • 3. Summary : PDF understanding is hard and requires significant Research breakthroughs and Product Innovations 3 ▪ PDF Documents are optimized for display and often do not include metadata and structure to facilitate Cognitive post processing – Existing technologies and solutions are optimized for printing and viewing – not cognitive post processing ▪ Need to handle Programmatically created (via MS Word, PPT….) and Legacy and scanned documents (Forms, hand written notes...) ▪ Approach : The definition of a Semantic Document Structure Model (DSM) for a consistent internal representation of document structures to be used in Future WDC Services and products ▪ Currently focused on Table and Diagram Understanding from PDF – Healthcare, Financial Services, Compliance, Legal… © 2017, IBM Corporation
  • 4. Research Focus : IBM AI Platform for Business • Best platform for building applications that incorporate enterprise and industry knowledge • Time to Value at every step of cognitive application development • Tools & Methodology to support development, deployment & intuitive usage 4 Data: ‒ Structured & Unstructured data sources ‒ Multimodal (text, visual, speech, etc.) data sources ‒ Public & private data sources Training ‒ Create Domain Models and Specialize them § for conversations aligned with business process § For discovery of insights ‒ Fast adaptation to new domains ‒ Scale from small to large amounts of training data ‒ Tuned model creation for accuracy vs. training time. ‒ Incremental & Automated Knowledge Evolution Conversation ‒ Tools for SMEs train from Business Processes ‒ Inference engines for specific content structure Discovery ‒ Tools for SMEs train from Business Knowledge ‒ Reason about domain knowledge (vs. Lexical/Syntactic) Tools & Methodology ‒ Cognitive application lifecycle (code/data/model) Resilient deployment of cognitive models © 2017, IBM Corporation
  • 5. Cognitive Computing (AI) Technologies Research Decision Support People Insights Cognitive Software and Data Life Cycle* Reasoning and Planning Human Computer Interaction Conversation Query and Retrieval* Knowledge Extraction and Representation* Learning* Natural Language & Text Understanding* Visual Comprehension* Speech and Audio Embodied Cognition Cognitive Computing Platform Infrastructure Signal Comprehension Reasoning About Domains Interaction Systems Trust and Security Semantic PDF Processing* © 2017, IBM Corporation
  • 6. Goal : From Raw Data to Business Artifacts .pdf Line PlotBulleted List • Create representation for an obligation • Models for “obligation language” • Reason about list or data that refines the obligation • Create document fragments by parsing out chunks • Document structure models • Reason about document chunks Obligation • Create representation for a fragments • Document fragment models • Reason about fragment constituentsfragment Section fragmentfragment • Hierarchical Processing • Machine-learned models and reasoning at all levels • Learnability of artifacts, models • Learn how to specify reasoners Example 1: Semantic PDF Processing 6© 2017, IBM Corporation
  • 7. Example 1: .pdf Line PlotBulleted List • Create representation for an obligation • Models for “obligation language” • Reason about list or data that refines the obligation • Create document fragments by parsing out chunks • Document structure models • Reason about document chunks Obligation • Create representation for a fragments • Document fragment models • Reason about fragment constituentsfragment Section fragmentfragment .mp4 SceneScene Boy Girl Night Soft Music Candles Romantic Scene • Hierarchical Processing • Machine-learned models and reasoning at all levels • Learnability of artifacts, models • Learn how to specify reasoners Example 2: Semantic MPEG Processing 7 From Raw Data to Business Artifacts © 2017, IBM Corporation
  • 8. Complexity akin to “Natural Language Understanding” Why is PDF Processing hard? ▪ Thousands of PDF generators (driver), with their own rules for placing marks on paper. ▪ Incredible variety in content – complex tables, images, diagrams, formulas, varying resolution in scanned content ▪ No closed form / algorithmic solution feasible – must resort to machine learning. © 2017, IBM Corporation
  • 9. Why is it hard? Variety of tables : 20-25 major table types in discussion with just one major customer Complex tables – graphical lines can be misleading – is this 1, 2 or 3 tables ? Table with visual clues only Multi-row, multi- column column headers Nested row headers Tables with Textual content Table with graphic lines Table interleaved with text and charts Complex multi-row, multi-column column headers identifiable using graphical lines and visual clues
  • 10. Why is it hard? Variety in Image, Diagram Types L. Lin et al. / Pattern Recognition 42 (2009) 1297--1307 1305 Fig. 8. ROC curves of the detection results for bicycle parts. Each graph shows the ROC curve of the results for a different part of the bicycle using just bottom-up information and bottom-up + top-down information. We can see that the addition of top-down information greatly improves the results. We can also see that the bicycle wheel is the most reliably detected object using only bottom-up cues, so we will look for that part first. With a quick second glance, even the seat and handlebars may be “seen”, though they are actually occluded. Our algorithm simulates the top-down process (indicated by blue/green downward arrows in Fig. 4) in a similar way, using the constructed And–Or graphs. Verification of hypotheses: Each of the bottom-up proposals ac- tivates a production rule that matches the terminal nodes in the graph, and the algorithm predicts its neighboring nodes subject to the learned relationships and node attributes. For example in Fig. 4, a proposed circle will activate the rule that expands a wheel into two rings. The algorithm then searches for another circle of propor- tional radius, subject to the concentric relation with existing circle. In Fig. 5(b), the wheels are already verified. The candidate frames are then predicted with their ends affixed to the center points of the wheels. Since we cannot tell the front wheels from the rear ones at this moment, frames facing in two different directions are both pre- dicted and put in the Open List. In Fig. 5(a), the triangle templates are detected using a Generalized Hough Transform only when the wheels are first verified and frames are predicted. If no neighboring nodes are matched, the algorithm stops pursuing this proposal and removes it from the Lists. Otherwise, if all of the neighboring nodes are matched, the production rule is completed. The grouped nodes are then put in the Closed List and lined up to be another bottom-up proposal for the higher level. Note that we may have both bottom- up and top-down information being passed about a particular pro- posal as shown by the gray arrows in Fig. 3. In Fig. 4, the sub-parts of the frame are predicted in the top-down phase from the frame node (blue arrows); at the same time, they are also proposed in the bottom-up phase based on the triangles we detected (red arrows). Proposals with bidirectional supports such as these are more likely to be accepted. After one particle is accepted from the Open List, any other overlapping particles should update accordingly. Template match: The pre-defined part templates, such as the bi- cycle frames or teapot bodies, are represented by sub-sketch-graphs, which are composed of a set of linked edgelets and junctions. Once a template is proposed and placed at a location with initial attributes, the template matching process is then activated. As shown in 10 PDF rendering q .doc, .ppt rendering to .pdf keeps minimal structure formatting. Geared towards visual fidelity q Often .pdf is created by “screen scraping” or scanning or hybrid ways that do not keep structure information. Multi-modality: extremely rich information q Images + Text + Tables both co-exist as well as form nested hierarchies possibly with several levels Nested table (numeric and non-numeric + image) Tabular representation of images with pictorial cross reference Images + captions + cross references and text that comments the image
  • 11. Two major approaches to tackling PDF Processing ▪ Unsupervised Learning and out of the box PDF processing – Works well for a large class of domains with some compromise in quality ▪ Supervised Learning with a graphical labelling tool – Potential for improved quality when many similar documents are available Both approaches can be used together
  • 13. Learned Semantic Document Representation © 2017, IBM Corporation
  • 16. PDF Conversion Architecture Programmatic PDF PDFBox API: Parse PDF Document HTML Layout + Reading Order Inference HTML Generation Table Structure Population Metadata Identification Table Identification Cleanse Raw PDF Data Open Source or Commercial Software Research Extensions Composite Unit / Region Identification Scanned PDF Cleanse Raw OCR Output OCR Engine API: Scan PDF Document • ML-based PDF conversion Pipeline is source-independent • SAME ML-based algorithms can be applied directly to data extracted from either scanned or programmatic PDF • PDF Conversion ML algorithms are unsupervised; thus achieve stated performance out-of-box with NO training / tuning data required • Deployable in Cloud for document-at-a-time processing service • Scanned PDF processing available now using Datacap OCR engine • Extension using Tesseract engine Programmatic PDF Extraction Scanned & Hybrid PDF Extraction Hybrid PDF Chart Identification ML-based PDF Conversion Pipeline
  • 17. • HTML output from WCS PDF Conversion is directly consumable by downstream analytics • PDF Conversion Table processing example : 17 PDF Conversion Downstream Analytics Example PDF HTML Watson Knowledge Graph WCS PDF Conversion Table Processing NLQ Answering Structured Facts from Table Answer Original Scanned PDF table HTML generated from current PDF Conversion Web service Bridge Designer Length Brooklyn J. A. Roebling 1595 Manhattan G. Lindenthal 1470 Queensborough Palmer & Hornboste l 1182 Structured facts from existing Table Processing Libraries (with appropriate customization) Who designed Brooklyn Bridge? NLQ Answering J. A. Roebling …
  • 19. Document Structure Model (DSM) - Draft Scan PDF Prog PDF Page [1…n] Token Character Phrase TextLine Paragraph PageColumn [1…n] [1…n] [1…n] [1…n] [1…n] PageChart TableCell Table Graphical Line [1…n] means ordered list All objects have BoundingBox attribute Color displayOrder rowSpan colSpan [1…n][1…n] [1…n] [1…n] [1…n] [1…n] [1…n] Embedded Image BoundBoxCoords contents displayOrder [1…n] Logical Data Model Ontology Representation 19 • Goal: Define common document structure ideal for subsequent semantic processing • Captures both raw extracted information (text, vector graphics) along with inferred artifacts (tables, charts, paragraphs) • Start with PDF documents and extend to other formats such as Word and Excel • DSM Schema in OWL, Serializations to HTML, JSON...
  • 20. 20102009 2011 2012 2013 2014 2015 50 0 25 75 100 125 150 Financial Services : Query from Unstructured Data Financial Documents (.pdf, .html, docx…) Ingest “Show me revenues for Citibank between 2009 and 2015” © 2017, IBM Corporation