SlideShare a Scribd company logo
DATA VIRTUALIZATION
&
INFORMATION AS A SERVICE
(IAAS)
By Anil Allewar
Senior Solutions Architect - Synerzip
1
About Me!!
2
Anil Allewar
Senior Solutions Architect @
Synerzip
Technology Evangelist &
speaker
Core interests: JEE, EAI, EII
• Use cases
Agenda
3
• What does it mean?
• Implementation Frameworks
• Demo
• Questions?
• Architecture explained
Why it makes sense?
4
Use Cases
Data
Warehouse
ETL
Financial
Data
OLTP
Data
ETL
3rd Party
Data
Data
Mart
ETL
Web
Service 1
Web
Service 2
Legacy
Data
Custom
Program
Excel
files
5
Traditional Data Integration
6
Enterprise Information System
ETL
Source
System
Source
System
ETL
Business Applications
Problems with ETL
7
More than 1 copy of
data for staging
Intermediate data =>
Errors
Lead time to add new
source
Domain knowledge for
mapping
Batch Process => No
real time data
Problems with DBMS consolidation
8
Alternate approach =>
Single EIS (say RDBMS)
Extensive changes to
existing apps
Might not satisfy
everyone’s requirements
• Use cases
Agenda
9
• What does it mean?
• Implementation Frameworks
• Demo
• Questions?
• Architecture explained
Data Virtualization & Federation
10
Single API to access
data
Only metadata stored
at virtualization layer
Real time access without
copying/moving data
Federate data across
hetero/homogenous
sources
Data Virtualization
11
• Use cases
Agenda
12
• What does it mean?
• Implementation Frameworks
• Demo
• Questions?
• Architecture explained
Architecture
13
User
Application
CommonAccess
API
Connector 1
Connector 2
RUNTIME & QUERY
ENGINE
Virtual
Database
Translator
1
Translator
2
• Use cases
Agenda
14
• What does it mean?
• Implementation Frameworks
• Demo
• Questions?
• Architecture explained
Vendors
15
 Commercial Products
 Composite Software
 http://www.compositesw.com/data-virtualization/
 Denodo
 http://www.denodo.com/en/product/overview.php?n=h
 IBM
 http://www-03.ibm.com/software/products/en/ibminfofedeserv
 Informatica
 http://www.informatica.com/us/data-virtualization/
 Red Hat
 http://www.redhat.com/products/jbossenterprisemiddleware/data-virtualization/
 Open Source
 Jboss Teiid
 http://teiid.jboss.org/
Selected Platform – JBoss Teiid
16
Open Source
Number of
relational/NoSQL/E
RP/CRM data stores
JEE standards
Add custom EIS
support using JEE
components
Active & responsive
community Synerzip contribution: Defect
discovery, root cause analysis,
feature verification
Teiid Components
17
 Virtual Database
 container for components used to integrate data from
multiple data sources
 Source Models
 structure and characteristics of physical data sources
 View Models
 structure and characteristics of abstract structures you want to expose to your applications
 Teiid Designer
 Eclipse based UI to dynamically discover data source
objects and apply data federation
 Generate virtual database from 1 or more sources
Teiid Components
18
 Translator
 Provides abstraction later between Teiid Query Engine and
source system
 Convert Teiid SQL commands to source specific execution
commands
 Convert result data from source system to Teiid specific
format
 Resource Adapter
 Provides connectivity to the physical data source
 Integration provided through Java Connector Architecture
(JCA) API
Teiid – Supported EIS
 Amazon SimpleDB
 Apache Accumulo
 Apache SOLR
 Cassandra
 File
 Google Spreadsheet
 JPA
 LDAP
 Excel – as file
 SalesForce
 JDBC
 MS access, DB2, derby, excel-
odbc, greenplum, h2 , hive(for
accessing Hadoop), oracle,
teradata and most RDBMS
 MongoDB
 Object
 OData
 OLAP
 Web Services
 SAP Netweaver Gateway
19
Performance Characteristics
20
 Access same data using Oracle and Teiid drivers
 Retrieval times comparable when accessing tables having no
Blobs
0
5,000
10,000
15,000
20,000
25,000
No. of rows Vs Time: No Blobs
Oracle-JDBC
Teiid-JDBC
No. of rows
ms
Performance Characteristics
21
 Teiid slower when accessing Blob data
 Can be tuned
0
5,000
10,000
15,000
20,000
25,000
30,000
0 0 2 42 21,804 32,531 185,454
No. of rows Vs Time: Blobs
Oracle-JDBC
Teiid-JDBC
ms
No. of rows
• Use cases
Agenda
22
• What does it mean?
• Implementation Frameworks
• Demo
• Questions?
• Architecture explained
Demo
23
JDBC
Client
JDBC
API
RDBMS
Resource
Adapter
MongoDB
Resource
Adapter
TEIID RUNTIME &
QUERY ENGINE
Federated
VDB
mySQL
Translator
MongoDB
Translator
mySQL
Demo-Steps
24
 Pre-requisites
 mySQL server 5.5+ installed
 MongoDB 2.4.x+ installed
 Steps
 Load the mySql and MongoDB database with sample data
 Setup environment – JBoss, Eclipse
 Create Teiid project in Eclipse using Teiid designer
 Import source model using JDBC
 Create the virtual model and federate data from the source model
 Create a virtual database (VDB) and deploy to JBoss
 Access data using JDBC client or through browser using OData
Demo – Scenario
25
Federated
Data
Demo – Connection Profile
26
Demo – Source Model
27
Demo - Source Model Generation
28
Demo – Map Source To View
29
Demo - Association
30
Demo – Data Federation
31
Demo – Source Code
32
 Source code
 https://github.com/anilallewar/JBoss-Teiid
 Contains
 Configuration files
 Instructions
 “How-to” videos
 VDBs, source models and view models
Conclusion
33
 Data Virtualization and Federation is a rapidly
emerging technology that solves traditional BI/ETL
problems.
 It provides lower time to market, distributes data
across the enterprise as a service and provides real
time access to enterprise data.

More Related Content

What's hot (20)

PDF
Enabling digital transformation api ecosystems and data virtualization
Denodo
 
PPT
Big data insights with Red Hat JBoss Data Virtualization
Kenneth Peeples
 
PPTX
Information Virtualization: Query Federation on Data Lakes
DataWorks Summit
 
PDF
How to Achieve Fast Data Performance in Big Data, Logical Data Warehouse, and...
Denodo
 
PDF
Domain Driven Data: Apache Kafka® and the Data Mesh
confluent
 
PDF
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Dr. Arif Wider
 
PPTX
Data Analytics Meetup: Introduction to Azure Data Lake Storage
CCG
 
PDF
Data Mesh @ Yelp - 2019
Steven Moy
 
PDF
dvprimer-architecture
Kenneth Peeples
 
PDF
Open Development
Medsphere
 
PPTX
Azure Document Db
Marco Parenzan
 
PDF
Data Platform Overview
Hamid J. Fard
 
PPTX
Azure Lowlands: An intro to Azure Data Lake
Rick van den Bosch
 
PDF
A Comparison of EDB Postgres to Self-Supported PostgreSQL
EDB
 
PDF
Virtualisation de données : Enjeux, Usages & Bénéfices
Denodo
 
PDF
Azure Data Factory V2; The Data Flows
Thomas Sykes
 
PPTX
Building a Big Data Solution
James Serra
 
PDF
The Rise of Microservices
MongoDB
 
PDF
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
Data Con LA
 
PDF
Where does Fast Data Strategy Fit within IT Projects
Denodo
 
Enabling digital transformation api ecosystems and data virtualization
Denodo
 
Big data insights with Red Hat JBoss Data Virtualization
Kenneth Peeples
 
Information Virtualization: Query Federation on Data Lakes
DataWorks Summit
 
How to Achieve Fast Data Performance in Big Data, Logical Data Warehouse, and...
Denodo
 
Domain Driven Data: Apache Kafka® and the Data Mesh
confluent
 
Data Mesh in Practice - How Europe's Leading Online Platform for Fashion Goes...
Dr. Arif Wider
 
Data Analytics Meetup: Introduction to Azure Data Lake Storage
CCG
 
Data Mesh @ Yelp - 2019
Steven Moy
 
dvprimer-architecture
Kenneth Peeples
 
Open Development
Medsphere
 
Azure Document Db
Marco Parenzan
 
Data Platform Overview
Hamid J. Fard
 
Azure Lowlands: An intro to Azure Data Lake
Rick van den Bosch
 
A Comparison of EDB Postgres to Self-Supported PostgreSQL
EDB
 
Virtualisation de données : Enjeux, Usages & Bénéfices
Denodo
 
Azure Data Factory V2; The Data Flows
Thomas Sykes
 
Building a Big Data Solution
James Serra
 
The Rise of Microservices
MongoDB
 
Big Data Fabric for At-Scale Real-Time Analysis by Edwin Robbins
Data Con LA
 
Where does Fast Data Strategy Fit within IT Projects
Denodo
 

Similar to Data virtualization, Data Federation & IaaS with Jboss Teiid (20)

PDF
Data virtualization
Hamed Hatami
 
PPTX
Data Virtualization And Information As A Service (IaaS)
Synerzip
 
PPTX
The Evolution of the Oracle Database - Then, Now and Later (Fontys Hogeschool...
Lucas Jellema
 
PPTX
Fontys Lecture - The Evolution of the Oracle Database 2016
Lucas Jellema
 
PDF
Data Virtualization Primer -
Kenneth Peeples
 
PDF
dvprimer-concepts
Kenneth Peeples
 
PDF
Red Hat JBOSS Data Virtualization
DLT Solutions
 
PDF
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Denodo
 
PDF
Mow2012 data services
Syed Shaaf
 
PDF
JDV for Codemotion Rome 2017
Luigi Fugaro
 
PDF
The Perfect Storm: The Impact of Analytics, Big Data and Analytics
Inside Analysis
 
PPTX
One Year in Six Minutes - My Professional Mindshifts (Oracle OpenWorld 2017)
Lucas Jellema
 
PPT
Why Data Virtualization? An Introduction by Denodo
Justo Hidalgo
 
PPTX
The Most Trusted In-Memory database in the world- Altibase
Altibase
 
PPT
Semantic Interoperability in Infocosm: Beyond Infrastructural and Data Intero...
Amit Sheth
 
PDF
2009.10.22 S308460 Cloud Data Services
Jeffrey T. Pollock
 
PDF
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Matt Stubbs
 
PPT
Technology Fundamentals
ashishsharma1506
 
PPT
Technology Fundamentals
ashishsharma1506
 
PPTX
Teradata Partners 2011 - Utilizing Teradata Express For Development And Sandb...
monsonc
 
Data virtualization
Hamed Hatami
 
Data Virtualization And Information As A Service (IaaS)
Synerzip
 
The Evolution of the Oracle Database - Then, Now and Later (Fontys Hogeschool...
Lucas Jellema
 
Fontys Lecture - The Evolution of the Oracle Database 2016
Lucas Jellema
 
Data Virtualization Primer -
Kenneth Peeples
 
dvprimer-concepts
Kenneth Peeples
 
Red Hat JBOSS Data Virtualization
DLT Solutions
 
Myth Busters II: BI Tools and Data Virtualization are Interchangeable
Denodo
 
Mow2012 data services
Syed Shaaf
 
JDV for Codemotion Rome 2017
Luigi Fugaro
 
The Perfect Storm: The Impact of Analytics, Big Data and Analytics
Inside Analysis
 
One Year in Six Minutes - My Professional Mindshifts (Oracle OpenWorld 2017)
Lucas Jellema
 
Why Data Virtualization? An Introduction by Denodo
Justo Hidalgo
 
The Most Trusted In-Memory database in the world- Altibase
Altibase
 
Semantic Interoperability in Infocosm: Beyond Infrastructural and Data Intero...
Amit Sheth
 
2009.10.22 S308460 Cloud Data Services
Jeffrey T. Pollock
 
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Matt Stubbs
 
Technology Fundamentals
ashishsharma1506
 
Technology Fundamentals
ashishsharma1506
 
Teradata Partners 2011 - Utilizing Teradata Express For Development And Sandb...
monsonc
 
Ad

Recently uploaded (20)

PPT
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
PPTX
Introduction to computer chapter one 2017.pptx
mensunmarley
 
DOCX
Online Delivery Restaurant idea and analyst the data
sejalsengar2323
 
PDF
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
PPTX
Fluvial_Civilizations_Presentation (1).pptx
alisslovemendoza7
 
PDF
How to Do Competitive Analysis with AI
Contify
 
PDF
apidays Munich 2025 - Developer Portals, API Catalogs, and Marketplaces, Miri...
apidays
 
PDF
[1library.net] creating a culture of cyber security at work
University of [X]
 
PPTX
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
PPTX
apidays Munich 2025 - Streamline & Secure LLM Traffic with APISIX AI Gateway ...
apidays
 
PPTX
UPS Case Study - Group 5 with example and implementation .pptx
yasserabdelwahab6
 
PPTX
apidays Munich 2025 - GraphQL 101: I won't REST, until you GraphQL, Surbhi Si...
apidays
 
PDF
apidays Munich 2025 - Integrate Your APIs into the New AI Marketplace, Senthi...
apidays
 
PPTX
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
PPTX
7 Easy Ways to Improve Clarity in Your BI Reports
sophiegracewriter
 
PDF
apidays Munich 2025 - The Physics of Requirement Sciences Through Application...
apidays
 
PPTX
GLOBAL_Gender-module-5_committing-equity-responsive-budget.pptx
rashmisahu90
 
PPTX
MR and reffffffvvvvvvvfversal_083605.pptx
manjeshjain
 
DOCX
Q1_LE_Mathematics 8_Lesson 4_Week 4.docx
ROWELLJAYMALAPIT
 
PDF
apidays Munich 2025 - Let’s build, debug and test a magic MCP server in Postm...
apidays
 
Real Life Application of Set theory, Relations and Functions
manavparmar205
 
Introduction to computer chapter one 2017.pptx
mensunmarley
 
Online Delivery Restaurant idea and analyst the data
sejalsengar2323
 
SUMMER INTERNSHIP REPORT[1] (AutoRecovered) (6) (1).pdf
pandeydiksha814
 
Fluvial_Civilizations_Presentation (1).pptx
alisslovemendoza7
 
How to Do Competitive Analysis with AI
Contify
 
apidays Munich 2025 - Developer Portals, API Catalogs, and Marketplaces, Miri...
apidays
 
[1library.net] creating a culture of cyber security at work
University of [X]
 
IP_Journal_Articles_2025IP_Journal_Articles_2025
mishell212144
 
apidays Munich 2025 - Streamline & Secure LLM Traffic with APISIX AI Gateway ...
apidays
 
UPS Case Study - Group 5 with example and implementation .pptx
yasserabdelwahab6
 
apidays Munich 2025 - GraphQL 101: I won't REST, until you GraphQL, Surbhi Si...
apidays
 
apidays Munich 2025 - Integrate Your APIs into the New AI Marketplace, Senthi...
apidays
 
Multiscale Segmentation of Survey Respondents: Seeing the Trees and the Fores...
Sione Palu
 
7 Easy Ways to Improve Clarity in Your BI Reports
sophiegracewriter
 
apidays Munich 2025 - The Physics of Requirement Sciences Through Application...
apidays
 
GLOBAL_Gender-module-5_committing-equity-responsive-budget.pptx
rashmisahu90
 
MR and reffffffvvvvvvvfversal_083605.pptx
manjeshjain
 
Q1_LE_Mathematics 8_Lesson 4_Week 4.docx
ROWELLJAYMALAPIT
 
apidays Munich 2025 - Let’s build, debug and test a magic MCP server in Postm...
apidays
 
Ad

Data virtualization, Data Federation & IaaS with Jboss Teiid

  • 1. DATA VIRTUALIZATION & INFORMATION AS A SERVICE (IAAS) By Anil Allewar Senior Solutions Architect - Synerzip 1
  • 2. About Me!! 2 Anil Allewar Senior Solutions Architect @ Synerzip Technology Evangelist & speaker Core interests: JEE, EAI, EII
  • 3. • Use cases Agenda 3 • What does it mean? • Implementation Frameworks • Demo • Questions? • Architecture explained
  • 4. Why it makes sense? 4
  • 6. Traditional Data Integration 6 Enterprise Information System ETL Source System Source System ETL Business Applications
  • 7. Problems with ETL 7 More than 1 copy of data for staging Intermediate data => Errors Lead time to add new source Domain knowledge for mapping Batch Process => No real time data
  • 8. Problems with DBMS consolidation 8 Alternate approach => Single EIS (say RDBMS) Extensive changes to existing apps Might not satisfy everyone’s requirements
  • 9. • Use cases Agenda 9 • What does it mean? • Implementation Frameworks • Demo • Questions? • Architecture explained
  • 10. Data Virtualization & Federation 10 Single API to access data Only metadata stored at virtualization layer Real time access without copying/moving data Federate data across hetero/homogenous sources
  • 12. • Use cases Agenda 12 • What does it mean? • Implementation Frameworks • Demo • Questions? • Architecture explained
  • 13. Architecture 13 User Application CommonAccess API Connector 1 Connector 2 RUNTIME & QUERY ENGINE Virtual Database Translator 1 Translator 2
  • 14. • Use cases Agenda 14 • What does it mean? • Implementation Frameworks • Demo • Questions? • Architecture explained
  • 15. Vendors 15  Commercial Products  Composite Software  http://www.compositesw.com/data-virtualization/  Denodo  http://www.denodo.com/en/product/overview.php?n=h  IBM  http://www-03.ibm.com/software/products/en/ibminfofedeserv  Informatica  http://www.informatica.com/us/data-virtualization/  Red Hat  http://www.redhat.com/products/jbossenterprisemiddleware/data-virtualization/  Open Source  Jboss Teiid  http://teiid.jboss.org/
  • 16. Selected Platform – JBoss Teiid 16 Open Source Number of relational/NoSQL/E RP/CRM data stores JEE standards Add custom EIS support using JEE components Active & responsive community Synerzip contribution: Defect discovery, root cause analysis, feature verification
  • 17. Teiid Components 17  Virtual Database  container for components used to integrate data from multiple data sources  Source Models  structure and characteristics of physical data sources  View Models  structure and characteristics of abstract structures you want to expose to your applications  Teiid Designer  Eclipse based UI to dynamically discover data source objects and apply data federation  Generate virtual database from 1 or more sources
  • 18. Teiid Components 18  Translator  Provides abstraction later between Teiid Query Engine and source system  Convert Teiid SQL commands to source specific execution commands  Convert result data from source system to Teiid specific format  Resource Adapter  Provides connectivity to the physical data source  Integration provided through Java Connector Architecture (JCA) API
  • 19. Teiid – Supported EIS  Amazon SimpleDB  Apache Accumulo  Apache SOLR  Cassandra  File  Google Spreadsheet  JPA  LDAP  Excel – as file  SalesForce  JDBC  MS access, DB2, derby, excel- odbc, greenplum, h2 , hive(for accessing Hadoop), oracle, teradata and most RDBMS  MongoDB  Object  OData  OLAP  Web Services  SAP Netweaver Gateway 19
  • 20. Performance Characteristics 20  Access same data using Oracle and Teiid drivers  Retrieval times comparable when accessing tables having no Blobs 0 5,000 10,000 15,000 20,000 25,000 No. of rows Vs Time: No Blobs Oracle-JDBC Teiid-JDBC No. of rows ms
  • 21. Performance Characteristics 21  Teiid slower when accessing Blob data  Can be tuned 0 5,000 10,000 15,000 20,000 25,000 30,000 0 0 2 42 21,804 32,531 185,454 No. of rows Vs Time: Blobs Oracle-JDBC Teiid-JDBC ms No. of rows
  • 22. • Use cases Agenda 22 • What does it mean? • Implementation Frameworks • Demo • Questions? • Architecture explained
  • 23. Demo 23 JDBC Client JDBC API RDBMS Resource Adapter MongoDB Resource Adapter TEIID RUNTIME & QUERY ENGINE Federated VDB mySQL Translator MongoDB Translator mySQL
  • 24. Demo-Steps 24  Pre-requisites  mySQL server 5.5+ installed  MongoDB 2.4.x+ installed  Steps  Load the mySql and MongoDB database with sample data  Setup environment – JBoss, Eclipse  Create Teiid project in Eclipse using Teiid designer  Import source model using JDBC  Create the virtual model and federate data from the source model  Create a virtual database (VDB) and deploy to JBoss  Access data using JDBC client or through browser using OData
  • 26. Demo – Connection Profile 26
  • 27. Demo – Source Model 27
  • 28. Demo - Source Model Generation 28
  • 29. Demo – Map Source To View 29
  • 31. Demo – Data Federation 31
  • 32. Demo – Source Code 32  Source code  https://github.com/anilallewar/JBoss-Teiid  Contains  Configuration files  Instructions  “How-to” videos  VDBs, source models and view models
  • 33. Conclusion 33  Data Virtualization and Federation is a rapidly emerging technology that solves traditional BI/ETL problems.  It provides lower time to market, distributes data across the enterprise as a service and provides real time access to enterprise data.

Editor's Notes

  • #8: Require more than 1 copy of data for staging Creating, storing and manipulating this intermediate data can lead to errors in data quality Lead time required to add data from new sources Depends on domain knowledge of mapping entities between different data sources Batch processing – information lagging behind real time data
  • #9: Alternate approach is to move all enterprise data to a common Enterprise Information System (typically RDBMS) Extensive changes to existing applications resulting in end user impact Might not satisfy every group’s requirements – say group 1 has partitioned data but the target RDBMS doesn’t support partitioning
  • #11: Single API to access data from heterogeneous sources Only metadata stored at virtualization layer Real time access of data without copying/moving data from the source Enterprise Information System (EIS) Federate data across multiple heterogeneous/homogenous sources An enterprise information system (EIS) is any kind of information system which improves the functions of an enterprise business processes by integration. An EIS could use a database/web service/flat files or any other custom system for storing this information.
  • #17: Jboss Teiid Open Source  Supports number of relational and non relational data sources Integrated with the JBoss Application Server and JEE architecture Ability to add custom data sources using standard JEE components Very active and responsive community
  • #20: Amazon SimpleDB - web service for running queries on structured data in real time Apache Accumulo - sorted, distributed key value store Apache SOLR - search system for indexing data/services Cassandra - NoSQL database File - exposes stored procedures to leverage file system resources JPA - reverse a JPA object model into a relational model LDAP - exposes an LDAP directory tree relationally MongoDB - NoSQL database Object - reading java objects from external sources (i.e., Infinispan Cache or Map cache) OData - Consume OData web services and also act as web server to expose VDB as an OData service OLAP - online analytical processing exposing data as 3-D arrays called cubes SalesForce - CRM product SAP Netweaver Gateway - Web service calls to SAP Web Services - exposes stored procedures for calling web services