SlideShare a Scribd company logo
Pentaho Data Integration 4
and MySQL
Matt Casters:
Pentaho's Chief Data Integration
Kettle Project Founder
MySQL User Conference, Tuesday April 13th
, 2010
Agenda
Pentaho: an introduction
Pentaho Data Integration
Version 4: New features
MySQL support in PDI
Q&A
Pentaho Introduction
Commercial open source alternative for business intelligence (BI)
Founded in 2004: Pioneer in commercial open source BI
Large referenceable customer base, wide range of BI/DW deployments
Management - proven BI and open source veterans
Business Objects, Cognos, Hyperion, JBoss, Oracle, Red Hat, SAS, SugarCRM
Board of Directors – deep expertise and proven success in open source
Bob Bearden - Executive chairman of the board (former SpringSource)
Larry Augustin - founder, VA Software, helped coin the phrase “open source”
Zack Urlocker – VP of Products, MySQL/Oracle
Benchmark Capital, Index Ventures, New Enterprise Associates
Widely recognized as the leader in open source BI
Pentaho Introduction
Complete Business Intelligence Suite
End-to-end coverage of all BI needs
Standards-based, modular, standalone or embeddable platform
Open Source Licensing
Lower software acquisition costs
Lower Total Cost of Ownership (TCO)
Enterprise Development Methodology
Transparent, detailed roadmap
Product roadmap and contributions managed by Pentaho
Core developers are Pentaho employees
Extensive QA
Expert Services
Comprehensive Training, Consulting, Enterprise service offerings
Delivered by the Experts
Pentaho Introduction – Enterprise Edition
Pentaho Introduction – Deployments
Wide range of deployments
Reporting
Data Integration / ETL
Dashboards
Full BI Suite
Thousands of users
3,000 on a single server
Large data volumes
Half a terabyte of live interactive OLAP
data
ETL loading 300K rows/second
Sophisticated applications
Hundreds of dimensions
Small deployments as well
20 users, MS Access databases
Pentaho Introduction – Technology
Componentized and modular
Service-implemented
architecture
Built “from the ground up” as a
set of services
Exposed via AJAX and Web
Services
100% Java EE server side
Scalable, standards-based
Web-based, thin-client end user
interfaces
Graphical design interfaces
Embedded process workflow
engine
Pentaho Introduction – Reporting
Access and format data from
disparate sources
RDBMS, XML, OLAP
Produce in popular formats
Multiple report types
Operational
Analytical
Financial
Parameterized
Go directly against data
sources or Pentaho’s
centralized metadata layer
Pentaho Introduction – Analysis
Navigate and explore
Ad hoc, interactive analysis
Drill into further detail
Select specific members for analysis
View data “dimensionally”
i.e. Sales by region, by channel, by
time period
ROLAP architecture
Works with all popular open source
and proprietary DBs
No intermediate storage
Aggregate table “aware” for faster
analytic queries
Design tools to build OLAP schemas
and improve query performance
Pentaho Introduction – Dashboards
Gain visibility into your organization’s
key performance indicators (KPIs)
Monitor top-level performance and
drill into supporting detail
Illuminate metrics for quick insight
into business activities
Track exceptions and receive alerts
Leverage the full Pentaho BI Suite
Comprehensive auditing of user
activity, performance and data access
Context-sensitive drilling to reports
and analysis views
Integrated security, scheduling,
alerting, portal integration
Integrate with 3rd
-party and custom
applications
Pentaho Introduction – Dashboard Designer
Web-based end user dashboard
creation
From Pentaho User Console
“Zero training”
Template and theme-based creation
Incorporate reports, analysis views,
Adobe Flash-based charts and other
Pentaho content
Create new charts and interactive
data grids from scratch
Pentaho metadata – no SQL required
Filter controls
Pentaho Introduction – BI Platform
Provides critical services for end users
Easy access to business information
Intuitive scheduling
Delivery over the web or via email
Alerting and notification
Provides critical services for
administrators
Centralized thin-client administration
Data source and security
management
Auditing and Performance
monitoring
Enterprise security integration
Definition and execution of business
rules
Integration points with 3rd
party
applications
Pentaho Enterprise Console
Pentaho User Console
Pentaho Introduction – Metadata
Provides an abstraction layer between
source systems and business user
concepts
Graphical design environment for
defining metadata model
Data presented to business users in
business terms
Allows business users to create their
own ad hoc reports based on
centralized business rules, without any
technical skills or knowledge of SQL
Changes to physical database do not
impact reports or analytic views
Business Intelligence Metadata
Business User
Automated SQL generation
Physical
Database Model
Pentaho Introduction – Data Mining
Take BI to the next level with
predictive analytics
Gain insight into hidden
patterns and relationships
Discover indicators of future
performance
Exploit correlations to
improve organizational
performance
Embed recommendations in
reports, dashboards, or
custom applications
Agenda
Pentaho: an introduction
Pentaho Data Integration
Version 4: New features
MySQL support in PDI
Q&A
Pentaho Data Integration for BI
Business Intelligence!
That's what we do.
Pentaho Data Integration – Kettle
Kettle
Extraction
Transportation
Transformation
Loading
Environment
Pentaho Data Integration – Extraction
Extract data from :
35+ database types
MySQL, PostgreSQL, SQLite, ...
Oracle, SQL Server, etc
Text files
XML files
XLS files
Xbase files (dBase, Foxpro, etc)
File systems information
Generated data
MS Access files
LDAP
Geo-data
...
Pentaho Data Integration – Transportation
Transportation of data
Engine based data transfer (no code generator)
Very flexible pathways:
splitting
partitioning
merging
joining
duplicating
clustering (MPP)
Pentaho Data Integration – Transformation
Flexibly transform data
Looking up data
databases
files
memory...
Calculating
Scripting
JavaScript, SQL, RegExp
Splitting
Mapping
Selecting
Filtering
Pivotting ...
Pentaho Data Integration – Loading
Load data into a target format
Database loads
Data warehouse population
Partitioned loading
Bulk loading
Parallel loading
Clustering
Pentaho Data Integration – Environment
Full GUI called “Spoon” to edit every option in Kettle
Drag & Drop
Debugger
Rich GUI
Command line tools
execute jobs
execute transformations
Web server
clustering
remote execution
Programming API for Java
Plugin eco-system
...
Pentaho Data Integration – Community
Paying Pentaho customers
Large and small corporations
All possible sectors
Lone rangers & Hobbiests
All regions on Earth
Meet on our Forum : +30,000 posts in 3 years
Use our JIRA case tracking systems
Download more than 10,000 copies of Kettle per month
http://www.ohloh.net/projects/3624?p=Kettle
http://www.softpedia.com/progClean/Kettle-Clean-80094.html
Pentaho Data Integration – use-cases
Load data from text files and store it into a database [demo]
Export data from database to text-file or more other databases
Data migration between database applications
Exploration of data in existing databases (tables, views, etc.)
Information improvement using lookups
Data cleaning
Application integration
Data warehouse population
Application integration
Report data generation
...
Pentaho Data Integration – Adoption
Wide range of production deployments
Small and medium-sized companies
Large enterprises
Rapid product evolution
Driven by Pentaho investment
Includes significant community
contributions
“Contribution-friendly” architecture
Natural fit for additional data sources,
targets and transformations
Pentaho Data Integration – Adoption
Most deployed open source data integration
solution. Independent study by Mark Madsen
of Third Nature and the BeyeNETWORK
Download free study at pentaho.com
Pentaho Data Integration – Links
Homepage: http://kettle.pentaho.org
Forum: http://forums.pentaho.org/forumdisplay.php?f=69
Case tracker: http://jira.pentaho.org/browse/PDI
Continuous Integration Server: http://ci.pentaho.com/job/Kettle
Wiki : http://wiki.pentaho.org/display/EAI
IRC Channel: ##pentaho (on Freenode)
Mailing list: http://groups.google.com/group/kettle-developers
My blog: http://www.ibridge.be
My coordinates: mcasters at pentaho dot org
Agenda
Pentaho: an introduction
Pentaho Data Integration
Version 4 : New features
MySQL support in PDI
Q&A
Version 4: New features - Visualisation
Demo
New welcome screen
Mouse-over slide-outs for icons
Hop creation
Improved error handling configuration
New perspectives support for Agile BI visualisations, modelling,
scheduling, etc.
Version 4: New features - Running jobs
Drill down into running job entries
Visual indicators of running and completed job entries
Success and failure mini-icons
Mouse over completion mini-icons shows details of execution
results
Log capturing of completed job entries
Version 4: New features -
Running transformations
Drill down into running transformation job entries and mappings
Row input/output sniff testing: see what rows are passing
(demo)
Remote input/output sniff testing on a Carte server
Version 4: New features - Better logging
Reduced memory consumption
Incremental log updates
Global log buffer size limit for long running jobs/transformations
Interval logging
Auto clean-up of old log records
Log record time-outs & execution lineage
Log record colour coding in Spoon (blue and red for error lines)
Step and job entry level Logging
Execution lineage logging
Renaming individual columns
Global configuration options for all log tables
Version 4: New features - Plugins
Unified plug-in architecture
Easier deployment and packaging
Step, job entry, partitioner, database type, spoon perspective,
life-cycle, ... : all pluggable
--> MySQL 5.1 plugin
Version 4: New features - Repositories
Allowing for 3rd party repositories like the Pentaho Unified
Enterprise Repository
Removed dependencies to relational database repository (still
supported though)
Added support for repositories capable of team-development
(file locking)
Added support for repositories capable of fine-grained security
repositories
Added support for repositories capable of storing and retrieving
revision history
Version 4: New features – New steps
SAP Input
Data Grid
OLAP Input (Mondrian, Palo, SSAS, SAP B/W)
Palo Cell Input/Output, Dimension Input/Output
Salesforce Delete, Insert, Update, Upsert
Add fields changing sequence (group sequence)
User Defined Java Class: create your own plugin in Java on the fly in a step
Send information using Syslog: Send a message to a Syslog server.
Java Filter
Memory Group By
Farrage streaming bulk loader
Teradata Fastload Bulk loader
Experimental steps like Get table names, Email messages input, ...
Agenda
Pentaho: an introduction
Pentaho Data Integration
Version 4 : New features
MySQL support in PDI
Q&A
MySQL Support in PDI
JDBC/ODBC Driver Integration
Reading: MySQL Result Streaming (cursor emulation) support
Writing: MySQL dialects for data types
Job entry: Bulk Loader of text files for MySQL
Job entry: Bulk writer to a text file for MySQL
Database Partitioning (Sharding) Demo
Database partitioning
Year 2003 Partition
Year 2004 Partition
Year 2005 Partition
Year 2006 Partition
Sales table 2003
2004
2005
2006
Sales
2003
2004
2005
2006
Sales
2003
2004
2005
2006
Sales
2003
2004
2005
2006
Sales
DB
DB2
DB
DB4
Questions and Closing
Other Pentaho related User Conference information:
Collapsing BI from Months to Minutes (Agile BI)
Jared Cornelius
Ballroom H
11:55am Tuesday April 13th
MySQL Binary Log Analysis With Pentaho BI
Robert Booth
Ballroom B
5:15pm Wednesday April 14th
The Pentaho Booth 516 in the Exibition Hall
ETA: September 2010
Ad

More Related Content

What's hot (20)

Introduction To Pentaho
Introduction To PentahoIntroduction To Pentaho
Introduction To Pentaho
pentaho Content
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - Snowflake
Matillion
 
An overview of snowflake
An overview of snowflakeAn overview of snowflake
An overview of snowflake
Sivakumar Ramar
 
Kettle – Etl Tool
Kettle – Etl ToolKettle – Etl Tool
Kettle – Etl Tool
Dr Anjan Krishnamurthy
 
Data Sharing with Snowflake
Data Sharing with SnowflakeData Sharing with Snowflake
Data Sharing with Snowflake
Snowflake Computing
 
Snowflake Overview
Snowflake OverviewSnowflake Overview
Snowflake Overview
Snowflake Computing
 
Power BI Overview
Power BI Overview Power BI Overview
Power BI Overview
Gal Vekselman
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
Databricks
 
Power bi
Power biPower bi
Power bi
Lakshmi Prasanna Kottagorla
 
Sap bw 4 hana vs sap bw on hana
Sap bw 4 hana vs sap bw on hanaSap bw 4 hana vs sap bw on hana
Sap bw 4 hana vs sap bw on hana
Jasbir Khanuja
 
Power BI Architecture
Power BI ArchitecturePower BI Architecture
Power BI Architecture
Arthur Graus
 
Data catalog
Data catalogData catalog
Data catalog
iamtodor
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineering
Thang Bui (Bob)
 
Presentation 1 - SSRS (1)
Presentation 1 - SSRS (1)Presentation 1 - SSRS (1)
Presentation 1 - SSRS (1)
Anurag Rana
 
Snowflake Datawarehouse Architecturing
Snowflake Datawarehouse ArchitecturingSnowflake Datawarehouse Architecturing
Snowflake Datawarehouse Architecturing
Ishan Bhawantha Hewanayake
 
Technical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdfTechnical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdf
Ilham31574
 
Deeper Insights with Alteryx
Deeper Insights with AlteryxDeeper Insights with Alteryx
Deeper Insights with Alteryx
Phil Budden
 
Power BI Made Simple
Power BI Made SimplePower BI Made Simple
Power BI Made Simple
James Serra
 
Getting Started with Delta Lake on Databricks
Getting Started with Delta Lake on DatabricksGetting Started with Delta Lake on Databricks
Getting Started with Delta Lake on Databricks
Knoldus Inc.
 
285295780-SAP-BW-Info-Provider.ppt
285295780-SAP-BW-Info-Provider.ppt285295780-SAP-BW-Info-Provider.ppt
285295780-SAP-BW-Info-Provider.ppt
ntrnbk
 
Master the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - SnowflakeMaster the Multi-Clustered Data Warehouse - Snowflake
Master the Multi-Clustered Data Warehouse - Snowflake
Matillion
 
An overview of snowflake
An overview of snowflakeAn overview of snowflake
An overview of snowflake
Sivakumar Ramar
 
Free Training: How to Build a Lakehouse
Free Training: How to Build a LakehouseFree Training: How to Build a Lakehouse
Free Training: How to Build a Lakehouse
Databricks
 
Sap bw 4 hana vs sap bw on hana
Sap bw 4 hana vs sap bw on hanaSap bw 4 hana vs sap bw on hana
Sap bw 4 hana vs sap bw on hana
Jasbir Khanuja
 
Power BI Architecture
Power BI ArchitecturePower BI Architecture
Power BI Architecture
Arthur Graus
 
Data catalog
Data catalogData catalog
Data catalog
iamtodor
 
Demystifying data engineering
Demystifying data engineeringDemystifying data engineering
Demystifying data engineering
Thang Bui (Bob)
 
Presentation 1 - SSRS (1)
Presentation 1 - SSRS (1)Presentation 1 - SSRS (1)
Presentation 1 - SSRS (1)
Anurag Rana
 
Technical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdfTechnical Deck Delta Live Tables.pdf
Technical Deck Delta Live Tables.pdf
Ilham31574
 
Deeper Insights with Alteryx
Deeper Insights with AlteryxDeeper Insights with Alteryx
Deeper Insights with Alteryx
Phil Budden
 
Power BI Made Simple
Power BI Made SimplePower BI Made Simple
Power BI Made Simple
James Serra
 
Getting Started with Delta Lake on Databricks
Getting Started with Delta Lake on DatabricksGetting Started with Delta Lake on Databricks
Getting Started with Delta Lake on Databricks
Knoldus Inc.
 
285295780-SAP-BW-Info-Provider.ppt
285295780-SAP-BW-Info-Provider.ppt285295780-SAP-BW-Info-Provider.ppt
285295780-SAP-BW-Info-Provider.ppt
ntrnbk
 

Viewers also liked (20)

كفايات التدريس بالبرمجيات التعليمية
كفايات التدريس بالبرمجيات التعليميةكفايات التدريس بالبرمجيات التعليمية
كفايات التدريس بالبرمجيات التعليمية
AHMED ENNAJI
 
Serie 1 tc semestre 1
Serie 1 tc  semestre 1Serie 1 tc  semestre 1
Serie 1 tc semestre 1
AHMED ENNAJI
 
Serie 4 tc6
Serie 4 tc6Serie 4 tc6
Serie 4 tc6
AHMED ENNAJI
 
Serie 6 2bac sm biof nombres complexes
Serie 6  2bac sm biof  nombres complexesSerie 6  2bac sm biof  nombres complexes
Serie 6 2bac sm biof nombres complexes
AHMED ENNAJI
 
Diaporama logique raisonnement
Diaporama logique raisonnementDiaporama logique raisonnement
Diaporama logique raisonnement
AHMED ENNAJI
 
Serie5( 2bac sm biof)
Serie5( 2bac sm biof)Serie5( 2bac sm biof)
Serie5( 2bac sm biof)
AHMED ENNAJI
 
2bacsm biof (serie2)
2bacsm biof (serie2)2bacsm biof (serie2)
2bacsm biof (serie2)
AHMED ENNAJI
 
Serie 5(derive)
Serie 5(derive)Serie 5(derive)
Serie 5(derive)
AHMED ENNAJI
 
Devoir surveille 1 semestre2
Devoir surveille 1 semestre2Devoir surveille 1 semestre2
Devoir surveille 1 semestre2
AHMED ENNAJI
 
Practice 1
Practice 1Practice 1
Practice 1
AHMED ENNAJI
 
Serie 3(suites et trigonometries1sm biof)
Serie 3(suites et trigonometries1sm biof)Serie 3(suites et trigonometries1sm biof)
Serie 3(suites et trigonometries1sm biof)
AHMED ENNAJI
 
Série 2 (derive)
Série 2 (derive)Série 2 (derive)
Série 2 (derive)
AHMED ENNAJI
 
ennaji ahmed base de donnees
ennaji ahmed base de donneesennaji ahmed base de donnees
ennaji ahmed base de donnees
AHMED ENNAJI
 
Biranzarne glaf du manuel tc bac international
Biranzarne glaf du manuel tc bac internationalBiranzarne glaf du manuel tc bac international
Biranzarne glaf du manuel tc bac international
AHMED ENNAJI
 
Con 1 tc semestre 1
Con 1 tc semestre 1Con 1 tc semestre 1
Con 1 tc semestre 1
AHMED ENNAJI
 
Devoir 1sm biof oumorabiaa semestr1
Devoir 1sm biof oumorabiaa semestr1Devoir 1sm biof oumorabiaa semestr1
Devoir 1sm biof oumorabiaa semestr1
AHMED ENNAJI
 
Serie 6(derive)
Serie 6(derive)Serie 6(derive)
Serie 6(derive)
AHMED ENNAJI
 
Pilotage de l'entreprise
Pilotage de l'entreprisePilotage de l'entreprise
Pilotage de l'entreprise
AHMED ENNAJI
 
Serie 3(derive)
Serie 3(derive)Serie 3(derive)
Serie 3(derive)
AHMED ENNAJI
 
كفايات التدريس بالبرمجيات التعليمية
كفايات التدريس بالبرمجيات التعليميةكفايات التدريس بالبرمجيات التعليمية
كفايات التدريس بالبرمجيات التعليمية
AHMED ENNAJI
 
Serie 1 tc semestre 1
Serie 1 tc  semestre 1Serie 1 tc  semestre 1
Serie 1 tc semestre 1
AHMED ENNAJI
 
Serie 6 2bac sm biof nombres complexes
Serie 6  2bac sm biof  nombres complexesSerie 6  2bac sm biof  nombres complexes
Serie 6 2bac sm biof nombres complexes
AHMED ENNAJI
 
Diaporama logique raisonnement
Diaporama logique raisonnementDiaporama logique raisonnement
Diaporama logique raisonnement
AHMED ENNAJI
 
Serie5( 2bac sm biof)
Serie5( 2bac sm biof)Serie5( 2bac sm biof)
Serie5( 2bac sm biof)
AHMED ENNAJI
 
2bacsm biof (serie2)
2bacsm biof (serie2)2bacsm biof (serie2)
2bacsm biof (serie2)
AHMED ENNAJI
 
Devoir surveille 1 semestre2
Devoir surveille 1 semestre2Devoir surveille 1 semestre2
Devoir surveille 1 semestre2
AHMED ENNAJI
 
Serie 3(suites et trigonometries1sm biof)
Serie 3(suites et trigonometries1sm biof)Serie 3(suites et trigonometries1sm biof)
Serie 3(suites et trigonometries1sm biof)
AHMED ENNAJI
 
ennaji ahmed base de donnees
ennaji ahmed base de donneesennaji ahmed base de donnees
ennaji ahmed base de donnees
AHMED ENNAJI
 
Biranzarne glaf du manuel tc bac international
Biranzarne glaf du manuel tc bac internationalBiranzarne glaf du manuel tc bac international
Biranzarne glaf du manuel tc bac international
AHMED ENNAJI
 
Con 1 tc semestre 1
Con 1 tc semestre 1Con 1 tc semestre 1
Con 1 tc semestre 1
AHMED ENNAJI
 
Devoir 1sm biof oumorabiaa semestr1
Devoir 1sm biof oumorabiaa semestr1Devoir 1sm biof oumorabiaa semestr1
Devoir 1sm biof oumorabiaa semestr1
AHMED ENNAJI
 
Pilotage de l'entreprise
Pilotage de l'entreprisePilotage de l'entreprise
Pilotage de l'entreprise
AHMED ENNAJI
 
Ad

Similar to Pentaho data integration 4.0 and my sql (20)

Sap Bw 3.5 Overview
Sap Bw 3.5 OverviewSap Bw 3.5 Overview
Sap Bw 3.5 Overview
Trevor Prescod
 
Business Intelligence and Big Data Analytics with Pentaho
Business Intelligence and Big Data Analytics with Pentaho Business Intelligence and Big Data Analytics with Pentaho
Business Intelligence and Big Data Analytics with Pentaho
Uday Kothari
 
Pentaho Suite Analysis
Pentaho Suite Analysis Pentaho Suite Analysis
Pentaho Suite Analysis
Kymberly Grayson-Perry
 
P6 analytics
P6 analyticsP6 analytics
P6 analytics
InSync Conference
 
BI Reporting Application Comparison
BI Reporting Application ComparisonBI Reporting Application Comparison
BI Reporting Application Comparison
Scott Mitchell
 
Pentaho-BI
Pentaho-BIPentaho-BI
Pentaho-BI
Edureka!
 
Webinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence IntroWebinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence Intro
SpagoWorld
 
Pentaho Partner Program Info
Pentaho Partner Program InfoPentaho Partner Program Info
Pentaho Partner Program Info
Sharmila Wijeyakumar
 
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
PwC
 
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewThe Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture View
DataWorks Summit/Hadoop Summit
 
Webinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence IntroWebinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence Intro
SpagoWorld
 
Power bi introduction
Power bi introductionPower bi introduction
Power bi introduction
Bishwadeb Dey
 
Complete Business Intelligence Solution for Your Microsoft Platform
Complete Business Intelligence Solution for Your Microsoft PlatformComplete Business Intelligence Solution for Your Microsoft Platform
Complete Business Intelligence Solution for Your Microsoft Platform
www.panorama.com
 
SAP - Business Objects - Ri happy
SAP - Business Objects - Ri happySAP - Business Objects - Ri happy
SAP - Business Objects - Ri happy
Douglas Bernardini
 
Open Source Solution
Open Source SolutionOpen Source Solution
Open Source Solution
ittishait
 
Hadoop uk user group meeting final
Hadoop uk user group meeting finalHadoop uk user group meeting final
Hadoop uk user group meeting final
Skills Matter
 
Oracle BI 11g Insync presentation
Oracle BI 11g Insync presentationOracle BI 11g Insync presentation
Oracle BI 11g Insync presentation
InSync Conference
 
Powerbi presentation from Microsoft Corporation
Powerbi presentation from Microsoft CorporationPowerbi presentation from Microsoft Corporation
Powerbi presentation from Microsoft Corporation
EngineerMBA1
 
P6 analytics r1 vp public 001
P6 analytics r1 vp public 001P6 analytics r1 vp public 001
P6 analytics r1 vp public 001
Mark Kromer
 
Sap bw bi
Sap bw biSap bw bi
Sap bw bi
trainer4ss
 
Business Intelligence and Big Data Analytics with Pentaho
Business Intelligence and Big Data Analytics with Pentaho Business Intelligence and Big Data Analytics with Pentaho
Business Intelligence and Big Data Analytics with Pentaho
Uday Kothari
 
BI Reporting Application Comparison
BI Reporting Application ComparisonBI Reporting Application Comparison
BI Reporting Application Comparison
Scott Mitchell
 
Pentaho-BI
Pentaho-BIPentaho-BI
Pentaho-BI
Edureka!
 
Webinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence IntroWebinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence Intro
SpagoWorld
 
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
Apache Hadoop Summit 2016: The Future of Apache Hadoop an Enterprise Architec...
PwC
 
The Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture ViewThe Future of Apache Hadoop an Enterprise Architecture View
The Future of Apache Hadoop an Enterprise Architecture View
DataWorks Summit/Hadoop Summit
 
Webinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence IntroWebinar: Open Source Business Intelligence Intro
Webinar: Open Source Business Intelligence Intro
SpagoWorld
 
Power bi introduction
Power bi introductionPower bi introduction
Power bi introduction
Bishwadeb Dey
 
Complete Business Intelligence Solution for Your Microsoft Platform
Complete Business Intelligence Solution for Your Microsoft PlatformComplete Business Intelligence Solution for Your Microsoft Platform
Complete Business Intelligence Solution for Your Microsoft Platform
www.panorama.com
 
SAP - Business Objects - Ri happy
SAP - Business Objects - Ri happySAP - Business Objects - Ri happy
SAP - Business Objects - Ri happy
Douglas Bernardini
 
Open Source Solution
Open Source SolutionOpen Source Solution
Open Source Solution
ittishait
 
Hadoop uk user group meeting final
Hadoop uk user group meeting finalHadoop uk user group meeting final
Hadoop uk user group meeting final
Skills Matter
 
Oracle BI 11g Insync presentation
Oracle BI 11g Insync presentationOracle BI 11g Insync presentation
Oracle BI 11g Insync presentation
InSync Conference
 
Powerbi presentation from Microsoft Corporation
Powerbi presentation from Microsoft CorporationPowerbi presentation from Microsoft Corporation
Powerbi presentation from Microsoft Corporation
EngineerMBA1
 
P6 analytics r1 vp public 001
P6 analytics r1 vp public 001P6 analytics r1 vp public 001
P6 analytics r1 vp public 001
Mark Kromer
 
Ad

More from AHMED ENNAJI (20)

Discipline positive
Discipline positiveDiscipline positive
Discipline positive
AHMED ENNAJI
 
Controle3 sur table elbilia tc1
Controle3 sur table elbilia tc1Controle3 sur table elbilia tc1
Controle3 sur table elbilia tc1
AHMED ENNAJI
 
Controle 1sur table tc1 elbilia nnaji
Controle 1sur table tc1 elbilia nnajiControle 1sur table tc1 elbilia nnaji
Controle 1sur table tc1 elbilia nnaji
AHMED ENNAJI
 
Controle2 sur table elbilia tc1
Controle2 sur table elbilia tc1Controle2 sur table elbilia tc1
Controle2 sur table elbilia tc1
AHMED ENNAJI
 
Contr 1 om pc biof decembre
Contr 1 om pc biof decembreContr 1 om pc biof decembre
Contr 1 om pc biof decembre
AHMED ENNAJI
 
Contr 2 om pc biof (janvier)2
Contr 2 om pc biof (janvier)2Contr 2 om pc biof (janvier)2
Contr 2 om pc biof (janvier)2
AHMED ENNAJI
 
Contr 3 om pc biof (janvier)
Contr 3 om pc biof (janvier)Contr 3 om pc biof (janvier)
Contr 3 om pc biof (janvier)
AHMED ENNAJI
 
Diagnos1
Diagnos1Diagnos1
Diagnos1
AHMED ENNAJI
 
Serie 1espace
Serie 1espaceSerie 1espace
Serie 1espace
AHMED ENNAJI
 
Devoir surveille 1 2 bac pc 2019
Devoir surveille 1  2 bac pc 2019Devoir surveille 1  2 bac pc 2019
Devoir surveille 1 2 bac pc 2019
AHMED ENNAJI
 
Exercice sur les fonctions exponentielles pour 2 bac pc
Exercice sur les fonctions exponentielles pour 2 bac pcExercice sur les fonctions exponentielles pour 2 bac pc
Exercice sur les fonctions exponentielles pour 2 bac pc
AHMED ENNAJI
 
Bac blanc3 oum
Bac blanc3 oumBac blanc3 oum
Bac blanc3 oum
AHMED ENNAJI
 
Bac blanc 11
Bac blanc 11Bac blanc 11
Bac blanc 11
AHMED ENNAJI
 
Examen blanc 7
Examen blanc 7Examen blanc 7
Examen blanc 7
AHMED ENNAJI
 
Bac blanc 5
Bac blanc 5Bac blanc 5
Bac blanc 5
AHMED ENNAJI
 
Bac blanc 6
Bac blanc 6Bac blanc 6
Bac blanc 6
AHMED ENNAJI
 
Bac blanc 10
Bac blanc 10Bac blanc 10
Bac blanc 10
AHMED ENNAJI
 
Bac blan 8 pc biof
Bac blan 8 pc biofBac blan 8 pc biof
Bac blan 8 pc biof
AHMED ENNAJI
 
Exercice bac pc1
Exercice bac pc1Exercice bac pc1
Exercice bac pc1
AHMED ENNAJI
 
Devoir surveille 1 semestre2 1sm om
Devoir surveille 1 semestre2 1sm omDevoir surveille 1 semestre2 1sm om
Devoir surveille 1 semestre2 1sm om
AHMED ENNAJI
 
Discipline positive
Discipline positiveDiscipline positive
Discipline positive
AHMED ENNAJI
 
Controle3 sur table elbilia tc1
Controle3 sur table elbilia tc1Controle3 sur table elbilia tc1
Controle3 sur table elbilia tc1
AHMED ENNAJI
 
Controle 1sur table tc1 elbilia nnaji
Controle 1sur table tc1 elbilia nnajiControle 1sur table tc1 elbilia nnaji
Controle 1sur table tc1 elbilia nnaji
AHMED ENNAJI
 
Controle2 sur table elbilia tc1
Controle2 sur table elbilia tc1Controle2 sur table elbilia tc1
Controle2 sur table elbilia tc1
AHMED ENNAJI
 
Contr 1 om pc biof decembre
Contr 1 om pc biof decembreContr 1 om pc biof decembre
Contr 1 om pc biof decembre
AHMED ENNAJI
 
Contr 2 om pc biof (janvier)2
Contr 2 om pc biof (janvier)2Contr 2 om pc biof (janvier)2
Contr 2 om pc biof (janvier)2
AHMED ENNAJI
 
Contr 3 om pc biof (janvier)
Contr 3 om pc biof (janvier)Contr 3 om pc biof (janvier)
Contr 3 om pc biof (janvier)
AHMED ENNAJI
 
Devoir surveille 1 2 bac pc 2019
Devoir surveille 1  2 bac pc 2019Devoir surveille 1  2 bac pc 2019
Devoir surveille 1 2 bac pc 2019
AHMED ENNAJI
 
Exercice sur les fonctions exponentielles pour 2 bac pc
Exercice sur les fonctions exponentielles pour 2 bac pcExercice sur les fonctions exponentielles pour 2 bac pc
Exercice sur les fonctions exponentielles pour 2 bac pc
AHMED ENNAJI
 
Bac blan 8 pc biof
Bac blan 8 pc biofBac blan 8 pc biof
Bac blan 8 pc biof
AHMED ENNAJI
 
Devoir surveille 1 semestre2 1sm om
Devoir surveille 1 semestre2 1sm omDevoir surveille 1 semestre2 1sm om
Devoir surveille 1 semestre2 1sm om
AHMED ENNAJI
 

Recently uploaded (20)

Hormones (mid terms) by yhbybhnybhunudr rida.pptx
Hormones (mid terms) by yhbybhnybhunudr rida.pptxHormones (mid terms) by yhbybhnybhunudr rida.pptx
Hormones (mid terms) by yhbybhnybhunudr rida.pptx
yousafmuzammil19
 
downhill-all-the-way (1).pdf BAth university
downhill-all-the-way (1).pdf  BAth universitydownhill-all-the-way (1).pdf  BAth university
downhill-all-the-way (1).pdf BAth university
Henry Tapper
 
Crypto Market Update Bitcoin Holds $94K, VIRTUAL Tops Gainers.pdf
Crypto Market Update Bitcoin Holds $94K, VIRTUAL Tops Gainers.pdfCrypto Market Update Bitcoin Holds $94K, VIRTUAL Tops Gainers.pdf
Crypto Market Update Bitcoin Holds $94K, VIRTUAL Tops Gainers.pdf
Coin Gabbar
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptx
JayeshTaneja4
 
Rural Livelihood.pptx Rural Development
Rural Livelihood.pptx  Rural DevelopmentRural Livelihood.pptx  Rural Development
Rural Livelihood.pptx Rural Development
Dr. Ravindra Pastor
 
DRAFT Internal presentation - Accessibility Act v2.pptx
DRAFT Internal presentation - Accessibility Act v2.pptxDRAFT Internal presentation - Accessibility Act v2.pptx
DRAFT Internal presentation - Accessibility Act v2.pptx
FinTech Belgium
 
5868585855858552452535545656685655636546Consumer Rights.pptx
5868585855858552452535545656685655636546Consumer Rights.pptx5868585855858552452535545656685655636546Consumer Rights.pptx
5868585855858552452535545656685655636546Consumer Rights.pptx
SajalSaifi
 
upljsjsjsjsjjsjsnsnsmsmmdmdmmdmdmdmmdmo.pptx
upljsjsjsjsjjsjsnsnsmsmmdmdmmdmdmdmmdmo.pptxupljsjsjsjsjjsjsnsnsmsmmdmdmmdmdmdmmdmo.pptx
upljsjsjsjsjjsjsnsnsmsmmdmdmmdmdmdmmdmo.pptx
JayeshTaneja4
 
Decoding What Project Financial Management Is.pdf
Decoding What Project Financial Management Is.pdfDecoding What Project Financial Management Is.pdf
Decoding What Project Financial Management Is.pdf
Enterprise Wired
 
Potential Crypto Airdrops – Checklist to Track the Most Promising Airdrops.pdf
Potential Crypto Airdrops – Checklist to Track the Most Promising Airdrops.pdfPotential Crypto Airdrops – Checklist to Track the Most Promising Airdrops.pdf
Potential Crypto Airdrops – Checklist to Track the Most Promising Airdrops.pdf
Coin Gabbar
 
Consolidated Accounting notes presentation 2
Consolidated Accounting notes presentation 2Consolidated Accounting notes presentation 2
Consolidated Accounting notes presentation 2
ashforddube14
 
Consolidated accounting notes presentation
Consolidated accounting notes presentationConsolidated accounting notes presentation
Consolidated accounting notes presentation
ashforddube14
 
Commercial Bank Economic Capsule - April 2025
Commercial Bank Economic Capsule - April 2025Commercial Bank Economic Capsule - April 2025
Commercial Bank Economic Capsule - April 2025
Commercial Bank of Ceylon PLC
 
Introduction to Agribusiness Marketing.pdf
Introduction to Agribusiness Marketing.pdfIntroduction to Agribusiness Marketing.pdf
Introduction to Agribusiness Marketing.pdf
AdityaPrananda3
 
EE2025 basic definitions and its importamnce.pptx
EE2025 basic definitions and its importamnce.pptxEE2025 basic definitions and its importamnce.pptx
EE2025 basic definitions and its importamnce.pptx
AnsarAbbas97
 
Depreciation of equipment's ____-__ .ppt
Depreciation of equipment's ____-__ .pptDepreciation of equipment's ____-__ .ppt
Depreciation of equipment's ____-__ .ppt
bluehhh07
 
George Mankiw Principle of Economics Chapter 26
George Mankiw Principle of Economics Chapter 26George Mankiw Principle of Economics Chapter 26
George Mankiw Principle of Economics Chapter 26
DyandraRenata
 
Truxton Capital: Middle Market Quarterly Review - April 2025
Truxton Capital: Middle Market Quarterly Review - April 2025Truxton Capital: Middle Market Quarterly Review - April 2025
Truxton Capital: Middle Market Quarterly Review - April 2025
truxtontrust
 
At Nonabel Disability, we redefine disability support services in Greater Syd...
At Nonabel Disability, we redefine disability support services in Greater Syd...At Nonabel Disability, we redefine disability support services in Greater Syd...
At Nonabel Disability, we redefine disability support services in Greater Syd...
zarishah73a
 
Blanchard_macro7e_accessible_fullppt_05.pptx
Blanchard_macro7e_accessible_fullppt_05.pptxBlanchard_macro7e_accessible_fullppt_05.pptx
Blanchard_macro7e_accessible_fullppt_05.pptx
examssua
 
Hormones (mid terms) by yhbybhnybhunudr rida.pptx
Hormones (mid terms) by yhbybhnybhunudr rida.pptxHormones (mid terms) by yhbybhnybhunudr rida.pptx
Hormones (mid terms) by yhbybhnybhunudr rida.pptx
yousafmuzammil19
 
downhill-all-the-way (1).pdf BAth university
downhill-all-the-way (1).pdf  BAth universitydownhill-all-the-way (1).pdf  BAth university
downhill-all-the-way (1).pdf BAth university
Henry Tapper
 
Crypto Market Update Bitcoin Holds $94K, VIRTUAL Tops Gainers.pdf
Crypto Market Update Bitcoin Holds $94K, VIRTUAL Tops Gainers.pdfCrypto Market Update Bitcoin Holds $94K, VIRTUAL Tops Gainers.pdf
Crypto Market Update Bitcoin Holds $94K, VIRTUAL Tops Gainers.pdf
Coin Gabbar
 
New Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptxNew Microsoft PowerPoint Presentation.pptx
New Microsoft PowerPoint Presentation.pptx
JayeshTaneja4
 
Rural Livelihood.pptx Rural Development
Rural Livelihood.pptx  Rural DevelopmentRural Livelihood.pptx  Rural Development
Rural Livelihood.pptx Rural Development
Dr. Ravindra Pastor
 
DRAFT Internal presentation - Accessibility Act v2.pptx
DRAFT Internal presentation - Accessibility Act v2.pptxDRAFT Internal presentation - Accessibility Act v2.pptx
DRAFT Internal presentation - Accessibility Act v2.pptx
FinTech Belgium
 
5868585855858552452535545656685655636546Consumer Rights.pptx
5868585855858552452535545656685655636546Consumer Rights.pptx5868585855858552452535545656685655636546Consumer Rights.pptx
5868585855858552452535545656685655636546Consumer Rights.pptx
SajalSaifi
 
upljsjsjsjsjjsjsnsnsmsmmdmdmmdmdmdmmdmo.pptx
upljsjsjsjsjjsjsnsnsmsmmdmdmmdmdmdmmdmo.pptxupljsjsjsjsjjsjsnsnsmsmmdmdmmdmdmdmmdmo.pptx
upljsjsjsjsjjsjsnsnsmsmmdmdmmdmdmdmmdmo.pptx
JayeshTaneja4
 
Decoding What Project Financial Management Is.pdf
Decoding What Project Financial Management Is.pdfDecoding What Project Financial Management Is.pdf
Decoding What Project Financial Management Is.pdf
Enterprise Wired
 
Potential Crypto Airdrops – Checklist to Track the Most Promising Airdrops.pdf
Potential Crypto Airdrops – Checklist to Track the Most Promising Airdrops.pdfPotential Crypto Airdrops – Checklist to Track the Most Promising Airdrops.pdf
Potential Crypto Airdrops – Checklist to Track the Most Promising Airdrops.pdf
Coin Gabbar
 
Consolidated Accounting notes presentation 2
Consolidated Accounting notes presentation 2Consolidated Accounting notes presentation 2
Consolidated Accounting notes presentation 2
ashforddube14
 
Consolidated accounting notes presentation
Consolidated accounting notes presentationConsolidated accounting notes presentation
Consolidated accounting notes presentation
ashforddube14
 
Introduction to Agribusiness Marketing.pdf
Introduction to Agribusiness Marketing.pdfIntroduction to Agribusiness Marketing.pdf
Introduction to Agribusiness Marketing.pdf
AdityaPrananda3
 
EE2025 basic definitions and its importamnce.pptx
EE2025 basic definitions and its importamnce.pptxEE2025 basic definitions and its importamnce.pptx
EE2025 basic definitions and its importamnce.pptx
AnsarAbbas97
 
Depreciation of equipment's ____-__ .ppt
Depreciation of equipment's ____-__ .pptDepreciation of equipment's ____-__ .ppt
Depreciation of equipment's ____-__ .ppt
bluehhh07
 
George Mankiw Principle of Economics Chapter 26
George Mankiw Principle of Economics Chapter 26George Mankiw Principle of Economics Chapter 26
George Mankiw Principle of Economics Chapter 26
DyandraRenata
 
Truxton Capital: Middle Market Quarterly Review - April 2025
Truxton Capital: Middle Market Quarterly Review - April 2025Truxton Capital: Middle Market Quarterly Review - April 2025
Truxton Capital: Middle Market Quarterly Review - April 2025
truxtontrust
 
At Nonabel Disability, we redefine disability support services in Greater Syd...
At Nonabel Disability, we redefine disability support services in Greater Syd...At Nonabel Disability, we redefine disability support services in Greater Syd...
At Nonabel Disability, we redefine disability support services in Greater Syd...
zarishah73a
 
Blanchard_macro7e_accessible_fullppt_05.pptx
Blanchard_macro7e_accessible_fullppt_05.pptxBlanchard_macro7e_accessible_fullppt_05.pptx
Blanchard_macro7e_accessible_fullppt_05.pptx
examssua
 

Pentaho data integration 4.0 and my sql

  • 1. Pentaho Data Integration 4 and MySQL Matt Casters: Pentaho's Chief Data Integration Kettle Project Founder MySQL User Conference, Tuesday April 13th , 2010
  • 2. Agenda Pentaho: an introduction Pentaho Data Integration Version 4: New features MySQL support in PDI Q&A
  • 3. Pentaho Introduction Commercial open source alternative for business intelligence (BI) Founded in 2004: Pioneer in commercial open source BI Large referenceable customer base, wide range of BI/DW deployments Management - proven BI and open source veterans Business Objects, Cognos, Hyperion, JBoss, Oracle, Red Hat, SAS, SugarCRM Board of Directors – deep expertise and proven success in open source Bob Bearden - Executive chairman of the board (former SpringSource) Larry Augustin - founder, VA Software, helped coin the phrase “open source” Zack Urlocker – VP of Products, MySQL/Oracle Benchmark Capital, Index Ventures, New Enterprise Associates Widely recognized as the leader in open source BI
  • 4. Pentaho Introduction Complete Business Intelligence Suite End-to-end coverage of all BI needs Standards-based, modular, standalone or embeddable platform Open Source Licensing Lower software acquisition costs Lower Total Cost of Ownership (TCO) Enterprise Development Methodology Transparent, detailed roadmap Product roadmap and contributions managed by Pentaho Core developers are Pentaho employees Extensive QA Expert Services Comprehensive Training, Consulting, Enterprise service offerings Delivered by the Experts
  • 5. Pentaho Introduction – Enterprise Edition
  • 6. Pentaho Introduction – Deployments Wide range of deployments Reporting Data Integration / ETL Dashboards Full BI Suite Thousands of users 3,000 on a single server Large data volumes Half a terabyte of live interactive OLAP data ETL loading 300K rows/second Sophisticated applications Hundreds of dimensions Small deployments as well 20 users, MS Access databases
  • 7. Pentaho Introduction – Technology Componentized and modular Service-implemented architecture Built “from the ground up” as a set of services Exposed via AJAX and Web Services 100% Java EE server side Scalable, standards-based Web-based, thin-client end user interfaces Graphical design interfaces Embedded process workflow engine
  • 8. Pentaho Introduction – Reporting Access and format data from disparate sources RDBMS, XML, OLAP Produce in popular formats Multiple report types Operational Analytical Financial Parameterized Go directly against data sources or Pentaho’s centralized metadata layer
  • 9. Pentaho Introduction – Analysis Navigate and explore Ad hoc, interactive analysis Drill into further detail Select specific members for analysis View data “dimensionally” i.e. Sales by region, by channel, by time period ROLAP architecture Works with all popular open source and proprietary DBs No intermediate storage Aggregate table “aware” for faster analytic queries Design tools to build OLAP schemas and improve query performance
  • 10. Pentaho Introduction – Dashboards Gain visibility into your organization’s key performance indicators (KPIs) Monitor top-level performance and drill into supporting detail Illuminate metrics for quick insight into business activities Track exceptions and receive alerts Leverage the full Pentaho BI Suite Comprehensive auditing of user activity, performance and data access Context-sensitive drilling to reports and analysis views Integrated security, scheduling, alerting, portal integration Integrate with 3rd -party and custom applications
  • 11. Pentaho Introduction – Dashboard Designer Web-based end user dashboard creation From Pentaho User Console “Zero training” Template and theme-based creation Incorporate reports, analysis views, Adobe Flash-based charts and other Pentaho content Create new charts and interactive data grids from scratch Pentaho metadata – no SQL required Filter controls
  • 12. Pentaho Introduction – BI Platform Provides critical services for end users Easy access to business information Intuitive scheduling Delivery over the web or via email Alerting and notification Provides critical services for administrators Centralized thin-client administration Data source and security management Auditing and Performance monitoring Enterprise security integration Definition and execution of business rules Integration points with 3rd party applications Pentaho Enterprise Console Pentaho User Console
  • 13. Pentaho Introduction – Metadata Provides an abstraction layer between source systems and business user concepts Graphical design environment for defining metadata model Data presented to business users in business terms Allows business users to create their own ad hoc reports based on centralized business rules, without any technical skills or knowledge of SQL Changes to physical database do not impact reports or analytic views Business Intelligence Metadata Business User Automated SQL generation Physical Database Model
  • 14. Pentaho Introduction – Data Mining Take BI to the next level with predictive analytics Gain insight into hidden patterns and relationships Discover indicators of future performance Exploit correlations to improve organizational performance Embed recommendations in reports, dashboards, or custom applications
  • 15. Agenda Pentaho: an introduction Pentaho Data Integration Version 4: New features MySQL support in PDI Q&A
  • 16. Pentaho Data Integration for BI Business Intelligence! That's what we do.
  • 17. Pentaho Data Integration – Kettle Kettle Extraction Transportation Transformation Loading Environment
  • 18. Pentaho Data Integration – Extraction Extract data from : 35+ database types MySQL, PostgreSQL, SQLite, ... Oracle, SQL Server, etc Text files XML files XLS files Xbase files (dBase, Foxpro, etc) File systems information Generated data MS Access files LDAP Geo-data ...
  • 19. Pentaho Data Integration – Transportation Transportation of data Engine based data transfer (no code generator) Very flexible pathways: splitting partitioning merging joining duplicating clustering (MPP)
  • 20. Pentaho Data Integration – Transformation Flexibly transform data Looking up data databases files memory... Calculating Scripting JavaScript, SQL, RegExp Splitting Mapping Selecting Filtering Pivotting ...
  • 21. Pentaho Data Integration – Loading Load data into a target format Database loads Data warehouse population Partitioned loading Bulk loading Parallel loading Clustering
  • 22. Pentaho Data Integration – Environment Full GUI called “Spoon” to edit every option in Kettle Drag & Drop Debugger Rich GUI Command line tools execute jobs execute transformations Web server clustering remote execution Programming API for Java Plugin eco-system ...
  • 23. Pentaho Data Integration – Community Paying Pentaho customers Large and small corporations All possible sectors Lone rangers & Hobbiests All regions on Earth Meet on our Forum : +30,000 posts in 3 years Use our JIRA case tracking systems Download more than 10,000 copies of Kettle per month http://www.ohloh.net/projects/3624?p=Kettle http://www.softpedia.com/progClean/Kettle-Clean-80094.html
  • 24. Pentaho Data Integration – use-cases Load data from text files and store it into a database [demo] Export data from database to text-file or more other databases Data migration between database applications Exploration of data in existing databases (tables, views, etc.) Information improvement using lookups Data cleaning Application integration Data warehouse population Application integration Report data generation ...
  • 25. Pentaho Data Integration – Adoption Wide range of production deployments Small and medium-sized companies Large enterprises Rapid product evolution Driven by Pentaho investment Includes significant community contributions “Contribution-friendly” architecture Natural fit for additional data sources, targets and transformations
  • 26. Pentaho Data Integration – Adoption Most deployed open source data integration solution. Independent study by Mark Madsen of Third Nature and the BeyeNETWORK Download free study at pentaho.com
  • 27. Pentaho Data Integration – Links Homepage: http://kettle.pentaho.org Forum: http://forums.pentaho.org/forumdisplay.php?f=69 Case tracker: http://jira.pentaho.org/browse/PDI Continuous Integration Server: http://ci.pentaho.com/job/Kettle Wiki : http://wiki.pentaho.org/display/EAI IRC Channel: ##pentaho (on Freenode) Mailing list: http://groups.google.com/group/kettle-developers My blog: http://www.ibridge.be My coordinates: mcasters at pentaho dot org
  • 28. Agenda Pentaho: an introduction Pentaho Data Integration Version 4 : New features MySQL support in PDI Q&A
  • 29. Version 4: New features - Visualisation Demo New welcome screen Mouse-over slide-outs for icons Hop creation Improved error handling configuration New perspectives support for Agile BI visualisations, modelling, scheduling, etc.
  • 30. Version 4: New features - Running jobs Drill down into running job entries Visual indicators of running and completed job entries Success and failure mini-icons Mouse over completion mini-icons shows details of execution results Log capturing of completed job entries
  • 31. Version 4: New features - Running transformations Drill down into running transformation job entries and mappings Row input/output sniff testing: see what rows are passing (demo) Remote input/output sniff testing on a Carte server
  • 32. Version 4: New features - Better logging Reduced memory consumption Incremental log updates Global log buffer size limit for long running jobs/transformations Interval logging Auto clean-up of old log records Log record time-outs & execution lineage Log record colour coding in Spoon (blue and red for error lines) Step and job entry level Logging Execution lineage logging Renaming individual columns Global configuration options for all log tables
  • 33. Version 4: New features - Plugins Unified plug-in architecture Easier deployment and packaging Step, job entry, partitioner, database type, spoon perspective, life-cycle, ... : all pluggable --> MySQL 5.1 plugin
  • 34. Version 4: New features - Repositories Allowing for 3rd party repositories like the Pentaho Unified Enterprise Repository Removed dependencies to relational database repository (still supported though) Added support for repositories capable of team-development (file locking) Added support for repositories capable of fine-grained security repositories Added support for repositories capable of storing and retrieving revision history
  • 35. Version 4: New features – New steps SAP Input Data Grid OLAP Input (Mondrian, Palo, SSAS, SAP B/W) Palo Cell Input/Output, Dimension Input/Output Salesforce Delete, Insert, Update, Upsert Add fields changing sequence (group sequence) User Defined Java Class: create your own plugin in Java on the fly in a step Send information using Syslog: Send a message to a Syslog server. Java Filter Memory Group By Farrage streaming bulk loader Teradata Fastload Bulk loader Experimental steps like Get table names, Email messages input, ...
  • 36. Agenda Pentaho: an introduction Pentaho Data Integration Version 4 : New features MySQL support in PDI Q&A
  • 37. MySQL Support in PDI JDBC/ODBC Driver Integration Reading: MySQL Result Streaming (cursor emulation) support Writing: MySQL dialects for data types Job entry: Bulk Loader of text files for MySQL Job entry: Bulk writer to a text file for MySQL Database Partitioning (Sharding) Demo
  • 38. Database partitioning Year 2003 Partition Year 2004 Partition Year 2005 Partition Year 2006 Partition Sales table 2003 2004 2005 2006 Sales 2003 2004 2005 2006 Sales 2003 2004 2005 2006 Sales 2003 2004 2005 2006 Sales DB DB2 DB DB4
  • 39. Questions and Closing Other Pentaho related User Conference information: Collapsing BI from Months to Minutes (Agile BI) Jared Cornelius Ballroom H 11:55am Tuesday April 13th MySQL Binary Log Analysis With Pentaho BI Robert Booth Ballroom B 5:15pm Wednesday April 14th The Pentaho Booth 516 in the Exibition Hall ETA: September 2010