Talend Data Integration Certification and Training
Talend Data Integration Certification and Training
The training is designed to help you master Talend Data Integration. It is an open source ETL
tool which will help you in boosting up your productivity. With this you can easily integrate
all your data with your data warehouse and applications, or synchronize data between
systems. You'll also use Talend Administrator Center to create users, Projects Scheduling and
Publishing Talend jobs etc.
Target Audience
Business Analysts
Data warehousing professionals
Data Analysts
Solution & Data Architects
System Administrators
Software Engineers
College Students
Prerequisites
As such, there are no pre-requisites for learning Talend Data Integration. Knowledge of basic
Computer languages, RDBMS (MySQL, SQL Server, Oracle any one) and ETL
fundamentals will be beneficial, but certainly not a mandate.
Course Duration
40 hours
Course Objectives
After the completion of this course, you should be able to:
Understand the ETL concepts and How to solve the real-time business problems
using Talend
Understand Talend Architecture and its various components.
Learn how to install and configure Talend product. Create a project
Create and run a Job that reads, converts, and writes data
Merge data from several sources within a Job
Save a schema for repeated use
Create and use metadata and context variables within Jobs
Connect to, read from, and write to a database from a Job
Access a web service from a Job
Work with master Jobs and Sub Jobs
Build, export, and test-run Jobs outside Studio
Implement basic error-handling techniques
Demonstration of a list of 100 mostly used components.
Implement several methods of parallel execution in a Talend Job
Create Job lets and call from a master job.
Create a unit test from a working Job
Configure a database to monitor and log changes in a separate change data capture
(CDC) database
Use the CDC database to perform incremental updates between the source and target
Understand SCD(Slowly change dimension) and its implementation.
Use best practices for Job and component naming, hints, and documentation
Talend job optimization and improvement.
Course Outline
Goal:
In this module, you will get an overview of ETL Technologies and the reason why Talend is
referred as the next Generation Leader in Data Integration. You will also get a brief on
various products offered by Talend corporation till date and its relevance to Data Integration
and Big Data. Further you will learn about the Talend Studio Data Integration and installation
Objectives:
At the end of this module, you should be able to:
Topics:
Working with ETL
Rise of Data Integration and Migration.
Role of Open Source ETL Technologies in Big Data
Comparison with other market leader tools in ETL domain
Importance of Talend (Why Talend)
Talend and its Products
Introduction of Talend Open Studio
Talend for Data Integration
GUI of Talend Studio
Hands On:
Downloading
1.Talend Open Studio(TOS)
2.Talend Data Integration for 30 days trial.
Module 2: Installing Talend Studio and understand Talend Glossary
Goal:
This module will make you acquainted with different installation method of Talend product
in different environment like Window Linux Mac etc. We will understand hardware and
software requirement for installing Talend. In this module we will understand how to install
Talend product using Talend Installer. Once the installation is finished will start Talend
Studio and will understand all the related glossary. In this module, you will understand what
are the different section in Talend Studio and how to use these sections. You will also learn
items present in different section.
Objectives:
At the end of this module, you should be able to:
What is the hardware and software requirement to install Talend product.
How to set JAVA_HOME as environment variable.
What is repository in Talned Studio
What are different items present in repository
What is design workspace and palette in Studio
How to use Talend configuration tab
Topics:
Installing your Talend Studio and set JAVA_HOME.
Talend repository and following items
o Job Designs
o Contexts
o Codes
o SQL Templetes
o Metadata
o Documentation
o Business Model
o Recycle bin
Talend design workspace and palette
Talend configuration tabs
Hands On:
Above all topics are hands-on intensive. Creating connection with MySQL database. Creating
different files in metadata.
Module 3: Components in Talend Studio
Goal:
In this module, you will understand what is a component in Talend and what is the role of a
component. You will also understand different family of components and commonly used
component in Talend
Objectives:
At the end of this module, you should be able to:
Understand the role of a component in talend
Explain Communication between homogeneous/heterogeneous data sources
Filter data
Different family of components
Topics:
Databases (DB2, MySql, SQL Server, Oracle, PostgreSql, AS400 etc.)
Files (xml files,Json files,Excel file,CSV file etc)
Processing components
o tFilterRow
o tJoin
o tMap
o tNormalize
o tReplace
o tReplaceList
o tSampleRow
o tSampleRow
o tSortRow
o tSplitRow
o tUniqRow
o tReplicate
o tUnite)
o tJoin
o tFilter
o tSortRow
o tAggregateRow
o tReplicate
o tSplit
Orchestration family components
o tFileList
o tFlowToIterate
o tForeach
o tInfiniteLoop
o tIterateToFlow
o tLoop
o tPostjob
o tPrejob
o tReplicate
o tRunJob - Orchestration
o tSleep
o tUnite
Logs & Errors components
o tAssert
o tAssertCatcher
o tChronometerStart
o tChronometerStop
o tDie
o tFlowMeter
o tFlowMeterCatcher
o tLogCatcher
o tLogRow
o tStatCatcher
o tWarn
Hands On:
Will create Talend Jobs for all these component
Module 4: Basic Transformations, custom code and Context
Variables
Goal:
In this module, you will understand data mapping and transformations using TOS. You will
also learn how to filter and join various data sources and then search and sort through them.
Objectives:
At the end of this module, you should be able to:
Perform data mapping and transformations
Explain Communication between homogeneous/heterogeneous data sources
Filter data
Join different data sources
Implement advance Lookups
Search and sort different sets of data
Enhance Data Quality
Use context variables and routines
Understand tMap and its wide usages
Topics:
How to select right component for my task
Routines in talned
Custom code in Talend
Accessing job level/ component level information within the job
Data Quality and Cleansing
Context variables and uses
tMap functionality
Hands On:
Above all topics are hands-on intensive
Module 5: Advanced Transformations (CDC, SCD) and Executing
Jobs Remotely in Talend
Goal:
In this module, you will understand the transformation and various steps involved in program
looping of Talend, ways to search files in a directory and how to process them in a sequence.
You will also learn to export and import jobs and run them remotely.
Objectives:
At the end of this module, you should be able to:
File management
Explain Casting
Describe looping components
Read data from folders and different FTP locations
Export and import Talend jobs
Use tFileList, tRunJob
Understand CDC and implementation
Parameterize a Talend job
Understand SCD
Schedule and run Talend DI jobs externally/ remotely
Topics:
Various components of file management
Type Casting (convert datatypes among source-target platforms)
Looping components (like tLoop, tFor)
tFileList, tFTPFileList
tRunJob
How to schedule and run Talend DI jobs externally (not in GUI)
Parameterizing a Talend job
CDC and SCD
Hands On:
Above all topics are hands-on intensive
Module 6: Different data source and target for Talend jobs
Goal:
In this module, you will learn about the different data source we can use in Talend and the
component available for these data sources. You will understand how to read data from
files(csv,xml,json,excel) , database(MySQL,SQL Server,Oracle) and APIs.
Objectives:
At the end of this module, you should be able to:
Understand different types of files(CSV,XML,JSON, EXCEL)
Xpath and JSON path query
Understand the components available for XML and JSON files in talned
Understand database components
Webservices and components to call REST and SOAP service form talned Studio
Commonly used CRMS Components (SalesForce, SugarCRM, etc.)
Topics:
CSV files and related components
Different components for Creating XML files
Wide use of tXMLMap components in talnd
Different way to create a database connection
Difference between built in and Repository schema
X-Path query and JSON-Path query
Big Data setup using Hortonworks Sandbox in your personal computer
Explaining the TOS for Big Data Environment
Hands On:
Above all topics are hands-on intensive. Will cover number of assignment for XML and
JSON files and Database components