0% found this document useful (0 votes)
80 views

Syllabus of DataStage Course

This document outlines a 22 module course syllabus on InfoSphere DataStage. The course covers topics such as data warehousing concepts, ETL processes, installing and configuring DataStage, designing jobs, file handling, data transformation, database integration, parallel job design, exception handling, deployment, monitoring and performance tuning.

Uploaded by

Dinesh Sanodiya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
80 views

Syllabus of DataStage Course

This document outlines a 22 module course syllabus on InfoSphere DataStage. The course covers topics such as data warehousing concepts, ETL processes, installing and configuring DataStage, designing jobs, file handling, data transformation, database integration, parallel job design, exception handling, deployment, monitoring and performance tuning.

Uploaded by

Dinesh Sanodiya
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Syllabus of DataStage Course in Chennai

Module 1: Introduction to Data Warehouse Concepts

​ What is Data Warehouse?

​ Data Mart

​ OLAP VS OLTP

​ Data Warehouse Architecture

​ What is Data Modelling?

​ Explorer on Dimensional Modelling

​ Explorer on Star Schema

​ Explain Snowflake Schema

​ Understanding on Dimension

​ Understanding on Fact

​ Slowly Changing Dimension

​ Lifecycle of Data Warehous

Module 2: Understanding on

ETL

(EXTRACTION, TRANSACTION, LOAD)

​ Overview of ETL

​ Feature and benefit for Business

​ Different SCD Types

​ ETL tools in markets

​ Explain on staging tables

​ Explain on Transformation

​ Loading data into different stage of table


Module 3: Overview of InfoSphere DataStage

​ What is InfoSphere DataStage?

​ Architecture of DataStage

​ Explain on Topologies

​ Components in DataStage

​ Runtime Architecture

​ OSH Script and Execution Flow

Module 4: Install and Configuration on InfoSphere DataStage

​ Prerequisite for InfoSphere DataStage

​ Install InfoSphere DataStage

​ Verify Installation

​ Setup Environment variables

​ Create / Update / Delete projects

​ User creation and Grand permission

Module 5: Working with DataStage Designer

​ Overview of Designer

​ Explorer on DataStage Designer

​ High level overview of Commonly used Stages

​ Schema

​ Pipeline Parallelism

​ Manipulate configuration file

​ Repository Palette

​ Passive and Active stages

​ Annotation and Create jobs

​ Import and Export Metadata


​ Dataset Management

​ Partition technique

Module 6: DataStage Job

​ Overview of Job types

​ Explain on Sequence and Parallel Jobs

​ Explain on Server Jobs

​ Different stages

​ Understanding Containers

Module 7: DataStage Director

​ Introduction to DataStage Director

​ User Interface Director

​ Job status and view

​ Compiling Single and Multiple jobs

​ Run, Reset ad Restart jobs

​ Scheduling Batches

​ Performance monitor

Module 8: Creating Parallel Job

​ Overview of Parallel Jobs

​ Design a Parallel Job using Designer

​ Pipeline Parallelism

​ Partition Parallelism

​ NLS Mode Work

​ Maps in Parallel Jobs


​ Run Parallel Jobs

Module 9: Handle Files

​ Introduction to file handling

​ Sequence and Complex file stage

​ Huge File Manipulation

​ Error and Invalid Records Rejection

Module 10: File Stages

​ File Stages

​ Sequential File stage

​ Explain on DataSet

​ Complex Flat File stage

​ Create jobs to read and write on sequential files

​ Multiple file reader using file patterns

​ Null handling in File Stage

​ Lookup file Set

Module 11: Combining and Partitioning Data

​ Overview of data process for combine and Partitioning

​ Combine data using by Lookup stage

​ Combine data using by merge stage

​ Combine data using by Join stage

​ Combine data using the Funnel stage


Module 12: Sorting and Aggregating Data

​ Sort data using in-stage sorts and Sort stage

​ Data Segregation using Aggregates stage

​ Unique data using Duplicates stage

Module 13: Transformation on Data

​ Understanding DataStage internal logical message

​ Column generator and Row generator

​ Transform message one to another format

​ Filter Data using on business criteria

​ Control data flow based on data conditions

​ Cover real time scenario using different Processing Stages

​ Routes creation

Module 14: Working with Relational DataBase

​ Understanding Database Stage

​ Database Metadata

​ Explain on ODBC Connection

​ Import Definitions for Tables.

​ Use Connector stages in a job.

​ Define SQL statements using Builder

​ ODBC Connector

​ Oracle Connector

​ Parallel Job with Connector

Module 15: Advanced Parallel Jobs


​ Overview of Type1 and Type2 process

​ Range look process

​ Job Performance analysis

​ Performance tuning

Module 16: Job Sequence

​ Job activities in Sequencer

​ Sequence Trigger

​ Notification and Terminator activity

​ Start and End Loop activity

​ Error and Exception handling activity

Module 17: Working with Cleansing Data

​ Overview of Cleansing

​ Explain Workflow of Standardization

​ Create and Configure Standardize Stage Job

​ Explain on Rule sets

​ Managing Rule sets

Module 18: Exception Handling on DataStage

​ Introduction to Exception Handling

​ How to Design job to link with Exception

​ Explain on Exception stage

​ One-source and Two-source Match Exception Stage

​ Route exception to Exception Stage


Module 19: Deployment on InfoSphere DataStage

​ Introduction to InfoSphere Information Server Manager

​ Explain on Deployment life cycle

​ Adding Domain on Information Server Manager

​ View job and asset properties

​ Explain Deployment Package

​ Deployment Workflow

​ Define Deployment Package

​ Setup Deploy Path

​ Deploy Package

​ Import and Export Assets

​ Explain various types of Source Control for DataStage

Module 20: Working with Monitoring Jobs

​ Introduction to Monitoring Jobs

​ Explain on Operations Console

​ How to Monitoring Jobs by using console

Module 21: Performance Tuning Job

​ Understanding performance impact activities

​ Design Job for Optimal Performance

​ Design flow with minimize CPU and Memory usage

​ Explain on Buffering

​ Deadlock prevention

Module 22: Best Practice on DataStage and Data Load

You might also like