0% found this document useful (0 votes)

252 views15 pages

Data Stage Interview Questions

DataStage is an ETL tool used to integrate data from multiple sources and process large volumes of data. It provides a graphical user interface to design jobs that extract, transform, and load data. Common interview questions about DataStage include explaining what it is, how sources are populated, commands to import and export jobs, differences between versions, and explaining stages like Merge, Join, and Lookup.

Uploaded by

Manoj Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

252 views15 pages

Data Stage Interview Questions

Uploaded by

Manoj Sharma

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 15

DATA STAGE INTERVIEW QUESTIONS & ANSWERS

DataStage is an ETL tool. It is used in graphical notation to build/construct solutions to

data integration. It is available in market in various versions like SE, EE and MVS edition.
One can check the availability of the job across cities including Bangalore, Pune,
Chennai and Hyderabad. DataStage role includes having knowledge on Data warehouse,
ETL, DataStage configuration, Design, Various stages, modules in datastages. It is used
to integrate multiple systems and processes high volumes of data. DataStage has user
friendly graphical frontend to design jobs. DataStage interview questions and
answers are useful to attend job interviews and get shortlisted for job position.
1. Question 1. Explain Data Stage?
Answer :
A data stage is simply a tool which is used to design, develop and execute
many applications to fill various tables in data warehouse or data marts.Learn
more about DataStage in this insightful blog post now.
2. Question 2. Tell How A Source File Is Populated?
Answer :
We can generate a source file in various ways such as by making a SQL query
in Oracle, or by using row generator extract tool etc.
3. Question 3. Write The Command Line Functions To Import And Export
The Ds Jobs?
Answer :
To signify the DS jobs, dsimport.exe is used and to export the DS jobs,
dsexport.exe is used.
4. Question 4. Differentiate Between Datastage 7.5 And 7.0?
Answer :
In Datastage 7.5 various new stages are added for more sturdiness and
smooth performance, such as Procedure Stage, Command Stage,etc.
5. Question 5. Explain Merge?
Answer :
Merge means to merge two or more tables. The two tables are merged on the
origin of Primary key columns in both the tables.Interested in learning
DataStage? Well, we have the in-depth DataStage Courses to give you a head
start in your career.
6. Question 6. Differentiate Between Data File And Descriptor File?
Answer :
As the name says, data files contains the data and the descriptor file contains
the information about the data in the data files.
7. Question 7. Differentiate Between Data Stage And Informatica?
Answer :
In datastage, there is a perception of separation, parallelism for node
configuration. While, there is no perception of separation and parallelism in
informatica for node configuration. Also, Informatica is more scalable than
Datastage. Datastage is more easy to use as compared to Informatica.

8. Question 8. Explain Routines And Their Types?

Answer :
Routines are basically group of functions that is described by DS manager. It
can be called through transformer stage. Routines are of three types such as,
parallel routines, server routines and main frame routines.
9. Question 9. How Can We Write Parallel Routines In Data Stage Px?
Answer :
We can mention parallel routines in C or C++ compiler. Such routines are also
developed in DS manager and can be called from transformer stage.
10. Question 10. What Is The Procedure Of Removing Duplicates, Without
The Remove Duplicate Stage?
Answer :
Duplicates can be detached by using Sort stage. We can use the opportunity,
as allow duplicate = false.
11. Question 11. What Steps Should Be Taken To Recover Datastage Jobs?
Answer :
In order to recover presentation of Datastage jobs, we have to first create the
baselines. Secondly, we should not use only one flow for presentation testing.
Thirdly, we should work in growth. Then, we should appraise data skews. Then
we should separate and solve the problems, one by one. After that, we should
allocate the file systems to take away bottlenecks, if any. Also, we should not
embrace RDBMS in start of testing phase. Last but not the least, we should
understand and evaluate the available tuning knobs.
12. Question 12. Compare And Contrast Between Join, Merge And Lookup
Stage?
Answer :
All the three are dissimilar from each other in the way they use the memory
storage, compare input necessities and how they treat various data . Join and
Merge needs minimum memory as compared to the Lookup stage.
13. Question 13. Describe Quality Stage?
Answer :
Quality stage is also called as Integrity stage. It assists in integrating various
types of data from different sources.
14. Question 14. Describe Job Control?
Answer :
Job control can be best performed by using Job Control Language (JCL). This
tool is used to execute various jobs concurrently, without using any kind of
loop.
15. Question 15. Contrast Between Symmetric Multiprocessing And
Massive Parallel Processing?
Answer :
In Symmetric Multiprocessing, the hardware resources are communal by
processor. The processor has one operating system and it communicates
through shared memory. While in Massive Parallel processing, the CPU
contact the hardware resources completely. This type of processing is also
called as Shared Nothing, as nothing is common in this. It is quicker than the
Symmetric Multiprocessing.
16. Question 16. Write The Steps Required To Kill The Job In Datastage?
Answer :
To destroy the job in Datasatge, we have to kill the individual processing ID.
17. Question 17. Contrast Between Validated And Compiled In The
Datastage?
Answer :
In Datastage, validating a job means, executing a job. While validating, the
Datastage engine checks whether all the necessary properties are given or
not. In other case, while compiling a job, the Datastage engine checks that
whether all the given property are suitable or not.
18. Question 18. How We Can Run Date Conversion In Datastage?
Answer :
We can use date conversion function for this reason i.e. Oconv
(Iconv(Filedname,”Existing Date Format”),”Another Date Format”).
19. Question 19. What Is The Need Of Exception Activity In Datastage?
Answer :
All the stages after the exception activity in Datastage are run in case of any
unfamiliar error occurs while executing the job sequencer.Learn how the
DataStage Training Videos can take your career to the next level!
20. Question 20. Explain Apt_config In Datastage?
Answer :
It is the environment variable which is used to recognize the *.apt file in
Datastage. It is also used to keep the node information, scratch information
and disk storage information.
21. Question 21. Write The Different Types Of Lookups In Datastage?
Answer :
There are two types of Lookups in Datastage i.e. Normal lookup and Sparse
lookup.
22. Question 22. How We Can Covert Server Job To A Parallel Job?
Answer :
We can convert a server job in to a parallel job by using Link Collector and IPC
Collector.
23. Question 23. Explain Repository Tables In Datastage?
Answer :
In Datastage, the Repository is second name for a data warehouse. It can be
federalized as well as circulated.
24. Question 24. Describe Oconv () And Iconv () Functions In Datastage?
Answer :
In Datastage, OConv () and IConv() functions are used to convert formats from
one format to another i.e. conversions of time, roman numbers, radix, date,
numeral ASCII etc. IConv () is mostly used to change formats for system to
understand. While, OConv () is used to change formats for users to
understand.
25. Question 25. Define Usage Analysis In Datastage?
Answer :
In Datastage, Usage Analysis is done within few clicks. Launch Datastage
Manager and right click on job. Then, select Usage Analysis.
26. Question 26. How We Can Find The Number Of Rows In A Sequential
File?
Answer :
To find rows in chronical file, we can use the System variable @INROWNUM.
27. Question 27. Contrast Between Hash File And Sequential File?
Answer :
The only dissimilarity between the Hash file and Sequential file is that the
Hash file stores data on hash algorithm and on a hash key value, while
sequential file doesn’t have any key value to save the data. Hence we can say
that hash key feature, searching in Hash file is faster than in sequential file.
28. Question 28. How We Can Clean The Datastage Repository?
Answer :
We can clean the Datastage repository via the Clean Up Resources
functionality in the Datastage Manager.
29. Question 29. How We Can Called Routine In Datastage Job?
Answer :
We can call a routine from the transformer stage in Datastage job.
30. Question 30. Differentiate Between Operational Datastage (ods) And
Data Warehouse?
Answer :
We can say, ODS is a small data warehouse. An ODS doesn’t have information
for more than 1 year while a data warehouse have detailed information about
the entire business.
31. Question 31. For What Nls Stand For In Datastage?
Answer :
NLS stand for National Language Support. It can be used to integrate various
languages such as French, German, and Spanish etc. in the data, requisite for
processing by data warehouse.
32. Question 32. Can You Explain How Could Anyone Crash The Index
Before Loading The Data In Target In Datastage?
Answer :
In Datastage, we can crash the index before loading the data in target by using
the Direct Load functionality of SQL Loaded Utility.
33. Question 33. Does Datastage Support Gradually Changing Dimensions ?
Answer :
Yes,Version 8.5 + supports this feature in datastage.
34. Question 34. How Complicated Jobs Are Implemented In Datstage To
Recover Performance?
Answer :
In order to recover performance in Datastage, it is suggested, not to use more
than 20 stages in every job. If you need to use more than 20 stages then it is
advisable to use next job for those stages.
35. Question 35. Name The Third Party Tools That Can Be Used In
Datastage?
Answer :
The third party tools that can be used in Datastage, are Autosys, TNG and
Event Co-ordinator.
36. Question 36. Describe Project In Datastage?
Answer :
When ever we begin the Datastage client, we are asked to join to a Datastage
project. A Datastage project have Datastage jobs, built-in apparatus and
Datastage Designer or User-Defined components.
37. Question 37. What Types Of Hash Files Are There?
Answer :
There are two types of hash files in which are Static Hash File and Dynamic
Hash File.
38. Question 38. Describe Meta Stage?
Answer :
In Datastage, MetaStage is used to store metadata that is beneficial for data
lineage and data analysis.
39. Question 39. Why Unix Environment Is Useful In Datastage?
Answer :
It is useful in Datastage because sometimes one has to write UNIX programs
such as batch programs to raise batch processing etc.
40. Question 40. Contrast Between Datastage And Datastage Tx?
Answer :
Datastage is a tool from ETL i.e. Extract, Transform and Load and Datastage
TX is a tool from EAI i.e. Enterprise Application Integration.Learn more about
the ETL process in this insightful blog now.
41. Question 41. What Is Size Of A Transaction And An Array Means In A
Datastage?
Answer :
Transaction size means the number of row written before committing the
account in a table. An array size means the number of rows written/read to or
from the table respectively.
42. Question 42. Name The Various Types Views In A Datastage Director?
Answer :
There are three types of views in a Datastage Director i.e. Log View, Job View
and Status View.
43. Question 43. What Is The Use Of Surrogate Key?
Answer :
Surrogate key is mostly used for getting data faster. It uses catalog to perform
the retrieval operation.
44. Question 44. How Discarded Rows Are Processed In Datastage?
Answer :
In the Datastage, the discarded rows are managed by constraints in
transformer. We can either place the discarded rows in the properties of a
transformer or we can create a brief storage for discarded rows with the help
of REJECTED command.
45. Question 45. Contrast Between Odbc And Drs Stage?
Answer :
DRS stage is faster than the ODBC stage because it uses local databases for
connectivity.
46. Question 46. Describe Orabulk And Bcp Stages?
Answer :
Orabulk stage is used to store big amount of data in one target table of Oracle
database. The BCP stage is used to store big amount of data in one target
table of Microsoft SQL Server.
47. Question 47. Describe Ds Designer?
Answer :
The DS Designer is used to make work area and add many links to it.
48. Question 48. What Is The Need Of Link Partitioner And Link Collector In
Datastage?
Answer :
In Datastage, Link Partitioner is used to split data into various parts by certain
partitioning methods. Link Collector is used to collect data from many
partitions to a single data and save it in the target table.
DATASTAGE:-

1) Define Data Stage?

A data stage is basically a tool that is used to design, develop and execute
various applications to fill multiple tables in data warehouse or data marts. It is a
program for Windows servers that extracts data from databases and change
them into data warehouses. It has become an essential part of IBM WebSphere
Data Integration suite.

2) Explain how a source file is populated?

We can populate a source file in many ways such as by creating a SQL query
in Oracle, or by using row generator extract tool etc.

3) Name the command line functions to import and export the DS jobs?

To import the DS jobs, dsimport.exe is used and to export the DS jobs,

dsexport.exe is used.

4) What is the difference between Datastage 7.5 and 7.0?

In Datastage 7.5 many new stages are added for more robustness and smooth
performance, such as Procedure Stage, Command Stage, Generate Report etc.

5) In Datastage, how you can fix the truncated data error?

The truncated data error can be fixed by using ENVIRONMENT VARIABLE ‘

IMPORT_REJECT_STRING_FIELD_OVERRUN’.

6) Define Merge?

Merge means to join two or more tables. The two tables are joined on the basis
of Primary key columns in both the tables.

7) Differentiate between data file and descriptor file?

As the name implies, data files contains the data and the descriptor file contains
the description/information about the data in the data files.

8) Differentiate between datastage and informatica?

Datastage

In datastage, there is a concept of partition, parallelism for node configuration.

While, there is no concept of partition and parallelism in informatica for node
configuration. Also, Informatica is more scalable than Datastage. Datastage is
more user-friendly as compared to Informatica.

9) Define Routines and their types?

Routines are basically collection of functions that is defined by DS manager. It

can be called via transformer stage. There are three types of routines such as,
parallel routines, main frame routines and server routines.

10) How can you write parallel routines in datastage PX?

We can write parallel routines in C or C++ compiler. Such routines are also
created in DS manager and can be called from transformer stage.

11) What is the method of removing duplicates, without the remove

duplicate stage?

Duplicates can be removed by using Sort stage. We can use the option, as allow
duplicate = false.

12) What steps should be taken to improve Datastage jobs?

In order to improve performance of Datastage jobs, we have to first establish the

baselines. Secondly, we should not use only one flow for performance testing.
Thirdly, we should work in increment. Then, we should evaluate data skews. Then
we should isolate and solve the problems, one by one. After that, we should
distribute the file systems to remove bottlenecks, if any. Also, we should not
include RDBMS in start of testing phase. Last but not the least, we should
understand and assess the available tuning knobs.

13) Differentiate between Join, Merge and Lookup stage?

All the three concepts are different from each other in the way they use the
memory storage, compare input requirements and how they treat various
records. Join and Merge needs less memory as compared to the Lookup stage.

14) Explain Quality stage?

Quality stage is also known as Integrity stage. It assists in integrating different

types of data from various sources.

15) Define Job control?

Job control can be best performed by using Job Control Language (JCL). This tool
is used to execute multiple jobs simultaneously, without using any kind of loop.

16) Differentiate between Symmetric Multiprocessing and Massive Parallel

Processing?

In Symmetric Multiprocessing, the hardware resources are shared by processor.

The processor has one operating system and it communicates through shared
memory. While in Massive Parallel processing, the processor access the hardware
resources exclusively. This type of processing is also known as Shared Nothing,
since nothing is shared in this. It is faster than the Symmetric Multiprocessing.

17) What are the steps required to kill the job in Datastage?

To kill the job in Datasatge, we have to kill the respective processing ID.

18) Differentiate between validated and Compiled in the Datastage?

In Datastage, validating a job means, executing a job. While validating, the

Datastage engine verifies whether all the required properties are provided or not.
In other case, while compiling a job, the Datastage engine verifies that whether
all the given properties are valid or not.
19) How to manage date conversion in Datastage?

We can use date conversion function for this purpose i.e.

Oconv(Iconv(Filedname,”Existing Date Format”),”Another Date Format”).

20) Why do we use exception activity in Datastage?

All the stages after the exception activity in Datastage are executed in case of any
unknown error occurs while executing the job sequencer.

21) Define APT_CONFIG in Datastage?

It is the environment variable that is used to identify the *.apt file in Datastage. It
is also used to store the node information, disk storage information and scratch
information.

22) Name the different types of Lookups in Datastage?

There are two types of Lookups in Datastage i.e. Normal lkp and Sparse lkp. In
Normal lkp, the data is saved in the memory first and then the lookup is
performed. In Sparse lkp, the data is directly saved in the database. Therefore, the
Sparse lkp is faster than the Normal lkp.

23) How a server job can be converted to a parallel job?

We can convert a server job in to a parallel job by using IPC stage and Link
Collector.

24) Define Repository tables in Datastage?

In Datastage, the Repository is another name for a data warehouse. It can be

centralized as well as distributed.

25) Define OConv () and IConv () functions in Datastage?

In Datastage, OConv () and IConv() functions are used to convert formats from
one format to another i.e. conversions of roman numbers, time, date, radix,
numeral ASCII etc. IConv () is basically used to convert formats for system to
understand. While, OConv () is used to convert formats for users to understand.
26) Explain Usage Analysis in Datastage?

In Datastage, Usage Analysis is performed within few clicks. Launch Datastage

Manager and right click the job. Then, select Usage Analysis and that’s it.

27) How do you find the number of rows in a sequential file?

To find rows in sequential file, we can use the System variable @INROWNUM.

28) Differentiate between Hash file and Sequential file?

The only difference between the Hash file and Sequential file is that the Hash file
saves data on hash algorithm and on a hash key value, while sequential file
doesn’t have any key value to save the data. Basis on this hash key feature,
searching in Hash file is faster than in sequential file.

29) How to clean the Datastage repository?

We can clean the Datastage repository by using the Clean Up Resources

functionality in the Datastage Manager.

30) How a routine is called in Datastage job?

In Datastage, routines are of two types i.e. Before Sub Routines and After Sub
Routines. We can call a routine from the transformer stage in Datastage.

31) Differentiate between Operational Datastage (ODS) and Data

warehouse?

We can say, ODS is a mini data warehouse. An ODS doesn’t contain information
for more than 1 year while a data warehouse contains detailed information
regarding the entire business.

32) NLS stands for what in Datastage?

NLS means National Language Support. It can be used to incorporate other

languages such as French, German, and Spanish etc. in the data, required for
processing by data warehouse. These languages have same scripts as English
language.
33) Can you explain how could anyone drop the index before loading the
data in target in Datastage?

In Datastage, we can drop the index before loading the data in target by using
the Direct Load functionality of SQL Loaded Utility.

34) Does Datastage support slowly changing dimensions ?

Yes. Version 8.5 + supports this feature

35) How can one find bugs in job sequence?

We can find bugs in job sequence by using DataStage Director.

36) How complex jobs are implemented in Datstage to improve

performance?

In order to improve performance in Datastage, it is recommended, not to use

more than 20 stages in every job. If you need to use more than 20 stages then it
is better to use another job for those stages.

37) Name the third party tools that can be used in Datastage?

The third party tools that can be used in Datastage, are Autosys, TNG and Event
Co-ordinator. I have worked with these tools and possess hands on experience of
working with these third party tools.

38) Define Project in Datastage?

Whenever we launch the Datastage client, we are asked to connect to a

Datastage project. A Datastage project contains Datastage jobs, built-in
components and Datastage Designer or User-Defined components.

39) How many types of hash files are there?

There are two types of hash files in DataStage i.e. Static Hash File and Dynamic
Hash File. The static hash file is used when limited amount of data is to be loaded
in the target database. The dynamic hash file is used when we don’t know the
amount of data from the source file.
40) Define Meta Stage?

In Datastage, MetaStage is used to save metadata that is helpful for data lineage
and data analysis.

41) Have you have ever worked in UNIX environment and why it is useful in
Datastage?

Yes, I have worked in UNIX environment. This knowledge is useful in Datastage

because sometimes one has to write UNIX programs such as batch programs to
invoke batch processing etc.

42) Differentiate between Datastage and Datastage TX?

Datastage is a tool from ETL (Extract, Transform and Load) and Datastage TX is a
tool from EAI (Enterprise Application Integration).

43) What is size of a transaction and an array means in a Datastage?

Transaction size means the number of row written before committing the records
in a table. An array size means the number of rows written/read to or from the
table respectively.

44) How many types of views are there in a Datastage Director?

There are three types of views in a Datastage Director i.e. Job View, Log View and
Status View.

45) Why we use surrogate key?

In Datastage, we use Surrogate Key instead of unique key. Surrogate key is

mostly used for retrieving data faster. It uses Index to perform the retrieval
operation.

46) How rejected rows are managed in Datastage?

In the Datastage, the rejected rows are managed through constraints in

transformer. We can either place the rejected rows in the properties of a
transformer or we can create a temporary storage for rejected rows with the help
of REJECTED command.

47) Differentiate between ODBC and DRS stage?

DRS stage is faster than the ODBC stage because it uses native databases for
connectivity.

48) Define Orabulk and BCP stages?

Orabulk stage is used to load large amount of data in one target table of Oracle
database. The BCP stage is used to load large amount of data in one target table
of Microsoft SQL Server.

49) Define DS Designer?

The DS Designer is used to design work area and add various links to it.

50) Why do we use Link Partitioner and Link Collector in Datastage?

In Datastage, Link Partitioner is used to divide data into different parts through
certain partitioning methods. Link Collector is used to gather data from various
partitions/segments to a single data and save it in the target table.

Snowflake Notes
100% (9)
Snowflake Notes
67 pages
Advanced Data Engineering With Databricks
No ratings yet
Advanced Data Engineering With Databricks
154 pages
Complete SQL Notes
81% (53)
Complete SQL Notes
18 pages
SQL 100 Interview Questions
80% (5)
SQL 100 Interview Questions
24 pages
Data Build Tool (DBT)
No ratings yet
Data Build Tool (DBT)
65 pages
Datastage Scenarios New
No ratings yet
Datastage Scenarios New
27 pages
Datastage Scenarios in Indian
No ratings yet
Datastage Scenarios in Indian
19 pages
Datastage Scenario Based Questions PDF
100% (1)
Datastage Scenario Based Questions PDF
4 pages
SQL Interview Questions PDF
88% (42)
SQL Interview Questions PDF
48 pages
Azure Databricks Course Slide Deck
75% (4)
Azure Databricks Course Slide Deck
169 pages
SQL Interview Questions & Answers
75% (4)
SQL Interview Questions & Answers
63 pages
Datastage: Datastage Interview Questions/Answers
No ratings yet
Datastage: Datastage Interview Questions/Answers
28 pages
Datastage Scenarios
100% (1)
Datastage Scenarios
34 pages
Datastage - Parameters - Schema Files
No ratings yet
Datastage - Parameters - Schema Files
23 pages
Fast Data Processing with Spark 2 - Third Edition
From Everand
Fast Data Processing with Spark 2 - Third Edition
Krishna Sankar
No ratings yet
Learning Informatica PowerCenter 9.x
From Everand
Learning Informatica PowerCenter 9.x
Rahul Malewar
3/5 (4)
Oracle 19c AutoUpgrade Best Practices: A Step-by-step Expert-led Database Upgrade Guide to Oracle 19c Using AutoUpgrade Utility
From Everand
Oracle 19c AutoUpgrade Best Practices: A Step-by-step Expert-led Database Upgrade Guide to Oracle 19c Using AutoUpgrade Utility
Sambaiah Sammeta
No ratings yet
Oracle Database Security Interview Questions, Answers, and Explanations: Oracle Database Security Certification Review
From Everand
Oracle Database Security Interview Questions, Answers, and Explanations: Oracle Database Security Certification Review
equitypress
No ratings yet
Oracle Essbase 11 Development Cookbook
From Everand
Oracle Essbase 11 Development Cookbook
Jose R. Ruiz
No ratings yet
SQL Commands Cheat Sheet
86% (7)
SQL Commands Cheat Sheet
1 page
Datastage Slowly Changing Dimensions
No ratings yet
Datastage Slowly Changing Dimensions
10 pages
500 SQL Server Interview Questions and Answers - SQL FAQ PDF
75% (12)
500 SQL Server Interview Questions and Answers - SQL FAQ PDF
22 pages
Datastage Interview Question
50% (2)
Datastage Interview Question
151 pages
SQL Notes
100% (2)
SQL Notes
793 pages
Important SQL Queries
100% (1)
Important SQL Queries
33 pages
Car To Car Rescue Write Up
100% (1)
Car To Car Rescue Write Up
16 pages
50+ TOP DataStage Interview Questions and Answers PDF
No ratings yet
50+ TOP DataStage Interview Questions and Answers PDF
2 pages
Datastage Interview Question and Answers
100% (2)
Datastage Interview Question and Answers
14 pages
Datastage Interview Questions - Answers - 0516
No ratings yet
Datastage Interview Questions - Answers - 0516
29 pages
What Is Difference Between Server Jobs and Parallel Jobs? Ans:-Server Jobs
No ratings yet
What Is Difference Between Server Jobs and Parallel Jobs? Ans:-Server Jobs
71 pages
Datastage Questions
No ratings yet
Datastage Questions
18 pages
Datastage Interview Questions
100% (1)
Datastage Interview Questions
18 pages
ORACLE PL/SQL Interview Questions You'll Most Likely Be Asked
From Everand
ORACLE PL/SQL Interview Questions You'll Most Likely Be Asked
Vibrant Publishers
5/5 (1)
Oracle SQL and PL/SQL
From Everand
Oracle SQL and PL/SQL
Niraj Gupta
4.5/5 (8)
DataStage FAQ
100% (1)
DataStage FAQ
243 pages
DataStage Interview Question
No ratings yet
DataStage Interview Question
9 pages
Introduction To ETL and DataStage
No ratings yet
Introduction To ETL and DataStage
48 pages
Recovered DataStage Tip
100% (1)
Recovered DataStage Tip
115 pages
Datastage Scenarios 3
No ratings yet
Datastage Scenarios 3
23 pages
DataStage Theory Part
No ratings yet
DataStage Theory Part
18 pages
CG Datastage
No ratings yet
CG Datastage
122 pages
Datastage Scenarios Doc1
No ratings yet
Datastage Scenarios Doc1
52 pages
Datastage Guide
No ratings yet
Datastage Guide
233 pages
Partitioning in Datastage
No ratings yet
Partitioning in Datastage
27 pages
Usefull Stuff Datastage
100% (1)
Usefull Stuff Datastage
23 pages
Pipeline Parallelism 2. Partition Parallelism
No ratings yet
Pipeline Parallelism 2. Partition Parallelism
12 pages
Data Stage Scenario
100% (1)
Data Stage Scenario
10 pages
Sandy's DataStage Notes
No ratings yet
Sandy's DataStage Notes
23 pages
Data Stage
No ratings yet
Data Stage
3 pages
Datastage Errors and Resolution
No ratings yet
Datastage Errors and Resolution
10 pages
Ibm Infosphere Datastage Performance Tuning: Menu
No ratings yet
Ibm Infosphere Datastage Performance Tuning: Menu
9 pages
Job Sequencer
No ratings yet
Job Sequencer
19 pages
Common Issues in Datastage
100% (2)
Common Issues in Datastage
12 pages
DataStage NOTES
No ratings yet
DataStage NOTES
165 pages
Unix Ds Commands
No ratings yet
Unix Ds Commands
7 pages
DataStage PPT
No ratings yet
DataStage PPT
94 pages
Training Course Datastage (Part 1) : V. Beyet 03/07/2006
100% (1)
Training Course Datastage (Part 1) : V. Beyet 03/07/2006
122 pages
Datastage Interview Questions
100% (1)
Datastage Interview Questions
57 pages
Datastage and Qualitystage Parallel Stages and Activities
No ratings yet
Datastage and Qualitystage Parallel Stages and Activities
154 pages
Imp Datastage New
No ratings yet
Imp Datastage New
153 pages
DataStage Best Practises 1
No ratings yet
DataStage Best Practises 1
41 pages
Datastage Functions and Routines
100% (1)
Datastage Functions and Routines
9 pages
Issues Datastage
No ratings yet
Issues Datastage
4 pages
Introduction To Datastage: Ibm Infosphere Datastage V11.5
No ratings yet
Introduction To Datastage: Ibm Infosphere Datastage V11.5
23 pages
Datastage Interview Questions & Answers
No ratings yet
Datastage Interview Questions & Answers
8 pages
Oracle Exadata Complete Self-Assessment Guide
From Everand
Oracle Exadata Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
ORACLE 12C Complete Self-Assessment Guide
From Everand
ORACLE 12C Complete Self-Assessment Guide
Gerardus Blokdyk
No ratings yet
HBase Administration Cookbook
From Everand
HBase Administration Cookbook
Yifeng Jiang
No ratings yet
Basic DBA Query v.1: Oracle Database
From Everand
Basic DBA Query v.1: Oracle Database
Oraclesql-plsql
5/5 (1)
Instant Pentaho Data Integration Kitchen
From Everand
Instant Pentaho Data Integration Kitchen
Sergio Ramazzina
No ratings yet
Oracle Database Mastery: Comprehensive Techniques for Advanced Application
From Everand
Oracle Database Mastery: Comprehensive Techniques for Advanced Application
Adam Jones
No ratings yet
Oracle Quick Guides: Part 1 - Oracle Basics: Database and Tools
From Everand
Oracle Quick Guides: Part 1 - Oracle Basics: Database and Tools
Malcolm Coxall
No ratings yet
Learn Hive in 24 Hours
From Everand
Learn Hive in 24 Hours
Alex Nordeen
No ratings yet
Getting Started with Talend Open Studio for Data Integration
From Everand
Getting Started with Talend Open Studio for Data Integration
Jonathan Bowen
No ratings yet
Sql Plsql Oracle
From Everand
Sql Plsql Oracle
Andrew Igla
No ratings yet
Datastage Realtime Senarios
100% (1)
Datastage Realtime Senarios
23 pages
Top 200 Data Engineer Interview Question PDF
100% (4)
Top 200 Data Engineer Interview Question PDF
482 pages
SQL Interview Questions CHEAT SHEET
No ratings yet
SQL Interview Questions CHEAT SHEET
20 pages
Pyspark Interview Code
100% (3)
Pyspark Interview Code
197 pages
Top SQL Interview Questions You Must Prepare
100% (2)
Top SQL Interview Questions You Must Prepare
11 pages
Datastage Interview Questions
No ratings yet
Datastage Interview Questions
10 pages
Etl Interview Questions
No ratings yet
Etl Interview Questions
36 pages
SQL Cheat Sheet
91% (11)
SQL Cheat Sheet
11 pages
Michael Jordan
No ratings yet
Michael Jordan
14 pages
Section 8.4 and 8.5
No ratings yet
Section 8.4 and 8.5
24 pages
BBSM-2015 - Ii
No ratings yet
BBSM-2015 - Ii
10 pages
UAE OMV Upstream
No ratings yet
UAE OMV Upstream
2 pages
SAS BAHASA INGGRIS KELAS 5
No ratings yet
SAS BAHASA INGGRIS KELAS 5
3 pages
Everything Is Poison Joy Mccullough instant download
No ratings yet
Everything Is Poison Joy Mccullough instant download
35 pages
Week 3 Housekeeping
No ratings yet
Week 3 Housekeeping
11 pages
Pega Certifications - Pega Academy
No ratings yet
Pega Certifications - Pega Academy
2 pages
Fatigue Analysis of Pulley by Using Finite Element Analysis
No ratings yet
Fatigue Analysis of Pulley by Using Finite Element Analysis
4 pages
CM - Mapeh 8 Music
No ratings yet
CM - Mapeh 8 Music
5 pages
Presser Seal
No ratings yet
Presser Seal
2 pages
Study on Essential Newborn Care
No ratings yet
Study on Essential Newborn Care
145 pages
BT 1
No ratings yet
BT 1
2 pages
Business Strategy 2012 Syllabus-1
No ratings yet
Business Strategy 2012 Syllabus-1
7 pages
Design of Rooftop Rainwater Harvesting Structure in A University Campus
No ratings yet
Design of Rooftop Rainwater Harvesting Structure in A University Campus
6 pages
Bill - Maruti Suzuki - Fronx - PB80A7375
No ratings yet
Bill - Maruti Suzuki - Fronx - PB80A7375
6 pages
Marine 2017 18 - Propeller Nozzles Design
No ratings yet
Marine 2017 18 - Propeller Nozzles Design
13 pages
SEA2M68 Stepper Motor Driver - Shenzhen Instar Electromechanical Technology Development Co., LTD
No ratings yet
SEA2M68 Stepper Motor Driver - Shenzhen Instar Electromechanical Technology Development Co., LTD
6 pages
App Form Questions
No ratings yet
App Form Questions
24 pages
My Course Stuff
No ratings yet
My Course Stuff
5 pages
HQ S Imp Questions
No ratings yet
HQ S Imp Questions
4 pages
Social Media Has Ruined Society
No ratings yet
Social Media Has Ruined Society
5 pages
Fenrg 09 723775
No ratings yet
Fenrg 09 723775
6 pages
Physics (9702)
No ratings yet
Physics (9702)
32 pages
Producers
No ratings yet
Producers
3 pages
Fill in The Blanks To Complete The Statements About A
No ratings yet
Fill in The Blanks To Complete The Statements About A
1 page
Shailesh Patel CV
No ratings yet
Shailesh Patel CV
3 pages
Madeeasy 2025 ESE BOOK
No ratings yet
Madeeasy 2025 ESE BOOK
25 pages
Brock Courses
No ratings yet
Brock Courses
4 pages

Data Stage Interview Questions

Uploaded by

Data Stage Interview Questions

Uploaded by

DATA STAGE INTERVIEW QUESTIONS & ANSWERS

DataStage is an ETL tool. It is used in graphical notation to build/construct solutions to

8. Question 8. Explain Routines And Their Types?

1) Define Data Stage?

2) Explain how a source file is populated?

To import the DS jobs, dsimport.exe is used and to export the DS jobs,

4) What is the difference between Datastage 7.5 and 7.0?

5) In Datastage, how you can fix the truncated data error?

The truncated data error can be fixed by using ENVIRONMENT VARIABLE ‘

7) Differentiate between data file and descriptor file?

8) Differentiate between datastage and informatica?

In datastage, there is a concept of partition, parallelism for node configuration.

9) Define Routines and their types?

Routines are basically collection of functions that is defined by DS manager. It

10) How can you write parallel routines in datastage PX?

11) What is the method of removing duplicates, without the remove

12) What steps should be taken to improve Datastage jobs?

In order to improve performance of Datastage jobs, we have to first establish the

13) Differentiate between Join, Merge and Lookup stage?

14) Explain Quality stage?

Quality stage is also known as Integrity stage. It assists in integrating different

15) Define Job control?

16) Differentiate between Symmetric Multiprocessing and Massive Parallel

In Symmetric Multiprocessing, the hardware resources are shared by processor.

18) Differentiate between validated and Compiled in the Datastage?

In Datastage, validating a job means, executing a job. While validating, the

We can use date conversion function for this purpose i.e.

20) Why do we use exception activity in Datastage?

21) Define APT_CONFIG in Datastage?

22) Name the different types of Lookups in Datastage?

23) How a server job can be converted to a parallel job?

24) Define Repository tables in Datastage?

In Datastage, the Repository is another name for a data warehouse. It can be

25) Define OConv () and IConv () functions in Datastage?

In Datastage, Usage Analysis is performed within few clicks. Launch Datastage

27) How do you find the number of rows in a sequential file?

28) Differentiate between Hash file and Sequential file?

29) How to clean the Datastage repository?

We can clean the Datastage repository by using the Clean Up Resources

30) How a routine is called in Datastage job?

31) Differentiate between Operational Datastage (ODS) and Data

32) NLS stands for what in Datastage?

NLS means National Language Support. It can be used to incorporate other

34) Does Datastage support slowly changing dimensions ?

Yes. Version 8.5 + supports this feature

35) How can one find bugs in job sequence?

We can find bugs in job sequence by using DataStage Director.

36) How complex jobs are implemented in Datstage to improve

In order to improve performance in Datastage, it is recommended, not to use

38) Define Project in Datastage?

Whenever we launch the Datastage client, we are asked to connect to a

39) How many types of hash files are there?

Yes, I have worked in UNIX environment. This knowledge is useful in Datastage

42) Differentiate between Datastage and Datastage TX?

43) What is size of a transaction and an array means in a Datastage?

44) How many types of views are there in a Datastage Director?

45) Why we use surrogate key?

In Datastage, we use Surrogate Key instead of unique key. Surrogate key is

46) How rejected rows are managed in Datastage?

In the Datastage, the rejected rows are managed through constraints in

47) Differentiate between ODBC and DRS stage?

48) Define Orabulk and BCP stages?

49) Define DS Designer?

50) Why do we use Link Partitioner and Link Collector in Datastage?

You might also like