0% found this document useful (0 votes)

51 views

Ds 42 Tutorial en

Uploaded by

Jose Roman

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

51 views

Ds 42 Tutorial en

Uploaded by

Jose Roman

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 210

PUBLIC

SAP Data Services

Document Version: 4.2 Support Package 14 (14.2.14.0) – 2022-09-17

Tutorial
© 2022 SAP SE or an SAP affiliate company. All rights reserved.

THE BEST RUN

Content

1 Introduction to the tutorial. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7

1.1 Audience and assumptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2 Tutorial objectives. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Product overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.1 Product components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 The Designer user interface. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.3 Designer tool palette. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3 About objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.1 Object hierarchy. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.2 Jobs and subordinate objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.3 Work flows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
3.4 Data flows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5 Naming conventions for objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3.6 Delete reusable objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

4 Preparation for this tutorial. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

4.1 Create required databases. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24
4.2 Database connections worksheet. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .26
4.3 Running the provided SQL scripts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .27

5 Tutorial data model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6 Logging into the Designer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

6.1 Exiting the tutorial. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
6.2 Resuming the tutorial. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

7 Create datastores and import metadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

7.1 Creating a datastore for the source database. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.2 Create the target datastore. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
7.3 Importing metadata for source tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.4 Importing metadata for target tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
7.5 What's next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

8 Populate a table from a flat file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

8.1 Defining a file format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .42
8.2 Creating a new project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
8.3 Adding a new job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Tutorial
2 PUBLIC Content
8.4 Adding a work flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8.5 Adding a data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
8.6 Define the data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Add objects to the DF_SalesOrg data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Define the order of steps in a data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48
Configure the query transform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
8.7 Validating the DF_SalesOrg data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
8.8 Viewing details of validation errors and warnings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
8.9 Saving the project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52
8.10 Ensuring that the Job Server is running. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .52
8.11 Executing the job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
8.12 What's next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

9 Populate a time dimension table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

9.1 Adding a job and data flow to the project. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
9.2 Adding the components of the time data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
9.3 Defining the output of the Date_Generation transform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
9.4 Defining the output of the query. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .60
9.5 Executing the job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
9.6 What's next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

10 Populate a table with data from a relational table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

10.1 Adding the CustDim job, work flow, and data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
10.2 Define the data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Add objects to the data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
Configuring the query transform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
10.3 Validating the CustDim data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
10.4 Executing the CustDim job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .69
10.5 The interactive debugger. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
Setting a breakpoint in a data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
Debugging Job_CustDim with interactive debugger. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .72
Setting a breakpoint condition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .73
10.6 What's next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74

11 Populate a table from an XML File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

11.1 Nested data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
11.2 Adding MtrlDim job, work flow, and data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
11.3 Importing a document type definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
11.4 Define the MtrlDim data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Adding objects to DF_MtrlDim . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
11.5 Validating the MtrlDim data flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
11.6 Executing the MtrlDim job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Tutorial
Content PUBLIC 3
11.7 Leveraging the XML_Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Create a job, work flow, and data flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Unnesting the schema with the XML_Pipeline transform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
11.8 What's next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

12 Populate a table from multiple relational tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

12.1 Creating the SalesFact job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
12.2 Adding objects to the SalesFact data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
12.3 Creating an inner join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
12.4 Purpose of the lookup_ext function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
12.5 Configuring the lookup_ext function. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
12.6 Executing the SalesFact job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
12.7 Viewing Impact and Lineage Analysis for the SALES_FACT target table. . . . . . . . . . . . . . . . . . . . . .105
12.8 What's next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .107

13 Changed data capture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

13.1 Global variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
13.2 Creating the initial load job and defining global variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
Building the JOB_CDC_Initial. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
Defining the initial load scripts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
Defining the data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
Defining the QryCDC query. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
13.3 Replicating the initial load data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
13.4 Creating the Delta job and scripts for global variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
Building the JOB_CDC_Delta. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
Defining the delta load scripts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
13.5 Execute the jobs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
Executing the initial load job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
Changing the source data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
Executing the delta load job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
13.6 What's next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .122

14 Data assessment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .123

14.1 Default profile statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .124
14.2 Viewing profile statistics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .125
14.3 The Validation transform. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
Creating a validation job . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
Configuring the Validation transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
14.4 Audit objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
Adding a fail target table to the data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
Creating audit functions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132
14.5 Viewing audit details in Operational Dashboard reports. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

Tutorial
4 PUBLIC Content
View audit results. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
14.6 What's next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136

15 Recovery mechanisms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

15.1 Recoverable job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .138
15.2 Creating local variables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
15.3 Creating the script that determines the status. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
15.4 Conditionals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
Adding the conditional. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Specifying the If-Then work flows. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
15.5 Creating the script that updates the status. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
15.6 Verify the job setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
15.7 Executing the job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .145
15.8 Data Services automated recovery properties. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
15.9 What's next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

16 Multiuser development. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

16.1 Central Object Library. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
16.2 Central Object Library layout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
16.3 How multiuser development works. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
16.4 Preparation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .152
Configuring the central repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
Configuring two local repositories. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 154
Associating repositories to your job server. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Defining connections to the central repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .155
16.5 Working in a multiuser environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
Activating a connection to the central repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Importing objects into your local repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
Adding objects to the central repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
Check out objects from the central repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162
Checking in objects to the central repository. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .164
Setting up the user2 environment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
Undo checkout. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .165
Comparing objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
Check out object without replacement. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 169
Get objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
Filter dependent objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
Deleting objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
16.6 What's next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .176

17 Extracting SAP application data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

17.1 SAP applications. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

Tutorial
Content PUBLIC 5
17.2 Defining an SAP application datastore. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
17.3 Importing metadata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180
17.4 Repopulate the customer dimension table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
Adding the SAP_CustDim job, work flow, and data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
Adding ABAP data flow to Customer Dimension job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183
Defining the DF_SAP_CustDim ABAP data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184
Executing the JOB_SAP_CustDim job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187
ABAP job execution errors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .188
17.5 Repopulating the material dimension table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Adding the Material Dimension job, work flow, and data flow. . . . . . . . . . . . . . . . . . . . . . . . . . .189
Adding ABAP data flow to Material Dimension job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
Defining the DF_SAP_MtrlDim ABAP data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 191
Executing the JOB_SAP_MtrlDim job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
17.6 Repopulating the Sales Fact table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195
Adding the Sales Fact job, work flow, and data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Adding ABAP data flow to Sales Fact job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 196
Defining the DF_ABAP_SalesFact ABAP data flow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 197
Executing the JOB_SAP_SalesFact job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203
17.7 What's next. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204

18 Real-time jobs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

18.1 Importing a real-time job. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205
18.2 Running a real time job in test mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

Tutorial
6 PUBLIC Content
1 Introduction to the tutorial

This tutorial introduces you to the basic use of SAP Data Services Designer by explaining key concepts and
providing a series of related exercises and sample data.

Data Services Designer is a graphical user interface (GUI) development environment in which you extract,
transform, and load batch data from flat-file and relational database sources for use in a data warehouse. You
can also use Designer for real-time data extraction and integration.

Audience and assumptions [page 7]

The tutorial is for users experienced in many areas of database management, SQL, and Microsoft
Windows.

Tutorial objectives [page 8]

After you complete this tutorial, you’ll be able to extract, transform, and load data from various source
and target types, and understand the concepts and features of SAP Data Services Designer.

1.1 Audience and assumptions

The tutorial is for users experienced in many areas of database management, SQL, and Microsoft Windows.

The tutorial introduces core SAP Data Services Designer functionality. We wrote the tutorial assuming that you
have experience in some of the following areas:

• Database management or administration

• Data extraction, data warehousing, data integration, or data quality
• Source data systems
• Business intelligence

Parent topic: Introduction to the tutorial [page 7]

Related Information

Tutorial objectives [page 8]

Tutorial
Introduction to the tutorial PUBLIC 7
1.2 Tutorial objectives

After you complete this tutorial, you’ll be able to extract, transform, and load data from various source and
target types, and understand the concepts and features of SAP Data Services Designer.

You’ll know about the various Data Services objects such as datastores and transforms, and you’ll be able to
define a file format, import data, and analyze data results.

You’ll learn how to use Data Services Designer features and functions to do the following:

• Verify and improve the quality of your data

• Examine data through a job using the debugger
• Capture changed data
• View and print metadata reports
• Recover from runtime errors
• Set up a multiuser development environment
• Set up and run real-time data processing

Parent topic: Introduction to the tutorial [page 7]

Related Information

Audience and assumptions [page 7]

Tutorial
8 PUBLIC Introduction to the tutorial
2 Product overview

SAP Data Services extracts, transforms, and loads (ETL) data from heterogeneous sources into a target
database or data warehouse.

You specify data mappings and transformations by using Designer, the graphical user interface for Data
Services. Data Services combines industry-leading data quality and integration into one platform. It transforms
your data in many ways. For example, it standardizes input data, adds additional address data, cleanses data,
and removes duplicate entries.

Data Services provides additional support for real-time data movement and access. It performs predefined
operations in real time, as it receives information. The Data Services real-time components also provide
services to Web applications and other client applications.

For a complete list of Data Services resources, see the SAP Help Portal at http://help.sap.com/bods.

Product components [page 9]

Descriptions of the components that are a part of SAP Data Services.

The Designer user interface [page 11]

Use the many tools in SAP Data Services Designer to create objects, projects, data flows, and
workflows to process data.

Designer tool palette [page 12]

The SAP Data Services objects to build work flows and data flows appear as icons on the tool palette to
the right of the workspace.

2.1 Product components

Descriptions of the components that are a part of SAP Data Services.

Data Services component descriptions

Component Description

Designer Data Services user interface that enables users to:

• Create, test, and execute jobs that populate a data warehouse

• Create objects and combine them by dragging their icons onto the workspace to create source-to-
target flow diagrams
• Configure objects by opening their editors from the data flow diagram
• Define data mappings, transformations, and control logic
• Create applications by combining objects in to workflows (job execution definitions) and data flows
(data transformation definitions)

Tutorial
Product overview PUBLIC 9
Component Description

Job Server Application that launches the Data Services processing engine and serves as an interface to the engine
and other components in the Data Services suite.

Engine Executes individual jobs that you define in the Designer to effectively accomplish the defined tasks.

Repository Database that stores Designer predefined system objects and user-defined objects including source and
target metadata and transformation rules. Use a local repository to store your data and objects. To share
data and objects with others and for version control, use a central repository.

Access Server Passes messages between Web applications and the Data Services Job Server and engines. Provides a
reliable and scalable interface for request-response processing.

Administrator Web administrator that provides the following browser-based administration of Data Services resources:

• Scheduling, monitoring, and executing batch jobs

• Configuring, starting, and stopping real-time services
• Configuring Job Server, Access Server, and repository usage
• Configuring and managing adapters
• Managing users
• Publishing batch jobs and real-time services via web services

The following diagram illustrates Data Services product components and relationships.

Tutorial
10 PUBLIC Product overview
Parent topic: Product overview [page 9]

Related Information

The Designer user interface [page 11]

Designer tool palette [page 12]

2.2 The Designer user interface

Use the many tools in SAP Data Services Designer to create objects, projects, data flows, and workflows to
process data.

The Designer interface contains key work areas that help you set up and run jobs. The following illustration
shows the key areas of the Designer user interface.

Tutorial
Product overview PUBLIC 11
Parent topic: Product overview [page 9]

Related Information

Product components [page 9]

Designer tool palette [page 12]

2.3 Designer tool palette

The SAP Data Services objects to build work flows and data flows appear as icons on the tool palette to the
right of the workspace.

If the object isn't applicable to what you have open in the workspace, Data Services disables the icon. We use
only a few of the objects in the tool palette for the tutorial.

The following table contains descriptions for some of the objects in the tool palette. For descriptions and use of
all objects in the tool palette, see the Designer Guide.

Object Description

Specifies the order in which Data Services processes subor

dinate data flows
Work Flow

Processes data in the order in which data objects are ar

Data Flow ranged in the workspace

Performs SQL type operations on source data

Query transform

Creates a target table using the schema defined by the data

Template table flow

Calls functions and assigns values to variables in a work flow

Script

Implements “if, then, else” logic in a work flow

Conditional

A sequence of steps that repeats as long as a condition is

While Loop true

Combines with a Catch block, specifies alternative work

Try
flows if errors occur during job execution

Combines with a Try block, terminates the Try block and

Catch handles exceptions

Tutorial
12 PUBLIC Product overview
Object Description

Custom description of a job, work flow, data flow, or a dia

Annotation
gram in a workspace

Parent topic: Product overview [page 9]

Related Information

Product components [page 9]

The Designer user interface [page 11]

Tutorial
Product overview PUBLIC 13
3 About objects

SAP Data Services objects are entities that you create, add, define, modify, or work with in the software.

Each Data Services object has similar characteristics for creating and configuring objects.

Characteristics of Data Services objects

Characteristic Description

Properties Text that describes the object. For example, the name, de
scription, and creation date describes aspects of an object.

Attributes Properties that organize objects and make them easier for
you to find. For example, organize objects by attributes such
as object types.

Classes Determines whether an object can be used again in a differ-

ent job. Object classes are “reusable” and “single-use”.

The Designer contains a Local Object Library that is divided by tabs. Each tab is labeled with an object type.
Objects in a tab are listed in groups. For example, the Project tab groups projects by project name and further
by job names that exist in the project.

Local Object Library tabs:

• Projects
• Jobs
• Workflows
• Data flows
• Transforms
• Datastores
• Formats
• Functions

Object hierarchy [page 15]

Each object in SAP Data Services has a specific place in the object hierarchy.

Jobs and subordinate objects [page 17]

Add a job to a project, and build the job by adding subordinate objects in a specific order.

Work flows [page 18]

Use a work flow to specify the order in which SAP Data Services processes multiple objects, including
other work flows and subordinate data flows.

Data flows [page 19]

Use data flows to transform source data into target data in a specific order.

Naming conventions for objects [page 20]

To help keep multiple objects in a project organized and easy to identify, use a consistent naming
convention for SAP Data Services objects.

Tutorial
14 PUBLIC About objects
Delete reusable objects [page 21]
SAP Data Services deletes objects differently based on the location in which you choose to delete the
object.

3.1 Object hierarchy

Each object in SAP Data Services has a specific place in the object hierarchy.

The highest object in the hierarchy is a project. All other objects are subordinate to a project. The following
diagram shows the hierarchical order of key object types in Data Services.

Tutorial
About objects PUBLIC 15
When you create objects in a project, add objects in hierarchical order.

 Example

In a project, you must 1st create a batch job, which is the 2nd highest object in Data Services. Then you can
add a work flow and a data flow. A work flow isn't required, but a job must have a data flow to process data.
A data flow can contain only subordinate objects such as tables, transforms, and template tables.

Parent topic: About objects [page 14]

Tutorial
16 PUBLIC About objects
Related Information

Jobs and subordinate objects [page 17]

Work flows [page 18]
Data flows [page 19]
Naming conventions for objects [page 20]
Delete reusable objects [page 21]

3.2 Jobs and subordinate objects

Add a job to a project, and build the job by adding subordinate objects in a specific order.

A project is the highest-level object in Designer hierarchy. Projects organize jobs and the related subordinate
objects in a job.

 Note

A job doesn't have to be a part of a project. You can create a job independent of a project. A data flow,
however, must be a part of a job.

Open a project by right-clicking the project in the object library and selecting Open. After you open a project, it
appears in the project area pane. If you open a different project from the Project tab in the object library, Data
Services closes the opened project in the project area and displays the newly opened project.

Build a project by adding subordinate objects in hierarchical order. The following table contains a list of objects
and the related subordinate objects.

Object Subordinate object Subordinate description

Project Job The smallest unit of work that you can

schedule to run independently.

Contains optional objects, such as

scripts and work flows, and required ob
jects, such as data flows.

Objects in jobs direct the flow of data

and inform Data Services about the
manner of processing.

Job Work flow Defines the decision-making process

for executing data flows.

Contains optional conditionals and

scripts plus data flows.

Sets the state of the system after the

data flow completes.

Tutorial
About objects PUBLIC 17
Object Subordinate object Subordinate description

Data flow Extracts, transforms, and loads data in

a job. Directs the flow of source data
through a transformation and out to a
target object.

View the object hierarchy diagram to

see all required and optional objects in
a data flow.

Parent topic: About objects [page 14]

Related Information

Object hierarchy [page 15]

Work flows [page 18]
Data flows [page 19]
Naming conventions for objects [page 20]
Delete reusable objects [page 21]

3.3 Work flows

Use a work flow to specify the order in which SAP Data Services processes multiple objects, including other
work flows and subordinate data flows.

A work flow is a reusable object. It executes only within a Job. Use work flows to perform the following tasks:

• Call data flows.

• Call another work flow.
• Define the order of steps to be executed in your job.
• Pass parameters to and from data flows.
• Define conditions for executing sections of the project.
• Specify how to handle errors that occur during execution.

Work flows are optional objects in the hierarchy.

 Example

Open a work flow in the workspace and start to build it by adding applicable objects such as data flows.
Arrange data flows in the workspace so that the output from 1 data flow is ready as input to the next data
flow.

Tutorial
18 PUBLIC About objects
The following is an example of a work flow diagram in a workspace:

Parent topic: About objects [page 14]

Related Information

Object hierarchy [page 15]

Jobs and subordinate objects [page 17]
Data flows [page 19]
Naming conventions for objects [page 20]
Delete reusable objects [page 21]

3.4 Data flows

Use data flows to transform source data into target data in a specific order.

Data flows process data in the order in which they’re arranged in the workspace.

A data flow defines the tasks of a job. It also defines the direction or flow of data. Data flows can be as simple as
having a source, transform, and target. However, data flows can be complicated and involve, for example,
several sources, if/then statements, queries, multiple targets, and more.

The following diagram shows a simple data flow.

Use data flows for the following tasks:

• Identify the source data to read.

Tutorial
About objects PUBLIC 19
• Define the transformations to perform on the data.
• Identify the target object and define the transfer protocol for transformed data.

A data flow is a reusable object. It's always called by either a parent work flow or a parent job.

Parent topic: About objects [page 14]

Related Information

Object hierarchy [page 15]

Jobs and subordinate objects [page 17]
Work flows [page 18]
Naming conventions for objects [page 20]
Delete reusable objects [page 21]

3.5 Naming conventions for objects

To help keep multiple objects in a project organized and easy to identify, use a consistent naming convention
for SAP Data Services objects.

We recommend that you decide on a comprehensive naming convention before you begin creating objects.

 Example

The naming convention described in the following table adds prefixes or suffixes to identify the object type,
and includes the job name, to relate the object to a specific job.

Object type Prefix Suffix Example

Job JOB JOB_SalesOrg

Work flow WF WF_SalesOrg

Data flow DF DF_SalesOrg

Datastore DS SalesOrg_DS

We use the same type of naming conventions throughout our documentation.

Parent topic: About objects [page 14]

Related Information

Object hierarchy [page 15]

Tutorial
20 PUBLIC About objects
Jobs and subordinate objects [page 17]
Work flows [page 18]
Data flows [page 19]
Delete reusable objects [page 21]

3.6 Delete reusable objects

SAP Data Services deletes objects differently based on the location in which you choose to delete the object.

The following table describes what Data Services deletes when you delete an object from either the project
area or the object library.

Delete from What's deleted

Project area Deletes only the object from the opened project.

• The object is still available in the object library and in

the repository.
• All existing calls to the object are valid except in the
project from which you deleted the object.

Object library Deletes the object from the object library, and deletes all
calls to the object in the following locations:

• Repository
• Parent objects

Parent objects become invalid after Data Services deletes

the call to the deleted object.

Before Data Services deletes an object from the object library, it issues a message letting you know when an
object is used in multiple locations. The message provides the following options:

• Yes: Continue with the deletion of the object from the repository.
• No: Cancel the deletion of the object from the repository.
• View Where Used: Display a list of the related objects.

Parent topic: About objects [page 14]

Related Information

Object hierarchy [page 15]

Jobs and subordinate objects [page 17]
Work flows [page 18]
Data flows [page 19]

Tutorial
About objects PUBLIC 21
Naming conventions for objects [page 20]

Tutorial
22 PUBLIC About objects
4 Preparation for this tutorial

Ensure that you or an administrator perform all of the preparation for the tutorial so that you can successfully
complete each exercise.

The preparation includes some steps that your administrator must complete. Contact your administrator for
important connection information and access information related to those tasks.

We have a complete documentation set for SAP Data Services available on our Customer Help Portal. If you are
unclear about a process in the tutorial, or if you don't understand a concept, refer to the online documentation
at http://help.sap.com/bods.

Required tasks

The steps to prepare for the SAP Data Services tutorial exercises include tasks for your administrator and tasks
that you can perform.

You must have sufficient user permission to perform the exercises in the tutorial. For information about
permissions, see the Administrator Guide.

The following table lists each task and who performs the task. Instructions for administrator-only tasks aren’t
included in the tutorial, but they’re included in other Data Services documents such as the Installation Guide.

Task Who performs

Install Central Management Server (CMS) Administrator

Installed through either of the following applications: Installed before installation of SAP Data Services. More infor
mation in the Installation Guide.
• SAP BusinessObjects Business Intelligence platform (BI
platform)
• Information platform services platform (IPS platform)

Install SAP Data Services Administrator

Steps are in the Installation Guide.

Create local repository Administrator

Created during Data Services installation. However, if your

administrator requires you to use a different repository, the
administrator creates the repository and provides you with
the login credentials.

Create user account for tutorial participants Administrator

Steps are in the Administrator Guide.

Tutorial
Preparation for this tutorial PUBLIC 23
Task Who performs

Create source and target databases You or an administrator

You must have the required permissions to create databases

in your RDBMS. Information about the source and target da
tabases is in the tutorial.

Run the tutorial scripts You or an administrator

The tutorial provides instructions to run the scripts for vari

ous database management systems.

You must have read and write permission to the directory

where the script files are stored. Steps are in the tutorial.

The scripts add the source and target tables to your RDBMS,
and add the other data, such as XML file, to the Tutorial di
rectory.

Create required databases [page 24]

The tutorial requires that you have access to an SAP Data Services repository, a source database, and a
target database.

Database connections worksheet [page 26]

Complete the database connections worksheet and refer to the information in the sheet to complete
the exercises in this tutorial.

Running the provided SQL scripts [page 27]

Run the tutorial batch files to populate the source and target databases with tables.

4.1 Create required databases

The tutorial requires that you have access to an SAP Data Services repository, a source database, and a target
database.

Repository database

SAP Data Services requires a repository database. Data Services installation includes the creation of a
repository database. Ask your administrator for the repository name and password. Then use that information
when you log into Data Services.

If you or your administrator creates a repository specifically for this tutorial, make sure to follow the
instructions in the Post-Installation section, “Configuring repositories”, in the Installation Guide.

Tutorial
24 PUBLIC Preparation for this tutorial
Source and target databases

Either request that your administrator create a source and target database, or create the databases yourself. If
you create the databases, you must have the required permissions in the relational database management
system (RDBMS) to create users and databases.

Create the source and target database in either the same RDBMS as the repository, or a different RDBMS. For
example, create a source and target database in SAP SQL Anywhere, which is bundled with Data Services.

To add tables to the source and target databases, you run special scripts designed for certain database
management systems. There is a script for each of the following RDBMS:

• DB2
• Informix
• Microsoft SQL Server
• Micorsoft SQL Server 2005 and later versions
• ODBC
• Oracle
• SAP SQL Anwhere (Sybase)

The scripts are prepopulated with the database names, user names, and passwords listed in the following
table.

Database type Database name User name Password

Source ODS ods ods

Target Target target target

The scripts also require the server name on which you run your RDBMS.

If you use different database names, user names, or passwords when you create the databases, enter the
information in the handy worksheet located in Database connections worksheet [page 26]. We refer to the
information in the worksheet in several of the tutorial exercises.

 Remember

If you use different names, update the scripts with the information that you used.

Database requirements

Ensure that the source and target users have the following permissions in the RDBMS:

• Permission to access the databases

• Permission to create and drop tables in the databases

Also make sure your RDBMS allows remote access.

Tutorial
Preparation for this tutorial PUBLIC 25
 Note

For Oracle, set the protocol to TCP/IP and enter a service name; for example, training.sap. The service
name can act as your connection name.

Consult your RDBMS documentation for specific requirements for setting permissions.

Parent topic: Preparation for this tutorial [page 23]

Related Information

Database connections worksheet [page 26]

Running the provided SQL scripts [page 27]

4.2 Database connections worksheet

Complete the database connections worksheet and refer to the information in the sheet to complete the
exercises in this tutorial.

Print this page, or copy it to an editable format, and enter the applicable information in each column. We refer
you to the information in the worksheet in many of the tutorial exercises.

Database Connections
Value Repository Source Target

Database type:

For example, Oracle or

MSSQL.

Server name: (or database

connection if using Oracle)

Database Name

User name:

Password:

Parent topic: Preparation for this tutorial [page 23]

Related Information

Create required databases [page 24]

Tutorial
26 PUBLIC Preparation for this tutorial
Running the provided SQL scripts [page 27]

4.3 Running the provided SQL scripts

Run the tutorial batch files to populate the source and target databases with tables.

Before you perform the following steps, make sure that you have permission to copy and edit files for the
directory in which the scripts are located.

SAP Data Services installation includes the following batch files:

• CreateTables_DB2
• CreateTables_Informix
• CreateTables_MSSQL
• CreateTables_MSSQL2005 (use for Microsoft SQL Server versions 2005 and later)
• CreateTables_ODBC
• CreateTables_ORA
• CreateTables_Sybase

The batch files run scripts that create tables in the source and target databases that you prepared for the
tutorial.

To edit and run the provided SQL scripts, perform the following steps:

1. Locate the batch file for your specific RDBMS in the Data Services installation directory.

The default location is <LINK_DIR>\Tutorial Files\Scripts.

2. Rename the original script file and create a copy.

Use the name of the original script for the copy.

 Example

Alter the original file name by adding “_original” after the file name. For the batch file
CreateTables_ORA, rename the original file to CreateTables_ORA_original.

3. Right-click the copy of the script file and select Edit.

4. Update the script with the credentials that you used when you created the databases.

 Example

You created the source and target databases using Oracle. You completed the database connections
worksheet as follows:

Database name User name Password

Source User1 Source_Data#1

Tutorial
Preparation for this tutorial PUBLIC 27
Database name User name Password

Target User1 Target_Data#1

Update the CreateTables_ORA.bat script file as follows, updates are in bold:

sqlplus User1/Source_Data#1@<connection> @ODS_ORA.sql >

CreateTables_ORA.out
sqlplus User1/Target_Data#1@<connection> @Target_ORA.sql >>
CreateTables_ORA.out

 Example

You created the source and target databases using Microsoft SQL Server 2019. You completed the
database connections worksheet as follows:

Database name User name Password

Source User1 SourceDB1

Target User1 TargetDB1

Update the CreateTables_MSSQL2005.bat script file as follows, updates are in bold:

osql /e /n /U User1 /S <servername> /d Source /P SourceDB1 /i

ODS_MSSQL.sql /o Tutorial_MSSQL.out
osql /e /n /U User1 /S <servername> /d Target /P TargetDB1 /i
Target_MSSQL.sql /o Target_MSSQL.out

Retain all spaces between commands.

5. Save and close the .bat file.

6. Double-click the .bat file to run the SQL script.
7. Open the Tutorial_<DBMS>.out text file to verify that the script file was successful.
8. Open your RDBMS and check that the source and target databases contain the tutorial tables.
When successful, the scripts add the following tables to your source and target databases:

Source tables Target tables

CDC_time CDC_time

cust_dim cust_dim

employee_dim employee_dim

mtrl_dim mtrl_dim

ods_customer sales_fact

ods_delivery salesorg_dim

ods_employee status_table

ods_material time_dim

ods_region

Tutorial
28 PUBLIC Preparation for this tutorial
Source tables Target tables

ods_salesitem

ods_salesorder

sales_fact

salesorg_dim

status_table

time_dim

Task overview: Preparation for this tutorial [page 23]

Related Information

Create required databases [page 24]

Database connections worksheet [page 26]

Tutorial
Preparation for this tutorial PUBLIC 29
5 Tutorial data model

To introduce you to the features in SAP Data Services, the tutorial uses a simplified data model.

The tutorial data model is a sales data warehouse with a star schema that contains one fact table and some
dimension tables.

In the tutorial, you perform tasks on the sales data warehouse. We divided the tutorial exercises into the
following segments:

Tutorial segments

Segment Lessons

Create datastores and import metadata Introduces how to create and use datastores to access data
from various data sources.

Populate a table from a flat file Introduces basic data flows, query transforms, and source
and target tables.

Populate the Time Dimension table Introduces Data Services functions.

Populate a table with data from a relational table Introduces data extraction from relational tables.

Populate a table from an XML File Introduces data extraction from nested sources.

Populate a table from multiple relational tables Continues data extraction from relational tables and introdu
ces joins and the lookup function.

The tutorial also has segments that introduce the following concepts:

• Changed data capture

Tutorial
30 PUBLIC Tutorial data model
• Data assessment
• Recovery mechanisms
• Multiuser development
• SAP application data
• Real-time jobs

Complete each segment before going on to the next segment. Each segment creates the jobs and objects that
you need in the next segment. We reinforce each skill in subsequent segments.

Tutorial
Tutorial data model PUBLIC 31
6 Logging into the Designer

When you start the tutorial, and each time you resume the tutorial, you need to log into the Designer.

Before you perform the following steps, know your user name and password assigned to you when your
administrator created your user account. Also know the repository name and password.

After you log into the Designer a few times, you won't need to refer to these steps, and you will remember your
log in credentials.

See an example of the Designer in The Designer user interface [page 11].

1. Open SAP Data Services Designer from the Start menu.

The SAP Data Services Repository Login opens.

The following image shows the login dialog box.

2. Enter the credentials as described in the following table.

Tutorial
32 PUBLIC Logging into the Designer
Option Description

System-host[:port] The name of the Central Management Server (CMS) system. You may also
need to specify the port when applicable.

 Note
If applicable, enter localhost.

User name The user name assigned to you when your administrator created your user ac
count in the Central Management Console (CMC).

Password The password assigned to you when your administrator created your user ac
count in the CMC.

Authentication The authentication type used by the CMS.

 Note
This value is usually Enterprise.

3. Select Log On.

If there is more than one repository, the software displays a list of existing local repositories in the bottom
pane. If there is just one repository, the prompt for your password appears.
4. Enter the repository password and select OK.

Data Services Designer opens.

Next you learn how to use a datastore to define the connections to the source and target databases.

Exiting the tutorial [page 33]

You can stop and leave the tutorial at any point.

Resuming the tutorial [page 34]

When you’re ready to resume the tutorial, start at the point where you exited.

Related Information

Create datastores and import metadata [page 35]

6.1 Exiting the tutorial

You can stop and leave the tutorial at any point.

To exit the tutorial, follow these steps:

Tutorial
Logging into the Designer PUBLIC 33
1. Select Project Exit.

If you haven't saved your changes, SAP Data Services prompts you to save your work before you exit.
2. Select Yes to save your work.

After Data Services saves your work, it exits.

Task overview: Logging into the Designer [page 32]

Related Information

Resuming the tutorial [page 34]

6.2 Resuming the tutorial

When you’re ready to resume the tutorial, start at the point where you exited.

To resume the tutorial, log into SAP Data Services and perform the following steps:

1. Select Project Open.

2. Choose the name of the tutorial project and select Open.

Data Services opens the project in the Project Area pane.

Task overview: Logging into the Designer [page 32]

Related Information

Exiting the tutorial [page 33]

Tutorial
34 PUBLIC Logging into the Designer
7 Create datastores and import metadata

Datastores contain connection configurations to databases and applications in which you have data.

What you'll learn in this segment

• How to create datastores that connect to the database where the source and target tables are stored.
• How to use the datastores to import metadata from source and target tables into the local repository

How to use datastores

Use the connection information in datastores to import metadata from the database or application for which
you created the datastore. Use the metadata from the import in source and target objects in jobs.

On a basic level, use imported objects in SAP Data Services as follows:

• As a source object, Data Services accesses the data through the connection information in the datastore
and loads the data into the data flow.
• As a target object, Data Services outputs processed data from the data flow into the target object and, if
configured to do so, uploads the data to the database or application using the datastore connection
information.

In addition to other elements such as functions and connection information, the metadata in a datastore
consists of the following table elements:

• Table name
• Column names
• Column data types
• Primary and foreign key columns
• Table attributes

What can datastores connect to?

Data Services datastores can connect to any of the following databases or applications:

• Databases
• Mainframe file systems
• Applications that have prepackaged or user-written adapters
• J.D. Edwards, One World, J.D. Edwards World, Oracle applications, PeopleSoft, SAP applications, SAP Data
Quality Management, microservices for location data, Siebel applications, and Google BigQuery.

Tutorial
Create datastores and import metadata PUBLIC 35
• Remote servers using FTP, SFTP, and SCP
• SAP systems: SAP applications, SAP NetWeaver Business Warehouse (BW) Source, and BW Target

For complete information about datastores, see the Designer Guide. See the various supplements for
information about specific databases and applications. For example, for applications with adapters, see the
Supplement for Adapters.

Creating a datastore for the source database [page 36]

Create a database datastore to use as a connection to the source database that you created when you
prepared for the tutorial.

Create the target datastore [page 37]

Create a database datastore to use as a connection to the target database that you created for the
tutorial.

Importing metadata for source tables [page 38]

Use the datastore that you created for the source database to import metadata into Designer.

Importing metadata for target tables [page 38]

Use the datastore that you created for the target database to import metadata into Designer.

What's next [page 39]

When you’ve created the source and target datastores and imported metadata, you’re ready to begin
the next segment.

7.1 Creating a datastore for the source database

Create a database datastore to use as a connection to the source database that you created when you
prepared for the tutorial.

To perform this task, you need to know the RDBMS that you used for the database. You also need the user
name and password information that you entered for your source database in the database connections
worksheet.

For descriptions of the datastore options, see the Designer Guide.

1. Open the Datastores tab in the Local Object Library in Designer.

2. Right-click in an empty area of the Datastores tab and select New.

The Create New Datastore dialog box opens.

3. Type ODS_DS for Datastore name.
4. Choose Database from the Datastore type list.
5. Choose the applicable database from the Database type list.
Choose the database type that you used to create the source database. For example, if you created the
source database file using Microsoft SQL Server, select Microsoft from the list.

The remaining options change based on the database type you choose.
6. Enter the connection information for the source database.
7. Enter the database name, the user name, and the password for the source database.

If you used the suggested values, enter ODS for the database name, and enter ods for both the user name
and password. Otherwise, consult your database connections worksheet.

Tutorial
36 PUBLIC Create datastores and import metadata
 Note

For the tutorial, we don't open or complete any of the advanced options.

8. Select OK.
Data Services saves the source datastore in the repository.

Task overview: Create datastores and import metadata [page 35]

Related Information

Create the target datastore [page 37]

Importing metadata for source tables [page 38]
Importing metadata for target tables [page 38]
What's next [page 39]

7.2 Create the target datastore

Create a database datastore to use as a connection to the target database that you created for the tutorial.

To create a target datastore, follow the same process as for the source datastore, except enter Target_DS for
the datastore name. If you used the suggested values, enter Target for the database name, and enter target for
both the user name and password. Otherwise, consult your database connections worksheet for the values you
used.

Parent topic: Create datastores and import metadata [page 35]

Related Information

Creating a datastore for the source database [page 36]

Importing metadata for source tables [page 38]
Importing metadata for target tables [page 38]
What's next [page 39]
Create required databases [page 24]

Tutorial
Create datastores and import metadata PUBLIC 37
7.3 Importing metadata for source tables

Use the datastore that you created for the source database to import metadata into Designer.

1. Open the Datastores tab in the object library.

2. Right-click the ODS_DS datastore and select Open.

SAP Data Services opens the Datastore Explorer in the right pane. With External Metadata selected at the
top, the explorer lists all the tables in the source database.

 Note

It can take a minute or so to download the tables.

3. Select all of the tables to highlight them.

4. Right-click and select Import.

Data Services imports the metadata for each table into the local repository.

 Note

It can take a minute or so to import the tables.

5. Expand the Tables node under the ODS_DS datastore in the object library.

The Tables node contains the imported tables.

6. Select the “X” in the upper right corner of the datastore explorer to close it.

Task overview: Create datastores and import metadata [page 35]

Related Information

Creating a datastore for the source database [page 36]

Create the target datastore [page 37]
Importing metadata for target tables [page 38]
What's next [page 39]

7.4 Importing metadata for target tables

Use the datastore that you created for the target database to import metadata into Designer.

1. Open the Datastores tab in the object library.

2. Right-click the Target_DS datastore and select Open.

SAP Data Services opens the Datastore Explorer in the right pane. With External Metadata selected at the
top of the right pane, the explorer lists all of the tables in the target database.

Tutorial
38 PUBLIC Create datastores and import metadata
 Note

It can take a minute or so to download the tables.

3. Select all of the tables to highlight them.

4. Right-click the selected tables and select Import.

Data Services imports the metadata for each table into the local repository.

 Note

It can take a minute or so to import the tables.

5. Expand the Tables node under the Target_DS datastore in the object library.

The Tables node contains the imported tables.

6. Select the “X” in the upper right corner of the datastore explorer to close it.

Task overview: Create datastores and import metadata [page 35]

Related Information

Creating a datastore for the source database [page 36]

Create the target datastore [page 37]
Importing metadata for source tables [page 38]
What's next [page 39]

7.5 What's next

When you’ve created the source and target datastores and imported metadata, you’re ready to begin the next
segment.

In the next segment, create a file format to define the schema for a flat file named sales_org.txt. Then use
the information in the file format to populate the Sales Org Dimension table with data from the
sales_org.txt flat file.

Parent topic: Create datastores and import metadata [page 35]

Related Information

Creating a datastore for the source database [page 36]

Create the target datastore [page 37]

Tutorial
Create datastores and import metadata PUBLIC 39
Importing metadata for source tables [page 38]
Importing metadata for target tables [page 38]
Populate a table from a flat file [page 41]

Tutorial
40 PUBLIC Create datastores and import metadata
8 Populate a table from a flat file

Populate a table with data from a flat file using a file format object.

What you'll learn

After you complete the tasks in this segment, you'll learn how to:

• Create a file format

• Create a project and add subobjects such as jobs, work flows, and data flows
• Output data to a datastore target
• Define a query transform, configure input and output schemas, and set processing options
• Validate a job and fix errors
• Execute a job to populate a table in the target data warehouse.

The goal

The purpose of this segment is to populate the Sales Org Dimension table with data from a source flat file
named sales_org.txt.

The circled portion of the Star Schema in the following diagram shows the portion we’ll work on in this
segment.

Each task in this segment adds objects to an SAP Data Services project. The project contains objects in a
specific hierarchical order.

Tutorial
Populate a table from a flat file PUBLIC 41
At the end of each task, save your work. You can either proceed to the next task or exit Data Services. If you exit
Data Services before you save your work, the software asks that you save your work before you exit.

Defining a file format [page 42]

A file format specifies a set of properties that describe the structure of a flat file.

Creating a new project [page 44]

Create a new project in SAP Data Services Designer to group the objects that you create in the tutorial.

Adding a new job [page 45]

A job is an executable object and is the second element in a project hierarchy.

Adding a work flow [page 46]

Work flows contain the order of steps in which the software executes a job.

Adding a data flow [page 46]

A data flow contains instructions to extract, transform, and load data through data flow objects.

Define the data flow [page 47]

The data flow contains objects with instructions to SAP Data Services for building the sales
organization dimension table.

Validating the DF_SalesOrg data flow [page 50]

Perform a design-time validation, which checks for construction errors such as syntax errors.

Viewing details of validation errors and warnings [page 51]

If there are errors or warnings after you validate your job, view more details and open the area where
the problem exists.

Saving the project [page 52]

Save the objects that you have created and exit SAP Data Services at any time.

Ensuring that the Job Server is running [page 52]

Before you execute a job, either as an immediate or scheduled task, ensure that the Job Server is
running.

Executing the job [page 53]

When you execute the job, SAP Data Services moves your data through the Query transform and loads
the data to the target table, SALESORG_DIM.

What's next [page 55]

Populate a table with data transformed by a data flow.

Related Information

Object hierarchy [page 15]

8.1 Defining a file format

A file format specifies a set of properties that describe the structure of a flat file.

Use the SAP Data Services file format editor to create a file format for the flat file named sales_org.txt.

Tutorial
42 PUBLIC Populate a table from a flat file
1. Open the Formats tab in the object library and right-click a blank area in the tab.

2. Select New File Format .

The File Format Editor opens.
3. Set the options in the left pane of the File Format Editor as described in the following table.

Option Value

General group:

Type Delimited

Data File(s) group:

Name Format_SalesOrg

Location Local

File name(s) Perform the following steps to select the file:

1. Select the file folder icon. The file browser opens.
2. Open <LINK_DIR>\Tutorial Files\.
3. Highlight the sales_org.txt file.
4. Select Open
5. Select Yes for the prompt asking to overwrite the cur
rent schema.

The File Format Editor displays the sales_org.txt file

schema in the upper right of the File Format Editor and
sample data in the lower right.

Default Format group:

Date Select ddmmyyyy from the list. If the date format isn't in
the list, type the format in the field.

The date format matches the data in the DateOpen col

umn in the lower right.

Input/Output group:

Skip row header Yes.

Select Yes for the prompt asking to overwrite the current

schema.

The software updates the following in the data pane:

• Replaces the values under FieldName in the upper
right with the column names from the lower pane.
• Replaces the column headings in the lower right. The
headers change from Field1, Field2, and so on , to val
ues that were in the first row, SalesOffice, Region, and
so on.

4. In the upper right of the File Format Editor, select date in the DateOpen row under the Data Type column.
5. Select Save & Close.
The following screen capture shows the completed File Format Editor.

Tutorial
Populate a table from a flat file PUBLIC 43
The new format, Format_SalesOrg, appears under the Flat Files node in the File Formats tab of the object
library.

8.2 Creating a new project

Create a new project in SAP Data Services Designer to group the objects that you create in the tutorial.

To create a new project, log into the Designer and perform the following steps:

1. Select Project New Project .

A list of your existing projects appears. If you don’t have any projects created, the list is empty.
2. Enter the following name in Project name: Class_Exercises.
3. Select Create.

The project Class_Exercises appears in the Project Area of the Designer, and in the Projects tab of the Local
Object Library.

Tutorial
44 PUBLIC Populate a table from a flat file
Select the Save All icon to save the project.

Next, create a job for the new project.

Related Information

Jobs and subordinate objects [page 17]

8.3 Adding a new job

A job is an executable object and is the second element in a project hierarchy.

To create a job in the Class_Exercises project, log into SAP Data Services Designer and perform the
following steps:

1. Open the Project tab in the object library and double-click Class_Exercises.

The project Class_Exercises appears in the Project Area.

2. Right-click in an empty space of the Project Area and select New Batch Job.

Alternately, select Project New Batch Job.

A new job node appears under the project node with the name “New Job”. At this point, the job name is
editable. If you make a menu selection or click away from the new job, you can rename the job using the
right-click menu and selecting Rename.

 Note

In future exercises, we'll simply tell you to rename the object.

3. Rename the new job JOB_SalesOrg.

The job appears in the Project Area under Class_Exercises, and in the Jobs tab under the Batch Jobs node
in the Local Object Library.

Save the new job and proceed to the next exercise.

Related Information

Jobs and subordinate objects [page 17]

Tutorial
Populate a table from a flat file PUBLIC 45
8.4 Adding a work flow

Work flows contain the order of steps in which the software executes a job.

To add a work flow to the Job_SalesOrg batch job, with the Class_Exercises project open in the Project
Area, perform the following steps:

1. Select JOB_SalesOrg in the Project Area.

The job opens in the workspace and the tool palette appears to the right of the workspace.

2. Select the work flow button ( ) from the tool palette and select an empty area of the workspace.

A work flow icon appears in the workspace. The work flow also appears in the Project Area hierarchy under
the job JOB_SalesOrg node.
3. Rename the new work flow WF_SalesOrg.
4. Select WF_SalesOrg in the project area.

An empty view of the work flow appears in a new workspace tab.

 Note

Work flows are easiest to read in the workspace from left to right and from top to bottom. Keep this
arrangement in mind as you add objects to the work flow workspace.

Next you create a data flow to add to WF_SalesOrg.

Related Information

Work flows [page 18]

8.5 Adding a data flow

A data flow contains instructions to extract, transform, and load data through data flow objects.

To add a data flow to the WF_SalesOrg workspace, perform the following steps:

1. Select the Data Flow icon ( ) from the tool palette and select an empty area of the workspace.

The data flow icon appears in the workspace and the new data flow appears under the work flow node in
the Project Area.
2. Rename the new data flow DF_SalesOrg.

The project, job, work flow, and data flow objects display in hierarchical order in the Project Area.

Next, add objects to the data flow.

Tutorial
46 PUBLIC Populate a table from a flat file
Related Information

Data flows [page 19]

8.6 Define the data flow

The data flow contains objects with instructions to SAP Data Services for building the sales organization
dimension table.

To build the sales organization dimension table, add a source file, query object, and a target table to the
DF_SalesOrg data flow in the workspace.

Perform the steps in each of the following task groups to define and configure the DF_SalesOrg data flow:

1. Add objects to the DF_SalesOrg data flow [page 47]

Start defining your data flow by adding objects to it.
2. Define the order of steps in a data flow [page 48]
To define the order that SAP Data Services processes the objects in the data flow DF_SalesOrg,
connect the objects in a specific order.
3. Configure the query transform [page 49]
The query transform retrieves a data set that satisfies conditions that you specify.

8.6.1 Add objects to the DF_SalesOrg data flow

Start defining your data flow by adding objects to it.

To add objects to the data flow, open DF_SalesOrg in the workspace and perform the following steps:

1. Open the Formats ( ) tab In the Local Object Library and expand the Flat Files node.
2. Drag and drop the Format_SalesOrg file format to the left side of the workspace and choose Make
Source from the popup menu.

Position the object to the left of the workspace area to make room for other objects.

3. Select the Query icon on the tool palette ( ) and select an area in the workspace to the right of the file
format object.

The Query icon appears in the workspace.

4. Open the Datastores tab in the Local Object Library, expand the Target_DS node, and expand the Tables
node.
5. Drag and drop the SALESORG_DIM table to the right of the Query icon in the workspace and choose Make
Target from the popup menu.

All the objects necessary to create the sales organization dimension table are now in the workspace. In the next
section, you connect the objects in the order in which you want the data to flow.

Tutorial
Populate a table from a flat file PUBLIC 47
Task overview: Define the data flow [page 47]

Next task: Define the order of steps in a data flow [page 48]

8.6.2 Define the order of steps in a data flow

To define the order that SAP Data Services processes the objects in the data flow DF_SalesOrg, connect the
objects in a specific order.

Data Services reads objects in a data flow from left to right. Therefore, arrange the objects in order from left to
right.

1. Select the small square on the right edge of the Format_SalesOrg source file and drag your pointer to the
triangle on the left edge of the query transform.

The mouse pointer turns into a hand holding a pencil. When you drag from the square to the triangle, the
software connects the two objects with an arrow that points in the direction of the data flow.
2. Use the same drag technique to connect the square on the right edge of the query transform to the triangle
on the left edge of the SALESORG_DIM target table.

The order of operation is established after you connect all of the objects. Next you configure the query
transform.

Task overview: Define the data flow [page 47]

Previous task: Add objects to the DF_SalesOrg data flow [page 47]

Tutorial
48 PUBLIC Populate a table from a flat file
Next task: Configure the query transform [page 49]

8.6.3 Configure the query transform

The query transform retrieves a data set that satisfies conditions that you specify.

Before you can configure the query transform, all objects in the data flow must be connected. When you
connect the objects in the data flow, the column information from the source and target files appears in the
Query transform to help you set up the query.

To configure the query transform in the Job_SalesOrg job, perform the following steps:

1. Double-click the query object listed under the DF_SalesOrg node in the Project Area.

Alternately, double-click the query object in the workspace.

The query editor opens. The query editor is divided into the following areas:
• Schema In pane: Lists the columns in the source file
• Schema Out pane: Lists the columns in the target file
• Options pane: Contains tabs for defining the query
2. To map the input columns to the output columns, select the column icon in the Schema In pane and drag it
to the corresponding column in the Schema Out pane. Map the columns as listed in the following table.

Map input column → To output column

SalesOffice → SALESOFFICE

DateOpen→ DATEOPEN

Region→ REGION

 Note

We won't use the Country column for this data flow.

After you drag the input column to the output column, an arrow icon appears next to the source column to
indicate that the column has been mapped.

The following graphic of the Query Editor contains red letters that relate to the descriptions in the following
legend:
• A. Target schema
• B. Source schema
• C. Query option tabs
• D. Column mapping definition

Tutorial
Populate a table from a flat file PUBLIC 49
3. Optional: Select a field in the Schema Out area and view the column mapping definition in the Mapping tab
of the options pane.

 Example

For example, in the graphic, the mapping for the SalesOffice input column to the SALESOFFICE output
column is: Format_SalesOrg.SalesOffice.

4. Select the cell in the SALESOFFICE row under the Type column in the Schema Out pane and choose
Decimal from the list.
5. Set Precision to 10 and Scale to 2 in the Type:Decimal popup dialog and select OK.

6. Select the Back arrow icon from the Designer toolbar to close the query editor and return to the data
flow workspace.
7. Save your work and optionally close the workspaces that you have open.

Task overview: Define the data flow [page 47]

Previous task: Define the order of steps in a data flow [page 48]

8.7 Validating the DF_SalesOrg data flow

Perform a design-time validation, which checks for construction errors such as syntax errors.

The Validation menu provides design-time validation options and not runtime verification. You can check for
runtime errors later in the process.

1. Highlight DF_SalesOrg in the Project Area.

The data flow opens in the workspace.

Tutorial
50 PUBLIC Populate a table from a flat file
2. Select Validation Validate Current View .

 Note

There are two validation options to choose:

• Current View: Validates the object definition open in the workspace.
• All Objects in View: Validates the object definition open in the workspace and all of the objects that
it calls.

You can alternatively use the icon bar and select either the Validate Current icon or the Validate All icon
to perform the same validations.

After the validation completes, Data Services displays the Output dialog with either the Errors tab or the
Warning tab open.

 Note

For this exercise, two warning messages appear. The warnings are a result of changing the data type of
the SALESOFFICE column when we defined the output schema. You don't have to fix anything for
warning messages.

For this exercise, there aren't any errors. However, if you do receive errors, fix the errors before you
proceed.
3. Select the “X” in the upper right corner of the Output dialog to close it.

Related Information

Executing the job [page 53]

8.8 Viewing details of validation errors and warnings

If there are errors or warnings after you validate your job, view more details and open the area where the
problem exists.

The job doesn't execute when there are validation errors. Therefore, you must fix errors. Warnings don't
prohibit the job from running.

To help you fix validation errors, view more information about the error. To view more information about
validation errors and warnings in the Output dialog box, perform the following steps:

1. Right-click an error or warning message and select View.

SAP Data Services displays the Message dialog box in which you can read the expanded notification text.
2. Double-click the error notification or right-click the error and select Go to Error.

Tutorial
Populate a table from a flat file PUBLIC 51
Data Services takes you to the object that contains the error.

 Example

When you select Go to Error for the following warning:

[Target Table:SALESORG_DIM(Target_DS.DBO)] The data type conversion will

be used to convert from type <decimal(10,2)> of source column
<SALESOFFICE> to type <int> of target column <SALESOFFICE>. (BODI-1110410)

Data Services opens the target editor with the column SALESOFFICE highlighted in the Schema Out
pane.

8.9 Saving the project

Save the objects that you have created and exit SAP Data Services at any time.

To save objects and close Data Service, use one of the following methods:

• Save objects in a project: Select Project Save All .

• Save objects that display in the workspace: Select Project Save .

• Save all changed objects from the current session: Select the Save All icon in the toolbar.
• Save while closing: When you select to close Data Services, it presents a list of all changed objects that
haven't been saved. Select Yes to save all objects in the list, or select specific objects to save and then
select Yes.

8.10 Ensuring that the Job Server is running

Before you execute a job, either as an immediate or scheduled task, ensure that the Job Server is running.

With the SAP Data Services Designer open, look at the bottom right of the page. The status of the Job Server is
indicated with icons.

Icon Description

Job Server is running

Job Server is inactive

The name of the active Job Server and port number appears in the status bar in the lower left when the cursor
is over the icon.

Tutorial
52 PUBLIC Populate a table from a flat file
An additional icon appears indicating whether the profile server is running. You will know what the icon
represents by viewing the message that appears when you hover your mouse over the icon.

8.11 Executing the job

When you execute the job, SAP Data Services moves your data through the Query transform and loads the data
to the target table, SALESORG_DIM.

Complete all of the steps to populate the Sales Organization Dimension from a flat file. Ensure that all errors
are fixed and that you save the job. If you exited Data Services, log back in to Data Services, and ensure that the
Job Server is running.

1. Select Project Open and select Class_Exercises.

The Class Exercises project appears in the Project Area.

2. Expand the Class Exercises project node.
3. Right-click the JOB_SalesOrg job in the Project Area and select Execute.

If you have not saved changes that you made to the job, the software prompts you to save them now.

The software validates the job and displays the Execution Properties dialog box.

The Execution Properties dialog box includes parameters and options for executing the job and to set
traces and global variables. Do not change the default settings for this exercise.

Tutorial
Populate a table from a flat file PUBLIC 53
4. Select OK.

Data Services displays a job log in the workspace. Trace messages appear while the software executes the
job. If the job encounters errors, an error icon becomes active and the job stops executing.
5. Change the log view by selecting the applicable log icon at the top of the job log.

Log files

Log file Description

Trace log A list of the job steps in the order they started.

Tutorial
54 PUBLIC Populate a table from a flat file
Log file Description

Monitor log A list of each step in the job, the number of rows proc
essed by that step, and the time required to complete the
operation.

Error log A list of any errors produced by the RDBMS, Data

Services, or the computer operating system during the job
execution.

 Note
The error icon is not active when there are no errors.

 Note

Remember that you should periodically close the tabs in the workspace when you are finished working with
the objects in the tab. To close a tab, click the X icon in the upper right of the workspace.

8.12 What's next

Populate a table with data transformed by a data flow.

In the next segment, you populate the Time Dimension table with the following time attributes

• Year number
• Month number
• Business quarter

You can now exit Data Services or go to the next group of tutorial exercises. If you exit, the software reminds
you to save your work if you did not save it before. The software saves all projects, jobs, workflows, data flows,
and results in the local repository.

Related Information

Populate a time dimension table [page 56]

Tutorial
Populate a table from a flat file PUBLIC 55
9 Populate a time dimension table

Time dimension tables contain date and time-related attributes such as season, holiday period, fiscal quarter,
and other attributes that aren’t directly obtainable from traditional SQL style date and time data types.

What you'll learn

In this segment, you'll practice the basic skills that you learned in the previous segment. In addition, you'll learn
how to do the following:

• Use one project for multiple jobs

• Create a job without a work flow
• Use a predefined transform in your job
• Use the Date_Generation transform as a source
• Configure an output schema to use functions
• View sample data

The goal

The Time Dimension table in this segment is simple in that it contains only the year number, month number,
and business quarter as time attributes. It uses a Julian date as a primary key.

 Note

The name of this section implies that we’re working with time, as in hours, minutes, and seconds. However,
we’re actually working with time in the sense of year, month, and business quarter. Don't confuse our
reference to time with time on the clock.

The following diagram shows the Star Schema with the portion we’ll work on in this segment circled.

Tutorial
56 PUBLIC Populate a time dimension table
Adding a job and data flow to the project [page 57]
Prepare a new job and data flow to populate the Time Dimension table.

Adding the components of the time data flow [page 58]

The components of the DF_TimeDim data flow consist of a Date_Generation transform as a source and
a table as a target.

Defining the output of the Date_Generation transform [page 59]

The Date_Generation transform enables you to define a sequence of dates to output.

Defining the output of the query [page 60]

Configure the query to map the DI_GENERATED_DATE column from the transform, to apply functions
to the output columns, and to map the output columns to an internal data set.

Executing the job [page 61]

After you save and validate the data flow DF_TimeDim, execute the job, JOB_TimeDim.

What's next [page 62]

In the next segment, you'll extract data to populate the Customer Dimension table.

9.1 Adding a job and data flow to the project

Prepare a new job and data flow to populate the Time Dimension table.

Log into SAP Data Services Designer and open the Class_Exercises project in the Project Area.

To add a new job and data flow to the Class_Exercises project, perform the following steps:

1. Right-click the project name Class_Exercises in the Project Area and select New Batch Job.

The new job appears under the Class_Exercises project node in the Project Area and an empty job
workspace opens. Notice that the new job listed in the Project Area contains the generic job name in a text
box.

Tutorial
Populate a time dimension table PUBLIC 57
2. Rename the job to JOB_TimeDim.

3. Right-click in the empty Job_TimeDim workspace and select Add New Data Flow .

A data flow icon appears in the job workspace.

4. Rename the data flow to DF_TimeDim.
5. Save the project.

The data flow is now ready for you to define.

 Note

A work flow is an optional object. For this job, we don't include a work flow.

Task overview: Populate a time dimension table [page 56]

Related Information

Adding the components of the time data flow [page 58]

Defining the output of the Date_Generation transform [page 59]
Defining the output of the query [page 60]
Executing the job [page 61]
What's next [page 62]

9.2 Adding the components of the time data flow

The components of the DF_TimeDim data flow consist of a Date_Generation transform as a source and a table
as a target.

To add objects to the DF_TimeDim data flow, perform the following steps in the DF_TimeDim workspace:

1. Open the Transforms tab ( ) in the Local Object Library and expand the Data Integrator
node.
2. Drag and drop the Date_Generation transform onto the data flow workspace.

The transforms in the Transform tab are predefined. The transform on your workspace is a copy of the
predefined Date_Generation transform.

3. Select the Query icon on the tool palette ( ) and select an empty area to the right of the
Date_Generation transform.

SAP Data Services adds a query transform icon to the data flow.
4. Open the Datastore tab in the Local Object Library and expand the Tables node under Target_DS.
5. Drag and drop the TIME_DIM table onto the workspace to the right of the query object and select Make
Target from the popup menu.

Tutorial
58 PUBLIC Populate a time dimension table
6. Connect all of the objects starting with the Date_Generation transform, through the query transform and
finally to the target table.

All of the objects to create the time dimension table are in the workspace.

Task overview: Populate a time dimension table [page 56]

Related Information

Adding a job and data flow to the project [page 57]

Defining the output of the Date_Generation transform [page 59]
Defining the output of the query [page 60]
Executing the job [page 61]
What's next [page 62]

9.3 Defining the output of the Date_Generation transform

The Date_Generation transform enables you to define a sequence of dates to output.

The Date_Generation transform outputs one column of dates in a sequence that you define. For more
information about the Date_Generation transform, see the Reference Guide.

To configure the Date_Generation transform, perform the following steps with the DF_TimeDim open in the
workspace:

1. Double-click the Date_Generation transform icon in the data flow.

The transform editor opens.

2. Complete the options in the Date Generation tab in the lower pane of the editor as described in the
following table:

Start date 2002.01.01

End date 2008.12.31

Increment daily

Keep Join rank set to 0 and don't select Cache.

3. Select the Back arrow icon in the upper toolbar to close the transform editor and return to the data
flow.
4. Select the Save All icon in the toolbar.

Task overview: Populate a time dimension table [page 56]

Tutorial
Populate a time dimension table PUBLIC 59
Related Information

Adding a job and data flow to the project [page 57]

Adding the components of the time data flow [page 58]
Defining the output of the query [page 60]
Executing the job [page 61]
What's next [page 62]

9.4 Defining the output of the query

Configure the query to map the DI_GENERATED_DATE column from the transform, to apply functions to the
output columns, and to map the output columns to an internal data set.

Perform the following steps with the DF_TimeDim data flow workspace open:

1. Double-click the query object in the DF_TimeDim data flow.

The query editor opens. The Schema In pane of the query editor contains one column from the
Date_Generation transform, DI_GENERATED_DATE. The Schema Out pane has columns that are copied
from the target table.
2. Drag the DI_GENERATED_DATE column from the Schema In pane to the NATIVEDATE column in the
Schema Out pane.

A blue arrow appears to the left of each column name indicating that the column is mapped.
3. Map each of the other output columns in the output schema by performing the following substeps:
a. Select the column name in the Schema Out pane.
b. Type a function for the column in the Mapping tab in the lower pane as directed in the following table.

The following table contains the column name and the corresponding function to type.

Output column name Function in Mapping tab Function description

DATE_ID julian(di_generated_date) Sets the Julian date for the date value.

MONTHNUM month(di_generated_date) Sets the month number for that date

value.

BUSQUARTER quarter(di_generated_date Sets the quarter for that date value.

)

YEARNUM to_char(di_generated_date Selects only the year out of the date

,'yyyy') value. Make sure to enclose yyyy in
single quotes.

4. Select the Back arrow icon on the toolbar ( ).

The query editor closes.

5. Select the Validate All icon in the upper toolbar.

Tutorial
60 PUBLIC Populate a time dimension table
The data flow validates without errors. If there are errors, go back through the steps up to this point and
perform any steps that you missed.
6. Select the Save All icon in the toolbar.

Task overview: Populate a time dimension table [page 56]

Related Information

Adding a job and data flow to the project [page 57]

Adding the components of the time data flow [page 58]
Defining the output of the Date_Generation transform [page 59]
Executing the job [page 61]
What's next [page 62]

9.5 Executing the job

After you save and validate the data flow DF_TimeDim, execute the job, JOB_TimeDim.

The JOB_TimeDim job populates the TIME_DIM dimension table with the transformed data.

1. Right-click JOB_TimeDim in the Project Area and select Execute.

2. Accept the default settings in the Execution Properties dialog box and select OK.
3. View the job progress to make sure that the job completes successfully.
4. Open the DF_TimeDim in the workspace.
5. Select the magnifying glass icon in the lower right corner of the target table.

The following graphic shows the data flow with the magnifying glass icon circled.

The target data table opens in the lower pane. The following screen capture shows the view of the data. The
table contains a row for each date beginning with the start date that you entered in the Date_Generation
transform editor and ending with the end date. The functions that you entered in the query editor break
down each date by month, business quarter, and year.

Tutorial
Populate a time dimension table PUBLIC 61
Task overview: Populate a time dimension table [page 56]

Related Information

Adding a job and data flow to the project [page 57]

Adding the components of the time data flow [page 58]
Defining the output of the Date_Generation transform [page 59]
Defining the output of the query [page 60]
What's next [page 62]

9.6 What's next

In the next segment, you'll extract data to populate the Customer Dimension table.

At this point, you've populated the following tables in the sales data warehouse:

Tutorial
62 PUBLIC Populate a time dimension table
• Sales Org Dimension from a flat file
• Time Dimension from a transform

Remember to periodically close the workspace tabs when you are finished working with the objects in the tab.

Right now, you can exit SAP Data Services or go to the next segment of tutorial exercises. If you exit, Data
Services reminds you to save any work that you haven't saved. Data Services saves all projects, jobs,
workflows, data flows, and results in the local repository.

Parent topic: Populate a time dimension table [page 56]

Related Information

Adding a job and data flow to the project [page 57]

Adding the components of the time data flow [page 58]
Defining the output of the Date_Generation transform [page 59]
Defining the output of the query [page 60]
Executing the job [page 61]
Populate a table with data from a relational table [page 64]

Tutorial
Populate a time dimension table PUBLIC 63
10 Populate a table with data from a
relational table

In this segment, you extract data from a relational table to populate the Customer Dimension table.

What you'll learn

While you perform the tasks in this segment, you'll learn some basic features of the interactive debugger. In
addition, you'll learn about the following:

• Extracting data from a relational table

• Remapping source columns to specific data types in the query transform

The goal

Populate the Customer Dimension table with data from a relational table. Then use the interactive debugger
feature to examine the data after it flows through each transform or object in the data flow.

The following diagram shows the Star Schema with the portion we’ll work on in this segment circled.

Adding the CustDim job, work flow, and data flow [page 65]
Add a new job, work flow, and data flow to the Class_Exercises project.

Define the data flow [page 66]

Tutorial
64 PUBLIC Populate a table with data from a relational table
Add objects to DF_CustDim data flow in the workspace area to build the instructions for populating the
Customer Dimension table.

Validating the CustDim data flow [page 68]

Validate the data flow before execution to make sure that it’s constructed correctly.

Executing the CustDim job [page 69]

Execute the CustDim job and view the resulting data.

The interactive debugger [page 70]

SAP Data Services Designer has an interactive debugger that enables you to examine and modify data
row by row during job execution.

What's next [page 74]

In the next segment, you'll learn about document type definitions (DTD) and extracting data from an
XML file.

10.1 Adding the CustDim job, work flow, and data flow

Add a new job, work flow, and data flow to the Class_Exercises project.

Open the Class_Exercises project so it appears in the Project Area in SAP Data Services Designer.

1. Right-click the Class_Exercises project and select New Batch Job.

A tab opens in the workspace area for the new batch job.
2. Rename this job JOB_CustDim.
3. Select the work flow button from the tool palette at right and click in the workspace.

The work flow icon appears in the workspace.

4. Rename the work flow as WF_CustDim.
5. Open the work flow WF_CustDim in the workspace.
6. Select the data flow button in the tool palette at right and select an empty space in the workspace.

Data Services adds the data flow icon to the workspace.

7. Rename the data flow DF_CustDim.
8. Select the new data flow in the project area.

Data Services opens the data flow workspace at right.

You’re now ready to define the DF_CustDim data flow.

Task overview: Populate a table with data from a relational table [page 64]

Related Information

Define the data flow [page 66]

Validating the CustDim data flow [page 68]

Tutorial
Populate a table with data from a relational table PUBLIC 65
Executing the CustDim job [page 69]
The interactive debugger [page 70]
What's next [page 74]
Work flows [page 18]

10.2 Define the data flow

Add objects to DF_CustDim data flow in the workspace area to build the instructions for populating the
Customer Dimension table.

In this exercise, you build the data flow by adding the following objects:

• Source table
• Query transform
• Target table

Perform the following group of tasks to define the data flow:

1. Add objects to the data flow [page 66]

Add three objects to the DF_CustDim data flow workspace.
2. Configuring the query transform [page 67]
You configure the query transform by mapping columns from the source to the target objects.

Parent topic: Populate a table with data from a relational table [page 64]

Related Information

Adding the CustDim job, work flow, and data flow [page 65]
Validating the CustDim data flow [page 68]
Executing the CustDim job [page 69]
The interactive debugger [page 70]
What's next [page 74]

10.2.1 Add objects to the data flow

Add three objects to the DF_CustDim data flow workspace.

Perform the following steps in the DF_CustDim data flow workspace:

1. Open the Datastore tab in the Local Object Library and expand the Tables node under ODS_DS.

Tutorial
66 PUBLIC Populate a table with data from a relational table
2. Drag and drop the ODS_CUSTOMER table to the workspace and select Make Source from the popup dialog
box.

3. Select the Query button on the tool palette at right ( ) and select an empty area of the workspace to
the right of the ODS_CUSTOMER table.

The Query object appears in the workspace.

4. Open the Datastore tab in the Local Object Library and expand the Tables node under Target_DS.
5. Drag and drop the CUST_DIM table to the right of the query and select Make Target from the popup menu.
6. Connect the objects to indicate the flow of data, as shown in the following graphic.

Next you'll define the input and output schemas in the Query transform.

Task overview: Define the data flow [page 66]

Next task: Configuring the query transform [page 67]

10.2.2 Configuring the query transform

You configure the query transform by mapping columns from the source to the target objects.

1. Double-click the query in the workspace to open the query editor.

2. Map the following source columns from the Schema In to the Schema Out pane.

 Note

Exclude the CUST_TIMESTAMP column.

Schema In column Schema Out column

CUST_ID → CUST_ID

CUST_CLASSF → CUST_CLASSF

NAME1 → NAME1

ADDRESS → ADDRESS

CITY → CITY

Tutorial
Populate a table with data from a relational table PUBLIC 67
Schema In column Schema Out column

REGION_ID → REGION_ID

ZIP → ZIP

 Note

If your database manager is Microsoft SQL Server or Sybase ASE, specify the columns in the order
shown in the table.

A blue arrow appears to the left of each column indicating the column is mapped. Because you didn't map
the CUST_TIMESTAMP column, it doesn’t have a blue arrow.

3. Right-click CUST_ID in the Schema Out pane and select Primary Key.

A key icon appears to the left of the CUST_ID column in the Schema Out pane indicating that the column is
a primary key.

4. Select the Back arrow icon ( ) in the toolbar to close the query editor and return to the data flow.
5. Save your work.

Task overview: Define the data flow [page 66]

Previous task: Add objects to the data flow [page 66]

10.3 Validating the CustDim data flow

Validate the data flow before execution to make sure that it’s constructed correctly.

Select Validation Validate .

Select one of the following options:

• Current View validates the object definition open in the workspace.
• All Objects in View validates the object definition open in the workspace and all of the objects that it calls.

 Note

You can alternatively use the icon bar and select Validate Current and Validate All to perform the same
validations.

The Output dialog box opens. If there are errors, the dialog opens with the Errors tab opens. In this situation,
the dialog box opens with the Warnings tab opened.

Task overview: Populate a table with data from a relational table [page 64]

Tutorial
68 PUBLIC Populate a table with data from a relational table
Related Information

Adding the CustDim job, work flow, and data flow [page 65]
Define the data flow [page 66]
Executing the CustDim job [page 69]
The interactive debugger [page 70]
What's next [page 74]

10.4 Executing the CustDim job

Execute the CustDim job and view the resulting data.

1. Right-click the JOB_CustDim job in the Project Area and select Execute.

SAP Data Services opens the Execution Properties dialog box.

2. Select OK to accept the default settings and to execute the job.

The Trace log opens and displays the job execution process. The job completes when you see the message
that the job completed successfully.
3. After the execution completes successfully, view the output data:
a. Open the DF_CustDim data flow in the workspace.
b. Select the magnifying glass icon that appears on the lower right corner of the target object.

A sample view of the output data appears in the lower pane.

For information about the icon options above the sample data, see “Using View Data” in the Designer
Guide.
4. Select the Back arrow icon in the upper toolbar to close the workspace pane.

Task overview: Populate a table with data from a relational table [page 64]

Related Information

Adding the CustDim job, work flow, and data flow [page 65]
Define the data flow [page 66]
Validating the CustDim data flow [page 68]
The interactive debugger [page 70]
What's next [page 74]

Tutorial
Populate a table with data from a relational table PUBLIC 69
10.5 The interactive debugger

SAP Data Services Designer has an interactive debugger that enables you to examine and modify data row by
row during job execution.

The debugger uses filters and breakpoints so that you can examine what happens to the data after each
transform or object in the data flow:

• Debug filter: Functions as a simple query transform with a WHERE clause. Use a filter to reduce a data set
in a debug job execution.
• Breakpoint: A point in the execution where the debugger pauses the job execution and returns the control
to you.

When you start a job in the interactive debugger, Data Services opens additional panes in the workspace area.
The following screen capture shows the default locations for the additional panes.

The following icons in the upper toolbar enable you to toggle the panes in the workspace:

The toolbar icons are, from left to right:

• Call Stack

Tutorial
70 PUBLIC Populate a table with data from a relational table
• Debug Variables
• Trace

The Tutorial doesn't show you all aspects of the debugger feature. To learn more about the interactive
debugger, see the Designer Guide.

Setting a breakpoint in a data flow [page 71]

A breakpoint is a location in the data flow where a debug job execution pauses and returns control to
you.

Debugging Job_CustDim with interactive debugger [page 72]

The interactive debugger stops at invervals so that you can see what's happening during job execution.

Setting a breakpoint condition [page 73]

Set a condition on the breakpoint to stop processing when a specific condition is met.

Parent topic: Populate a table with data from a relational table [page 64]

Related Information

Adding the CustDim job, work flow, and data flow [page 65]
Define the data flow [page 66]
Validating the CustDim data flow [page 68]
Executing the CustDim job [page 69]
What's next [page 74]

10.5.1 Setting a breakpoint in a data flow

A breakpoint is a location in the data flow where a debug job execution pauses and returns control to you.

Ensure that you have the Class_Exercises project open in the Project Area.

To set a breakpoint in the DF_CustDim data flow, perform the following step:

1. Open the DF_CustDim data flow in the workspace.

2. Right-click the connector line between the source table and the query and select Set Breakpoint.

A red breakpoint icon appears on the connector line.

3. Double-click the breakpoint icon on the connector.

The Breakpoint editor opens.

4. Select Set in the Breakpoint group at right.

Leave the other options set at the default settings.

The following screen capture shows the Breakpoint editor with the correct options selected.

Tutorial
Populate a table with data from a relational table PUBLIC 71
5. Select OK.

You have set the first breakpoint.

Task overview: The interactive debugger [page 70]

Related Information

Debugging Job_CustDim with interactive debugger [page 72]

Setting a breakpoint condition [page 73]

10.5.2 Debugging Job_CustDim with interactive debugger

The interactive debugger stops at invervals so that you can see what's happening during job execution.

Before you perform the following task, make sure that you set a breakpoint by following the steps in Setting a
breakpoint in a data flow [page 71].

1. Right-click Job_CustDim in the Project Area and select Start debug.

The Debug Properties dialog box opens.

2. Accept all default settings by selecting OK.

Tutorial
72 PUBLIC Populate a table with data from a relational table
The debugging starts. After SAP Data Services processes the first row, the debugger stops the process and
displays the first record, Cust_ID DT01 in the Target Data pane. Also notice that, for each row processed,
a trace message appears in the Trace pane.

3. Select the Get Next Row icon ( ) from the toolbar.

Another row replaces the existing row in the Target Data pane.
4. To see all debugged rows, select the All checkbox in the upper right of the Target Data pane.

The Target Data pane shows the first two rows that it has debugged. As you progress through each row, the
Target Data pane adds the processed rows.

5. To stop the debugger, select the Stop Debug icon in the toolbar ( ).

Debugging stops and Data Services closes the debugging panes.

Next we'll set a breakpoint condition and debug the job.

Task overview: The interactive debugger [page 70]

Related Information

Setting a breakpoint in a data flow [page 71]

Setting a breakpoint condition [page 73]

10.5.3 Setting a breakpoint condition

Set a condition on the breakpoint to stop processing when a specific condition is met.

 Example

Add a breakpoint condition for the Customer Dimension job to break when the debugger reaches a row in
the data with a REGION_ID value of 2.

1. Double-click the breakpoint icon on the connector line between the source and the query in the
DF_CustDim data flow.

The Breakpoint dialog box opens.

2. Choose CUSTOMER.REGION_ID from the Column list.
3. Choose = from the Operator list.

Tutorial
Populate a table with data from a relational table PUBLIC 73
4. Type 2 for Value.
5. Select OK.
6. Right-click JOB_CustDim in the Project Area and choose Start debug.
SAP Data Services starts to debug the job. The debugger stops after processing the row that has the value
of 2 for the REGION_ID column.

7. To stop the debug mode, select the Stop Debug icon ( ) in the toolbar.

Task overview: The interactive debugger [page 70]

Related Information

Setting a breakpoint in a data flow [page 71]

Debugging Job_CustDim with interactive debugger [page 72]

10.6 What's next

In the next segment, you'll learn about document type definitions (DTD) and extracting data from an XML file.

For more information about the topics covered in this section, see the Designer Guide.

Parent topic: Populate a table with data from a relational table [page 64]

Related Information

Adding the CustDim job, work flow, and data flow [page 65]
Define the data flow [page 66]
Validating the CustDim data flow [page 68]
Executing the CustDim job [page 69]
The interactive debugger [page 70]
Populate a table from an XML File [page 75]

Tutorial
74 PUBLIC Populate a table with data from a relational table
11 Populate a table from an XML File

In this segment, use a DTD (Data Type Definition) file to define the format of an XML file, which has a
hierarchical structure.

What you'll learn

An XML file represents hierarchical data using XML tags instead of rows and columns as in a relational table.

In this segment, you'll learn two methods to flatten a nested schema and process an XML file:

• Use a query transform to systematically flatten the input file structure.

• Use an XML_Pipeline transform to select portions of the nested data to process.

 Tip

Using an XML_Pipeline transform is much easier than using a query transform. However, performing the
exercises using a query transform first helps you to appreciate the simplicity of the XML_Pipeline method.

To help you understand the goal for the tasks in this section, read about nested data in the Designer Guide.

The goal

Data Services can process hierarchical data only after you’ve flattened the hierarchy. The goal of this segment
is to flatten a nested schema from an XML file and output the data to a table.

The circled portion of the Star Schema in the following diagram shows the portion we’ll work on in this
segment.

Tutorial
Populate a table from an XML File PUBLIC 75
Nested data [page 77]
SAP Data Services provides a way to view and manipulate hierarchical relationships within data flow
sources, targets, and transforms using Nested Relational Data Modeling (NRDM).

Adding MtrlDim job, work flow, and data flow [page 78]
To create the objects for this task, we omit the details and rely on the skills that you learned in the first
few exercises of the tutorial.

Importing a document type definition [page 79]

A document type definition (DTD) schema file describes the data contained in an XML document and
the relationships among the elements in the data.

Define the MtrlDim data flow [page 80]

In this exercise you add specific objects to the DF_MtrlDim data flow workspace and connect them in
the order in which the software should process them.

Validating the MtrlDim data flow [page 88]

After unnesting the source data using the Query in the last exercises, validate the DF_MtrlDim to make
sure that there are no errors.

Executing the MtrlDim job [page 89]

Execute the JOB_MtrlDim to see the unnested data in the output table.

Leveraging the XML_Pipeline [page 89]

The XML_Pipeline transform extracts data from an XML file using tools such as SQL SELECT
statements.

What's next [page 93]

In the next segment, learn about using joins and functions to obtain data from multiple relational
tables.

Tutorial
76 PUBLIC Populate a table from an XML File
11.1 Nested data

SAP Data Services provides a way to view and manipulate hierarchical relationships within data flow sources,
targets, and transforms using Nested Relational Data Modeling (NRDM).

In this tutorial, we use an XML file that has a hierarchical structure. We use a document type definition (DTD)
schema to define the XML. The DTD describes the data contained in the XML document and the relationships
among the elements in the data.

Nested data method

Using the nested data method can be more concise than other methods of representing nested data.

 Example

For example, when you represent nested data in a single data set, you have repeated information. In the
following table, the first four columns contain repeated information.

Nested data in a single data set

OrderNo CustID ShipTo1 ShipTo2 Item Qty ItemPrice

9999 1001 123 State St Town, CA 001 2 10

9999 1001 123 State St Town, CA 002 4 5

Also, columns inside a nested schema can contain columns. There is a unique instance of each nested
schema for each row at each level of the relationship:

The following screen capture shows the structure of nested source data in the Schema In pane of a query
editor in Designer:

• The Order_Data table is the top level

• The LineItems table is nested

Tutorial
Populate a table from an XML File PUBLIC 77
Parent topic: Populate a table from an XML File [page 75]

Related Information

Adding MtrlDim job, work flow, and data flow [page 78]
Importing a document type definition [page 79]
Define the MtrlDim data flow [page 80]
Validating the MtrlDim data flow [page 88]
Executing the MtrlDim job [page 89]
Leveraging the XML_Pipeline [page 89]
What's next [page 93]

11.2 Adding MtrlDim job, work flow, and data flow

To create the objects for this task, we omit the details and rely on the skills that you learned in the first few
exercises of the tutorial.

Add the following objects to the Class_Exercises project:

Object Rename

Job JOB_MtrlDim

Work flow WF_MtrlDim

Data flow DF_MtrlDim

Task overview: Populate a table from an XML File [page 75]

Tutorial
78 PUBLIC Populate a table from an XML File
Related Information

Nested data [page 77]

Importing a document type definition [page 79]
Define the MtrlDim data flow [page 80]
Validating the MtrlDim data flow [page 88]
Executing the MtrlDim job [page 89]
Leveraging the XML_Pipeline [page 89]
What's next [page 93]
Adding a new job [page 45]
Adding a work flow [page 46]
Adding a data flow [page 46]

11.3 Importing a document type definition

A document type definition (DTD) schema file describes the data contained in an XML document and the
relationships among the elements in the data.

The scripts that you ran at the beginning of the tutorial added the necessary objects for you to perform the
following task.

Import the DTD schema named Mtrl_List by performing the following steps.

1. Open the Formats tab in the Local Object Library.

2. Right-click Nested Schemas and select New DTD .

The Import DTD Format dialog box opens.

3. Type Mtrl_List for DTD definition name.
4. Select Browse and open <LINK_DIR>\Tutorial Files\mtrl.dtd.

The directory and file name appears in File name.

5. Keep the default option value of DTD selected for File type.
6. Choose MTRL_MASTER_LIST from the Root element name list.

MTRL_MASTER_LIST is the primary node. SAP Data Services imports only elements of the DTD that belong
to this primary node and any subnodes.

Leave all other settings as is.

7. Select OK.

Data Services adds the DTD Mtrl_List to the Nested Schemas group in the Local Object Library. The
following is a text view of the Mtrl.dtd file:

<?xml encoding="UTF-8"?>
<!ELEMENT MTRL_MASTER_LIST (MTRL_MASTER+, EFF_DATE)>
<!ELEMENT MTRL_MASTER (MTRL_ID, MTRL_TYPE, IND_SECTOR, MTRL_GROUP, UNIT,
TOLERANCE, HAZMAT_IND*, TEXT+ )>
<!ELEMENT MTRL_ID (#PCDATA)>
<!ELEMENT MTRL_TYPE (#PCDATA)>

Tutorial
Populate a table from an XML File PUBLIC 79
<!ELEMENT IND_SECTOR (#PCDATA)>
<!ELEMENT MTRL_GROUP (#PCDATA)>
<!ELEMENT UNIT (#PCDATA)>
<!ELEMENT TOLERANCE (#PCDATA)>
<!ELEMENT HAZMAT_IND (HAZMAT_TYPE, HAZMAT_LEVEL )>
<!ELEMENT HAZMAT_TYPE (#PCDATA)>
<!ELEMENT HAZMAT_LEVEL (#PCDATA)>
<!ELEMENT TEXT (LANGUAGE, SHORT_TEXT, LONG_TEXT*)>
<!ELEMENT LANGUAGE (#PCDATA)>
<!ELEMENT SHORT_TEXT (#PCDATA)>
<!ELEMENT LONG_TEXT (#PCDATA)>
<!ELEMENT EFF_DATE (#PCDATA)>

Task overview: Populate a table from an XML File [page 75]

Related Information

Nested data [page 77]

Adding MtrlDim job, work flow, and data flow [page 78]
Define the MtrlDim data flow [page 80]
Validating the MtrlDim data flow [page 88]
Executing the MtrlDim job [page 89]
Leveraging the XML_Pipeline [page 89]
What's next [page 93]

11.4 Define the MtrlDim data flow

In this exercise you add specific objects to the DF_MtrlDim data flow workspace and connect them in the
order in which the software should process them.

Follow the tasks in this exercise to configure the objects in the DF_MtrlDim data flow so that the data flow
correctly processes hierarchical data from an XML source file.

The following objects make up the DF_MtrlDim data flow:

• Source XML file

• Query transform
• Target table

Adding objects to DF_MtrlDim [page 81]

Build the DF_MtrlDim data flow with a source, target, and query transform.

Configuring the qryunnest query [page 82]

Use the query transform to unnest the hierarchical Mtrl_List XML source data properly.

Parent topic: Populate a table from an XML File [page 75]

Tutorial
80 PUBLIC Populate a table from an XML File
Related Information

Nested data [page 77]

Adding MtrlDim job, work flow, and data flow [page 78]
Importing a document type definition [page 79]
Validating the MtrlDim data flow [page 88]
Executing the MtrlDim job [page 89]
Leveraging the XML_Pipeline [page 89]
What's next [page 93]

11.4.1 Adding objects to DF_MtrlDim

Build the DF_MtrlDim data flow with a source, target, and query transform.

1. Open the DF_MtrlDim data flow in the workspace.

2. Drag and drop the Mtrl_List file from the Local Object Library Formats tab under the Nested Schemas
node to the workspace.
3. Select Make File Source from the popup menu.
4. Double-click Mtrl_List in the workspace.

The Source File Editor opens containing the Schema Out options in the upper pane and the Source options
in the lower pane.

Tutorial
Populate a table from an XML File PUBLIC 81
5. To complete the options in the Source tab, perform the following substeps:
a. Ensure that XML is selected.
b. Choose <Select file> from the File list.

The Open dialog box opens.

c. Select mtrl.xml located in <LINK_DIR>\Tutorial Files and select Open.

The File option in the Source tab populates with the file name and location of the XML file.
d. Select Enable validation.

Enable validation compares the incoming data to the stored data type definition (DTD) format.

Data Services automatically populates the following options in the Source tab:
• Format name: The schema name Mtrl_List
• Root element name: The primary node name MTRL_MASTER_LIST

 Note

You can’t edit these values.

6. Select the Back arrow icon in the toolbar to return to the DF_MtrlDim data flow workspace.
7. Select the Query Transform icon in the tool palette and then select an empty area of the workspace, to the
right of the table object.
8. Rename the query transform “qryunnest”.
9. Drag and drop the MTRL_DIM table from the Tables node under Target_DS to the workspace.
10. Choose Make Target from the popup menu.
11. Connect the objects in the data flow to indicate the flow of data from the source XML file through the query
to the target table.
12. Save your work.

The DF_MtrlDim data flow is ready for configuration.

Task overview: Define the MtrlDim data flow [page 80]

Related Information

Configuring the qryunnest query [page 82]

11.4.2 Configuring the qryunnest query

Use the query transform to unnest the hierarchical Mtrl_List XML source data properly.

We've broken this process into several segments. Make sure that you take your time and try to understand
what you accomplish in each segment.

Double-click the query object qryunnest in the workspace.

Tutorial
82 PUBLIC Populate a table from an XML File
The Query editor opens as shown in the following screen capture. Notice the nested structure of the source in
the Schema In pane. The Schema Out pane reflects the current structure of the target table MTRL_DIM. Notice
the differences in column names and data types between the input and output schemas.

In the next several exercises, we use specific configuration settings to systematically unnest the table.

Task overview: Define the MtrlDim data flow [page 80]

Related Information

Adding objects to DF_MtrlDim [page 81]

11.4.2.1 1. Add MTRL_MASTER schema to output

Move the MTRL_MASTER schema from the Schema In pane to the Schema Out pane in the query editor for
qryunnest:

1. Select all five columns in the Schema out pane so they’re highlighted:

• MTRL_ID
• MTRL_TYP
• IND_SECTOR
• MTRL_GRP
• DESCR
2. Right-click and select Cut.

SAP Data Services removes the five columns and saves the column names and data types to your
clipboard.

Tutorial
Populate a table from an XML File PUBLIC 83
 Caution

Don’t use Delete. By selecting Cut instead of Delete, SAP Data Services copies the correct column
names and data types from the target schema to the clipboard. In a later step, we instruct you to paste
the clipboard information to the Schema Out pane of the MTRL_Master target table schema.

3. Drag and drop the MTRL_MASTER schema to the Schema Out pane from the Schema In pane.
The following screen capture shows the results on the qryunnest schema in the Schema Out pane. Notice
that MTRL_MASTER is now nested under the qryunnest schema.

4. View how the move affects the target table schema:

a. Double-click MTRL_DIM in the Project Area under DF_MtrlDim.

The MTRL_DIM target table workspace opens.

b. Examine the Schema In pane.

The schema now contains the MTRL_MASTER schema that you just moved to the Schema Out pane in
the query editor.

11.4.2.2 2. Add back the cut columns to Schema Out

Perform the following steps in the qryunnest query editor:

1. Right-click MTRL_MASTER in the Schema Out pane and choose Make Current.

The MTRL_MASTER schema is now editable.

2. Multiselect the following columns in the Schema Out pane under MTRL_MASTER, right-click, and select
Delete:

Tutorial
84 PUBLIC Populate a table from an XML File
• MTRL_ID
• MTRL_TYPE
• IND_SECTOR
• MTRL_GROUP
• UNIT
• TOLERANCE
• HAZMAT_IND

The following screen capture shows the remaining nested nodes under MTRL_MASTER in the Schema Out
pane.

3. Right-click the MTRL_MASTER schema in the Schema Out pane and choose Paste.

The columns that you originally deleted from the Schema Out pane are added back to the schema.
However, now the columns appear under the MTRL_MASTER schema.
4. Map the following fields from the Schema In pane to the corresponding columns in the Schema Out pane:

• MTRL_ID
• MTRL_TYPE
• IND_SECTOR
• MTRL_GROUP
The following screen capture shows the columns that you added back under the MTRL_MASTER schema.

Tutorial
Populate a table from an XML File PUBLIC 85
11.4.2.3 3. Map the DESCR column

Map the SHORT_TEXT column in the Schema In pane to the DESCR column in the Schema Out pane.

Perform the following steps in the qryunnest query editor:

1. Right-click the DESCR column in the Schema Out pane and choose Cut from the popup menu.

SAP Data Services removes the DESCR column from the Schema Out pane, but saves it to your clipboard.
2. Right-click the TEXT nested table in the Schema Out pane and select Make Current from the popup menu.

3. Right-click the LANGUAGE column in the Schema Out pane and select Paste Insert Below.

Data Services places the DESCR column at the same level as the SHORT_TEXT column.
4. Map the SHORT_TEXT column from the Schema In pane to the DESCR column in the Schema Out pane.
5. Delete the following two columns and nested schema from the Schema Out pane:

• LANGUAGE
• SHORT_TEXT
• TEXT_nt_1

The TEXT nested table in the Schema Out pane contains only the DESCR column.

Tutorial
86 PUBLIC Populate a table from an XML File
6. View the results of the steps on the MTRL_DIM target table:
a. Double-click MTRL_DIM in the Project Area under DF_MtrlDim.

The MTRL_DIM target table workspace opens.

b. Examine the Schema In pane.

The Schema In pane shows the same schemas and columns that appear in the queryunnest query
Schema Out pane. However, the Schema In of the MTRL_DIM target table is still not flat, and it won’t
produce the flat schema that the target requires. Therefore, next we flatten the remaining schema.

11.4.2.4 4. Flatten the final nested schema

Finally, unnest the TEXT schema in the Schema Out pane.

Perform the following steps in the qryunnest query editor:

1. In the Schema Out pane, right-click the TEXT node and choose Unnest.

The table icon next to the TEXT node appears with a left-pointing arrow ( ).
2. Right-click MTRL_MASTER in the Schema Out pane and select Make Current.

The MTRL_MASTER table is now editable.

3. Right-click MTRL_MASTER in the Schema Out pane and choose Unnest.

The table icon next to MTRL_MASTER appears with a left-pointing arrow.

4. View the results of your actions on the MTRL_DIM target table:
a. Double-click MTRL_DIM in the Project Area under DF_MtrlDim.

The MTRL_DIM target table workspace opens.

b. Examine the Schema In pane.

The Schema In and Schema Out panes show one level for each.

Tutorial
Populate a table from an XML File PUBLIC 87
5. Select the Save All icon in the upper toolbar.

11.5 Validating the MtrlDim data flow

After unnesting the source data using the Query in the last exercises, validate the DF_MtrlDim to make sure
that there are no errors.

1. Select the DF_MtrlDim data flow in the Project Area.

The workspace opens at right.

2. Select the Validate All icon in the toolbar.

The Output dialog box opens in the Warnings tab. There are warning messages indicating that data type
conversion will be used to convert from varchar (1024) to the data type and length of the target file.

If your design contains any errors in the Errors tab, you must fix them. For example, the following error
indicates that the source schema is still nested: “The flat loader...cannot be connected to NRDM”. Right-
click the error message and select Go to error. If you have syntax errors, a dialog box appears with a
message describing the error. Address all errors before executing the job.

Task overview: Populate a table from an XML File [page 75]

Related Information

Nested data [page 77]

Adding MtrlDim job, work flow, and data flow [page 78]
Importing a document type definition [page 79]
Define the MtrlDim data flow [page 80]
Executing the MtrlDim job [page 89]
Leveraging the XML_Pipeline [page 89]
What's next [page 93]

Tutorial
88 PUBLIC Populate a table from an XML File
11.6 Executing the MtrlDim job

Execute the JOB_MtrlDim to see the unnested data in the output table.

Before you execute the JOB_MtrlDim job, validate the data flow and save your work.

1. Right-click JOB_MtrlDim in the Project Area and select Execute.

2. Keep all default settings in the Execution Properties dialog box and select OK.

The Trace Messages dialog opens showing processing messages. The last message is that the job
completed successfully.

Open DF_MtrlDim in the workspace and select the magnifying glass icon in the lower right corner of the
MTRL_DIM target table. A table opens in the lower pane showing a sample of the transformed data.

Task overview: Populate a table from an XML File [page 75]

Related Information

Nested data [page 77]

Adding MtrlDim job, work flow, and data flow [page 78]
Importing a document type definition [page 79]
Define the MtrlDim data flow [page 80]
Validating the MtrlDim data flow [page 88]
Leveraging the XML_Pipeline [page 89]
What's next [page 93]

11.7 Leveraging the XML_Pipeline

The XML_Pipeline transform extracts data from an XML file using tools such as SQL SELECT statements.

When you extract data from an XML file to load into a target data warehouse, you obtain only parts of the XML
file. In the previous exercises, we used the Query transform for partial extraction. The XML_Pipeline transform
extracts much more than the Query transform because it uses many of the clauses of a SQL SELECT
statement. Additionally, the XML_Pipeline transform performs better than the Query transform because of the
way it uses memory:

• Uses less memory: Processes each instance of a repeatable schema within the XML file rather than
building the whole XML structure first.
• Uses memory efficiently: Releases and reuses memory continually to flow XML data through the
transform more steadily.

To build the MTRL_DIM table from a nested XML file, use the XML_Pipeline transform in addition to a Query
transform. Construct the data flow with the following objects:

Tutorial
Populate a table from an XML File PUBLIC 89
• XML file: The source
• XML_Pipeline transform: Obtains a repeatable portion of the nested source schema
• Query transform: Maps the output from the XML_Pipeline transform to a flat target schema
• Flat file: The target

Create a job, work flow, and data flow [page 90]

In this exercise, you’ll achieve the same outcome as in the previous exercise, but you use the XML
Pipeline transform for more efficient configuration and processing.

Unnesting the schema with the XML_Pipeline transform [page 91]

The XML_Pipeline transform enables you to map a nested column directly to a flat target table.

Parent topic: Populate a table from an XML File [page 75]

Related Information

Nested data [page 77]

11.7.1 Create a job, work flow, and data flow

In this exercise, you’ll achieve the same outcome as in the previous exercise, but you use the XML Pipeline
transform for more efficient configuration and processing.

1. Add the following objects to Class_Exercises in the Project Area using one of the methods you've
learned in previous exercises:

• JOB_Mtrl_Pipe
• WF_Mtrl_Pipe
• DF_Mtrl_Pipe
2. Open DF_Mtrl_Pipe in the workspace.
3. Expand the Nested Schemas node in the Formats tab of the Local Object Library.
4. Drag and drop the Mtrl_List file into the DF_Mtrl_Pipe workspace and choose Make File Source.
5. Double-click the Mtrl_List source file to open the source file editor.
6. In the Source tab, ensure XML is selected.
7. Choose Select file from the File list.
8. Select the mtrl_list.xml in <LINK_DIR>\Tutorial Files\ and select Open.

Tutorial
90 PUBLIC Populate a table from an XML File
9. Select Enable Validation.

Enable Validation enables comparison of the incoming data to the stored DTD format.
10. Select the back arrow in the upper toolbar to return to the data flow workspace.
11. Expand the Data Integrator node in the Transforms tab of the Local Object Library.
12. Drag and drop the XML_Pipeline transform to the DF_Mtrl_Pipe workspace.
13. Select the Query transform icon in the tool palette and select an empty area of the workspace.
14. Rename the Query transform Query_Pipeline.
15. Drag and drop the MTRL_DIM table from the Tables node of the Target_DS datastore to the DF_Mtrl_Pipe
workspace and select Make Target.
16. Connect the objects in the data flow to indicate the flow of data from the source XML file through the
XML_Pipeline and Query_Pipeline transforms to the target table.
The following shows an example of the data flow in Designer.

17. Save all files.

Task overview: Leveraging the XML_Pipeline [page 89]

Related Information

Unnesting the schema with the XML_Pipeline transform [page 91]

11.7.2 Unnesting the schema with the XML_Pipeline

transform

The XML_Pipeline transform enables you to map a nested column directly to a flat target table.

Set up the job as instructed in Create a job, work flow, and data flow [page 90].

Perform the following steps beginning in the DF_Mtrl_Pipe workspace:

1. Double-click the XML_Pipeline transform in theDF_Mtrl_Pipe.

The transform editor opens. The Schema In pane shows the nested structure of the source file.
2. Drag and drop the following columns from the Schema In pane to the Schema Out pane.

Tutorial
Populate a table from an XML File PUBLIC 91
• MTRL_ID
• MTRL_TYPE
• IND_SECTOR
• MRTL_GROUP
• SHORT_TEXT

3. Click the Back arrow icon ( ) from the upper toolbar to close the transform editor.
4. Double-click Query_Pipeline to open the query editor.
5. Map each column from the Schema In pane to the column in the Schema Out pane as shown in the
following table.

Schema In column Schema Out column

MTRL_ID → MTRL_ID

MTRL_TYPE → MTRL_TYPE

IND_SECTOR → IND_SECTOR

MTRL_GROUP → MTRL_GROUP

SHORT_TEXT → DESCR

When you map each column from the Schema In pane to the Schema Out pane, the column Type in
Schema Out doesn't change, even though the input fields have the type varchar(1024).

6. Double-click MTRL_DIM in the data flow to open the target table editor.
7. Open the Options tab in the lower pane and select Delete data from table before loading.

This option deletes existing data in the table before loading new data. If you don’t select this option, SAP
Data Services appends data to the existing table.
8. Select the Back arrow icon in the upper toolbar to close the target table editor.
9. Select the Validate icon from the upper toolbar.

The Warnings tab opens. The warnings indicate that each column will be converted to the data type in the
Schema Out pane.
10. Execute the JOB_Mtrl_Pipe job.
11. Accept the default settings in Execution Properties and select OK.

After the job successfully executes, open DF_Mtrl_Pipe in the workspace and select the magnifying glass
icon in the lower right corner of the MTRL_DIM target table. A table opens in the lower pane showing a sample
of the transformed data.

Task overview: Leveraging the XML_Pipeline [page 89]

Related Information

Create a job, work flow, and data flow [page 90]

Tutorial
92 PUBLIC Populate a table from an XML File
11.8 What's next

In the next segment, learn about using joins and functions to obtain data from multiple relational tables.

Parent topic: Populate a table from an XML File [page 75]

Related Information

Nested data [page 77]

Tutorial
Populate a table from an XML File PUBLIC 93
12 Populate a table from multiple relational
tables

Join columns from multiple tables using joins and functions.

What you'll learn

In this segment, you'll learn to perform the following tasks:

• Using two source tables in a data flow.

• Joining the source tables with an inner join in a Query transform.
• Adding a data column from a nonsource table using a Lookup function.
• Using metadata reporting tools to view reports and other metadata associated with the source tables.

The goal

Create an inner join to combine data from the ods_SalesItem and ods_SalesOrder tables to populate the
SalesFact table. Then add order status information from the ods_Delivery table using a Lookup function.

The circled portion of the Star Schema in the following diagram shows the portion we’ll work on in this
segment.

More information:

Tutorial
94 PUBLIC Populate a table from multiple relational tables
• For information about joins in the Query transform, see the Query transform section in the Reference
Guide.
• For more information about operations on nested data, see the Nested data section in the Designer Guide.
• For more information about the Lookup expression, functions, and filters, see the Designer Guide.
• For more information about Impact and Lineage reports, see the Management Console Guide.

Creating the SalesFact job [page 95]

Use the basic skills that you’ve learned in earlier exercises to set up a new job, work flow, and data flow.

Adding objects to the SalesFact data flow [page 96]

Build the DF_SalesFact data flow by adding objects including two source tables.

Creating an inner join [page 97]

Use an inner join to join the columns of the two source tables to include only the matching columns
from both tables.

Purpose of the lookup_ext function [page 99]

The lookup_ext function gets data from a lookup table and outputs the data when user-defined
conditions are met.

Configuring the lookup_ext function [page 101]

The lookup_ext function retrieves data from a column in the ODS_DELIVERY table to include in the
SALES_FACT output table.

Executing the SalesFact job [page 104]

After you have performed the validation step and fixed any errors, execute the JOB_SalesFact job.

Viewing Impact and Lineage Analysis for the SALES_FACT target table [page 105]
Use the Data Services Management Console to view a lineage analysis of the Sales Fact job.

What's next [page 107]

In the next segment, you'll learn how to take advantage of change data capture.

12.1 Creating the SalesFact job

Use the basic skills that you’ve learned in earlier exercises to set up a new job, work flow, and data flow.

1. Add a new job to the Class_Exercises project in the Project Area and name it JOB_SalesFact.
2. Add the following objects to the job using the skills you've learned in previous exercises:

• Work flow: WF_SalesFact

• Data flow: DF_SalesFact
3. Select the Save All icon in the upper toolbar.

Task overview: Populate a table from multiple relational tables [page 94]

Tutorial
Populate a table from multiple relational tables PUBLIC 95
Related Information

Adding objects to the SalesFact data flow [page 96]

Creating an inner join [page 97]
Purpose of the lookup_ext function [page 99]
Configuring the lookup_ext function [page 101]
Executing the SalesFact job [page 104]
Viewing Impact and Lineage Analysis for the SALES_FACT target table [page 105]
What's next [page 107]

12.2 Adding objects to the SalesFact data flow

Build the DF_SalesFact data flow by adding objects including two source tables.

1. Open the DF_SalesFact data flow in the workspace.

2. Drag and drop the source table ODS_SALESITEM from the ODS_DS datastore to the workspace and select
Make Source from the popup menu.
3. Drag and drop the source table ODS_SALESORDER from the ODS_DS datastore to the workspace and
select Make Source from the popup menu.

Arrange the two sources vertically on the left of the workspace, with one above the other.

4. Add a query to the data flow from the tool palette ( ).

5. Drag and drop the SALES_FACT table from the Target_DS datastore and select Make Target.
6. Connect the objects to indicate the flow of data as shown in the following diagram.
• Drag a line from ODS_SALESITEM to the Query object.
• Drag a line from ODS_SALESORDER to the Query object.
• Drag a line from the Query object to the SALES_FACT target table.

Tutorial
96 PUBLIC Populate a table from multiple relational tables
7. Save your work.

Task overview: Populate a table from multiple relational tables [page 94]

Related Information

Creating the SalesFact job [page 95]

12.3 Creating an inner join

Use an inner join to join the columns of the two source tables to include only the matching columns from both
tables.

SAP Data Services defines the relationship between the ODS_SALESORDER and ODS_SALESITEM tables by
matching the key column SALES_ORDER_NUMBER, which is in both tables. The join option generates a join
expression based on primary and foreign keys and column names.

The values in the SALES_ORDER_NUMBER column must be the same in each table before the record is included
in the output.

1. Double-click the Query transform in the DF_SalesFact workspace to open the query editor.
2. Open the FROM tab in the lower pane.

Create the join in the Join Pairs group.

3. Select ODS_SALESORDER from the Left column list.
4. Leave Join Type set to Inner Join.
5. Select ODS_SALESITEM from the Right column list.

6. Click the eclipses icon next to the Right table name ODS_SALESITEM.

The Smart Editor opens. Add a filter to apply to the records that qualify for the inner join.
7. Place your cursor at the end of the first line and press Enter .
8. Type the following two lines, each on its own line, using the casing as shown. Alternately, copy and paste
the text into the Smart Editor:

AND ODS_SALESORDER.ORDER_DATE >= to_date('2007.01.01','yyyy.mm.dd')

AND ODS_SALESORDER.ORDER_DATE <= to_date('2007.12.31','yyyy.mm.dd')

Tutorial
Populate a table from multiple relational tables PUBLIC 97
These lines filter the sales orders by date. Data Services moves all orders that are from January 1, 2007 up
to and including December 31, 2007 to the target table.

 Tip

If you decide to type the lines, as you type the function names, the Smart Editor prompts you with
options. Either ignore the prompts and keep typing or select an option that is highlighted and press
Enter . You can alternately double-click the prompt to accept it.

9. Select OK to close the Smart Editor.

The join conditions that you added in the Smart Editor appear in the Join Condition column and in the
FROM Clause area.

10. In the Schema In and Schema Out panes of the query editor, map the following source columns to output
columns using drag and drop.

Source table Source column Target column

ODS_SALESORDER CUST_ID → CUST_ID

ORDER_DATE → SLS_DOC_DATE

ODS_SALESITEM SALES_ORDER_NUMBER → SLS_DOC_NO

SALES_LINE_ITEM_ID → SLS_DOC_LINE_NO

MTRL_ID → MATERIAL_NO

PRICE → NET_VALUE

11. Keep the query editor open for the next task.

Task overview: Populate a table from multiple relational tables [page 94]

Related Information

Creating the SalesFact job [page 95]

Tutorial
98 PUBLIC Populate a table from multiple relational tables
Adding objects to the SalesFact data flow [page 96]
Purpose of the lookup_ext function [page 99]
Configuring the lookup_ext function [page 101]
Executing the SalesFact job [page 104]
Viewing Impact and Lineage Analysis for the SALES_FACT target table [page 105]
What's next [page 107]

12.4 Purpose of the lookup_ext function

The lookup_ext function gets data from a lookup table and outputs the data when user-defined conditions
are met.

In this example, we create a lookup_ext function to output data to the SALES_FACT target table from a non-
source table. We designate the non-source table as a lookup table in the data flow configuration.

The SALES_FACT target table contains a column named ORD_STATUS that we haven't mapped because there
are no comparable columns in our two source tables. The ODS_DELIVERY table contains the order status
information in the DEL_ORDER_STATUS column. Therefore, we establish ODS_DELIVERY as a lookup table so
that we include the order status information in our target table. To ensure that the correct delivery order status
is output with each record, we set conditions.

The following table shows the columns from the lookup table and the corresponding columns in the source
table that we use in the lookup_ext conditions. The values in each field pair must match to satisfy the
conditions.

Lookup: ODS_DELIVERY Source: ODS_SALESITEM

DEL_SALES_ORDER_NMBER = SALES_ORDER_NUMBER

DEL_ORDER_ITEM_NUMBER = SALES_LINE_ITEM_ID

The syntax of the lookup_ext function seems complicated, however, there’s a graphical user interface that
helps you create the function. The following code shows the syntax of the lookup_ext function with just the
portions that we use in this example:

lookup_ext([<lookup_table, chache_spec, return_policy>],

[<return_column_list>],
[<default_value_list>],
[<condition_list>] set (“run_as_separate_process”= 'yes' or 'no'),
(“output_cols_info”='<?xml version=“1.0”
encoding=“UTF-8”?>
<output_cols_info>
<col index=“1” expression=“yes” or “no”/>
</output_cols_info>')
)

The following table describes each variable and the values for the lookup_ext we set in this example. For a
complete list of all of the options, see “lookup_ext” in the “Descriptions of Data Services built-in functions”
section of the Reference Guide.

Tutorial
Populate a table from multiple relational tables PUBLIC 99
 Note

Because some of the function sections are too wide for the table, we've shown them with line breaks.

<lookup_table> The database table is represented in ODS_DS.DBO.ODS_DELIVERY

the function as:
<datastore.owner.table name>

<cache_spec> Caching method that the lookup_ext 'PRE_LOAD_CAHCE', 'MAX'

function uses. Select
PRE_LOAD_CACHE.

<return_column_list> The name of the column in the lookup DEL_ORDER_STATUS

table that contains the value to output.

<default_value_list> Values that are allowed in the del_order_status, NULL

ORD_STATUS output column.

<condition_list> The lookup condition expressed as Condition 1:

<compare_column>,
<compare_operator>, DEL_SALES_ORDER_NUMBER,'
<compare_expression> =',
ODS_SALESITEM.SALES_ORDE
R_NUMBER

Condition 2:

DEL_ORDER_ITEM_NUMBER,'=
',
ODS_SALESITEM.SALES_LINE
_ITEM_ID

SET option Various options that you set in the

graphical user interface or that come
SET
("run_as_separate_proces
from the system. s"='no',
"output_cols_info"='<?
xml version="1.0"
encoding="UTF-8"?
><output_cols_info>
<col index="1"
expression="no"

Parent topic: Populate a table from multiple relational tables [page 94]

Related Information

Creating the SalesFact job [page 95]

Adding objects to the SalesFact data flow [page 96]
Creating an inner join [page 97]
Configuring the lookup_ext function [page 101]
Executing the SalesFact job [page 104]
Viewing Impact and Lineage Analysis for the SALES_FACT target table [page 105]

Tutorial
100 PUBLIC Populate a table from multiple relational tables
What's next [page 107]

12.5 Configuring the lookup_ext function

The lookup_ext function retrieves data from a column in the ODS_DELIVERY table to include in the
SALES_FACT output table.

The following steps continue from the exercise in Creating an inner join [page 97].

For the following exercise, we use the ODS_DELIVERY table as the lookup table. We'll create two conditions in
the mapping of the ORD_STATUS column in the Schema Out pane. Perform the following steps in the
DF_SalesFact query editor:

1. Select the ORD_STATUS column in the Schema Out pane.

The column hasn't been mapped yet, so a Column icon appears to the left of the column name.
2. Open the Mapping tab in the lower pane and select Functions....

The Select Function dialog box opens.

3. Choose Lookup Functions from the Function categories list at left.
4. Choose lookup_ext from the Function name column at right.
5. Select Next.

The Select Parameters dialog box opens with options to define the Lookup_Ext function.
6. Establish ODS_DELIVERY as the lookup table by performing the following substeps:

 Note

The lookup table is where the lookup_ext function obtains the value to put into the ORD_STATUS
column.

a. Select the down pointing arrow at the end of the Lookup Table option at the top.

The Input Parameter dialog box opens

b. Keep the value Datastore in the Look in option.

Leave Cache spec set to PRE-LOAD-CACHE.

c. Select ODS_DS and select OK.
d. Ensure that ODS_DS appears in the Look in option.
e. Select ODS_DELIVERY from the list of tables and select OK.

The Input Parameter dialog box closes. The ODS_DELIVERY table is now the lookup table.
7. Expand the two source tables to expose the columns to use for the conditions:
a. Expand the Lookup table node at left and then expand the ODS_DELIVERY subnode.
b. Expand the Input Schema node at left and then expand the ODS_SALESITEM subnode.

We don't use the ODS_SALESORDER source table because it doesn't contain the columns we need for
the conditions in the function.
8. Set the first condition:

Tutorial
Populate a table from multiple relational tables PUBLIC 101
 Note

Conditions determine what value to output to the ORD_STATUS column.

a. Drag and drop DEL_SALES_ORDER_NUMBER from under ODS_DELIVERY to the Condition group under
Column in Lookup table.
b. Ensure that the equal sign (=) appears under Op.(&).
c. Select the ellipses at the end of the row.

The Select Parameters dialog box opens.

d. Drag and drop SALES_ORDER_NUMBER from under the ODS_SALESITEM node at left to the right side.
e. Select OK.

The resulting condition is:

ODS_DELIVERY.DEL_SALES_ORDER_NUMBER = ODS_SALESITEM.SALES_ORDER_NUMBER

9. Create the second condition.

a. Drag and drop DEL_ORDER_ITEM_NUMBER from under ODS_DELIVERY to the Conditions group, the
second row under Column in Lookup table.
b. Ensure that the equal sign (=) appears under Op.(&).
c. Select the ellipses at the end of the second row.

The Select Parameter dialog box opens.

d. Drag and drop SALES_LINE_ITEM_ID under ODS_SALESITEM to the right.
e. Select OK.

The resulting condition is:

ODS_DELIVERY.DEL_ORDER_ITEM_NUMBER = ODS_SALESITEM.SALES_LINE_ITEM_ID

10. Define the output:

a. Drag and drop the DEL_ORDER_STATUS column from under ODS_DELIVERY to the Output group under
the Column in lookup table column in the first row.

Tutorial
102 PUBLIC Populate a table from multiple relational tables
The following screen capture shows the completed Select Parameters dialog box:

11. Select Finish.

The final lookup_ext function displays in the Mapping tab and looks as follows:

lookup_ext([ODS_DS.DBO.ODS_DELIVERY,'PRE_LOAD_CACHE','MAX'],
[DEL_ORDER_STATUS],[NULL],
[DEL_SALES_ORDER_NUMBER,'=',ODS_SALESITEM.SALES_ORDER_NUMBER,DEL_ORDER_ITEM_NU
MBER,'=',ODS_SALESITEM.SALES_LINE_ITEM_ID]) SET
("run_as_separate_process"='no', "output_cols_info"='<?xml version="1.0"
encoding="UTF-8"?><output_cols_info><col index="1" expression="no"/>
</output_cols_info>' )

12. Select Validate Current in the upper toolbar to make sure that there are no errors.

13. Select Back ( ) in the upper toolbar to close the query editor.
14. Save your work.

Task overview: Populate a table from multiple relational tables [page 94]

Tutorial
Populate a table from multiple relational tables PUBLIC 103
Related Information

Creating the SalesFact job [page 95]

Adding objects to the SalesFact data flow [page 96]
Creating an inner join [page 97]
Purpose of the lookup_ext function [page 99]
Executing the SalesFact job [page 104]
Viewing Impact and Lineage Analysis for the SALES_FACT target table [page 105]
What's next [page 107]

12.6 Executing the SalesFact job

After you have performed the validation step and fixed any errors, execute the JOB_SalesFact job.

1. Right-click JOB_SalesFact in the Project Area and select Execute.

2. Accept the default settings in the Execution Properties dialog box and select OK.

The trace messages open. The execution is complete when you see the message in trace messages that
the job completed successfully.
3. Select DF_SalesFact in the Project Area to open it in the workspace.
4. Select the View Data icon (magnifying-glass) on the lower right corner of the SALES_FACT target object to
view 17 rows of data.

View the SALES_FACT target table data. Notice the following:

• Based on the filter you set for the inner join, the records show dates in the ORDER_DATE column that are
greater than or equal to January 1, 2007 and less than or equal to December 31, 2007.
• The ORD_STATUS column contains either a “D” or an “O” to indicate that the order status is D = delivered
or O = ordered.

Task overview: Populate a table from multiple relational tables [page 94]

Related Information

Creating the SalesFact job [page 95]

Adding objects to the SalesFact data flow [page 96]
Creating an inner join [page 97]
Purpose of the lookup_ext function [page 99]
Configuring the lookup_ext function [page 101]
Viewing Impact and Lineage Analysis for the SALES_FACT target table [page 105]
What's next [page 107]

Tutorial
104 PUBLIC Populate a table from multiple relational tables
12.7 Viewing Impact and Lineage Analysis for the
SALES_FACT target table

Use the Data Services Management Console to view a lineage analysis of the Sales Fact job.

View information about the SALES_FACT target table by performing the following steps:

1. In SAP Data Services Designer, select Tools Data Services Management Console

Your browser opens with the Management Console login screen.

2. Enter your login credentials, which are the same credentials as you use to log into Designer.

The Management Console main page opens.

3. Select the Impact and Lineage Analysis icon.

The Impact and Lineage page opens with Objects to Analyze at left and repository information at right.
4. Select Settings in the upper right corner.
5. Check the name in the Repository text box at right to make sure that it contains the current repository.
6. Open the Refresh Usage Data tab to make sure that it lists the current job server in the Job Server text box.
7. Select Calculate Column Mapping.

The software calculates the current column mapping and displays a notification that column mappings are
calculated successfully at the top of the tab.
8. Select Close.
9. In the file tree at left, expand Datastores and then Target_DS to view the list of tables.
10. Expand Data Flow Column Mapping Calculation in the right pane to view the calculation status of each data
flow.
11. Select the SALES_FACT table name under Target_DS in the file tree.

The Overview tab for SALES_FACT table opens at right. The Overview tab displays general information
about the table such as the table datastore name and the table type.
12. Open the Lineage tab.

The following screen capture shows the Impact and Lineage Analysis for the SALES_FACT target table.
When you move the pointer over a source table icon, the name of the datastore, data flow, and owner
appear.

Tutorial
Populate a table from multiple relational tables PUBLIC 105
13. Expand the SALES_FACT table in the file tree and double-click the ORD_STATUS column.

The Lineage tab in the right-pane refreshes to show the lineage for the column. The following screen
capture shows the lineage of SALES_FACT.ORD_STATUS.

Notice that it shows the lookup table as the source for the ORD_STATUS column.
14. Print the reports by selecting the print option in your browser. For example, for Chrome, select the Tools
icon in the upper right and select Print.

Task overview: Populate a table from multiple relational tables [page 94]

Tutorial
106 PUBLIC Populate a table from multiple relational tables
Related Information

Creating the SalesFact job [page 95]

Adding objects to the SalesFact data flow [page 96]
Creating an inner join [page 97]
Purpose of the lookup_ext function [page 99]
Configuring the lookup_ext function [page 101]
Executing the SalesFact job [page 104]
What's next [page 107]

12.8 What's next

In the next segment, you'll learn how to take advantage of change data capture.

Parent topic: Populate a table from multiple relational tables [page 94]

Related Information

Creating the SalesFact job [page 95]

Tutorial
Populate a table from multiple relational tables PUBLIC 107
13 Changed data capture

Changed data capture (CDC) extracts only new or modified data after you process an initial load of the data to
the target system.

What you'll learn

In this segment, you'll learn how to use SAP Data Services data flows and scripts to build a logic for finding
changed data. This method uses date, time, or datetime stamps to identify new rows added to a source table at
a given point in time. To learn about the other methods for capturing changed data, see the Designer Guide.

After you perform the tasks in this segment, you'll learn about using the following CDC objects:

• Global variables
• Template tables
• Scripts
• Custom functions

The goal

In this exercise, we create two jobs, an initial load job and a delta load job:

• Initial load job: Loads all rows from a source table to a target table. The job deletes all data in the target
table before it loads data. Therefore, the target data is the same as the source data.
• Delta load job: Reads data from the source table, but inserts only the new data to the target because of the
WHERE condition. The WHERE condition filters the data between the global variables for START TIME and
ENDTIME. Therefore, the job loads only the changed data to the target table. The job doesn't delete
existing data in the target table before loading changed data.

In the initial load job, Data Services establishes a baseline using an assigned date and time for each row in the
data source. In the delta load job, Data Services determines which rows are new or changed based on the last
date and time data.

The target database contains a job status table called CDC_time. Data Services stores the last date and time
data for each row in CDC_time. The delta load job updates that date and time for the next execution.

Global variables [page 109]

Global variables are symbolic placeholders for values in a specific job that increase the flexibility and
reusability of jobs.

Creating the initial load job and defining global variables [page 110]
Create a job that processes the initial load of data.

Replicating the initial load data flow [page 116]

Tutorial
108 PUBLIC Changed data capture
Use the replicate feature to copy the existing DF_CDC_Initial data flow and use the copy in the delta
load job.

Creating the Delta job and scripts for global variables [page 116]
SAP Data Services uses the delta load job to update the target table with data that is new or changed
since the last time the job ran.

Execute the jobs [page 119]

To understand how change data capture (CDC) works, execute the JOB_CDC_Initial then use your
DBMS to change the data in the ODS_CUSTOMER table before you execute the JOB_CDC_Delta.

What's next [page 122]

In the next segment, learn how to verify your source data and improve source data quality.

13.1 Global variables

Global variables are symbolic placeholders for values in a specific job that increase the flexibility and reusability
of jobs.

In general, an initial job contains the usual objects, such as a source, transform, and a target. But, an initial job
can also serve as a baseline for the source data through global variables.

The following list contains characteristics of global variables:

• Global variables are available within the job for which they were created. They aren’t available for any other
jobs.

 Example

For example, you have 26 jobs named JobA, JobB, through JobZ. You create a global variable for JobA.
You can't use the JobA global variable for JobsB through Z.

• Set values for global variables in several ways, including in scripts, at job execution, or in job schedule
properties.
• Global variables provide you with maximum flexibility at runtime.

 Example

For example, you can change default values for global variables at runtime from a job's schedule or
SOAP call without having to open a job in the SAP Data Services Designer.

For complete information about using global variables in Data Services, see the Designer Guide.

Parent topic: Changed data capture [page 108]

Related Information

Creating the initial load job and defining global variables [page 110]

Tutorial
Changed data capture PUBLIC 109
Replicating the initial load data flow [page 116]
Creating the Delta job and scripts for global variables [page 116]
Execute the jobs [page 119]
What's next [page 122]

13.2 Creating the initial load job and defining global

variables

Create a job that processes the initial load of data.

After you create the initial load job, create two global variables. The global variables serve as placeholders for
job execution start and end time stamps.

1. Open Class_Exercises in the Project Area and create a new batch job.
2. Rename the new job JOB_CDC_Initial.
3. Select job JOB_CDC_Initial in the Project Area to highlight it.

4. Select Tools Variables .

The Variables and Parameters dialog box opens. The dialog box displays the job name in the context
header.
5. Right-click Global Variables and select Insert.

A new variable appears under Global Variables.

6. Double-click the new variable to open the Global Variable Properties dialog box.
7. Enter $GV_STARTTIME in Name.
8. Select datetime from the Data type list.
9. Select OK.
10. Create another global variable in the same way.

• Name the second global variable $GV_ENDTIME.

• Set the Data Type to datetime.
• Select OK.
11. Close the Variables and Parameters editor by clicking the “X” in the upper right corner.
12. Save your work.

Building the JOB_CDC_Initial [page 111]

Add objects to the JOB_CDC_Initial job, including a work flow, data flow, initialization script, and
termination script.

Defining the initial load scripts [page 112]

Use your database management scripting language to add expressions that define the values for the
global variables.

Defining the data flow [page 114]

Define the data flow by adding a query and a template table to the data flow.

Defining the QryCDC query [page 115]

Tutorial
110 PUBLIC Changed data capture
Map the columns from the source table to the output schema in the QryCDC query, and add a function
that checks the date and time.

Task overview: Changed data capture [page 108]

Related Information

Global variables [page 109]

Replicating the initial load data flow [page 116]
Creating the Delta job and scripts for global variables [page 116]
Execute the jobs [page 119]
What's next [page 122]

13.2.1 Building the JOB_CDC_Initial

Add objects to the JOB_CDC_Initial job, including a work flow, data flow, initialization script, and termination
script.

The initialization and termination scripts define values for the global variables.

1. Open JOB_CDC_Initial in the workspace.

2. Add a work flow object to the workspace from the tool pallet.
3. Name the work flow WF_CDC_Initial.
4. Open the WF_CDC_Initial work flow in the workspace.

5. Add a script object ( ) from the tool palette to the left side of the work flow workspace.
6. Name the script SET_START_END_TIME.
7. Add a data flow object from the tool palette to the right of the script object in the workspace.
8. Name the data flow DF_CDC_Initial.
9. Add a second script object from the tool palette to the right of the DF_CDC_Initial data flow object.
10. Name the second script UPDATE_CDC_TIME_TABLE.
11. Connect the objects from left to right to set the order of the work flow.
The following screen capture shows the completed WF_CDC_Initial workflow:

12. Save your work.

Tutorial
Changed data capture PUBLIC 111
Next you add functions to the scripts.

Task overview: Creating the initial load job and defining global variables [page 110]

Related Information

Defining the initial load scripts [page 112]

Defining the data flow [page 114]
Defining the QryCDC query [page 115]

13.2.2 Defining the initial load scripts

Use your database management scripting language to add expressions that define the values for the global
variables.

When you define scripts, make sure that you follow the syntax rules for your database management system.

Before you define the scripts, check the date and time in the existing database to make sure you use a date in
the script that includes all of the records. To check the date and time, perform the following prerequisite steps:

1. Open the Datastores tab in the Local Object Library.

2. Expand the Tables node under ODS_DS.
3. Double-click the ODS_CUSTOMER source table to open the editor in the workspace.
4. Open the View Data tab in the lower pane to see a sample of the source data.
5. Look in the CUST_TIMESTAMP column. Note that the timestamp for all records is 2008.03.27 00:00:00.
6. Close the editor.

To define the scripts in the WF_CDC_Initial work flow, perform the following steps:

1. Expand the WF_CDC_Inital node in the Project Area and select the SET_START_END_TIME script to open
it in the workspace.

This script establishes the $GV_STARTTIME as a date that includes all records in the table. The end time is
the time of job execution. The initial data flow captures all of the rows in the source table. In the
prerequisite steps, you noted the date for all rows in the source table as 2008.03.27 00:00:00. Therefore,
set the value for $GV_STARTTIME to a date that includes all rows. For example, the date 2008.01.01
00:00:000 is before the timestamp date in the table, therefore, it will include all rows from the table.

a. Enter the script directly in the text area using the syntax applicable for your database.

The following example is for Microsoft SQL Server. It establishes a start date and time of 2008.01.01
00:00:000. Then it establishes that the end date is the system date for the initial load job execution:

$GV_STARTTIME = '2008.01.01 00:00:000';

$GV_ENDTIME = sysdate();

Tutorial
112 PUBLIC Changed data capture
 Tip

As you start to type the global variable name, a list of variable names appears. Double-click the
applicable variable name from the list to add it to the string.

b. Select the Validate Current icon in the upper toolbar to validate the script.
Fix any syntax errors and revalidate if necessary.

 Note

Even if SAP Data Services doesn't find syntax errors, your DBMS can find syntax errors when you
execute the job.

c. Close the script editor for SET_START_END_TIME.

2. Select the UPDATE_CDC_TIME_TABLE script in the Project Area to open the script editor in the workspace.

The script resets the LAST_TIME column value in the CDC_time job status table to the system date in
$GV_ENDTIME.

a. Enter the script directly in the text area using the syntax applicable for your database.

The following example is for Microsoft SQL Server. The function deletes the current date in the
CDC_time job status table and inserts the value from the global variable $GV_ENDTIME:

sql('Target_DS', 'DELETE FROM ODS.CDC_TIME');

sql('Target_DS', 'INSERT INTO ODS.CDC_TIME VALUES ({$GV_ENDTIME})');

 Note

Note that in the script, “ODS” is the owner name. Use your owner name in your script.

b. Select the Validate Current icon to validate the script.

A validation warning appears that the DATETIME data type will be converted to VARCHAR. Data
Services always preserves the data type in the output schema, so ignore the warning.

 Note

Even if SAP Data Services doesn't find syntax errors, your DBMS can find syntax errors when you
execute the job.

c. Close the UPDATE_CDC_TIME_TABLE script file.

3. Close all open workspace tabs and save your work.

Task overview: Creating the initial load job and defining global variables [page 110]

Related Information

Building the JOB_CDC_Initial [page 111]

Defining the data flow [page 114]
Defining the QryCDC query [page 115]

Tutorial
Changed data capture PUBLIC 113
13.2.3 Defining the data flow

Define the data flow by adding a query and a template table to the data flow.

With a target template table, you don’t have to specify the table schema or import metadata. Instead, during
job execution, SAP Data Services has the DBMS create the table with the schema defined by the data flow.

Template tables appear in the Local Object Library under each datastore.

1. Open the DF_CDC_Initial data flow in the workspace.

2. Drag and drop the ODS_Customer table to the workspace from the Tables node of the ODS_DS datastore.
3. Select Make Source from the popup menu.
4. Select the Query icon from the tool pallet and select an empty area to the right of the source table in the
workspace.

Data Services adds the Query to the data flow in the workspace.
5. Rename the query QryCDC.
6. Expand Target_DS in the Datastores tab in the Local Object Library.
7. Drag and drop the Template Tables icon to the workspace as the target in the data flow.

The Create Template dialog box opens.

8. Enter CUST_CDC in Template name and select OK.
9. Connect the objects from left to right to designate the flow of data.
10. Open the target table CUST_CDC in the workspace to define it.
11. Open the Options tab and select Delete data from table before loading.

Leave Drop and re-create table selected. It’s selected by default.

12. Select the Back arrow icon in the upper toolbar.

The Template Target Editor closes.

13. Save your work.

Next, configure the qryCDC query in the data flow.

Task overview: Creating the initial load job and defining global variables [page 110]

Related Information

Building the JOB_CDC_Initial [page 111]

Defining the initial load scripts [page 112]
Defining the QryCDC query [page 115]

Tutorial
114 PUBLIC Changed data capture
13.2.4 Defining the QryCDC query

Map the columns from the source table to the output schema in the QryCDC query, and add a function that
checks the date and time.

1. Open QryCDC in the workspace.

2. Drag columns from the Schema In pane to the Schema Out pane.
Move each of the following columns to the output schema, or, for convenience, use multiselect to move all
columns at once:
• CUST_ID
• CUST_CLASSF
• NAME1
• ZIP
• CUST_TIMESTAMP
3. Open the Where tab in the lower pane and enter the following query statement using the syntax applicable
for your database.

Type the following text for Microsoft SQL Server:

(ODS_CUSTOMER.CUST_TIMESTAMP >= $GV_STARTTIME) and

(ODS_CUSTOMER.CUST_TIMESTAMP <= $GV_ENDTIME)

 Note

Optionally, use the following shortcuts:

• Drag the column CUST_TIMESTAMP from the Schema In pane to the Where tab
• Double-click the applicable global variable name that appears when you start to type “$GV” in the
function text.

4. Select the Validate icon on the toolbar.

Fix any syntax errors and revalidate if necessary.
5. Highlight the JOB_CDC_Initial job name in the Project Area and select the Validate All icon in the
toolbar.

Ignore the warning about DATETIME conversion to VARCHAR.

6. Save your work and close all open workspace tabs.

Task overview: Creating the initial load job and defining global variables [page 110]

Related Information

Building the JOB_CDC_Initial [page 111]

Defining the initial load scripts [page 112]
Defining the data flow [page 114]

Tutorial
Changed data capture PUBLIC 115
13.3 Replicating the initial load data flow

Use the replicate feature to copy the existing DF_CDC_Initial data flow and use the copy in the delta load
job.

After you replicate the data flow, change the name and adjust some of the settings.

1. Open the Data Flows tab in the Local Object Library.

2. Right-click DF_CDC_Initial and choose Replicate.

A new data flow appears in the Data Flow list with “Copy_1” added to the work flow name.
3. Rename the copy dataflow DF_CDC_Delta.
4. Double-click the DF_CDC_Delta data flow in the Local Object Library to open it in the workspace.
5. Double-click the CUST_CDC target table object in the workspace.

The Template Target Table editor opens.

6. Open the Options tab in the lower pane.
7. Deselect the following options: Delete data from table before loading and Drop and re-create table.

This step enables the job to update the target table with changed data while retaining the current data.
8. Close the Template Target Table Editor.
9. Save your work.

Task overview: Changed data capture [page 108]

Related Information

Global variables [page 109]

Creating the initial load job and defining global variables [page 110]
Creating the Delta job and scripts for global variables [page 116]
Execute the jobs [page 119]
What's next [page 122]

13.4 Creating the Delta job and scripts for global variables

SAP Data Services uses the delta load job to update the target table with data that is new or changed since the
last time the job ran.

1. Create a new batch job in Class_Exercises and name the job JOB_CDC_Delta.
2. Select the JOB_CDC_Delta job name in the Project Area to highlight it.

3. Select Tools Variables .

Tutorial
116 PUBLIC Changed data capture
The Variables and Parameters dialog box opens. The banner of the dialog box contains the job name
JOB_CDC_Delta.
4. Right-click Global Variables and choose Insert.

A new variable appears under Global Variables.

5. Double-click the new variable.

The Global Variable Properties dialog box opens.

6. Enter $GV_STARTTIME in the Name text box.
7. Choose datetime from the Data type list.
8. Select OK.
9. Perform similar steps to create the global variable $GV_ENDTIME with a data type of datetime.
10. Close the Variables and Parameters dialog box and save your work.

Even though you use the same global variable names as for the initial load job, Data Services doesn't consider
them as duplicates because you create them for different jobs.

Building the JOB_CDC_Delta [page 117]

Add a work flow and two scripts to the JOB_CDC_Delta job.

Defining the delta load scripts [page 118]

Define the $GV_STARTTIME and $GV_ENDTIME global variables in the delta load scripts.

Task overview: Changed data capture [page 108]

Related Information

Global variables [page 109]

Creating the initial load job and defining global variables [page 110]
Replicating the initial load data flow [page 116]
Execute the jobs [page 119]
What's next [page 122]

13.4.1 Building the JOB_CDC_Delta

Add a work flow and two scripts to the JOB_CDC_Delta job.

1. Open JOB_CDC_Delta in the workspace.

2. Add a work flow object from the tool pallet to the workspace and rename it WF_CDC_Delta.
3. Open WF_CDC_Delta in the workspace.
4. Add a new script from the tool palette to the left side of the WF_CDC_Delta workspace and name it
SET_NEW_START_END_TIME.
5. Drag and drop the replicated data flow named DF_CDC_Delta from the Data Flows tab in the Local Object
Library to the right of the script object in the workspace.

Tutorial
Changed data capture PUBLIC 117
6. Add a second script from the tool pallet to the right of the data flow object in the workspace and name it
UPDATE_CDC_TIME_TABLE.
7. Connect the objects in the workspace from left to right.
8. Save your work.

Task overview: Creating the Delta job and scripts for global variables [page 116]

Related Information

Defining the delta load scripts [page 118]

13.4.2 Defining the delta load scripts

Define the $GV_STARTTIME and $GV_ENDTIME global variables in the delta load scripts.

When you define scripts, make sure that you follow the rules for your database management system.

Perform the following steps beginning in the WF_CDC_Delta work flow workspace in SAP Data Services
Designer:

1. Define the SET_NEW_START_END_TIME script:

a. Double-click the SET_NEW_START_END_TIME script object in the workspace to open the Script Editor.
b. Define the $GV_STARTTIME global variable as the last time stamp recorded in the CDC_time job
status table. Define the $GV_ENDTIME global variable as the system date.

The following example is for MS SQL Server:

$GV_STARTTIME = to_date(sql('Target_DS', 'SELECT LAST_TIME FROM

ODS.CDC_TIME'), 'YYYY-MM-DD HH24:MI:SS');
$GV_ENDTIME = sysdate();

 Note

Note that in the script, “ODS” is the owner name. Use your owner name in your script.

c. Select the Validate Current icon in the toolbar.

Fix any syntax errors and revalidate until there are no errors.

 Note

Even if SAP Data Services doesn't find syntax errors, your DBMS can find syntax errors when you
execute the job.

d. Select the Back arrow icon in the toolbar to close the Script Editor.
2. Define the UPDATE_CDC_TIME_TABLE script:
a. Double-click the UPDATE_CDC_TIME_TABLE script in the workspace to open the Script Editor.

Tutorial
118 PUBLIC Changed data capture
b. Define the script to replace the value in the LAST_TIME column in the CDC_Time job status table to the
system date that is defined in $GV_ENDTIME global variable.

The following example is for MS SQL Server:

sql('Target_DS', 'UPDATE ODS.CDC_TIME SET LAST_TIME = {$GV_ENDTIME}');

 Note

Note that in the script, “ODS” is the owner name. Use the actual owner name in your script.

c. Validate the script and fix any syntax errors.

 Note

Even if SAP Data Services doesn't find syntax errors, your DBMS can find syntax errors when you
execute the job.

d. Select the Back arrow icon to close the Script Editor.

3. Select JOB_CDC_delta in the Project Area and select the Validate All icon in the toolbar.

Correct any errors, and ignore any warnings for this exercise.
4. Save your work and close all workspace tabs.

Task overview: Creating the Delta job and scripts for global variables [page 116]

Related Information

Building the JOB_CDC_Delta [page 117]

13.5 Execute the jobs

To understand how change data capture (CDC) works, execute the JOB_CDC_Initial then use your DBMS to
change the data in the ODS_CUSTOMER table before you execute the JOB_CDC_Delta.

The JOB_CDC_Delta job extracts the changed data from the table and updates the target table with only the
changed data.

View the results to see the different time stamps and to verify that only the changed data was loaded to the
target table.

Executing the initial load job [page 120]

The initial load job outputs the source data to the target table, and updates the job status table with the
job execution date and time.

Changing the source data [page 121]

To see change data capture (CDC) in action, add a row to the ODS_CUSTOMER table and execute the
delta load job.

Tutorial
Changed data capture PUBLIC 119
Executing the delta load job [page 121]
The delta-load job outputs the row that you added to the table after you ran the initial-load job.

Parent topic: Changed data capture [page 108]

Related Information

Global variables [page 109]

Creating the initial load job and defining global variables [page 110]
Replicating the initial load data flow [page 116]
Creating the Delta job and scripts for global variables [page 116]
What's next [page 122]

13.5.1 Executing the initial load job

The initial load job outputs the source data to the target table, and updates the job status table with the job
execution date and time.

Use your DBMS to open the ODS_CUSTOMER table. Notice that there are 12 rows. The columns are the same as
the columns that appear in the Schema In pane in the QryCDC object.

To execute the initial load job, perform the following steps in SAP Data Services Designer:

1. Right-click the JOB_CDC_Initial job in the Project Area and select Execute.
2. Accept all of the default settings in the Execution Properties dialog box and select OK.

The Job Log opens in the workspace area. View the messages in the Trace page. If there’s an error, the Error
icon activates and processing stops. Select the Error icon and read the error messages. Even when the
script syntax validated, there can still be script errors issued by your database management system.

3. After successful execution, select the Monitor icon ( ) and view the Row Count column. The job is
successful when the column contains 12, which indicates the job processed all 12 rows of the source table.
4. View the data in the CUST_CDC target table.

The CUST_CDC target table contains the rows from the ODS_CUSTOMER source table.

Task overview: Execute the jobs [page 119]

Related Information

Changing the source data [page 121]

Executing the delta load job [page 121]

Tutorial
120 PUBLIC Changed data capture
13.5.2 Changing the source data

To see change data capture (CDC) in action, add a row to the ODS_CUSTOMER table and execute the delta load
job.

Use your DBMS to edit the ODS_CUSTOMER table.

1. Add a row to the table with the following data.

 Note

If your database does not allow nulls for some fields, copy the data from another row.

Column name Value

Cust_ID ZZ01

Cust_Classf ZZ

Name1 EZ BI

Address NULL

City NULL

Region_ID NULLL

ZIP ZZZZZ

Cust_timestamp Current date and time in given format

2. Save the table.

Task overview: Execute the jobs [page 119]

Related Information

Executing the initial load job [page 120]

Executing the delta load job [page 121]

13.5.3 Executing the delta load job

The delta-load job outputs the row that you added to the table after you ran the initial-load job.

Before you perform the following steps, ensure that you added an additional row to the ODS_Customer table in
your database management system.

1. Right-click the JOB_CDC_Delta job in the Project Area and select Execute.

Tutorial
Changed data capture PUBLIC 121
2. Accept all of the default settings in the Execution Properties dialog box and select OK.

3. After successful execution, select the Monitor icon ( ) and view the Row Count column.

The job is successful when the column contains 1, which indicates the job processed only the changed row
in the source table.
4. View the data in the CUST_CDC target table.

The row that you added to the table in your database management system appears in the CUST_CDC target
table along with the original content of the table.

Task overview: Execute the jobs [page 119]

Related Information

Executing the initial load job [page 120]

Changing the source data [page 121]

13.6 What's next

In the next segment, learn how to verify your source data and improve source data quality.

Parent topic: Changed data capture [page 108]

Related Information

Global variables [page 109]

Tutorial
122 PUBLIC Changed data capture
14 Data assessment

Use data assessment features to identify problems in your data, separate out bad data, and audit data to
improve the quality and validity of your data.

Data Assessment provides features that enable you to trust the accuracy and quality of your source data.

What you'll learn

In this segment, learn about the following methods to profile and audit data details:

• View table data and use the profile tools to view the default profile statistics.
• Use the Validation transform in a data flow to find records in your data that violate a data format
requirement in a specific column.
• Create an audit expression and an action for when a record fails the expression.
• Add an additional target table for records that fail an audit rule.
• View audit details in Operational Dashboard reports in the Data Services Management Console

The goal

In the previous exercise for change data capture, we instructed you to add the value of ZZZZZ for the ZIP
column. For this exercise, we must employ a business rule from a fictional company that requires a target ZIP
column contains only numeric data.

The exercises in this section introduce the following Data Services features:

• Data profiling: Pulls specific data statistics about the quality of your source data.
• Validation transform: Applies your business rules to data and sends data that failed the rules to a separate
target table.
• Audit dataflow: Outputs invalid records to a separate table.
• Auditing tools: Tracks your jobs in the Data Services Management Console.

For more information about data assessment features, see the Designer Guide.

Default profile statistics [page 124]

The Data Profiler executes on a profiler server to provide column and relationship information about
your data.

Viewing profile statistics [page 125]

Use features of the View Data dialog box to see profile statistics about source data that help you
determine data quality before processing.

The Validation transform [page 126]

The Validation transform qualifies a data set based on rules for input schema columns.

Tutorial
Data assessment PUBLIC 123
Audit objects [page 130]
Auditing provides a way to ensure that a data flow loads correct data into intended targets.

Viewing audit details in Operational Dashboard reports [page 134]

View audit details, such as an audit rule summary and audit labels and values, in the SAP Data Services
Management Console.

What's next [page 136]

The next segment shows you how to design jobs that are recoverable if the job malfunctions, crashes,
or doesn’t complete.

14.1 Default profile statistics

The Data Profiler executes on a profiler server to provide column and relationship information about your data.

The software reveals statistics for each column that you choose to evaluate. The following table describes the
default statistics.

Statistic Description

Column The Data Profiler provides two types of column profiles:

• Basic profiling. Minimum value, maximum value, aver

age value, minimum string length, and maximum string
length.
• Detailed profiling. Distinct count, distinct percent, me
dian, median string length, pattern count, and pattern
percent.

Distincts The total number of distinct values out of all records for the
column.

Nulls The total number of NULL values out of all records in the col
umn.

Min The minimum value in the column out of all records.

• If the column contains alpha data, the minimum value is

the string that comes first alphabetically.
• If the column contains numeric data, the minimum
value is the lowest numeral in the column.
• If the column contains alphanumeric data, the mini
mum value is the string that comes first alphabetically
and lowest numerically.

Tutorial
124 PUBLIC Data assessment
Statistic Description

Max The maximum value in the column out of all records.

• If the column contains alpha data, the maximum value

is the string that comes last alphabetically.
• If the column contains numeric data, the maximum
value is the highest numeral in the column.
• If the column contains alphanumeric data, the maxi
mum value is the string that comes last alphabetically
and highest numerically.

Timestamp The time that the statistic is calculated.

For more information about using the data profiler, see the Data Assessment section of the Designer Guide.

Parent topic: Data assessment [page 123]

Related Information

Viewing profile statistics [page 125]

The Validation transform [page 126]
Audit objects [page 130]
Viewing audit details in Operational Dashboard reports [page 134]
What's next [page 136]

14.2 Viewing profile statistics

Use features of the View Data dialog box to see profile statistics about source data that help you determine
data quality before processing.

1. Open the Datastores tab in the Local Object Library and expand ODS_DS Tables .
2. Right-click the ODS_CUSTOMER table and select View Data.

The View Data dialog box opens.

3. Select the Profile Tab icon ( ), which is the second tab from the left.

The Profile Tab opens. The first column contains the column names in the table. Subsequent columns
contain profile information for each column. The following screen capture shows the Profile Tab for
ODS_CUSTOMER.

Tutorial
Data assessment PUBLIC 125
Notice the ZIP column contains “ZZZZZ” in the Max column.
4. After you examine the statistics, select the “X” in the upper right corner to close the View Data dialog.

Next, create a validation job that changes the invalid entry of “ZZZZZ” in the ZIP column to blank.

Task overview: Data assessment [page 123]

Related Information

Default profile statistics [page 124]

The Validation transform [page 126]
Audit objects [page 130]
Viewing audit details in Operational Dashboard reports [page 134]
What's next [page 136]

14.3 The Validation transform

The Validation transform qualifies a data set based on rules for input schema columns.

Use a Validation transform to define rules that sort good data from bad data. The Validation transform outputs
up to three values: Pass, Fail, and RuleViolation. Data outputs are based on the condition that you specify
in the transform.

For this exercise, we set up a Pass target table for the first job execution. Then we alter the first job by adding a
Fail target table with audit rules.

Creating a validation job [page 127]

Create a job and add a data flow that includes the ODS_CUSTOMER source table.

Configuring the Validation transform [page 128]

Create a rule in the Validation transform that marks records that have a noncompliant value in the ZIP
column and substitutes <Blank> for the noncompliant value.

Tutorial
126 PUBLIC Data assessment
Parent topic: Data assessment [page 123]

Related Information

Default profile statistics [page 124]

Viewing profile statistics [page 125]
Audit objects [page 130]
Viewing audit details in Operational Dashboard reports [page 134]
What's next [page 136]

14.3.1 Creating a validation job

Create a job and add a data flow that includes the ODS_CUSTOMER source table.

1. Add a new job to the Class_Exercises project and name the job JOB_CustGood.
2. Open JOB_CustGood in the workspace.
3. Add a Data Flow object to the JOB_CustGood workspace from the tool palette and name the data flow
DF_CustGood.
4. Open the DF_CustGood data flow in the workspace.
5. Add the ODS_CUSTOMER table to the DF_CustGood data flow workspace and select Make Source from the
popup menu.

Find ODS_CUSTOMER in the Datastores tab of the Local Object Library under ODS_DS.
6. Add a Validation transform icon to the DF_CustGood workspace.

Find the Validation transform in the Transform tab of the Local Object Library under the Platform node.
7. Add a Template Table icon to the DF_CustGood workspace.

Find the Template Table icon in the Datastores tab of the Local Object Library under Target_DS.

The Create Template dialog box opens.

8. Enter Cust_Good for Name and select OK.
9. Connect the ODS_CUSTOMER source table to the Validation transform.
10. Connect the Validation transform to the CUST_GOOD target table.
11. Select Pass from the popup menu.

The Pass option requires that SAP Data Services passes all rows to the target table, even rows that fail the
validation rules. The following screen capture shows the data flow:

Tutorial
Data assessment PUBLIC 127
12. Save your work.

Task overview: The Validation transform [page 126]

Related Information

Configuring the Validation transform [page 128]

14.3.2 Configuring the Validation transform

Create a rule in the Validation transform that marks records that have a noncompliant value in the ZIP column
and substitutes <Blank> for the noncompliant value.

To configure the Validation transform, open the DF_CustGood data flow workspace and perform the following
steps:

1. Double-click the Validation transform to open the Transform Editor.

2. Open the Validation Rules tab in the lower pane and select Add in the upper right of the pane.
The Rule Editor dialog box opens.
3. Create a rule using the values in the following table:

Option Instruction

Name Enter 5_Digit_ZIP_Column_Rule.

Description Optional. Enter a description such as ZIP values

must have 5-digits.

Enabled Selected by default. Keep selected to enable the current

validation rule.

 Note
A small check mark appears next to the specified vali
dation column in the Schema in pane of the Transform
Editor.

Action on Fail Select Send to Pass from the list.

Send to Pass passes the row to the target table even when
it fails the 5_Digit_ZIP_Column_Rule rule.

Tutorial
128 PUBLIC Data assessment
Option Instruction

Validation Function Skip for this exercise.

Bindings

Column Validation Select.

Column Select ODS_Customer.ZIP from the list.

Condition Select Match Pattern from the list.

Requires the value in the selected column to match the

pattern set in the next text box.

Text box Enter the pattern 99999.

The condition requires that values in the ZIP column con

tain 5-digit strings.

The following screenshot shows the completed Rule Editor dialog box:

4. Select OK.

The Rule Editor dialog box closes. The new rule appears in the Validation Rules tab under Rules.
5. Select the Enabled checkbox under If any rule fails and Send to Pass, substitute with.
6. Double-click the cell next to the checked Enabled cell, under Column.
7. Select ODS_Customer.ZIP from the list.
8. Enter the following under Expression: '' (two single quotes with no space between).

Tutorial
Data assessment PUBLIC 129
The two single quotes substitutes <Blank> for the ZIP values that don't pass the 5-digit string rule.
9. Select the Validate Current icon in the upper toolbar.
10. Fix any validation errors, if necessary.
11. Select the Back arrow icon in the upper toolbar to close the Transform Editor.
12. Right-click JOB_CustGood in the Project Area and select Execute.

After a successful execution, view the data in the CUST_GOOD target table to verify that the rule worked as you
intended. The row with the CUST_ID value of ZZ01, that contained “ZZZZZ” for the ZIP column, now contains
<Blank> in the ZIP column.

Task overview: The Validation transform [page 126]

Related Information

Creating a validation job [page 127]

14.4 Audit objects

Auditing provides a way to ensure that a data flow loads correct data into intended targets.

Collect audit statistics on data that flows out of any object in SAP Data Services, such as a source, transform,
or target.

In the next exercise, we set up the validation job from the last exercise to output records to two target tables:

• Fail target table: Contains records that don't pass the validation rule.
• Pass target table: Contains records that pass the validation rule.

The following table describes the various audit objects involved in creating an audit data flow.

Setting Description

Audit point Collects audit statistics.

Audit function Collects statistics for the audit points. For this exercise, we
set up a Count audit function on the source and pass target
tables. The Count audit function collects the following statis
tics:

• Good count for rows that pass the rule.

• Error count for rows that don’t pass the rule.

Audit label Unique name in the data flow that is generated for the audit
statistics for each defined audit function.

Tutorial
130 PUBLIC Data assessment
Setting Description

Audit rule Boolean expression that uses audit labels to verify the job.

Audit action on failure Action the job takes when there’s a failure.

For a complete list of audit objects and descriptions, see the Data Assessment section of the Designer Guide.

Adding a fail target table to the data flow [page 131]

To configure auditing in the DF_CustGood data flow, add a second target table to the Validation
transform.

Creating audit functions [page 132]

Create an audit function in the DF_CustGood data flow to direct failed records to the applicable target
table.

Parent topic: Data assessment [page 123]

Related Information

Default profile statistics [page 124]

Viewing profile statistics [page 125]
The Validation transform [page 126]
Viewing audit details in Operational Dashboard reports [page 134]
What's next [page 136]

14.4.1 Adding a fail target table to the data flow

To configure auditing in the DF_CustGood data flow, add a second target table to the Validation transform.

1. Open the DF_CustGood data flow in the workspace.

2. Double-click the Validation transform in the data flow to open the Transform Editor.
3. Select the ZIP column in the Schema In pane.
4. Open the Validation Rules tab in the lower pane and double-click the rule name match_pattern.

The Rule Editor opens.

5. Select Send To Fail from the Action on Fail list and select OK.

Send to Fail causes Data Services to send the rows that fail the 5_Digit_ZIP_Column_Rule to a fail
target table.
6. Select the Back arrow in the upper toolbar to close the Transform Editor.
7. Add a Template Table icon to the DF_CustGood data flow as a second target object.

The Template Table icon is in the Datastores tab under Target_DS.

Tutorial
Data assessment PUBLIC 131
8. Enter Cust_Bad_Format in Template name in the Create Template dialog box and select OK.
9. Draw a connection from the Validation transform to the Cust_Bad_Format target table and select the Fail
option from the popup menu.
The following screen capture shows an example of the finished data flow.

10. Save your work.

Next you create audit functions.

Task overview: Audit objects [page 130]

Related Information

Creating audit functions [page 132]

14.4.2 Creating audit functions

Create an audit function in the DF_CustGood data flow to direct failed records to the applicable target table.

The following steps set up a rule expression for each of the target tables in the data flow.

1. Open the DF_CustGood data flow in the workspace and select the Audit icon ( ) in the upper toolbar.

The Audit dialog box opens.

2. Open the Label tab, right-click ODS_CUSTOMER, and select Count from the list.

• A green check mark icon appears next to the ODS_CUSTOMER table.

• The word Count appears in the Audit Function column.
• The phrase $Count_ODS_Customer appears in the Audit Label column.
3. Right-click CUST_GOOD and select Count from the list.

• A green check mark icon appears next to the ODS_CUSTOMER table.

• The word Count appears in the Audit Function column.
• The phrase $Count_CUST_Good appears in the Audit Label column.

Tutorial
132 PUBLIC Data assessment
4. Open the Rule tab.
5. Select Add in the upper right.
6. Select the following values from each of the three lists in the center of the Rule tab:
• $Count_ODS_CUSTOMER
• =
• $CountError_ODS_CUSTOMER
7. Select Add.

Data Services adds the first auditing rule and opens a new line to add a second auditing rule.
8. Select the following values from each of the three lists in the center of the Rule tab:

• $Count_CUST_GOOD
• =
• $CountError_CUST_GOOD
9. In the Action on failure group at the right of the pane, deselect Raise exception.

Deselecting Raise exception prevents the job from stopping when an exception occurs.
10. Select Close to close the Audit dialog box.

The following screen capture shows the completed data flow. Notice that Data Services indicates the audit
points with the Audit icon on the right side of the ODS_Customer source table and Cust_Good target table.
The audit points are where Data Services collects audit statistics.

11. Select the Validate All icon in the upper toolbar to verify that there are no errors.
12. Save your work.
13. Right-click the Job_CustGood job and select Execute.

Task overview: Audit objects [page 130]

Related Information

Adding a fail target table to the data flow [page 131]

Tutorial
Data assessment PUBLIC 133
14.5 Viewing audit details in Operational Dashboard reports
View audit details, such as an audit rule summary and audit labels and values, in the SAP Data Services
Management Console.

1. Select Tools Data Services Management Console .

The Management Console is browser-based. Therefore, Data services opens your browser and presents
the Management Console login screen.
2. Log into the Management Console using your access credentials.

The Management Console home page opens.

3. Select the Operational Dashboard application in the home page.

The Dashboard opens with statistics and data. For more information about the Dashboard, see the
Management Console Guide.
4. Select one of the JOB_CustGood jobs listed in the table at right.

There are two JOB_CustGood jobs because you executed it in this exercise and the last exercise.

 Note

If the job doesn't appear in the table, adjust the Time Period dropdown list to a longer or shorter time
period as applicable.

The Job Execution Details pane opens showing the job execution history of the JOB_CustGood job.
5. Select the JOB_CustGood in the Job Execution History table.

The Job Details table opens. The Contains Audit Data column contains YES.
6. Select DF_CustGood in the Data Flow Name column.
Three graphs appear at right: Buffer Used, Row Processed, and CPU Used. Read about these graphs in the
Management Console Guide.
7. Select View Audit Data located just above the View Audit Data table.
The Audit Details dialog box opens. The following screen shot shows an example of the Audit Details dialog
box.

Tutorial
134 PUBLIC Data assessment
The following table explains the Audit Details pane.

Audit Details Information

Audit Rule Summary Audit rules from the Validation transform.

Audit Rule Failed The violated audit rule from the job execution.

The row count = 13

The row count = 12

The Audit Details table lists the number counts for each Audit Label:
• $Count_ODS_CUSTOMER = 13
• $Count_CUST_GOOD = 12

The validation rule requires that all records comply with the 5_Digit_ZIP_Column_Rule. One record
failed the rule. That was the record that you manually added to the data table. It contained a ZIP value of
“ZZZZZ”. The audit rules that you created require that the row count is equal. However, because one row
failed the validation rule, the counts are not equal.

View audit results [page 135]

After the job executes, open the fail target table to view the failed record.

Task overview: Data assessment [page 123]

Related Information

Default profile statistics [page 124]

Viewing profile statistics [page 125]
The Validation transform [page 126]
Audit objects [page 130]
What's next [page 136]

14.5.1 View audit results

After the job executes, open the fail target table to view the failed record.

Open the data flow in the workspace and click the magnifying icon in the lower right corner of the
CUST_BAD_FORMAT target table. The CUST_BAD_FORMAT target table contains one record. In addition to the
fields selected for output, the software added and populated three additional fields for error information:

• DI_ERRORACTION = F
• DI_ERRORCOLUMNS = Validation failed rule(s): ZIP
• DI_ROWID = 1.000000

Tutorial
Data assessment PUBLIC 135
These are the rule violation output fields that are automatically included in the Validation transform. For
complete information about the Validation transform, see the Reference Guide.

Parent topic: Viewing audit details in Operational Dashboard reports [page 134]

14.6 What's next

The next segment shows you how to design jobs that are recoverable if the job malfunctions, crashes, or
doesn’t complete.

Parent topic: Data assessment [page 123]

Related Information

Default profile statistics [page 124]

Viewing profile statistics [page 125]
The Validation transform [page 126]
Audit objects [page 130]
Viewing audit details in Operational Dashboard reports [page 134]

Tutorial
136 PUBLIC Data assessment
15 Recovery mechanisms

Use SAP Data Services recovery mechanisms to set up automatic recovery or to recover jobs manually that
don’t complete successfully.

A recoverable work flow is one that can run repeatedly after failure without loading duplicate data. Examples of
failure include source or target server crashes or target database errors that cause a job or work flow to
terminate prematurely.

What you'll learn

In this segment, learn about the job recovery mechanisms that you can use to recover jobs that only partially
ran, and failed for some reason.

In the following exercises, you learn how to:

• Design and implement recoverable work flows

• Use Data Services conditionals
• Specify and use the table auto correction option

The goal

Create a recoverable job that loads the sales organization dimension table that you loaded in the exercise
Populate a table from a flat file [page 41]. Reuse the data flow DF_SalesOrg from that exercise to complete this
segment.

For more information about recovery methods, see the Designer Guide and the Reference Guide.

Recoverable job [page 138]

Create a job that contains three objects that are configured so that the job is recoverable.

Creating local variables [page 138]

Local variables contain information that you can use in a script to determine when a job must be
recovered.

Creating the script that determines the status [page 140]

Create a script that checks the $end_time variable to determine if the job completed properly.

Conditionals [page 141]

Use conditionals to implement if-then-else logic in a work flow.

Creating the script that updates the status [page 143]

This script updates the status_table table with the current timestamp after the work flow in the
conditional has completed. The timestamp indicates a successful execution.

Verify the job setup [page 144]

Make sure that job configuration for JOB_Recovery is complete by verifying that the objects are ready.

Tutorial
Recovery mechanisms PUBLIC 137
Executing the job [page 145]
Execute the job to see how the software functions with the recovery mechanism.

Data Services automated recovery properties [page 146]

Data Services provides automated recovery methods to use as an alternative to the job setup for
JOB_Recovery.

What's next [page 147]

The remaining segments in the tutorial provide information about some of the advanced features in
SAP Data Services.

15.1 Recoverable job

Create a job that contains three objects that are configured so that the job is recoverable.

The recoverable job that you create in this section contains the following objects:

• A script that determines when recovery is required

• A conditional that triggers the appropriate data flow
• Two data flows: A regular data flow and a recovery data flow
• A script that updates a status table that indicates successful execution

Parent topic: Recovery mechanisms [page 137]

Related Information

Creating local variables [page 138]

Creating the script that determines the status [page 140]
Conditionals [page 141]
Creating the script that updates the status [page 143]
Verify the job setup [page 144]
Executing the job [page 145]
Data Services automated recovery properties [page 146]
What's next [page 147]

15.2 Creating local variables

Local variables contain information that you can use in a script to determine when a job must be recovered.

In previous exercises you defined global variables. Local variables differ from global variables. Use local
variables in a script or expression that is defined in the job or work flow that calls the script.

Tutorial
138 PUBLIC Recovery mechanisms
1. Open the Class_Exercises project in the Project Area and add a new job named JOB_Recovery.

2. Select the job name and select Tools Variables .

The Variables and Parameters dialog box opens.

3. Right-click Variables and select Insert.

A new variable appears named $NewVariableX where X indicates the new variable number.
4. Double-click $NewVariableX and enter $recovery_needed for Name.
5. Select int from the Data type dropdown list.
6. Follow the same steps to create another local variable.
7. Name the variable $end_time and select varchar(20) from the Data type dropdown list.

Task overview: Recovery mechanisms [page 137]

Related Information

Recoverable job [page 138]

Tutorial
Recovery mechanisms PUBLIC 139
15.3 Creating the script that determines the status

Create a script that checks the $end_time variable to determine if the job completed properly.

The script reads the ending time in the status_table table that corresponds to the most recent start time. If
there is no ending time for the most recent starting time, the software determines that the prior data flow must
not have completed properly.

1. With JOB_Recovery opened in the workspace, add a script to the left side of the workspace and name it
GetWFStatus.
2. Open the script in the workspace and type the script directly into the Script Editor. Make sure that the
script complies with syntax rules for your DBMS.

For Microsoft SQL Server or SAP ASE, enter the following script:

 Sample Code

$end_time = sql('Target_DS', 'select convert(char(20), end_time, 0)

from status_table where start_time = (select max(start_time)from
status_table)');
if ($end_time IS NULL or $end_time = '')$recovery_needed = 1;
else $recovery_needed = 0;

For Oracle, enter the following script:

 Sample Code

$end_time = sql('Target_DS', 'select to_char(end_time, \'YYYY-MM-DD

HH24:MI:SS\')
from status_table where start_time = (select max(start_time) from
status_table)');
if (($end_time IS NULL) or ($end_time = '')) $recovery_needed = 1; else
$recovery_needed = 0;

3. Validate the script.

Task overview: Recovery mechanisms [page 137]

Related Information

Recoverable job [page 138]

Creating local variables [page 138]
Conditionals [page 141]
Creating the script that updates the status [page 143]
Verify the job setup [page 144]
Executing the job [page 145]
Data Services automated recovery properties [page 146]
What's next [page 147]

Tutorial
140 PUBLIC Recovery mechanisms
15.4 Conditionals

Use conditionals to implement if-then-else logic in a work flow.

Conditionals are single use objects, which means they can only be used in the job for which they were created.

Define a conditional for this exercise to specify a recoverable data flow. To define a conditional, you specify a
condition and two logical branches:

Conditional branch Description

If A Boolean expression that evaluates to TRUE or FALSE. Use

functions, variables, and standard operators to construct the
expression.

Then Work flow elements to execute when the “If” expression eval
uates to TRUE.

Else (Optional) Work flow elements to execute when the “If” ex
pression evaluates to FALSE.

Adding the conditional [page 142]

Add a conditional expression to JOB_Recovery to determine the execution path.

Specifying the If-Then work flows [page 142]

Complete the conditional by specifying the data flows to use if the conditional equals true or false.

Parent topic: Recovery mechanisms [page 137]

Related Information

Recoverable job [page 138]

Creating local variables [page 138]
Creating the script that determines the status [page 140]
Creating the script that updates the status [page 143]
Verify the job setup [page 144]
Executing the job [page 145]
Data Services automated recovery properties [page 146]
What's next [page 147]

Tutorial
Recovery mechanisms PUBLIC 141
15.4.1 Adding the conditional

Add a conditional expression to JOB_Recovery to determine the execution path.

1. Open JOB_Recovery in the workspace.

2. Click the conditional icon on the tool palette then click in the workspace to the right of the script
GetWFStatus.
3. Name the conditional recovery_needed.
4. Double-click the conditional in the workspace to open the Conditional Editor.

The Conditional Editor contains three areas:

• The if expression text box
• A space for specifying the work flow to execute when the if condition evaluates to TRUE. For example, if
condition = true, then perform the task in the Then space.
• A space for specifying the work flow to execute when the if condition evaluates to FALSE. For example,
if condition does not equal true, run the task in the Else space,
5. Type the following text into the if text box to state the condition.

($recovery_needed = 1)

Complete the conditional by specifying the work flows to execute for the If and Then conditions.

Task overview: Conditionals [page 141]

Related Information

Specifying the If-Then work flows [page 142]

15.4.2 Specifying the If-Then work flows

Complete the conditional by specifying the data flows to use if the conditional equals true or false.

Follow these steps with the recovery_needed conditional open in the workspace:

1. Open the Data Flow tab In the Local Object Library and move DF_SalesOrg to the Else portion of the
Conditional Editor using drag and drop.

You use this data flow for the “false” branch of the conditional.
2. Right-click DF_SalesOrg in the Data Flow tab in the Local Object Library and select Replicate.
3. Name the replicated data flow ACDF_SalesOrg.
4. Move ACDF_SalesOrg to the Then area of the conditional using drag and drop.

This data flow is for the “true” branch of the conditional.

5. Double-click the ACDF_SalesOrg data flow in the Data Flow tab to open it in the workspace.

Tutorial
142 PUBLIC Recovery mechanisms
6. Double-click the SALESORG_DIM target table to open it in the workspace.
7. Open the Options tab in the lower pane of the Target Table Editor.
8. Find the Update control category in the Advanced section and set Auto correct load to Yes.

Auto correct loading ensures that the same row is not duplicated in a target table by matching primary key
fields. See the Reference Guide for more information about how auto correct load works.

Task overview: Conditionals [page 141]

Related Information

Adding the conditional [page 142]

Creating the script that updates the status [page 143]

15.5 Creating the script that updates the status

This script updates the status_table table with the current timestamp after the work flow in the conditional has
completed. The timestamp indicates a successful execution.

1. With JOB_Recovery opened in the workspace, add the script icon to the right of the recovery_needed
conditional.
2. Name the script UpdateWFStatus.
3. Double-click UpdateWFStatus to open the Script Editor in the workspace.
4. Enter text using the syntax for your RDBMS.

For Microsoft SQL Server and SAP ASE, enter the following text:

sql('Target_DS', 'update status_table set end_time = getdate() where

start_time = (select max(start_time) from status_table)');

For Oracle, enter the following text:

sql('Target_DS', 'update status_table set end_time = SYSDATE where start_time

= (select max(start_time) from status_table)');

For DB2, enter the following text:

sql('Target_DS','update status_table set end_time = current timestamp where

start_time = (select max(start_time) from status_table)');

5. Validate the script.

6. Close the Script Editor.
7. Open JOB_Recovery in the workspace and connect the objects to indicate execution order.

Connect GetWFStatus script to the recover_needed conditional, and then connect recover_needed
conditional to the UpdateWFStatus script.

Tutorial
Recovery mechanisms PUBLIC 143
8. Save your work.

Task overview: Recovery mechanisms [page 137]

Related Information

Recoverable job [page 138]

Creating local variables [page 138]
Creating the script that determines the status [page 140]
Conditionals [page 141]
Verify the job setup [page 144]
Executing the job [page 145]
Data Services automated recovery properties [page 146]
What's next [page 147]

15.6 Verify the job setup

Make sure that job configuration for JOB_Recovery is complete by verifying that the objects are ready.

Objects in JOB_Recovery

Purpose
Object

GetWFStatus script Determines if recovery is required.

recovery_needed Conditional Specifies the work flow to execute when the “If” statement is
true or false.

• If true, then run the ACDF_SalesOrg data flow

• Else, run the DF_SalesOrg data flow

UpdateWFStatus script Updates the status table with the current timestamp after
the work flow in the conditional has completed. The time
stamp indicates a successful execution.

Objects in recovery_needed Conditional

Object Purpose

DF_SalesOrg data flow The data flow to execute when the conditional equals false.

ACDF_SalesOrg data flow The data flow to execute when the conditional equals true.

Tutorial
144 PUBLIC Recovery mechanisms
Parent topic: Recovery mechanisms [page 137]

Related Information

Recoverable job [page 138]

Creating local variables [page 138]
Creating the script that determines the status [page 140]
Conditionals [page 141]
Creating the script that updates the status [page 143]
Executing the job [page 145]
Data Services automated recovery properties [page 146]
What's next [page 147]

15.7 Executing the job

Execute the job to see how the software functions with the recovery mechanism.

Edit the status table status_table in your DBMS and make sure that the end_time column is NULL or blank.

1. Execute JOB_Recovery.
2. View the Trace messages and the Monitor data to see that the conditional chose ACDF_SalesOrg to
process.

ACDF_SalesOrg is the job that runs when the condition equals true. The condition is true because there
was no date in the end_time column in the status table. The software concludes that the previous job did
not complete and needs recovery.
3. Now execute the JOB_Recovery again.
4. View the Trace messages and the Monitor data to see that the conditional chose DF_SalesOrg to process.

DF_SalesOrg is the job that runs when the conditional equals false. The condition is false for this job
because the end_time column in the status_table contained the date and time of the last execution of the
job. The software concludes that the previous job completed successfully, and that it does not require
recovery.

Task overview: Recovery mechanisms [page 137]

Related Information

Recoverable job [page 138]

Creating local variables [page 138]
Creating the script that determines the status [page 140]

Tutorial
Recovery mechanisms PUBLIC 145
Conditionals [page 141]
Creating the script that updates the status [page 143]
Verify the job setup [page 144]
Data Services automated recovery properties [page 146]
What's next [page 147]

15.8 Data Services automated recovery properties

Data Services provides automated recovery methods to use as an alternative to the job setup for
JOB_Recovery.

With automatic recovery, Data Services records the result of each successfully completed step in a job. If a job
fails, you can choose to run the job again in recovery mode. During recovery mode, the software retrieves the
results for successfully completed steps and reruns incompleted or failed steps under the same conditions as
the original job.

Data Services has the following automatic recovery settings that you can use to recover jobs:

• Select Enable recovery and Recover from last failed execution in the job Execution Properties dialog.
• Select Recover as a unit in the work flow Properties dialog.

For more information about how to use the automated recovery properties in Data Services, see the Designer
Guide.

Parent topic: Recovery mechanisms [page 137]

Related Information

Recoverable job [page 138]

Creating local variables [page 138]
Creating the script that determines the status [page 140]
Conditionals [page 141]
Creating the script that updates the status [page 143]
Verify the job setup [page 144]
Executing the job [page 145]
What's next [page 147]

Tutorial
146 PUBLIC Recovery mechanisms
15.9 What's next

The remaining segments in the tutorial provide information about some of the advanced features in SAP Data
Services.

The next three segments are optional. They contain exercises that help you learn about working in a multiuser
environment, working with SAP application data, and about running real-time jobs.

Parent topic: Recovery mechanisms [page 137]

Related Information

Recoverable job [page 138]

Tutorial
Recovery mechanisms PUBLIC 147
16 Multiuser development

SAP Data Services enables teams of developers working on separate local repositories to store and share their
work in a central repository.

Each individual developer or team works on the application in their unique local repository. Each team uses a
central repository to store the master copy of its application. The central repository preserves all versions of all
objects in the application so you can revert to a previous version if necessary.

You can implement optional security features for central repositories. For more information about
implementing Central Repository security, see the Designer Guide.

What you'll learn

In this segment, you'll learn how to perform the following multiuser development tasks:

• Create a central repository

• Connect a local repository to the central repository
• Add objects from your local repository to the central repository
• Check out single objects or an object and its dependents from the central repository
• Roll back to a previous version of an object
• Work with more than one user in a central repository

The goal

We base the exercises for multiuser development on the following use case:

 Example

Two developers use a Data Services job to collect data for the HR department. Each developer has their
own local repository and they share a central repository. Throughout the exercises, the developers modify
the objects in the job and use the central repository to store and manage the modified versions of the
objects.

Perform the exercises by acting as both developers, or work with another person with each of you assuming
one of the developer roles.

Central Object Library [page 149]

The central object library provides access to reusable objects in a central repository, which you use in a
multi-user environment to check objects out to your local repository.

Central Object Library layout [page 150]

The central object library pane contains controls for working with objects in the central repository as
well as version information.

Tutorial
148 PUBLIC Multiuser development
How multiuser development works [page 151]
Data Services uses a central repository as a storage location and a version control tool for all objects
uploaded from local repositories.

Preparation [page 152]

Your system administrator sets up the multiuser environment to include two repositories and a central
repository.

Working in a multiuser environment [page 156]

As you perform the tasks in this section, Data Services adds all objects to your local repositories.

What's next [page 176]

In the next segment, learn how to extract SAP application data using SAP Data Services.

16.1 Central Object Library

The central object library provides access to reusable objects in a central repository, which you use in a multi-
user environment to check objects out to your local repository.

The central object library is a source control mechanism for objects in a central repository. It tracks the check-
out and check-in status of all objects that multiple users access. The central object library is a dockable and
movable pane just like the project area and local object library.

Through the central object library, authorized users access the central repository. The central repository
contains versions of objects saved by other users from their local repositories. The central object library
enables administrators to control who can add, view, and modify the objects stored in the central repository.

 Example

Check out an object from the central repository to your local repository. Edit and save the object, then
check the object back into the central repository. Data Services adds the edited object to the central
repository as a new version, and also maintains the original version. When you check an object out of the
central repository, no other user can work on that object until you check the object back into the central
repository.

Users must belong to a user group that has permission to perform tasks in the central repository.
Administrators assign permissions to an entire group of users as well as assign various levels of permissions to
the users in a group.

Parent topic: Multiuser development [page 148]

Related Information

Central Object Library layout [page 150]

How multiuser development works [page 151]
Preparation [page 152]
Working in a multiuser environment [page 156]

Tutorial
Multiuser development PUBLIC 149
What's next [page 176]
Multi-user development

16.2 Central Object Library layout

The central object library pane contains controls for working with objects in the central repository as well as
version information.

The top of the central object library pane displays a Group Permission box with the current user permissions,
and the name of the central repository. There are icons located at the top of the pane for performing the
following tasks:

• Select an object checkout type

• Show checkout history
• Edit central repository connection
• Refresh the content of the central object library

The central object library contains the same tabs as the local object library for accessing the existing objects
from the central repository. For example, open the Datastores tab and the central object library lists all of the
datastores saved to the central repository.

The central object library contains the following content:

• A list of the object types based on the selected tab.

• A red check mark over an object name when the object is checked out.
• Additional columns with information about the object's checkout status.

The following table describes the content for the additional columns in the central object library pane.

Column name Column content

Check out user The name of the user who currently has the object checked
out of the library. Blank when the object is not checked out.

Check out repository The name of the local repository that contains the checked-
out object. Blank when the object is not checked out.

Permission The authorization type for the group that appears in the
Group Permission box at the top of the pane. When you add a
new object to the central object library, the current group
gets FULL permission to the object and all other groups get
READ permission.

Latest version A version number and a timestamp that indicate when the
software saved this version of the object.

Description Information about the object or blank. Information appears

only when the user who added the object to the central repo
sitory entered a description through the object's properties.

Parent topic: Multiuser development [page 148]

Tutorial
150 PUBLIC Multiuser development
Related Information

Central Object Library [page 149]

How multiuser development works [page 151]
Preparation [page 152]
Working in a multiuser environment [page 156]
What's next [page 176]

16.3 How multiuser development works

Data Services uses a central repository as a storage location and a version control tool for all objects uploaded
from local repositories.

The central repository retains a history for all objects stored there. Developers use their local repositories to
create, modify, or execute objects such as jobs.

A central repository enables you to perform the following tasks:

• Get objects
• Add objects
• Check out objects
• Check in objects

Task Description

Get objects Copy objects from the central repository to your local reposi
tory. If the object already exists in your local repository, the
file from the central repository overwrites the object in your
local repository.

Check out objects The software locks the object when you check it out from the
central repository. No one else can work on the object when
you have it checked out. Other users can copy a locked ob
ject and put it into their local repository, but it is only a copy.
Any changes that they make cannot be uploaded to the cen
tral repository.

Tutorial
Multiuser development PUBLIC 151
Task Description

Check in objects When you check the object back into the central repository,
Data Services creates a new version of the object and saves
the previous version. Other users can check out the object
after you check it in. Other users can also view the object
history to view changes that you made to the object.

Add objects Add objects from your local repository to the central reposi
tory any time, as long as the object does not already exist in
the central repository.

The central repository works like file collaboration and version control software. The central repository retains a
history for each object. The object history lists all versions of the object. Revert to a previous version of the
object if you want to undo your changes. Before you revert an object to a previous version, make sure that you
are not mistakingly undoing changes from other users.

Parent topic: Multiuser development [page 148]

Related Information

Central Object Library [page 149]

Central Object Library layout [page 150]
Preparation [page 152]
Working in a multiuser environment [page 156]
What's next [page 176]

16.4 Preparation

Your system administrator sets up the multiuser environment to include two repositories and a central
repository.

Create three repositories using the user names and passwords listed in the following table.

User Name Password

central central

user1 user1

user2 user2

Create the repositories based on the rules for your DBMS.

Tutorial
152 PUBLIC Multiuser development
 Example

For example, with Oracle use the same database for the additional repositories. However, first add the users
listed in the table to the existing database. Make sure that you assign the appropriate access rights for each
user. When you create the additional repositories, Data Services qualifies the names of the repository
tables with these user names.

 Example

For Microsoft SQL Server, create a new database for each of the repositories listed in the table. When you
create the user names and passwords, ensure that you specify appropriate server and database roles to
each database.

Consult the Designer Guide and the Management Console Guide for additional details about multiuser
environments.

1. Configuring the central repository [page 153]

Configure a central repository using the Data Services Repository Manager.
2. Configuring two local repositories [page 154]
Configure the two local repositories using the Data Services Repository Manager.
3. Associating repositories to your job server [page 155]
You assign a Job Server to each repository to enable job execution in Data Services.
4. Defining connections to the central repository [page 155]
Assign the central repository named central to user1 and user2 repositories.

Parent topic: Multiuser development [page 148]

Related Information

Central Object Library [page 149]

Central Object Library layout [page 150]
How multiuser development works [page 151]
Working in a multiuser environment [page 156]
What's next [page 176]

16.4.1 Configuring the central repository

Configure a central repository using the Data Services Repository Manager.

Follow these steps to configure a central repository. If you created a central repository during installation, use
that central repository for the exercises.

1. If you have Designer open, close it before proceeding.

Tutorial
Multiuser development PUBLIC 153
2. From your Windows Start menu, click Programs SAP Data Services 4.2 Data Services Repository
Manager .

The Repository Manager opens.

3. Select Central from the Repository type dropdown list.
4. Select the database type from the Database type dropdown list.
5. Enter the remaining connection information for the central repository based on the database type you
chose.
6. Entral central for both User name and Password.
7. Click Create.

Data Services creates repository tables in the database that you identified.
8. Click Close.

Task overview: Preparation [page 152]

Next task: Configuring two local repositories [page 154]

16.4.2 Configuring two local repositories

Configure the two local repositories using the Data Services Repository Manager.

Repeat these steps to configure the user1 repository and the user2 repository.

1. If you have Designer open, close it before proceeding.

2. From the Start menu, click Programs SAP Data Services 4.2 Data Services Repository Manager .
3. Enter the database connection information for the local repository.
4. Type the following user name and password based on which repository you are creating:

Repository User name Password

1 user1 user1

2 user2 user2

5. For Repository type, click Local.

6. Click Create.
7. Click Close.

Task overview: Preparation [page 152]

Previous task: Configuring the central repository [page 153]

Next task: Associating repositories to your job server [page 155]

Tutorial
154 PUBLIC Multiuser development
16.4.3 Associating repositories to your job server

You assign a Job Server to each repository to enable job execution in Data Services.

1. From the Start menu, click Programs SAP Data Services 4.2 Data Services Server Manager .
2. Click Configuration Editor in the Job Server tab.

The Job Server Configuration Editor dialog box opens.

3. Select the Job Server name and click Edit.

The Job Server Properties dialog box opens. A list of current associated repositories appears in the
Associated Repositories list, if applicable.
4. Click Add under the Associated Repositories list.

The Repository Information options become active on the right side of the dialog box.
5. Select the appropriate database type for your local repository from the Database type dropdown list.
6. Complete the appropriate connection information for your database type as applicable.
7. Type user1 in both the User name and Password fields.
8. Click Apply.

<databasename>_user1 appears in the Associated Repositories list.

9. After the software completes processing, a message appears stating that the local repository was
successfully created.
10. Repeat steps 4 through 8 to associate user2 to your job server.
11. Click OK to close the Job Server Properties dialog box.
12. Click OK to close the Job Server Configuration Editor.
13. Click Close and Restart on the Server Manager dialog box.
14. Click OK to confirm that you want to restart the Data Services Service.

The software resyncs the job server with the repositories that you just set up.

Task overview: Preparation [page 152]

Previous task: Configuring two local repositories [page 154]

Next task: Defining connections to the central repository [page 155]

16.4.4 Defining connections to the central repository

Assign the central repository named central to user1 and user2 repositories.

1. Start the Designer, enter your log in credentials, and click Log on.
2. Select the repository user1 and click OK.
3. Enter the password for user1.

If you created the user1 repository as instructed, the password is user1.

Tutorial
Multiuser development PUBLIC 155
4. Select Tools Central Repositories. .

The Options dialog box opens.

5. Select Central Repository Connections at left.

The Central Repository Connections option is selected by default.

6. Click Add.
7. Enter your log in credentials and click Log on.

A list of the repositories appears if applicable.

8. Select central from the list of available repositories and click OK.
9. Click OK to close the Options dialog box.

If a prompt appears asking to overwrite the Job Server option parameters, select Yes.
10. Exit Designer.
11. Perform the same steps to connect user2 to the central repository.

Task overview: Preparation [page 152]

Previous task: Associating repositories to your job server [page 155]

Related Information

Activating a connection to the central repository [page 158]

16.5 Working in a multiuser environment

As you perform the tasks in this section, Data Services adds all objects to your local repositories.

For this exercise, you will learn the following:

• Activating a connection to the central repository

• Importing objects to your local repository
• Adding objects to the central repository
• Checking out and checking in objects in the central repository
• Comparing objects
• Filtering objects

Activating a connection to the central repository [page 158]

Activate the central repository for the user1 and user2 local repositories so that the local repository has
central repository connection information.

Importing objects into your local repository [page 158]

Import objects from the multiusertutorial.atl file so the objects are ready to use for the
exercises.

Tutorial
156 PUBLIC Multiuser development
Adding objects to the central repository [page 159]
After you import objects to the user1 local repository, add the objects to the central repository for
storage.

Check out objects from the central repository [page 162]

When you check out an object from the central repository, it becomes unavailable for other users to
change it.

Checking in objects to the central repository [page 164]

You can check in an object by itself or check it in along with all associated dependent objects. When an
object and its dependents are checked out and you check in the single object without its dependents,
the dependent objects remain checked out.

Setting up the user2 environment [page 165]

Set up the environment for user2 so that you can perform the remaining tasks in Multiuser
development.

Undo checkout [page 165]

Undo a checkout to restore the object in the central repository to the condition in which it was when
you checked it out.

Comparing objects [page 167]

Compare two objects, one from the local repository, and the same object from the central repository to
view the differences between the objects.

Check out object without replacement [page 169]

Check out an object from the central repository so that SAP Data Services does not overwrite your
local copy.

Get objects [page 172]

When you get an object from the central repository, you are making a copy of a specific version for your
local repository.

Filter dependent objects [page 173]

Use filtering to select the dependent objects to include, exclude, or replace when you add, check out, or
check in objects in a central repository.

Deleting objects [page 174]

Parent topic: Multiuser development [page 148]

Related Information

Central Object Library [page 149]

Central Object Library layout [page 150]
How multiuser development works [page 151]
Preparation [page 152]
What's next [page 176]
Defining connections to the central repository [page 155]

Tutorial
Multiuser development PUBLIC 157
16.5.1 Activating a connection to the central repository
Activate the central repository for the user1 and user2 local repositories so that the local repository has central
repository connection information.

Log in to Designer and select the user1 repository.

1. From the Tools menu, click Central Repositories.

The Central Repository Connections option is selected by default in the Designer list.
2. In the Central repository connections list, select Central and click Activate.

Data Services activates a link between the user1 repository and the central repository.
3. Select the option Activate automatically.

This option enables you to move back and forth between user1 and user2 local repositories without
reactivating the connection to the central repository each time.
4. Open the Central Object Library by clicking the Central Object Library icon on the Designer toolbar.

For the rest of the exercises in this section, we assume that you have the Central Object Library available in the
Designer.

Related Information

Central Object Library [page 149]

16.5.2 Importing objects into your local repository

Import objects from the multiusertutorial.atl file so the objects are ready to use for the exercises.

Before you can import objects into the local repository, complete the tasks in the section Preparation [page
152].

1. Log in to Data Services and select the user1 repository.

2. In the Local Object Library, right-click in a blank space and click Repository Import From File .
3. Select multiusertutorial.atl located in <LINK_DIR>\Tutorial Files and click Open.

A prompt opens explaining that the chosen ATL file is from an earlier release of Data Services. The ATL
older version does not affect the tutorial exercises. Therefore, click Yes.

Another prompt appears asking if you want to overwrite existing data. Click Yes.

Tutorial
158 PUBLIC Multiuser development
The Import Plan window opens.
4. Click Import.
5. Enter dstutorial for the passphrase and click Import.

The multiusertutorial.atl file contains a batch job with previously created work flows and data flows.
6. Open the Project tab in the Local Object Library and double-click MU to open the project in the Project
Area.
The MU project contains the following objects:
• JOB_Employee
• WF_EmpPos
• DF_EmpDept
• DF_EmpLoc
• WF_PostHireDate
• DF_PostHireDate

16.5.3 Adding objects to the central repository

After you import objects to the user1 local repository, add the objects to the central repository for storage.

When you add objects to the central repository, add a single object or the object and its dependents. All
projects and objects in the object library can be stored in a central repository.

Adding a single object to the central repository [page 159]

After importing objects into the user1 local repository, you can add them to the central repository for
storage.

Adding an object and dependents to the central repository [page 160]

Select to add an object and object dependents from the local repository to the central repository.

Adding dependent objects that already exist [page 161]

Add an object and dependents that has dependent objects that were already added to the central
repository through a different object.

16.5.3.1 Adding a single object to the central repository

After importing objects into the user1 local repository, you can add them to the central repository for storage.

Follow these steps to add a single object from the user1 repository to the central repository:

1. Click the Formats tab in the user1 Local Object Library.

 Note

Make sure that you verify that you are using the correct library by reading the header information.

2. Expand the Flat Files node to display the file names.

3. Right-click NameDate_Format and select Add to Central Repository Object .

Tutorial
Multiuser development PUBLIC 159
4. Optional. Add any comments about the object.
5. Click Continue.

A status Options dialog box opens to indicate that Data Services added the object successfully.

 Note

If the object already exists in the central repository, the Add to Central Repository option is not active.

6. Open the Central Object Library and open the Formats tab.

Expand Flat Files to see the NameDate_Format file is now in the central repository.

Task overview: Adding objects to the central repository [page 159]

Related Information

Adding an object and dependents to the central repository [page 160]

Adding dependent objects that already exist [page 161]

16.5.3.2 Adding an object and dependents to the central

repository

Select to add an object and object dependents from the local repository to the central repository.

1. Open the Work Flows tab in the Local Object Library.

2. Double-click WF_EmpPos.
Dependent objects include the following:

• DF_EmpDept
• DF_EmpLoc

3. Right-click WF_EmpPos in the Local Object Library and select Add to Central Repository Object and
dependents .

Instead of choosing the right-click options, you can move objects from your local repository to the central
repository using drag and drop. The Version Control Confirmation dialog box opens. Click Next and the click
Next again so that all dependent objects are included in the addition.

The Add - Object and dependents dialog box opens.

4. Type a comment as in the following example.

Adding WF_EmpPos, DF_EmpDept, and DF_EmpLoc to the central repository.

5. Click Apply Comments to all Objects.

Tutorial
160 PUBLIC Multiuser development
The comment appears for the object and all dependents when you view the history in the central
repository.
6. Click Continue.

The Output dialog box displays with a message that states “Add object completed”. Close the dialog box.
7. Verify that the Central Object Library contains the WF_EmpPos, DF_EmpDept, and DF_EmpLoc objects in
their respective tabs.

When you include the dependents of the WF_EmpPos, you add other dependent objects, including dependents
of the two data flows DF_EmpDept and DF_EmpLoc.

• Open the Datastores tab in the Central Object Library to see that the NAMEDEPT and the POSLOC tables.
• Open the Format tab in the Central Object Library to see the flat files PosDept_Format,
NamePos_Format, and NameLoc_Format objects

Task overview: Adding objects to the central repository [page 159]

Related Information

Adding a single object to the central repository [page 159]

Adding dependent objects that already exist [page 161]

16.5.3.3 Adding dependent objects that already exist

Add an object and dependents that has dependent objects that were already added to the central repository
through a different object.

This topic continues from Adding an object and dependents to the central repository [page 160]. We assume
that you are still logged in to the user1 repository in Designer.

1. Open the Work Flow tab in the Local Object Library.

2. Right-click WF_PosHireDate and select Add to Central Repository Objects and dependents .

The Add to Central Repository Alert dialog box appears listing the objects that already exist in the central
repository:
• DW_DS
• NameDate_Format
• NamePos_Format
• POSHDATE(DWS_DS.USER1)
3. Click Yes to continue.

It is okay to continue with the process because you haven't changed the existing objects yet.
4. Enter a comment and select Apply comments to all objects.

Adding WF_PosHireDate and DF_PosHireDate to central repository.

Tutorial
Multiuser development PUBLIC 161
5. Click Continue.
6. Close the Output dialog box.

The central repository now contains all objects in the user1 local repository. Developers who have access to the
central repository can check out, check in, label, and get those objects.

Task overview: Adding objects to the central repository [page 159]

Related Information

Adding a single object to the central repository [page 159]

Adding an object and dependents to the central repository [page 160]

16.5.4 Check out objects from the central repository

When you check out an object from the central repository, it becomes unavailable for other users to change it.

You can check out a single object or check out an object with dependents.

• If you check out a single object such as WF_EmpPos, it is not available for any user to change it. However,
the dependent object DF_EmpDept, remains in the central repository and it can be checked out by other
users.
• If you check out WF_EmpPos and the dependent DF_EmpDept, no one else can check out those objects.
Change the objects and save your changes locally, and then check the objects with your changes back into
the central repository. The repository creates a new version of the objects that include your changes.

After you make your changes and check the changed objects back into the central repository, other users can
view your changes, and check out the objects to make additional changes.

Checking out an object and its dependent objects [page 162]

Check out an object and dependent objects from the central repository using menu options or icon
tools.

Modifying dependent objects [page 163]

Modify the data flows that are dependent of WF_EmpPos.

16.5.4.1 Checking out an object and its dependent objects

Check out an object and dependent objects from the central repository using menu options or icon tools.

Perform the following steps while you are logged in to the user1 repository.

1. Open the Central Object Library and open the Work Flow tab.

2. Right-click WF_EmpPos and select Check Out Object and dependents .

A warning appears telling you that checking out WF_EmpPos does not include the datastores. To include the
datastores in the checkout, use the Check Out with Filtering check out option.

Tutorial
162 PUBLIC Multiuser development
 Note

The software does not include the datastore DW_DS in the checkout as the message states. However,
the tables NAMEDEPT and POSLOC, which are listed under the Tables node of DW_DS, are included in the
dependent objects that are checked out.

3. Click Yes to continue.

Alternatively, you can select the object in the Central Object Library, then click the Check out object and

dependents icon on the Central Object Library toolbar.

4. Close the Object dialog box.

Data Services copies the most recent version of WF_EmpPos and its dependent objects from the central
repository into the user1 local repository. A red check mark appears on the icon for objects that are checked
out in both the local and central repositories.

User1 can modify the WF_EmpPos work flow and the checked out dependents in the local repository while it is
checked out of the central repository.

Task overview: Check out objects from the central repository [page 162]

Related Information

Modifying dependent objects [page 163]

Filter dependent objects [page 173]

16.5.4.2 Modifying dependent objects

Modify the data flows that are dependent of WF_EmpPos.

1. In the local object library, click the Data Flow tab.

2. Open the DF_EmpDept data flow.
3. In the DF_EmpDept workspace, double-click the query to open the Query Editor.
4. Change the mapping in the Schema Out pane: Right-click FName and click Cut.
5. Click the Back arrow in the icon bar to return to the data flow.
6. Open the DF_EmpLoc data flow.
7. In the DF_EmpLoc workspace, double-click the query to open the Query Editor.
8. Cut the following rows from the Schema Out pane as follows:
a. Right-click FName and click Cut.
b. Right-click LName and click Cut.
9. Save your work by clicking the Save All icon.

Task overview: Check out objects from the central repository [page 162]

Tutorial
Multiuser development PUBLIC 163
Related Information

Checking out an object and its dependent objects [page 162]

Checking in a single object [page 164]

16.5.5 Checking in objects to the central repository

You can check in an object by itself or check it in along with all associated dependent objects. When an object
and its dependents are checked out and you check in the single object without its dependents, the dependent
objects remain checked out.

Checking in a single object [page 164]

After you change an existing object, check it into the central repository so that other users can access
it.

16.5.5.1 Checking in a single object

After you change an existing object, check it into the central repository so that other users can access it.

1. In SAP Data Services Designer, open the Central Object Library.

2. Open the Data Flow tab and right-click DF_EmpLoc.

3. Select Check In Object .

The Comment dialog box opens.

4. Type the following text In the Comments field:

Removed FName and LName columns from NAMEDEPT target table

5. Click Continue. Close the Output dialog box.

Data Services copies the object from the user1 local repository to the central repository and removes the
check-out marks.
6. In the Central Object Library window, right-click DF_EmpLoc and click Show History.

The History dialog box contains the user name, date, action, and version number for each time the file was
checked out and checked back in. The dialog box also lists the comments that the user included when they
checked the object into the central repository. This information is helpful for many reasons, including:
• Providing information to the next developer who checks out the object.
• Helping you decide what version to choose when you want to roll back to an older version.
• Viewing the difference between versions.

For more information about viewing history, see the Designer Guide.
7. After you have reviewed the history, click Close.

The next portion of this exercise involves a second developer, user2.

Tutorial
164 PUBLIC Multiuser development
Task overview: Checking in objects to the central repository [page 164]

Related Information

Setting up the user2 environment [page 165]

16.5.6 Setting up the user2 environment

Set up the environment for user2 so that you can perform the remaining tasks in Multiuser development.

Log into SAP Data Services Designer and choose the user2 repository. Enter user2 for the password.

Set up the user2 developer environment in the same way that you set up the environment for user1. The
following is a summary of the steps:

1. Import the multiusertutorial.atl file.

2. Activate the connection to the central repository.
3. Open the Central Object Library and dock it along with the Local Object Library and the Project Area.

Related Information

Importing objects into your local repository [page 158]

Activating a connection to the central repository [page 158]

16.5.7 Undo checkout

Undo a checkout to restore the object in the central repository to the condition in which it was when you
checked it out.

In this exercise, you check out DF_PosHireDate from the central repository, modify it, and save your changes
to your local repository. Then you undo the checkout of DF_PosHireDate from the central repository.

When you undo a checkout, you restore the object in the central repository to the way it was when you checked
it out. SAP Data Services does not save changes or create a new version in the central repository. Your local
repository, however, retains the changes that you made. To undo changes in your local repository, “get” the
object from the central repository after you undo the checkout. The software overwrites your local copy and
replaces it with the restored copy of the object in the central repository.

Undo checkout works for both a single object as well as objects with dependents.

Checking out and modifying an object [page 166]

Check out the DF_PosHireDate and modify the output mapping in the query.

Undoing an object checkout [page 166]

Tutorial
Multiuser development PUBLIC 165
Undo an object checkout when you do not want to save your changes, and you want to revert the object
back to the original content when you checked it out.

Related Information

Get objects [page 172]

16.5.7.1 Checking out and modifying an object

Check out the DF_PosHireDate and modify the output mapping in the query.

Log on to Designer and the user2 repository.

1. Open the Central Object Library.

2. Open the Data Flow tab, expand Data Flows, and right-click DF_PosHireDate.

3. Select Check Out Object .

The DF_PosHireDate object appears with a red checkmark in both the Local Object Library and the
Central Object Library indicating that it is checked out.
4. In the local object library, double-click DF_PosHireDate to open it in the workspace.
5. Double-click the query in the data flow to open the Query Editor.
6. In the Schema Out pane, right-click LName and click Cut.

You have changed the mapping in the data flow.

7. Save your work.

Task overview: Undo checkout [page 165]

Related Information

Undoing an object checkout [page 166]

16.5.7.2 Undoing an object checkout

Undo an object checkout when you do not want to save your changes, and you want to revert the object back to
the original content when you checked it out.

Log on to Designer and the user2 repository.

1. Open the Data Flow tab in the Central Object Library and expand Data Flows.

Tutorial
166 PUBLIC Multiuser development
2. Right-click DF_PosHireDate and click Undo Check Out Object .

Data Services removes the check-out symbol from DF_PosHireDate in the Local and Central Object Library,
without saving your changes in the central repository. The object in your local repository still has the output
mapping change.

Task overview: Undo checkout [page 165]

Related Information

Checking out and modifying an object [page 166]

Comparing objects [page 167]

16.5.8 Comparing objects

Compare two objects, one from the local repository, and the same object from the central repository to view
the differences between the objects.

Make sure that you have followed all of the steps in the Undo checkout section.

Log on to Designer and the user2 repository.

1. Expand the Data Flow tab in the Local Object Library and expand Data Flows.

2. Right-click DF_PosHireDate and click Compare Object with dependents to Central .

The Difference Viewer opens in the workspace. It shows the local repository contents for DF_PosHireDate
on the left and the central repository contents for DF_PosHireDate on the right.
3. Examine the data in the Difference Viewer.

The Difference Viewer helps you find the differences between the local object and the object in the central
repository.

Expand the Query node and then expand the Query table icon. The Difference Viewer indicates that the
LName column was removed in the local repository on the left, but it was added back in the central
repository. The text is in green, and the green icon appears signifying that there was an insertion.

Difference Viewer data [page 167]

The Difference Viewer shows the difference between an object in the local repository and the central
repository.

16.5.8.1 Difference Viewer data

The Difference Viewer shows the difference between an object in the local repository and the central repository.

In the following screen capture, the Difference Viewer shows the differences between the DF_PosHireDate
objects in the left and right panes. Notice the following areas of the dialog box:

Tutorial
Multiuser development PUBLIC 167
• Each line represents an object or item in the object.
• The red bars on the right indicate where data is different. Click a red bar on the right and the viewer
highlights the line that contains the difference.
• The changed lines contain a colored status icon on the object icon that shows the status: Deleted,
changed, inserted, or consolidated. There is a key at the bottom of the Difference Viewer that lists the
status that corresponds to each colored status icon.

The Difference Viewer contains a status line at the bottom of the dialog box as shown in the image below. The
status line indicates the number of differences. If there are no differences, the status line indicates Difference [ ]
of 0. To the left of the status line is a key to the colored status icons.

Colored status icons and descriptions

Icon Status Description

Deleted The item does not appear in the object in the right pane.

Changed The differences between the items are highlighted in blue (the default) text.

Inserted The item has been added to the object in the right pane.

Consolidated The items within the line have differences. Expand the item by clicking its plus
sign to view the differences

Parent topic: Comparing objects [page 167]

Tutorial
168 PUBLIC Multiuser development
16.5.9 Check out object without replacement

Check out an object from the central repository so that SAP Data Services does not overwrite your local copy.

 Example

For example, you may need to use the checkout without replacement option when you change an object in
your local repository before you check it out from the central repository.

The option prevents Data Services from overwriting the changes that you made in your local copy.

After you have checked out the object from the central repository the object in both the central and local
repository has a red check out icon. But the local copy is not replaced with the version in the central repository.
You can then check your local version into the central repository so that it is updated with your changes.

Do not ues the check out without replacement option if another user checked out the file from the central
repository, made changes, and then checked in the changes.

 Example

For example, you make changes to your local copy of Object-A without realizing you are working in your
local copy.

Meanwhile, another developer checks out Object-A from the central repository, makes extensive changes
and checks it back in to the central repository.

You finally remember to check out Object-A from the central repository. Instead of checking the object
history, you assume that you were the last developer to work in the master of Object-A, so you check
Object-A out of the central repository using the without replacement option. When you check your local
version of Object-A into the central repository, all changes that the other developer made are overwritten.

 Caution

Before you use the Object without replacement option in a multiuser environment, check the history of the
object in the central repository. Make sure that you are the last person who worked on the object.

In the next exercise, user2 uses the check out option Object without replacement to be able to update the
master version in the central repository with changes from the version in the local repository.

Checking out an object without replacement [page 170]

Use the checkout option without replacement to check out an object from the central repository
without overwriting the local copy that has changed.

Checking in the DF_EmpLoc data flow [page 170]

Check in the local version of DF_EmpLoc to update the central repository version to include your
changes.

Checking in DF_EmpDept and WF_EmpPos [page 171]

Check in files from the user1 repository.

Tutorial
Multiuser development PUBLIC 169
16.5.9.1 Checking out an object without replacement

Use the checkout option without replacement to check out an object from the central repository without
overwriting the local copy that has changed.

Log in to Data Services Designer and the user2 repository.

1. Open the Data Flow tab in the Local Object Library and expand Data Flows.
2. Double-click DF_EmpLoc to open it in the workspace.
3. Double-click the query in the workspace to open the Query Editor.
4. Right-click FName in the Schema Out pane and click Cut.
5. Save your work.
6. Open the Data Flow tab of the Central Object Library and expand Data Flows.

7. Right-click DF_EmpLoc and click Check Out Object without replacement .

8. Close the Output window

The software marks the DF_EmpLoc object in the Central Object Library and the Local Object Library as
checked out. The software does not overwrite the object in the Local Object Library, but preserves the
object as is.

Task overview: Check out object without replacement [page 169]

Related Information

Checking in the DF_EmpLoc data flow [page 170]

Checking in DF_EmpDept and WF_EmpPos [page 171]

16.5.9.2 Checking in the DF_EmpLoc data flow

Check in the local version of DF_EmpLoc to update the central repository version to include your changes.

These steps continue from the topic Checking out an object without replacement [page 170].

1. In the Central Object Library, right-click DF_EmpLoc and select Check in Object .
2. Type a comment as follows in the Comment dialog box and click Continue.

Removed FName from POSLOC

Now the central repository contains a third version of DF_EmpLoc. This version is the same as the copy of
DF_EmpLoc in the user2 local object library.

3. Right-click DF_EmpLoc in your Local Object Library and select Compare Object to central .

The Difference Viewer should show the two objects as the same.

Tutorial
170 PUBLIC Multiuser development
Task overview: Check out object without replacement [page 169]

Related Information

Checking out an object without replacement [page 170]

Checking in DF_EmpDept and WF_EmpPos [page 171]

16.5.9.3 Checking in DF_EmpDept and WF_EmpPos

Check in files from the user1 repository.

1. Log on to Designer and the user1 repository.

2. Open the Central Object Library, open the Data Flow tab and expand Data Flows.

3. Right-click DF_EmpDept and select Check in Object and dependents .

4. Enter a comment like the following example and click Continue.

Removed FName from NAMEDEPT

5. Click Yes in the Check In Warning window.

6. Confirm that the central repository contains the following versions by right-clicking each object and
clicking Show History.

• Three versions of DF_EmpLoc

• Two versions of DF_EmpDept
• One version of DF_PosHireDate
7. Save your work and log out of Designer.

Task overview: Check out object without replacement [page 169]

Related Information

Checking out an object without replacement [page 170]

Checking in the DF_EmpLoc data flow [page 170]
Get objects [page 172]

Tutorial
Multiuser development PUBLIC 171
16.5.10 Get objects
When you get an object from the central repository, you are making a copy of a specific version for your local
repository.

You might want to copy a specific version of an object from the central repository into your local repository.
Getting objects allows you to select a version other than the most recent version to copy. When you get an
object, you replace the version in your local repository with the version that you copied from the central
repository. The object is not checked out of the central repository, and it is still available for others to lock and
check out.

Getting the latest version of an object [page 172]

Obtain a copy of the latest version of an object from the central repository.

Getting a previous version of an object [page 173]

Obtain a copy of a select previous version of an object from the central repository.

16.5.10.1 Getting the latest version of an object

Obtain a copy of the latest version of an object from the central repository.

Perform the following steps in Designer. You can use either the user1 or user2 repository.

1. Open the Data Flow tab of the Local Object Library.

2. Open DF_EmpLoc in the workspace.
3. Open the query to open the Query Editor.
4. Notice that Pos and Loc are the only two columns in the Schema Out pane.
5. Click the Back icon in the icon menu bar to close the Query Editor.
6. Open the Data Flow tab In the Central Object Library.

7. Right-click DF_EmpLoc and select Get Latest Version Object from the dropdown menu.

Data Services copies the most recent version of the data flow from the central repository to the local
repository.
8. Open the DF_EmpLoc data flow in the Local Object Library,.
9. Open the query to open the Query Editor.
10. Notice that there are now three columns in the Schema Out pane: LName, Pos, and Loc.

The latest version of DF_EmpLoc from the central repository overwrites the previous copy in the local
repository.
11. Click the Back arrow in the icon menu bar to return to the data flow.

Task overview: Get objects [page 172]

Related Information

Getting a previous version of an object [page 173]

Tutorial
172 PUBLIC Multiuser development
16.5.10.2 Getting a previous version of an object

Obtain a copy of a select previous version of an object from the central repository.

Perform the following steps in Designer. You can use either the user1 or user2 repository.

When you get a previous version of an object, you get the object but not its dependents.

1. Open the Data Flows tab of the Central Object Library.

2. Right-click DF_EmpLoc and select Show History.
3. In the History window, click the Version 1 of the DF_EmpLoc data flow.
4. Click Get Object By Version.
5. Click Close to close the History dialog box.
6. Open the Data Flows tab of the Local Object Library and open DF_EmpLoc.
7. Open the query.
8. Notice that all of the original columns are listed in the Schema Out pane.

Version 1 of DF_EmpLoc is the version that you first added to the central repository at the beginning of this
section. The software overwrote the altered version in your local repository with Version 1 from the central
repository.

Task overview: Get objects [page 172]

Related Information

Getting the latest version of an object [page 172]

16.5.11 Filter dependent objects

Use filtering to select the dependent objects to include, exclude, or replace when you add, check out, or check
in objects in a central repository.

When multiple users work on an application, some objects can contain repository-specific information. For
example, datastores and database tables might refer to a particular database connection unique to a user or a
phase of development. After you check out an object with filtering, you can change or replace the following
configurations:

• Change datastore and database connection information to your local repository

• Change the root directory for files associated with a particular file format to a directory on your local
repository
• Replace or exclude specific dependent objects when you check in the object to the central repository

Checking out objects with filtering [page 174]

Tutorial
Multiuser development PUBLIC 173
Related Information

Checking out objects with filtering [page 174]

Deleting objects [page 174]

16.5.11.1 Checking out objects with filtering

1. Log on to Designer and the user2 repository.

2. Open the Work Flow tab of the Central Object Library.

3. Right-click WF_EmpPos and click Check Out With filtering .

The Version Control Confirmation dialog box opens with a list of dependent object types. Expand each node
to see a list of dependent objects of that object type.
4. Select NamePos_Format under Flat Files.
5. Select Exclude from the Target status dropdown list.

The word “excluded” appears next to NamePos_Format in the Action column. Data Services excludes the
flat file NamePos_Format from the dependent objects to be checked out.
6. Click Next.

The Datastore Options dialog box opens listing the datastores that are used by NamePos_Format.
7. Click Finish.

You may see a Check Out Alert dialog box stating that there are some dependent objects checked out by other
users. For example, if user1 checked in the WF_EmpPos work flow to the central repository without selecting to
include the dependent objects, the dependent objects could still be checked out. The Check Out Alert lists the
reasons why each listed object cannot be checked out. For example, “The object is checked out by the
repository: user1”. This reason provides you with the information to decide what to do next:

• Select Yes to get copies of the latest versions of the selected objects into your repository.
• Select No to check out the objects that are not already checked out by another user.
• Select Cancel to cancel the checkout.

Task overview: Filter dependent objects [page 173]

16.5.12 Deleting objects

You can delete objects from the local or the central repository.

Deleting an object from the central repository [page 175]

When you delete objects from the central repository, dependent objects, and objects in your local
repositories are not always deleted.

Deleting an object from a local repository [page 175]

When you delete an object from a local repository, it is not deleted from the central repository.

Tutorial
174 PUBLIC Multiuser development
16.5.12.1 Deleting an object from the central repository

When you delete objects from the central repository, dependent objects, and objects in your local repositories
are not always deleted.

1. Log on to Designer and the user2 repository.

2. Open the Work Flows tab in the Central Object Library.
3. Right-click WF_PosHireDate and click Delete.
4. Click OK to confirm the deletion.
5. Open the Data Flows tab in the Central Object Library.
6. Verify that DF_PosHireDate was not deleted from the central repository.

When you delete objects from the central repository, you delete only the selected object and all versions of
it; you do not delete any dependent objects.
7. Open the Work Flows tab in the local object library to verify that WF_PosHireDate was not deleted from the
user2 local object library.

When you delete an object from a central repository, it is not automatically deleted from the connected
local repositories.

Task overview: Deleting objects [page 174]

Related Information

Deleting an object from a local repository [page 175]

16.5.12.2 Deleting an object from a local repository

When you delete an object from a local repository, it is not deleted from the central repository.

1. Log on to Designer and the user2 repository.

2. open the Data Flows tab in the Local Object Library.
3. Right-click DF_EmpDept and click Delete.

When you delete an object from a local repository, the software does not delete it from the central
repository. If you delete an object from your local repository by accident, recover the object by selecting to
“Get” the object from the central repository, if it exists in your central repository.
4. Open the Central Object Library.
5. Click the Refresh icon on the object library toolbar.
6. Open the Data Flows tab in the Central Object Library and verify that DF_EmpDept was not deleted from
the central repository.
7. Exit Data Services.

Tutorial
Multiuser development PUBLIC 175
Task overview: Deleting objects [page 174]

Related Information

Deleting an object from the central repository [page 175]

16.6 What's next

In the next segment, learn how to extract SAP application data using SAP Data Services.

Parent topic: Multiuser development [page 148]

Related Information

Central Object Library [page 149]

Central Object Library layout [page 150]
How multiuser development works [page 151]
Preparation [page 152]
Working in a multiuser environment [page 156]

Tutorial
176 PUBLIC Multiuser development
17 Extracting SAP application data

To work with data from SAP applications, use specific tools and objects in SAP Data Services.

What you'll learn

In this segment, use the following advanced features to obtain and process data from SAP applications:

• ABAP code and ABAP data flow: Define the data to extract from SAP applications.
• Data transport object: Carries data from the SAP application into Data services.
• Lookup function and additional lookup values: Obtains data from a source that isn't included in a job.

For more information about using SAP application data in Data Services, see the Supplement for SAP.

The goal

In this section, we work with the data sources that are circled in the star schema:

 Note

To perform the exercises in this section, your implementation of Data Services must be able to connect to
an SAP remote server. Ask your administrator for details.

Tutorial
Extracting SAP application data PUBLIC 177
 Note

The structure of standard SAP tables varies between versions. Therefore, the sample tables for these
exercises may not work with all versions of SAP applications. If the exercises in this section aren’t working
as documented, it can be because of the versions of your SAP applications.

SAP applications [page 178]

SAP applications are the main building blocks of the SAP solution portfolios for industries.

Defining an SAP application datastore [page 179]

Use the SAP application datastore to connect Data Services to the SAP application server.

Importing metadata [page 180]

Import SAP application tables into the new datastore SAP_DS for the exercises in this section.

Repopulate the customer dimension table [page 181]

Repopulate the customer dimension table by configuring a data flow that outputs SAP application data
to a datastore table.

Repopulating the material dimension table [page 188]

Repopulate the material dimension table using the SAP_DS datastore.

Repopulating the Sales Fact table [page 195]

Repopulate the Sales Fact table from two SAP application sources.

What's next [page 204]

In the next section, learn how to import and run a real-time job.

17.1 SAP applications

SAP applications are the main building blocks of the SAP solution portfolios for industries.

SAP applications provide the software foundation with which organizations address their business issues. SAP
delivers the following types of applications:

General-purpose applications These include applications provided within SAP Business

Suite software such as the SAP Customer Relationship
Management application and the SAP ERP application.

Industry-specific applications These applications perform targeted, industry-specific busi

ness functions. Examples:

• SAP Apparel and Footwear Solution for Consumer Prod

ucts
• SAP Reinsurance Management application for the insur
ance industry

Ask your system administrator about the types of SAP applications that your organization uses.

Parent topic: Extracting SAP application data [page 177]

Tutorial
178 PUBLIC Extracting SAP application data
Related Information

Defining an SAP application datastore [page 179]

Importing metadata [page 180]
Repopulate the customer dimension table [page 181]
Repopulating the material dimension table [page 188]
Repopulating the Sales Fact table [page 195]
What's next [page 204]

17.2 Defining an SAP application datastore

Use the SAP application datastore to connect Data Services to the SAP application server.

Log on to Designer and to the tutorial repository. Do not use the user1, user2, or central repositories that you
created for the multiuser exercises.

1. Open the Datastores tab in the Local Object Library.

2. Right-click on a blank space in the tab and click New.

The Create New Datastore dialog box opens.

3. Type SAP_DS for Datastore name.

This name identifies the database connection inside the software.

4. Select SAP Applications from the Datastore type dropdown list.
5. Type the name of the remote SAP application computer (host) in Database server name.
6. Enter the applicable information for the SAP application server in User Name and Password.

For details about completing other datastore options, see the Datastores for SAP applications section of
the Supplement for SAP.
7. Click OK.

The new datastore appears in the Datastore tab of the Local Object Library.

Task overview: Extracting SAP application data [page 177]

Related Information

SAP applications [page 178]

Tutorial
Extracting SAP application data PUBLIC 179
17.3 Importing metadata

Import SAP application tables into the new datastore SAP_DS for the exercises in this section.

Create and configure the SAP application datastore named SAP_DS before you import the metadata.

1. Open the Datastores tab in the Local Object Library.

2. Right-click SAP_DS and click Import by Name.

The Import By Name dialog box opens.

3. Select Table from the Type dropdown list.
4. Type KNA1 in Name.

The software automatically completes the owner name.

5. Click Import.
6. Repeat steps 1 through 5 for the following additional SAP tables:

MAKT
MARA
VBAK
VBUP

The software adds the tables to the Datastores tab of the Local Object Library under Tables.

Task overview: Extracting SAP application data [page 177]

Tutorial
180 PUBLIC Extracting SAP application data
Related Information

SAP applications [page 178]

Defining an SAP application datastore [page 179]
Repopulate the customer dimension table [page 181]
Repopulating the material dimension table [page 188]
Repopulating the Sales Fact table [page 195]
What's next [page 204]

17.4 Repopulate the customer dimension table

Repopulate the customer dimension table by configuring a data flow that outputs SAP application data to a
datastore table.

Configure a Data Services job that includes a work flow and an ABAP data flow. The ABAP data flow extracts
SAP data and loads it into the customer dimension table.

This exercise differs from previous exercises in the following ways:

• You access data through a remote server.

• You communicate with the data source using ABAP code.

To configure the Data Services job so that it communicates with the SAP application, configure an ABAP data
flow. The ABAP data flow contains Data Services supplied commands so you do not need to know ABAP.

For more information about configuring an ABAP data flow, see the Supplement for SAP.

1. Adding the SAP_CustDim job, work flow, and data flow [page 182]
The job for repopulating the customer dimension table includes a work flow and a data flow.
2. Adding ABAP data flow to Customer Dimension job [page 183]
Add the ABAP data flow to JOB_SAP_CustDim and set options in the ABAP data flow.
3. Defining the DF_SAP_CustDim ABAP data flow [page 184]
Define the ABAP data flow so that it communicates the job tasks to the SAP application.
4. Executing the JOB_SAP_CustDim job [page 187]
Validate and then execute the JOB_SAP_CustDim job.
5. ABAP job execution errors [page 188]
There are some common ABAP job execution errors that have solutions.

Parent topic: Extracting SAP application data [page 177]

Related Information

SAP applications [page 178]

Tutorial
Extracting SAP application data PUBLIC 181
Defining an SAP application datastore [page 179]
Importing metadata [page 180]
Repopulating the material dimension table [page 188]
Repopulating the Sales Fact table [page 195]
What's next [page 204]

17.4.1 Adding the SAP_CustDim job, work flow, and data flow

The job for repopulating the customer dimension table includes a work flow and a data flow.

1. Open the Class_Exercises project so that it displays in the Project Area.

2. Create a new batch job and name it JOB_SAP_CustDim.
3. Create a new work flow and name it WF_SAP_CustDim.
4. Create a new data flow and name it DF_SAP_CustDim.
5. Save your work.

A data flow within a data flow [page 182]

The SAP_CustDim data flow needs an ABAP dataflow to extract SAP application data.

Task overview: Repopulate the customer dimension table [page 181]

Next task: Adding ABAP data flow to Customer Dimension job [page 183]

17.4.1.1 A data flow within a data flow

The SAP_CustDim data flow needs an ABAP dataflow to extract SAP application data.

The ABAP data flow interacts directly with the SAP application database layer. Because the database layer is
complex, Data Services accesses it using ABAP code.

Data Services executes the SAP_CustDim batch job in the following way:

• Data Services generates the ABAP code.

• Data Services connects to the SAP application server via remote function call (RFC).
• The ABAP code executes on the SAP application server.
• The SAP application generates the ABAP program results and communicates the results to Data Services.
• Data Services loads the target data cache.

Learn more about this process in the Supplement for SAP.

Tutorial
182 PUBLIC Extracting SAP application data
17.4.2 Adding ABAP data flow to Customer Dimension job

Add the ABAP data flow to JOB_SAP_CustDim and set options in the ABAP data flow.

1. Open the DF_SAP_CustDim data flow in the workspace.

2. Click the ABAP data flow icon from the tool palette and click in the workspace to add it to the data flow.

The Properties window of the ABAP data flow opens.

3. Complete the fields in the Options tab as described in the following table:

Option Action

Datastore Select SAP_DS from the dropdown list

Generated ABAP file name Specify a file name for the generated ABAP code. The
software stores the file in the ABAP directory that you
specified in the SAP_DS datastore.

ABAP program name Specify the name for the ABAP program that the Data
Services job uploads to the SAP application. Adhere to the
following name requirements:
• Begins with the letter Y or Z
• Cannot exceed 8 characters

Job name Type SAP_CustDim. The name is for the job that runs in
the SAP application.

4. Open the General tab and name the data flow DF_SAP_CustDim.
5. Click OK.

6. Open the Datastores tab in the Local Object Library and expand Target_DS Tables .
7. Move the CUST_DIM table on to the workspace using drag and drop.

Place the table to the right of the DF_SAP_CustDim object.

8. Select Make Target.
9. Save your work.

Task overview: Repopulate the customer dimension table [page 181]

Previous task: Adding the SAP_CustDim job, work flow, and data flow [page 182]

Next task: Defining the DF_SAP_CustDim ABAP data flow [page 184]

Tutorial
Extracting SAP application data PUBLIC 183
17.4.3 Defining the DF_SAP_CustDim ABAP data flow
Define the ABAP data flow so that it communicates the job tasks to the SAP application.

Perform the following group of tasks to define the ABAP data flow:

1. Designate a source table.

2. Define a query to specify the data to extract.
3. Define the data transport object into which the SAP application writes the resulting data set.
4. Set the order of execution in the data flow.

Adding objects to DF_SAP_CustDim ABAP data flow [page 184]

Add the necessary objects to complete the DF_SAP_CustDim ABAP data flow.

Defining the query [page 185]

Complete the output schema in the query to define the data to extract from the SAP application.

Defining the details of the data transport [page 186]

A data transport defines a staging file for the data that is extracted from the SAP application.

Setting the execution order [page 187]

Set the order of execution by joining the objects in the data flow.

Task overview: Repopulate the customer dimension table [page 181]

Previous task: Adding ABAP data flow to Customer Dimension job [page 183]

Next task: Executing the JOB_SAP_CustDim job [page 187]

17.4.3.1 Adding objects to DF_SAP_CustDim ABAP data flow

Add the necessary objects to complete the DF_SAP_CustDim ABAP data flow.

1. Open DF_SAP_CustDim data flow in the workspace.

2. Open the Datastores tab in the Local Object Library and expand SAP_DS Tables .
3. Move the KNA1 table to the left side of the workspace using drag and drop.
4. Select Make Source.

5. Add a query from the tool pallet to the right of the KNA1 table in the workspace.

6. Add a data transport from the tool pallet to the right of the query in the workspace.
7. Connect the icons in the data flow to indicate the flow of data as shown.

Tutorial
184 PUBLIC Extracting SAP application data
17.4.3.2 Defining the query

Complete the output schema in the query to define the data to extract from the SAP application.

1. Open the query In the workspace to open the Query Editor dialog box.
2. Expand the KNA1 table in the Schema In pane to see the columns.
3. Click the column head (above the table name) to sort the list in alphabetical order.
4. Map the following seven source columns to the target schema. Use Ctrl + Click to select multiple
columns and drag them to the output schema.

KUKLA
KUNNR
NAME1
ORT01
PSTILZ
REGIO
STRAS

The icon next to the source column changes to an arrow to indicate that the column has been mapped. The
Mapping tab in the lower pane of the Query Editor shows the mapping relationships.
5. Rename the target columns and verify or change the data types and descriptions using the information in
the following table. To change these settings, right-click the column name and select Properties from the
dropdown list.

 Note

Microsoft SQL Server and Sybase ASE DBMSs require that you specify the columns in the order shown
in the following table and not alphabetically.

Original name New name Data type Description

KUNNR Cust_ID varchar(10) Customer number

KUKLA Cust_classf varchar(2) Customer classification

NAME1 Name1 varchar(35) Customer name

STRAS Address varchar(35) Address

ORT01 City varchar(35) City

REGIO Region_ID varchar(3) Region

PSTLZ Zip varchar(10) Postal code

Tutorial
Extracting SAP application data PUBLIC 185
6. Click the Back arrow icon in the icon toolbar to return to the data flow and to close the Query Editor.
7. Save your work.

17.4.3.3 Defining the details of the data transport

A data transport defines a staging file for the data that is extracted from the SAP application.

1. Open the DF_SAP_CustDim ABAP data flow In the workspace.

2. Double-click the data transport object to open the ABAP Data File Option Editor.
3. Type cust_dim.dat in File Name.
This file stores the data set produced by the ABAP data flow. The full path name for this file is the path of
the SAP Data Services shared directory concatenated with the file name that you just entered.
4. Select Replace File.
Replace File truncates this file each time the data flow is executed.
5. Click the Back icon in the icon toolbar to return to the data flow.
6. Save your work.

Tutorial
186 PUBLIC Extracting SAP application data
17.4.3.4 Setting the execution order

Set the order of execution by joining the objects in the data flow.

1. Open the DF_SAP_CustDim data flow in the workspace.

The data flow contains the ABAP data flow and the target table named Cust_Dim.
2. Connect the ABAP data flow to the target table.
3. Save your work.

Related Information

Executing the JOB_SAP_CustDim job [page 187]

17.4.4 Executing the JOB_SAP_CustDim job

Validate and then execute the JOB_SAP_CustDim job.

1. With the job selected in the Project Area, click the Validate All icon on the icon toolbar.

If your design contains errors, a message appears describing the error. The software requires that you
resolve the error before you can proceed.

If the job has warning message, you can continue. Warnings do not prohibit job execution.

If your design does not have errors, the following message appears:

Validate: No errors found

2. Right-click the job name in the project area and click Execute.

If you have not saved your work, a save dialog box appears. Save your work and continue. The Execution
Properties dialog box opens.
3. Leave the default selections and click OK.

After the job completes, check the Output window for any error or warning messages.
4. Use a query tool to check the contents of the cust_dim table in your DBMS.

Task overview: Repopulate the customer dimension table [page 181]

Previous task: Defining the DF_SAP_CustDim ABAP data flow [page 184]

Next: ABAP job execution errors [page 188]

Tutorial
Extracting SAP application data PUBLIC 187
17.4.5 ABAP job execution errors

There are some common ABAP job execution errors that have solutions.

The following table lists a few common ABAP job execution errors. probable causes, and how to fix them.

Error Probable cause Solution

Cannot open Lack of permissions for Job Server 1. Open the Services Control Panel
ABAP output file service account. 2. Double-click the Data Services service and select a user
account that has permissions to the working folder on
the SAP server

Cannot create Working directory on SAP server speci Open the Datastores tab in the Local Object Library and fol
ABAP output file fied incorrectly. low these steps:

1. Right-click the SAP_DS datastore and click Edit.

2. Review the information in Working Directory on SAP
Server and make changes if necessary.

3. Verify the new path by pasting it into the Start

Run dialog box and executing.

If the path is valid, a window to the working directory on the

SAP server opens.

If you have other ABAP errors, read about debugging and testing ABAP jobs in the Supplement for SAP.

Parent topic: Repopulate the customer dimension table [page 181]

Previous task: Executing the JOB_SAP_CustDim job [page 187]

17.5 Repopulating the material dimension table

Repopulate the material dimension table using the SAP_DS datastore.

For this exercise, you create a data flow that is similar to the dataflow that you created to repopulate the
customer dimension table. However, in this process, the data for the material dimension table is the result of a
join between two SAP application tables.

1. Adding the Material Dimension job, work flow, and data flow [page 189]
Create the Material Dimension job and add a work flow and a data flow.
2. Adding ABAP data flow to Material Dimension job [page 189]
Add the ABAP data flow to JOB_SAP_MtrlDim and set options in the ABAP data flow.
3. Defining the DF_SAP_MtrlDim ABAP data flow [page 191]
Define the ABAP data flow so that it communicates the job tasks to the SAP application.

Tutorial
188 PUBLIC Extracting SAP application data
4. Executing the JOB_SAP_MtrlDim job [page 194]
Validate and then execute the JOB_SAP_MtrlDim job.

Parent topic: Extracting SAP application data [page 177]

Related Information

SAP applications [page 178]

Defining an SAP application datastore [page 179]
Importing metadata [page 180]
Repopulate the customer dimension table [page 181]
Repopulating the Sales Fact table [page 195]
What's next [page 204]
A data flow within a data flow [page 182]

17.5.1 Adding the Material Dimension job, work flow, and data
flow

Create the Material Dimension job and add a work flow and a data flow.

Log into SAP Data Services Designer and open the Class_Exercises project in the Project Area.

1. Create a new batch job and name it JOB_SAP_MtrlDim.

2. Open the workspace for the job and add a work flow.
3. Rename the work flow WF_SAP_MtrlDim.
4. Open the workspace for the work flow and add a data flow.
5. Rename the data flow DF_SAP_MtrlDim.
6. Save your work.

Task overview: Repopulating the material dimension table [page 188]

Next task: Adding ABAP data flow to Material Dimension job [page 189]

17.5.2 Adding ABAP data flow to Material Dimension job

Add the ABAP data flow to JOB_SAP_MtrlDim and set options in the ABAP data flow.

1. Open DF_SAP_MtrlDim in the workspace.

2. Click the ABAP data flow icon from the tool palette and click in the workspace to add it to the data flow.

Tutorial
Extracting SAP application data PUBLIC 189
The Properties window of the ABAP data flow opens.
3. Complete the fields in the Options tab as described in the following table:

Option Action

Datastore Select SAP_DS from the dropdown list

Generated ABAP file name Specify a file name for the generated ABAP code. The
software stores the file in the ABAP directory that you
specified in the SAP_DS datastore.

Job name Type SAP_MtrlDim. The name is for the job that runs in
the SAP application.

4. Open the General tab and name the data flow DF_SAP_MtrlDim.
5. Click OK.

6. Open the Datastores tab in the Local Object Library and expand Target_DS Tables .
7. Move the MTRL_DIM table to the workspace using drag and drop.

Place the table to the right of the DF_SAP_MtrlDim object.

8. Select Make Target.
9. Save your work.

Task overview: Repopulating the material dimension table [page 188]

Previous task: Adding the Material Dimension job, work flow, and data flow [page 189]

Next task: Defining the DF_SAP_MtrlDim ABAP data flow [page 191]

Related Information

Defining the DF_SAP_MtrlDim ABAP data flow [page 191]

Tutorial
190 PUBLIC Extracting SAP application data
17.5.3 Defining the DF_SAP_MtrlDim ABAP data flow

Define the ABAP data flow so that it communicates the job tasks to the SAP application.

Perform the following group of tasks to define the ABAP data flow:

1. Designate the source tables.

2. Define a query to join the source tables and to specify the data to extract.
3. Define the data transport object into which the SAP application writes the resulting data set.
4. Set the order of execution in the data flow.

Adding objects to the DF_SAP_MtrlDim ABAP data flow [page 191]

Add the necessary objects to complete the DF_SAP_MtrlDim ABAP data flow.

Defining the query with a join between source tables [page 192]
Set up a join between the two source tables and complete the output schema to define the data to
extract from the SAP application

Defining data details of the data transport [page 193]

A data transport defines a staging file for the data that is extracted from the SAP application.

Setting the execution order [page 194]

Set the order of execution by joining the objects in the data flow.

Task overview: Repopulating the material dimension table [page 188]

Previous task: Adding ABAP data flow to Material Dimension job [page 189]

Next task: Executing the JOB_SAP_MtrlDim job [page 194]

Related Information

Adding objects to the DF_SAP_MtrlDim ABAP data flow [page 191]

17.5.3.1 Adding objects to the DF_SAP_MtrlDim ABAP data

flow

Add the necessary objects to complete the DF_SAP_MtrlDim ABAP data flow.

1. Open the DF_SAP_MtrlDim data flow in the workspace.

2. Open the Datastores tab in the Local Object Library and expand SAP_DS Tables .
3. Move the MARA table to the left side of the workspace using drag and drop.
4. Select Make Source.
5. Move the MAKT table to the workspace using drag and drop. Position it under the MARA table.

Tutorial
Extracting SAP application data PUBLIC 191
6. Select Make Source.

7. Add a query from the tool pallet to the right of the table in the workspace.

8. Add a data transport from the tool pallet to the right of the query in the workspace.
9. Connect the icons in the data flow to indicate the flow of data as shown.

10. Save your work.

Related Information

Defining the query [page 185]

17.5.3.2 Defining the query with a join between source tables

Set up a join between the two source tables and complete the output schema to define the data to extract from
the SAP application

1. Double-click the query in the workspace to open the Query Editor dialog box.
2. Open the FROM tab in the lower pane.
3. In the Join pairs group, select MARA from the Left dropdown list.
4. Select MAKT from the Right dropdown list.
The source rows must meet the requirements of the condition to be passed to the target, including the join
relationship between sources. The MARA and MAKT tables are related by a common column named
MATNR. The MATNR column contains the material number and is the primary key between the two tables.

The resulting relationship appears in the From clause text box:

(MARA.MATNR = MAKT.MATNR)

5. Click the Smart Editor icon.

Tutorial
192 PUBLIC Extracting SAP application data
6. Type the following command in the Smart Editor. Use all uppercase:

AND (SPRAS = 'E')

This command filters the material descriptions by language. Only the records with the material
descriptions in English are output to the target.
7. Click OK to close the Smart Editor.
8. In the Schema In and Schema Out panes, map the following source columns to output columns using drag
and drop.

Table Column

MARA MATNR
MTART
MBRSH
MATKL

MAKT MAKTX

9. Rename the target columns, verify data types, and add descriptions based on the information in the
following table.

Column name Rename Data type Description

MATNR Mtrl_id varchar(18) Material number

MTART Mtrl_typ varchar(4) Material type

MBRSH Ind_sector varchar(1) Industry sector

MATKL Mtrl_grp varchar(9) Material group

MAKTX Descr varchar(60) Material description

10. Click the Back arrow in the icon toolbar to return to the data flow.
11. Save your work.

Related Information

Defining data details of the data transport [page 193]

17.5.3.3 Defining data details of the data transport

A data transport defines a staging file for the data that is extracted from the SAP application.

1. Open the DF_SAP_MtrlDim ABAP data flow in the workspace

2. Double-click the the data transport object to open the ABAP Data File Option Editor.
3. Type mtrl_dim.dat in File Name .

Tutorial
Extracting SAP application data PUBLIC 193
This file stores the data set produced by the ABAP data flow. The full path name for this file is the path of
the SAP Data Services shared directory concatenated with the file name that you just entered.
4. Select Replace File.
Replace File truncates this file each time the data flow is executed.
5. Click the Back icon in the icon toolbar to return to the data flow.
6. Save your work.

17.5.3.4 Setting the execution order

Set the order of execution by joining the objects in the data flow.

1. Open the DF_SAP_MtrlDim data flow in the workspace.

The data flow contains the ABAP data flow and the target table named Mtrl_Dim.
2. Connect the ABAP data flow to the target table.
3. Save your work.

Related Information

Executing the JOB_SAP_MtrlDim job [page 194]

17.5.4 Executing the JOB_SAP_MtrlDim job

Validate and then execute the JOB_SAP_MtrlDim job.

1. With JOB_SAP_MtrlDim selected in the Project Area, click the Validate All icon on the icon toolbar.

If your design contains errors, a message appears describing the error, which requires solving before you
can proceed.

If your design contains warnings, a warning message appears. Warnings do not prohibit job execution.

If your design does not have errors, the following message appears:

Validate: No errors found

2. Right-click the job name in the Project Area and click the Execute icon in the toolbar.

If you have not saved your work, a save dialog box appears. Save your work and continue.

The Execution Properties dialog box opens.

3. Leave the default selections and click OK.

After the job completes, check the Output window for any error or warning messages.
4. Use a query tool to check the contents of the Mtrl_Dim table in your DBMS.

Tutorial
194 PUBLIC Extracting SAP application data
Task overview: Repopulating the material dimension table [page 188]

Previous task: Defining the DF_SAP_MtrlDim ABAP data flow [page 191]

Related Information

ABAP job execution errors [page 188]

17.6 Repopulating the Sales Fact table

Repopulate the Sales Fact table from two SAP application sources.

This task extracts data from two source tables, and it extracts a single column from a third table using a lookup
function.

1. Adding the Sales Fact job, work flow, and data flow [page 196]
Create the Sales Fact job and add work flow and a data flow objects.
2. Adding ABAP data flow to Sales Fact job [page 196]
Add the ABAP data flow to JOB_SAP_SalesFact and set options in the ABAP data flow.
3. Defining the DF_ABAP_SalesFact ABAP data flow [page 197]
Define the ABAP data flow so that it communicates the job tasks to the SAP application.
4. Executing the JOB_SAP_SalesFact job [page 203]
Validate and then execute the JOB_SAP_SalesFact job.

Parent topic: Extracting SAP application data [page 177]

Related Information

SAP applications [page 178]

Defining an SAP application datastore [page 179]
Importing metadata [page 180]
Repopulate the customer dimension table [page 181]
Repopulating the material dimension table [page 188]
What's next [page 204]
Adding the Sales Fact job, work flow, and data flow [page 196]

Tutorial
Extracting SAP application data PUBLIC 195
17.6.1 Adding the Sales Fact job, work flow, and data flow

Create the Sales Fact job and add work flow and a data flow objects.

Log into SAP Data Services Designer and open the Class_Exercises project in the Project Area.

1. Create a new batch job.

2. Rename the batch job JOB_SAP_SalesFact.
3. Select JOB_SAP_SalesFact in the Project Area.
4. The job opens in the workspace.
5. Add a work flow to the workspace and name it WF_SAP_SalesFact.
6. Open the work flow in the workspace and add a data flow.
7. Rename the data flow DF_SAP_SalesFact.
8. Save your work.

Task overview: Repopulating the Sales Fact table [page 195]

Next task: Adding ABAP data flow to Sales Fact job [page 196]

17.6.2 Adding ABAP data flow to Sales Fact job

Add the ABAP data flow to JOB_SAP_SalesFact and set options in the ABAP data flow.

1. Open the DF_SAP_SalesFact dataflow in the workspace.

2. Click the ABAP data flow icon from the tool palette and click in the workspace to add it to the data flow.

The Properties window of the ABAP data flow opens.

3. Complete the fields in the Options tab as described in the following table:

Option Action

Datastore Select SAP_DS from the dropdown list

Generated ABAP file name Specify a file name for the generated ABAP code. The
software stores the file in the ABAP directory that you
specified in the SAP_DS datastore.

ABAP program name Specify a name for the ABAP program that the Data
Services job uploads to the SAP application. Adhere to the
following naming requirements:
• Begins with the letter Y or Z
• Cannot exceed 8 characters

Tutorial
196 PUBLIC Extracting SAP application data
Option Action

Job name Type SAP_SalesFact. The name is for the job that runs
in the SAP application.

4. Open the General tab enter DF_ABAP_SalesFact for the ABAP data flow.
5. Click OK.

6. Open the Datastores tab in the Local Object Library and expand Target_DS Tables .
7. Move the SALES_FACT table to the workspace using drag and drop.

Place the table to the right of the DF_ABAP_SalesFact object.

8. Select Make Target.
9. Save your work.

Task overview: Repopulating the Sales Fact table [page 195]

Previous task: Adding the Sales Fact job, work flow, and data flow [page 196]

Next task: Defining the DF_ABAP_SalesFact ABAP data flow [page 197]

17.6.3 Defining the DF_ABAP_SalesFact ABAP data flow

Define the ABAP data flow so that it communicates the job tasks to the SAP application.

Perform the following group of tasks to define the ABAP data flow:

1. Designate the source tables.

2. Define the query to join two tables and to specify the data to extract.
3. Define a look up function to extract column data from a different source table.
4. Define the data transport object.
5. Set the order of execution.

Adding objects to the DF_ABAP_SalesFact ABAP data flow [page 198]

Add the necessary objects to complete the DF_ABAP_SalesFact ABAP data flow.

Defining the query with a join between source tables [page 199]
Set up a join between the two source tables and complete the output schema to define the data to
extract from the SAP application

Defining the lookup function to add output column with a value from another table [page 200]
Use a lookup function to extract data from a table that is not defined in the job.

Defining the details of the data transport [page 203]

A data transport defines a staging file for the data that is extracted from the SAP application.

Setting the execution order [page 203]

Set the order of execution by joining the objects in the data flow.

Tutorial
Extracting SAP application data PUBLIC 197
Task overview: Repopulating the Sales Fact table [page 195]

Previous task: Adding ABAP data flow to Sales Fact job [page 196]

Next task: Executing the JOB_SAP_SalesFact job [page 203]

Related Information

Adding objects to the DF_ABAP_SalesFact ABAP data flow [page 198]

17.6.3.1 Adding objects to the DF_ABAP_SalesFact ABAP data

flow

Add the necessary objects to complete the DF_ABAP_SalesFact ABAP data flow.

1. Open the DF_ABAP_SalesFact ABAP data flow in the workspace.

2. Open the Datastores tab in the Local Object Library and expand SAP_DS Tables .
3. Move the VBAP table to the left side of the workspace using drag and drop.
4. Select Make Source.
5. Move the VBAK table to the workspace using drag and drop. Place it under the VBAP table.
6. Select Make Source.

7. Add a query from the tool palette to the right of the tables in the workspace.

8. Add a data transport from the tool palette to the right of the query in the workspace.
9. Connect the icons in the dataflow to indicate the flow of data as shown.

Tutorial
198 PUBLIC Extracting SAP application data
10. Save your work.

Related Information

Defining the query with a join between source tables [page 199]

17.6.3.2 Defining the query with a join between source tables

Set up a join between the two source tables and complete the output schema to define the data to extract from
the SAP application

Open the DF_ABAP_SalesFact ABAP data flow in the workspace.

1. Double-click the query to open the Query Editor.

2. Open the FROM tab in the lower pane.
3. In the Join pairs group, select VBAP from the Left dropdown list.
4. Select VBAK from the Right dropdown list.
The VBAP and VBAK tables are related by a common column named VBELN. The VBELN column contains
the sales document number and is the primary key between the two tables.
The Propose Join option specifies a relationship based on the primary keys.

The resulting relationship appears in the From clause text box:

VBAP.VBELN = VBAK.VBELN

5. Click the Smart Editor icon.

6. Type the following command using all uppercase, as shown.

AND ((AUDAT >= '19970101') AND (AUDAT <= '19971231'))

This statement filters the sales orders by date and brings the sales orders from one year into the target
table.
7. Click OK.
8. In the Schema In and Schema Out panes, map the following source columns to output columns using drag
and drop:

Table Column

VBAP VBELN
POSNR
MATNR
NETWR

Tutorial
Extracting SAP application data PUBLIC 199
Table Column

VBAK KVGR1
AUDAT

9. Rename the target columns, verify data types, and add descriptions as shown in the following table:

Original name New name Data type Description

VBELN SLS_doc_no varchar(10) Sales document

POSNR SLS_doc_line_no varchar(6) Sales document line item

MATNR Material_no varchar(18) Material number

NETWR Net_value varchar(15) Order item value

KVGR1 Cust_ID varchar(3) Customer ID

AUDAT SLS_doc_date date Document date

17.6.3.3 Defining the lookup function to add output column

with a value from another table

Use a lookup function to extract data from a table that is not defined in the job.

Open the DF_ABAP_SalesFact data flow in the workspace.

1. Double-click the query to open the Query Editor.

2. In the Schema Out pane, right-click the target schema name and select New Output Column from the
dropdown menu.

The Column Properties dialog box opens.

3. Complete the options as described in the following table:

Option Value

Name ord_status

Data type varchar

Length 1

Description Order item status

Tutorial
200 PUBLIC Extracting SAP application data
 Note

Leave Content type blank.

4. Click OK.

The ord_status column appears in the output schema list.

5. Click the ord_status column in the Schema Out pane.
6. In the lower pane, click Functions on the Mapping tab.

The Select Function dialog box opens.

7. Click Lookup_Function from Function Categories.
8. Click lookup from the Function Name pane.
9. Click Next.

The Define Input Parameter(s) dialog box opens.

10. Complete the Lookup function using the values in the following table.

 Restriction

The LOOKUP function is case sensitive. Enter the values using the case as listed in the following table.
Type the entries in the text boxes instead of using the dropdown arrow or the Browse button.

Option Value Description

Lookup table SAP_DS_VBUP The table in which to look up values.

Result column GBSTA The column from the VBUP table that contains the value for
the target column ord_status.

Tutorial
Extracting SAP application data PUBLIC 201
Option Value Description

Default value 'none' The value used if the lookup isn't successful. Use single quotes
as shown.

Cache spec 'NO_CACHE' Specifies whether to cache the table. Use single quotes as
shown.

Compare column VBELN The document number in the lookup table.

Expression VBAK.VBELN The document number in the input (source) schema.

 Note

The value for the ord_status column comes from the GBSTA column in the VBUP table. The value in
the GBSTA column indicates the status of a specific item in the sales document. The software needs
both an order number and an item number to determine the correct value to extract from the table.

The function editor provides fields for only one dependency, which you defined using the values from
the table.

11. Click Finish.

The Lookup expression displays in the Mapping text box as follows:

lookup(SAP_DS..VBUP, GBSTA, 'none', 'NO_CACHE', VBELN, VBAK.VBELN)

12. Add lookup values to the mapping expression.

The Lookup function can process any number of comparison value pairs. To include the dependency on the
item number to the Lookup expression, add the item number column from the translation table and the
item number column from the input (source) schema as follows:

POSNR, VBAP.POSNR

The final lookup function looks as follows:

lookup(SAP_DS..VBUP, GBSTA, 'none', 'NO_CACHE', VBELN, VBAK.VBELN, POSNR,

VBAP.POSNR)

13. Click the Back arrow in the icon toolbar to close the Query Editor.
14. Save your work.

Related Information

Defining the details of the data transport [page 203]

Tutorial
202 PUBLIC Extracting SAP application data
17.6.3.4 Defining the details of the data transport

A data transport defines a staging file for the data that is extracted from the SAP application.

1. Open the DF_ABAP_SalesFact data flow in the workspace.

2. Double-click the data transport object to open the ABAP Data File Option Editor.
3. Type sales_fact.dat in File name.
4. Select Replace File.
Replace File truncates this file each time the data flow is executed.
5. Click the Back icon in the icon toolbar to return to the data flow.
6. Save your work.

Related Information

Setting the execution order [page 203]

17.6.3.5 Setting the execution order

Set the order of execution by joining the objects in the data flow.

1. Open the DF_SAP_SalesFact data flow in the workspace.

2. Connect the ABAP data flow to the target table SALES_FACT.
3. Save your work.

Related Information

Executing the JOB_SAP_SalesFact job [page 203]

17.6.4 Executing the JOB_SAP_SalesFact job

Validate and then execute the JOB_SAP_SalesFact job.

1. With JOB_SAP_SalesFact selected in the Project Area, click the Validate All icon in the toolbar.

If your design contains errors, a message appears describing the error, which requires solving before you
can proceed.

If your design contains warnings, a warning message appears. Warnings do not prohibit job execution.

Tutorial
Extracting SAP application data PUBLIC 203
If your design does not have errors, the following message appears:

Validate: No errors found

2. Right-click JOB_SAP_SalesFact in the Project Area and click the Execute icon in the toolbar.

If you have not saved your work, a save dialog box appears. Save your work and continue.

The Execution Properties dialog box opens.

3. Leave the default selections and click OK.

After the job completes, check the Output window for any error or warning messages.
4. Use a query tool to check the contents of the Sales_Fact table in your DBMS.

Task overview: Repopulating the Sales Fact table [page 195]

Previous task: Defining the DF_ABAP_SalesFact ABAP data flow [page 197]

Related Information

ABAP job execution errors [page 188]

17.7 What's next

In the next section, learn how to import and run a real-time job.

The tutorial employs batch jobs to help you learn how to use SAP Data Services. However, real-time jobs
process requests from external systems or Web applications, and send back requests in real time.

Parent topic: Extracting SAP application data [page 177]

Related Information

SAP applications [page 178]

Tutorial
204 PUBLIC Extracting SAP application data
18 Real-time jobs

In this segment, you execute a real-time job to see the basic functionality.

What you'll learn

For real-time jobs, Data Services receives requests from ERP systems and Web applications and sends replies
immediately after receiving the requested data. Requested data comes from a data cache or a second
application. You define operations for processing on-demand messages by building real-time jobs in the
Designer.

Real-time jobs have the following characteristics.

• A single real-time data flow (RTDF) that runs until explicitly stopped
• Requests in XML message format and SAP applications using IDoc format

 Note

The tutorial exercise focuses on a simple XML-based example that you import.

• Requests in XML file format in test mode from a development environment.

• A listener that forwards XML requests to the appropriate real-time job or service

For more information about real-time jobs, see the Reference Guide.

The goal

We've developed a simple real-time job that you import and run in test mode.

Importing a real-time job [page 205]

Import a real time job into SAP Data Services.

Running a real time job in test mode [page 206]

Run a real time job that transforms an input string of Hello World to World Hello.

18.1 Importing a real-time job

Import a real time job into SAP Data Services.

1. Copy the following files from <LINK_DIR>\ConnectivityTest and paste them into your temporary
directory. For example, C:\temp:

Tutorial
Real-time jobs PUBLIC 205
• TestOut.dtd
• TestIn.dtd
• TestIn.xml
• ClientTest.txt
2. Copy the file ClientTest.exe from <LINK_DIR>\bin and paste it to your temporary directory.

 Note

ClientTest.exe uses DLLs in your <LINK_DIR>\bin directory. If you encounter problems, ensure
that you have included <LINK_DIR>\bin in the Windows environment variables path statement.

3. Log on to SAP Data Services Designer and the tutorial repository.

4. Right-click in a blank space in the Local Object Library and select Repository Import From File .

The Open Import File dialog box opens.

5. Go to <LINK_DIR>\ConnectivityTest, select Testconnectivity.atl, and click Open.
6. Click Yes to the prompt that warns you that you will overwrite existing objects.

The Import Plan window opens.

7. Accept the defaults and click Import.
8. Enter ds for Passphrase and click Import.

Task overview: Real-time jobs [page 205]

Related Information

Running a real time job in test mode [page 206]

18.2 Running a real time job in test mode

Run a real time job that transforms an input string of Hello World to World Hello.

Import the necessary files to create a real-time job.

Use the files that you imported previously to create a real-time job.

1. Click the Project menu option and select New Project .

2. Type TestConnectivity for Project Name and click Create.

The new project opens in the Project Area.

3. Open the Jobs tab and expand Real-time jobs.
4. Move the job named Job_TestConnectivity to the TestConnectivity project in the Project Area using drag
and drop.

Tutorial
206 PUBLIC Real-time jobs
5. Expand Job_TestConnectivity and click RT_TestConnectivity to open it in the workspace.

The workspace contains one XML message source named TestIn (XML request) and one XML message
target named TestOut (XML reply).

6. Double-click TestIn to open it. Verify that the Test file option in the Source tab is C:\temp\TestIn.XML.
7. In Windows Explorer, open Testin.XML in your temporary directory. For example, C:\temp\TestIn.XML.
Confirm that it contains the following message:

<test>
<Input_string>Hello World</Input_string>
</test>

8. Back in Designer, double-click TestOut in the workspace to open it. Verify that the Test file option in the
Target tab is C:\temp\TestOut.XML.
9. Execute the job Job_TestConnectivity
10. Click Yes to save all changes if applicable.
11. Accept the default settings in the Execution Properties dialog box and click OK.
12. When the job completes, open Windows Explorer and open C:\temp\TestOut.xml. Verify that the file
contains the following text:

<test>
<output_string>World Hello</output_string>
</test>

Task overview: Real-time jobs [page 205]

Related Information

Importing a real-time job [page 205]

Tutorial
Real-time jobs PUBLIC 207
Important Disclaimers and Legal Information

Hyperlinks
Some links are classified by an icon and/or a mouseover text. These links provide additional information.
About the icons:

• Links with the icon : You are entering a Web site that is not hosted by SAP. By using such links, you agree (unless expressly stated otherwise in your
agreements with SAP) to this:

• The content of the linked-to site is not SAP documentation. You may not infer any product claims against SAP based on this information.

• SAP does not agree or disagree with the content on the linked-to site, nor does SAP warrant the availability and correctness. SAP shall not be liable for any
damages caused by the use of such content unless damages have been caused by SAP's gross negligence or willful misconduct.

• Links with the icon : You are leaving the documentation for that particular SAP product or service and are entering a SAP-hosted Web site. By using such
links, you agree that (unless expressly stated otherwise in your agreements with SAP) you may not infer any product claims against SAP based on this
information.

Videos Hosted on External Platforms

Some videos may point to third-party video hosting platforms. SAP cannot guarantee the future availability of videos stored on these platforms. Furthermore, any
advertisements or other content hosted on these platforms (for example, suggested videos or by navigating to other videos hosted on the same site), are not within
the control or responsibility of SAP.

Beta and Other Experimental Features

Experimental features are not part of the officially delivered scope that SAP guarantees for future releases. This means that experimental features may be changed by
SAP at any time for any reason without notice. Experimental features are not for productive use. You may not demonstrate, test, examine, evaluate or otherwise use
the experimental features in a live operating environment or with data that has not been sufficiently backed up.
The purpose of experimental features is to get feedback early on, allowing customers and partners to influence the future product accordingly. By providing your
feedback (e.g. in the SAP Community), you accept that intellectual property rights of the contributions or derivative works shall remain the exclusive property of SAP.

Example Code
Any software coding and/or code snippets are examples. They are not for productive use. The example code is only intended to better explain and visualize the syntax
and phrasing rules. SAP does not warrant the correctness and completeness of the example code. SAP shall not be liable for errors or damages caused by the use of
example code unless damages have been caused by SAP's gross negligence or willful misconduct.

Bias-Free Language
SAP supports a culture of diversity and inclusion. Whenever possible, we use unbiased language in our documentation to refer to people of all cultures, ethnicities,
genders, and abilities.

Tutorial
208 PUBLIC Important Disclaimers and Legal Information
Tutorial
Important Disclaimers and Legal Information PUBLIC 209
www.sap.com/contactsap

No part of this publication may be reproduced or transmitted in any form

or for any purpose without the express permission of SAP SE or an SAP
affiliate company. The information contained herein may be changed
without prior notice.

Some software products marketed by SAP SE and its distributors

contain proprietary software components of other software vendors.
National product specifications may vary.

These materials are provided by SAP SE or an SAP affiliate company for

informational purposes only, without representation or warranty of any
kind, and SAP or its affiliated companies shall not be liable for errors or
omissions with respect to the materials. The only warranties for SAP or
SAP affiliate company products and services are those that are set forth
in the express warranty statements accompanying such products and
services, if any. Nothing herein should be construed as constituting an
additional warranty.

SAP and other SAP products and services mentioned herein as well as
their respective logos are trademarks or registered trademarks of SAP
SE (or an SAP affiliate company) in Germany and other countries. All
other product and service names mentioned are the trademarks of their
respective companies.

Please see https://www.sap.com/about/legal/trademark.html for

additional trademark information and notices.

THE BEST RUN

revit villa project
No ratings yet
revit villa project
41 pages
SalesPricingConditionRecord_20231106_142304
No ratings yet
SalesPricingConditionRecord_20231106_142304
9 pages
Dev Guide
100% (3)
Dev Guide
195 pages
Web Designer: TSX ETG 3000 Product Range User Manual
No ratings yet
Web Designer: TSX ETG 3000 Product Range User Manual
310 pages
OpenScape 4000 Assistant V8 Configuration Management Administrator Documentation Issue 4
No ratings yet
OpenScape 4000 Assistant V8 Configuration Management Administrator Documentation Issue 4
1,450 pages
Getting Started Product Structure
No ratings yet
Getting Started Product Structure
218 pages
Nuxeo Platform 5.8 Technical Documentation
No ratings yet
Nuxeo Platform 5.8 Technical Documentation
665 pages
Ds 42 Tutorial en PDF
No ratings yet
Ds 42 Tutorial en PDF
194 pages
Core Features
No ratings yet
Core Features
580 pages
Core Features
No ratings yet
Core Features
612 pages
Guide BRF+ PDF
60% (5)
Guide BRF+ PDF
274 pages
Rapid SQL Developer User Guide
No ratings yet
Rapid SQL Developer User Guide
75 pages
Prisma Hlp 240712 En
No ratings yet
Prisma Hlp 240712 En
95 pages
SAP HANA Modeling Guide
100% (1)
SAP HANA Modeling Guide
266 pages
SAPHANA Modeling Guide forSAP HANA Studio
No ratings yet
SAPHANA Modeling Guide forSAP HANA Studio
266 pages
SAP HANA Modeling Guide For SAP HANA Studio en
No ratings yet
SAP HANA Modeling Guide For SAP HANA Studio en
264 pages
SAP HANA Modeling Guide For SAP HANA Studio en
No ratings yet
SAP HANA Modeling Guide For SAP HANA Studio en
268 pages
SAP HANA Developer Guide For SAP HANA Web Workbench
No ratings yet
SAP HANA Developer Guide For SAP HANA Web Workbench
552 pages
SAP HANA Developer Guide For SAP HANA Web Workbench en
No ratings yet
SAP HANA Developer Guide For SAP HANA Web Workbench en
540 pages
AS400 5250 Tutorial, 3.5
No ratings yet
AS400 5250 Tutorial, 3.5
102 pages
SAP HANA Modeling Guide For SAP HANA Studio en
No ratings yet
SAP HANA Modeling Guide For SAP HANA Studio en
264 pages
SAP HANA Modeling Guide For SAP HANA Studio en PDF
No ratings yet
SAP HANA Modeling Guide For SAP HANA Studio en PDF
264 pages
PHP - PG
No ratings yet
PHP - PG
55 pages
SAP Predictive Analytics Developer Guide
No ratings yet
SAP Predictive Analytics Developer Guide
252 pages
ObjectARX .NET Developer's Guide
No ratings yet
ObjectARX .NET Developer's Guide
86 pages
The Definitive Guide To Yii 1.1: Qiang Xue and Xiang Wei Zhuo
No ratings yet
The Definitive Guide To Yii 1.1: Qiang Xue and Xiang Wei Zhuo
212 pages
Guia Oficial Matisse Java
No ratings yet
Guia Oficial Matisse Java
65 pages
Sap Signavio Process Manager User Guide en
No ratings yet
Sap Signavio Process Manager User Guide en
352 pages
Model Config Guide
No ratings yet
Model Config Guide
494 pages
Maximo REST API Guide
No ratings yet
Maximo REST API Guide
79 pages
Model Configuration Guide (SAP IBP For Supply Chain 2308)
No ratings yet
Model Configuration Guide (SAP IBP For Supply Chain 2308)
492 pages
Rapidsql User Guide
No ratings yet
Rapidsql User Guide
961 pages
DataStudio Redbook
No ratings yet
DataStudio Redbook
166 pages
Getting Started in 3D With 3ds Max - Model, Texture, Rig, Animate, and Render in 3ds Max (PDFDrive) PDF
No ratings yet
Getting Started in 3D With 3ds Max - Model, Texture, Rig, Animate, and Render in 3ds Max (PDFDrive) PDF
518 pages
Blog Smarter, Not Harder: SEO, Blogging, and AI Strategies to Skyrocket Your Traffic
From Everand
Blog Smarter, Not Harder: SEO, Blogging, and AI Strategies to Skyrocket Your Traffic
Jay Nans
No ratings yet
AutoCAD Civil 3D 2011 Developers Guide
100% (1)
AutoCAD Civil 3D 2011 Developers Guide
292 pages
AutoCAD Civil 3D 2011 Developer S Guide
No ratings yet
AutoCAD Civil 3D 2011 Developer S Guide
292 pages
Hci10 Help Sap en
No ratings yet
Hci10 Help Sap en
408 pages
Inside AX 2012 R3 Preview
No ratings yet
Inside AX 2012 R3 Preview
86 pages
Inside AX 2012 R3
60% (5)
Inside AX 2012 R3
86 pages
Catel 3.9 Documentation
No ratings yet
Catel 3.9 Documentation
320 pages
SAP Help - Desktop Stu Dev Guide
No ratings yet
SAP Help - Desktop Stu Dev Guide
536 pages
Yii-Guide-1 1 4
No ratings yet
Yii-Guide-1 1 4
214 pages
Content Creation Revolution with chatGPT
From Everand
Content Creation Revolution with chatGPT
Maria Cowen
No ratings yet
SAP HANA Modeling Guide en PDF
No ratings yet
SAP HANA Modeling Guide en PDF
120 pages
SAP Enterprise Architecture Designer
No ratings yet
SAP Enterprise Architecture Designer
270 pages
Ponyorm
No ratings yet
Ponyorm
123 pages
SAP HANA Modeling Guide en
100% (1)
SAP HANA Modeling Guide en
120 pages
Soapui
0% (1)
Soapui
366 pages
Hci10 Help Sap En
No ratings yet
Hci10 Help Sap En
418 pages
EasySTONE 6.7 ENG
No ratings yet
EasySTONE 6.7 ENG
1,482 pages
SAP HANA Modeling Guide en
No ratings yet
SAP HANA Modeling Guide en
170 pages
THE SAP® Business One Software Development Kit: SAP Functions in Detail
No ratings yet
THE SAP® Business One Software Development Kit: SAP Functions in Detail
24 pages
CodEx - Design Manual
No ratings yet
CodEx - Design Manual
101 pages
ChatGPT for Business: Strategies for Success
From Everand
ChatGPT for Business: Strategies for Success
Matthew C. Smith
1/5 (1)
A To Z of Internet: Everything You Wanted to Know
From Everand
A To Z of Internet: Everything You Wanted to Know
Bittu Kumar
No ratings yet
Unlocking Statistics for the Social Sciences
From Everand
Unlocking Statistics for the Social Sciences
Norma Sinclair
No ratings yet
Software Patterns Made Easy
From Everand
Software Patterns Made Easy
Justice Nanhou
No ratings yet
Mastering Python Advanced Concepts and Practical Applications
From Everand
Mastering Python Advanced Concepts and Practical Applications
Aissa Younes
No ratings yet
Euro Fin Management - 2021 - Aziz - Machine Learning in Finance A Topic Modeling Approach
No ratings yet
Euro Fin Management - 2021 - Aziz - Machine Learning in Finance A Topic Modeling Approach
27 pages
Structure of MIS: Lai Prabhakar
100% (1)
Structure of MIS: Lai Prabhakar
45 pages
CSE215 Chapter 8 Multi-dim-Arrays
No ratings yet
CSE215 Chapter 8 Multi-dim-Arrays
26 pages
Changelog
No ratings yet
Changelog
13 pages
Logic FI and Soln
No ratings yet
Logic FI and Soln
17 pages
All MCQ'S
No ratings yet
All MCQ'S
11 pages
Kids Box New Generation 4 Flashcards
No ratings yet
Kids Box New Generation 4 Flashcards
104 pages
Reviewer Kay Sir Jed
No ratings yet
Reviewer Kay Sir Jed
12 pages
Series-63 Exam Dumps
No ratings yet
Series-63 Exam Dumps
6 pages
Ann - Lab - Ipynb - Colaboratory
No ratings yet
Ann - Lab - Ipynb - Colaboratory
7 pages
PROMPT
No ratings yet
PROMPT
3 pages
TIA Export Graphics Tool DOC v51 en
No ratings yet
TIA Export Graphics Tool DOC v51 en
15 pages
Song - George Szirtes PDF Lever Poetry
No ratings yet
Song - George Szirtes PDF Lever Poetry
1 page
FM
No ratings yet
FM
14 pages
Computer Network Time Synchronization The Network Time Protocol 1st Edition David L. Mills all chapter instant download
100% (1)
Computer Network Time Synchronization The Network Time Protocol 1st Edition David L. Mills all chapter instant download
77 pages
EDI Config
100% (1)
EDI Config
16 pages
System Analysis and Design Lab Manual
No ratings yet
System Analysis and Design Lab Manual
54 pages
Hospital Prescription System
No ratings yet
Hospital Prescription System
21 pages
HP-97 Service Manual
No ratings yet
HP-97 Service Manual
94 pages
DAVIE - XDcII - 530 - Eng DAF PDF
100% (1)
DAVIE - XDcII - 530 - Eng DAF PDF
126 pages
Spreadsheet Modeling
100% (4)
Spreadsheet Modeling
24 pages
Crowdstrike (CRWD) : Cybersecurity Attacks Are Happening More Than Ever
No ratings yet
Crowdstrike (CRWD) : Cybersecurity Attacks Are Happening More Than Ever
6 pages
CD Lab Manual Aim - Algorithm (1)
No ratings yet
CD Lab Manual Aim - Algorithm (1)
11 pages
Real Time Operating System
No ratings yet
Real Time Operating System
51 pages
Seraphic Infosolutions Final Notice
No ratings yet
Seraphic Infosolutions Final Notice
4 pages
Inspection Method for Batch PCBA Projects
No ratings yet
Inspection Method for Batch PCBA Projects
3 pages
Flexify Guide
No ratings yet
Flexify Guide
63 pages
Poweredge M640: High Performance Modular Server Blade
No ratings yet
Poweredge M640: High Performance Modular Server Blade
2 pages

Ds 42 Tutorial en

Uploaded by

Ds 42 Tutorial en

Uploaded by

PUBLIC

SAP Data Services

THE BEST RUN

1 Introduction to the tutorial. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7

4 Preparation for this tutorial. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5 Tutorial data model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

6 Logging into the Designer. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

7 Create datastores and import metadata. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

8 Populate a table from a flat file. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

9 Populate a time dimension table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

10 Populate a table with data from a relational table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

11 Populate a table from an XML File. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

12 Populate a table from multiple relational tables. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

13 Changed data capture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

14 Data assessment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .123

15 Recovery mechanisms. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 137

16 Multiuser development. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148

17 Extracting SAP application data. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

18 Real-time jobs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 205

Audience and assumptions [page 7]

Tutorial objectives [page 8]

1.1 Audience and assumptions

• Database management or administration

Parent topic: Introduction to the tutorial [page 7]

Tutorial objectives [page 8]

• Verify and improve the quality of your data

Parent topic: Introduction to the tutorial [page 7]

Audience and assumptions [page 7]

Product components [page 9]

The Designer user interface [page 11]

Designer tool palette [page 12]

2.1 Product components

Descriptions of the components that are a part of SAP Data Services.

Data Services component descriptions

Designer Data Services user interface that enables users to:

• Create, test, and execute jobs that populate a data warehouse

• Scheduling, monitoring, and executing batch jobs

The Designer user interface [page 11]

2.2 The Designer user interface

Product components [page 9]

2.3 Designer tool palette

Specifies the order in which Data Services processes subor­

Processes data in the order in which data objects are ar­

Performs SQL type operations on source data

Creates a target table using the schema defined by the data

Calls functions and assigns values to variables in a work flow

Implements “if, then, else” logic in a work flow

A sequence of steps that repeats as long as a condition is

Combines with a Catch block, specifies alternative work

Combines with a Try block, terminates the Try block and

Custom description of a job, work flow, data flow, or a dia­

Parent topic: Product overview [page 9]

Product components [page 9]

Characteristics of Data Services objects

Classes Determines whether an object can be used again in a differ-

Local Object Library tabs:

Object hierarchy [page 15]

Jobs and subordinate objects [page 17]

Work flows [page 18]

Data flows [page 19]

Naming conventions for objects [page 20]

3.1 Object hierarchy

Parent topic: About objects [page 14]

Jobs and subordinate objects [page 17]

3.2 Jobs and subordinate objects

Object Subordinate object Subordinate description

Project Job The smallest unit of work that you can

Contains optional objects, such as

Objects in jobs direct the flow of data

Job Work flow Defines the decision-making process

Contains optional conditionals and

Sets the state of the system after the

Data flow Extracts, transforms, and loads data in

View the object hierarchy diagram to

Parent topic: About objects [page 14]

Object hierarchy [page 15]

Specifies the order in which Data Services processes subor

Processes data in the order in which data objects are ar

Custom description of a job, work flow, data flow, or a dia

The tutorial provides instructions to run the scripts for vari