0% found this document useful (0 votes)
2 views15 pages

75 Loading Data

This document provides instructions on saving and executing a job to populate the TIME_DIM dimension table and summarizes the skills learned in the process, including using different objects in data flows. It details the steps for adding and executing a job to populate the Customer dimension table from a relational table, as well as utilizing an interactive debugger for data examination. The document concludes with a summary of the key learnings and next steps for further exercises.

Uploaded by

Cloosy Duneem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views15 pages

75 Loading Data

This document provides instructions on saving and executing a job to populate the TIME_DIM dimension table and summarizes the skills learned in the process, including using different objects in data flows. It details the steps for adding and executing a job to populate the Customer dimension table from a relational table, as well as utilizing an interactive debugger for data examination. The document concludes with a summary of the key learnings and next steps for further exercises.

Uploaded by

Cloosy Duneem
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 15

Next task: Saving and executing the job [page 56]

5.7 Saving and executing the job

After you save the data flow DF_TimeDim, execute the JOB_TimeDim job to populate the TIME_DIM dimension
table with the changed data.

For instructions to validate and execute the job, see Validating the DF_SalesOrg data flow [page 44] and
Executing the job [page 47].

After the job successfully completes, view the output data using your database management tool. Compare
the output to the input data and see how the functions that you set up in the query affected the output data.

 Note

Remember that you should periodically close the tabs in the workspace when you are finished working with
the objects in the tab.

Task overview: Populate the Time dimension table [page 51]

Previous task: Defining the output of the query [page 55]

Next: Summary and what to do next [page 56]

5.8 Summary and what to do next

In the exercises to populate the Time Dimension table, you practiced the skills that you learned in the first
group of exercises, plus you learned how to use different objects as source and target in a data flow.

You have now populated the following tables in the sales data warehouse:

● Sales Org dimension from a flat file


● Time dimension from a transform

What you have learned:

● Use a project for multiple jobs


● Create a job without an optional workflow
● Use a predefined software-supplied transform in your job
● How the Date_Generation transform is used as a source
● How to set up an output schema using functions

In the next section, you will extract data to populate the Customer dimension table.

Tutorial
56 PUBLIC Populate the Time dimension table
You can now exit Data Services or go to the next group of tutorial exercises. If you exit, the software reminds
you to save your work if you did not save it before. The software saves all projects, jobs, workflows, data flows,
and results in the local repository.

Parent topic: Populate the Time dimension table [page 51]

Previous task: Saving and executing the job [page 56]

Related Information

Populate the Customer dimension table from a relational table [page 58]

Tutorial
Populate the Time dimension table PUBLIC 57
6 Populate the Customer dimension table
from a relational table

In this exercise, you populate the Customer dimension table in the Sales star schema with data from a
relational table.

In the past exercises you have used a flat file to populate the Sales Org. dimension table and a transform to
populate the Time dimension table. In this exercise you use a relational table to populate the Customer
dimension table.

You also use the interactive debugger to examine the data after each transform or object in the data flow.

Before you continue with this exercise, make sure that you imported the source and target tables as instructed
in the Importing metadata [page 32] section.

1. Adding the CustDim job and workflow [page 59]


Add a new job and workflow to the Class Exercises project.
2. Adding the CustDim data flow [page 59]
Create a data flow named DF_CustDim inside the workflow WF_CustDim.
3. Define the data flow [page 60]
Add objects to DF_CustDim in the workspace area to define the data flow instructions for populating
the Custom dimension table.
4. Validating the CustDim data flow [page 62]
5. Executing the CustDim job [page 63]
You execute the CustDim job in the same way that you execute the other jobs in the tutorial.
However,we show you how to view data.
6. The interactive debugger [page 63]

Tutorial
58 PUBLIC Populate the Customer dimension table from a relational table
The Designer interactive debugger allows you to examine and modify data row by row using filters and
breakpoints on lines in a data flow diagram.
7. Summary and what to do next [page 67]
In the exercise to populate the Customer dimension table with a relational table, you learned to use
some basic features of the interactive debugger.

6.1 Adding the CustDim job and workflow

Add a new job and workflow to the Class Exercises project.

Open the Class_Exercises project so it appears in the Project Area in Designer.

1. Right-click the Class_Exercises project name and select New Batch Job.

A tab opens in the workspace area for the new batch job.
2. Rename this job JOB_CustDim.
3. Select the workflow button from the tool palette at right and click the workspace area.

The workflow icon appears in the workspace.


4. Rename the workflow WF_CustDim.
5. Save your work.

Task overview: Populate the Customer dimension table from a relational table [page 58]

Next task: Adding the CustDim data flow [page 59]

Related Information

Work flows [page 15]

6.2 Adding the CustDim data flow

Create a data flow named DF_CustDim inside the workflow WF_CustDim.

Make sure the workspace is open for the WF_CustDim workflow.

1. Click the data flow button in the tool palette at right and click in the workspace.

A new data flow appears in the workspace.


2. Rename the data flow DF_CustDim.

The project, job, workflow, and data flow objects display in hierarchical form in the Project Area. To navigate
to these levels, click their names in the project area.

Tutorial
Populate the Customer dimension table from a relational table PUBLIC 59
3. Click DF_CustDim in the Project Area.

A blank definition area for the data flow appears in the workspace.

Task overview: Populate the Customer dimension table from a relational table [page 58]

Previous task: Adding the CustDim job and workflow [page 59]

Next: Define the data flow [page 60]

Related Information

Data flows [page 16]

6.3 Define the data flow

Add objects to DF_CustDim in the workspace area to define the data flow instructions for populating the
Custom dimension table.

In this exercise, you build the data flow by adding the following objects:

● Source table
● Query transform
● Target table

Adding objects to a data flow [page 60]


Add three objects to the DF_CustDim data flow workspace.

Configuring the query transform [page 61]


You configure the query transform by mapping columns from the source to the target objects.

Parent topic: Populate the Customer dimension table from a relational table [page 58]

Previous task: Adding the CustDim data flow [page 59]

Next task: Validating the CustDim data flow [page 62]

6.3.1 Adding objects to a data flow

Add three objects to the DF_CustDim data flow workspace.

Make sure that the DF_CustDim data flow workspace is open.

Tutorial
60 PUBLIC Populate the Customer dimension table from a relational table
1. Open the Datastore tab in the Local Object Library and expand the Tables node under ODS_DS.
2. Drag and drop the ODS_CUSTOMER table to the workspace and click Make Source.

3. Click the query button on the tool palette at right and click in the workspace to the right of the
CUSTOMER table.

The query icon appears in the workspace.


4. Open the Datastore tab in the Local Object Library and expand the Tables node under Target_DS.
5. Drag and drop the CUST_DIM table to the right of the query and click Make Target.
6. Connect the objects to indicate the flow of data, as shown.

6.3.2 Configuring the query transform

You configure the query transform by mapping columns from the source to the target objects.

1. Double-click the query in the workspace to open the query editor.


2. Drag CUST_ID key column from Schema In to the Schema Out column area.

The software adds CUST_ID as a column in Query table.


3. Remap the following source columns to the target schema, leaving the names and data types as they are in
the target.

 Note

Do not map CUST_TIMESTAMP.

Schema In column Schema Out column Description

CUST_CLASSF CUST_CLASSF Customer classification

NAME1 NAME1 Customer name

ADDRESS ADDRESS Address

CITY CITY City

REGION_ID REGION_ID Region

ZIP ZIP Postal code

Tutorial
Populate the Customer dimension table from a relational table PUBLIC 61
 Note

If your database manager is Microsoft SQL Server or Sybase ASE, specify the columns in the order
shown in the table.

4. Click the Back arrow in the icon bar to return to the data flow.
5. Save your work.

6.4 Validating the CustDim data flow

Next you will verify that the data flow has been constructed properly.

From the menu bar, click Validation Validate All Objects in View .

 Note

○ Current View validates the object definition open in the workspace.


○ All Objects in View validates the object definition open in the workspace and all of the objects that it
calls.

You can alternatively use the icon bar and click Validate Current and Validate All to perform the same
validations.

If your design contains syntax errors, a dialog box appears with a message describing the error. Warning
messages usually do not affect proper execution of the job.

If your data flow contains no errors, the following message appears:

Validate: No Errors Found

Task overview: Populate the Customer dimension table from a relational table [page 58]

Previous: Define the data flow [page 60]

Next task: Executing the CustDim job [page 63]

Tutorial
62 PUBLIC Populate the Customer dimension table from a relational table
6.5 Executing the CustDim job

You execute the CustDim job in the same way that you execute the other jobs in the tutorial. However,we show
you how to view data.

1. In the Project Area, right-click the JOB_CustDim job and click Execute.
2. Click OK.

The software opens Execution Properties.


3. Leave all of the default settings and click OK.

The software executes the JOB_CustDim job.


4. Check for any warnings or errors after the job execution completes. If errors exist, fix the errors and
execute the job again.
5. After successful execution, view the output data by following these substeps.

1. Click the DF_CustDim data flow in the Project Area. The data flow workspace opens.
2. Click the magnifying glass that appears on the lower right corner of the target object.

A sample view of the output data appears in the lower pane. Notice that there is not a CUST_TIMESTAMP
column in the output file. However, the software added the CUST_ID column to the output file.

For information about the icon options above the sample data, see the Designer Guide.

Task overview: Populate the Customer dimension table from a relational table [page 58]

Previous task: Validating the CustDim data flow [page 62]

Next: The interactive debugger [page 63]

6.6 The interactive debugger

The Designer interactive debugger allows you to examine and modify data row by row using filters and
breakpoints on lines in a data flow diagram.

The debugger allows you to examine what happens to the data after each transform or object in the flow.

● Debug filter: Functions as a simple query transform with a WHERE clause. Use a filter to reduce a data set
in a debug job execution.
● Breakpoint: Location where a debug job execution pauses and returns control to you.

When you start a job in the interactive debugger, Designer displays three additional panes as well as the View
Data panes beneath the workspace area. The following diagram shows the default locations for these panes.

Tutorial
Populate the Customer dimension table from a relational table PUBLIC 63
1. View data panes, left and right
2. Call Stack pane
3. Trace pane
4. Debug Variables pane

The left View Data pane shows the data in the CUSTOMER source table, and the right pane shows one row at a
time (the default) that has passed to the query.

Optionally, set a condition in a breakpoint to search for specific rows. For example, you can set a condition to
stop the data flow when the debugger reaches a row in the data with a Region_ID value of 2.

In the next exercise, we show you how to set a breakpoint and debug your DF_CustDim data flow.

Learn more about the interactive debugger in the Designer Guide.

Setting a breakpoint in a data flow [page 65]


A breakpoint is a location in the data flow where a debug job execution pauses and returns control to
you.

Debugging Job_CustDim with interactive debugger [page 66]


Run an interactive debuging on the Customer Dimension job to see the basic functionality of the
interactive debugger.

Setting a breakpoint condition [page 66]


Set a condition on the breakpoint to stop processing when a specific condition is met.

Tutorial
64 PUBLIC Populate the Customer dimension table from a relational table
Parent topic: Populate the Customer dimension table from a relational table [page 58]

Previous task: Executing the CustDim job [page 63]

Next: Summary and what to do next [page 67]

6.6.1 Setting a breakpoint in a data flow

A breakpoint is a location in the data flow where a debug job execution pauses and returns control to you.

Ensure that you have the Class_Exercises project open in the Project Area.

Follow these steps to set a breakpoint in the DF_CustDim data flow.

1. Expand JOB_CustDim and click DF_CustDim in the Project Area.

The DF_CustDim definition opens in the workspace.


2. Right-click the connector line between the source table and the query and select Set Breakpoint.

A red breakpoint icon appears on the connector line.


3. Double-click the breakpoint icon on the connector to open the Breakpoint editor.

The Breakpoint settings are in the right pane of the Breakpoint editor.
4. Select the Set checkbox.

Leave the other options set at the default settings.

Tutorial
Populate the Customer dimension table from a relational table PUBLIC 65
5. Click OK.

6.6.2 Debugging Job_CustDim with interactive debugger


Run an interactive debuging on the Customer Dimension job to see the basic functionality of the interactive
debugger.

1. In the Designer Project Area, right-click Job_CustDim and select Start debug.
Click OK if you see a prompt to save your work.

The Debug Properties editor opens. See The interactive debugger [page 63] for an explanation of the
Debug Properties editor.
2. Click OK to close the Debug Properties editor.

The debugging stops after the first row and displays the View data left and right panes.

3. To process the next row, click from the icon toolbar at the top of the workspace area.

The next row replaces the existing row in the right view data pane.
4. To see all debugged rows, select the All checkbox in the upper right of the right view data pane.

The right pane shows the first two rows that it has debugged.
5. To stop the debug mode, click Stop Debug from the Debug menu, or click the Stop Debug button on the

toolbar. .

Now debug the job with a breakpoint condition.

6.6.3 Setting a breakpoint condition


Set a condition on the breakpoint to stop processing when a specific condition is met.

For example, add a breakpoint condition for the Customer Dimension job to break when the debugger reaches
a row in the data with a Region_ID value of 2.

1. Open the breakpoint dialog box by double-clicking the breakpoint icon in the data flow.
2. Click the cell under the Column heading and click the down arrow to display a dropdown list of columns.
3. Click CUSTOMER.REGION_ID.
4. Click the cell under the Operator heading and click the down arrow to display a dropdown list of operators.
Click = .
5. Click the cell under the Value heading and type 2.
6. Click OK.
7. Right-click the job name and click Start debug.
The debugger stops after processing the first row with a Region_ID of 2. The right View Data pane shows
the break point.
8. To stop the debug mode, from the Debug menu, click Stop Debug, or click the Stop Debug button on the
toolbar.

Tutorial
66 PUBLIC Populate the Customer dimension table from a relational table
6.7 Summary and what to do next

In the exercise to populate the Customer dimension table with a relational table, you learned to use some basic
features of the interactive debugger.

What you learned:

● Extract data from a relational table


● View a sample of the data by clicking the magnifying glass in the lower right corner of the source or target
icon in the data flow
● Remap source columns to specific data types in the query transform
● Use the basic features of the interactive debugger

In the next section, you learn about document type definitions (DTD) and extracting data from an XML file.

For more information about the topics covered in this section, see the Designer Guide.

Parent topic: Populate the Customer dimension table from a relational table [page 58]

Previous: The interactive debugger [page 63]

Related Information

Populate the Material Dimension from an XML File [page 68]

Tutorial
Populate the Customer dimension table from a relational table PUBLIC 67
7 Populate the Material Dimension from an
XML File

In this exercise, we use a DTD to define the format of an XML file, which has a hierarchical structure. The
software can process the data only after you have flattened the hierarchy.

An XML file represents hierarchical data using XML tags instead of rows and columns as in a relational table.

There are two methods for flattening the hierarchy of an XML file so that the software can process your data. In
this exercise we first use a Query transform and systematically flatten the input file structure. Then we use an
XML_Pipeline transform to select portions of the nested data to process.

To help you understand the goal for the tasks in this section, read about nested data in the Designer Guide.

1. Nested data [page 69]


The software provides a way to view and manipulate hierarchical relationships within data flow sources,
targets, and transforms using Nested Relational Data Modeling (NRDM).
2. Adding MtrlDim job, workflow, and data flow [page 69]
To create the objects for this task, we omit the details and rely on the skills that you learned in the first
few exercises of the tutorial.
3. Importing a document type definition [page 70]
Import the document type definition (DTD) schema named mtrl.dtd as described in the following
steps.
4. Define the MtrlDim data flow [page 71]
In this exercise you add specific objects to the DF_MtrlDim data flow workspace and connect them in
the order in which the software should process them.
5. Validating that the MtrlDim data flow has been constructed properly [page 75]

Tutorial
68 PUBLIC Populate the Material Dimension from an XML File
After unnesting the source data using the Query in the last exercise, validate the DF_MtrlDim to make
sure there are no errors.
6. Executing the MtrlDim job [page 76]
After you save the MtrlDim data flow, execute the MtrlDim job.
7. Leveraging the XML_Pipeline [page 76]
The main purpose of the XML_Pipeline transform is to extract parts of the XML file.
8. Summary and what to do next [page 79]
In this section you learned two ways to process an XML file: With a Query transform and with the XML
Pipeline transform.

Related Information

Designer Guide: Nested Data


Designer Guide: Nested Data, Operations on nested data, Overview of nested data and the Query transform
Designer Guide, Nested Data, Operations on nested data, Unnesting nested data

7.1 Nested data

The software provides a way to view and manipulate hierarchical relationships within data flow sources, targets,
and transforms using Nested Relational Data Modeling (NRDM).

In this tutorial, we use a document type definition (DTD) schema to define an XML source. XML files have a
hierarchical structure. The DTD describes the data contained in the XML document and the relationships
among the elements in the data.

You imported the mtrl.dtd file when you ran the script for this tutorial. It is located in the Formats tab of the
Local Object Library under Nested Schemas.

For complete information about nested data, see the Designer Guide.

Parent topic: Populate the Material Dimension from an XML File [page 68]

Next task: Adding MtrlDim job, workflow, and data flow [page 69]

7.2 Adding MtrlDim job, workflow, and data flow

To create the objects for this task, we omit the details and rely on the skills that you learned in the first few
exercises of the tutorial.

1. Add a new job to the Class_Exercises project and name it JOB_MtrlDim. To remind you of the steps,
see Adding the CustDim job and workflow [page 59].

Tutorial
Populate the Material Dimension from an XML File PUBLIC 69
2. Add a workflow and name it WF_MtrlDim. To remind you of the steps, see Adding the CustDim job and
workflow [page 59].
3. Click WF_MtrlDim in the Project Area to open it in the workspace.
4. Add a data flow to the workflow definition and name it DF_MtrlDim. To remind you of the steps, see
Adding a data flow [page 40].

Task overview: Populate the Material Dimension from an XML File [page 68]

Previous: Nested data [page 69]

Next task: Importing a document type definition [page 70]

7.3 Importing a document type definition

Import the document type definition (DTD) schema named mtrl.dtd as described in the following steps.

1. Open the Formats tab in the Local Object Library.

2. Right-click Nested Schemas and click New DTD .

The Import DTD Format dialog opens.


3. Type Mtrl_List for the DTD definition name.
4. Click Browse and open <LINK_DIR>\Tutorial Files\mtrl.dtd.

The directory and file name appears in File name.


5. For File type, keep the default option value of DTD .
6. Click the dropdown arrow In Root element name and select MTRL_MASTER_LIST.
7. Click OK.

The software adds the DTD file Mtrl_List to the Nested Schemas group in the Local Object Library.

Task overview: Populate the Material Dimension from an XML File [page 68]

Previous task: Adding MtrlDim job, workflow, and data flow [page 69]

Next: Define the MtrlDim data flow [page 71]

Tutorial
70 PUBLIC Populate the Material Dimension from an XML File

You might also like