75 Loading Data
75 Loading Data
After you save the data flow DF_TimeDim, execute the JOB_TimeDim job to populate the TIME_DIM dimension
table with the changed data.
For instructions to validate and execute the job, see Validating the DF_SalesOrg data flow [page 44] and
Executing the job [page 47].
After the job successfully completes, view the output data using your database management tool. Compare
the output to the input data and see how the functions that you set up in the query affected the output data.
Note
Remember that you should periodically close the tabs in the workspace when you are finished working with
the objects in the tab.
In the exercises to populate the Time Dimension table, you practiced the skills that you learned in the first
group of exercises, plus you learned how to use different objects as source and target in a data flow.
You have now populated the following tables in the sales data warehouse:
In the next section, you will extract data to populate the Customer dimension table.
Tutorial
56 PUBLIC Populate the Time dimension table
You can now exit Data Services or go to the next group of tutorial exercises. If you exit, the software reminds
you to save your work if you did not save it before. The software saves all projects, jobs, workflows, data flows,
and results in the local repository.
Related Information
Populate the Customer dimension table from a relational table [page 58]
Tutorial
Populate the Time dimension table PUBLIC 57
6 Populate the Customer dimension table
from a relational table
In this exercise, you populate the Customer dimension table in the Sales star schema with data from a
relational table.
In the past exercises you have used a flat file to populate the Sales Org. dimension table and a transform to
populate the Time dimension table. In this exercise you use a relational table to populate the Customer
dimension table.
You also use the interactive debugger to examine the data after each transform or object in the data flow.
Before you continue with this exercise, make sure that you imported the source and target tables as instructed
in the Importing metadata [page 32] section.
Tutorial
58 PUBLIC Populate the Customer dimension table from a relational table
The Designer interactive debugger allows you to examine and modify data row by row using filters and
breakpoints on lines in a data flow diagram.
7. Summary and what to do next [page 67]
In the exercise to populate the Customer dimension table with a relational table, you learned to use
some basic features of the interactive debugger.
1. Right-click the Class_Exercises project name and select New Batch Job.
A tab opens in the workspace area for the new batch job.
2. Rename this job JOB_CustDim.
3. Select the workflow button from the tool palette at right and click the workspace area.
Task overview: Populate the Customer dimension table from a relational table [page 58]
Related Information
1. Click the data flow button in the tool palette at right and click in the workspace.
The project, job, workflow, and data flow objects display in hierarchical form in the Project Area. To navigate
to these levels, click their names in the project area.
Tutorial
Populate the Customer dimension table from a relational table PUBLIC 59
3. Click DF_CustDim in the Project Area.
A blank definition area for the data flow appears in the workspace.
Task overview: Populate the Customer dimension table from a relational table [page 58]
Previous task: Adding the CustDim job and workflow [page 59]
Related Information
Add objects to DF_CustDim in the workspace area to define the data flow instructions for populating the
Custom dimension table.
In this exercise, you build the data flow by adding the following objects:
● Source table
● Query transform
● Target table
Parent topic: Populate the Customer dimension table from a relational table [page 58]
Tutorial
60 PUBLIC Populate the Customer dimension table from a relational table
1. Open the Datastore tab in the Local Object Library and expand the Tables node under ODS_DS.
2. Drag and drop the ODS_CUSTOMER table to the workspace and click Make Source.
3. Click the query button on the tool palette at right and click in the workspace to the right of the
CUSTOMER table.
You configure the query transform by mapping columns from the source to the target objects.
Note
Tutorial
Populate the Customer dimension table from a relational table PUBLIC 61
Note
If your database manager is Microsoft SQL Server or Sybase ASE, specify the columns in the order
shown in the table.
4. Click the Back arrow in the icon bar to return to the data flow.
5. Save your work.
Next you will verify that the data flow has been constructed properly.
From the menu bar, click Validation Validate All Objects in View .
Note
You can alternatively use the icon bar and click Validate Current and Validate All to perform the same
validations.
If your design contains syntax errors, a dialog box appears with a message describing the error. Warning
messages usually do not affect proper execution of the job.
Task overview: Populate the Customer dimension table from a relational table [page 58]
Tutorial
62 PUBLIC Populate the Customer dimension table from a relational table
6.5 Executing the CustDim job
You execute the CustDim job in the same way that you execute the other jobs in the tutorial. However,we show
you how to view data.
1. In the Project Area, right-click the JOB_CustDim job and click Execute.
2. Click OK.
1. Click the DF_CustDim data flow in the Project Area. The data flow workspace opens.
2. Click the magnifying glass that appears on the lower right corner of the target object.
A sample view of the output data appears in the lower pane. Notice that there is not a CUST_TIMESTAMP
column in the output file. However, the software added the CUST_ID column to the output file.
For information about the icon options above the sample data, see the Designer Guide.
Task overview: Populate the Customer dimension table from a relational table [page 58]
The Designer interactive debugger allows you to examine and modify data row by row using filters and
breakpoints on lines in a data flow diagram.
The debugger allows you to examine what happens to the data after each transform or object in the flow.
● Debug filter: Functions as a simple query transform with a WHERE clause. Use a filter to reduce a data set
in a debug job execution.
● Breakpoint: Location where a debug job execution pauses and returns control to you.
When you start a job in the interactive debugger, Designer displays three additional panes as well as the View
Data panes beneath the workspace area. The following diagram shows the default locations for these panes.
Tutorial
Populate the Customer dimension table from a relational table PUBLIC 63
1. View data panes, left and right
2. Call Stack pane
3. Trace pane
4. Debug Variables pane
The left View Data pane shows the data in the CUSTOMER source table, and the right pane shows one row at a
time (the default) that has passed to the query.
Optionally, set a condition in a breakpoint to search for specific rows. For example, you can set a condition to
stop the data flow when the debugger reaches a row in the data with a Region_ID value of 2.
In the next exercise, we show you how to set a breakpoint and debug your DF_CustDim data flow.
Tutorial
64 PUBLIC Populate the Customer dimension table from a relational table
Parent topic: Populate the Customer dimension table from a relational table [page 58]
A breakpoint is a location in the data flow where a debug job execution pauses and returns control to you.
Ensure that you have the Class_Exercises project open in the Project Area.
The Breakpoint settings are in the right pane of the Breakpoint editor.
4. Select the Set checkbox.
Tutorial
Populate the Customer dimension table from a relational table PUBLIC 65
5. Click OK.
1. In the Designer Project Area, right-click Job_CustDim and select Start debug.
Click OK if you see a prompt to save your work.
The Debug Properties editor opens. See The interactive debugger [page 63] for an explanation of the
Debug Properties editor.
2. Click OK to close the Debug Properties editor.
The debugging stops after the first row and displays the View data left and right panes.
3. To process the next row, click from the icon toolbar at the top of the workspace area.
The next row replaces the existing row in the right view data pane.
4. To see all debugged rows, select the All checkbox in the upper right of the right view data pane.
The right pane shows the first two rows that it has debugged.
5. To stop the debug mode, click Stop Debug from the Debug menu, or click the Stop Debug button on the
toolbar. .
For example, add a breakpoint condition for the Customer Dimension job to break when the debugger reaches
a row in the data with a Region_ID value of 2.
1. Open the breakpoint dialog box by double-clicking the breakpoint icon in the data flow.
2. Click the cell under the Column heading and click the down arrow to display a dropdown list of columns.
3. Click CUSTOMER.REGION_ID.
4. Click the cell under the Operator heading and click the down arrow to display a dropdown list of operators.
Click = .
5. Click the cell under the Value heading and type 2.
6. Click OK.
7. Right-click the job name and click Start debug.
The debugger stops after processing the first row with a Region_ID of 2. The right View Data pane shows
the break point.
8. To stop the debug mode, from the Debug menu, click Stop Debug, or click the Stop Debug button on the
toolbar.
Tutorial
66 PUBLIC Populate the Customer dimension table from a relational table
6.7 Summary and what to do next
In the exercise to populate the Customer dimension table with a relational table, you learned to use some basic
features of the interactive debugger.
In the next section, you learn about document type definitions (DTD) and extracting data from an XML file.
For more information about the topics covered in this section, see the Designer Guide.
Parent topic: Populate the Customer dimension table from a relational table [page 58]
Related Information
Tutorial
Populate the Customer dimension table from a relational table PUBLIC 67
7 Populate the Material Dimension from an
XML File
In this exercise, we use a DTD to define the format of an XML file, which has a hierarchical structure. The
software can process the data only after you have flattened the hierarchy.
An XML file represents hierarchical data using XML tags instead of rows and columns as in a relational table.
There are two methods for flattening the hierarchy of an XML file so that the software can process your data. In
this exercise we first use a Query transform and systematically flatten the input file structure. Then we use an
XML_Pipeline transform to select portions of the nested data to process.
To help you understand the goal for the tasks in this section, read about nested data in the Designer Guide.
Tutorial
68 PUBLIC Populate the Material Dimension from an XML File
After unnesting the source data using the Query in the last exercise, validate the DF_MtrlDim to make
sure there are no errors.
6. Executing the MtrlDim job [page 76]
After you save the MtrlDim data flow, execute the MtrlDim job.
7. Leveraging the XML_Pipeline [page 76]
The main purpose of the XML_Pipeline transform is to extract parts of the XML file.
8. Summary and what to do next [page 79]
In this section you learned two ways to process an XML file: With a Query transform and with the XML
Pipeline transform.
Related Information
The software provides a way to view and manipulate hierarchical relationships within data flow sources, targets,
and transforms using Nested Relational Data Modeling (NRDM).
In this tutorial, we use a document type definition (DTD) schema to define an XML source. XML files have a
hierarchical structure. The DTD describes the data contained in the XML document and the relationships
among the elements in the data.
You imported the mtrl.dtd file when you ran the script for this tutorial. It is located in the Formats tab of the
Local Object Library under Nested Schemas.
For complete information about nested data, see the Designer Guide.
Parent topic: Populate the Material Dimension from an XML File [page 68]
Next task: Adding MtrlDim job, workflow, and data flow [page 69]
To create the objects for this task, we omit the details and rely on the skills that you learned in the first few
exercises of the tutorial.
1. Add a new job to the Class_Exercises project and name it JOB_MtrlDim. To remind you of the steps,
see Adding the CustDim job and workflow [page 59].
Tutorial
Populate the Material Dimension from an XML File PUBLIC 69
2. Add a workflow and name it WF_MtrlDim. To remind you of the steps, see Adding the CustDim job and
workflow [page 59].
3. Click WF_MtrlDim in the Project Area to open it in the workspace.
4. Add a data flow to the workflow definition and name it DF_MtrlDim. To remind you of the steps, see
Adding a data flow [page 40].
Task overview: Populate the Material Dimension from an XML File [page 68]
Import the document type definition (DTD) schema named mtrl.dtd as described in the following steps.
The software adds the DTD file Mtrl_List to the Nested Schemas group in the Local Object Library.
Task overview: Populate the Material Dimension from an XML File [page 68]
Previous task: Adding MtrlDim job, workflow, and data flow [page 69]
Tutorial
70 PUBLIC Populate the Material Dimension from an XML File