Guide To Migrating To IBM Info Sphere Information Server, Version 8.5
Guide To Migrating To IBM Info Sphere Information Server, Version 8.5
Version 8 Release 5
SC19-2965-00
SC19-2965-00
Note Before using this information and the product that it supports, read the information in Notices and trademarks on page 53.
Copyright IBM Corporation 2006, 2010. US Government Users Restricted Rights Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents
Part 1. Overview of migration . . . . 1
Chapter 1. Migrating to IBM InfoSphere Information Server, Version 8.5 . . . . . 3
Migration strategies . . . . . . . . . . . . 3 Job migration in legacy operational mode . . . QualityStage job migration in expanded form . . Rule set migration . . . . . . . . . . . Match specification migration . . . . . . . Installing the WebSphere QualityStage migration utility . . . . . . . . . . . . . . . Running the QualityStage migration utility . . . Importing the migrated files into the Version 8.5 Designer client . . . . . . . . . . . Preparing migrated QualityStage jobs for operation . . . . . . . . . . . . . Migrating real-time QualityStage jobs from Release 7 to Version 8.5 . . . . . . . . . . . . Replace InfoSphere QualityStage 7 data quality stages . . . . . . . . . . . . . . . . . . . 25 26 26 26
. 27 . 28 . 29 . 30 . 32 . 33
Chapter 5. Migration from IBM WebSphere RTI 7.5, 7.5.1, and 7.5.2 installations . . . . . . . . . . . . 35
Migrating RTIX files to Version 8.5 . . . . . . 35
Part 4. Appendixes. . . . . . . . . 43
Product accessibility . . . . . . . . 45
Accessing product documentation. . . 47 Reading command-line syntax . . . . 49 Links to non-IBM Web sites. . . . . . 51 Notices and trademarks . . . . . . . 53 Contacting IBM . . . . . . . . . . . 57 Index . . . . . . . . . . . . . . . 59
iii
iv
Migration strategies
You will use different migration strategies and methods to migrate to Version 8.5 from existing installations. Plan your strategy and choose your migration method based on the version of your existing installation. Successful migration requires that you understand your IBM InfoSphere Information Server installation topology on the source computers and your InfoSphere Information Server installation topology on the target computers. You must use one of the following methods to migrate your source installation to a new Version 8.5 installation:
Table 1. Supported migrations to Version 8.5 Existing installation (source) Information Server, Version 8.0.1 and later Migration method If you want to do this type of migration, see www.ibm.com/support/ docview.wss?uid=swg21445403. Export projects and save the settings files from the existing installation. Then import the projects and move the settings files to the Version 8.5 installation. See Chapter 3, Migration from DataStage 7.5.3 and earlier versions, on page 9. Use the QualityStage migration utility. See Chapter 4, Migration from QualityStage 7.5 and earlier releases, on page 25. Use the WebSphere RTI Export Wizard. See Chapter 5, Migration from IBM WebSphere RTI 7.5, 7.5.1, and 7.5.2 installations, on page 35.
Note: You do not migrate computers that only have the client tier installed on them. Computers that only have client tier software installed do not contain installation information or data that requires migration. To upgrade these computers to Version 8.5, you run the InfoSphere Information Server suite installation program.
Exporting projects
You export your DataStage projects from your source installation before migrating the projects to your Version 8.5 target installation.
Procedure
1. Open the DataStage Manager client, and attach to project that you want to export. 2. Ensure View Host View is selected. 3. Select Export DataStage Components. 4. In the Export window, specify details about the project that you want to export: a. In the Export to file field, type or browse for the path name of the file in which to store the project. By default, export files have the suffix .dsx. b. In the Components tab, select the Whole Project option. 5. Click OK. The project is exported to the file that you specified. 6. In the left pane of the DataStage Manager window, select the next project and repeat the export process from step 3, specifying a different name for the export file.
Procedure
1. Open a command line editor and go to the DataStage client directory (the default path is C:\Ascential\Program Files\Ascential\DataStageversion). 2. Enter the command: dscmdexport /H=hostname /U=username /P=password project_name export_file_path [/V] The arguments are as follows: v hostname is the name of the DataStage server computer where the project is located. v username is your user name on the server computer. v password is the password for that user name. v project_name is the name of the project that you are exporting. v export_file_path is the path name of the destination file. By convention, export files have the suffix .dsx. v /V is optional. Including /V turns the verbose option on so that you can follow the progress of the export procedure.
10
Example
For example, the following command exports the project named monthlyaudit that is located on the DataStage server named R101 and writes the project to a file named monthlyaudit.dsx. The target file is located in a directory named migrated_projects on the client computer.
dscmdexport /H=R101 /U=BillG /P=paddock monthlyaudit C:\migrated_projects\monthlyaudit.dsx
What to do next
You can enter a command for each project that you want to export, or you can create a script that contains commands for all of the projects on your DataStage server computer.
Procedure
1. Copy the $DSHOME/dsenv file to a location outside of the DataStage file structure. 2. To keep a record of the existing configuration and the configuration of your ODBC drivers, copy the following files to a location outside of the DataStage file structure: v $DSHOME/.odbc.ini v $DSHOME/uvodbc.config v The uvodbc.config file located in each project directory
Chapter 3. Migration from DataStage 7.5.3 and earlier versions
11
3. Copy the DSParams file from each project directory to a location outside of the DataStage file structure. 4. Also, copy the DSParams file from the Template project. $DSHOME/../Template/ DSParams
Procedure
1. Copy the DSParams file from each project directory to a safe location. 2. Also, copy the DSParams file from the Template project. For example, save the C:\Ascential\DataStage\Template\DSParams file. 3. Copy the $DSHOME/uvodbc.config in the engine directory. 4. The uvodbc.config file located in each project directory.
12
Procedure
1. Locate the hashed files in your directory structure: v Each static hashed file is represented by two operating system files. For example, a static hashed file named price_lookup is contained in the two files named price_lookup and d_price_lookup. v Each dynamic hashed file is represented by a directory with the same name as the hashed file and a file named D_hashed_file_name. For example, a dynamic hashed file named code_lookup is represented by the directory named code_lookup and the file named D_code_lookup. 2. Copy the files or directories that represent each of your hashed files to a safe location outside of the DataStage directory structure.
What to do next
You can restore these files after you install and configure InfoSphere DataStage.
13
Procedure
1. 2. 3. 4. Open the control panel. In the control panel, select Add or Remove Programs. In the list of installed programs, select DataStage server. Click Change/Remove.
Procedure
1. Log in as the root user. 2. Change directory to the top level directory of the CD, or the directory to which you copied the CD contents. 3. Type the relevant uninstallation command:
Operating System Solaris AIX
HP-UX LINUX
The uninstallation program guides you through the procedure for removing the DataStage server.
Procedure
1. Open the Windows control panel. 2. In the control panel, select Add or Remove Programs. 3. In the list of installed programs, select DataStage Clients. 4. Click Change/Remove.
14
Procedure
1. Move the .dsx file or files that you created to the computer where the IBM InfoSphere DataStage and QualityStage, Version 8.5 clients are installed. 2. Open the Administrator client and create a project or projects to contain the objects from your exported projects: a. On the Projects page, click Add. b. In the Add Project window, type the name of the project that you want to create and specify a path name for it. You can use the names of the original projects if required. c. Click OK to create the project. d. Repeat these steps for each project that you want to create. You might want to create a project for each of the projects that you exported and give it the same name as the original project. Open the Designer client and attach to the target project. Select Import DataStage Components. Specify the name of the .dsx file that you want to import, and click OK. After you unit test the migrated components in your Version 8.5 development environment, you must export the DataStage components from your Version 8.5 environment and then import them to your Version 8.5 production environment.
3. 4. 5. 6.
Procedure
1. Open the saved version of the settings file in a text editor. 2. Open the new version of the same file in a text editor. 3. Compare the contents of the saved file with the new file. 4. Add any required entries to the new file. 5. Save the new file and close the text editor.
15
Procedure
1. For each file, find the location for the file in the new IBM InfoSphere Information Server directory structure. For example, if the file that you want to restore was in the project directory in your previous DataStage installation, find the project directory to which you imported the project contents. 2. Use operating system commands to copy the saved file to the required location. 3. Ensure that the jobs that reference these files can locate them. If the file is referred to directly, you must edit the path name in the job design. If the file is referred to by a job parameter, you might need to edit the default value of the parameter to reference the new location for the file.
Procedure
1. For jobs in your migrated project that create the hashed files, open the InfoSphere DataStage and QualityStage Director client, and attach to the migrated project that uses the hashed files. 2. Run or validate the job that creates the hashed files. 3. Close all InfoSphere DataStage clients and stop the DataStage services. 4. Locate the newly created files in your directory structure and copy the saved hashed files over the top of them. 5. Restart the DataStage services.
What to do next
Your hashed files are now available for use.
Procedure
1. Open the Designer client, and attach to the migrated project that uses the hashed files. 2. Create a new server job. 3. For each hashed file that you need to restore, add a Sequential File stage linked to a Hashed File stage.
16
4. In the Sequential File stage, point to an empty text file and define one column in the Outputs page Columns tab. 5. In the Hashed File stage, specify the name of the hashed file that you are restoring, then select the Create File option and specify the file type of the hashed file that you are restoring. 6. When you have added stages for all the hashed files that you want to restore, compile and then validate or run the job to create the empty hashed files. 7. Close all InfoSphere DataStage clients and stop the DataStage services 8. Locate the newly-created files in your directory structure and copy your saved hash files over the top of them. 9. Restart the DataStage services.
What to do next
Your hashed files are now available for use.
Recompiling jobs
You must recompile the jobs and routines in the migrated projects to create new executable jobs.
Procedure
1. If you started the wizard from the Tools menu, specify the criteria for selecting jobs to compile. Choose one or more of: v v v v v v Server Parallel Mainframe Sequence Custom server routines Custom parallel stage types
You can also specify that only currently uncompiled jobs will be compiled, and that you want to manually select the items to compile. 2. Click Next>. If you chose the Show manual selection page option, the Job Selection Override screen appears. Choose jobs in the left pane and add them to the right pane by using the Add buttons or remove them from the right pane by using the Remove buttons. Clicking Add while you have a folder selected, selects all the items in that folder and moves them to the right pane. All the jobs in the right pane will be compiled. 3. Click Next>. If you are compiling parallel or mainframe jobs, the Compiler Options screen appears, allowing you to specify the following: v Force compile (for parallel jobs). v An upload profile for mainframe jobs you are generating code for. 4. Click Next>. The Compile Process screen appears, displaying the names of the selected items and their current compile status.
Chapter 3. Migration from DataStage 7.5.3 and earlier versions
17
5. Click Start Compile to start the compilation. As the compilation proceeds the status changes from Queued to Compiling to Compiled OK or Failed and details about each job are displayed in the compilation output window as it compiles. Click the Cancel button to stop the compilation, although you can only cancel between compilations so the Designer client might take some time to respond. 6. Click Finish. If the Show compile report checkbox was selected the job compilation report screen appears, displaying the report generated by the compilation.
Procedure
1. Start the Multi-client Manager by double-clicking the desktop shortcut.
18
2. In the Current installation field, check whether the currently selected version is the version that you want. v If the correct version is selected, you need take no further action. v If the correct version is not selected, select the correct client in the Known installations list and click Select. 3. Click Close to close the Multi-client Manager.
19
Table 2. Scenario 1: InfoSphere DataStage, Version 8.5 server installed on the same computer as an existing DataStage 7.5.1 server (continued) Engine tier instance InfoSphere DataStage, Version 8.5 server Server details v Itag 123 v Port 31540 v /opt/IBM/InformationServer/Server
The following table shows a scenario of a multi-server installation with three servers, and illustrates the use of port numbers and ITAGs.
Table 3. Scenario 2: InfoSphere DataStage, Version 8.5 server installed on the same computer as two existing DataStage 7.5.1 servers Engine tier instance WebSphere DataStage, release 7.2 server Server details v Itag ADE v Port 31538 v /disk1/Ascential/DataStage WebSphere DataStage, release 7.5.1 server v Itag A23 v Port 31546 v /disk2/Ascential/DataStage InfoSphere DataStage, Version 8.5 server v Itag 123 v Port 31540 v /opt/IBM/InformationServer/Server
The following table shows a scenario of with three Version 8.5 servers, and illustrates the use of port numbers and ITAGs for Version 8.5 installations.
Table 4. Scenario 3: Three instances of InfoSphere DataStage, Version 8.5 server installed on the same computer Engine tier instance InfoSphere DataStage, Version 8.5 server Server details v Itag ADE v Port 31538 v /opt/IBM/InformationServer/Server InfoSphere DataStage, Version 8.5 server v Itag BED v Port 31540 v /opt2/IBM/InformationServer/Server InfoSphere DataStage, Version 8.5 server v Itag 123 v Port 31542 v /opt3/IBM/InformationServer/Server
20
Installing IBM InfoSphere DataStage, Version 8.5 clients in addition to existing clients
When you maintain multiple versions of the InfoSphere DataStage server on Linux or UNIX computers, you must maintain corresponding client versions on Windows computers.
Procedure
1. Log on to the Windows computer as an administrator. 2. Turn off any firewall software that is installed on the computer. 3. Optional: Turn off your antivirus software. 4. Go to the root directory on the InfoSphere Information Server, Version 8.5 installation media or downloaded installation image. 5. Double-click setup.exe. The installation program starts and guides you through the installation procedure. 6. When asked for an installation directory, select the New installation option and either use the default directory, or specify a new directory. 7. When asked to select the product modules and components, select InfoSphere DataStage and QualityStage. Select other components as required by your installation plan.
21
Procedure
1. Set the $DSHOME environment variable to point to the /opt/IBM/ InformationServer/Server/DSEngine directory. 2. Stop the server by using the following command: $DSHOME/bin/uv -admin -stop 3. Wait thirty seconds to give the server time to stop. 4. Start the server by using the following command: $DSHOME/bin/uv -admin -start
Procedure
1. Select the Manager, Designer, or Director client from the Start menu. 2. In the Host System field of the Attach to Project window, type the identity of the server computer in the form hostname:portnumber where portnumber is the port number that the server uses. For example, type R101:31538 in the Host System field. 3. Type your user name and password. 4. Specify the name of the project that you want to attach to.
22
Procedure
1. Select the Administrator client from the Start menu. 2. In the Host System field of the Attach to DataStage window, type the identity of the server computer in the form hostname:portnumber where portnumber is the port number that the server uses. For example, type R101:31538 in the Host System field. 3. Type your user name and password.
Procedure
1. Select the Designer client or Director client from the Start menu. 2. In the Domain field of the Attach to Project window, type the name of the domain to which your InfoSphere DataStage server belongs in the form DomainServer:9080. For example, type R101:9080. 3. Type your user name and password. 4. In the Project field, specify the identity of the project that you want to attach to in the form hostname:portnumber/project where portnumber is the port number that the server uses. For example, type R101:31348/datastage.
Connecting to IBM InfoSphere DataStage and QualityStage, Version 8.5 from an Administrator client
Use this procedure to connect to a Version 8.5 server from the Administrator client.
23
Procedure
1. Select the Administrator client from the Start menu. 2. In the Domain field of the Attach to DataStage window, type the name of the domain to which your InfoSphere DataStage server belongs in the form DomainServer:9080. For example, type R101:9080 in the Domain field or type R201:80 if you are using a front-end Web server. 3. Type your user name and password. 4. In the DataStage Server field, specify the identity of the computer where the server that you want to attach to is located in the form hostname:portnumber where portnumber is the port number that the server uses. For example, type R201:9080 or type R301:80 if you are using a front-end Web server.
Procedure
v To connect to a project from the command line, you specify the server name and port number of the required instance with the -server argument in the form -server server:portnumber for local computers. v For remote computers that run InfoSphere DataStage, Version 8.5 server instances, you must specify the domain and the server name in the form -domain domain:domain_portnumber -server server:portnumber. The default domain port number is 9080. For example, to run a job on the local computer on the server that uses port 31359, run this command: dsjob -server r101:31359 -run myproj myjob v To run a job on the local computer on the default server, run this command: dsjob -run myotherproj myotherjob v To run a job on a remote computer called R101 on the Version 8.5 server that uses port 31360, you must also specify the host computer, the domain, and supply the login information. For example, enter the following command: dsjob -domain mydomain:9080 -server r101:31360 -user billg -password paddock -run myproj myjob
24
25
Note that, in a Japanese environment, the migrated job uses Data Set stages instead of Sequential File stages. In this case, you must create an additional job that reads the source data from the sequential file and writes it to a data set. The data set can then be read by the migrated job. If your original job wrote results to a sequential file, then you must create another job to write the results from the data set produced by the migrated job to a sequential file. Do not use the legacy operational mode option if you are migrating a job that contains the following QualityStage 7 stages because their functionality is not supported by the QualityStage Legacy stage: v Postal stages such as CASS and SERP v v v v Program stage Multinational Standardize stage WAVES stage Format Convert stage
26
As with rule sets, match specifications are renamed when the information is imported. The match specification name has the following form:
QS-7-RefMatch-or-UndupMatch-Stage-Name_QS-7-Project-Name
Procedure
1. Choose one of the following options, depending on operating system type and whether you want to install the IBM InfoSphere DataStage and QualityStage server on the same computer as the QualityStage 7 server.
Option Installing the InfoSphere DataStage and QualityStage server on the same computer as the QualityStage 7 server Description Install IBM Information Server, installing the engine tier on the computer where the QualityStage 7 server is installed. The migration utility is installed with the engine tier. Install IBM Information Server on the target computer. The migration utility is installed with the engine tier. Before you run the migration utility on the target computer, make the QualityStage 7 project metadata available to the target computer. Make the metadata available by copying the entire QualityStage 7 project directory and all its contents to the target computer, or by making the directory available via the network (for example, by mapping a drive on Windows).
Installing the InfoSphere DataStage and QualityStage server on a different computer from the QualityStage 7 server (Windows or UNIX or Linux)
2. If you are migrating from a Unicode-enabled version of QualityStage Version 7, make the following changes to the qsmig.env file (located in the directory where the migration tool is installed). Make the changes before you use the migration utility. v Change the line, FLDEXTPR=0 to FLDEXTPR=1 v For the Japanese language, add the line, QSLCHARSET="CP932" v For the Chinese language, add the line QSLCHARSET="CP936" 3. On Windows, if the QualityStage Version 7 project directories are on a different drive from the migration utility, add the following line to the qsmig.env file: v APPCLIB=${PROJ}/Controls
What to do next
After you install the QualityStage migration utility, you can migrate QualityStage Version 7 jobs to InfoSphere DataStage and QualityStage, Version 8.5.
27
Procedure
1. Ensure that the QualityStage project metadata is accessible from the computer on which the migration utility is installed. 2. From the migration utility directory, run the script to start the utility. v On UNIX or Linux computers, enter ./qsmigrate.sh. v On Windows, double-click the qsmigrate.bat file. 3. When prompted, enter the full path name of the QualityStage 7 project directory that contains the jobs that you want to migrate. The utility returns a list of the jobs and rule sets in the project. 4. Enter the number of an option from the list of options. 5. Select one of the following procedures, depending on the migration option that you selected:
Option 1. Migrate multiple QualityStage 7.x jobs plus dependencies OR 2. Migrate multiple QualityStage 7.x jobs in expanded form plus dependencies Steps for options 1. Enter a name for the output file that the migration utility produces. All the objects that you migrate are written to this file. The utility reports the path name of this file each time it informs you of the success or failure of an object migration. 2. When prompted with a job name, enter Y to migrate that job or N to skip that job. Repeat this step until you have migrated or skipped all the jobs. 3. When prompted with the name of a rule set or a match specification, enter Y to migrate that object or N to skip that object. Repeat this step until you have migrated or skipped all the objects. 3. Migrate individual QualityStage 7.x job OR 4. Migrate individual QualityStage 7.x job in expanded form When prompted, enter the name of the job that you want to migrate. The output file name is derived from the job name.
28
Option 5. Migrate individual QualityStage 7.x job plus dependencies OR 6. Migrate individual QualityStage 7.x job in expanded form plus dependencies
Steps for options 1. When prompted, enter the name of the job that you want to migrate. The output file name is derived from the job name. 2. When prompted with the name of a rule set or a match specification, enter Y to migrate that object or N to skip that object. Repeat this step until you have migrated or skipped all the objects When prompted, enter the name of the rule set or match specification that you want to migrate. The output file name is derived from this name.
What to do next
The QualityStage migration utility places the .dsx files that it creates in the Temp directory under the project directory. After you complete the migration of all of your jobs and objects, move the .dsx files to the computer where the Version 8.5 IBM InfoSphere DataStage and QualityStage client is installed.
Importing the migrated files into the Version 8.5 Designer client
After you complete the file migration, you must import the files into the metadata repository.
Procedure
1. Move the .dsx file or files that you created when you ran the migration script to the computer where theInfoSphere DataStage and QualityStage clients are installed. 2. Open the Designer client and attach to the project where you want to save the InfoSphere DataStage and QualityStage jobs. 3. Select Import DataStage Components. 4. Specify the name of the migration file and click OK. The migrated jobs, rules sets, and match specifications are saved in the following folders: v Project_name Jobs folder. v Project_name Standardization Rules Imported Rules Rule Sets folder. v Project_name Match Specifications folder.
29
Procedure
1. In the Designer client, find the rule set within the repository tree Project Standardization Rules Imported Rules Rule Sets folder. 2. Select the rule set. 3. Right-click and select Provision All from the menu.
Results
You can compile and run any job that uses the rule set, except for migrated jobs that used the expanded mode to migrate. If you used the expanded form, read the instructions for preparing migrated jobs in the expanded format.
Procedure
1. In the Designer client, find the match specification within the repository tree Project Match Specifications folder. 2. Select the match specification and double-click to open the Match Designer. 3. Select Save All Passes. 4. Select Save Specification. 5. Click OK to close the Match Designer. 6. From the repository, select the match specification. 7. Right-click the match specification and select Provision All from the menu.
Results
You can now use the match specification in a Match job.
Procedure
1. Double-click the job in the Designer client repository tree to open it on the Designer canvas. When you run the migrated job, the results will vary depending on how you ran the job in previous versions of QualityStage. If you
30
used any mode other than the parallel extender mode, the results might be significantly different from previous runs. 2. If you did not previously run the job in parallel extender mode, insert a sort operation into the job design: a. Double-click the target Sequential File stage to open the Input Partitioning page. b. Select Sort Merge from the Collector type list. c. Under the Sorting section, click Perform sort. 3. Click OK to close the window. 4. Select File Compile to compile the job.
Procedure
1. Double-click the job in the Designer client repository tree view to open it. The job contains both QualityStage Legacy stages and Data Quality stages. 2. If you have Standardize, Survive, MNS, or WAVES stages, double-click each stage to open it and then click OK. 3. Review any migration warnings that are displayed at the bottom of the job and resolve these issues. 4. Save the job. 5. Select File Compile. When you run the migrated job, the results will vary depending on how you ran the job in previous versions of QualityStage. If you used any mode other than the parallel extender mode, the results might be significantly different from previous runs. 6. If you did not previously run the job in parallel extender mode, insert a sort operation into the job design: a. Double-click the target Sequential File stage to open the Input Partitioning page. b. Select Sort Merge from the Collector type list. c. Under the Sorting section, click Perform sort. 7. Optional: Replace QualityStage Legacy stages with the equivalent data quality or processing stage. a. Double-click the QualityStage Legacy stage to open the Properties window. b. Find the stage that offers functionality that is equivalent to the Legacy stage functionality from the Data Quality section of the palette. c. Substitute the QualityStage Legacy stage with the equivalent data quality stage or stages. To optimize your job, it is more efficient to replace the QualityStage Legacy stages. d. Configure the new stage or stages. e. Compile the job.
31
Preparing migrated jobs to use updated Address Verification Interface and SERP libraries
The IBM InfoSphere QualityStage, Version 8.5 includes updated libraries for the Address Verification Interface and SERP modules. You must update the migrated jobs to use these updated libraries before they can be run.
Procedure
1. Migrate the job by running the migration utility in expanded form. 2. Replace the legacy stages with appropriate stages according to the table of replacement stages. 3. If your migrated job contains more than one input stage and more than one output stage, reconfigure your job to reduce the number of inputs and outputs. Alternately, you can reconfigure a job with more than one input to align with the behavior of the real-time stages. For more information about creating real-time jobs with two input sources InfoSphere DataStage and QualityStage, see IBM InfoSphere Information Services Director User Guide. A job can have only one ISD Input stage and only one ISD Output stage. 4. Update your jobs to add the ISD Output and ISD Input stages to replace the sequential input and output stages that exist in the migrated jobs. 5. Use InfoSphere Information Services Director to connect to InfoSphere QualityStage. For more information about connecting to InfoSphere DataStage and QualityStage, see IBM InfoSphere Information Services Director User Guide. 6. Develop an application, service, and operation by using InfoSphere Information Services Director. Your migrated real-time job is the information provider for the operation of the service. For more information about developing applications, services, and operations, see IBM InfoSphere Information Services Director User Guide. 7. Deploy the application as a service. For more information about deploying applications, see IBM InfoSphere Information Services Director User Guide.
32
Build
Rebuilds a single record from No direct replacement. Build was multiple records that are often used with Parse to analyze created with a Parse stage. multi-domain data fields. Use Standardize to accomplish the same function in one step. Generates a list of each unique value in single-domain data fields. Sort stage
Collapse
Collapse
Generates frequency counts of Aggregate stage data values in a field or a group of fields. Reformats files from delimited to fixed-length and vice versa. Provides I/O to an ODBC database. Analysis of data quality. Sequential File stage
Format Convert
ODBC stage or database specific stage Investigate stage and the Reporting tab for the WebConsole for IBM Information Server.
Match
Identifying data duplicates in Unduplicate Match stage in conjunction with the Match a single file by using fuzzy Frequency stage. match logic. Pairing records from one file with those in another by using fuzzy match logic. Standardize multinational address data. Tokenizes a text field by resolving free-form text fields into fixed-format records that contain individual data elements. Invokes a customer-written program. Reference Match stage in conjunction with the Match Frequency stage. MNS stage No direct replacement. Parse was often used with Build to analyze multi-domain data fields. Use the Standardize stage to accomplish the same function in one step. Depends on the functionality of the customer-written program. Possibilities include adding a Parallel Build, Custom, or Wrapped stage type.
Match
Program
33
Table 5. Replacement InfoSphere DataStage and QualityStage stages for migrated QualityStage stages. (continued) QualityStage 7 Select Purpose Conditionally routes records that are based on values in selected fields. Sorts a list. Breaks down multi-domain data columns into a set of standardized single-domain columns. Produces the best results record from a group of related records. Rearranges and reformats columns in a record. Acts as a gatekeeper for files in non-standard formats (variable length records, non-standard code page, binary or packed data). Produces multiple output records from a single input record. Adds record keys that consists of sequence number plus an optional fixed "file identifier." Join records from two files based on a key. Pairing records from one file with those in another by using fuzzy match logic. Merges data from multiple records into one. Manipulate and transform data record. Standardize and validate multinational address data. Replacement stage Switch and Filter stages
Sort Standardize
Survive
Survive stage
Transfer Transfer
No separate stage is required to do this in QualityStage 8. Sequential File or Complex Flat File stage
Transfer
Splitting records can be achieved by Copy stage followed by Funnel stage Surrogate Key Generator stage
Transfer
Unijoin Unijoin
Join stage or Lookup stage Reference Match stage in conjunction with Match Frequency stage. Join stage and Merge stage Transformer stage WAVES stage
34
Chapter 5. Migration from IBM WebSphere RTI 7.5, 7.5.1, and 7.5.2 installations
Use the WebSphere RTI Export Wizard to migrate from WebSphere RTI 7.5, 7.5.1, and 7.5.2 installations to an IBM InfoSphere Information Services Director, Version 8.5 installation. After migration, you can use one of the following methods to administer and deploy InfoSphere Information Services Director: v InfoSphere Information Services Director command line interface v IBM InfoSphere Information Server console
Procedure
1. Halt WebSphere RTI. 2. Use the WebSphere RTI Export Wizard on the source computer to create an RTIX file. This RTIX file contains descriptions of operations and services. 3. Move the RTIX file to the Version 8.5 computer. 4. Use the IBM InfoSphere Information Server console Import function to import the RTIX file. This imported file is the equivalent of the output of the console design function. 5. You must associate the imported service descriptions with an application object before you can deploy the services. The import function is done at the application level to create this association.
Results
The imported service description is the equivalent of a service that is designed in InfoSphere Information Server. You can deploy the service description like any natively designed information service.
35
36
37
38
Procedure
1. Start the installation program in graphical mode as described in "Starting the installation program in graphical mode" in the Installation Guide. 2. Follow the prompts in the wizard as they appear.
39
When the installation program detects that the target directory contains a client installation, confirm that you want to upgrade the installation. The current client installation is uninstalled. Note: To retain your existing client versions, you must select a different installation directory location for Version 8.5 to ensure that your existing versions remain intact. When the installation program has collected your settings, it saves a response file for you. This file is a text file that contains the settings that you made in the previous pages. When you run the installation program again, you can load the settings into the program instead of manually selecting them again. Note: Passwords are not saved in the response file. You must edit the response file by using a text editor to add passwords. Based on your settings, the installation program runs a prerequisites check. During this check, it analyzes the resources and file system of the computer to determine whether the new client installation is likely to succeed. The wizard page displays each check. If the prerequisites check fails, it is marked FAILED for the corresponding items. 3. If you receive a FAILED message, double-click the message to learn more about resolving it. Try to resolve the issue without exiting the installation program. If you believe that you solved the problem, click Check Again in the Prerequisites Check page. If it is necessary to exit the installation program, click Cancel and close the browser window. Resolve the issue and then restart the installation program. 4. The installation program summarizes your settings and then begins the installation. 5. Monitor the installation as described in "Installation progress monitoring" in the Installation Guide. Leave the terminal window open until the installation is complete. Note: During the installation process, the installation process might occasionally request a response from you. Check periodically to make sure that the system is not waiting for you to respond. The installation process might fail if the installation process halts for a long time. 6. After the installation is completed, install the correct Microsoft XML Core Services (MSXML) Service Pack. 7. Repeat the upgrade process for each client computer.
What to do next
If the installation fails: 1. View the installation logs for more information. See "Log files" in the Installation Guide. 2. Resolve any issues that are listed in the installation log files. 3. Remove the installation directory structure and the installation log file.
Windows Restart the computer. 4. 5. Run the installation program again.
Client tier computers that include DataStage client software include the Version 8.5 Multi-client Manager. After Version 8.5 is installed, use Version 8.5 is the active
40
version. To use an earlier version of the client software, use the Version 8.5 Multi-client Manager.
41
42
Part 4. Appendixes
43
44
Product accessibility
You can get information about the accessibility status of IBM products. The IBM InfoSphere Information Server product modules and user interfaces are not fully accessible. The installation program installs the following product modules and components: v IBM InfoSphere Business Glossary v IBM InfoSphere Business Glossary Anywhere v IBM InfoSphere DataStage v IBM InfoSphere FastTrack v v v v IBM IBM IBM IBM InfoSphere InfoSphere InfoSphere InfoSphere Information Analyzer Information Services Director Metadata Workbench QualityStage
For information about the accessibility status of IBM products, see the IBM product accessibility information at http://www.ibm.com/able/product_accessibility/ index.html.
Accessible documentation
Accessible documentation for InfoSphere Information Server products is provided in an information center. The information center presents the documentation in XHTML 1.0 format, which is viewable in most Web browsers. XHTML allows you to set display preferences in your browser. It also allows you to use screen readers and other assistive technologies to access the documentation.
45
46
47
48
{}
Note: v The maximum number of characters in an argument is 256. v Enclose argument values that have embedded spaces with either single or double quotation marks. For example: wsetsrc[-S server] [-l label] [-n name] source The source argument is the only required argument for the wsetsrc command. The brackets around the other arguments indicate that these arguments are optional. wlsac [-l | -f format] [key... ] profile In this example, the -l and -f format arguments are mutually exclusive and optional. The profile argument is required. The key argument is optional. The ellipsis (...) that follows the key argument indicates that you can specify multiple key names. wrb -import {rule_pack | rule_set}... In this example, the rule_pack and rule_set arguments are mutually exclusive, but one of the arguments must be specified. Also, the ellipsis marks (...) indicate that you can specify multiple rule packs or rule sets.
49
50
51
52
Notices
IBM may not offer the products, services, or features discussed in this document in other countries. Consult your local IBM representative for information on the products and services currently available in your area. Any reference to an IBM product, program, or service is not intended to state or imply that only that IBM product, program, or service may be used. Any functionally equivalent product, program, or service that does not infringe any IBM intellectual property right may be used instead. However, it is the user's responsibility to evaluate and verify the operation of any non-IBM product, program, or service. IBM may have patents or pending patent applications covering subject matter described in this document. The furnishing of this document does not grant you any license to these patents. You can send license inquiries, in writing, to: IBM Director of Licensing IBM Corporation North Castle Drive Armonk, NY 10504-1785 U.S.A. For license inquiries regarding double-byte character set (DBCS) information, contact the IBM Intellectual Property Department in your country or send inquiries, in writing, to: Intellectual Property Licensing Legal and Intellectual Property Law IBM Japan Ltd. 1623-14, Shimotsuruma, Yamato-shi Kanagawa 242-8502 Japan The following paragraph does not apply to the United Kingdom or any other country where such provisions are inconsistent with local law: INTERNATIONAL BUSINESS MACHINES CORPORATION PROVIDES THIS PUBLICATION "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESS OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF NON-INFRINGEMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Some states do not allow disclaimer of express or implied warranties in certain transactions, therefore, this statement may not apply to you. This information could include technical inaccuracies or typographical errors. Changes are periodically made to the information herein; these changes will be incorporated in new editions of the publication. IBM may make improvements and/or changes in the product(s) and/or the program(s) described in this publication at any time without notice. Any references in this information to non-IBM Web sites are provided for convenience only and do not in any manner serve as an endorsement of those Web
53
sites. The materials at those Web sites are not part of the materials for this IBM product and use of those Web sites is at your own risk. IBM may use or distribute any of the information you supply in any way it believes appropriate without incurring any obligation to you. Licensees of this program who wish to have information about it for the purpose of enabling: (i) the exchange of information between independently created programs and other programs (including this one) and (ii) the mutual use of the information which has been exchanged, should contact: IBM Corporation J46A/G4 555 Bailey Avenue San Jose, CA 95141-1003 U.S.A. Such information may be available, subject to appropriate terms and conditions, including in some cases, payment of a fee. The licensed program described in this document and all licensed material available for it are provided by IBM under terms of the IBM Customer Agreement, IBM International Program License Agreement or any equivalent agreement between us. Any performance data contained herein was determined in a controlled environment. Therefore, the results obtained in other operating environments may vary significantly. Some measurements may have been made on development-level systems and there is no guarantee that these measurements will be the same on generally available systems. Furthermore, some measurements may have been estimated through extrapolation. Actual results may vary. Users of this document should verify the applicable data for their specific environment. Information concerning non-IBM products was obtained from the suppliers of those products, their published announcements or other publicly available sources. IBM has not tested those products and cannot confirm the accuracy of performance, compatibility or any other claims related to non-IBM products. Questions on the capabilities of non-IBM products should be addressed to the suppliers of those products. All statements regarding IBM's future direction or intent are subject to change or withdrawal without notice, and represent goals and objectives only. This information is for planning purposes only. The information herein is subject to change before the products described become available. This information contains examples of data and reports used in daily business operations. To illustrate them as completely as possible, the examples include the names of individuals, companies, brands, and products. All of these names are fictitious and any similarity to the names and addresses used by an actual business enterprise is entirely coincidental. COPYRIGHT LICENSE: This information contains sample application programs in source language, which illustrate programming techniques on various operating platforms. You may copy, modify, and distribute these sample programs in any form without payment to
54
IBM, for the purposes of developing, using, marketing or distributing application programs conforming to the application programming interface for the operating platform for which the sample programs are written. These examples have not been thoroughly tested under all conditions. IBM, therefore, cannot guarantee or imply reliability, serviceability, or function of these programs. The sample programs are provided "AS IS", without warranty of any kind. IBM shall not be liable for any damages arising out of your use of the sample programs. Each copy or any portion of these sample programs or any derivative work, must include a copyright notice as follows: (your company name) (year). Portions of this code are derived from IBM Corp. Sample Programs. Copyright IBM Corp. _enter the year or years_. All rights reserved. If you are viewing this information softcopy, the photographs and color illustrations may not appear.
Trademarks
IBM, the IBM logo, and ibm.com are trademarks of International Business Machines Corp., registered in many jurisdictions worldwide. Other product and service names might be trademarks of IBM or other companies. A current list of IBM trademarks is available on the Web at www.ibm.com/legal/copytrade.shtml. The following terms are trademarks or registered trademarks of other companies: Adobe is a registered trademark of Adobe Systems Incorporated in the United States, and/or other countries. IT Infrastructure Library is a registered trademark of the Central Computer and Telecommunications Agency which is now part of the Office of Government Commerce. Intel, Intel logo, Intel Inside, Intel Inside logo, Intel Centrino, Intel Centrino logo, Celeron, Intel Xeon, Intel SpeedStep, Itanium, and Pentium are trademarks or registered trademarks of Intel Corporation or its subsidiaries in the United States and other countries. Linux is a registered trademark of Linus Torvalds in the United States, other countries, or both. Microsoft, Windows, Windows NT, and the Windows logo are trademarks of Microsoft Corporation in the United States, other countries, or both. ITIL is a registered trademark, and a registered community trademark of the Office of Government Commerce, and is registered in the U.S. Patent and Trademark Office UNIX is a registered trademark of The Open Group in the United States and other countries. Cell Broadband Engine is a trademark of Sony Computer Entertainment, Inc. in the United States, other countries, or both and is used under license therefrom.
55
Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both. The United States Postal Service owns the following trademarks: CASS, CASS Certified, DPV, LACSLink, ZIP, ZIP + 4, ZIP Code, Post Office, Postal Service, USPS and United States Postal Service. IBM Corporation is a non-exclusive DPV and LACSLink licensee of the United States Postal Service. Other company, product or service names may be trademarks or service marks of others.
56
Contacting IBM
You can contact IBM for customer support, software services, product information, and general information. You also can provide feedback to IBM about products and documentation. The following table lists resources for customer support, software services, training, and product and solutions information.
Table 6. IBM resources Resource IBM Support Portal Description and location You can customize support information by choosing the products and the topics that interest you at www.ibm.com/support/ entry/portal/Software/ Information_Management/ InfoSphere_Information_Server You can find information about software, IT, and business consulting services, on the solutions site at www.ibm.com/ businesssolutions/ You can manage links to IBM Web sites and information that meet your specific technical support needs by creating an account on the My IBM site at www.ibm.com/account/ You can learn about technical training and education services designed for individuals, companies, and public organizations to acquire, maintain, and optimize their IT skills at http://www.ibm.com/software/swtraining/ You can contact an IBM representative to learn about solutions at www.ibm.com/connect/ibm/us/en/
Software services
My IBM
IBM representatives
Providing feedback
The following table describes how to provide feedback to IBM about products and product documentation.
Table 7. Providing feedback to IBM Type of feedback Product feedback Action You can provide general product feedback through the Consumability Survey at www.ibm.com/software/data/info/ consumability-survey
57
Table 7. Providing feedback to IBM (continued) Type of feedback Documentation feedback Action To comment on the information center, click the Feedback link on the top right side of any topic in the information center. You can also send comments about PDF file books, the information center, or any other documentation in the following ways: v Online reader comment form: www.ibm.com/software/data/rcf/ v E-mail: [email protected]
58
Index A
Abbreviate stage replacing QualityStage legacy 33 Administrator client, Version 8.5 importing 7.5.3 and earlier DataStage projects 15 DataStage Designer client (continued) importing files after migrating from 7.5.3 and earlier 29 installing multiple versions with Multi-Client Manager 21 restoring hashed files after migration from 7.5.3 and earlier 16 DataStage servers attaching to multiple 22 connecting to DataStage Version 7.5.3 and earlier 22 connecting to multiple 22 uninstalling Linux and UNIX 14 uninstalling Windows 13 Designer and Director clients connecting to multiple servers on Version 8.5 23 Designer client compiling jobs and routines 17 exporting projects before migration from 7.5.3 and earlier 10 importing 7.5.3 and earlier DataStage projects 15 importing files after migrating from 7.5.3 and earlier 29 importing migrated files (QualityStage) 29 importing projects to 15 installing multiple versions with Multi-Client Manager 21 recompiling DataStage jobs for 17 restoring hashed files after migration from 7.5.3 and earlier 16 Designer Manager client exporting projects before migration from 7.5.3 and earlier 10 different versions of clients switching between 18 Director client using to restore hashed files 16 dscmdexport command exporting DataStage 7.5.3 and earlier projects 10 dsenv environment variable saving DataStage 7.5.3 and earlier 11 DSHOME environment variable saving DataStage 7.5.3 and earlier 11 dsjob command using to connect to a Version 8.5 after migration 24 DSParams file saving DataStage 7.5.3 and earlier, Linux and UNIX 11 saving DataStage 7.5.3 and earlier, Windows 12 export about DataStage 7.5.3 and earlier projects 9 DataStage 7.5.3 and earlier projects using Manager client 10
F
files .ini, saving DataStage 7.5.3 and earlier 11 DSParams, saving DataStage 7.5.3 and earlier 11 odbc.ini, saving DataStage 7.5.3 and earlier 11 uvodbc.config, saving DataStage 7.5.3 and earlier 11 flat files moving DataStage 7.5.3 and earlier 12 Format Convert stage replacing QualityStage legacy 33
B
Build stage replacing QualityStage legacy 33
C
client tiers upgrading client-only installation 39 Collapse stage replacing QualityStage legacy 33 command line interface connecting to DataStage 7.5.3 and earlier projects 24 exporting DataStage 7.5.3 and earlier projects 10 command-line syntax conventions 49 commands syntax 49 compile DataStage 7.5.3 and earlier jobs after migration 17 DataStage 7.5.3 and earlier routines after migration 17 connect to Version 7.5.3 and earlier Designer, Manager, and Director clients 22 conversion script QualityStage migration utility 30 customer support contacting 57
H
hashed files importing into Designer client after migration 16 migrating DataStage 7.5.3 and earlier 13
I
import importing 7.5.3 and earlier DataStage projects 15 importing Version 7.5.3 and earlier QualityStage jobs 29 ini files saving DataStage 7.5.3 and earlier 11 install client-only installation 39 ITAG, multiple servers 19 Investigate stage replacing QualityStage legacy 33 ITAG installation migrating multiple DataStage servers 21 iTag installations multiple servers Linux, UNIX 19
D
DataStage 7.5.3 and earlier export using command line 10 export using Manager client 9 importing projects to Version 8.5 15 job dependencies, moving 12 legacy projects, exporting 9 moving schema files 12 saving setting files 11 DataStage Administrator client importing 7.5.3 and earlier DataStage projects 15 DataStage client uninstalling 14 DataStage Designer client compiling jobs and routines 17 exporting projects before migration from 7.5.3 and earlier 10 Copyright IBM Corp. 2006, 2010
J
job migration (QualityStage) legacy 25 match specification 26
E
expanded job migration QualityStage 7.5.3 and earlier 26
59
L
legacy migration replacing QualityStage Legacy stages 30 legacy migration (QualityStage) QualityStage jobs 25 legacy operational mode QualityStage 7.5.3 and earlier 25 legacy stages QualityStage, replacing 33 legal notices 53
O
odbc files saving DataStage 7.5.3 and earlier odbc.ini file saving DataStage 7.5.3 and earlier overview migration 3 11 11
P
Parse stage replacing QualityStage legacy product accessibility accessibility 45 product documentation accessing 47 Program stage replacing QualityStage legacy projects (DataStage) exporting 10 33
M
Manager client 10 match specification migration 26 Match stage replacing QualityStage legacy 33 migrated files (QualityStage) importing 29 provisioning 29 migrating hashed files 12 migration client tiers 39 compiling jobs after 17 DataStage schema files 16 from release 7.5.3 and earlier 7 ITAG installations 9 process overview 3 QualityStage v7.5.3 or earlier 25 real-time QualityStage jobs 32 RTI 7.5.2 and earlier 35 RTI 7.5.3 and earlier 35 scenarios and methodology 3 strategies 3 migration from 7.5.3 and earlier DataStage flat files 16 migration utility running QualityStage 28 Multi-client Manager using to switch between different client versions 18 Multi-Client Manager 21 installing multiple DataStage client versions 21 Multinational Standardize stage replacing QualityStage legacy 33 multiple DataStage clients connecting to 22 multiple DataStage servers connecting to Version 7.5.1 and earlier 23 connecting to Version 8.5 24 Multiple Job Compile tool for recompiling DataStage jobs 17
Sort stage replacing QualityStage legacy 33 special characters in command-line syntax 49 Standardize stage replacing QualityStage legacy 33 start DataStage 7.5.3 and earlier server 21 stop DataStage 7.5.3 and earlier server 21 strategy migration 3 support customer 57 Survive stage replacing QualityStage legacy 33 syntax command-line 49
33
T
trademarks list of 53 Transfer stage replacing QualityStage legacy 33
Q
QualityStage 7.0 to 7.5 job migration legacy form 25 QualityStage 7.5 plug-in rewriting jobs after migration 18 QualityStage 7.5.3 and earlier job migration, expanded form 26 QualityStage migration utility installing 27 UNIX and Linux 28 Windows 28 QualityStage rule sets migrating 7.5.3 and earlier 26 QualityStage stages replacing legacy 33
U
Unijoin stage replacing QualityStage legacy 33 uninstall DataStage client 14 DataStage server, Windows 13 DataStage servers, Linux and UNIX 14 overview, previous DataStage versions 13 UNIX and Linux installing QualityStage 7 migration utility 27 upgrade client-only installation 39 utility QualityStage migration 28 uvodbc.config file saving DataStage 7.5.3 and earlier 11
R
real-time QualityStage jobs migrating 32 replacing legacy stages QualityStage version 7.5.3 and earlier 33 restore 7.5.3 and earlier DataStage settings files 15 DataStage 7.5.3 and earlier settings files 15 RTI 7.5.2 and earlier migration export 35 rule sets, QualityStage migrating 7.5.3 and earlier 26
W
WAVES stage replacing QualityStage legacy 33 Web sites non-IBM 51 Websphere RTI 7.5.3 and earlier migration 35 Windows installing QualityStage migration utility 27 wizards WebSphere RTI Export 35
S
Select stage replacing QualityStage legacy 33 settings files restoring DataStage 7.5.3 and earlier files 15 saving DataStage 7.5.3 and earlier 11 software services contacting 57
N
non-IBM Web sites links to 51
60
Printed in USA
SC19-2965-00