What Is Compiler in Datastage - Compilation Process in Datastage
What Is Compiler in Datastage - Compilation Process in Datastage
Compilation is the process of converting the GUI into its machine code .That is nothing but machine understandable language. In this process it will checks all the link requirements, stage mandatory property values, and if there any logical errors.And Compiler produces OSH Code.
Advantages of Datamart
Datamart is the access layer of the datawarehouse environment. That means we create datamart to retrieve the data to the users faster. The Datamart is the subset of Datarehouse. That means all the data available in the datamart will be available in datarehouse. This Datamart will be created for the purpose of specific business. ( For example telecom Database or banking Database etc) There are many reasons to create Datamart.There is lot of importance of Datamart and advantages. It is easy to access frequently needed data from the database when reuired by the client. We can give access to group of users to view the Datamart when it is required. Ofcourse performance will be good. It is easy to maintain and to create the datamart. It will be related to specific business. And It is low cost to create a datamart rather than creating datarehouse with a huge space.
View the Logs Batch Jobs Unlock Jobs Scheduling Jobs Monitor the JOBS Message Handling In Manager , We can Import & Export the Jobs Node Configuration And by using Admin , We can Create the Projects Organize the Projects Delete the Projects
What is Fact Tables in Datawarehousing ? Give Example? Features of Datastage | Datastage Feautures
Datastage Features are 1) Any to Any ( Any Source to Any Target ) 2) Platform Independent. 3) Node Configuration. 4) Partition Parallelism. 5) Pipeline Parallelism. 1)Any to Any:-That means Datastage can Extrace the data from any source and can loads the data into the any target. 2) Platform Independent:-The Job developed in the one platform can run on the any other platform. That means if we designed a job in the Uni level processing, it can be run in the SMP machine. 3 )Node Configuration Node Configuration is a technique to create logical C.P.U Node is a Logical C.P.U 4)Partition Parallelism Partition parallelim is a technique distributing the data across the nodes based on the partition techniques. Partition Techniques are a) Key Based b) Key Less a) Key based Techniques are 1 ) Hash 2)Modulus 3) Range 4) DB2 b) Key less Techniques are 1 ) Same 2) Entire 3) Round Robin 4 ) Random 5) Pipeline Parallelism Pipeline Parallelism is the process, the extraction, transformation and loading will be occurred simultaneously. Re- Partitioning: The distribution of distributed data is Re-Partitioning. Reverse Partitioning: Reverse Partitioning is called as Collecting. Collecting methods are Ordered Round Robin
How to Remove Special Characters data and load rest of the data
Here we are going to know how to remove Special characters data rows and load rest of the rows into the target. Some times we get the data with special characters added for some of the rows. If we like to remove those special characters mixed rows in the column. We can use Alpha function. Alpha Function is used for this Job. If we use "Alpha" function. It will drop the special characters mixed rows and loads the rest of the rows into the target. So you can take sequential file to read the data and you can take Transformer stage to apply business logic. In Transformer stage in Constrain you can write the Alpha function. And Drag and Drop into the Target. . Then Compile and Run.
Alpha Function
Checks if a string is alphabetic. If NLS is enabled, the result of this function is dependent on the current locale setting of the Ctype convention
Syntax
Alpha (string)
Examples
These examples show how to check that a string contains only alphabetic characters:
Column1 = * the "%" Column2 = * Column2 "ABcdEF%" character is non-alpha (If Alpha(Column1) Then "A" Else "B") set to "B"
Column1 = "" * note that the empty string is non-alpha Column2 = (If Alpha(Column1) Then "A" Else "B") * Column2 set to "B"
DataStage Functions
Following is the list of DataStage String Functions
Usage Cab be used to check if the given string has alphanumeric characters TRUE if string is completely alphabetic
CompactWhiteSpace all consective whitespace will be reduced to single space Compare ComparNoCase ComparNum Compares two strings for sort Compare two strings irrespective of Case in-sensitiveness Compare the first n characters of the two strings
CompareNumNoCase Compare first n characters of the two strings irrespective of case insensitiveness Convert Count Dcount DownCase DQuote Field Index Left Len Replace character in a string with the given character. Count number of times a given substring occurs in a string Returns count of delimited fields in a string Change all uppercase letters in a string to lowercase Enclose a string in double quotation marks Return 1 or more delimited substrings Find starting character position of substring Finds leftmost n characters of string Length of the string or total number of characters in a string
Usage Return 1 if string can be converted to a number Return the string padded with the optional pad character and optional length Finds Rightmost n characters of string Returns a string which identifies a set of words that are phonetically similar Return a string of N space characters Covers a string into single quotation marks Repeat a string Return the string after removing all whitespace Remove all leading and trailing spaces and tabs. Also reduce the internal occurrences of spaces and tabs into one. Remove all trailing spaces and tabs Remove all leading spaces and tabs Returns a string with leading and trailing whitespace removed Change all lowercase letters in a string to uppercase
Right Soundex
It adds the state to the new column defined for the output link. So that full state name is added to the each row based on codes given. If the code not found in the lookup table, record will be rejected. Lookup stage also performs to validate the row. Look Up stage is a processing stage which performs horizontal combining. Lookup stage Supports N-Inputs ( For Norman Lookup ) 2 Inputs ( For Sparse Lookup) 1 output And 1 Reject link Up to Datastage 7 Version We have only 2 Types of LookUps a) Normal Lookup and b) Sparse Lookup But in Datastage 8 Version, enhancements has been take place. They are c) Range Look Up And d) Case less Look up Normal Lookup:-- In Normal Look, all the reference records are copied to the memory and the primary records are cross verified with the reference records. Sparse Lookup:--In Sparse lookup stage, each primary records are sent to the Source and cross verified with the reference records. Here , we use sparse lookup when the data coming have memory sufficiency and the primary records is relatively smaller than reference date we go for this sparse lookup. Range LookUp:--- Range Lookup is going to perform the range checking on selected columns. For Example: -- If we want to check the range of salary, in order to find the grades of the employee than we can use the range lookup.
In Lookup stage properties, you will have constraints option. If you click on constraints button- you will get options like continue, drop, fail and reject If you select the option continue: it means left outer join operation will be performed. If you select the option drop: it means inner join operation will be performed.
Read and load the data in sequential file. In Aggregator stage select group =dno Aggregator type = count rows Count output column =dno_cpunt( user defined ) In output Drag and Drop the columns required.Than click ok In Filter Stage ----- At first where clause dno_count>1 -----Output link =0 -----At second where clause dno_count<=1 -----output link=0 Drag and drop the outputs to the two targets. Give Target file names and Compile and Run the JOb. You will get the required data to the Targets.
Read and Load the data into two sequential files. Go to Funnel stage Properties and Select Funnel Type = Continous Funnel ( Or Any other according to your requirement ) Go to output Drag and drop the Columns ( Remember Source Columns Stucture Should be same ) Then click ok Give file name for the target dataset then compile and run th job
Read and Load the data in Source file In Transformer Stage just Drag and Drop the data to the target tables. Write expression in constraints as below dept_no=10 or dept_no= 40 for table 1 dept_no=30 for table 1 dept_no=20 or dept_no= 40 for table 1 Click ok Give file name at the target file and compile and Run the Job to get the Output
Read and load the data in three sequential files. In first Join stage , Go to Properties ----Select Key column as Deptno and you can select Join type = Inner Drag and drop the required columns in Output Click Ok In Second Join Stage Go to Properties ---- Select Key column as loc_id and you can select Join type = Inner
Drag and Drop the required columns in the output Click ok Give file name to the Target file, That's it Compile and Run the Job
Read and load the data in sequential file. In Aggregator stage select group =dno Aggregator type = count rows Count output column =dno_cpunt( user defined ) In output Drag and Drop the columns required.Than click ok In Filter Stage ----- At first where clause dno_count>1 ----Output link =0 -----At second where clause dno_count<=1 -----output link=0 Drag and drop the outputs to the two targets. Give Target file names and Compile and Run the JOb. You will get the required data to the Targets.
cars,ac,tv,music_system BMW,avlb,avlb,Adv Benz,avlb,avlb,Adv Camray,avlb,avlb,basic Honda,avlb,avlb,medium Toyota,avlb,avlb,medium Mergestage_update1 cars,cooling_glass,CC BMW,avlb,1050 Benz,avlb,1010 Camray,avlb,900 Honda,avlb,1000 Toyota,avlb,950 MergeStage Update2 cars,model,colour BMW,2008,black Benz,2010,red Camray,2009,grey Honda,2008,white Toyota,2010,skyblue Take Job Design as below
Read and load the Data into all the input files.
In Merge Stage Take cars as Key column. In Output Column Drag and Drop all the columns to the output files.
Give File name to the Target/Output file and If you want you can give reject links (n-1) Compile and Run the Job to get the required output.
Read and load the two tables in sequential files. Go to lookup stage and Drag and drop the primary columns to the output. And Join e_state from primary table to the e_state in reference table And drag and drop the Full_state to the output. In properties select lookup failure as drop Now click ok Give Target file name and Compile & Run the Job here
clients. That's why we need to get the Job for very less time. We need to try our best to get good performance to the Job. Both the stages Join stage and Look up stage performs same thing. That is they combine the tables we have. But why Lookup stage has been introduced. Look Up Stage have some extra benefits which will not come with the Join stage. Look up stage doest not required the data to be sorted. Sorting is mandatory with the Join stage. In Look Up stage the columns with different column names can be joined as well where it is not possible in the Join stage. That means Join stage, the column name must be similar. A Look Up Stage supports reject links , if our required demands reject links we cant go with Join stage. Because Join stage doesnt supports Reject Links. And Lookup stage has an option to fail the Job if the look up fails. It will be useful when the look up stage is expected to be successful. Look up stage keeps the reference data into the memory which yields better performance for smaller volume of data. If you have large amount of data, you need to go with Join stage.