100% found this document useful (1 vote)
70 views

ADFDF Cheat Sheet Sqlplayer v2.1

The document summarizes various data transformation activities that can be performed using Azure Data Factory mapping data flows and compares them to equivalent transformations in SQL Server Integration Services (SSIS) and Transact-SQL. It lists activities for multiple inputs/outputs, joining data, conditional splitting, unioning, lookups, schema modification, selections, aggregations, ranking, pivoting, formatting and parsing, custom functions, filtering, sorting, row modification, and defining sources and sinks. For each activity, it provides a brief description and notes equivalent functionality in SSIS and T-SQL.

Uploaded by

Bala
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
100% found this document useful (1 vote)
70 views

ADFDF Cheat Sheet Sqlplayer v2.1

The document summarizes various data transformation activities that can be performed using Azure Data Factory mapping data flows and compares them to equivalent transformations in SQL Server Integration Services (SSIS) and Transact-SQL. It lists activities for multiple inputs/outputs, joining data, conditional splitting, unioning, lookups, schema modification, selections, aggregations, ranking, pivoting, formatting and parsing, custom functions, filtering, sorting, row modification, and defining sources and sinks. For each activity, it provides a brief description and notes equivalent functionality in SSIS and T-SQL.

Uploaded by

Bala
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 2

Transformations of Mapping Data Flow

Azure Data Factory #MappingDataFlows Azure Synapse

Activity Description SSIS equivalent SQL Server equivalent

Multiple inputs/outputs

New branch Create a new flow branch SELECT INTO


with the same data Multicast (+icon) SELECT OUTPUT

Join data from two streams INNER | LEFT | RIGHT JOIN,


Join based on a condition Merge join CROSS | FULL OUTER JOIN

Conditional Route data into different SELECT INTO WHERE condition1


streams based on conditions Conditional Split SELECT INTO WHERE condition2
Split CASE ... WHEN

Check the existence of data in Lookup / Merge SELECT * FROM Table


Exists another stream WHERE EXISTS(SELECT ... ) | JOIN
Join | NOT EXISTS

Collect data from multiple SELECT col1a UNION (ALL)


Union streams Union All SELECT col1b

Lookup additional data from LEFT | RIGHT JOIN


Lookup another stream Lookup

Schema modifier

Cast columns to different SELECT


Cast types Derived Column CAST(NumStr as INT) as Number,
CONVERT(DATE, '14/07/2022', 103)

Derived Compute new columns based SELECT Column1 * 1.09 as NewColumn


on the existing once Derived Column
Column

Choose columns to flow to OUTPUT in components, SELECT Column1, Column4


Select the next stream FROM Table
mapping columns

Calculate aggregation on the SELECT Year(DateOfBirth) as Year,


Aggregate stream Aggregate MIN(), MAX(), AVG()
GROUP BY Year(DateOfBirth)

Surrogate Add a surrogate key column Script SELECT ROW_NUMBER()


to output stream from a OVER(ORDER BY n ASC) AS R#, n
Key Component FROM sys.databases
specific value + Incremental Primary Key (with limited capabilities)

Pivots row values into SELECT rowCol, c1, c2 FROM


Pivot columns, groups columns and Pivot ( SELECT sourceCols FROM Table)
PIVOT
aggregates data ( SUM(sumCol) FOR col IN (...) )

Unpivots columns into row SELECT rowCol, col, X FROM (


Unpivot values and ungroups columns Unpivot SELECT rowCol, c1, c2 FROM pvt)
UNPIVOT
(X FOR col FROM (c1, c2)) AS unpvt

Aggregates data based on a [Sort] + SELECT fun() OVER(


Window window and joins with PARTITION BY pc ORDER BY oc) newc,
Custom Script pc, oc, otherCols FROM Table
original data

Generates an ordered ranking [Sort] + SELECT ROW_NUMBER() OVER (


Rank based upon sort conditions PARTITION BY pc ORDER BY oc) newc,
Custom Script pc, oc, otherCols FROM Table
specified by the user

Enables custom function by Script Not available out of the box.


External Call calling out to external REST Some tricks are possible: OLE Automation,
Component CLR function, although not recommended.
endpoints row-by-row

BLOG:
https://SQLPlayer.net/tag/ADFDF @SQLPlayer
Transformations of Mapping Data Flow

Azure Data Factory #MappingDataFlows Azure Synapse

Activity Description SSIS equivalent SQL Server equivalent

Formatters

Takes array values from Script Component or SELECT * FROM OPENJSON(json) WITH (
Flatten hierarchical structures such as id INT 'strict $.id',
3rd Party Component Name NVARCHAR(50) '$.info.name');
JSON (denormalization) (ZappySys SSIS JSON Parser Transform)

Parses columns in document Script Component or SELECT * FROM OPENJSON ...


Parse form data (JSON, XML, SELECT * FROM OPENXML ...
3rd Party Component SELECT value FROM
delimited text) (ZappySys SSIS JSON Parser Transform)
STRING_SPLIT('clothing,road', ',');

Turns complex data types into Script


Stringify strings Component

Row modifier

Filter rows in the stream SELECT * FROM Table


Filter based on a condition Conditional Split WHERE [Column] LIKE '%pattern%'

Order data in the stream SELECT * FROM Table


Sort based on column(s) Sort ORDER BY [Column] ASC

Set action policy on rows Conditional Split MERGE |


Alter Row when database is sink INSERT, UPDATE, DELETE
+ n Destinations IF | WHERE

Enables you to build custom Conditional Split or WHERE, REPLACE, SUBSTRING,


Assert rules for data quality and data IF, RAISERROR, THROW
DQS Cleansing
validation

Source / Destination

Source for your data flow. OLE DB Source SELECT * FROM SourceTable
Source Obligatory first element of and more …
every Data Flow in ADF

Destination for your data flow OLE DB Destination and INSERT INTO TargetTable
Sink stream more…

Version: 2.1 (Aug 2022)

BLOG:
https://SQLPlayer.net/tag/ADFDF @SQLPlayer

You might also like