0% found this document useful (0 votes)
328 views

Load Data With Azure Data Factory

This document provides steps to load data into Azure SQL Data Warehouse using Azure Data Factory. It involves: 1) Copying sample data from a local file to an Azure Storage Blob. 2) Creating linked services in Data Factory to connect the storage blob and SQL Data Warehouse. 3) Defining datasets for the blob storage and SQL data. 4) Creating a pipeline in Data Factory to copy the data from blob storage to SQL Data Warehouse on an hourly schedule.

Uploaded by

paramreddy2000
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
328 views

Load Data With Azure Data Factory

This document provides steps to load data into Azure SQL Data Warehouse using Azure Data Factory. It involves: 1) Copying sample data from a local file to an Azure Storage Blob. 2) Creating linked services in Data Factory to connect the storage blob and SQL Data Warehouse. 3) Defining datasets for the blob storage and SQL data. 4) Creating a pipeline in Data Factory to copy the data from blob storage to SQL Data Warehouse on an hourly schedule.

Uploaded by

paramreddy2000
Copyright
© © All Rights Reserved
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 4

Load Data with Azure Data Factory

[AZURE.SELECTOR]

 Data Factory
 PolyBase

 BCP

This tutorial shows you how to create a pipeline in Azure Data Factory to move data from Azure
Storage Blob to SQL Data Warehouse. With the following steps you will:

 Set-up sample data in an Azure Storage Blob.


 Connect resources to Azure Data Factory.

 Create a pipeline to move data from Storage Blobs to SQL Data Warehouse.

[AZURE.VIDEO loading-azure-sql-data-warehouse-with-azure-data-factory]

Before you begin


To familiarize yourself with Azure Data Factory, see Introduction to Azure Data Factory.

Create or identify resources


Before starting this tutorial, you need to have the following resources.

 Azure Storage Blob: This tutorial uses Azure Storage Blob as the data source for the Azure
Data Factory pipeline, and so you need to have one available to store the sample data. If you
don't have one already, learn how to Create a storage account.

 SQL Data Warehouse: This tutorial moves the data from Azure Storage Blob to SQL Data
Warehouse and so need to have a data warehouse online that is loaded with the
AdventureWorksDW sample data. If you do not already have a data warehouse, learn how to
provision one. If you have a data warehouse but didn't provision it with the sample data, you
can load it manually.

 Azure Data Factory: Azure Data Factory will complete the actual load and so you need to
have one that you can use to build the data movement pipeline.If you don't have one
already, learn how to create one in Step 1 of Get started with Azure Data Factory (Data
Factory Editor).

 AZCopy: You need AZCopy to copy the sample data from your local client to your Azure
Storage Blob. For install instructions, see the AZCopy documentation.

Step 1: Copy sample data to Azure Storage Blob


Once you have all of the pieces ready, you are ready to copy sample data to your Azure Storage
Blob.

1. Download sample data. This data will add another three years of sales data to your
AdventureWorksDW sample data.

1
2. Use this AZCopy command to copy the three years of data to your Azure Storage Blob.
AzCopy /Source:<Sample Data Location> /Dest:https://<storage
account>.blob.core.windows.net/<container name> /DestKey:<storage key>
/Pattern:FactInternetSales.csv

Step 2: Connect resources to Azure Data Factory


Now that the data is in place we can create the Azure Data Factory pipeline to move the data from
Azure blob storage into SQL Data Warehouse.

To get started, open the Azure portal and select your data factory from the left-hand menu.

Step 2.1: Create Linked Service


Link your Azure storage account and SQL Data Warehouse to your data factory.

1. First, begin the registration process by clicking the 'Linked Services' section of your data
factory and then click 'New data store.' Choose a name to register your azure storage under,
select Azure Storage as your type, and then enter your Account Name and Account Key.

2. To register SQL Data Warehouse navigate to the 'Author and Deploy' section, select 'New
Data Store', and then 'Azure SQL Data Warehouse'. Copy and paste in this template, and then
fill in your specific information.
{
"name": "<Linked Service Name>",
"properties": {
"description": "",
"type": "AzureSqlDW",
"typeProperties": {
"connectionString": "Data Source=tcp:<server
name>.database.windows.net,1433;Initial Catalog=<server name>;Integrated Security=False;User
ID=<user>@<servername>;Password=<password>;Connect Timeout=30;Encrypt=True"
}
}
}

Step 2.2: Define the dataset


After creating the linked services, we will have to define the data sets. Here this means defining the
structure of the data that is being moved from your storage to your data warehouse. You can read
more about creating

1. Start this process by navigating to the 'Author and Deploy' section of your data factory.

2. Click 'New dataset' and then 'Azure Blob storage' to link your storage to your data factory.
You can use the below script to define your data in Azure Blob storage:
{
"name": "<Dataset Name>",
"properties": {
"type": "AzureBlob",
"linkedServiceName": "<linked storage name>",
"typeProperties": {
"folderPath": "<containter name>",
"fileName": "FactInternetSales.csv",
"format": {
"type": "TextFormat",
"columnDelimiter": ",",

2
"rowDelimiter": "\n"
}
},
"external": true,
"availability": {
"frequency": "Hour",
"interval": 1
},
"policy": {
"externalData": {
"retryInterval": "00:01:00",
"retryTimeout": "00:10:00",
"maximumRetry": 3
}
}
}
}
3. Now we will also define our dataset for SQL Data Warehouse. We start in the same way, by
clicking 'New dataset' and then 'Azure SQL Data Warehouse'.

{
"name": "DWDataset",
"properties": {
"type": "AzureSqlDWTable",
"linkedServiceName": "AzureSqlDWLinkedService",
"typeProperties": {
"tableName": "FactInternetSales"
},
"availability": {
"frequency": "Hour",
"interval": 1
}
}
}

Step 3: Create and run your pipeline


Finally, we will set-up and run the pipeline in Azure Data Factory. This is the operation that will
complete the actual data movement. You can find a full view of the operations that you can
complete with SQL Data Warehouse and Azure Data Factory here.

In the 'Author and Deploy' section now click 'More Commands' and then 'New Pipeline'. After you
create the pipeline, you can use the below code to transfer the data to your data warehouse:
{
"name": "<Pipeline Name>",
"properties": {
"description": "<Description>",
"activities": [
{
"type": "Copy",
"typeProperties": {
"source": {
"type": "BlobSource",
"skipHeaderLineCount": 1
},
"sink": {
"type": "SqlDWSink",
"writeBatchSize": 0,
"writeBatchTimeout": "00:00:10"

3
}
},
"inputs": [
{
"name": "<Storage Dataset>"
}
],
"outputs": [
{
"name": "<Data Warehouse Dataset>"
}
],
"policy": {
"timeout": "01:00:00",
"concurrency": 1
},
"scheduler": {
"frequency": "Hour",
"interval": 1
},
"name": "Sample Copy",
"description": "Copy Activity"
}
],
"start": "<Date YYYY-MM-DD>",
"end": "<Date YYYY-MM-DD>",
"isPaused": false
}
}

Next steps
To learn more, start by viewing:

 Azure Data Factory learning path.


 Azure SQL Data Warehouse Connector. This is the core reference topic for using Azure Data
Factory with Azure SQL Data Warehouse.

These topics provide detailed information about Azure Data Factory. They discuss Azure SQL
Database or HDinsight, but the information also applies to Azure SQL Data Warehouse.

 Tutorial: Get started with Azure Data Factory This is the core tutorial for processing data with
Azure Data Factory. In this tutorial you will build your first pipeline that uses HDInsight to
transform and analyze web logs on a monthly basis. Note, there is no copy activity in this
tutorial.
 Tutorial: Copy data from Azure Storage Blob to Azure SQL Database. In this tutorial, you will
create a pipeline in Azure Data Factory to copy data from Azure Storage Blob to Azure SQL
Database.

 Real-world scenario tutorial. This is an in-depth tutorial for using Azure Data Factory.

You might also like