Azure Data Factory Whitepaper PassingParameters
Azure Data Factory Whitepaper PassingParameters
This articles walks you through how to pass parameters between a pipeline
and activity as well as between the activities
Written By-
Reviewed By-
© 2019 Microsoft Corporation. This document is for informational purposes only. MICROSOFT MAKES
NO WARRANTIES, EXPRESS OR IMPLIED, IN THIS SUMMARY. The names of actual companies and
products mentioned herein may be the trademarks of their respective owners
Azure data Factory –Passing Parameters
Passing parameters to ADF is quite important as it provides the flexibility required to create dynamic
pipelines. To reference a parameter, one will have to provide the fully qualified name of the parameter.
It is worth noting that parameter names are case sensitive. A parameter could be a user input, which
means that the parameter is passed from the pipeline layer or could be an input coming from an activity
within the pipeline.
Select “Create pipeline” icon. Rename the pipeline by replacing “pipeline1” with name of your choice. It
is worth following some naming convention.
Example of naming convention-
*we are using the numbering to get around limits such as maximum number of activities in a pipeline.
Once parameter tab is clicked, you can hit “+New” icon. Type in the name-relativeurl-with type as string
and default value as /speakers.
Either use the search bar to find copy data activity or expand "Move & Transform” node to find copy
data activity. Drag and drop this activity to the white canvas on your right
Once copied to the right pane, your screen will appear like the one below-
Within the copy activity, we need to define the source and sink. In this case, the source will be the REST
API and the sink will be a blob store.
Defining Source
Click the Source tab and click the icon “+New”. When the list of new datasets appears, search for “Rest”
and select “Rest”.
Fill the information as given below and hit the test connection button-
At the source page, set the dataset level parameter to the pipeline level parameter. You cannot see the
pipeline level parameter at the dataset level. This is a very important thing to note.
Defining Sink
In the general tab, fill the details using the naming convention that is used by your team
Click on the Connection tab and setup the linked service. Use managed identity where possible.
Once the destination has been setup, the connection page will look like the one below-
To save this, we need to click the publish all icon at the top left-
Once the pipeline has been published successfully, you can trigger the pipeline to test it. Select trigger
now to trigger the pipeline.
Click on the monitor icon on the left and check whether the pipeline has completed successfully
Passing parameter between activities
Before we even kick off this session, it is important to understand what an activity is. Activity is where a
particular action is performed. The action could be to cleanse data or update a control table within SQL
Database. An activity is within a pipeline, where the pipeline is a logical container having one or more
activities.
We will be creating a pipeline that will copy the meta data information of a file in blob store and then
loads this information into an Azure SQL database table. All the information will be moving between
activities via parameters-the output parameters.
The architecture
We will follow a three-step process to show how to pass parameters between activities. The flat file
currently exists on ADLS gen 1 storage. We use ADF activity to get the meta data about a file stored in
ADLS gen 1. Once this information is available, we use parameters to pass the details to the next activity,
which is a stored procedure within Azure SQL database.
Go to a data factory and create a new pipeline and drag the activity “Get Metadata”.
We plan to find the last modified date of a file in ADLS storage. Set the dataset to be the file stored in
the ADLS storage. I am planning to capture three arguments- Item Name, Item Type and Last Modified.
To pass these arguments to the next activity, we need to first debug the pipeline. You will be able to see
the outcome in the output tab.
Once the execution completes, you have two option within action. One is the input and the other is
output.
In this case, click on the symbol and view whether the three arguments we selected are displayed.
Create a table and stored procedure within the Azure database that was created. The code is provided
below.
/*===================================================
Create a control table that will store information
=====================================================*/
CREATE TABLE TB_FILE_METADATA
(
I_ID INT IDENTITY(1,1),
V_ITEM_NAME VARCHAR(255) NOT NULL,
V_ITEM_TYPE VARCHAR(255) NOT NULL,
D_LAST_MODIFIED DATETIME NOT NULL
)
-- =======================================================
-- Create Stored Procedure Template for Azure SQL Database
-- =======================================================
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
--
=======================================================================================
==
-- Author: John Doe
-- Create Date: 31/05/2019
-- Description: This stored procedure will populate the data in the control table
--
=======================================================================================
===
CREATE PROCEDURE usp_populate_control_table
(
-- Add the parameters for the stored procedure here
@V_Item_Name varchar(255),
@V_Item_Type varchar(255),
@D_Last_Modified datetime
)
AS
BEGIN
SET NOCOUNT ON
BEGIN TRY
You can either import the parameters or manually enter the parameter. When you use import
parameters, the input parameters for the stored procedure are identified automatically.
Click on the Value text box and press “add dynamic content”
Select the Activity output or you can manually type the value
The first selection will look something like the one below.
The names are case sensitive and once completed the screen will show up as follows
You could now debug the pipeline and if everything looks good you will see that the table in azure SQL
database is populated.
This indicates that the file was last modified in March 2019.