Azure Data Factory
Access Data Factory
Dashboard
See the Dashboard section of this documentation from more information.
-
Click on the Dashboard menu from the Azure Portal.
ADF URL
-
Navigate to https://adf.azure.com, and select the Data Factory instance that was created for you.
Azure Portal
-
In the Azure Portal Search box, search for Data factories.
-
You should then see a list of the Data Factories you were given permission to access.
Authoring
Click on Author and Monitor.
In Data Factory, you have the ability to author and deploy resources.
See Visual authoring in Azure Data Factory for more information.
You can also use some of the various wizards provided on Data Factory Overview page.
NOTE: Configuring SSIS Integration is NOT recommended. Contact the support team through the Slack channel if you have questions.
See Azure Documentation Tutorials for more details.
Access the Data Lake from ADF
A Data Lake connection has been pre-configured for your environment.
-
Click on Manage.
-
Click on Linked Services.
-
The linked service with the Azure Data Lake Storage Gen2 type is your Data Lake.
Note: You have been granted access to specific containers created in the Data Lake for your environment.
Access Azure SQL Database
Some projects have an Azure SQL Database instance.
-
Click on Manage.
-
Click on Linked Services.
-
The linked service(s) with the Azure SQL Database type is / are your Database(s)
Save / Publish Your Data Factory Resources
Azure Data Factory can be configured to save your work to the following locations:
- Git repository
- Publish directly to Data Factory
Git (when supported)
When Git is enabled you can see your configuration and save your work to a specific branch.
-
Click on Manage
-
Click on Git Configuration.
-
See the Git configuration that was setup for you:
-
When authoring a workflow it can be saved to your branch. Click on + New branch from this branch dropdown to create a new feature branch.
-
When you are ready to merge the changes from your feature branch to your collaboration branch (master), click on the branch dropdown and select Create pull request. This action takes you to Azure DevOps Git Repo where you can create pull requests, do code reviews, and merge changes to your collaboration branch (master) after the pull request has been approved.
-
After you have merged changes to the collaboration branch (master), click on Publish to publish your code changes from the master branch to Azure Data Factory. Contact the support team through the Slack channel if you receive an error when trying to Publish.
Data Factory Service
When Data Factory is not integrated with source control your workflows are stored directly in the Data Factory service and you cannot save partial changes, you can only Publish all which overwrites the current state of the Data Factory with your changes, which are then visible to everyone.
Ingest and Transform Data with ADF
Integration Runtimes
AutoResolveIntegrationRuntime
Do not use. Please use the canadaCentralIR-4nodesDataFlow or selfHostedCovidIaaSVnet runtimes instead.
The auto resolve runtime is created by default with the data factory instance, and will auto resolve to the Azure Data Centre closest to the data, which may violate data residency policies.
canadaCentralIR-4nodesDataFlow
This is shared by all users and runs all the time.
Can Access:
- Internal Data Lake
- External Storage Account
- External Data Sources (Internet)
Cannot Access:
- Azure SQL Database
selfHostedCovidIaaSVnet
Located inside CAE virtual network (VNet).
Can Access:
- Internal Data Lake
- SQL Server
Cannot Access:
- External Storage Account
- External Data Sources (Internet)
Example: How to connect John Hopkins Data
-
There is an example workflow that shows how to ingest data from GitHub using a Data Factory Pipeline.
-
Data can be filtered from within Data Factory.
-
Alternatively, data can be pulled from GitHub using code in a Databricks notebook.
Microsoft Documentation
- Introduction to Azure Data Factory - Azure Data Factory
- Create an Azure data factory using the Azure Data Factory UI - Azure Data Factory
- Copy data by using the Azure Copy Data tool - Azure Data Factory
- Create a mapping data flow - Azure Data Factory
- Expression functions in the mapping data flow - Azure Data Factory
- Mapping data flow Debug Mode - Azure Data Factory
- Mapping data flow Visual Monitoring - Azure Data Factory
YouTube Videos
- Ingest, prepare & transform using Azure Databricks & Data Factory | Azure Friday
- Azure Friday | Visually build pipelines for Azure Data Factory V2
- How to prepare data using wrangling data flows in Azure Data Factory | Azure Friday
- How to develop and debug with Azure Data Factory | Azure Friday
- Building Data Flows in Azure Data Factory