Prepare to run the Azure Data Factory collector
Set up Azure data assets, authentication, and permissions before running the collector.
Before you begin
Role required: admin
About this task
The collector uses Azure Service Principal for authentication. You must register an application, obtain Azure IDs, and grant subscription-level access. You can also prepare datasets for lineage harvesting.
Procedure
Register application and create client secret
Register an Azure application and create a client secret for Service Principal authentication.
Before you begin
Role required: admin
About this task
Register an application in Azure Active Directory and create credentials for the collector.
Procedure
-
Register a new application.
-
Create a client secret.
- On the application page, select .
- Select New client secret.
- Add a description and set the expiration date.
- Select Add.
- Copy and save the secret value.
-
Obtain the Client ID.
- Select the Overview tab.
- Copy the Application (Client) ID from the Essentials section.
Obtain subscription and tenant IDs
Obtain Azure subscription and tenant IDs for collector configuration.
Before you begin
Role required: admin
About this task
You will use these IDs when configuring the collector.
Procedure
-
Obtain the Tenant ID.
-
Obtain the Subscription ID.
Grant Service Principal access to Data Factory
Grant Reader role to the Service Principal for Azure Data Factory access.
Before you begin
Role required: admin
About this task
The Service Principal does not require explicit permission for each Data Factory. If the Data Factories you want to catalog were created within a specific subscription, add the Service Principal to that subscription with the Reader role.
Procedure
- Navigate to the Subscriptions page in Azure Portal.
- Select the subscription containing your Data Factories.
- Select the Access Control (IAM) tab.
- Select .
- Under Job Function Roles, search for and select Reader.
- Select the Members tab.
- Select Select Members.
- Search for and select DataDotWorldADFApplication.
- Select Review + assign.
Prepare to harvest lineage
Import dataset schemas to enable column and lineage harvesting.
Before you begin
Role required: admin
About this task
Complete this task if you want to harvest columns and the associated lineage.
Procedure
- Navigate to the dataset you want to harvest columns and lineage from.
- Select the Schema tab.
- Select Import Schema.
- Publish the dataset.