Azure Data Factory metadata collector

Release version: Australia

Updated April 1, 2026

2 minutes to read

Summarize

Summarized using AI

Summary of Azure Data Factory metadata collector

The Azure Data Factory (ADF) metadata collector provides ServiceNow customers with read-only access to metadata from an external Azure Data Factory account. It enables the harvesting of detailed metadata including pipelines, datasets, dataflows, linked services, triggers, integration runtimes, and global parameters. This collector also captures lineage information within ADF datasets and between ADF and external data sources such as Snowflake and Databricks.

Show full answer Show less

Key Features

Comprehensive Metadata Cataloging: Collects detailed attributes for ADF factories, pipelines, activities, linked services, datasets, dataflows, triggers, integration runtimes, and global parameters. For example, it collects factory configuration, pipeline variables, activity policies, linked service connection details (excluding SFTP connection strings), dataset schema, and compute runtime properties.
Relationship Mapping: Displays relationships between data assets, such as pipelines containing activities, datasets using linked services, and triggers activating pipelines. This helps visualize dependencies and data flow within ADF.
Lineage Tracking: Tracks data lineage by identifying sources and sinks for datasets and tables. It supports lineage extraction for supported data sources like Snowflake, Databricks, PostgreSQL, MySQL, Oracle, Teradata, DB2, and SQL Server, particularly when Copy Activities move data between datasets.
Authentication: Uses Azure Service Principal for secure authentication when connecting to Azure Data Factory accounts.

Practical Use for ServiceNow Customers

This collector allows customers to integrate ADF metadata into ServiceNow’s data catalog and governance workflows, enabling improved visibility into data pipelines, dependencies, and data lineage. By harvesting detailed metadata and relationships, customers can better manage, audit, and understand their data engineering processes within Azure Data Factory.

Preparation and Setup

Before running the collector, customers must prepare their Azure data assets and configure authentication with an Azure Service Principal. After setup, they can create and run the Azure Data Factory metadata collector within ServiceNow to import metadata and lineage information.

The Azure Data Factory metadata collector provides read-only access to metadata from an external Azure Data Factory account.

Use this collector to harvest metadata from ADF, including pipelines, datasets, dataflows, linked services, triggers, integration runtimes, and global parameters. It gathers lineage information between ADF datasets and between ADF and external sources such as Snowflake.

Metadata cataloged

The Azure Data Factory collector catalogs the following information.

Table 1. Metadata harvested
Object	Information cataloged
Factory	ID, Name, ETag, Location, Create Time, Provisioning State, Version, Public Network Access, Factory Tags, Repository configuration (Account name, Collaboration Branch, Repository Name, Disable Publish, Root Folder, Host Name, Client ID, Project Name, Last Commit ID, Tenant ID, Repo Configuration Type).
Pipeline	ID, Name, Description, Etag, Concurrency, Folder, Parameters, Metric Policy Duration, Variables
Pipeline Activity	Name, Description, Type, Inactivity Status, State, User Properties, Activity Policy (Retry, Timeout, Retry Interval In Secs, Secure Input, Secure Output)
Linked Service	ID, Name, Description, Type, Etag, Connection String, Domain, Parameters Note: Harvesting of Connection String for SFTP Linked Services is not supported.
Dataset	ID, Name, Etag, Type, Database, Schema, Table, Folder, Container, File Name, Parameters
Dataflow	ID, Name, Etag, Type, Description, Folder
Trigger	ID, Name, Etag, Type, State, Description, Frequency, Interval, Start time, End time
Integration Runtime	ID, Etag, Name, Type, Description, State Compute Properties (Node Size, Number of Nodes, Max Parallel Execution Per Node, Core Count, Compute Type, Clean up, Number of External Nodes, Number of Pipeline Nodes), SSIS properties ( Catalog Server Endpoint, Catalog Admin Username, Catalog Pricing Tier, License Type, Dual Standby PairName, Edition)
Global Parameter	ID, Name, Value, Type
ADF Table	ID, Name
ADF Column	ID, Name, Type, Precision, Scale
Pipeline Activity	Query

Relationships between objects

Catalog pages show relationships between the following data asset types:

Table 2. Relationships between harvested data asset pages
Data asset page	Relationship
Factory	Contains Global Parameter, Contains Pipeline, Contains Dataset, Contains Dataflow, Contains Trigger, Contains Integration Runtime
Pipeline	Has Tag (also known as Annotation), Contains Activity
Activity	Belongs to Pipeline, Contains Activity, Depends on Activity, uses Linked Service, uses Integration Runtime, uses Dataset
Linked Service	Uses Integration Runtime, Has Tag (also known as Annotation), Connects to database
Dataset	Uses Linked Service, Has Tabular Datasource, Has Tag (also known as Annotation)
Dataflow	Uses Dataflow, Imports Data From Linked Service, Exports Data From Linked Service, Imports Data From Dataset, Exports Data From Dataset, has Tag (also known as Annotation)
Integration Runtime	Uses Integration Runtime, Uses Linked Service
Trigger	Triggers Pipeline, Has Tag (also known as Annotation)

Lineage for Azure Data Factory

Collected lineage information:

Table 3. Lineage availability by object
Object	Lineage available
Dataset	The collector identifies the source or sink of the dataset: when the source/sink is Snowflake, Databricks, PostgreSQL, MySQL, Oracle, Teradata, DB2, and SQLServer. when there is a Copy Activity Run copying data between two datasets.
ADF table	The collector identifies the associated table in an upstream table where the data is sourced from/sinked to.
ADF column	The collector identifies the associated table in an upstream column where the data is sourced from/sinked to.

Supported data sources for cross-system lineage:

Snowflake
Databricks

Authentication types supported

The Azure Data Factory collector authenticates using Azure Service Principal.