dbt Cloud metadata collector
Summarize
Summary of dbt Cloud metadata collector
The dbt Cloud metadata collector enables ServiceNow customers to access read-only metadata from an external dbt Cloud account. It connects to a dbt Cloud project to harvest comprehensive metadata about dbt assets and their column-level lineage relationships derived from database views. This integration helps you catalog and visualize data asset details and relationships to improve data governance and impact analysis.
Show less
Key Features
- Metadata Cataloging: The collector gathers detailed metadata for various dbt assets including analyses, models, model columns, projects, snapshots, seeds, sources, tests, test results, semantic models, entities, dimensions, measures, and metrics. Each asset’s attributes such as name, description, SQL code, package name, resource type, and unique identifiers are cataloged.
- Relationship Mapping: The collector maps relationships between data asset types such as models linked to projects and tests, semantic models connected to components and metrics, and lineage between upstream and downstream assets. This provides a clear view of how data elements interact within the dbt ecosystem.
- Lineage Collection: It collects lineage information for dbt models materialized as views, including their referenced database tables and columns, as well as upstream and downstream dbt data assets. Currently, Snowflake is supported for cross-system lineage visualization.
Practical Use and Preparation
- Before running the collector, configure your dbt Cloud environment and create authentication tokens to enable secure access.
- Create and configure a dbt Cloud metadata collector within ServiceNow to import and leverage the harvested metadata.
Benefits for ServiceNow Customers
By integrating dbt Cloud metadata, you gain enhanced visibility into your data transformation workflows, enabling more effective data governance, impact analysis, and troubleshooting. This comprehensive metadata and lineage capture supports better decision-making and operational efficiency in managing your data assets.
The dbt Cloud metadata collector provides read-only access to metadata from an external dbt Cloud account.
The dbt cloud collector connects to the dbt cloud project and harvests dbt assets and column-level lineage relationships from database views associated with dbt assets.
Metadata cataloged
The dbt Cloud collector catalogs the following information.
| Object | Information cataloged |
|---|---|
| Analysis | Name, Description, Path, Root path, Package name, Unique ID, Alias, Meta, Raw SQL, Compiled SQL/Compiled Code, Enabled, Materialized, Resource type |
| Model | Name, Description, Path, Root path, Package name, Unique ID, Alias, Meta, Raw SQL, Compiled SQL/Compiled Code, Enabled, Materialized, Resource type |
| Model column | Column name |
| Project | Name, Project version |
| Snapshot | Name, Description, Path, Root path, Package name, Unique ID, Alias, Meta, Raw SQL/Raw Code, Compiled SQL/Compiled Code, Enabled, Materialized, Resource type |
| Seed | Name, Description, Path, Root path, Package name, Unique ID, Alias, Meta, Raw SQL/Raw Code, Compiled SQL/Compiled Code, Enabled, Materialized, Resource type |
| Source | Name, Description, Path, Root path, Package name, Unique ID, Alias, Meta, Raw SQL/Raw Code, Compiled SQL/Compiled Code, Enabled, Source name, Resource type |
| Test | Name, Description, Path, Root path, Package name, Unique ID, Alias, Meta, Raw SQL/Raw Code, Compiled SQL/Compiled Code, Enabled, Materialized, Resource type |
| Test result | Time the test was executed, Status, Count of failures (if any), Message emitted by the test (if any) |
| Semantic Models | Name, Description, Path, Package name, Unique ID, Enabled, Resource Type, Semantic Model Components, Primary Entity |
| Entities | Title, SQL Expression, Entity Type |
| Dimensions | Title, Dimension Type |
| Measures | Title, Description, Has Measure Aggregation |
| Metrics | Title, Description, Path, Package Name, Unique ID, Metric Type |
Relationship between objects
Catalog pages show relationships between the following data asset types:
| Data asset page | Relationship |
|---|---|
| Model | Project containing dbt model, Tests testing the integrity of model, dbt data assets (test, seed, model, snapshot, source) that are upstream of model, dbt data assets (Test, Seed, Model, Snapshot, Source) that are downstream of model |
| Semantic Model | Project containing the semantic model, dbt model related to the semantic model, dbt semantic model components (dimensions, entities, measures), Metric that the semantic model provides context for |
| Model column | The database column in the manifested table or view |
| Project | dbt data assets (Test, Seed, Model, Snapshot, Source) contained within project |
| Snapshot | Project containing dbt project, dbt data assets (Test, Seed, Model, Source) that are upstream of snapshot, dbt data assets (Test, Seed, Model, Source) that are downstream of snapshot |
| Seed | Project containing dbt project, dbt data assets (Test, Seed, Model, Snapshot, Source) that are upstream of seed, dbt data assets (Test, Seed, Model, Snapshot, Source) that are downstream of seed |
| Source | Project containing dbt project, dbt data assets (Test, Seed, Model, Snapshot) that are downstream of seed, Database schema that the source represents |
| Test | Project containing dbt project, dbt model that has its integrity tested by this test |
| Test result | The dbt test that was executed to produce the result |
Lineage for dbt
The following lineage information is collected by the dbt Cloud collector.
| Object | Lineage available |
|---|---|
| dbt model materialized as view | Referenced database tables and columns in dbt model materialized as view |
| dbt resource | dbt data assets that are upstream and downstream (for example, seeds that are upstream of models, and tests that are downstream of models) of dbt data asset. |
Snowflake is the currently supported data source for cross-system lineage.