Create a dbt Cloud metadata collector
Create a collector to import metadata from dbt Cloud.
Before you begin
Before you begin, verify the following:
- A MID Server is setup for the collectors. For more information, see MID Server for metadata collectors.
- All per-requisite tasks are completed. For more information, see Prepare to run the dbt Cloud collector.
- Role required: connection-admin
Procedure
- Navigate to All > Workflow Data Fabric > Workflow Data Fabric Home.
-
Select the Connect Hub
icon in the left sidebar.
- Select Create > Metadata collector.
- From the System list, select dbt Cloud.
-
From the Connection type list, select one of the following:
- Select New connection to configure a new connection.
-
Select Existing connection to reuse an existing connection and select an existing connection from the Connections list.
The configuration form is filled with details from the existing connection. The name is appended with the word Copy and sensitive details like password aren't copied.
-
On the form, fill in the fields.
Table 1. dbt Cloud metadata collector form Field Description Connection name Unique identifier for the connection. This field can't be modified once the connection is established. Short description Purpose and details of the connection. -
Enter the dbt Cloud configuration details.
Table 2. Configuration details Field Description dbt Cloud API key A dbt cloud-issued API key with permissions to access the specified account. dbt Cloud host The host for your organization's account on dbt cloud. If left unspecified, the default host is assumed as cloud.getdbt.com. dbt cloud account ID The dbt cloud account that owns the project from which to harvest dbt metadata artifacts. dbt Cloud project The name or numeric identifier of the project from which to harvest dbt metadata artifacts. dbt cloud run ID The numeric identifier of the run that produced the artifacts to be harvested. If not specified, the most recent successful run that produced artifacts within the project is harvested. dbt Cloud environment The dbt Cloud environment (ID or name) used to filter the job runs from which to harvest dbt metadata artifacts. dbt Cloud job The dbt Cloud job (ID or name) used to filter the job runs from which to harvest dbt metadata artifacts. -
Enter the target database details.
Note:You must set the Target database to Snowflake overrides to harvest Snowflake lineage relationships between columns specified through views.
Table 3. Target database details Field Description Target database Option to override the database connection information configured on the Project in dbt cloud - No Target database overrides: Enables the collector to skip connecting to a data warehouse and only harvest dbt assets. No lineage is available for views
- Snowflake overrides: Select to harvest Snowflake lineage relationships between columns specified through views
Authentication (Snowflake overrides) Authentication Authentication method to use if Snowflake overrides is selected - No Snowflake authentication overrides
- Snowflake username and password overrides
- Snowflake private key file overrides
Note:If you select Snowflake overrides and don't provide any authentication details, the collector obtains connection information (Snowflake account, role, and warehouse) from the identified dbt Cloud run.Snowflake username and password overrides Database username User credential to use in connecting to the target database Database password Password credential to use in connecting to the target database Snowflake private key file overrides Database username Username to use in connecting to the target database Snowflake key file path Private key file to use for authentication with Snowflake (for example rsa_key.p8). Use this option to override the dbt profile Snowflake key file password Password for the private key file if the key is encrypted and a password was set. Use this option to override the dbt profile or cloud configuration Other optional settings Snowflake application Application connection parameter to use in connecting to the target Snowflake database. Use this option to override the dbt profile or cloud configuration Default: datadotworld
Snowflake account Snowflake account or tenant Snowflake role Role to use in connecting to the target Snowflake database. Use this option to override the dbt profile or cloud configuration. This field is case-insensitive Snowflake warehouse Warehouse to use in connecting to the target Snowflake database. Use this option to override the dbt profile or cloud configuration. This field is case-insensitive -
Enter the advanced options.
Table 4. Advanced options Field Description Max retries The number of times the system retries a failed API call. Default: 5
Retry delay The number of seconds to wait between retry attempts for a failed API call. Default: 2 seconds
API HTTP header Name-value pairs included as HTTP headers in API calls made by the collector. Add one value per line to specify multiple headers. JDBC driver properties JDBC driver properties to pass through to driver connection. Specify multiple JDBC driver properties by adding one value per line. If you are using the NTLM authentication, you must set two JDBC properties as:
- integratedSecurity=true
- authenticationScheme=NTLM
- Select Save.
Result
The metadata collector is created and appears on the Connectors page with a Configured status. It is now ready to connect to the source system and harvest metadata.
What to do next
After creating the collector, you can perform any of the following tasks:
- Run the collector manually to harvest metadata immediately. See Run metadata collectors manually.
- Automate metadata collection by scheduling regular collector runs. See Schedule metadata collector runs.
- Monitor execution status and troubleshoot issues by viewing the runtime logs. See View runtime logs for collector runs.
- Discover and evaluate the harvested data assets in the Data Catalog. See Governing the Data Catalog.