BigQuery metadata collector
The BigQuery metadata collector provides read-only access to metadata from an external BigQuery account.
The collector harvests metadata for BigQuery datasets, projects, tables, and columns in a BigQuery instance and make it searchable and discoverable in the data catalog. The collector also harvests column-level lineage relationships between tables and views.
Metadata cataloged
The BigQuery collector catalogs the following information.
| Object | Information cataloged |
|---|---|
| Datasets | ID, name, description, labels (note these are key/value pairs), created date, last modified date, default table expiry, default partition expiry, data location |
| Projects | Name |
| Tables | Name, Description, Created date, Last modified date, Default table expiration, Data location, Labels, Type (Standard, External, Snapshot, Model), Partitioned on field, Clustered by columns for standard and snapshot tables, Partition type (range or time) requires partition filter - Range (Start, end, interval) Time (Partition type (hour, day, month, year), expiration) |
| Columns | Name, Description, Data Type, Is Nullable, Column size |
| View | Name, description, created date, default table expiration, last modified date, data location, default collation, labels, view SQL, clustered by columns for materialized |
Relationships between objects
Catalog pages show relationships between the following data asset types:
| Data asset page | Relationships |
|---|---|
| Datasets | Tables, Views |
| Projects | Dataset |
| Tables | Column, Labels |
| Columns | Table, View |
| Views | Column |
| Label Value | Table, View, Project, Dataset |
Lineage for BigQuery
The following lineage information is collected by the BigQuery collector.
| Object | Lineage available |
|---|---|
| View Column |
The collector identifies the associated column in an upstream view or table:
|