Metrics: defining and joining tables¶
DataHub's metric layer lets you define a single, governed source of truth for every business metric — once, and then reuse it across dashboards, alerts, and downstream consumers. This page covers two surfaces of the Metric builder modal that data stewards reach for most often:
- Joining tables — building a metric whose semantic model spans more than one physical table, joined by a Unity Catalog foreign key.
- System schemas — what to do when the foreign-key picker tells you Unity Catalog system schemas are disabled.
If you are looking for the read-only metric query surface (running an existing metric from a dashboard, applying filters, picking a granularity), that is documented separately and out of scope here.
When to choose this¶
Use the Joined tables flow in the metric builder when:
- Your metric needs to slice a measure by a dimension that lives on a different table — for example
sum(orders.amount) by customers.region. The amount lives on the orders table; the region lives on the customers table. - The two tables are already related by a foreign key declared in Unity Catalog (the
referential_constraintsview insystem.information_schema). DataHub will not invent a join for you — UC is the source of truth. - Both tables live in the same Databricks catalog and schema. Cross-catalog and cross-schema joins are out of scope for v1.
Stick with the Single table tab (the default) when:
- Your metric's measure and grouping columns all live on one physical table.
- You want to ship the metric without involving an admin to declare new UC FKs.
- You are still iterating on the measure or aggregation — picking columns is faster on a single-table surface.
You can always start in the Single table tab, ship a v1 metric, and come back later to add joins. The Joined tables tab does not lock you out of single-table flows — your existing metrics are unchanged.
Setup — what an admin needs to do once¶
The Joined tables tab depends on two things being in place in Databricks. Both are one-time admin steps; once they are set, every metric author benefits.
1. Declare the foreign key in Unity Catalog¶
DataHub reads foreign-key relationships from system.information_schema.referential_constraints and system.information_schema.key_column_usage. If the relationship is not declared in UC, the picker will show No declared foreign keys in Unity Catalog for that table — even if the column names match.
Declare the relationship on the child table (the one carrying the FK column):
sql
ALTER TABLE main.raw.orders
ADD CONSTRAINT orders_customer_id_fk
FOREIGN KEY (customer_id)
REFERENCES main.raw.customers (customer_id);
UC also requires a primary key on the parent table — DataHub picks this up automatically if it exists:
sql
ALTER TABLE main.raw.customers
ADD CONSTRAINT customers_pk
PRIMARY KEY (customer_id);
Once both are declared, the next time a metric author opens the builder against orders, the Joined tables tab will list customer_id → main.raw.customers as an available join target.
2. Enable system schemas (Standard SKU only)¶
Premium SKU workspaces have system.information_schema enabled by default. Standard SKU workspaces require an explicit enablement — see § System schemas below.
What metric authors see¶
Open the metric builder from Metrics → Create metric (top-right of the metrics list). The modal opens on the Single table tab — exactly as it did before. The new behaviour lives behind the second tab:
| Tab | What it does |
|---|---|
| Single table | The existing single-table builder, unchanged. Pick a connection, catalog, schema, table; pick a measure column, time dimension, and primary key; save. This is the default landing surface. |
| Joined tables | Adds a foreign-key picker plus a list of staged joins. Each join is one hop: [base table].[FK column] → [joined table].[PK column]. |
The Joined tables tab¶
The tab's content depends on the state of the base-table pick:
| State | What you see |
|---|---|
| You have not picked a base table yet | "Pick a base table first to discover joinable tables." |
| The base table has no declared UC foreign keys | An info banner: "This table has no declared foreign keys in Unity Catalog. See the docs on joining tables to add a join manually." (links here) |
The workspace has not enabled system.information_schema |
A warning banner: "Foreign-key auto-discovery requires Unity Catalog system.information_schema access. Ask your workspace admin to enable system schemas to use the joined-tables flow." See § System schemas. |
| The base table has at least one FK | A dropdown labelled Join via foreign key, an Add join button, and a list of staged joins (one card per join, with a remove button). |
When at least one join is staged:
- The Measure column and Time column dropdowns gain a header per source table so you can clearly see which column comes from which table.
- The Primary key field continues to refer to the base table's primary key — this is the entity the metric is anchored to.
- The compiled YAML now contains one
semantic_models:entry per source table (the base table plus each joined table). MetricFlow uses the entity declarations to wire the join automatically — no hand-rolled SQL.
Joining tables¶
You will arrive at this anchor from the Joined tables tab's info banner when the base table has no declared UC foreign keys, or from a teammate sharing the link.
The metric builder's Joined tables tab supports one-hop joins only in v1. That means a metric can span:
orders ⨝ customers✅events ⨝ users✅
but not:
orders ⨝ customers ⨝ regions❌ (two hops)
If your metric needs a two-hop chain, the v1 workaround is to materialise a single-hop view in dbt and bind the metric to that. We are tracking multi-hop joins as a v2 polish item.
How dependency tracking handles joined tables¶
When a metric joins multiple tables (for example, joining orders to customers to slice revenue by region), the metric depends on every contract whose table contributes a measure or a dimension. Linking only the headline table's contract is not enough — the catalog will refuse to save the metric until all contributing contracts are linked, surfacing a COLUMN_REF_NOT_IN_LINKED_CONTRACTS error that names the table and column it could not find.
Foreign-key entities used purely to wire up the join (no measures or dimensions pulled from the joined model) do not require their own contract link.
Concretely: if Total revenue lives on orders and you slice by customers.region, you must link both the orders contract AND the customers contract. If you only link orders, the save fails with a clear pointer at the customers.region reference.
Removing a staged join¶
Click the trash icon on a staged join's card. If the join already has measure / time / primary-key columns bound to it, DataHub asks you to confirm before removing — the bound columns will be cleared because they no longer have a home.
System schemas¶
You will arrive at this anchor from the Joined tables tab's warning banner when the workspace's system.information_schema is not enabled.
Unity Catalog ships with a system catalog that exposes metadata views — including referential_constraints (the source of truth for foreign keys) and key_column_usage (the column-pair mapping). Premium SKU workspaces have these enabled by default. Standard SKU workspaces require an account-admin to enable them once.
Without system.information_schema access, the FK picker has no way to discover joinable tables. We deliberately surface this as a clear admin-action banner — silently falling back to manual entry would erode the "UC is the source of truth" invariant the metric layer relies on.
How an admin enables system schemas¶
- Go to the Databricks Account Console.
- Under Catalog Explorer (or Data), open the
systemcatalog. - If the catalog does not exist, click Enable system catalog (Account admin role required).
- Inside
system, ensure theinformation_schemaschema is enabled. Some workspaces also needaccessandbillingfor separate features; onlyinformation_schemais required for DataHub's FK picker.
The official Databricks docs cover this in detail under Enable system schemas. Once enabled, refresh the metric builder; the FK picker should appear within a few seconds.
How it works¶
The metric builder reads UC metadata via DataHub's POST /insightsmodule/query/foreign-keys endpoint, which queries:
system.information_schema.referential_constraints(the FK declarations themselves), andsystem.information_schema.key_column_usage(the column-pair mapping for each FK).
Both queries are parameterised — DataHub does not interpolate raw catalog / schema / table names into SQL. The endpoint is gated on the same insights.manage role as the rest of the catalog discovery surface.
When you save a metric with one or more joined tables, DataHub:
- Generates one
semantic_models:block per source table — the base table plus each joined table. - Adds the FK column to the base table's
entities:block as atype: foreignentry, matching the joined table'stype: primaryentity by name. MetricFlow uses this entity-name match to wire the join at compile time. - Compiles the spec via MetricFlow, persists a
metric_contract_dependenciesrow per(contract, column), and only then accepts the save.
If MetricFlow rejects the spec — for example, because a contributing contract is missing — the save fails with a precise pointer at the unsatisfied reference. You can adjust contracts and re-save without losing your other inputs.
Limitations¶
| Limit | Why | Workaround |
|---|---|---|
One join hop only (A ⨝ B, not A ⨝ B ⨝ C) |
Keeps the FK-discovery query trivial and the UI flat for v1. | Materialise a one-hop view in dbt and bind the metric to that. |
| Both tables must live in the same catalog and schema | v1 scope decision. | Move the joined table or define it as a view inside the base catalog/schema. |
| FKs must be declared in Unity Catalog | UC is the source of truth — no manual FK entry in v1 to avoid typos and drift. | Declare the FK in UC (see § Setup). |
| The headline measure stays on the base table | A joined model carrying measures: requires its own agg_time_dimension, which the v1 emitter does not add. |
Either keep the measure on the base table, or model the joined surface as a separate metric. |
| Embedded dashboards do not honour per-user UC RLS | Embed tokens use a different identity model — see Databricks per-user OAuth. | This is a Databricks identity-model constraint, not a metric-layer one. |
Audit & compliance¶
Every multi-table metric is auditable through the same surfaces as a single-table metric:
| Question a CISO might ask | Where to look |
|---|---|
| "Which contracts does this metric depend on?" | The metric detail page lists every linked contract; backed by metric_contract_dependencies rows in Postgres. |
| "Did the user have permission to query both tables?" | UC enforces this at query time — DataHub mints a per-user OAuth token (when configured) so the actual end-user identity flows to UC. |
| "Was the join definition tampered with after approval?" | Each save creates an immutable metric_versions snapshot containing the full spec_yaml. The dependency rows are version-scoped (metric_version_id, not metric_definition_id). |
| "Why did this metric fail to save?" | The error envelope names the unsatisfied reference (COLUMN_REF_NOT_IN_LINKED_CONTRACTS plus the (table, column) pair). |
Troubleshooting¶
| Symptom | Likely cause | Fix |
|---|---|---|
| Joined tables tab shows "No declared foreign keys in Unity Catalog" | UC FK is not declared, or DataHub has not refreshed its UC cache yet. | Verify with SHOW CONSTRAINTS ON main.raw.orders; in Databricks. If declared, give DataHub ~30s to refresh; otherwise declare the FK as shown in § Setup. |
| Joined tables tab shows "Foreign-key auto-discovery requires Unity Catalog system.information_schema access" | Workspace is on Standard SKU and system.information_schema is disabled. |
See § System schemas. |
Save fails with COLUMN_REF_NOT_IN_LINKED_CONTRACTS naming a column on a joined table |
You linked the headline table's contract but not the joined table's contract. | Open the Contracts picker on the metric builder and link the contract that covers the joined table; re-save. |
Save fails with dbt parse failed mentioning agg_time_dimension |
The joined model received a measure but no time dimension was declared on it. | In v1, keep measures on the base table. If you must move the measure to a joined table, also pick a time column on that table. |
FK picker is empty even though referential_constraints shows the FK in Databricks |
The customer querying user (the per-user OAuth identity) does not have USE CATALOG / USE SCHEMA privileges on system.information_schema. |
Grant USAGE on system.information_schema to the user or the role they assume. |
| The compiled YAML contains the joined table but the metric returns no rows when run | The join key types do not match across the two tables (e.g. INT on one side, STRING on the other). |
Cast the column types to match in upstream dbt models — UC does not cast for you. |
If something is broken that this page does not cover, ping your DataHub contact with the metric ID and the error text — both are visible on the metric detail page.