Metrics: defining and joining tables¶
Datahub's metric layer lets you define a single, governed source of truth for every business metric — once, and then reuse it across dashboards, alerts, and downstream consumers. This page covers four surfaces of the metric builder that data stewards reach for most often:
- The Create / Edit page — full-page builder at
/metrics/newand/metrics/:id/edit(NumberedSection layout: General + HercHint at the top, then Data source → optional Joins → Measure → Preview). - The Suggest-with-HERC entry point — AI batch flow at
/metrics/suggest, reached from the Suggest with HERC secondary header action on/metrics. - Joining tables — building a metric whose semantic model spans more than one physical table, joined by a Unity Catalog foreign key.
- System schemas — what to do when the foreign-key picker tells you Unity Catalog system schemas are disabled.
- Where my metric reads from — how Datahub binds your metric's compiled SQL to your real Unity Catalog tables, and what to do if an older metric refuses to compile.
If you are looking for the read-only metric query surface (running an existing metric from a dashboard, applying filters, picking a granularity), that is documented separately and out of scope here.
When to choose this¶
Use the Joined tables branch in the metric builder when:
- Your metric needs to slice a measure by a dimension that lives on a different table — for example
sum(orders.amount) by customers.region. The amount lives on the orders table; the region lives on the customers table. - The two tables are already related by a foreign key declared in Unity Catalog (the
referential_constraintsview insystem.information_schema). Datahub will not invent a join for you — UC is the source of truth. - Both tables live in the same Databricks catalog and schema. Cross-catalog and cross-schema joins are out of scope for v1.
Stick with the Single table mode (pick the Single table card on the first step) when:
- Your metric's measure and grouping columns all live on one physical table.
- You want to ship the metric without involving an admin to declare new UC FKs.
- You are still iterating on the measure or aggregation — picking columns is faster on a single-table surface.
You can always start in Single table mode, ship a v1 metric, and come back later to recreate or extend with joins. Choosing Joined tables does not change existing saved metrics until you save a new version.
Setup — what an admin needs to do once¶
The Joined tables flow depends on two things being in place in Databricks. Both are one-time admin steps; once they are set, every metric author benefits.
1. Declare the foreign key in Unity Catalog¶
Datahub reads foreign-key relationships from system.information_schema.referential_constraints and system.information_schema.key_column_usage. If the relationship is not declared in UC, the picker will show No declared foreign keys in Unity Catalog for that table — even if the column names match.
Declare the relationship on the child table (the one carrying the FK column):
ALTER TABLE main.raw.orders
ADD CONSTRAINT orders_customer_id_fk
FOREIGN KEY (customer_id)
REFERENCES main.raw.customers (customer_id);
UC also requires a primary key on the parent table — Datahub picks this up automatically if it exists:
Once both are declared, the next time a metric author opens the builder against orders and picks Joined tables, the Joins step will list customer_id → main.raw.customers as an available join target.
2. Enable system schemas (Standard SKU only)¶
Premium SKU workspaces have system.information_schema enabled by default. Standard SKU workspaces require an explicit enablement — see § System schemas below.
What metric authors see¶
The metric builder has two entry points on the metrics list (/metrics):
| Action | Route | When to use |
|---|---|---|
| Create metric (primary header action) | /metrics/new |
You know what you want to measure and which table it lives on. The page renders every section (General, Data source, Measure, Preview) at once — no stepper, no mode picker. |
| Suggest with HERC (secondary header action) | /metrics/suggest |
You want HERC to propose drafts from a warehouse table and then refine the ones you like. Batch flow — one fetch, multiple drafts saved in one go. |
Editing an existing metric (pencil icon on a row, or Edit in builder on the detail page) lands you on the same /metrics/:id/edit page in the same NumberedSection layout.
The Create / Edit page (NumberedSection layout)¶
The page renders top-to-bottom as one continuous form so you can see the road ahead from page load:
| # | Section | What it does |
|---|---|---|
| — | General (top, paired with a HERC hint card) | Name, description, key, governance metadata (tags, domain, department, team), fiscal calendar, cache settings. Always visible. |
| 1 | Data source | Connection → catalog → schema → table picker. A Single ↔ Joined toggle sits inside this section. Switching to Joined reveals the Joins section below; switching back is blocked while joined tables are staged (so you can't silently drop them). |
| 2 | Joins (only when Joined is on) | Foreign-key picker, optional Discover join paths (UC FK graph, up to three hops), a read-only join graph preview, and a list of staged joins. Each join edge is [source table].[FK column] → [joined table].[PK column] (the base table is the first source; later hops use intermediate tables as sourceTable in the saved spec). |
| 2 / 3 | Measure | MetricFlow type (simple / ratio / derived / cumulative / conversion / funnel), measure column, aggregation, time grain, primary key. |
| 3 / 4 | Preview | Run the live spec against your warehouse and inspect rows + comparison results before saving. |
A sticky action bar (Save as draft / Save and promote) slides in when the page header scrolls out of view. Save and promote stays disabled until validation clears; Save as draft only requires a name (so you can park work in progress).
Editing an unparseable metric¶
If the saved YAML was hand-edited and Datahub can't reconstruct it, the Data source / Joins / Measure / Preview sections are hidden and a HERC hint explains why. You can still edit name, description, and governance — the saved spec is preserved verbatim.
The Joins section¶
The Joins section appears only when the Single ↔ Joined toggle is on, and its contents depend on the state of the base-table pick:
| State | What you see |
|---|---|
| You have not picked a base table yet | "Pick a base table first to discover joinable tables." |
| The base table has no declared UC foreign keys | An info banner: "This table has no declared foreign keys in Unity Catalog. See the docs on joining tables to add a join manually." (links here) |
The workspace has not enabled system.information_schema |
A warning banner: "Foreign-key auto-discovery requires Unity Catalog system.information_schema access. Ask your workspace admin to enable system schemas to use the joined-tables flow." See § System schemas. |
| The base table has at least one FK | A dropdown labelled Join via foreign key, Discover join paths (when available), an Add join button, and a list of staged joins (one card per join, with a remove button). |
The Suggest-with-HERC entry point¶
Open the suggest flow from Metrics → Suggest with HERC (secondary header action on /metrics). You land on /metrics/suggest, which mounts a batch wizard with the Mode picker hidden (the route pins it to AI). The flow is:
- Data source — pick a connection, catalog, schema, and table.
- Select proposals — HERC returns a small set of starter metrics (simple / ratio / cumulative / conversion / funnel where applicable). Add an optional business context to bias the proposals. Multi-select the ones you want to keep.
- Governance — set tags / domain / team / fiscal calendar that apply to every draft in the batch.
- Review — see one card per selected proposal; on save, Datahub creates one draft metric per proposal.
Each draft lands on the metrics list as Status = draft. Open it from there and refine in the standard NumberedSection builder before promoting.
Why a separate route? The batch flow has fundamentally different shape from single-metric editing (one fetch produces N metrics, no Save and promote, no Preview). Keeping it on its own route means improvements to the single-metric layout cannot regress the batch flow, and vice-versa.
When at least one join is staged:
- The Measure column and Time column dropdowns gain a header per source table so you can clearly see which column comes from which table.
- The Primary key field continues to refer to the base table's primary key — this is the entity the metric is anchored to.
- The compiled YAML now contains one
semantic_models:entry per source table (the base table plus each joined table). MetricFlow uses the entity declarations to wire the join automatically — no hand-rolled SQL.
Joining tables¶
You will arrive at this anchor from the Joins step info banner when the base table has no declared UC foreign keys, or from a teammate sharing the link.
The metric builder's Joined tables branch can follow up to three foreign-key hops along Unity Catalog–declared FKs (for example orders → customers → regions → countries). Use Discover join paths to list candidate chains and stage one; a read-only graph preview shows the tables and FK direction. Cycles (join steps that would loop back to an already-used table, including a self-loop) are rejected in the UI before save.
Not supported: many-to-many join paths that require an implicit bridge table — model those as an explicit table or view first.
If you need more than three hops, materialise a view (or narrow table) that collapses the chain to three hops or fewer and bind the metric to that.
How contract linkage works for joined tables (April 2026)¶
Direct metric→contract linking was retired in April 2026. A metric reaches a contract by being wrapped in a Data Product (the bridge concept introduced in data-product-bundle-bridge):
- Save the metric (no contract picker on the wizard any more).
- From the metric detail page, click Register as Data Product to create or join a Data Product.
- From the Data Product detail page, link any data contracts whose schemas the underlying tables belong to.
For a joined metric (orders ⨝ customers), bundle both tables' Data Products into a single product (or wrap the metric in a product that lists both tables as assets) and link that product to each of the contracts. There is no longer a save-time COLUMN_REF_NOT_IN_LINKED_CONTRACTS failure — the binding is enforced through Data Product membership at view time, not through column-level dependency rows at compile time.
Removing a staged join¶
Click the trash icon on a staged join's card. If the join already has measure / time / primary-key columns bound to it, Datahub asks you to confirm before removing — the bound columns will be cleared because they no longer have a home.
System schemas¶
You will arrive at this anchor from the Joins step warning banner when the workspace's system.information_schema is not enabled.
Unity Catalog ships with a system catalog that exposes metadata views — including referential_constraints (the source of truth for foreign keys) and key_column_usage (the column-pair mapping). Premium SKU workspaces have these enabled by default. Standard SKU workspaces require an account-admin to enable them once.
Without system.information_schema access, the FK picker has no way to discover joinable tables. We deliberately surface this as a clear admin-action banner — silently falling back to manual entry would erode the "UC is the source of truth" invariant the metric layer relies on.
How an admin enables system schemas¶
- Go to the Databricks Account Console.
- Under Catalog Explorer (or Data), open the
systemcatalog. - If the catalog does not exist, click Enable system catalog (Account admin role required).
- Inside
system, ensure theinformation_schemaschema is enabled. Some workspaces also needaccessandbillingfor separate features; onlyinformation_schemais required for Datahub's FK picker.
The official Databricks docs cover this in detail under Enable system schemas. Once enabled, refresh the metric builder; the FK picker should appear within a few seconds.
Where my metric reads from¶
Every metric you create in the wizard is bound, at save time, to the exact catalog · schema · table you picked in the data-source step. That binding is stored alongside the metric's spec and is what Datahub uses at query time to issue the FROM clause against your real Unity Catalog tables — no placeholder schema, no test fixture leakage, no surprise routing to a different catalog.
How the binding is captured¶
| What you do in the wizard | What Datahub stores | What runs at query time |
|---|---|---|
| Pick connection → catalog → schema → table on the Data source step | A source_map entry per ref, e.g. { "orders": { "catalog": "main", "schema": "gold", "table": "orders" } }, persisted on the metric version snapshot |
The compiled SQL's FROM clause resolves to `main`.`gold`.`orders` |
| Add a Joined table with its own catalog/schema/table | One additional source_map entry per joined table — each binds independently |
Each joined table's FROM clause resolves to its own UC location; UC handles the join |
| Save a cosmetic-only edit (rename, description, tag changes) without touching Data source | The previous version's source_map is inherited by the new save |
The compiled SQL keeps reading from the same UC tables — no rebinding |
| Save an edit that changes the spec (measure column, time grain, contract link) without touching Data source | The previous version's source_map is inherited; the new spec recompiles against the same UC tables |
The new compiled SQL reads from the same UC tables, with the new spec semantics |
The wizard is the only place this binding can be set or changed. If you re-pick the table on the Data source step and save, Datahub captures the new binding. If you only edit the YAML, the existing binding stays in place.
Identifier rules¶
Datahub locks every catalog / schema / table value to the character set A–Z, a–z, 0–9, and _. UC names with hyphens (-), dots (.), or spaces are rejected at save time with an inline error pointing at the offending value — for example, "orders.schema 'my-schema' contains characters that are not allowed." This matches what dbt and MetricFlow can render natively without quoting tricks.
If your tables use restricted characters in their UC names, the workaround in v1 is to expose a sanitised view in the same catalog/schema (e.g. CREATE VIEW main.gold.my_schema_orders AS SELECT * FROM main.gold.\my-schema`.orders;`) and bind the metric to that view.
Legacy metrics: "this metric was created before per-ref source binding existed"¶
Metrics created before Datahub shipped per-ref source binding do not have a source_map. Trying to compile or query one returns a clear error:
METRIC_REQUIRES_SOURCE_BINDING: this metric was created before per-ref source binding existed; recreate it via the wizard.
There is no automatic backfill. Datahub never persisted the catalog / schema for those metrics anywhere, so we have nothing to migrate from. The intentional fix is to recreate the metric via the wizard, which captures the binding correctly from the data-source picker.
When you recreate, the new metric:
- Gets a fresh definition + version (so links from dashboards / alerts pointing at the old definition need to be updated to the new ID).
- Picks up the same governance scaffolding (tags, domain, fiscal calendar) you set on the original — none of that is lost; you just retype the picker values that were never persisted in the first place.
- Compiles against your real UC tables on the first save and is queryable immediately.
Troubleshooting the binding¶
| Symptom | Likely cause | Fix |
|---|---|---|
Save fails with a 422 naming a (ref, segment, value) triple, e.g. "orders.schema 'my-schema' …" |
The catalog / schema / table value contains a hyphen, dot, or space. | Expose a sanitised view with an underscore-only name and bind the metric to that view. |
Run query fails with METRIC_REQUIRES_SOURCE_BINDING |
The metric predates per-ref source binding. | Recreate via the wizard (see above). |
Run query fails with [TABLE_OR_VIEW_NOT_FOUND] naming a real-looking catalog / schema / table |
The UC table was renamed, moved, or revoked from the calling user since the metric was last saved. | Re-open the metric in the wizard, re-pick the (possibly renamed) table on the Data source step, and save. The new save creates a new version with the corrected binding. |
Compiled SQL on the metric detail page shows datahub_metric_placeholder.public.<table> |
This is the original bug Datahub closed in metric-source-binding SP1–SP6 (April 2026). It should not appear on any metric saved after that release. | Recreate via the wizard — the compile path now refuses to fall back to a placeholder and instead raises METRIC_REQUIRES_SOURCE_BINDING. |
How it works¶
The metric builder reads UC metadata via Datahub's POST /insightsmodule/query/foreign-keys endpoint (outbound FKs from one table) and POST /insightsmodule/query/foreign-key-chains (BFS up to depth three for multi-hop discovery), which query:
system.information_schema.referential_constraints(the FK declarations themselves), andsystem.information_schema.key_column_usage(the column-pair mapping for each FK).
Both code paths use parameterised SQL — Datahub does not interpolate raw catalog / schema / table names into SQL. The endpoints are gated on the same insights.manage role as the rest of the catalog discovery surface.
When you save a metric with one or more joined tables, Datahub:
- Generates one
semantic_models:block per source table — the base table plus each joined table. - Adds the FK column to the base table's
entities:block as atype: foreignentry, matching the joined table'stype: primaryentity by name. MetricFlow uses this entity-name match to wire the join at compile time. - Compiles the spec via MetricFlow against the per-ref
source_map(the catalog/schema/table you picked on the Data source step) and persists the compiled SQL.
If MetricFlow rejects the spec — for example, because a ref(...) does not bind to a real Unity Catalog table — the save fails with a precise pointer at the unsatisfied reference. Contract linkage now happens out-of-band via Data Products (see § How contract linkage works for joined tables).
Limitations¶
| Limit | Why | Workaround |
|---|---|---|
More than three FK hops (A → B → C → D → E) |
UC graph walk is capped at depth three for predictable compile and UX. | Materialise a view (or table) that shortens the chain and bind the metric to that. |
| Many-to-many via an undeclared bridge | Bridge semantics are deferred. | Expose the bridge (or a flattened view) as a first-class table with UC FKs, then join. |
| Both tables must live in the same catalog and schema | v1 scope decision. | Move the joined table or define it as a view inside the base catalog/schema. |
| FKs must be declared in Unity Catalog | UC is the source of truth — no manual FK entry in v1 to avoid typos and drift. | Declare the FK in UC (see § Setup). |
| The headline measure stays on the base table | A joined model carrying measures: requires its own agg_time_dimension, which the v1 emitter does not add. |
Either keep the measure on the base table, or model the joined surface as a separate metric. |
| Embedded dashboards do not honour per-user UC RLS | Embed tokens use a different identity model — see Databricks per-user OAuth. | This is a Databricks identity-model constraint, not a metric-layer one. |
Audit & compliance¶
Every multi-table metric is auditable through the same surfaces as a single-table metric:
| Question a CISO might ask | Where to look |
|---|---|
| "Which contracts does this metric depend on?" | The metric detail page's Part of panel lists every Data Product that wraps the metric; each product's detail page lists its linked contracts. |
| "Did the user have permission to query both tables?" | UC enforces this at query time — Datahub mints a per-user OAuth token (when configured) so the actual end-user identity flows to UC. |
| "Was the join definition tampered with after approval?" | Each save creates an immutable metric_versions snapshot containing the full spec_yaml and source_map. |
| "Why did this metric fail to save?" | Spec failures surface as dbt parse errors with the offending ref; binding failures surface as METRIC_REQUIRES_SOURCE_BINDING (or a 422 naming the rejected (ref, segment, value) triple). |
Funnel metrics: ordered steps¶
Use a funnel metric when you need ordered steps on one entity (for example a customer or session id) within a conversion window — e.g. signup → activation → purchase within 30 days. The metric builder (single-table mode) lets you add two or more steps; each step maps to a numeric measure on the same physical table.
When to choose this
- You care about drop-off between consecutive steps, not a single conversion ratio only.
- Every step can be expressed as a measure on the same table and tied to the same primary key / entity as the other steps.
- You accept v1 limits: one entity column, linear steps only (no branching), single semantic model (no cross-table funnels in this release).
What you define
- Entity — aligned with the semantic model’s primary key (the builder uses your chosen primary key column; it maps to a single
primary_entityin the spec). - Window — e.g. 30 days or 12 hours; conversions must complete from step i to step i+1 within that window.
- Steps — each step has a name, a measure (column + aggregation), and optional filters in YAML for advanced users.
What the viewer shows
- An overall headline rate (product of step conversion rates) plus a funnel chart and a per-step transition table (current vs comparison period when you use YoY or similar).
Limitations (v1)
- No branching funnels (one step splitting into two paths).
- No multi-entity identity resolution (e.g. anonymous → identified) in one funnel.
- Funnel is not a substitute for retention cohorts or anomaly metrics — those stay on the roadmap.
Fiscal calendars¶
Fiscal calendars tell Datahub when your financial year starts so metrics can align periods to it. They are used by metrics whose period grain is fiscal_quarter or fiscal_year and by yoy_fiscal_aligned comparisons.
The seeded default. Every tenant starts with one calendar named "Fiscal Year (default)" with January as the start month. You can rename it and change the start month to match your real fiscal year (for example April for the UK, October for the US federal year). The default cannot be deleted — rename it instead.
When to add a custom calendar. Only if your organisation operates under multiple fiscal regimes (for example a UK group plus a US subsidiary). Each metric definition links to exactly one calendar. If one calendar covers the whole organisation, the seeded default is enough — you do not need to add another one.
How it surfaces in the metric builder. The General → Governance section shows a Fiscal calendar picker. The picker is pre-set to the system default when you create a new metric, so the Fiscal year and Fiscal YoY options work out of the box. You can switch to a custom calendar from the dropdown — the picker shows each calendar as "{name} — {month} 1", e.g. "Fiscal Year (default) — January 1" or "UK FY — April 1".
Where to manage them. Stewards with the metric.steward role can manage the list under Administration → Fiscal calendars. The page lists every calendar in your tenant — the seeded default plus any custom ones — with its starting month and a System / Custom badge.
What you cannot delete.
- The system row (Fiscal Year (default)) — the delete button is hidden on it.
- Any calendar that is currently linked to a metric definition — the API returns a 409 with
"Fiscal calendar in use by N metric definition(s)". Reassign or archive the metrics first, then retry the delete.
Audit & compliance. Name and start-month changes flow through the standard system audit log via sysmodifier/sysmodified, the same as every other governed entity.
Troubleshooting¶
| Symptom | Likely cause | Fix |
|---|---|---|
| Funnel save fails: steps on different semantic models | Measures for steps point at different models | In v1, keep all step measures on one table. |
| Funnel window rejected | Window outside allowed range or bad unit | Use N days or N hours with a positive integer within product limits. |
| Funnel per-step table empty after Run query | Pair-step query failed or returned no data | Widen the run window, check the integration, and confirm step measures are populated for the period. |
| Joins step shows "No declared foreign keys in Unity Catalog" | UC FK is not declared, or Datahub has not refreshed its UC cache yet. | Verify with SHOW CONSTRAINTS ON main.raw.orders; in Databricks. If declared, give Datahub ~30s to refresh; otherwise declare the FK as shown in § Setup. |
| Joins step shows "Foreign-key auto-discovery requires Unity Catalog system.information_schema access" | Workspace is on Standard SKU and system.information_schema is disabled. |
See § System schemas. |
Save fails with dbt parse failed mentioning agg_time_dimension |
The joined model received a measure but no time dimension was declared on it. | In v1, keep measures on the base table. If you must move the measure to a joined table, also pick a time column on that table. |
FK picker is empty even though referential_constraints shows the FK in Databricks |
The customer querying user (the per-user OAuth identity) does not have USE CATALOG / USE SCHEMA privileges on system.information_schema. |
Grant USAGE on system.information_schema to the user or the role they assume. |
| The compiled YAML contains the joined table but the metric returns no rows when run | The join key types do not match across the two tables (e.g. INT on one side, STRING on the other). |
Cast the column types to match in upstream dbt models — UC does not cast for you. |
If something is broken that this page does not cover, ping your Datahub contact with the metric ID and the error text — both are visible on the metric detail page.