Skip to content

Data Products: bundles, contracts, and asset registration

A Data Product in Datahub is a curated, named, owned bundle of one or more underlying assets (metrics, glossary terms, catalog assets, BPM processes, metadata-engine connections, logic-engine alert rules) that is the only thing a Data Contract can attach to. This page covers when to create one, how to register an existing asset as a Data Product, and how the bundle relates to your contracts.

When to choose this

Use a Data Product when you want to:

  • Describe a customer-facing capability that spans multiple assets. Example: Customer 360 = 4 metrics (mrr, arr, churn_rate, nps) + 2 catalog assets (customers, subscriptions) + 1 glossary term (Customer). The Data Product is the single noun; the bundle is what's inside.
  • Attach an asset to a Data Contract. Contracts no longer link directly to metrics, glossary terms, catalog assets, processes, connections, or alert rules. They link only to Data Products, and the Data Product carries the bundled assets. (See Why does a contract no longer link directly to a metric? below.)
  • Set ownership and lifecycle on a coherent set of assets at once. A Data Product has a single owner team, status state machine (draft → active → deprecated → retired), domain, and product type. The bundled assets keep their own owners and lifecycles, but the product is what stewards and consumers reach for.

You do not need a Data Product for:

  • A one-off ad-hoc query against the warehouse.
  • A glossary term that nobody outside your team consumes (it lives in the glossary; promote it later if it earns the audience).
  • An asset that will never be put on a contract and is never the answer to "what does this team ship?"

What a Data Product looks like

Surface Where What you see
List /data-products Filterable table — name, status, product type, owner team, domain, system, tags. KPIs / chart at the top of the lander.
Detail — Overview tab /data-products/{id} (default tab) Name, description, owner team, product type, domain, system, tags, status with state-machine actions, external URL, audit metadata.
Detail — Assets tab /data-products/{id} (Assets tab) The bundle. One row per wrapped asset: source type (metric / glossary / catalog / process / connection / alert rule), resolved name, added-on, and a remove (trash) button gated on the dataproducts.products.write role. Use Add Asset to wrap more.
"Part of" panel on every asset detail page catalog asset, glossary term, metric, BPM process, metadata-engine connection, logic-engine alert rule detail pages Shows "Part of N data products" with chips that link to each product. If the asset is not yet wrapped, the panel offers a Register as Data Product button.

Setup — what an admin needs to do once

There is no per-tenant setup required for the Data Product bundle bridge itself. The relevant prerequisites are:

Prereq Where Why
At least one Product type seeded /admin/data-products (or the Product Types section) New products pick a type at creation time (Dataset, API, Report, Stream, ML Model, Data Pipeline are platform defaults; admins can add more).
At least one Team /admin/teams Every Data Product has an owner team.
Roles /admin/users (role groups) dataproducts.products.read to view, dataproducts.products.write to create / edit / delete / wrap-asset / unwrap-asset.

Before you register: the asset must be published

Catalog assets carry a lifecycle (draftunder reviewpublishedarchived). A catalog asset can only be registered as a Data Product (or added to an existing one) when its status is published. This is the platform's "accepted / green" signal — it tells stewards and consumers that the asset has earned the right to be governed at the contract layer.

What you'll see in practice:

  • On a catalog asset detail page (/data-catalog/assets/{id}), the Register as Data Product button is greyed out and shows a tooltip — "Only published assets can be registered as a data product. Move this asset through review first." — until the asset reaches published.
  • If you try the API or the Add Asset flow inside a Data Product against a non-published catalog asset, the request is rejected with HTTP 422 and the error message "Asset cannot be registered as a data product while its status is 'draft'. Only published assets can be wrapped."
  • This rule only fires for asset types whose module has opted in. Today that's catalog assets; the other source types (glossary terms, metrics, BPM processes, metadata-engine connections, logic-engine alert rules) accept any state until their owning module opts in.
  • Existing wraps are not affected. If a Data Product already contains an asset whose status later moves back to draft, the wrap stays — the gate only fires on new registration attempts.

To unblock yourself, advance the asset through its review flow on the asset detail page until the status badge reads Published (green), then the Register as Data Product button enables.

The full architectural decision is in docs/adr/ADR-012 (internal).

How to register an existing asset as a Data Product

There are two equivalent paths. Both produce the same result: a Data Product whose bundle contains your asset, ready to be linked to a contract.

  1. Open the asset's detail page (/data-catalog/assets/{id}, /business-glossary/terms/{id}, /metrics/{id}, /bpm/processes/{id}, /metadata-engine/connections/{id}, or /logic-engine/alert-rules/{id}).
  2. Find the Part of N data products panel.
  3. Click Register as Data Product.
  4. Choose Create new (default) — name and description are pre-filled from the asset; tweak as needed and pick an owner team / product type / domain. The new product starts in draft status with your asset already wrapped.
  5. Or choose Add to existing — pick from the dropdown of products that don't yet contain this asset. The dropdown excludes products this asset is already in.
  6. Save. The panel updates in place to show Part of 1 data product (or N+1) with a chip pointing at the new product.

Path B — From the Data Product detail page

  1. Open /data-products/{id}.
  2. Switch to the Assets tab.
  3. Click Add Asset.
  4. Pick a source type (metric / glossary term / catalog asset / process / connection / alert rule).
  5. Pick the specific asset from the populated dropdown. Already-bundled assets are filtered out.
  6. Save. The new row appears in the Assets table.

Path C — Through HERC (the in-app AI assistant)

Ask HERC in plain language:

  • "Register the metric mrr as a data product called Customer 360 Revenue."
  • "What data products contain the glossary term Customer?"
  • "What's in the Customer 360 product?"

HERC routes these to the Data Products agent and uses the register_asset_as_product, list_products_for_asset, and list_assets_for_product tools to act and answer in one turn. After registration HERC tells you the new product's name and status (always draft) and suggests the next step (link the product to a contract).

What "the bundle" means in practice

The bundle is a 1-to-many junction between a Data Product and any number of source assets. The same source asset can also belong to multiple Data Products — there is no exclusive ownership.

Property Behaviour
Cardinality 1 product → N assets (a product can wrap any number of assets)
Sharing The same asset can be wrapped by multiple products (no exclusive lock)
Standalone products A Data Product with zero wrapped assets is valid — useful when you want the named product to exist before deciding what goes in it.
Nested products Not supported in v1. A Data Product cannot wrap another Data Product.
Allowed source types data_catalog, business_glossary, metric, bpm_process, metadata_engine, logicengine_alert_rule. Any other value is rejected at the API layer.

When you delete a Data Product, the bundle rows are cascade-deleted automatically — the bundled assets themselves stay (they live in their own modules with their own lifecycles).

This is the most common question after April 2026. The short answer:

Contracts now link only to Data Products. To put a metric (or glossary term, catalog asset, process, connection, alert rule) on a contract, wrap it in a Data Product first, then link the Data Product to the contract.

The longer answer is in docs/adr/ADR-008 (internal). The summary you need as a steward:

  • Before April 2026, Contract.linked_products could contain six different entity types and each one had its own lookup path. That polymorphism leaked into every consumer module and made adding a new linkable type expensive.
  • Now Contract.linked_products contains only Data Products. Want a metric on a contract? Register the metric as (or add it to) a Data Product, then link that product to the contract. The contract's "Linked Products" tab and the metric's "Part of" panel both surface the same relationship.
  • This is the only path. Pre-existing direct links (test data) were dropped at migration; if you had a real link that mattered, register the asset as a Data Product and re-attach.

For metrics specifically: the old primary_contract_id field, the metric → contract dependency table, the lineage graph organism, and the save-time COLUMN_REF_NOT_IN_LINKED_CONTRACTS failure are all retired. Save the metric, then govern it through Data Product membership.

How it works

Asset name resolution

Every row in the Assets tab shows a resolved name (e.g. "Monthly Recurring Revenue" for a metric, "customers" for a catalog asset, "Customer" for a glossary term). Datahub resolves these by asking each consumer module for the human-readable name of an (entity_type, entity_id) pair.

If the consumer module's name resolver hasn't been registered (during a partial deployment, for example), the row falls back to displaying the raw UUID in italic muted text. This is rare in production but explicitly handled so the bundle never crashes when a single module is misconfigured.

Cascade behaviour

What you do What happens to the bundle
Delete a Data Product All its bundle rows are cascade-deleted. Bundled assets are untouched.
Delete a bundled asset (out of band) The bundle row becomes orphaned. The platform's archive-before-delete rule prevents this in normal flows. Orphan cleanup is deferred until it becomes a problem in practice.
Unwrap an asset The bundle row is removed; the product and the asset are both untouched.

Limitations

Limit Why Workaround
You cannot wrap a Data Product inside another Data Product Cyclic-bundle pathology + transitive-membership semantics aren't worth the cost without a real customer ask. Use multiple Data Products that share contracts, or model the parent as a higher-level Data Product whose description references the others.
Bundling an asset that's hard-deleted later orphans the bundle row Postgres can't FK a polymorphic (type, id) pair without an exhaustive CHECK trigger that would re-introduce the upward dependency. Use the platform's archive-before-delete flow. Hard-delete is a guarded admin action; rare in practice.
Direct contract → asset links are gone Replaced by contract → product → asset. Register the asset as (or add it to) a Data Product, then link the product to the contract.
The "Linked Products" picker on a contract only shows Data Products DPB-D3. This is the new platform model — see Why does a contract no longer link directly to a metric?.
Catalog assets must reach published status before they can be wrapped in a Data Product ADR-012 / DPS-D3. Customer language: "accepted / green" — only published assets can flow into contract-bound bundles. Move the asset through review on its detail page; the Register as Data Product CTA enables once the status badge turns green. The same gate applies to the Add Asset flow inside an existing Data Product.

Audit & compliance

Question a CISO might ask Where to look
"Which contracts does this metric / asset / term flow into?" The asset's detail page → Part of N data products panel → click each product chip → the product's detail page lists every linked contract.
"Who can register an asset as a data product?" Anyone with the dataproducts.products.write role group. The CTA is hidden for read-only users.
"Can a user bypass the bundle and link a metric directly to a contract?" No. The database has a CHECK constraint on contract_product_links.entity_type = 'data_product' and the model rejects any other value. Both paths are enforced.
"What happened to legacy direct links?" They were deleted at the lockdown migration. The migration logged the row count via RAISE NOTICE for audit.
"What's in the audit log when an asset is wrapped or unwrapped?" Each data_product_assets row carries created_at + created_by_user_id (when the registry resolves the caller). The Data Product's audit metadata reflects the wrap/unwrap event timestamp.

Troubleshooting

Symptom Likely cause Fix
The Register as Data Product button is missing on an asset detail page The user does not have the dataproducts.products.write role Ask an admin to add the role group, then re-login.
The Assets tab Add-Asset dropdown is empty after picking a source type All available assets of that type are already in the bundle, or the user does not have the consumer module's read role for that type Check the consumer module's read role; if all are already bundled, that's expected.
A bundled row shows the asset's UUID in italic muted text instead of a name The consumer module's IAssetNameResolver is not registered (partial deploy or import error in Dependencies.py) Check the consumer module's Dependencies.py for the registration block; restart the backend; the row will resolve on next list.
Trying to wrap the same asset twice in the same product returns 409 UNIQUE constraint on (data_product_id, source_entity_type, source_entity_id) This is intentional — the duplicate is rejected. The first wrap is already there.
Trying to register an asset whose product name collides Product name uniqueness is enforced per-tenant Pick a different name (the modal surfaces the 409 with a clear message).
The Register as Data Product button is greyed out with a tooltip about status The catalog asset is not in published status (it's draft, under review, or archived) — the platform only allows registration of accepted assets (ADR-012) Move the asset through review until its status badge reads Published, then re-open the asset detail page.
Trying to add a non-published catalog asset to an existing Data Product returns 422 Same status gate as the Register CTA, enforced server-side at POST /products/{id}/assets and POST /products/register-from-asset Publish the catalog asset first, then add it to the bundle.
The Linked Products tab on a contract is empty even though I "linked things" before April 2026 Pre-bundle direct links were deleted at the lockdown migration (DPB-D4) Re-register the relevant assets as Data Products, then link them to the contract.

If something is broken that this page does not cover, ping your Datahub contact with the data product ID (visible on the URL) and the error text from the toast / banner.