Skip to content

AI platform

The AI platform is everything Datahub needs to run HERC, the agent system, and any module that uses an LLM (metric suggestions, term enrichment, dashboard generation, transcription, …).

It's not a single page. It's the collection of admin surfaces under /ai/* that govern which models you use, with which keys, under what guardrails, with what observability, and what they cost.

This page is the map.

You want to… Go to
Add or rotate an OpenAI / Anthropic / Databricks API key Provider keys
Switch on the platform-hosted private model (Granite-4) Provider keys + license
Change the default model for an agent (e.g. cheaper / faster) Model settings
Block PII, secrets, prompt injection, off-topic prompts Guardrails
See every LLM call (latency, tokens, prompt, response, evals) Observability
Run an agent against a fixed dataset to compare versions Experiments

Provider keys — /ai/provider-keys

Datahub talks to LLM providers using per-tenant API keys stored encrypted in the database (Fernet, derived from the platform secret). One tenant can hold multiple active keys for the same provider; the most recent is preferred. Encryption is resilient — a corrupted key is skipped (with a warning), so one bad row never breaks the rest.

Supported providers:

Provider Use cases Notes
OpenAI Default for HERC, agents, transcription (Whisper). Standard OPENAI_API_KEY.
Anthropic Alternative LLM (Claude Sonnet / Haiku). Useful as a fallback. The platform automatically chooses sensible defaults per provider.
Databricks LLM via Databricks Foundation Model APIs / Genie. Wired through the Databricks workspace integration.
Datahub Private AI (Granite-4) Platform-hosted private model. Credentials live in Azure Key Vault, not the tenant DB. License-gated. Tenants can start a 30-day trial directly from the banner.

Fallback

If a code path needs a provider the tenant hasn't configured (e.g. a service hardcoded openai but only Anthropic is available), the platform falls back transparently. The model name is rewritten to the alternate provider's default and the request is retried. A structured provider_key_fallback log helps admins see the drift.

Datahub Private AI

The platform-hosted Granite-4 model on Datahub-managed hardware. Three tiers:

Tier Pricing Where the model runs
Business From €100 / month Shared sovereign Granite-4 instance. 30-day trial available.
Enterprise Base fee + per-token usage Dedicated / siloed Granite-4 instance per organisation.
Enterprise+ Custom Granite-4 + Datahub management software deployed inside your environment.

Trial flow is self-service: click Start trial in the locked banner, fill in name / email / company, the license server activates a signed trial license, and the tenant becomes entitled immediately. The trial is hard-capped at a token limit and a 30-day window; the admin cannot raise the cap during trial.

For paid plans, monthly token usage is metered and visible at /ai/provider-keys (per-agent breakdown, hard-limit toggle, warn threshold). Hitting the hard limit returns 429 to the caller.

Model settings — /ai/model-settings

Per-agent overrides for which model + what parameters to use:

Parameter What it controls
Provider + model name E.g. openai / gpt-4o-mini, anthropic / claude-haiku-4, datahub_private / granite4@dev.
Temperature, top-p Determinism vs creativity.
Max tokens Hard ceiling per response.
Frequency / presence penalty Repetition penalties. (Auto-stripped for providers that don't support them, e.g. Anthropic, Granite.)

Two layers:

  • Global default — the model used when an agent has no override.
  • Per-agent override — e.g. give the metric agent a stronger model and the catalog agent a faster one.

Resolution: agent value > global value > hardcoded defaults (openai / gpt-5-nano).

Datahub Private AI is filtered out of the dropdowns unless the tenant is entitled.

Guardrails — /ai/guardrails

Pluggable content-safety pipeline that runs before the LLM sees the prompt and after the LLM responds. Rules are direction-scoped (input, output, or both) and optionally agent-scoped.

Rule types:

Type What it does Default?
PII redact Detects + masks email, phone, SSN, credit card, IP via regex. Yes (input + output).
Regex block Block on a custom regex. No.
Topic block Reject prompts touching a configured topic list. No.
Disclaimer Prepend / append a fixed disclaimer. No.
Prompt injection Heuristic regex with three sensitivity tiers. Yes (input).
Content moderation OpenAI Moderation API (free, fail-open). Uses the platform moderation key (or the tenant's OpenAI key as fallback). Yes.
Secret detection AWS keys, API keys, bearer tokens, private keys, connection strings, JWTs. Yes.

Six rules ship out of the box; tenants can add more. Every violation is logged as a span attribute (datahub.guardrail_violations) and visible in the trace detail.

The Test button on each rule lets you dry-run text through the pipeline before activating.

Observability — /ai/observability

The native trace browser, backed by Arize Phoenix.

Every agent call is automatically traced via OpenTelemetry. The trace dashboard shows:

Metric What it tells you
Trace count, error rate, P50 / P99 latency Operational health.
Token usage over time Spend pattern.
Per-agent breakdown Which agents are called most.
Estimated cost Blended $ / 1M tokens (configurable).

Click any trace to see:

  • The span waterfall — orchestrator → tool calls → LLM → guardrails → response.
  • Attributes (model, provider, agent name, conversation id, hashed user id, guardrail violations).
  • The actual prompt + response (with PII already redacted by the guardrails).
  • Evaluations from Phoenix (auto-eval rules that score relevance, faithfulness, etc.).
  • Annotations — manual feedback you and your team add to traces (label / score / freeform). Define schemas at Observability → Annotation configs.

Phoenix runs in-process by default (auto-launched on first use, persisted to disk), so the dashboard is always available without any extra deployment. Production deployments can point at an external Phoenix collector (DATAHUB_PHOENIX_COLLECTOR_ENDPOINT).

Experiments — /ai/experiments

Offline evaluation suite for agents. The flow:

  1. Create a dataset of prompts (with optional gold-standard outputs). Bulk-add or import CSV.
  2. Create an experiment: pick the dataset and the agent to test.
  3. Execute — the platform runs every example through the agent (using the same MCP tool filtering as production), capturing per-example latency, token counts, and output.
  4. Compare runs across agents / model settings / prompt versions.

Caps: max 100 examples per run.

Use it to:

  • A / B test model upgrades before flipping the default.
  • Catch regressions when a system prompt changes.
  • Validate that guardrails don't over-block legitimate queries.

How the AI platform fits with the rest

Feature Depends on
HERC chat At least one provider key + a model resolution (global or per-agent).
Metric suggestions, dashboard generation, term enrichment Same.
Transcription (Whisper) OpenAI provider key.
Genie chat on dashboards Databricks workspace credentials (proxied — no AI provider key needed).
Conversation persistence Built-in — every chat is stored per-user with optional summary on archive.
Conversation language Driven by the user's UI locale; the platform prepends [LANGUAGE: en|nl] to every prompt to enforce adherence.

Roles

Role What you can do
aimodule.providerkeys.read View provider keys, model settings, guardrails, observability, experiments.
aimodule.providerkeys.write Manage provider keys, model settings, guardrails.
aimodule.agent.run Run conversations (HERC chat).
aimodule.conversations.read / write Read / manage conversations.
aimodule.transcription.run Use the Transcription module.

Bundled into role groups: Administrator and AI.Editor get all AI roles. AI.Viewer is read-only.

Limitations

Limit Why Workaround
Provider keys are tenant-scoped, not user-scoped. Most teams want one shared budget. Use observability + guardrails to track per-user usage.
Granite-4 (Private AI) is license-gated. Commercial offering. Start a 30-day trial from the banner; upgrade to Business / Enterprise / Enterprise+.
Phoenix dashboards are best on a fresh page load. OTel batching. Refresh after a recent call if you don't see it.
Experiments cap at 100 examples per run. Cost guardrail. Split into multiple datasets.
Conversation history is trimmed for classification + follow-ups. Token budget. Long-running threads still answer; classifier just sees the recent slice.

Audit & compliance

Question Where to look
"Where do prompts go?" The provider configured for the agent — visible per call in /ai/observability. With Datahub Private AI, prompts stay on platform infrastructure.
"Who asked the AI what?" Per-conversation (and per-message) trace tied to the user JWT. Visible in observability + the conversations API.
"Did the AI see PII it shouldn't?" Guardrails redact PII before the prompt leaves the platform. Violations are logged on the trace.
"How much did this cost?" Observability → estimated cost (blended) + per-agent token breakdown. For Private AI, the metered ledger lives on the provider-keys page.
"Are conversations encrypted at rest?" Yes — provider keys are Fernet-encrypted; conversations are PostgreSQL rows.
"Can I disable a model after the fact?" Yes — delete the provider key + remove the per-agent override. The platform falls back transparently.

Troubleshooting

Symptom Likely cause Fix
503 No active API key found for provider 'openai' Tenant has no OpenAI key + service hardcoded openai. The fallback should engage automatically; if not, add an OpenAI key.
429 Too Many Requests from Private AI Hard token limit hit (or trial cap). Wait for the next month rollover, or upgrade.
Trace doesn't show in observability Phoenix not enabled, or collector unreachable. Check DATAHUB_PHOENIX_ENABLED; the in-process Phoenix should auto-launch.
Guardrail blocks a legitimate prompt Topic / regex rule too aggressive. Test rule from the guardrails page; tune sensitivity.
HERC keeps responding in the wrong language Locale not resolving from JWT. Check user account locale setting; the language directive prefixes every prompt.

See also

  • HERC — the user-facing AI assistant the platform routes through.
  • Administration — broader admin surfaces, including integrations.
  • Transcription — Whisper-backed speech to text, governed by the same provider keys.
  • Organisation DNA — the knowledge graph HERC reads from, built once per tenant.