Flows: visual automation that runs¶
The Flows module is Datahub's automation engine. Build a graph — start, actions, transforms, conditions, sub-flows, end — connect it, and the platform runs it on a schedule, on a webhook, on a platform event, or on demand. Think n8n inside Datahub, with first-class versioning, replay, audit, and AI assistance.
It's where Processes-the-documentation meets execution: turn a procedure into something the platform actually does.
When to choose this¶
Use Flows when you want to:
- Run an HTTP call when something happens. Trigger by webhook, schedule, or platform event; POST to your CRM / ERP / Slack.
- Wire a multi-step automation. Conditional branching, parallel execution, sub-flows, retries.
- React to a platform event. A new contract is published → enrich it via your downstream system; an alert fires → call a workflow that pages on-call.
- Compose flows from flows. Build small, reusable atomic flows; compose them from a parent flow.
- Keep a paper trail. Every step's input / output context is captured (with secret redaction); every flow change is audited; every execution is replayable.
You do not need this module to:
- Document a process visually for human eyes only — that's Processes.
- Execute SQL alerts on Databricks — that's Logic Engine, which can trigger a flow.
- Approve / route entities through a governance gate — that's Workflows.
What Flows looks like¶
| Surface | Where | What you see |
|---|---|---|
| Landing | /flows |
KPI strip (active flows, success rate, runs today, avg duration), mini execution timeline, health signals card, AI suggestions, definitions table with create / search / filter / bulk select. |
| Templates browser | /flows/templates |
Card grid with category filter, search, Use template, import / export. Six starter templates ship out of the box. |
| Analytics | /flows/analytics |
Per-definition success / failure / duration metrics, stacked timeline (recharts), step-level analytics, dependency graph showing cross-flow sub-flow references. |
| Builder | /flows/{id} |
Floating-panel canvas, palette, config panel, version switcher, run history panel, schedule + event panels, audit log, variables panel, webhook management, Debug tab. |
| Run detail | /flows/{id}/runs/{instanceId} |
Step timeline (memoised, pulse on running steps, expandable input / output context), execution logs (virtualised, level filtered), execution tree (parent / child sub-flows), retry from failure, replay-from-step. |
Concepts¶
| Concept | What it is |
|---|---|
| Flow definition | The reusable template. Name, trigger, version chain, schedule(s), event subscription(s), webhook secret, variables. |
| Flow version | A snapshot of the graph (nodes, edges, node configs). Versions auto-increment; only one is active at a time. |
| Flow instance | One execution of an active version. Has a context (JSONB) carried through every step. |
| Step execution | One step's run within an instance. Records input context, output context, duration, status, error. |
| Step handler | The code that runs for a node type — built-in handlers for action (HTTP), transform, condition. |
| Trigger | What starts a flow: manual, schedule (cron), webhook, platform event, or sub-flow invocation. |
| Variables | Flow-scoped key-value pairs injected into the execution context at start. |
| Template | A publishable, portable flow definition with category and tags; importable into another tenant. |
Node types¶
| Node | Category | What it does |
|---|---|---|
| Start | control | Entry point. |
| End | control | Termination. |
| Action | step | HTTP webhook call (URL, method, headers, body template, timeout). |
| Transform | step | Map values from input context to output context (rename, reshape). |
| Condition | control | Evaluate an expression → branch on true / false via the edge's source handle. |
| Sub-flow | step | Invoke another flow as a child step (max depth 5; cascade prevention). |
| Sync (join) | control | Join parallel branches before continuing. |
Condition operators: ==, !=, >, <, >=, <=, in, not_in, is_empty, is_not_empty.
Setup — what an admin needs to do once¶
| Prereq | Where | Why |
|---|---|---|
| Roles | /rolegroups |
flow.definitions.manage to author flows; flow.instances.read to view executions. |
| AI provider key | /admin/integrations → AI Provider Keys |
Powers the HERC button on the canvas and the AI agent that answers questions about flows. |
| Webhook auth (per flow) | Builder → Webhook panel → Reveal secret | Each definition gets an auto-generated 128-char webhook secret used in the X-Webhook-Secret header. Rotate from the same panel. |
| Schedules / events | Builder → Schedule panel / Event subscription panel | If you want the flow to run automatically. |
Building a flow¶
- Create.
/flows→ New flow → name, description, trigger type, tags. Auto-creates version 1 with an empty graph. - Drag from the palette. Start → Actions / Transforms / Conditions / Sub-flows → End.
- Configure each node. Click a node → floating config panel with type-specific fields and inline validation.
- Set the trigger.
- Manual — click Run in the builder.
- Schedule — Schedule panel → cron string or preset (every minute, every hour, daily at 09:00, …).
- Webhook — Webhook panel → reveal the URL + secret; POST anywhere with
X-Webhook-Secret: <secret>. Rate limit 60 / minute. - Event — Event subscription panel → pick a platform event (e.g. contract.published).
- Variables. Variables panel → key-value pairs; injected into context at start.
- Save. Auto-saves to the current draft version. Activate the version when ready.
- Run. Trigger-test from the builder, or wait for the schedule / webhook / event.
Versioning, replay, debug¶
Flows take versioning seriously:
- New version at any time — copies the graph from the current active version, becomes the new draft.
- Activate to make a version live; only one version per definition is active.
- Diff two versions → field-by-field table with semantic colours (added: green, removed: red, changed: yellow).
- Audit log — every definition / version change is recorded with field-level diffs.
- Replay-from-step — open a run → pick a step → Replay from here → a brand-new instance starts at that step on the active version, optionally with overridden context. Source instance is untouched.
- Retry from failure — a failed run can be retried from the failed step; the new instance is linked via superseded_by.
- Debug tab in the builder lets you start a run from any node (engine entry mid-graph), pass overrides, and trace through.
Step input / output context¶
Every step records its input context and its output context as JSONB.
- The full context is redacted through
SecretRedactor(key-based recursive masking) before storage — keys with names like*token*,*secret*,*password*are masked. - The redacted context is capped at 64 KiB UTF-8 per field. Over the cap, the stored value is
{ "_truncated": true, "original_bytes": N }so you can see the size without storing the payload. - Output context becomes the next step's input context (with transform / mapping applied).
This is what makes replay safe: you can re-run from a step with the exact (redacted) context the original run had.
Templates¶
The Templates browser is a library of reusable, importable flows:
- Use — clones a template into a new flow definition in your tenant.
- Publish — promote a flow definition to a template (with category + tags).
- Export / Import JSON — portable envelope for sharing across tenants.
- Bulk delete templates from the browser.
Six starter templates ship out of the box (HTTP webhook, daily report, conditional notification, sub-flow composition, schedule + transform, event reactor).
Schedules and events¶
- Schedules are cron-based. The Procrastinate periodic job picks due schedules and triggers them.
- Event subscriptions subscribe a flow to a platform event (e.g. asset.created, term.published, alert_rule.triggered) with optional filters. The event bus delivers; cascade prevention guards against infinite loops.
Both can coexist on the same flow.
Webhook trigger¶
The webhook endpoint is public (no JWT) but protected by:
X-Webhook-Secretheader — must match the per-definition secret.- Rate limit — 60 requests / minute per definition.
- Body — arbitrary JSON, becomes the trigger payload of the flow execution context.
Rotate the secret from the builder's Webhook panel any time you suspect leakage. Old secret is revoked immediately.
Sub-flows¶
Sub-flows let you compose: build small, focused flows (one HTTP call + transform), then have a parent flow invoke them as steps. Cascade is prevented at the engine level (max depth 5; cycle detection in the dependency graph).
The Run Detail's Execution tree visualises the parent → child instances so you can navigate between them.
Analytics¶
- KPI strip — active definitions, success rate (rolling), runs today, average duration.
- Timeline — stacked bar chart by day: succeeded / failed. Click a bar to drill into the executions that day.
- Per-definition table — invocations, duration p50 / p95, success rate.
- Step-level table — per-node-type metrics (invocations, duration, success rate). Optional heatmap toggle on the canvas to colour nodes by latency.
- Dependency graph — visual map of cross-flow sub-flow references, useful before deactivating a flow other flows depend on.
Limitations¶
| Limit | Why | Workaround |
|---|---|---|
| Sub-flow depth capped at 5. | Sanity. | Refactor deeply nested flows into siblings. |
| Action node only does HTTP. | Started simple. | More handlers (queue, file, email) are on the roadmap. |
| Webhook is per-definition; no per-instance auth. | Header-based shared secret. | Rotate the secret regularly; restrict via tenant-side allowlist. |
| Step output cap is 64 KiB. | Storage hygiene. | Stream large payloads to object storage and pass the URL. |
| Cron is timezone-fixed to the tenant default. | Determinism. | Pass the tz in the cron expression if your scheduler supports it; otherwise schedule in tenant TZ. |
| Replay starts a new instance — it doesn't mutate the original. | Auditable. | The new run's superseded_by link makes the chain visible. |
Audit & compliance¶
| Question a CISO might ask | Where to look |
|---|---|
| "Who changed this flow and when?" | Builder → Audit toggle → field-level diff timeline. |
| "Who triggered this run?" | Run detail → triggered_by + trigger source (manual / schedule / webhook / event / sub-flow). |
| "Did this flow leak credentials in the context?" | All step contexts are redacted via the SecretRedactor; verify by inspecting any step's input / output. |
| "Where did this webhook call go?" | Action node config + execution log entry shows the resolved URL. |
| "Did this run touch a sub-flow that touched another?" | Run detail → Execution tree. |
| "Can a non-author trigger an active flow?" | Yes if they have flow.definitions.manage; webhook calls require the secret. |
Troubleshooting¶
| Symptom | Likely cause | Fix |
|---|---|---|
| Webhook returns 401 | Wrong / missing X-Webhook-Secret header. |
Reveal the secret in the Webhook panel; rotate if leaked. |
| Webhook returns 429 | Rate limit (60 / min). | Throttle the caller. |
| Run stays running forever | Step timed out and the timeout handler crashed (rare); or the engine couldn't compute a UTC-aware duration. | Cancel from the run detail; check execution logs. |
Step input shows _truncated: true |
Payload exceeded 64 KiB. | Reduce upstream payload or stream to object storage. |
| Replay-from-step errors with "non-step node" | You picked Start, End, or a gateway. | Pick an action / transform / condition / sub-flow node. |
| Activate fails | Config validation on a node failed. | The failing node is highlighted; required fields are red. |
| "Submit for Review" fails | Version not in draft, or a capability node names an unregistered capability. | Fix graph errors shown in the validation panel. |
| Version stuck in "Under Review" | The assigned reviewer hasn't acted yet. | Ask them to approve in the Tasks inbox, or withdraw the review. |
| Schedule doesn't fire | Schedule paused, or cron is malformed. | Toggle on; use the cron preview button. |
Version approval workflow¶
Every flow version must be approved before it becomes active. This ensures that a human reviewer consciously approves the capabilities a flow may use (e.g. querying metrics, sending notifications).
- Build — edit the graph in draft. Add capability nodes for the actions the flow needs.
- Submit for Review — validates the graph and starts an approval workflow. The version is frozen while under review.
- Review — a user with the
flow.definitions.managerole sees a task in their Tasks inbox. They can approve or reject. - Approve — the version becomes active with a frozen capability set. The previously active version is superseded.
- Reject — the version returns to draft for further editing.
You can withdraw a review at any time before the reviewer acts, returning the version to draft.
The platform records which version was active at every point in time (SCD Type 2), so the brain and audit trail can explain what logic produced any given outcome.
See also¶
- Flow Capabilities — governed operations (run metric, notify) in flows.
- Processes — model the human side of the work.
- Logic Engine — alerts can trigger flows.
- HERC — ask "what flows ran today?" to navigate runs conversationally.
- Tasks — flows can create tasks; tasks can trigger flows.