Skip to content

Teach transcription your domain vocabulary

The Speech-to-Text module records audio and turns it into searchable text. By default the underlying model (OpenAI Whisper) does a great job on everyday speech, but it has no idea what your team calls things. "Asset Lifecycle Management" comes back as "lifestyle management". "EBITDA" shows up as "abita". Names of products, methodologies, internal projects — all the words that only your organisation uses — are exactly the words the model gets wrong.

Glossary linking fixes that. Highlight a phrase in the transcript, link it to a glossary term (or create one on the fly), and the next recording in this session — and every future session that links the same term — sees that vocabulary and gets it right.

It's the bridge between your Business Glossary and your Transcription sessions: terms you've already governed get re-used as recording vocabulary automatically.

When to use this

Reach for glossary linking when:

  • A recording got a domain word wrong and you fixed it manually — link it once so it doesn't happen again.
  • You're about to start a long-running customer interview series, deposition, or strategy off-site where the same set of names and acronyms will come up repeatedly.
  • Your team uses a Dutch / English mix where some words don't translate (e.g. "klantreis", "stuurgroep") — link them and Whisper will recognise them.
  • You're capturing a methodology call ("Asset Lifecycle Management", "Customer Lifetime Value", "Total Cost of Ownership") where every transcription needs to spell the methodology correctly.

You do not need this for:

  • Generic everyday speech (Whisper handles that fine).
  • Single one-off recordings where you don't care about future quality (just edit the transcript and move on).
  • Audio so noisy or low-quality that no amount of vocabulary biasing will rescue it (recording quality is still the limit).

How it works

Three steps the user sees:

  1. Select a phrase in the transcript — works in both the editable view (during edit) and the read-only viewer (after completing).
  2. Link it to a glossary term — pick from the search popover, or click "+ Create new term" to author a brand-new term inline (without leaving the transcript).
  3. Record again — the next chunk uses the linked term's name + synonyms as a hint to the speech model. Domain words land correctly.

What happens behind the scenes:

Step What the platform does
You select text A small floating Link to glossary button appears above your selection.
You pick an existing term The selection is stored as a link with the surface text + character offsets.
You create a new term The same form as /glossary/new opens with the name pre-filled to your selection — fill domain, definition, and synonyms, then save. The new term is auto-linked to your selection.
You start the next recording The platform gathers all linked terms for this session (name + synonyms), assembles them into a single hint string, and passes it to the speech model alongside the previous chunk's tail for sentence continuity.
You stop a recording The hint is not re-applied to past chunks — it only biases the next one. (See "Limitations" below.)

Setup

There is no setup specific to glossary linking. If your tenant has both Speech-to-Text and Business Glossary enabled, the feature is live. Prerequisites:

Prereq Where Why
Active OpenAI provider key /admin/integrations Whisper is OpenAI-only today (same key powers HERC).
At least one domain in the glossary /glossary (Domains tab) New terms must be assigned to a domain at create time. The platform seeds 8 default domains.
aimodule.transcription.run role /rolegroups Required to record + edit + link. (Same role you already have if you can edit transcripts.)
glossary.write role /rolegroups Optional — only needed to create new glossary terms inline. Without it, you can still link to existing terms; the "+ Create new term" button is hidden.

What you see

Surface Where What it does
Floating Link to glossary button Above any selection in the transcript Click it to open the picker.
Glossary search popover Anchored to your selection Debounced search across glossary terms; arrow keys + Enter to pick; ESC to dismiss.
Existing-link highlight Inline <mark> over the linked phrase Hover for a tooltip showing the term name + abbreviation. Click → confirm dialog → unlink.
Orphan strike-through Faded + struck-through link in edit mode Means: you deleted the linked phrase from the transcript while editing. The link will be removed on Save. Cancel-edit restores it.
+ Create new term in the popover Bottom of the picker Opens the same form as /glossary/new with name pre-filled. Auto-links the new term to your selection on save. Hidden if you lack glossary.write.

Tips for high-quality linking

  • Link the canonical phrase, not every variant. If your term is "Asset Lifecycle Management" with synonyms alm, levenscyclus, link the canonical phrase once. Synonyms are passed to Whisper together — duplicating variants as separate links doesn't help and clutters the transcript.
  • Use the synonyms field at create time. When you create a term inline, fill the synonyms field with every spoken variant your team uses (alm, Asset LCM, levenscyclus van assets). Whisper sees every variant as biasing input for the next chunk.
  • Link in the edit view. You can link in either view, but the editable view's mark renders in the same flow as your edits, so it's easier to spot orphans. The read-only viewer is fine for ad-hoc linking after the session is complete.
  • Re-link after a re-recording. Glossary biasing only affects the next chunk. If you re-record because the first take had bad audio, the new recording sees your existing links — you don't need to re-link.

Limitations

Limit Why Workaround
Biasing only affects the next recording, never the current chunk and never past chunks. Whisper's prompt argument is read at call time, not retroactively. Edit the past transcript manually; link the term so future recordings get it right.
Total vocabulary cap is roughly 30–60 terms per session. Whisper's prompt argument is capped at ~900 characters; vocabulary names + synonyms compete for that budget. If you hit the ceiling, the previous-text tail gets dropped first (vocabulary wins), so transcription accuracy stays best-effort. Consider splitting very-long recordings into multiple sessions.
Per-session, not tenant-wide. Each session's links feed only that session's prompt. Link the same terms in each new session. A tenant-default vocabulary is on the roadmap.
Whisper biases, doesn't enforce. The prompt is documented as a hint, not a constraint. The model can still mis-transcribe under heavy noise, accent, or unusual pronunciation. Edit the transcript manually for stubborn cases. The link still helps the next chunk and every future session.
Selection in the editable textarea anchors the floating button to the textarea, not the precise caret. Browsers don't expose textarea caret coordinates without a hidden mirror DOM. If the button feels off, mention it to support — the upgrade path is documented internally.
English-only term in a Dutch session is still passed verbatim. Multi-lingual term variants are a glossary-module concern, not a transcription one. Add the local-language variant as a synonym on the term.
No automatic "where is this term used?" panel on the glossary detail page. The schema supports it (FK + index) but the UI is deferred. The signal is queryable today via SQL or a future glossary panel.

Audit & compliance

Question a CISO might ask Answer
Does Whisper see my glossary terminology? Yes — only on calls for sessions where you have linked terms. The hint string is sent to the speech endpoint configured in /admin/integrations (OpenAI). It is NOT logged on our side.
Are linked terms logged? Only IDs and counts (transcription_id, term_count, prompt_chars). The assembled prompt string is never logged because synonyms can carry customer terminology.
Can a different tenant see my links? No. Links are scoped by organization_id; cross-tenant access returns 404 (the API doesn't even acknowledge the session exists).
Can a glossary term be deleted while it's linked? No — the FK is RESTRICT. The user has to unlink first. This protects the audit trail and avoids silently corrupted prompts.
What audit trail exists for link/unlink? Every link carries created_by (UUID) and syscreated. Unlink is a soft pattern at the UI: orphans are flushed on Save, and the link rows are removed at that point. The transcription session itself remains unchanged.

Troubleshooting

Symptom Likely cause Fix
I linked a term but the next recording still got it wrong. Whisper biases, doesn't constrain. The audio quality, accent, or speaker pace might be the limit. Try linking the synonym variant the speaker actually used (e.g. "ALM" not "Asset Lifecycle Management"). If still wrong, the audio is the constraint.
The + Create new term button isn't visible. You don't have the glossary.write role. Ask your admin to assign the Glossary Editor role group, or pick an existing term from the picker.
The picker shows "No results" even though I know that term exists. The search debounces at 200ms and matches term name + abbreviation. Domain or synonym matching isn't part of the search. Try the term's exact name or abbreviation. If still empty, check /glossary to confirm the term exists in your tenant.
A linked phrase is showing strike-through. You deleted the surface text in the editable view. The link will be removed when you save. If you want to keep the link, undo the delete (or cancel-edit). If you want to drop it, just save.
I can't unlink a term — the confirm dialog doesn't appear. The mark may be rendered in the read-only viewer rather than the edit view. Both views support unlink, but ensure you're clicking the highlight, not the surrounding text. Click directly on the highlighted phrase. If still nothing, refresh — there may be a stale render.
Whisper's prompt budget feels saturated (lots of mis-transcription on a session with many links). The 900-character ceiling has been hit; vocabulary wins truncation but very long names + synonyms can saturate the budget. Trim synonyms to the variants people actually say. Consider splitting the session into smaller ones with more focused vocabularies.

See also

  • Transcription — the parent module: recording, editing, archiving sessions.
  • Business Glossary — terms, domains, synonyms, approval workflow.
  • HERC — ask "find calls about asset lifecycle management" to navigate transcripts conversationally.
  • AI platform — provider keys, model settings, observability for Whisper calls.