How Fluidworks Keeps Hallucinations < 5 %
Jose Kuttan
Aug 27, 2025

At Fluidworks, factual accuracy is a product requirement, not an afterthought. Our agent runs through layered defenses—anytime confidence drops, it reformulates, asks for clarification, or hands off to a human.
1. Retrieval-First Architecture
Grounding in data
Every doc, call transcript, and UI screenshot you upload is chunk-indexed with pgvector on GCP Postgres.
The agent retrieves top-k passages (semantic similarity + recency bias) and assembles a context window.
The LLM must cite which chunks it used. If none pass the relevance threshold, it replies “I’m not sure” instead of guessing.
Why it matters: Most hallucinations come from ungrounded prompts. Retrieval-first ensures the model rarely generates in a vacuum.
2. Model Ensemble Voting
Primary model (e.g., GPT-4, Gemini 2 Pro) drafts an answer.
Verifier models (lighter Mistral-class) produce their own.
If ≥ 2 models agree (0.8+ cosine similarity), we proceed; otherwise confidence drops, triggering a rerun or user clarification.
Why it matters: LLMs hallucinate independently. Ensembles make simultaneous hallucinations unlikely.
3. Guardrail Filtering
Rule-based layer: Blocks unsupported claims (“guaranteed ROI”), speculative numbers, or PII leaks.
Regex + semantic filters: Catch brand-new entities not in your KB, forcing a grounding pass.
Safe complete: Strips or replaces risky output with clarifying fallbacks.
Why it matters: Even correct answers can over-promise or leak. Guardrails enforce compliance and brand safety.
4. User Flagging & Human-in-the-Loop
Inline “Report Issue” button in every transcript.
Triage queue: issues appear in a JIRA-like board within minutes.
Flagged turns auto-feed into nightly fine-tune jobs and prompt updates.
Why it matters: Edge cases surface fastest in the wild. A 24-hour SLA keeps the feedback loop tight.
5. Continuous Evaluation & Shadow Tests
Shadow prompts: 10 % of live turns replay on a staging model for deviation checks.
Monthly red-team sweeps: Adversarial prompts (e.g., “Invent an ROI metric”) test resilience. Failures trigger new guardrails.
Metric tracked: Hallucination Incident Rate = (confirmed ÷ total) × 100.
Last 90-day rolling avg: 0.74 %.
When Confidence Drops
Explicit disclaimer: “I’m not 100 % sure—want me to connect you to a teammate?”
Human handoff: Escalate to live chat or Slack.
Auto-log: If no one’s available, a ticket is filed with full context.
Takeaway
By layering grounding, model voting, guardrails, and human review, Fluidworks keeps hallucinations both rare and recoverable , delivering onboarding you can trust.
How Fluidworks Keeps Hallucinations < 5 %
Jose Kuttan
Aug 27, 2025

At Fluidworks, factual accuracy is a product requirement, not an afterthought. Our agent runs through layered defenses—anytime confidence drops, it reformulates, asks for clarification, or hands off to a human.
1. Retrieval-First Architecture
Grounding in data
Every doc, call transcript, and UI screenshot you upload is chunk-indexed with pgvector on GCP Postgres.
The agent retrieves top-k passages (semantic similarity + recency bias) and assembles a context window.
The LLM must cite which chunks it used. If none pass the relevance threshold, it replies “I’m not sure” instead of guessing.
Why it matters: Most hallucinations come from ungrounded prompts. Retrieval-first ensures the model rarely generates in a vacuum.
2. Model Ensemble Voting
Primary model (e.g., GPT-4, Gemini 2 Pro) drafts an answer.
Verifier models (lighter Mistral-class) produce their own.
If ≥ 2 models agree (0.8+ cosine similarity), we proceed; otherwise confidence drops, triggering a rerun or user clarification.
Why it matters: LLMs hallucinate independently. Ensembles make simultaneous hallucinations unlikely.
3. Guardrail Filtering
Rule-based layer: Blocks unsupported claims (“guaranteed ROI”), speculative numbers, or PII leaks.
Regex + semantic filters: Catch brand-new entities not in your KB, forcing a grounding pass.
Safe complete: Strips or replaces risky output with clarifying fallbacks.
Why it matters: Even correct answers can over-promise or leak. Guardrails enforce compliance and brand safety.
4. User Flagging & Human-in-the-Loop
Inline “Report Issue” button in every transcript.
Triage queue: issues appear in a JIRA-like board within minutes.
Flagged turns auto-feed into nightly fine-tune jobs and prompt updates.
Why it matters: Edge cases surface fastest in the wild. A 24-hour SLA keeps the feedback loop tight.
5. Continuous Evaluation & Shadow Tests
Shadow prompts: 10 % of live turns replay on a staging model for deviation checks.
Monthly red-team sweeps: Adversarial prompts (e.g., “Invent an ROI metric”) test resilience. Failures trigger new guardrails.
Metric tracked: Hallucination Incident Rate = (confirmed ÷ total) × 100.
Last 90-day rolling avg: 0.74 %.
When Confidence Drops
Explicit disclaimer: “I’m not 100 % sure—want me to connect you to a teammate?”
Human handoff: Escalate to live chat or Slack.
Auto-log: If no one’s available, a ticket is filed with full context.
Takeaway
By layering grounding, model voting, guardrails, and human review, Fluidworks keeps hallucinations both rare and recoverable , delivering onboarding you can trust.