Product Garaj

Portfolio / Project 06

LLM Content Trust &
Verification Interface

A design system for AI-generated content interfaces — reusable UX patterns for displaying confidence, source attribution, hallucination detection, and human-in-the-loop review workflows. Born from 98 enterprise user interviews.

Client

Modguard.ai

Role

Lead Product Designer

Timeline

12 weeks

Platform

Enterprise SaaS

Domain

AI/ML Trust & Safety

Year

2024

The Challenge

Enterprise teams adopting LLMs for content generation face a fundamental trust problem: AI outputs look authoritative regardless of accuracy. A confidently stated hallucination is indistinguishable from a verified fact. In regulated industries like healthcare and defense — where Modguard.ai's clients operate — a single unverified claim can trigger compliance violations, patient safety risks, or security breaches.

Existing AI interfaces treated trust as a binary — either you trust the model or you don't. There was no graduated system for communicating varying levels of AI certainty, no inline mechanism for source verification, and no workflow for humans to efficiently review and approve AI-generated content at scale.

I led the design of Modguard's trust and verification interface — a pattern library that makes AI reasoning visible, sources verifiable, and hallucinations catchable. The patterns were developed through 98 user research interviews and validated with enterprise customers in healthcare and defense.

Measured Impact

62%

reduction in content review time with claim-level confidence scoring

A/B test, n=40 enterprise users

34%

improvement in hallucination catch rate before content reaches end users

Validation study, 12-week period

91%

of users said visible AI reasoning increased their trust in the output

User research interviews, n=98

3.2x

faster source verification with inline citation panels vs. traditional footnotes

Task completion analysis, n=32

Design Process

1

Discovery

Weeks 1-3

98 user interviews across healthcare and defense enterprises. Contextual inquiry sessions observing analysts verify AI content in real workflows.

Key Insight

Users don't distrust AI — they distrust the opacity. 91% said they'd use AI more if they could see its reasoning.

Deliverable: Research synthesis + persona development

2

Pattern Audit

Weeks 3-5

Competitive analysis of 24 AI content tools. Catalogued 60+ trust/verification patterns. Identified gaps in existing approaches.

Key Insight

No tool provided inline confidence at the claim level. All used document-level scores, which mask low-confidence sections.

Deliverable: Pattern library + gap analysis

3

Concept Design

Weeks 5-8

Rapid prototyping of 4 distinct interaction models. Weekly design reviews with compliance officers and domain experts.

Key Insight

Progressive disclosure of AI reasoning outperformed always-visible approaches by 3x in user preference testing.

Deliverable: Hi-fi prototypes for 6 core patterns

4

Validation

Weeks 8-12

A/B testing with 40 enterprise users. Measured time-to-trust, review accuracy, and workflow completion rates.

Key Insight

Claim-level confidence scores reduced review time by 62% while improving error catch rate by 34%.

Deliverable: Validated design system + implementation specs

Research & Discovery

Enterprise User Personas

Elena — The Compliance Lead

12 years in financial complianceHealthcare enterprise

Needs absolute traceability. Every AI output must have an audit trail. Zero tolerance for unverified claims reaching clients or regulators.

Core Pain Point

Spends 4+ hours daily manually checking AI outputs against source documents

Design Goal

Automated compliance checking that she can trust and defend to auditors

Marcus — The Research Analyst

6 years in equity researchMid-cap investment firm

Power user who generates 10+ research drafts per week. Comfortable with AI but burned by a hallucinated earnings figure that reached a client deck.

Core Pain Point

Lost trust in AI tools after a high-profile error — now manually verifies everything

Design Goal

Regain confidence in AI drafting with clear confidence signals he can act on quickly

Content Verification Journey

T+0s

Content Generation

AI generates research memo from earnings transcript

No visibility into what's verified vs. generated

T+2s

Confidence Triage

System color-codes every claim by verification level

Previously required manual source-checking

T+15s

Source Verification

Reviewer drills into citations, checks reasoning chain

Sources buried in footnotes, hard to trace

T+30s

Hallucination Review

Address flagged claims with suggested corrections

Errors caught only in final review or worse — by clients

T+2min

Approve & Publish

Final sign-off with complete audit trail

No record of what was AI-generated vs. human-edited

Design Principles

01

Claim-Level Transparency

Move confidence from document-level to individual claims. Every sentence shows its verification status. This emerged from our finding that 91% of users would trust AI more if they could see per-claim reasoning — not just an aggregate score that masks unreliable sections.

02

Progressive Disclosure of Reasoning

AI reasoning is available but not forced. Click to expand any claim's source chain. In testing, this pattern outperformed always-visible reasoning by 3x — users felt in control rather than overwhelmed. Domain experts could drill deep; casual users could skim confidently.

03

Non-Disruptive Flagging

Hallucination alerts appear inline without breaking reading flow. Severity grading lets users triage — address critical issues immediately, batch low-severity items. Designed from our observation that modal-based alerts caused 'alert fatigue' within 48 hours of deployment.

04

Auditable Human-AI Collaboration

Every interaction between human and AI is logged in an immutable audit trail. For healthcare and defense clients, this isn't a nice-to-have — it's a regulatory requirement. The audit trail doubles as a training signal for model improvement.

Domain Expertise

Building trust interfaces for AI-generated content requires a rare combination: deep AI/ML pattern expertise, regulated-industry experience, and a research-driven approach to user behavior. This project draws directly from three career threads converging into one design system.

Modguard.ai

98 user interviews on AI trust in enterprise workflows. Developed hallucination detection UX patterns, confidence calibration interfaces, and citation visibility systems used by healthcare and defense customers.

Fidelity Investments

Designed advisor scheduling and client management tools used by 12,000+ financial advisors. Deep understanding of how regulated professionals interact with AI-assisted workflows at scale.

Deutsche Bank

Built compliance eLearning platform for global bank operations. First-hand knowledge of regulatory audit requirements and how compliance teams verify content for accuracy.

The hallucination flagging patterns in this design system weren't theoretical — they were tested with real compliance officers reviewing real AI-generated content. The progressive disclosure model came from watching financial analysts toggle between "quick scan" and "deep verify" modes 400+ times across our research sessions. Every pattern is grounded in observed behavior, not assumed preference.

Interactive Prototype

Explore the Pattern Library

Navigate all six core patterns — confidence spectrum, source citations, hallucination alerts, human review workflow, chat interaction models, and trust calibration. Every element is interactive.

Confidence SpectrumCitations & SourcesHallucination AlertsHuman ReviewChat PatternsTrust Calibration

Modguard

Confidence Spectrum

Citations & Sources

Hallucination Alerts

Human Review

Chat Patterns

Trust Calibration

Confidence Spectrum

Every AI-generated claim is scored individually and assigned to one of four verification tiers. Click any claim to see its source reasoning.

Verified Fact≥90%

Sourced Claim75–89%

AI Inference60–74%

Speculative<60%

Verified Fact98%

The company reported $4.2B in Q3 revenue, up 12% year-over-year.

Sourced Claim89%

Cloud infrastructure spending is expected to grow 22% in 2025 according to Gartner.

AI Inference72%

This strategic shift suggests the company is pivoting toward enterprise customers.

Speculative54%

The merger could potentially result in 15-20% operational cost savings within two years.

MODGUARD TRUST INTERFACE v1.0REBECKA RAJ

Key Design Decisions

Claim-Level vs. Document-Level Confidence

Before

After

Every sentence is individually scored and color-coded across four tiers: Verified Fact (≥90%), Sourced Claim (75–89%), AI Inference (60–74%), and Speculative (<60%). Low-confidence claims auto-expand their source panel so reviewers immediately see the AI's reasoning. Users scan for amber and red highlights to prioritize review.

98%

89%

72%

54%

Verified

Sourced

Inferred

Speculative

Impact

Review time reduced by 62% — users focus effort where it matters instead of re-reading everything

Passive Footnotes vs. Active Reasoning Chains

Before

After

Inline citation chips expand to reveal a step-by-step reasoning chain: what the AI retrieved, how it extracted data, what it cross-referenced, and why it assigned a particular confidence level. One click from claim to full provenance. The reasoning chain pattern was directly informed by how compliance officers explained their own verification process in our 98 interviews.

2 sources

REASONING CHAIN

Retrieved SEC filing

Extracted revenue data

Cross-referenced transcript

Confidence: 98%

Impact

Source verification speed improved 3.2x — from avg 48s to 15s per claim

Post-Hoc Review vs. Real-Time Hallucination Alerts

Before

After

Real-time hallucination detection flags potential inaccuracies inline as content is generated. Each flag shows: the original claim, the specific issue, the source conflict (what was claimed vs. what sources say), and a one-click suggested correction. Flags are severity-graded (high/medium/low) so users address critical issues first.

HALLUCINATION DETECTED

Claimed: $500B → Actual: $487B

Fix

Remove

Caught before it reaches anyone

Impact

Unverified claims reaching end users dropped from 12% to 0.8% in validation testing

Complete User Flow

From AI Draft to Approved Memo

Follow an analyst's complete journey reviewing an AI-generated research memo — from initial generation through hallucination flagging, source verification, and final approval.

Content Generation

Automated Compliance Scan

Reviewer Triage

Source Verification

Iterative Refinement

Final Approval

Content Generation

AI System ~8 seconds

Confidence Spectrum

Analyst requests a research memo on Meridian Corp's Q3 earnings. AI generates a 2,400-word draft with inline confidence scoring.

Detail

The system processes the earnings transcript, 10-Q filing, and three analyst reports. Each claim is scored as it's generated — the user sees confidence colors populating in real-time.

1 / 6

Designed by Rebecka Raj at Modguard.ai, 2024

Patterns developed through 98 user research interviews with enterprise customers in healthcare and defense. Validated with A/B testing across 40 enterprise users over 12 weeks.

Back to Projects