Portfolio / Project 06
LLM Content Trust &
Verification Interface
Verification Interface
A design system for AI-generated content interfaces — reusable UX patterns for displaying confidence, source attribution, hallucination detection, and human-in-the-loop review workflows. Born from 98 enterprise user interviews.
Client
Modguard.ai
Role
Lead Product Designer
Timeline
12 weeks
Platform
Enterprise SaaS
Domain
AI/ML Trust & Safety
Year
2024
The Challenge
Enterprise teams adopting LLMs for content generation face a fundamental trust problem: AI outputs look authoritative regardless of accuracy. A confidently stated hallucination is indistinguishable from a verified fact. In regulated industries like healthcare and defense — where Modguard.ai's clients operate — a single unverified claim can trigger compliance violations, patient safety risks, or security breaches.
Existing AI interfaces treated trust as a binary — either you trust the model or you don't. There was no graduated system for communicating varying levels of AI certainty, no inline mechanism for source verification, and no workflow for humans to efficiently review and approve AI-generated content at scale.
I led the design of Modguard's trust and verification interface — a pattern library that makes AI reasoning visible, sources verifiable, and hallucinations catchable. The patterns were developed through 98 user research interviews and validated with enterprise customers in healthcare and defense.
Measured Impact
62%
reduction in content review time with claim-level confidence scoring
A/B test, n=40 enterprise users
34%
improvement in hallucination catch rate before content reaches end users
Validation study, 12-week period
91%
of users said visible AI reasoning increased their trust in the output
User research interviews, n=98
3.2x
faster source verification with inline citation panels vs. traditional footnotes
Task completion analysis, n=32
Design Process
1
Discovery
Weeks 1-3
98 user interviews across healthcare and defense enterprises. Contextual inquiry sessions observing analysts verify AI content in real workflows.
Key Insight
Users don't distrust AI — they distrust the opacity. 91% said they'd use AI more if they could see its reasoning.
Deliverable: Research synthesis + persona development
2
Pattern Audit
Weeks 3-5
Competitive analysis of 24 AI content tools. Catalogued 60+ trust/verification patterns. Identified gaps in existing approaches.
Key Insight
No tool provided inline confidence at the claim level. All used document-level scores, which mask low-confidence sections.
Deliverable: Pattern library + gap analysis
3
Concept Design
Weeks 5-8
Rapid prototyping of 4 distinct interaction models. Weekly design reviews with compliance officers and domain experts.
Key Insight
Progressive disclosure of AI reasoning outperformed always-visible approaches by 3x in user preference testing.
Deliverable: Hi-fi prototypes for 6 core patterns
4
Validation
Weeks 8-12
A/B testing with 40 enterprise users. Measured time-to-trust, review accuracy, and workflow completion rates.
Key Insight
Claim-level confidence scores reduced review time by 62% while improving error catch rate by 34%.
Deliverable: Validated design system + implementation specs
Research & Discovery
Enterprise User Personas
Elena — The Compliance Lead
12 years in financial complianceHealthcare enterprise
Needs absolute traceability. Every AI output must have an audit trail. Zero tolerance for unverified claims reaching clients or regulators.
Core Pain Point
Spends 4+ hours daily manually checking AI outputs against source documents
Design Goal
Automated compliance checking that she can trust and defend to auditors
Marcus — The Research Analyst
6 years in equity researchMid-cap investment firm
Power user who generates 10+ research drafts per week. Comfortable with AI but burned by a hallucinated earnings figure that reached a client deck.
Core Pain Point
Lost trust in AI tools after a high-profile error — now manually verifies everything
Design Goal
Regain confidence in AI drafting with clear confidence signals he can act on quickly
Content Verification Journey
Design Principles
01
Claim-Level Transparency
Move confidence from document-level to individual claims. Every sentence shows its verification status. This emerged from our finding that 91% of users would trust AI more if they could see per-claim reasoning — not just an aggregate score that masks unreliable sections.
02
Progressive Disclosure of Reasoning
AI reasoning is available but not forced. Click to expand any claim's source chain. In testing, this pattern outperformed always-visible reasoning by 3x — users felt in control rather than overwhelmed. Domain experts could drill deep; casual users could skim confidently.
03
Non-Disruptive Flagging
Hallucination alerts appear inline without breaking reading flow. Severity grading lets users triage — address critical issues immediately, batch low-severity items. Designed from our observation that modal-based alerts caused 'alert fatigue' within 48 hours of deployment.
04
Auditable Human-AI Collaboration
Every interaction between human and AI is logged in an immutable audit trail. For healthcare and defense clients, this isn't a nice-to-have — it's a regulatory requirement. The audit trail doubles as a training signal for model improvement.
Domain Expertise
Building trust interfaces for AI-generated content requires a rare combination: deep AI/ML pattern expertise, regulated-industry experience, and a research-driven approach to user behavior. This project draws directly from three career threads converging into one design system.
Modguard.ai
98 user interviews on AI trust in enterprise workflows. Developed hallucination detection UX patterns, confidence calibration interfaces, and citation visibility systems used by healthcare and defense customers.
Fidelity Investments
Designed advisor scheduling and client management tools used by 12,000+ financial advisors. Deep understanding of how regulated professionals interact with AI-assisted workflows at scale.
Deutsche Bank
Built compliance eLearning platform for global bank operations. First-hand knowledge of regulatory audit requirements and how compliance teams verify content for accuracy.
The hallucination flagging patterns in this design system weren't theoretical — they were tested with real compliance officers reviewing real AI-generated content. The progressive disclosure model came from watching financial analysts toggle between "quick scan" and "deep verify" modes 400+ times across our research sessions. Every pattern is grounded in observed behavior, not assumed preference.
Interactive Prototype
Explore the Pattern Library
Navigate all six core patterns — confidence spectrum, source citations, hallucination alerts, human review workflow, chat interaction models, and trust calibration. Every element is interactive.
Confidence SpectrumCitations & SourcesHallucination AlertsHuman ReviewChat PatternsTrust Calibration
Key Design Decisions
Claim-Level vs. Document-Level Confidence
Every sentence is individually scored and color-coded across four tiers: Verified Fact (≥90%), Sourced Claim (75–89%), AI Inference (60–74%), and Speculative (<60%). Low-confidence claims auto-expand their source panel so reviewers immediately see the AI's reasoning. Users scan for amber and red highlights to prioritize review.
98%
89%
72%
54%
Verified
Sourced
Inferred
Speculative
Impact
Review time reduced by 62% — users focus effort where it matters instead of re-reading everything
Passive Footnotes vs. Active Reasoning Chains
Inline citation chips expand to reveal a step-by-step reasoning chain: what the AI retrieved, how it extracted data, what it cross-referenced, and why it assigned a particular confidence level. One click from claim to full provenance. The reasoning chain pattern was directly informed by how compliance officers explained their own verification process in our 98 interviews.
2 sources
REASONING CHAIN
1
Retrieved SEC filing2
Extracted revenue data3
Cross-referenced transcript4
Confidence: 98%Impact
Source verification speed improved 3.2x — from avg 48s to 15s per claim
Post-Hoc Review vs. Real-Time Hallucination Alerts
Real-time hallucination detection flags potential inaccuracies inline as content is generated. Each flag shows: the original claim, the specific issue, the source conflict (what was claimed vs. what sources say), and a one-click suggested correction. Flags are severity-graded (high/medium/low) so users address critical issues first.
HALLUCINATION DETECTED
Claimed: $500B → Actual: $487B
Fix
Remove
Caught before it reaches anyone
Impact
Unverified claims reaching end users dropped from 12% to 0.8% in validation testing
Complete User Flow
From AI Draft to Approved Memo
Follow an analyst's complete journey reviewing an AI-generated research memo — from initial generation through hallucination flagging, source verification, and final approval.
1
Content Generation2
Automated Compliance Scan3
Reviewer Triage4
Source Verification5
Iterative Refinement6
Final ApprovalDesigned by Rebecka Raj at Modguard.ai, 2024
Patterns developed through 98 user research interviews with enterprise customers in healthcare and defense. Validated with A/B testing across 40 enterprise users over 12 weeks.
Back to Projects