RRRebecka Raj
WorkAboutLifeContact
Modguard.ai · Case Study 02 · LLM Trust & Safety
Designing trust into LLMs for healthcare and defense.
Around 100 user interviews, three core patterns shipped, and a product acquired with the patterns embedded — across two regulated verticals where verifying LLM output had to fit inside an expert workflow.
Client
Modguard.ai
Role
Design Advisor
Timeline
9 months
Verticals
Healthcare + Defense
Stage
Early → Acquired
Year
2023-2024
Where Modguard played
Two verticals, one product, a shared trust problem.
Vertical 01
Healthcare
Primary user: clinicians reviewing AI-generated patient summaries
High cost of error, time-bound between visits
Compliance: HIPAA, medical liability
The product
Modguard - LLM trust layer
StageEarly-stage AI/ML startup
MandateLLM solutions for regulated industries
My roleDesign Advisor, ~9 months
OutcomeAcquired with patterns intact
Vertical 02
Defense
Primary user: intel analysts validating LLM-synthesized briefs
Deadline-bound, fabricated-source risk
Compliance: classification, source provenance
Two domains, a shared trust problem: helping experts act on imperfect output without slowing them down.
The brief I reformulated
The brief I was given, and the brief I reformulated.
· The original brief
Make the LLM feel trustworthy.
"Reduce hallucinations. Add citations. Make the model sound more confident."
Treats trust as an output-quality property
Aims to make the model the source of confidence
Underspecified: 2023 LLMs will hallucinate
Reframed
· The reframed thesis
Make verification cheap enough to absorb imperfect output.
Trust is a workflow property, not an output property.
1
Avoid making the model sound confident — that combines opacity with risk.
2
Make the evidence easier to inspect, not the prose easier to accept.
3
Treat the user as an expert who needs speed, not a novice who needs reassurance.
The reformulation opened the design space. With trust framed as a workflow problem, the question shifted from "how do we improve the model?" to "how do we make verification fast enough that imperfect output becomes manageable?"
Two users
Two users. Both experts. Both time-bound.
User profile 01
The clinician
Reviews LLM-generated patient summaries between visits
What they do with the LLM
Reads the model's summary of a patient chart in the 90 seconds between back-to-back appointments. Decides whether anything in the original record was glossed over.
What scares them
Missing a flag in the chart that the model smoothed over — or spending so long verifying that she falls behind schedule.
User profile 02
The intel analyst
Validates LLM-synthesized briefs under deadline
What they do with the LLM
Receives a synthesized intel brief and has to confirm every claim is sourced before forwarding it. Often facing a hard deadline and dozens of cited passages.
What scares them
Forwarding a well-written paragraph that turns out to cite a source that doesn't exist.
The shared structure: domain experts under time pressure who pay a high cost when wrong. Different chrome, same workflow shape. The patterns we shipped had to work for both.
Three insights from research
~40 clinicians and ~60 analysts later, three findings shaped every pattern we shipped.
01
From clinician interviews
Confidence scores were rejected immediately. "60% confident" is meaningless when the question is whether to act.
What we built
Show evidence, not confidence
Replace probability badges with inline source markers. Strong source = clean text. Weak source = subtle inline glyph. No retrieval = explicit gap.
02
From analyst shadow sessions
Analysts already mentally tag each claim to a source. We weren't introducing a new behavior — we were giving it a UI.
What we built
Inline citations over reference list
Per-sentence provenance. Every claim links back to its source passage, not a footnote at the bottom of the brief.
03
From early prototype testing
Aggressive flagging caused alarm fatigue within minutes. Three banners on a page and clinicians stopped reading the warnings.
What we built
Graded, silent-until-needed
Flag only the most consequential claims. Default state is invisible. Hover or click to inspect. The rest reads as normal text.
Design principle
Make uncertainty legible.
"
The model knows where its evidence is thin. The user usually doesn't.
→ The corollary
Verification must be cheap.
The cost to check has to approach zero, because experts will not trade speed for ceremony. If verification takes longer than skipping it, it won't happen.
The contested tradeoff
Speed vs. warning density: flag every uncertain claim, or only the most consequential? Flag too few → over-trust. Flag too many → alarm fatigue.
Where we landed: flag only the most consequential, with the recognition that this call would need to be revisited if a real adverse event surfaced. A decision made deliberately, with the tradeoffs documented.
Three patterns shipped
Hallucination flagging, citation visibility, guardrail indicators. Each addresses a different failure mode of LLM output, designed to fit inside an expert workflow.
Pattern 01 · Hallucination flagging
Modguard verified · 3 of 5 grounded
Chart summary · M. Chen · F · 58 · MRN 4218
The patient was admitted on March 14 with elevated troponin and chest pain consistent with prior records. 1 A previous note suggests reduced ejection fraction in the 35\u201340% range, though the source is from an outside facility and was not re-confirmed. ! weak The patient's response to beta-blocker therapy has been favorable across recent visits. 2 Current medications include lisinopril 10mg, atorvastatin 40mg, and metoprolol 50mg, per the active medication list.
Hover the dashed claim above. The marker is silent until inspected. Three states: clean text (strong source), inline marker (weak), explicit gap (no retrieval). No confidence badges.
Pattern 02 · Citation visibility
4 of 4 grounded
Synthesized brief · case 2241 · Pattern of life
The subject was last documented at the southern checkpoint at 14:32 local time[1] Border crossings in the preceding 90 days indicate a pattern of weekday transits, primarily Tuesday and Thursday.[2] Vehicle registration matches a 2019 sedan registered to a regional logistics firm[3] Communications metadata for the same window suggests sustained contact with two known associates[4]
click any [n] to verify source · hover-rate: ~3s per citation
Click any citation [n] in the brief to verify the source passage.
Before
Open new tab, find source, scroll, return
~60s
After
Click sentence. Source rises beside it.
~3s
Pattern 03 · Guardrail indicators
~6% deferral false-positive
In competence · chart review · persistent
Summarize this patient's last three visits.
Three visits over the past 14 months. Recurring presentation of intermittent tachycardia, evaluated by cardiology in May. Lab trends within range. Current medication list reconciled at last visit.
Near hard zone · dosage · deferral
What dose of metoprolol should I start her on?
Not in my competence
Dosage isn't something I should be the source of truth on. Here's where it lives:
Handoff, not refusal. The status strip stays visible across the whole session, not just at deferral moments. Three rule-based layers detect the boundary: input classifier, retrieval-source filter, prompt-level constraints. Auditable beats accurate for this function.
What the work actually moved
Acquisition
Acquired
Modguard was acquired with the trust patterns intact in the product surface.
Verification time
~10xfaster
Per-citation verification time, from minutes-per-claim to seconds-per-claim in the analyst workflow.
Trust metric
CTR
Citation click-through rate adopted as a leading trust metric, replacing post-hoc trust surveys.
// the lesson I'm taking with me
In regulated industries, deferring well is more valuable than answering well. Asymmetric costs require asymmetric design.
// what generalizes
Trust is a workflow property.
Graded flagging, hover-to-verify, persistent competence boundaries — the patterns travel from clinical chart review to intel briefing to any expert-facing LLM surface.
Legal reviewFinancial reportingCompliance audit
Designed by Rebecka Raj · Modguard.ai · 2023-2024
The patterns shipped before the acquisition and survived it. They remain the spine of the trust surface in the acquiring company's regulated-industry product line.
Back to Projects