How it's built

The provider abstraction, the context reduction, the agent loop, and the cost guards that make each sweep cheap and traceable. Enough to judge the engineering, not a marketing tour.

Architecture

swap pointLink flowFixtures today, live Flexpa Link when keys land
Provider interfaceOne contract, either source behind it
FlattenerFHIR bundle to compact TSV rows
AgentLLMBackend, CLI local or BYOK deployed
Eval harnessGraded against authored ground truth
Replay layerCaptured runs, re-served with no live calls

The provider interface is the seam. This build reads sandbox-shaped fixtures through it, and the same interface swaps to live Flexpa Link when API keys land. Nothing downstream changes.

Flattening the context

The reduction is ViewDefinition-inspired, credit to Larry Ditton's Flexpa post on SQL on FHIR for LLM context reduction. Rather than run a full ViewDefinition runtime, Clawback flattens each EOB into one tabular row and serializes it as compact TSV for the model.

On the demo sweep that cut the context from 24,610 raw FHIR tokens to 1,936 flattened tokens, a 92.1% estimated reduction, chars-per-token estimator on both sides of the division. Larry's post reports a similar figure on its own corpus, but the two are not directly comparable. That measurement ran over 14 real EOBs and this one runs over 48 synthetic single-line claims, so treat the closeness as directional, not a benchmark. The token meter shows the paired counts and the dollar cost, and the evals show the scores hold on the smaller context.

Build versus buy

Four decisions every LLM product has to make, and what this demo chose for each.

Concern	What this demo chose
Retrieval	Flatten the bundle to TSV rows. The corpus is small and bounded, so no vector store earns its keep.
Guardrails	Forced JSON schemas via zod, spend caps, rate limits, and provenance required on every finding.
Evals	A committed harness with an LLM judge under a separate strict prompt. Scores and failures are published.
Versioning	Every run persisted, replay artifacts and eval results committed by git SHA.

Who pays, and what degrades

The plausible buyer is more likely an employer, a TPA, or a benefits platform than a consumer paying a subscription. A consumer version most likely works as free monitoring with a success fee on dollars actually recovered, so the member never pays to be told nothing is wrong.

The patient-access economics are real. Identity proofing to IAL2 adds a per-member cost and real funnel friction, and Flexpa's published tiers start at a $20K annual platform license. Those two facts mean the unit economics only close with either scale or a B2B2C channel that spreads the license and the proofing cost across many members.

The findings also degrade on real payer data, not just on this clean synthetic set. The copay-tier and deductible-accumulator findings need plan benefit design data that patient-access Coverage resources frequently do not carry, and the denial findings need CARC codes that payers populate inconsistently. The adjudication-quality script in docs/adjudication-quality.md exists to measure exactly this per payer. The live section stays pending until API keys land.

Stack

Next.js 15	App Router, server components, one deployable
TypeScript	Strict types across agent, flattener, and UI
claude-sonnet-5	Subject and judge, reached through an LLMBackend abstraction
zod	Schema validation on every model response
vitest	180 passing tests across the pure logic

Built by

Life-OS is the agent system I run my own work through, a set of Claude Code agents that handle my pipeline, my research, and my writing every day. Before that I helped build and sell two healthcare companies, most of that work on the reimbursement side, so claims data and denial logic are familiar ground.

I applied to Flexpa on June 10 and built this demo on their rails. The demo is the pitch. Everything here runs on synthetic data, the evals are published, and the token meter shows the real cost.

More at chriscardinal.me.

Clawbacks have always run one direction. This one runs yours.