Documentation Index
Fetch the complete documentation index at: https://docs.verial.ai/llms.txt
Use this file to discover all available pages before exploring further.
Verial is the benchmarking platform for healthcare AI agents.
Verial gives healthcare AI teams realistic, reproducible environments to evaluate their agents against. You compose simulated health systems (FHIR EHRs, phone lines, fax, payer portals, clearinghouses, HL7, SFTP, X12, CDS Hooks), author benchmarks of typed tasks, and drive rollouts from your agent. Verial scores every observable outcome with the verification engine so you can compare versions, catch regressions, and ship with confidence.Simulate full healthcare systems
Run your agent against fully functional FHIR R4 EHRs, voice/IVR lines, fax, payer portals, and clearinghouses. Every protocol your production agent touches, available as a live sandbox endpoint.
Benchmark with typed criteria
Published benchmarks with versioned slugs. Each task has typed assertions (
fhir-resource-state, portal-state-match, voice-transcript, etc.) that score every observable outcome, not just LLM-judged prose.Drop in your agent
Solver keys, a stateless v1 API, an official TypeScript SDK, and an MCP server. Drive rollouts from any language or connect directly from Claude, Cursor, or ChatGPT.
Compare, regress, improve
Score runs along axes (correctness, safety, efficiency), compare versions, and regression-test every prompt or tool change. CI integration via GitHub Actions.
How It Works
- Compose an environment from simulators and datasets.
- Author a benchmark of tasks with typed criteria.
- Submit a run. Verial provisions a playground; your agent drives the sandbox endpoints.
- Verification engine scores each task, producing per-criterion results, per-axis scores, and a benchmark score.
Get Started
Quickstart
Drive an agent through a published benchmark in 10 minutes with the v1 API.
Core Concepts
Environments, simulators, benchmarks, criteria, runs, verification.
Simulators
How your agent connects to each simulated healthcare interface.
API Reference
REST endpoints for internal and v1 flows.
Who Is Verial For?
Teams building healthcare AI agents that interact with EHRs, payers, clearinghouses, portals, phone lines, fax, or messaging. Use Verial to benchmark an agent before production, compare versions, and regression-test new prompt or tool changes.Core Concepts
| Concept | Description |
|---|---|
| Environment | Reusable composition of simulators and datasets |
| Simulator | A simulated interface (FHIR, HL7, Voice, Fax, Portal, SFTP, X12, CDS Hooks, Messages) |
| Dataset | Synthetic data loaded into a simulator sandbox |
| Benchmark | Published, versioned set of tasks referencing an environment |
| Task | One test case with criteria, optional scenario, and dataset bindings |
| Criterion | A typed assertion attached to a task |
| Run | One execution of a benchmark |
| Verification | How the engine scores task runs using criteria |
| Playground | A provisioned environment ready for rollouts |
| Sandbox | A running simulator instance with branching and checkpoints |
| Interaction | Recorded evidence (transcripts, FHIR logs, fax docs, portal events) |
| Solver | Per-organization agent identity that submits runs against published benchmarks |