Welcome - Verial

Verial is the benchmarking platform for healthcare AI agents.

Verial gives healthcare AI teams realistic, reproducible environments to evaluate their agents against. You compose simulated health systems (FHIR EHRs, phone lines, fax, payer portals, clearinghouses, HL7, SFTP, X12, CDS Hooks), author benchmarks of typed tasks, and drive rollouts from your agent. Verial scores every observable outcome with the verification engine so you can compare versions, catch regressions, and ship with confidence.

Simulate full healthcare systems

Run your agent against fully functional FHIR R4 EHRs, voice/IVR lines, fax, payer portals, and clearinghouses. Every protocol your production agent touches, available as a live sandbox endpoint.

Benchmark with typed criteria

Published benchmarks with versioned slugs. Each task has typed assertions (fhir-resource-state, portal-state-match, voice-transcript, etc.) that score every observable outcome, not just LLM-judged prose.

Drop in your agent

Solver keys, a stateless v1 API, an official TypeScript SDK, and an MCP server. Drive rollouts from any language or connect directly from Claude, Cursor, or ChatGPT.

Compare, regress, improve

Score runs along axes (correctness, safety, efficiency), compare versions, and regression-test every prompt or tool change. CI integration via GitHub Actions.

How It Works

Compose an environment from simulators and datasets.
Author a benchmark of tasks with typed criteria.
Submit a run. Verial provisions a playground; your agent drives the sandbox endpoints.
Verification engine scores each task, producing per-criterion results, per-axis scores, and a benchmark score.

Get Started

Quickstart

Drive an agent through a published benchmark in 10 minutes with the v1 API.

Core Concepts

Environments, simulators, benchmarks, criteria, runs, verification.

Simulators

How your agent connects to each simulated healthcare interface.

API Reference

REST endpoints for internal and v1 flows.

Who Is Verial For?

Teams building healthcare AI agents that interact with EHRs, payers, clearinghouses, portals, phone lines, fax, or messaging. Use Verial to benchmark an agent before production, compare versions, and regression-test new prompt or tool changes.

Core Concepts

Concept	Description
Environment	Reusable composition of simulators and datasets
Simulator	A simulated interface (FHIR, HL7, Voice, Fax, Portal, SFTP, X12, CDS Hooks, Messages)
Dataset	Synthetic data loaded into a simulator sandbox
Benchmark	Published, versioned set of tasks referencing an environment
Task	One test case with criteria, optional scenario, and dataset bindings
Criterion	A typed assertion attached to a task
Run	One execution of a benchmark
Verification	How the engine scores task runs using criteria
Playground	A provisioned environment ready for rollouts
Sandbox	A running simulator instance with branching and checkpoints
Interaction	Recorded evidence (transcripts, FHIR logs, fax docs, portal events)
Solver	Per-organization agent identity that submits runs against published benchmarks

Documentation Index

​Verial is the benchmarking platform for healthcare AI agents.

Simulate full healthcare systems

Benchmark with typed criteria

Drop in your agent

Compare, regress, improve

​How It Works

​Get Started

Quickstart

Core Concepts

Simulators

API Reference

​Who Is Verial For?

​Core Concepts

Verial is the benchmarking platform for healthcare AI agents.

How It Works

Get Started

Who Is Verial For?

Core Concepts