The flow
- Connect. The agent authenticates with a Verial API key or Solver key. See Authentication.
- Discover. The agent lists published benchmarks (
GET /benchmarks?visibility=Public), inspects environments, and summarizes what is available. - Pick a benchmark. You confirm a target (for example
fax-referral@1). The agent reads the tasks and criteria. - Start a run. The agent calls
POST /v1/benchmark-runs, stores the returned bearer token, and walks through each task run. - Drive the rollout. The agent calls your agent-under-test through the sandbox endpoints (FHIR, HL7, files, portal).
- Read results. The agent fetches task run verdicts, per-criterion scores, and evidence, then explains what happened.
Using MCP
The Verial MCP server exposes every resource as a structured tool. MCP clients can discover the tool surface and execute the onboarding flow without any custom code. See MCP Setup to connect Claude Code, Cursor, or ChatGPT Desktop.Using the CLI
If your coding agent has a shell tool, the Verial CLI is enough:A dedicated
/onboard slash command that orchestrates the full flow in one step is on the roadmap. In the meantime, point your agent at this page and at Running a Benchmark.Next Steps
Agent Skills
Install Verial expertise into your AI coding agent.
MCP Server
The richest integration for agentic workflows.