Documentation Index
Fetch the complete documentation index at: https://docs.verial.ai/llms.txt
Use this file to discover all available pages before exploring further.
Recommended Tool Flow
For most simulation workflows, follow this progression:- Setup —
simulatorsto define interfaces,environmentsto compose them,datasetsto prepare patient data - Define —
benchmarksto create test suites,tasksto add test cases,criteriato add assertions - Execute —
benchmark_runsto start runs, poll withgetuntil status isCompleted - Analyze —
task-runsto see per-task results,criterion-runsto see per-criterion reasoning and scores
Tool Chaining Patterns
Create-then-link
Simulators exist independently from environments. Create them first, then attach them.Benchmark definition
Benchmarks, tasks, and criteria form a hierarchy. Create them top-down.Run and poll
Start a run, poll for completion, then drill into results.benchmark_runs get response includes the overall score and verdict. Use task-runs and criterion-runs to understand which specific checks passed or failed.
Writing Good Criteria
When writing criteria, describe observable outcomes the verification engine can check against sandbox state. Write them like test assertions: specific, observable, and unambiguous.| Assert | Quality | Why |
|---|---|---|
| ”The agent did a good job” | Bad | Subjective, no observable criteria |
| ”A prior auth was submitted” | Okay | Observable but vague about what counts as “submitted" |
| "A prior authorization request was submitted through the payer portal with CPT code 72148” | Good | Specific action, specific channel, specific data point |
| ”The 271 eligibility response shows active coverage with plan type PPO” | Good | Specific transaction type, specific fields to check |
| ”The agent called the FHIR endpoint GET /Patient and received a 200 response” | Good | Verifiable against interaction logs |
| ”The agent handled the error gracefully” | Bad | ”Gracefully” is subjective |
One assertion per criterion
Split compound checks into separate criteria rather than combining them. This gives you granular scoring and clearer failure messages.Error Handling
All tools return errors in a consistent shape:| Error Code | Meaning | Recommended Action |
|---|---|---|
not_found | Entity does not exist or belongs to a different organization | Verify the ID; use a list action to find valid IDs |
validation_error | Invalid parameters (missing required field, wrong type) | Check the parameter types and required fields in the tool reference |
conflict | Duplicate or conflicting state (e.g., simulator already linked) | Use get to check current state before retrying |
timeout | Run exceeded the benchmark timeout | Increase the benchmark timeout value or simplify the task |
Next Steps
Tools Reference
Full parameter documentation for each tool.
Workflow Examples
Step-by-step tool call sequences for common simulation tasks.