Eval Runs

Endpoints
Eval Run Object
SDK Example

Eval Runs contain the result of a single Eval assertion within a Task Run. The LLM judge evaluates the assertion against evidence collected from sandbox interactions and produces a score and detailed reasoning.

Endpoints not yet in OpenAPI spec.

Endpoints

Method	Endpoint	Description
`GET`	`/eval-runs?task_run_id={taskRunId}`	List eval runs for a task run
`GET`	`/eval-runs/{id}`	Get eval run details

Eval Run Object

Field	Type	Description
`id`	string	Unique identifier
`task_run_id`	string	Parent Task Run
`eval_id`	string	Source Eval
`result`	string	Pass/fail result
`score`	number	Score for this assertion (0-1)
`details`	string \| null	LLM judge reasoning
`started_at`	datetime	When evaluation started
`completed_at`	datetime \| null	When evaluation finished

SDK Example

// List eval runs for a task run
const evalRuns = await verial.evalRuns.list({ taskRunId: 'tr_abc123' })

// Get eval run details (includes LLM judge reasoning)
const details = await verial.evalRuns.get({ id: evalRuns.data[0].id })

console.log(`${details.result}: ${details.details}`)

Task Runs

GetFhirSmartConfig

⌘I

Getting Started

SDK

Resources

Sandbox Protocols

Endpoints

Eval Run Object

SDK Example

Getting Started

SDK

Resources

Sandbox Protocols

​Endpoints

​Eval Run Object

​SDK Example

Endpoints

Eval Run Object

SDK Example