Skip to main content
Tasks are individual test cases within a Benchmark. Each task defines an instruction for the agent, optional trigger conditions, and a set of Evals that determine success or failure.
Endpoints not yet in OpenAPI spec.

Endpoints

MethodEndpointDescription
GET/tasks?benchmark_id={benchmarkId}List tasks for a benchmark
POST/tasksCreate a task
GET/tasks/{id}Get task details
PATCH/tasks/{id}Update a task
DELETE/tasks/{id}Delete a task

Task Object

FieldTypeDescription
idstringUnique identifier
benchmark_idstringParent Benchmark
namestringTask name
instructionstring | nullNatural language instruction for the agent
timeoutnumber | nullTask-level timeout override in seconds
triggerobject | nullConditions that start the task
tagsstring[] | nullTags for filtering and grouping
organization_idstringParent organization
created_atdatetimeCreation timestamp
updated_atdatetimeLast modification timestamp

SDK Example

// Create a task
const task = await verial.tasks.create({
  benchmarkId: 'bench_abc123',
  name: 'Submit prior auth for MRI',
  instruction: 'Submit a prior authorization request for a knee MRI',
  tags: ['prior-auth', 'imaging'],
})

// List tasks for a benchmark
const tasks = await verial.tasks.list({ benchmarkId: 'bench_abc123' })

// Get a specific task
const details = await verial.tasks.get({ id: task.id })

// Update
await verial.tasks.update({
  id: task.id,
  timeout: 120,
})

// Delete
await verial.tasks.delete({ id: task.id })