Skip to main content
Benchmarks group Tasks together into a test suite for evaluating an AI agent. Each benchmark references an Environment and defines timeout and concurrency settings for execution.

Endpoints

MethodEndpointDescription
GET/benchmarksList benchmarks
POST/benchmarksCreate a benchmark
GET/benchmarks/{id}Get benchmark details
PATCH/benchmarks/{id}Update a benchmark
DELETE/benchmarks/{id}Delete a benchmark

Benchmark Object

FieldTypeDescription
idstringUnique identifier
namestringBenchmark name
descriptionstring | nullOptional description
timeoutnumberMax execution time in seconds
concurrencynumberMax concurrent task executions
organization_idstringParent organization
created_atdatetimeCreation timestamp
updated_atdatetimeLast modification timestamp

SDK Example

// Create a benchmark
const benchmark = await verial.benchmarks.create({
  name: 'Prior Auth E2E',
  environmentId: 'env_abc123',
  timeout: 300,
  concurrency: 5,
})

// List all benchmarks
const benchmarks = await verial.benchmarks.list()

// Get a specific benchmark
const details = await verial.benchmarks.get({ id: benchmark.id })

// Update
await verial.benchmarks.update({
  id: benchmark.id,
  timeout: 600,
})

// Delete
await verial.benchmarks.delete({ id: benchmark.id })