Interactions

An interaction is the evidence a Sandbox records while the agent drives a rollout. Every request to a FHIR store, every HL7 outbound, every portal form submit, every voice turn, every uploaded file, every X12 response: Verial writes it down. The verification engine reads these interactions when it runs each Criterion after the task run completes.

Evidence by Simulator

Each simulator type produces its own evidence shape. The verification engine dispatches to a check implementation keyed by assertion.assert, and each check pulls evidence from the matching source.

Simulator	Evidence
FHIR	HTTP request/response log: method, path, request body, status code, response body. Verification runs FHIR searches against the final store state.
HL7	Outbound HL7v2 messages recorded as `hl7_outbound` sandbox events, with the full message payload (MSH, PID, PV1, OBX, etc.).
Voice	Call turns with speaker (`agent` / `caller`) and transcribed text, plus the full recording.
Fax	Inbound or outbound fax document, with OCR text extracted for assertion.
Portal	Sandbox events per action: form submits, patient searches, auth submissions, with the submitted payload and the resulting state row.
Files / SFTP	Uploaded file metadata (path, size) plus the raw content in object storage.
X12	Submitted and response records per transaction (270/271/276/277/278).
CDS Hooks	Hook invocations and the cards returned by the agent.
Message	Outbound SMS/text messages with the rendered body.

How the Verification Engine Reads Interactions

For each criterion on the task, the engine:

Reads assertion.assert to pick a check implementation.
Pulls the relevant evidence from the sandbox (a FHIR search against the store, HL7 outbound rows, portal state rows, voice turns, SFTP objects, X12 responses).
Runs the typed assertion against that evidence.
Writes a Criterion Run with passed, score, details, and the evidence it considered.

See Verification for the full dispatch table and scoring rules.

Reading Interactions

Interactions surface in two places:

Per sandbox: GET /sandboxes/{id}/events returns the raw event log for a sandbox. Useful for debugging a rollout or authoring new criteria from real traces.
Per criterion run: GET /criterion-runs/{id} returns the specific evidence the check considered for that criterion, with field-level diffs where applicable.

During a scored benchmark run, per-field evidence is omitted from completion responses so the agent cannot learn the rubric. Fetch the full evidence later from GET /criterion-runs/{id}.

Retention

Interactions persist after a playground is torn down. Teardown releases the live resources (phone numbers, FHIR stores, portal users) but keeps every recorded event so you can review evidence, debug failed criteria, and compare rollouts across benchmark runs.

Evidence by Simulator

How the Verification Engine Reads Interactions

Reading Interactions

Retention

Next Steps

Verification

Sandboxes API

​Evidence by Simulator

​How the Verification Engine Reads Interactions

​Reading Interactions

​Retention

​Next Steps

Verification

Sandboxes API

Evidence by Simulator

How the Verification Engine Reads Interactions

Reading Interactions

Retention

Next Steps