Interface BenchmarkOutcome

Outcome data for a single benchmark run within a benchmark job, representing results for one agent configuration.

interface BenchmarkOutcome {
    agent_name: string;
    benchmark_run_id: string;
    n_completed: number;
    n_failed: number;
    n_timeout: number;
    scenario_outcomes: Runloop.BenchmarkJobView.BenchmarkOutcome.ScenarioOutcome[];
    average_score?: number | null;
    duration_ms?: number | null;
    model_name?: string | null;
}

Index

Properties

agent_name benchmark_run_id n_completed n_failed n_timeout scenario_outcomes average_score? duration_ms? model_name?

Properties

agent_name

agent_name: string

The name of the agent configuration used.

benchmark_run_id

benchmark_run_id: string

The ID of the benchmark run.

n_completed

n_completed: number

Number of scenarios that completed successfully.

n_failed

n_failed: number

Number of scenarios that failed.

n_timeout

n_timeout: number

Number of scenarios that timed out.

scenario_outcomes

scenario_outcomes: Runloop.BenchmarkJobView.BenchmarkOutcome.ScenarioOutcome[]

Detailed outcomes for each scenario in this benchmark run.

`Optional`average_score

average_score?: number | null

Average score across all completed scenarios (0.0 to 1.0).

`Optional`duration_ms

duration_ms?: number | null

Total duration of the benchmark run in milliseconds.

`Optional`model_name

model_name?: string | null

The model name used by the agent.

Interface BenchmarkOutcome

Index

Properties

Properties

agent_name

benchmark_run_id

n_completed

n_failed

n_timeout

scenario_outcomes

`Optional`average_score

`Optional`duration_ms

`Optional`model_name

Settings

On This Page

Interface BenchmarkOutcome

Index

Properties

Properties

agent_name

benchmark_run_id

n_completed

n_failed

n_timeout

scenario_outcomes

Optionalaverage_score

Optionalduration_ms

Optionalmodel_name

Settings

On This Page

`Optional`average_score

`Optional`duration_ms

`Optional`model_name