@runloop/api-client - v1.4.0
    Preparing search index...

    Outcome data for a single benchmark run within a benchmark job, representing results for one agent configuration.

    interface BenchmarkOutcome {
        agent_name: string;
        benchmark_run_id: string;
        n_completed: number;
        n_failed: number;
        n_timeout: number;
        scenario_outcomes: Runloop.BenchmarkJobView.BenchmarkOutcome.ScenarioOutcome[];
        average_score?: number | null;
        duration_ms?: number | null;
        model_name?: string | null;
    }
    Index

    Properties

    agent_name: string

    The name of the agent configuration used.

    benchmark_run_id: string

    The ID of the benchmark run.

    n_completed: number

    Number of scenarios that completed successfully.

    n_failed: number

    Number of scenarios that failed.

    n_timeout: number

    Number of scenarios that timed out.

    Detailed outcomes for each scenario in this benchmark run.

    average_score?: number | null

    Average score across all completed scenarios (0.0 to 1.0).

    duration_ms?: number | null

    Total duration of the benchmark run in milliseconds.

    model_name?: string | null

    The model name used by the agent.