Skip to main content

Usage

benchspan retry <run_id> --agent <agent> [options]

Options

FlagDescription
--agentRequired. Built-in agent name or path to agent directory
--parallelismMax concurrent containers (default: 5, max: 10)

What it does

Scans the previous run for instances that failed or errored, and re-runs only those instances with a fresh agent. Creates a new run tagged retry:<original_run_id>.

Example

# Run 50 instances, some fail
benchspan run --benchmark swebench --agent claude-code --instances 50

# Retry just the failures
benchspan retry run_20260329_210534 --agent claude-code