The fastest way: use the onboarding skill
If you use a coding agent like Claude Code, Codex, or OpenCode, install our onboarding skill and it will do everything for you. Install the skill:/onboard-agent skill for your coding agent. Then start a session with your coding agent and run:
- Ask about your agent (where’s the code, what runtime, what env vars it needs)
- Explore your codebase to understand how to invoke it
- Generate a working
runner.sh - Help you set up env vars on the dashboard
- Run the healthcheck benchmark to verify everything works
- Diagnose and fix any failures automatically
The skill follows the Agent Skills open standard and works with Claude Code, Cursor, Codex, GitHub Copilot, and other compatible agents.
Or do it manually
If you prefer to do it yourself, here’s the process:Understand how it works
Your agent runs in a Docker container. It gets a problem statement and a working directory. It produces output. Benchspan grades the result.You write one file —
runner.sh — that installs and runs your agent. That’s the entire integration.How it worksWrite your runner.sh
Follow the pattern that matches your setup: pip package, npm package, binary, or build from source.Writing your runner.sh
Set env vars on the dashboard
Add your agent’s required env vars (API keys, model config) at your dashboard settings. Whatever you set there gets injected into the container.
Test with the healthcheck benchmark
Run 10 simple tasks that verify your runner.sh works across different environments.Healthcheck details
What you’ll need
- A Benchspan account (sign up)
- The CLI installed (
pip install benchspan) - Your agent’s source code or published package
- API keys for whatever LLM your agent uses