Modes

Benchspan has two operating modes. You pick one when constructing the SDK; it applies to every scan from that instance.

`block` (default)

The SDK waits for the scan result. If an injection is detected, it raises an exception before the LLM call happens; the model never sees the poisoned content. Adds the scan latency (typically sub-100ms on tool outputs) to your agent’s critical path.

from benchspan import BenchGuard, InjectionDetectedError

guard = BenchGuard(api_key="ag_live_...", mode="block")

try:
    result = llm.invoke(messages, config={"callbacks": [guard]})
except InjectionDetectedError as e:
    print(f"Blocked: score={e.result.score:.4f}")
    # Return a safe error to your user, log the incident, alert, etc.

import { BenchGuard, InjectionDetectedError } from "@benchspan/sdk";

const guard = new BenchGuard({ apiKey: "ag_live_...", mode: "block" });

try {
  await guard.scanOrThrow(toolOutput, { role: "tool" });
} catch (e) {
  if (e instanceof InjectionDetectedError) {
    console.log(`Blocked: score=${e.result.score.toFixed(4)}`);
  }
}

Use block in production. It’s the default for a reason: an injection that reaches the LLM is already damage, even if you catch it in logs afterwards.

`warn`

Zero latency. The SDK fires the scan in the background (daemon thread in Python, unawaited Promise in TypeScript) and the LLM call proceeds immediately with no added wait. Detection still happens; the verdict lands in your dashboard logs asynchronously and a warning is logged locally, but the agent never pauses. Useful for:

Evaluating false-positive rate on real traffic before enforcing. You see every would-be block in the dashboard without affecting production latency.
Shadow deployments: running Benchspan in parallel with your existing controls to compare coverage.
Latency-critical agents where you’d rather observe than block. Voice, real-time chat, any flow where even a sub-100ms stall matters.

guard = BenchGuard(api_key="ag_live_...", mode="warn")

result = llm.invoke(messages, config={"callbacks": [guard]})
# Returns immediately. Scan runs in a daemon thread and logs a warning
# (+ updates the dashboard) if an injection is detected.

const guard = new BenchGuard({ apiKey: "ag_live_...", mode: "warn" });

// Your LLM call proceeds with zero added latency. The scan runs in the
// background and any injection lands in your dashboard.
const result = await llm.generate(prompt);

Warn mode is fire-and-forget. If you explicitly call guard.scan(...) and await it, you get the synchronous verdict back; the zero-latency behavior only applies to the framework integrations (callbacks, hooks, middleware).

Recommended rollout

Start in warn mode

Deploy with mode="warn". Watch your dashboard for injections and false positives on real traffic for a few days. No user-visible latency impact.

Tune if needed

If you see legitimate content being flagged, send us a sample at founders@benchspan.com. For high-volume deployments, we can train a custom model on your traffic; reach out.

Switch to block

Once your false-positive rate is acceptable and the latency budget allows it, flip mode="block". Same SDK, one-line change.

Do not run without Benchspan in production with the assumption that your system prompt alone will prevent injection. Every major published IPI attack has broken system-prompt-only defenses.

Getting Started

Concepts

SDKs

Framework Integrations

`block` (default)

`warn`

Recommended rollout

​block (default)

​warn

​Recommended rollout

`block` (default)

`warn`

Recommended rollout