LangChain

Benchspan integrates with LangChain as a callback handler. It scans user and tool messages flowing through the chat model and raises InjectionDetectedError (in block mode) before the LLM call goes out.

Python

Install

pip install benchspan langchain-anthropic  # or any LangChain provider

Pass the guard as a callback

agent.py

from benchspan import BenchGuard
from langchain_anthropic import ChatAnthropic
from langchain_core.messages import HumanMessage, ToolMessage

guard = BenchGuard(api_key="ag_live_...", agent="email-agent")
llm = ChatAnthropic(model="claude-sonnet-4-6")

messages = [
    HumanMessage(content="Summarize this email"),
    ToolMessage(
        content=email_body,           # scanned by BenchGuard
        tool_call_id="call_123",
        name="read_email",
    ),
]

result = llm.invoke(messages, config={"callbacks": [guard]})

BenchGuard implements the BaseCallbackHandler interface directly, so no wrapper class is needed. Pass it to any chain, agent, or .invoke() call that accepts callbacks.

Handle injections

from benchspan import InjectionDetectedError

try:
    result = llm.invoke(messages, config={"callbacks": [guard]})
except InjectionDetectedError as e:
    # Tell your user, log, alert. The LLM call never happened.
    return {"error": "Suspicious content detected", "score": e.result.score}

Works with

Any LangChain provider: Anthropic, OpenAI, Google, Mistral, Ollama, and custom LLMs. The callback attaches to the chat model, not the provider.

TypeScript

Install

npm install @benchspan/sdk @langchain/anthropic

Wrap the callback

agent.ts

import { BenchGuard } from "@benchspan/sdk";
import { ChatAnthropic } from "@langchain/anthropic";

const guard = new BenchGuard({ apiKey: "ag_live_...", agent: "email-agent" });
const llm = new ChatAnthropic({ model: "claude-sonnet-4-6" });

const result = await llm.invoke(messages, {
  callbacks: [guard.asLangChainCallback()],
});

LangChain JS requires an object with handleChatModelStart / handleLLMStart. asLangChainCallback() returns exactly that.

What gets scanned

Message type	Scanned?
`HumanMessage` / `user`	✅
`ToolMessage` / `tool`	✅
`SystemMessage` / `system`	❌ (trusted)
`AIMessage` / `assistant`	❌ (trusted)

Duplicates are skipped. If the same tool output appears in multiple turns of a conversation, it’s only scanned once.

CrewAI

CrewAI uses the same LangChain callback protocol. Pass BenchGuard directly to the Crew:

crew.py

from benchspan import BenchGuard
from crewai import Agent, Crew, Task

guard = BenchGuard(api_key="ag_live_...", agent="research-crew")

crew = Crew(
    agents=[...],
    tasks=[...],
    callbacks=[guard],
)

See CrewAI integration for a full crew example.

Getting Started

Concepts

SDKs

Framework Integrations

Python

Works with

TypeScript

What gets scanned

CrewAI

​Python

​Works with

​TypeScript

​What gets scanned

​CrewAI

Python

Works with

TypeScript

What gets scanned

CrewAI