Skip to main content

Installation

pip install pandaprobe[anthropic]

Setup

from pandaprobe.wrappers import wrap_anthropic
from anthropic import Anthropic

client = wrap_anthropic(Anthropic())

Messages API

Span name: "anthropic-messages", SpanKind: LLM
response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    system="You are a helpful assistant.",
    messages=[{"role": "user", "content": "Explain quantum computing."}],
)
What gets traced
  • Input: system prompt (from top-level system kwarg) plus messages, normalized to a standard format
  • Output: text content blocks
  • Model name
  • Token usage
  • Model parameters: temperature, top_p, top_k, max_tokens, stop_sequences, thinking configuration
  • Extended thinking or reasoning blocks stored in metadata as reasoning_summary

Streaming

response = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)
for event in response:
    if hasattr(event, "delta") and hasattr(event.delta, "text"):
        print(event.delta.text, end="")
Both streaming patterns are fully supported with time-to-first-token tracking.

Extended thinking

When using Anthropic’s extended thinking feature, thinking blocks are automatically extracted and stored in the span metadata under the reasoning_summary key. Thinking content is stripped from the visible output.

Token usage mapping

Anthropic FieldPandaProbe Field
input_tokensprompt_tokens
output_tokenscompletion_tokens
cache_read_input_tokenscache_read_tokens
cache_creation_input_tokenscache_creation_tokens