OpenAI

Installation

pip install "pandaprobe[openai]"

uv add "pandaprobe[openai]"

Setup

Sync
Async

from pandaprobe.wrappers import wrap_openai
from openai import OpenAI

client = wrap_openai(OpenAI())

from pandaprobe.wrappers import wrap_openai
from openai import AsyncOpenAI

async_client = wrap_openai(AsyncOpenAI())

Works with both synchronous and asynchronous clients; use the same wrap_openai entry point.

Chat Completions API

Span name: "openai-chat", SpanKind: LLM

response = client.chat.completions.create(
    model="gpt-5.4",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Explain quantum computing."},
    ],
    temperature=0.7,
)

What gets traced

Input: messages array
Output: assistant message
Model name
Token usage: prompt_tokens, completion_tokens, total_tokens, plus detail fields (for example reasoning_tokens from completion_tokens_details)
Model parameters: temperature, top_p, max_tokens, and other safe parameters only

Streaming

stream = client.chat.completions.create(
    model="gpt-5.4",
    messages=[{"role": "user", "content": "Hello!"}],
    stream=True,
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")

Streaming is fully supported. The wrapper records completion_start_time on the first chunk for time-to-first-token tracking. Chunks are reduced to a single response for the span output.

Responses API

Span name: "openai-response", SpanKind: LLM

response = client.responses.create(
    model="gpt-5.4",
    instructions="You are a helpful assistant.",
    input="What is the capital of France?",
)

What gets traced

Input: instructions plus input, normalized to messages format
Output: response output items
Token usage: input_tokens mapped to prompt tokens, output_tokens mapped to completion tokens, plus detail fields
Reasoning summaries extracted from reasoning output items
Model parameters: max_output_tokens, temperature, top_p, reasoning, and related fields

Tool calls (Responses API)

Built-in tools such as web_search, file_search, and code_interpreter are automatically traced as child spans with SpanKind TOOL:

response = client.responses.create(
    model="gpt-5.4",
    input="Search the web for PandaProbe",
    tools=[{"type": "web_search"}],
)

Each tool invocation produces a child TOOL span with the tool type as the span name (for example "web_search_call", "function_call"). Function calls (function_call items) are also captured as TOOL child spans with arguments as input and results as output.

Token usage mapping

OpenAI Field	PandaProbe Field
`prompt_tokens`	`prompt_tokens`
`completion_tokens`	`completion_tokens`
`total_tokens`	`total_tokens`
`completion_tokens_details.reasoning_tokens`	`reasoning_tokens`
(Responses) `input_tokens`	`prompt_tokens`
(Responses) `output_tokens`	`completion_tokens`
(Responses) `input_tokens_details.cached_tokens`	`cache_read_tokens`
(Responses) `output_tokens_details.reasoning_tokens`	`reasoning_tokens`

Chat Completions and Responses return different usage object shapes from the SDK. The wrapper normalizes both into the PandaProbe fields in this table; do not assume raw OpenAI field names are identical across APIs when reading span payloads in custom exporters.

Get Started

Tracing

Evaluation

Installation

Setup

Chat Completions API

Streaming

Responses API

Tool calls (Responses API)

Token usage mapping

Get Started

Tracing

Evaluation

Documentation Index

​Installation

​Setup

​Chat Completions API

​Streaming

​Responses API

​Tool calls (Responses API)

​Token usage mapping

Installation

Setup

Chat Completions API

Streaming

Responses API

Tool calls (Responses API)

Token usage mapping