> ## Documentation Index
> Fetch the complete documentation index at: https://docs.pandaprobe.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Google Gemini

> Auto-trace Google Gemini generate_content API calls

### Installation

<Tabs>
  <Tab title="pip">
    ```bash theme={null}
    pip install "pandaprobe[gemini]"
    ```
  </Tab>

  <Tab title="uv">
    ```bash theme={null}
    uv add "pandaprobe[gemini]"
    ```
  </Tab>
</Tabs>

### Setup

```python theme={null}
from pandaprobe.wrappers import wrap_gemini
from google import genai

client = wrap_gemini(genai.Client())
```

## Generate content

Span name: `"gemini-generate"`, SpanKind: `LLM`

```python theme={null}
response = client.models.generate_content(
    model="gemini-3.1-flash-preview",
    contents="Explain quantum computing.",
    config={"temperature": 0.7},
)
print(response.text)
```

**What gets traced**

* Input: `contents` plus `system_instruction` normalized to messages format (role `"model"` mapped to `"assistant"`)
* Output: answer text (non-thought parts)
* Model name
* Token usage
* Model parameters: `temperature`, `top_p`, `top_k`, `max_output_tokens`, `stop_sequences`, and related fields
* Thinking or reasoning parts stored in metadata as `reasoning_summary`

### Streaming and async

<CodeGroup>
  ```python title="Streaming" theme={null}
  stream = client.models.generate_content_stream(
      model="gemini-3.1-flash-preview",
      contents="Hello!",
  )
  for chunk in stream:
      print(chunk.text, end="")
  ```

  ```python title="Async" theme={null}
  response = await client.aio.models.generate_content(
      model="gemini-3.1-flash-preview",
      contents="Hello!",
  )

  async for chunk in client.aio.models.generate_content_stream(
      model="gemini-3.1-flash-preview",
      contents="Hello!",
  ):
      print(chunk.text, end="")
  ```
</CodeGroup>

<Note>
  All four methods are traced: synchronous blocking, synchronous streaming, asynchronous blocking, asynchronous streaming.
</Note>

<Accordion title="Sync vs async streaming">
  Use `models.generate_content_stream` for synchronous iterators and `aio.models.generate_content_stream` with `async for` when the call site is already async. The wrapper emits the same span fields in both cases; only the execution model differs.
</Accordion>

## Thinking mode

When using Gemini's thinking mode, thought parts are automatically separated from answer parts. Thought content is stored in metadata as `reasoning_summary`, while the span output contains only the answer text.

## Token usage mapping

| Gemini Field                 | PandaProbe Field    |
| ---------------------------- | ------------------- |
| `prompt_token_count`         | `prompt_tokens`     |
| `candidates_token_count`     | `completion_tokens` |
| `total_token_count`          | `total_tokens`      |
| `thoughts_token_count`       | `reasoning_tokens`  |
| `cached_content_token_count` | `cache_read_tokens` |
