Create Monitor
Create an evaluation monitor that spawns eval runs on a recurring schedule.
Monitors persist a reusable evaluation configuration (target type, metrics,
filters, cadence) and the system automatically creates eval runs at each
scheduled interval. If only_if_changed is true (default), runs are
skipped when no new data has arrived since the previous run.
Request body fields:
- name (string, required): Human-readable label, e.g.
"Daily prod eval". - target_type (string, required):
"TRACE"or"SESSION". - metrics (string[], required): Metric names to run. Use
GET /evaluations/trace-metrics(TRACE) orGET /evaluations/session-metrics(SESSION) for available names. - filters (object, optional, default ): Scope the data the monitor evaluates.
Pass
{}to match everything. Accepted keys depend ontarget_type:- TRACE:
date_from,date_to(ISO 8601),status(PENDING/RUNNING/COMPLETED/ERROR),session_id,user_id,tags(string[]),name(substring match). - SESSION:
date_from,date_to,user_id,has_error(bool),tags(string[]),min_trace_count(int).
- TRACE:
- cadence (string, required): Firing schedule. Predefined:
"every_6h","daily","weekly". Custom cron:"cron:<min hour dom month dow>", e.g."cron:0 3 * * *"(daily 3 AM UTC),"cron:0 6 * * 1-5"(weekdays 6 AM). - sampling_rate (float, optional, default 1.0): Fraction of matching items to evaluate per run (0.0–1.0).
- model (string, optional): LLM model override, e.g.
"openai/gpt-4o". - only_if_changed (bool, optional, default true): Skip the run if no new traces/sessions exist since the last run.
- signal_weights (object, optional, SESSION only): Override signal
aggregation weights. Keys:
confidence,loop_detection,tool_correctness,coherence.
Auth: Bearer + X-Project-ID | X-API-Key + X-Project-Name
Documentation Index
Fetch the complete documentation index at: https://docs.pandaprobe.com/llms.txt
Use this file to discover all available pages before exploring further.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Create an evaluation monitor that spawns runs on a cadence.
Human-readable name for the monitor, e.g. 'Daily prod trace eval'.
Evaluation scope: 'TRACE' for trace-level metrics or 'SESSION' for session-level metrics.
Metric names to run on each scheduled eval. For TRACE monitors use GET /evaluations/trace-metrics; for SESSION monitors use GET /evaluations/session-metrics to list available names. Example: ['task_completion', 'step_efficiency'].
1How often the monitor fires. Predefined intervals: 'every_6h', 'daily', 'weekly'. Custom cron: 'cron:<5-part expression>' where the five parts are minute hour day-of-month month day-of-week. Examples: 'cron:0 3 * * *' (daily at 3 AM UTC), 'cron:0 6 * * 1-5' (weekdays at 6 AM), 'cron:0 */4 * * *' (every 4 hours).
JSON object defining which traces/sessions the monitor targets. Pass {} to match everything in the project. TRACE monitors accept: date_from (ISO 8601), date_to (ISO 8601), status (PENDING|RUNNING|COMPLETED|ERROR), session_id, user_id, tags (string array, ANY match), name (substring, case-insensitive). SESSION monitors accept: date_from, date_to, user_id, has_error (bool), tags, min_trace_count (int). Example: {"status": "COMPLETED", "tags": ["production"]}.
Fraction of matching traces/sessions to evaluate per run. 1.0 = all, 0.1 = random 10%.
0 <= x <= 1LLM model override for judge calls (e.g. 'openai/gpt-4o'). Uses system default if null.
When true, the scheduled run is skipped if no new traces/sessions have arrived since the last run, saving LLM costs. Set to false to always run on schedule regardless of new data.
Override default signal weights for session-level aggregation. Only valid for SESSION monitors (rejected for TRACE). Keys: confidence, loop_detection, tool_correctness, coherence. Defaults: confidence=1.0, loop_detection=1.0, tool_correctness=0.8, coherence=1.0. Example: {"confidence": 1.0, "loop_detection": 1.5}.
Response
Successful Response
Full evaluation monitor representation.

