Skip to main content
Version: 0.5

Cookbook

Practical query recipes for common ROS2 observability tasks. Copy, adapt, and run against your telemetry data.

Basic queries

Find all error spans

Returns every span with a non-OK status code — the fastest way to triage what broke.

FROM traces WHERE status = 'ERROR'

Error logs at a specific severity

Fetch log entries at ERROR level across all services.

FROM logs WHERE severity = 'ERROR'

Slowest actions (last hour)

Find navigation actions that exceeded a duration threshold, ranked slowest first.

SELECT span_name, duration
FROM traces
WHERE action_name = '/navigate_to_pose' AND duration > 500 ms
SINCE 1 hour ago
ORDER BY duration DESC
LIMIT 20

Filter by ROS2 node

Scope traces to a single ROS2 node by its fully-qualified name.

FROM traces WHERE node = '/bt_navigator'

Robot-scoped query

Prefix any query with FOR ROBOT to filter by robot.id resource attribute.

FOR ROBOT 'robot_sim_001' FROM logs WHERE severity = 'ERROR'

Scope by version

Narrows results to a specific firmware or software version via the service.version resource attribute.

FOR VERSION 'v1.2.3' FROM traces WHERE status = 'ERROR'

Scope by environment

Filter to a specific deployment environment such as production or staging.

FOR ENVIRONMENT 'production' FROM traces WHERE status = 'ERROR' SINCE 1 hour ago

Scope by session

Scope to a single ROS2 run identified by ros.session.id.

FOR SESSION 'sess_abc123' FROM logs WHERE severity = 'ERROR'

Compose multiple FOR clauses

Stack multiple FOR clauses to intersect any combination of robot, version, environment, and session.

FOR ROBOT 'robot_42' FOR VERSION 'v1.0.0' FOR ENVIRONMENT 'prod' FROM traces WHERE status = 'ERROR'

Read topic messages

Query raw message data from the topic_messages table, with access to fields for any message field.

FROM topics WHERE topic_name = '/battery_state' AND fields['percentage'] < 20

Topic alias

ROSQL provides shorthand aliases for common ROS2 topics so you don't have to type the full topic name.

-- Shorthand for: FROM topics WHERE topic_name = '/odom'
FROM odom LIMIT 5

Cross-signal correlation

Find error traces that occurred while battery percentage was below a threshold — DURING() handles the time-window correlation automatically.

SELECT trace_id, span_name, service_name, duration, status_code, span_attributes
FROM traces
WHERE status = 'ERROR' AND action_name = '/navigate_to_pose'
DURING(
FROM topics WHERE topic_name = '/battery_state'
AND fields['percentage'] < 15
)
SINCE 6 hours ago

Errors during high CPU load

Correlate trace failures with periods of CPU saturation — useful for distinguishing software bugs from resource exhaustion.

FROM traces WHERE status = 'ERROR'
DURING(
FROM metrics WHERE metric_name = 'system.cpu.total_usage_pct'
AND metric_value > 90
)
SINCE yesterday

Causality and message tracing

Trace full message causality chain

TRACE 'a3f1c9d2e8b04f7a'

Returns all spans in the causal chain by walking parent_span_id → span_id recursively via a CTE.

Find all message paths for a topic

Shows every node that published or subscribed to a topic, derived from span attributes.

MESSAGE FLOW FROM TOPIC '/cmd_vel'

Shows which nodes published and subscribed to /cmd_vel.

Filter message flow to a specific destination node

Narrows the flow graph to paths that terminate at a specific subscriber node.

MESSAGE FLOW FROM TOPIC '/cmd_vel' TO NODE '/motor_driver'

Filter message flow to a specific destination topic

Narrows the flow graph to paths that terminate at a specific downstream topic.

MESSAGE FLOW FROM TOPIC '/sensor_data' TO TOPIC '/processed_data'

All spans for a specific trace

Returns every span belonging to a trace, ordered by timestamp.

TRACE 'trace-002'

Robot health dashboard

Five focused queries that together give a complete picture of robot health. Run them individually or wire them into a dashboard.

Error rate by robot

Count errors per robot over the last 30 minutes — the headline metric for a health dashboard.

SELECT COUNT(*) FROM traces WHERE status = 'ERROR' FACET robot_id SINCE 30 minutes ago

Log severity breakdown

Break down log volume by severity to spot warn/error spikes.

SELECT COUNT(*) FROM logs WHERE severity IN ('ERROR', 'WARN') FACET severity SINCE 30 minutes ago

Action success rates

Compute the ratio of succeeded to total executions for an action server.

SELECT ACTION_SUCCESS_RATE('/navigate_to_pose') FROM traces SINCE 30 minutes ago

Topic publish rates

Check the current publish rate for a topic in Hz.

SELECT TOPIC_RATE('/cmd_vel') FROM metrics SINCE 30 minutes ago

P95 action latency

95th-percentile latency for a navigation or manipulation action — catches tail-latency regressions.

SELECT APPROX_PERCENTILE(duration, 95) FROM traces WHERE action_name = '/navigate_to_pose' SINCE 30 minutes ago

Anomaly detection

Detect statistical anomalies in span duration

Uses a two-phase z-score CTE to compare current values against a historical or fleet-wide baseline.

-- Anomalous robots compared to last week
ANOMALY(duration) COMPARED TO last week FACET robot_id SINCE 7 days ago

-- Anomalies compared to fleet-wide baseline
ANOMALY(duration) COMPARED TO fleet SINCE 1 hour ago FACET robot_id

Returns robot_id, avg_current, avg_baseline, z_score, is_anomalous, direction.

Finding anomalous robots after deployment

Compare post-deployment behaviour against the previous week's baseline to catch regressions:

ANOMALY(duration) FROM traces COMPARED TO last week FACET robot_id SINCE 24 hours ago

Pair with FOR VERSION to scope to a specific release:

FOR VERSION 'v1.3.0' ANOMALY(duration) COMPARED TO last week FACET robot_id SINCE 12 hours ago

Path deviation analysis

Check if a robot deviated from its planned path during a specific trace:

PATH DEVIATION FOR TRACE 'trace-002'

Or over a time window for a robot:

PATH DEVIATION FOR ROBOT 'robot_sim_001' SINCE 1 hour ago

Select a specific plan (0 = first occurrence, -1 = latest, default = -1):

PATH DEVIATION PLAN 0 FOR ROBOT 'robot_sim_001' SINCE 6 hours ago

Returns lateral deviation in metres per waypoint plus summary stats.

Joint trajectory tracking

Compare planned vs. actual joint positions to detect mechanical issues or controller errors:

JOINT DEVIATION FOR TRACE 'trace-002'

Or across a time window:

JOINT DEVIATION FOR ROBOT 'arm_01' SINCE 1 hour ago

Returns per-joint position_error_rad and summary statistics.

Browse robot joint map

Inspect the URDF-derived joint map registered for a robot model:

SHOW JOINTS FOR ROBOT 'robot_sim_001'

Returns joint_name, joint_index, joint_type, lower_limit, upper_limit.


MCAP recordings

Find recordings covering a time window

FROM recordings queries the mcap_metadata index — one row per MCAP file. It returns file-level metadata (robot, time window, topics, file_uri — an s3://, file://, or gs:// URI). Use file_uri with the returned timestamps to fetch and seek into the file client-side for sub-range slicing.

-- All recent recordings for a robot (SINCE uses overlap semantics on end_time)
FROM recordings WHERE robot_id = 'amr-01' SINCE 6 hours ago LIMIT 10

-- Recordings that contain joint state data
FROM recordings
WHERE '/joint_states' = ANY(topics)
SINCE 24 hours ago

-- Find the recording that covers a specific error timestamp
FROM recordings
WHERE robot_id = 'amr-01'
AND start_time <= '2026-04-17T10:22:00Z'
AND end_time >= '2026-04-17T10:22:00Z'

Returns robot_id, session_id, start_time, end_time, file_uri, topics, message_types.


Pipeline syntax

Pipeline syntax uses | to chain stages, making complex queries more readable:

Find slow error spans, grouped by robot

FROM traces
| WHERE duration > 500 ms
| WHERE status = 'ERROR'
| FACET robot_id
FROM traces
| WHERE action_name = '/navigate_to_pose'
| WHERE status = 'ERROR'
| FACET robot_id
| COMPARE TO last week

Compare error rate to a previous version

Compare current error volume against a historical version baseline.

FROM traces WHERE status = 'ERROR' SINCE 1 hour ago | COMPARE TO VERSION 'v1.0.0'

Compare two versions head-to-head

Side-by-side comparison of an action's behaviour across two specific firmware versions.

FROM traces WHERE action_name = '/navigate_to_pose' COMPARE VERSION 'v1.0' TO VERSION 'v2.0'

Deployment history for a robot

List every firmware version and environment combination the robot has been seen with.

SHOW DEPLOYMENTS FOR ROBOT 'robot_42' SINCE 7 days ago

Top spans by average duration

Rank all span types by mean duration — useful for spotting slow nodes at a glance.

SHOW SPAN SUMMARY FOR ROBOT 'robot_42' SINCE 1 hour ago

Plan spans for a specific trace

Returns all spans in a trace that carry a ros.plan.id attribute — i.e. planning spans emitted by Nav2 or a custom motion planner. Useful for inspecting how long path planning took within a navigation action.

SHOW PLANS FOR TRACE 'a3f1c9d2e8b04f7a'

Also works scoped to a robot or time window:

SHOW PLANS FOR ROBOT 'robot_42' SINCE 1 hour ago

Log analysis with ordering

Fetch recent warning and error logs sorted by severity.

FROM logs
| WHERE severity IN ('ERROR', 'WARN')
| ORDER BY severity DESC
| LIMIT 20

Time-series and aggregations

Average span duration per robot

Mean action duration faceted by robot — compare fleet-wide performance in one query.

SELECT AVG(duration) FROM traces FACET robot_id

Error count by severity

Log volume grouped by severity across robots.

SELECT COUNT(*) FROM logs WHERE severity = 'ERROR' FACET robot_id

Percentile latency for navigation

p95 latency for a specific action across all robots and sessions.

SELECT PERCENTILE(duration, 95) FROM traces WHERE action_name = '/navigate_to_pose'

Multiple aggregations

Compute several aggregate statistics in a single query.

SELECT AVG(duration) AS avg_dur, MAX(duration) AS max_dur, COUNT(*) AS total FROM traces

Rolling average (MOVING_AVG)

Smooth out spikes in action duration over a 5-row window:

SELECT MOVING_AVG(duration, 5) FROM traces WHERE action_name = '/navigate_to_pose'

Rate of change (DERIVATIVE)

Detect rising metric values — useful for spotting resource exhaustion trends:

SELECT DERIVATIVE(metric_value) FROM metrics WHERE metric_name = 'system.cpu.total_usage_pct'

Approximate distinct count

Count unique action names without a full distinct scan (DuckDB uses HLL; PostgreSQL falls back to exact):

SELECT APPROX_COUNT_DISTINCT(span_name) FROM traces SINCE 1 hour ago

Geospatial filtering

Filter by GPS proximity (Haversine)

Find odom messages where the robot was within 500 m of a GPS coordinate:

FROM odom WHERE position WITHIN 500 m OF (37.7749, -122.4194) SINCE 1 hour ago

Uses inline Haversine SQL (great-circle distance) extracting position.latitude / position.longitude from the fields JSONB column.

Filter by local frame proximity (Euclidean)

Find odom messages within 2 m of a local coordinate using Euclidean distance:

FROM odom WHERE position WITHIN 2 m OF POSITION (1.5, 3.0) SINCE 1 hour ago

Extracts pose.pose.position.x / pose.pose.position.y from fields.

Monitoring joint positions during failures

Use array-indexed field access to read individual joint positions from /joint_states messages:

FROM joint_states
WHERE fields['position[0]'] > 1.5
FOR ROBOT 'arm_01'
SINCE 1 hour ago

fields['position[0]'] compiles to "fields"->'position'->>0 on PostgreSQL/DuckDB.


Pagination

Page through results with LIMIT and OFFSET

Standard keyset-less pagination — useful for UI tables that load results in pages.

FROM logs ORDER BY timestamp DESC LIMIT 20 OFFSET 0
FROM logs ORDER BY timestamp DESC LIMIT 20 OFFSET 20
FROM logs ORDER BY timestamp DESC LIMIT 20 OFFSET 40

Pipeline form:

FROM logs | ORDER BY timestamp DESC | LIMIT 20 | OFFSET 40

Investigating failed navigation with ENRICH WITH

Use ENRICH WITH to pull correlated data from a second source alongside your primary query. Here we fetch the error trace and enrich it with log context from the same trace.

Step 1: find the failed navigation trace

FROM traces
WHERE action_name = '/navigate_to_pose' AND status = 'ERROR'
SINCE 2 hours ago
ORDER BY timestamp DESC
LIMIT 5

Step 2: enrich with logs for the same trace

FROM traces
WHERE action_name = '/navigate_to_pose' AND status = 'ERROR'
SINCE 2 hours ago
ENRICH WITH logs

The result includes all primary trace columns plus _enriched.logs — a nested array of log rows sharing the same trace_id.

Step 3: inspect the trace tree

TRACE 'a3f1c9d2e8b04f7a'

Returns a nested span tree (format_hint: Gantt) showing the full execution timeline for that navigation attempt.


Finding anomalous robots after deployment

Spot robots behaving differently from the fleet after a code push

ANOMALY(duration)
COMPARED TO fleet
FACET robot_id
SINCE 1 hour ago

Returns one row per robot with z_score, is_anomalous, and direction (higher/lower). Robots with |z_score| > 2 are flagged as anomalous.

tip

Always add FACET robot_id or FACET action_name to ANOMALY. Without a FACET, the comparison mixes heterogeneous spans (heartbeats, navigation, diagnostics) producing meaningless z-scores. ROSQL will warn you if you forget.

Compare this version to the previous one

FROM traces
WHERE action_name = '/navigate_to_pose'
COMPARE VERSION 'v2.3.1' TO VERSION 'v2.3.0'

REPL showcase examples

These six queries illustrate the range of visualizations available. Each uses a distinct format_hint.

1. Line chart — error rate over time

SELECT COUNT(*) FROM traces WHERE status = 'ERROR' TIMESERIES 5 min SINCE 6 hours ago

Response metadata:

{
"format_hint": "LineChart",
"visualization": { "x_axis": "time_bucket", "y_axis": "count" }
}

2. Gantt chart — trace waterfall

TRACE 'abc123'

Response metadata:

{
"format_hint": "Gantt",
"visualization": null
}

3. Directed graph — pub/sub topology

MESSAGE FLOW FROM TOPIC '/cmd_vel'

Response metadata:

{
"format_hint": "DirectedGraph",
"visualization": null
}

4. Severity-colored log table

FROM logs WHERE severity IN ('ERROR', 'FATAL') SINCE 1 hour ago

Response metadata:

{
"format_hint": "LogTable",
"visualization": { "color_field": "severity" }
}

5. Scalar cards — KPI summary

SELECT COUNT(*) AS total_errors, AVG(duration) AS avg_duration FROM traces SINCE 1 hour ago

Response metadata:

{
"format_hint": "ScalarCards",
"visualization": { "label_field": "total_errors" }
}

6. Bar chart — error count by robot

SELECT action_name, COUNT(*) FROM traces FACET action_name SINCE 1 hour ago

Response metadata:

{
"format_hint": "BarChart",
"visualization": { "x_axis": "action_name", "y_axis": "count" }
}

Session and mission data model

A ROSQL session corresponds to a ROS2 run scoped by ros.session.id. All spans, logs, and metrics emitted during a single robot operation (delivery, inspection, etc.) share this attribute.

Setting session attributes (Python)

from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.resources import Resource

resource = Resource.create({
"robot.id": "robot_42",
"ros.session.id": "delivery_042",
"ros.session.type": "delivery",
"service.version": "2.3.1",
"deployment.environment": "production",
})

provider = TracerProvider(resource=resource)
trace.set_tracer_provider(provider)

Setting session attributes (C++)

#include "opentelemetry/sdk/resource/resource.h"

auto resource = opentelemetry::sdk::resource::Resource::Create({
{"robot.id", "robot_42"},
{"ros.session.id", "delivery_042"},
{"ros.session.type", "delivery"},
{"service.version", "2.3.1"},
{"deployment.environment", "production"},
});

Querying by session

FOR SESSION 'delivery_042' FROM traces WHERE status = 'ERROR'
FOR SESSION 'delivery_042' FROM logs ORDER BY timestamp DESC

See also