Cookbook
Practical query recipes for common ROS2 observability tasks. Copy, adapt, and run against your telemetry data.
Basic queries
Find all error spans
Returns every span with a non-OK status code — the fastest way to triage what broke.
FROM traces WHERE status = 'ERROR'
Error logs at a specific severity
Fetch log entries at ERROR level across all services.
FROM logs WHERE severity = 'ERROR'
Slowest actions (last hour)
Find navigation actions that exceeded a duration threshold, ranked slowest first.
SELECT span_name, duration
FROM traces
WHERE action_name = '/navigate_to_pose' AND duration > 500 ms
SINCE 1 hour ago
ORDER BY duration DESC
LIMIT 20
Filter by ROS2 node
Scope traces to a single ROS2 node by its fully-qualified name.
FROM traces WHERE node = '/bt_navigator'
Robot-scoped query
Prefix any query with FOR ROBOT to filter by robot.id resource attribute.
FOR ROBOT 'robot_sim_001' FROM logs WHERE severity = 'ERROR'
Scope by version
Narrows results to a specific firmware or software version via the service.version resource attribute.
FOR VERSION 'v1.2.3' FROM traces WHERE status = 'ERROR'
Scope by environment
Filter to a specific deployment environment such as production or staging.
FOR ENVIRONMENT 'production' FROM traces WHERE status = 'ERROR' SINCE 1 hour ago
Scope by session
Scope to a single ROS2 run identified by ros.session.id.
FOR SESSION 'sess_abc123' FROM logs WHERE severity = 'ERROR'
Compose multiple FOR clauses
Stack multiple FOR clauses to intersect any combination of robot, version, environment, and session.
FOR ROBOT 'robot_42' FOR VERSION 'v1.0.0' FOR ENVIRONMENT 'prod' FROM traces WHERE status = 'ERROR'
Read topic messages
Query raw message data from the topic_messages table, with access to fields for any message field.
FROM topics WHERE topic_name = '/battery_state' AND fields['percentage'] < 20
Topic alias
ROSQL provides shorthand aliases for common ROS2 topics so you don't have to type the full topic name.
-- Shorthand for: FROM topics WHERE topic_name = '/odom'
FROM odom LIMIT 5
Cross-signal correlation
Navigation failures during low battery
Find error traces that occurred while battery percentage was below a threshold — DURING() handles the time-window correlation automatically.
SELECT trace_id, span_name, service_name, duration, status_code, span_attributes
FROM traces
WHERE status = 'ERROR' AND action_name = '/navigate_to_pose'
DURING(
FROM topics WHERE topic_name = '/battery_state'
AND fields['percentage'] < 15
)
SINCE 6 hours ago
Errors during high CPU load
Correlate trace failures with periods of CPU saturation — useful for distinguishing software bugs from resource exhaustion.
FROM traces WHERE status = 'ERROR'
DURING(
FROM metrics WHERE metric_name = 'system.cpu.total_usage_pct'
AND metric_value > 90
)
SINCE yesterday
Causality and message tracing
Trace full message causality chain
TRACE 'a3f1c9d2e8b04f7a'
Returns all spans in the causal chain by walking parent_span_id → span_id recursively via a CTE.
Find all message paths for a topic
Shows every node that published or subscribed to a topic, derived from span attributes.
MESSAGE FLOW FROM TOPIC '/cmd_vel'
Shows which nodes published and subscribed to /cmd_vel.
Filter message flow to a specific destination node
Narrows the flow graph to paths that terminate at a specific subscriber node.
MESSAGE FLOW FROM TOPIC '/cmd_vel' TO NODE '/motor_driver'
Filter message flow to a specific destination topic
Narrows the flow graph to paths that terminate at a specific downstream topic.
MESSAGE FLOW FROM TOPIC '/sensor_data' TO TOPIC '/processed_data'
All spans for a specific trace
Returns every span belonging to a trace, ordered by timestamp.
TRACE 'trace-002'
Robot health dashboard
Five focused queries that together give a complete picture of robot health. Run them individually or wire them into a dashboard.
Error rate by robot
Count errors per robot over the last 30 minutes — the headline metric for a health dashboard.
SELECT COUNT(*) FROM traces WHERE status = 'ERROR' FACET robot_id SINCE 30 minutes ago
Log severity breakdown
Break down log volume by severity to spot warn/error spikes.
SELECT COUNT(*) FROM logs WHERE severity IN ('ERROR', 'WARN') FACET severity SINCE 30 minutes ago
Action success rates
Compute the ratio of succeeded to total executions for an action server.
SELECT ACTION_SUCCESS_RATE('/navigate_to_pose') FROM traces SINCE 30 minutes ago
Topic publish rates
Check the current publish rate for a topic in Hz.
SELECT TOPIC_RATE('/cmd_vel') FROM metrics SINCE 30 minutes ago
P95 action latency
95th-percentile latency for a navigation or manipulation action — catches tail-latency regressions.
SELECT APPROX_PERCENTILE(duration, 95) FROM traces WHERE action_name = '/navigate_to_pose' SINCE 30 minutes ago
Anomaly detection
Detect statistical anomalies in span duration
Uses a two-phase z-score CTE to compare current values against a historical or fleet-wide baseline.
-- Anomalous robots compared to last week
ANOMALY(duration) COMPARED TO last week FACET robot_id SINCE 7 days ago
-- Anomalies compared to fleet-wide baseline
ANOMALY(duration) COMPARED TO fleet SINCE 1 hour ago FACET robot_id
Returns robot_id, avg_current, avg_baseline, z_score, is_anomalous, direction.
Finding anomalous robots after deployment
Compare post-deployment behaviour against the previous week's baseline to catch regressions:
ANOMALY(duration) FROM traces COMPARED TO last week FACET robot_id SINCE 24 hours ago
Pair with FOR VERSION to scope to a specific release:
FOR VERSION 'v1.3.0' ANOMALY(duration) COMPARED TO last week FACET robot_id SINCE 12 hours ago
Path deviation analysis
Check if a robot deviated from its planned path during a specific trace:
PATH DEVIATION FOR TRACE 'trace-002'
Or over a time window for a robot:
PATH DEVIATION FOR ROBOT 'robot_sim_001' SINCE 1 hour ago
Select a specific plan (0 = first occurrence, -1 = latest, default = -1):
PATH DEVIATION PLAN 0 FOR ROBOT 'robot_sim_001' SINCE 6 hours ago
Returns lateral deviation in metres per waypoint plus summary stats.
Joint trajectory tracking
Compare planned vs. actual joint positions to detect mechanical issues or controller errors:
JOINT DEVIATION FOR TRACE 'trace-002'
Or across a time window:
JOINT DEVIATION FOR ROBOT 'arm_01' SINCE 1 hour ago
Returns per-joint position_error_rad and summary statistics.
Browse robot joint map
Inspect the URDF-derived joint map registered for a robot model:
SHOW JOINTS FOR ROBOT 'robot_sim_001'
Returns joint_name, joint_index, joint_type, lower_limit, upper_limit.
MCAP recordings
Find recordings covering a time window
FROM recordings queries the mcap_metadata index — one row per MCAP file. It returns file-level metadata (robot, time window, topics, file_uri — an s3://, file://, or gs:// URI). Use file_uri with the returned timestamps to fetch and seek into the file client-side for sub-range slicing.
-- All recent recordings for a robot (SINCE uses overlap semantics on end_time)
FROM recordings WHERE robot_id = 'amr-01' SINCE 6 hours ago LIMIT 10
-- Recordings that contain joint state data
FROM recordings
WHERE '/joint_states' = ANY(topics)
SINCE 24 hours ago
-- Find the recording that covers a specific error timestamp
FROM recordings
WHERE robot_id = 'amr-01'
AND start_time <= '2026-04-17T10:22:00Z'
AND end_time >= '2026-04-17T10:22:00Z'
Returns robot_id, session_id, start_time, end_time, file_uri, topics, message_types.
Pipeline syntax
Pipeline syntax uses | to chain stages, making complex queries more readable:
Find slow error spans, grouped by robot
FROM traces
| WHERE duration > 500 ms
| WHERE status = 'ERROR'
| FACET robot_id
Navigation failures vs. last week
FROM traces
| WHERE action_name = '/navigate_to_pose'
| WHERE status = 'ERROR'
| FACET robot_id
| COMPARE TO last week
Compare error rate to a previous version
Compare current error volume against a historical version baseline.
FROM traces WHERE status = 'ERROR' SINCE 1 hour ago | COMPARE TO VERSION 'v1.0.0'
Compare two versions head-to-head
Side-by-side comparison of an action's behaviour across two specific firmware versions.
FROM traces WHERE action_name = '/navigate_to_pose' COMPARE VERSION 'v1.0' TO VERSION 'v2.0'
Deployment history for a robot
List every firmware version and environment combination the robot has been seen with.
SHOW DEPLOYMENTS FOR ROBOT 'robot_42' SINCE 7 days ago
Top spans by average duration
Rank all span types by mean duration — useful for spotting slow nodes at a glance.
SHOW SPAN SUMMARY FOR ROBOT 'robot_42' SINCE 1 hour ago
Plan spans for a specific trace
Returns all spans in a trace that carry a ros.plan.id attribute — i.e. planning spans emitted by Nav2 or a custom motion planner. Useful for inspecting how long path planning took within a navigation action.
SHOW PLANS FOR TRACE 'a3f1c9d2e8b04f7a'
Also works scoped to a robot or time window:
SHOW PLANS FOR ROBOT 'robot_42' SINCE 1 hour ago
Log analysis with ordering
Fetch recent warning and error logs sorted by severity.
FROM logs
| WHERE severity IN ('ERROR', 'WARN')
| ORDER BY severity DESC
| LIMIT 20
Time-series and aggregations
Average span duration per robot
Mean action duration faceted by robot — compare fleet-wide performance in one query.
SELECT AVG(duration) FROM traces FACET robot_id
Error count by severity
Log volume grouped by severity across robots.
SELECT COUNT(*) FROM logs WHERE severity = 'ERROR' FACET robot_id
Percentile latency for navigation
p95 latency for a specific action across all robots and sessions.
SELECT PERCENTILE(duration, 95) FROM traces WHERE action_name = '/navigate_to_pose'
Multiple aggregations
Compute several aggregate statistics in a single query.
SELECT AVG(duration) AS avg_dur, MAX(duration) AS max_dur, COUNT(*) AS total FROM traces
Rolling average (MOVING_AVG)
Smooth out spikes in action duration over a 5-row window:
SELECT MOVING_AVG(duration, 5) FROM traces WHERE action_name = '/navigate_to_pose'
Rate of change (DERIVATIVE)
Detect rising metric values — useful for spotting resource exhaustion trends:
SELECT DERIVATIVE(metric_value) FROM metrics WHERE metric_name = 'system.cpu.total_usage_pct'
Approximate distinct count
Count unique action names without a full distinct scan (DuckDB uses HLL; PostgreSQL falls back to exact):
SELECT APPROX_COUNT_DISTINCT(span_name) FROM traces SINCE 1 hour ago
Geospatial filtering
Filter by GPS proximity (Haversine)
Find odom messages where the robot was within 500 m of a GPS coordinate:
FROM odom WHERE position WITHIN 500 m OF (37.7749, -122.4194) SINCE 1 hour ago
Uses inline Haversine SQL (great-circle distance) extracting position.latitude / position.longitude from the fields JSONB column.
Filter by local frame proximity (Euclidean)
Find odom messages within 2 m of a local coordinate using Euclidean distance:
FROM odom WHERE position WITHIN 2 m OF POSITION (1.5, 3.0) SINCE 1 hour ago
Extracts pose.pose.position.x / pose.pose.position.y from fields.
Monitoring joint positions during failures
Use array-indexed field access to read individual joint positions from /joint_states messages:
FROM joint_states
WHERE fields['position[0]'] > 1.5
FOR ROBOT 'arm_01'
SINCE 1 hour ago
fields['position[0]'] compiles to "fields"->'position'->>0 on PostgreSQL/DuckDB.
Pagination
Page through results with LIMIT and OFFSET
Standard keyset-less pagination — useful for UI tables that load results in pages.
FROM logs ORDER BY timestamp DESC LIMIT 20 OFFSET 0
FROM logs ORDER BY timestamp DESC LIMIT 20 OFFSET 20
FROM logs ORDER BY timestamp DESC LIMIT 20 OFFSET 40
Pipeline form:
FROM logs | ORDER BY timestamp DESC | LIMIT 20 | OFFSET 40
Investigating failed navigation with ENRICH WITH
Use ENRICH WITH to pull correlated data from a second source alongside your primary query. Here we fetch the error trace and enrich it with log context from the same trace.
Step 1: find the failed navigation trace
FROM traces
WHERE action_name = '/navigate_to_pose' AND status = 'ERROR'
SINCE 2 hours ago
ORDER BY timestamp DESC
LIMIT 5
Step 2: enrich with logs for the same trace
FROM traces
WHERE action_name = '/navigate_to_pose' AND status = 'ERROR'
SINCE 2 hours ago
ENRICH WITH logs
The result includes all primary trace columns plus _enriched.logs — a nested array of log rows sharing the same trace_id.
Step 3: inspect the trace tree
TRACE 'a3f1c9d2e8b04f7a'
Returns a nested span tree (format_hint: Gantt) showing the full execution timeline for that navigation attempt.
Finding anomalous robots after deployment
Spot robots behaving differently from the fleet after a code push
ANOMALY(duration)
COMPARED TO fleet
FACET robot_id
SINCE 1 hour ago
Returns one row per robot with z_score, is_anomalous, and direction (higher/lower). Robots with |z_score| > 2 are flagged as anomalous.
Always add FACET robot_id or FACET action_name to ANOMALY. Without a FACET, the comparison mixes heterogeneous spans (heartbeats, navigation, diagnostics) producing meaningless z-scores. ROSQL will warn you if you forget.
Compare this version to the previous one
FROM traces
WHERE action_name = '/navigate_to_pose'
COMPARE VERSION 'v2.3.1' TO VERSION 'v2.3.0'
REPL showcase examples
These six queries illustrate the range of visualizations available. Each uses a distinct format_hint.
1. Line chart — error rate over time
SELECT COUNT(*) FROM traces WHERE status = 'ERROR' TIMESERIES 5 min SINCE 6 hours ago
Response metadata:
{
"format_hint": "LineChart",
"visualization": { "x_axis": "time_bucket", "y_axis": "count" }
}
2. Gantt chart — trace waterfall
TRACE 'abc123'
Response metadata:
{
"format_hint": "Gantt",
"visualization": null
}
3. Directed graph — pub/sub topology
MESSAGE FLOW FROM TOPIC '/cmd_vel'
Response metadata:
{
"format_hint": "DirectedGraph",
"visualization": null
}
4. Severity-colored log table
FROM logs WHERE severity IN ('ERROR', 'FATAL') SINCE 1 hour ago
Response metadata:
{
"format_hint": "LogTable",
"visualization": { "color_field": "severity" }
}
5. Scalar cards — KPI summary
SELECT COUNT(*) AS total_errors, AVG(duration) AS avg_duration FROM traces SINCE 1 hour ago
Response metadata:
{
"format_hint": "ScalarCards",
"visualization": { "label_field": "total_errors" }
}
6. Bar chart — error count by robot
SELECT action_name, COUNT(*) FROM traces FACET action_name SINCE 1 hour ago
Response metadata:
{
"format_hint": "BarChart",
"visualization": { "x_axis": "action_name", "y_axis": "count" }
}
Session and mission data model
A ROSQL session corresponds to a ROS2 run scoped by ros.session.id. All spans, logs, and metrics emitted during a single robot operation (delivery, inspection, etc.) share this attribute.
Setting session attributes (Python)
from opentelemetry import trace
from opentelemetry.sdk.trace import TracerProvider
from opentelemetry.sdk.resources import Resource
resource = Resource.create({
"robot.id": "robot_42",
"ros.session.id": "delivery_042",
"ros.session.type": "delivery",
"service.version": "2.3.1",
"deployment.environment": "production",
})
provider = TracerProvider(resource=resource)
trace.set_tracer_provider(provider)
Setting session attributes (C++)
#include "opentelemetry/sdk/resource/resource.h"
auto resource = opentelemetry::sdk::resource::Resource::Create({
{"robot.id", "robot_42"},
{"ros.session.id", "delivery_042"},
{"ros.session.type", "delivery"},
{"service.version", "2.3.1"},
{"deployment.environment", "production"},
});
Querying by session
FOR SESSION 'delivery_042' FROM traces WHERE status = 'ERROR'
FOR SESSION 'delivery_042' FROM logs ORDER BY timestamp DESC
See also
- Command Reference → — full syntax for every ROSQL statement
- Schema Reference → — tables, fields, and OTel conventions
- Concepts → — SHOW vs FROM, scoping, joint indexing
- Examples page → — detailed worked examples