Skip to main content
Version: 0.4

Cookbook

Practical query recipes for common ROS2 observability tasks. Copy, adapt, and run against your telemetry data.

Basic queries

Find all error spans

FROM traces WHERE status = 'ERROR'

Error logs at a specific severity

FROM logs WHERE severity = 'ERROR'

Slowest actions (last hour)

SELECT span_name, duration
FROM traces
WHERE action_name = '/navigate_to_pose' AND duration > 500 ms
SINCE 1 hour ago
ORDER BY duration DESC
LIMIT 20

Filter by ROS2 node

FROM traces WHERE node = '/bt_navigator'

Robot-scoped query

FOR ROBOT 'robot_sim_001' FROM logs WHERE severity = 'ERROR'

Scope by version

FOR VERSION 'v1.2.3' FROM traces WHERE status = 'ERROR'

Scope by environment

FOR ENVIRONMENT 'production' FROM traces WHERE status = 'ERROR' SINCE 1 hour ago

Scope by session

FOR SESSION 'sess_abc123' FROM logs WHERE severity = 'ERROR'

Compose multiple FOR clauses

FOR ROBOT 'robot_42' FOR VERSION 'v1.0.0' FOR ENVIRONMENT 'prod' FROM traces WHERE status = 'ERROR'

Read topic messages

FROM topics WHERE topic_name = '/battery_state' AND fields['percentage'] < 20

Topic alias

-- Shorthand for: FROM topics WHERE topic_name = '/odom'
FROM odom LIMIT 5

Cross-signal correlation

SELECT trace_id, span_name_col, service_name, duration, status_code, span_attributes
FROM traces
WHERE status = 'ERROR' AND action_name = '/navigate_to_pose'
DURING(
FROM topics WHERE topic_name = '/battery_state'
AND fields['percentage'] < 15
)
SINCE 6 hours ago

Errors during high CPU load

FROM traces WHERE status = 'ERROR'
DURING(
FROM metrics WHERE metric_name = 'system.cpu.total_usage_pct'
AND metric_value > 90
)
SINCE yesterday

Causality and message tracing

Trace full message causality chain

TRACE 'a3f1c9d2e8b04f7a'

Returns all spans in the causal chain by walking parent_span_id → span_id recursively via a CTE.

Removed in v0.4.1

MESSAGE JOURNEY FOR TRACE 'id' has been removed. Use TRACE 'id' instead.

Find all message paths for a topic

MESSAGE FLOW FROM TOPIC '/cmd_vel'

Shows which nodes published and subscribed to /cmd_vel.

Filter message flow to a specific destination node

MESSAGE FLOW FROM TOPIC '/cmd_vel' TO NODE '/motor_driver'

Filter message flow to a specific destination topic

MESSAGE FLOW FROM TOPIC '/sensor_data' TO TOPIC '/processed_data'
Removed in v0.4.1

MESSAGE PATHS FOR TOPIC '/topic' and MESSAGE PATH FROM TOPIC '/src' TO NODE '/dst' have been removed. Use MESSAGE FLOW FROM TOPIC ... instead.

All spans for a specific trace

TRACE 'trace-002'

Composable Health Dashboard

Five focused queries that together cover what HEALTH() will do when implemented. Run them individually or wire them into a dashboard.

Error rate by robot

SELECT COUNT(*) FROM traces WHERE status = 'ERROR' FACET robot_id SINCE 30 minutes ago

Log severity breakdown

SELECT COUNT(*) FROM logs WHERE severity IN ('ERROR', 'WARN') FACET severity SINCE 30 minutes ago

Action success rates

SELECT ACTION_SUCCESS_RATE('/navigate_to_pose') FROM traces SINCE 30 minutes ago

Topic publish rates

SELECT TOPIC_RATE('/cmd_vel') FROM metrics SINCE 30 minutes ago

P95 action latency

SELECT APPROX_PERCENTILE(duration, 95) FROM traces WHERE action_name = '/navigate_to_pose' SINCE 30 minutes ago

Robot health and anomaly detection

Robot health assessment

Not implemented in v0.4

HEALTH() is not yet compiled. Use the Composable Health Dashboard recipes above.

HEALTH() FOR ROBOT 'robot_sim_001'

Fan-out across traces, logs, and metrics to produce an overall health assessment.

Detect statistical anomalies in span duration (v0.4.3+)

Uses a two-phase z-score CTE to compare current values against a historical or fleet-wide baseline.

-- Anomalous robots compared to last week
ANOMALY(duration) COMPARED TO last week FACET robot_id SINCE 7 days ago

-- Anomalies compared to fleet-wide baseline
ANOMALY(duration) COMPARED TO fleet SINCE 1 hour ago FACET robot_id

Returns robot_id, avg_current, avg_baseline, z_score, is_anomalous, direction.

Finding anomalous robots after deployment (v0.4.3+)

Compare post-deployment behaviour against the previous week's baseline to catch regressions:

ANOMALY(duration) FROM traces COMPARED TO last week FACET robot_id SINCE 24 hours ago

Pair with FOR VERSION to scope to a specific release:

FOR VERSION 'v1.3.0' ANOMALY(duration) COMPARED TO last week FACET robot_id SINCE 12 hours ago

Path deviation analysis (v0.4.3+)

Check if a robot deviated from its planned path during a specific trace:

PATH DEVIATION FOR TRACE 'trace-002'

Or over a time window for a robot:

PATH DEVIATION FOR ROBOT 'robot_sim_001' SINCE 1 hour ago

Select a specific plan (0 = first occurrence, -1 = latest, default = -1):

PATH DEVIATION PLAN 0 FOR ROBOT 'robot_sim_001' SINCE 6 hours ago

Returns lateral deviation in metres per waypoint plus summary stats.

Joint trajectory tracking (v0.4.3+)

Compare planned vs. actual joint positions to detect mechanical issues or controller errors:

JOINT DEVIATION FOR TRACE 'trace-002'

Or across a time window:

JOINT DEVIATION FOR ROBOT 'arm_01' SINCE 1 hour ago

Returns per-joint position_error_rad and summary statistics.

Browse robot joint map (v0.4.3+)

Inspect the URDF-derived joint map registered for a robot model:

SHOW JOINTS FOR ROBOT 'robot_sim_001'

Returns joint_name, joint_index, joint_type, lower_limit, upper_limit.


MCAP recordings

Find recordings covering a time window

FROM recordings queries the mcap_metadata index — one row per MCAP file. It returns file-level metadata (robot, time window, topics, s3_key). Use s3_key with the returned timestamps to fetch and seek into the file client-side for sub-range slicing.

-- All recent recordings for a robot (SINCE uses overlap semantics on end_time)
FROM recordings WHERE robot_id = 'amr-01' SINCE 6 hours ago LIMIT 10

-- Recordings that contain joint state data
FROM recordings
WHERE '/joint_states' = ANY(topics)
SINCE 24 hours ago

-- Find the recording that covers a specific error timestamp
FROM recordings
WHERE robot_id = 'amr-01'
AND start_time <= '2026-04-17T10:22:00Z'
AND end_time >= '2026-04-17T10:22:00Z'

Returns robot_id, session_id, start_time, end_time, s3_key, topics, message_types.


Pipeline syntax

Pipeline syntax uses | to chain stages, making complex queries more readable:

Find slow error spans, grouped by robot

FROM traces
| WHERE duration > 500 ms
| WHERE status = 'ERROR'
| FACET robot_id
FROM traces
| WHERE action_name = '/navigate_to_pose'
| WHERE status = 'ERROR'
| FACET robot_id
| COMPARE TO last week

Compare error rate to a previous version

FROM traces WHERE status = 'ERROR' SINCE 1 hour ago | COMPARE TO VERSION 'v1.0.0'

Compare two versions head-to-head

FROM traces WHERE action_name = '/navigate_to_pose' COMPARE VERSION 'v1.0' TO VERSION 'v2.0'

Deployment history for a robot

SHOW DEPLOYMENTS FOR ROBOT 'robot_42' SINCE 7 days ago

Top spans by average duration

SHOW SPAN SUMMARY FOR ROBOT 'robot_42' SINCE 1 hour ago

Plan spans for a specific trace

SHOW PLANS FOR TRACE 'a3f1c9d2e8b04f7a'

Log analysis with ordering

FROM logs
| WHERE severity IN ('ERROR', 'WARN')
| ORDER BY severity DESC
| LIMIT 20

Time-series and aggregations

Average span duration per robot

SELECT AVG(duration) FROM traces FACET robot_id

Error count by severity

SELECT COUNT(*) FROM logs WHERE severity = 'ERROR' FACET robot_id

Percentile latency for navigation

SELECT PERCENTILE(duration, 95) FROM traces WHERE action_name = '/navigate_to_pose'

Multiple aggregations

SELECT AVG(duration) AS avg_dur, MAX(duration) AS max_dur, COUNT(*) AS total FROM traces

Rolling average (MOVING_AVG)

Smooth out spikes in action duration over a 5-row window:

SELECT MOVING_AVG(duration, 5) FROM traces WHERE action_name = '/navigate_to_pose'

Rate of change (DERIVATIVE)

Detect rising metric values — useful for spotting resource exhaustion trends:

SELECT DERIVATIVE(metric_value) FROM metrics WHERE metric_name = 'system.cpu.total_usage_pct'

Approximate distinct count

Count unique action names without a full distinct scan (DuckDB uses HLL; PostgreSQL falls back to exact):

SELECT APPROX_COUNT_DISTINCT(span_name) FROM traces SINCE 1 hour ago

Geospatial filtering (v0.4.3+)

Filter by GPS proximity (Haversine)

Find odom messages where the robot was within 500 m of a GPS coordinate:

FROM odom WHERE position WITHIN 500 m OF (37.7749, -122.4194) SINCE 1 hour ago

Uses inline Haversine SQL (great-circle distance) extracting position.latitude / position.longitude from the fields JSONB column.

Filter by local frame proximity (Euclidean)

Find odom messages within 2 m of a local coordinate using Euclidean distance:

FROM odom WHERE position WITHIN 2 m OF POSITION (1.5, 3.0) SINCE 1 hour ago

Extracts pose.pose.position.x / pose.pose.position.y from fields.

Monitoring joint positions during failures

Use array-indexed field access to read individual joint positions from /joint_states messages:

FROM joint_states
WHERE fields['position[0]'] > 1.5
FOR ROBOT 'arm_01'
SINCE 1 hour ago

fields['position[0]'] compiles to "fields"->'position'->>0 on PostgreSQL/DuckDB.


Pagination

Page through results with LIMIT and OFFSET

FROM logs ORDER BY timestamp DESC LIMIT 20 OFFSET 0
FROM logs ORDER BY timestamp DESC LIMIT 20 OFFSET 20
FROM logs ORDER BY timestamp DESC LIMIT 20 OFFSET 40

Pipeline form:

FROM logs | ORDER BY timestamp DESC | LIMIT 20 | OFFSET 40

See also