Skip to main content
When a node is misbehaving — events not flowing, peers not syncing, tasks stalling — the diagnostics API and kernel console give you a structured view into what the network bridge is actually doing. Rather than grepping raw logs, you can query the running node directly and get a snapshot of gossip topics, backfill cursors, transport events, and peer state all in one place.

GET /api/diagnostics

The primary diagnostics endpoint reads the node’s diagnostics/wattswarm_node.jsonl log and combines it with a live bridge observability snapshot.
curl http://127.0.0.1:7788/api/diagnostics | jq .
You can filter the event list with query parameters:
ParameterDescription
limitMaximum number of diagnostic events to return
levelFilter by log level (e.g. error, warn, info)
componentFilter by component name (e.g. gossip, backfill)
categoryFilter by event category
modeFilter by mode field
phaseFilter by lifecycle phase
event_idFilter by specific SEL event ID
object_idFilter by task, run, or object ID
source_node_idFilter events from a specific remote node
searchFree-text search across event payloads

What the response includes

The response object contains the following fields:
  • network_service_startedtrue if the P2P bridge started successfully during this node session.
  • network_service_status — current bridge status string (e.g. running, stopped).
  • snapshot — the latest bridge observability snapshot, which includes:
    • p2p_foundation"iroh" on the standard state-dir startup path; indicates which transport stack is active.
    • local_iroh_endpoint_id — the Iroh NodeId / EndpointId for this node’s active endpoint.
    • Iroh gossip topic IDs joined — the set of deterministic gossip topic IDs derived from network_id + scope + gossip_kind.
    • Known Iroh contact count — number of peers for which persisted Iroh contact material is available.
    • legacy_transport_activefalse on normal state-dir startup. If p2p_foundation is "iroh", legacy transport fields in the snapshot are compatibility placeholders only.
  • diagnostics — array of structured transport, gossip, backfill, and agent-callback event entries from the JSONL log.

Accessing diagnostics via the UI

The kernel console includes a formatted diagnostics view. Open http://127.0.0.1:7788/diagnostics in your browser to see the same data rendered with filtering controls. Use this when you want to scan events visually rather than parsing JSON.
Open the /swarm dashboard at http://127.0.0.1:7788/swarm to watch real-time task state transitions driven by live kernel calls. The dashboard calls /api/swarm/state and /api/swarm/tick against actual executor runtimes, so the panels reflect true SEL and projection state — not a simulation.

CLI log commands

The wattswarm log subcommands give you direct access to the append-only Structured Event Log (SEL) stored in PostgreSQL.
# Show the latest events in the SEL (most recent head sequence number and entries)
wattswarm log head --pg-url postgres://postgres:postgres@127.0.0.1:55432/wattswarm

# Replay all events and rebuild projections from scratch
wattswarm log replay --pg-url postgres://postgres:postgres@127.0.0.1:55432/wattswarm

# Verify log integrity — checks sequence continuity and hash chain
wattswarm log verify --pg-url postgres://postgres:postgres@127.0.0.1:55432/wattswarm
Use log replay after a crash or manual schema change to rebuild the node’s projection tables from the raw event history. Use log verify to confirm the log has not been truncated or tampered with.

Node status

To check the node’s current identity, network membership, and P2P info:
curl http://127.0.0.1:7788/api/node/status | jq .
The response includes node_id, running, mode, local_protocol_version, and peer_protocol_distribution — a map of protocol version strings to the count of peers seen at each version.

Peers list

To see all discovered peers and how they were found:
curl http://127.0.0.1:7788/api/peers/list | jq .

# or via CLI
wattswarm peers list --pg-url postgres://postgres:postgres@127.0.0.1:55432/wattswarm
Each peer entry includes a source_kind that tells you how the node learned about it:
source_kindMeaning
udpDiscovered via UDP multicast or broadcast announce
bootstrapLoaded from startup bootstrap_contacts
connectedEstablished an active Iroh session
identifyLearned via P2P handshake/identify
bootstrap_indexFound via bootstrap index lookup
local_discoveryDiscovered via mDNS (legacy compatibility path)

Checking executor health from the kernel

To trigger an executor health check from within the running kernel (rather than from the CLI), send a POST request directly:
curl -X POST http://127.0.0.1:7788/api/executors/check \
  -H "Content-Type: application/json" \
  -d '{"name": "rt"}'
The kernel calls the executor’s /health and /capabilities endpoints and returns a structured result. This is the same check the worker performs before dispatching a step.

Common investigation patterns

Missing events between nodes

If you expect events from a remote node but they are not appearing in the local SEL, check the backfill cursor state in the diagnostics log:
curl 'http://127.0.0.1:7788/api/diagnostics?component=backfill&limit=50' | jq '.diagnostics'
Look for backfill events where phase is request but there is no corresponding response, or where error payloads appear. A persistent gap in cursors means the peer is either unreachable or returning empty backfill responses. Check that the remote node is UP and that its head_seq is advancing with wattswarm log head.

Slow finalization

If tasks are taking longer than expected to finalize, check the resolution path breakdown in the run result:
curl 'http://127.0.0.1:7788/api/run/result?run_id=<run-id>' | jq '.result.aggregation.resolution_paths'
The resolution_paths field shows which aggregation steps were taken (e.g. TIE, REEXPLORE, STOCHASTIC). Cross-reference with aggregation.null_resolution to see if null paths were triggered. For latency profiling, the runtime metrics endpoint exposes true p95 latency computed from collected samples.

Sync issues after reconnect

If a node reconnects after a partition and events are still missing, look for gossip backfill events with error payloads in the diagnostics log:
curl 'http://127.0.0.1:7788/api/diagnostics?component=gossip&level=error' | jq '.diagnostics'
Backfill is lane-aware and persists cursor state across reconnects, so a successfully reconnected peer should resume from where it left off. Error payloads here indicate the remote peer rejected or could not serve the backfill range — check whether the remote node’s event log covers the requested sequence range.