Skip to content

Agent-to-Agent Protocol Specification (I7)

Version: 1.0 Status: Ratified Author: GO Date: 2026-03-30 Scope: All STRUXIO agents (GO, HO, PO, IO, Specialists, Workers)


1. Principle: No Direct Agent-to-Agent Communication

All inter-agent communication passes through the STRUXIO Bus. Agents MUST NOT open direct TCP/HTTP connections to one another. Agents MUST NOT write to each other's state files directly. The Bus is the single source of truth for all messages, events, and presence.

This constraint enables: - Reliable message delivery (Bus persists all messages) - Auditable communication (every message is a row in events) - Decoupled agents (spawner does not need to know the target's address) - Replay and cursor-based recovery


2. Transport

Layer Technology
Protocol HTTPS (TLS 1.3) to bus.struxio.ai
Auth Bearer token per agent (issued at spawn time)
Encoding JSON (UTF-8)
Persistence PostgreSQL events table
Streaming Server-Sent Events (SSE) at /api/sse/events for real-time consumers

Agents MAY use SSE for low-latency notification, but MUST NOT rely on SSE for durability. The poll+cursor pattern is the authoritative delivery mechanism.


3. Message Format

Every message sent via bus_send_message creates one row in the events table.

3.1 Wire format (JSON body sent to Bus)

{
  "from_actor": "<actor_id>",
  "to_actor":   "<actor_id | broadcast>",
  "topic":      "<topic_string>",
  "payload":    { /* arbitrary JSON object */ },
  "reply_to":   "<seq_number | null>",
  "idempotency_key": "<uuid | null>"
}
Field Type Required Description
from_actor string yes Sending agent's registered actor ID (e.g. GO, HO:h1, PO:paperclip)
to_actor string yes Recipient actor ID, or broadcast for all subscribers
topic string yes Dot-separated topic path (e.g. task.assigned, alert.fired, agent.spawned)
payload object yes Message body. Schema varies by topic (see Section 5)
reply_to integer no seq of the message this is a reply to; null for new threads
idempotency_key uuid no Client-supplied dedup key; Bus discards duplicate (from_actor, idempotency_key) pairs

3.2 Stored event format (as returned by bus_poll)

{
  "seq":        42,
  "from_actor": "GO",
  "to_actor":   "HO:h1",
  "topic":      "task.assigned",
  "payload":    { "ticket_id": "abc-123", "priority": 2 },
  "reply_to":   null,
  "created_at": "2026-03-30T09:15:00.123Z"
}

The seq field is the canonical message identifier and ordering key.


4. Delivery Semantics

4.1 At-least-once delivery

The Bus persists every accepted message as an immutable row in events before returning HTTP 200. Messages are never silently dropped. Agents must be idempotent: the same message MAY be delivered more than once if an agent crashes after processing but before advancing its cursor.

Use idempotency_key in the payload to detect and suppress duplicates at the application level.

4.2 Ordering guarantee

Messages are ordered by seq (monotonically increasing integer, assigned by the Bus at insert time). Within a single from_actor, messages are delivered in send order. Cross-actor ordering is not guaranteed beyond wall-clock proximity.

4.3 Cursor-based polling

Each agent maintains a cursor — the seq of the last successfully processed message.

bus_poll(actor="HO:h1", cursor=41)
  → returns messages with seq > 41
  → agent processes each in seq order
  → agent calls bus_ack(actor="HO:h1", seq=45)
  → next poll uses cursor=45

On reconnect, agents pass their last known cursor to receive any messages missed during downtime. Cursor is stored by the Bus per actor; agents need not persist it locally.

4.4 No TTL / no expiry

Messages do not expire. An agent that was offline for hours will receive all missed messages on its next poll. Old messages are never deleted by the Bus automatically (archival is a separate ops concern).


5. Standard Topics

Topic Direction Description
task.assigned GO → any New ticket/task handed to an agent
task.completed any → GO Agent reports task done
task.failed any → GO Agent reports unrecoverable failure
task.progress any → GO Incremental progress update (optional)
agent.spawned GO → GO Confirmation that a subagent was started
agent.terminated any → GO Agent clean exit notification
alert.fired Bus/Governor → GO An alert has been triggered
alert.resolved Bus/Governor → GO An alert has been cleared
heartbeat any → Bus Presence keepalive (actors must send every 60 s)
message.direct any → any Free-form inter-agent message
broadcast.all GO → broadcast Urgent system-wide notification

New topics are registered in /opt/struxio/bus/src/store/schema/topics.yaml. Undocumented topics are accepted by the Bus but will trigger a topic.unknown warning event.


6. Retry Protocol

6.1 Sender retries (transient Bus errors)

If bus_send_message returns HTTP 5xx or a connection error:

  1. Wait 2 s and retry once.
  2. Wait 10 s and retry once.
  3. Wait 60 s and retry once.
  4. After 3 failures: log to local incident file and halt (circuit breaker).

Use the same idempotency_key on all retry attempts so the Bus deduplicates if the first send actually succeeded.

6.2 Receiver retries (missed messages)

Agents poll on a configurable interval (default: 10 s when active, 60 s when idle). On reconnect after any gap, agents pass their stored cursor:

bus_poll(actor="<id>", cursor=<last_acked_seq>)

The Bus returns all messages with seq > cursor, filling the gap automatically. No explicit "re-send" request is needed.

6.3 Dead-letter handling

If an agent processes a message and the action fails irrecoverably:

  1. Agent sends task.failed back to GO with reply_to: <original_seq> and an error_code.
  2. GO decides to reassign, escalate, or discard.
  3. The original message remains in events — it is never deleted.

7. Presence and Heartbeat

Agents announce presence via:

bus_presence_heartbeat(actor="<id>", surface="<runtime_surface>")

Heartbeat interval: every 60 seconds while active. The Bus records last-seen timestamp per actor. GO monitors for agents that have not sent a heartbeat in > 3 minutes and raises agent.stale alert.


8. Actor ID Conventions

Pattern Example Description
GO GO Global Orchestrator (singleton)
HO:<host> HO:hetzner-1 Host Orchestrator
PO:<project> PO:paperclip Project Orchestrator
IO:<human> IO:shai Interface Orchestrator
W:<uuid> W:a3f9... Ephemeral Worker (uses UUID to prevent collision)
S:<type>:<uuid> S:db:b2c1... Specialist

Worker and Specialist IDs include a UUID suffix because multiple instances may run concurrently. GO, HO, PO, and IO are singletons per their domain — the name alone is unique.


9. Security

  • Every agent authenticates to the Bus with a per-agent Bearer token.
  • Tokens are issued by GO at spawn time via POST /api/agents/tokens.
  • Tokens are rotated per migration 012_token_rotation.sql.
  • An agent MUST NOT send messages on behalf of another actor (from_actor must match the token's registered actor).
  • The Bus validates from_actor server-side and returns HTTP 403 on mismatch.
  • Broadcast messages (to_actor: broadcast) are only accepted from GO-tier actors.

10. Versioning and Compatibility

  • This spec is versioned as 1.0.
  • Breaking changes (field removal, type change) require a new major version and a migration period.
  • Additive changes (new optional fields, new topics) are backward-compatible and do not increment the major version.
  • The Bus's /health endpoint returns {"protocol_version": "1.0"}.
  • Agents that receive an unknown topic MUST silently ack and continue (forward compatibility).

11. Reference Implementation

Component Path
Bus send endpoint POST /api/bus/send
Bus poll endpoint GET /api/bus/poll?actor=<id>&cursor=<seq>
Bus ack endpoint POST /api/bus/ack
Presence endpoint POST /api/bus/heartbeat
SSE stream GET /api/sse/events?actor=<id>
events table schema /opt/struxio/bus/src/store/migrations/001_initial.sql
MCP tools bus_send_message, bus_poll, bus_ack, bus_presence_heartbeat