Agent-to-Agent Protocol Specification (I7)¶
Version: 1.0 Status: Ratified Author: GO Date: 2026-03-30 Scope: All STRUXIO agents (GO, HO, PO, IO, Specialists, Workers)
1. Principle: No Direct Agent-to-Agent Communication¶
All inter-agent communication passes through the STRUXIO Bus. Agents MUST NOT open direct TCP/HTTP connections to one another. Agents MUST NOT write to each other's state files directly. The Bus is the single source of truth for all messages, events, and presence.
This constraint enables:
- Reliable message delivery (Bus persists all messages)
- Auditable communication (every message is a row in events)
- Decoupled agents (spawner does not need to know the target's address)
- Replay and cursor-based recovery
2. Transport¶
| Layer | Technology |
|---|---|
| Protocol | HTTPS (TLS 1.3) to bus.struxio.ai |
| Auth | Bearer token per agent (issued at spawn time) |
| Encoding | JSON (UTF-8) |
| Persistence | PostgreSQL events table |
| Streaming | Server-Sent Events (SSE) at /api/sse/events for real-time consumers |
Agents MAY use SSE for low-latency notification, but MUST NOT rely on SSE for durability. The poll+cursor pattern is the authoritative delivery mechanism.
3. Message Format¶
Every message sent via bus_send_message creates one row in the events table.
3.1 Wire format (JSON body sent to Bus)¶
{
"from_actor": "<actor_id>",
"to_actor": "<actor_id | broadcast>",
"topic": "<topic_string>",
"payload": { /* arbitrary JSON object */ },
"reply_to": "<seq_number | null>",
"idempotency_key": "<uuid | null>"
}
| Field | Type | Required | Description |
|---|---|---|---|
from_actor |
string | yes | Sending agent's registered actor ID (e.g. GO, HO:h1, PO:paperclip) |
to_actor |
string | yes | Recipient actor ID, or broadcast for all subscribers |
topic |
string | yes | Dot-separated topic path (e.g. task.assigned, alert.fired, agent.spawned) |
payload |
object | yes | Message body. Schema varies by topic (see Section 5) |
reply_to |
integer | no | seq of the message this is a reply to; null for new threads |
idempotency_key |
uuid | no | Client-supplied dedup key; Bus discards duplicate (from_actor, idempotency_key) pairs |
3.2 Stored event format (as returned by bus_poll)¶
{
"seq": 42,
"from_actor": "GO",
"to_actor": "HO:h1",
"topic": "task.assigned",
"payload": { "ticket_id": "abc-123", "priority": 2 },
"reply_to": null,
"created_at": "2026-03-30T09:15:00.123Z"
}
The seq field is the canonical message identifier and ordering key.
4. Delivery Semantics¶
4.1 At-least-once delivery¶
The Bus persists every accepted message as an immutable row in events before returning HTTP 200.
Messages are never silently dropped.
Agents must be idempotent: the same message MAY be delivered more than once if an agent crashes after processing but before advancing its cursor.
Use idempotency_key in the payload to detect and suppress duplicates at the application level.
4.2 Ordering guarantee¶
Messages are ordered by seq (monotonically increasing integer, assigned by the Bus at insert time).
Within a single from_actor, messages are delivered in send order.
Cross-actor ordering is not guaranteed beyond wall-clock proximity.
4.3 Cursor-based polling¶
Each agent maintains a cursor — the seq of the last successfully processed message.
bus_poll(actor="HO:h1", cursor=41)
→ returns messages with seq > 41
→ agent processes each in seq order
→ agent calls bus_ack(actor="HO:h1", seq=45)
→ next poll uses cursor=45
On reconnect, agents pass their last known cursor to receive any messages missed during downtime. Cursor is stored by the Bus per actor; agents need not persist it locally.
4.4 No TTL / no expiry¶
Messages do not expire. An agent that was offline for hours will receive all missed messages on its next poll. Old messages are never deleted by the Bus automatically (archival is a separate ops concern).
5. Standard Topics¶
| Topic | Direction | Description |
|---|---|---|
task.assigned |
GO → any | New ticket/task handed to an agent |
task.completed |
any → GO | Agent reports task done |
task.failed |
any → GO | Agent reports unrecoverable failure |
task.progress |
any → GO | Incremental progress update (optional) |
agent.spawned |
GO → GO | Confirmation that a subagent was started |
agent.terminated |
any → GO | Agent clean exit notification |
alert.fired |
Bus/Governor → GO | An alert has been triggered |
alert.resolved |
Bus/Governor → GO | An alert has been cleared |
heartbeat |
any → Bus | Presence keepalive (actors must send every 60 s) |
message.direct |
any → any | Free-form inter-agent message |
broadcast.all |
GO → broadcast | Urgent system-wide notification |
New topics are registered in /opt/struxio/bus/src/store/schema/topics.yaml.
Undocumented topics are accepted by the Bus but will trigger a topic.unknown warning event.
6. Retry Protocol¶
6.1 Sender retries (transient Bus errors)¶
If bus_send_message returns HTTP 5xx or a connection error:
- Wait 2 s and retry once.
- Wait 10 s and retry once.
- Wait 60 s and retry once.
- After 3 failures: log to local incident file and halt (circuit breaker).
Use the same idempotency_key on all retry attempts so the Bus deduplicates if the first send actually succeeded.
6.2 Receiver retries (missed messages)¶
Agents poll on a configurable interval (default: 10 s when active, 60 s when idle). On reconnect after any gap, agents pass their stored cursor:
The Bus returns all messages with seq > cursor, filling the gap automatically.
No explicit "re-send" request is needed.
6.3 Dead-letter handling¶
If an agent processes a message and the action fails irrecoverably:
- Agent sends
task.failedback to GO withreply_to: <original_seq>and anerror_code. - GO decides to reassign, escalate, or discard.
- The original message remains in
events— it is never deleted.
7. Presence and Heartbeat¶
Agents announce presence via:
Heartbeat interval: every 60 seconds while active.
The Bus records last-seen timestamp per actor.
GO monitors for agents that have not sent a heartbeat in > 3 minutes and raises agent.stale alert.
8. Actor ID Conventions¶
| Pattern | Example | Description |
|---|---|---|
GO |
GO |
Global Orchestrator (singleton) |
HO:<host> |
HO:hetzner-1 |
Host Orchestrator |
PO:<project> |
PO:paperclip |
Project Orchestrator |
IO:<human> |
IO:shai |
Interface Orchestrator |
W:<uuid> |
W:a3f9... |
Ephemeral Worker (uses UUID to prevent collision) |
S:<type>:<uuid> |
S:db:b2c1... |
Specialist |
Worker and Specialist IDs include a UUID suffix because multiple instances may run concurrently. GO, HO, PO, and IO are singletons per their domain — the name alone is unique.
9. Security¶
- Every agent authenticates to the Bus with a per-agent Bearer token.
- Tokens are issued by GO at spawn time via
POST /api/agents/tokens. - Tokens are rotated per migration
012_token_rotation.sql. - An agent MUST NOT send messages on behalf of another actor (
from_actormust match the token's registered actor). - The Bus validates
from_actorserver-side and returns HTTP 403 on mismatch. - Broadcast messages (
to_actor: broadcast) are only accepted from GO-tier actors.
10. Versioning and Compatibility¶
- This spec is versioned as
1.0. - Breaking changes (field removal, type change) require a new major version and a migration period.
- Additive changes (new optional fields, new topics) are backward-compatible and do not increment the major version.
- The Bus's
/healthendpoint returns{"protocol_version": "1.0"}. - Agents that receive an unknown topic MUST silently ack and continue (forward compatibility).
11. Reference Implementation¶
| Component | Path |
|---|---|
| Bus send endpoint | POST /api/bus/send |
| Bus poll endpoint | GET /api/bus/poll?actor=<id>&cursor=<seq> |
| Bus ack endpoint | POST /api/bus/ack |
| Presence endpoint | POST /api/bus/heartbeat |
| SSE stream | GET /api/sse/events?actor=<id> |
| events table schema | /opt/struxio/bus/src/store/migrations/001_initial.sql |
| MCP tools | bus_send_message, bus_poll, bus_ack, bus_presence_heartbeat |