XIOPro Production Blueprint v5.0¶

Part 4 — Execution & Agent System¶

1. Purpose of This Part¶

This document defines how XIOPro:

executes work
runs agents
manages sessions
integrates LLM providers
supports Remote Control (RC)
ensures continuity and recovery
optimizes cost and performance

This is the layer that connects:

ODM (Part 3) → Real execution in the world

2. Execution Philosophy¶

XIOPro execution is:

agent-driven
ticket-based
state-controlled
cost-aware
provider-agnostic
continuously improving

Execution must:

never depend on a single session
survive crashes
resume from state
remain observable

3. Execution Stack Overview¶

flowchart TD
    Ticket --> Task
    Task --> A000
    Orchestrator["Orchestrator"] --> AgentSelection
    AgentSelection --> ModelRouter
    ModelRouter --> ExecutionEngine
    ExecutionEngine --> Activity
    Activity --> DB
    DB --> Governor["Governor"]

4. Core Components¶

4.1 Orchestrator Role¶

Formerly: O00 — Orchestrator

Role¶

Primary execution coordinator. In the unified agent identity model, the orchestrator role is one of several role bundles that can be assigned to an agent. See Part 1, Section 8 for the complete role bundle and agent identity definitions.

Responsibilities¶

read work graph
assign tasks
select agents
trigger execution
handle failures
maintain continuity

4.1A Orchestrator Surface Names¶

XIOPro uses named orchestrator surfaces for easy identification:

Name	Full Name	Host	Launch Command	Role
GO	Global Orchestrator	Hetzner	`devxio go` or `GO`	Primary orchestrator. Runs 24x7. Manages all projects, agents, state.
MO	Mac Orchestrator	Mac Studio	`devxio mo` or `MO`	Mac-local orchestrator. Handles Mac tasks, browser testing, local experiments.

Rules¶

GO and MO are surface names, not agent IDs. The agent running GO might be 000 or any agent with the orchestrator role.
Both launch via the devxio command with the surface as argument.
GO is the primary -- MO reports to GO via the Control Bus.
Both can run simultaneously on different hosts.

4.2 Governor Role¶

Formerly: O01 — Governor

This role is part of the XIOPro Optimizer (see Part 1, Section 8A).

Role¶

System optimization and protection. In the unified agent identity model, the governor role is a role bundle that can be assigned alongside the orchestrator role.

Responsibilities¶

cost tracking
anomaly detection
performance analysis
optimization recommendations
circuit breaking

4.2A Rule Steward Role¶

Formerly: R01 — Rule & Skill Steward

This role is part of the XIOPro Optimizer (see Part 1, Section 8A).

Role¶

Role bundle responsible for the lifecycle quality of:

RULE_* assets
SKILL_* assets
agent activation assets such as claude.md
reusable operating patterns / templates

The rule steward role is not a runtime governor like the governor role. It is the steward of behavioral assets that shape how XIOPro thinks and executes.

Why It Exists¶

As XIOPro evolves, the system will continuously accumulate:

new skills
revised rules
agent-specific activations
overlapping procedures
obsolete guidance
conflicting operating patterns

Without a dedicated steward, these assets drift, duplicate, and eventually degrade execution quality.

The rule steward role exists to keep the rule/skill layer:

coherent
reusable
discoverable
conflict-minimized
approval-governed

Primary Responsibilities¶

The rule steward must:

search for existing rules/skills before new ones are created
detect missing capabilities and propose new skill creation
validate structure, metadata, and completeness of rule/skill assets
detect overlap, contradiction, duplication, and drift
evaluate whether an activation file like claude.md remains effective
propose consolidation, supersession, deprecation, or promotion
draft new skills using existing approved skills when appropriate
open approval flows for protected changes
maintain lineage across revisions

Non-Responsibilities¶

The rule steward must not:

silently change live execution behavior
bypass founder approval for protected changes
replace governor runtime governance
become uncontrolled self-modification
commit rule/skill mutations directly into production without policy

Managed Asset Classes¶

managed_assets:
  - RULE
  - SKILL
  - ACTIVATION
  - PATTERN
  - PROTOCOL

Operating Modes¶

r01_modes:
  - audit
  - evaluate
  - propose
  - normalize
  - deprecate

Core Inputs¶

The rule steward should consume:

current RULE_* files
current SKILL_* files
activation files such as claude.md
historical incidents and overrides
task/result/reflection history
Dream Engine proposals
founder/operator requests
performance and reuse signals

Core Outputs¶

The rule steward may emit:

validation report
conflict report
redundancy report
skill-gap report
draft skill proposal
draft rule proposal
activation improvement proposal
deprecation recommendation
approval request

Technology Model¶

For T1P, rule and skill stewardship should use a dual representation:

Human-readable source of truth
Markdown assets in Git
explicit metadata/front matter
examples, rationale, and scope
Structured runtime mirror
normalized YAML/DB representation
queryable scope, precedence, owner, status, and approval requirements
machine-evaluable validation state

The rule steward operates across both layers.

Stewardship Flow¶

flowchart TD
    NeedOrProposal --> SearchExisting
    SearchExisting -->|found reusable asset| EvaluateFit
    SearchExisting -->|gap detected| DraftNewAsset
    EvaluateFit --> ValidateAsset
    DraftNewAsset --> ValidateAsset
    ValidateAsset --> DetectConflicts
    DetectConflicts --> ApprovalGate
    ApprovalGate --> PublishApprovedAsset
    PublishApprovedAsset --> UpdateIndex
    UpdateIndex --> AvailableForAgents

Relation to Other Components¶

Component	Relation to Rule Steward
Orchestrator role	consumes approved rules/skills during execution
Governor role	governs runtime behavior using approved policy/rule outputs
Librarian	stores, indexes, versions, and retrieves managed assets
Dream Engine	may propose skill/rule improvements but does not approve them
Human Operator	approves protected changes and resolves high-impact conflicts

Final Rule¶

The rule steward role is the custodian of execution behavior assets.

The governor role protects runtime. The rule steward role protects the quality and evolution of the rule/skill layer.

4.2B Prompt Steward Role¶

Formerly: P01 — ContextPrompting Orchestrator

This role is part of the XIOPro Optimizer (see Part 1, Section 8A).

See resources/DESIGN_rc_architecture.md for the Remote Control architecture design covering how human-agent interaction surfaces (Open WebUI, Prompt Composer) connect to the prompt steward role via the Control Bus.

Role¶

Role bundle responsible for transforming vague human intent and incomplete task context into execution-ready prompt packages.

The prompt steward does not replace context engineering.

It complements context engineering by deciding:

whether enough information already exists
whether questions should be asked
which questions are worth asking
how human answers should be converted into durable execution context
how the final prompt package should be assembled for the active runtime

Why It Exists¶

XIOPro does not rely on a single giant "super prompt".

Prompt quality is not only a writing problem. It is also a questioning problem.

As topics, tickets, and issues evolve, the system must be able to:

detect ambiguity
detect missing constraints
detect weak assumptions
ask the minimum useful questions
preserve the answers for future execution continuity

This applies to:

XIOPro itself
all STRUXIO products (see MVP1_PRODUCT_SPEC.md for the first product)
future STRUXIO.ai product flows

Core Principle¶

XIOPro replaces static prompt engineering with:

context engineering
prompt orchestration
interactive inquiry

When ambiguity materially affects quality, risk, relevance, or cost, the system should prefer targeted inquiry over silent assumption.

Primary Responsibilities¶

The prompt steward must:

assess task readiness before execution
identify missing intent, constraints, preferences, and assumptions
select an appropriate prompting mode
generate targeted clarifying questions
classify questions as optional or blocking
convert human answers into structured execution context
assemble runtime-specific prompt packages for the orchestrator / execution agents
maintain prompt lineage across revisions and retries
support human collaboration during design/problem-shaping tasks

Non-Responsibilities¶

The prompt steward role must not:

replace orchestrator execution orchestration
replace governor governance
replace rule steward rule/skill stewardship
ask unlimited annoying questions
block execution when policy allows bounded assumptions
silently mutate durable context without traceability

ContextPrompting Modes¶

contextprompting_modes:
  - direct
  - governed
  - clarify
  - collaborate

Mode Meaning¶

direct = execute immediately with no inquiry unless required by policy
governed = ask only required approval/risk/policy questions
clarify = ask a small number of targeted questions before execution
collaborate = work interactively with the human to shape the problem

Default Mode¶

Default user-facing mode should be:

default_contextprompting_mode: collaborate

Question Budget¶

The prompt steward should control how many questions are asked.

question_budget:
  - none
  - light
  - normal
  - deep

Typical guidance:

none → direct execution utility task
light → one or two material clarifications
normal → bounded pre-execution shaping
deep → collaborative framing for complex strategic/design work

Prompting Readiness Decision¶

Before execution, the prompt steward should determine one of:

prompt_readiness_decision:
  - ready_now
  - ask_optional_questions
  - ask_blocking_questions
  - require_human_collaboration
  - require_governed_approval

Inquiry Output Classes¶

Human answers gathered by the prompt steward should be transformed into durable objects such as:

clarified_intent
assumptions
constraints
preferences
unresolved_questions
approval_inputs
prompt_packet_inputs

These outputs must be attachable to:

ticket
task
activity
runtime
session
human decision history

Prompt Package Contract¶

The prompt steward should produce a bounded prompt package rather than a monolithic prompt blob.

prompt_package:
  task_id: string|null
  runtime_id: string|null
  prompting_mode: enum
  readiness_decision: enum

  goal: string|null
  scope: [string]
  constraints: [string]
  assumptions: [string]
  unresolved_questions: [string]

  relevant_context_refs: [string]
  relevant_rule_refs: [string]
  relevant_skill_refs: [string]

  human_answer_refs: [string]
  recommended_next_step: string|null

Example Operating Logic¶

If context is sufficient and risk is low → direct
If policy or approval applies → governed
If a few answers would materially improve quality → clarify
If the problem itself needs shaping with the founder → collaborate

Collaboration Rule¶

For strategic design, architecture, product shaping, and other high-value ambiguous work, XIOPro should usually prefer collaborate mode.

This is especially relevant to blueprint creation, MVP definition, and early STRUXIO.ai product design.

Interaction with Other Components¶

Component	Relation to Prompt Steward
Orchestrator role	consumes execution-ready prompt packages
Governor role	constrains prompting when governance/policy requires
Rule Steward role	supplies validated rules/skills/activations for prompt assembly
Librarian	supplies supporting knowledge/context assets
RC / UI	provides the human interaction surface for inquiry
Dream Engine	may identify missing recurring skills/questions patterns

Success Criteria¶

The prompt steward is successful when:

the system asks fewer but better questions
execution starts with clearer intent
assumptions become explicit rather than hidden
human collaboration improves high-value tasks
prompt packages remain compact, relevant, and traceable
inquiry improves quality without becoming friction-heavy

Final Rule¶

Prompting in XIOPro is not a single artifact.

It is a governed interactive process for shaping execution quality.

4.2C Module Steward Role¶

Formerly: M01 — Module Portfolio Steward & Optimizer

This role is part of the XIOPro Optimizer (see Part 1, Section 8A).

Role¶

Role bundle responsible for governing, evaluating, optimizing, and evolving XIOPro's module portfolio across:

subscription-backed access
API-key access
local/self-hosted runtimes
cloud/server-hosted runtimes
future hybrid execution paths

The module steward treats module choice as a governed optimization problem, not an ad hoc per-agent preference.

Why It Exists¶

XIOPro will use many modules across many surfaces, agents, and workflows.

That creates a portfolio problem, not just a cost problem.

The system must continuously optimize the use of modules and subscriptions across constrained resources such as:

compute power
memory
bandwidth
time / latency
monetary cost
quota / subscription utilization

while maximizing:

quality
stability
trust

Without a dedicated steward, module usage drifts into:

duplicated capability
poor routing choices
underused subscriptions
wasteful cost
weak fallback design
hidden dependency on vendor-specific surfaces
unmanaged self-hosted complexity

Primary Responsibilities¶

The module steward must:

maintain the governed registry of available modules and access paths
understand which modules are available by subscription, API, or self-hosting
evaluate module fitness by task type, quality target, and environment
optimize module selection across constrained resources
recommend preferred and fallback modules
detect waste, underuse, poor fit, and overlap
detect deprecated or weakening module options
recommend when self-hosting becomes justified
scout and evaluate new modules, plans, and hosting options
prepare adoption / upgrade / retirement proposals
coordinate with governor role for runtime enforcement
coordinate with Part 8 constraints for actual hosting feasibility

Non-Responsibilities¶

The module steward role must not:

auto-purchase subscriptions
auto-deploy new module stacks into production
auto-switch the portfolio without approval where policy requires it
replace governor runtime governance
replace prompt steward prompt-package assembly
replace rule steward rule/skill stewardship

Governed Asset Classes¶

managed_module_assets:
  - MODULE
  - MODULE_POLICY
  - SUBSCRIPTION
  - HOSTING_PROFILE
  - MODULE_EVALUATION
  - MODULE_RECOMMENDATION

Optimization Objective¶

Core optimization principle:

Module choice is a governed optimization game.

Optimization is not only about lowering cost.

It is about achieving the best feasible balance of:

quality
stability
trust
speed
resource efficiency
operational resilience

Typical Dimensions Evaluated¶

The module steward should evaluate at least:

task fit
quality / output reliability
latency
token / usage cost
subscription utilization
memory and compute footprint
bandwidth / network dependency
privacy / exposure profile
execution surface compatibility
hosting feasibility
fallback availability
operational complexity

Example Questions the Module Steward Must Answer¶

Which module should this class of task prefer by default?
Which fallback should be used when the preferred module is unavailable?
Which subscriptions are underused or strategically weak?
Which self-hosted options are worth evaluating next?
Which modules should be deprecated or constrained?
Which environments can actually support a proposed new module?

Evidence Sources & Scouting Inputs¶

The module steward must optimize from evidence, not intuition alone.

Its scouting and evaluation inputs may include:

provider documentation
provider pricing and plan changes
approved benchmark/evaluation reports
internal task/module evaluation history
research outputs from the Research Center
approved web research results
Hugging Face model and repository research
local or remote CLI-based research tools
self-hosting feasibility notes from infrastructure

Hugging Face Rule¶

Hugging Face may be used as a governed scouting source for:

candidate module discovery
repository discovery
self-hosting research leads
capability comparison
surrounding ecosystem signals

But Hugging Face findings are not automatic approvals.

They are candidate inputs that must still flow through:

module steward evaluation
rule steward / prompt steward / governor constraints where relevant
approval policy for adoption or strategic change

Runtime Feedback & Telemetry Requirement¶

The module steward must receive real usage evidence from execution.

At minimum, module usage should be traceable by:

module/provider
access path
execution surface
runtime
session
activity
task
ticket
latency / retry profile
estimated or billed cost where available

This feedback loop is required so module optimization can use:

actual performance
actual stability
actual cost pressure
actual subscription utilization
actual fallback frequency

rather than only assumptions.

Optimization Rule¶

A module recommendation is incomplete unless it can be supported by at least one of:

direct internal usage evidence
credible external evaluation
controlled comparison result
explicit exploratory candidate status

This prevents portfolio decisions from becoming folklore.

Interaction with Other Components¶

Component	Relation to Module Steward
Orchestrator role	executes within the governed module portfolio
Governor role	enforces module policy, constraints, and anomaly responses at runtime
Prompt Steward role	uses module portfolio guidance when assembling prompt packages
Rule Steward role	stewards rules/skills/activations that shape module usage
LiteLLM / routing layer	applies preferred/fallback routing decisions where applicable
Part 8 infrastructure	provides the actual hosting and resource envelope
Human operator	approves protected additions, removals, and strategic changes

Success Criteria¶

The module steward is successful when:

module usage is explainable rather than ad hoc
portfolio choices improve quality, stability, and trust
cost and resource use are optimized without hidden fragility
subscriptions are used deliberately rather than accidentally
self-hosting proposals are grounded in real need and real feasibility
new modules are evaluated systematically before adoption
fallback and retirement decisions are intentional

Final Rule¶

Module usage in XIOPro is not a side effect.

It is a governed optimization discipline.

4.2D T1P Implementation Form of Role Bundles¶

Purpose¶

The named XIOPro role bundles are architectural capabilities assigned to agents (see Part 1, Section 8).

For T1P, they must also be made concrete as implementation units.

This section defines what these role bundles are at code and deployment level, so the blueprint can be ticketized without pretending that every role is a separate distributed system.

Note: In the unified agent identity model, all five role bundles (orchestrator, governor, rule_steward, prompt_steward, module_steward) can be assigned to a single agent. They are implemented as separate code modules, not separate agents.

4.2D.1 Core T1P Deployables¶

T1P should begin with a deliberately small set of deployables:

web-ui
widget-first web control center
api-service
FastAPI-based control/API layer
owns core request/response interfaces
exposes founder/operator and UI-facing APIs
emits SSE streams for live updates
worker-service
Python worker/runtime service
executes jobs, orchestration loops, research tasks, and background governance tasks
postgres
authoritative operational state store
reverse-proxy
Caddy
observability stack
OpenTelemetry / Prometheus / Grafana

Optional in T1P where needed:

litellm-router
only for API-backed module routing paths
not required for subscription-only human-operated surfaces
Ruflo execution fabric
runtime integration layer for bounded multi-agent execution

Rule¶

The named professions do not require one deployable each in T1P.

They may begin as application services/modules inside a smaller number of processes.

4.2D.2 Orchestrator Role — Implementation Form¶

Formerly: O00.

T1P form:

application service / orchestration module
primarily hosted inside:
api-service
worker-service for longer-running execution and coordination loops

The orchestrator module should own:

orchestration logic
task assignment logic
execution progression logic
handoff into runtime fabric
resume/recovery coordination at orchestration level

The orchestrator module should not be implemented as: - only a prompt persona - only a markdown convention - only a UI abstraction

State Ownership¶

The orchestrator module does not own state authoritatively.

Authoritative state remains in: - PostgreSQL-backed work graph / ODM

The orchestrator module reads and mutates that state through explicit services and records.

4.2D.3 Governor Role — Implementation Form¶

Formerly: O01.

T1P form:

governance service / policy evaluation module
implemented as:
synchronous policy checks in api-service
background anomaly / breaker / rollup evaluation in worker-service

The governor module should own:

alert evaluation
breaker logic
approval gate checks
runtime constraint decisions
cost anomaly checks
governance event emission

The governor module should not be: - only a chat persona - an invisible UI-side heuristic layer

State Ownership¶

The governor module does not own canonical operational objects.

It owns: - governance logic - governance records - policy evaluation outputs

Authoritative governance events and related objects remain stored in PostgreSQL.

4.2D.4 Rule Steward Role — Implementation Form¶

Formerly: R01.

T1P form:

application service / governed asset module
implemented initially inside api-service and worker-service
not required as a separate deployable in T1P

The rule steward module should own:

search-before-create checks
validation of rule/skill/activation assets
conflict/overlap detection
publication and approval routing
asset-lifecycle support

State Ownership¶

The rule steward module does not own the source of truth for assets.

Sources of truth remain: - Git-managed asset files - structured mirror records in PostgreSQL where applicable

The rule steward module owns validation and stewardship behavior over those assets.

4.2D.5 Prompt Steward Role — Implementation Form¶

Formerly: P01.

T1P form:

application service / prompt-package and inquiry module
implemented initially inside api-service
may use worker-service for longer-running preparation tasks if needed

The prompt steward module should own:

prompting mode interpretation
readiness decision
question selection
blocking vs optional inquiry classification
prompt-package assembly
promotion of meaningful answers into durable context

State Ownership¶

The prompt steward module does not own chat history as the source of truth.

It reads and writes through: - discussion threads - tasks - sessions - human decisions - prompt package records / structured context refs

4.2D.6 Module Steward Role — Implementation Form¶

Formerly: M01.

T1P form:

application service / module registry and optimization module
implemented initially inside api-service
background evidence aggregation and recommendation refresh may run in worker-service

The module steward module should own:

module registry logic
recommendation logic
fallback logic
evidence-backed comparison logic
subscription/access-path awareness
hosting-feasibility evaluation
proposal preparation for adoption/deprecation

T1P Narrowing Rule¶

For T1P, the module steward may begin as a narrow module registry + recommendation layer rather than a large autonomous portfolio engine.

The architectural role remains module_steward, but its initial implementation scope may be deliberately narrow.

4.2D.7 Communication Model¶

T1P communication should stay simple.

UI <-> Backend¶

REST/JSON over HTTPS
SSE for live updates
WebSocket only where true bidirectional streaming is justified

API <-> PostgreSQL¶

ORM / explicit persistence layer
no direct UI-to-DB path

API <-> Worker¶

PostgreSQL-backed job dispatch / claim / update model
no separate broker required in T1P

Backend <-> Ruflo / runtime surfaces¶

adapter/service boundary
explicit execution records
runtime/session IDs preserved

Backend <-> LiteLLM¶

used only for API-backed module routes
not required to mediate subscription-only human-operated paths

4.2D.8 State Ownership Summary¶

Role / Component	Owns Logic	Owns Canonical State
Orchestrator	Yes	No
Governor	Yes	No
Rule Steward	Yes	No
Prompt Steward	Yes	No
Module Steward	Yes	No
Reviewer	Yes	No
PostgreSQL / ODM	No	Yes
Git-managed governed assets	No	Yes for source assets

Rule¶

The professions own behavior. The system stores canonical state in explicit durable stores.

This prevents role descriptions from becoming state silos.

4.2D.9 T1P Implementation Constraint¶

T1P should prefer:

fewer deployables
more explicit modules/services inside those deployables
strong contracts
durable state
clear event and job records

The blueprint may name many professions.

T1P should not force each profession into an independent distributed runtime prematurely.

4.2D.10 Final Rule¶

For T1P, XIOPro should be implemented as:

a small number of deployables
a larger number of explicit services/modules
one canonical work graph/state layer
one clear operator UI
one recoverable orchestration and governance core

This preserves architectural clarity without over-distributing the system too early.

4.2E T1P Implementation Form Table (v5.0 Addition)¶

Each role bundle's concrete T1P implementation form is summarized below for quick reference during ticketization and implementation.

All role bundles are implemented as separate code modules, not separate agents. Current assignment: see Section 19.

Role Bundle	T1P Implementation
orchestrator	Python module in api-service. Reads ODM, assigns tasks, triggers execution. Uses Ruflo for agent spawning.
governor	Python module in api-service. Policy evaluation, breaker logic, cost tracking. Thin initially.
rule_steward	Python module in worker-service. Validation, conflict detection, search-before-create. Thin initially -- most logic handled by orchestrator module inline.
prompt_steward	Python module in api-service. Readiness assessment, question generation, prompt assembly. Start as simple logic, not full orchestrator.
module_steward	Python module in worker-service. Module registry, usage tracking, recommendation. Start as config file + simple recommendation logic.
reviewer	On-demand agent spawned per review request. No persistent module -- spawned as a short-lived Claude Code session with reviewer role activation. Verdict stored as Activity Evaluation in PostgreSQL.

4.2F Host Resource Awareness (v5.0 Addition)¶

The orchestrator must check host capacity before spawning any agent.

Pre-Spawn Check¶

Before spawning an agent, the orchestrator must:

Query the Host Registry for the target host's current state
Check active_agents against max_concurrent_agents
Check current RAM usage against 85% threshold
If capacity insufficient: queue the task, try another host, or escalate

Agent-Host Binding¶

Every Agent Runtime must carry:

host_id: string          # which host this agent runs on
host_name: string        # human-readable reference
resource_estimate:
  ram_gb: float          # estimated RAM this agent will consume
  cpu_cores: float       # estimated CPU usage

Multi-Host Execution¶

XIOPro supports execution across multiple hosts:

Host	Role	Typical Workloads
Hetzner CPX62	control_plane	orchestrator (all roles), services, domain brains, workers
Mac Studio M1 (32GB)	hybrid	remote worker, local experiments, overflow agents
Future cloud nodes	worker / gpu	compute-intensive tasks, self-hosted models

The orchestrator should prefer the control plane host for orchestration and distribute overflow to available hosts.

OOM Prevention¶

85% RAM threshold triggers "no new agents" gate
90% triggers graceful shutdown of lowest-priority agents
95% triggers emergency agent termination + alert to founder
Host health is monitored by the governor with breaker policies

4.2G Ruflo Relationship to Orchestrator Role (v5.0 Clarification)¶

Ruflo (claude-flow) is the agent execution runtime. The orchestrator is the orchestration logic that uses Ruflo.

The separation is:

The orchestrator decides WHAT to execute. It reads the work graph, selects tasks, assigns agents, determines execution order, and manages progression.
Ruflo decides HOW to spawn agents. Ruflo handles agent lifecycle, process spawning, sub-agent coordination, execution boundaries, and runtime fabric management.

The orchestrator invokes Ruflo as its execution fabric. Ruflo does not contain orchestration logic -- it provides the runtime machinery that the orchestrator directs.

This distinction prevents confusion between the orchestration role and the execution runtime (Ruflo). They are complementary, not interchangeable.

4.2H XIOPro Control Bus (v5.0 Addition)¶

The XIOPro Control Bus is the unified communication and coordination backbone. Full specification (architecture, capabilities table, intervention model, push delivery, data access rules, migration path): see Part 2, Section 5.8.

Agent Communication Flow¶

Agent starts session
  → registers with Bus (POST /agents/register) using 3-digit agent_id
  → opens SSE channel (GET /events/{agent_id})
  → receives tasks, messages, interventions via push
  → reports activity results back to Bus
  → heartbeats every 60 seconds
Agent ends session
  → Bus marks agent offline
  → queued messages persist for next session

Relationship to Ruflo¶

Layer	Scope	Persistence
Control Bus	Cross-session, cross-host	PostgreSQL — survives everything
Ruflo	Within-session, within-host	Session memory — dies with session

Ruflo reports state to the Bus. The Bus does not depend on Ruflo.

4.2I Reviewer Role (v5.0 Addition)¶

Role¶

The Reviewer is a short-lived agent role spawned by an orchestrator (GO or PO) after a builder agent completes significant work. Its sole purpose is to independently evaluate the output against the original specification and return a verdict to the spawning orchestrator.

The Reviewer is not the builder. It is never the same agent that produced the work.

Why It Exists¶

Builders verify their own output via the Completion Self-Check Protocol (Section 5.2). That is insufficient for high-stakes deliverables. Self-evaluation has a structural blind spot: the builder shares the same context, assumptions, and potential misunderstandings that produced the work.

The Reviewer role closes this gap by introducing an independent perspective:

reads the spec and the output independently, with no shared build context
applies a different model tier where possible (Opus reviews Sonnet's work; Sonnet reviews Haiku's work)
cannot be influenced by the builder's reasoning path
reports a clean verdict with evidence

When to Spawn a Reviewer¶

A Reviewer should be spawned when:

a ticket is marked significant (architectural change, public API, schema migration, security-sensitive work)
the builder's Completion Self-Check confidence is in the 0.5–0.8 range
the orchestrator's policy for the project requires mandatory review
the builder explicitly requests independent review (rare but permitted)

A Reviewer is NOT spawned for:

routine sub-hour tasks
documentation edits without behavioral impact
tasks already reviewed by a human via RC

Spawning Rule¶

reviewer_spawn_rule:
  spawned_by: orchestrator (GO or PO)
  trigger: builder marks task complete on a significant ticket
  constraint: reviewer_agent_id != builder_agent_id
  model_preference:
    - if builder used sonnet → prefer opus for reviewer
    - if builder used opus → prefer sonnet for reviewer
    - if builder used haiku → prefer sonnet for reviewer
    - if preferred model unavailable → use any different model tier
  lifecycle: short-lived — spawned for one review, terminates after verdict
  bus_registration: yes — registered in Control Bus for traceability
  cost_attribution: separate ledger entry, attributed to the ticket

Reviewer Responsibilities¶

The Reviewer must:

read the original ticket specification (goal, scope, acceptance criteria)
read the builder's output (code, document, artifact, or result)
evaluate each acceptance criterion independently
identify gaps, regressions, or spec deviations
produce a structured verdict

The Reviewer must not:

fix the output itself
negotiate with the builder
consult the builder about intent
carry over context from a previous session on this ticket

Verdict Structure¶

review_verdict:
  ticket_id: string
  task_id: string
  reviewer_agent_id: string
  builder_agent_id: string
  reviewer_model: string
  builder_model: string
  verdict: APPROVED | NEEDS_FIX | REJECTED
  criteria_results:
    - criterion: string
      result: pass | fail | partial
      evidence: string
  gaps_found: [string]
  fix_required: [string]   # populated when verdict = NEEDS_FIX
  rejection_reason: string  # populated when verdict = REJECTED
  recommendation: string

Verdict Meanings¶

APPROVED — all acceptance criteria pass; orchestrator may close or promote the ticket
NEEDS_FIX — one or more criteria are partial or failing; orchestrator re-assigns to builder with the fix list
REJECTED — output does not meet the spec at a fundamental level; orchestrator decides whether to reassign or escalate

Orchestrator Response to Verdict¶

Verdict	Orchestrator Action
APPROVED	Mark task complete, proceed with ticket progression
NEEDS_FIX	Re-open task, assign fix list to original builder, re-trigger review on completion
REJECTED	Escalate to human (RC) or reassign entire task to a different agent

Relation to Completion Self-Check¶

The Completion Self-Check (Section 5.2) is the builder's internal gate. The Reviewer is the external gate.

Both must pass before a significant ticket is closed.

Builder self-check passes → task marked complete (builder)
                          → orchestrator spawns Reviewer
                          → Reviewer returns verdict
                          → APPROVED → ticket closed
                          → NEEDS_FIX / REJECTED → ticket re-opened

T1P Implementation Form¶

Reviewer is spawned as an on-demand agent (Pattern 2, Section 5A.2) with the reviewer role assigned
For T1P, review is triggered manually by the orchestrator after builder completion on significant tickets
Automated spawn-on-completion is a post-T1P enhancement
The review_verdict output is stored as an Activity Evaluation entity (Part 3, Section 4.6.1) attached to the reviewed task

Interaction with Other Components¶

Component	Relation to Reviewer
Orchestrator role	spawns Reviewer, receives verdict, acts on it
Builder (Specialist/Worker)	produces the work being reviewed; cannot interact with active Reviewer
Completion Self-Check	builder-side gate that precedes Review spawn
Activity Evaluation (Part 3)	verdict stored as evaluation record
Governor role	tracks review cost; may require review for high-cost tickets
RC	receives REJECTED verdicts requiring human judgment

Final Rule¶

The Reviewer exists to catch what self-evaluation misses.

It is not a bureaucratic gate. It is a targeted quality signal for work that matters.

4.3 Ruflo — Agent Swarm Engine¶

Role¶

Agent orchestration runtime.

Responsibilities¶

spawn agents
manage sub-agents
control execution lifecycle
enforce boundaries

Notes¶

acts as execution fabric
integrates with Claude Code / other agents
supports multi-agent collaboration

4.4 LiteLLM — Model Router¶

Role¶

Provider abstraction layer.

Responsibilities¶

route requests to:
Claude
OpenAI
Gemini
local models
optimize cost vs performance
fallback handling
unify API interface

Key Feature¶

Enables provider independence

4.5 Execution Engine¶

Role¶

Actual execution runtime.

Can include:¶

Claude Code (primary)
RooCode
custom Python agents
CLI-based execution
future local models

4.6 Remote Control (RC)¶

4.6.1 Purpose¶

RC enables the human operator to interact with live execution in a controlled and auditable way.

RC exists to:

attach to a running execution context
respond to escalation requests
approve or reject protected actions
inject bounded guidance
redirect or constrain execution
recover decision continuity during ambiguity or failure

RC is the primary human interaction surface for live XIOPro brains.

It supports:

exploratory conversation
execution-bound discussion
approval and escalation handling
recovery intervention
bounded guidance and redirection

When RC interaction materially affects execution, it must be converted into durable operational state.

4.6.2 Principle¶

RC transforms XIOPro from:

autonomous system → governed autonomous system

The key principle is:

RC is not the system of record.

The system of record is the durable operational state held in:

Agent Runtime
Session
Escalation Request
Human Decision
Activity / Ticket / Task lineage
Transcript and context references

RC is the human interaction surface over those objects.

4.6.3 Canonical Objects Used by RC¶

RC must operate on the canonical runtime objects defined in the ODM.

Agent Runtime¶

Represents the live actor doing work.

RC may:

attach to it
pause it
constrain it
redirect it
resume it

Session¶

Represents the durable execution session.

RC must attach to a session, not merely to an abstract agent name.

A session may be:

active
idle
paused
waiting
blocked
crashed
recovering
closed
archived

Escalation Request¶

Represents a durable request for human discussion, clarification, or approval.

RC should open or respond to an escalation request rather than relying on ad hoc chat state.

Human Decision¶

Represents the durable answer or approval outcome recorded by the founder/operator.

RC is one way to create a Human Decision, but the decision must persist beyond the UI.

Execution Surface¶

Represents where the runtime is actually executing, such as:

Claude Code
Codex
Gemini CLI
custom CLI
API worker
future local model runtime

RC must be aware of execution surface constraints.

4.6.4 RC Interaction Modes¶

RC should support at least these modes:

Attach Mode¶

Used when the founder wants to connect to an already-running session.

Escalation Response Mode¶

Used when a task or runtime has opened a durable Escalation Request.

Approval Mode¶

Used when a protected action requires formal go / no-go input.

Redirect Mode¶

Used when the founder changes goal, scope, constraints, provider, or path.

Recovery Mode¶

Used when a session crashed, degraded, or became blocked and a recovery decision is needed.

4.6.5 RC Architecture¶

flowchart TD
    Human --> RCInterface
    RCInterface --> RCManager
    RCManager --> Session
    RCManager --> AgentRuntime
    RCManager --> EscalationRequest
    EscalationRequest --> HumanDecision
    HumanDecision --> RCManager
    RCManager --> Orchestrator["Orchestrator"]
    RCManager --> Governor["Governor"]
    Orchestrator --> ExecutionSurface
    ExecutionSurface --> Session
    Session --> TranscriptStore
    Session --> ContextBundle

4.6.6 RC Manager Responsibilities¶

RC Manager is the backend control layer for RC.

It must:

locate attachable sessions
bind human interaction to the correct runtime scope
assemble the required context bundle
persist interaction history
route decisions back into runtime execution
preserve ticket/task/activity lineage
support multi-brain switching
prevent uncontrolled cross-session contamination

It must not:

silently overwrite durable system state
bypass approval requirements
become a generic freeform chat relay without structure

4.6.7 Context Bundle Contract¶

Before attaching or escalating, RC should assemble a bounded context bundle.

Minimum bundle contents:

context_bundle:
  runtime_id: string
  session_id: string
  execution_surface_id: string|null

  ticket_id: string|null
  task_id: string|null
  activity_id: string|null

  current_goal: string|null
  current_state: string
  blocker_summary: string|null

  recent_actions_ref: string|null
  relevant_knowledge_refs: [string]
  transcript_ref: string|null
  checkpoint_ref: string|null

  escalation_request_id: string|null
  pending_approval: boolean
  recommended_next_step: string|null

This keeps human intervention compact, explicit, and resumable.

4.6.8 RC Triggers¶

RC may be invoked by:

requires_human = true
requires_approval = true
ambiguity detected
recovery tradeoff required
quality failure requires judgment
runtime blocked
founder manual intervention
governance escalation from the governor

4.6.9 RC Flow¶

flowchart TD
    RuntimeActive --> TriggerDetected
    TriggerDetected --> EscalationOrAttach
    EscalationOrAttach --> ContextBundleBuilt
    ContextBundleBuilt --> HumanInteraction
    HumanInteraction --> HumanDecisionRecorded
    HumanDecisionRecorded --> RuntimeResume
    RuntimeResume --> SessionUpdated
    SessionUpdated --> AuditTrail

4.6.10 RC Interaction Modes¶

Mode	Purpose	Durability Requirement
exploratory conversation	think with a brain without immediate execution change	optional unless promoted
execution-bound discussion	guide or clarify active work	must persist if it affects work
approval / escalation	formal human gate	durable by default
recovery intervention	unblock or redirect after failure/degradation	durable by default

Rule¶

RC is the unified human interaction surface for XIOPro brains.

Not every RC conversation must mutate execution state.

But any RC conversation that changes execution, constraints, approvals, direction, or recovery must be recorded as durable operational state.

4.6.11 RC Success Criteria¶

RC is successful when:

the founder can attach to the correct live execution context
discussion and approval become durable system state
context injection is bounded and traceable
session continuity is preserved after intervention
multiple brains can be switched without confusion
recovery decisions are captured and replayable

4.6.12 Final Statement¶

RC is not a convenience chat layer.

It is the controlled human intervention surface for live execution.

4.7 Session Manager¶

4.7.1 Role¶

Session Manager owns session lifecycle control.

It ensures that execution continuity survives:

normal pause/resume
human escalation
provider instability
runtime crash
surface switch
controlled recovery

4.7.2 Responsibilities¶

Session Manager must:

open and close sessions
monitor session health
persist session checkpoints
transfer or rebuild context
coordinate session recovery
track attachment eligibility
support resume semantics
preserve transcript references
prevent orphaned runtimes

4.7.3 Session State Model¶

Session Manager must honor the canonical session states:

active
idle
paused
waiting
blocked
crashed
recovering
closed
archived

Interpretation¶

active = currently executing
idle = no immediate work, resumable
paused = intentionally halted
waiting = waiting on human/dependency/event
blocked = cannot continue without intervention
crashed = abnormal interruption
recovering = recovery path underway
closed = ended and no longer active
archived = retained for history

4.7.4 Recovery Paths¶

Session recovery should support at least:

retry same session
resume with new session on same surface
switch execution surface
switch model/provider path
escalate to human for recovery decision
terminal close

Recovery must preserve lineage to:

runtime
ticket
task
activity
escalation request
human decision

4.7.5 Attachment Eligibility¶

A session is attachable when:

it has not been terminally closed
it is still relevant to live or recoverable work
required context is available
ownership/lock conditions permit intervention

Attachable states usually include:

active
idle
paused
waiting
recovering

4.7.6 Ownership & Locking¶

Session Manager should prevent unsafe simultaneous human/control collisions.

Minimum rules:

one human control attachment at a time
explicit lock release on detach or timeout
emergency override allowed with audit log
session ownership visible to the orchestrator/governor and the control surface

4.7.7 Session Success Criteria¶

Session management is successful when:

sessions survive ordinary interruptions
recoverable failures remain recoverable
no important context is silently lost
human intervention can resume the right work reliably
execution surfaces can be switched without breaking lineage

4.8 Memory / Context Layer¶

4.8.1 Role¶

This layer preserves the working context needed for execution continuity.

It is not identical to long-term knowledge storage.

It provides the bounded memory bridge between:

live runtime execution
ticket/task state
durable knowledge
human intervention
recovery

4.8.2 Context Horizons¶

Short-Term¶

Immediate live execution state:

current step
recent tool calls
latest outputs
transient working memory

Mid-Term¶

Execution continuity state:

ticket/task context
session checkpoints
escalation context
current constraints
recent decisions

Long-Term¶

Persistent knowledge:

rules
skills
activations
documents
reflections
prior decisions
knowledge graph / ledger references

4.8.3 Context Sources¶

The context layer may assemble context from:

Session transcript
checkpoint artifacts
task/activity state
Librarian / knowledge layer
Knowledge Ledger
human decisions
governance decisions
Dream-derived improvements where approved

4.8.4 Context Rules¶

The context layer must:

minimize irrelevant context
preserve critical continuity
avoid cross-ticket contamination
keep human intervention bounded
allow reconstruction after crash or restart

It must not:

blindly dump all history into runtime prompts
let one brain inherit another brain's state without justification
treat chat history as the only memory source

4.8.5 Resume Bundle¶

A resumed runtime should receive a structured resume bundle, not just raw transcript replay.

Minimum resume bundle:

resume_bundle:
  session_id: string
  runtime_id: string
  current_goal: string|null
  latest_valid_checkpoint_ref: string|null
  latest_human_decision_ref: string|null
  active_constraints: [string]
  relevant_knowledge_refs: [string]
  next_expected_action: string|null

4.8.6 Success Criteria¶

The context layer is successful when:

tasks resume without silent amnesia
human decisions remain attached to execution
context remains compact and relevant
session recovery is practical
long-term knowledge improves execution without polluting it

4.8A Memory Engineering Principles (from 5-Layer Memory Stack Research)¶

These 5 production engineering rules apply to all XIOPro memory operations (Hindsight, Librarian, state files, knowledge vault). Derived from @the_enterprise.ai's "5-Layer AI Agent Memory Stack" research (see struxio-knowledge/vault/research_inbox/REVIEW_5_layer_memory_stack_images.md for full analysis).

memory_engineering_principles:

  1_async_updates:
    rule: "Never block the main agent execution for memory operations"
    implementation: "Memory extraction, indexing, and storage happen in background threads or post-activity hooks"
    applies_to: [Hindsight, Librarian, Knowledge Ledger]

  2_debounce_writes:
    rule: "Batch memory operations — don't write on every turn"
    implementation: "Wait 30 seconds or N turns, batch messages, make one extraction call"
    applies_to: [Hindsight, session state, activity logging]
    benefit: "Reduces token usage and prevents memory thrashing"

  3_confidence_threshold:
    rule: "Discard low-confidence facts (< 0.7). Cap total facts per agent at 100"
    implementation: "Every stored fact gets a confidence score 0.0-1.0. Below threshold = discard. Above cap = trim lowest confidence."
    applies_to: [Knowledge Objects, Hindsight memories, agent lessons]
    benefit: "Prevents unbounded memory growth and low-quality knowledge accumulation"

  4_token_budget:
    rule: "Control context injection size — max 2000 tokens for memory context"
    implementation: "When injecting memories/knowledge into agent context, trim to budget by removing lowest-confidence items first"
    applies_to: [Prompt Steward context assembly, Hindsight auto-inject, RAG retrieval]
    benefit: "Prevents context window bloat, keeps agent focused"

  5_atomic_writes:
    rule: "Write state files atomically — temp file then rename"
    implementation: "Write to plan.yaml.tmp, then mv plan.yaml.tmp plan.yaml. Never corrupt state mid-write."
    applies_to: [plan.yaml, next-actions.yaml, agents.yaml, all state files, session checkpoints]
    benefit: "Prevents corrupted state from crashes or concurrent writes"

Relation to Existing XIOPro Architecture¶

Principle	XIOPro Component	Current Status	T1P Action
1. Async Updates	Bus async messaging, background workers	Partially covered by Bus architecture	Enforce for Hindsight/Librarian processing
2. Debounce Writes	Hindsight extraction, session state	Not yet implemented	Add batching to Hindsight extraction pipeline
3. Confidence Threshold	Knowledge Objects, Hindsight memories	Not yet implemented — no confidence field exists	Add confidence field to Knowledge Objects (Part 5, Section 6.1)
4. Token Budget	Context Rules (Section 4.8.4), Prompt Steward	Context rules say "minimize" but lack hard number	Define 2000-token hard budget for memory context injection
5. Atomic Writes	State files (plan.yaml, next-actions.yaml)	Not enforced	Implement write-to-temp-then-rename for all state files

Implementation Requirements¶

All agents must use atomic writes for state file mutations (Principle 5)
The Prompt Steward (Section 4.2B) must enforce the 2000-token memory context budget (Principle 4) when assembling prompt packages
Hindsight and Librarian processing must be async and debounced (Principles 1, 2) — see Part 5, Sections 9 and 4 for implementation requirements
Knowledge Objects must carry a confidence score field; the Librarian must enforce threshold and cap (Principle 3) — see Part 5, Section 6.1

4.9. Dream Engine (Sleep-Time Intelligence Layer)¶

The Dream Engine and its T1P subset (Idle Maintenance, Section 4.9.9) are part of the XIOPro Optimizer (see Part 1, Section 8A).

XIOPro includes a background cognition layer called the Dream Engine.

This system runs during idle periods and performs:

memory consolidation
knowledge pruning
contradiction resolution
pattern extraction
cost optimization suggestions
system-level improvements

4.9.1 Purpose¶

Prevent:

memory decay
knowledge fragmentation
context pollution
repeated mistakes

Enable:

long-term intelligence accumulation
system self-improvement
reduced token usage over time

4.9.2 Trigger Conditions¶

Dream cycles are triggered when:

time threshold reached (e.g. 24h)
activity threshold reached (e.g. N sessions / N tasks)
manual trigger by founder
major system change detected

4.9.3 Scope of Operation¶

Dream Engine operates on:

knowledge base (.md / .yaml / DB)
tickets history
task execution logs
reflections
agent performance data

4.9.4 Core Phases¶

Orientation
scan current system state
build structural map
Signal Extraction
identify:
- repeated failures
- corrections
- decisions
- patterns
Consolidation
merge duplicates
remove obsolete entries
normalize metadata
convert relative → absolute time
Optimization
propose:
- rule updates
- skill improvements
- routing optimizations
- cost reductions
Index Rebuild
update:
- librarian index
- topic structure
- search mappings

4.9.5 Output Artifacts¶

Dream Engine produces:

updated knowledge files
improvement proposals
rule modification suggestions
skill enhancement suggestions
anomaly reports

4.9.6 Governance¶

runs in isolated mode
cannot modify execution directly
requires approval for:
rule changes
system behavior changes

4.9.7 Relation to Other Systems¶

System	Role
Auto Memory	capture
Librarian	organize
Dream Engine	refine
Reflection Engine	evaluate
Improvement Engine	apply

4.9.8 Strategic Impact¶

Dream Engine transforms XIOPro from:

"execution system"

into:

self-evolving intelligence system

4.9.9 T1P Dream Engine Posture (v5.0 Addition)¶

Posture: Idle Maintenance Only

The full Dream Engine architecture is preserved in this blueprint as the target capability.

For T1P, only the following subset is implemented:

Memory consolidation (AutoDream) -- consolidate session artifacts, clean transient state
Stale knowledge detection -- flag documents that have not been referenced or updated beyond a threshold
Morning brief generation -- produce a daily summary of system state, pending work, and alerts
Session cleanup -- archive completed sessions, remove orphaned runtime artifacts
Idea review -- scan Ideas with status new or deferred whose next_review_at has passed or whose last_reviewed_at exceeds the configured review cycle. Surface stale ideas in the morning brief for user attention.

The full Dream Engine phases (signal extraction, optimization proposals, index rebuild, contradiction resolution) are deferred to post-T1P.

Rule¶

T1P Dream Engine is operational but narrow. It maintains system hygiene without attempting autonomous intelligence evolution. Full capability is a post-T1P milestone.

4.10 Agent Activation Architecture (v5.0 Addition)¶

Problem¶

Current activation files (ACTIVATE_BM.md, ACTIVATE_B1.md, etc.) are 65-108 lines each and contain significant duplication:

Duplicated Content	Lines	Repeated In
Execution Discipline (Boris rules)	6-8	All 7 agents
Paperclip protocol	2-4	All agents
Session Start Protocol	5-6	All agents
Memory (Hindsight + Obsidian)	2-3	domain brains
Worker spawning rules	3-4	domain brains
First Action (read tools + state)	3-5	All agents

This wastes ~1,400 tokens per agent load. Over dozens of daily sessions across 7 agents, this is significant token waste — and worse, it creates maintenance burden (changing a rule means editing 7 files).

Solution: Skill-Based Activation¶

Extract duplicated content into shared skills. Activation files become slim identity-only documents that declare which skills to load.

Extracted Skills¶

Skill	Content	Loaded By
`SKILL_bootstrap`	Read tools reference, state files, lessons. Set context.	All agents
`SKILL_execution_discipline`	Boris Cherny rules: plan first, subagents, self-improve, verify, circuit breaker, git discipline.	All agents
`SKILL_memory`	Hindsight bank setup, Obsidian query patterns, knowledge retrieval.	All agents
`SKILL_worker_spawn`	Worker naming ([brain_id][seq]), max 3 active, headless mode, supervision rules.	domain brains
`SKILL_paperclip_sync`	Already exists. Ticket checkout, comments, completion, cost reporting.	All agents
`SKILL_session_start`	Heartbeat, bus poll, state load, resume top action.	All agents

Slim Activation File Pattern¶

---
title: "ACTIVATE: 002 — Engineering Brain"
agent: 002
version: "5.0.0"
skills_on_load: [bootstrap, execution-discipline, session-start]
skills_available: [paperclip-sync, memory, worker-spawn]
---

# 002 — Engineering Brain

## Identity
You are **002** (Engineering) — STRUXIO's Product Engineering Brain.
Ruflo worker on Hetzner under orchestrator coordination.

## Domain
Python, REST APIs, product integrations, domain-specific tooling.

## Workers
201 (coder), 202 (tester), 203 (code-reviewer). Max 3.

## Model
Sonnet 4.6 default. Opus when ticket specifies.

## On Activation
Load `skills_on_load` from frontmatter. Execute SKILL_bootstrap.
Other skills load on demand when triggered.

~20 lines instead of 68. Token savings: ~200 tokens per agent load.

Skill Loading Strategy¶

skills_on_load:     # Always loaded at session start. Critical for identity and bootstrap.
skills_available:   # Loaded on demand when the agent's task requires them.

This mirrors how Superpowers skills work — frontmatter declares what's available, runtime loads when needed.

Connection to Dream Engine / Idle Maintenance¶

This optimization IS what the Dream Engine does in practice:

Review activation files, skill files, and rules for duplication
Detect redundancy, drift, and token waste
Propose consolidation (new shared skills, slim activation files)
Report to founder / rule steward role for approval

Adding to the Idle Maintenance scope:

idle_maintenance_tasks:
  - memory_consolidation          # existing
  - stale_knowledge_detection     # existing
  - morning_brief                 # existing
  - session_cleanup               # existing
  - activation_optimization       # NEW: review activation files for duplication
  - skill_dedup_detection         # NEW: detect overlapping skills
  - token_waste_analysis          # NEW: estimate token savings from consolidation
  - idea_review                   # NEW: scan ideas not reviewed within their review_cycle
  - skill_performance_review      # NEW: compare internal skill metrics against alternatives (see Part 5 Section 8.9A)
  - skill_token_optimization      # NEW: identify skills with high token usage, suggest alternatives (see Part 5 Section 8.9A)

This is the bridge between "Idle Maintenance" (T1P) and "Dream Engine" (full capability) — practical optimization that proves the Dream concept without requiring the full autonomous intelligence layer.

Migration Plan¶

Create the 4 new skills (bootstrap, execution-discipline, memory, worker-spawn)
Update SKILL_REGISTRY.yaml with all skills
Slim activation files (one at a time, test each)
Add Idle Maintenance task to detect future drift

4.11 Skill Selection Architecture (v5.0 Addition)¶

When the orchestrator assigns a task, it must select which skills the agent loads. This prevents token waste (loading 48 skills when 3 are needed) and ensures model-appropriate skill assignment.

Problem¶

The Skill Registry (Part 5, Section 8.9) defines what skills exist. The Activation Architecture (Section 4.10) defines how activation files reference skills. Neither defines which skills to load for a given task assignment.

Without selection logic: - Every agent loads all skills it has access to (token waste) - Haiku agents receive skills that require deep reasoning (quality loss) - Task-irrelevant skills dilute the agent's context (precision loss)

Foundation: Role → Topic → Skill Binding Chain (v5.0.5 Clarification)¶

Skills bind to roles via topics, not directly to agent numbers. A role has multiple topics. A topic has multiple skills. Any agent assigned a role inherits all topic-skill bindings for that role.

role_topic_skill_chain:
  role: designer
  topics:
    - brand_identity
    - content_creation
    - visual_design
  skills_per_topic:
    brand_identity: [voice-dna-creator, brainstorming]
    content_creation: [content-research-writer, writing-plans]
    visual_design: [brainstorming]

  role: specialist_compliance
  topics:
    - iso_19650
    - bim_fidelity
    - cde_management
  skills_per_topic:
    iso_19650: [claude-deep-research, writing-plans]
    bim_fidelity: [systematic-debugging, verification-quality]
    cde_management: [writing-plans]

This binding chain is the structural foundation for the 3-step selection filter below. Step 1 resolves the role's topic-skill bindings; Steps 2 and 3 then narrow the result by task type and model tier.

Solution: 3-Step Skill Selection Filter¶

When the orchestrator assigns a task, it computes the skill set through three sequential filters. The final skill set is the intersection of all three.

Step 1 — Filter by Agent Role (via Topic-Skill Bindings)¶

Each role has a base skill set derived from its topic-skill bindings. An agent only considers skills permitted for its role.

agent_role_skills:
  orchestrator: [writing-plans, brainstorming, paperclip-sync, dispatching-parallel-agents]
  specialist: [writing-plans, TDD, systematic-debugging, code-review, brainstorming, paperclip-sync]
  worker: [TDD, verification-before-completion, paperclip-sync]
  reviewer: [code-review, receiving-code-review, verification-quality, systematic-debugging]
  interface: []  # UI agents have no reasoning skills

Step 2 — Filter by Task Type¶

Each task type declares which skills are relevant. Only skills that survived Step 1 AND appear in the task type list continue.

task_type_skills:
  coding: [TDD, systematic-debugging, verification-before-completion]
  research: [brainstorming, writing-plans]
  review: [code-review, receiving-code-review, verification-quality]
  design: [brainstorming, writing-plans]
  deployment: [verification-before-completion]
  debugging: [systematic-debugging]
  planning: [brainstorming, writing-plans, executing-plans]
  ticket_management: [paperclip-sync]

Step 3 — Filter by Model Tier¶

The assigned model determines final compatibility. Skills that require reasoning beyond the model's capability are excluded.

model_skill_compatibility:
  haiku:
    exclude: [brainstorming, writing-plans]  # too complex for haiku
    best_for: [paperclip-sync, verification-before-completion]  # simple execution
  sonnet:
    exclude: []  # handles everything
    best_for: [TDD, systematic-debugging, code-review]  # sweet spot
  opus:
    exclude: []
    best_for: [brainstorming, writing-plans, architecture, complex-debugging]  # deep reasoning

Selection Formula¶

final_skills = role_skills ∩ task_skills − model_excludes

Example: specialist + coding + sonnet: - Step 1 (role): [writing-plans, TDD, systematic-debugging, code-review, brainstorming, paperclip-sync] - Step 2 (task): [TDD, systematic-debugging, verification-before-completion] - Intersection: [TDD, systematic-debugging] - Step 3 (model): sonnet excludes nothing - Result: [TDD, systematic-debugging]

Known Skill Library (Categorized)¶

All skills managed by the rule steward role. Categories determine default model tier.

Execution Skills (any model)¶

Skill ID	Purpose
`paperclip-sync`	Ticket lifecycle management
`verification-before-completion`	Verify before marking done
`finishing-a-development-branch`	PR/merge workflow
`using-git-worktrees`	Isolated feature work

Engineering Skills (Sonnet+)¶

Skill ID	Purpose
`test-driven-development`	TDD workflow
`systematic-debugging`	Debug before fix
`pair-programming`	AI pair programming
`code-review`	Requesting code review
`receiving-code-review`	Receiving code review
`verification-quality`	Truth scoring

Architecture Skills (Sonnet/Opus)¶

Skill ID	Purpose
`brainstorming`	Explore before building
`writing-plans`	Implementation plans
`executing-plans`	Execute with checkpoints
`subagent-driven-development`	Parallel execution
`dispatching-parallel-agents`	Independent work dispatch

Infrastructure Skills (any model)¶

Skill ID	Purpose
`hooks-automation`	Hooks management
`swarm-orchestration`	Multi-agent coordination
`swarm-advanced`	Advanced swarm patterns

Knowledge Skills (Sonnet+)¶

Skill ID	Purpose
`writing-skills`	Create/edit skills
`skill-builder`	Generate skill templates
`reasoningbank-agentdb`	Adaptive learning
`agentdb-memory-patterns`	Persistent memory

Domain Skills¶

Skill ID	Purpose
`sparc-methodology`	SPARC development workflow
`claude-api`	Claude API / Anthropic SDK integration
`github-code-review`	GitHub code review
`github-workflow-automation`	GitHub Actions workflow automation
`github-project-management`	Project board and sprint planning
`github-release-management`	Release orchestration and versioning
`github-multi-repo`	Multi-repository coordination
`github-code-review-swarm`	Swarm-coordinated code review
`flow-nexus-platform`	Flow Nexus authentication, sandboxes, apps
`flow-nexus-swarm`	Cloud swarm deployment with Flow Nexus
`flow-nexus-neural`	Neural network training in Flow Nexus

Advanced/Candidate Skills (Review Required)¶

These skills exist but require rule steward review before T1P adoption:

Skill ID	Purpose	Concern
`v3-performance-optimization`	Aggressive performance targets	claude-flow v3 specific
`v3-mcp-optimization`	MCP server optimization	claude-flow v3 specific
`v3-cli-modernization`	CLI modernization	claude-flow v3 specific
`v3-ddd-architecture`	DDD architecture patterns	claude-flow v3 specific
`v3-core-implementation`	Core module implementation	claude-flow v3 specific
`v3-security-overhaul`	Security architecture overhaul	claude-flow v3 specific
`v3-memory-unification`	Memory system unification	claude-flow v3 specific
`v3-integration-deep`	Deep agentic-flow integration	claude-flow v3 specific
`v3-swarm-coordination`	15-agent hierarchical coordination	claude-flow v3 specific
`agentdb-vector-search`	Semantic vector search	Advanced, may be premature
`agentdb-memory-patterns`	Persistent memory patterns	Advanced, may be premature
`agentdb-learning`	RL learning plugins	Advanced, may be premature
`agentdb-optimization`	Performance optimization	Advanced, may be premature
`agentdb-advanced`	Multi-DB management	Advanced, may be premature
`reasoningbank-intelligence`	Adaptive learning patterns	Advanced, may be premature
`stream-chain`	Stream-JSON chaining	Niche use case
`browser`	Web browser automation	Overlaps with Playwright MCP

Full Skill Count Summary¶

Category	Count	Model Tier
Execution	4	any
Engineering	6	Sonnet+
Architecture	5	Sonnet/Opus
Infrastructure	3	any
Knowledge	4	Sonnet+
Domain	11	varies
Advanced/Candidate	17	varies
Total	50

The rule steward reviews this catalog during idle maintenance to detect unused skills, propose consolidation, and evaluate candidate skills for promotion or retirement.

The rule steward role reviews this list during idle maintenance to detect unused skills and propose consolidation.

Task Assignment with Skills¶

When the orchestrator assigns a task to an agent, the selection result is included in the assignment:

task_assignment:
  task_id: "1001"
  agent_id: "002"
  skills_required: [TDD, systematic-debugging]  # from selection logic
  skills_available: [verification-before-completion]  # on-demand if needed
  model: sonnet
  host: hetzner-cpx62

The skills_required field is computed by the 3-step filter. The skills_available field lists additional skills the agent may invoke on-demand (present in its role set but not in the task type set).

Connection to Rule Steward Role¶

The rule steward maintains the skill library and selection logic:

Reviews skill usage patterns across task assignments
Detects unused skills (no assignments in 30 days)
Proposes consolidation when skills overlap
Updates model compatibility as new models release
Adds new skills when gaps detected in task coverage
Adjusts role-skill mappings when new roles are introduced

This is governed by the same idle maintenance cycle defined in Section 4.9.9 and the Rule Steward responsibilities in Section 4.2A.

T1P Implementation¶

For T1P, skill selection is performed manually by the orchestrator when dispatching tasks: - The orchestrator reads the role + task type + model and picks skills accordingly - No automated selection engine required - The YAML definitions above serve as the reference lookup table

Full automation (selection engine integrated with Ruflo task dispatch) is deferred to post-T1P.

5. Execution Flow¶

flowchart TD
    Ticket --> Task
    Task --> Orchestrator["Orchestrator"]
    Orchestrator --> AssignAgent
    AssignAgent --> Execute
    Execute --> Activity
    Activity --> Evaluate
    Evaluate --> Continue

5.1 Task Dependency Resolution (v5.0.8 Addition)¶

Tasks can have dependencies (depends_on, blocks). The orchestrator must resolve these before assignment.

dependency_resolution:
  algorithm: "topological_sort"
  rules:
    - task cannot start until all depends_on tasks are completed
    - if circular dependency detected: flag as error, escalate to user
    - parallel execution: tasks with no shared dependencies run simultaneously
    - blocked tasks: re-evaluate when any dependency completes

  execution_order:
    1. build dependency graph from all active tasks
    2. topological sort to determine execution order
    3. identify tasks with zero dependencies (ready now)
    4. assign ready tasks to available agents (respecting host capacity)
    5. when task completes: remove from graph, re-check dependents
    6. repeat until all tasks complete or blocked

Design Rationale¶

Topological sort is the minimal correct algorithm for DAG resolution. It guarantees no task starts before its dependencies complete, and it detects circular dependencies (which are errors by definition).
Parallel execution is implicit: any tasks with zero unresolved dependencies at the same time can run simultaneously, bounded by host capacity (see Section 4.2F).
Re-evaluation on completion means the orchestrator does not need to pre-compute the full schedule. It reacts to task completion events and releases newly-unblocked tasks.

T1P Implementation¶

For T1P, the orchestrator performs dependency resolution manually:

Read task depends_on and blocks fields from the ODM (Part 3, Section 4.5)
Build a simple in-memory dependency graph
Assign tasks in topological order
If the graph is small (< 50 tasks per project), no external DAG engine is needed

A formal DAG execution engine (e.g., integrated into Ruflo) is deferred to post-T1P when project complexity may require it.

5.2 Completion Self-Check Protocol (v5.0.8 Addition)¶

Before an agent claims a task is complete, it must run a self-evaluation. This strengthens the Reflection pattern (Part 1, Section 7) from post-hoc to in-line.

completion_self_check:
  steps:
    1. re_read_objective: "Read the task objective again"
    2. check_acceptance_criteria: "For each criterion, verify it is met"
    3. run_completion_test: "Execute the completion_test command if defined"
    4. self_score: "Rate confidence 0.0-1.0 that the task is truly done"
    5. identify_gaps: "List anything that might be incomplete"
    6. decision:
        - if confidence >= 0.8 and completion_test passes: mark done
        - if confidence 0.5-0.8: mark done with caveats noted
        - if confidence < 0.5: do NOT mark done, continue working or escalate

  output:
    completion_evaluation:
      task_id: string
      confidence: float
      criteria_met: [string]
      criteria_unmet: [string]
      completion_test_result: pass|fail|not_defined
      gaps_identified: [string]
      decision: done|continue|escalate

Design Rationale¶

Re-reading the objective counters drift: agents can lose track of the original goal during long execution sequences.
Acceptance criteria check is explicit: each criterion from the task definition (Part 3, Section 4.5) must be individually verified, not assumed.
Confidence scoring introduces nuance: not all completions are equal. A task marked "done with caveats" signals to the orchestrator that review may be warranted.
Escalation at low confidence prevents agents from marking tasks done when they know they fell short. This is cheaper than discovering incomplete work downstream.

Relation to Activity Evaluation¶

The completion_evaluation output becomes an Activity Evaluation entity (Part 3, Section 4.6.1) attached to the final activity of the task. This makes self-evaluation auditable and queryable.

T1P Implementation¶

For T1P, the self-check is enforced via activation files:

Every agent activation includes the completion self-check protocol as a rule
The orchestrator verifies that task completion messages include the completion_evaluation block
Tasks marked done without evaluation are flagged for review

5.3 Agent Auto-Pickup (v5.0.13 Addition)¶

Agents signal readiness and self-retrieve their next task rather than waiting passively for orchestrator push. This reduces orchestrator polling overhead and allows agents to resume immediately after completing a task.

agent_auto_pickup:
  endpoint: POST /agents/{id}/pickup
  behavior:
    - Agent calls pickup when it becomes idle (task complete or session start)
    - Bus evaluates assigned tasks for the agent, returns highest-priority ready task
    - If no task is ready: returns 204 No Content — agent polls again after backoff
  task_query_endpoint: GET /agents/{id}/tasks
  backoff_schedule: [5s, 10s, 30s, 60s]

Why Auto-Pickup¶

Orchestrator pushes tasks via Bus when assigning, but agents may miss push on session restart
Auto-pickup ensures no assigned task is silently dropped on session recovery
Pair with SSE: agent receives push notification AND can self-poll on reconnect

5.4 Paperclip Auto-Sync (v5.0.13 Addition)¶

Paperclip task records are kept in sync with XIOPro ODM task state via fire-and-forget async calls. Agents do not wait for Paperclip acknowledgement.

paperclip_auto_sync:
  trigger: any task CRUD operation (create, update, complete, block)
  pattern: fire-and-forget
  behavior:
    - Task state change occurs in XIOPro ODM (source of truth)
    - Async call to Paperclip API issued in background
    - Failure is logged but does not block execution
    - Sync catches up on next successful call
  note: Paperclip is the current task tracker (to be superseded by XIOPro ODM)

5A. Agent Spawning Patterns (v5.0 Addition)¶

XIOPro distinguishes three spawning patterns. Each serves a different purpose and has different lifecycle, visibility, and cost characteristics.

5A.1 Agent vs Sub-Agent Distinction¶

Property	Agent	Sub-Agent
Identity	Own 3-digit ID (e.g., 002)	No ID — lives under parent
Bus registration	Registered, sends heartbeats	NOT registered in Bus
Session	Own independent session	Shares parent's session context
Memory	Own Hindsight bank	Uses parent's memory
Lifecycle	Survives parent restart	Dies with parent session
Orchestrator visibility	Visible — orchestrator can intervene	Invisible — parent's responsibility
Communication	Through Control Bus (SSE, REST)	Direct to parent via Ruflo
Cost tracking	Own cost ledger entries	Rolled into parent's cost
Model	Configured per agent	Usually Haiku (cheap)
Capacity	Counts against host limit	Max 3 per parent

5A.2 Three Spawning Patterns¶

Pattern 1: Project Roster Agent (Commissioned)¶

Spawned when a project starts. Long-lived. Assigned to project roster.

pattern: project_roster
spawned_by: orchestrator or system master
when: "Project needs sustained domain expertise"
duration: entire project or sprint
lifecycle:
  - orchestrator creates agent
  - registers in Control Bus
  - added to project roster with roles
  - works on project tickets
  - freed when project completes or no longer needed
examples:
  - "A product project needs a compliance specialist for 2 weeks"
  - "XIOPro needs a dedicated backend engineer"
visibility: full — orchestrator sees status, cost, tasks

Pattern 2: On-Demand Agent (Task-Scoped)¶

Spawned for a specific task. Medium-lived. Has own identity.

pattern: on_demand
spawned_by: orchestrator
when: "A specific task needs a dedicated agent"
duration: task duration (hours to days)
lifecycle:
  - orchestrator identifies task needing dedicated agent
  - checks host capacity
  - spawns agent with task assignment
  - agent registers in Control Bus
  - works on assigned task
  - reports results
  - terminated when task complete
examples:
  - "Research all competitors in the target domain — spawn a research agent"
  - "Build the SSE endpoint — spawn a backend agent"
  - "Run security audit — spawn a security agent"
visibility: full — registered, trackable, cost-attributed

Pattern 3: Sub-Agent (Ephemeral, Parent-Managed)¶

Spawned WITHIN an agent's session for parallel subtasks. Short-lived. No independent identity.

pattern: sub_agent
spawned_by: parent agent (via Ruflo claude -p)
when: "Agent needs parallel help within its own work"
duration: minutes to hours, within parent session
lifecycle:
  - parent agent decides it needs parallel help
  - spawns sub-agent via Ruflo (claude -p headless)
  - sub-agent executes narrow task
  - reports result directly to parent
  - parent reviews and integrates
  - sub-agent terminates
  - cost rolls into parent's ledger
examples:
  - "I'm coding and need tests run in parallel"
  - "I need a quick code review of my current diff"
  - "Fetch and summarize 5 web pages while I continue"
  - "Run the DDL migration while I update the docs"
max_concurrent: 3 per parent agent
model: typically Haiku (cheapest capable model)
visibility: invisible to orchestrator — parent's responsibility

5A.3 Spawning Decision Logic¶

When the orchestrator receives a task:

flowchart TD
    Task["New Task"] --> NeedAgent{"Need a new agent?"}
    NeedAgent -->|"No - existing agent available"| Assign["Assign to existing agent"]
    NeedAgent -->|"Yes"| Duration{"Expected duration?"}
    Duration -->|"Sprint/project"| Roster["Spawn Project Roster Agent"]
    Duration -->|"Days"| OnDemand["Spawn On-Demand Agent"]
    Duration -->|"Hours or less"| Parent{"Can existing agent sub-agent it?"}
    Parent -->|"Yes"| SubAgent["Parent spawns Sub-Agent"]
    Parent -->|"No"| OnDemand
    Roster --> Register["Register in Control Bus"]
    OnDemand --> Register
    SubAgent --> ParentManages["Parent manages internally"]

5A.4 Rules¶

Only the orchestrator spawns agents (roster and on-demand). Agents spawn sub-agents.
Agents count against host capacity. Sub-agents count against parent's sub-agent limit (max 3).
Sub-agents should NEVER be used for work that needs to survive a session restart. Use on-demand agents for that.
Cost attribution: agents get their own ledger entries; sub-agent costs roll into parent.
The orchestrator cannot see or intervene on sub-agents. If a sub-agent is stuck, the parent agent handles it or escalates.

6. Agent Lifecycle¶

flowchart TD
    Spawn --> Initialize
    Initialize --> Execute
    Execute --> Complete
    Execute --> Fail
    Fail --> Retry
    Retry --> Execute
    Complete --> Terminate

6.1 Heartbeat, Staleness & Orphan Cleanup¶

Heartbeat Intervals¶

Surface	Heartbeat Interval	Protocol
Agents (registered via Bus)	60 seconds	`POST /agents/{id}/heartbeat`
SSE clients	30 seconds	SSE `:ping` frame or heartbeat event

Staleness Thresholds¶

Threshold	Duration	Agent Status	Action
Healthy	< 300 seconds since last heartbeat	`online`	Normal operation
Stale	300 seconds (5 minutes)	`stale`	Governor emits `agent.stale` warning; no task reassignment yet
Dead	600 seconds (10 minutes)	`offline`	Agent marked offline; orphaned tasks reassigned

Orphan Cleanup¶

The Governor runs a cleanup sweep every 60 seconds:

Query all agents where last_heartbeat_at < NOW() - INTERVAL '300 seconds' and status = 'online' -- mark as stale
Query all agents where last_heartbeat_at < NOW() - INTERVAL '600 seconds' and status IN ('online', 'stale') -- mark as offline
For agents marked offline: find all tasks with assigned_agent_id = {agent_id} and status = 'in_progress' -- reset to queued for reassignment
Emit agent.offline governance event with agent_id, last_heartbeat_at, and count of reassigned tasks
SSE connections that miss 3 consecutive pings (90 seconds) are closed server-side

Rules¶

An agent recovering from stale to online must re-register via POST /agents/register and reclaim its queued tasks
An agent recovering from offline must re-register; previously reassigned tasks are NOT automatically returned
The Governor must not mark the master orchestrator (GO) as offline without emitting a critical alert

7. Session Lifecycle¶

flowchart TD
    Start --> Active
    Active --> Idle
    Idle --> Resume
    Idle --> Dream
    Active --> Crash
    Crash --> Recover
    Recover --> Active

7.1 Context Rotation Protocol¶

The Global Orchestrator (GO) runs in long sessions that accumulate context. When context approaches capacity, GO must rotate to a fresh session without losing state.

Protocol Steps¶

context_rotation:
  trigger: "Session duration > 8 hours OR context feels compressed OR many agents spawned"

  steps:
    1. save_state:
      - Update Part 11 (Execution Log) with current session work
      - Update memory files (~/.claude/projects/*/memory/)
      - Push all repos to Git

    2. prepare_handoff:
      - Launch background restart process: nohup bash -c "sleep 5 && devxio go" &
      - OR spawn a rotation agent to manage the restart

    3. exit_session:
      - /exit (or session ends naturally)

    4. new_session_boots:
      - Reads CLAUDE.md (activation protocol)
      - Reads memory files (current project state)
      - Reads Part 11 (what was done, what's pending)
      - Reads plan.yaml (ticket status)
      - Resumes from exact point

  continuity:
    - Agent identity persists (GO = Global Orchestrator, same role)
    - State files are the bridge between sessions
    - No work is lost — everything is in Git + memory + Part 11

  frequency: "As needed. Typically once per 8-12 hour session."

Rule¶

Context rotation is transparent to the user. The orchestrator self-manages it. The user always talks to the same role (GO), just with fresh context.

8. Model Selection Strategy¶

Inputs¶

task complexity
required reasoning
cost constraints
latency requirements

Examples¶

Scenario	Model
heavy reasoning	Claude Opus
balanced	Claude Sonnet
cheap execution	GPT / Gemini
bulk tasks	cheaper models

9. Cost Optimization Layer¶

Managed by the Governor¶

Strategies:

downgrade models when possible
batch operations
avoid redundant work
detect runaway loops
enforce budget limits

10. Failure Handling¶

Types¶

agent failure
model failure
session crash
incomplete task

Handling¶

retry
escalate
switch model
request human input

11. Human-in-the-Loop¶

Trigger Conditions¶

requires_human = true
requires_approval = true
ambiguity detected

Flow¶

pause execution
open RC
await input
resume

12. Multi-Agent Coordination¶

The orchestrator delegates to domain brains
Domain brains spawn workers
Results flow upward
The orchestrator maintains global consistency

12.1 Agent-to-Agent Communication Protocol¶

Message Format¶

All agent-to-agent communication uses JSON messages delivered through the Control Bus bus_send_message tool:

agent_message:
  id: uuid                    # unique message ID
  from_actor: string          # sender agent_id (e.g., "000")
  to_actor: string            # recipient agent_id or topic
  topic: string               # message topic / channel
  type: enum                  # task_assignment | result | query | notification | coordination
  payload: jsonb              # message-specific structured content
  idempotency_key: string     # unique key for deduplication
  correlation_id: string|null # links related messages in a conversation
  created_at: datetime

Delivery Guarantee¶

At-least-once delivery: the Bus persists all events to PostgreSQL. Agents poll with a cursor. Messages are never silently dropped.
If an agent is offline, messages accumulate in the Bus and are delivered when the agent next polls or reconnects via SSE.

Ordering¶

Per-topic ordering guaranteed: messages within a single topic are assigned sequential seq numbers by the Bus. Agents process messages in seq order.
Cross-topic ordering is NOT guaranteed. Agents must not depend on message ordering across different topics.

Retry Behavior¶

If a message delivery fails (agent unreachable, processing error), the sender retries up to 3 times with exponential backoff: 5s, 15s, 45s.
After 3 failed retries, the message is marked delivery_failed and a governance event message.delivery_failed is emitted.
The Bus itself does not retry -- retry is the sender's responsibility.

Idempotency¶

Every message must carry an idempotency_key (typically {from_actor}:{correlation_id}:{seq} or a UUID).
Recipients must deduplicate incoming messages by idempotency_key. Processing the same key twice must produce no side effects.
The Bus MAY deduplicate at ingestion if the same idempotency_key is submitted within a 5-minute window.

Acknowledgement¶

Agents acknowledge processed messages via bus_ack with their cursor position.
Unacknowledged messages are re-delivered on the next poll.

13. CLI-First Principle¶

Everything must be runnable:

via CLI
headless
without UI

UI is optional, execution is not.

14. Infrastructure Execution Mapping¶

Cloud (Hetzner)¶

Orchestrator (BrainMaster)
DB
Ruflo
LiteLLM
API services

Local (Mac Studio)¶

fallback execution
knowledge access
RC sessions
future local models

15. Restart & Recovery¶

System must support:

full restart command
state reload
session recovery
task continuation

16. GitHub & Backup Integration¶

GitHub¶

version control
agent code
rules
blueprints

Backblaze B2¶

backups
snapshots
disaster recovery

17. Observability¶

System must expose:

active agents
running tasks
cost
errors
alerts

18. Security & Isolation¶

API keys protected
agent isolation
permission control
environment separation

19. Current State (v5.0 Addition)¶

19.1 Agent Identity Model¶

All agents use 3-digit numeric IDs. Roles are assigned properties, not fixed identities. See Part 1, Section 8 for the complete identity schema.

Agent-to-role assignments are project-scoped (see Part 3, Section 4.2.1 Project Agent Roster). The architecture does not hardcode which agent holds which role — that is an operational decision made per project.

ID Allocation Ranges¶

Range	Purpose
000-009	Core agents (orchestrators, specialists)
010-019	Remote/external host agents
020-029	Interface agents
030-099	Reserved for future core expansion
100-999	On-demand and ephemeral agents

Current Agent Registry (operational, not architectural)¶

The current agent registry is maintained in the Control Bus and the agents.yaml state file. It changes as projects are created and agents are commissioned or retired. See Part 11 (Execution Log) for the live registry.

19.2 Current Execution Runtime¶

Agent spawning: Ruflo (claude-flow) on Hetzner CPX62
Primary execution surface: Claude Code CLI on Hetzner
Remote execution: via Tailscale to Mac Studio
Agent communication: STRUXIO Bus (PostgreSQL-backed, evolving to Control Bus)
Task tracking: Paperclip (to be superseded by XIOPro ODM)
Agent identity: Unified 3-digit numbering, role-based assignment

20. Execution Success Criteria¶

Execution layer is successful if:

tasks complete reliably
sessions recover automatically
cost is controlled
agents remain coordinated
system runs 24/7

21. Final Statement¶

This layer is the engine of XIOPro .

If this is strong:

the system works continuously
the founder scales beyond time

If weak:

everything collapses into manual work

21. Error Handling Implementation Specification¶

This section closes the error handling gap identified by all three external reviewers. Part 7 defines governance policy objects and breaker types. This section specifies the concrete implementation parameters.

21.1 Retry Policy¶

retry_policy:
  default_max_retries: 3
  backoff: "exponential (1s, 2s, 4s)"
  max_backoff: 30s

21.2 Circuit Breaker Implementation Parameters¶

circuit_breakers:
  cost_breaker:
    threshold: "85% of budget_cap triggers warning, 100% halts non-critical agents"
    evaluation_frequency: "per-activity"
  loop_breaker:
    threshold: "same error 3 times in sequence"
    action: "halt agent, escalate to orchestrator"
  failure_breaker:
    threshold: "5 failed activities in 1 hour"
    action: "pause agent, alert user"
  memory_breaker:
    threshold: "host RAM at 85% = no new agents, 90% = graceful shutdown lowest priority, 95% = emergency terminate"
    evaluation_frequency: "every 60 seconds via host monitor"

21.3 Bus Down Fallback¶

bus_down_fallback:
  detection: "3 consecutive failed heartbeats (3 minutes)"
  agent_behavior: "continue current task locally, queue messages for retry"
  recovery: "on Bus recovery, flush queued messages, re-register"

21.4 Runaway Detection¶

runaway_detection:
  definition: "agent consuming >10x normal tokens for task type, or >30 minutes on a task estimated at <5 minutes"
  action: "governor alerts user, pauses agent if no response in 5 minutes"

21.5 Cross-Reference¶

These parameters implement the breaker types defined in Part 7, Section 9.3 and the recovery policies in Part 7, Section 8.4. The memory breaker implements the memory pressure survival rule from Part 8, Section 11.10.3.

Changelog¶

Version	Date	Author	Changes
4.1.0	2026-03-26	BM	Initial v4.1 release
4.2.0	2026-03-28	BM	Added: T1P implementation form table (4.2E). Added: Ruflo relationship to O00 clarification (4.2F). Fixed: "Rufio" renamed to "Ruflo" globally. Added: Dream Engine T1P posture -- Idle Maintenance only (4.9.9). Added: Current agent mapping table (19.1). Added: Current execution runtime state (19.2). Added: Changelog section. Updated version header to 4.2.0.
4.2.1	2026-03-28	BM	Unified Agent Identity Model: Reframed O00/O01/R01/P01/M01 as role bundles assigned to agents, not separate agent identities. Updated all section headers from profession codes to role names (e.g., "4.1 Orchestrator Role" instead of "4.1 O00"). Updated 4.2E table to show role bundles with agent 000 assignment. Updated 4.2F/4.2G/4.2H to use 3-digit agent IDs. Updated Section 19 agent mapping to unified 3-digit identity table with Old ID column. Updated all body text references from O00/O01/R01/P01/M01 to role-based naming. Updated all Mermaid diagrams to use 3-digit agent IDs.
4.2.2	2026-03-28	000	Agent naming migration: B1-B5 replaced with 001-005 in skill tables and activation examples. BM replaced with 000. W21-W23 replaced with 201-203 in example activation. Slim activation example updated from B2 to 002 naming. Backblaze B2 references preserved unchanged. Changelog author entries preserved as historical.
4.2.3	2026-03-28	000	Idea + User entities: Added `idea_review` to idle_maintenance_tasks (Section 4.10). Added Idea review to T1P Dream Engine posture (Section 4.9.9) — scan ideas not reviewed within review_cycle, surface in morning brief.
4.2.4	2026-03-28	000	Skill Selection Architecture (Section 4.11): 3-step filter (role + task type + model tier) for selecting which skills an agent loads per task assignment. Includes categorized skill library, task assignment contract with skills_required/skills_available fields, and rule steward governance connection. Updated Part 5 Section 8.9 to cross-reference.
4.2.5	2026-03-28	000	Founder clarifications: (1) Role-Topic-Skill binding chain added to Section 4.11 -- skills bind to roles via topics, not directly to agent numbers. (2) Added skill_performance_review and skill_token_optimization to idle_maintenance_tasks (Section 4.9.9).
4.2.6	2026-03-28	000	Roles over numbers: Removed agent IDs from all architectural role descriptions, section headers, diagrams, and tables. Agent numbers retained only in Section 19 (Current State) and Changelog. Blueprint now describes WHAT roles do, not WHICH agent holds them.
4.2.7	2026-03-28	BM	XIOPro Optimizer cross-references: Added "part of the XIOPro Optimizer (see Part 1, Section 8A)" note to Governor (4.2), Rule Steward (4.2A), Prompt Steward (4.2B), Module Steward (4.2C), and Dream Engine (4.9).
4.2.8	2026-03-28	BM	AGI pattern gap fixes: (1) Task Dependency Resolution (5.1) — topological sort DAG algorithm for depends_on/blocks resolution. Addresses audit gap "Workflow DAG Formalization" (Principle 21). (2) Completion Self-Check Protocol (5.2) — 5-step self-evaluation before marking tasks done, with confidence scoring and escalation rules. Addresses audit gap "Agent Self-Evaluation" (Principle 1 depth).
4.2.9	2026-03-28	000	Wave 1-2 BP fixes: Expanded Domain Skills in Section 4.11 — github- (6 skills), flow-nexus- (3 skills) now listed individually. Added Advanced/Candidate Skills table (17 skills) with review concerns. Added Full Skill Count Summary table (50 total skills across 7 categories).
4.2.10	2026-03-28	000	Memory engineering principles: Added Section 4.8A — 5 production engineering rules for memory operations (async updates, debounce writes, confidence threshold, token budget, atomic writes) from 5-Layer Memory Stack research. Includes relation table to existing architecture and implementation requirements. Slimmed Section 4.2H Control Bus to cross-reference Part 2 Section 5.8 (removed duplicated capabilities list).
4.2.11	2026-03-29	BM	Added Section 4.1A (Orchestrator Surface Names) — GO/MO naming convention for Hetzner and Mac orchestrator surfaces with launch commands and rules. Added Section 7.1 (Context Rotation Protocol) — session rotation procedure for long-running orchestrator sessions with state preservation via Part 11, memory files, and Git.
4.2.12	2026-03-29	BM	Cross-references: Added pointer to `resources/DESIGN_rc_architecture.md` (RC architecture — human-agent interaction surface design, Open WebUI evaluation, multi-provider routing).
4.2.13	2026-03-29	000	Batch BP update from recent tickets: Added Section 5.3 (Agent Auto-Pickup) — /agents/{id}/pickup endpoint, self-retrieval pattern, backoff schedule. Added Section 5.4 (Paperclip Auto-Sync) — fire-and-forget pattern for ODM-to-Paperclip sync on task CRUD.
4.2.14	2026-03-30	000	Reviewer role: Added Section 4.2I (Reviewer Role) — formal agent role for post-build independent review. Spawned by GO/PO after builder completes significant work; must be a different agent than the builder; uses different model tier where possible (Opus reviews Sonnet, Sonnet reviews Haiku); reads spec + output independently; returns APPROVED / NEEDS_FIX / REJECTED verdict to orchestrator; short-lived. Updated 4.2D.8 state ownership table, 4.2E T1P form table, and Section 4.11 agent_role_skills to include reviewer.
5.0.1	2026-03-30	GO	I4: Added Section 6.1 (Heartbeat, Staleness & Orphan Cleanup) -- heartbeat intervals (60s agents, 30s SSE), stale threshold (300s), dead threshold (600s), Governor cleanup sweep every 60s, orphaned task reassignment to queued. I7: Added Section 12.1 (Agent-to-Agent Communication Protocol) -- JSON message format via Bus, at-least-once delivery, per-topic ordering with sequential seq numbers, 3-retry exponential backoff, idempotency_key deduplication, bus_ack acknowledgement.