XIOPro Production Blueprint v5.0¶

Part 12 — Work Plan, Migration & Test Strategy¶

1. Purpose¶

This document defines how XIOPro moves from the current mixed implementation state to a governed, testable, continuously operating system.

It answers:

how the build proceeds without a big-bang rewrite
how the current repositories and assets are migrated into the target model
what gets built first
what must be proven by tests before moving forward
how review and outside validation are staged

2. Core Delivery Principle¶

Build while operating. Migrate in place. Prove each layer before treating it as real.

XIOPro should not be rebuilt from zero if live assets, rules, repos, and workflows already exist.

The implementation path should:

preserve existing useful assets
normalize and govern them
replace fragile/manual paths incrementally
keep rollback realistic
attach every major capability to explicit proof

3. Current Baseline¶

XIOPro does not start from nothing.

The current baseline already includes:

multiple STRUXIO repositories
rule and skill assets
ticket history
infrastructure definitions
frontend/backend experiments
swarm/runtime tooling
Librarian-related work
voice, dashboard, and execution-daemon experiments
existing bus/relay concepts
current state files and workspace graph

This means the work plan is a migration and consolidation plan, not only a net-new build plan.

4. Canonical Active Repository Baseline¶

The active repository family for implementation should be treated as:

struxio-os
struxio-logic
struxio-design
struxio-app
struxio-business
struxio-knowledge (v5.0 addition — governed knowledge, syncs with Obsidian)

A transition/legacy path may still exist for:

struxio-aibus

Reference repos may still exist for research or inspiration, but they are not the canonical operating core.

4.1 Recommended Repo Roles¶

See Part 8, Section 8.13.2 for the full repository topology with detailed descriptions of each repo's contents and ownership (struxio-os, struxio-logic, struxio-design, struxio-app, struxio-business, struxio-knowledge, struxio-aibus transitional path).

4.2 Transitional Repo Rule¶

struxio-aibus should not remain a permanent first-class pillar. See Part 8, Section 8.13.2 for the migration and archive plan.

5. Work Streams¶

XIOPro implementation should be managed as parallel governed work streams.

5.1 WS1 — Control Plane & State¶

Focus:

DB / ODM
tickets/tasks/activities
sessions / escalations / decisions
APIs / scheduler / control services

5.2 WS2 — Execution Fabric¶

Focus:

orchestrator
Ruflo/runtime integration
worker execution
session continuity
recovery wiring

5.3 WS3 — Governance & Optimization¶

Focus:

governor
breakers
alerts
approval flows
rule steward / prompt steward / module steward support
module cost and optimization logic

5.4 WS4 — Knowledge & Research¶

Focus:

Librarian
knowledge ingestion
Research Center
NotebookLM / Obsidian linkage
Hindsight / Dream proposal loops

5.5 WS5 — UI / Control Center¶

Focus:

widget system
prompt composer
brain collaboration
governance / research / module workspaces
desktop and mobile operator flows

5.6 WS6 — Infrastructure & Operations¶

Focus:

bootstrap
deployment
repo/storage layout
secrets and networking
telemetry
backup / restore
runtime health and rollout discipline

5.7 WS7 — Product Integration¶

Focus:

XIOPro support for the first product (see MVP1_PRODUCT_SPEC.md)
customer/runtime separation
handoff from internal operating system to product runtime

6. Self-Build Strategy & Parallel Operation¶

6.1 Purpose¶

XIOPro should help build XIOPro.

The system already has partially active headless execution capacity, legacy swarm patterns, and current operational tickets/assets.

This section defines how the existing agent capability is used to accelerate migration, without pretending the new architecture already fully exists.

6.2 Core Principle¶

Use the current system to build the next system, but under explicit migration discipline.

This means:

existing brains/agents may execute migration and build work
current execution capacity should be harnessed, not ignored
outputs must be normalized into the new governed model
legacy naming and partial architecture must not silently become permanent

The system should bootstrap itself progressively, not magically.

6.3 Legacy-to-Target Brain Transition¶

Current legacy names such as:

BM (now the BrainMaster)
M0 (now the Mac Worker)

may continue as transitional identifiers during migration.

But the target architecture should normalize toward role-based naming:

BrainMaster with orchestrator, governor, rule steward, prompt steward, module steward roles
domain brains
Mac Worker
Face
workers under governed runtime identity

Rule¶

Legacy names may remain operationally useful for continuity, but target tickets and docs should increasingly anchor on the new role model.

6.4 Dual-Project Strategy¶

XIOPro and product projects may run in parallel, provided the boundary remains explicit.

XIOPro Track¶

Focus:

core operating system
control plane
governance
work graph
knowledge/research
UI / control center
infrastructure hardening

Product Track¶

Focus:

revenue-facing product delivery
product APIs and features
user-facing outcomes validated by XIOPro

For the first product (MVP1), see MVP1_PRODUCT_SPEC.md.

Boundary Rule¶

Parallel work is allowed. Architectural collapse is not.

XIOPro must remain the internal operating system. Product work remains a separate track.

6.5 Immediate Migration Reality¶

Before full XIOPro-native operation, the implementation should explicitly support a transition from the older mixed environment:

mixed XIOPro beta / product work
older swarm naming
mixed repo history
older bus/relay concepts
partial manual workflows

A cleanup wave is required to separate:

XIOPro project work
product project work (see MVP1_PRODUCT_SPEC.md)
legacy / archive material
transitional repo content

This cleanup is not optional. It is part of enabling self-build.

6.6 Recommended Early Transition Tickets¶

The earliest self-build tickets should include at least:

server upgrade / stabilization assessment ticket
memory-pressure / lockout cleanup ticket
repo-role normalization ticket
XIOPro vs product project separation ticket
struxio-aibus migration / archive ticket
legacy brain naming normalization ticket
current swarm execution inventory ticket
first XIOPro-native migration execution ticket

These tickets create the runway for the system to build more of itself safely.

6.7 Self-Build Workflow¶

Recommended pattern:

founder defines or approves migration/build ticket
current headless agents execute under supervision
output is reviewed and normalized
output is committed into canonical repos/state
resulting capability increases XIOPro's own ability to execute the next wave

flowchart TD
    Ticket --> LegacyOrCurrentAgents
    LegacyOrCurrentAgents --> ReviewNormalize
    ReviewNormalize --> CommitToCanonicalSystem
    CommitToCanonicalSystem --> NewCapability
    NewCapability --> NextTicketWave

6.8 Guardrail¶

Self-build is allowed only if:

the output is reviewable
the output lands in canonical repos/state
the output does not bypass governance
the output does not mutate production behavior silently
rollback remains realistic

XIOPro may help build itself. It may not self-redefine without control.

6.9 Parallel 24x7 Operation Rule¶

If continuous agent availability exists, XIOPro and product work may run as parallel workstreams.

This is desirable because it reduces dependence on founder real-time attention.

However, 24x7 availability does not remove the need for:

approval gates
prioritization
escalation handling
evidence-backed module use
recovery discipline
daily supervision rhythm

Always-on execution is a multiplier, not a replacement for governance.

6.10 Schedule Expectation Rule¶

Aggressive timelines may be used as operating targets, but blueprint gates should remain evidence-based.

Examples of good gates:

one recovery path proven
one end-to-end task flow proven
one approval path proven
one research workflow preserved
one UI interaction becoming durable state

The system should optimize for accelerated progress, but not by removing proof.

6.11 Parallel Operation Rule¶

During migration, old services and new services run in parallel.

This means:

Bus continues running alongside new API service
Paperclip continues running alongside new ODM-based ticket management
Phase 1 dashboard continues running alongside new Control Center
Paperclip DB runs alongside ODM PostgreSQL until consolidation

No big-bang cutover.

Old services are retired only after new services reach feature parity and are proven stable in production for a reasonable period.

Retirement sequence follows the service fate map in resources/SERVICE_FATE_MAP_v4_2.md.

7. Delivery Phases¶

The phases below are ordered by dependency, but some work streams run in parallel.

Phase 0 — Grounding (Completed 2026-03-28)¶

Goals:

lock technology decisions
write executable DDL from ODM schema
map service fates (current -> target)
install must-have CLI tools
harden security (firewall, SSH, git history)
verify emergency access paths
establish 4-digit ticket numbering
define Control Bus architecture
define unified agent identity model
retire deprecated services
create initial ticket backlog
produce BP v5.0 (all 12 parts)

Key outputs:

resources/SCHEMA_walking_skeleton_v4_2.sql — 21 tables, 27 enums, runs on PostgreSQL without error
resources/SERVICE_FATE_MAP_v4_2.md — 14 containers mapped
resources/CLI_TOOLS_ASSESSMENT.md — 7 must-have CLI tools assessed
resources/BP_v4_2_CORRECTIONS.md — 30+ correction items from v4.1
CLI tools installed: jq 1.8.1, yq 4.52.5, fd 10.4.2, fzf 0.70.0, uv 0.11.2, gh 2.89.0
Security: UFW firewall enabled, Tailscale SSH, git history cleaned (plaintext secrets purged)
Stale services retired: devxio-frontend, devxio-bridge (123 MB freed), Neo4j (1.83 GB freed)
First batch of 4-digit tickets created (TKT-1001 to TKT-1008) in Paperclip
Control Bus architecture defined (Parts 2, 4, 8)
Unified Agent Identity Model (3-digit IDs, role bundles) across all 12 parts
struxio-knowledge repo created and seeded (13 technology evaluations)
Brand brief v1 produced
resources/DESIGN_rc_architecture.md — Remote Control architecture (Open WebUI, multi-provider routing, Prompt Composer integration)
resources/DESIGN_cli_services.md — CLI services framework (config-driven operational commands via Bus API)

Gate to exit:

DDL runs on PostgreSQL successfully (done)
Service fate map approved by founder (done)
CLI tools installed and verified (done)
Security hardened (done)
BP v5.0 complete (done)

Open items carried to Phase 1: - pg_dump not yet in backup scope (needs root SSH) - backup.sh has plaintext B2 credentials (needs SOPS encryption) - direnv not installed (needs sudo/apt)

Estimated duration: 1 day (completed in 1 day)

Phase 0A — Baseline Consolidation¶

Goals:

freeze authoritative blueprint set
freeze canonical repo roles
inventory current assets
mark transitional/legacy repos
define the first ticket backlog from blueprint decisions

Key outputs:

canonical repository registry
canonical blueprint set
migration inventory
first workstream board

Phase 1 — Foundation & Walking Skeleton¶

Goals:

establish Node A baseline
validate Node B role
finalize bootstrap/startup/update scripts
establish repo/storage conventions
prove deployment/restart discipline
prove the walking skeleton end-to-end path

Key outputs:

repeatable bootstrap
controlled update flow
storage/repo conventions active
base observability and backup running
walking skeleton entities live in PostgreSQL

Gate to exit:

fresh bootstrap works
warm restart works
rollback path documented and smoke-tested
one end-to-end task can run with durable state (walking skeleton)

Estimated duration: 1 week

Phase 2 — Work Graph / ODM + Governance & Research¶

Goals:

implement authoritative ODM schema
implement tickets/tasks/activities
implement runtimes/sessions/escalations/human decisions
implement cost/time propagation
replace ad hoc work tracking with governed state
activate the governor
implement alerting and breakers
implement approval/discussion gates
activate rule steward / prompt steward / module steward governance paths
connect cost signals and optimization evidence

Key outputs:

DB schema
migrations
API/domain contracts
first real ticket/task execution chain
alerts and breaker baseline
approval workflow
module portfolio governance
ContextPrompting governance
governed asset publication path

Gate to exit:

one end-to-end task can run with durable state
one escalation can be opened and resolved
one activity cost record is attributable upward
one breaker triggers correctly
one approval path blocks and resumes execution
one module decision is evidence-backed and queryable

Estimated duration: 1 week

Phase 3 — Execution Fabric + UI Minimum¶

Goals:

wire the orchestrator to the work graph
wire Ruflo/runtime execution into tickets/tasks
establish session manager / resume semantics
establish recovery paths
prove headless execution without UI
implement widget-first web UI
implement Prompt Composer
implement Brain Collaboration preset
implement Control Center preset

Key outputs:

orchestrator active
task-to-runtime execution flow
session continuity and restart support
recovery path choices
widget grid shell
core workspaces active

Gate to exit:

one ticket executes end-to-end headlessly
one restart/recovery scenario is successful
session lineage remains intact
founder can talk to a brain from the UI
an execution-changing interaction becomes durable state

Estimated duration: 1-2 weeks

Phase 4 — Migration, Cleanup & Knowledge¶

Goals:

retire stale services (devxio-frontend, devxio-bridge)
~~evaluate Neo4j instances~~ (done -- both retired and removed, see Part 5 Section 12.1)
migrate Paperclip data to ODM
activate Librarian discipline
implement Research Center task model
connect Hindsight and Dream proposal loops
tune performance and cost behavior
reduce manual operator burden

Key outputs:

stale services retired
Paperclip data migrated or adapted
Librarian ingestion path active
research task scheduling active
known failure and recovery patterns documented
operational hardening complete

Gate to exit:

one research task runs and is preserved
one research output is ingested and retrievable
stale services stopped and containers removed
system can run continuously with controlled human intervention

Estimated duration: 1 week

Phase 5 — Knowledge & Research (Extended)¶

Goals:

implement governed ingestion
implement Research Center task model fully
connect NotebookLM and Obsidian through bounded roles
implement Hindsight and Dream proposal loops

Key outputs:

Librarian ingestion path
research task scheduling
research output types
research-to-knowledge promotion path

Gate to exit:

one research task runs and is preserved
one research output is ingested and retrievable
one Hindsight or Dream proposal routes correctly for approval

Phase 6 — UI / Control Center (Extended)¶

Goals:

implement approvals / alerts / research / module workspaces
establish mobile reduced-surface mode
implement hot/warm/cold widget model

Key outputs:

hot/warm/cold widget model
prompt widget with Prompting Mode, attachments, voice, search/research, style, module controls
approval and recovery visible and actionable in UI
mobile mode supports core founder control

Gate to exit:

approval and recovery are visible and actionable in UI
mobile mode supports core founder control

Phase 7 — Product Integration & Hardening¶

Goals:

connect XIOPro to product-relevant flows (see MVP1_PRODUCT_SPEC.md for the first product)
prove product/runtime separation
harden failure cases
reduce manual operator burden
tune performance and cost behavior

Key outputs:

product support path
operational hardening
reduced manual intervention
known failure and recovery patterns documented

Gate to exit:

XIOPro supports at least one meaningful product workflow
system can run continuously with controlled human intervention
critical quality/stability/trust criteria are met

7A. Timeline Estimate¶

T1P vs Phase 2 Scope Boundary¶

The external reviews (ChatGPT, Claude, Gemini) unanimously identified the original 10-14 day claim as not credible for the full scope. This section now distinguishes between T1P (Walking Skeleton) and Phase 2 (Full System). Full production delivery takes 4-6 weeks, with sprint compression achieving individual feature delivery in hours.

t1p_scope:  # 2-3 days (sprint-compressed; 4-6 weeks for full production)
  - ODM schema deployed (DONE)
  - Control Bus (SSE, registration, intervention, tasks)
  - Walking skeleton (one full path: discussion -> ticket -> task -> execution -> cost)
  - One escalation/approval path
  - Minimum UI (shell + 3-4 core widgets)
  - Agent spawning with host capacity check
  - Basic cost tracking

phase_2_scope:  # 4-6 weeks after T1P
  - Full governance (all breakers, all policies, all review gates)
  - Full knowledge system (Librarian service, Research Center automation, RAG pipeline)
  - Full UI (all 6 widgets, mobile, layout presets)
  - Hindsight integration
  - Dream Engine / Idle Maintenance automation
  - Skill Performance DB active
  - Behavioral testing
  - Product integration (see `MVP1_PRODUCT_SPEC.md`)

Accelerated Timeline (Revised 2026-03-28)¶

accelerated_timeline:
  lesson: "34 tickets completed in Day 1 (~2.5 hours agent time)"
  implication: "Human-pace estimates are 50-100x too slow for agent execution"

  revised_plan:
    day_1: "DONE — Control Bus, ODM, Governance, UI, Knowledge, Infra, Testing (34 tickets)"
    day_2: "Docker deploy, UI polish, product integration start, Paperclip cleanup"
    day_3: "Product core (see MVP1_PRODUCT_SPEC.md)"
    day_4: "Integration testing, SSE real-time, UI connected to live data"
    day_5: "Phase 2 planning, full governance activation, research center automation"

  t1p_completion: "Day 2-3 (not Day 10-14)"
  phase_2_start: "Day 4-5 (not week 4-6)"

T1P Target (Founder Directive 2026-03-28, revised 2026-03-29)¶

T1P (Walking Skeleton + Control Bus + One Approval Path + Minimum UI) must be operational within 2-3 days (revised from original estimates based on Day 1 execution velocity; full production requires 4-6 weeks, with sprint compression achieving feature delivery in hours). Product work begins Day 2, running in parallel (see MVP1_PRODUCT_SPEC.md). Additional budget (tokens + compute) is available if needed.

The broader scope (full governance, full knowledge system, full research center, all 25 modules operational) is Phase 2 -- estimated Day 4-5 start (revised from 4-6 weeks after T1P).

Phase	Duration	Cumulative	Parallel Work	Ticket Coverage
Phase 0 (Grounding)	Day 1	Day 1	DONE — DDL, CLI, security, BP v5.0, tickets, Control Bus, ODM, Governance, UI, Knowledge, Infra, Testing (34 tickets)	Pre-ticketing + TKT-1001 to TKT-1063
Phase 1 (Docker Deploy + UI Polish)	Day 2	Day 2	Docker containers live, Control Center deployed, Caddy routing, Paperclip cleanup	TKT-1070+
Phase 2 (Product Core)	Day 2-3	Day 3	Product integration (see `MVP1_PRODUCT_SPEC.md`)	Product tickets
Phase 3 (Integration + Real-time)	Day 3-4	Day 4	SSE real-time, UI connected to live Bus/ODM data, end-to-end validation	Integration tickets
Phase 4 (Phase 2 + Full Governance)	Day 4-5	Day 5	Phase 2 planning, full governance activation, research center automation	Phase 2 tickets
Total T1P	2-3 days
Phase 2 begins	Day 4-5		Full governance, full UI, knowledge, migration	Remaining tickets

Acceleration Levers (Proven)¶

Multi-agent parallel execution: 34 tickets in ~2.5 hours proves the model
24x7 continuous operation: BrainMaster runs without human-gated pauses
Compressed phases: All phases overlap -- Day 1 covered Phases 0-3 of the original plan
Mac Studio M1: Overflow capacity (~25 additional agent slots)
Budget flexibility: Founder willing to add tokens and compute power

Resource Allocation¶

Agent	Primary Phase	Secondary
BrainMaster (orchestrator)	All phases — orchestration	Control Bus coding
Engineering Brain	Phase 1-2 — Bus upgrade, backend	Phase 3 — API endpoints
Brand Brain	Phase 3 — UI / Control Center	Phase 2 — brand integration
DevOps Brain	Phase 1 — infrastructure, testing	Phase 4 — migration
Mac Worker	All phases — Mac tasks, Obsidian	Product support

Timeline Rules¶

Human-pace estimates are 50-100x too slow for agent execution -- plan accordingly.
Gates are evidence-based — but compress validation, don't skip it.
Parallel work across phases is the default, not the exception.
24x7 agent availability accelerates and is expected.
If blocked, escalate immediately — don't wait for next day.
Additional compute or API budget available on request.

7B. Walking Skeleton Definition¶

7B.1 Purpose¶

XIOPro needs a provable minimum end-to-end execution path before broader expansion.

This is the walking skeleton.

It is the smallest real slice that proves the machine is a machine, not only a document set.

7B.2 Walking Skeleton Goal¶

The walking skeleton should prove this flow:

founder request -> durable discussion / intake -> ticket -> task -> runtime / session -> activity / result -> optional escalation -> human decision -> completion -> traceable cost and audit signals

This is the minimum proof path for governed execution.

7B.3 Required Objects in the Skeleton¶

The walking skeleton must use real instances of at least:

Discussion Thread
Ticket
Task
Agent Runtime
Session
Activity
Escalation Request
Human Decision
cost/usage record
audit/governance event

If these are mocked only in prose, the skeleton is not real enough.

DDL Reference¶

The executable schema for the walking skeleton is defined in:

resources/SCHEMA_walking_skeleton_v4_2.sql

This DDL must run on PostgreSQL without error as the Phase 0 gate. It defines all tables, enums, indexes, and constraints for the walking skeleton entities.

7B.4 Required Services in the Skeleton¶

The walking skeleton should involve at least:

one backend API/service layer
PostgreSQL-backed state
orchestrator execution/orchestration path
one constrained runtime/execution surface
one approval or escalation path
one UI or operator surface path
one telemetry/cost signal path

7B.5 Skeleton Scenarios¶

T1P should prove at least these scenarios:

Scenario A — Straight Through Execution¶

founder opens a Discussion Thread
thread promotes into Ticket / Task
orchestrator assigns execution
Agent Runtime and Session are created
Activity executes and returns a result
task completes
cost and audit signals are queryable

Scenario B — Escalation / Human Gate¶

execution reaches ambiguity or approval threshold
Escalation Request opens
founder responds
Human Decision is recorded
execution resumes and completes

Scenario C — Recovery / Resume¶

a session is interrupted
recovery path is chosen
session or replacement session resumes
lineage remains intact

Scenario D — Research Flow¶

founder creates or promotes work into a Research Task
research runs
Research Output is preserved
output is promoted into Librarian-managed knowledge or explicitly left draft

7B.6 Skeleton Boundaries¶

The walking skeleton should avoid false scale.

It does not need:

full multi-brain sophistication
all widget presets
advanced Dream behavior
large module portfolio automation
complete self-build autonomy

It needs only enough reality to prove the canonical path.

7B.7 Acceptance Criteria¶

The walking skeleton is complete when all are true:

the end-to-end flow is real, not mocked
state persists across services
one escalation/resume path works
one cost signal is attributable
one audit trail is inspectable
one research task path is real
one recovery path is proven
the founder can supervise the path from the UI or control surface

7B.8 Implementation Rule¶

All broader implementation ambition should be built on top of the walking skeleton.

If the skeleton is not working, additional architectural sophistication increases risk instead of reducing it.

8. Ticket Numbering¶

8.1 New Ticket Numbering Scheme¶

All new tickets use a 4-digit numbering scheme starting at 1001.

NNNN          — Sequential ticket number (1001, 1002, ...)
NNNN.NN       — Sub-task (1001.01, 1001.02)
EPIC-NN       — Epic grouping (EPIC-01, EPIC-02)

8.2 Historical Ticket Preservation¶

Existing 3-digit tickets (064-079) are preserved as-is. Do not renumber.
Paperclip STR-N identifiers retire when Paperclip retires.
ODM uses the 4-digit number as the canonical identifier.

8.3 Migration¶

The transition from 3-digit to 4-digit is a clean cut:

All tickets created before v5.0 keep their original numbers
All tickets created under v5.0 start at 1001
No ambiguity: 3-digit = historical, 4-digit = current

9. Recommended Initial Ticketization Order¶

The first ticket waves should roughly follow this order:

bootstrap / deploy / restore baseline
ODM schema + migration runner (DDL from resources/SCHEMA_walking_skeleton_v4_2.sql)
ticket/task/activity runtime chain
session / escalation / human decision support
orchestrator execution orchestration
session recovery paths
governor alerts / breakers / approvals
Librarian ingestion and retrieval baseline
Research Center task model
widget shell + Prompt Composer
Brain Collaboration and Control Center presets
module portfolio evidence and optimization views
mobile reduced-surface mode
Product integration tickets (see MVP1_PRODUCT_SPEC.md)
outside review and correction wave

9A. T1P Sequencing Constraints & First Skeleton Wave¶

9A.1 Purpose¶

The work plan now includes:

explicit T1P posture rules
a defined walking skeleton
a self-build strategy
parallel XIOPro and product operation

This section constrains their interaction so early implementation does not become uncontrolled parallelism.

9A.2 Sequencing Principle¶

Not all work should begin at once just because the architecture names many subsystems.

T1P sequencing should follow this rule:

stabilize -> prove the skeleton -> expand carefully

This protects the project from: - blueprint drift - premature parallelization - self-build overreach - UI-first illusion - governance that exists only in prose

9A.3 Hard Ordering Constraints¶

The following must be treated as hard prerequisites.

Constraint A — Bootstrap Before Breadth¶

Before broad feature work begins, XIOPro must have:

repeatable bootstrap
controlled restart
basic observability
basic backup/restore confidence
emergency access / lockout survival path

Constraint B — ODM Before Rich Orchestration¶

Before richer orchestration or advanced UI behavior, the system must have authoritative:

Discussion Thread
Ticket
Task
Agent Runtime
Session
Escalation Request
Human Decision
Research Task
cost and audit records

Constraint C — Walking Skeleton Before Broad Self-Build¶

Before XIOPro meaningfully "builds itself," the walking skeleton must already be real.

Self-build may assist setup and migration preparation earlier, but it should not become a broad primary delivery method until the skeleton proves:

durable state
controlled execution
one human gate
one recovery path
one research path

Constraint D — XIOPro Boundary Before Product Acceleration¶

Parallel product work is allowed (see MVP1_PRODUCT_SPEC.md), but broad parallel acceleration should wait until XIOPro proves:

work graph integrity
execution continuity
governance basics
module/cost visibility
a stable founder control surface

9A.4 First Skeleton Wave¶

The first implementation wave should be deliberately narrow.

Wave 1 Goal¶

Prove the canonical end-to-end XIOPro path using a minimal but real slice.

Wave 1 Scope¶

Build only enough to prove:

Discussion Thread creation
Ticket / Task promotion
orchestrator assignment
Agent Runtime + Session creation
Activity record creation
Result persistence
optional Escalation Request
Human Decision recording
completion state
queryable cost/audit signal

Wave 1 Explicit Non-Goals¶

Do not require in Wave 1:

full widget catalog
broad multi-brain sophistication
complete module portfolio intelligence
rich Dream behavior
rich Hindsight automation
broad self-build autonomy
many runtime surfaces

Wave 1 is about proving the machine, not proving every future sophistication.

9A.5 First Research Wave¶

Research remains in T1P, but the first research wave should also be narrow.

It should prove only:

Research Task creation
source bundle association
research execution or synthesis run
Research Output preservation
promotion into Librarian-managed knowledge or explicit draft retention

Research Rule¶

Research is preserved in T1P, but it does not need a giant multi-surface ecosystem before the core research path is real.

9A.6 First Governance Wave¶

The first governance wave should prove only:

one alert path
one approval-required path
one breaker path
one override path
one cost anomaly or module constraint path

This is enough to prove governance is operational, without pretending all advanced governance exists on day one.

9A.7 First UI Wave¶

The first UI wave should prove only:

Control Center preset
Brain Collaboration preset
Prompt Composer
approval visibility
trace visibility
mobile reduced-surface basics

UI Rule¶

The first UI wave should expose real backend state.

It must not become a polished simulation layer over incomplete backend contracts.

9A.8 First Self-Build Wave¶

XIOPro may help build XIOPro early, but the first self-build wave should be constrained to:

migration inventory work
repo normalization assistance
documentation cleanup
blueprint-to-ticket preparation
research preparation
non-destructive code/doc generation
reviewed implementation tickets under founder supervision

Rule¶

Before the walking skeleton is proven, self-build should be assistive and review-heavy, not broadly autonomous.

9A.9 First Parallel Product Wave¶

Parallel product work (see MVP1_PRODUCT_SPEC.md) may continue in T1P, but only in bounded areas that do not depend on unfinished XIOPro core guarantees.

Good early parallel candidates:

product design
requirements shaping
UX exploration
non-critical application slices
research and business-facing prep

Riskier candidates to delay until XIOPro core proof:

heavy dependence on unstable orchestration
deep coupling to incomplete governance
runtime-critical workflows requiring strong recovery guarantees

9A.10 First-Sprint Acceptance Gate¶

The first serious implementation wave should not be declared successful unless all are true:

bootstrap and restart are repeatable
the walking skeleton path is real
at least one escalation/human decision path works
at least one research task path works
at least one approval/governance path works
founder can inspect the flow from the UI/control surface
cost/audit evidence is queryable

If these are not true, the system is still in architectural assembly, not operational proof.

10. Migration Strategy¶

10.1 Migrate, Do Not Flatten¶

Existing assets should be:

classified
normalized
migrated
archived where necessary

Do not collapse everything into one repo or one giant backlog.

10.2 Legacy / Transitional Asset Handling¶

For assets such as:

older tickets
legacy repo structures
old bus concepts
historical draft blueprints

choose one of:

adopt
normalize
supersede
archive

10.3 Rule for Legacy Systems¶

A legacy path may continue temporarily if:

it still provides operational value
it has not yet been replaced
its boundary is explicit
its retirement path is defined

11. Sprint Strategy¶

11.1 Sprint Length¶

Recommended initial length:

1 week

This is short enough for fast correction, but long enough to prove real technical increments.

11.2 Sprint Shape¶

Every sprint should include:

planned build work
validation work
documentation update
cleanup / debt reduction
explicit review
next-sprint refinement

11.3 Sprint Rhythm¶

flowchart TD
    Plan --> Build
    Build --> Validate
    Validate --> Review
    Review --> Correct
    Correct --> NextSprint

11.4 Sprint Rule¶

No sprint is complete unless:

implementation changed
validation ran
state/docs were updated
open defects and follow-ups were recorded

12. Daily Operating Rhythm¶

Morning¶

review attention queue
review alerts and blocked work
review approvals needed
adjust priorities and ticket order

Day¶

active build and supervision
founder/brain collaboration
governed interventions
module/research decisions as required

Evening / Night¶

lower-touch execution
scheduled research
Hindsight/Dream cycles where allowed
maintenance/refresh jobs
optimization passes within policy

13. Test Strategy¶

Testing is not a final polish phase.

It is part of the delivery model.

13.1 Test Layers¶

XIOPro should use multiple test layers:

unit tests
integration tests
workflow tests
recovery tests
governance tests
UI tests
performance tests
acceptance tests

13.1A Default Test Toolchain¶

See Part 2, Section 5.11 for the canonical test toolchain decision (pytest for backend, Playwright for E2E, optional Vitest).

Practical mapping for the work plan:

ODM, policy, scheduler, cost, and governance tests -> pytest
API/service integration tests -> pytest
UI workflow tests -> Playwright
mobile reduced-surface flow tests -> Playwright
widget rendering contract tests -> Playwright first, optional Vitest later if needed

13.2 Unit Tests¶

Target:

ODM logic
state transitions
policy evaluators
module recommendation logic
ContextPrompting decisions
parsing / validation helpers

Purpose:

prove deterministic local correctness

13.3 Integration Tests¶

Target:

API <-> DB
orchestrator <-> work graph
governor <-> policy/breaker system
Librarian ingestion path
Research Center scheduling
UI <-> backend contracts
module routing / LiteLLM / runtime adapter paths

Purpose:

prove service cooperation

13.4 Workflow / End-to-End Tests¶

Target scenarios:

founder request -> ticket -> task -> runtime -> result
escalation request -> human decision -> resume
research task -> output -> ingestion -> retrieval
module recommendation -> constrained execution -> measured outcome
UI interaction -> durable state mutation

Purpose:

prove the machine works as a machine

13.5 Recovery Tests¶

Target scenarios:

session crash
container restart
host reboot
provider failure
recovery reroute
admission closed then reopened
backup restore drill

Purpose:

prove recoverability, not only happy path

13.6 Governance Tests¶

Target scenarios:

breaker trigger
approval-required action blocked
ContextPrompting asks blocking question
bad module path constrained
override recorded correctly
audit events emitted correctly

Purpose:

prove the system remains governed under stress

13.7 UI Tests¶

Target:

widget rendering and contract integrity
layout preset loading
Prompt Composer behavior
approvals / alerts / research widget flows
mobile reduced-surface mode
long-list virtualization behavior

Purpose:

prove the control center is usable and safe

13.8 Performance Tests¶

Target:

hot widget responsiveness
event throughput
cost telemetry ingestion
search/retrieval latency
module routing latency
mobile responsiveness
long conversation/log virtualization behavior

Purpose:

prove widget-first design remains operational

13.9 Acceptance Tests¶

Acceptance should be written from founder/operator perspective.

Example acceptance cases:

"I can ask a brain to work, clarify it, and see the durable result."
"I can approve or reject a protected action and know what changed."
"I can see which module was used, what it cost, and why."
"I can recover from a failed runtime without losing context."
"I can create and review research in a repeatable way."
"I can use the core controls from mobile."

13.10 Agent Behavioral Testing (v5.0.8 Addition)¶

Beyond code testing (pytest, Playwright), XIOPro must test agent behavior itself. Code tests verify that the system works; behavioral tests verify that agents behave correctly within the system.

behavioral_testing:
  prompt_regression:
    what: "Verify that agent responses remain consistent after skill/rule changes"
    method: "Golden set of input->expected_output pairs per role"
    when: "After any skill, rule, or activation file change"

  drift_detection:
    what: "Detect when agent behavior degrades over time"
    method: "Compare current task completion rate, token usage, and quality scores against baseline"
    when: "Weekly via Idle Maintenance"
    threshold: "Flag if completion rate drops >10% or token usage increases >20%"

  role_conformance:
    what: "Verify agent operates within its assigned role boundaries"
    method: "Check that agent didn't use capabilities outside its role bundle"
    when: "Post-task evaluation"

  escalation_testing:
    what: "Verify agents escalate correctly when they should"
    method: "Present ambiguous/impossible tasks, verify escalation happens"
    when: "Monthly via controlled test scenarios"

Design Rationale¶

Prompt regression is the agentic equivalent of unit testing. When a skill or rule changes, the agent's behavior may change in unexpected ways. A golden set of input/output pairs detects regressions before they reach production.
Drift detection catches gradual degradation that no single change causes. Model updates, accumulated context patterns, or subtle prompt erosion can all cause drift. Weekly measurement via Idle Maintenance (Part 4, Section 4.9.9) provides early warning.
Role conformance ensures agents stay within their assigned boundaries. An agent with the engineering role should not be making governance decisions. Post-task evaluation checks for boundary violations.
Escalation testing verifies a critical safety property: agents must escalate when they cannot complete a task or when the task is ambiguous. This is tested with controlled scenarios, not just observed in production.

T1P Implementation¶

For T1P, behavioral testing is lightweight:

Prompt regression: maintain 5-10 golden test cases per role bundle. Run after any activation file change. Use pytest to invoke the prompt and compare output against expected patterns (fuzzy match, not exact).
Drift detection: track completion rate and token usage per role in the Skill Performance DB (Part 5, Section 8.9A). Weekly Idle Maintenance task compares current week vs baseline.
Role conformance: manual review during task evaluation. Automated enforcement deferred to post-T1P.
Escalation testing: manual test scenarios during sprint reviews. Automated escalation test suite deferred to post-T1P.

Relation to Existing Test Layers¶

Behavioral tests complement, not replace, the test layers in Sections 13.1-13.9:

Unit tests (13.2) verify code logic
Behavioral tests (13.10) verify agent decision-making
Governance tests (13.6) verify policy enforcement
Together they cover the full stack: code correctness, agent behavior, and system governance

14. Exit Criteria by Major Layer¶

Work Graph / ODM Exit Criteria¶

explicit state
queryable lineage
cost/time propagation
escalation + human decision support

Execution Exit Criteria¶

headless ticket execution
session continuity
recovery path support

Governance Exit Criteria¶

alerts
breakers
approvals
auditability
governed module/prompting behavior

Knowledge / Research Exit Criteria¶

governed ingestion
retrievable outputs
scheduled research support
proposal routing from Hindsight/Dream

UI Exit Criteria¶

widget-first shell
Prompt Composer
collaboration preset
governance visibility
mobile reduced-surface mode

15. Review Strategy¶

15.1 Internal Review¶

After each major phase:

compare implementation to blueprint
compare docs to implementation
update open gaps
record correction tickets

15.2 Outside Review¶

After Part 10 completion and before major lock:

run an outside architecture review
run an implementation practicality review
run an OSS/capability scan for build-vs-adopt decisions

Outside review should challenge:

overdesign
missing operational details
unnecessary custom code
weak failure handling
weak testing discipline

16. Risks & Mitigation¶

Risk: Complexity exceeds control¶

Mitigation:

strict ticketization
sprint gates
layered implementation
outside review

Risk: Hidden drift between blueprint and code¶

Mitigation:

phase reviews
update docs as part of sprint completion
treat mismatch as a tracked defect

Risk: Cost visibility remains weak¶

Mitigation:

execution-time telemetry
attribution pipeline
governor and module steward integration
cost-focused acceptance tests

Mitigation:

hot/warm/cold widget model
lazy loading
virtualization
layout presets before full freedom

Risk: Legacy assets create confusion¶

Mitigation:

classify adopt / supersede / archive
define repo roles
remove transitional repos from the canonical core once migrated

Risk: Research becomes fragmented again¶

Mitigation:

Research Center task model
source bundle discipline
Librarian promotion path
scheduled refresh rules

17. Success Criteria¶

The plan is successful if:

XIOPro becomes buildable in layers
migration preserves useful existing assets
repos and responsibilities become clear
the system can run headlessly and be supervised through the UI
module use becomes governable and optimizable
research becomes repeatable and reusable
test coverage proves the critical workflows
product work can be supported without collapsing XIOPro into product runtime (see MVP1_PRODUCT_SPEC.md)

18. Current State¶

As of 2026-03-28, the work plan exists in blueprint form only.

What exists today:

14 Docker containers running on Hetzner CPX62
Active ticket backlog (3-digit, 064-079 range + STR-N Paperclip identifiers)
Five canonical repos (struxio-os, struxio-logic, struxio-design, struxio-app, struxio-business)
One transitional repo (struxio-aibus)
BrainMaster orchestrator running via Claude Code
Walking skeleton DDL written (resources/SCHEMA_walking_skeleton_v4_2.sql)
Service fate map written (resources/SERVICE_FATE_MAP_v4_2.md)
CLI tools assessment written (resources/CLI_TOOLS_ASSESSMENT.md)
Blueprint v5.0 corrections identified and being applied

What must happen next:

Execute Phase 0 (Grounding): run DDL, install CLI tools, create first 4-digit tickets
Begin Phase 1 (Walking Skeleton): prove the end-to-end path
Retire stale containers (devxio-frontend, devxio-bridge) for immediate RAM savings

19. Final Statement¶

This plan turns XIOPro from a partially manual, partially experimental operating environment into a governed, testable, continuously improving machine.

The goal is not speed without proof.

The goal is controlled acceleration: build, validate, govern, optimize, and only then scale.

Changelog¶

Version	Date	Author	Changes
4.1.0	2026-03-27	BM	Initial work plan blueprint
4.2.0	2026-03-28	BM	C9.1: Switched to 4-digit ticket numbering (1001+, historical 3-digit preserved) -- new Section 8. C9.2: Added Phase 0 (Grounding) before Phase 1 with concrete gate criteria. C9.3: Added DDL reference for walking skeleton in Section 7B.3. C9.4: Added parallel operation rule (Section 6.11) -- old+new services run together, no big-bang cutover. C9.5: Added timeline estimate (Section 7A) -- Phase 0: 1-2 days, Phase 1: 1 week, Phase 2: 1 week, Phase 3: 1-2 weeks, Phase 4: 1 week = 4-6 weeks total T1P. CX.1: Global "Rufio" to "Ruflo" rename. CX.2: Updated version header to 4.2.0. CX.3: Added changelog. CX.4: Added current state section (Section 18). Renumbered Phase 0 (old Baseline Consolidation) to Phase 0A. Renumbered sections 14-17 to 15-18, final statement to 19.
4.2.2	2026-03-28	000	Agent naming migration: O00 replaced with 000 (orchestrator). O01 replaced with 000 (governor). R01/P01/M01 replaced with role-based naming. BM replaced with 000 (BrainMaster). B2/B3/B5/M0 replaced with 002/003/005/010 in resource allocation table. Legacy naming section updated to reference 3-digit model. Changelog author entries preserved as historical.
4.2.3	2026-03-28	000	Roles over numbers: Removed agent IDs from work plan phases, resource allocation, milestone lists, and integration references. Role names used throughout instead of agent numbers.
4.2.7	2026-03-28	BM	Neo4j deprecated: Phase 4 "evaluate Neo4j instances" marked as done (both retired).
4.2.8	2026-03-28	BM	AGI pattern gap fix: Added Agent Behavioral Testing section (13.10) — prompt regression, drift detection, role conformance, escalation testing. Addresses audit gap "Agent Behavioral Testing" (Principle 8 depth).
4.2.9	2026-03-28	000	Wave 1-2 BP fixes: Rewrote Phase 0 to reflect actual Day 0 completion (36 items from Part 11 execution log). Updated timeline table (7A) with Phase 0 marked DONE and ticket coverage column (TKT-1001 to 1008). Added open items carried to Phase 1.
4.2.10	2026-03-28	000	Content deduplication: Section 4.1 Repo Roles — replaced duplicated repo descriptions with cross-reference to Part 8 Section 8.13.2. Section 13.1A Test Toolchain — replaced duplicated toolchain decision with cross-reference to Part 2 Section 5.11.
4.2.11	2026-03-29	000	External review fix: Section 7A rewritten with T1P vs Phase 2 scope boundary. T1P = walking skeleton + control bus + one approval path + minimum UI (2-3 days sprint-compressed). Phase 2 = full governance, full knowledge, full UI, MVP1 (4-6 weeks for full production). Schema count corrected from "15 tables" to "21 tables, 27 enums" in Phase 0 outputs.
4.2.12	2026-03-29	BM	Cross-references: Added pointers to `resources/DESIGN_rc_architecture.md` and `resources/DESIGN_cli_services.md` in Phase 0 outputs.
5.0.1	2026-03-30	GO	N21: Updated Section 7A timeline language -- replaced "10-14 days" phrasing with "4-6 weeks for full production, with sprint compression achieving feature delivery in hours". T1P target and Phase 2 language adjusted to match validated execution velocity.