Skip to content

XIOPro Production Blueprint v5.0

Part 12 — Work Plan, Migration & Test Strategy


1. Purpose

This document defines how XIOPro moves from the current mixed implementation state to a governed, testable, continuously operating system.

It answers:

  • how the build proceeds without a big-bang rewrite
  • how the current repositories and assets are migrated into the target model
  • what gets built first
  • what must be proven by tests before moving forward
  • how review and outside validation are staged

2. Core Delivery Principle

Build while operating. Migrate in place. Prove each layer before treating it as real.

XIOPro should not be rebuilt from zero if live assets, rules, repos, and workflows already exist.

The implementation path should:

  • preserve existing useful assets
  • normalize and govern them
  • replace fragile/manual paths incrementally
  • keep rollback realistic
  • attach every major capability to explicit proof

3. Current Baseline

XIOPro does not start from nothing.

The current baseline already includes:

  • multiple STRUXIO repositories
  • rule and skill assets
  • ticket history
  • infrastructure definitions
  • frontend/backend experiments
  • swarm/runtime tooling
  • Librarian-related work
  • voice, dashboard, and execution-daemon experiments
  • existing bus/relay concepts
  • current state files and workspace graph

This means the work plan is a migration and consolidation plan, not only a net-new build plan.


4. Canonical Active Repository Baseline

The active repository family for implementation should be treated as:

  • struxio-os
  • struxio-logic
  • struxio-design
  • struxio-app
  • struxio-business
  • struxio-knowledge (v5.0 addition — governed knowledge, syncs with Obsidian)

A transition/legacy path may still exist for:

  • struxio-aibus

Reference repos may still exist for research or inspiration, but they are not the canonical operating core.

See Part 8, Section 8.13.2 for the full repository topology with detailed descriptions of each repo's contents and ownership (struxio-os, struxio-logic, struxio-design, struxio-app, struxio-business, struxio-knowledge, struxio-aibus transitional path).

4.2 Transitional Repo Rule

struxio-aibus should not remain a permanent first-class pillar. See Part 8, Section 8.13.2 for the migration and archive plan.


5. Work Streams

XIOPro implementation should be managed as parallel governed work streams.

5.1 WS1 — Control Plane & State

Focus:

  • DB / ODM
  • tickets/tasks/activities
  • sessions / escalations / decisions
  • APIs / scheduler / control services

5.2 WS2 — Execution Fabric

Focus:

  • orchestrator
  • Ruflo/runtime integration
  • worker execution
  • session continuity
  • recovery wiring

5.3 WS3 — Governance & Optimization

Focus:

  • governor
  • breakers
  • alerts
  • approval flows
  • rule steward / prompt steward / module steward support
  • module cost and optimization logic

5.4 WS4 — Knowledge & Research

Focus:

  • Librarian
  • knowledge ingestion
  • Research Center
  • NotebookLM / Obsidian linkage
  • Hindsight / Dream proposal loops

5.5 WS5 — UI / Control Center

Focus:

  • widget system
  • prompt composer
  • brain collaboration
  • governance / research / module workspaces
  • desktop and mobile operator flows

5.6 WS6 — Infrastructure & Operations

Focus:

  • bootstrap
  • deployment
  • repo/storage layout
  • secrets and networking
  • telemetry
  • backup / restore
  • runtime health and rollout discipline

5.7 WS7 — Product Integration

Focus:

  • XIOPro support for the first product (see MVP1_PRODUCT_SPEC.md)
  • customer/runtime separation
  • handoff from internal operating system to product runtime

6. Self-Build Strategy & Parallel Operation

6.1 Purpose

XIOPro should help build XIOPro.

The system already has partially active headless execution capacity, legacy swarm patterns, and current operational tickets/assets.

This section defines how the existing agent capability is used to accelerate migration, without pretending the new architecture already fully exists.

6.2 Core Principle

Use the current system to build the next system, but under explicit migration discipline.

This means:

  • existing brains/agents may execute migration and build work
  • current execution capacity should be harnessed, not ignored
  • outputs must be normalized into the new governed model
  • legacy naming and partial architecture must not silently become permanent

The system should bootstrap itself progressively, not magically.

6.3 Legacy-to-Target Brain Transition

Current legacy names such as:

  • BM (now the BrainMaster)
  • M0 (now the Mac Worker)

may continue as transitional identifiers during migration.

But the target architecture should normalize toward role-based naming:

  • BrainMaster with orchestrator, governor, rule steward, prompt steward, module steward roles
  • domain brains
  • Mac Worker
  • Face
  • workers under governed runtime identity

Rule

Legacy names may remain operationally useful for continuity, but target tickets and docs should increasingly anchor on the new role model.

6.4 Dual-Project Strategy

XIOPro and product projects may run in parallel, provided the boundary remains explicit.

XIOPro Track

Focus:

  • core operating system
  • control plane
  • governance
  • work graph
  • knowledge/research
  • UI / control center
  • infrastructure hardening

Product Track

Focus:

  • revenue-facing product delivery
  • product APIs and features
  • user-facing outcomes validated by XIOPro

For the first product (MVP1), see MVP1_PRODUCT_SPEC.md.

Boundary Rule

Parallel work is allowed. Architectural collapse is not.

XIOPro must remain the internal operating system. Product work remains a separate track.

6.5 Immediate Migration Reality

Before full XIOPro-native operation, the implementation should explicitly support a transition from the older mixed environment:

  • mixed XIOPro beta / product work
  • older swarm naming
  • mixed repo history
  • older bus/relay concepts
  • partial manual workflows

A cleanup wave is required to separate:

  • XIOPro project work
  • product project work (see MVP1_PRODUCT_SPEC.md)
  • legacy / archive material
  • transitional repo content

This cleanup is not optional. It is part of enabling self-build.

The earliest self-build tickets should include at least:

  1. server upgrade / stabilization assessment ticket
  2. memory-pressure / lockout cleanup ticket
  3. repo-role normalization ticket
  4. XIOPro vs product project separation ticket
  5. struxio-aibus migration / archive ticket
  6. legacy brain naming normalization ticket
  7. current swarm execution inventory ticket
  8. first XIOPro-native migration execution ticket

These tickets create the runway for the system to build more of itself safely.

6.7 Self-Build Workflow

Recommended pattern:

  1. founder defines or approves migration/build ticket
  2. current headless agents execute under supervision
  3. output is reviewed and normalized
  4. output is committed into canonical repos/state
  5. resulting capability increases XIOPro's own ability to execute the next wave
flowchart TD
    Ticket --> LegacyOrCurrentAgents
    LegacyOrCurrentAgents --> ReviewNormalize
    ReviewNormalize --> CommitToCanonicalSystem
    CommitToCanonicalSystem --> NewCapability
    NewCapability --> NextTicketWave

6.8 Guardrail

Self-build is allowed only if:

  • the output is reviewable
  • the output lands in canonical repos/state
  • the output does not bypass governance
  • the output does not mutate production behavior silently
  • rollback remains realistic

XIOPro may help build itself. It may not self-redefine without control.

6.9 Parallel 24x7 Operation Rule

If continuous agent availability exists, XIOPro and product work may run as parallel workstreams.

This is desirable because it reduces dependence on founder real-time attention.

However, 24x7 availability does not remove the need for:

  • approval gates
  • prioritization
  • escalation handling
  • evidence-backed module use
  • recovery discipline
  • daily supervision rhythm

Always-on execution is a multiplier, not a replacement for governance.

6.10 Schedule Expectation Rule

Aggressive timelines may be used as operating targets, but blueprint gates should remain evidence-based.

Examples of good gates:

  • one recovery path proven
  • one end-to-end task flow proven
  • one approval path proven
  • one research workflow preserved
  • one UI interaction becoming durable state

The system should optimize for accelerated progress, but not by removing proof.

6.11 Parallel Operation Rule

During migration, old services and new services run in parallel.

This means:

  • Bus continues running alongside new API service
  • Paperclip continues running alongside new ODM-based ticket management
  • Phase 1 dashboard continues running alongside new Control Center
  • Paperclip DB runs alongside ODM PostgreSQL until consolidation

No big-bang cutover.

Old services are retired only after new services reach feature parity and are proven stable in production for a reasonable period.

Retirement sequence follows the service fate map in resources/SERVICE_FATE_MAP_v4_2.md.


7. Delivery Phases

The phases below are ordered by dependency, but some work streams run in parallel.

Phase 0 — Grounding (Completed 2026-03-28)

Goals:

  • lock technology decisions
  • write executable DDL from ODM schema
  • map service fates (current -> target)
  • install must-have CLI tools
  • harden security (firewall, SSH, git history)
  • verify emergency access paths
  • establish 4-digit ticket numbering
  • define Control Bus architecture
  • define unified agent identity model
  • retire deprecated services
  • create initial ticket backlog
  • produce BP v5.0 (all 12 parts)

Key outputs:

  • resources/SCHEMA_walking_skeleton_v4_2.sql — 21 tables, 27 enums, runs on PostgreSQL without error
  • resources/SERVICE_FATE_MAP_v4_2.md — 14 containers mapped
  • resources/CLI_TOOLS_ASSESSMENT.md — 7 must-have CLI tools assessed
  • resources/BP_v4_2_CORRECTIONS.md — 30+ correction items from v4.1
  • CLI tools installed: jq 1.8.1, yq 4.52.5, fd 10.4.2, fzf 0.70.0, uv 0.11.2, gh 2.89.0
  • Security: UFW firewall enabled, Tailscale SSH, git history cleaned (plaintext secrets purged)
  • Stale services retired: devxio-frontend, devxio-bridge (123 MB freed), Neo4j (1.83 GB freed)
  • First batch of 4-digit tickets created (TKT-1001 to TKT-1008) in Paperclip
  • Control Bus architecture defined (Parts 2, 4, 8)
  • Unified Agent Identity Model (3-digit IDs, role bundles) across all 12 parts
  • struxio-knowledge repo created and seeded (13 technology evaluations)
  • Brand brief v1 produced
  • resources/DESIGN_rc_architecture.md — Remote Control architecture (Open WebUI, multi-provider routing, Prompt Composer integration)
  • resources/DESIGN_cli_services.md — CLI services framework (config-driven operational commands via Bus API)

Gate to exit:

  • DDL runs on PostgreSQL successfully (done)
  • Service fate map approved by founder (done)
  • CLI tools installed and verified (done)
  • Security hardened (done)
  • BP v5.0 complete (done)

Open items carried to Phase 1: - pg_dump not yet in backup scope (needs root SSH) - backup.sh has plaintext B2 credentials (needs SOPS encryption) - direnv not installed (needs sudo/apt)

Estimated duration: 1 day (completed in 1 day)

Phase 0A — Baseline Consolidation

Goals:

  • freeze authoritative blueprint set
  • freeze canonical repo roles
  • inventory current assets
  • mark transitional/legacy repos
  • define the first ticket backlog from blueprint decisions

Key outputs:

  • canonical repository registry
  • canonical blueprint set
  • migration inventory
  • first workstream board

Phase 1 — Foundation & Walking Skeleton

Goals:

  • establish Node A baseline
  • validate Node B role
  • finalize bootstrap/startup/update scripts
  • establish repo/storage conventions
  • prove deployment/restart discipline
  • prove the walking skeleton end-to-end path

Key outputs:

  • repeatable bootstrap
  • controlled update flow
  • storage/repo conventions active
  • base observability and backup running
  • walking skeleton entities live in PostgreSQL

Gate to exit:

  • fresh bootstrap works
  • warm restart works
  • rollback path documented and smoke-tested
  • one end-to-end task can run with durable state (walking skeleton)

Estimated duration: 1 week

Phase 2 — Work Graph / ODM + Governance & Research

Goals:

  • implement authoritative ODM schema
  • implement tickets/tasks/activities
  • implement runtimes/sessions/escalations/human decisions
  • implement cost/time propagation
  • replace ad hoc work tracking with governed state
  • activate the governor
  • implement alerting and breakers
  • implement approval/discussion gates
  • activate rule steward / prompt steward / module steward governance paths
  • connect cost signals and optimization evidence

Key outputs:

  • DB schema
  • migrations
  • API/domain contracts
  • first real ticket/task execution chain
  • alerts and breaker baseline
  • approval workflow
  • module portfolio governance
  • ContextPrompting governance
  • governed asset publication path

Gate to exit:

  • one end-to-end task can run with durable state
  • one escalation can be opened and resolved
  • one activity cost record is attributable upward
  • one breaker triggers correctly
  • one approval path blocks and resumes execution
  • one module decision is evidence-backed and queryable

Estimated duration: 1 week

Phase 3 — Execution Fabric + UI Minimum

Goals:

  • wire the orchestrator to the work graph
  • wire Ruflo/runtime execution into tickets/tasks
  • establish session manager / resume semantics
  • establish recovery paths
  • prove headless execution without UI
  • implement widget-first web UI
  • implement Prompt Composer
  • implement Brain Collaboration preset
  • implement Control Center preset

Key outputs:

  • orchestrator active
  • task-to-runtime execution flow
  • session continuity and restart support
  • recovery path choices
  • widget grid shell
  • core workspaces active

Gate to exit:

  • one ticket executes end-to-end headlessly
  • one restart/recovery scenario is successful
  • session lineage remains intact
  • founder can talk to a brain from the UI
  • an execution-changing interaction becomes durable state

Estimated duration: 1-2 weeks

Phase 4 — Migration, Cleanup & Knowledge

Goals:

  • retire stale services (devxio-frontend, devxio-bridge)
  • ~~evaluate Neo4j instances~~ (done -- both retired and removed, see Part 5 Section 12.1)
  • migrate Paperclip data to ODM
  • activate Librarian discipline
  • implement Research Center task model
  • connect Hindsight and Dream proposal loops
  • tune performance and cost behavior
  • reduce manual operator burden

Key outputs:

  • stale services retired
  • Paperclip data migrated or adapted
  • Librarian ingestion path active
  • research task scheduling active
  • known failure and recovery patterns documented
  • operational hardening complete

Gate to exit:

  • one research task runs and is preserved
  • one research output is ingested and retrievable
  • stale services stopped and containers removed
  • system can run continuously with controlled human intervention

Estimated duration: 1 week

Phase 5 — Knowledge & Research (Extended)

Goals:

  • implement governed ingestion
  • implement Research Center task model fully
  • connect NotebookLM and Obsidian through bounded roles
  • implement Hindsight and Dream proposal loops

Key outputs:

  • Librarian ingestion path
  • research task scheduling
  • research output types
  • research-to-knowledge promotion path

Gate to exit:

  • one research task runs and is preserved
  • one research output is ingested and retrievable
  • one Hindsight or Dream proposal routes correctly for approval

Phase 6 — UI / Control Center (Extended)

Goals:

  • implement approvals / alerts / research / module workspaces
  • establish mobile reduced-surface mode
  • implement hot/warm/cold widget model

Key outputs:

  • hot/warm/cold widget model
  • prompt widget with Prompting Mode, attachments, voice, search/research, style, module controls
  • approval and recovery visible and actionable in UI
  • mobile mode supports core founder control

Gate to exit:

  • approval and recovery are visible and actionable in UI
  • mobile mode supports core founder control

Phase 7 — Product Integration & Hardening

Goals:

  • connect XIOPro to product-relevant flows (see MVP1_PRODUCT_SPEC.md for the first product)
  • prove product/runtime separation
  • harden failure cases
  • reduce manual operator burden
  • tune performance and cost behavior

Key outputs:

  • product support path
  • operational hardening
  • reduced manual intervention
  • known failure and recovery patterns documented

Gate to exit:

  • XIOPro supports at least one meaningful product workflow
  • system can run continuously with controlled human intervention
  • critical quality/stability/trust criteria are met

7A. Timeline Estimate

T1P vs Phase 2 Scope Boundary

The external reviews (ChatGPT, Claude, Gemini) unanimously identified the original 10-14 day claim as not credible for the full scope. This section now distinguishes between T1P (Walking Skeleton) and Phase 2 (Full System). Full production delivery takes 4-6 weeks, with sprint compression achieving individual feature delivery in hours.

t1p_scope:  # 2-3 days (sprint-compressed; 4-6 weeks for full production)
  - ODM schema deployed (DONE)
  - Control Bus (SSE, registration, intervention, tasks)
  - Walking skeleton (one full path: discussion -> ticket -> task -> execution -> cost)
  - One escalation/approval path
  - Minimum UI (shell + 3-4 core widgets)
  - Agent spawning with host capacity check
  - Basic cost tracking

phase_2_scope:  # 4-6 weeks after T1P
  - Full governance (all breakers, all policies, all review gates)
  - Full knowledge system (Librarian service, Research Center automation, RAG pipeline)
  - Full UI (all 6 widgets, mobile, layout presets)
  - Hindsight integration
  - Dream Engine / Idle Maintenance automation
  - Skill Performance DB active
  - Behavioral testing
  - Product integration (see `MVP1_PRODUCT_SPEC.md`)

Accelerated Timeline (Revised 2026-03-28)

accelerated_timeline:
  lesson: "34 tickets completed in Day 1 (~2.5 hours agent time)"
  implication: "Human-pace estimates are 50-100x too slow for agent execution"

  revised_plan:
    day_1: "DONE  Control Bus, ODM, Governance, UI, Knowledge, Infra, Testing (34 tickets)"
    day_2: "Docker deploy, UI polish, product integration start, Paperclip cleanup"
    day_3: "Product core (see MVP1_PRODUCT_SPEC.md)"
    day_4: "Integration testing, SSE real-time, UI connected to live data"
    day_5: "Phase 2 planning, full governance activation, research center automation"

  t1p_completion: "Day 2-3 (not Day 10-14)"
  phase_2_start: "Day 4-5 (not week 4-6)"

T1P Target (Founder Directive 2026-03-28, revised 2026-03-29)

T1P (Walking Skeleton + Control Bus + One Approval Path + Minimum UI) must be operational within 2-3 days (revised from original estimates based on Day 1 execution velocity; full production requires 4-6 weeks, with sprint compression achieving feature delivery in hours). Product work begins Day 2, running in parallel (see MVP1_PRODUCT_SPEC.md). Additional budget (tokens + compute) is available if needed.

The broader scope (full governance, full knowledge system, full research center, all 25 modules operational) is Phase 2 -- estimated Day 4-5 start (revised from 4-6 weeks after T1P).

Phase Duration Cumulative Parallel Work Ticket Coverage
Phase 0 (Grounding) Day 1 Day 1 DONE — DDL, CLI, security, BP v5.0, tickets, Control Bus, ODM, Governance, UI, Knowledge, Infra, Testing (34 tickets) Pre-ticketing + TKT-1001 to TKT-1063
Phase 1 (Docker Deploy + UI Polish) Day 2 Day 2 Docker containers live, Control Center deployed, Caddy routing, Paperclip cleanup TKT-1070+
Phase 2 (Product Core) Day 2-3 Day 3 Product integration (see MVP1_PRODUCT_SPEC.md) Product tickets
Phase 3 (Integration + Real-time) Day 3-4 Day 4 SSE real-time, UI connected to live Bus/ODM data, end-to-end validation Integration tickets
Phase 4 (Phase 2 + Full Governance) Day 4-5 Day 5 Phase 2 planning, full governance activation, research center automation Phase 2 tickets
Total T1P 2-3 days
Phase 2 begins Day 4-5 Full governance, full UI, knowledge, migration Remaining tickets

Acceleration Levers (Proven)

  • Multi-agent parallel execution: 34 tickets in ~2.5 hours proves the model
  • 24x7 continuous operation: BrainMaster runs without human-gated pauses
  • Compressed phases: All phases overlap -- Day 1 covered Phases 0-3 of the original plan
  • Mac Studio M1: Overflow capacity (~25 additional agent slots)
  • Budget flexibility: Founder willing to add tokens and compute power

Resource Allocation

Agent Primary Phase Secondary
BrainMaster (orchestrator) All phases — orchestration Control Bus coding
Engineering Brain Phase 1-2 — Bus upgrade, backend Phase 3 — API endpoints
Brand Brain Phase 3 — UI / Control Center Phase 2 — brand integration
DevOps Brain Phase 1 — infrastructure, testing Phase 4 — migration
Mac Worker All phases — Mac tasks, Obsidian Product support

Timeline Rules

  • Human-pace estimates are 50-100x too slow for agent execution -- plan accordingly.
  • Gates are evidence-based — but compress validation, don't skip it.
  • Parallel work across phases is the default, not the exception.
  • 24x7 agent availability accelerates and is expected.
  • If blocked, escalate immediately — don't wait for next day.
  • Additional compute or API budget available on request.

7B. Walking Skeleton Definition

7B.1 Purpose

XIOPro needs a provable minimum end-to-end execution path before broader expansion.

This is the walking skeleton.

It is the smallest real slice that proves the machine is a machine, not only a document set.

7B.2 Walking Skeleton Goal

The walking skeleton should prove this flow:

founder request -> durable discussion / intake -> ticket -> task -> runtime / session -> activity / result -> optional escalation -> human decision -> completion -> traceable cost and audit signals

This is the minimum proof path for governed execution.

7B.3 Required Objects in the Skeleton

The walking skeleton must use real instances of at least:

  • Discussion Thread
  • Ticket
  • Task
  • Agent Runtime
  • Session
  • Activity
  • Escalation Request
  • Human Decision
  • cost/usage record
  • audit/governance event

If these are mocked only in prose, the skeleton is not real enough.

DDL Reference

The executable schema for the walking skeleton is defined in:

resources/SCHEMA_walking_skeleton_v4_2.sql

This DDL must run on PostgreSQL without error as the Phase 0 gate. It defines all tables, enums, indexes, and constraints for the walking skeleton entities.

7B.4 Required Services in the Skeleton

The walking skeleton should involve at least:

  • one backend API/service layer
  • PostgreSQL-backed state
  • orchestrator execution/orchestration path
  • one constrained runtime/execution surface
  • one approval or escalation path
  • one UI or operator surface path
  • one telemetry/cost signal path

7B.5 Skeleton Scenarios

T1P should prove at least these scenarios:

Scenario A — Straight Through Execution

  1. founder opens a Discussion Thread
  2. thread promotes into Ticket / Task
  3. orchestrator assigns execution
  4. Agent Runtime and Session are created
  5. Activity executes and returns a result
  6. task completes
  7. cost and audit signals are queryable

Scenario B — Escalation / Human Gate

  1. execution reaches ambiguity or approval threshold
  2. Escalation Request opens
  3. founder responds
  4. Human Decision is recorded
  5. execution resumes and completes

Scenario C — Recovery / Resume

  1. a session is interrupted
  2. recovery path is chosen
  3. session or replacement session resumes
  4. lineage remains intact

Scenario D — Research Flow

  1. founder creates or promotes work into a Research Task
  2. research runs
  3. Research Output is preserved
  4. output is promoted into Librarian-managed knowledge or explicitly left draft

7B.6 Skeleton Boundaries

The walking skeleton should avoid false scale.

It does not need:

  • full multi-brain sophistication
  • all widget presets
  • advanced Dream behavior
  • large module portfolio automation
  • complete self-build autonomy

It needs only enough reality to prove the canonical path.

7B.7 Acceptance Criteria

The walking skeleton is complete when all are true:

  • the end-to-end flow is real, not mocked
  • state persists across services
  • one escalation/resume path works
  • one cost signal is attributable
  • one audit trail is inspectable
  • one research task path is real
  • one recovery path is proven
  • the founder can supervise the path from the UI or control surface

7B.8 Implementation Rule

All broader implementation ambition should be built on top of the walking skeleton.

If the skeleton is not working, additional architectural sophistication increases risk instead of reducing it.


8. Ticket Numbering

8.1 New Ticket Numbering Scheme

All new tickets use a 4-digit numbering scheme starting at 1001.

NNNN          — Sequential ticket number (1001, 1002, ...)
NNNN.NN       — Sub-task (1001.01, 1001.02)
EPIC-NN       — Epic grouping (EPIC-01, EPIC-02)

8.2 Historical Ticket Preservation

  • Existing 3-digit tickets (064-079) are preserved as-is. Do not renumber.
  • Paperclip STR-N identifiers retire when Paperclip retires.
  • ODM uses the 4-digit number as the canonical identifier.

8.3 Migration

The transition from 3-digit to 4-digit is a clean cut:

  • All tickets created before v5.0 keep their original numbers
  • All tickets created under v5.0 start at 1001
  • No ambiguity: 3-digit = historical, 4-digit = current

The first ticket waves should roughly follow this order:

  1. bootstrap / deploy / restore baseline
  2. ODM schema + migration runner (DDL from resources/SCHEMA_walking_skeleton_v4_2.sql)
  3. ticket/task/activity runtime chain
  4. session / escalation / human decision support
  5. orchestrator execution orchestration
  6. session recovery paths
  7. governor alerts / breakers / approvals
  8. Librarian ingestion and retrieval baseline
  9. Research Center task model
  10. widget shell + Prompt Composer
  11. Brain Collaboration and Control Center presets
  12. module portfolio evidence and optimization views
  13. mobile reduced-surface mode
  14. Product integration tickets (see MVP1_PRODUCT_SPEC.md)
  15. outside review and correction wave

9A. T1P Sequencing Constraints & First Skeleton Wave

9A.1 Purpose

The work plan now includes:

  • explicit T1P posture rules
  • a defined walking skeleton
  • a self-build strategy
  • parallel XIOPro and product operation

This section constrains their interaction so early implementation does not become uncontrolled parallelism.


9A.2 Sequencing Principle

Not all work should begin at once just because the architecture names many subsystems.

T1P sequencing should follow this rule:

stabilize -> prove the skeleton -> expand carefully

This protects the project from: - blueprint drift - premature parallelization - self-build overreach - UI-first illusion - governance that exists only in prose


9A.3 Hard Ordering Constraints

The following must be treated as hard prerequisites.

Constraint A — Bootstrap Before Breadth

Before broad feature work begins, XIOPro must have:

  • repeatable bootstrap
  • controlled restart
  • basic observability
  • basic backup/restore confidence
  • emergency access / lockout survival path

Constraint B — ODM Before Rich Orchestration

Before richer orchestration or advanced UI behavior, the system must have authoritative:

  • Discussion Thread
  • Ticket
  • Task
  • Agent Runtime
  • Session
  • Escalation Request
  • Human Decision
  • Research Task
  • cost and audit records

Constraint C — Walking Skeleton Before Broad Self-Build

Before XIOPro meaningfully "builds itself," the walking skeleton must already be real.

Self-build may assist setup and migration preparation earlier, but it should not become a broad primary delivery method until the skeleton proves:

  • durable state
  • controlled execution
  • one human gate
  • one recovery path
  • one research path

Constraint D — XIOPro Boundary Before Product Acceleration

Parallel product work is allowed (see MVP1_PRODUCT_SPEC.md), but broad parallel acceleration should wait until XIOPro proves:

  • work graph integrity
  • execution continuity
  • governance basics
  • module/cost visibility
  • a stable founder control surface

9A.4 First Skeleton Wave

The first implementation wave should be deliberately narrow.

Wave 1 Goal

Prove the canonical end-to-end XIOPro path using a minimal but real slice.

Wave 1 Scope

Build only enough to prove:

  1. Discussion Thread creation
  2. Ticket / Task promotion
  3. orchestrator assignment
  4. Agent Runtime + Session creation
  5. Activity record creation
  6. Result persistence
  7. optional Escalation Request
  8. Human Decision recording
  9. completion state
  10. queryable cost/audit signal

Wave 1 Explicit Non-Goals

Do not require in Wave 1:

  • full widget catalog
  • broad multi-brain sophistication
  • complete module portfolio intelligence
  • rich Dream behavior
  • rich Hindsight automation
  • broad self-build autonomy
  • many runtime surfaces

Wave 1 is about proving the machine, not proving every future sophistication.


9A.5 First Research Wave

Research remains in T1P, but the first research wave should also be narrow.

It should prove only:

  1. Research Task creation
  2. source bundle association
  3. research execution or synthesis run
  4. Research Output preservation
  5. promotion into Librarian-managed knowledge or explicit draft retention

Research Rule

Research is preserved in T1P, but it does not need a giant multi-surface ecosystem before the core research path is real.


9A.6 First Governance Wave

The first governance wave should prove only:

  1. one alert path
  2. one approval-required path
  3. one breaker path
  4. one override path
  5. one cost anomaly or module constraint path

This is enough to prove governance is operational, without pretending all advanced governance exists on day one.


9A.7 First UI Wave

The first UI wave should prove only:

  • Control Center preset
  • Brain Collaboration preset
  • Prompt Composer
  • approval visibility
  • trace visibility
  • mobile reduced-surface basics

UI Rule

The first UI wave should expose real backend state.

It must not become a polished simulation layer over incomplete backend contracts.


9A.8 First Self-Build Wave

XIOPro may help build XIOPro early, but the first self-build wave should be constrained to:

  • migration inventory work
  • repo normalization assistance
  • documentation cleanup
  • blueprint-to-ticket preparation
  • research preparation
  • non-destructive code/doc generation
  • reviewed implementation tickets under founder supervision

Rule

Before the walking skeleton is proven, self-build should be assistive and review-heavy, not broadly autonomous.


9A.9 First Parallel Product Wave

Parallel product work (see MVP1_PRODUCT_SPEC.md) may continue in T1P, but only in bounded areas that do not depend on unfinished XIOPro core guarantees.

Good early parallel candidates:

  • product design
  • requirements shaping
  • UX exploration
  • non-critical application slices
  • research and business-facing prep

Riskier candidates to delay until XIOPro core proof:

  • heavy dependence on unstable orchestration
  • deep coupling to incomplete governance
  • runtime-critical workflows requiring strong recovery guarantees

9A.10 First-Sprint Acceptance Gate

The first serious implementation wave should not be declared successful unless all are true:

  • bootstrap and restart are repeatable
  • the walking skeleton path is real
  • at least one escalation/human decision path works
  • at least one research task path works
  • at least one approval/governance path works
  • founder can inspect the flow from the UI/control surface
  • cost/audit evidence is queryable

If these are not true, the system is still in architectural assembly, not operational proof.


10. Migration Strategy

10.1 Migrate, Do Not Flatten

Existing assets should be:

  • classified
  • normalized
  • migrated
  • archived where necessary

Do not collapse everything into one repo or one giant backlog.

10.2 Legacy / Transitional Asset Handling

For assets such as:

  • older tickets
  • legacy repo structures
  • old bus concepts
  • historical draft blueprints

choose one of:

  • adopt
  • normalize
  • supersede
  • archive

10.3 Rule for Legacy Systems

A legacy path may continue temporarily if:

  • it still provides operational value
  • it has not yet been replaced
  • its boundary is explicit
  • its retirement path is defined

11. Sprint Strategy

11.1 Sprint Length

Recommended initial length:

  • 1 week

This is short enough for fast correction, but long enough to prove real technical increments.

11.2 Sprint Shape

Every sprint should include:

  • planned build work
  • validation work
  • documentation update
  • cleanup / debt reduction
  • explicit review
  • next-sprint refinement

11.3 Sprint Rhythm

flowchart TD
    Plan --> Build
    Build --> Validate
    Validate --> Review
    Review --> Correct
    Correct --> NextSprint

11.4 Sprint Rule

No sprint is complete unless:

  • implementation changed
  • validation ran
  • state/docs were updated
  • open defects and follow-ups were recorded

12. Daily Operating Rhythm

Morning

  • review attention queue
  • review alerts and blocked work
  • review approvals needed
  • adjust priorities and ticket order

Day

  • active build and supervision
  • founder/brain collaboration
  • governed interventions
  • module/research decisions as required

Evening / Night

  • lower-touch execution
  • scheduled research
  • Hindsight/Dream cycles where allowed
  • maintenance/refresh jobs
  • optimization passes within policy

13. Test Strategy

Testing is not a final polish phase.

It is part of the delivery model.

13.1 Test Layers

XIOPro should use multiple test layers:

  • unit tests
  • integration tests
  • workflow tests
  • recovery tests
  • governance tests
  • UI tests
  • performance tests
  • acceptance tests

13.1A Default Test Toolchain

See Part 2, Section 5.11 for the canonical test toolchain decision (pytest for backend, Playwright for E2E, optional Vitest).

Practical mapping for the work plan:

  • ODM, policy, scheduler, cost, and governance tests -> pytest
  • API/service integration tests -> pytest
  • UI workflow tests -> Playwright
  • mobile reduced-surface flow tests -> Playwright
  • widget rendering contract tests -> Playwright first, optional Vitest later if needed

13.2 Unit Tests

Target:

  • ODM logic
  • state transitions
  • policy evaluators
  • module recommendation logic
  • ContextPrompting decisions
  • parsing / validation helpers

Purpose:

  • prove deterministic local correctness

13.3 Integration Tests

Target:

  • API <-> DB
  • orchestrator <-> work graph
  • governor <-> policy/breaker system
  • Librarian ingestion path
  • Research Center scheduling
  • UI <-> backend contracts
  • module routing / LiteLLM / runtime adapter paths

Purpose:

  • prove service cooperation

13.4 Workflow / End-to-End Tests

Target scenarios:

  • founder request -> ticket -> task -> runtime -> result
  • escalation request -> human decision -> resume
  • research task -> output -> ingestion -> retrieval
  • module recommendation -> constrained execution -> measured outcome
  • UI interaction -> durable state mutation

Purpose:

  • prove the machine works as a machine

13.5 Recovery Tests

Target scenarios:

  • session crash
  • container restart
  • host reboot
  • provider failure
  • recovery reroute
  • admission closed then reopened
  • backup restore drill

Purpose:

  • prove recoverability, not only happy path

13.6 Governance Tests

Target scenarios:

  • breaker trigger
  • approval-required action blocked
  • ContextPrompting asks blocking question
  • bad module path constrained
  • override recorded correctly
  • audit events emitted correctly

Purpose:

  • prove the system remains governed under stress

13.7 UI Tests

Target:

  • widget rendering and contract integrity
  • layout preset loading
  • Prompt Composer behavior
  • approvals / alerts / research widget flows
  • mobile reduced-surface mode
  • long-list virtualization behavior

Purpose:

  • prove the control center is usable and safe

13.8 Performance Tests

Target:

  • hot widget responsiveness
  • event throughput
  • cost telemetry ingestion
  • search/retrieval latency
  • module routing latency
  • mobile responsiveness
  • long conversation/log virtualization behavior

Purpose:

  • prove widget-first design remains operational

13.9 Acceptance Tests

Acceptance should be written from founder/operator perspective.

Example acceptance cases:

  • "I can ask a brain to work, clarify it, and see the durable result."
  • "I can approve or reject a protected action and know what changed."
  • "I can see which module was used, what it cost, and why."
  • "I can recover from a failed runtime without losing context."
  • "I can create and review research in a repeatable way."
  • "I can use the core controls from mobile."

13.10 Agent Behavioral Testing (v5.0.8 Addition)

Beyond code testing (pytest, Playwright), XIOPro must test agent behavior itself. Code tests verify that the system works; behavioral tests verify that agents behave correctly within the system.

behavioral_testing:
  prompt_regression:
    what: "Verify that agent responses remain consistent after skill/rule changes"
    method: "Golden set of input->expected_output pairs per role"
    when: "After any skill, rule, or activation file change"

  drift_detection:
    what: "Detect when agent behavior degrades over time"
    method: "Compare current task completion rate, token usage, and quality scores against baseline"
    when: "Weekly via Idle Maintenance"
    threshold: "Flag if completion rate drops >10% or token usage increases >20%"

  role_conformance:
    what: "Verify agent operates within its assigned role boundaries"
    method: "Check that agent didn't use capabilities outside its role bundle"
    when: "Post-task evaluation"

  escalation_testing:
    what: "Verify agents escalate correctly when they should"
    method: "Present ambiguous/impossible tasks, verify escalation happens"
    when: "Monthly via controlled test scenarios"

Design Rationale

  • Prompt regression is the agentic equivalent of unit testing. When a skill or rule changes, the agent's behavior may change in unexpected ways. A golden set of input/output pairs detects regressions before they reach production.
  • Drift detection catches gradual degradation that no single change causes. Model updates, accumulated context patterns, or subtle prompt erosion can all cause drift. Weekly measurement via Idle Maintenance (Part 4, Section 4.9.9) provides early warning.
  • Role conformance ensures agents stay within their assigned boundaries. An agent with the engineering role should not be making governance decisions. Post-task evaluation checks for boundary violations.
  • Escalation testing verifies a critical safety property: agents must escalate when they cannot complete a task or when the task is ambiguous. This is tested with controlled scenarios, not just observed in production.

T1P Implementation

For T1P, behavioral testing is lightweight:

  • Prompt regression: maintain 5-10 golden test cases per role bundle. Run after any activation file change. Use pytest to invoke the prompt and compare output against expected patterns (fuzzy match, not exact).
  • Drift detection: track completion rate and token usage per role in the Skill Performance DB (Part 5, Section 8.9A). Weekly Idle Maintenance task compares current week vs baseline.
  • Role conformance: manual review during task evaluation. Automated enforcement deferred to post-T1P.
  • Escalation testing: manual test scenarios during sprint reviews. Automated escalation test suite deferred to post-T1P.

Relation to Existing Test Layers

Behavioral tests complement, not replace, the test layers in Sections 13.1-13.9:

  • Unit tests (13.2) verify code logic
  • Behavioral tests (13.10) verify agent decision-making
  • Governance tests (13.6) verify policy enforcement
  • Together they cover the full stack: code correctness, agent behavior, and system governance

14. Exit Criteria by Major Layer

Work Graph / ODM Exit Criteria

  • explicit state
  • queryable lineage
  • cost/time propagation
  • escalation + human decision support

Execution Exit Criteria

  • headless ticket execution
  • session continuity
  • recovery path support

Governance Exit Criteria

  • alerts
  • breakers
  • approvals
  • auditability
  • governed module/prompting behavior

Knowledge / Research Exit Criteria

  • governed ingestion
  • retrievable outputs
  • scheduled research support
  • proposal routing from Hindsight/Dream

UI Exit Criteria

  • widget-first shell
  • Prompt Composer
  • collaboration preset
  • governance visibility
  • mobile reduced-surface mode

15. Review Strategy

15.1 Internal Review

After each major phase:

  • compare implementation to blueprint
  • compare docs to implementation
  • update open gaps
  • record correction tickets

15.2 Outside Review

After Part 10 completion and before major lock:

  • run an outside architecture review
  • run an implementation practicality review
  • run an OSS/capability scan for build-vs-adopt decisions

Outside review should challenge:

  • overdesign
  • missing operational details
  • unnecessary custom code
  • weak failure handling
  • weak testing discipline

16. Risks & Mitigation

Risk: Complexity exceeds control

Mitigation:

  • strict ticketization
  • sprint gates
  • layered implementation
  • outside review

Risk: Hidden drift between blueprint and code

Mitigation:

  • phase reviews
  • update docs as part of sprint completion
  • treat mismatch as a tracked defect

Risk: Cost visibility remains weak

Mitigation:

  • execution-time telemetry
  • attribution pipeline
  • governor and module steward integration
  • cost-focused acceptance tests

Risk: Widget UI becomes heavy and unstable

Mitigation:

  • hot/warm/cold widget model
  • lazy loading
  • virtualization
  • layout presets before full freedom

Risk: Legacy assets create confusion

Mitigation:

  • classify adopt / supersede / archive
  • define repo roles
  • remove transitional repos from the canonical core once migrated

Risk: Research becomes fragmented again

Mitigation:

  • Research Center task model
  • source bundle discipline
  • Librarian promotion path
  • scheduled refresh rules

17. Success Criteria

The plan is successful if:

  • XIOPro becomes buildable in layers
  • migration preserves useful existing assets
  • repos and responsibilities become clear
  • the system can run headlessly and be supervised through the UI
  • module use becomes governable and optimizable
  • research becomes repeatable and reusable
  • test coverage proves the critical workflows
  • product work can be supported without collapsing XIOPro into product runtime (see MVP1_PRODUCT_SPEC.md)

18. Current State

As of 2026-03-28, the work plan exists in blueprint form only.

What exists today:

  • 14 Docker containers running on Hetzner CPX62
  • Active ticket backlog (3-digit, 064-079 range + STR-N Paperclip identifiers)
  • Five canonical repos (struxio-os, struxio-logic, struxio-design, struxio-app, struxio-business)
  • One transitional repo (struxio-aibus)
  • BrainMaster orchestrator running via Claude Code
  • Walking skeleton DDL written (resources/SCHEMA_walking_skeleton_v4_2.sql)
  • Service fate map written (resources/SERVICE_FATE_MAP_v4_2.md)
  • CLI tools assessment written (resources/CLI_TOOLS_ASSESSMENT.md)
  • Blueprint v5.0 corrections identified and being applied

What must happen next:

  • Execute Phase 0 (Grounding): run DDL, install CLI tools, create first 4-digit tickets
  • Begin Phase 1 (Walking Skeleton): prove the end-to-end path
  • Retire stale containers (devxio-frontend, devxio-bridge) for immediate RAM savings

19. Final Statement

This plan turns XIOPro from a partially manual, partially experimental operating environment into a governed, testable, continuously improving machine.

The goal is not speed without proof.

The goal is controlled acceleration: build, validate, govern, optimize, and only then scale.


Changelog

Version Date Author Changes
4.1.0 2026-03-27 BM Initial work plan blueprint
4.2.0 2026-03-28 BM C9.1: Switched to 4-digit ticket numbering (1001+, historical 3-digit preserved) -- new Section 8. C9.2: Added Phase 0 (Grounding) before Phase 1 with concrete gate criteria. C9.3: Added DDL reference for walking skeleton in Section 7B.3. C9.4: Added parallel operation rule (Section 6.11) -- old+new services run together, no big-bang cutover. C9.5: Added timeline estimate (Section 7A) -- Phase 0: 1-2 days, Phase 1: 1 week, Phase 2: 1 week, Phase 3: 1-2 weeks, Phase 4: 1 week = 4-6 weeks total T1P. CX.1: Global "Rufio" to "Ruflo" rename. CX.2: Updated version header to 4.2.0. CX.3: Added changelog. CX.4: Added current state section (Section 18). Renumbered Phase 0 (old Baseline Consolidation) to Phase 0A. Renumbered sections 14-17 to 15-18, final statement to 19.
4.2.2 2026-03-28 000 Agent naming migration: O00 replaced with 000 (orchestrator). O01 replaced with 000 (governor). R01/P01/M01 replaced with role-based naming. BM replaced with 000 (BrainMaster). B2/B3/B5/M0 replaced with 002/003/005/010 in resource allocation table. Legacy naming section updated to reference 3-digit model. Changelog author entries preserved as historical.
4.2.3 2026-03-28 000 Roles over numbers: Removed agent IDs from work plan phases, resource allocation, milestone lists, and integration references. Role names used throughout instead of agent numbers.
4.2.7 2026-03-28 BM Neo4j deprecated: Phase 4 "evaluate Neo4j instances" marked as done (both retired).
4.2.8 2026-03-28 BM AGI pattern gap fix: Added Agent Behavioral Testing section (13.10) — prompt regression, drift detection, role conformance, escalation testing. Addresses audit gap "Agent Behavioral Testing" (Principle 8 depth).
4.2.9 2026-03-28 000 Wave 1-2 BP fixes: Rewrote Phase 0 to reflect actual Day 0 completion (36 items from Part 11 execution log). Updated timeline table (7A) with Phase 0 marked DONE and ticket coverage column (TKT-1001 to 1008). Added open items carried to Phase 1.
4.2.10 2026-03-28 000 Content deduplication: Section 4.1 Repo Roles — replaced duplicated repo descriptions with cross-reference to Part 8 Section 8.13.2. Section 13.1A Test Toolchain — replaced duplicated toolchain decision with cross-reference to Part 2 Section 5.11.
4.2.11 2026-03-29 000 External review fix: Section 7A rewritten with T1P vs Phase 2 scope boundary. T1P = walking skeleton + control bus + one approval path + minimum UI (2-3 days sprint-compressed). Phase 2 = full governance, full knowledge, full UI, MVP1 (4-6 weeks for full production). Schema count corrected from "15 tables" to "21 tables, 27 enums" in Phase 0 outputs.
4.2.12 2026-03-29 BM Cross-references: Added pointers to resources/DESIGN_rc_architecture.md and resources/DESIGN_cli_services.md in Phase 0 outputs.
5.0.1 2026-03-30 GO N21: Updated Section 7A timeline language -- replaced "10-14 days" phrasing with "4-6 weeks for full production, with sprint compression achieving feature delivery in hours". T1P target and Phase 2 language adjusted to match validated execution velocity.