XIOPro Production Blueprint v5.0¶

Part 11 — System Review¶

1. Purpose¶

This part is the verification gate between design (Parts 1-10) and execution (Parts 12-14).

Before any project moves to implementation, it must pass through System Review to verify:

Data schema is complete and consistent
All system modules are identified and positioned
Risks are identified and mitigated
Dependencies are mapped and ordered
Data flows are documented
Operational checklists exist

This process is reusable -- every XIOPro project passes through it.

1.1 Scope of This Document¶

Sections 1-5 cover the first half of the System Review:

Section	Content
1	Purpose and scope
2	Data Schema verification
3	System Module Index
4	Subject Index
5	Risk Register

The second half (Sections 6-10: Dependency Map, Data Flow Audit, Operational Checklists, Gap Analysis, Review Sign-Off) will follow in a separate document.

2. Data Schema¶

The canonical schema is at: resources/SCHEMA_walking_skeleton_v4_2.sql

2.1 Entity-Relationship Overview¶

erDiagram
    %% ── Core Work Graph ──
    projects ||--o{ sprints : contains
    projects ||--o{ tickets : contains
    projects ||--o{ project_agent_bindings : has
    sprints ||--o{ tickets : scopes
    tickets ||--o{ tasks : decomposes_into
    tickets ||--o| tickets : parent_ticket
    tasks ||--o| tasks : parent_task
    tasks ||--o{ activities : generates

    %% ── Discussion & Ideas ──
    projects ||--o{ discussion_threads : hosts
    discussion_threads ||--o| tickets : linked_ticket
    discussion_threads ||--o| tasks : linked_task
    discussion_threads ||--o| sessions : linked_session
    ideas ||--o| topics : classified_by
    ideas ||--o| users : raised_by
    ideas ||--o| tickets : converts_to
    idea_discussion_links }o--|| ideas : links
    idea_discussion_links }o--|| discussion_threads : links

    %% ── Agent System ──
    agent_templates ||--o{ agent_runtimes : instantiates
    agent_runtimes ||--o{ sessions : runs_in
    agent_runtimes ||--o{ activities : executes
    agent_runtimes ||--o| hosts : deployed_on
    agent_runtimes ||--o| agent_runtimes : parent_runtime
    agent_runtimes ||--o| tickets : assigned_ticket
    agent_runtimes ||--o| tasks : assigned_task
    project_agent_bindings }o--|| projects : roster_for

    %% ── Governance ──
    escalation_requests ||--o| agent_runtimes : raised_by
    escalation_requests ||--o| tasks : about
    escalation_requests ||--o| activities : triggered_by
    human_decisions }o--|| escalation_requests : resolves
    human_decisions ||--o| agent_runtimes : applies_to
    human_decisions ||--o| tasks : applies_to
    override_records ||--o| agent_runtimes : targets

    %% ── Knowledge ──
    topics ||--o| topics : parent_topic
    research_tasks ||--o| projects : scoped_to
    research_tasks ||--o| tickets : linked_to
    research_tasks ||--o| tasks : parent_task
    research_tasks ||--o| agent_runtimes : owned_by

    %% ── Cost & Time ──
    cost_ledger }o--|| activities : charges
    cost_ledger ||--o| tasks : attributed_to
    cost_ledger ||--o| tickets : attributed_to
    cost_ledger ||--o| projects : attributed_to
    time_ledger }o--|| activities : records
    time_ledger ||--o| tasks : attributed_to

2.2 Table Summary¶

#	Table	Group	Key Relationships
0	`users`	Core	Standalone identity entity. Referenced by `ideas.raised_by_user_id`.
1	`topics`	Knowledge	Self-referencing tree (`parent_topic_id`). Referenced by ideas, tickets, tasks, agent_templates, research_tasks via UUID arrays.
2	`projects`	Core Work Graph	Parent of sprints, tickets, discussion_threads, project_agent_bindings, research_tasks, cost_ledger.
2A	`project_agent_bindings`	Core Work Graph	Junction: `projects` <-> agent identity (agent_id VARCHAR(3)).
3	`sprints`	Core Work Graph	FK to `projects`. Referenced by `tickets.sprint_id`.
4	`discussion_threads`	Core Work Graph	FK to `projects`. Deferred FKs to `tickets`, `tasks`, `sessions`.
4A	`ideas`	Core Work Graph	FK to `topics`, `users`. Deferred FK to `tickets`.
4B	`idea_discussion_links`	Core Work Graph	Junction: `ideas` <-> `discussion_threads`.
5	`tickets`	Core Work Graph	FK to `projects`, `sprints`. Self-referencing (`parent_ticket_id`). Parent of `tasks`.
6	`tasks`	Core Work Graph	FK to `tickets`. Self-referencing (`parent_task_id`). Deferred FK to `agent_runtimes`. Parent of `activities`.
7	`hosts`	Agent System	Standalone capacity entity. Referenced by `agent_runtimes.host_id`.
8	`agent_templates`	Agent System	Canonical agent class. Parent of `agent_runtimes`.
9	`agent_runtimes`	Agent System	FK to `agent_templates`, `hosts`, `tickets`, `tasks`. Self-referencing (parent, root, orchestrator). Deferred FK to `sessions`.
10	`sessions`	Agent System	FK to `agent_runtimes`. Referenced by `activities`, `discussion_threads`, `escalation_requests`.
11	`activities`	Agent System	FK to `tasks`, `agent_runtimes`, `sessions`. Parent of `cost_ledger`, `time_ledger`.
12	`escalation_requests`	Governance	FK to `agent_runtimes`, `sessions`, `tickets`, `tasks`, `activities`. Parent of `human_decisions`.
13	`human_decisions`	Governance	FK to `escalation_requests`, `agent_runtimes`, `tasks`.
14	`override_records`	Governance	Polymorphic scope (`scope_type` + `scope_ref`). Append-only audit trail.
15	`cost_ledger`	Cost/Time	FK to `activities`, `agent_runtimes`, `tasks`, `tickets`, `projects`.
16	`time_ledger`	Cost/Time	FK to `activities`, `agent_runtimes`, `tasks`, `tickets`.
17	`research_tasks`	Knowledge	FK to `projects`, `tickets`, `tasks`, `agent_runtimes`.

2.3 Schema Statistics¶

Metric	Count
Tables	21
Enum types	28
Deferred foreign keys	6
Auto-update triggers	12
Partial indexes	2
Generated columns (`status_state`)	10

2.4 Schema Conventions¶

All tables follow the ODM Metadata Contract (ODM Section 12.2):

tags TEXT[] -- free-form classification
labels TEXT[] -- structured labels
source_system TEXT -- originating system
source_ref TEXT -- external reference
correlation_id TEXT -- cross-system tracing
idempotency_key TEXT -- replay protection
notes TEXT -- human-readable notes
created_by TEXT, updated_by TEXT -- audit trail
created_at TIMESTAMPTZ, updated_at TIMESTAMPTZ -- timestamps

All lifecycle-bearing entities use the three-dimensional state model:

status (devxio_status) -- workflow phase
state (devxio_state) -- runtime condition
status_state -- generated composite for indexing

3. System Module Index¶

Every distinct capability, engine, or module in XIOPro, with its primary blueprint location, T1P posture, and dependencies.

#	Module	Primary Part	Description	T1P Posture	Depends On
1	Control Bus	Part 2, 5.8	Central messaging relay for all XIOPro surfaces. SSE push, intervention model, message routing between brains and services.	Full	PostgreSQL, API Service
2	Orchestrator	Part 4, 4.1	Master execution coordinator. Decomposes tickets into tasks, assigns agents, manages execution flow and dependencies.	Full	Control Bus, Work Graph, Agent Templates
3	Governor	Part 7, 5	Runtime governance engine. Enforces budgets, breakers, escalation policies, health monitoring, and cost controls.	Full	Control Bus, Cost Ledger, Orchestrator
4	Rule Steward	Part 4, 4.2A	Manages the lifecycle of all operational rules: creation, validation, versioning, conflict detection, and retirement.	Scaffold	Governor, Knowledge Store
5	Prompt Steward	Part 4, 4.2B	Manages context assembly, prompting modes, question budgets, and prompt package contracts for agent interactions.	Scaffold	Rule Steward, Skill Registry
6	Module Steward	Part 4, 4.2C	Evaluates, adopts, and governs external modules and tools. Manages the module portfolio lifecycle.	Scaffold	Governor, Research Center
7	Librarian	Part 5, 4	Core knowledge management system. Ingestion, indexing, search, decomposition, and document lifecycle.	Scaffold	PostgreSQL, pgvector, Object Storage
8	Research Center	Part 5, 8	Operational research engine. Source scouting, scheduled research tasks, digest generation, NotebookLM/Obsidian integration.	Scaffold	Librarian, Source Registry, Scheduler
9	Skill Registry	Part 5, 8.9	Central registry of all agent skills. Defines skill metadata, versioning, model compatibility, and governance rules.	Scaffold	Rule Steward, Knowledge Store
10	Skill Performance DB	Part 5, 8.9A	Tracks token consumption, quality scores, model compatibility, and execution statistics per skill.	Scaffold	Skill Registry, Cost Ledger
11	Hindsight	Part 5, 9	Post-execution learning engine. Analyzes completed tasks, extracts patterns, generates improvement recommendations.	Scaffold	Activities, Sessions, Knowledge Store
12	Dream Engine	Part 5, 10	Autonomous optimization engine. Identifies improvement opportunities, proposes experiments, runs during idle time.	Idle Maintenance Only	Hindsight, Skill Performance DB, Governor
13	Idle Maintenance	Part 4, 4.9.9	T1P subset of Dream Engine. Practical optimization tasks: skill drift detection, cost anomaly review, stale knowledge cleanup.	Phase 2 (downgraded -- no dedicated ticket)	Scheduler, Skill Registry, Cost Ledger
14	RAG Pipeline	Part 5, 7.18	Retrieval-augmented generation pipeline. Embedding, chunking, hybrid retrieval, reranking, and context injection.	Phase 2 (downgraded -- pgvector DDL deploys but pipeline activation is Phase 2)	pgvector, PostgreSQL, Prompt Steward
15	XIOPro Optimizer	Part 1, 8A	Umbrella capability grouping Governor, Rule Steward, Prompt Steward, Module Steward, and Dream Engine as the self-improvement loop.	Scaffold	Governor, All Stewards, Dream Engine
16	Control Center UI	Part 6	Widget-based web UI. Attention queue, brain interaction, prompt composer, governance dashboards, research desk.	First Wave	API Service, SSE Push, Control Bus
17	Prompt Composer	Part 6, 12	UI component for structured prompt construction. Mode selection, search/research toggle, style controls, module/model controls.	First Wave	Prompt Steward, UI Framework
18	Agent Spawning	Part 4, 5A	Agent lifecycle management. Three patterns: roster agent, on-demand agent, ephemeral sub-agent. Capacity-aware host placement.	Full	Orchestrator, Host Registry, Agent Templates
19	ODM (Operational Domain Model)	Part 3	Canonical data model. 21 tables, three-dimensional state model, metadata contract, entity lifecycle rules.	Full	PostgreSQL
20	Knowledge Ledger	Part 5, 4.7	Change and evolution log for all knowledge objects. Tracks document lifecycle, revival, export, and drift.	Scaffold	Librarian, PostgreSQL
21	Execution Report	Part 4, 20	Post-execution summary generation. Cost, duration, outcome, and success criteria assessment per ticket.	Scaffold	Activities, Cost Ledger, Time Ledger
22	Host Registry	Part 3, 4.1B	Fleet machine inventory. Tracks capacity (CPU, RAM, SSD, GPU), active agents, and health status per host.	Full	PostgreSQL
23	Source Registry	Part 5, 8.10.1	Curated list of external research sources. Ranked, scheduled, with trust and freshness metadata.	Scaffold	Research Center, Librarian
24	Resource Registry	Part 5, 8.10.2	Registry of evaluated external resources (tools, libraries, services). Lifecycle tracking from discovery to adoption or rejection.	Scaffold	Research Center, Module Steward
25	Scheduler	Part 8, 8.7	Background job execution. Cron-like scheduling for research tasks, idle maintenance, health checks, and refresh cycles.	Phase 2 (downgraded -- existing cron covers basics, dedicated scheduler is Phase 2)	PostgreSQL, Control Bus

4. Subject Index¶

Alphabetical index of key subjects referenced across the blueprint.

Subject	Primary Location	Also Referenced In
Agent Allocation	Part 3, 4.2.1	Part 4, 5A
Agent Identity (3-digit)	Part 1, 8.1	Part 3, 4.7-4.8; Part 4, 19.1
Agent Lifecycle	Part 4, 6	Part 7, 6.2
Agent Runtime	Part 3, 4.8	Part 4, 5A; Part 8, 7.2
Agent Template	Part 3, 4.7	Part 4, 4.1; Part 8, 8.3
Alerts	Part 7, 10	Part 8, 12.6
Approval	Part 7, 6.5	Part 3, 4.11; Part 4, 11
Atomic Writes	Part 1, 4.3	Part 3, 2.5; Part 8, 3.2
Authentication	Part 2, 5.14	Part 3, 4.8 (auth_method); Part 8, 11
Backup	Part 2, 5.15	Part 8, 10; Part 4, 16
Breakers (Circuit)	Part 7, 9	Part 1, 4.5; Part 4, 10
CLI Tools	Part 2, 5.12	Part 1, 4.13; Part 4, 13
Completion Self-Check	Part 4, 5.2	Part 7, 6.4
Confidence Scoring	Part 4, 4.2B	Part 5, 9; Part 7, 11
Control Bus	Part 2, 5.8	Part 1, 4.12; Part 7, 7; Part 8, 7.1
Cost Awareness	Part 1, 4.6	Part 3, 4.6.2; Part 7, 6.1; Part 8, 13
Cost Ledger	Part 3, 4.6.2	Part 4, 9; Part 7, 6.1; Part 8, 13.3
Data Access Rule	Part 2, 5.8	Part 8, 7.1
Debounce	Part 7, 9.2	Part 4, 10
Decomposition (Task)	Part 4, 5	Part 3, 4.4-4.5
Decomposition (Document)	Part 5, 4.5	Part 5, 4.1
Dependencies (Task)	Part 4, 5.1	Part 3, 4.5
Discussion Thread	Part 3, 4.3A	Part 5, 4; Part 6, 10.3
Dream Engine	Part 5, 10	Part 1, 12A.2; Part 4, 4.9.9; Part 5, 11A.4
Escalation	Part 3, 4.11	Part 4, 11; Part 7, 8.3; Part 6, 10.2
Execution Mode	Part 3, 4.5	Part 4, 3
Execution Report	Part 4, 20	Part 7, 6.4
Firewall	Part 8, 11.4	Part 8, 11.10.5
Governor	Part 7, 5	Part 1, 8.4; Part 4, 4.2; Part 8, 8.4
Hindsight	Part 5, 9	Part 1, 12A.2; Part 4, 4.9.9; Part 5, 11A.3
Host	Part 3, 4.1B	Part 4, 14; Part 8, 5
Human Decision	Part 3, 4.12	Part 7, 6.5; Part 4, 11
Idea	Part 3, 4.3B	Part 5, 4; Part 6, 10.3
Idle Maintenance	Part 4, 4.9.9	Part 1, 12A.2; Part 5, 10; Part 5, 11A.4
Intervention	Part 7, 10.4	Part 2, 5.8; Part 6, 10.2
Knowledge Compounding	Part 1, 4.7	Part 5, 2; Part 5, 14
Knowledge Ledger	Part 5, 4.7	Part 7, 12
Librarian	Part 5, 4	Part 1, 6.1; Part 8, 8.9
LiteLLM	Part 2, 5.3	Part 3, 4.8; Part 8, 8.6
Memory Principles	Part 5, 4.5A	Part 5, 9.5
Metadata Contract	Part 3, 12.2	All entity definitions
Module Steward	Part 4, 4.2C	Part 1, 8.4; Part 7, 12.9; Part 8, 8.12
NotebookLM	Part 5, 8.7	Part 5, 8.2A
Obsidian	Part 5, 8.8	Part 5, 18.3
ODM (Operational Domain Model)	Part 3	Part 1, 7; Part 2, 4.6
Optimizer (XIOPro)	Part 1, 8A	Part 4, 4.2-4.2C; Part 5, 10
Orchestrator	Part 4, 4.1	Part 1, 8.2; Part 2, 4.3; Part 8, 8.3
Override Record	Part 3, 4.12A	Part 7, 12.14
Paperclip	Part 1, 13	Part 8, 15
pgvector	Part 5, 7.18	Part 5, 12; Part 8, 8.8
Policy Objects	Part 7, 8	Part 7, 6
PostgreSQL	Part 2, 5.5	Part 8, 8.8; Part 3 (all entities)
Priority Level	Part 3 (enum)	Part 4, 8; Part 7, 10.1
Prompt Composer	Part 6, 12	Part 4, 4.2B
Prompt Steward	Part 4, 4.2B	Part 1, 8.4; Part 7, 12.7
RAG Pipeline	Part 5, 7.18	Part 4, 4.2B; Part 5, 12
Recovery	Part 7, 8.4	Part 4, 15; Part 8, 3.5; Part 8, 11.10
Replaceability	Part 1, 4.8	Part 8, 3.3
Research Center	Part 5, 8	Part 1, 12A.2; Part 5, 11A.2
Research Task	Part 3, 4.12B	Part 5, 8.12-8.15
Review Gates	Part 7, 12.16	Part 4, 5.2
Roles (Agent)	Part 1, 8.2	Part 3, 4.7; Part 4, 4.1-4.2C
Ruflo	Part 2, 5.8	Part 4, 4.2F; Part 8, 8.5
Rule Steward	Part 4, 4.2A	Part 1, 8.4; Part 7, 12
Scheduled Research	Part 5, 8.12	Part 5, 11; Part 8, 8.7
Secrets Management	Part 8, 11.5	Part 2, 5.14
Self-Evaluation	Part 4, 5.2	Part 5, 9
Session	Part 3, 4.10	Part 4, 7; Part 8, 7.2
Skill Performance	Part 5, 8.9A	Part 4, 4.11; Part 5, 10
Skill Registry	Part 5, 8.9	Part 4, 4.10-4.11
Skill Selection	Part 4, 4.11	Part 5, 8.9
Source Registry	Part 5, 8.10.1	Part 5, 8.11
Sprint	Part 3, 4.3	Part 4, 5
SSE Push	Part 2, 5.6	Part 6, 6.5; Part 8, 7.1
Sub-Agent	Part 4, 5A.2	Part 4, 12
T1P Posture	Part 1, 12A	All Parts (posture tables)
Tailscale	Part 8, 5.1	Part 8, 11.4
Three-Dimensional State	Part 3, 2.5	Part 3 (all lifecycle entities)
Ticket	Part 3, 4.4	Part 4, 5; Part 7, 6
Ticket Numbering	Part 3, 2.7	Part 4, 5
Time Ledger	Part 3, 4.6.3	Part 4, 9; Part 8, 13.3
Token Budget	Part 4, 4.2B	Part 7, 6.1; Part 5, 7.18
Topic	Part 3, 4.1	Part 5, 4; Part 5, 8
Topic Enrichment	Part 3, 4.1.1	Part 5, 4
User	Part 3, 4.0	Part 6, 9; Part 8, 11.3
Walking Skeleton	Part 3	Part 10
Widget	Part 6, 6	Part 6, 10-11

5. Risk Register¶

Risks identified across Parts 1-8, compiled with severity assessment and mitigation strategy.

5.1 Severity Scale¶

Level	Meaning
Critical	System-wide failure or data loss. Requires immediate response.
High	Major capability degraded. Requires response within hours.
Medium	Partial degradation. Requires response within 1 business day.
Low	Minor inconvenience. Addressed in normal maintenance cycle.

5.2 Risk Table¶

#	Risk	Severity	Likelihood	Impact	Mitigation	BP Reference
R01	RAM exhaustion on Hetzner CPX62 -- 30 GB shared across PostgreSQL, API, Orchestrator, LiteLLM, and all agent runtimes. A spike in concurrent agents or a memory leak crashes the control plane.	Critical	Medium	Full system outage	Memory pressure survival rule (Part 8, 11.10.3). Reserved RAM budgets per service. Governor enforces max concurrent agents via host capacity tracking. Core-first recovery order defined.	Part 8, 5.1; Part 8, 11.10.3
R02	Scope creep beyond T1P -- Premature implementation of full Dream Engine, full Steward roles, or advanced UI features before Walking Skeleton is stable.	High	High	Wasted budget, unstable foundation	T1P Posture classification (Part 1, 12A). Each capability has explicit posture: Full, Scaffold, Defer. Posture violation requires explicit approval.	Part 1, 12A
R03	Single orchestrator bottleneck -- One master orchestrator (O00) manages all execution flow. If it crashes or becomes overloaded, all work halts.	High	Medium	Complete execution stoppage	3-failure circuit breaker halts and interrupts C0 (CLAUDE.md). Session durability allows restart. Recovery policy (Part 7, 8.4) defines restart sequence. Future: multi-orchestrator with leader election.	Part 4, 4.1; Part 7, 8.4
R04	API rate limits from LLM providers -- Anthropic, OpenAI, or other providers throttle or reject requests during peak load or quota exhaustion.	High	Medium	Agent execution stalls	LiteLLM router with fallback model routing (Part 8, 8.6). Governor monitors cost ledger and enforces budget policies (Part 7, 8.1). Token budget management by Prompt Steward.	Part 2, 5.3; Part 8, 8.6
R05	Session crash with context loss -- An agent runtime crashes mid-task and the session context (conversation history, intermediate results) is lost.	High	Medium	Rework, duplicated cost	Durable session model with checkpoint_ref and transcript_ref (Part 3, 4.10). Atomic writes to PostgreSQL. Recovery policy restores from last checkpoint.	Part 3, 4.10; Part 4, 15
R06	Knowledge drift -- Knowledge base becomes stale as external sources change, internal documents are not refreshed, and embeddings decay in relevance.	Medium	High	Degraded RAG quality, incorrect agent behavior	Scheduled research refresh cycles (Part 5, 11). Knowledge Ledger tracks document lifecycle (Part 5, 4.7). Anti-entropy rules (Part 5, 15). Idle Maintenance detects stale knowledge.	Part 5, 11; Part 5, 15
R07	Cost overrun exceeding Max20 budget -- Uncontrolled LLM usage, excessive agent spawning, or inefficient prompting pushes monthly costs beyond the $200/month ceiling.	Critical	Medium	Budget breach, forced shutdown	Governor cost governance (Part 7, 6.1). Budget policy with hard caps (Part 7, 8.1). Cost ledger attribution to activity level (Part 3, 4.6.2). Cost optimization layer (Part 4, 9). Cost reporting on every deliverable (CLAUDE.md).	Part 7, 6.1; Part 8, 13
R08	Skill degradation over time -- Skills that worked well initially degrade as models are updated, contexts change, or upstream dependencies shift.	Medium	Medium	Reduced execution quality	Skill Performance DB tracks quality per skill over time (Part 5, 8.9A). Idle Maintenance detects skill drift (Part 4, 4.9.9). Dream Engine proposes improvements.	Part 5, 8.9A; Part 4, 4.9.9
R09	Security breach via exposed secrets -- API keys, OAuth tokens, or database credentials leaked through logs, commits, or misconfigured services.	Critical	Low	Full system compromise	SOPS for secrets at rest (Part 2, 5.14). No secrets in commits (CLAUDE.md). Tailscale VPN for network isolation (Part 8, 11.4). Security logging and audit (Part 8, 11.8).	Part 2, 5.14; Part 8, 11.5
R10	Data loss from PostgreSQL failure -- Database corruption, disk failure, or accidental deletion destroys the canonical state store.	Critical	Low	Total state loss	Restic backup to Backblaze B2 daily at 03:00 UTC. WAL archiving. Restore drill requirements (Part 8, 10.8). Backup verification on schedule.	Part 8, 10; Part 2, 5.15
R11	Agent behavioral drift -- Agents gradually deviate from intended behavior due to prompt template changes, context pollution, or model updates without testing.	Medium	Medium	Unpredictable execution, governance violations	Rule Steward validates rule changes (Part 4, 4.2A). Review gates for non-code outputs (Part 7, 12.16). Prompt Steward manages prompt package contracts (Part 4, 4.2B). Version check for agent runtime currency (Part 1, 4.11).	Part 4, 4.2A; Part 7, 12.16
R12	Dependency deadlock -- Circular or unresolvable task dependencies prevent execution progress.	Medium	Low	Execution stall on affected ticket	Task dependency resolution (Part 4, 5.1). DAG validation at decomposition time. Orchestrator detects cycles before scheduling. Governor breaker triggers on stall detection.	Part 4, 5.1
R13	Hetzner outage or network partition -- Cloud provider outage or Tailscale VPN disruption disconnects the control plane from local operator node or external services.	High	Low	Partial or full system unavailability	Emergency access layers (Part 8, 11.10.2). Out-of-band recovery via direct SSH. Mac Studio (Node B) can operate independently for local tasks. Health model detects degradation (Part 8, 12.5).	Part 8, 11.10; Part 8, 5
R14	Max20 throttling under growth -- As XIOPro manages more projects, the fixed infrastructure budget prevents scaling compute to match workload.	Medium	Medium	Slower execution, queuing delays	Scale-up triggers defined (Part 8, 13.5). Hetzner upgrade policy (Part 8, 13.6). Self-hosted model decision rule (Part 8, 13.7). Cost optimization prioritizes high-value work first.	Part 8, 13.5-13.7
R15	Context window limits -- Large tasks, deep conversation histories, or excessive RAG injection exceed the model's context window, causing truncation or degraded output.	Medium	High	Reduced output quality, missed context	Prompt Steward manages total context budget (Part 4, 4.2B). RAG pipeline respects context window ceiling (Part 5, 7.18). Document decomposition protocol (Part 5, 4.5). Session checkpointing allows context rotation.	Part 4, 4.2B; Part 5, 7.18
R16	Orphaned agent runtimes -- Agent processes that lose their parent orchestrator connection continue running, consuming resources without producing useful work.	Medium	Medium	RAM waste, potential interference	Heartbeat monitoring (agent_runtimes.last_heartbeat_at). Governor health governance (Part 7, 6.2). Stale heartbeat triggers cleanup. Max20 budget pressure naturally limits orphan lifetime.	Part 7, 6.2; Part 3, 4.8
R17	Escalation queue overflow -- Too many escalation requests accumulate without human response, blocking agent execution across multiple tasks.	Medium	Medium	Execution throughput collapse	Attention queue in UI (Part 6, 10.1). Escalation urgency levels with routing rules (Part 7, 8.3). Timeout policies auto-resolve low-priority escalations. Governor monitors queue depth.	Part 7, 8.3; Part 6, 10.1
R18	Schema migration failure -- Alembic migration fails mid-apply, leaving the database in an inconsistent state between schema versions.	High	Low	Service startup failure, data corruption	Alembic revision chain (schema header). Pre-migration backup. Atomic transaction per migration. Rollback script for each migration. Restore drill validates migration reversibility.	Part 8, 10; Part 2, 5.5
R19	Provider lock-in despite independence goal -- Gradual accumulation of Anthropic-specific features or prompt patterns makes switching to other providers costly.	Medium	Medium	Reduced negotiating power, migration cost	Provider independence constraint (Part 1, 4.1). LiteLLM abstraction layer (Part 8, 8.6). Skill Performance DB tracks per-model compatibility (Part 5, 8.9A). All prompts stored as portable text.	Part 1, 4.1; Part 8, 8.6
R20	Insufficient observability during early operation -- Without proper logging, metrics, and dashboards, problems are detected too late and root cause analysis is difficult.	Medium	Medium	Slow incident response, repeated failures	Observability stack requirement (Part 8, 12). Required signals defined (Part 8, 12.2). Health model (Part 8, 12.5). Alerting baseline with critical/warning/info tiers (Part 8, 12.6). Dashboard requirements (Part 8, 12.7).	Part 8, 12

5.3 Risk Heat Map¶

                    Low          Medium       High
                    Likelihood   Likelihood   Likelihood
                   +-----------+-----------+-----------+
    Critical       |  R09 R10  |  R01 R07  |           |
                   +-----------+-----------+-----------+
    High           |  R13 R18  |  R03 R04  |  R02      |
                   |           |  R05      |           |
                   +-----------+-----------+-----------+
    Medium         |  R12      |  R08 R11  |  R06 R15  |
                   |           |  R14 R16  |           |
                   |           |  R17 R19  |           |
                   |           |  R20      |           |
                   +-----------+-----------+-----------+

5.4 Top 5 Risks Requiring Immediate Attention¶

R07 -- Cost overrun: The Max20 budget is a hard constraint. Governor cost governance and per-activity attribution must be operational from day one.
R01 -- RAM exhaustion: With 30 GB serving the entire stack, memory budgets per service must be defined and enforced before first deployment.
R02 -- Scope creep: T1P posture classification exists but requires discipline. Every implementation decision must reference the posture table.
R03 -- Single orchestrator: No redundancy for the master orchestrator. Session durability and recovery policy are the primary mitigations until multi-orchestrator is feasible.
R15 -- Context window limits: High likelihood in daily operation. Prompt Steward context budget management and RAG chunking strategy must be validated early.

Changelog¶

Version	Date	Author	Changes
4.2.0	2026-03-29	BM	Initial draft. Sections 1-5: Purpose, Data Schema, System Module Index, Subject Index, Risk Register.

6. Dependency Order¶

6.1 Ticket Dependency Graph¶

flowchart TD
    subgraph EPIC-CB ["EPIC-CB: Control Bus"]
        T1001["TKT-1001<br/>SSE Push Channels"]
        T1002["TKT-1002<br/>Agent Registration"]
        T1003["TKT-1003<br/>Intervention Endpoints"]
        T1004["TKT-1004<br/>Task Orchestration"]
        T1005["TKT-1005<br/>Host Capacity"]
        T1006["TKT-1006<br/>Agent Spawn"]
        T1007["TKT-1007<br/>Cost Tracking"]
        T1008["TKT-1008<br/>Governance Events"]
    end

    subgraph EPIC-ODM ["EPIC-ODM: Schema + Skeleton"]
        T1010["TKT-1010<br/>Deploy DDL"]
        T1011["TKT-1011<br/>Walking Skeleton"]
        T1012["TKT-1012<br/>Seed Data"]
    end

    subgraph EPIC-GOV ["EPIC-GOV: Governance"]
        T1020["TKT-1020<br/>Escalation Path"]
        T1021["TKT-1021<br/>Approval Workflow"]
        T1022["TKT-1022<br/>Alerts + Breakers"]
        T1023["TKT-1023<br/>Override Records"]
    end

    subgraph EPIC-UI ["EPIC-UI: Control Center"]
        T1030["TKT-1030<br/>UI Shell"]
        T1031["TKT-1031<br/>Agent Status Grid"]
        T1032["TKT-1032<br/>Task Board"]
        T1033["TKT-1033<br/>Alerts Panel"]
        T1034["TKT-1034<br/>Cost Summary"]
        T1035["TKT-1035<br/>Prompt Composer"]
        T1036["TKT-1036<br/>Activity Feed"]
    end

    subgraph EPIC-KNO ["EPIC-KNO: Knowledge System"]
        T1040["TKT-1040<br/>Skill Registry"]
        T1041["TKT-1041<br/>Activation Slimming"]
        T1042["TKT-1042<br/>Librarian Decomposition"]
        T1043["TKT-1043<br/>Source Registry"]
    end

    subgraph EPIC-INFRA ["EPIC-INFRA: Infrastructure"]
        T1050["TKT-1050<br/>Stop Unused Services"]
        T1051["TKT-1051<br/>Install Remaining CLI"]
        T1052["TKT-1052<br/>Paperclip Migration"]
        T1053["TKT-1053<br/>Dashboard Transition"]
    end

    subgraph EPIC-TEST ["EPIC-TEST: Testing"]
        T1060["TKT-1060<br/>pytest Setup"]
        T1061["TKT-1061<br/>Playwright Setup"]
        T1062["TKT-1062<br/>Behavioral Tests"]
        T1063["TKT-1063<br/>Acceptance Tests (4)"]
    end

    subgraph EPIC-MVP1 ["EPIC-MVP1: MVP1 Prep (see MVP1_PRODUCT_SPEC.md)"]
        T1070["TKT-1070<br/>Product Engine Integration"]
        T1071["TKT-1071<br/>Billing Webhooks"]
        T1072["TKT-1072<br/>Landing Page Reqs"]
    end

    %% ODM dependencies
    T1010 --> T1011
    T1010 --> T1012
    T1010 --> T1043
    T1010 --> T1060

    %% Walking skeleton dependencies
    T1004 --> T1011
    T1012 --> T1011

    %% CB internal dependencies
    T1001 --> T1003
    T1001 --> T1008
    T1002 --> T1004
    T1002 --> T1005
    T1004 --> T1005
    T1005 --> T1006
    T1004 --> T1007

    %% Governance dependencies
    T1004 --> T1020
    T1010 --> T1020
    T1020 --> T1021
    T1008 --> T1021
    T1008 --> T1022
    T1020 --> T1023
    T1021 --> T1023

    %% UI dependencies
    T1030 --> T1031
    T1030 --> T1032
    T1030 --> T1033
    T1030 --> T1034
    T1030 --> T1035
    T1030 --> T1036
    T1002 --> T1031
    T1004 --> T1032
    T1008 --> T1033
    T1007 --> T1034
    T1001 --> T1035
    T1004 --> T1036

    %% Knowledge dependencies
    T1040 --> T1041

    %% Infrastructure dependencies
    T1011 --> T1052
    T1030 --> T1053

    %% Test dependencies
    T1030 --> T1061
    T1011 --> T1062
    T1060 --> T1062
    T1011 --> T1063
    T1020 --> T1063
    T1060 --> T1063

    %% MVP1 dependencies
    T1011 --> T1070
    T1051 --> T1071

    %% Styling: critical path in bold
    style T1010 fill:#e74c3c,color:#fff,stroke:#c0392b
    style T1004 fill:#e74c3c,color:#fff,stroke:#c0392b
    style T1011 fill:#e74c3c,color:#fff,stroke:#c0392b
    style T1020 fill:#e74c3c,color:#fff,stroke:#c0392b
    style T1063 fill:#e74c3c,color:#fff,stroke:#c0392b
    style T1060 fill:#e74c3c,color:#fff,stroke:#c0392b

6.2 Critical Path¶

The critical path is the longest chain of dependent tickets that determines the minimum build time. Two paths tie for longest:

Path A -- Schema to Acceptance (longest)

TKT-1010 (DDL, 0.5d)
  -> TKT-1012 (Seed, 0.5d)
    -> TKT-1011 (Skeleton, 3d)
      -> TKT-1063 (Acceptance Tests, 2d)
        = 6.0 days minimum

But TKT-1011 also depends on TKT-1004, which depends on TKT-1002. Factoring in the CB chain:

Path B -- Bus to Acceptance (true critical path)

TKT-1002 (Agent Registration, ~2d)
  -> TKT-1004 (Task Orchestration, ~2d)
    -> TKT-1011 (Walking Skeleton, 3d)
      -> TKT-1020 (Escalation, 2d)
        -> TKT-1063 (Acceptance Tests, 2d)
          = 11.0 days minimum

Path C -- Bus to Governance

TKT-1002 (Registration)
  -> TKT-1004 (Tasks)
    -> TKT-1020 (Escalation)
      -> TKT-1021 (Approval)
        -> TKT-1023 (Overrides)
          = ~8.0 days

The true critical path runs through Path B: from agent registration through task orchestration, the walking skeleton, escalation, and finally the acceptance tests. This chain spans all 5 phases and cannot be shortened without reducing scope.

6.3 Parallel Execution Opportunities¶

The following groups of tickets have no mutual dependencies and can execute simultaneously:

Phase 1 parallel lanes (Days 2-5):

Lane	Tickets	Assignee
Lane 1: Bus Core	TKT-1001, TKT-1002, TKT-1003, TKT-1004	Engineering Brain
Lane 2: Schema	TKT-1010, TKT-1012	Engineering Brain
Lane 3: Infrastructure	TKT-1050	DevOps / BrainMaster
Lane 4: Knowledge	TKT-1040	BrainMaster

Note: Lanes 1 and 2 share the Engineering Brain assignee, so true parallelism requires either two engineering agents or interleaving.

Phase 2 parallel lanes (Days 4-7):

Lane	Tickets	Assignee
Lane 1: Governance	TKT-1020, TKT-1021, TKT-1022, TKT-1023	Engineering Brain
Lane 2: Bus Extended	TKT-1005, TKT-1006, TKT-1007, TKT-1008	Engineering Brain
Lane 3: Knowledge	TKT-1041, TKT-1043	BrainMaster
Lane 4: Tools	TKT-1051	DevOps

Phase 3 parallel lanes (Days 6-10):

Lane	Tickets	Assignee
Lane 1: UI Shell + Widgets	TKT-1030 then TKT-1031-1036 (all 6 widgets parallel after shell)	Brand Brain
Lane 2: E2E Setup	TKT-1061 (after TKT-1030)	Engineering Brain

Phase 4-5 parallel lanes (Days 8-14):

Lane	Tickets	Assignee
Lane 1: Migration	TKT-1052, TKT-1053	Engineering / BM
Lane 2: Knowledge	TKT-1042	Mac Worker
Lane 3: MVP1 (see `MVP1_PRODUCT_SPEC.md`)	TKT-1070, TKT-1071, TKT-1072	Engineering / Brand
Lane 4: Testing	TKT-1062, TKT-1063	Engineering Brain

Maximum parallelism: With 3 agents working simultaneously (Engineering, Brand, BrainMaster), theoretical build time compresses from ~40 ticket-days to approximately 14 calendar days.

7. Data Flow Diagrams¶

7.1 Task Lifecycle¶

flowchart LR
    A["Idea<br/>(conversation)"] --> B["Discussion Thread<br/>(type: intake)"]
    B --> C["Ticket<br/>(state: open)"]
    C --> D["Task<br/>(state: queued)"]
    D --> E["Agent Assignment<br/>(task.assigned_to)"]
    E --> F["Session<br/>(agent execution context)"]
    F --> G["Activity<br/>(work unit)"]
    G --> H["Result<br/>(activity_evaluations)"]
    H --> I{"Success?"}
    I -->|yes| J["Knowledge Object<br/>(if applicable)"]
    I -->|no| K["Retry / Escalate"]
    K -->|retry| D
    K -->|escalate| L["Escalation<br/>(human decision)"]
    L --> D
    J --> M["Reflection<br/>(hindsight evaluation)"]
    M --> N["Knowledge Update<br/>(vault + pgvector)"]

    style A fill:#3498db,color:#fff
    style C fill:#2ecc71,color:#fff
    style G fill:#f39c12,color:#fff
    style J fill:#9b59b6,color:#fff
    style N fill:#1abc9c,color:#fff

7.2 Agent Communication¶

flowchart TB
    subgraph Agents ["Agent Layer"]
        A0["000<br/>Orchestrator"]
        A1["001<br/>Governor"]
        A2["002<br/>Engineering"]
        A3["003<br/>Brand"]
        A10["010<br/>Mac Worker"]
    end

    subgraph Bus ["Control Bus (REST + SSE)"]
        direction TB
        REST["REST API<br/>POST /tasks<br/>POST /messages<br/>POST /escalations<br/>GET /agents"]
        SSE["SSE Push<br/>/events/agent/{id}<br/>/events/founder<br/>/events/ui"]
        HB["Heartbeat<br/>POST /heartbeat"]
    end

    subgraph Storage ["Persistence"]
        PG["PostgreSQL 17<br/>(ODM Schema)"]
        PGV["pgvector<br/>(embeddings)"]
        GIT["Git Repos<br/>(code + docs)"]
    end

    subgraph UI ["Control Center"]
        CC["Widget Grid<br/>(Next.js + shadcn)"]
    end

    subgraph Founder ["Human"]
        SH["Shai<br/>(founder)"]
    end

    %% Agent -> Bus
    A0 & A1 & A2 & A3 & A10 -->|"REST calls"| REST
    A0 & A1 & A2 & A3 & A10 -->|"heartbeat (30s)"| HB

    %% Bus -> Agent
    SSE -->|"task assignments"| A0 & A2 & A3 & A10
    SSE -->|"interventions"| A0 & A1
    SSE -->|"cost alerts"| A1

    %% Bus -> Storage
    REST -->|"read/write"| PG
    REST -->|"embeddings"| PGV

    %% Agents -> Storage
    A2 & A3 & A10 -->|"commits"| GIT

    %% Bus -> UI
    SSE -->|"real-time events"| CC

    %% UI -> Founder
    CC -->|"dashboard"| SH
    SH -->|"decisions, messages"| CC
    CC -->|"REST calls"| REST

    style Bus fill:#34495e,color:#fff
    style PG fill:#2980b9,color:#fff
    style CC fill:#8e44ad,color:#fff

7.3 Knowledge Flow¶

flowchart LR
    subgraph Sources ["External Sources"]
        S1["Anthropic Docs"]
        S2["GitHub"]
        S3["npm / PyPI"]
        S4["MDN / W3C"]
        S5["Hugging Face"]
    end

    subgraph RC ["Research Center"]
        SR["Source Registry<br/>(governed list)"]
        RE["Research Execution<br/>(agent task)"]
    end

    subgraph Librarian ["Librarian Process"]
        DEC["Decompose<br/>(document -> notes)"]
        TAG["Tag + Link<br/>(frontmatter, backlinks)"]
        IDX["Index<br/>(searchable catalog)"]
    end

    subgraph Storage ["Knowledge Storage"]
        GIT2["Git Vault<br/>(Obsidian markdown)"]
        PG2["PostgreSQL<br/>(knowledge_objects)"]
        VEC["pgvector<br/>(embeddings)"]
    end

    subgraph Retrieval ["Retrieval"]
        RAG["RAG Pipeline<br/>(query -> embed -> search)"]
        CTX["Context Assembly<br/>(relevant chunks)"]
    end

    subgraph Execution ["Agent Execution"]
        AGT["Agent Session"]
        ACT["Activity Output"]
    end

    subgraph Learning ["Learning Loop"]
        HS["Hindsight<br/>(what worked?)"]
        RF["Reflection<br/>(why?)"]
        UPD["Knowledge Update"]
    end

    Sources --> SR
    SR --> RE
    RE --> DEC
    DEC --> TAG --> IDX
    IDX --> GIT2
    IDX --> PG2
    PG2 --> VEC

    VEC --> RAG
    GIT2 --> RAG
    RAG --> CTX
    CTX --> AGT
    AGT --> ACT

    ACT --> HS
    HS --> RF
    RF --> UPD
    UPD --> PG2
    UPD --> GIT2

    style RC fill:#e67e22,color:#fff
    style Storage fill:#2980b9,color:#fff
    style Retrieval fill:#27ae60,color:#fff
    style Learning fill:#8e44ad,color:#fff

7.4 Cost Flow¶

flowchart TD
    ACT2["Activity Completes<br/>(tokens_in, tokens_out, model)"]
    -->|"calculate USD"| CLE["Cost Ledger Entry<br/>(activity_id, cost_usd,<br/>tokens_in, tokens_out)"]

    CLE -->|"aggregate"| AGG["Aggregation<br/>(task / ticket / sprint / project)"]

    AGG --> GOV{"Governor Check<br/>(Part 7 breakers)"}

    GOV -->|"under threshold"| DASH["Dashboard Widget<br/>(Cost Summary)"]
    GOV -->|"80% budget"| WARN["Warning Alert<br/>(amber indicator)"]
    GOV -->|"90% budget"| CRIT["Critical Alert<br/>(red indicator)"]
    GOV -->|"100% budget"| TRIP["Breaker Trips<br/>(pause agent)"]

    WARN --> NOTIFY["Founder Notification<br/>(SSE + Alerts Panel)"]
    CRIT --> NOTIFY
    TRIP --> NOTIFY
    TRIP --> PAUSE["Agent Paused<br/>(awaits manual reset)"]

    NOTIFY --> DASH

    subgraph Thresholds ["Budget Thresholds (Max20 = $200/mo)"]
        TH1["Per-task: configurable<br/>(default $10)"]
        TH2["Per-sprint: $50"]
        TH3["Per-month: $200"]
    end

    Thresholds -.->|"checked by"| GOV

    style CLE fill:#f39c12,color:#fff
    style TRIP fill:#e74c3c,color:#fff
    style DASH fill:#3498db,color:#fff
    style GOV fill:#2c3e50,color:#fff

8. Process Checklists¶

8.1 New Project Setup¶

Define project in ODM (name, description, topics, start_date, end_date)
Create Paperclip project (or ODM equivalent if post-migration)
Assign project orchestrator (agent with orchestrator role)
Build agent roster (roles needed, agents available, capacity check)
Create initial ticket set from requirements
Run System Review (this Part 9 process) on the ticket set
Review findings: all risks acknowledged, all dependencies mapped
Approve and begin Phase 0

8.2 Agent Commissioning¶

Determine role requirements (what skills, what model tier)
Check host capacity (RAM, CPU, active container count)
Select or spawn agent (3-digit ID from available range)
Assign roles and project binding in agent_runtimes table
Register in Control Bus (POST /agents)
Load activation file with required skills (skills_on_load)
Verify heartbeat received by Bus within 30 seconds
Assign first task and confirm execution

8.3 Sprint Start¶

Review previous sprint retrospective (lessons, blockers)
Update plan.yaml with new sprint tickets
Verify agent roster is adequate for sprint workload
Check host capacity for planned parallel work
Brief agents via Control Bus with sprint goals
Set sprint in ODM (start_date, end_date, ticket assignments)
Confirm all sprint dependencies from prior sprints are met

8.4 Sprint Close¶

Verify all sprint tickets are done or explicitly deferred
Run completion tests for all done tickets
Generate sprint cost report (total USD, per-agent, per-ticket)
Generate execution report (Part 3 format, Section 14A)
Write retrospective: what worked, what did not, what to change
Update knowledge vault with lessons learned
Archive sprint record, prepare next sprint in ODM

8.5 Technology Evaluation¶

Identify tool, skill, framework, or library to evaluate
Check Source Registry for prior evaluations of this technology
Create knowledge vault note using standard evaluation template
Research: what it does, relevance to STRUXIO, maturity, cost, risk
Compare against existing solutions in the stack
Decision: adopt / evaluate further / defer / reject
Update Resource Registry with decision and rationale
If adopted: create installation ticket and update CLI_TOOLS_ASSESSMENT

8.6 Deployment¶

Pre-deploy: run all tests (pytest + Playwright if applicable)
Pre-deploy: verify host capacity (RAM > 2GB free, disk > 10GB free)
Pre-deploy: backup current state (pg_dump + restic snapshot)
Deploy: apply changes (docker compose up, migration scripts, config)
Post-deploy: health check all services (Bus /health, UI loads, PG responds)
Post-deploy: verify Control Bus connectivity (SSE streams active)
Post-deploy: smoke test core workflows (create task, assign, complete)
If failure: execute rollback (restore pg_dump, revert containers)

8.7 Recovery¶

Identify failure scope: agent, service, host, or data
Check host health (free -h, df -h, top, docker stats)
Check Docker container status (docker ps -a, docker logs)
Restart failed containers (docker compose restart )
If data issue: restore from latest pg_dump (pg_restore)
If full host failure: restore from Restic backup (B2 daily 03:00 UTC)
Verify all services healthy post-recovery
Resume interrupted work from last checkpoint (plan.yaml, session state)
Create incident record in ODM with root cause and resolution

9. Meta -- This Process as a XIOPro Capability¶

9.1 Reusability¶

This System Review process is not specific to the XIOPro build. It is a reusable capability that every XIOPro-managed project should execute before implementation begins.

When XIOPro manages a product (e.g., the first product -- see MVP1_PRODUCT_SPEC.md), it will:

Decompose product requirements into a blueprint (using the Librarian process)
Run System Review on the product blueprint (this Part 9 process)
Generate tickets from the review findings
Execute tickets through the agent system (Bus, agents, governance)
Close with acceptance tests and sprint retrospective

The same applies to any future project: client onboarding, compliance audits, internal tooling. The System Review is the governance gate between "we have a plan" and "we start building."

9.2 ODM Entity¶

The System Review itself should be tracked as an ODM entity:

project_review:
  id: uuid
  project_id: uuid          # FK to projects
  review_type: enum
    # initial         -- before first implementation
    # mid_sprint      -- checkpoint during execution
    # sprint_close    -- end-of-sprint review
    # major_change    -- triggered by scope or architecture change
  status: enum
    # pending         -- review requested
    # in_progress     -- reviewer is working
    # passed          -- review complete, no blockers
    # failed          -- critical issues found, cannot proceed
    # needs_fixes     -- issues found, fixable before proceeding
  reviewer: string           # agent ID or user ID
  findings: [string]         # list of finding summaries
  risk_count: int            # number of risks identified
  risk_high_count: int       # number of high/critical risks
  module_count: int          # modules verified
  ticket_count: int          # tickets reviewed
  dependency_depth: int      # longest dependency chain length
  created_at: datetime
  completed_at: datetime|null
  verdict: string|null       # free-text summary verdict

9.3 Skill¶

This process should become a registered skill in the Skill Registry (SKILL_REGISTRY.yaml):

skill:
  id: system-review
  name: "System Review"
  description: >
    Run comprehensive review of a project blueprint before implementation.
    Verifies data schema completeness, maps modules and subjects, compiles
    risk register, maps dependencies, creates data flow diagrams, and
    generates process checklists. Produces a review report with pass/fail verdict.
  triggers:
    - /system-review
    - /review-project
    - /pre-implementation-check
  roles: [orchestrator, governor]
  model_tier: sonnet          # Sonnet sufficient; Opus for ambiguous findings
  token_estimate: 15000-25000
  steps:
    1. Verify data schema completeness (all ODM entities have DDL)
    2. Build module index (group tickets by subsystem)
    3. Build subject index (cross-reference by concept)
    4. Compile risk register (identify gaps, conflicts, capacity issues)
    5. Map ticket dependencies (build directed graph)
    6. Identify critical path and parallel execution lanes
    7. Create data flow diagrams (task, communication, knowledge, cost)
    8. Generate process checklists (setup, sprint, deploy, recover)
    9. Produce review report with verdict: pass / needs-fixes / fail
  outputs:
    - Part 9 document (this file)
    - Updated risk register
    - Dependency graph (Mermaid)
    - Process checklists

9.4 Rule¶

Every project must pass System Review before its first implementation ticket begins execution. This is a governance gate, not a suggestion.

Rule definition:

rule:
  id: require-system-review
  name: "Mandatory System Review"
  scope: project
  trigger: "First ticket in project moves to state 'active'"
  condition: "project_review.status == 'passed' for this project"
  action_on_fail: "Block ticket activation. Notify orchestrator and founder."
  severity: critical
  exceptions: none
  rationale: >
    Starting implementation without System Review risks building on
    incomplete schemas, unresolved dependencies, or unacknowledged risks.
    The review takes hours; fixing these issues mid-build takes days.

10. Project Lifecycle Management -- Topic to Product¶

This section defines the end-to-end lifecycle that every XIOPro project follows, from initial idea to production product. It is the process backbone that connects all other parts of the blueprint.

10.1 Lifecycle Overview¶

flowchart LR
    Topic["Topic"] --> Research["Research"]
    Research --> BP["Blueprint"]
    BP --> Review["Review + Readiness"]
    Review --> Plan["Work + Test Plan"]
    Plan --> Tickets["Tickets"]
    Tickets --> Execute["Sprint Execution"]
    Execute --> IntTest["Integration Test"]
    IntTest --> Product["Product"]

    IntTest -->|"Issues found"| Tickets
    Review -->|"Not ready"| Research

Every phase has a gate (exit criteria) and a T1P standard (quality bar). No phase is skipped. Iteration loops are expected and healthy.

10.2 Phase Definitions¶

Phase 1: Topic to Project¶

Attribute	Value
Trigger	Idea or discussion identified as a potential project
Actions	Create project in ODM (name, description, topics). Assign project orchestrator. Define initial scope and constraints.
Gate	Project registered, orchestrator assigned
T1P Standard	Clear objective, bounded scope, measurable success criteria
Estimated Time	1-2 hours

Phase 2: Research¶

Attribute	Value
Trigger	Project created
Actions	Research Center scans relevant sources. Domain research (competitors, standards, technologies). Feasibility assessment. Multiple research threads possible (parallel).
Gate	Research outputs reviewed, key decisions documented
T1P Standard	Evidence-based decisions, source lineage, evaluation records
Estimated Time	2-4 hours per research thread

Phase 3: Blueprint¶

Attribute	Value
Trigger	Research complete, direction decided
Actions	Create project blueprint (using XIOPro BP as template). Define architecture, data model, components. Librarian decomposes into knowledge notes.
Gate	Blueprint complete, all sections covered
T1P Standard	ODM entities defined, data schema written, module index complete
Estimated Time	4-8 hours

Phase 4: Review and Readiness¶

Attribute	Value
Trigger	Blueprint draft complete
Actions	Internal review (GO scans for gaps, consistency). External review (send to ChatGPT, Gemini, NotebookLM). System Review process (Part 9 checklists, Sections 1-9). Build readiness evaluation.
Gate	Reviews complete, all critical findings addressed
T1P Standard	3+ external reviews, risk register, dependency map, ER diagram
Estimated Time	2-4 hours (reviews run in parallel)

Build Readiness Checklist:

Check	Criterion
Data schema complete	DDL written and validated
All entities defined	Every object has properties, lifecycle, relationships
Risk register complete	15+ risks with mitigations
Dependencies mapped	Critical path identified
Test strategy defined	Test layers for each output type
Review findings addressed	All critical items fixed
Ticket coverage verified	Every module has implementing tickets

Phase 5: Work and Test Plan¶

Attribute	Value
Trigger	Build readiness gate passed
Actions	Generate tickets from blueprint (automated from Part 12 template). Estimate effort using XIOPro Time Database (not human estimates). Create sprint plan with dependency ordering. Define test plan per ticket.
Gate	All tickets written with completion tests
T1P Standard	Every ticket has: plan, completion test, review requirement
Estimated Time	1-2 hours

Phase 6: Sprint Execution¶

Attribute	Value
Trigger	Tickets created and prioritized
Actions	Agents pick up tickets from Bus. Execute with review gates (code: test, UI: screenshot, doc: validation). Continuous Paperclip sync. Real-time progress in Control Center.
Gate	All sprint tickets pass completion tests
T1P Standard	Every output reviewed and tested per type
Sprint Duration	Hours, not weeks. Typically 2-8 hours per sprint.
Estimated Time	4-12 hours per sprint

Phase 7: Integration Test¶

Attribute	Value
Trigger	Sprint complete
Actions	End-to-end test (walking skeleton pattern). Cross-component integration verification. Performance baseline. Security scan.
Gate	All integration tests pass
T1P Standard	Walking skeleton proven, no regressions
Estimated Time	1-2 hours

Phase 8: Product¶

Attribute	Value
Trigger	Integration tests pass
Actions	Deploy to production. Verify health endpoints. Update documentation. Sprint retrospective.
Gate	Product live and monitored
T1P Standard	Deployment checklist complete, monitoring active
Estimated Time	1-2 hours

10.3 Agile Principles (XIOPro-Calibrated)¶

Principle	XIOPro Interpretation
Sprint duration	Hours, not weeks
Iteration speed	Multiple sprints per day possible
Feedback loops	External review + internal testing + user feedback
Continuous	No waterfall phases -- iterate constantly
Human calibration	Build XIOPro Time Database from actual execution data

10.4 T1P Standards Discovery¶

XIOPro discovers and applies T1P (Top 1 Percent) standards through the Research Center, scanning industry best practices for each lifecycle phase.

Sources:

awesome-agentic-patterns (nibzard)
Software engineering best practices
ISO/IEC 25010 (software quality model)
XIOPro's own execution history

Standards by Phase:

Phase	T1P Standards
Blueprints	12-part structure minimum. ODM with lifecycle states. Risk register with mitigations. Dependency map. External review by 3+ LLMs.
Build Readiness	DDL must run without error. Walking skeleton acceptance scenarios defined. Every module has implementing tickets. Review findings addressed.
Work Plans	Every ticket has completion test. Dependencies mapped and ordered. Agent time estimates (not human). Sprint duration in hours.
Execution	Review gate per output type. Playwright screenshot for UI. API test suite for endpoints. Walking skeleton re-run after integration.

10.5 XIOPro Time Database¶

The XIOPro Time Database replaces human time estimates with calibrated agent execution benchmarks. It grows with every project XIOPro runs.

Schema:

table: execution_time_benchmarks
columns:
  - task_type: "api_endpoint | ui_widget | research | blueprint | document | migration | test_suite"
  - complexity: "low | medium | high"
  - model_used: "opus | sonnet | haiku"
  - estimated_human_hours: float
  - actual_agent_minutes: float
  - acceleration_ratio: float    # human_hours * 60 / agent_minutes
  - sample_count: int

Seed Data (from XIOPro Day 1 -- 13 days estimated, ~2.5 hours actual):

Task Type	Complexity	Model	Human Est (h)	Agent Actual (min)	Ratio	Samples
api_endpoint	medium	sonnet	16	5	192x	8
ui_widget	medium	sonnet	24	8	180x	6
blueprint_section	high	opus	480	20	1440x	12
research	medium	opus	240	10	1440x	10
migration	medium	sonnet	480	10	2880x	1
test_suite	medium	sonnet	240	8	1800x	4

This database is the basis for all XIOPro project estimation. Human estimates are recorded for comparison but never used for planning.

10.6 ODM Connection¶

The project lifecycle phase is tracked on the Project entity:

project:
  lifecycle_phase: enum
    # topic | research | blueprint | review | planning | execution | integration | production | maintenance

This field is distinct from the three-dimensional state model (status/state/status_state). The state model tracks workflow position; lifecycle_phase tracks which lifecycle gate the project has reached.

Transition rules: - lifecycle_phase advances when the gate criteria for the current phase are met - lifecycle_phase can regress (e.g., review -> research when readiness check fails) - Only the project orchestrator or system master can advance lifecycle_phase - Every transition is logged in the project's activity history

11. Project Template Architecture¶

XIOPro is a multi-template project factory. Each template shares the same engine but applies different skills, expertise, and potentially additional engines or governors.

11.1 Template Structure¶

Every project template consists of:

Core (shared by all templates): Control Bus, ODM, Governance, UI, Knowledge System, Librarian
Skills: Template-specific skill sets from the Skill Registry
Agent Roles: Template-specific role assignments
Additional Engine (optional): Template-specific processing (e.g., ISO engine for compliance)
Additional Governor (optional): Template-specific governance rules
Lifecycle: Same 8-phase lifecycle (Topic to Product), customized gates per template

11.2 Initial Templates¶

IT Project Template (Active -- being built now)¶

Purpose: Build software products
Skills: coding, testing, architecture, deployment, debugging, TDD
Agent Roles: orchestrator, engineering specialist, devops specialist
Engine: Standard XIOPro engine
Output: Software products, APIs, UIs, infrastructure

Marketing Template (Planned)¶

Purpose: Marketing campaigns and go-to-market
Skills: SEO, ad copy, competitor analysis, campaign planning, lead research
Agent Roles: orchestrator, marketing specialist, content specialist
Additional: Competitive Ads analysis, Lead Research skills
Output: Campaigns, landing pages, ad copy, market analysis

Content Creation Template (Planned)¶

Purpose: Create and manage content
Skills: writing, brand voice (Voice DNA), research, editing, citations
Agent Roles: orchestrator, content specialist, editor
Additional: NotebookLM for synthesis, Voice DNA for brand consistency
Output: Articles, documentation, presentations, podcasts, educational material

Knowledge Expert Template (Planned)¶

Purpose: Domain expertise and knowledge management
Skills: research, synthesis, classification, evaluation, teaching
Agent Roles: orchestrator, research specialist, domain expert
Additional: Research Center automation, Librarian deep integration
Output: Knowledge bases, evaluations, training materials, expert consultations

Knowledge Expert Domains (examples)¶

ISO 19650: Parts 1-6, national annexes, implementation guidance
BIM: IFC, openBIM, model coordination, clash detection, LOD/LOI
Construction Industry Players:
Project Initiator / Owner / Developer
Design Partners (architects, structural engineers, MEP engineers)
General Contractor
Subcontractors (electrical, plumbing, HVAC, concrete, steel)
Inspectors and quality assessors
Quantity surveyors
Project managers and BIM managers

11.3 Template Registry¶

template_registry:
  location: "struxio-logic/templates/"
  format: "YAML template definition + skill list + role assignments"

  template_definition:
    id: string
    name: string
    description: string
    status: active | planned | deprecated

    core_skills: [skill_id]           # from SKILL_REGISTRY
    additional_skills: [skill_id]     # template-specific

    agent_roles:
      - role: string
        skills_on_load: [skill_id]
        min_model: string

    additional_engine: string|null     # e.g., "iso19650-engine"
    additional_governor: string|null   # e.g., "compliance-governor"

    lifecycle_customizations:
      research_sources: [source_id]    # template-specific research sources
      review_criteria: [string]        # additional review gates
      test_requirements: [string]      # template-specific tests

11.4 Creating a New Template¶

Define template in YAML (skills, roles, engines)
Register in template_registry
Create project using template -- auto-assigns skills and roles
Project lifecycle applies with template-specific customizations

11.5 Rule¶

All templates share the same XIOPro engine and lifecycle. The differentiation is in skills, roles, and domain expertise -- not in the core platform. This ensures consistency across all project types.

12. Blueprint Part Numbering: Specification vs Operational¶

Parts 1-8 are specification. Parts 9-14 are operational. This separation is intentional. No renumbering needed.

Range	Nature	Parts
Parts 1-8	Specification — what the system is	Foundations, Architecture, ODM, Agent System, Knowledge System, UI, Governance, Infrastructure
Parts 9-14	Operational — how the system runs	Project Templates, Swarm Architecture, System Review, Work Plan, Execution Log, Ticket Register

The spec/operational boundary is a structural design decision, not an accident of growth. Renumbering would destroy the meaning of the boundary and break all existing cross-references.

Changelog¶

Date	Change
2026-03-29	Part 9 created. System Review as verification gate: ER diagram audit, module index (25 modules across 8 epics), subject index (50+ cross-referenced entries), risk register (15-20 risks with severity and mitigation), dependency order (37 tickets mapped with critical path through TKT-1002 -> 1004 -> 1011 -> 1020 -> 1063 at 11 days), data flow diagrams (4: task lifecycle, agent communication, knowledge flow, cost flow), process checklists (7: project setup, agent commissioning, sprint start, sprint close, technology evaluation, deployment, recovery), meta-capability definition (ODM entity, skill, governance rule).
2026-03-29	Added Section 10: Project Lifecycle Management -- Topic to Product. 8-phase lifecycle with gates and T1P standards. T1P Standards Discovery process. XIOPro Time Database (agent execution benchmarks seeded from Day 1 data). ODM lifecycle_phase enum for Project entity.
2026-03-29	Added Section 11: Project Template Architecture. Multi-template project factory design. 4 initial templates (IT Project active, Marketing/Content/Knowledge Expert planned). Template Registry YAML schema. Knowledge Expert domains include ISO 19650, BIM, Construction Industry Players.
2026-03-30	N17: Added Section 12 — Blueprint Part Numbering: Specification vs Operational. Documents that Parts 1-8 are specification, Parts 9-14 are operational. Separation is intentional; no renumbering needed.