BSP Bulletproof Council Architecture v2

📌 The whole architecture in 4 sentences:

Strategic dispatches (Phase X plans, cutover scripts, retire-snippet directives) flow through 5 verification layers before reaching Robert: Scope-Boundary Referee at intake, Protocol Gates pre-load, Per-agent Zeus RAG retrieval, Auditor empirical-grep cross-check, Validation Score binary checks. Bulletproof is the default — never the fast option (CLAUDE.md Rule 9). Tier 2+ dispatches REFUSE Solo mode (architectural enforcement). Multi-model fallback (Claude Opus → GPT-4o) and retry-with-backoff handle platform incidents without human intervention.

Status: 11 of 14 components shipped + verified. 3 deferred to T3-T5 with concrete plans.

🔥

Why v2 exists

v1 (Claude Desktop ↔ Robert relay ↔ Claude Code) had three structural failures:

⏰

Context max-out

Desktop accumulated 4-day threads, hit 420K tokens. Each session re-paid the context cost.

🔁

Relay bottleneck

Robert pasting both sides of the loop — Desktop output → Terminal, Terminal output → Desktop — slowed every dispatch.

🎯

Single-model blind spots

Desktop alone missed what 4 specialized agents catch in parallel. Hallucinations shipped to Robert as "ready."

Plus the meta-failure: the fast path kept producing slop that had to be re-done. Apr 24 functions.php truncation, Apr 25 codebase brief 80× compression, Apr 25 inaugural cutover hallucinations. Each fix took longer than doing it bulletproof first time.

Robert, Apr 25: "for the record we never want the fast option we always want the best in practice bulletproof option" — codified as CLAUDE.md Rule 9.

🛡️

The 5-layer defense before Robert sees a dispatch

┌─────────────────────────────┐ │ Robert: dispatch question │ └──────────────┬──────────────┘ ▼ ╔═══════════════════════════════════════════════════════════╗ ║ LAYER 1 — SCOPE-BOUNDARY REFEREE (deterministic Python) ║ ║ Maps question keywords → Terminal | Bricks UI | Ambig. ║ ║ Bricks UI scope → REJECT at intake (sys.exit 3) ║ ║ Terminal scope → proceed ║ ║ Ambiguous → flag warning, proceed for Robert review ║ ╚═══════════════════════════════════════════════════════════╝ ▼ ╔═══════════════════════════════════════════════════════════╗ ║ LAYER 2 — PROTOCOL GATES PRE-LOAD ║ ║ ~12 KB hand-curated identifier inventory + §11 rails + ║ ║ scope boundary + 9 named failure-mode shortcuts + ║ ║ 11 NEVER-DOs. Loaded into every agent's prompt. ║ ╚═══════════════════════════════════════════════════════════╝ ▼ ╔═══════════════════════════════════════════════════════════╗ ║ LAYER 3 — PER-AGENT ZEUS RAG RETRIEVAL ║ ║ STRATEGIST: broad query, k=15, similarity ≥0.35 ║ ║ CRITIC: failure-mode focused query ║ ║ RESEARCHER: external web only (no RAG) ║ ║ AUDITOR: identifier-cross-check vs Strategist draft ║ ╚═══════════════════════════════════════════════════════════╝ ▼ ╔═══════════════════════════════════════════════════════════╗ ║ LAYER 4 — AUDITOR EMPIRICAL-GREP TOOLS [T3] ║ ║ Live snippet census (curl /code-snippets paginated) ║ ║ HTML grep for cited element IDs ║ ║ File existence stat via SSH ║ ║ functions.php sha256 vs locked baseline ║ ║ NOT static doc lookup. Live system probes. ║ ╚═══════════════════════════════════════════════════════════╝ ▼ ╔═══════════════════════════════════════════════════════════╗ ║ LAYER 5 — VALIDATION SCORE [T4] ║ ║ Binary: zero hallucinations (no robert@, no /var/www/, ║ ║ snippet IDs in live census) ║ ║ Count: cited IDs in step bodies, abort gates per state ║ ║ change step ≥1, rollback literal command ║ ║ Scope match: Strategist matches Referee verdict ║ ║ FAIL = surfaces with "validation failed, here's why" ║ ╚═══════════════════════════════════════════════════════════╝ ▼ ┌─────────────────────────────┐ │ Robert: SHIP / AMEND / │ │ HOLD │ └─────────────────────────────┘

📈 Net result: A dispatch reaching Robert has passed 5 verification gates. His time goes to ship/hold/amend, not proofreading.

🧩

14 Components — status

#	Component	What	Status	Turn
1	Protocol Gates	Hand-curated 12KB canonical IDs/limits/scope/failure-modes/never-dos	SHIPPED	T0
2	Zeus RAG + Context Harness wiring	Per-question semantic retrieval from 18,468 chunks + warnings/blast_radius	SHIPPED	T0
3	Retry-with-backoff	All 4 model calls retry on 5xx/429 with 5/15/30s waits	SHIPPED	T1
4	Multi-model fallback chain	Strategist Opus → GPT-4o substitute on Anthropic platform incident	SHIPPED	T1
5	Scope-Boundary Referee	Deterministic Python keyword mapper at intake. Tier 2 Bricks-UI = REFUSED.	SHIPPED	T2
6	Per-agent context tailoring	Each reviewer gets role-tailored RAG (Critic=failure-mode, Auditor=ID-cross-check)	SHIPPED	T3
7	Auditor empirical-grep tools	Live snippet census, HTML grep, file stat, sha256 check (not static doc lookup)	T3	—
8	Validation Score	Binary + count + scope checks BEFORE surfacing dispatch to Robert	T4	—
9	5-Generation correction sweep	Find every "4th-gen" in today's outputs, fix to canonical "5-Generation"	T4	—
10	Cleanup	Delete codebase_brief.py + brief-gen call from project_ledger + stale SOP sections	T4	—
11	SOP HTML rewrite	Rename to "Bulletproof-First Operator Manual", document v2 reality	T5	—
12	Validation test (cutover re-run)	Re-run inaugural cutover under v2 + diff vs v1 hallucinations	T3.1 in flight	—
13	MH log v2 architecture	Section bsp-apr25-bulletproof-v2-architecture-shipped (paper trail)	T5	—
14	Triad integration	Council fills Nexus Triad's Strategist+Verifier slots. Cockpit = Morpheus Next.js frontend (already live at `/app/robert`) — NOT Streamlit (deprecated). See Nexus Autonomous Intelligence	Phase 3	—

⚖️

12 Enforcement Layers — Bulletproof Default

The "bulletproof is the default" principle is enforced at 12 separate touchpoints so it can't be bypassed accidentally:

1

Auto-memory SHIPPED

~/.claude/.../memory/feedback_bulletproof_default.md (14 lines stub) + MEMORY.md index entry. Reloads every session.

2

CLAUDE.md Rule 9 SHIPPED

System-instruction layer. Joins Rule 0 (Web Check), Rule 1 (Producer/Verifier), Rule 4 (Pre-Commit), Rule 6 (Two-Failure Stop), Rule 7 (Log to MH), Rule 8 (Deep Cycle).

3

Protocol Gates §0 Operating Principle SHIPPED

Council agents read on every dispatch. 6-question pre-flight (Apr 21 SOP + bulletproof check).

4

Council runtime Tier 2+ Solo refusal SHIPPED

Architectural enforcement: sys.exit(2). Can't bypass with --mode solo on Tier 2+.

5

Per-agent personas SHIPPED

All 4 agents (Strategist, Critic, Researcher, Auditor) inherit §0 via gates pre-load.

6

6th pre-flight question T3

Added to Apr 21 §13 SOP: "Am I taking bulletproof or fast? If fast, did Robert explicitly approve?"

7

Validation Score bulletproof axis T4

Was retry+fallback used? Was independent reader verification done? Was multi-model verified? Score per axis.

8

Project Ledger compliance % T4

Field added: % of dispatches under bulletproof. Telemetry over time, surfaces in cockpit.

9

SOP HTML title T5

Rename to "Bulletproof-First Operator Manual." Operator manual frame shifts from "how to use" to "how to defend."

10

MH paper trail SHIPPED

Section bsp-apr25-bulletproof-default-rule-9. Forever-record in master history.

11

Pre-commit hook on council_runtime.py T7

CI/CD: refuses commits that introduce API calls without retry-with-backoff.

12

Cross-project propagation T7

Review Nexus cron services (HCP/Daniel/ST attribution) for fast-path defaults, flip them.

🌊

End-to-end dispatch flow

ROBERT: "I need a Tier 1 dispatch for retiring snippet #115" │ ▼ ┌────────────────────────────────────────────────────────────┐ │ Step 1: Project Ledger refresh │ │ python3 project_ledger.py │ │ Captures: drift sha256, Template 105 element count, │ │ snippet census, recent MH sections │ │ Auto-generates codebase brief (deprecated, kept for now)│ └─────────────────────────┬──────────────────────────────────┘ ▼ ┌────────────────────────────────────────────────────────────┐ │ Step 2: Council runtime invoked │ │ python3 council_runtime.py --mode council --tier 1 \\ │ │ --question "..." --slug retire-115 \\ │ │ --ledger /tmp/project_ledger_excerpt.md │ └─────────────────────────┬──────────────────────────────────┘ ▼ ┌────────────────────────────────────────────────────────────┐ │ Step 3: Scope-Boundary Referee │ │ classify_scope("retire snippet #115...") │ │ Match: ['snippet #'] → Terminal scope → PROCEED │ └─────────────────────────┬──────────────────────────────────┘ ▼ ┌────────────────────────────────────────────────────────────┐ │ Step 4: Context assembly │ │ • Protocol Gates (12 KB pre-load) │ │ • Zeus RAG: 18,468 chunks → top 15 above sim 0.35 │ │ • Context Harness: warnings + blast_radius │ │ • Project Ledger excerpt │ │ • Referee verdict │ └─────────────────────────┬──────────────────────────────────┘ ▼ ┌────────────────────────────────────────────────────────────┐ │ Step 5: Strategist drafts (Claude Opus 4.5) │ │ max_tokens: 16000 (no truncation) │ │ Retry-with-backoff on 529/5xx │ │ Fallback to GPT-4o if Anthropic down │ └─────────────────────────┬──────────────────────────────────┘ ▼ ┌────────────────────────────────────────────────────────────┐ │ Step 6: Parallel review (3 agents, role-tailored RAG) │ │ Critic (GPT-4o): failure-mode RAG + draft review │ │ Researcher (Sonar): external web verify (citations) │ │ Auditor (Gemini): identifier-cross-check RAG + draft │ │ (Pro→Flash fallback) │ └─────────────────────────┬──────────────────────────────────┘ ▼ ┌────────────────────────────────────────────────────────────┐ │ Step 7: Synthesis + receipts │ │ Dispatch markdown saved to │ │ /opt/nexus/.../dispatches/phase__.md │ │ Per-agent raw JSON saved to │ │ /opt/nexus/.../council/raw__.json │ │ MH section logged automatically │ └─────────────────────────┬──────────────────────────────────┘ ▼ ┌──────────────────────────────┐ │ Robert: SHIP / AMEND / HOLD │ └──────────────────────────────┘

🤖

5 agents (4 strategic + 1 executor)

🧠

STRATEGIST — Claude Opus 4.5

Authors Phase X dispatches in BSP house style. Default voice. Fallback: GPT-4o if Anthropic down. Max tokens: 16,000 (truncation-proof). RAG: broad query k=15.

⚔️

CRITIC — GPT-4o

Devil's advocate. Numbered risk list with severity. Vote: APPROVE / AMEND / REJECT. RAG: failure-mode focused query. Brutally specific — "it might not work" is rejected.

📚

RESEARCHER — Perplexity Sonar Pro

Citation-first external evidence. Verifies factual claims with 2024-2025 sources. RAG: none (external web grounding only). Reject opinion-only sources.

🔍

AUDITOR — Gemini 2.5 Pro

Cross-checks dispatch against MH + Project Ledger + codebase doc. Empirical grep [T3] for cited identifiers. Fallback: 2.5 Flash if Pro hits MAX_TOKENS. RAG: identifier-cross-check on Strategist draft.

🛠️

EXECUTOR — Claude Code (this agent)

VM SSH + Hostinger MCP + file ops + native-save / install-child / cache purge / Playwright verify. Runs the dispatch end-to-end after Robert ships. Single integration point — no copy-paste relay.

🚦

4 Autonomy Tiers

Tier	What	Examples	Mode required	Robert approval?
TIER 0	Read-only / docs / MH log	SEO research, project ledger refresh, doc draft, MH section	Solo OK	None — auto-execute
TIER 1	Staging writes with rollback	Template 105 native-save, snippet activate, child theme install, op258 patches	Solo OK (or Council)	Auto-execute, rollback ready
TIER 2	Production writes / content tree change / schema	Production cutover, callbrightside.com edits, sitemap regen, GMB updates	Council REQUIRED	Robert approval REQUIRED
TIER 3	Destructive / irreversible	Drop snippet permanently, delete Bricks template, force-push to main, drop DB table	Council REQUIRED	Robert approval + signed receipt

Architectural enforcement: Tier 2+ refuses Solo mode. Try it: --mode solo --tier 2 exits with code 2 + redirect message. Bulletproof Default rule cannot be bypassed by accident.

🚫

13 anti-patterns refused forever

Each happened during Apr 24-25. Each is now codified as a never-do across all 12 enforcement layers.

Fast option chosen when bulletproof exists. Default to bulletproof unless Robert explicitly says "go fast."
Single-model verification of state-changing claims. Tier 2+ requires Council mode — multi-agent crosscheck.
Producer == Verifier collapse. The script that performs a change is never the authority on success. Independent reader.
No retry on transient API failures. All 4 model calls have retry-with-backoff (5/15/30s).
Skipping pre-flight gates. 6-question test (Apr 21 SOP + bulletproof check) before any code.
Truncating output to fit token budget. Strategist max_tokens 16,000 (was 8,000 — caused Apr 25 cutover rollback truncation).
Regex string-replace on production files. Use grep -n + targeted Edit with ≥20-char unique anchors (Apr 24 truncation incident).
Citing identifiers without verification. Auditor empirical-grep [T3] cross-checks every cited element/snippet ID against live system.
Claiming done without receipts. CLAUDE.md Rule 2: literal tool output in code block. No narration.
Local fix instead of class-of-problem fix. Mario: "agents have only local view." Zoom out for systemic answer.
Code shipped before tests. Test-after-each-layer. Smoke test before next component.
Documentation written before system works. Aspirational features documented as factual = Apr 25 SOP mistake.
Outsourcing QA to Robert by default. Playwright + audit + diff before reporting (§11 self-limits).

🐌

Mario rails (§11 self-limits — the constraints)

Per Mario Zechner's "Slow the fuck down" framework (March 2026 essay) — applied to BSP coding sessions:

Limit	Number	Why
Lines of code per turn	≤ 200 net new	Code I can't read in 2 min is code I can't defend.
New snippets per session	≤ 3 net new	Each snippet = new surface to maintain.
Element tree size per build	≤ 150 elements per page	Bigger = split into templates + sections.
Patches before rebuild	≤ 3 patches to a section	>3 = underlying tree wrong. Rebuild from Figma.
QA cycles outsourced to Robert	0 by default	Playwright + audit + diff before reporting.
Diagnostic snippets left active	0 (delete within 24h)	Slop until removed.
Memory file size	≤ 20 lines per file	Stub form only. Full content in MH. (Apr 25 addition)

The check before any deploy (§11):

Recall: Did I search existing for prior art? Extending vs duplicating?
Local-vs-global: One-element patch or solving the underlying class?
Defensibility: Can I explain to Kalen in one sentence why each new line of code exists?
Verification: Did I run Playwright + audit + diff before reporting?
Cleanup: What am I leaving behind (diagnostic snippets, dead helpers)?
Bulletproof check (Apr 25): Am I taking bulletproof or fast? If fast, did Robert explicitly approve?

If ANY answer is "no" or "I don't know" → STOP. Do not write code.

🎴

Cheat sheet — copy-paste these

Refresh project ledger + codebase brief

ssh dovew@34.55.179.122 "python3 /opt/nexus/nexus/scripts/project_ledger.py"

Solo dispatch — Tier 1 tactical work

ssh dovew@34.55.179.122 "python3 /opt/nexus/nexus/scripts/council_runtime.py --mode solo --tier 1 --slug short-name --question 'YOUR ASK' --ledger /tmp/project_ledger_excerpt.md"

Council dispatch — Tier 2+ high stakes (refuses Solo, this is the only path)

ssh dovew@34.55.179.122 "python3 /opt/nexus/nexus/scripts/council_runtime.py --mode council --tier 2 --slug short-name --question 'YOUR ASK' --ledger /tmp/project_ledger_excerpt.md"

Read most-recent dispatch

ssh dovew@34.55.179.122 "cat \$(ls -t /opt/nexus/nexus/scripts/output/dispatches/*.md | head -1)"

Pull dispatch to Windows for Notepad review

scp -i ~/.ssh/google_compute_engine "dovew@34.55.179.122:\$(ssh dovew@34.55.179.122 'ls -t /opt/nexus/nexus/scripts/output/dispatches/*.md | head -1')" C:/Users/dovew/Downloads/

Test scope-boundary referee on a question

ssh dovew@34.55.179.122 "python3 /opt/nexus/nexus/scripts/referee.py 'YOUR ASK'"

📅

What's deferred (Phase 3)

Correction (Apr 25): Robert flagged: "we don't use Streamlit anymore." Cockpit is the Morpheus Next.js frontend (live at https://morpheus.callbrightside.com/app/robert) plus the document library at /documents/*. Streamlit dashboards were deprecated. Adjusting deferred items below.

Phase 3 — integrate into Nexus Autonomous Intelligence

Per BSP_Nexus_Autonomous_Intelligence.html, the broader system is already running with: 6,332 RAG chunks, CRAG (Corrective RAG quality gate), Self-RAG (self-reflection), 18 antibodies (only 3 firing — gap documented), nexus_priority_engine.py at port 8765 with 350 endpoints, the Triad architecture (Strategist + Verifier + Executor), Morpheus Next.js cockpit, daily 5AM priority queue ranking. The BSP Operator Council fits inside this as a multi-model implementation of the Triad's Strategist+Verifier slots.

Slot the council into the Nexus Triad — Council Strategist+Auditor become Triad's Strategist+Verifier roles, sharing the existing CRAG / Self-RAG / antibody machinery
Surface dispatch queue in Morpheus frontend — add a route under /app/robert for council dispatch queue + Validation Score history (replacing the Streamlit cockpit idea)
Promote 4 deferred deliverables (T4-T7) into Nexus antibodies — the 6th pre-flight question, Validation Score, Project Ledger compliance %, pre-commit hook map cleanly onto Nexus' antibody pattern + would lift firing rate from 3/18 toward 7/18
Cross-project propagation — same council harness for HCP / Daniel AI / ST attribution work, leveraging Zeus RAG cross-domain chunks
Telemetry into nexus_priority_engine.py — bulletproof compliance % feeds the daily 5AM priority queue scoring
Daily cron loop — drift check 06:30, ledger refresh 07:00, daily brief 07:30, end-of-day MH snapshot 17:00 (slot into existing Nexus cron)

🛡️ BSP Bulletproof Council v2

📑 Quick Jump

Why v2 exists

The 5-layer defense before Robert sees a dispatch

14 Components — status

12 Enforcement Layers — Bulletproof Default

Auto-memory SHIPPED

CLAUDE.md Rule 9 SHIPPED

Protocol Gates §0 Operating Principle SHIPPED

Council runtime Tier 2+ Solo refusal SHIPPED

Per-agent personas SHIPPED

6th pre-flight question T3

Validation Score bulletproof axis T4

Project Ledger compliance % T4

SOP HTML title T5

MH paper trail SHIPPED

Pre-commit hook on council_runtime.py T7

Cross-project propagation T7

End-to-end dispatch flow

5 agents (4 strategic + 1 executor)

4 Autonomy Tiers

13 anti-patterns refused forever

Mario rails (§11 self-limits — the constraints)

Cheat sheet — copy-paste these

Refresh project ledger + codebase brief

Solo dispatch — Tier 1 tactical work

Council dispatch — Tier 2+ high stakes (refuses Solo, this is the only path)

Read most-recent dispatch

Pull dispatch to Windows for Notepad review

Test scope-boundary referee on a question

What's deferred (Phase 3)

Phase 3 — integrate into Nexus Autonomous Intelligence