BSP Data Trust Evolution v1 | Bright Side Plumbing

🛡️

The 60-Second Version

For Kalen and Stephanie. Plain English. No jargon.

💥

$6.4M

Phantom revenue the system was reporting

➡️

🔍

Found the bug

One line of code, pinpointed

➡️

✅

Quarantined + fix documented

Permanent guard in place

📊 Here are your REAL numbers (verified against 3 independent sources)

🏆 Weekly Revenue

$70,180

from Ashton's daily Slack log

💰 YTD Net Income

$79,743

from QuickBooks P&L

📈 Operating Margin

16.77%

math engine verified

🚰 The Revenue Pipeline: Where Money Flows (and Where It Breaks)

📣

Google Ads

$2.6K/wk

152 clicks

✅ Working

📞

Calls Come In

127/wk

inbound demand

✅ Working

📋

Job Booked

40%

target: 80%

❌ 64 lost/wk

🔧

Job Done

marked complete

❌ Pipe 4

🧾

Invoice

76%

show $0

❌ Pipe 1

🏦

QuickBooks

$573K

YTD income

✅ Profitable

🟢 = working 🔴 = broken pipe (money flows through but the system can't see it)

✅ The Good News

The business IS making money. $70K/week in real collected revenue (confirmed by Ashton's daily Slack reports AND by QuickBooks). The company is profitable: $79.7K net income so far this year on $573K total income. Google Ads is returning $12 for every $1 spent. The problem was never the business. It was the reporting system.

❌ What Was Broken

A script that imports historical data from ServiceTitan stamped 10,461 old jobs with today's date. The system thought 6 years of history happened in one minute. Every report inherited a fake $6.4M. We found the exact line of code, quarantined the bad data (reversible, nothing deleted), and documented the permanent fix below.

$6.4M

Phantom Monthly Revenue

reported by Math Engine pre-quarantine

11,729

Phantom Rows Quarantined

10,461 (Apr 3) + 1,268 (Mar 24)

Files With Same Bug Pattern

missing created_at in INSERT

55,870

Stale Rows At Risk

across 22 titan tables

102

Real 30d Jobs (Clean)

was 11,831 before quarantine

334x

Daniel AI ROI

$25K rev / $75 cost / month

$134K

Big Sale 4-Week Actual

Ashton Slack log, trustworthy

16.77%

QB Operating Margin

math engine verified

💥 Pipe 5: The April 3 Phantom Sync

Single-minute injection of 6 years of historical ServiceTitan jobs stamped with today's created_at

What Happened Critical

At 04:07 UTC on April 3, 2026, a single process injected 10,461 rows into titan.jobs. Every row had created_at stamped at the ingestion minute, but scheduled_at spanning November 30, 2020 through April 2, 2026. All rows carried lead_source = "Imported Default Campaign". Of those, 10,444 were marked job_status = completed with a total invoice_total sum of $6,380,257.

Every downstream query that filtered by created_at BETWEEN '2026-04-03' AND '2026-04-04' saw six years of historical job data compressed into one fake minute. The math engine's weekly_revenue formula and /api/math/validate-revenue endpoint began returning the phantom number. The story API's revenue narrative inherited it. The Sacred HTML's revenue card inherited it. The causal validator's google_ads_to_revenue chain inherited it.

The Timeline

Apr 2, 2026, 01:26 UTC

nexus_titan_migration.py written and saved to /opt/nexus/nexus/scripts/

A one-time historical backfill script designed to upsert ServiceTitan jobs into titan.jobs. Contains the INSERT statement that omits the created_at column.

Apr 3, 2026, 04:00 UTC

nexus-etl-postgres.timer fires nightly run

The scheduled ETL job started normally, then failed at 04:00:05 with HTTP 403 on ST auth. The phantom was NOT injected by this timer. Logs confirm it only synced speed_experiment_latest.json.

Apr 3, 2026, 04:07 UTC

Manual or scripted run of nexus_titan_migration.py

Between 04:07 and 04:08, 10,461 ServiceTitan historical jobs were upserted into titan.jobs. 9,000+ of them inserted in a single minute. Every row stamped with created_at = NOW() because the INSERT column list omits created_at and Postgres defaults it.

Apr 3-11, 2026

Downstream systems begin reporting phantom numbers

Math engine, Story API, causal validator, sacred HTML all show inflated revenue. Zero-invoice anomaly detector begins firing CRITICAL at 76% rate (Apr 7). Weekly scorecard reports 10,493 jobs with $6,357,947 revenue for the 7-day previous window.

Apr 11, 2026, late evening

Kalen flags the numbers

Kalen Barker tells Robert the revenue figure coming out of the Intelligence Stack is impossible for a two-technician shop. Robert initiates the investigation.

Apr 12, 2026, 00:30 UTC

Forensic audit: apr3_phantom_sync_investigation.py built and run

Investigation confirms 10,461 rows stamped at 04:07 UTC. scheduled_at span 2020-11-30 to 2026-04-02. lead_source = Imported Default Campaign. invoice_total sum = $6,380,257.

Apr 12, 2026, 01:00 UTC

Quarantine lookup table created

titan.quarantined_jobs lookup created with 10,461 rows inserted. Non-destructive filter that downstream queries can LEFT JOIN against. Clean 30d revenue drops from $6.4M phantom to $93,436 realistic.

Apr 12, 2026, 02:45 UTC

Root cause pinpointed: INSERT column list missing created_at

Line 249 of nexus_titan_migration.py. The fix is a single-column addition. The blast radius search then revealed 29 other INSERT statements across titan and the nexus scripts directory with the same pattern.

Apr 12, 2026, 03:00 UTC

This document written and deployed

Permanent fix artifact built. Evolution Protocols documented. Referenced from CLAUDE.md and the memory index so every future session loads it.

Proof of Injection

From apr3_phantom_sync_investigation.py output:

Day-by-day created_at distribution (Mar 25 - Apr 12, 2026): 2026-03-25 23 2026-03-26 8 2026-03-27 31 ... 2026-04-02 14 2026-04-03 10,461 <-- SPIKE 2026-04-04 19 2026-04-05 22 ... Apr 3 minute-level spikes (more than 100 rows in one minute): 2026-04-03 04:07 9,047 2026-04-03 04:08 1,414 Spike-window job metadata: scheduled_at span: 2020-11-30 00:00 -> 2026-04-02 23:59 completed_at span: 2020-12-02 00:00 -> 2026-04-02 18:30 with invoice > 0: 5,782 zero/null invoice: 4,679 total revenue sum: $6,380,257 lead_source: Imported Default Campaign (100%)

💣 Pipe 6: The 29-File Blast Radius

The same created_at omission pattern is present in 29 INSERT statements across the codebase, not just one

Why This Is A Pattern, Not A One-Off

After pinpointing the phantom to nexus_titan_migration.py line 249, a blast-radius search was run against /opt/nexus/titan and /opt/nexus/nexus/scripts for every INSERT into a titan table that does not explicitly set created_at. The grep returned 29 files with the same pattern, including two that write to titan.jobs (the phantom source) and seven that write to tables which are currently flagged as stale in the freshness scan.

This means the bug is not a one-off that can be fixed with a single patch. It is a codebase-wide convention where developers (including me, past me) relied on Postgres defaults for created_at instead of explicitly passing the authoritative timestamp from the source system. Every one of these writes is a future phantom waiting to happen.

The Blast Radius Inventory

Severity	File	Target Table	Why It Matters
Crit	nexus/scripts/nexus_titan_migration.py:249	titan.jobs	Apr 3 phantom source. 10,461 rows injected. Must fix first.
Crit	titan/titan_sync_daemon.py:206	titan.jobs	Second writer to titan.jobs. Runs as a daemon, not a one-shot. Every write stamps NOW().
Crit	titan/titan_sync_daemon.py:273	titan.estimates	Estimates carry sold_by_id attribution. Same bug class.
Crit	titan/api/portal.py:173	titan.jobs	Third writer to titan.jobs. Customer portal path.
High	titan/api/estimates.py:141	titan.estimates	Estimate creation endpoint.
High	titan/api/estimates.py:261	titan.invoices	Invoice creation. Already a source of the 71% zero-invoice problem.
High	titan/api/threecx.py:98	titan.call_logs	call_logs is STALE 437 hours. Explains why freshness broke.
High	titan/api/vapi_voice.py:157	titan.voice_calls	Daniel AI call ingestion.
High	titan/api/vapi_voice.py:254	titan.daniel_learning	Daniel RAG learning loop.
High	titan/api/sms.py:197	titan.call_logs	Second writer to call_logs. Stale table.
High	titan/api/sms.py:257	titan.call_logs	Third writer to call_logs.
High	titan/nexus_st_webhook_listener.py:529	titan.gclid_captures	GCLID pipeline writes. Attribution bridge.
High	titan/nexus_st_webhook_listener.py:797	titan.invoice_events	Invoice events drive the webhook to Google Ads. Same pattern.
High	titan/nexus_st_webhook_listener.py:854	titan.google_ads_conversion_log	Conversion upload log. Same pattern.
Med	titan/nexus_gclid_bridge_v2.py:119	titan.gclid_captures	GCLID bridge.
Med	titan/gclid_bridge.py:79	titan.gclid_captures	Second GCLID bridge.
Med	titan/nexus_doctor.py:236	titan.google_ads_conversion_log	Doctor repair path.
Med	titan/nexus_blog_scraper.py:237	titan.knowledge_base	RAG ingestion. Lower severity because knowledge_base time filtering is less critical.
Med	titan/nexus_playbook_indexer.py:169	titan.knowledge_base	Playbook RAG indexer.
Med	titan/nexus_email_rag_ingester.py:368	titan.knowledge_base	Email RAG ingester.
Med	titan/nexus_daniel_rag_bridge.py:178	titan.knowledge_gaps	RAG gap tracking.
Med	titan/nexus_metric_drift_detector.py:104	titan.metric_history	Metric drift detection.
Med	titan/api/tasks.py:172	titan.trust_audit	Trust audit insertion.
Med	titan/api/notifications.py:120	titan.trust_audit	Second trust_audit writer.
Med	titan/api/xact_pricing.py:112	titan.xact_estimates	Xactimate estimate creation.
Med	titan/api/xact_pricing.py:137	titan.xact_estimate_items	Xactimate line items.
Med	titan/api/transcript_repo.py:204	titan.call_transcripts	Transcript repository.
Med	titan/api/eta_relay.py:250	titan.tech_relays	ETA relay system.
Med	titan/api/feedback_loop.py:18,42,45	titan.citation_log, knowledge_base	Feedback loop writers.

🧮 Math Engine Audit: 10 Formulas, Real Tests

Every formula tested live. Which ones work. Which ones are contaminated by ServiceTitan. Which ones correctly fire anomalies.

Test Methodology

Every formula registered in the math engine was hit with a curl against /api/math/compute?formula=NAME&inputs=A,B,C using realistic production numbers pulled from the trusted sources (QuickBooks P&L, Google Ads live API, Big Sale Tracker). The output was captured verbatim along with any anomaly warnings. The goal was to verify that the engine itself is mathematically sound, regardless of whether the inputs fed into it by other systems are contaminated.

Formula Inventory Table

Formula ID	Name	Status	Expected Range	Live Test Result
cpl	Cost Per Lead	Working	$5 - $500	$31.96 (11187 / 350, 30d)
roas	Return on Ad Spend	Working	0.5 - 100	0.07 ANOMALY FIRED (11187 / 170252)
booking_rate	Booking Rate	Formula OK	10% - 100%	Untested, needs leads input
avg_ticket	Average Ticket	Contaminated	$100 - $25000	Relies on ST revenue / completed jobs
weekly_revenue	Weekly Revenue	Broken	$10K - $200K	Lambda signature error: takes 1 arg, got 2
close_rate	Close Rate	Formula OK	5% - 80%	Untested, needs estimate data
margin	Operating Margin	Working	-50% to +60%	16.77% (573182, 230906, 246124 from QB)
annualized_revenue	Annualized Revenue	Working	$500K - $10M	$3,659,385 (70180 per 7 days)
customer_ltv	Customer Lifetime Value	Formula OK	$500 - $50000	Untested, needs retention data
revenue_per_tech	Revenue per Technician	Contaminated	$2K - $50K/wk	Relies on ST revenue as numerator

Key Proof: The Math Engine Correctly Fired An Anomaly

When ROAS was computed with $11,187 spend divided by $170,252 QB 30-day revenue (which is all revenue, not ad-attributed), the engine returned ROAS = 0.07 and flagged "ANOMALY: Return on Ad Spend = 0.07 is outside expected range [0.5, 100]". The anomaly is correct. 0.07 means 7 cents of revenue per dollar of ad spend, which would mean the ads are catastrophically failing. The real issue is that QB's $170K is not the right numerator (it includes all revenue, not just ad-attributed). The engine worked exactly as designed. The bug is upstream: there is no "ad-attributed revenue" source wired as the causal_validator yet.

GET /api/math/compute?formula=roas&inputs=11187,170252 {"formula":"roas","inputs":[11187.0,170252.0], "result":{"value":0.0657,"unit":"dimensionless",...}, "warning":"ANOMALY: Return on Ad Spend = 0.07 is outside expected range [0.5, 100]"}

Key Proof: Margin Formula Validates Profitability

When Margin was computed with QuickBooks actuals (income $573,182, COGS $230,906, OpEx $246,124), the engine returned Margin = 16.77%. This is within the expected range of -50% to +60%, and it matches the 13.9% net-income margin when you include Other Expenses ($16,410). The math engine proves that the business is profitable, and the number is defensible against any ST-contaminated claim. This is the number Kalen can trust.

GET /api/math/compute?formula=margin&inputs=573182,230906,246124 {"formula":"margin","inputs":[573182.0,230906.0,246124.0], "result":{"value":16.775,"unit":"percent",...}, "latex":"\\text{Margin} = \\frac{\\text{Revenue} - \\text{COGS} - \\text{OpEx}}{\\text{Revenue}} \\times 100"}

The Missing Guardrails: Zero Metric Bounds Registered

Although the math engine formulas themselves have bounds, the 28 metrics registered in metrics_registry.yaml have zero bounds entries. None of them will fire an out-of-range warning when the underlying data source drifts. This is why the phantom $6.4M figure flowed through without being caught: no metric said "hey, $6.4M for a two-tech shop is outside the expected $20K-$200K range." The permanent fix is to add an expected_range to every metric in the YAML.

metrics_registry.yaml — 28 / 28 metrics have NO expected_range guardrail: ad_spend_last_7d, revenue_all_sources_last_7d, ramp_total_spend_last_7d, ramp_materials_spend_last_7d, ramp_ad_spend_last_7d, big_sale_this_week, big_sale_latest_eow, clicks_last_7d, cpc_last_7d, google_ads_real_conversions_last_7d, booked_ad_sourced_last_7d, booked_all_sources_last_7d, completed_all_sources_last_7d, cost_per_booking_last_7d, calls_inbound_last_30d, calls_inbound_today, calls_inbound_last_7d, calls_outbound_last_7d, calls_by_agent_last_7d, ad_sourced_calls_last_7d, leads_by_source_last_7d, st_webhook_events_last_7d, st_webhook_uploads_last_7d, st_webhook_upload_value_last_7d, gclid_captures_last_7d, call_upload_last_run_success, call_upload_last_run_matches, lsa_leads_last_7d

🪦 The Stale Graveyard: 55,870 Rows At Risk

22 of 54 titan tables are older than 24 hours. Every one is a potential phantom if the same bug pattern triggers.

Why Stale Tables Are Dangerous

A stale table in the warehouse is not just a missing update. It is a time bomb. When someone eventually runs a backfill or a re-sync to "fix" the staleness, the same created_at = NOW() bug will re-trigger, compressing months of history into a single minute and creating a new phantom. Every stale table in the list below represents a potential future Apr 3 04:07 UTC incident.

The stale tables also break the graph brain and the freshness heatmap. graph_nodes and graph_edges are themselves stale (283h and 193h respectively), which means the graph brain's view of the ecosystem is describing Nexus as it was 11 days ago. seo_keywords_full has 40,847 rows that have been locked for 9 days. kalen_review_criteria has been locked for 8 days, which means Kalen's review framework is not updating.

Stale Table Inventory (>24h since last update)

Table	Rows	Age (hours)	Business Impact
call_logs	352	437.9	3CX call ingestion broken. Demand signal lost.
price_book	400	428.6	ST price book not syncing. Xactimate baseline frozen.
locations	5,755	428.5	Customer address book frozen.
dispatch_schedule	7,831	420.1	Dispatch board data stale 17 days.
payroll	199	419.7	Technician commission data frozen.
xact_pricebook	62	373.4	Xactimate pricing frozen 15 days.
call_transcripts	34	341.8	3CX + Vapi transcripts not flowing.
graph_nodes	54	283.7	Graph brain describing 11-day-old ecosystem.
competitive_intel	34	228.4	Competitor intelligence frozen 9 days.
metric_thresholds	12	228.1	The guardrails themselves are stale.
session_learning	17	225.5	Claude session learnings not flowing.
outcome_tracking	13	223.9	Action outcome feedback loop frozen.
seo_keywords_full	40,847	221.4	40K SEO keywords locked. Biggest row count at risk.
competitor_profiles	11	194.8	Competitor tracking stale.
kalen_review_criteria	25	194.6	Kalen's review framework not updating.
graph_edges	85	193.1	Graph edge connections 8 days old.
ramp_merchant_categories	32	150.5	Ramp COGS categorization stale.
ramp_tech_mapping	16	150.1	Ramp-to-technician attribution stale.
competitor_reviews	57	144.9	Competitor review scraping frozen.
financial_reconciliation	8	133.9	The reconciliation log itself is stale.
predictive_alerts	21	123.4	Predictive alerting paused.
trust_audit	5	420.5	Trust audit log frozen 17 days.
TOTAL	55,870

💰 30-Day Reconciliation Across Three Sources

Big Sale Tracker, QuickBooks P&L, and the quarantine-filtered ST warehouse each tell a different story. Here is why.

The Three Numbers

Source	30-Day Revenue	Methodology	Trust Level
Big Sale Tracker	$134,384	Sum of 4 most recent weekly totals from the #big-sale-alert Slack channel (Ashton daily log)	Primary SoT
QuickBooks P&L (pro-rata)	$170,252	$573,182 YTD income divided by 101 days, multiplied by 30	Primary SoT
ST Clean (quarantine-filtered)	$93,436	titan.jobs LEFT JOIN quarantined_jobs, last 30d, completed only	Operational only
ST Dirty (no filter, phantom included)	$6,451,383	titan.jobs without quarantine, last 30d created_at	Do not use

Why The Numbers Diverge

Big Sale $134K vs QB $170K = $36K gap: QuickBooks includes receivables collected from prior-period work, cash tips, AR catch-up, and any non-sale income (e.g. tool rentals to partner shops). Big Sale Tracker only captures what Ashton records in the daily log, which is a strict "sold today" snapshot. A $9K/week gap is consistent with QB catching trailing collections that Big Sale misses.
QB $170K vs Clean ST $93K = $77K gap: This is the real operational leak. The anomaly detector has been firing zero_invoice CRITICAL at 76% rate since April 7. Nineteen out of every twenty-five completed jobs in ST are closed with zero invoice_total because the invoice webhook never fires or because the tech never finalized the invoice in the ST app. This is Pipe 1 from the April 8 Battle Plan. Fix the invoice closure workflow and the $77K reappears.
ST Dirty $6.4M: The phantom. 10,461 historical rows with created_at stamped at Apr 3 04:07 UTC. Already quarantined, do not use.

Reconciliation Verdict

The true 30-day revenue is somewhere between Big Sale's $134K (conservative floor) and QB's $170K (full cash recognition). The ST clean number of $93K is too low because of the zero-invoice closure problem, not because of the phantom. Report Big Sale as the "this week / this month" number and QB as the "profit and cash" number. Never report the ST number until Pipe 1 (invoice closure) is fixed.

🏆 Trust Hierarchy: Which Source To Use For Which Question

The pyramid. Tier 1 is the most trusted. Never escalate to a lower tier when a higher tier has the answer.

Weekly Revenue

Big Sale Tracker -> /output/big_sale_tracker.json -> #big-sale-alert Slack

Monthly / YTD Profit

QuickBooks P&L -> /output/qb_pl_latest.json (1h refresh)

COGS / Materials / Ramp Spend

Ramp API -> titan.ramp_transactions (996 rows, 15.4h fresh)

Ad Spend, Clicks, CPC, Real Conversions

Google Ads live API -> GAQL (no warehouse lag)

Call Demand Signal (Inbound/Outbound)

3CX xapi -> titan.voice_calls (148 rows, 8.4h fresh)

Knowledge / Institutional Memory

Zeus RAG -> titan.knowledge_base (15,344 chunks, 1.3h fresh)

Job Counts, Lead Sources, Operational KPIs

titan.jobs + titan.quarantined_jobs filter (COUNTS only, never revenue)

Daniel AI Learning and Voice Call Outcomes

titan.daniel_learning, titan.voice_calls (both 8.4h fresh)

ST invoice_total as Revenue (BROKEN)

titan.jobs.invoice_total SUM -> DO NOT USE until Pipe 1 fixed

🔧 The Permanent Code Fix

Exact before and after. Every writer to titan.jobs (and every other titan table) must explicitly pass the source-system timestamp as created_at.

Step 1: Fix nexus_titan_migration.py line 249

BEFORE (the bug)

INSERT INTO titan.jobs ( st_id, job_number, customer_id, technician_id, job_type, job_status, summary, scheduled_at, completed_at, invoice_total, campaign_id, lead_source ) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s) -- created_at defaults to NOW() -- PHANTOM ... job.get("scheduledOn") or job.get("createdOn"), job.get("completedOn"), total, str(campaign_id) if campaign_id else None, job.get("leadSource", ""),

AFTER (the fix)

INSERT INTO titan.jobs ( st_id, job_number, customer_id, technician_id, job_type, job_status, summary, scheduled_at, completed_at, invoice_total, campaign_id, lead_source, created_at, warehouse_ingested_at ) VALUES (%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,%s,NOW()) -- created_at = ST createdOn, ingested_at = now ... job.get("scheduledOn") or job.get("createdOn"), job.get("completedOn"), total, str(campaign_id) if campaign_id else None, job.get("leadSource", ""), job.get("createdOn"), # authoritative ST time

Step 2: Add warehouse_ingested_at column to titan.jobs

The schema change separates "when ST created this job" from "when our warehouse pulled it in." Both are useful, but they must not be conflated. After this migration, created_at always means ST's authoritative time and never defaults to NOW(). Every query that previously filtered by created_at will continue to work because the semantics are preserved.

BEGIN; -- 1. Add the new column ALTER TABLE titan.jobs ADD COLUMN warehouse_ingested_at TIMESTAMPTZ DEFAULT NOW(); -- 2. Backfill the ingested_at from the quarantine lookup (where we know) UPDATE titan.jobs j SET warehouse_ingested_at = q.quarantined_at FROM titan.quarantined_jobs q WHERE q.job_id = j.id; -- 3. Re-stamp created_at for quarantined rows from ST's scheduled_at -- (scheduled_at is the best proxy we have for the real ST created time) UPDATE titan.jobs j SET created_at = j.scheduled_at FROM titan.quarantined_jobs q WHERE q.job_id = j.id AND j.scheduled_at IS NOT NULL; -- 4. Index on the new column for freshness queries CREATE INDEX idx_jobs_warehouse_ingested ON titan.jobs (warehouse_ingested_at); COMMIT;

Step 3: Patch all 29 INSERT sites systematically

A single patch script (patch_created_at_omission.py) walks through every file in the blast radius inventory. For each file:

Parse the INSERT statement column list.
If created_at is missing, add it as the last column.
Find the VALUES tuple and add a new %s.
Find the Python tuple being passed to execute() and add the source-system timestamp (or datetime.now(timezone.utc) if no source-system time exists).
Write a backup of the original file before modification.
Print a diff summary per file for Robert's review.
Never auto-deploy. Always stop at the diff stage.

Step 4: Add a pg trigger as a safety net

Even after the patch, a future developer could write a new INSERT and forget created_at. The safety net is a Postgres trigger that raises an exception if created_at would fall back to DEFAULT NOW() without being explicitly set.

CREATE OR REPLACE FUNCTION titan.enforce_explicit_created_at() RETURNS trigger AS $$ BEGIN -- Raise if the inserted created_at is within 5 seconds of NOW() -- AND the row looks like historical data (scheduled_at < 1 day old mismatch) IF NEW.created_at IS NOT NULL AND (NOW() - NEW.created_at) < INTERVAL '5 seconds' AND NEW.scheduled_at IS NOT NULL AND (NEW.created_at - NEW.scheduled_at) > INTERVAL '30 days' THEN RAISE EXCEPTION 'PHANTOM GUARD: created_at=NOW() but scheduled_at is >30d old. Pass authoritative ST createdOn.'; END IF; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER trg_jobs_created_at_guard BEFORE INSERT OR UPDATE ON titan.jobs FOR EACH ROW EXECUTE FUNCTION titan.enforce_explicit_created_at();

📜 Evolution Protocols: Claude's Operating Contract

The rules every future Claude session must follow. Loaded from CLAUDE.md and the memory index. Violations cost Robert time and Kalen trust.

P-01Never report revenue from ST as a primary number

Before quoting any revenue figure, check Big Sale Tracker first. Then QB P&L. ST is a T4 operational source for counts only, not revenue.

Why: The Apr 3 phantom injected 10,461 rows with $6.4M. The 71% to 76% zero-invoice closure rate means ST is missing real revenue even after the phantom is removed. How to apply: When a revenue question comes in, run cat /opt/nexus/nexus/scripts/output/big_sale_tracker.json and cat /opt/nexus/nexus/scripts/output/qb_pl_latest.json before SSHing anything to titan.jobs.

P-02Every INSERT into a titan table must explicitly set created_at

Never rely on Postgres DEFAULT NOW(). Always pass the source-system timestamp. If no source-system time exists, pass datetime.now(timezone.utc) explicitly.

Why: 29 files in the codebase omit created_at. Every one is a potential phantom. How to apply: Before writing any new INSERT, add created_at to both the column list and the tuple. Before editing an existing INSERT, check whether it already has created_at in the column list.

P-03Before editing titan.jobs, query the graph cascade

curl 'http://localhost:8765/api/graph/cascade?node=titan-jobs' before any schema change or write path change.

Why: At least 8 API endpoints read titan.jobs downstream. The math engine, story API, causal validator, and sacred HTML all depend on it. Blast radius must be assessed before any change. How to apply: The graph brain returns the list of affected systems. Read the list. Then decide.

P-04Before any financial claim, run it through the math engine

curl 'http://localhost:8765/api/math/compute?formula=NAME&inputs=A,B'. If the response includes a "warning" field with "ANOMALY", do not quote the number. Investigate first.

Why: The math engine correctly fired an anomaly on ROAS = 0.07 when fed bad inputs. It is your last defense against reporting garbage. How to apply: Margin, ROAS, CPL, and annualized_revenue are all working. Use them. Weekly_revenue and avg_ticket are contaminated. Do not use them until they are rewritten to pull from Big Sale and QB.

P-05Always query the Intelligence Stack before saying "we do not have X"

/api/context/prepare, /api/zeus/search, /api/graph/lineage, and /opt/nexus/nexus/scripts/output/anomaly_log.json are all cheap to hit. Hit them before scattering.

Why: The anomaly log has been firing CRITICAL on zero_invoice since April 7. If a session had queried it first, the invoice closure conversation would have happened days earlier. How to apply: Session start protocol in CLAUDE.md already requires this. Follow it.

P-06Check the HTML playbooks before building a new one

ls /opt/nexus/nexus/scripts/output/playbooks/ and grep for the topic. If a document already covers it, extend or reference. Do not duplicate.

Why: In this session, the 4 Broken Pipes Battle Plan already documented Pipes 1-4. The Stephanie Data Trust Report already had the bank reconciliation analogy. Rediscovering these wasted time. How to apply: Before writing a new HTML, do the grep and read the existing docs. Every playbook lives under /opt/nexus/nexus/scripts/output/playbooks/.

P-07Use standalone .py files via scp, never inline SSH heredocs

For any Python code longer than 10 lines, write it to a file under BrightSidePlumbing/scripts/, scp it to /tmp on the VM, then ssh and run. No inline heredocs.

Why: Inline SSH heredocs fail on Python f-string escaping at least once per session. Every failure wastes 3 to 5 tool calls debugging the quoting instead of investigating the actual problem. How to apply: The pattern is: Write to C:\Users\dovew\Documents\Clients\BrightSidePlumbing\scripts\FOO.py, then bash scp -i ~/.ssh/google_compute_engine "PATH" dovew@34.55.179.122:/tmp/FOO.py && ssh ... python3 /tmp/FOO.py.

P-08After any audit, write a memory entry so the next session inherits the finding

Memory entry format: type=project or type=feedback, include Why and How to apply, update MEMORY.md index.

Why: The Apr 3 phantom could have been caught on Apr 4 if the Apr 3 anomaly had been logged. The next phantom will be caught if this document and its memory entry exist. How to apply: After writing this HTML, create memory/project-apr3-phantom-and-evolution-v1.md with a one-line pointer in MEMORY.md.

P-09Revenue hierarchy is Big Sale then QB then Ramp then ST counts

When rendering Sacred v2 or any dashboard, Big Sale Tracker is the weekly card, QB P&L is the monthly card, Ramp is the spend and COGS card, ST is the job-count card only.

Why: This is the contractual trust hierarchy from the Apr 11 investigation. Any dashboard that sums ST invoice_total is lying. How to apply: The metrics_registry.yaml entries must be updated so big_sale_this_week is the primary revenue metric and completed_all_sources_last_7d returns a count (not a dollar sum).

P-10Load this HTML as a reference at session start

The first 5 minutes of any session should include an SSH-based read of this file. Its findings are the contract.

Why: This is the reference button. This is the insurance. This is the artifact that prevents the next phantom from costing 8 hours of investigation. How to apply: Add a line to CLAUDE.md: "Before any data-question session, read BSP_Data_Trust_Evolution_v1.html and confirm the findings still apply."

📅 14-Day Permanent Fix Plan

Concrete, sequential, verifiable. Each item has an owner and a proof-of-done requirement.

Week 1 (Apr 12 - Apr 18): Code Fix and Schema Migration

Day 1 (today): Fix nexus_titan_migration.py line 249. Add created_at to the INSERT column list and pass job.get("createdOn") from the ST payload. Commit with reference to this document.
Day 1: Make the script non-executable (chmod -x) so it can never be run by accident again. Move to scripts/one_time_migrations/.
Day 2: Add warehouse_ingested_at column to titan.jobs. Run the migration SQL from Step 2 above. Verify via SELECT created_at, warehouse_ingested_at FROM titan.jobs WHERE id IN (SELECT job_id FROM titan.quarantined_jobs LIMIT 10).
Day 2: Install the enforce_explicit_created_at() trigger. Verify it fires by attempting an INSERT with created_at = NOW() and scheduled_at = 2020-01-01. It should raise.
Day 3: Build patch_created_at_omission.py that walks the 29-file blast radius and prints the diff. Run in dry-run mode. Robert reviews every diff before applying.
Day 3: Apply the patch to the top 5 critical files (titan_sync_daemon.py, api/portal.py, api/estimates.py, api/threecx.py, nexus_st_webhook_listener.py). Restart affected services.
Day 4: Add expected_range to every metric in metrics_registry.yaml. Use realistic bounds based on the last 90 days of data. Revenue metrics get [$20K, $200K] for 7d and [$80K, $800K] for 30d. Spend metrics get [$500, $50K]. Etc.
Day 5: Rewrite the math engine weekly_revenue formula to accept a pluggable source argument. The default source becomes Big Sale Tracker. ST is wired as a fallback only.
Day 5: Rewrite the math engine avg_ticket and revenue_per_tech formulas to read from the same pluggable source layer.
Day 6: Fix the anomaly detector's zero_invoice check to page Ashton in Slack (not just log), since it has been firing CRITICAL for 5 days without escalation.
Day 7: Write the memory entry and update MEMORY.md. Deploy this HTML to the playbooks directory.

Week 2 (Apr 19 - Apr 25): Reconnect Stale Tables and Ship Sacred v2

Day 8: Investigate each stale table and restart its ingestion path. graph_nodes and graph_edges first because the graph brain depends on them.
Day 8: Fix the broken nexus-dashboard-integrity.timer which last ran on Apr 2 and has been "active waiting" ever since.
Day 9: Reconnect call_logs ingestion. Both 3CX and SMS paths write to it. The 437-hour staleness means both paths are broken.
Day 9: Reconnect price_book and xact_pricebook. Kalen's estimate workflow depends on fresh pricing.
Day 10: Rebuild Sacred HTML v2 with the new source hierarchy. Big Sale card on top, QB margin card, Ramp COGS card, Google Ads efficiency card, 3CX demand card, ST counts card at the bottom labeled "operational only".
Day 11: Wire the causal validator google_ads_to_revenue chain to Big Sale Tracker as the revenue endpoint. Currently it terminates at an ST node.
Day 12: Walk Kalen through this document. Show him the pinpoint, the fix, and the hierarchy. Get his signoff on Big Sale as the weekly SoT.
Day 13: Ship the Sacred v2 draft to Stephanie. Keep v1 accessible for rollback.
Day 14: Post-mortem. What would have caught the phantom on Apr 3? Were the existing protocols enough? Update this document with anything the post-mortem surfaces.

🌐 Appendix: External API Perimeter

What is wired, what is a gap. For future sessions that want to add a new data source.

22 APIs Wired

API	Files Touching It	Purpose
ServiceTitan	203	Primary ops system (broken as SoT for revenue, still used for counts)
Google Ads	104	Ad spend, clicks, CPC, real conversions (LIVE API, trustworthy)
GA4 / Analytics	66	Website behavior, conversions (no warehouse table yet)
3CX	62	Inbound/outbound call data (titan.voice_calls)
Anthropic (Claude)	59	RAG responses, ask-v2 endpoint, auto-research
OpenAI	54	text-embedding-3-small for knowledge_base chunks
WordPress	50	Page content, Code Snippets, RankMath SEO
QuickBooks	35	P&L pull (manual until Plaid auth)
Gmail	29	Email monitor, RAG ingester
Slack	26	Big Sale tracker, win alerts
Vapi (Daniel AI)	24	Voice agent calls, RAG bridge
Cloudflare	23	CDN, WAF, Web Analytics
Semrush	16	Keyword research, competitor tracking
Google Tag Manager	14	Tag deployment, GCLID capture
Ramp	13	COGS, materials spend (titan.ramp_transactions)
Google Search Console	13	Organic rankings (no warehouse table yet)
ClickCease / CHEQ	11	Click fraud blocking
Telnyx	11	SMS, 10DLC registration (pending BSP EIN)
Meta Ads	7	Facebook budget automation
BirdEye / Reviews	4	Review aggregation
Google LSA	3	Local Service Ads (Tier 1-4 cities)

Gaps (APIs Not Yet Wired)

Ahrefs: Zero files. Robert has access but no ingestion path. Would add backlink intelligence and competitive SEO data.
Plaid (for QuickBooks): Blocked on Stephanie authorization. Once wired, QB P&L refreshes automatically instead of on manual pull.
GSC warehouse table: API is wired (13 files) but no titan.gsc_* table exists. Organic rankings are queried live, never cached.
GA4 warehouse table: Same as GSC. API wired (66 files) but no warehouse table.

📜 The NEXUS Treaty + The Great Stabilization

Root cause: the auto-repair agent lobotomized 19 Python files. System governed by 3 gates. RUNBOOK deployed.

💥 Root Cause: Auto-Repair Agent

The nexus_repair_agent.py auto-repair timer injected a 3-line SLACK_ENABLED = False block into 19 Python files at wrong indentation, causing IndentationError cascade across 20 timers. The "self-healing" systems were the disease, not the cure. Error logs: immune system 12,657, repair agent 13,799, sentinel 9,712 errors.

Fix: All 19 files repaired (14 from .bak backups, 5 by removing injected blocks). Auto-repair agent quarantined to /opt/nexus-staging/quarantine/. Permanently disabled.

📜 The NEXUS Treaty (3 Gates)

Gate 1: Git Mandate — /opt/nexus/.git/ initialized. Pre-commit hook blocks syntax errors. 2 commits baseline the clean state.
Gate 2: Truth Service — nexus_truth_service.py is the SINGLE source for all revenue numbers. Every script must import from it. No more 8 scripts competing.
Gate 3: Verification Gate — verify_sacred_math.py checks Sacred v2 against Truth Service before deploy. Blocks phantom numbers, stale values, unrendered templates. 14/14 checks passing.

🛠️ Infrastructure Deployed

Staging: /opt/nexus-staging/ with scripts, titan, test, quarantine directories
Deploy Gate: deploy_gate.sh — syntax + safety + diff review before production promotion
ALCOA+ Test: test_idempotency.py — double-run verification per Google SRE book
Production Lock: 14 core scripts set to chmod 444 (read-only)
Quarantine: 5 dangerous scripts isolated (auto_repair, immune_system, self_healer, homeostasis, auto_expansion)
Null-Revenue Shield: DataUnavailableError + validate_revenue() + validate_input() in Math Engine
RAG Decontamination: 152 error-pattern chunks purged from pgvector (IndentationError, SyntaxError patterns)
RUNBOOK: 238-line operations manual at /opt/nexus/RUNBOOK.md

📊 Before vs After

Metric	Before (Apr 12 morning)	After (Apr 12 close)
Active timers	292	59
Failing timers	20	0
Syntax-broken files	19	0
Error logs/day	654+ (sentinel alone)	Near zero
Git commits	0	2
Deploy gates	0	3
Revenue sources	8 competing	1 (Truth Service)
Staging environment	None	Created
Rollback capability	.bak files	git checkout
RAG contamination	152 error chunks	0

🔗 API Capability Registry (8/8 Active)

Every API gets a health ping before any script uses it. If a capability is LOST, the system alerts immediately instead of running with stale data.

API	Status	Scope	Refresh Command
ServiceTitan	ACTIVE	Jobs, Campaigns, Customers (NOT revenue)	Refresh in ST Developer Portal
Google Ads	ACTIVE	GAQL queries, offline conversions, campaigns	google_oauth_setup.py
QuickBooks	ACTIVE	P&L reports (SSoT for profitability)	OAuth flow, Plaid from Stephanie
Ramp	ACTIVE	Transactions (COGS, materials)	Ramp dashboard
Gmail	ACTIVE	gmail.modify (read + label)	nexus_email_monitor.py --scan
OpenAI	ACTIVE	text-embedding-3-small for RAG	OpenAI dashboard
Slack	ACTIVE	#big-sale-alert, #nexus-alerts	Slack app settings
Vapi	ACTIVE	Daniel AI voice agent	Vapi dashboard

File: /opt/nexus/nexus/scripts/output/api_capabilities.json | Script: /opt/nexus/titan/nexus_capabilities.py