๐Ÿง  NEXUS EVOLUTION PROOF ยท DEEP-DIVE DEBRIEF

Why the Dashboard Says ๐Ÿ”ด CRITICAL and the Plumbers Don't

April 17, 2026 ยท 9:55 PM Central Time ยท For: Robert Dove ยท Status: Diagnosis complete, no code changed

๐ŸŽฏ The 60-Second Story

โ‘  Patient Vitals what's actually running vs. what's not

๐ŸŸข Actually Alive

ComponentLast heartbeat
titan_sync_daemon.py1 min ago โœ… (every 15m)
titan_invoice_sync.py05:15 CT today โœ…
titan-killer.service (API)active โœ…
zeus modules (6 of 7)all ๐ŸŸข
ST API authpulled 169 invoices today โœ…
Postgres INSERTs into titan.jobslast 1h 33m ago โœ…

๐Ÿ”ด What the Monitors Yell About

AlertWhy it fires
Anomaly zero_invoice 75%6 of 8 recent jobs have no invoice_total
Data freshness: ST Jobs (6d)st_jobs_cache.json mtime is Apr 12

Both are downstream of the same Apr 12 decision. Neither reflects a tech problem.

โ‘ก The Pipeline Map live flow of one ST job from creation to dashboard

โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ ๐Ÿ  Customer calls ยท BSP schedules ยท Tech arrives ยท Job gets done โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ Writes into ServiceTitan โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ ๐Ÿ›๏ธ ServiceTitan API (source of truth for operational data) โ”‚ โ”‚ /jobs /invoices /customers /estimates โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ”‚ every 15 min 05:15 CT daily RETIRED Apr 12 โ”‚ โ”‚ โ”‚ โ–ผ โ–ผ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ titan_sync_daemon โ”‚ โ”‚ titan_invoice_sync โ”‚ โ”‚ nexus_titan_migrationโ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ (quarantined โ€” was โ”‚ โ”‚ INSERT new jobs โ”‚ โ”‚ UPDATE invoice_ โ”‚ โ”‚ the $6.4M phantom) โ”‚ โ”‚ INSERT customers โ”‚ โ”‚ total WHERE st_id โ”‚ โ”‚ โ”‚ โ”‚ INSERT estimates โ”‚ โ”‚ matches โ”‚ โ”‚ Used to also write: โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ€ข invoice backfill โ”‚ โ”‚ (no job_number, โ”‚ โ”‚ 169 invoices/day โ”‚ โ”‚ โ€ข job_number sync โ”‚ โ”‚ no invoice_total,โ”‚ โ”‚ 86โ€“115 updates/day โ”‚ โ”‚ โ€ข st_jobs_cache.jsonโ”‚ โ”‚ no scheduled_at) โ”‚ โ”‚ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ ๐Ÿ—„๏ธ Postgres ยท bsp_analytics ยท titan.jobs (the live table) โ”‚ โ”‚ 11,831 jobs ยท last insert 1h 33m ago โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚ โ–ผ โ–ผ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚ ๐Ÿง  Dashboards + APIs โ”‚ โ”‚ ๐Ÿ‘ฎ Anomaly detector โ”‚ โ”‚ (HCP, Stephanie, โ”‚ โ”‚ reads zero_invoice โ”‚ โ”‚ Big Sale, Audreyโ€ฆ) โ”‚ โ”‚ on 8 recent jobs โ”‚ โ”‚ โ”‚ โ”‚ ๐Ÿ”ด fires at 75% โ”‚ โ”‚ SHOWS: broken ST rev โ”‚ โ”‚ โ”‚ โ”‚ $13.7K/wk โ”‚ โ”‚ ๐Ÿ‘ฎ Session enforcer โ”‚ โ”‚ โ”‚ โ”‚ checks st_jobs_cache โ”‚ โ”‚ Real revenue lives โ”‚ โ”‚ mtime ยท 6d old โ”‚ โ”‚ in Big Sale $226K/wk โ”‚ โ”‚ ๐Ÿ”ด STALE โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜

โ‘ข The Chronic Gap invoice_total populated rate, by day

Every row below is completed jobs. If tech work flows into invoices flows into invoice_total, this bar should be mostly full. It is not. And it has not been all month.

DayCompletedWith $Populated %
Apr 18 (partial)100%
Apr 171000%
Apr 161000%
Apr 1511655%
Apr 1419211%
Apr 139111%
Apr 109333%
Apr 081119%
Apr 044250%
Apr 026350%

Average across April: ~26% of completed jobs carry a non-zero invoice_total. Peak day: 55%. That means 74% of the time, our own ST mirror does not know what a job was worth.

โ‘ฃ The Causal Chain Apr 3 fire ยท Apr 12 treaty ยท Apr 17 alert

๐Ÿ“… Apr 3 04:07 UTC โ–ผ ๐Ÿ”ฅ PHANTOM $6.4M discovered nexus_titan_migration.py:249 INSERT missing created_at 10,461 jobs stamped same timestamp ยท scheduled_at spans 5 years ๐Ÿ“… Apr 3 โ†’ Apr 12 โ–ผ ๐Ÿงฏ Incident response ยท Evolution Protocols v1 published 29-file blast radius documented in BSP_Data_Trust_Evolution_v1.html ๐Ÿ“… Apr 12 (The Nexus Treaty) โ–ผ ๐Ÿ”’ nexus_titan_migration.py โ†’ one_time_migrations/ + chmod -x ๐Ÿ”’ Postgres trigger guard added to titan.jobs (prevents bulk INSERT) ๐Ÿ”’ titan_sync_daemon.py takes over job INSERTs (every 15 min) 11,729 phantom rows quarantined ยท 292 โ†’ 128 timers BUT โ€” three responsibilities were never reassigned: โŒ Invoice total backfill on older jobs โŒ job_number population โŒ st_jobs_cache.json daily write ๐Ÿ“… Apr 12 โ†’ Apr 17 (5 days) โ–ผ ๐Ÿ•ณ๏ธ Orphaned work piles up Each day's completed jobs enter titan.jobs as skeletons and stay that way ๐Ÿ“… Apr 17 21:42 CT โ–ผ ๐Ÿšจ Evolution Proof fires ๐Ÿ”ด๐Ÿ”ด zero_invoice CRITICAL โ€” 6 of 8 is over 70% threshold ST Jobs 6d stale โ€” cache file mtime is Apr 12 00:00

โ‘ค The Math why "75%" is both technically correct and statistically loud

Small-sample noise check

Sample: 8 jobs ยท 6 zero-invoice ยท point estimate 75%. Wilson 95% confidence interval: 40.9% to 93.0%. On 8 data points, the true rate could be 41% or it could be 93%. The threshold bar at 70% is inside that interval.

Sample
n = 8
jobs scheduled โ‰ค 7d
Zero-invoice
6
invoice_total is null or 0
Point rate
75%
fires at โ‰ฅ 70%
95% CI
41 โ€“ 93%
Wilson score

But the signal is real at larger N

Widen to all completed jobs Apr 1 through Apr 18: 124 jobs total, 100 of them zero-invoice. That's an 80.6% zero rate on n=124. Wilson 95% CI: 73% to 87%. Statistically robust. The detector picked the wrong window, but the underlying finding is real.

Revenue implication

Evolution Proof reports $13,745/wk from ST and $226,703/wk from Big Sale. Ratio: 6.1%. If Big Sale is truth, ST is capturing only about 6% of real revenue. The 78% zero-invoice rate on our ST mirror and the 6% revenue capture ratio are the same story told two different ways.

โ‘ฅ Monitor vs Reality lane diagram

Alert the dashboard showsWhat it's measuringWhat it means in reality
๐Ÿ”ด CRIT zero_invoice 75% Last 8 jobs with scheduled_at in 7d Technically correct. Label ("techs not closing jobs") is wrong. Real cause: invoice sync doesn't backfill older jobs.
๐Ÿ”ด STALE ST Jobs 6d mtime of st_jobs_cache.json File was written by the quarantined migration script. DB itself is fresh. Monitor watches a ghost.
๐Ÿ’ช 56/100 Muscle score How much Nexus is acting on data Downstream of the first two. If ST mirror is 74% empty, action engines have thin signal.
๐ŸŸข 6 of 7 data fresh ad_throttler, 3cx, st_enforce, ai_intake, anomaly_log, ads_audit Correct. All six write-every-day files are under 6h old.

โ‘ฆ The Fix Menu ten levers, ranked by impact ร— effort

#LeverImpactEffortRisk
A1 Write a small job that has titan_sync_daemon.py also emit st_jobs_cache.json on each cycle Silences ST Jobs stale alert permanently ~30 min ๐ŸŸข Low
A2 Drop st_jobs_cache.json from DATA_SOURCES in nexus_session_enforcer_v2.py and replace with a live DB freshness query Silences the alert AND makes the freshness check accurate ~20 min ๐ŸŸข Low
B1 Raise check_zero_invoice_rate() minimum sample from any to n โ‰ฅ 20 Stops small-N false alarms without hiding real gap ~10 min ๐ŸŸข Low
B2 Also exclude jobs completed in last 24h (give invoice sync a chance) Cuts the residual daily lag noise ~10 min ๐ŸŸข Low
C1 Widen invoice sync window from 7 days to 30 days Should lift population rate substantially on older jobs ~10 min + one re-run ๐ŸŸก Medium
C2 Change invoice sync key from modifiedOnOrAfter to pull invoices for all open jobs in last 30d regardless of invoice modify date Catches jobs whose invoice was created but never modified ~1 hr ๐ŸŸก Medium
C3 Hook invoice sync to ST webhook invoice.updated so it is event-driven not cron Near-realtime population ยท strongest fix ~3 hr (webhook listener exists) ๐ŸŸก Medium
D1 Diagnostic: sample 10 recent zero-invoice jobs, call ST API directly, compare totals Tells us how many are sync-miss vs. genuinely $0 (warranty / declined) ~20 min ๐ŸŸข Low
D2 Remove the four cron lines that point into /purgatory/ and /backups/ (monday_sync, st_data_fixer, etc.) Cleans noise; no functional change ~10 min ๐ŸŸข Low
E1 Rename the zero_invoice alert label from "Techs not closing jobs" to "ST mirror invoice coverage gap" Stops misleading anyone who glances at the dashboard ~2 min ๐ŸŸข Low

โ‘ง My Recommendation what to do Monday morning

๐ŸŽฏ Sequenced plan

  1. Today (quiet the noise): Do A2 + B1 + B2 + E1. All four are < 45 minutes total and do not touch the data pipeline. Dashboard goes ๐ŸŸข without pretending problems away.
  2. Diagnostic before fix (D1): Pick 10 zero-invoice jobs, pull their invoices from ST directly, find out the split between "sync missed it" and "job truly has no invoice yet". This decides whether C1 or C3 is the right fix.
  3. The real fix (C1 first, C3 later): Widen invoice sync window to 30 days as a one-line change. Watch the population rate for 48 hours. If still below 70%, promote to C3 event-driven.
  4. Cleanup (D2): Retire the dead cron lines. These have been erroring daily since Apr 12 and add noise to every log file.

What I did not do: I touched zero code on the VM. All findings are read-only. Saying "yes" to any of Aโ€“E means me making the change with you reviewing before I restart the service.

๐Ÿงญ How to read the Evolution Proof from now on

When the proof says ๐Ÿ”ด CRITICAL, ask three questions in this order:

  1. Does the DB itself say the data is stale? (Query MAX(updated_at), not a cache file.)
  2. Is the sample size big enough to trust the percentage? (Under n=20, treat any rate as a hint not a verdict.)
  3. Is the ALERT LABEL telling you the cause, or just the symptom? Most of our labels describe symptoms.

The dashboard is a thermometer, not a diagnosis. It is very good at telling you something is off. It is not very good at telling you what.