Each chapter preserved what worked and upgraded what was limiting. Chapter 3 came from an unexpected place — a data integrity audit that led to a fundamentally better architecture.
During v2 development, a data integrity audit revealed that some benchmark values had been pattern-matched rather than verified from the source spreadsheet. Rather than patch the problem, we made a fundamental architecture decision: stop asking the advisor to manually enter data that already exists in public filings. The Form 5500 dataset — 869,889 plans with 48 verified fields each — became the primary data source for Section 1. This made the product faster (zero entry), more defensible (DOL-attested data), and more scalable (peer groups computed from real filing populations, not survey samples). The PLANSPONSOR survey data remains available in Section 2 for plan design comparisons, using only values verified against the original spreadsheet.
The "Before" column captures the state under Excel and v1. The "Now" column captures v3.
The jump from "digitized Excel" to "filing benchmark engine" happens along six specific axes. Each one was chosen because it either eliminates advisor effort or makes the benchmark more defensible.
Type a company name. A live results table shows matching plans from 869,889 filings — plan name, sponsor, location, assets, score. Click one. The benchmark renders instantly. No EIN lookup, no manual typing. The advisor's hands never leave the keyboard for more than 3 seconds.
Every Form 5500 filing is legally attested by the plan administrator. This is census data, not sample data. When the report says "among 10,656 Manufacturing plans with 11–25 participants," that's the actual population — not an estimate from a survey of ~1,000 plan sponsors.
Primary peers: same NAICS sector + size band. State peers: same industry + state (shown when n ≥ 50 for statistical credibility). National industry: all plans in the sector regardless of size. The advisor sees which layer is driving each comparison and exactly how many plans are in the peer group.
Every plan gets a 1–10 score from four weighted decile components: employer contributions (40%), participant contributions (30%), account balances (20%), plan age (10%). Computed within the plan's size band so a 15-person plan isn't penalized for having lower assets than a 5,000-person plan.
Claude Sonnet 4 receives the full plan data + peer stats and writes a structured analysis: executive summary, strengths, areas of concern, peer comparison highlights, actionable recommendations. Every number in the narrative comes from the actual filing data — the model is explicitly instructed never to invent statistics.
The Excel workflow topped out at 1 plan / 30 min. Bench(k) v3 runs at 1 plan / 30 sec — a 60× throughput improvement. For a Prime advisor with a book of 100 plans, that's the difference between a quarter-long project and a 50-minute afternoon session.
The complete advisor flow from opening the app to reading the AI analysis. No manual data entry anywhere in the chain.
Visit bench-k.pages.dev. The 299 MB Form 5500 dataset streams with live progress. NOX constellation spinner + animated status messages keep the advisor informed: "Loading 869,889 plans…" → "Preparing 15 benchmark charts…" → "AI analysis engine ready." ~25s · once per session
Type "anello" or "summit" or any part of a company name. Live results appear in the sidebar with score badges, location, and assets. Click a plan. The identity card, 15 benchmark charts, and peer stats render instantly. <3s
Claude Sonnet 4 receives the full plan data and peer statistics. It writes a structured expert narrative — executive summary, ranked strengths and concerns, peer comparison highlights, actionable recommendations. ~15s
The advisor scrolls through 15 benchmark charts and the AI narrative. The Help Center in the sidebar explains scoring methodology, data sources, and plan feature codes. Export produces a print-ready report. ~10s
Faster benchmarks are the visible improvement. The deeper change is what an advisor can now do that wasn't possible before.
At 30 seconds per plan, a Prime advisor can run their entire book in an afternoon. Quarterly benchmarking becomes a calendar event, not a project.
Before a prospect meeting, search for their company name — public filing data — and walk in with a benchmark of their current plan vs. peers. Zero sponsor effort required. Prime advisors show up with data, not brochures.
Every peer comparison shows the exact peer count. Every number comes from legally attested Form 5500 filings. Plan committees and ERISA counsel can trace every data point to its source.