The Case Study
Five-Step Walkthrough
The Excel and SQL projects in this portfolio both look backward — they verify invoices that have already been billed and reconcile payments that have already been collected. But by the time an exception report flags a problem, the revenue impact has often compounded for several billing cycles.
Revenue Operations teams need a forward-looking signal: which accounts are behaving differently from their own normal pattern right now, before the deviation becomes a quarter-end surprise. A single month-over-month comparison isn't enough — a 15% dip might be totally normal for a seasonal account and alarming for a steady one. The detection logic needs to be relative to each account's own history, not a flat company-wide threshold.
- Flat thresholds ("alert if revenue drops >10%") produce false positives on naturally volatile accounts and miss real problems on stable ones
- Manual review of 30+ accounts across 24 months of history doesn't scale
- Slow erosion — a few percent decline every month — rarely trips a single-month alert but compounds into serious revenue loss
- No way to rank which anomalies deserve attention first when several flag in the same week
generate_data.py builds 720 rows of monthly revenue across 30 fictitious SaaS accounts spanning July 2024 through June 2026. Each account starts with its own baseline MRR, its own gentle growth or decline trend, and its own natural month-to-month noise — so the simulated data doesn't look artificially uniform.
Five deliberate anomalies are seeded on top of that baseline, each shaped differently so the detection logic has to handle more than one pattern type:
- Apex Holdings — sudden churn-style drop to ~30% of prior revenue, staying low (a customer that's leaving)
- Lighthouse Media — one-month spike to nearly 3x normal revenue, then reverting (a billing error, not real growth)
- Northgate Partners — slow erosion across 12 consecutive months, a few percent at a time (easy to miss month-to-month)
- Sterling Capital — sharp one-month spike followed by a gradual climb to a genuinely higher baseline (a real expansion, not an error)
- Crestline Industries — sudden drop followed by partial recovery the next month (a service disruption that got partially resolved)
The core method: for every account, compute a trailing 6-month rolling mean and rolling standard deviation, shifted forward one month so the current month is being compared against its own prior baseline — not a baseline that includes itself.
flag if |z| ≥ 2.5
The classification logic that turns raw z-scores and erosion flags into a labeled anomaly type:
def classify(row): # Point anomalies (single dramatic month) take priority over erosion # since a sudden drop or spike is more urgent than a gradual trend if row["point_anomaly"] and row["z_score"] > 0: return "SPIKE" elif row["point_anomaly"] and row["z_score"] < 0: return "DROP" elif row["erosion_flag"]: return "EROSION" else: return "NORMAL" g["anomaly_type"] = g.apply(classify, axis=1) # Severity score unifies point anomalies and erosion onto one scale g["severity_score"] = np.where( g["anomaly_type"] == "EROSION", g["erosion_pct"].abs() * 100, g["z_score"].abs().fillna(0) )
Top of the ranked anomaly report
| account | month | revenue | rolling_mean | pct_vs_baseline | type | severity |
|---|---|---|---|---|---|---|
| Apex Holdings | 2026-04 | $8,954.90 | $24,896.35 | -64.0% | EROSION | 73.09 |
| Apex Holdings | 2026-03 | $9,594.54 | $28,832.55 | -66.7% | EROSION | 71.11 |
| Lighthouse Media | 2025-10 | $115,321.82 | $45,438.23 | +153.8% | SPIKE | 31.21 |
| Sterling Capital | 2025-04 | $50,884.28 | $29,359.53 | +73.3% | SPIKE | 21.32 |
| Apex Holdings | 2026-02 | $11,193.63 | $32,642.42 | -65.7% | DROP | 20.77 |
| Crestline Industries | 2026-01 | $6,543.81 | $14,552.17 | -55.0% | DROP | 18.34 |
The script outputs a four-panel chart of the top flagged accounts, showing each account's revenue line against its own rolling baseline with anomalies marked at the point of deviation. Gradient fills and glowing markers make the deviation points immediately visible without needing to read the underlying numbers.
Apex Holdings shows the churn-style drop staying low across three months. Lighthouse Media shows the single-month billing-error spike reverting immediately after. Crestline Industries shows a sharp drop with partial recovery. Sterling Capital shows a spike followed by a genuinely higher sustained baseline — the model correctly stops flagging it as anomalous once the new baseline catches up.
Outcomes
What the Detection Run Produced
Technical Skills
Tools & Techniques Used
Python & Data Logic
Business & Revenue Ops Domains
Download the script
Run it against your own monthly revenue export — swap the CSV, the detection logic doesn't change.