Revenue Anomaly Detector — Case Study

Portfolio simulation notice: All account names and revenue figures in this case study are entirely fictitious, generated programmatically for portfolio demonstration purposes. No real client, billing, or proprietary revenue data has been used or referenced.

The Case Study

Five-Step Walkthrough

The Problem

Why a backward-looking audit isn't enough

▾

The Excel and SQL projects in this portfolio both look backward — they verify invoices that have already been billed and reconcile payments that have already been collected. But by the time an exception report flags a problem, the revenue impact has often compounded for several billing cycles.

Revenue Operations teams need a forward-looking signal: which accounts are behaving differently from their own normal pattern right now, before the deviation becomes a quarter-end surprise. A single month-over-month comparison isn't enough — a 15% dip might be totally normal for a seasonal account and alarming for a steady one. The detection logic needs to be relative to each account's own history, not a flat company-wide threshold.

Flat thresholds ("alert if revenue drops >10%") produce false positives on naturally volatile accounts and miss real problems on stable ones
Manual review of 30+ accounts across 24 months of history doesn't scale
Slow erosion — a few percent decline every month — rarely trips a single-month alert but compounds into serious revenue loss
No way to rank which anomalies deserve attention first when several flag in the same week

The Simulated Dataset

30 accounts, 24 months, 5 seeded scenarios

▾

generate_data.py builds 720 rows of monthly revenue across 30 fictitious SaaS accounts spanning July 2024 through June 2026. Each account starts with its own baseline MRR, its own gentle growth or decline trend, and its own natural month-to-month noise — so the simulated data doesn't look artificially uniform.

Five deliberate anomalies are seeded on top of that baseline, each shaped differently so the detection logic has to handle more than one pattern type:

Apex Holdings — sudden churn-style drop to ~30% of prior revenue, staying low (a customer that's leaving)
Lighthouse Media — one-month spike to nearly 3x normal revenue, then reverting (a billing error, not real growth)
Northgate Partners — slow erosion across 12 consecutive months, a few percent at a time (easy to miss month-to-month)
Sterling Capital — sharp one-month spike followed by a gradual climb to a genuinely higher baseline (a real expansion, not an error)
Crestline Industries — sudden drop followed by partial recovery the next month (a service disruption that got partially resolved)

Detection Logic

Rolling baseline, z-score, and erosion check

▾

The core method: for every account, compute a trailing 6-month rolling mean and rolling standard deviation, shifted forward one month so the current month is being compared against its own prior baseline — not a baseline that includes itself.

Rolling Baseline

rolling_mean = revenue.rolling(6, min_periods=5).mean().shift(1)

Each account's own trailing 6 months become its personal baseline — a volatile account and a stable account each get judged against their own history, not a single company-wide rule.

Std Floor (stability fix)

rolling_std = rolling_std.clip(lower = rolling_mean * 0.03)

Without this floor, a few unusually flat trailing months produce a near-zero standard deviation, which sends the z-score to extreme and meaningless values. Flooring it at 3% of the rolling mean keeps the score stable.

Z-Score Classification

z = (revenue - rolling_mean) / rolling_std
flag if |z| ≥ 2.5

Months that land more than 2.5 standard deviations from the account's own baseline are flagged as a point anomaly — classified as SPIKE if above baseline, DROP if below.

Erosion Check

erosion = revenue.pct_change(periods=6) ≤ -30%

A secondary check that catches slow decline a single-month z-score would miss — if revenue has fallen 30%+ over the trailing 6 months even without one dramatic month, it's flagged as EROSION.

Severity Score

severity = |z-score| or |erosion %| × 100

Point anomalies and erosion are measured on different scales (standard deviations vs. percentage decline). The severity score normalizes both onto one comparable number so every anomaly type can be ranked together.

Priority Tiering

HIGH / MEDIUM / LOW based on severity thresholds per type

Converts the raw severity score into an executive-friendly triage label, so a RevOps manager can scan the report and know immediately which accounts need a conversation this week.

Code & Output Preview

The classification function and ranked report

▾

The classification logic that turns raw z-scores and erosion flags into a labeled anomaly type:

def classify(row):
    # Point anomalies (single dramatic month) take priority over erosion
    # since a sudden drop or spike is more urgent than a gradual trend
    if row["point_anomaly"] and row["z_score"] > 0:
        return "SPIKE"
    elif row["point_anomaly"] and row["z_score"] < 0:
        return "DROP"
    elif row["erosion_flag"]:
        return "EROSION"
    else:
        return "NORMAL"

g["anomaly_type"] = g.apply(classify, axis=1)

# Severity score unifies point anomalies and erosion onto one scale
g["severity_score"] = np.where(
    g["anomaly_type"] == "EROSION",
    g["erosion_pct"].abs() * 100,
    g["z_score"].abs().fillna(0)
)

Top of the ranked anomaly report

account	month	revenue	rolling_mean	pct_vs_baseline	type	severity
Apex Holdings	2026-04	$8,954.90	$24,896.35	-64.0%	EROSION	73.09
Apex Holdings	2026-03	$9,594.54	$28,832.55	-66.7%	EROSION	71.11
Lighthouse Media	2025-10	$115,321.82	$45,438.23	+153.8%	SPIKE	31.21
Sterling Capital	2025-04	$50,884.28	$29,359.53	+73.3%	SPIKE	21.32
Apex Holdings	2026-02	$11,193.63	$32,642.42	-65.7%	DROP	20.77
Crestline Industries	2026-01	$6,543.81	$14,552.17	-55.0%	DROP	18.34

Results & Visualization

Chart of the top flagged accounts

▾

The script outputs a four-panel chart of the top flagged accounts, showing each account's revenue line against its own rolling baseline with anomalies marked at the point of deviation. Gradient fills and glowing markers make the deviation points immediately visible without needing to read the underlying numbers.

Revenue

Rolling Baseline

SPIKE

DROP

EROSION

Revenue vs Rolling Baseline — Apex Holdings

Month-over-Month % Change

Apex Holdings shows the churn-style drop staying low across three months. Lighthouse Media shows the single-month billing-error spike reverting immediately after. Crestline Industries shows a sharp drop with partial recovery. Sterling Capital shows a spike followed by a genuinely higher sustained baseline — the model correctly stops flagging it as anomalous once the new baseline catches up.

Outcomes

What the Detection Run Produced

6.7%

Of all 720 account-months flagged — a clean signal, not noise

4 of 5

Seeded anomaly scenarios surfaced at the very top of the ranked report

3 types

SPIKE, DROP, and EROSION all detected by one unified scoring method

Self-correcting

Sterling Capital's new higher baseline stopped triggering once the rolling window caught up — no manual threshold reset needed

Technical Skills

Tools & Techniques Used

Python & Data Logic

Business & Revenue Ops Domains

Download the script

Run it against your own monthly revenue export — swap the CSV, the detection logic doesn't change.

↓ Download anomaly_detector.py ← Back to Portfolio Get in touch