Every ops leader eventually learns the same lesson: a noisy alert system is worse than no alert system.
It creates false urgency, trains people to ignore messages, and erodes trust. The goal is not "more alerts." The goal is faster noticing with less noise.
The core principle: alerts must be spend-aware
A 20% CPA swing is not always meaningful. The question is: how much money was exposed while the metric drifted?
So instead of "% change," design alerts around:
Spend-at-risk (simple version)
Spend-at-risk is the amount of spend flowing through segments that are below target.
Example: If 60% of spend is above CPA target, you have a real problem even if averages look fine.
This one framing makes client reporting more honest and ops work more prioritized.
The 6 rules of production alerting
1) Define severity levels
Keep it simple:
If everything is critical, nothing is.
2) Use persistence windows
Most performance data is noisy. Require persistence.
Examples:
This avoids chasing spikes.
3) Use minimum data thresholds
Do not alert on 3 conversions.
Add minimums: impressions, clicks, conversions, spend.
4) Always attach baselines and comparisons
Every alert must include:
5) Link directly to the object
If the alert says "campaign," include the campaign link. If it says "ad set," include the ad set link.
Hunting destroys response time.
6) Assign an owner and escalation path
For critical alerts, escalation should be explicit:
Example: what a good alert looks like
```
⚠️ Critical: Meta pacing +18% MTD
Drivers: 2 ad sets (links)
Evidence: yesterday vs 7-day CPA
Threshold: +10% for 2 days
Spend-at-risk: $X projected by month-end
Next action: reduce cap or shift budget to Y
```
Exception, evidence, next action. No dashboard.
How to operationalize alert tuning
Once a month, review:
Tune thresholds. Alerting is not "set and forget." It's a product.
FAQ
What's the #1 reason alerts fail?
They don't include evidence and they don't have owners.
How do we avoid alert fatigue quickly?
Add persistence windows and minimum data thresholds. Then prune categories aggressively.