Complaint handling used to be reactive

Complaint handling used to be entirely reactive. An order would go wrong, a customer would file a complaint, and only then would ops look into what happened. By the time anyone acted, the damage to trust was already done.

Scoring every live order, in real time

Agility built an order-risk prediction layer into ATLAS that scores every live order for the probability of a preventable failure, in real time, before it becomes a complaint. The model draws on order value, time of day, how busy the merchant's kitchen is at that moment, the customer's complaint history, each driver's historical error rate, the zone, and live traffic conditions.

Rather than a black box score, it surfaces the actual contributing factors behind each prediction using SHAP, so a dispatcher sees not just "this order is high risk" but "high risk because this merchant is running 22 minutes behind and this driver has a below-average on-time rate in this zone."

ORDER RISK QUEUE
HIGH
#48213 · Zone 37

Merchant running 22 min behind · driver below-avg on-time here

0.86
MED
#48197 · Zone 14

Peak-hour load · high-value order

0.61
LOW
#48180 · Zone 6

Nominal · driver on-time rate strong

0.12
Ranked by predicted failure probability · SHAP factors shown
The risk queue inside the Dispatch Command Center: orders ranked by predicted failure probability, with the contributing factors surfaced.

Ranked so the riskiest orders get attention first

That risk queue now sits inside the Dispatch Command Center, ranked so the highest-risk orders get human attention first. The model retrains on a schedule with a promote-only-if-better guardrail, meaning a new version never goes live unless it actually outperforms the one it's replacing, and every prediction gets logged for drift monitoring.

The honest limitation, stated plainly

Ranking orders by risk and surfacing the contributing factors is reliable and differentiated. Predicting the exact type of failure is only as good as the underlying complaint data, and in this deployment roughly 90% of attributable complaints trace back to restaurant or kitchen issues.

That's not a flaw in the model, it's a ceiling set by the incident data itself, and it's exactly why data quality, not just model sophistication, is the real lever for improving prediction depth over time.

The pattern for other industries

Wherever a business has driver history, customer history, and live operational signals, this same architecture predicts failure risk before it happens, whether that failure is a late package, a missed service window, or a temperature excursion in a cold-chain shipment.