Incident Date: September 4, 2025
Reported By: P&G at 6th September 9:43 AM IST (Resolved at 4:40 PM IST) ~ 7 hours
System: The Coupon Bureau (TCB) Platform
Component: AWS Sharing API Lambda for DLQ (Dead Letter Queue) Processing
Summary
On 4th September, the TCB platform experienced increased Lambda concurrency and message backlog. The issue was triggered while processing the DLQ: exceptions within the DLQ-processing Lambda caused abnormal exits. Because messages were not successfully acknowledged, they reappeared in the queue and were retried multiple times. This repeated retry loop consumed concurrency near the reserved limit (100), leading to processing delays across the system.
Impact
Root Cause
Corrective Actions Taken