Post-mortem
Impact window: ~10:08 AM – 12:30 PM CT, May 31, 2026
Summary: Starting around 10:08 AM CT, some customers experienced delays and failures when sending messages and receiving inbound message events, along with elevated timeouts or 5xx API responses. Message sending and inbound delivery were restored by ~11:25 AM CT; we declared full resolution at ~12:30 PM CT after confirming stability.
Cause: A component in our message-processing pipeline became overloaded during a peak in traffic and stopped processing correctly, interrupting message delivery for a subset of traffic.
Resolution: We rebalanced and restored the affected component, then verified end-to-end recovery. The platform is fully operational.
Resolved
Post-mortem
Impact window: ~10:08 AM – 12:30 PM CT, May 31, 2026
Summary: Starting around 10:08 AM CT, some customers experienced delays and failures when sending messages and receiving inbound message events, along with elevated timeouts or 5xx API responses. Message sending and inbound delivery were restored by ~11:25 AM CT; we declared full resolution at ~12:30 PM CT after confirming stability.
Cause: A component in our message-processing pipeline became overloaded during a peak in traffic and stopped processing correctly, interrupting message delivery for a subset of traffic.
Resolution: We rebalanced and restored the affected component, then verified end-to-end recovery. The platform is fully operational.
Monitoring
All services are now back online and fully operational. We're giving the all-clear, and our engineering team will continue monitoring closely to ensure we don't see any regressions. Thank you for your patience throughout this incident.
Identified
Most services are now back online and we're seeing continued recovery across the platform. Our engineering team is still actively monitoring and working toward full resolution. We're encouraged by the progress and will keep you updated until everything is fully stable.
We don't want to give the all-clear until we're very sure. We'll give the all-clear once we're confident that everything is fully back online.
Identified
We are seeing some recovery. We've identified the issue but we're working to bring all services back online now.
Identified
Update: A fix has been successfully implemented, and we are starting to see services recover. We are currently monitoring the systems to ensure full stability.
Investigating
We're experiencing service degradation/high error rates for some partners. Investigating and will update with more info.