Incident affecting run scheduling

Incident Report for Spacelift

Resolved

After continuing to monitor the system we have confirmed that it is stable again.

Posted Jul 23, 2025 - 10:16 UTC

Monitoring

We have identified the root cause of the incident. We are currently monitoring to make sure the system is stable, and we will take steps to prevent it occurring again.

Posted Jul 23, 2025 - 09:40 UTC

Identified

The problem was caused by a large backlog of messages building up on one of our message queues. The backlog has cleared and the system appears to be operating correctly again. We are currently trying to understand the root cause to prevent it happening again.

Posted Jul 23, 2025 - 08:49 UTC

Investigating

We are currently investigating an incident affecting run scheduling and other asynchronous processing in Spacelift. The incident is causing delays with starting runs and updating statuses in VCS systems. As soon as we have more information we will post a further update.

Posted Jul 23, 2025 - 08:39 UTC

This incident affected: Event processing and Public workers.