Between 25 March 2025, 11:36 UTC and 27 March 2025, 11:12 UTC, our API experienced a slightly elevated rate of 504 Gateway Timeout errors. The issue was reported by customers at 10:30 UTC on 27 March, prompting an investigation.
The issue was traced to a DELETE query that became stuck in the preparing
state. This led to lock contention in the database, occasionally blocking other queries and resulting in timeouts.
Once identified, the problematic query was terminated and lock contention was cleared. Error rates returned to normal levels as of 11:12 UTC on 27 March.
preparing
state and why it caused lock contention affecting unrelated queries.As a general best practice, we recommend implementing retries with exponential backoff in workflows that depend on our API, to gracefully handle occasional transient errors like 504 timeouts.
We apologize for any inconvenience this may have caused and appreciate the customers who reported the issue.