Bitmovin Inc Status

All Systems Operational

Uptime over the past 90 days. View historical uptime.

Bitmovin API Operational

90 days ago

99.99 % uptime

Today

Account Service Operational

90 days ago

100.0 % uptime

Today

Input Service Operational

90 days ago

100.0 % uptime

Today

Encoding Service Operational

90 days ago

99.99 % uptime

Today

Output Service Operational

90 days ago

99.99 % uptime

Today

Statistics Service Operational

90 days ago

100.0 % uptime

Today

Infrastructure Service Operational

90 days ago

100.0 % uptime

Today

Configuration Service Operational

90 days ago

100.0 % uptime

Today

Manifest Service Operational

90 days ago

100.0 % uptime

Today

Player Service Operational

90 days ago

100.0 % uptime

Today

Player Licensing Operational

90 days ago

100.0 % uptime

Today

Analytics Service Operational

90 days ago

100.0 % uptime

Today

Analytics Ingress Operational

90 days ago

100.0 % uptime

Today

Query Service Operational

90 days ago

100.0 % uptime

Today

Export Service Operational

90 days ago

100.0 % uptime

Today

Bitmovin Dashboard Operational

90 days ago

100.0 % uptime

Today

Operational

Degraded Performance

Partial Outage

Major Outage

Maintenance

System Metrics Month Week Day

API Response Time

Fetching

Past Incidents

Jul 18, 2025

Increased queue times

Resolved - This incident has been resolved.
Jul 18, 21:40 UTC

Monitoring - The queuing system has fully recovered and encodings run normally and without delay.
We are still monitoring the situation.
Jul 18, 17:25 UTC

Investigating - We are currently observing increased queue times across all infrastructures due to our overall system capacity safeguards.

Encodings are still processing normally; however, customers may notice that their full slot allocation cannot be utilized at this time. This is expected behavior, as our fair load balancing mechanism and system-wide limits are in place to ensure equitable resource distribution among all customers.

We are actively monitoring the situation and will provide updates as the load decreases or further actions are taken.

Thank you for your patience.
Jul 18, 16:49 UTC

Jul 17, 2025

No incidents reported.

Jul 16, 2025

Scheduling delays across our encoding platform

Resolved - This incident has been resolved.
Jul 16, 06:53 UTC

Monitoring - We’re happy to report that the system has fully recovered and all encodings are now being processed normally across all regions and infrastructure environments.

As a next step, we are conducting a full root cause analysis to understand what led to the incident and to define measures that will prevent similar issues in the future.
We will provide a detailed postmortem once the analysis is completed.

Thank you again for your patience and trust throughout this incident.
Jul 15, 16:41 UTC

Update - We have now manually recovered all previously stuck encodings and successfully cleaned up the instances that were blocking our GCP SSD quota.

As a result, GCP-based encoding jobs are once again processing successfully. We are currently performing a controlled ramp-up of system capacity to work through the backlog of affected jobs and ensure platform stability.

We continue to monitor closely and will provide further updates as full performance is restored.

Thank you for your patience during this incident.
Jul 15, 15:10 UTC

Update - We have successfully recovered encoding operations for all non-GCP accounts and fully processed the existing backlog.

We are now beginning a controlled and gradual recovery of our GCP (Google Compute Engine) operations. Scheduling of encoding jobs on GCP remains halted as we work to stabilize the environment and resolve the underlying SSD quota constraints.

We will continue to monitor the situation closely and provide further updates as we increase GCP capacity and restore full service.

Thank you for your continued patience and understanding.
Jul 15, 10:15 UTC

Identified - We have identified the root cause of the current encoding disruption as an exhaustion of SSD quota within our Google Cloud Platform (GCP) infrastructure. This quota issue is preventing us from provisioning additional instances required to scale encoding jobs, which is causing GCP-based workloads to become extremely slow and trigger our overall system safeguards.

To mitigate the impact, we have paused scheduling of encoding jobs on GCP and are actively working to restore service functionality for customers using non-GCP infrastructure.

Thank you for your patience as we work to fully restore encoding operations.
Jul 15, 09:14 UTC

Monitoring - We are currently experiencing scheduling delays across our encoding platform. Our monitoring systems detected increased queue times and longer job scheduling intervals starting at approximately 04:00 UTC today.

Our engineering team has been notified and is actively investigating the root cause. We are seeing signs of recovery with scheduling times beginning to improve.

We will provide updates as more information becomes available and will post a follow-up once the issue has been fully resolved.
We apologize for any inconvenience this may have caused and appreciate your patience as we work to restore normal scheduling performance.
Jul 15, 06:37 UTC

Jul 15, 2025

Analytics Database Issues

Resolved - This incident has been fully resolved.
All services have operational, and all buffered data has been successfully backfilled as of 06:57 UTC. No data was lost during the incident.
Jul 15, 07:59 UTC

Monitoring - We have resolved the underlying issues affecting the Analytics backend database. All services have fully recovered, and query error rates have returned to normal levels.

We are now actively backfilling the buffered data to ensure all historical events are written and available in the system. No data has been lost.

We will continue to monitor the system closely and provide a final update once the backfill is fully complete.
Jul 15, 06:47 UTC

Investigating - We are currently investigating elevated query error rates and backend insert failures affecting our main Analytics database.
While queries may intermittently fail and analytics dashboards may show incomplete or delayed data, no data is being lost — all incoming events that cannot currently be written are being safely buffered.

Our engineering team is actively working to identify the root cause and restore full functionality.
We will continue to provide updates as the situation evolves.
Jul 15, 06:40 UTC

Jul 14, 2025

No incidents reported.

Jul 13, 2025

No incidents reported.

Jul 12, 2025

No incidents reported.

Jul 11, 2025

No incidents reported.

Jul 10, 2025

Elevated API Latency and 503/504 Gateway Errors

Resolved - On 2025-07-10 between 00:06 UTC and 00:26 UTC, and again from 01:19 UTC to 01:39 UTC, our API experienced elevated latency and an increased rate of 503 Service Unavailable and 504 Gateway Timeout errors. The issue was detected by our monitoring systems during the first incident window, prompting an immediate investigation.

The issue was traced to a customer workflow that generated unexpected high volume traffic, causing excessive load on our API gateway system. The gateway was unable to keep up with the volume of rate limiting decisions required, leading to memory pressure on the gateway nodes. This memory pressure resulted in longer request processing times and upstream timeouts.

Once identified, additional API gateway capacity was added and memory pressure on the gateway nodes was alleviated. Response times and error rates returned to normal levels as of 01:39 UTC.

The high traffic volume during the incident resulted in a significant backlog of encoding jobs. Queue times and processing throughput have been impacted and are slowly returning to normal levels as the system processes through the accumulated backlog.

We have already tweaked our API gateway configuration to increase the number of available nodes and allocate more memory per node to better handle traffic spikes. Additionally, we will be implementing auto-scaling capabilities over the coming weeks to further prevent similar incidents in the future.

This incident also triggered a comprehensive review of our rate limiting configuration. As a result of this analysis, we have adjusted our rate limits to better balance system protection with customer workflow requirements. To see our current API rate limits, please check the following documentation: https://developer.bitmovin.com/encoding/reference/introduction-of-api-rate-limits

As a general best practice, we recommend implementing retries with exponential backoff in workflows that depend on our API, to gracefully handle occasional transient errors like 503/504 responses.

We apologize for any inconvenience this may have caused.
Jul 10, 00:30 UTC

Jul 9, 2025

No incidents reported.

Jul 8, 2025

No incidents reported.

Jul 7, 2025

No incidents reported.

Jul 6, 2025

No incidents reported.

Jul 5, 2025

No incidents reported.

Jul 4, 2025

No incidents reported.

All Systems Operational

Related

Past Incidents