Incidents | Runpod

Planned Network Switch Firmware Upgrade (EUR-NO-2)

Fri, 24 Jul 2026 15:00:22 -0000

Maintenance Window Location: EUR-NO-2 Date: Friday, July 24, 2026 Start Time: 15:00 UTC End Time: 23:00 UTC Expected Duration: 8 Hours Overview & Reason We will be performing a scheduled firmware upgrade on the Mellanox data switches within the EUR-NO-2 environment. This maintenance is part of our ongoing commitment to reliability and includes essential performance optimizations and security enhancements. Expected Customer Impact During the 8-hour maintenance window, customers may experience brief network disconnections or intermittent latency as individual switches are sequentially upgraded and rebooted. Data Integrity: There is no expected impact on customer data or data persistence. Customer Recommendation To minimize potential disruptions, we recommend avoiding traffic-intensive workloads or deferring large, critical data transfers during the maintenance window where possible. Contingency Plan Our engineering team will actively monitor the environment throughout the upgrade. Should any unexpected issues arise, a comprehensive rollback procedure is in place to halt the upgrade and restore normal service as quickly as possible. We apologize for any inconvenience this may cause and appreciate your understanding as we work to improve our infrastructure. If you have any questions or concerns regarding this maintenance, please contact the support team.

EUR-IS-1 recovered

Tue, 21 Jul 2026 23:54:55 +0000

EUR-IS-1 recovered

EUR-IS-1 Network Issue

Tue, 21 Jul 2026 23:45:00 -0000

Issue has been resolved and the network in EUR-IS-1 DC is operating normally.

EUR-IS-1 Network Issue

Tue, 21 Jul 2026 22:35:00 -0000

We are experiencing network connectivity issues in EUR-IS-1. We are working to resolve it.

EUR-IS-1 went down

Tue, 21 Jul 2026 22:30:07 +0000

EUR-IS-1 went down

Planned Network Upgrade - Infiniband Switches & UFM

Sat, 11 Jul 2026 18:30:00 -0000

Scheduled Firmware Upgrade: Unified Fabric Manager (UFM) and InfiniBand (IB) Switches Purpose: This maintenance is necessary to enhance the stability and performance of our infrastructure. By keeping our network systems up to date, we ensure a more robust environment and help prevent unplanned disruptions to your hosted business services. Impact to Services: Node Availability: All GPU nodes will remain online and reachable throughout the maintenance window. Running Jobs: Because the InfiniBand network switches are being upgraded, network connectivity between nodes will be interrupted. Consequently, any active GPU jobs during this time will be disrupted and are likely to fail. Action Required: Please do not initiate any new GPU jobs during this maintenance window.

EUR-IS-4 recovered

Fri, 10 Jul 2026 21:02:18 +0000

EUR-IS-4 recovered

EUR-IS-4 Network Issue

Fri, 10 Jul 2026 21:02:00 -0000

Issue has been resolved and the network in EUR-IS-4 DC is operating normally.

EUR-IS-4 went down

Fri, 10 Jul 2026 20:25:00 +0000

EUR-IS-4 went down

EUR-IS-4 Network Issue

Fri, 10 Jul 2026 20:24:00 -0000

We are experiencing network connectivity issues in EUR-IS-4. We are working to resolve it.

EUR-IS-1 Networking is Degraded

Fri, 10 Jul 2026 19:00:00 -0000

Issue has been resolved and is operating normally.

EUR-IS-1 Networking is Degraded

Fri, 10 Jul 2026 10:00:00 -0000

Network performance is degraded in EUR-IS-1. We are working on resolving the issue.

US-TX-3 Network Storage Issue

Thu, 09 Jul 2026 16:50:00 -0000

Issue has been resolved and is operating normally.

US-TX-3 Network Storage Issue

Thu, 09 Jul 2026 06:24:00 -0000

We are experiencing Network Storage issue in US-TX-3 DC. We are working to resolve it.

US-GA-2 Networking is Degraded

Wed, 08 Jul 2026 22:12:00 -0000

Issue has been resolved and is operating normally.

US-GA-2 Networking is Degraded

Wed, 08 Jul 2026 20:16:00 -0000

Network performance is degraded in US-GA-2. We are working on resolving the issue.

EUR-NO-1 Network Storage Issue

Wed, 08 Jul 2026 10:27:00 -0000

Network storage issue in EUR-NO-1 is resolved and operating normally.

CA-MTL-3 Network Storage Issue

Wed, 08 Jul 2026 09:56:00 -0000

Network storage issue in CA-MTL-3 is resolved and operating normally.

EUR-NO-1 Network Storage Issue

Wed, 08 Jul 2026 09:52:00 -0000

We are experiencing Network Storage issue in EUR-NO-1 DC. We are working to resolve it.

CA-MTL-3 Network Storage Issue

Tue, 07 Jul 2026 21:20:00 -0000

We are experiencing Network Storage issue in CA-MTL-3 DC. We are working to resolve it.

Planned Network Hardware Capacity Upgrade in AP-IN-1

Sun, 05 Jul 2026 12:30:00 -0000

Planned Network Hardware Capacity Upgrade in AP-IN-1 Expected Downtime: Brief service interruptions or network fluctuations. Details: During the upcoming network upgrade, you may experience temporary network fluctuations caused by minor packet drops (approximately 3 to 4 packets). Full service will normalize immediately upon the successful completion of the maintenance. Please be assured that all configurations and data stored on your infrastructure will remain completely safe and unaffected throughout this process.

US-PA-1 Network Inbound Issue

Fri, 26 Jun 2026 14:00:00 -0000

This issue has been resolved. ------ Some machines' public IPs and ports are unreachable in US-PA-1, and we're also seeing network interruptions causing the servers to lose connectivity. We are checking with the ISP, and our network engineers are working on getting a full root cause and resolution.

Planned Network Maintenance in AP-IN-1

Sat, 20 Jun 2026 15:30:23 -0000

EUR-IS-1 Network Issue

Fri, 19 Jun 2026 14:26:00 -0000

Issue has been resolved and the network in EUR-IS-1 DC is operating normally.

EUR-IS-1 Network Issue

Fri, 19 Jun 2026 13:46:00 -0000

We are experiencing network connectivity issues in EUR-IS-1. We are working to resolve it.

Upstream issue - Elevated image pull error rates from DockerHub Cloudfront

Mon, 08 Jun 2026 19:52:00 -0000

The issue with downloading images from Dockerhub has been resolved and is operating normally.

Planned Network Maintenance - AP-IN-1

Sat, 06 Jun 2026 18:30:00 -0000

Activity: Scheduled upgrade of UFM1 and InfiniBand (IB) Switches Firmware. Purpose: This maintenance is necessary to enhance the stability and performance of our infrastructure. By keeping our systems up to date, we ensure a more robust environment and help prevent unplanned disruptions to your hosted business services. Impact & Important Instructions: Node Availability: All GPU nodes will remain online and reachable throughout the maintenance window. Running Jobs: Any active GPU jobs will be disrupted and may fail. Action Required: Please do not initiate any new GPU jobs during this activity window.

CA-MTL-1 recovered

Tue, 02 Jun 2026 23:29:26 +0000

CA-MTL-1 recovered

CA-MTL-1 went down

Tue, 02 Jun 2026 22:56:43 +0000

CA-MTL-1 went down

EU-SE-1 recovered

Mon, 01 Jun 2026 21:48:21 +0000

EU-SE-1 recovered

EU-SE-1 went down

Mon, 01 Jun 2026 07:42:00 +0000

EU-SE-1 went down

CA-MTL-1 recovered

Wed, 27 May 2026 21:36:14 +0000

CA-MTL-1 recovered

CA-MTL-1 went down

Wed, 27 May 2026 21:21:36 +0000

CA-MTL-1 went down

Planned Network Maintenance - US-NE-1

Wed, 27 May 2026 18:00:53 -0000

Planned network upgrades are scheduled for internet connectivity at the US-NE-1 datacenter. Brief connectivity interruptions may occur while we upgrade the circuit infrastructure.

Upstream authentication provider is experiencing an issue.

Tue, 26 May 2026 21:57:00 -0000

The root cause has been identified and addressed. We will continue to monitor going forward. ------- We are investigating an upstream services issue that is affecting user signup and account switching.

Upstream issue - Elevated image pull error rates from DockerHub Cloudfront

Tue, 26 May 2026 12:00:00 -0000

We're observing increased error rates in downloading images from Dockerhub. We've notified their team and we are awaiting further detail. We will provide additional information as we receive it. [Docker added Cloudfront to DockerHub's distribution network on 2026-05-20](https://docs.docker.com/docker-hub/release-notes/#2026-05-20), since that time, we've seen a growing level of errors for image pulls when distributed over Cloudfront. The Docker team is looking into the issue. Runpod is not able to control the selection of which CDN is used and has confirmed that it's not a Runpod specific issue. At this point, it's believed that images with layers larger than 50GB cause Cloudfront to have different cache behavior, resulting in an error. As Docker remediates the issue, we advise using an alternate image distribution network or reduce layer sizes below 50GB. This results in error messages in the console similar to: ERROR: The request could not be satisfied

ERROR

The request could not be satisfied.

Generated Wed, 03 Jun 2026 23:35:36 GMT
Request ID: yFjsOWdc6NZVG8D5l5omQzXCD4vMBR-4h_JE5gmXP06Y3vxx3M9thw==

CA-MTL-3 Network Issue

Thu, 14 May 2026 21:49:00 -0000

Network issue is resolved and operating normally.

CA-MTL-3 Network Issue

Thu, 14 May 2026 20:40:00 -0000

Network has been restored in CA-MTL-3 DC. We are continuing to monitor for issues.

CA-MTL-3 Network Issue

Thu, 14 May 2026 20:24:00 -0000

We are experiencing network connectivity issues in CA-MTL-3. We are working to resolve it.

Console: Delayed Pod starts and Empty Console Logs

Fri, 08 May 2026 21:49:00 -0000

The issue has been resolved. Functionality for pod starts and display for logs on the console have returned to normal.

Console: Delayed Pod starts and Empty Console Logs

Fri, 08 May 2026 21:47:00 -0000

We've identified the issue and are working to incorporate the fix.

Upstream Systems recovered

Fri, 08 May 2026 20:02:26 +0000

Upstream Systems recovered

Upstream Systems went down

Fri, 08 May 2026 19:28:18 +0000

Upstream Systems went down

Console: Delayed Pod starts and Empty Console Logs

Fri, 08 May 2026 17:08:00 -0000

We are experiencing increased times for Pods to start along with console logs not being displayed. We are investigating the issue and will provide an update once we have more information.

US-CA-2 recovered

Fri, 08 May 2026 01:46:43 +0000

US-CA-2 recovered

US-CA-2 went down

Fri, 08 May 2026 01:23:38 +0000

US-CA-2 went down

Planned Network Maintenance - EUR-IS-1

Wed, 29 Apr 2026 09:00:49 -0000

Scheduled Maintenance: Datacenter EUR-IS-1 Affected Region: Iceland (EUR-IS-1) Maintenance Window: April 29th, 2026 Start: 9AM GMT+0 END 11AM GMT+0 📝 Maintenance Overview Our Datacenter Network Engineers will be performing essential updates to the core switches within the EUR-IS-1 facility. This maintenance includes critical firmware updates to improve hardware stability and security protocols. ⚡ Expected Impact We anticipate minimal impact during this window. However, as the core switches transition and links are re-established, you may observe: Brief Latency Spikes: Temporary increases in ping times. Minor Packet Loss: Potential for small bursts of dropped packets while traffic is rerouted through redundant paths.

Planned Network Maintenance - US-CA-2

Tue, 28 Apr 2026 20:00:14 -0000

Key Maintenance Details Date: April 28, 2026 Time: 1:00 PM – 6:00 PM PST Location: Datacenter US-CA-2 Scope: Network infrastructure upgrades specifically targeting storage system connectivity. Impact Assessment Service Interruption: Expect downtime or intermittent connectivity for systems residing within this datacenter during the four-hour window. Specific Notifications: Not all machines may be affected. Check your system-level dashboard for individual maintenance flags to confirm if your specific instances are in scope. Objective: These changes are designed to "enhance network connectivity for performance storage systems," which typically results in lower latency and higher throughput for data-heavy operations once completed.

CA-MTL-1 and CA-MTL-3 Network Issue

Tue, 28 Apr 2026 08:12:00 -0000

Network connectivity to CA-MTL-1 DC has been restored.

CA-MTL-1 and CA-MTL-3 Network Issue

Tue, 28 Apr 2026 06:27:00 -0000

Network connectivity to CA-MTL-3 DC has been restored. We are continuing to work on resolving network issues in CA-MTL-1 DC.

CA-MTL-1 and CA-MTL-3 Network Issue

Tue, 28 Apr 2026 04:52:00 -0000

We are experiencing network connectivity issues in CA-MTL-1 and CA-MTL-3. We are working to resolve it.

Planned Power maintenance for EUR-IS-3 and EUR-IS-4

Wed, 22 Apr 2026 07:00:01 -0000

Planned Power maintenance for EUR-IS-3 and EUR-IS-4 Our datacenter partner will be performing scheduled power maintenance on Wednesday, April 22. While they expect zero downtime for the vast majority of users, please review the details below. Maintenance Window PST (USA): 00:00 – 06:00 Iceland Time: 08:00 – 14:00 Beijing Time (CST): 16:00 – 22:00 Key Takeaways Customer Action: No action is required. Performance & Data: GPU performance will remain stable, and all local NVMe/Scratch data will be preserved. Operational Setup: The datacenter partner is consolidating power to a single operational rail. Systems will maintain full PSU functionality during this period. Expected Impact While a seamless transition is expected, there is a small risk that a few servers may experience an RPDU trip, resulting in a temporary shutdown. Mitigation Strategy: The datacenter partner will provide continuous on-site supervision and enhanced monitoring throughout the window. If a shutdown occurs, they are prepared to perform immediate RPDU resets to bring systems back online. Please monitor your systems as usual; our partner will manage the maintenance and recovery efforts.

Runpod.io website is unavailable

Tue, 14 Apr 2026 20:41:00 -0000

Issue with runpod.io website has been resolved. We are continuing to monitor for issues.

Runpod.io website is unavailable

Tue, 14 Apr 2026 20:09:00 -0000

Runpod.io website is experiencing issues but the console and services are operating normally and not impacted. We are actively working with upstream provider to address this issue.

[Planned Maintenance] Network Performance Upgrades - EUR-NO-2

Thu, 09 Apr 2026 11:00:39 -0000

Maintenance Notice We will be performing scheduled network maintenance at our EUR-NO-2 data center. This work is required to optimize traffic flow and ensure the continued stability of our infrastructure. Details Location: EUR-NO-2 Start Time: April 9, 2026, at 11:00 UTC End Time: April 9, 2026, at 11:30 UTC Summary of Work This maintenance is focused on infrastructure improvements to enhance TCP performance on IPv4 and resolve existing performance degradation with non-TCP protocols across both IPv4 and IPv6. Service Impact Connectivity: All systems within the EUR-NO-2 region will experience a brief interruption. Expected Downtime: Approximately 5 minutes during the maintenance window. Note: Our team will be monitoring the systems closely throughout the process to ensure services are restored as quickly as possible. We appreciate your understanding and apologize for any inconvenience this may cause as we work to provide a better experience.

Incident Notice: Power Service Interruption – US-NC-1

Sat, 04 Apr 2026 17:58:00 -0000

Update: At 1:45 PM PT, Power was restored to the site and machines had been started to be powered back on. The datacenter is back on full utility power with UPS backup. Datacenter engineers are continuing to look into why the generator supplied power at a voltage outside of expected range for the UPS which caused the UPS to not switch over. --------------- Original issue: Our datacenter US-NC-1 has reported a power failure at the US-NC-1 facility that occurred at approximately 8:00 AM PT on April 4th. This failure has taken down the entire A-side power. The servers currently failing are not N+N Redundant; they are N+1. Typically, in a power outage, UPS and Generators act as backup, however, this failover is not working. Onsite teams are working on providing a Root Cause Analysis (RCA) to find out why the backup power was unable to kick on to run these servers. Updates will be shared as our datacenter partners provide estimated repair times and as more information becomes available.

EU-FR-1 Network Issue

Tue, 31 Mar 2026 08:39:00 -0000

Issue has been resolved and the network in EU-FR-1 DC is operating normally.

EU-FR-1 Network Issue

Tue, 31 Mar 2026 02:41:00 -0000

We have network recovery in EU-FR-1 DC but will continue to monitor for issues.

EU-FR-1 Network Issue

Tue, 31 Mar 2026 01:11:00 -0000

We are experiencing network connectivity issues in EU-FR-1. We are working to resolve it.

Planned Network Maintenance: US-CA-2 Infrastructure Enhancement

Thu, 26 Mar 2026 18:00:55 -0000

We are expanding our VXLAN fabric in the US-CA-2 datacenter by adding an additional Spine switch. This enhancement increases redundancy and ensures $N+1$ fault tolerance for all services hosted in this region. Schedule: Location: US-CA-2 Datacenter Date: Thursday, March 26, 2026 Maintenance Window: 11:00 AM – 2:00 PM PDT Service Impact: Expected Impact: None Details: This is a non-disruptive infrastructure addition. Network traffic will continue to route through existing nodes while the new node is integrated. Monitoring: Our engineering team will be actively monitoring the fabric throughout the window. No action is required on your part.

US-NC-1 Network Issue

Wed, 25 Mar 2026 04:48:00 -0000

Issue has been resolved and the network in US-NC-1 DC is operating normally.

US-NC-1 Network Issue

Wed, 25 Mar 2026 04:18:00 -0000

We are experiencing network connectivity issues in US-NC-1. We are working to resolve it.

US-NE-1 internet network speeds is degraded

Wed, 18 Mar 2026 11:55:00 -0000

Issue with internet network speeds in US-NE-1 DC has been resolved and is now operating normally.

Scheduled Network Maintenance – EUR-IS-1 (Iceland)

Wed, 18 Mar 2026 09:00:00 -0000

**Date & Time:** Wednesday, March 18, 2025 from 09:00 to 11:00 UTC **Estimated Duration:** Up to 2 hours **Affected Region:** EUR-IS-1 (Iceland) **Summary:** Our datacenter provider will be performing minor configuration changes on edge routers in the EUR-IS-1 region during this window. This is routine network maintenance intended to maintain the health and stability of the datacenter's network infrastructure. **Expected Impact:** Pods in EUR-IS-1 may experience brief, intermittent network disruptions during this window. The maintenance involves minor edge router changes, so any interruptions are expected to be short-lived rather than a sustained outage. Pod compute resources should remain unaffected. **Recommended Actions:** - Be aware that network connectivity for pods in EUR-IS-1 may fluctuate briefly during the maintenance window - Expect to re-establish SSH sessions or restart network-dependent processes if connectivity is interrupted - Monitor your pods for any connectivity issues between 09:00 and 11:00 UTC on March 18th We appreciate your patience as our infrastructure partners work to maintain network reliability in this region.

US-NE-1 internet network speeds is degraded

Tue, 17 Mar 2026 16:23:00 -0000

US-NE-1 DC is experiencing slower internet connectivity. We are working on resolving this issue.

CA-MTL-1 Connectivity issue to network and network storage

Tue, 17 Mar 2026 01:45:00 -0000

Network connectivity has been restored and is operating normally.

CA-MTL-1 Connectivity issue to network and network storage

Mon, 16 Mar 2026 21:43:00 -0000

There are network connectivity issues in CA-MTL-1 DC. We are working to resolve the issue.

Emergency Network Maintenance Notice – US-CA-2 (California)

Sat, 14 Mar 2026 07:00:16 -0000

**Date & Time:** Saturday, March 14, 2026 from 07:01 to 13:00 UTC **Affected Region:** US-CA-2 (California) **Summary:** RunPod's data center partner has been notified by their upstream internet provider, of an upcoming emergency hardware upgrade on network equipment serving the US-CA-2 facility. **Expected Impact:** We anticipate minimal impact for pods in US-CA-2. In the most likely scenario, customers may experience brief packet loss during any carrier traffic switchover. A sustained outage is not expected. Infrastructure partners will be actively monitoring throughout the entire window. **Recommended Actions:** - Be aware that pods in US-CA-2 may experience brief, intermittent network disruption between 07:01 and 13:00 UTC on March 14th - Expect to re-establish SSH sessions or restart network-dependent processes if connectivity is interrupted - Monitor pods for connectivity during the maintenance window

Issue with creating new storage volumes in EU-RO-1 DC

Sat, 14 Mar 2026 02:21:00 -0000

The issue has been resolved. New network volume creations for EU-RO-1 DC is now operating normally.

Issue with creating new storage volumes in EU-RO-1 DC

Sat, 14 Mar 2026 01:36:00 -0000

We are experiencing an issue with creating new storage volumes in EU-RO-1 DC. Existing storage volumes continue to operate as normal. We are working on resolving this issue.

EUR-NO-1 - Network storage performance is degraded

Wed, 11 Mar 2026 19:15:00 -0000

Network storage performance has been resolved and is operating normally.

EUR-NO-1 - Network storage performance is degraded

Wed, 11 Mar 2026 15:45:00 -0000

EUR-NO-1's network storage array performance is degraded. We're working on restoring performance.

US-CA-2 Network Storage connectivity issues

Tue, 10 Mar 2026 23:24:00 -0000

Connectivity to the network storage volumes in US-CA-2 has been resolved.

US-CA-2 Network Storage connectivity issues

Tue, 10 Mar 2026 22:40:00 -0000

There are network connectivity issues using the network storage volumes in US-CA-2 DC.

Console Login Issues

Tue, 10 Mar 2026 16:21:00 -0000

We are currently monitoring issues with an upstream authentication provider that may cause console logins to be slow or fail for some users. Our team is investigating and working with the provider to resolve the issue.

US-CA-2 networking is degraded

Sat, 07 Mar 2026 11:06:00 -0000

All telemetry is returning nominal levels and the ISP is preparing a post mortem report.

US-CA-2 networking is degraded

Sat, 07 Mar 2026 09:45:00 -0000

We are continuing to monitor network performance to ensure full recovery. At this time, we are observing recovery, however we will continue to keep tracking until we receive positive confirmation from the upstream provider.

US-CA-2 networking is degraded

Sat, 07 Mar 2026 08:30:00 -0000

Remediation has been performed and we are seeing telemetry return to baseline levels. We are continuing to monitor the situation.

US-CA-2 networking is degraded

Sat, 07 Mar 2026 08:08:00 -0000

US-CA-2 networking is degraded which is impacting ability to connect to pods within the datacenter. Runpod engineering is investigating in coordination with on-site staff.

Billing Explorer experiencing issue.

Tue, 03 Mar 2026 18:01:00 -0000

We are currently experiencing issues with a downstream service provider, which is impacting Billing Explorer queries. ---- downstream service provider is back to normal

EU-FR-1 Network Issue

Tue, 24 Feb 2026 20:57:00 -0000

We've completed monitoring. Issue has been resolved and network is operating normally.

EU-FR-1 Network Issue

Tue, 24 Feb 2026 05:45:00 -0000

We have network recovery in EU-FR-1 DC but will be continuing to monitor for the next 24 hours.

EU-FR-1 Network Issue

Tue, 24 Feb 2026 00:57:00 -0000

We are continuing to work on resolving the network issue in EU-FR-1 DC.

EU-FR-1 Network Issue

Mon, 23 Feb 2026 22:21:00 -0000

We are experiencing network connectivity issues in EU-FR-1. We are working to resolve it.

Emergency Network Maintenance for US-NC-2

Sat, 21 Feb 2026 21:30:56 -0000

**Date & Time:** Saturday, February 21, 2026 at 21:30 UTC **Estimated Duration:** Up to 30 minutes **Affected Region:** US-NC-2 (North Carolina) **Summary:** RunPod's data center partner will be deploying a dedicated NAT Gateway for the US-NC-2 region to address network latency and packet processing issues. This upgrade is intended to improve long-term network stability and performance for all pods in this region. **Expected Impact:** Pods in US-NC-2 may experience intermittent network connectivity disruptions during the cutover window. Active connections — including SSH sessions, API calls, and data transfers — may be dropped and will need to re-establish. Pod compute resources will remain online; only network connectivity is affected. **Recommended Actions:** - Pause or checkpoint any network-sensitive workloads prior to 21:30 UTC on 2/21 if possible - Expect to re-establish SSH sessions or restart network-dependent processes after the maintenance window - Monitor your pods for connectivity restoration following the cutover

Emergency Network Maintenance for US-TX-4

Sat, 21 Feb 2026 21:00:05 -0000

**Date & Time:** Saturday, February 21, 2026 at 21:00 UTC **Estimated Duration:** Up to 30 minutes **Affected Region:** US-TX-4 (Texas) **Summary:** RunPod's data center partner will be deploying a dedicated NAT Gateway for the US-TX-4 region to address network latency and packet processing issues. This upgrade is intended to improve long-term network stability and performance for all pods in this region. **Expected Impact:** Pods in US-TX-4 may experience intermittent network connectivity disruptions during the cutover window. Active connections — including SSH sessions, API calls, and data transfers — may be dropped and will need to re-establish. Pod compute resources will remain online; only network connectivity is affected. **Recommended Actions:** - Pause or checkpoint any network-sensitive workloads prior to 21:00 UTC on 2/21 if possible - Expect to re-establish SSH sessions or restart network-dependent processes after the maintenance window - Monitor your pods for connectivity restoration following the cutover

US-TX-4 Network Outage

Sat, 21 Feb 2026 04:14:00 -0000

On Saturday February 22nd at approximately 3:15AM UTC our datacenter provider at the US-TX-4 suffered an unexpected network outage. Network engineers resolved the issue and network stability returned at 3:36AM UTC. The datacenter is performing a maintenance on Feb 22nd that will address this issue and correct it long term.

Console login is slow or failing

Thu, 19 Feb 2026 19:09:00 -0000

The issues have been resolved by the upstream vendor and we are not seeing errors at this time, but will continue monitoring.

Console login is slow or failing

Thu, 19 Feb 2026 16:58:00 -0000

We are still tracking errors from an upstream authentication provider which are causing console login to be very slow or fail for some users.

Console login is slow or failing

Thu, 19 Feb 2026 16:29:00 -0000

We are monitoring errors with an upstream authentication provider that is causing console login to be very slow or fail for some users.

Emergency Network Maintenance for US-NC-1

Wed, 18 Feb 2026 21:00:02 -0000

**Date & Time:** Wednesday, February 18, 2026 at 21:00 UTC **Estimated Duration:** Up to 30 minutes **Affected Region:** US-NC-1 (North Carolina) **Summary:** RunPod's data center partner will be deploying a dedicated NAT Gateway for the US-NC-1 region to address network latency and packet processing issues. This upgrade is intended to improve long-term network stability and performance for all pods in this region. **Expected Impact:** Pods in US-NC-1 may experience intermittent network connectivity disruptions during the cutover window. Active connections — including SSH sessions, API calls, and data transfers — may be dropped and will need to re-establish. Pod compute resources will remain online; only network connectivity is affected. **Recommended Actions:** - Pause or checkpoint any network-sensitive workloads prior to 21:00 UTC on 2/18 if possible - Expect to re-establish SSH sessions or restart network-dependent processes after the maintenance window - Monitor your pods for connectivity restoration following the cutover

Planned Maintenance: Redundancy Upgrade – EUR-NO-2

Sat, 14 Feb 2026 08:00:59 -0000

Activity: Redundancy upgrade Location: EUR-NO-2 - Lefdal Urgency: High Priority Description of Change This maintenance covers critical redundancy upgrades to the infrastructure at EUR-NO-2, aimed at improving system resilience and operational stability. Reason for Change To increase redundancy and overall infrastructure stability within the facility. Expected Service Impact Downtime: A service downtime of approximately 3 hours is expected during the maintenance window. Storage Status: Local storage across all GPU nodes will be in read-only mode. All services and compute tasks will be shut down to ensure data integrity during the infrastructure upgrade. System Integrity: To avoid data inconsistency and ensure system integrity, all affected services and devices at the site will be shut down prior to the maintenance start. Restoration: Full service restoration will commence once the maintenance is successfully completed.

Planned Network Maintenance for EU-FR-1 - Public Network IP change

Mon, 09 Feb 2026 14:00:00 -0000

Maintenance Window Date: February 9, 2026 Start Time: 14:00 GMT End Time: 15:00 GMT Duration: 1 hour Description We are performing a mandatory network infrastructure migration for the EU-FR-1 Datacenter. This cutover moves services to a new public IP infrastructure to improve security and scalability. Impact Analysis During the 1-hour window (14:00 – 15:00 GMT), services in the EU-FR-1 Datacenter may experience: Brief connectivity instability or packet loss. Delays during DNS propagation and routing switches. Customer Actions Whitelisting: Ensure your network authorizes the following subnet for the EU-FR-1 Datacenter prior to 14:00 GMT: Subnet: 31.24.80.0/24 Note: If you utilize a custom configuration not standard to our environment, please submit a support ticket prior to the window. Post-Maintenance: Old IP addresses will be decommissioned immediately following the successful verification of the migration.

EUR-IS-1 Network Issue

Wed, 04 Feb 2026 21:01:00 -0000

Issue has been resolved and network is operating normally.

EUR-IS-1 Network Issue

Wed, 04 Feb 2026 20:09:00 -0000

We are experiencing network connectivity issues in EUR-IS-1. We are working to resolve it.

BE ADVISED: Winter Weather Impact - Texas Data Centers

Fri, 23 Jan 2026 08:00:00 -0000

RunPod is closely monitoring an incoming winter storm system affecting Texas. Our team is taking proactive measures to ensure service continuity during adverse weather conditions. We will provide updates if conditions change or if any service impact occurs.

US-WA-1 Network Issue

Thu, 15 Jan 2026 19:36:00 -0000

Issue has been resolved and network is operating normally.

US-WA-1 Network Issue

Thu, 15 Jan 2026 19:02:00 -0000

We are experiencing network connectivity issues in US-WA-1. We are working to resolve it.

Planned Power Maintenance (No Outage Expected) - EUR-IS-3 Datacenter

Wed, 14 Jan 2026 08:00:27 -0000

We are notifying you of upcoming construction-related maintenance affecting the EUR-IS-3 datacenter. Please be advised that this is not a total power outage. The facility will operate at reduced power redundancy during the maintenance window. While one power feed will be disconnected for safety, all equipment will remain online supported by the redundant UPS power supply. Maintenance Window Date: Wednesday, January 14 Duration: 5 Hours PST (Pacific Time): 00:00 – 05:00 GMT (Iceland Time): 08:00 – 13:00 CST (Beijing Time): 16:00 – 21:00

US-IL-1 data center is currently experiencing a network issue

Fri, 09 Jan 2026 21:23:00 -0000

The US-IL-1 data center is currently experiencing a network issue. We’re actively investigating. -----

The public endpoint is experiencing intermittent issues.

Thu, 08 Jan 2026 15:19:00 -0000

The public endpoint is currently experiencing intermittent issues, and some requests may fail. Our team is actively investigating the issue. ---- The issue has been mitigated and we’re continuing to monitor it. If you’re still seeing issues, please report them to help@runpod.io.

Planned Network Maintenance for US-TX-4

Sun, 21 Dec 2025 21:00:37 -0000

Schedule Details Start Time: December 21, 2025, at 4:00 PM EST Estimated Duration: 45 minutes Expected End Time: December 21, 2025, at 4:45 PM EST Summary of Work Our datacenter provider will be conducting essential network maintenance. This work is necessary to ensure the continued stability, security, and high performance of our infrastructure. Impact During this 45-minute window, you may experience: Intermittent or total loss of network connectivity to routers and switches within the US-TX-4 region.

Planned Network Maintenance for US-WA-1

Fri, 19 Dec 2025 23:00:00 -0000

Schedule Details Start Time: December 19, 2025, at 6:00 PM EST Estimated Duration: 45 minutes Expected End Time: December 19, 2025, at 6:45 PM EST Summary of Work Our datacenter provider will be conducting essential network maintenance. This work is necessary to ensure the continued stability, security, and high performance of our infrastructure. Impact During this 45-minute window, you may experience: Intermittent or total loss of network connectivity to routers and switches within the US-WA-1 region.

Planned Network Maintenance for US-NC-1

Wed, 17 Dec 2025 20:00:51 -0000

Schedule Details Start Time: December 17, 2025, at 3:00 PM EST Estimated Duration: 45 minutes Expected End Time: December 17, 2025, at 3:45 PM EST Summary of Work Our datacenter provider will be conducting essential network maintenance. This work is necessary to ensure the continued stability, security, and high performance of our infrastructure. Impact During this 45-minute window, you may experience: Intermittent or total loss of network connectivity to routers and switches within the US-NC-1 region.

Planned Network Maintenance for EUR-IS-3

Fri, 12 Dec 2025 09:30:40 -0000

There is a Planned Network Maintenance for Datacenter EUR-IS-3 Details: Date: Friday, December 12th Start Time: 09:30 (UTC) End Time: 10:30 (UTC) Impact: There is no expected downtime for this window. You may experience brief periods of increased latency or packet loss as network devices are adjusted. This maintenance is planned and essential for ensuring continued stability and performance.

Planned Network Maintenance for EUR-IS-1

Thu, 11 Dec 2025 20:00:00 -0000

Our datacenter provider for EUR-IS-1 is planning a network router firmware upgrade. Details: Thursday December 11th Start time: 20:00 (UTC) End time: 22:00 (UTC) Impact: There is no expected downtime for this window. You may experience brief periods of increased latency or packet loss. This maintenance is planned and essential for ensuring continued stability and performance.

Planned Network Maintenance for EU-SE-1

Wed, 03 Dec 2025 23:00:57 -0000

We are conducting scheduled internet service provider maintenance in the EU-SE-1 data center. Details Start Time: December 3, 2025, at 23:00 UTC End Time: December 5, 2025, at 23:00 UTC Impact: During this 48-hour window, internet service may experience brief periods of increased latency or packet loss. This maintenance is planned and essential for ensuring continued stability and performance.

Planned Internet Maintenance CA-MTL-4

Thu, 27 Nov 2025 22:45:14 -0000

We will be performing a planned failover for CA-MTL-4 ISP connections in order to perform firewall firmware upgrades. Internet connections may be reset during this period. Internal workloads and networking will not be impacted.

Planned Internet Maintenance EU-FR-1

Thu, 20 Nov 2025 10:00:00 -0000

We are conducting planned internet service maintenance in data center EU-FR-1 on November 20, 2025, between 10:00-12:00 UTC. During this scheduled time, internet service will experience a brief outage.

Upstream provider outage affecting Runpod services

Tue, 18 Nov 2025 12:33:00 -0000

There is an ongoing outage with cloudflare that are degrading or blocking Runpod products. https://www.cloudflarestatus.com/

Partial Network Outage in EUR-IS-2

Tue, 18 Nov 2025 03:29:00 -0000

The network issue has been resolved and machines have been returned to service.

EUR-IS-2 network issues resolved

Tue, 18 Nov 2025 03:27:00 -0000

The network issue has been resolved and machines have been returned to service.

Partial Network Outage in EUR-IS-2

Tue, 18 Nov 2025 02:13:00 -0000

A set of machines in EUR-IS-2 have encountered network issues. We have identified the cause and are working to resolve the issue.

CPU pod, Servless and some public endpoint disruption

Mon, 03 Nov 2025 12:47:00 -0000

We’ve added more CPU resources, and so far both CPU serverless and public endpoints are scaling up normally again. -------- We are tracking an ongoing incident with CPU pod, serverles, github integration and public endpoints. we are spinning up emergency reservers of CPU server capacity in multiple datacenters and tracking current usage to pinpoint exact issues

Upstream Issue: AWS Outages

Mon, 20 Oct 2025 14:43:00 -0000

Our GPU utilization is back to normal, and most clients’ serverless queue items have been processed. AWS still hasn’t updated their status page, but we’ll continue monitoring the situation.

Upstream Issue: AWS Outages

Mon, 20 Oct 2025 14:32:00 -0000

We’re beginning to see early signs of recovery in the affected AWS region, and parts of our system are returning to normal operating thresholds. The team is closely monitoring the situation and will continue working on the migration as a precaution.

Upstream Issue: AWS Outages

Mon, 20 Oct 2025 14:01:00 -0000

We're currently working on migrating our services away from the affected AWS region. This may take a few hours. In the meantime, if the AWS region recovers sooner, our services should come back online as well, whichever happens first. Thank you for your patience. AWS health Status: https://health.aws.amazon.com/health/status

Upstream Issue: AWS Outages

Mon, 20 Oct 2025 12:48:00 -0000

AWS has redeclared the same incident, affecting our API service. We expect this will also impact the console, but nothing yet.

Upstream Issue: AWS Outages

Mon, 20 Oct 2025 08:24:00 -0000

AWS is having a large scale outage that is causing downstream issues on our hosting provider and causing partial or full downtime on console.runpod.io https://www.vercel-status.com/ Serverless and API calls are also seeing high levels of errors due to AWS outage https://health.aws.amazon.com/health/status https://downdetector.com/status/aws-amazon-web-services/

US-CA-2 Network issue

Tue, 14 Oct 2025 19:43:00 -0000

Issue has been resolved and network is operating normally.

US-CA-2 Network issue

Mon, 13 Oct 2025 00:38:00 -0000

We've recovered from the network connectivity issue in US-CA-2. We are continuing to monitor.

US-CA-2 Network issue

Sun, 12 Oct 2025 22:25:00 -0000

We are experiencing network connectivity issues in US-CA-2. We are working to resolve it.

US-CA-2 Network issue

Sun, 12 Oct 2025 02:46:00 -0000

Issue has been resolved and network is operating normally.

US-CA-2 Network issue

Sun, 12 Oct 2025 01:17:00 -0000

We are experiencing network connectivity issues in US-CA-2. We are working to resolve it.

Issue with provisioning Pods using images with Auth against Docker Hub

Fri, 10 Oct 2025 19:26:00 -0000

This has been resolved and the service is operating normally.

Intermittent issues with connectivity to proxy

Fri, 10 Oct 2025 17:45:00 -0000

Changes have been made to improve the connectivity to proxy. Issue has been resolved and is operating normally.

Issue with provisioning Pods using images with Auth against Docker Hub

Thu, 09 Oct 2025 17:47:00 -0000

There should be improvements around this issue and should be resolved but we are continuing to monitor the situation.

EU-RO-1 network connectivity issues to AWS

Thu, 09 Oct 2025 15:42:00 -0000

Network connectivity between AWS and EU-RO-1 has been restored. Network is operating normally.

EU-RO-1 network connectivity issues to AWS

Thu, 09 Oct 2025 14:16:00 -0000

We've investigating network connectivity issues from EU-RO-1 DC to AWS.

Intermittent issues with connectivity to proxy

Thu, 09 Oct 2025 14:08:00 -0000

We are investigating intermittent network connectivity issues while connecting to *.proxy.runpod.net.

Issue with provisioning Pods using images with Auth against Docker Hub

Wed, 08 Oct 2025 21:41:00 -0000

We are investigating an issue that is contributing to slower provisioning of Pods using images that utilize Auth against Docker Hub.

Emergency Network Maintenance in US-TX-4

Tue, 07 Oct 2025 23:00:19 -0000

We are conducting emergency network maintenance at our US-TX-4 data center starting at 10/07/2025 23:00:00 (UTC). During this maintenance window, you may experience a brief network impact of up to 5 minutes, which could result in temporary session loss or connection timeouts. No other impact is expected beyond this short interruption. We apologize for the short notice of this emergency maintenance.

Emergency Network Maintenance in US-WA-1

Tue, 07 Oct 2025 21:00:01 -0000

We are conducting emergency network maintenance at our US-WA-1 data center starting at 10/07/2025 21:00:00 (UTC). During this maintenance window, you may experience a brief network impact of up to 5 minutes, which could result in temporary session loss or connection timeouts. No other impact is expected beyond this short interruption. We apologize for the short notice of this emergency maintenance.

US-WA-1 Network issue

Tue, 07 Oct 2025 17:47:00 -0000

Issue has been resolved and network is operating normally.

US-WA-1 Network issue

Tue, 07 Oct 2025 16:50:00 -0000

We are experiencing network connectivity issues in US-WA-1. We are working on this issue.

Runpod APIs experiencing intermittent errors

Thu, 02 Oct 2025 17:43:00 -0000

Issue has been resolved. We've been monitoring the situation and it is operating normally.

US-CA-2 Network issue

Thu, 02 Oct 2025 15:00:00 -0000

Issue has been resolved and network is operating normally.

US-CA-2 Network issue

Thu, 02 Oct 2025 14:40:00 -0000

We are experiencing network connectivity issues for some of the machines in US-CA-2. We are working on this issue.

Runpod APIs experiencing intermittent errors

Thu, 02 Oct 2025 14:34:00 -0000

We are experiencing intermittent connectivity issues to our upstream that is impacting calls to Runpod APIs and Console. We working on resolving this issue.

Runpod console intermittent connectivity issue

Wed, 01 Oct 2025 14:52:00 -0000

Services have returned to normal and we are continuing to monitor the situation

EU-FR-1 DC experiencing connectivity issues with H100 machines

Tue, 30 Sep 2025 17:04:00 -0000

The EU-FR-1 DC issue with the H100 machines has been resolved and now operating normally.

Console data unavailable

Mon, 29 Sep 2025 22:53:00 -0000

At 2025/09/29 20:39:30 UTC, Runpod's primary API suffered an outage. This was due to an upstream vendor's managed database offering suffering an internal critical error. Vendor support was engaged immediately and their team was able to perform recovery at 2025/09/29 21:26:30. During the outage period, all calls made to the REST/GraphQL API's/console UI did not respond correctly and a percentage of serverless API job submissions were impacted. Existing pod workloads and data was not impacted. At this time all services are operating normally and all global capacity is available. We are engaging with the vendor to determine the root cause.

Console data unavailable

Mon, 29 Sep 2025 22:11:00 -0000

Upstream resolved their issue and we are back to operating normally.

Console data unavailable

Mon, 29 Sep 2025 21:30:00 -0000

We are seeing some preliminary recovery. We are continuing work through and monitor the situation.

Console data unavailable

Mon, 29 Sep 2025 21:13:00 -0000

There is an upstream component related issue. We are working through on resolving this issue.

Console data unavailable

Mon, 29 Sep 2025 20:48:00 -0000

Console data not showing up on the dashboard. We are investigating and working on resolving the issue.

EU-FR-1 DC experiencing connectivity issues with H100 machines

Mon, 29 Sep 2025 17:50:00 -0000

EU-FR-1 is experiencing network connectivity issues with H100 machines. We are working on getting this resolved.

Planned Internet Maintenance in CA-MTL-4 DC

Sat, 27 Sep 2025 13:00:44 -0000

Maintenance will be performed from September 27, 13:00 UTC to September 29, 4:00 UTC. This maintenance is focused on enhancing the performance and reliability of the internet services in CA-MTL-4 DC. While we are taking every precaution to avoid service interruptions, there may be experiences of brief packet loss to the internet while maintenance activities take place.

US-IL-1 Network issue

Fri, 26 Sep 2025 19:39:00 -0000

We’ve identified the root cause of the networking Issue and do not expect the issue to resurface. It has been resolved and the network is operating normally.

US-IL-1 Network issue

Fri, 26 Sep 2025 18:35:00 -0000

Network issue has recovered in US-IL-1. We are continuing to monitor the situation and will provide an update as we get more information.

US-IL-1 Network issue

Thu, 25 Sep 2025 21:26:00 -0000

US-IL-1 is experiencing a network issue. We are working on getting this resolved.

Upstream issue: Docker Hub Registry

Thu, 25 Sep 2025 01:34:00 -0000

Docker Hub has resolved its service issues and has returned to normal operation. Further details are captured here: https://www.dockerstatus.com/pages/history/533c6539221ae15e3f000031

Upstream issue: Docker Hub Registry

Thu, 25 Sep 2025 01:25:00 -0000

Errors have returned back to normal. We'll continue to monitor as Docker finalizes this as resolved.

Upstream issue: Docker Hub Registry

Wed, 24 Sep 2025 23:43:00 -0000

Docker is observing issues with authenticated requests which may impact image pulls and pushes. We are monitoring the situation and will provide ongoing updates if the situation changes. See https://www.dockerstatus.com/ for further detail.

CA-MTL-4 - Network performance is degraded

Wed, 24 Sep 2025 12:12:00 -0000

CA-MTL-4 network issue has been resolved and network is operating normally.

Upstream Issue - Console Logins are Unavailable

Mon, 15 Sep 2025 16:25:00 -0000

This issue has been resolved and the upstream provider has confirmed resolution.

Upstream Issue - Console Logins are Unavailable

Mon, 15 Sep 2025 14:00:00 -0000

Logins to the Runpod console are impacted by an upstream provider. This may result in logging to the console operating slowly, or being stuck on the Runpod loading page. Please try refreshing your browser. We are tracking with this provider and will provide updates as we receive them.

US-CA-2 Network issue

Sat, 13 Sep 2025 20:52:00 -0000

Issue has been resolved and network is operating normally.

US-CA-2 Network issue

Sat, 13 Sep 2025 19:25:00 -0000

US-CA-2 is experiencing a network issue. We are working on getting this resolved.

Upstream Issue - Console login flow errors

Thu, 28 Aug 2025 23:31:00 -0000

This issue has been resolved and the upstream provider has confirmed resolution.

Upstream Issue - Console login flow errors

Thu, 28 Aug 2025 17:31:00 -0000

The upstream provider has stabilized error rates and is still monitoring the situation. We will keep this status open until we receive positive confirmation that the upstream event is resolved.

Upstream Issue - Console login flow errors

Thu, 28 Aug 2025 15:00:00 -0000

Logins to the RunPod console are impacted by an upstream provider. This may result in logging to the console operating slowly, or being stuck on the Runpod loading page. Please try refreshing your browser. We are tracking with this provider and will provide updates as we receive them.

CA-MTL-4 Network Issue

Tue, 26 Aug 2025 22:23:00 -0000

The issue has been resolved and networking is operating correctly.

CA-MTL-4 Network Issue

Tue, 26 Aug 2025 21:47:00 -0000

The issue has been identified and initial remediation has been applied. We are monitoring traffic recovery.

CA-MTL-4 Network Issue

Tue, 26 Aug 2025 21:34:00 -0000

Networking at CA-MTL-4 has been disrupted. We are assessing and working on determining the root cause.

US-TX-1 Network Issue

Tue, 26 Aug 2025 00:44:00 -0000

Issues has been resolved and network is operating normally.

US-TX-1 Network Issue

Mon, 25 Aug 2025 17:40:00 -0000

US-TX-1 is experiencing a network issue. We are working on getting this resolved.

Email related actions are degraded

Mon, 25 Aug 2025 03:32:00 -0000

The issue has been resolved and service related email sending has returned to normal levels. This did not impact any core functionality of RunPod Serverless, Pods, or other workloads.

Email related actions are degraded

Sun, 24 Aug 2025 22:43:00 -0000

Actions related to email are currently degraded and may not deliver email. This includes forgot password, new account, etc. The engineering team is working on resolution and will provide updates here.

CA-MTL-3 - Network storage performance is degraded

Sat, 02 Aug 2025 14:18:00 -0000

We identified an edge case around network volume cleanup that was impacting the performance of the storage cluster in CA-MTL-3. The network storage performance has returned back to normal. We will continue to monitor this closely.

CA-MTL-3 - Network storage performance is degraded

Sat, 02 Aug 2025 00:37:00 -0000

We have been actively and continuing to work on a resolution to this issue.

CA-MTL-3 - Network storage performance is degraded

Fri, 01 Aug 2025 14:44:00 -0000

We are continuing to work on resolving and towards path to recovery.

CA-MTL-3 - Network storage performance is degraded

Fri, 01 Aug 2025 01:14:00 -0000

We are continuing to work on restoring the performance issue.

EUR-NO-1 DC is experiencing issues

Thu, 31 Jul 2025 18:49:00 -0000

The EUR-NO-1 data center issue has been resolved.

CA-MTL-3 - Network storage performance is degraded

Thu, 31 Jul 2025 16:09:00 -0000

CA-MTL-3's network storage array performance is degraded. We're working on restoring performance now.

Console Elevated Error Rates

Fri, 25 Jul 2025 19:13:00 -0000

The Console is running and functioning as normal.

Console Elevated Error Rates

Fri, 25 Jul 2025 17:13:00 -0000

Investigating - We are currently experiencing an issue with the Console returning elevated error rates for certain features. We will post an update as soon as we are able.

Cloudflare Global DNS is unavailable

Mon, 14 Jul 2025 23:29:00 -0000

Cloudflare has resolved the issue and we are observing normal network patterns when communicating with Cloudflare subnets.

Cloudflare Global DNS is unavailable

Mon, 14 Jul 2025 22:15:00 -0000

Be advised, Cloudflare Global DNS is having an outage. Services which rely on 1.1.1.1 may fail to operate correctly. RunPod is not observing any service degradation at this time, however we are assessing the situation. More details available here: https://www.cloudflarestatus.com/incidents/28r0vbbxsh8f

Network performance to Cloudflare is degraded in AP-JP-1

Wed, 02 Jul 2025 00:40:00 -0000

The upstream issue has been resolved and routing performance has returned to normal levels.

Network performance to Cloudflare is degraded in AP-JP-1

Tue, 01 Jul 2025 20:39:00 -0000

We are experienced elevated packet loss in AP-JP-1 to certain network subnets on the global internet. We are engaging with our upstream provider to determine root cause and resolution.

Login to RunPod Console is degraded (Upstream service outage)

Thu, 26 Jun 2025 07:30:00 -0000

Clerk has confirmed full recovery and access to the Runpod Console has been restored.

Login to RunPod Console is degraded (Upstream service outage)

Thu, 26 Jun 2025 07:09:00 -0000

We are observing recovery of logins and we are seeing correct login behavior on the console. We are still monitoring while Clerk confirms full recovery.

Login to RunPod Console is degraded (Upstream service outage)

Thu, 26 Jun 2025 06:33:00 -0000

We are aware of an upstream issue with our authentication provider, Clerk, that prevents users from logging in to the Runpod console. When attempting to log into the console, the login form will not load and the user experiences an infinite loading animation. Existing pods, serverless, and other workloads are not impacted at this time. More information from Clerk is available here: https://status.clerk.com/incidents/01JYNESV77Q8D10QZKP2PF63PN

RunPod console maintenance

Wed, 18 Jun 2025 16:16:00 -0000

Access to the RunPod console has been restored.

RunPod console maintenance

Wed, 18 Jun 2025 15:25:00 -0000

RunPod console is experiencing issues. We are working on resolving and will provide updates.

Monitoring Issues With Other Cloud Providers

Thu, 12 Jun 2025 20:54:00 -0000

Docker Hub and cloud providers appear to be functioning normally.

Monitoring Issues With Other Cloud Providers

Thu, 12 Jun 2025 19:57:00 -0000

We are aware of issues with various cloud providers and are monitoring the situation to ensure there is no impact to the Runpod platform. Docker Hub has acknowledged issues, which may affect some image pulls. You can view their status page here: https://www.dockerstatus.com/

Downtime in US-IL-1

Thu, 12 Jun 2025 06:56:00 -0000

The network issue in the US-IL-1 data center has been fully resolved. Our team will continue to monitor the situation.

Downtime in US-IL-1

Thu, 12 Jun 2025 06:07:00 -0000

We’ve detected network downtime affecting the US-IL-1 data center. Our team is actively investigating the issue and will continue to monitor the situation closely. We’ll provide updates as we learn more.

Upstream issue - Docker Hub Registry

Thu, 05 Jun 2025 14:23:00 -0000

Docker Hub has resolved its service issues and has returned to normal operation. Further details are captured here: https://www.dockerstatus.com/pages/history/533c6539221ae15e3f000031

Upstream issue - Docker Hub Registry

Wed, 04 Jun 2025 12:51:00 -0000

Docker is observing issues with pulls and pushes against Docker Hub. We are monitoring the situation and will provide ongoing updates if the situation changes. See https://www.dockerstatus.com/ for further detail.

Upstream issue - Canonical (Ubuntu) package manager

Fri, 30 May 2025 16:16:00 -0000

Canonical has resolved its service issues, and measured error levels have returned to normal levels. Further details are captured here: https://status.canonical.com/#/incident/KNms6QK9ewuzz-7xUsPsNylV20jEt5kyKsd8A-3ptQGnu9-UhZcQUtDmIVRYTQMx6Vt0EjSxe6Bz4_D89gPRLg==

Upstream issue - Canonical (Ubuntu) package manager

Thu, 29 May 2025 16:41:00 -0000

Canonical (Ubuntu)'s package mirrors are degraded. Users may encounter timeouts or other connection related issues when running `apt-get` commands. We are monitoring the situation and will provide ongoing updates if the situation changes. See https://status.canonical.com/ for further detail.

Planned Internet Maintenance EU-FR-1

Tue, 27 May 2025 11:00:50 -0000

We are conducting planned internet service maintenance in data center EU-FR-1 on May 27, 2025, between 11:00-14:00 UTC. During this scheduled time, internet service will be temporarily offline, but power will be maintained.

Planned Internet Maintenance US-TX-3

Wed, 14 May 2025 10:00:39 -0000

Maintenance completed

Planned Internet Maintenance US-TX-3

Wed, 14 May 2025 07:00:39 -0000

We are conducting planned internet service maintenance in data center US-TX-3 on May 14, 2025, from 07:00-10:00 UTC. During this scheduled time, internet service will be temporarily offline, but power will be maintained.

EU-RO-1 Network Storage is degraded

Tue, 06 May 2025 17:04:00 -0000

The storage cluster has been restored to nominal operating performance, and we are continuing to monitor performance.

EU-RO-1 Network Storage is degraded

Tue, 06 May 2025 16:26:00 -0000

Reads and writes have been re-enabled on this cluster. Performance remains degraded as the system restores.

EU-RO-1 Network Storage is degraded

Tue, 06 May 2025 16:17:00 -0000

The team has isolated the issue and is working to restore service now.

EU-RO-1 Network Storage is degraded

Tue, 06 May 2025 15:48:00 -0000

EU-RO-1 Network Storage is degraded, resulting in inability to read and write to network stores. We are working to restore service now.

US-NC-1 Network Issue

Tue, 29 Apr 2025 01:50:00 -0000

Our US-NC-1 data center is currently experiencing a network issue. The team is actively investigating. ---- The network has been restored.

Error rates elevated for Serverless endpoints

Mon, 21 Apr 2025 18:40:00 -0000

The issue has been resolved and error rates have returned to normal levels.

Error rates elevated for Serverless endpoints

Mon, 21 Apr 2025 18:33:00 -0000

The fix has been deployed and we are monitoring recovery - error rates are returning to normal levels.

Error rates elevated for Serverless endpoints

Mon, 21 Apr 2025 18:04:00 -0000

The team has identified the issue and is deploying a fix at this time.

Error rates elevated for Serverless endpoints

Mon, 21 Apr 2025 17:53:00 -0000

We are observing elevated error rates for Serverless endpoints which is resulting in failed requests and responses. The Engineering team is investigating now.

RunPod console shows Pods and Serverless endpoints unavailable

Thu, 10 Apr 2025 19:06:00 -0000

Monitoring - all services are returning to normal operating baselines, however we are continuing to monitor overall service recovery. ----- On April 10, 2025, between 18:26:30 UTC and 18:53:00 UTC, a service disruption occurred due to a software release that was dependent on a database change which had not yet been applied. This caused our primary API to become temporarily non-functional. As a result, customers experienced issues including missing pods and serverless endpoints in the dashboard, and delayed request processing due to serverless endpoints being unable to scale.

RunPod console shows Pods and Serverless endpoints unavailable

Thu, 10 Apr 2025 18:55:00 -0000

Identified - This issue is caused by a database problem. We've applied the fix and are continuing to monitor recovery.

RunPod console shows Pods and Serverless endpoints unavailable

Thu, 10 Apr 2025 18:43:00 -0000

Investigating - We are currently experiencing an issue with RunPod console and API where users are not able to access or deploy new Pods and Serverless endpoints. We are currently investigating and will post an update as soon as we are able.

Billing and Audit Log pages down

Mon, 07 Apr 2025 21:08:00 -0000

Resolved - Users were unable to access the Billing and Audit Log pages in User Settings. We rolled out a fix and this issue is now resolved.

Billing and Audit Log pages down

Mon, 07 Apr 2025 20:54:00 -0000

Identified - This issue is caused by a bug in the application code. A hot fix will be released imminently. We will provide another update once the hot fix has been rolled out and service is restored.

Billing and Audit Log pages down

Mon, 07 Apr 2025 20:35:00 -0000

Investigating - We are currently experiencing an issue with some pages not loading in the RunPod Console User Settings. Specifically, we are aware that users are not able to access the Billing and Audit Log pages at this time. We are currently investigating and will post an update as soon as we are able.

Urgent: Emergency Firmware Update for US-TX-4 at 21:00 UTC (March 11, 2025)

Tue, 11 Mar 2025 18:59:00 -0000

Our engineering team has identified a network disruption at our US-TX-4 datacenter, caused by a required firmware update for our router. To resolve this, we will deploy an emergency fix at 21:00 UTC on March 11, 2025, with a maximum expected downtime of 10-15 minutes. ----------- The update was successfully completed.

US-NC-1 Network Issue

Thu, 06 Mar 2025 18:44:00 -0000

Our primary ISP circuit for the US-NC-1 data center experienced an outage. The secondary router failed to take over due to a known firmware issue that was scheduled for a later patch. We’ve now upgraded the router to the latest patched version and are running on the secondary circuit. --------- The issue has been resolved.

Issue with Volume Storage in CA-MTL-1

Tue, 25 Feb 2025 14:53:00 -0000

We have discovered an issue affecting pods running in CA-MTL-1 when using volume disk or network storage. When executing commands, the process may hang, although the file is still created successfully. So far, this issue primarily impacts most H100 GPUs and a few A40 GPUs. Our team is actively investigating and will provide updates here as we learn more. ------- We have identify the root cause of the issue, team is pushing the updates to machine. ------- All machines have been updated, and the issue is now resolved.

EU-CZ-1 Data Center Upgrade

Sat, 15 Feb 2025 17:00:00 -0000

We are currently upgrading the EU-CZ-1 data center, and all machines are offline during this process. Services hosted in this region are temporarily unavailable during this period. ------ We’ve successfully brought most of the machines online. However, due to some technical issues, we need a bit more time to restore the remaining ones. Thanks for your patience, we’ll keep you posted! ------ All machines in the EU-CZ-1 data center are now fully online. The data center upgrade is complete, thank you for your patience!

Serverless Request Issue

Thu, 13 Feb 2025 23:23:00 -0000

We experienced an issue affecting serverless requests from 10:00 PM to 10:23 PM UTC. This was due to an update made to improve system capacity in the NYC region, which led to temporary request issues. The issue has been identified and resolved, and we’ve taken steps to minimize future risks. ---- We are still seeing issues, and our team is actively investigating. We’ll provide further updates as soon as we have more information. ---- We have identified the issue and will be rolling out a release to fix it soon. Thank you for your patience while we work on resolving this. ----- The issue was still related to the new server we added. After adding the new server, it triggered an unexpected bug that caused the worker to be unable to retrieve the request payload. --------- The team has just confirmed that the issue is now resolved.

🚨 CA-MTL-1 Network Volume Performance Issue 🚨

Tue, 11 Feb 2025 16:00:00 -0000

We’re currently experiencing performance issues with network volumes in the CA-MTL-1 data center. Our team is investigating the issue, and we’ll provide updates as soon as possible. ------ We detected a performance issue with one of the chunk servers and have isolated the affected server. ------ The issue has been resolved

CA-MTL-3 Network Disruption

Thu, 30 Jan 2025 11:14:00 -0000

CA-MTL-3 is suffering a network disruption due to an upstream provider issue. We are in contact with the provider and are working to restore network availability now. --------- the network is restored

US-TX-4 Network Disruption

Thu, 23 Jan 2025 22:52:00 -0000

US-TX-4 is suffering a network disruption due to an upstream provider issue. We are in contact with the provider and are working to restore network availability now. ----- The US-TX-4 region will experience a short network disruption at approximately 01/23/2025 5:30 PM CST for about 10 minutes due to an emergency firewall update. We apologize for any inconvenience and appreciate your understanding as we perform this critical update. ---------- The issue affecting US-TX-4 has been resolved. Services are now operating normally. Thank you for your patience and understanding.

EU-SE-1 Network Disruption

Thu, 16 Jan 2025 00:00:00 -0000

The network issue at the data center has been resolved. Thank you for your patience.

EU-SE-1 Network Disruption

Mon, 13 Jan 2025 04:00:00 -0000

EU-SE-1 is suffering a network disruption due to an upstream provider issue. We are in contact with the provider and are working to restore network availability now.

US-TX-3 Network Disruption

Thu, 19 Dec 2024 04:34:00 -0000

US-TX-3 is suffering a network disruption due to an upstream provider issue. We are in contact with the provider and are working to restore network availability now.

US-TX-3 Network Disruption

Tue, 17 Dec 2024 20:42:00 -0000

This issue was due to an upstream provider and has been resolved. We have requested an RCA and will provide updates as applicable.

US-TX-3 Network Disruption

Tue, 17 Dec 2024 20:25:00 -0000

US-TX-3 suffered a network disruption due to an upstream provider issue.

CA-MTL-1 data center is currently inaccessible

Tue, 29 Oct 2024 11:51:00 -0000

Our CA-MTL-1 data center recently underwent maintenance, which was completed with minimal impact. However, during post-maintenance monitoring, the data center became inaccessible due to an unexpected issue. Our team is actively working to resolve the problem. ---- The network issue has been resolved for the CA-MTL-1 data center

Elevated errors for dashboard and API

Thu, 10 Oct 2024 19:46:00 -0000

The root cause has been resolved, and services have returned to normal operating levels.

Elevated errors for dashboard and API

Thu, 10 Oct 2024 19:17:00 -0000

We are currently experiencing elevated error rates for the console and primary API's. We have identified the issue and are in the process of resolving.

EUR-IS-1 Network Issue

Tue, 08 Oct 2024 19:59:00 -0000

The network issue at the data center has been resolved. Thank you for your patience.

EUR-IS-1 Network Issue

Tue, 08 Oct 2024 18:20:00 -0000

We’re currently experiencing network packet loss issues in the EUR-IS-1 region, leading to connectivity errors and connection loss. Our team is actively coordinating with the data center and networking teams to resolve the problem.

Elevated network packetloss in US-OR-1 DC

Tue, 01 Oct 2024 00:20:00 -0000

The root cause of the issue has been addressed and congestion has returned to baseline levels.

Elevated network packetloss in US-OR-1 DC

Mon, 30 Sep 2024 14:43:00 -0000

We are currently observing elevated packet loss within the EUR-IS-1 DC. This is resulting in increased connection resets and failures. We are engaging with the network provider to determine the root cause.

Network availability issues in EUR-IS-1

Thu, 05 Sep 2024 18:04:00 -0000

Network availability has been restored by the upstream provider. We will be performing a RCA and provide further details.

Network availability issues in EUR-IS-1

Thu, 05 Sep 2024 17:30:00 -0000

We're experiencing elevated network errors in EUR-IS-1 resulting in connectivity errors and connection loss. We are coordinating with the DC and and networking teams.

Elevated network packetloss in US-OR-1 DC

Wed, 04 Sep 2024 01:59:00 -0000

We have received confirmation from the upstream network provider and we have validated that this issue is resolved. The root cause was a network protection ruleset which engaged in a false-positive manner to drop a selection of packets. This resulted in failure to establish connections and impact to bandwidth over TCP/QUIC connections. We will provide an RCA once we receive the report from the upstream provider.

Elevated network packetloss in US-OR-1 DC

Tue, 03 Sep 2024 21:16:00 -0000

Packetloss has returned to nominal levels. We are still monitoring closely.

Elevated network packetloss in US-OR-1 DC

Tue, 03 Sep 2024 20:19:00 -0000

The network provider is still in the process of mitigating the issue. We will provide regular updates as they make progress.

Elevated network packetloss in US-OR-1 DC

Tue, 03 Sep 2024 19:32:00 -0000

The network provider is still in the process of mitigating the issue. We will provide regular updates as they make progress.

Elevated network packetloss in US-OR-1 DC

Tue, 03 Sep 2024 18:49:00 -0000

The network provider is still in the process of mitigating the issue.

Elevated network packetloss in US-OR-1 DC

Tue, 03 Sep 2024 18:21:00 -0000

The network provider is diagnosing the issue and has isolated it to exist within a specific network segment.

Elevated network packetloss in US-OR-1 DC

Tue, 03 Sep 2024 17:02:00 -0000

We are currently observing elevated packet loss within the US-OR-1 DC. This is resulting in increased connection resets and failures. We are engaging with the network provider to determine the root cause.

Serverless Workers Unable To Read Environment Variables In Templates

Tue, 11 Jun 2024 20:32:00 -0000

At 9:02 AM PST workers in serverless endpoints were unable to read environment variables set in templates. Thus, workers that were not already initialized and relied on environment variables from the template would fail to start. This issue has been resolved at 1:32 PM PST.

US-OR-1 Firewall Under Stress

Tue, 11 Jun 2024 19:33:00 -0000

We have resolved this incident.

US-OR-1 Firewall Under Stress

Mon, 10 Jun 2024 20:32:00 -0000

Our firewall is currently handling an unusually high number of small packets, which may cause some temporary service disruptions and a slow down in upload and download speeds in US-OR-1. We are working to resolve the issue.

Network Volume outage in RO region

Wed, 12 Jul 2023 01:25:00 -0000

We patched the configuration and the problem should now be resolved. We will continue to monitor.

Network Volume outage in RO region

Wed, 12 Jul 2023 01:00:00 -0000

We had a configuration issue that caused network volumes in the RO region to stop being able to be registered, causing a widespread outage for pods in the region. We have resolved the issue and patched the configuration so that this won't happen again. We are also reviewing this configuration in other regions to being it in line with this region.