Incidents | Runpod Incidents reported on status page for Runpod https://uptime.runpod.io/ https://d1lppblt9t2x15.cloudfront.net/logos/e9614e69b8b2cae94548141560b36cb2.png Incidents | Runpod https://uptime.runpod.io/ en Login to RunPod Console is degraded (Upstream service outage) https://uptime.runpod.io/incident/609416 Thu, 26 Jun 2025 07:30:00 -0000 https://uptime.runpod.io/incident/609416#aad3e087259ffe97970178bc47f1404a2a09ddaf040fffbc165a7bdff44aa249 Clerk has confirmed full recovery and access to the Runpod Console has been restored. Login to RunPod Console is degraded (Upstream service outage) https://uptime.runpod.io/incident/609416 Thu, 26 Jun 2025 07:30:00 -0000 https://uptime.runpod.io/incident/609416#aad3e087259ffe97970178bc47f1404a2a09ddaf040fffbc165a7bdff44aa249 Clerk has confirmed full recovery and access to the Runpod Console has been restored. Login to RunPod Console is degraded (Upstream service outage) https://uptime.runpod.io/incident/609416 Thu, 26 Jun 2025 07:09:00 -0000 https://uptime.runpod.io/incident/609416#e048e7ac99395b50edb95917e2087e446601cffee2f24825bbd53396ff6bbf1d We are observing recovery of logins and we are seeing correct login behavior on the console. We are still monitoring while Clerk confirms full recovery. Login to RunPod Console is degraded (Upstream service outage) https://uptime.runpod.io/incident/609416 Thu, 26 Jun 2025 07:09:00 -0000 https://uptime.runpod.io/incident/609416#e048e7ac99395b50edb95917e2087e446601cffee2f24825bbd53396ff6bbf1d We are observing recovery of logins and we are seeing correct login behavior on the console. We are still monitoring while Clerk confirms full recovery. Login to RunPod Console is degraded (Upstream service outage) https://uptime.runpod.io/incident/609416 Thu, 26 Jun 2025 06:33:00 -0000 https://uptime.runpod.io/incident/609416#82882e9a88643b22e7ed281aa41d474adce28408ea7e055dd457e8d356488691 We are aware of an upstream issue with our authentication provider, Clerk, that prevents users from logging in to the Runpod console. When attempting to log into the console, the login form will not load and the user experiences an infinite loading animation. Existing pods, serverless, and other workloads are not impacted at this time. More information from Clerk is available here: https://status.clerk.com/incidents/01JYNESV77Q8D10QZKP2PF63PN Login to RunPod Console is degraded (Upstream service outage) https://uptime.runpod.io/incident/609416 Thu, 26 Jun 2025 06:33:00 -0000 https://uptime.runpod.io/incident/609416#82882e9a88643b22e7ed281aa41d474adce28408ea7e055dd457e8d356488691 We are aware of an upstream issue with our authentication provider, Clerk, that prevents users from logging in to the Runpod console. When attempting to log into the console, the login form will not load and the user experiences an infinite loading animation. Existing pods, serverless, and other workloads are not impacted at this time. More information from Clerk is available here: https://status.clerk.com/incidents/01JYNESV77Q8D10QZKP2PF63PN RunPod console maintenance https://uptime.runpod.io/incident/605227 Wed, 18 Jun 2025 16:16:00 -0000 https://uptime.runpod.io/incident/605227#84e2f3f6b0648c0c1dd4cf3a98f29a0c6dc0e68e4412cb985e171197aeed19ac Access to the RunPod console has been restored. RunPod console maintenance https://uptime.runpod.io/incident/605227 Wed, 18 Jun 2025 15:25:00 -0000 https://uptime.runpod.io/incident/605227#98c22b1cbef67eb62ef4626f6c1b9bf176675e0569c161dbd85dfeaa6705565b RunPod console is experiencing issues. We are working on resolving and will provide updates. ui: runpod.io/console recovered https://uptime.runpod.io/ Wed, 18 Jun 2025 15:24:34 +0000 https://uptime.runpod.io/#5708672de8ba0b299534fddf49a60b67aa8cdf96f0defcd66a8df82edb016360 ui: runpod.io/console recovered ui: runpod.io/console went down https://uptime.runpod.io/ Wed, 18 Jun 2025 15:04:35 +0000 https://uptime.runpod.io/#5708672de8ba0b299534fddf49a60b67aa8cdf96f0defcd66a8df82edb016360 ui: runpod.io/console went down Monitoring Issues With Other Cloud Providers https://uptime.runpod.io/incident/602037 Thu, 12 Jun 2025 20:54:00 -0000 https://uptime.runpod.io/incident/602037#094b84f52f8a786bbba9c425313b46014ed7d4fbdf90bae7bb3a5fd44398a825 Docker Hub and cloud providers appear to be functioning normally. Monitoring Issues With Other Cloud Providers https://uptime.runpod.io/incident/602037 Thu, 12 Jun 2025 19:57:00 -0000 https://uptime.runpod.io/incident/602037#2efb2a98f87fc24103c4c55e9e8d30e7a7776a9698fcee6e352dbbc445fb5b56 We are aware of issues with various cloud providers and are monitoring the situation to ensure there is no impact to the Runpod platform. Docker Hub has acknowledged issues, which may affect some image pulls. You can view their status page here: https://www.dockerstatus.com/ Downtime in US-IL-1 https://uptime.runpod.io/incident/601337 Thu, 12 Jun 2025 06:56:00 -0000 https://uptime.runpod.io/incident/601337#a94cf48b31e79784d65e605c4dfd666d6b5423c9507a51493de051d172f15a96 The network issue in the US-IL-1 data center has been fully resolved. Our team will continue to monitor the situation. US-IL-1 recovered https://uptime.runpod.io/ Thu, 12 Jun 2025 06:42:20 +0000 https://uptime.runpod.io/#1d7648ac9a86a7862c220dd1801d4524841b7defd5fb1a038441f464ebdd2f10 US-IL-1 recovered Downtime in US-IL-1 https://uptime.runpod.io/incident/601337 Thu, 12 Jun 2025 06:07:00 -0000 https://uptime.runpod.io/incident/601337#9264f39556d827a1862c6530ffceff77a29ea819bc550afe88b4988084eba22c We’ve detected network downtime affecting the US-IL-1 data center. Our team is actively investigating the issue and will continue to monitor the situation closely. We’ll provide updates as we learn more. US-IL-1 went down https://uptime.runpod.io/ Thu, 12 Jun 2025 05:32:43 +0000 https://uptime.runpod.io/#1d7648ac9a86a7862c220dd1801d4524841b7defd5fb1a038441f464ebdd2f10 US-IL-1 went down EUR-IS-1 recovered https://uptime.runpod.io/ Wed, 11 Jun 2025 01:56:55 +0000 https://uptime.runpod.io/#90d2e1ceba9972ca184fd08e53fb843ff94aad6f87e912264c12df3ed1aeed81 EUR-IS-1 recovered EUR-IS-1 went down https://uptime.runpod.io/ Wed, 11 Jun 2025 01:47:14 +0000 https://uptime.runpod.io/#90d2e1ceba9972ca184fd08e53fb843ff94aad6f87e912264c12df3ed1aeed81 EUR-IS-1 went down Upstream issue - Docker Hub Registry https://uptime.runpod.io/incident/597075 Thu, 05 Jun 2025 14:23:00 -0000 https://uptime.runpod.io/incident/597075#d8415e3a7d2980f2444f525a7c5f6fd538dd37350382f8f3f7af3e348976dec8 Docker Hub has resolved its service issues and has returned to normal operation. Further details are captured here: https://www.dockerstatus.com/pages/history/533c6539221ae15e3f000031 Upstream issue - Docker Hub Registry https://uptime.runpod.io/incident/597075 Wed, 04 Jun 2025 12:51:00 -0000 https://uptime.runpod.io/incident/597075#feba563ed80f42445498601343481cdc24d12541c82e0dce77745d770fee043c Docker is observing issues with pulls and pushes against Docker Hub. We are monitoring the situation and will provide ongoing updates if the situation changes. See https://www.dockerstatus.com/ for further detail. Upstream issue - Canonical (Ubuntu) package manager https://uptime.runpod.io/incident/583629 Fri, 30 May 2025 16:16:00 -0000 https://uptime.runpod.io/incident/583629#40da1719ada80c7737080e0d51fa937b42dab2ec4d6d8cd913d5b8ac5bfaaee1 Canonical has resolved its service issues, and measured error levels have returned to normal levels. Further details are captured here: https://status.canonical.com/#/incident/KNms6QK9ewuzz-7xUsPsNylV20jEt5kyKsd8A-3ptQGnu9-UhZcQUtDmIVRYTQMx6Vt0EjSxe6Bz4_D89gPRLg== Upstream issue - Canonical (Ubuntu) package manager https://uptime.runpod.io/incident/583629 Thu, 29 May 2025 16:41:00 -0000 https://uptime.runpod.io/incident/583629#db8e4d7b9c962b80be958131b5c91ceeec966d9ce64860fa139fa60cc70e1989 Canonical (Ubuntu)'s package mirrors are degraded. Users may encounter timeouts or other connection related issues when running `apt-get` commands. We are monitoring the situation and will provide ongoing updates if the situation changes. See https://status.canonical.com/ for further detail. Planned Internet Maintenance EU-FR-1 https://uptime.runpod.io/incident/577181 Tue, 27 May 2025 14:00:50 +0000 https://uptime.runpod.io/incident/577181#6cdcc341d94453fd78811b1d89198e2fd3af6565b8b10a961821cd443189f706 Maintenance completed Planned Internet Maintenance EU-FR-1 https://uptime.runpod.io/incident/577181 Tue, 27 May 2025 11:00:50 -0000 https://uptime.runpod.io/incident/577181#00a90a5306980309984dcd18fd5862773b58e7983c08fd169c48654b32c10ad5 We are conducting planned internet service maintenance in data center EU-FR-1 on May 27, 2025, between 11:00-14:00 UTC. During this scheduled time, internet service will be temporarily offline, but power will be maintained. EUR-NO-1 recovered https://uptime.runpod.io/ Sun, 25 May 2025 22:11:21 +0000 https://uptime.runpod.io/#fd88f65df5809f80e4f81af40c20ce15ef3debc8e350e8eca95a327fd814c1bf EUR-NO-1 recovered EUR-NO-1 went down https://uptime.runpod.io/ Sun, 25 May 2025 22:01:45 +0000 https://uptime.runpod.io/#fd88f65df5809f80e4f81af40c20ce15ef3debc8e350e8eca95a327fd814c1bf EUR-NO-1 went down Planned Internet Maintenance US-TX-3 https://uptime.runpod.io/incident/559319 Wed, 14 May 2025 10:00:39 +0000 https://uptime.runpod.io/incident/559319#6b11f988c07f59e5ca11e10adfd2080e25c423c9c9dfd2cde0c0612e4cad0590 Maintenance completed Planned Internet Maintenance US-TX-3 https://uptime.runpod.io/incident/559319 Wed, 14 May 2025 10:00:39 +0000 https://uptime.runpod.io/incident/559319#6b11f988c07f59e5ca11e10adfd2080e25c423c9c9dfd2cde0c0612e4cad0590 Maintenance completed Planned Internet Maintenance US-TX-3 https://uptime.runpod.io/incident/559319 Wed, 14 May 2025 10:00:39 +0000 https://uptime.runpod.io/incident/559319#6b11f988c07f59e5ca11e10adfd2080e25c423c9c9dfd2cde0c0612e4cad0590 Maintenance completed Planned Internet Maintenance US-TX-3 https://uptime.runpod.io/incident/559319 Wed, 14 May 2025 07:00:39 -0000 https://uptime.runpod.io/incident/559319#dcc2f6dc3edc1597d5f4450b3a0149b9a9b07dbed170963467c513ada54ff9ae We are conducting planned internet service maintenance in data center US-TX-3 on May 14, 2025, from 07:00-10:00 UTC. During this scheduled time, internet service will be temporarily offline, but power will be maintained. Planned Internet Maintenance US-TX-3 https://uptime.runpod.io/incident/559319 Wed, 14 May 2025 07:00:39 -0000 https://uptime.runpod.io/incident/559319#dcc2f6dc3edc1597d5f4450b3a0149b9a9b07dbed170963467c513ada54ff9ae We are conducting planned internet service maintenance in data center US-TX-3 on May 14, 2025, from 07:00-10:00 UTC. During this scheduled time, internet service will be temporarily offline, but power will be maintained. Planned Internet Maintenance US-TX-3 https://uptime.runpod.io/incident/559319 Wed, 14 May 2025 07:00:39 -0000 https://uptime.runpod.io/incident/559319#dcc2f6dc3edc1597d5f4450b3a0149b9a9b07dbed170963467c513ada54ff9ae We are conducting planned internet service maintenance in data center US-TX-3 on May 14, 2025, from 07:00-10:00 UTC. During this scheduled time, internet service will be temporarily offline, but power will be maintained. EU-RO-1 Network Storage is degraded https://uptime.runpod.io/incident/557552 Tue, 06 May 2025 17:04:00 -0000 https://uptime.runpod.io/incident/557552#0c511ced79af707b623cf3362ccb760ededea15dc892cb397de4f2aa7fe2a52d The storage cluster has been restored to nominal operating performance, and we are continuing to monitor performance. EU-RO-1 Network Storage is degraded https://uptime.runpod.io/incident/557552 Tue, 06 May 2025 16:26:00 -0000 https://uptime.runpod.io/incident/557552#abfbe04a2e8dca3e22b70056e7ca9e7f2be85c25783ee8f8fe1178e88ecda1ec Reads and writes have been re-enabled on this cluster. Performance remains degraded as the system restores. EU-RO-1 Network Storage is degraded https://uptime.runpod.io/incident/557552 Tue, 06 May 2025 16:17:00 -0000 https://uptime.runpod.io/incident/557552#dc5fb3c7cf102ae24b7ad7f9c17356ffebefc0f643e48e607d9015729d866a7a The team has isolated the issue and is working to restore service now. EU-RO-1 Network Storage is degraded https://uptime.runpod.io/incident/557552 Tue, 06 May 2025 15:48:00 -0000 https://uptime.runpod.io/incident/557552#665d7aca1cfdc27893d615b3fc744ac312101846addef029325f49d6c6be0ab1 EU-RO-1 Network Storage is degraded, resulting in inability to read and write to network stores. We are working to restore service now. US-NC-1 Network Issue https://uptime.runpod.io/incident/553492 Tue, 29 Apr 2025 01:50:00 -0000 https://uptime.runpod.io/incident/553492#4565b80f695ab3065afedaba2533420d030eb537fe9b063cd03e862402c623c0 Our US-NC-1 data center is currently experiencing a network issue. The team is actively investigating. ---- The network has been restored. US-IL-1 recovered https://uptime.runpod.io/ Fri, 25 Apr 2025 20:40:43 +0000 https://uptime.runpod.io/#41f2005365416d77a8120aaf95dc6440f2006bb522cdf1f2fb992091b0d13fda US-IL-1 recovered US-IL-1 went down https://uptime.runpod.io/ Fri, 25 Apr 2025 20:31:07 +0000 https://uptime.runpod.io/#41f2005365416d77a8120aaf95dc6440f2006bb522cdf1f2fb992091b0d13fda US-IL-1 went down Error rates elevated for Serverless endpoints https://uptime.runpod.io/incident/548625 Mon, 21 Apr 2025 18:40:00 -0000 https://uptime.runpod.io/incident/548625#5bb6d044b2530e5e030ac4afccce0b1371c1d59aa9431144bc5a64f61e3f14ef The issue has been resolved and error rates have returned to normal levels. Error rates elevated for Serverless endpoints https://uptime.runpod.io/incident/548625 Mon, 21 Apr 2025 18:33:00 -0000 https://uptime.runpod.io/incident/548625#1fa6a48885767dbcd246c69f72fa97f5b336e0bbe61cf5d4be2ea4a42cb18c30 The fix has been deployed and we are monitoring recovery - error rates are returning to normal levels. Error rates elevated for Serverless endpoints https://uptime.runpod.io/incident/548625 Mon, 21 Apr 2025 18:04:00 -0000 https://uptime.runpod.io/incident/548625#32f2d4f8f8b6281cd834091fb0bbf00ad2c7736376a4759e07ef37f1e9fb4044 The team has identified the issue and is deploying a fix at this time. Error rates elevated for Serverless endpoints https://uptime.runpod.io/incident/548625 Mon, 21 Apr 2025 17:53:00 -0000 https://uptime.runpod.io/incident/548625#a8ad16475a4aae6e43b8f99dd2e0b5ad93ea98661cdb7e0df96342bf4456d37b We are observing elevated error rates for Serverless endpoints which is resulting in failed requests and responses. The Engineering team is investigating now. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 19:06:00 -0000 https://uptime.runpod.io/incident/543258#432ae229000dc585e314e667b5681ae5d1db5242f8983e68bae511496b643df9 Monitoring - all services are returning to normal operating baselines, however we are continuing to monitor overall service recovery. ----- On April 10, 2025, between 18:26:30 UTC and 18:53:00 UTC, a service disruption occurred due to a software release that was dependent on a database change which had not yet been applied. This caused our primary API to become temporarily non-functional. As a result, customers experienced issues including missing pods and serverless endpoints in the dashboard, and delayed request processing due to serverless endpoints being unable to scale. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 19:06:00 -0000 https://uptime.runpod.io/incident/543258#432ae229000dc585e314e667b5681ae5d1db5242f8983e68bae511496b643df9 Monitoring - all services are returning to normal operating baselines, however we are continuing to monitor overall service recovery. ----- On April 10, 2025, between 18:26:30 UTC and 18:53:00 UTC, a service disruption occurred due to a software release that was dependent on a database change which had not yet been applied. This caused our primary API to become temporarily non-functional. As a result, customers experienced issues including missing pods and serverless endpoints in the dashboard, and delayed request processing due to serverless endpoints being unable to scale. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 19:06:00 -0000 https://uptime.runpod.io/incident/543258#432ae229000dc585e314e667b5681ae5d1db5242f8983e68bae511496b643df9 Monitoring - all services are returning to normal operating baselines, however we are continuing to monitor overall service recovery. ----- On April 10, 2025, between 18:26:30 UTC and 18:53:00 UTC, a service disruption occurred due to a software release that was dependent on a database change which had not yet been applied. This caused our primary API to become temporarily non-functional. As a result, customers experienced issues including missing pods and serverless endpoints in the dashboard, and delayed request processing due to serverless endpoints being unable to scale. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 18:55:00 -0000 https://uptime.runpod.io/incident/543258#7e5d52e00e4e8b8a00381fcd0ae40365143d205388f8f46b9036b5958c82770c Identified - This issue is caused by a database problem. We've applied the fix and are continuing to monitor recovery. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 18:55:00 -0000 https://uptime.runpod.io/incident/543258#7e5d52e00e4e8b8a00381fcd0ae40365143d205388f8f46b9036b5958c82770c Identified - This issue is caused by a database problem. We've applied the fix and are continuing to monitor recovery. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 18:55:00 -0000 https://uptime.runpod.io/incident/543258#7e5d52e00e4e8b8a00381fcd0ae40365143d205388f8f46b9036b5958c82770c Identified - This issue is caused by a database problem. We've applied the fix and are continuing to monitor recovery. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 18:43:00 -0000 https://uptime.runpod.io/incident/543258#790ff76c6a985582ad0677574741e7711012f55512be08386d7230732a89893a Investigating - We are currently experiencing an issue with RunPod console and API where users are not able to access or deploy new Pods and Serverless endpoints. We are currently investigating and will post an update as soon as we are able. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 18:43:00 -0000 https://uptime.runpod.io/incident/543258#790ff76c6a985582ad0677574741e7711012f55512be08386d7230732a89893a Investigating - We are currently experiencing an issue with RunPod console and API where users are not able to access or deploy new Pods and Serverless endpoints. We are currently investigating and will post an update as soon as we are able. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 18:43:00 -0000 https://uptime.runpod.io/incident/543258#790ff76c6a985582ad0677574741e7711012f55512be08386d7230732a89893a Investigating - We are currently experiencing an issue with RunPod console and API where users are not able to access or deploy new Pods and Serverless endpoints. We are currently investigating and will post an update as soon as we are able. Billing and Audit Log pages down https://uptime.runpod.io/incident/541449 Mon, 07 Apr 2025 21:08:00 -0000 https://uptime.runpod.io/incident/541449#84b3fc571bf98e3c45e66909e9a62ec20285bf9106a7f77cc6004ac3406d1dff Resolved - Users were unable to access the Billing and Audit Log pages in User Settings. We rolled out a fix and this issue is now resolved. Billing and Audit Log pages down https://uptime.runpod.io/incident/541449 Mon, 07 Apr 2025 20:54:00 -0000 https://uptime.runpod.io/incident/541449#69839c8856bbda317412524c134dfaf1af26e26b1b49c03d0111b4244c114eab Identified - This issue is caused by a bug in the application code. A hot fix will be released imminently. We will provide another update once the hot fix has been rolled out and service is restored. Billing and Audit Log pages down https://uptime.runpod.io/incident/541449 Mon, 07 Apr 2025 20:35:00 -0000 https://uptime.runpod.io/incident/541449#6f268e31335ff0a8b703bf35e0c90c8ab213a8b6dfb940782e652693217f8ffd Investigating - We are currently experiencing an issue with some pages not loading in the RunPod Console User Settings. Specifically, we are aware that users are not able to access the Billing and Audit Log pages at this time. We are currently investigating and will post an update as soon as we are able. Urgent: Emergency Firmware Update for US-TX-4 at 21:00 UTC (March 11, 2025) https://uptime.runpod.io/incident/526582 Tue, 11 Mar 2025 18:59:00 -0000 https://uptime.runpod.io/incident/526582#6367a5612e3c0d0caba76f5fe8e9be696d81a2f2fe37a6e4f4a3c04afd9b7e86 Our engineering team has identified a network disruption at our US-TX-4 datacenter, caused by a required firmware update for our router. To resolve this, we will deploy an emergency fix at 21:00 UTC on March 11, 2025, with a maximum expected downtime of 10-15 minutes. ----------- The update was successfully completed. US-NC-1 Network Issue https://uptime.runpod.io/incident/523954 Thu, 06 Mar 2025 18:44:00 -0000 https://uptime.runpod.io/incident/523954#a443da98c1ea5485cc1e02caaaf18502cd7ebeea9ec3fe2f1094fb7d0ce1cfbc Our primary ISP circuit for the US-NC-1 data center experienced an outage. The secondary router failed to take over due to a known firmware issue that was scheduled for a later patch. We’ve now upgraded the router to the latest patched version and are running on the secondary circuit. --------- The issue has been resolved. Issue with Volume Storage in CA-MTL-1 https://uptime.runpod.io/incident/518776 Tue, 25 Feb 2025 14:53:00 -0000 https://uptime.runpod.io/incident/518776#a38a988f6fcbccc9e0a0fdc072c6655b943826743b09279a7425c0157ad384a9 We have discovered an issue affecting pods running in CA-MTL-1 when using volume disk or network storage. When executing commands, the process may hang, although the file is still created successfully. So far, this issue primarily impacts most H100 GPUs and a few A40 GPUs. Our team is actively investigating and will provide updates here as we learn more. ------- We have identify the root cause of the issue, team is pushing the updates to machine. ------- All machines have been updated, and the issue is now resolved. EU-CZ-1 Data Center Upgrade https://uptime.runpod.io/incident/513399 Sat, 15 Feb 2025 17:00:00 -0000 https://uptime.runpod.io/incident/513399#5c3d731bc79877773d2e4e31e89f7f6a40d3c220efa1f14df8b7992460e18907 We are currently upgrading the EU-CZ-1 data center, and all machines are offline during this process. Services hosted in this region are temporarily unavailable during this period. ------ We’ve successfully brought most of the machines online. However, due to some technical issues, we need a bit more time to restore the remaining ones. Thanks for your patience, we’ll keep you posted! ------ All machines in the EU-CZ-1 data center are now fully online. The data center upgrade is complete, thank you for your patience! Serverless Request Issue https://uptime.runpod.io/incident/512662 Thu, 13 Feb 2025 23:23:00 -0000 https://uptime.runpod.io/incident/512662#cd26736a958bc4f7e05df46a6ab050be4b007f33ae6e8f3bbc3fd15816b6bf62 We experienced an issue affecting serverless requests from 10:00 PM to 10:23 PM UTC. This was due to an update made to improve system capacity in the NYC region, which led to temporary request issues. The issue has been identified and resolved, and we’ve taken steps to minimize future risks. ---- We are still seeing issues, and our team is actively investigating. We’ll provide further updates as soon as we have more information. ---- We have identified the issue and will be rolling out a release to fix it soon. Thank you for your patience while we work on resolving this. ----- The issue was still related to the new server we added. After adding the new server, it triggered an unexpected bug that caused the worker to be unable to retrieve the request payload. --------- The team has just confirmed that the issue is now resolved. 🚨 CA-MTL-1 Network Volume Performance Issue 🚨 https://uptime.runpod.io/incident/511142 Tue, 11 Feb 2025 16:00:00 -0000 https://uptime.runpod.io/incident/511142#b390762037be10eb8aa08081fbf68dd63266d8e872aee7e84c44284a8df481fd We’re currently experiencing performance issues with network volumes in the CA-MTL-1 data center. Our team is investigating the issue, and we’ll provide updates as soon as possible. ------ We detected a performance issue with one of the chunk servers and have isolated the affected server. ------ The issue has been resolved CA-MTL-3 Network Disruption https://uptime.runpod.io/incident/504467 Thu, 30 Jan 2025 11:14:00 -0000 https://uptime.runpod.io/incident/504467#4eee808b5ee6eff1041a5b1dd202ab4c710a65294646c6ef19e91adbe893a3ae CA-MTL-3 is suffering a network disruption due to an upstream provider issue. We are in contact with the provider and are working to restore network availability now. --------- the network is restored US-TX-4 Network Disruption https://uptime.runpod.io/incident/500904 Thu, 23 Jan 2025 22:52:00 -0000 https://uptime.runpod.io/incident/500904#f98451bf98724360bc2d433708dfab2fe45094af5b465449fda11bd3122d664d US-TX-4 is suffering a network disruption due to an upstream provider issue. We are in contact with the provider and are working to restore network availability now. ----- The US-TX-4 region will experience a short network disruption at approximately 01/23/2025 5:30 PM CST for about 10 minutes due to an emergency firewall update. We apologize for any inconvenience and appreciate your understanding as we perform this critical update. ---------- The issue affecting US-TX-4 has been resolved. Services are now operating normally. Thank you for your patience and understanding. EU-SE-1 Network Disruption https://uptime.runpod.io/incident/496566 Thu, 16 Jan 2025 00:00:00 -0000 https://uptime.runpod.io/incident/496566#bac999ca6f72114dc0143198f9378a39a59fc98bda798366a7fcb144b874f57a The network issue at the data center has been resolved. Thank you for your patience. EU-SE-1 Network Disruption https://uptime.runpod.io/incident/496566 Mon, 13 Jan 2025 04:00:00 -0000 https://uptime.runpod.io/incident/496566#6e5797f4a081232adc2c738c6b37214d18483c5850df0b4682f56414b4077165 EU-SE-1 is suffering a network disruption due to an upstream provider issue. We are in contact with the provider and are working to restore network availability now. US-TX-3 Network Disruption https://uptime.runpod.io/incident/484139 Thu, 19 Dec 2024 04:34:00 -0000 https://uptime.runpod.io/incident/484139#3377ce032047262417f18071137dea5cec0dfabfb8215b95228b947ce70d19f2 US-TX-3 is suffering a network disruption due to an upstream provider issue. We are in contact with the provider and are working to restore network availability now. US-TX-3 Network Disruption https://uptime.runpod.io/incident/483548 Tue, 17 Dec 2024 20:42:00 -0000 https://uptime.runpod.io/incident/483548#8f20af3c61e182f47833106a93d6a72e88e38d3f1f2d48b69ca8a4ed4a5ecc43 This issue was due to an upstream provider and has been resolved. We have requested an RCA and will provide updates as applicable. US-TX-3 Network Disruption https://uptime.runpod.io/incident/483548 Tue, 17 Dec 2024 20:25:00 -0000 https://uptime.runpod.io/incident/483548#aaaf0c42e4c518d8b8217a2847e9cd97f45584a583be458fcca2e81e319de440 US-TX-3 suffered a network disruption due to an upstream provider issue. CA-MTL-1 data center is currently inaccessible https://uptime.runpod.io/incident/452574 Tue, 29 Oct 2024 11:51:00 -0000 https://uptime.runpod.io/incident/452574#1f3b3b519b2080a72c8dbc07de49b6941232c353ae70b11bcf1db17e16ae66f4 Our CA-MTL-1 data center recently underwent maintenance, which was completed with minimal impact. However, during post-maintenance monitoring, the data center became inaccessible due to an unexpected issue. Our team is actively working to resolve the problem. ---- The network issue has been resolved for the CA-MTL-1 data center Elevated errors for dashboard and API https://uptime.runpod.io/incident/442564 Thu, 10 Oct 2024 19:46:00 -0000 https://uptime.runpod.io/incident/442564#7cc59f1d855f565f3181d01f81c7918cf6c3c7fcc2369ff65730df7a5ba663ba The root cause has been resolved, and services have returned to normal operating levels. Elevated errors for dashboard and API https://uptime.runpod.io/incident/442564 Thu, 10 Oct 2024 19:46:00 -0000 https://uptime.runpod.io/incident/442564#7cc59f1d855f565f3181d01f81c7918cf6c3c7fcc2369ff65730df7a5ba663ba The root cause has been resolved, and services have returned to normal operating levels. Elevated errors for dashboard and API https://uptime.runpod.io/incident/442564 Thu, 10 Oct 2024 19:17:00 -0000 https://uptime.runpod.io/incident/442564#aeaac89a546035e9510426195f43d675408cebe6c6ff3fe94b4f690c3783e377 We are currently experiencing elevated error rates for the console and primary API's. We have identified the issue and are in the process of resolving. Elevated errors for dashboard and API https://uptime.runpod.io/incident/442564 Thu, 10 Oct 2024 19:17:00 -0000 https://uptime.runpod.io/incident/442564#aeaac89a546035e9510426195f43d675408cebe6c6ff3fe94b4f690c3783e377 We are currently experiencing elevated error rates for the console and primary API's. We have identified the issue and are in the process of resolving. EUR-IS-1 Network Issue https://uptime.runpod.io/incident/441372 Tue, 08 Oct 2024 19:59:00 -0000 https://uptime.runpod.io/incident/441372#be330aacb37d015b9469eb7ea6a5fd94aec2b20ad238ce83f24143b4977a539d The network issue at the data center has been resolved. Thank you for your patience. EUR-IS-1 Network Issue https://uptime.runpod.io/incident/441372 Tue, 08 Oct 2024 18:20:00 -0000 https://uptime.runpod.io/incident/441372#028dd57188d133992ad1feb43529e129977b2e6ae9c1d9694cb91a32fea3bd89 We’re currently experiencing network packet loss issues in the EUR-IS-1 region, leading to connectivity errors and connection loss. Our team is actively coordinating with the data center and networking teams to resolve the problem. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/435242 Tue, 01 Oct 2024 00:20:00 -0000 https://uptime.runpod.io/incident/435242#98262437ae0f119ac56df9c16306c87ef16bb02fd126f62177748eba52443232 The root cause of the issue has been addressed and congestion has returned to baseline levels. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/435242 Mon, 30 Sep 2024 14:43:00 -0000 https://uptime.runpod.io/incident/435242#5bb3ae97b27a733d3cc69b782fa0c1c0cbbb312e6667902194ce1b6eefb8f045 We are currently observing elevated packet loss within the EUR-IS-1 DC. This is resulting in increased connection resets and failures. We are engaging with the network provider to determine the root cause. Network availability issues in EUR-IS-1 https://uptime.runpod.io/incident/424794 Thu, 05 Sep 2024 18:04:00 -0000 https://uptime.runpod.io/incident/424794#ea3e05de77229d04ef8fefb1ff51c5b87419aa5c3c991a8aba879d04ad9b71f6 Network availability has been restored by the upstream provider. We will be performing a RCA and provide further details. Network availability issues in EUR-IS-1 https://uptime.runpod.io/incident/424794 Thu, 05 Sep 2024 17:30:00 -0000 https://uptime.runpod.io/incident/424794#7dbd09f2bec60affd409d52939a6c462fef7735511d1b1fe066cdb003100e29f We're experiencing elevated network errors in EUR-IS-1 resulting in connectivity errors and connection loss. We are coordinating with the DC and and networking teams. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Wed, 04 Sep 2024 01:59:00 -0000 https://uptime.runpod.io/incident/423675#3be8f57d1f34097955cdc708aaec7b624fd1f822affe6b98c2ac2612a83c2c89 We have received confirmation from the upstream network provider and we have validated that this issue is resolved. The root cause was a network protection ruleset which engaged in a false-positive manner to drop a selection of packets. This resulted in failure to establish connections and impact to bandwidth over TCP/QUIC connections. We will provide an RCA once we receive the report from the upstream provider. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Tue, 03 Sep 2024 21:16:00 -0000 https://uptime.runpod.io/incident/423675#c8810bdb8cc229c41275f1cee6d87bf9a238b165ce4d52bb8d042a3d114df593 Packetloss has returned to nominal levels. We are still monitoring closely. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Tue, 03 Sep 2024 20:19:00 -0000 https://uptime.runpod.io/incident/423675#3b018d4313fbc8d1548b8d109a4cd74c183cc578e5b33fc89e2152fa4f3fe3eb The network provider is still in the process of mitigating the issue. We will provide regular updates as they make progress. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Tue, 03 Sep 2024 19:32:00 -0000 https://uptime.runpod.io/incident/423675#356e6437a6cd95d5f0a9122417e8d469b1bb3a46c58eb76c21fa97b3e763c849 The network provider is still in the process of mitigating the issue. We will provide regular updates as they make progress. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Tue, 03 Sep 2024 18:49:00 -0000 https://uptime.runpod.io/incident/423675#ef2774d80ff90a1e39a1630f7a487ee6c34854b91a03200dfe896693faaa8db0 The network provider is still in the process of mitigating the issue. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Tue, 03 Sep 2024 18:21:00 -0000 https://uptime.runpod.io/incident/423675#44ce8ab39b9b84b6cbe2cdf4773d438418326ee404e5434b317c651f5a9cc1cc The network provider is diagnosing the issue and has isolated it to exist within a specific network segment. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Tue, 03 Sep 2024 17:02:00 -0000 https://uptime.runpod.io/incident/423675#a089eae61a9a6e7ee976dcd2ab18cf5acce50474518a800aadc3aa218f532b39 We are currently observing elevated packet loss within the US-OR-1 DC. This is resulting in increased connection resets and failures. We are engaging with the network provider to determine the root cause. Serverless Workers Unable To Read Environment Variables In Templates https://uptime.runpod.io/incident/382897 Tue, 11 Jun 2024 20:32:00 -0000 https://uptime.runpod.io/incident/382897#3c8f4ebcf1d8fb1d1fe1bfc48169c89f881f7adbbe26ebfd95c5bb3c496f0866 At 9:02 AM PST workers in serverless endpoints were unable to read environment variables set in templates. Thus, workers that were not already initialized and relied on environment variables from the template would fail to start. This issue has been resolved at 1:32 PM PST. US-OR-1 Firewall Under Stress https://uptime.runpod.io/incident/382380 Tue, 11 Jun 2024 19:33:00 -0000 https://uptime.runpod.io/incident/382380#8db84946b7b069b16703060dab37b93daf5c33e0a00b5642b6692de7b8370de2 We have resolved this incident. US-OR-1 Firewall Under Stress https://uptime.runpod.io/incident/382380 Mon, 10 Jun 2024 20:32:00 -0000 https://uptime.runpod.io/incident/382380#62ee7e3f49104103bbef65a954d4d6e4c781e4ecdfdc6ea50e2424a5cdf0dbda Our firewall is currently handling an unusually high number of small packets, which may cause some temporary service disruptions and a slow down in upload and download speeds in US-OR-1. We are working to resolve the issue. Network Volume outage in RO region https://uptime.runpod.io/incident/231767 Wed, 12 Jul 2023 01:25:00 -0000 https://uptime.runpod.io/incident/231767#ab9f609106dbd5ed786435cc420f2829d54f28a5a4cc23313909f128084e1659 We patched the configuration and the problem should now be resolved. We will continue to monitor. Network Volume outage in RO region https://uptime.runpod.io/incident/231767 Wed, 12 Jul 2023 01:25:00 -0000 https://uptime.runpod.io/incident/231767#ab9f609106dbd5ed786435cc420f2829d54f28a5a4cc23313909f128084e1659 We patched the configuration and the problem should now be resolved. We will continue to monitor. Network Volume outage in RO region https://uptime.runpod.io/incident/231767 Wed, 12 Jul 2023 01:00:00 -0000 https://uptime.runpod.io/incident/231767#ca244d30eb80b2577b1fb82c08b4546c4709aa8373db377fb7355e40a148cdf3 We had a configuration issue that caused network volumes in the RO region to stop being able to be registered, causing a widespread outage for pods in the region. We have resolved the issue and patched the configuration so that this won't happen again. We are also reviewing this configuration in other regions to being it in line with this region. Network Volume outage in RO region https://uptime.runpod.io/incident/231767 Wed, 12 Jul 2023 01:00:00 -0000 https://uptime.runpod.io/incident/231767#ca244d30eb80b2577b1fb82c08b4546c4709aa8373db377fb7355e40a148cdf3 We had a configuration issue that caused network volumes in the RO region to stop being able to be registered, causing a widespread outage for pods in the region. We have resolved the issue and patched the configuration so that this won't happen again. We are also reviewing this configuration in other regions to being it in line with this region.