Incidents | Runpod Incidents reported on status page for Runpod https://uptime.runpod.io/ https://d1lppblt9t2x15.cloudfront.net/logos/32bca1a130e7b935bcdd6c41d428cd07.png Incidents | Runpod https://uptime.runpod.io/ en Runpod APIs experiencing intermittent errors https://uptime.runpod.io/incident/736405 Thu, 02 Oct 2025 17:43:00 -0000 https://uptime.runpod.io/incident/736405#fb20c29975b590c57e0623adbc0b5071bb265e38c342c4e6f10eab797930bfb8 Issue has been resolved. We've been monitoring the situation and it is operating normally. Runpod APIs experiencing intermittent errors https://uptime.runpod.io/incident/736405 Thu, 02 Oct 2025 17:43:00 -0000 https://uptime.runpod.io/incident/736405#fb20c29975b590c57e0623adbc0b5071bb265e38c342c4e6f10eab797930bfb8 Issue has been resolved. We've been monitoring the situation and it is operating normally. US-CA-2 Network issue https://uptime.runpod.io/incident/736409 Thu, 02 Oct 2025 15:00:00 -0000 https://uptime.runpod.io/incident/736409#96293fcfeafd8a4a6dfc1a84fec51cabd82d4215df23fa3849835de728c06d16 Issue has been resolved and network is operating normally. US-CA-2 Network issue https://uptime.runpod.io/incident/736409 Thu, 02 Oct 2025 14:40:00 -0000 https://uptime.runpod.io/incident/736409#32b04853955f540c9031857921625045936033eba33ed99d1eef1b9d6d277c45 We are experiencing network connectivity issues for some of the machines in US-CA-2. We are working on this issue. Runpod APIs experiencing intermittent errors https://uptime.runpod.io/incident/736405 Thu, 02 Oct 2025 14:34:00 -0000 https://uptime.runpod.io/incident/736405#d44cafaee69c0f70d058d263b868e27bd9d869621ae244fb61302dc4b06aa093 We are experiencing intermittent connectivity issues to our upstream that is impacting calls to Runpod APIs and Console. We working on resolving this issue. Runpod APIs experiencing intermittent errors https://uptime.runpod.io/incident/736405 Thu, 02 Oct 2025 14:34:00 -0000 https://uptime.runpod.io/incident/736405#d44cafaee69c0f70d058d263b868e27bd9d869621ae244fb61302dc4b06aa093 We are experiencing intermittent connectivity issues to our upstream that is impacting calls to Runpod APIs and Console. We working on resolving this issue. graphql: api.runpod.io recovered https://uptime.runpod.io/ Thu, 02 Oct 2025 14:32:14 +0000 https://uptime.runpod.io/#bec922363d8f6ec244a841a6835804bca853273e379f19227ede296e7913c942 graphql: api.runpod.io recovered graphql: api.runpod.io went down https://uptime.runpod.io/ Thu, 02 Oct 2025 14:24:35 +0000 https://uptime.runpod.io/#bec922363d8f6ec244a841a6835804bca853273e379f19227ede296e7913c942 graphql: api.runpod.io went down Runpod console intermittent connectivity issue https://uptime.runpod.io/incident/735788 Wed, 01 Oct 2025 14:52:00 -0000 https://uptime.runpod.io/incident/735788#627dc11ac06eea6641f7f8d6b88674126fefdc4d9ca1fcdb1c70ab62638a96cc Services have returned to normal and we are continuing to monitor the situation Runpod console intermittent connectivity issue https://uptime.runpod.io/incident/735788 Wed, 01 Oct 2025 14:52:00 -0000 https://uptime.runpod.io/incident/735788#627dc11ac06eea6641f7f8d6b88674126fefdc4d9ca1fcdb1c70ab62638a96cc Services have returned to normal and we are continuing to monitor the situation Runpod console intermittent connectivity issue https://uptime.runpod.io/incident/735788 Wed, 01 Oct 2025 14:52:00 -0000 https://uptime.runpod.io/incident/735788#627dc11ac06eea6641f7f8d6b88674126fefdc4d9ca1fcdb1c70ab62638a96cc Services have returned to normal and we are continuing to monitor the situation Runpod console intermittent connectivity issue https://uptime.runpod.io/incident/735788 Wed, 01 Oct 2025 14:52:00 -0000 https://uptime.runpod.io/incident/735788#627dc11ac06eea6641f7f8d6b88674126fefdc4d9ca1fcdb1c70ab62638a96cc Services have returned to normal and we are continuing to monitor the situation Runpod console intermittent connectivity issue https://uptime.runpod.io/incident/735788 Wed, 01 Oct 2025 14:52:00 -0000 https://uptime.runpod.io/incident/735788#627dc11ac06eea6641f7f8d6b88674126fefdc4d9ca1fcdb1c70ab62638a96cc Services have returned to normal and we are continuing to monitor the situation Runpod console intermittent connectivity issue https://uptime.runpod.io/incident/735788 Wed, 01 Oct 2025 14:52:00 -0000 https://uptime.runpod.io/incident/735788#627dc11ac06eea6641f7f8d6b88674126fefdc4d9ca1fcdb1c70ab62638a96cc Services have returned to normal and we are continuing to monitor the situation CA-MTL-4 went down https://uptime.runpod.io/ Wed, 01 Oct 2025 12:39:08 +0000 https://uptime.runpod.io/#bccd493063f4a752b82ba8ac8bc041e48130abcfd5bd7cd5004792e0d0b7344a CA-MTL-4 went down CA-MTL-4 recovered https://uptime.runpod.io/ Wed, 01 Oct 2025 12:37:45 +0000 https://uptime.runpod.io/#62d34294a474fc2b0f01cf6798a80f20dcb8b5a7086b116d920e1c3f3d057c92 CA-MTL-4 recovered CA-MTL-4 went down https://uptime.runpod.io/ Wed, 01 Oct 2025 12:31:42 +0000 https://uptime.runpod.io/#62d34294a474fc2b0f01cf6798a80f20dcb8b5a7086b116d920e1c3f3d057c92 CA-MTL-4 went down CA-MTL-4 recovered https://uptime.runpod.io/ Wed, 01 Oct 2025 12:26:57 +0000 https://uptime.runpod.io/#a98e44322ecf5853acedc1e05d3e8107b77fb7f56d5c62a99fff729aada25dd1 CA-MTL-4 recovered CA-MTL-4 went down https://uptime.runpod.io/ Wed, 01 Oct 2025 11:31:13 +0000 https://uptime.runpod.io/#a98e44322ecf5853acedc1e05d3e8107b77fb7f56d5c62a99fff729aada25dd1 CA-MTL-4 went down EU-FR-1 DC experiencing connectivity issues with H100 machines https://uptime.runpod.io/incident/734444 Tue, 30 Sep 2025 17:04:00 -0000 https://uptime.runpod.io/incident/734444#cda6866d7e197e6e385ce6fa0ebe8f760268039e198a17dd4ffb36eae28bdff6 The EU-FR-1 DC issue with the H100 machines has been resolved and now operating normally. EUR-IS-1 recovered https://uptime.runpod.io/ Tue, 30 Sep 2025 01:53:20 +0000 https://uptime.runpod.io/#010f6b233cd5be39fff3d706d720e77a582cf10f68f0a9063d36996396c3a7f3 EUR-IS-1 recovered EUR-IS-1 went down https://uptime.runpod.io/ Tue, 30 Sep 2025 01:33:40 +0000 https://uptime.runpod.io/#010f6b233cd5be39fff3d706d720e77a582cf10f68f0a9063d36996396c3a7f3 EUR-IS-1 went down Console data unavailable https://uptime.runpod.io/incident/734527 Mon, 29 Sep 2025 22:53:00 -0000 https://uptime.runpod.io/incident/734527#40bf9c70989be57d026105290b826a7ac04148103940ad6a05377d3a5afc074e At 2025/09/29 20:39:30 UTC, Runpod's primary API suffered an outage. This was due to an upstream vendor's managed database offering suffering an internal critical error. Vendor support was engaged immediately and their team was able to perform recovery at 2025/09/29 21:26:30. During the outage period, all calls made to the REST/GraphQL API's/console UI did not respond correctly and a percentage of serverless API job submissions were impacted. Existing pod workloads and data was not impacted. At this time all services are operating normally and all global capacity is available. We are engaging with the vendor to determine the root cause. Console data unavailable https://uptime.runpod.io/incident/734527 Mon, 29 Sep 2025 22:11:00 -0000 https://uptime.runpod.io/incident/734527#de5be3403b6d39c07560f41704321984ef5db3a5889241f2a4769c05d7a02fd3 Upstream resolved their issue and we are back to operating normally. Console data unavailable https://uptime.runpod.io/incident/734527 Mon, 29 Sep 2025 21:30:00 -0000 https://uptime.runpod.io/incident/734527#653283413ed4fe1c5c09cc7924937517321d882e5bc9875dfda5a8f26d9f4ffa We are seeing some preliminary recovery. We are continuing work through and monitor the situation. Console data unavailable https://uptime.runpod.io/incident/734527 Mon, 29 Sep 2025 21:13:00 -0000 https://uptime.runpod.io/incident/734527#7153f379998a5c31c902cfc9521578efbe27ec019daad69563f73d6b2e8aa308 There is an upstream component related issue. We are working through on resolving this issue. Console data unavailable https://uptime.runpod.io/incident/734527 Mon, 29 Sep 2025 20:48:00 -0000 https://uptime.runpod.io/incident/734527#3cf2e05fd296182005d151499839b51c3818afc683588d32c11e45717b8c5bd4 Console data not showing up on the dashboard. We are investigating and working on resolving the issue. EU-FR-1 DC experiencing connectivity issues with H100 machines https://uptime.runpod.io/incident/734444 Mon, 29 Sep 2025 17:50:00 -0000 https://uptime.runpod.io/incident/734444#f4de370fdc6c6597dff2f3c5d986b784988e7ffab61e02dddff2587782f2a1ce EU-FR-1 is experiencing network connectivity issues with H100 machines. We are working on getting this resolved. Planned Internet Maintenance in CA-MTL-4 DC https://uptime.runpod.io/incident/729097 Mon, 29 Sep 2025 03:59:44 -0000 https://uptime.runpod.io/incident/729097#2ea7e8ac37858303974817b74ba3306c69fed313dc472294526a651f60142556 Maintenance completed CA-MTL-4 recovered https://uptime.runpod.io/ Sun, 28 Sep 2025 02:16:07 +0000 https://uptime.runpod.io/#5f0c799b579c7fbf738ad9f22ec375136223f9eabc08f4532e11417e313f9a68 CA-MTL-4 recovered CA-MTL-4 went down https://uptime.runpod.io/ Sun, 28 Sep 2025 01:53:54 +0000 https://uptime.runpod.io/#5f0c799b579c7fbf738ad9f22ec375136223f9eabc08f4532e11417e313f9a68 CA-MTL-4 went down Planned Internet Maintenance in CA-MTL-4 DC https://uptime.runpod.io/incident/729097 Sat, 27 Sep 2025 13:00:44 -0000 https://uptime.runpod.io/incident/729097#f668a0eb9bf3b3bdab6ae5fbad1cf3dc524caf04dd70e05856fa36dddf581a75 Maintenance will be performed from September 27, 13:00 UTC to September 29, 4:00 UTC. This maintenance is focused on enhancing the performance and reliability of the internet services in CA-MTL-4 DC. While we are taking every precaution to avoid service interruptions, there may be experiences of brief packet loss to the internet while maintenance activities take place. US-IL-1 Network issue https://uptime.runpod.io/incident/732439 Fri, 26 Sep 2025 19:39:00 -0000 https://uptime.runpod.io/incident/732439#646b237f12f281eae2e89bac39f7690c4d1064caec1341d0393ab366b18fef09 We’ve identified the root cause of the networking Issue and do not expect the issue to resurface. It has been resolved and the network is operating normally. CA-MTL-4 recovered https://uptime.runpod.io/ Fri, 26 Sep 2025 18:35:32 +0000 https://uptime.runpod.io/#fb25ef3a0e70361c354a6384d8dff83c343d506e8b45c856651d3efb02edbb79 CA-MTL-4 recovered US-IL-1 Network issue https://uptime.runpod.io/incident/732439 Fri, 26 Sep 2025 18:35:00 -0000 https://uptime.runpod.io/incident/732439#10f17272f5f2d32784080323af53ad3fb59680fb73f3a7ae53ed9103e77def0f Network issue has recovered in US-IL-1. We are continuing to monitor the situation and will provide an update as we get more information. CA-MTL-4 went down https://uptime.runpod.io/ Fri, 26 Sep 2025 18:03:49 +0000 https://uptime.runpod.io/#fb25ef3a0e70361c354a6384d8dff83c343d506e8b45c856651d3efb02edbb79 CA-MTL-4 went down US-IL-1 recovered https://uptime.runpod.io/ Fri, 26 Sep 2025 15:45:20 +0000 https://uptime.runpod.io/#4cc71c02cb8aa2961b8d09b2b400a5f52eb6c29fcfb1465e9d950c8d9242f2b3 US-IL-1 recovered US-IL-1 went down https://uptime.runpod.io/ Fri, 26 Sep 2025 15:35:46 +0000 https://uptime.runpod.io/#4cc71c02cb8aa2961b8d09b2b400a5f52eb6c29fcfb1465e9d950c8d9242f2b3 US-IL-1 went down US-IL-1 Network issue https://uptime.runpod.io/incident/732439 Thu, 25 Sep 2025 21:26:00 -0000 https://uptime.runpod.io/incident/732439#bce81c626b6e07f6c9e8a7cd29d65cccd2aad0699db3a0c6a1b4e3fdfeab9713 US-IL-1 is experiencing a network issue. We are working on getting this resolved. EUR-IS-1 recovered https://uptime.runpod.io/ Thu, 25 Sep 2025 02:08:38 +0000 https://uptime.runpod.io/#07d5902c5168c3ad5cd579db9e97fb9fe06c2a2bb0d0cde32c035a19ac66c4b7 EUR-IS-1 recovered Upstream issue: Docker Hub Registry https://uptime.runpod.io/incident/731666 Thu, 25 Sep 2025 01:34:00 -0000 https://uptime.runpod.io/incident/731666#c9b28e4c40e3271e27d3da77e43910df513f00f673394e052e58ef1533148404 Docker Hub has resolved its service issues and has returned to normal operation. Further details are captured here: https://www.dockerstatus.com/pages/history/533c6539221ae15e3f000031 Upstream issue: Docker Hub Registry https://uptime.runpod.io/incident/731666 Thu, 25 Sep 2025 01:25:00 -0000 https://uptime.runpod.io/incident/731666#4e1af99995ac7a400af28c6d63c498dd3c2e5b799be203b578f20399ef6db1f0 Errors have returned back to normal. We'll continue to monitor as Docker finalizes this as resolved. EUR-IS-1 went down https://uptime.runpod.io/ Thu, 25 Sep 2025 01:09:17 +0000 https://uptime.runpod.io/#07d5902c5168c3ad5cd579db9e97fb9fe06c2a2bb0d0cde32c035a19ac66c4b7 EUR-IS-1 went down Upstream issue: Docker Hub Registry https://uptime.runpod.io/incident/731666 Wed, 24 Sep 2025 23:43:00 -0000 https://uptime.runpod.io/incident/731666#052e7c0f391b2a38e6cbfc7875ae63acb22c8bbefeefafaa75be37c7540d8fb3 Docker is observing issues with authenticated requests which may impact image pulls and pushes. We are monitoring the situation and will provide ongoing updates if the situation changes. See https://www.dockerstatus.com/ for further detail. CA-MTL-4 recovered https://uptime.runpod.io/ Wed, 24 Sep 2025 12:45:03 +0000 https://uptime.runpod.io/#bc6f1f64f7e57ff40dbadbe28e74f0dfdd7f9fc8e789090125da76e21b611ee1 CA-MTL-4 recovered CA-MTL-4 - Network performance is degraded https://uptime.runpod.io/incident/731138 Wed, 24 Sep 2025 12:12:00 -0000 https://uptime.runpod.io/incident/731138#8592ab64612d16a66573eb2f7498169e33e37d5a27698225492f3952791a4a6b CA-MTL-4 network issue has been resolved and network is operating normally. CA-MTL-4 went down https://uptime.runpod.io/ Wed, 24 Sep 2025 11:30:21 +0000 https://uptime.runpod.io/#bc6f1f64f7e57ff40dbadbe28e74f0dfdd7f9fc8e789090125da76e21b611ee1 CA-MTL-4 went down Upstream Issue - Console Logins are Unavailable https://uptime.runpod.io/incident/724608 Mon, 15 Sep 2025 16:25:00 -0000 https://uptime.runpod.io/incident/724608#08e55146c84526147e929240da8d099ba1a6150142b3636e1d931d6ddc1a13de This issue has been resolved and the upstream provider has confirmed resolution. Upstream Issue - Console Logins are Unavailable https://uptime.runpod.io/incident/724608 Mon, 15 Sep 2025 14:00:00 -0000 https://uptime.runpod.io/incident/724608#fe2f256138157c4a5fa39b6e04fe5b1734987e96bf927e3b5f96fc2598a4180a Logins to the Runpod console are impacted by an upstream provider. This may result in logging to the console operating slowly, or being stuck on the Runpod loading page. Please try refreshing your browser. We are tracking with this provider and will provide updates as we receive them. US-CA-2 Network issue https://uptime.runpod.io/incident/723742 Sat, 13 Sep 2025 20:52:00 -0000 https://uptime.runpod.io/incident/723742#bbfb2204e2fc57c9c61d2b750ebfdce251dd16977c6111c14c2001d3dea013b3 Issue has been resolved and network is operating normally. US-CA-2 Network issue https://uptime.runpod.io/incident/723742 Sat, 13 Sep 2025 19:25:00 -0000 https://uptime.runpod.io/incident/723742#6715d99e55fc852d9f956eddc421560e8d12a4112725af4da7cd3b4072c518a8 US-CA-2 is experiencing a network issue. We are working on getting this resolved. US-TX-4 recovered https://uptime.runpod.io/ Wed, 10 Sep 2025 07:05:04 +0000 https://uptime.runpod.io/#1a2ae1c83e4122a8bbeb33b80fc26bfe8bf232b15fa0d472bfd38448f2034812 US-TX-4 recovered US-TX-4 went down https://uptime.runpod.io/ Wed, 10 Sep 2025 06:40:29 +0000 https://uptime.runpod.io/#1a2ae1c83e4122a8bbeb33b80fc26bfe8bf232b15fa0d472bfd38448f2034812 US-TX-4 went down US-NC-1 recovered https://uptime.runpod.io/ Thu, 04 Sep 2025 05:18:32 +0000 https://uptime.runpod.io/#3304791d28ab626343aacb14d8bc5797dffc37d6312f84c62279c39e8283f8b2 US-NC-1 recovered US-NC-1 went down https://uptime.runpod.io/ Thu, 04 Sep 2025 05:08:52 +0000 https://uptime.runpod.io/#3304791d28ab626343aacb14d8bc5797dffc37d6312f84c62279c39e8283f8b2 US-NC-1 went down Upstream Systems recovered https://uptime.runpod.io/ Sun, 31 Aug 2025 22:05:16 +0000 https://uptime.runpod.io/#1598cc96e7bb8e57b870a880eec32eb0678e7de474c706d1b0fe8d6adb102459 Upstream Systems recovered Upstream Systems went down https://uptime.runpod.io/ Sun, 31 Aug 2025 21:56:05 +0000 https://uptime.runpod.io/#1598cc96e7bb8e57b870a880eec32eb0678e7de474c706d1b0fe8d6adb102459 Upstream Systems went down Upstream Issue - Console login flow errors https://uptime.runpod.io/incident/714299 Thu, 28 Aug 2025 23:31:00 -0000 https://uptime.runpod.io/incident/714299#82de84b7735f86fdfffcca44fbf5935966ab74d7ddcdb9a1b586a687b83ccd6a This issue has been resolved and the upstream provider has confirmed resolution. Upstream Issue - Console login flow errors https://uptime.runpod.io/incident/714299 Thu, 28 Aug 2025 17:31:00 -0000 https://uptime.runpod.io/incident/714299#42646a71efe66936fce9b240d7940c097ce238bb658f82c994942c88d1e4376f The upstream provider has stabilized error rates and is still monitoring the situation. We will keep this status open until we receive positive confirmation that the upstream event is resolved. Upstream Issue - Console login flow errors https://uptime.runpod.io/incident/714299 Thu, 28 Aug 2025 15:00:00 -0000 https://uptime.runpod.io/incident/714299#eebd542905bb42b8f11ddcf86630857222f1ede7373a7b2e08548510b3f13ba9 Logins to the RunPod console are impacted by an upstream provider. This may result in logging to the console operating slowly, or being stuck on the Runpod loading page. Please try refreshing your browser. We are tracking with this provider and will provide updates as we receive them. Upstream Systems recovered https://uptime.runpod.io/ Wed, 27 Aug 2025 18:51:55 +0000 https://uptime.runpod.io/#766f665655b77d628e0623476f49cc7115384b5dc0963737c0ee27899386c785 Upstream Systems recovered Upstream Systems went down https://uptime.runpod.io/ Wed, 27 Aug 2025 18:33:18 +0000 https://uptime.runpod.io/#766f665655b77d628e0623476f49cc7115384b5dc0963737c0ee27899386c785 Upstream Systems went down CA-MTL-4 Network Issue https://uptime.runpod.io/incident/713024 Tue, 26 Aug 2025 22:23:00 -0000 https://uptime.runpod.io/incident/713024#6771da27fe53882eb018f67b635a3833218e203ce806c966d1d562562d0c6770 The issue has been resolved and networking is operating correctly. CA-MTL-4 recovered https://uptime.runpod.io/ Tue, 26 Aug 2025 22:13:49 +0000 https://uptime.runpod.io/#f4dd3f7c2151c08e2ce694e425896315410b7be0dabe3b218b3d0ac5ddc82cf3 CA-MTL-4 recovered CA-MTL-4 Network Issue https://uptime.runpod.io/incident/713024 Tue, 26 Aug 2025 21:47:00 -0000 https://uptime.runpod.io/incident/713024#489065c77886df9c0d6a1067b4dd303334e2112b2e918f7376a385b3f1434f89 The issue has been identified and initial remediation has been applied. We are monitoring traffic recovery. CA-MTL-4 went down https://uptime.runpod.io/ Tue, 26 Aug 2025 21:39:18 +0000 https://uptime.runpod.io/#f4dd3f7c2151c08e2ce694e425896315410b7be0dabe3b218b3d0ac5ddc82cf3 CA-MTL-4 went down CA-MTL-4 Network Issue https://uptime.runpod.io/incident/713024 Tue, 26 Aug 2025 21:34:00 -0000 https://uptime.runpod.io/incident/713024#70ebe6606a8eb2f57e804b4f600314fa5f6c570d01d0d68fc2c7771f40c8df15 Networking at CA-MTL-4 has been disrupted. We are assessing and working on determining the root cause. US-TX-1 Network Issue https://uptime.runpod.io/incident/712177 Tue, 26 Aug 2025 00:44:00 -0000 https://uptime.runpod.io/incident/712177#46cfaa18415bd15611396906ad367f66fee69e2dc4e95f2c3a5afec3c95e318e Issues has been resolved and network is operating normally. US-TX-1 Network Issue https://uptime.runpod.io/incident/712177 Mon, 25 Aug 2025 17:40:00 -0000 https://uptime.runpod.io/incident/712177#3789784499c63fc3451e86a8f37d20d667cb026c4aa4fd3629ca6288177c7c40 US-TX-1 is experiencing a network issue. We are working on getting this resolved. Email related actions are degraded https://uptime.runpod.io/incident/711668 Mon, 25 Aug 2025 03:32:00 -0000 https://uptime.runpod.io/incident/711668#3b6e0c397c417434e1171440800e43e46c2c7aea9a0c3026971c1b3ba07411ef The issue has been resolved and service related email sending has returned to normal levels. This did not impact any core functionality of RunPod Serverless, Pods, or other workloads. Email related actions are degraded https://uptime.runpod.io/incident/711668 Sun, 24 Aug 2025 22:43:00 -0000 https://uptime.runpod.io/incident/711668#776343f4a0016cbd90ba94a737108f7b23cd5d74819f7dbc3e1d4e119da4cd7a Actions related to email are currently degraded and may not deliver email. This includes forgot password, new account, etc. The engineering team is working on resolution and will provide updates here. US-KS-2 recovered https://uptime.runpod.io/ Fri, 22 Aug 2025 03:16:49 +0000 https://uptime.runpod.io/#fe20bf55923370a41269770ce1f6c3ebc3a872cee78b66d61b90c54cf57a4a68 US-KS-2 recovered US-KS-2 went down https://uptime.runpod.io/ Fri, 22 Aug 2025 03:07:11 +0000 https://uptime.runpod.io/#fe20bf55923370a41269770ce1f6c3ebc3a872cee78b66d61b90c54cf57a4a68 US-KS-2 went down US-KS-2 recovered https://uptime.runpod.io/ Mon, 18 Aug 2025 18:50:27 +0000 https://uptime.runpod.io/#bce36b0ad97ecb3417001b90af5ecfc0c6312735ee6b90d2b905183732f1de04 US-KS-2 recovered US-KS-2 went down https://uptime.runpod.io/ Mon, 18 Aug 2025 18:40:46 +0000 https://uptime.runpod.io/#bce36b0ad97ecb3417001b90af5ecfc0c6312735ee6b90d2b905183732f1de04 US-KS-2 went down Upstream Systems recovered https://uptime.runpod.io/ Tue, 12 Aug 2025 14:26:37 +0000 https://uptime.runpod.io/#680c585222af65d84d7339114b2bc3e5b6112059dfa4bd62f9ef479bbf5f1c44 Upstream Systems recovered Upstream Systems went down https://uptime.runpod.io/ Tue, 12 Aug 2025 14:18:55 +0000 https://uptime.runpod.io/#680c585222af65d84d7339114b2bc3e5b6112059dfa4bd62f9ef479bbf5f1c44 Upstream Systems went down CA-MTL-3 - Network storage performance is degraded https://uptime.runpod.io/incident/673519 Sat, 02 Aug 2025 14:18:00 -0000 https://uptime.runpod.io/incident/673519#b6bb037f42a7637af5c9db5afa1c273c5bf413ec943c1086c163ee62321e76a1 We identified an edge case around network volume cleanup that was impacting the performance of the storage cluster in CA-MTL-3. The network storage performance has returned back to normal. We will continue to monitor this closely. CA-MTL-3 - Network storage performance is degraded https://uptime.runpod.io/incident/673519 Sat, 02 Aug 2025 00:37:00 -0000 https://uptime.runpod.io/incident/673519#55180fb3931d6e06eb35005651c61f7e54030c1bee441cffe7bf839784109187 We have been actively and continuing to work on a resolution to this issue. CA-MTL-3 - Network storage performance is degraded https://uptime.runpod.io/incident/673519 Fri, 01 Aug 2025 14:44:00 -0000 https://uptime.runpod.io/incident/673519#f606faf8f0c3b5f230b1faa800b4e640ebf0a455425c8a97b3ec85adab95033a We are continuing to work on resolving and towards path to recovery. CA-MTL-3 - Network storage performance is degraded https://uptime.runpod.io/incident/673519 Fri, 01 Aug 2025 01:14:00 -0000 https://uptime.runpod.io/incident/673519#d57db4f2f7a533da3aa601641da830e71737b8f20f23a5575f71d9413c5886ae We are continuing to work on restoring the performance issue. EUR-NO-1 DC is experiencing issues https://uptime.runpod.io/incident/676392 Thu, 31 Jul 2025 18:49:00 -0000 https://uptime.runpod.io/incident/676392#136c0b4918dcc8e33c315aaada44a0a329645d1b612dfe0811da718113e9ea58 The EUR-NO-1 data center issue has been resolved. CA-MTL-3 - Network storage performance is degraded https://uptime.runpod.io/incident/673519 Thu, 31 Jul 2025 16:09:00 -0000 https://uptime.runpod.io/incident/673519#e91b9524237402c10f8e5914fd6f0f0c7d07f2e612eac4d1b5817d9cabcf4e69 CA-MTL-3's network storage array performance is degraded. We're working on restoring performance now. Console Elevated Error Rates https://uptime.runpod.io/incident/625805 Fri, 25 Jul 2025 19:13:00 -0000 https://uptime.runpod.io/incident/625805#cd2af9502e811089eeb36af92a0030cfefdc3bd5076804618326a8cdddf0ed5c The Console is running and functioning as normal. Console Elevated Error Rates https://uptime.runpod.io/incident/625805 Fri, 25 Jul 2025 17:13:00 -0000 https://uptime.runpod.io/incident/625805#26ae0eb1c2b234387aba49f314fd5ab901a4b4799b095182c76569bc11191977 Investigating - We are currently experiencing an issue with the Console returning elevated error rates for certain features. We will post an update as soon as we are able. Cloudflare Global DNS is unavailable https://uptime.runpod.io/incident/619615 Mon, 14 Jul 2025 23:29:00 -0000 https://uptime.runpod.io/incident/619615#e68448b995159ac76bd3468d742cce22b54d51a18e43de8e29fc58b1cfb01928 Cloudflare has resolved the issue and we are observing normal network patterns when communicating with Cloudflare subnets. Cloudflare Global DNS is unavailable https://uptime.runpod.io/incident/619615 Mon, 14 Jul 2025 22:15:00 -0000 https://uptime.runpod.io/incident/619615#47b48958e2885aaf56d377de2a45ef8f73b063dc18c69d1add1f0596eb91161e Be advised, Cloudflare Global DNS is having an outage. Services which rely on 1.1.1.1 may fail to operate correctly. RunPod is not observing any service degradation at this time, however we are assessing the situation. More details available here: https://www.cloudflarestatus.com/incidents/28r0vbbxsh8f ui: runpod.io/console recovered https://uptime.runpod.io/ Mon, 07 Jul 2025 08:10:52 +0000 https://uptime.runpod.io/#66fb8fdb9e946d3b4c56705590126c3aaabdf0758d5243d8029cb5c70724e9ef ui: runpod.io/console recovered ui: runpod.io/console went down https://uptime.runpod.io/ Mon, 07 Jul 2025 08:10:08 +0000 https://uptime.runpod.io/#66fb8fdb9e946d3b4c56705590126c3aaabdf0758d5243d8029cb5c70724e9ef ui: runpod.io/console went down ui: runpod.io/console recovered https://uptime.runpod.io/ Sun, 06 Jul 2025 19:39:45 +0000 https://uptime.runpod.io/#bbdcf18fe4590f958ee0ebaec0e2bef736abdeaf23932f3266b64e96d3b36b72 ui: runpod.io/console recovered ui: runpod.io/console went down https://uptime.runpod.io/ Sun, 06 Jul 2025 19:38:48 +0000 https://uptime.runpod.io/#bbdcf18fe4590f958ee0ebaec0e2bef736abdeaf23932f3266b64e96d3b36b72 ui: runpod.io/console went down Network performance to Cloudflare is degraded in AP-JP-1 https://uptime.runpod.io/incident/612650 Wed, 02 Jul 2025 00:40:00 -0000 https://uptime.runpod.io/incident/612650#da0a4082e3cf9c2bf8a687ef27de21dd477a85b01600734c4759456f91e9e67b The upstream issue has been resolved and routing performance has returned to normal levels. Network performance to Cloudflare is degraded in AP-JP-1 https://uptime.runpod.io/incident/612650 Tue, 01 Jul 2025 20:39:00 -0000 https://uptime.runpod.io/incident/612650#bbfae10d7a1e4963edfc7c41bd3bcb3f67d5b4e04a8cf960afddc52bb2e9cf41 We are experienced elevated packet loss in AP-JP-1 to certain network subnets on the global internet. We are engaging with our upstream provider to determine root cause and resolution. Login to RunPod Console is degraded (Upstream service outage) https://uptime.runpod.io/incident/609416 Thu, 26 Jun 2025 07:30:00 -0000 https://uptime.runpod.io/incident/609416#aad3e087259ffe97970178bc47f1404a2a09ddaf040fffbc165a7bdff44aa249 Clerk has confirmed full recovery and access to the Runpod Console has been restored. Login to RunPod Console is degraded (Upstream service outage) https://uptime.runpod.io/incident/609416 Thu, 26 Jun 2025 07:30:00 -0000 https://uptime.runpod.io/incident/609416#aad3e087259ffe97970178bc47f1404a2a09ddaf040fffbc165a7bdff44aa249 Clerk has confirmed full recovery and access to the Runpod Console has been restored. Login to RunPod Console is degraded (Upstream service outage) https://uptime.runpod.io/incident/609416 Thu, 26 Jun 2025 07:09:00 -0000 https://uptime.runpod.io/incident/609416#e048e7ac99395b50edb95917e2087e446601cffee2f24825bbd53396ff6bbf1d We are observing recovery of logins and we are seeing correct login behavior on the console. We are still monitoring while Clerk confirms full recovery. Login to RunPod Console is degraded (Upstream service outage) https://uptime.runpod.io/incident/609416 Thu, 26 Jun 2025 07:09:00 -0000 https://uptime.runpod.io/incident/609416#e048e7ac99395b50edb95917e2087e446601cffee2f24825bbd53396ff6bbf1d We are observing recovery of logins and we are seeing correct login behavior on the console. We are still monitoring while Clerk confirms full recovery. Login to RunPod Console is degraded (Upstream service outage) https://uptime.runpod.io/incident/609416 Thu, 26 Jun 2025 06:33:00 -0000 https://uptime.runpod.io/incident/609416#82882e9a88643b22e7ed281aa41d474adce28408ea7e055dd457e8d356488691 We are aware of an upstream issue with our authentication provider, Clerk, that prevents users from logging in to the Runpod console. When attempting to log into the console, the login form will not load and the user experiences an infinite loading animation. Existing pods, serverless, and other workloads are not impacted at this time. More information from Clerk is available here: https://status.clerk.com/incidents/01JYNESV77Q8D10QZKP2PF63PN Login to RunPod Console is degraded (Upstream service outage) https://uptime.runpod.io/incident/609416 Thu, 26 Jun 2025 06:33:00 -0000 https://uptime.runpod.io/incident/609416#82882e9a88643b22e7ed281aa41d474adce28408ea7e055dd457e8d356488691 We are aware of an upstream issue with our authentication provider, Clerk, that prevents users from logging in to the Runpod console. When attempting to log into the console, the login form will not load and the user experiences an infinite loading animation. Existing pods, serverless, and other workloads are not impacted at this time. More information from Clerk is available here: https://status.clerk.com/incidents/01JYNESV77Q8D10QZKP2PF63PN RunPod console maintenance https://uptime.runpod.io/incident/605227 Wed, 18 Jun 2025 16:16:00 -0000 https://uptime.runpod.io/incident/605227#84e2f3f6b0648c0c1dd4cf3a98f29a0c6dc0e68e4412cb985e171197aeed19ac Access to the RunPod console has been restored. RunPod console maintenance https://uptime.runpod.io/incident/605227 Wed, 18 Jun 2025 15:25:00 -0000 https://uptime.runpod.io/incident/605227#98c22b1cbef67eb62ef4626f6c1b9bf176675e0569c161dbd85dfeaa6705565b RunPod console is experiencing issues. We are working on resolving and will provide updates. Monitoring Issues With Other Cloud Providers https://uptime.runpod.io/incident/602037 Thu, 12 Jun 2025 20:54:00 -0000 https://uptime.runpod.io/incident/602037#094b84f52f8a786bbba9c425313b46014ed7d4fbdf90bae7bb3a5fd44398a825 Docker Hub and cloud providers appear to be functioning normally. Monitoring Issues With Other Cloud Providers https://uptime.runpod.io/incident/602037 Thu, 12 Jun 2025 19:57:00 -0000 https://uptime.runpod.io/incident/602037#2efb2a98f87fc24103c4c55e9e8d30e7a7776a9698fcee6e352dbbc445fb5b56 We are aware of issues with various cloud providers and are monitoring the situation to ensure there is no impact to the Runpod platform. Docker Hub has acknowledged issues, which may affect some image pulls. You can view their status page here: https://www.dockerstatus.com/ Downtime in US-IL-1 https://uptime.runpod.io/incident/601337 Thu, 12 Jun 2025 06:56:00 -0000 https://uptime.runpod.io/incident/601337#a94cf48b31e79784d65e605c4dfd666d6b5423c9507a51493de051d172f15a96 The network issue in the US-IL-1 data center has been fully resolved. Our team will continue to monitor the situation. Downtime in US-IL-1 https://uptime.runpod.io/incident/601337 Thu, 12 Jun 2025 06:07:00 -0000 https://uptime.runpod.io/incident/601337#9264f39556d827a1862c6530ffceff77a29ea819bc550afe88b4988084eba22c We’ve detected network downtime affecting the US-IL-1 data center. Our team is actively investigating the issue and will continue to monitor the situation closely. We’ll provide updates as we learn more. Upstream issue - Docker Hub Registry https://uptime.runpod.io/incident/597075 Thu, 05 Jun 2025 14:23:00 -0000 https://uptime.runpod.io/incident/597075#d8415e3a7d2980f2444f525a7c5f6fd538dd37350382f8f3f7af3e348976dec8 Docker Hub has resolved its service issues and has returned to normal operation. Further details are captured here: https://www.dockerstatus.com/pages/history/533c6539221ae15e3f000031 Upstream issue - Docker Hub Registry https://uptime.runpod.io/incident/597075 Wed, 04 Jun 2025 12:51:00 -0000 https://uptime.runpod.io/incident/597075#feba563ed80f42445498601343481cdc24d12541c82e0dce77745d770fee043c Docker is observing issues with pulls and pushes against Docker Hub. We are monitoring the situation and will provide ongoing updates if the situation changes. See https://www.dockerstatus.com/ for further detail. Upstream issue - Canonical (Ubuntu) package manager https://uptime.runpod.io/incident/583629 Fri, 30 May 2025 16:16:00 -0000 https://uptime.runpod.io/incident/583629#40da1719ada80c7737080e0d51fa937b42dab2ec4d6d8cd913d5b8ac5bfaaee1 Canonical has resolved its service issues, and measured error levels have returned to normal levels. Further details are captured here: https://status.canonical.com/#/incident/KNms6QK9ewuzz-7xUsPsNylV20jEt5kyKsd8A-3ptQGnu9-UhZcQUtDmIVRYTQMx6Vt0EjSxe6Bz4_D89gPRLg== Upstream issue - Canonical (Ubuntu) package manager https://uptime.runpod.io/incident/583629 Thu, 29 May 2025 16:41:00 -0000 https://uptime.runpod.io/incident/583629#db8e4d7b9c962b80be958131b5c91ceeec966d9ce64860fa139fa60cc70e1989 Canonical (Ubuntu)'s package mirrors are degraded. Users may encounter timeouts or other connection related issues when running `apt-get` commands. We are monitoring the situation and will provide ongoing updates if the situation changes. See https://status.canonical.com/ for further detail. Planned Internet Maintenance EU-FR-1 https://uptime.runpod.io/incident/577181 Tue, 27 May 2025 14:00:50 -0000 https://uptime.runpod.io/incident/577181#6cdcc341d94453fd78811b1d89198e2fd3af6565b8b10a961821cd443189f706 Maintenance completed Planned Internet Maintenance EU-FR-1 https://uptime.runpod.io/incident/577181 Tue, 27 May 2025 11:00:50 -0000 https://uptime.runpod.io/incident/577181#00a90a5306980309984dcd18fd5862773b58e7983c08fd169c48654b32c10ad5 We are conducting planned internet service maintenance in data center EU-FR-1 on May 27, 2025, between 11:00-14:00 UTC. During this scheduled time, internet service will be temporarily offline, but power will be maintained. Planned Internet Maintenance US-TX-3 https://uptime.runpod.io/incident/559319 Wed, 14 May 2025 10:00:39 -0000 https://uptime.runpod.io/incident/559319#6b11f988c07f59e5ca11e10adfd2080e25c423c9c9dfd2cde0c0612e4cad0590 Maintenance completed Planned Internet Maintenance US-TX-3 https://uptime.runpod.io/incident/559319 Wed, 14 May 2025 10:00:39 -0000 https://uptime.runpod.io/incident/559319#6b11f988c07f59e5ca11e10adfd2080e25c423c9c9dfd2cde0c0612e4cad0590 Maintenance completed Planned Internet Maintenance US-TX-3 https://uptime.runpod.io/incident/559319 Wed, 14 May 2025 10:00:39 -0000 https://uptime.runpod.io/incident/559319#6b11f988c07f59e5ca11e10adfd2080e25c423c9c9dfd2cde0c0612e4cad0590 Maintenance completed Planned Internet Maintenance US-TX-3 https://uptime.runpod.io/incident/559319 Wed, 14 May 2025 07:00:39 -0000 https://uptime.runpod.io/incident/559319#dcc2f6dc3edc1597d5f4450b3a0149b9a9b07dbed170963467c513ada54ff9ae We are conducting planned internet service maintenance in data center US-TX-3 on May 14, 2025, from 07:00-10:00 UTC. During this scheduled time, internet service will be temporarily offline, but power will be maintained. Planned Internet Maintenance US-TX-3 https://uptime.runpod.io/incident/559319 Wed, 14 May 2025 07:00:39 -0000 https://uptime.runpod.io/incident/559319#dcc2f6dc3edc1597d5f4450b3a0149b9a9b07dbed170963467c513ada54ff9ae We are conducting planned internet service maintenance in data center US-TX-3 on May 14, 2025, from 07:00-10:00 UTC. During this scheduled time, internet service will be temporarily offline, but power will be maintained. Planned Internet Maintenance US-TX-3 https://uptime.runpod.io/incident/559319 Wed, 14 May 2025 07:00:39 -0000 https://uptime.runpod.io/incident/559319#dcc2f6dc3edc1597d5f4450b3a0149b9a9b07dbed170963467c513ada54ff9ae We are conducting planned internet service maintenance in data center US-TX-3 on May 14, 2025, from 07:00-10:00 UTC. During this scheduled time, internet service will be temporarily offline, but power will be maintained. EU-RO-1 Network Storage is degraded https://uptime.runpod.io/incident/557552 Tue, 06 May 2025 17:04:00 -0000 https://uptime.runpod.io/incident/557552#0c511ced79af707b623cf3362ccb760ededea15dc892cb397de4f2aa7fe2a52d The storage cluster has been restored to nominal operating performance, and we are continuing to monitor performance. EU-RO-1 Network Storage is degraded https://uptime.runpod.io/incident/557552 Tue, 06 May 2025 16:26:00 -0000 https://uptime.runpod.io/incident/557552#abfbe04a2e8dca3e22b70056e7ca9e7f2be85c25783ee8f8fe1178e88ecda1ec Reads and writes have been re-enabled on this cluster. Performance remains degraded as the system restores. EU-RO-1 Network Storage is degraded https://uptime.runpod.io/incident/557552 Tue, 06 May 2025 16:17:00 -0000 https://uptime.runpod.io/incident/557552#dc5fb3c7cf102ae24b7ad7f9c17356ffebefc0f643e48e607d9015729d866a7a The team has isolated the issue and is working to restore service now. EU-RO-1 Network Storage is degraded https://uptime.runpod.io/incident/557552 Tue, 06 May 2025 15:48:00 -0000 https://uptime.runpod.io/incident/557552#665d7aca1cfdc27893d615b3fc744ac312101846addef029325f49d6c6be0ab1 EU-RO-1 Network Storage is degraded, resulting in inability to read and write to network stores. We are working to restore service now. US-NC-1 Network Issue https://uptime.runpod.io/incident/553492 Tue, 29 Apr 2025 01:50:00 -0000 https://uptime.runpod.io/incident/553492#4565b80f695ab3065afedaba2533420d030eb537fe9b063cd03e862402c623c0 Our US-NC-1 data center is currently experiencing a network issue. The team is actively investigating. ---- The network has been restored. Error rates elevated for Serverless endpoints https://uptime.runpod.io/incident/548625 Mon, 21 Apr 2025 18:40:00 -0000 https://uptime.runpod.io/incident/548625#5bb6d044b2530e5e030ac4afccce0b1371c1d59aa9431144bc5a64f61e3f14ef The issue has been resolved and error rates have returned to normal levels. Error rates elevated for Serverless endpoints https://uptime.runpod.io/incident/548625 Mon, 21 Apr 2025 18:33:00 -0000 https://uptime.runpod.io/incident/548625#1fa6a48885767dbcd246c69f72fa97f5b336e0bbe61cf5d4be2ea4a42cb18c30 The fix has been deployed and we are monitoring recovery - error rates are returning to normal levels. Error rates elevated for Serverless endpoints https://uptime.runpod.io/incident/548625 Mon, 21 Apr 2025 18:04:00 -0000 https://uptime.runpod.io/incident/548625#32f2d4f8f8b6281cd834091fb0bbf00ad2c7736376a4759e07ef37f1e9fb4044 The team has identified the issue and is deploying a fix at this time. Error rates elevated for Serverless endpoints https://uptime.runpod.io/incident/548625 Mon, 21 Apr 2025 17:53:00 -0000 https://uptime.runpod.io/incident/548625#a8ad16475a4aae6e43b8f99dd2e0b5ad93ea98661cdb7e0df96342bf4456d37b We are observing elevated error rates for Serverless endpoints which is resulting in failed requests and responses. The Engineering team is investigating now. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 19:06:00 -0000 https://uptime.runpod.io/incident/543258#432ae229000dc585e314e667b5681ae5d1db5242f8983e68bae511496b643df9 Monitoring - all services are returning to normal operating baselines, however we are continuing to monitor overall service recovery. ----- On April 10, 2025, between 18:26:30 UTC and 18:53:00 UTC, a service disruption occurred due to a software release that was dependent on a database change which had not yet been applied. This caused our primary API to become temporarily non-functional. As a result, customers experienced issues including missing pods and serverless endpoints in the dashboard, and delayed request processing due to serverless endpoints being unable to scale. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 19:06:00 -0000 https://uptime.runpod.io/incident/543258#432ae229000dc585e314e667b5681ae5d1db5242f8983e68bae511496b643df9 Monitoring - all services are returning to normal operating baselines, however we are continuing to monitor overall service recovery. ----- On April 10, 2025, between 18:26:30 UTC and 18:53:00 UTC, a service disruption occurred due to a software release that was dependent on a database change which had not yet been applied. This caused our primary API to become temporarily non-functional. As a result, customers experienced issues including missing pods and serverless endpoints in the dashboard, and delayed request processing due to serverless endpoints being unable to scale. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 19:06:00 -0000 https://uptime.runpod.io/incident/543258#432ae229000dc585e314e667b5681ae5d1db5242f8983e68bae511496b643df9 Monitoring - all services are returning to normal operating baselines, however we are continuing to monitor overall service recovery. ----- On April 10, 2025, between 18:26:30 UTC and 18:53:00 UTC, a service disruption occurred due to a software release that was dependent on a database change which had not yet been applied. This caused our primary API to become temporarily non-functional. As a result, customers experienced issues including missing pods and serverless endpoints in the dashboard, and delayed request processing due to serverless endpoints being unable to scale. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 18:55:00 -0000 https://uptime.runpod.io/incident/543258#7e5d52e00e4e8b8a00381fcd0ae40365143d205388f8f46b9036b5958c82770c Identified - This issue is caused by a database problem. We've applied the fix and are continuing to monitor recovery. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 18:55:00 -0000 https://uptime.runpod.io/incident/543258#7e5d52e00e4e8b8a00381fcd0ae40365143d205388f8f46b9036b5958c82770c Identified - This issue is caused by a database problem. We've applied the fix and are continuing to monitor recovery. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 18:55:00 -0000 https://uptime.runpod.io/incident/543258#7e5d52e00e4e8b8a00381fcd0ae40365143d205388f8f46b9036b5958c82770c Identified - This issue is caused by a database problem. We've applied the fix and are continuing to monitor recovery. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 18:43:00 -0000 https://uptime.runpod.io/incident/543258#790ff76c6a985582ad0677574741e7711012f55512be08386d7230732a89893a Investigating - We are currently experiencing an issue with RunPod console and API where users are not able to access or deploy new Pods and Serverless endpoints. We are currently investigating and will post an update as soon as we are able. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 18:43:00 -0000 https://uptime.runpod.io/incident/543258#790ff76c6a985582ad0677574741e7711012f55512be08386d7230732a89893a Investigating - We are currently experiencing an issue with RunPod console and API where users are not able to access or deploy new Pods and Serverless endpoints. We are currently investigating and will post an update as soon as we are able. RunPod console shows Pods and Serverless endpoints unavailable https://uptime.runpod.io/incident/543258 Thu, 10 Apr 2025 18:43:00 -0000 https://uptime.runpod.io/incident/543258#790ff76c6a985582ad0677574741e7711012f55512be08386d7230732a89893a Investigating - We are currently experiencing an issue with RunPod console and API where users are not able to access or deploy new Pods and Serverless endpoints. We are currently investigating and will post an update as soon as we are able. Billing and Audit Log pages down https://uptime.runpod.io/incident/541449 Mon, 07 Apr 2025 21:08:00 -0000 https://uptime.runpod.io/incident/541449#84b3fc571bf98e3c45e66909e9a62ec20285bf9106a7f77cc6004ac3406d1dff Resolved - Users were unable to access the Billing and Audit Log pages in User Settings. We rolled out a fix and this issue is now resolved. Billing and Audit Log pages down https://uptime.runpod.io/incident/541449 Mon, 07 Apr 2025 20:54:00 -0000 https://uptime.runpod.io/incident/541449#69839c8856bbda317412524c134dfaf1af26e26b1b49c03d0111b4244c114eab Identified - This issue is caused by a bug in the application code. A hot fix will be released imminently. We will provide another update once the hot fix has been rolled out and service is restored. Billing and Audit Log pages down https://uptime.runpod.io/incident/541449 Mon, 07 Apr 2025 20:35:00 -0000 https://uptime.runpod.io/incident/541449#6f268e31335ff0a8b703bf35e0c90c8ab213a8b6dfb940782e652693217f8ffd Investigating - We are currently experiencing an issue with some pages not loading in the RunPod Console User Settings. Specifically, we are aware that users are not able to access the Billing and Audit Log pages at this time. We are currently investigating and will post an update as soon as we are able. Urgent: Emergency Firmware Update for US-TX-4 at 21:00 UTC (March 11, 2025) https://uptime.runpod.io/incident/526582 Tue, 11 Mar 2025 18:59:00 -0000 https://uptime.runpod.io/incident/526582#6367a5612e3c0d0caba76f5fe8e9be696d81a2f2fe37a6e4f4a3c04afd9b7e86 Our engineering team has identified a network disruption at our US-TX-4 datacenter, caused by a required firmware update for our router. To resolve this, we will deploy an emergency fix at 21:00 UTC on March 11, 2025, with a maximum expected downtime of 10-15 minutes. ----------- The update was successfully completed. US-NC-1 Network Issue https://uptime.runpod.io/incident/523954 Thu, 06 Mar 2025 18:44:00 -0000 https://uptime.runpod.io/incident/523954#a443da98c1ea5485cc1e02caaaf18502cd7ebeea9ec3fe2f1094fb7d0ce1cfbc Our primary ISP circuit for the US-NC-1 data center experienced an outage. The secondary router failed to take over due to a known firmware issue that was scheduled for a later patch. We’ve now upgraded the router to the latest patched version and are running on the secondary circuit. --------- The issue has been resolved. Issue with Volume Storage in CA-MTL-1 https://uptime.runpod.io/incident/518776 Tue, 25 Feb 2025 14:53:00 -0000 https://uptime.runpod.io/incident/518776#a38a988f6fcbccc9e0a0fdc072c6655b943826743b09279a7425c0157ad384a9 We have discovered an issue affecting pods running in CA-MTL-1 when using volume disk or network storage. When executing commands, the process may hang, although the file is still created successfully. So far, this issue primarily impacts most H100 GPUs and a few A40 GPUs. Our team is actively investigating and will provide updates here as we learn more. ------- We have identify the root cause of the issue, team is pushing the updates to machine. ------- All machines have been updated, and the issue is now resolved. EU-CZ-1 Data Center Upgrade https://uptime.runpod.io/incident/513399 Sat, 15 Feb 2025 17:00:00 -0000 https://uptime.runpod.io/incident/513399#5c3d731bc79877773d2e4e31e89f7f6a40d3c220efa1f14df8b7992460e18907 We are currently upgrading the EU-CZ-1 data center, and all machines are offline during this process. Services hosted in this region are temporarily unavailable during this period. ------ We’ve successfully brought most of the machines online. However, due to some technical issues, we need a bit more time to restore the remaining ones. Thanks for your patience, we’ll keep you posted! ------ All machines in the EU-CZ-1 data center are now fully online. The data center upgrade is complete, thank you for your patience! Serverless Request Issue https://uptime.runpod.io/incident/512662 Thu, 13 Feb 2025 23:23:00 -0000 https://uptime.runpod.io/incident/512662#cd26736a958bc4f7e05df46a6ab050be4b007f33ae6e8f3bbc3fd15816b6bf62 We experienced an issue affecting serverless requests from 10:00 PM to 10:23 PM UTC. This was due to an update made to improve system capacity in the NYC region, which led to temporary request issues. The issue has been identified and resolved, and we’ve taken steps to minimize future risks. ---- We are still seeing issues, and our team is actively investigating. We’ll provide further updates as soon as we have more information. ---- We have identified the issue and will be rolling out a release to fix it soon. Thank you for your patience while we work on resolving this. ----- The issue was still related to the new server we added. After adding the new server, it triggered an unexpected bug that caused the worker to be unable to retrieve the request payload. --------- The team has just confirmed that the issue is now resolved. 🚨 CA-MTL-1 Network Volume Performance Issue 🚨 https://uptime.runpod.io/incident/511142 Tue, 11 Feb 2025 16:00:00 -0000 https://uptime.runpod.io/incident/511142#b390762037be10eb8aa08081fbf68dd63266d8e872aee7e84c44284a8df481fd We’re currently experiencing performance issues with network volumes in the CA-MTL-1 data center. Our team is investigating the issue, and we’ll provide updates as soon as possible. ------ We detected a performance issue with one of the chunk servers and have isolated the affected server. ------ The issue has been resolved CA-MTL-3 Network Disruption https://uptime.runpod.io/incident/504467 Thu, 30 Jan 2025 11:14:00 -0000 https://uptime.runpod.io/incident/504467#4eee808b5ee6eff1041a5b1dd202ab4c710a65294646c6ef19e91adbe893a3ae CA-MTL-3 is suffering a network disruption due to an upstream provider issue. We are in contact with the provider and are working to restore network availability now. --------- the network is restored US-TX-4 Network Disruption https://uptime.runpod.io/incident/500904 Thu, 23 Jan 2025 22:52:00 -0000 https://uptime.runpod.io/incident/500904#f98451bf98724360bc2d433708dfab2fe45094af5b465449fda11bd3122d664d US-TX-4 is suffering a network disruption due to an upstream provider issue. We are in contact with the provider and are working to restore network availability now. ----- The US-TX-4 region will experience a short network disruption at approximately 01/23/2025 5:30 PM CST for about 10 minutes due to an emergency firewall update. We apologize for any inconvenience and appreciate your understanding as we perform this critical update. ---------- The issue affecting US-TX-4 has been resolved. Services are now operating normally. Thank you for your patience and understanding. EU-SE-1 Network Disruption https://uptime.runpod.io/incident/496566 Thu, 16 Jan 2025 00:00:00 -0000 https://uptime.runpod.io/incident/496566#bac999ca6f72114dc0143198f9378a39a59fc98bda798366a7fcb144b874f57a The network issue at the data center has been resolved. Thank you for your patience. EU-SE-1 Network Disruption https://uptime.runpod.io/incident/496566 Mon, 13 Jan 2025 04:00:00 -0000 https://uptime.runpod.io/incident/496566#6e5797f4a081232adc2c738c6b37214d18483c5850df0b4682f56414b4077165 EU-SE-1 is suffering a network disruption due to an upstream provider issue. We are in contact with the provider and are working to restore network availability now. US-TX-3 Network Disruption https://uptime.runpod.io/incident/484139 Thu, 19 Dec 2024 04:34:00 -0000 https://uptime.runpod.io/incident/484139#3377ce032047262417f18071137dea5cec0dfabfb8215b95228b947ce70d19f2 US-TX-3 is suffering a network disruption due to an upstream provider issue. We are in contact with the provider and are working to restore network availability now. US-TX-3 Network Disruption https://uptime.runpod.io/incident/483548 Tue, 17 Dec 2024 20:42:00 -0000 https://uptime.runpod.io/incident/483548#8f20af3c61e182f47833106a93d6a72e88e38d3f1f2d48b69ca8a4ed4a5ecc43 This issue was due to an upstream provider and has been resolved. We have requested an RCA and will provide updates as applicable. US-TX-3 Network Disruption https://uptime.runpod.io/incident/483548 Tue, 17 Dec 2024 20:25:00 -0000 https://uptime.runpod.io/incident/483548#aaaf0c42e4c518d8b8217a2847e9cd97f45584a583be458fcca2e81e319de440 US-TX-3 suffered a network disruption due to an upstream provider issue. CA-MTL-1 data center is currently inaccessible https://uptime.runpod.io/incident/452574 Tue, 29 Oct 2024 11:51:00 -0000 https://uptime.runpod.io/incident/452574#1f3b3b519b2080a72c8dbc07de49b6941232c353ae70b11bcf1db17e16ae66f4 Our CA-MTL-1 data center recently underwent maintenance, which was completed with minimal impact. However, during post-maintenance monitoring, the data center became inaccessible due to an unexpected issue. Our team is actively working to resolve the problem. ---- The network issue has been resolved for the CA-MTL-1 data center Elevated errors for dashboard and API https://uptime.runpod.io/incident/442564 Thu, 10 Oct 2024 19:46:00 -0000 https://uptime.runpod.io/incident/442564#7cc59f1d855f565f3181d01f81c7918cf6c3c7fcc2369ff65730df7a5ba663ba The root cause has been resolved, and services have returned to normal operating levels. Elevated errors for dashboard and API https://uptime.runpod.io/incident/442564 Thu, 10 Oct 2024 19:46:00 -0000 https://uptime.runpod.io/incident/442564#7cc59f1d855f565f3181d01f81c7918cf6c3c7fcc2369ff65730df7a5ba663ba The root cause has been resolved, and services have returned to normal operating levels. Elevated errors for dashboard and API https://uptime.runpod.io/incident/442564 Thu, 10 Oct 2024 19:17:00 -0000 https://uptime.runpod.io/incident/442564#aeaac89a546035e9510426195f43d675408cebe6c6ff3fe94b4f690c3783e377 We are currently experiencing elevated error rates for the console and primary API's. We have identified the issue and are in the process of resolving. Elevated errors for dashboard and API https://uptime.runpod.io/incident/442564 Thu, 10 Oct 2024 19:17:00 -0000 https://uptime.runpod.io/incident/442564#aeaac89a546035e9510426195f43d675408cebe6c6ff3fe94b4f690c3783e377 We are currently experiencing elevated error rates for the console and primary API's. We have identified the issue and are in the process of resolving. EUR-IS-1 Network Issue https://uptime.runpod.io/incident/441372 Tue, 08 Oct 2024 19:59:00 -0000 https://uptime.runpod.io/incident/441372#be330aacb37d015b9469eb7ea6a5fd94aec2b20ad238ce83f24143b4977a539d The network issue at the data center has been resolved. Thank you for your patience. EUR-IS-1 Network Issue https://uptime.runpod.io/incident/441372 Tue, 08 Oct 2024 18:20:00 -0000 https://uptime.runpod.io/incident/441372#028dd57188d133992ad1feb43529e129977b2e6ae9c1d9694cb91a32fea3bd89 We’re currently experiencing network packet loss issues in the EUR-IS-1 region, leading to connectivity errors and connection loss. Our team is actively coordinating with the data center and networking teams to resolve the problem. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/435242 Tue, 01 Oct 2024 00:20:00 -0000 https://uptime.runpod.io/incident/435242#98262437ae0f119ac56df9c16306c87ef16bb02fd126f62177748eba52443232 The root cause of the issue has been addressed and congestion has returned to baseline levels. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/435242 Mon, 30 Sep 2024 14:43:00 -0000 https://uptime.runpod.io/incident/435242#5bb3ae97b27a733d3cc69b782fa0c1c0cbbb312e6667902194ce1b6eefb8f045 We are currently observing elevated packet loss within the EUR-IS-1 DC. This is resulting in increased connection resets and failures. We are engaging with the network provider to determine the root cause. Network availability issues in EUR-IS-1 https://uptime.runpod.io/incident/424794 Thu, 05 Sep 2024 18:04:00 -0000 https://uptime.runpod.io/incident/424794#ea3e05de77229d04ef8fefb1ff51c5b87419aa5c3c991a8aba879d04ad9b71f6 Network availability has been restored by the upstream provider. We will be performing a RCA and provide further details. Network availability issues in EUR-IS-1 https://uptime.runpod.io/incident/424794 Thu, 05 Sep 2024 17:30:00 -0000 https://uptime.runpod.io/incident/424794#7dbd09f2bec60affd409d52939a6c462fef7735511d1b1fe066cdb003100e29f We're experiencing elevated network errors in EUR-IS-1 resulting in connectivity errors and connection loss. We are coordinating with the DC and and networking teams. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Wed, 04 Sep 2024 01:59:00 -0000 https://uptime.runpod.io/incident/423675#3be8f57d1f34097955cdc708aaec7b624fd1f822affe6b98c2ac2612a83c2c89 We have received confirmation from the upstream network provider and we have validated that this issue is resolved. The root cause was a network protection ruleset which engaged in a false-positive manner to drop a selection of packets. This resulted in failure to establish connections and impact to bandwidth over TCP/QUIC connections. We will provide an RCA once we receive the report from the upstream provider. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Tue, 03 Sep 2024 21:16:00 -0000 https://uptime.runpod.io/incident/423675#c8810bdb8cc229c41275f1cee6d87bf9a238b165ce4d52bb8d042a3d114df593 Packetloss has returned to nominal levels. We are still monitoring closely. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Tue, 03 Sep 2024 20:19:00 -0000 https://uptime.runpod.io/incident/423675#3b018d4313fbc8d1548b8d109a4cd74c183cc578e5b33fc89e2152fa4f3fe3eb The network provider is still in the process of mitigating the issue. We will provide regular updates as they make progress. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Tue, 03 Sep 2024 19:32:00 -0000 https://uptime.runpod.io/incident/423675#356e6437a6cd95d5f0a9122417e8d469b1bb3a46c58eb76c21fa97b3e763c849 The network provider is still in the process of mitigating the issue. We will provide regular updates as they make progress. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Tue, 03 Sep 2024 18:49:00 -0000 https://uptime.runpod.io/incident/423675#ef2774d80ff90a1e39a1630f7a487ee6c34854b91a03200dfe896693faaa8db0 The network provider is still in the process of mitigating the issue. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Tue, 03 Sep 2024 18:21:00 -0000 https://uptime.runpod.io/incident/423675#44ce8ab39b9b84b6cbe2cdf4773d438418326ee404e5434b317c651f5a9cc1cc The network provider is diagnosing the issue and has isolated it to exist within a specific network segment. Elevated network packetloss in US-OR-1 DC https://uptime.runpod.io/incident/423675 Tue, 03 Sep 2024 17:02:00 -0000 https://uptime.runpod.io/incident/423675#a089eae61a9a6e7ee976dcd2ab18cf5acce50474518a800aadc3aa218f532b39 We are currently observing elevated packet loss within the US-OR-1 DC. This is resulting in increased connection resets and failures. We are engaging with the network provider to determine the root cause. Serverless Workers Unable To Read Environment Variables In Templates https://uptime.runpod.io/incident/382897 Tue, 11 Jun 2024 20:32:00 -0000 https://uptime.runpod.io/incident/382897#3c8f4ebcf1d8fb1d1fe1bfc48169c89f881f7adbbe26ebfd95c5bb3c496f0866 At 9:02 AM PST workers in serverless endpoints were unable to read environment variables set in templates. Thus, workers that were not already initialized and relied on environment variables from the template would fail to start. This issue has been resolved at 1:32 PM PST. US-OR-1 Firewall Under Stress https://uptime.runpod.io/incident/382380 Tue, 11 Jun 2024 19:33:00 -0000 https://uptime.runpod.io/incident/382380#8db84946b7b069b16703060dab37b93daf5c33e0a00b5642b6692de7b8370de2 We have resolved this incident. US-OR-1 Firewall Under Stress https://uptime.runpod.io/incident/382380 Mon, 10 Jun 2024 20:32:00 -0000 https://uptime.runpod.io/incident/382380#62ee7e3f49104103bbef65a954d4d6e4c781e4ecdfdc6ea50e2424a5cdf0dbda Our firewall is currently handling an unusually high number of small packets, which may cause some temporary service disruptions and a slow down in upload and download speeds in US-OR-1. We are working to resolve the issue. Network Volume outage in RO region https://uptime.runpod.io/incident/231767 Wed, 12 Jul 2023 01:25:00 -0000 https://uptime.runpod.io/incident/231767#ab9f609106dbd5ed786435cc420f2829d54f28a5a4cc23313909f128084e1659 We patched the configuration and the problem should now be resolved. We will continue to monitor. Network Volume outage in RO region https://uptime.runpod.io/incident/231767 Wed, 12 Jul 2023 01:25:00 -0000 https://uptime.runpod.io/incident/231767#ab9f609106dbd5ed786435cc420f2829d54f28a5a4cc23313909f128084e1659 We patched the configuration and the problem should now be resolved. We will continue to monitor. Network Volume outage in RO region https://uptime.runpod.io/incident/231767 Wed, 12 Jul 2023 01:00:00 -0000 https://uptime.runpod.io/incident/231767#ca244d30eb80b2577b1fb82c08b4546c4709aa8373db377fb7355e40a148cdf3 We had a configuration issue that caused network volumes in the RO region to stop being able to be registered, causing a widespread outage for pods in the region. We have resolved the issue and patched the configuration so that this won't happen again. We are also reviewing this configuration in other regions to being it in line with this region. Network Volume outage in RO region https://uptime.runpod.io/incident/231767 Wed, 12 Jul 2023 01:00:00 -0000 https://uptime.runpod.io/incident/231767#ca244d30eb80b2577b1fb82c08b4546c4709aa8373db377fb7355e40a148cdf3 We had a configuration issue that caused network volumes in the RO region to stop being able to be registered, causing a widespread outage for pods in the region. We have resolved the issue and patched the configuration so that this won't happen again. We are also reviewing this configuration in other regions to being it in line with this region.