Previous incidents

June 2024
Jun 11, 2024
1 incident

Serverless Workers Unable To Read Environment Variables In Templates

Resolved Jun 11 at 01:32pm PDT

At 9:02 AM PST workers in serverless endpoints were unable to read environment variables set in templates. Thus, workers that were not already initialized and relied on environment variables from the template would fail to start.

This issue has been resolved at 1:32 PM PST.

Jun 10, 2024
1 incident

US-OR-1 Firewall Under Stress

Degraded

Resolved Jun 11 at 12:33pm PDT

We have resolved this incident.

1 previous update

May 2024
May 09, 2024
1 incident

Decreased reliability for GPU workers that need to spawn large numbers of pro...

Downtime

Resolved May 10 at 01:00pm PDT

Summary:

GPU pods were being given too low of a Process ID (PID) Limit, which could cause them to suffer unexpected failures when launching >1024 processes.

Source of Bug:

  • Logic error created as part of adding AMD GPU vendor support.

Timeline

  • START: ~12:00 PST 2024-05-09
  • END: ~13:00 PST 2024-05-10

Suggested Actions by Category:

Serverless

This should resolve itself automatically if you allow your workers to scale to zero. Alternatively, force-scal...

1 previous update

April 2024
Apr 13, 2024
1 incident

Network Upgrades

Maintenance

Resolved Apr 13 at 10:00am PDT

We'll be running network upgrades for us-or-1. In case you are having any issues with the new public IP going forward please remember that it will only work after this upgrade has completed. If you have hard coded any IP addresses anywhere please remember to upgrade them and restart your services after this migration.