Previous incidents

July 2024
No incidents reported
June 2024
Jun 11, 2024
1 incident

Serverless Workers Unable To Read Environment Variables In Templates

Resolved Jun 11 at 01:32pm PDT

At 9:02 AM PST workers in serverless endpoints were unable to read environment variables set in templates. Thus, workers that were not already initialized and relied on environment variables from the template would fail to start.

This issue has been resolved at 1:32 PM PST.

Jun 10, 2024
1 incident

US-OR-1 Firewall Under Stress

Degraded

Resolved Jun 11 at 12:33pm PDT

We have resolved this incident.

1 previous update

May 2024
May 09, 2024
1 incident

Decreased reliability for GPU workers that need to spawn large numbers of pro...

Downtime

Resolved May 10 at 01:00pm PDT

Summary:

GPU pods were being given too low of a Process ID (PID) Limit, which could cause them to suffer unexpected failures when launching >1024 processes.

Source of Bug:

  • Logic error created as part of adding AMD GPU vendor support.

Timeline

  • START: ~12:00 PST 2024-05-09
  • END: ~13:00 PST 2024-05-10

Suggested Actions by Category:

Serverless

This should resolve itself automatically if you allow your workers to scale to zero. Alternatively, force-scal...

1 previous update