Max capacity errors
During training you may see errors like this in your rollout logs:- Increase capacity on the environment. If you own the environment (or can contact the owner), increase the max pods or sessions per pod in the environment’s settings.
- Lower your max concurrency. Reduce the max concurrency in your training settings so fewer sessions are requested at once.
Memory allocation errors
You may also see errors like this:- Retry the affected rollouts. Your training code should detect these failures and redo the rollouts once the environment server comes back up.
- Reduce memory pressure on the environment. This is the longer-term fix. See Out of memory (OOM) crashes in the environment debugging guide for specific strategies - loading less data, using index-based task access, or increasing the environment’s memory allocation.

