Skip to main content
When running large-scale RL training against OpenReward environment endpoints, you may encounter errors related to capacity limits or resource constraints. This page covers the most common ones and how to handle them.

Max capacity errors

During training you may see errors like this in your rollout logs:
ClientResponseError(429): Environment at maximum capacity.
Ask the environment owner to increase the max pods or sessions per pod.
This means the environment can’t handle the number of concurrent sessions your training run is requesting. There are two ways to address this:
  1. Increase capacity on the environment. If you own the environment (or can contact the owner), increase the max pods or sessions per pod in the environment’s settings.
  2. Lower your max concurrency. Reduce the max concurrency in your training settings so fewer sessions are requested at once.

Memory allocation errors

You may also see errors like this:
ClientResponseError(503): Pod unavailable: Container ran out of memory.
Ask the environment owner to increase the memory allocation.
This means the environment server itself crashed due to OOM during your rollouts. Unlike a max capacity error, this requires the environment to restart before it can serve requests again. To handle this:
  1. Retry the affected rollouts. Your training code should detect these failures and redo the rollouts once the environment server comes back up.
  2. Reduce memory pressure on the environment. This is the longer-term fix. See Out of memory (OOM) crashes in the environment debugging guide for specific strategies - loading less data, using index-based task access, or increasing the environment’s memory allocation.