Skip to main content

Overview

OpenReward provides cloud storage integration for both environments and sandboxes.
Key Concept: The same storage is accessible in both environments and sandboxes.

Automatic Setup

Each environment automatically includes isolated cloud storage:
  • Created automatically with your environment
  • Isolated from other environments
  • Accessible in both environment server and sandboxes

Using Storage in Environments

Accessing Data

Inside your environment server code, access storage at /orwd_data/:
import os
import json
from pathlib import Path

# Storage is mounted at /orwd_data/
data_dir = Path("/orwd_data")

# List files
files = list(data_dir.glob("*.json"))
print(f"Found {len(files)} JSON files")

# Read file
with open(data_dir / "tasks.json") as f:
    tasks = json.load(f)

Common Use Cases

1. Large Datasets:
# Don't include large datasets in Docker image
# Instead, access from cloud storage
import pandas as pd
dataset = pd.read_csv("/orwd_data/datasets/large_dataset.csv")
2. Configuration Files:
import yaml

# Store environment config in cloud storage
config_path = Path("/orwd_data/config.yaml")
if config_path.exists():
    with open(config_path) as f:
        config = yaml.safe_load(f)

Using Storage in Sandboxes

Configuration

Configure storage access via bucket_config in SandboxSettings:
from openreward import OpenReward, SandboxSettings, SandboxBucketConfig

client = OpenReward(api_key="your-api-key")

settings = SandboxSettings(
    environment="username/env-name",
    image="python:3.11-slim",
    machine_size="1:2",
    bucket_config=SandboxBucketConfig(
        mount_path="/workspace",           # Where to mount in container
        read_only=True,                    # Buckets are always read-only
        only_dir="datasets/subset",        # Optional: mount only subdirectory
        implicit_dirs=False                # Optional: show all subdirectories
    )
)

async with client.sandbox(settings) as sandbox:
    # Storage is mounted at /workspace
    output, _ = await sandbox.run("ls -la /workspace")
    print(output)

SandboxBucketConfig Parameters

ParameterTypeRequiredDefaultDescription
mount_pathstrYes-Path inside container where storage is mounted
read_onlyLiteral[True]NoTrueBuckets are always mounted read-only
only_dirstr | NoneNoNoneMount only this subdirectory
implicit_dirsboolNoFalseShow all subdirectories in listings

Accessing Data

Inside the sandbox, access files at the configured mount path:
# List files
output, _ = await sandbox.run("ls -la /workspace")

# Read file
output, _ = await sandbox.run("cat /workspace/data.csv")

# Process file with Python
output, _ = await sandbox.run("""
python -c "
import pandas as pd
df = pd.read_csv('/workspace/data.csv')
print(f'Loaded {len(df)} rows')
"
""")

Mounting Subdirectories

Mount only a specific subdirectory with only_dir:
# Storage structure:
# /orwd_data/
# ├── datasets/
# │   ├── train/
# │   │   └── data.csv
# │   └── test/
# │       └── data.csv
# └── models/
#     └── checkpoint.pt

# Mount only datasets/train
bucket_config=SandboxBucketConfig(
    mount_path="/data",
    only_dir="datasets/train"
)

# Inside sandbox:
# /data/
# └── data.csv  (only files from datasets/train/)
This is useful if you want to mount only a specific subdirectory of the bucket, for example if you want to mount only the training data of a dataset.

Implicit Directories

Control how directory listings work:
# Without implicit_dirs (default, faster):
bucket_config=SandboxBucketConfig(
    mount_path="/workspace",
    implicit_dirs=False  # Default
)
# Shows only directories that explicitly exist
# Better performance

# With implicit_dirs (slower, more complete):
bucket_config=SandboxBucketConfig(
    mount_path="/workspace",
    implicit_dirs=True
)
# Shows all directories implied by file paths
# More complete directory tree

Next Steps

Sandboxes

Learn how to use storage in sandboxes

Environments

Understand storage access in environments

Workspaces

Learn about automatic workspace storage

Sandbox API Reference

Complete SandboxBucketConfig API documentation