Skip to main content

Overview

OpenReward provides cloud storage integration for both environments and sandboxes.
Key Concept: The same storage is accessible in both environments and sandboxes.

Automatic Setup

Each environment automatically includes isolated cloud storage:
  • Created automatically with your environment
  • Isolated from other environments
  • Accessible in both environment server and sandboxes

Using Storage in Environments

Accessing Data

Inside your environment server code, access storage at /orwd_data/:
import os
import json
from pathlib import Path

# Storage is mounted at /orwd_data/
data_dir = Path("/orwd_data")

# List files
files = list(data_dir.glob("*.json"))
print(f"Found {len(files)} JSON files")

# Read file
with open(data_dir / "tasks.json") as f:
    tasks = json.load(f)

Common Use Cases

1. Large Datasets:
# Don't include large datasets in Docker image
# Instead, access from cloud storage
import pandas as pd
dataset = pd.read_csv("/orwd_data/datasets/large_dataset.csv")
2. Configuration Files:
import yaml

# Store environment config in cloud storage
config_path = Path("/orwd_data/config.yaml")
if config_path.exists():
    with open(config_path) as f:
        config = yaml.safe_load(f)

Using Storage in Sandboxes

Configuration

Configure storage access via bucket_config in SandboxSettings:
from openreward import OpenReward, SandboxSettings, BucketConfig

client = OpenReward(api_key="your-api-key")

settings = SandboxSettings(
    environment="username/env-name",
    image="python:3.11-slim",
    machine_size="1:2",
    bucket_config=BucketConfig(
        mount_path="/workspace",           # Where to mount in container
        read_only=True,                    # Read-only or read-write
        only_dir="datasets/subset",        # Optional: mount only subdirectory
        implicit_dirs=False                # Optional: show all subdirectories
    )
)

async with client.sandbox(settings) as sandbox:
    # Storage is mounted at /workspace
    output, _ = await sandbox.run("ls -la /workspace")
    print(output)

BucketConfig Parameters

ParameterTypeRequiredDefaultDescription
mount_pathstrYes-Path inside container where storage is mounted
read_onlyboolNoTrueMount in read-only mode
only_dirstr | NoneNoNoneMount only this subdirectory
implicit_dirsboolNoFalseShow all subdirectories in listings

Accessing Data

Inside the sandbox, access files at the configured mount path:
# List files
output, _ = await sandbox.run("ls -la /workspace")

# Read file
output, _ = await sandbox.run("cat /workspace/data.csv")

# Process file with Python
output, _ = await sandbox.run("""
python -c "
import pandas as pd
df = pd.read_csv('/workspace/data.csv')
print(f'Loaded {len(df)} rows')
"
""")

Mounting Subdirectories

Mount only a specific subdirectory with only_dir:
# Storage structure:
# /orwd_data/
# ├── datasets/
# │   ├── train/
# │   │   └── data.csv
# │   └── test/
# │       └── data.csv
# └── models/
#     └── checkpoint.pt

# Mount only datasets/train
bucket_config=BucketConfig(
    mount_path="/data",
    only_dir="datasets/train"
)

# Inside sandbox:
# /data/
# └── data.csv  (only files from datasets/train/)
This is useful if you want to mount only a specific subdirectory of the bucket, for example if you want to mount only the training data of a dataset.

Implicit Directories

Control how directory listings work:
# Without implicit_dirs (default, faster):
bucket_config=BucketConfig(
    mount_path="/workspace",
    implicit_dirs=False  # Default
)
# Shows only directories that explicitly exist
# Better performance

# With implicit_dirs (slower, more complete):
bucket_config=BucketConfig(
    mount_path="/workspace",
    implicit_dirs=True
)
# Shows all directories implied by file paths
# More complete directory tree

Next Steps