Overview
OpenReward provides cloud storage integration for both environments and sandboxes.
Key Concept: The same storage is accessible in both environments and sandboxes.
Automatic Setup
Each environment automatically includes isolated cloud storage:
- Created automatically with your environment
- Isolated from other environments
- Accessible in both environment server and sandboxes
Using Storage in Environments
Accessing Data
Inside your environment server code, access storage at /orwd_data/:
import os
import json
from pathlib import Path
# Storage is mounted at /orwd_data/
data_dir = Path("/orwd_data")
# List files
files = list(data_dir.glob("*.json"))
print(f"Found {len(files)} JSON files")
# Read file
with open(data_dir / "tasks.json") as f:
tasks = json.load(f)
Common Use Cases
1. Large Datasets:
# Don't include large datasets in Docker image
# Instead, access from cloud storage
import pandas as pd
dataset = pd.read_csv("/orwd_data/datasets/large_dataset.csv")
2. Configuration Files:
import yaml
# Store environment config in cloud storage
config_path = Path("/orwd_data/config.yaml")
if config_path.exists():
with open(config_path) as f:
config = yaml.safe_load(f)
Using Storage in Sandboxes
Configuration
Configure storage access via bucket_config in SandboxSettings:
from openreward import OpenReward, SandboxSettings, BucketConfig
client = OpenReward(api_key="your-api-key")
settings = SandboxSettings(
environment="username/env-name",
image="python:3.11-slim",
machine_size="1:2",
bucket_config=BucketConfig(
mount_path="/workspace", # Where to mount in container
read_only=True, # Read-only or read-write
only_dir="datasets/subset", # Optional: mount only subdirectory
implicit_dirs=False # Optional: show all subdirectories
)
)
async with client.sandbox(settings) as sandbox:
# Storage is mounted at /workspace
output, _ = await sandbox.run("ls -la /workspace")
print(output)
BucketConfig Parameters
| Parameter | Type | Required | Default | Description |
|---|
mount_path | str | Yes | - | Path inside container where storage is mounted |
read_only | bool | No | True | Mount in read-only mode |
only_dir | str | None | No | None | Mount only this subdirectory |
implicit_dirs | bool | No | False | Show all subdirectories in listings |
Accessing Data
Inside the sandbox, access files at the configured mount path:
# List files
output, _ = await sandbox.run("ls -la /workspace")
# Read file
output, _ = await sandbox.run("cat /workspace/data.csv")
# Process file with Python
output, _ = await sandbox.run("""
python -c "
import pandas as pd
df = pd.read_csv('/workspace/data.csv')
print(f'Loaded {len(df)} rows')
"
""")
Mounting Subdirectories
Mount only a specific subdirectory with only_dir:
# Storage structure:
# /orwd_data/
# ├── datasets/
# │ ├── train/
# │ │ └── data.csv
# │ └── test/
# │ └── data.csv
# └── models/
# └── checkpoint.pt
# Mount only datasets/train
bucket_config=BucketConfig(
mount_path="/data",
only_dir="datasets/train"
)
# Inside sandbox:
# /data/
# └── data.csv (only files from datasets/train/)
This is useful if you want to mount only a specific subdirectory of the bucket, for example if you want to mount only the training data of a dataset.
Implicit Directories
Control how directory listings work:
# Without implicit_dirs (default, faster):
bucket_config=BucketConfig(
mount_path="/workspace",
implicit_dirs=False # Default
)
# Shows only directories that explicitly exist
# Better performance
# With implicit_dirs (slower, more complete):
bucket_config=BucketConfig(
mount_path="/workspace",
implicit_dirs=True
)
# Shows all directories implied by file paths
# More complete directory tree
Next Steps