Skip to main content

Goals

  • Install and configure the orwd CLI
  • Create, deploy, and manage environments without leaving the terminal
  • Use structured output formats for scripting and automation

Prerequisites

Installation

Install the openreward package, which includes the orwd CLI:
pip install openreward
Then set your API key:
export OPENREWARD_API_KEY='your-api-key-here'
Most commands require OPENREWARD_API_KEY. The only exception is orwd init, which scaffolds files locally without touching the API.

Shell alias (for development)

If you’re developing the SDK locally and want orwd to always point at your working copy:
# Add to ~/.zshrc or ~/.bashrc
alias orwd="uv run --project ~/path/to/python-sdk orwd"
Use --project (not --directory) so orwd runs relative to your current directory, not the SDK directory.

Creating an Environment

Interactive wizard

The fastest way to go from zero to a deployed environment:
orwd new
This walks you through every step: naming, template selection, GitHub repo creation, and linking. At each prompt, you can accept the default (shown in parentheses) or type a different value.
  OpenReward — new environment

  > Environment name: my-bench
  > Template
    1. basic (default)
    2. sandbox
    Choice:
  > Description: A benchmark for evaluating X
  > Private (y/N):
  > Harbor (sandbox) mode (y/N):
  > Namespace (thomas):
  > Directory (my-bench):
  > Create a GitHub repo (Y/n):
  > GitHub repo (owner/repo) (thomas/my-bench):
  > Private repo (y/N):

Non-interactive mode

Pass all arguments as flags and add -y to skip prompts:
orwd new my-bench \
  --description "A benchmark for evaluating X" \
  --namespace GeneralReasoning \
  --repo GeneralReasoning/my-bench \
  --template basic \
  -y
This is useful for scripting or CI workflows. Any argument you omit will still be prompted for interactively (unless -y is set, in which case it falls back to its default).

Harbor environments

To create an environment that uses the Harbor task specification:
orwd new my-harbor-env --harbor --description "Harbor-based benchmark"

Step by step

If you prefer more control, you can run the individual steps yourself:
# 1. Scaffold local files
orwd init my-env --template basic

# 2. Create the environment on OpenReward
orwd create my-env --description "My environment" --namespace my-org

# 3. Create a GitHub repo and push
cd my-env
git init && git add . && git commit -m "Initial scaffold"
gh repo create my-org/my-env --source . --push

# 4. Link to GitHub (triggers first deployment)
orwd link my-org/my-env my-org/my-env

Managing Environments

Listing environments

# List all environments
orwd list

# Filter by owner
orwd list --owner GeneralReasoning

# Search by name or description
orwd list --search "benchmark"

# Your own environments
orwd list --mine
By default, orwd list fetches all results (auto-paginating through the API). Use --limit to cap the number returned.

Getting environment details

orwd get GeneralReasoning/CTF
Name:        GeneralReasoning/CTF
ID:          abc123
Description: Capture the flag challenges
Private:     False
GitHub:      connected
Repo:        https://github.com/GeneralReasoning/CTF
Compute:     1 CPU, 4 GB
Created:     2025-03-15T12:00:00Z
Updated:     2025-04-01T08:30:00Z

Updating an environment

# Change description
orwd update my-org/my-env --description "Updated description"

# Rename
orwd update my-org/my-env --name new-name

# Toggle privacy
orwd update my-org/my-env --private
orwd update my-org/my-env --public

# Enable or disable harbor mode
orwd update my-org/my-env --harbor
orwd update my-org/my-env --no-harbor

# Set external URLs
orwd update my-org/my-env --arxiv-url "https://arxiv.org/abs/..."

GitHub Integration

Linking

orwd link my-org/my-env my-org/my-repo
This connects an environment to a GitHub repository and triggers the first deployment. If you haven’t authorized GitHub yet, the CLI opens your browser to complete OAuth. Default compute settings (matching the web UI):
SettingDefault
--cpu-memory1:4 (1 vCPU, 4 GB)
--concurrency500
--max-scale10
Override any of these:
orwd link my-org/my-env my-org/my-repo --cpu-memory 2:8 --max-scale 5

Unlinking

orwd unlink my-org/my-env
Change compute or scaling settings on an already-linked environment:
orwd update-link my-org/my-env --cpu-memory 4:16 --concurrency 1000 --max-scale 3

Monitoring Deployments

Listing deployments

orwd deployments my-org/my-env
ID         STATUS       BRANCH               COMMIT     CREATED
a1b2c3d4   deployed     main                 f8e9a0b1   2025-04-20T10:00:00Z
e5f6g7h8   failed       main                 c3d4e5f6   2025-04-19T15:30:00Z

Viewing logs

# Runtime logs (default)
orwd logs my-org/my-env

# Build logs
orwd logs my-org/my-env --build

# Logs for a specific deployment
orwd logs my-org/my-env --deployment-id a1b2c3d4-full-id-here

# More log entries
orwd logs my-org/my-env --limit 200

Harbor task builds

For Harbor environments, each task gets its own Docker image build. To check their status:
orwd task-builds my-org/my-harbor-env
TASK                           STATUS     ID         UPDATED
task-1                         success    a1b2c3d4   2025-04-20T10:05:00Z
task-2                         building   e5f6g7h8   2025-04-20T10:04:00Z
task-3                         error      i9j0k1l2   2025-04-20T10:03:00Z
                               error: Dockerfile syntax error on line 12
To view the build logs for a specific task:
orwd task-build-logs <task-build-id>
Use orwd task-builds my-org/my-env -o json to get the full task build IDs for use with task-build-logs.

File Management

Environments have a file store for uploading data files (datasets, ground truth, task inputs). These files are available at /orwd_data/ on the deployed environment server. See Where Environment Data Lives for details on how uploaded files are used.

Uploading files

# Upload a single file
orwd upload my-org/my-env data.json

# Upload multiple files
orwd upload my-org/my-env train.csv test.csv labels.json

# Upload an entire directory (preserves structure)
orwd upload my-org/my-env ./data/
By default, files are uploaded using their relative path from the current working directory. Use --dest to set a custom destination path:
# Upload to a specific path
orwd upload my-org/my-env train.csv --dest datasets/train.csv

# Upload files into a directory (note the trailing slash)
orwd upload my-org/my-env a.txt b.txt --dest inputs/

# Upload a local directory to a different remote path
orwd upload my-org/my-env ./local-data/ --dest ground_truth/
Upload concurrency defaults to 10 parallel transfers. Adjust with --concurrency:
orwd upload my-org/my-env ./large-dataset/ --concurrency 20

Listing files

# List all files
orwd files my-org/my-env

# Filter by path prefix
orwd files my-org/my-env --prefix ground_truth/

Deleting files

# Delete a single file
orwd delete-file my-org/my-env data.json

# Delete a folder and all its contents
orwd delete-file my-org/my-env ground_truth/ --folder

Runs and Rollouts

Browse your evaluation runs and their rollouts from the CLI:
# List runs
orwd runs
orwd runs --search "experiment-1"

# Get details for a specific run
orwd run <run-id>

# List rollouts within a run
orwd rollouts <run-id>

Structured Output

Every command (except init and new) supports the -o / --output flag for machine-readable output. This is useful for piping into jq, scripting, or loading into other tools.
FormatFlagDescription
Table-o tableHuman-readable columns (default)
JSON-o jsonCompact JSON array or object
YAML-o yamlYAML output
JSONL-o jsonlOne JSON object per line (for row-based commands)

Examples

# Get environment details as JSON
orwd get my-org/my-env -o json

# Pipe environment list into jq
orwd list --owner my-org -o json | jq '.[].name'

# Stream deployments as JSONL for processing
orwd deployments my-org/my-env -o jsonl

# Export all environments to YAML
orwd list --mine -o yaml > my-environments.yaml

Command Reference

CommandDescription
orwd whoamiShow the current authenticated user
orwd newInteractive wizard (or one-shot with flags) to create and deploy an environment
orwd initScaffold local environment files from a template
orwd createCreate an environment on OpenReward
orwd listList environments
orwd getGet environment details
orwd updateUpdate environment metadata
orwd linkLink an environment to a GitHub repository
orwd unlinkDisconnect from GitHub
orwd update-linkUpdate compute/scaling settings for a linked environment
orwd deploymentsList deployments
orwd logsView build or runtime logs
orwd task-buildsList Harbor task image builds
orwd task-build-logsView logs for a Harbor task image build
orwd uploadUpload files or directories to an environment’s file store
orwd filesList files in an environment’s file store
orwd delete-fileDelete a file or folder from an environment’s file store
orwd runsList runs
orwd runGet run details
orwd rolloutsList rollouts in a run
Run orwd <command> --help for full usage details on any command.