Using the CLI - OpenReward

Goals

Install and configure the orwd CLI
Create, deploy, and manage environments without leaving the terminal
Use structured output formats for scripting and automation

Prerequisites

An OpenReward account and API key
Python 3.11+
GitHub CLI (gh) — optional, needed for repo creation via orwd new

Installation

Install the openreward package, which includes the orwd CLI:

pip install openreward

Then set your API key:

export OPENREWARD_API_KEY='your-api-key-here'

Most commands require OPENREWARD_API_KEY. The only exception is orwd init, which scaffolds files locally without touching the API.

Shell alias (for development)

If you’re developing the SDK locally and want orwd to always point at your working copy:

# Add to ~/.zshrc or ~/.bashrc
alias orwd="uv run --project ~/path/to/python-sdk orwd"

Use --project (not --directory) so orwd runs relative to your current directory, not the SDK directory.

Creating an Environment

Interactive wizard

The fastest way to go from zero to a deployed environment:

orwd new

This walks you through every step: naming, template selection, GitHub repo creation, and linking. At each prompt, you can accept the default (shown in parentheses) or type a different value.

  OpenReward — new environment

  > Environment name: my-bench
  > Template
    1. basic (default)
    2. sandbox
    Choice:
  > Description: A benchmark for evaluating X
  > Private (y/N):
  > Harbor (sandbox) mode (y/N):
  > Namespace (thomas):
  > Directory (my-bench):
  > Create a GitHub repo (Y/n):
  > GitHub repo (owner/repo) (thomas/my-bench):
  > Private repo (y/N):

Non-interactive mode

Pass all arguments as flags and add -y to skip prompts:

orwd new my-bench \
  --description "A benchmark for evaluating X" \
  --namespace GeneralReasoning \
  --repo GeneralReasoning/my-bench \
  --template basic \
  -y

This is useful for scripting or CI workflows. Any argument you omit will still be prompted for interactively (unless -y is set, in which case it falls back to its default).

Harbor environments

To create an environment that uses the Harbor task specification:

orwd new my-harbor-env --harbor --description "Harbor-based benchmark"

Step by step

If you prefer more control, you can run the individual steps yourself:

# 1. Scaffold local files
orwd init my-env --template basic

# 2. Create the environment on OpenReward
orwd create my-env --description "My environment" --namespace my-org

# 3. Create a GitHub repo and push
cd my-env
git init && git add . && git commit -m "Initial scaffold"
gh repo create my-org/my-env --source . --push

# 4. Link to GitHub (triggers first deployment)
orwd link my-org/my-env my-org/my-env

Managing Environments

Listing environments

# List all environments
orwd list

# Filter by owner
orwd list --owner GeneralReasoning

# Search by name or description
orwd list --search "benchmark"

# Your own environments
orwd list --mine

By default, orwd list fetches all results (auto-paginating through the API). Use --limit to cap the number returned.

Getting environment details

orwd get GeneralReasoning/CTF

Name:        GeneralReasoning/CTF
ID:          abc123
Description: Capture the flag challenges
Private:     False
GitHub:      connected
Repo:        https://github.com/GeneralReasoning/CTF
Compute:     1 CPU, 4 GB
Created:     2025-03-15T12:00:00Z
Updated:     2025-04-01T08:30:00Z

Updating an environment

# Change description
orwd update my-org/my-env --description "Updated description"

# Rename
orwd update my-org/my-env --name new-name

# Toggle privacy
orwd update my-org/my-env --private
orwd update my-org/my-env --public

# Enable or disable harbor mode
orwd update my-org/my-env --harbor
orwd update my-org/my-env --no-harbor

# Set external URLs
orwd update my-org/my-env --arxiv-url "https://arxiv.org/abs/..."

GitHub Integration

Linking

orwd link my-org/my-env my-org/my-repo

This connects an environment to a GitHub repository and triggers the first deployment. If you haven’t authorized GitHub yet, the CLI opens your browser to complete OAuth. Default compute settings (matching the web UI):

Setting	Default
`--cpu-memory`	`1:4` (1 vCPU, 4 GB)
`--concurrency`	`500`
`--max-scale`	`10`

Override any of these:

orwd link my-org/my-env my-org/my-repo --cpu-memory 2:8 --max-scale 5

Unlinking

orwd unlink my-org/my-env

Updating link settings

Change compute or scaling settings on an already-linked environment:

orwd update-link my-org/my-env --cpu-memory 4:16 --concurrency 1000 --max-scale 3

Monitoring Deployments

Listing deployments

orwd deployments my-org/my-env

ID         STATUS       BRANCH               COMMIT     CREATED
a1b2c3d4   deployed     main                 f8e9a0b1   2025-04-20T10:00:00Z
e5f6g7h8   failed       main                 c3d4e5f6   2025-04-19T15:30:00Z

Viewing logs

# Runtime logs (default)
orwd logs my-org/my-env

# Build logs
orwd logs my-org/my-env --build

# Logs for a specific deployment
orwd logs my-org/my-env --deployment-id a1b2c3d4-full-id-here

# More log entries
orwd logs my-org/my-env --limit 200

Harbor task builds

For Harbor environments, each task gets its own Docker image build. To check their status:

orwd task-builds my-org/my-harbor-env

TASK                           STATUS     ID         UPDATED
task-1                         success    a1b2c3d4   2025-04-20T10:05:00Z
task-2                         building   e5f6g7h8   2025-04-20T10:04:00Z
task-3                         error      i9j0k1l2   2025-04-20T10:03:00Z
                               error: Dockerfile syntax error on line 12

To view the build logs for a specific task:

orwd task-build-logs <task-build-id>

Use orwd task-builds my-org/my-env -o json to get the full task build IDs for use with task-build-logs.

File Management

Environments have a file store for uploading data files (datasets, ground truth, task inputs). These files are available at /orwd_data/ on the deployed environment server. See Where Environment Data Lives for details on how uploaded files are used.

Uploading files

# Upload a single file
orwd upload my-org/my-env data.json

# Upload multiple files
orwd upload my-org/my-env train.csv test.csv labels.json

# Upload an entire directory (preserves structure)
orwd upload my-org/my-env ./data/

By default, files are uploaded using their relative path from the current working directory. Use --dest to set a custom destination path:

# Upload to a specific path
orwd upload my-org/my-env train.csv --dest datasets/train.csv

# Upload files into a directory (note the trailing slash)
orwd upload my-org/my-env a.txt b.txt --dest inputs/

# Upload a local directory to a different remote path
orwd upload my-org/my-env ./local-data/ --dest ground_truth/

Upload concurrency defaults to 10 parallel transfers. Adjust with --concurrency:

orwd upload my-org/my-env ./large-dataset/ --concurrency 20

Listing files

# List all files
orwd files my-org/my-env

# Filter by path prefix
orwd files my-org/my-env --prefix ground_truth/

Deleting files

# Delete a single file
orwd delete-file my-org/my-env data.json

# Delete a folder and all its contents
orwd delete-file my-org/my-env ground_truth/ --folder

Runs and Rollouts

Browse your evaluation runs and their rollouts from the CLI:

# List runs
orwd runs
orwd runs --search "experiment-1"

# Get details for a specific run
orwd run <run-id>

# List rollouts within a run
orwd rollouts <run-id>

Structured Output

Every command (except init and new) supports the -o / --output flag for machine-readable output. This is useful for piping into jq, scripting, or loading into other tools.

Format	Flag	Description
Table	`-o table`	Human-readable columns (default)
JSON	`-o json`	Compact JSON array or object
YAML	`-o yaml`	YAML output
JSONL	`-o jsonl`	One JSON object per line (for row-based commands)

Examples

# Get environment details as JSON
orwd get my-org/my-env -o json

# Pipe environment list into jq
orwd list --owner my-org -o json | jq '.[].name'

# Stream deployments as JSONL for processing
orwd deployments my-org/my-env -o jsonl

# Export all environments to YAML
orwd list --mine -o yaml > my-environments.yaml

Command Reference

Command	Description
`orwd whoami`	Show the current authenticated user
`orwd new`	Interactive wizard (or one-shot with flags) to create and deploy an environment
`orwd init`	Scaffold local environment files from a template
`orwd create`	Create an environment on OpenReward
`orwd list`	List environments
`orwd get`	Get environment details
`orwd update`	Update environment metadata
`orwd link`	Link an environment to a GitHub repository
`orwd unlink`	Disconnect from GitHub
`orwd update-link`	Update compute/scaling settings for a linked environment
`orwd deployments`	List deployments
`orwd logs`	View build or runtime logs
`orwd task-builds`	List Harbor task image builds
`orwd task-build-logs`	View logs for a Harbor task image build
`orwd upload`	Upload files or directories to an environment’s file store
`orwd files`	List files in an environment’s file store
`orwd delete-file`	Delete a file or folder from an environment’s file store
`orwd runs`	List runs
`orwd run`	Get run details
`orwd rollouts`	List rollouts in a run

Run orwd <command> --help for full usage details on any command.

​Goals

​Prerequisites

​Installation

​Shell alias (for development)

​Creating an Environment

​Interactive wizard

​Non-interactive mode

​Harbor environments

​Step by step

​Managing Environments

​Listing environments

​Getting environment details

​Updating an environment

​GitHub Integration

​Linking

​Unlinking

​Updating link settings

​Monitoring Deployments

​Listing deployments

​Viewing logs

​Harbor task builds

​File Management

​Uploading files

​Listing files

​Deleting files

​Runs and Rollouts

​Structured Output

​Examples

​Command Reference

Goals

Prerequisites

Installation

Shell alias (for development)

Creating an Environment

Interactive wizard

Non-interactive mode

Harbor environments

Step by step

Managing Environments

Listing environments

Getting environment details

Updating an environment

GitHub Integration

Linking

Unlinking

Updating link settings

Monitoring Deployments

Listing deployments

Viewing logs

Harbor task builds

File Management

Uploading files

Listing files

Deleting files

Runs and Rollouts

Structured Output

Examples

Command Reference