Goals
- Install and configure the
orwd CLI
- Create, deploy, and manage environments without leaving the terminal
- Use structured output formats for scripting and automation
Prerequisites
Installation
Install the openreward package, which includes the orwd CLI:
Then set your API key:
export OPENREWARD_API_KEY='your-api-key-here'
Most commands require OPENREWARD_API_KEY. The only exception is orwd init, which scaffolds files locally without touching the API.
Shell alias (for development)
If you’re developing the SDK locally and want orwd to always point at your working copy:
# Add to ~/.zshrc or ~/.bashrc
alias orwd="uv run --project ~/path/to/python-sdk orwd"
Use --project (not --directory) so orwd runs relative to your current directory, not the SDK directory.
Creating an Environment
Interactive wizard
The fastest way to go from zero to a deployed environment:
This walks you through every step: naming, template selection, GitHub repo creation, and linking. At each prompt, you can accept the default (shown in parentheses) or type a different value.
OpenReward — new environment
> Environment name: my-bench
> Template
1. basic (default)
2. sandbox
Choice:
> Description: A benchmark for evaluating X
> Private (y/N):
> Harbor (sandbox) mode (y/N):
> Namespace (thomas):
> Directory (my-bench):
> Create a GitHub repo (Y/n):
> GitHub repo (owner/repo) (thomas/my-bench):
> Private repo (y/N):
Non-interactive mode
Pass all arguments as flags and add -y to skip prompts:
orwd new my-bench \
--description "A benchmark for evaluating X" \
--namespace GeneralReasoning \
--repo GeneralReasoning/my-bench \
--template basic \
-y
This is useful for scripting or CI workflows. Any argument you omit will still be prompted for interactively (unless -y is set, in which case it falls back to its default).
Harbor environments
To create an environment that uses the Harbor task specification:
orwd new my-harbor-env --harbor --description "Harbor-based benchmark"
Step by step
If you prefer more control, you can run the individual steps yourself:
# 1. Scaffold local files
orwd init my-env --template basic
# 2. Create the environment on OpenReward
orwd create my-env --description "My environment" --namespace my-org
# 3. Create a GitHub repo and push
cd my-env
git init && git add . && git commit -m "Initial scaffold"
gh repo create my-org/my-env --source . --push
# 4. Link to GitHub (triggers first deployment)
orwd link my-org/my-env my-org/my-env
Managing Environments
Listing environments
# List all environments
orwd list
# Filter by owner
orwd list --owner GeneralReasoning
# Search by name or description
orwd list --search "benchmark"
# Your own environments
orwd list --mine
By default, orwd list fetches all results (auto-paginating through the API). Use --limit to cap the number returned.
Getting environment details
orwd get GeneralReasoning/CTF
Name: GeneralReasoning/CTF
ID: abc123
Description: Capture the flag challenges
Private: False
GitHub: connected
Repo: https://github.com/GeneralReasoning/CTF
Compute: 1 CPU, 4 GB
Created: 2025-03-15T12:00:00Z
Updated: 2025-04-01T08:30:00Z
Updating an environment
# Change description
orwd update my-org/my-env --description "Updated description"
# Rename
orwd update my-org/my-env --name new-name
# Toggle privacy
orwd update my-org/my-env --private
orwd update my-org/my-env --public
# Enable or disable harbor mode
orwd update my-org/my-env --harbor
orwd update my-org/my-env --no-harbor
# Set external URLs
orwd update my-org/my-env --arxiv-url "https://arxiv.org/abs/..."
GitHub Integration
Linking
orwd link my-org/my-env my-org/my-repo
This connects an environment to a GitHub repository and triggers the first deployment. If you haven’t authorized GitHub yet, the CLI opens your browser to complete OAuth.
Default compute settings (matching the web UI):
| Setting | Default |
|---|
--cpu-memory | 1:4 (1 vCPU, 4 GB) |
--concurrency | 500 |
--max-scale | 10 |
Override any of these:
orwd link my-org/my-env my-org/my-repo --cpu-memory 2:8 --max-scale 5
Unlinking
orwd unlink my-org/my-env
Updating link settings
Change compute or scaling settings on an already-linked environment:
orwd update-link my-org/my-env --cpu-memory 4:16 --concurrency 1000 --max-scale 3
Monitoring Deployments
Listing deployments
orwd deployments my-org/my-env
ID STATUS BRANCH COMMIT CREATED
a1b2c3d4 deployed main f8e9a0b1 2025-04-20T10:00:00Z
e5f6g7h8 failed main c3d4e5f6 2025-04-19T15:30:00Z
Viewing logs
# Runtime logs (default)
orwd logs my-org/my-env
# Build logs
orwd logs my-org/my-env --build
# Logs for a specific deployment
orwd logs my-org/my-env --deployment-id a1b2c3d4-full-id-here
# More log entries
orwd logs my-org/my-env --limit 200
Harbor task builds
For Harbor environments, each task gets its own Docker image build. To check their status:
orwd task-builds my-org/my-harbor-env
TASK STATUS ID UPDATED
task-1 success a1b2c3d4 2025-04-20T10:05:00Z
task-2 building e5f6g7h8 2025-04-20T10:04:00Z
task-3 error i9j0k1l2 2025-04-20T10:03:00Z
error: Dockerfile syntax error on line 12
To view the build logs for a specific task:
orwd task-build-logs <task-build-id>
Use orwd task-builds my-org/my-env -o json to get the full task build IDs for use with task-build-logs.
File Management
Environments have a file store for uploading data files (datasets, ground truth, task inputs). These files are available at /orwd_data/ on the deployed environment server. See Where Environment Data Lives for details on how uploaded files are used.
Uploading files
# Upload a single file
orwd upload my-org/my-env data.json
# Upload multiple files
orwd upload my-org/my-env train.csv test.csv labels.json
# Upload an entire directory (preserves structure)
orwd upload my-org/my-env ./data/
By default, files are uploaded using their relative path from the current working directory. Use --dest to set a custom destination path:
# Upload to a specific path
orwd upload my-org/my-env train.csv --dest datasets/train.csv
# Upload files into a directory (note the trailing slash)
orwd upload my-org/my-env a.txt b.txt --dest inputs/
# Upload a local directory to a different remote path
orwd upload my-org/my-env ./local-data/ --dest ground_truth/
Upload concurrency defaults to 10 parallel transfers. Adjust with --concurrency:
orwd upload my-org/my-env ./large-dataset/ --concurrency 20
Listing files
# List all files
orwd files my-org/my-env
# Filter by path prefix
orwd files my-org/my-env --prefix ground_truth/
Deleting files
# Delete a single file
orwd delete-file my-org/my-env data.json
# Delete a folder and all its contents
orwd delete-file my-org/my-env ground_truth/ --folder
Runs and Rollouts
Browse your evaluation runs and their rollouts from the CLI:
# List runs
orwd runs
orwd runs --search "experiment-1"
# Get details for a specific run
orwd run <run-id>
# List rollouts within a run
orwd rollouts <run-id>
Structured Output
Every command (except init and new) supports the -o / --output flag for machine-readable output. This is useful for piping into jq, scripting, or loading into other tools.
| Format | Flag | Description |
|---|
| Table | -o table | Human-readable columns (default) |
| JSON | -o json | Compact JSON array or object |
| YAML | -o yaml | YAML output |
| JSONL | -o jsonl | One JSON object per line (for row-based commands) |
Examples
# Get environment details as JSON
orwd get my-org/my-env -o json
# Pipe environment list into jq
orwd list --owner my-org -o json | jq '.[].name'
# Stream deployments as JSONL for processing
orwd deployments my-org/my-env -o jsonl
# Export all environments to YAML
orwd list --mine -o yaml > my-environments.yaml
Command Reference
| Command | Description |
|---|
orwd whoami | Show the current authenticated user |
orwd new | Interactive wizard (or one-shot with flags) to create and deploy an environment |
orwd init | Scaffold local environment files from a template |
orwd create | Create an environment on OpenReward |
orwd list | List environments |
orwd get | Get environment details |
orwd update | Update environment metadata |
orwd link | Link an environment to a GitHub repository |
orwd unlink | Disconnect from GitHub |
orwd update-link | Update compute/scaling settings for a linked environment |
orwd deployments | List deployments |
orwd logs | View build or runtime logs |
orwd task-builds | List Harbor task image builds |
orwd task-build-logs | View logs for a Harbor task image build |
orwd upload | Upload files or directories to an environment’s file store |
orwd files | List files in an environment’s file store |
orwd delete-file | Delete a file or folder from an environment’s file store |
orwd runs | List runs |
orwd run | Get run details |
orwd rollouts | List rollouts in a run |
Run orwd <command> --help for full usage details on any command.