Using Verifiers Environments

Goals

Understand what Verifiers is and how its environments map to OpenReward.
Install and use the verifiers2or CLI tools.
Convert a Verifiers environment into a deployable ORS environment.
Build, test, and deploy the generated environment to OpenReward.

Prerequisites

An OpenReward account and API key
Python 3.11+
A Verifiers environment package to convert (e.g., wordle)
Familiarity with environments and deployment

What is Verifiers?

Verifiers is Prime Intellect’s library for creating environments to train and evaluate LLMs. Each environment is a self-contained Python module that packages a dataset of task inputs, a harness for tools and context management, and a reward function for scoring. Verifiers environments support reinforcement learning training, capability evaluation, synthetic data generation, and agent experimentation. The key architectural difference with ORS is Cartesian separation. In Verifiers, the environment and agent are entangled: the environment owns the agent loop, calls the model, parses its raw text output, and scores the episode. Environment and agent are one fused unit, so changing either means rewriting the other. ORS decouples these completely. The environment is an HTTP server that exposes tools and returns structured feedback. The agent is an independent process that calls those tools. Neither knows the other’s internals. This means any agent can be paired with any environment - they are orthogonal axes, not coupled components.

Verifiers: Environment ↔ Agent are entangled. The environment drives the loop, calls the model, and parses raw text.
ORS: Environment ⊥ Agent are separated. The environment exposes tools via HTTP; the agent drives the loop and calls tools.

The verifiers2or toolkit bridges these paradigms, letting you convert existing Verifiers environments into ORS-compatible servers.

Installation

Install verifiers2or from GitHub:

pip install git+https://github.com/OpenRewardAI/verifiers2or.git

You also need the Verifiers environment package you want to convert:

pip install verifiers>=0.1.9

Quick Start

Analyze the environment

Use analyzer to inspect a Verifiers environment before converting:

python -m converter.analyzer wordle

This outputs the environment type, system prompt, parser fields, reward functions, dataset structure, and more:

=== Verifiers Environment Analysis ===
Class:        TextArenaEnv
Type:         MultiTurnEnv
System prompt: 'You are a competitive game player...'

--- Parser ---
Type:         XMLParser
Fields:       ['guess']
Answer field: guess

--- Reward Functions ---
  correct_answer (weight=1.0)
  partial_answer (weight=1.0)
  length_bonus (weight=1.0)

The analyzer accepts installed packages (wordle, primeintellect/gsm8k) or local files (./path/to/my_env.py).

Convert the environment

You have two conversion options:Option A: Quick wrap (zero-code)Wrap the environment as an ORS server with one command:

python -m converter.wrap wordle

This starts an ORS server on http://localhost:8080 with tools, tasks, and splits automatically extracted from the Verifiers environment.To pass arguments to the environment:

python -m converter.wrap wordle --env-args '{"num_train_examples": 500}'

The wrapper is convenient for quick testing but may be unreliable for complex environments. It attempts to automatically bridge verifiers’ internal state management, message formats, and reward computation, which can break for environments with non-standard patterns. For production use or environments with complex multi-turn logic, user simulation, or custom state handling, we recommend using the scaffold command and doing a full manual conversion instead.

Option B: Generate scaffold (full control)Generate ORS environment code for customisation:

python -m converter.scaffolder wordle -o ./my_wordle -n wordle

This creates:

wordle_env.py — ORS Environment subclass with TODOs for game logic
server.py — Server entry point
test_wordle.py — Client test script

Test locally

Start the server and run the test script:

# Terminal 1: Start the server
cd ./my_wordle
python server.py

# Terminal 2: Run the test
python test_wordle.py

Verify that splits and tasks are listed, tools appear with correct schemas, and episodes complete with finished=True and valid rewards.

Deploy to OpenReward

Push the generated environment to GitHub and connect it via the OpenReward dashboard:

cd my_wordle
git init && git add -A && git commit -m "Initial environment"
git remote add origin https://github.com/yourorg/my-wordle.git
git push -u origin main

Then follow the GitHub deployment guide to connect the repository and trigger your first build on OpenReward.

Conversion Methods

Method	Use Case	Output
`wrap`	Quick prototyping, testing	ORS server at runtime (no code generated)
`scaffold`	Production, customisation	Python files with TODOs for manual completion

The wrap command reuses all original Verifiers logic (game state, feedback, scoring) at runtime. Use scaffold when you need full control over the ORS implementation.

Concept Mapping

This table shows how Verifiers concepts translate to ORS:

Verifiers	ORS	Notes
`load_environment()`	`__init__()` + `setup()`	ORS creates one instance per session
Dataset (HuggingFace)	`list_tasks(split)`	Each row becomes a task JSON object
Rubric + reward funcs	`ToolOutput.reward`	Reward returned on terminal tool call
Parser (XMLParser)	Pydantic `BaseModel`	Structured tool input replaces text parsing
`env_response()`	`@tool` method	Agent calls tools instead of env parsing text
`system_prompt`	`get_prompt()`	Prompt delivered via API
`State` dict	Instance attributes	ORS env is instantiated per session
`@vf.stop` conditions	`ToolOutput(finished=True)`	Termination signaled via tool output
train/eval datasets	`list_splits()` + `Split`	Named splits with type annotation

Supported Environment Types

Type	Wrapper Support	Scaffold Support	Notes
`SingleTurnEnv`	Full	Full	One `submit` tool
`MultiTurnEnv`	Full	Full	Tool from parser fields
`ToolEnv`	Full	Full	One ORS tool per Verifiers tool
`StatefulToolEnv`	Full	Full	Same as ToolEnv
`SandboxEnv`	Partial	Template + TODOs	Requires manual sandbox config

Next Steps

verifiers2or Reference

Full CLI reference, conversion guide, and worked examples.

Your First Environment

Learn the fundamentals of building ORS environments from scratch.

GitHub Deployment

Connect your environment repository to OpenReward for automatic deployment.

Get started

Core Concepts

Making Environments

Advanced Environment Topics

Training

Evaluation

Harnesses

Sandbox Providers

Integrations

Rollouts

Deployment

Storage & Data

Using Verifiers Environments

Goals

Prerequisites

What is Verifiers?

Installation

Quick Start

Conversion Methods

Concept Mapping

Supported Environment Types

Next Steps

verifiers2or Reference

Your First Environment

GitHub Deployment

​Goals

​Prerequisites

​What is Verifiers?

​Installation

​Quick Start

​Conversion Methods

​Concept Mapping

​Supported Environment Types

​Next Steps

verifiers2or Reference

Your First Environment

GitHub Deployment

Goals

Prerequisites

What is Verifiers?

Installation

Quick Start

Conversion Methods

Concept Mapping

Supported Environment Types

Next Steps