Getting Started#

There are three main ways to use RLtools:

Approach

Pros ✅

Cons ❌

Python Interface

  • Easy installation: pip install rltools

  • Compatible with Gym/Gymnasium environments

  • Easiest way to use Intel MKL for acceleration: pip install rltools[mkl]

  • Performance limited by Python environments

  • Limited flexibility (fixed algorithms, adjustable hyperparameters)

Native: In-Source

  • Full performance

  • Easily reproduces the examples

  • Tuned training configurations maintained in the RLtools Zoo

  • Versioning issues (this is basically forking RLtools)

Native: No CMake

  • Full performance

  • Great for purists who hate CMake and build-systems in general

  • Shows that RLtools is actually dependency-free

  • No automatic interoperability with IDEs for e.g. debugging

Native: As a Submodule/Library

  • Full performance

  • Easy versioning by including RLtools as a submodule

  • Great for implementing your own environments and maintaining them over time

  • Slightly more initial setup

The No CMake approach is mainly for demonstration purposes to show that RLtools has no dependencies and to show that there is no hidden magic in the CMake configuration.

Between the latter two native options we recommend to start with the In-Source approach to take advantage of the pre-configured examples in the RLtools Zoo. Then, when implementing your own environment you probably want to switch to using RLtools as a library by including it as a submodule in your project.

This quick-start guide mainly focuses on the In-Source approach because it showcases how RLtools can be set up with minimal dependencies and used to reproduce the RLtools Zoo training runs.

Python Interface#

pip install rltools gymnasium

gymnasium is not generally required but we install it for this example.

from rltools import SAC
import gymnasium as gym
from gymnasium.wrappers import RescaleAction

seed = 0xf00d
def env_factory():
    env = gym.make("Pendulum-v1")
    env = RescaleAction(env, -1, 1)
    env.reset(seed=seed)
    return env

sac = SAC(env_factory)
state = sac.State(seed)

finished = False
while not finished:
    finished = state.step()

For more information please refer to the Python Interface documentation.

Native: In-Source#

This guide gets you started using RLtools natively on multiple platforms (Docker / Ubuntu / WSL / macOS).

Step 1: Clone the Repository#

git clone https://github.com/rl-tools/rl-tools.git

Step 2: Run Container (Optional)#

Run a Docker container from the cloned directory:

cd rl-tools
docker run -it --rm -p 8000:8000 --mount type=bind,source=$(pwd),target=/rl_tools,readonly ubuntu:24.04
  • -p 8000:8000: Optional. Exposes port 8000 such that you can use the Experiment Tracking facilities. This exposes a static web server that gives access to a browser-based interface to watch environment rollouts.

  • --mount type=bind,source=$(pwd),target=/rl_tools,readonly: We mount the current directory (checked out repository) to /rl_tools in the container. We use the readonly flag to ensure a clean, out-of-tree build. The files can still be edited on the host machine using your editor or IDE of choice.

  • ubuntu:24.04: We use Ubuntu for this demonstration because of its wide-spread use and familiarity. Due to the minimal requirements of RLtools (basically only a C++ compiler and CMake and possibly a BLAS library to speed up matrix multiplications) it can also be used on virtually any other system (Windows, macOS and other Linux distros)

Step 3: Install Dependencies#

Docker & Ubuntu & WSL#

apt update
export DEBIAN_FRONTEND=noninteractive
apt install -y build-essential cmake libopenblas-dev git python3
  • DEBIAN_FRONTEND=noninteractive: Suppresses interactive prompts during package installation (for convenience)

  • apt install -y: Installs several dependencies

    • build-essential: Installs the C++ compiler

    • cmake: CMake to configure the different example targets and call the compiler

    • libopenblas-dev: Optional. Lightweight BLAS library that provides fast matrix multiplication implementations. Required for the -DRL_TOOLS_BACKEND_ENABLE_OPENBLAS=ON option. About 10x faster than with the generic implementations that are used by default (when the option is absent). By using a more tailored BLAS library like Intel MKL you might be able to get another ~2x speed improvement

    • git: Optional. If git is available, CMake’s FetchContent module is used to automatically fetch other optional dependencies for, e.g., JSON, HDF5, Tensorboard logging, command-line argument parsing etc.

    • python3: Optional. Python is used in tools/serve.sh to host a simple static HTTP server that visualizes the environments during and after training.

macOS#

Install the Xcode command line tools:

xcode-select --install
brew install cmake

Step 4: Configure and Build the Targets#

mkdir build && cd build
cmake ..
cmake --build .
  • cmake ..: Using CMake to configure the examples contained in the RLtools project. The main suit of (tuned) environment configurations we are using is the RLtools Zoo (see https://zoo.rl.tools for trained agents and learning curves). If you are using docker, replace .. with /rl_tools since the mounted source tree should be read-only.

  • cmake --build .: Builds the targets. You can use an additional e.g. -j4 to speed up the build using 4 parallel threads.

Step 5: Run an Experiment#

Execute e.g. the RLtools Zoo example using SAC to train the Learning to Fly (l2f) environment.

cd ..
./build/src/rl/zoo/rl_zoo_l2f_sac

You can use cmake --build . --target help to list the available targets. In Docker the source directory is readonly, hence just run ./src/rl/zoo/rl_zoo_l2f_sac. The working directory matters because checkpoints and other logs are written to an experiments folder in the current working directory.

Step 6: Visualize the Results#

During the experiment the training loop emits checkpoints and recorded trajectories a s well as Javscript rendering instructions in the experiment folder following the Experiment Tracking conventions. You can browse the experiments folder which contains the runs to inspect the checkpoints and other data. RLtools includes a simple web-based UI to visualize these results.

Docker#

To expose the experiment data through the forwarded port of the docker container we copy the files that constitute the web interface into the docker container such that a simple, static HTTP server can expose them together with the experiment data.

cp -r /rl_tools/static /rl_tools/tools /rl_tools/index.html .
./tools/serve.sh

After copying the UI files we run tools/serve.sh which periodically builds an index file containing a list of all experiment files such that the web UI can find them. It also starts a simple Python-based HTTP server on port 8000. Now you should be able to navigate to http://localhost:8000 and view the visualizations of the training runs (EXperiment TRACKing UI).

Ubuntu & WSL & macOS#

Make sure that the target is run with the cloned repository rl-tools as the working directory. This should create an experiments folder inside it.

Now we can run tools/serve.sh which periodically builds an index file containing a list of all experiment files such that the web UI can find them. It also starts a simple Python-based HTTP server on port 8000. Now you should be able to navigate to http://localhost:8000 and view the visualizations of the training runs.

./tools/serve.sh

Native: No CMake#

“Never underestimate a man armed with nothing but a C++ compiler.”

—RLtools contributor

The following compiles the RLtools Zoo example for SAC with the Learning to Fly environment:

g++ -I include -std=c++17 src/rl/zoo/l2f/sac.cpp
./a.out
  • -Iinclude: This is run from the root of the cloned repository folder rl-tools hence the header search path is include. In Docker this should be adjusted to -I/rl_tools/include.

  • -std=c++17: Use the C++17 standard

  • src/rl/zoo/zoo*.cpp: Compile the RLtools Zoo files. In Docker this should be adjusted to /rl_tools/src/rl/zoo/zoo*.cpp.

  • -DRL_TOOLS_RL_ZOO_ALGORITHM_SAC: Flag for the RLtools Zoo that selects the SAC RL algorithm

  • -DRL_TOOLS_RL_ZOO_ENVIRONMENT_L2F: Flag for the RLtools Zoo that selects the Learning to Fly environment

  • ./a.out: Run the compiled binary

This will be quite slow because no optimizations are applied by default. In the following we instruct the compiler to maximally optimize the code using the compile-time knowledge that RLtools provides. In particular the sizes of all datastructures and for-loops are known at compile-time and the compiler can unroll loops and inline functions to maximize performance.

g++ -I include -std=c++17 -O3 -ffast-math -march=native src/rl/zoo/l2f/sac.cpp
./a.out

With this we observe an ~80x speedup. The added options are:

  • -O3 -ffast-math: Maximally optimize the code and use fast math

  • -march=native: Take maximal advantage of the available instructions on the host machine

To further speed up the computations we can use a matrix multiplication backend which we find to give another ~7-10x speedup:

Docker & Ubuntu & WSL#

g++ -I include -std=c++17 -O3 -ffast-math -march=native src/rl/zoo/l2f/sac.cpp -lblas -DRL_TOOLS_BACKEND_ENABLE_OPENBLAS
./a.out
  • -lblas: Link against the BLAS library

  • -DRL_TOOLS_BACKEND_ENABLE_OPENBLAS: Enable the OpenBLAS backend (should actually work with any CBLAS-compatible library)

macOS#

g++ -I include -std=c++17 -O3 -ffast-math -march=native src/rl/zoo/l2f/sac.cpp -framework Accelerate -DRL_TOOLS_BACKEND_ENABLE_ACCELERATE
./a.out
  • -framework Accelerate: Link against the Accelerate framework

  • -DRL_TOOLS_BACKEND_ENABLE_ACCELERATE: Enable the Accelerate backend

Native: As a Submodule/Library#

To use RLtools as a library you can start from the example https://github.com/rl-tools/example and use it as a template to implement your own environment. The steps to setup the environment are the same as in Native: In-Source.