Getting Started#
There are three main ways to use RLtools:
Approach |
Pros ✅ |
Cons ❌ |
---|---|---|
|
|
|
|
|
|
|
|
|
|
|
The No CMake approach is mainly for demonstration purposes to show that RLtools has no dependencies and to show that there is no hidden magic in the CMake configuration.
Between the latter two native options we recommend to start with the In-Source approach to take advantage of the pre-configured examples in the RLtools Zoo. Then, when implementing your own environment you probably want to switch to using RLtools as a library by including it as a submodule in your project.
This quick-start guide mainly focuses on the In-Source approach because it showcases how RLtools can be set up with minimal dependencies and used to reproduce the RLtools Zoo training runs.
Python Interface#
pip install rltools gymnasium
gymnasium is not generally required but we install it for this example.
from rltools import SAC
import gymnasium as gym
from gymnasium.wrappers import RescaleAction
seed = 0xf00d
def env_factory():
env = gym.make("Pendulum-v1")
env = RescaleActionV0(env, -1, 1)
env.reset(seed=seed)
return env
sac = SAC(env_factory)
state = sac.State(seed)
finished = False
while not finished:
finished = state.step()
For more information please refer to the Python Interface documentation.
Native: In-Source#
This guide gets you started using RLtools natively on multiple platforms (Docker / Ubuntu / WSL / macOS).
Step 1: Clone the Repository
Step 2: Run Container (Docker only)
- Step 3: Install Dependencies
Step 5: Run an Experiment
- Step 6: Visualize the Results
Step 1: Clone the Repository#
git clone https://github.com/rl-tools/rl-tools.git
- Note:
We don’t encourage using the
--recursive
flag because we are maintaining RLtools as a monorepo and some of the submodules contain large files (e.g. data for unit tests or redistributable binaries for releases). RLtools is designed as a header-only and dependency-free library but for some convenience features like Tensorboard logging, or gzipped checkpointing, additional dependencies are required. We prefer to vendor them as versioned submodules in./external
and they can be instantiated selectively usinggit submodule update --init --recursive -- external/<submodule>
.
Step 2: Run Container (Docker only)#
Run a Docker container from the cloned directory:
cd rl-tools
docker run -it --rm -p 8000:8000 --mount type=bind,source=$(pwd),target=/rl_tools,readonly ubuntu:24.04
-p 8000:8000
: Optional. Exposes port8000
such that you can use the Experiment Tracking facilities. This exposes a static web server that gives access to a browser-based interface to watch environment rollouts.--mount type=bind,source=$(pwd),target=/rl_tools,readonly
: We mount the current directory (checked out repository) to/rl_tools
in the container. We use thereadonly
flag to ensure a clean, out-of-tree build. The files can still be edited on the host machine using your editor or IDE of choice.ubuntu:24.04
: We use Ubuntu for this demonstration because of its wide-spread use and familiarity. Due to the minimal requirements of RLtools (basically only a C++ compiler and CMake and possibly a BLAS library to speed up matrix multiplications) it can also be used on virtually any other system (Windows, macOS and other Linux distros)
Step 3: Install Dependencies#
Docker & Ubuntu & WSL#
apt update
export DEBIAN_FRONTEND=noninteractive
apt install -y build-essential cmake libopenblas-dev python3
DEBIAN_FRONTEND=noninteractive
: Suppresses interactive prompts during package installation (for convenience)apt install -y
: Installs several dependenciesbuild-essential
: Installs the C++ compilercmake
: CMake to configure the different example targets and call the compilerlibopenblas-dev
: Optional. Lightweight BLAS library that provides fast matrix multiplication implementations. Required for the-DRL_TOOLS_BACKEND_ENABLE_OPENBLAS=ON
option. About 10x faster than with the generic implementations that are used by default (when the option is absent). By using a more tailored BLAS library like Intel MKL you might be able to get another ~2x speed improvementpython3
: Optional. Python is used inserve.sh
to host a simple static HTTP server that visualizes the environments during and after training.
macOS#
Install the Xcode command line tools:
xcode-select --install
brew install cmake
Step 4: Configure and Build the Targets#
- Note:
For macOS, replace
-DRL_TOOLS_BACKEND_ENABLE_OPENBLAS=ON
with-DRL_TOOLS_BACKEND_ENABLE_ACCELERATE=ON
mkdir build && cd build
cmake /rl_tools -DCMAKE_BUILD_TYPE=Release -DRL_TOOLS_ENABLE_TARGETS=ON -DRL_TOOLS_BACKEND_ENABLE_OPENBLAS=ON
cmake --build .
cmake /rl_tools
: Using CMake to configure the examples contained in the RLtools project. The main suit of (tuned) environment configurations we are using is the RLtools Zoo (see https://zoo.rl.tools for trained agents and learning curves).-DCMAKE_BUILD_TYPE=Release
: Sets the build type toRelease
to optimize the build for performance (expect a large difference compared to without it).-DRL_TOOLS_ENABLE_TARGETS=ON
: Enables the building of the example targets. These are turned off by default such that they don’t clutter projects that just include RLtools as a library and do not want to build the examples.-DRL_TOOLS_BACKEND_ENABLE_OPENBLAS=ON
: Enables the OpenBLAS backend, allowing RL_Tools to utilize OpenBLAS for matrix multiplications (~10x faster than using the generic implementations that are automatically used in the absence of this flag).cmake --build .
: Builds the targets. You can use an additional e.g.-j4
to speed up the build using 4 parallel threads.
Step 5: Run an Experiment#
Execute e.g. the RLtools Zoo example using SAC to train the Learning to Fly (l2f) environment.
./src/rl/zoo/rl_zoo_sac_l2f
You can use cmake --build . --target help
to list the available targets.
Step 6: Visualize the Results#
During the experiment the training loop emits checkpoints and recorded trajectories a s well as Javscript rendering instructions in the experiment folder following the Experiment Tracking conventions. You can browse the experiments
folder which contains the runs to inspect the checkpoints and other data. RLtools includes a simple web-based UI to visualize these results.
Docker#
To expose the experiment data through the forwarded port of the docker container we copy the files that constitute the web interface into the docker container such that a simple, static HTTP server can expose them together with the experiment data.
cp -r /rl_tools/static .
/rl_tools/serve.sh
After copying the UI files we run serve.sh
which periodically builds an index file containing a list of all experiment files such that the web UI can find them. It also starts a simple Python-based HTTP server on port 8000
. Now you should be able to navigate to http://localhost:8000 and view the visualizations of the training runs.
Ubuntu & WSL & macOS#
Make sure that the target is run with the cloned repository rl-tools as the working directory. This should create an experiments folder inside it.
Now we can run serve.sh
which periodically builds an index file containing a list of all experiment files such that the web UI can find them. It also starts a simple Python-based HTTP server on port 8000
. Now you should be able to navigate to http://localhost:8000 and view the visualizations of the training runs.
./serve.sh
Native: No CMake#
“Never underestimate the power of a man armed with nothing but a C++ compiler.”
—RLtools contributor
The following compiles the RLtools Zoo example for SAC with the Learning to Fly environment:
g++ -I include -std=c++17 src/rl/zoo/l2f/sac.cpp
./a.out
-Iinclude
: This is run from the root of the cloned repository folderrl-tools
hence the header search path isinclude
. In Docker this should be adjusted to-I/rl_tools/include
.-std=c++17
: Use the C++17 standardsrc/rl/zoo/zoo*.cpp
: Compile the RLtools Zoo files. In Docker this should be adjusted to/rl_tools/src/rl/zoo/zoo*.cpp
.-DRL_TOOLS_RL_ZOO_ALGORITHM_SAC
: Flag for the RLtools Zoo that selects the SAC RL algorithm-DRL_TOOLS_RL_ZOO_ENVIRONMENT_L2F
: Flag for the RLtools Zoo that selects the Learning to Fly environment./a.out
: Run the compiled binary
This will be quite slow because no optimizations are applied by default. In the following we instruct the compiler to maximally optimize the code using the compile-time knowledge that RLtools provides. In particular the sizes of all datastructures and for-loops are known at compile-time and the compiler can unroll loops and inline functions to maximize performance.
g++ -I include -std=c++17 -Ofast -march=native src/rl/zoo/l2f/sac.cpp
./a.out
With this we observe an ~80x speedup. The added options are:
-Ofast
: Maximally optimize the code and use fast math-march=native
: Take maximal advantage of the available instructions on the host machine
To further speed up the computations we can use a matrix multiplication backend which we find to give another ~7-10x speedup:
Docker & Ubuntu & WSL#
g++ -I include -std=c++17 -Ofast -march=native src/rl/zoo/l2f/sac.cpp -lblas -DRL_TOOLS_BACKEND_ENABLE_OPENBLAS
./a.out
-lblas
: Link against the BLAS library-DRL_TOOLS_BACKEND_ENABLE_OPENBLAS
: Enable the OpenBLAS backend (should actually work with any CBLAS-compatible library)
macOS#
g++ -I include -std=c++17 -Ofast -march=native src/rl/zoo/l2f/sac.cpp -framework Accelerate -DRL_TOOLS_BACKEND_ENABLE_ACCELERATE
./a.out
-framework Accelerate
: Link against the Accelerate framework-DRL_TOOLS_BACKEND_ENABLE_ACCELERATE
: Enable the Accelerate backend
Native: As a Submodule/Library#
To use RLtools as a library you can start from the example https://github.com/rl-tools/example and use it as a template to implement your own environment. The steps to setup the environment are the same as in Native: In-Source.