Experiment Tracking Interface Specification#
“Strava but for training runs.”
—RLtools contributor
The experiments follow the hierarchical, filesystem-based experiment tracking interface. We follow the UNIX philosophy and expose it through the filesystem.
This makes it very easy to sort, filter and analyze the experiments with just unix tools like find
, sort
, grep
and jq
. Additionally it also allows easy monitoring through a static website like https://zoo.rl.tools where experiment results are automatically made accessible by our continuous integration.
Run Identifier#
Each run is characterized by the following path/identifier structure:
{TIME}/{COMMIT}_{NAME}_{POPULATION}/{CONFIG}/{SEED}
e.g.
2024-05-26_06-26-52/4f717cb_zoo_algorithm_environment/sac_pendulum-v1/0000
{TIME}: e.g.
2024-05-26_06-26-52
TIME: The time when the experiment was executed. The time is the most significant component to facilitate lexicographical sorting (e.g.
find . | sort
). On Linux It can e.g. be generated bydate '+%Y-%m-%d_%H-%M-%S'
{COMMIT}_{NAME}_{POPULATION}: e.g.
4f717cb_zoo_algorithm_environment
COMMIT (e.g.
4f717cb
: The first 7 hexadecimal places of the commit hashNAME (e.g.
zoo
): Name of the experiment. Thereturn.json
should be in a consistent per NAMEPOPULATION (e.g.
algorithm_environment
): The variables that are varied/tested in this experiment (separated by underscores)
{CONFIG}: e.g.
sac_pendulum-v1
CONFIG: The values of the previously mentioned POPULATION variables for a particular experiment configuration, separated by underscores. In combination with the previous population setup, this yields e.g.
{"algorithm": "sac", "environment": "pendulum-v1"}
{SEED}: e.g.
1337
SEED: The seed for a particular run of the previously described configuration
Run Contents#
Per Run#
Each run can contain different files that apply to the whole run, e.g.:
2024-05-26_06-26-52/4f717cb_zoo_algorithm_environment/sac_pendulum-v1/0000/return.json
2024-05-26_06-26-52/4f717cb_zoo_algorithm_environment/sac_pendulum-v1/0000/logs.tfevents
return.json: The return value of the experiment. The structure and semantics of this file depend on the {NAME}. The existence of this file signals that the experiment is finished.
config.json (optional): Additional configuration
logs.tfevents (optional): Tensorboard logs
ui.js/ui.esm.js (optional): Render function for the UI (see https://studio.rl.tools for more info). This allows collected trajectories to be rendered
Per Step#
Additionally each run can store files at different stages of the run (corresponding to “steps” of the Loop Interface) e.g.:
2024-05-26_06-26-52/4f717cb_zoo_algorithm_environment/sac_pendulum-v1/0000/steps/000000000012000/checkpoint.h5
2024-05-26_06-26-52/4f717cb_zoo_algorithm_environment/sac_pendulum-v1/0000/steps/000000000012000/trajectories.json
steps: Directory to separate the step contents from the full-run contents
{STEP}: e.g.
000000000012000
checkpoint.h (optional): Checkpoint exported as a C++ header file such that it can be compiled into e.g. the firmware of a microcontroller.
checkpoint.h5 (optional): Checkpoint exported as an HDF5 file.
trajectories.json/trajectories.json.gz (optional): Trajectories from the evaluation loop step. These trajectories contain state, action, reward and termination signal as JSON objects and hence are compatible with the render function from
ui.js
evaluation_results.json (optional): Statistics (episode length and return) of the evaluation step