Source linked

TycoonLE Schiffe JAX-Native Logistikplanung, die Rollouts kompiliert

Diese RL-Umgebung für die Route-, Fracht- und Finanzierungsplanung läuft vollständig in JAX - jit, vmap, scan - und Schiffe mit einer Wiedergabe-UI und einer Benchmark-Suite namens TycoonBench.

tycoonlejaxreinforcement learninglogisticsplanningopenttd

You can now JIT-compile an entire logistics planning RL loop — TycoonLE runs pure JAX, letting you jit, vmap, and scan your way through route-building, cargo flow, debt management, and delayed rewards. No more Python loops killing your GPU utilization.

TycoonLE is a fixed-shape, economically grounded reinforcement learning environment. Agents allocate capital, build transport routes, move cargo, manage debt, and optimize returns that take dozens of steps to materialize. The whole setup is designed to study action legality, candidate-frontier decision interfaces, financing timing, and procedural variation — all with replayable audit traces.

What Makes TycoonLE Different

Most RL environments break JAX compatibility because of variable-length action spaces or dynamic state shapes. TycoonLE sidesteps that with a fixed-shape interface: agents choose among valid route, finance, and wait candidates, and rollouts remain a single jit-compiled function.

The environment supports Python 3.11/3.12, installs with pip, and includes a PPO smoke train script that runs 1 update, 4 envs, rollout length 4, update epochs 1, and hidden size 128. That’s enough to verify the pipeline works end-to-end without waiting hours.

TycoonBench and Replay UI

TycoonBench ships alongside the environment — a companion benchmark report at vrtnis.github.io/tycoonbench for comparing agent and model performance. The replay UI, built with TypeScript and Vite, lets you inspect policies through route choices, cargo flow, financing behavior, reward, score, and profit over time. Open the browser and load a replay JSON file.

Assets come from OpenGFX, the open-source graphics base for OpenTTD — so the sprite work is familiar to anyone who’s played Transport Tycoon. The license is MIT.

What This Enables Next

TycoonLE fills a gap between toy gridworlds and full simulators by giving RL researchers a JAX-native playground for long-horizon, multi-objective planning. Expect to see it used for curriculum learning experiments, credit assignment studies, and multi-agent logistics benchmarks — all on a single TPU.


Source: TycoonLE: A Jax reinforcement learning environment for long-horizon planning
Domain: github.com

Read original source ->

External source stays available while the OJO article and comment thread stay local.

Comments load interactively on the live page.