AAAI Conference 2026 System Paper
RL-Studio: A System for Multi-Phase Reinforcement Learning Experimentation
- Whiyoung Jung
- Sunghoon Hong
- Deunsol Yoon
- Jeonghye Kim
- Yongjae Shin
- Suhyun Jung
- Hyundam Yoo
- Youngjin Kim
Reinforcement learning (RL) has evolved beyond monolithic training, yet existing frameworks remain limited to single algorithms or simple offline-to-online transitions. We present multi-phase RL, a framework that orchestrates multiple learning phases for continual policy improvement. It enables efficient fine-tuning of pretrained policies with new data and smooth adaptation from simulation to real-world environments. To support this paradigm, we introduce RL-Studio, a platform that addresses key implementation barriers, including neural architecture mismatches, parameter transfer complexities, and experiment management overhead. It provides phase orchestration, transition-point monitoring, and full experiment lineage tracking. We demonstrate the effectiveness of multi-phase RL through representative scenarios and highlight RL-Studio’s capabilities.