AAAI Conference 2026 Conference Paper
Improving Generalization in Offline Meta-Reinforcement Learning via Cross-task Contexts
- Hongcai He
- Zetao Zheng
- Anjie Zhu
- Deqiang Ouyang
- Jie Shao
Context-based offline meta-reinforcement learning (meta-RL) is a paradigm that integrates meta-learning with offline reinforcement learning. It learns a strategy to extract task-specific contexts from trajectories of meta-training tasks and leverages this strategy for adapting to unseen target tasks. However, existing methods struggle to generate generalizable contexts for adaptations due to context shift, which arises from the context-based policy overfitting to offline data. We argue that leveraging the internal relationships among tasks, rather than treating each task in isolation, is crucial for mitigating the impact of context shift. Hence, we propose a framework called cross-task contexts for improving generalization in meta-RL (CTMRL). Specifically, we design a context quantization variational auto-encoder (CQ-VAE), which clusters task-specific contexts of meta-training tasks into discrete codes based on the internal relationships among tasks. Cross-task contexts are constructed with these codes, reflecting shared information across similar tasks. These cross-task contexts not only serve as high-level structures to capture similarity across tasks but also provide a foundation for hard contrastive learning that enhances the distinguishability of similar yet distinct tasks, thereby improving the generalization of contexts and facilitating adaptation to unseen target tasks. The evaluation in meta-environments confirms the performance advantage of CTMRL over existing methods.