Arrow Research search
Back to ICRA

ICRA 2021

Continuous Transition: Improving Sample Efficiency for Continuous Control Problems via MixUp

Conference Paper Accepted Paper Artificial Intelligence · Robotics

Abstract

Although deep reinforcement learning (RL) has been successfully applied to a variety of robotic control tasks, it’s still challenging to apply it to real-world tasks, due to the poor sample efficiency. Attempting to overcome this shortcoming, several works focus on reusing the collected trajectory data during the training by decomposing them into a set of policy-irrelevant discrete transitions. However, their improvements are somewhat marginal since i) the amount of the transitions is usually small, and ii) the value assignment only happens in the joint states. To address these issues, this paper introduces a concise yet powerful method to construct Continuous Transition, which exploits the trajectory information by exploiting the potential transitions along the trajectory. Specifically, we propose to synthesize new transitions for training by linearly interpolating the consecutive transitions. To keep the constructed transitions authentic, we also develop a discriminator to guide the construction process automatically. Extensive experiments demonstrate that our proposed method achieves a significant improvement in sample efficiency on various complex continuous robotic control problems in MuJoCo and outperforms the advanced model-based / model-free RL methods. The source code is available 1.

Authors

Keywords

  • Training
  • Interpolation
  • Temperature distribution
  • Codes
  • Automation
  • Conferences
  • Reinforcement learning
  • Control Problem
  • Sampling Efficiency
  • Continuous Transition
  • Improve Sample Efficiency
  • Continuous Control Problem
  • Deep Learning
  • Linear Interpolation
  • Control Task
  • Poor Efficiency
  • Robot Control
  • Deep Reinforcement Learning
  • Trajectory Data
  • Reinforcement Learning Methods
  • Joint State
  • Assignment Of Values
  • Robotic Tasks
  • Model-free Reinforcement Learning
  • Training Data
  • Manifold
  • Deep Neural Network
  • Beta Distribution
  • Data Augmentation
  • Tolerance Values
  • Cheetah
  • Effect Of Different Values
  • Dots In Fig
  • Policy Gradient Method
  • Blue Dots
  • Random Pairs
  • Image Classification

Context

Venue
IEEE International Conference on Robotics and Automation
Archive span
1984-2025
Indexed papers
30179
Paper id
1015174739259579747