Arrow Research search

Author name cluster

Ashish Malik

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers
2 author rows

Possible papers

4

TMLR Journal 2025 Journal Article

Low Compute Unlearning via Sparse Representations

  • Vedant Shah
  • Frederik Träuble
  • Ashish Malik
  • Hugo Larochelle
  • Michael Curtis Mozer
  • Sanjeev Arora
  • Yoshua Bengio
  • Anirudh Goyal

Machine unlearning, which involves erasing knowledge about a \emph{forget set} from a trained model, can prove to be costly and infeasible using existing techniques. We propose a low-compute unlearning technique based on a discrete representational bottleneck. We show that the proposed technique efficiently unlearns the forget set and incurs negligible damage to the model's performance on the rest of the dataset. We evaluate the proposed technique on the problem of class unlearning using four datasets: CIFAR-10, CIFAR-100, LACUNA-100 and ImageNet-1k. We compare the proposed technique to SCRUB, a state-of-the-art approach which uses knowledge distillation for unlearning. Across all four datasets, the proposed technique performs as well as, if not better than SCRUB while incurring almost no computational cost.

IROS Conference 2024 Conference Paper

Interruptive Language Control of Bipedal Locomotion

  • Ashish Malik
  • Stefan Lee
  • Alan Fern

We study the problem of natural language-based control of dynamic bipedal locomotion from the perspective of operational robustness and hardware safety. Existing work on natural language-based robot control has focused on episodic command execution for stable robot platforms, such as fixed-based manipulators in table-top scenarios. These scenarios feature non-overlapping phases of instruction and execution, with execution mishaps usually posing no threat to the robot safety. This allows for non-trivial failure rates to be acceptable. In contrast, our work involves indistinguishable instruction and execution stages for a dynamically unstable robot where execution failures can harm the robot. For example, interrupting a bipedal robot with a new instruction in certain states may cause it to fall. Our first contribution is to design and train a natural language-based controller for the bipedal robot Cassie that can take in new language commands at any time. Our second contribution is to introduce a protocol for evaluating the robustness to interruptions of such controllers and evaluating the learned controller in simulation under different interruption distributions. Our third contribution is to learn a detector for interruptions that are likely to lead to failure and to integrate that detector into a failure mitigation strategy. Overall, our results show that interruptions can lead to non-trivial failure rates for the original controller and that the proposed mitigation strategy can help to significantly reduce that rate.

IROS Conference 2022 Conference Paper

Learning Dynamic Bipedal Walking Across Stepping Stones

  • Helei Duan
  • Ashish Malik
  • Mohitvishnu S. Gadde
  • Jeremy Dao
  • Alan Fern
  • Jonathan W. Hurst

In this work, we propose a learning approach for 3D dynamic bipedal walking when footsteps are constrained to stepping stones. While recent work has shown progress on this problem, real-world demonstrations have been limited to relatively simple open-loop, perception-free scenarios. Our main contribution is a more advanced learning approach that enables real-world demonstrations, using the Cassie robot, of closed-loop dynamic walking over moderately difficult stepping-stone patterns. Our approach first uses reinforcement learning (RL) in simulation to train a controller that maps footstep commands onto joint actions without any reference motion information. We then learn a model of that controller's capabilities, which enables prediction of feasible footsteps given the robot's current dynamic state. The resulting controller and model are then integrated with a real-time overhead camera system for detecting stepping stone locations. For evaluation, we develop a benchmark set of stepping stone patterns, which are used to test performance in both simulation and the real world. Overall, we demonstrate that sim-to-real learning is extremely promising for enabling dynamic locomotion over stepping stones. We also identify challenges remaining that motivate important future research directions.

ICRA Conference 2022 Conference Paper

Sim-to-Real Learning of Footstep-Constrained Bipedal Dynamic Walking

  • Helei Duan
  • Ashish Malik
  • Jeremy Dao
  • Aseem Saxena
  • Kevin Green
  • Jonah Siekmann
  • Alan Fern
  • Jonathan W. Hurst

Recently, work on reinforcement learning (RL) for bipedal robots has successfully learned controllers for a variety of dynamic gaits with robust sim-to-real demonstrations. In order to maintain balance, the learned controllers have full freedom of where to place the feet, resulting in highly robust gaits. In the real world however, the environment will often impose constraints on the feasible footstep locations, typically identified by perception systems. Unfortunately, most demonstrated RL controllers on bipedal robots do not allow for specifying and responding to such constraints. This missing control interface greatly limits the real-world application of current RL controllers. In this paper, we aim to maintain the robust and dynamic nature of learned gaits while also respecting footstep constraints imposed externally. We develop an RL formulation for training dynamic gait controllers that can respond to specified touchdown locations. We then successfully demonstrate simulation and sim-to-real performance on the bipedal robot Cassie. In addition, we use supervised learning to induce a transition model for accurately predicting the next touchdown locations that the controller can achieve given the robot's proprioceptive observations. This model paves the way for integrating the learned controller into a full-order robot locomotion planner that robustly satisfies both balance and environmental constraints.