Arrow Research search

Author name cluster

Gregory Kahn

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers
1 author row

Possible papers

12

ICRA Conference 2024 Conference Paper

Open X-Embodiment: Robotic Learning Datasets and RT-X Models: Open X-Embodiment Collaboration

  • Abby O'Neill
  • Abdul Rehman
  • Abhiram Maddukuri
  • Abhishek Gupta 0004
  • Abhishek Padalkar
  • Abraham Lee
  • Acorn Pooley
  • Agrim Gupta

Large, high-capacity models trained on diverse datasets have shown remarkable successes on efficiently tackling downstream applications. In domains from NLP to Computer Vision, this has led to a consolidation of pretrained models, with general pretrained backbones serving as a starting point for many applications. Can such a consolidation happen in robotics? Conventionally, robotic learning methods train a separate model for every application, every robot, and even every environment. Can we instead train "generalist" X-robot policy that can be adapted efficiently to new robots, tasks, and environments? In this paper, we provide datasets in standardized data formats and models to make it possible to explore this possibility in the context of robotic manipulation, alongside experimental results that provide an example of effective X-robot policies. We assemble a dataset from 22 different robots collected through a collaboration between 21 institutions, demonstrating 527 skills (160266 tasks). We show that a high-capacity model trained on this data, which we call RT-X, exhibits positive transfer and improves the capabilities of multiple robots by leveraging experience from other platforms. The project website is robotics-transformer-x. github.io.

ICRA Conference 2021 Conference Paper

ViNG: Learning Open-World Navigation with Visual Goals

  • Dhruv Shah
  • Benjamin Eysenbach
  • Gregory Kahn
  • Nicholas Rhinehart
  • Sergey Levine

We propose a learning-based navigation system for reaching visually indicated goals and demonstrate this system on a real mobile robot platform. Learning provides an appealing alternative to conventional methods for robotic navigation: instead of reasoning about environments in terms of geometry and maps, learning can enable a robot to learn about navigational affordances, understand what types of obstacles are traversable (e. g. , tall grass) or not (e. g. , walls), and generalize over patterns in the environment. However, unlike conventional planning algorithms, it is harder to change the goal for a learned policy during deployment. We propose a method for learning to navigate towards a goal image of the desired destination. By combining a learned policy with a topological graph constructed out of previously observed data, our system can determine how to reach this visually indicated goal even in the presence of variable appearance and lighting. Three key insights, waypoint proposal, graph pruning and negative mining, enable our method to learn to navigate in real-world environments using only offline data, a setting where prior methods struggle. We instantiate our method on a real outdoor ground robot and show that our system, which we call ViNG, outperforms previously-proposed methods for goal-conditioned reinforcement learning, including other methods that incorporate reinforcement learning and search. We also study how ViNG generalizes to unseen environments and evaluate its ability to adapt to such an environment with growing experience. Finally, we demonstrate ViNG on a number of real-world applications, such as last-mile delivery and warehouse inspection. We encourage the reader to visit the project website for videos of our experiments and demonstrations 1.

ICRA Conference 2019 Conference Paper

Generalization through Simulation: Integrating Simulated and Real Data into Deep Reinforcement Learning for Vision-Based Autonomous Flight

  • Katie Kang
  • Suneel Belkhale
  • Gregory Kahn
  • Pieter Abbeel
  • Sergey Levine

Deep reinforcement learning provides a promising approach for vision-based control of real-world robots. However, the generalization of such models depends critically on the quantity and variety of data available for training. This data can be difficult to obtain for some types of robotic systems, such as fragile, small-scale quadrotors. Simulated rendering and physics can provide for much larger datasets, but such data is inherently of lower quality: many of the phenomena that make the real-world autonomous flight problem challenging, such as complex physics and air currents, are modeled poorly or not at all, and the systematic differences between simulation and the real world are typically impossible to eliminate. In this work, we investigate how data from both simulation and the real world can be combined in a hybrid deep reinforcement learning algorithm. Our method uses real-world data to learn about the dynamics of the system, and simulated data to learn a generalizable perception system that can enable the robot to avoid collisions using only a monocular camera. We demonstrate our approach on a real-world nano aerial vehicle collision avoidance task, showing that with only an hour of real-world data, the quadrotor can avoid collisions in new environments with various lighting conditions and geometry. Code, instructions for building the aerial vehicles, and videos of the experiments can be found at github.com/gkahn13/GtS.

ICRA Conference 2019 Conference Paper

Robustness to Out-of-Distribution Inputs via Task-Aware Generative Uncertainty

  • Rowan McAllister
  • Gregory Kahn
  • Jeff Clune
  • Sergey Levine

Deep learning provides a powerful tool for robotic perception in the open world. However, real-world robotic systems, especially mobile robots, must be able to react intelligently and safely even in unexpected circumstances. This requires a system that knows what it knows, and can estimate its own uncertainty for unfamiliar, out-of-distribution observations. Approximate Bayesian approaches are commonly used to estimate uncertainty for neural network predictions, but struggle with out-of-distribution observations. Generative models can in principle detect out-of-distribution observations as those with a low estimated density, but overly pessimistic as an uncertainty measure, since the mere presence of an out-of-distribution input does not by itself indicate an unsafe situation. Intuitively, we would like a perception system that can detect when task-salient parts of the image are unfamiliar or uncertain, while ignoring task-irrelevant features. In this paper, we present a method for uncertainty-aware robotic perception that combines generative modeling and model uncertainty. Our method estimates an uncertainty measure about the model's prediction, taking into account an explicit generative model of the observation distribution to handle out-of-distribution inputs. We evaluate our method on an action-conditioned collision prediction task with both simulated and real data, and demonstrate that our approach improves on a variety of Bayesian neural network techniques.

IROS Conference 2018 Conference Paper

Learning Image-Conditioned Dynamics Models for Control of Underactuated Legged Millirobots

  • Anusha Nagabandi
  • Guangzhao Yang
  • Thomas Asmar
  • Ravi Pandya
  • Gregory Kahn
  • Sergey Levine
  • Ronald S. Fearing

Millirobots are a promising robotic platform for many applications due to their small size and low manufacturing costs. Legged millirobots, in particular, can provide increased mobility in complex environments and improved scaling of obstacles. However, controlling these small, highly dynamic, and underactuated legged systems is difficult. Hand-engineered controllers can sometimes control these legged millirobots, but they have difficulties with dynamic maneuvers and complex terrains. We present an approach for controlling a real-world legged millirobot that is based on learned neural network models. Using less than 17 minutes of data, our method can learn a predictive model of the robot's dynamics that can enable effective gaits to be synthesized on the fly for following user-specified waypoints on a given terrain. Furthermore, by leveraging expressive, high-capacity neural network models, our approach allows for these predictions to be directly conditioned on camera images, endowing the robot with the ability to predict how different terrains might affect its dynamics. This enables sample-efficient and effective learning for locomotion of a dynamic legged millirobot on various terrains, including gravel, turf, carpet, and styrofoam. Videos and further details can be found at https://sites.google.com/view/imageconddyn.

ICRA Conference 2018 Conference Paper

Neural Network Dynamics for Model-Based Deep Reinforcement Learning with Model-Free Fine-Tuning

  • Anusha Nagabandi
  • Gregory Kahn
  • Ronald S. Fearing
  • Sergey Levine

Model-free deep reinforcement learning algorithms have been shown to be capable of learning a wide range of robotic skills, but typically require a very large number of samples to achieve good performance. Model-based algorithms, in principle, can provide for much more efficient learning, but have proven difficult to extend to expressive, high-capacity models such as deep neural networks. In this work, we demonstrate that neural network dynamics models can in fact be combined with model predictive control (MPC) to achieve excellent sample complexity in a model-based reinforcement learning algorithm, producing stable and plausible gaits that accomplish various complex locomotion tasks. We further propose using deep neural network dynamics models to initialize a model-free learner, in order to combine the sample efficiency of model-based approaches with the high task-specific performance of model-free methods. We empirically demonstrate on MuJoCo locomotion tasks that our pure model-based approach trained on just random action data can follow arbitrary trajectories with excellent sample efficiency, and that our hybrid algorithm can accelerate model-free learning on high-speed benchmark tasks, achieving sample efficiency gains of 3-5× on swimmer, cheetah, hopper, and ant agents. Videos can be found at https://sites.google.com/view/mbmf.

ICRA Conference 2018 Conference Paper

Self-Supervised Deep Reinforcement Learning with Generalized Computation Graphs for Robot Navigation

  • Gregory Kahn
  • Adam Villaflor
  • Bosen Ding
  • Pieter Abbeel
  • Sergey Levine

Enabling robots to autonomously navigate complex environments is essential for real-world deployment. Prior methods approach this problem by having the robot maintain an internal map of the world, and then use a localization and planning method to navigate through the internal map. However, these approaches often include a variety of assumptions, are computationally intensive, and do not learn from failures. In contrast, learning-based methods improve as the robot acts in the environment, but are difficult to deploy in the real-world due to their high sample complexity. To address the need to learn complex policies with few samples, we propose a generalized computation graph that subsumes value-based model-free methods and model-based methods, with specific instantiations interpolating between model-free and model-based. We then instantiate this graph to form a navigation model that learns from raw images and is sample efficient. Our simulated car experiments explore the design decisions of our navigation model, and show our approach outperforms single-step and N -step double Q-learning. We also evaluate our approach on a real-world RC car and show it can learn to navigate through a complex indoor environment with a few hours of fully autonomous, self-supervised training. Videos of the experiments and code can be found at github.com/gkahn13/gcg.

ICRA Conference 2017 Conference Paper

PLATO: Policy learning using adaptive trajectory optimization

  • Gregory Kahn
  • Tianhao Zhang 0001
  • Sergey Levine
  • Pieter Abbeel

Policy search can in principle acquire complex strategies for control of robots and other autonomous systems. When the policy is trained to process raw sensory inputs, such as images and depth maps, it can also acquire a strategy that combines perception and control. However, effectively processing such complex inputs requires an expressive policy class, such as a large neural network. These high-dimensional policies are difficult to train, especially when learning to control safety-critical systems. We propose PLATO, a continuous, reset-free reinforcement learning algorithm that trains complex control policies with supervised learning, using model-predictive control (MPC) to generate the supervision, hence never in need of running a partially trained and potentially unsafe policy. PLATO uses an adaptive training method to modify the behavior of MPC to gradually match the learned policy in order to generate training samples at states that are likely to be visited by the learned policy. PLATO also maintains the MPC cost as an objective to avoid highly undesirable actions that would result from strictly following the learned policy before it has been fully trained. We prove that this type of adaptive MPC expert produces supervision that leads to good long-horizon performance of the resulting policy. We also empirically demonstrate that MPC can still avoid dangerous on-policy actions in unexpected situations during training. Our empirical results on a set of challenging simulated aerial vehicle tasks demonstrate that, compared to prior methods, PLATO learns faster, experiences substantially fewer catastrophic failures (crashes) during training, and often converges to a better policy.

ICRA Conference 2016 Conference Paper

Learning deep control policies for autonomous aerial vehicles with MPC-guided policy search

  • Tianhao Zhang 0001
  • Gregory Kahn
  • Sergey Levine
  • Pieter Abbeel

Model predictive control (MPC) is an effective method for controlling robotic systems, particularly autonomous aerial vehicles such as quadcopters. However, application of MPC can be computationally demanding, and typically requires estimating the state of the system, which can be challenging in complex, unstructured environments. Reinforcement learning can in principle forego the need for explicit state estimation and acquire a policy that directly maps sensor readings to actions, but is difficult to apply to unstable systems that are liable to fail catastrophically during training before an effective policy has been found. We propose to combine MPC with reinforcement learning in the framework of guided policy search, where MPC is used to generate data at training time, under full state observations provided by an instrumented training environment. This data is used to train a deep neural network policy, which is allowed to access only the raw observations from the vehicle's onboard sensors. After training, the neural network policy can successfully control the robot without knowledge of the full state, and at a fraction of the computational cost of MPC. We evaluate our method by learning obstacle avoidance policies for a simulated quadrotor, using simulated onboard sensors and no explicit state estimation at test time.

IROS Conference 2016 Conference Paper

Occlusion-aware multi-robot 3D tracking

  • Karol Hausman
  • Gregory Kahn
  • Sachin Patil
  • Jörg Müller 0004
  • Ken Goldberg
  • Pieter Abbeel
  • Gaurav S. Sukhatme

We introduce an optimization-based control approach that enables a team of robots to cooperatively track a target using onboard sensing. In this setting, the robots are required to estimate their own positions as well as concurrently track the target. Our probabilistic method generates controls that minimize the expected uncertainty of the target. Additionally, our method efficiently reasons about occlusions between robots and takes them into account for the control generation. We evaluate our approach in a number of experiments in which we simulate a team of quadrotor robots flying in three-dimensional space to track a moving target on the ground. We compare our method to other state-of-the-art approaches represented by the random sampling technique, lattice planning method, and our previous method. Our experimental results indicate that our method achieves up to 8 times smaller maximum tracking error and up to 2 times smaller average tracking error than the next best approach in the presented scenarios.

ICRA Conference 2015 Conference Paper

Active exploration using trajectory optimization for robotic grasping in the presence of occlusions

  • Gregory Kahn
  • Peter Sujan
  • Sachin Patil
  • Shaunak D. Bopardikar
  • Julian Ryde
  • Ken Goldberg
  • Pieter Abbeel

We consider the task of actively exploring unstructured environments to facilitate robotic grasping of occluded objects. Typically, the geometry and locations of these objects are not known a priori. We mount an RGB-D sensor on the robot gripper to maintain a 3D voxel map of the environment during exploration. The objective is to plan the motion of the sensor in order to search for feasible grasp handles that lie within occluded regions of the map. In contrast to prior work that generates exploration trajectories by sampling, we directly optimize the exploration trajectory to find grasp handles. Since it is challenging to optimize over the discrete voxel map, we encode the uncertainty of the positions of the occluded grasp handles as a mixture of Gaussians, one per occluded region. Our trajectory optimization approach encourages exploration by penalizing a measure of the uncertainty. We then plan a collision-free trajectory for the robot arm to the detected grasp handle. We evaluated our approach by actively exploring and attempting 300 grasps. Our experiments suggest that compared to the baseline method of sampling 10 trajectories, which successfully grasped 58% of the objects, our active exploration formulation with trajectory optimization successfully grasped 93% of the objects, was 1. 3× faster, and had 3. 2× fewer failed grasp attempts.

ICRA Conference 2014 Conference Paper

Autonomous multilateral debridement with the Raven surgical robot

  • Ben Kehoe
  • Gregory Kahn
  • Jeffrey Mahler
  • Jonathan Kim
  • Alex X. Lee
  • Anna Lee
  • Keisuke Nakagawa
  • Sachin Patil

Autonomous robot execution of surgical sub-tasks has the potential to reduce surgeon fatigue and facilitate supervised tele-surgery. This paper considers the sub-task of surgical debridement: removing dead or damaged tissue fragments to allow the remaining healthy tissue to heal. We present an autonomous multilateral surgical debridement system using the Raven, an open-architecture surgical robot with two cable-driven 7 DOF arms. Our system combines stereo vision for 3D perception with trajopt, an optimization-based motion planner, and model predictive control (MPC). Laboratory experiments involving sensing, grasping, and removal of 120 fragments suggest that an autonomous surgical robot can achieve robustness comparable to human performance. Our robot system demonstrated the advantage of multilateral systems, as the autonomous execution was 1. 5× faster with two arms than with one; however, it was two to three times slower than a human. Execution speed could be improved with better state estimation that would allow more travel between MPC steps and fewer MPC replanning cycles. The three primary contributions of this paper are: (1) introducing debridement as a sub-task of interest for surgical robotics, (2) demonstrating the first reliable autonomous robot performance of a surgical sub-task using the Raven, and (3) reporting experiments that highlight the importance of accurate state estimation for future research. Further information including code, photos, and video is available at: http://rll.berkeley.edu/raven.