Author name cluster

Daniel Freeman

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

2 author rows

IROS Conference 2024 Conference Paper

The Design of the Barkour Benchmark for Robot Agility

Wenhao Yu 0003
Ken Caluwaerts
Atil Iscen
J. Chase Kew
Tingnan Zhang
Daniel Freeman
Lisa Lee
Stefano Saliceti

In this paper, we describe the design of the Barkour benchmark for measuring robot agility in navigating complex environments. Despite the growing interest in developing agile robot locomotion skills, the field lacks systematic benchmarks to measure the performance of robotic control systems and hardware in agility-focused tasks. This motivated us to propose the Barkour benchmark, an obstacle course designed to quantify agility across various robotic platforms. Inspired by dog agility competitions, the course features diverse obstacles and a time-based scoring mechanism, encouraging researchers to develop controllers that enable robots to move quickly, precisely, and with adaptability. This benchmark is challenging as it demands diverse motion skills and the time-based scoring requires control precision at high speed. Along with the design details presented in the paper, we release our simulated environment setups in MuJoCo-XLA and the CAD model of a custom-designed quadruped robot to facilitate future research to reproduce the Barkour setup (available at sites.google.com/view/barkour). We hope these together will accelerate the pace of robot agility research.

Details

ICML Conference 2022 Conference Paper

Blocks Assemble! Learning to Assemble with Large-Scale Structured Reinforcement Learning

Seyed Kamyar Seyed Ghasemipour
Satoshi Kataoka
Byron David
Daniel Freeman
Shixiang Gu
Igor Mordatch

Assembly of multi-part physical structures is both a valuable end product for autonomous robotics, as well as a valuable diagnostic task for open-ended training of embodied intelligent agents. We introduce a naturalistic physics-based environment with a set of connectable magnet blocks inspired by children’s toy kits. The objective is to assemble blocks into a succession of target blueprints. Despite the simplicity of this objective, the compositional nature of building diverse blueprints from a set of blocks leads to an explosion of complexity in structures that agents encounter. Furthermore, assembly stresses agents’ multi-step planning, physical reasoning, and bimanual coordination. We find that the combination of large-scale reinforcement learning and graph-based policies – surprisingly without any additional complexity – is an effective recipe for training agents that not only generalize to complex unseen blueprints in a zero-shot manner, but even operate in a reset-free setting without being trained to do so. Through extensive experiments, we highlight the importance of large-scale training, structured representations, contributions of multi-task vs. single-task learning, as well as the effects of curriculums, and discuss qualitative behaviors of trained agents. Our accompanying project webpage can be found at: https: //sites. google. com/view/learning-direct-assembly/home

Details

NeurIPS Conference 2022 Conference Paper

Multi-Game Decision Transformers

Kuang-Huei Lee
Ofir Nachum
Mengjiao (Sherry) Yang
Lisa Lee
Daniel Freeman
Sergio Guadarrama
Ian Fischer
Winnie Xu

A longstanding goal of the field of AI is a method for learning a highly capable, generalist agent from diverse experience. In the subfields of vision and language, this was largely achieved by scaling up transformer-based models and training them on large, diverse datasets. Motivated by this progress, we investigate whether the same strategy can be used to produce generalist reinforcement learning agents. Specifically, we show that a single transformer-based model – with a single set of weights – trained purely offline can play a suite of up to 46 Atari games simultaneously at close-to-human performance. When trained and evaluated appropriately, we find that the same trends observed in language and vision hold, including scaling of performance with model size and rapid adaptation to new games via fine-tuning. We compare several approaches in this multi-game setting, such as online and offline RL methods and behavioral cloning, and find that our Multi-Game Decision Transformer models offer the best scalability and performance. We release the pre-trained models and code to encourage further research in this direction.

PDF Details

NeurIPS Conference 2021 Conference Paper

Brax - A Differentiable Physics Engine for Large Scale Rigid Body Simulation

Daniel Freeman
Erik Frey
Anton Raichuk
Sertan Girgin
Igor Mordatch
Olivier Bachem

We present Brax, an open source library for \textbf{r}igid \textbf{b}ody simulation with a focus on performance and parallelism on accelerators, written in JAX. We present results on a suite of tasks inspired by the existing reinforcement learning literature, but remade in our engine. Additionally, we provide reimplementations of PPO, SAC, ES, and direct policy optimization in JAX that compile alongside our environments, allowing the learning algorithm and the environment processing to occur on the same device, and to scale seamlessly on accelerators. Finally, we include notebooks that facilitate training of performant policies on common MuJoCo-like tasks in minutes.

PDF Details

NeurIPS Conference 2019 Conference Paper

Learning to Predict Without Looking Ahead: World Models Without Forward Prediction

Daniel Freeman
David Ha
Luke Metz

Much of model-based reinforcement learning involves learning a model of an agent's world, and training an agent to leverage this model to perform a task more efficiently. While these models are demonstrably useful for agents, every naturally occurring model of the world of which we are aware---e. g. , a brain---arose as the byproduct of competing evolutionary pressures for survival, not minimization of a supervised forward-predictive loss via gradient descent. That useful models can arise out of the messy and slow optimization process of evolution suggests that forward-predictive modeling can arise as a side-effect of optimization under the right circumstances. Crucially, this optimization process need not explicitly be a forward-predictive loss. In this work, we introduce a modification to traditional reinforcement learning which we call observational dropout, whereby we limit the agents ability to observe the real environment at each timestep. In doing so, we can coerce an agent into learning a world model to fill in the observation gaps during reinforcement learning. We show that the emerged world model, while not explicitly trained to predict the future, can help the agent learn key skills required to perform well in its environment. Videos of our results available at https: //learningtopredict. github. io/

PDF Details