Author name cluster

Aaron Walsman

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

Convergent Functions, Divergent Forms

Hyeonseong Jeon
Ainaz Eftekhar
Aaron Walsman
Kuo-Hao Zeng
Ali Farhadi
Ranjay Krishna

We introduce LOKI, a compute-efficient framework for co-designing morphologies and control policies that generalize across unseen tasks. Inspired by biological adaptation—where animals quickly adjust to morphological changes—our method overcomes the inefficiencies of traditional evolutionary and quality-diversity algorithms. We propose learning convergent functions: shared control policies trained across clusters of morphologically similar designs in a learned latent space, drastically reducing the training cost per design. Simultaneously, we promote divergent forms by replacing mutation with dynamic local search, enabling broader exploration and preventing premature convergence. The policy reuse allows us to explore $\sim780\times$ more designs using 78\% fewer simulation steps and 40\% less compute per design. Local competition paired with a broader search results in a diverse set of high-performing final morphologies. Using the UNIMAL design space and a flat-terrain locomotion task, LOKI discovers a rich variety of designs—ranging from quadrupeds to crabs, bipedals, and spinners—far more diverse than those produced by prior work. These morphologies also transfer better to unseen downstream tasks in agility, stability, and manipulation domains (e. g. $2 \times$ higher reward on bump and push box incline tasks). Overall, our approach produces designs that are both diverse and adaptable, with substantially greater sample efficiency than existing co-design methods.

PDF Details

TMLR Journal 2023 Journal Article

FLUID: A Unified Evaluation Framework for Flexible Sequential Data

Matthew Wallingford
Aditya Kusupati
Keivan Alizadeh-Vahid
Aaron Walsman
Aniruddha Kembhavi
Ali Farhadi

Modern machine learning methods excel when training data is IID, large-scale, and well labeled. Learning in less ideal conditions remains an open challenge. The sub-fields of few-shot, continual, transfer, and representation learning have made substantial strides in learning under adverse conditions, each affording distinct advantages through methods and insights. These methods address different challenges such as data arriving sequentially or scarce training examples, however often the difficult conditions an ML system will face over its lifetime cannot be anticipated prior to deployment. Therefore, general ML systems which can handle the many challenges of learning in practical settings are needed. To foster research towards the goal of general ML methods, we introduce a new unified evaluation framework – FLUID (Flexible Sequential Data). FLUID integrates the objectives of few-shot, continual, transfer, and representation learning while enabling comparison and integration of techniques across these subfields. In FLUID, a learner faces a stream of data and must make sequential predictions while choosing how to update itself, adapt quickly to novel classes, and deal with changing data distributions; while accounting for the total amount of compute. We conduct experiments on a broad set of methods which shed new insight on the advantages and limitations of current techniques and indicate new research problems to solve. As a starting point towards more general methods, we present two new baselines which outperform other evaluated methods on FLUID.

PDF Details

ICLR Conference 2023 Conference Paper

Impossibly Good Experts and How to Follow Them

Aaron Walsman
Muru Zhang
Sanjiban Choudhury
Dieter Fox
Ali Farhadi

We consider the sequential decision making problem of learning from an expert that has access to more information than the learner. For many problems this extra information will enable the expert to achieve greater long term reward than any policy without this privileged information access. We call these experts ``Impossibly Good'' because no learning algorithm will be able to reproduce their behavior. However, in these settings it is reasonable to attempt to recover the best policy possible given the agent's restricted access to information. We provide a set of necessary criteria on the expert that will allow a learner to recover the optimal policy in the reduced information space from the expert's advice alone. We also provide a new approach called Elf Distillation (Explorer Learning from Follower) that can be used in cases where these criteria are not met and environmental rewards must be taken into account. We show that this algorithm performs better than a variety of strong baselines on a challenging suite of Minigrid and Vizdoom environments.

Details

IROS Conference 2019 Conference Paper

EARLY FUSION for Goal Directed Robotic Vision

Aaron Walsman
Yonatan Bisk
Saadia Gabriel
Dipendra Misra
Yoav Artzi
Yejin Choi 0001
Dieter Fox

Building perceptual systems for robotics which perform well under tight computational budgets requires novel architectures which rethink the traditional computer vision pipeline. Modern vision architectures require the agent to build a summary representation of the entire scene, even if most of the input is irrelevant to the agent’s current goal. In this work, we flip this paradigm, by introducing EARLYFUSION vision models that condition on a goal to build custom representations for downstream tasks. We show that these goal specific representations can be learned more quickly, are substantially more parameter efficient, and more robust than existing attention mechanisms in our domain. We demonstrate the effectiveness of these methods on a simulated item retrieval problem that is trained in a fully end-to-end manner via imitation learning.

Details

ICRA Conference 2019 Conference Paper

Part Segmentation for Highly Accurate Deformable Tracking in Occlusions via Fully Convolutional Neural Networks

Weilin Wan 0001
Aaron Walsman
Dieter Fox

Successfully tracking the human body is an important perceptual challenge for robots that must work around people. Existing methods fall into two broad categories: geometric tracking and direct pose estimation using machine learning. While recent work has shown direct estimation techniques can be quite powerful, geometric tracking methods using point clouds can provide a very high level of 3D accuracy which is necessary for many robotic applications. However these approaches can have difficulty in clutter when large portions of the subject are occluded. To overcome this limitation, we propose a solution based on fully convolutional neural networks (FCN). We develop an optimized Fast-FCN network architecture for our application which allows us to filter observed point clouds and improve tracking accuracy while maintaining interactive frame rates. We also show that this model can be trained with a limited number of examples and almost no manual labelling by using an existing geometric tracker and data augmentation to automatically generate segmentation maps. We demonstrate the accuracy of our full system by comparing it against an existing geometric tracker, and show significant improvement in these challenging scenarios.

Details