Arrow Research search

Author name cluster

Harrison Edwards

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers
2 author rows

Possible papers

4

ICLR Conference 2024 Conference Paper

Let's Verify Step by Step

  • Hunter Lightman
  • Vineet Kosaraju
  • Yuri Burda
  • Harrison Edwards
  • Bowen Baker
  • Teddy Lee
  • Jan Leike
  • John Schulman

In recent years, large language models have greatly improved in their ability to perform complex multi-step reasoning. However, even state-of-the-art models still regularly produce logical mistakes. To train more reliable models, we can turn either to outcome supervision, which provides feedback for a final result, or process supervision, which provides feedback for each intermediate reasoning step. Given the importance of training reliable models, and given the high cost of human feedback, it is important to carefully compare the both methods. Recent work has already begun this comparison, but many questions still remain. We conduct our own investigation, finding that process supervision significantly outperforms outcome supervision for training models to solve problems from the challenging MATH dataset. Our process-supervised model solves 78% of problems from a representative subset of the MATH test set. Additionally, we show that active learning significantly improves the efficacy of process supervision. To support related research, we also release PRM800K, the complete dataset of 800,000 step-level human feedback labels used to train our best reward model.

IROS Conference 2021 Conference Paper

Autonomous object harvesting using synchronized optoelectronic microrobots

  • Christopher Bendkowski
  • Laurent Mennillo
  • Tao Xu
  • Mohamed Elsayed 0008
  • Filip Stojic
  • Harrison Edwards
  • Shuailong Zhang
  • Cindi Morshead

Optoelectronic tweezer-driven microrobots (OETdMs) are a versatile micromanipulation technology based on the application of light induced dielectrophoresis to move small dielectric structures (microrobots) across a photoconductive substrate. The microrobots in turn can be used to exert forces on secondary objects and carry out a wide range of micromanipulation operations, including collecting, transporting and depositing microscopic cargos. In contrast to alternative (direct) micromanipulation techniques, OETdMs are relatively gentle, making them particularly well suited to interacting with sensitive objects such as biological cells. However, at present such systems are used exclusively under manual control by a human operator. This limits the capacity for simultaneous control of multiple microrobots, reducing both experimental throughput and the possibility of cooperative multi-robot operations. In this article, we describe an approach to automated targeting and path planning to enable open-loop control of multiple microrobots. We demonstrate the performance of the method in practice, using microrobots to simultaneously collect, transport and deposit silica microspheres. Using computational simulations based on real microscopic image data, we investigate the capacity of microrobots to collect target cells from within a dissociated tissue culture. Our results indicate the feasibility of using OETdMs to autonomously carry out micromanipulation tasks within complex, unstructured environments.

AAMAS Conference 2018 Conference Paper

Evaluating Generalization in Multiagent Systems using Agent-Interaction Graphs

  • Aditya Grover
  • Maruan Al-Shedivat
  • Jayesh K. Gupta
  • Yuri Burda
  • Harrison Edwards

Learning from interactions between agents is a key component for inference in multiagent systems. Depending on the downstream task, there could be multiple criteria for evaluating the generalization performance of learning. In this work, we propose a novel framework for evaluating generalization in multiagent systems based on agent-interaction graphs. An agent-interaction graph models agents as nodes and interactions as hyper-edges between participating agents. Using this abstract data structure, we define three notions of generalization for principled evaluation of learning in multiagent systems.

ICML Conference 2018 Conference Paper

Learning Policy Representations in Multiagent Systems

  • Aditya Grover
  • Maruan Al-Shedivat
  • Jayesh K. Gupta
  • Yuri Burda
  • Harrison Edwards

Modeling agent behavior is central to understanding the emergence of complex phenomena in multiagent systems. Prior work in agent modeling has largely been task-specific and driven by hand-engineering domain-specific prior knowledge. We propose a general learning framework for modeling agent behavior in any multiagent system using only a handful of interaction data. Our framework casts agent modeling as a representation learning problem. Consequently, we construct a novel objective inspired by imitation learning and agent identification and design an algorithm for unsupervised learning of representations of agent policies. We demonstrate empirically the utility of the proposed framework in (i) a challenging high-dimensional competitive environment for continuous control and (ii) a cooperative environment for communication, on supervised predictive tasks, unsupervised clustering, and policy optimization using deep reinforcement learning.