Arrow Research search

Author name cluster

Daniel Urieli

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers
1 author row

Possible papers

10

AAMAS Conference 2021 Conference Paper

Scalable Multiagent Driving Policies for Reducing Traffic Congestion

  • Jiaxun Cui
  • William Macke
  • Harel Yedidsion
  • Aastha Goyal
  • Daniel Urieli
  • Peter Stone

Traffic congestion is a major challenge in modern urban settings. The industry-wide development of autonomous and automated vehicles (AVs) motivates the question of how can AVs contribute to congestion reduction. Past research has shown that in small scale mixed traffic scenarios with both AVs and human-driven vehicles, a small fraction of AVs executing a controlled multiagent driving policy can mitigate congestion. In this paper, we scale up existing approaches and develop new multiagent driving policies for AVs in scenarios with greater complexity. We start by showing that a congestion metric used by past research is manipulable in open road network scenarios where vehicles dynamically join and leave the road. We then propose using a different metric that is robust to manipulation and reflects open network traffic efficiency. Next, we propose a modular transfer reinforcement learning approach, and use it to scale up a multiagent driving policy to outperform human-like traffic and existing approaches in a simulated realistic scenario, which is an order of magnitude larger than past scenarios (hundreds instead of tens of vehicles). Additionally, our modular transfer learning approach saves up to 80% of the training time in our experiments, by focusing its data collection on key locations in the network. Finally, we show for the first time a distributed multiagent policy that improves congestion over human-driven traffic. The distributed approach is more realistic and practical, as it relies solely on existing sensing and actuation capabilities, and does not require adding new communication infrastructure.

AAMAS Conference 2016 Conference Paper

An MDP-Based Winning Approach to Autonomous Power Trading: Formalization and Empirical Analysis

  • Daniel Urieli
  • Peter Stone

With the efforts of moving to sustainable and reliable energy supply, electricity markets are undergoing far-reaching changes. Due to the high-cost of failure in the real-world, it is important to test new market structures in simulation. This is the focus of the Power Trading Agent Competition (Power TAC), which proposes autonomous electricity broker agents as a means for stabilizing the electricity grid. This paper focuses on the question: how should an autonomous electricity broker agent act in competitive electricity markets to maximize its profit. We formalize the electricity trading problem as a continuous, high-dimensional Markov Decision Process (MDP), which is computationally intractable to solve. Our formalization provides a guideline for approximating the MDP’s solution, and for extending existing solutions. We show that a previously champion broker can be viewed as approximating the solution using a lookahead policy. We present TacTex’15, which improves upon this previous approximation and achieves state-of-the-art performance in competitions and controlled experiments. Using thousands of experiments against 2015 finalist brokers, we analyze TacTex’15’s performance and the reasons for its success. We find that lookahead policies can be effective, but their performance can be sensitive to errors in the transition function prediction, specifically demand-prediction.

AAAI Conference 2016 Conference Paper

Autonomous Electricity Trading Using Time-of-Use Tariffs in a Competitive Market

  • Daniel Urieli
  • Peter Stone

This paper studies the impact of Time-Of-Use (TOU) tariffs in a competitive electricity market place. Specifically, it focuses on the question of how should an autonomous broker agent optimize TOU tariffs in a competitive retail market, and what is the impact of such tariffs on the economy. We formalize the problem of TOU tariff optimization and propose an algorithm for approximating its solution. We extensively experiment with our algorithm in a large-scale, detailed electricity retail markets simulation of the Power Trading Agent Competition (Power TAC) and: 1) find that our algorithm results in 15% peak-demand reduction, 2) find that its peakflattening results in greater profit and/or profit-share for the broker and allows it to win against the 1st and 2nd place brokers from the Power TAC 2014 finals, and 3) analyze several economic implications of using TOU tariffs in competitive retail markets.

AAAI Conference 2014 Conference Paper

TacTex’13: A Champion Adaptive Power Trading Agent

  • Daniel Urieli
  • Peter Stone

Sustainable energy systems of the future will no longer be able to rely on the current paradigm that energy supply follows demand. Many of the renewable energy resources do not produce power on demand, and therefore there is a need for new market structures that motivate sustainable behaviors by participants. The Power Trading Agent Competition (Power TAC) is a new annual competition that focuses on the design and operation of future retail power markets, specifically in smart grid environments with renewable energy production, smart metering, and autonomous agents acting on behalf of customers and retailers. It uses a rich, open-source simulation platform that is based on real-world data and stateof-the-art customer models. Its purpose is to help researchers understand the dynamics of customer and retailer decisionmaking, as well as the robustness of proposed market designs. This paper introduces TACTEX’13, the champion agent from the inaugural competition in 2013. TACTEX’13 learns and adapts to the environment in which it operates, by heavily relying on reinforcement learning and prediction methods. This paper describes the constituent components of TACTEX’13 and examines its success through analysis of competition results and subsequent controlled experiments.

AAAI Conference 2012 Conference Paper

Design and Optimization of an Omnidirectional Humanoid Walk: A Winning Approach at the RoboCup 2011 3D Simulation Competition

  • Patrick MacAlpine
  • Samuel Barrett
  • Daniel Urieli
  • Victor Vu
  • Peter Stone

This paper presents the design and learning architecture for an omnidirectional walk used by a humanoid robot soccer agent acting in the RoboCup 3D simulation environment. The walk, which was originally designed for and tested on an actual Nao robot before being employed in the 2011 RoboCup 3D simulation competition, was the crucial component in the UT Austin Villa team winning the competition in 2011. To the best of our knowledge, this is the first time that robot behavior has been conceived and constructed on a real robot for the end purpose of being used in simulation. The walk is based on a double linear inverted pendulum model, and multiple sets of its parameters are optimized via a novel framework. The framework optimizes parameters for different tasks in conjunction with one another, a little-understood problem with substantial practical significance. Detailed experiments show that the UT Austin Villa agent significantly outperforms all the other agents in the competition with the optimized walk being the key to its success.

AAMAS Conference 2012 Conference Paper

UT Austin Villa 2011: A Champion Agent in the RoboCup 3D Soccer Simulation Competition

  • Patrick MacAlpine
  • Daniel Urieli
  • Samuel Barrett
  • Shivaram Kalyanakrishnan
  • Francisco Barrera
  • Adrian Lopez-Mobilia
  • Nicolae Ştiurcă
  • Victor Vu

This paper presents the architecture and key components of a simulated humanoid robot soccer team, UT Austin Villa, which was designed to compete in the RoboCup 3D simulation competition. These key components include (1) an omnidirectional walk engine and associated walk parameter optimization framework, (2) an inverse kinematics based kicking architecture, and (3) a dynamic role assignment and positioning system. UT Austin Villa won the RoboCup 2011 3D simulation competition in convincing fashion by winning all 24 games it played. During the course of the competition the team scored 136 goals while conceding none. We analyze the effect of each component in isolation and show through extensive experiments that the complete team significantly outperforms all the other teams from the competition.

AAAI Conference 2011 Conference Paper

Multiagent Patrol Generalized to Complex Environmental Conditions

  • Noa Agmon
  • Daniel Urieli
  • Peter Stone

The problem of multiagent patrol has gained considerable attention during the past decade, with the immediate applicability of the problem being one of its main sources of interest. In this paper we concentrate on frequency-based patrol, in which the agents’ goal is to optimize a frequency criterion, namely, minimizing the time between visits to a set of interest points. We consider multiagent patrol in environments with complex environmental conditions that affect the cost of traveling from one point to another. For example, in marine environments, the travel time of ships depends on parameters such as wind, water currents, and waves. We demonstrate that in such environments there is a need to consider a new multiagent patrol strategy which divides the given area into parts in which more than one agent is active, for improving frequency. We show that in general graphs this problem is intractable, therefore we focus on simplified (yet realistic) cyclic graphs with possible inner edges. Although the problem remains generally intractable in such graphs, we provide a heuristic algorithm that is shown to significantly improve point-visit frequency compared to other patrol strategies. For evaluation of our work we used a custom developed ship simulator that realistically models ship movement constraints such as engine force and drag and reaction of the ship to environmental changes.

AAMAS Conference 2011 Conference Paper

On Optimizing Interdependent Skills: A Case Study in Simulated 3D Humanoid Robot Soccer

  • Daniel Urieli
  • Patrick MacAlpine
  • Shivaram Kalyanakrishnan
  • Yinon Bentor
  • Peter Stone

In several realistic domains an agent's behavior is composed of multiple interdependent skills. For example, consider a humanoid robot that must play soccer, as is the focus of this paper. In order to succeed, it is clear that the robot needs to walk quickly, turn sharply, and kick the ball far. However, these individual skills are ineffective if the robot falls down when switching from walking to turning, or if it cannot position itself behind the ball for a kick. This paper presents a learning architecture for a humanoid robot soccer agent that has been fully deployed and tested within the RoboCup 3D simulation environment. First, we demonstrate that individual skills such as walking and turning can be parameterized and optimized to match the best performance statistics reported in the literature. These results are achieved through effective use of the CMA-ES optimization algorithm. Next, we describe a framework for optimizing skills in conjunction with one another, a little-understood problem with substantial practical significance. Over several phases of learning, a total of roughly 100-150 parameters are optimized. Detailed experiments show that an agent thus optimized performs comparably with the top teams from the RoboCup 2010 competitions, while taking relatively few man-hours for development.

AAMAS Conference 2011 Conference Paper

Ship Patrol: Multiagent Patrol under Complex Environmental Conditions

  • Noa Agmon
  • Daniel Urieli
  • Peter Stone

In the problem of multiagent patrol, a team of agents is required to repeatedly visit a target area in order to monitor possible changes in state. The growing popularity of this problem comes mainly from its immediate applicability to a wide variety of domains. In this paper we concentrate on frequency-based patrol, in which the agents' goal is to optimize a frequency criterion, namely, minimizing the time between visits to a set of interest points. In situations with varying environmental conditions, the influence of changes in the conditions on the cost of travel may be immense. For example, in marine environments, the travel time of ships depends on parameters such as wind, water currents, and waves. Such environments raise the need to consider a new multiagent patrol strategy which divides the given area into regions in which more than one agent is active, for improving frequency. We prove that in general graphs this problem is intractable, therefore we focus on simplified (yet realistic) cyclic graphs with possible inner edges. Although the problem remains generally intractable in such graphs, we provide a heuristic algorithm that is shown to significantly improve point-visit frequency compared to other patrol strategies.