AIJ Journal 2023 Journal Article
Solving zero-sum one-sided partially observable stochastic games
- Karel Horák
- Branislav Bošanský
- Vojtěch Kovařík
- Christopher Kiekintveld
Author name cluster
Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.
AIJ Journal 2023 Journal Article
IJCAI Conference 2019 Conference Paper
Value methods for solving stochastic games with partial observability model the uncertainty of the players as a probability distribution over possible states, where the dimension of the belief space is the number of states. For many practical problems, there are exponentially many states which causes scalability problems. We propose an abstraction technique that addresses this curse of dimensionality by projecting the high-dimensional beliefs onto characteristic vectors of significantly lower dimension (e. g. , marginal probabilities). Our main contributions are (1) a novel compact representation of the uncertainty in partially observable stochastic games and (2) a novel algorithm using this representation that is based on existing state-of-the-art algorithms for solving stochastic games with partial observability. Experimental evaluation confirms that the new algorithm using the compact representation dramatically increases scalability compared to the state of the art.
AAMAS Conference 2019 Conference Paper
IJCAI Conference 2018 Conference Paper
In a Periodic Double Auction (PDA), there are multiple discrete trading periods for a single type of good. PDAs are commonly used in real-world energy markets to trade energy in specific time slots to balance demand on the power grid. Strategically, bidding in a PDA is complicated because the bidder must predict and plan for future auctions that may influence the bidding strategy for the current auction. We present a general bidding strategy for PDAs based on forecasting clearing prices and using Monte Carlo Tree Search (MCTS) to plan a bidding strategy across multiple time periods. In addition, we present a fast heuristic strategy that can be used either as a standalone method or as an initial set of bids to seed the MCTS policy. We evaluate our bidding strategies using a PDA simulator based on the wholesale market implemented in the Power Trading Agent Competition (PowerTAC) competition. We demonstrate that our strategies outperform state-of-the-art bidding strategies designed for that competition.
AAMAS Conference 2018 Conference Paper
Bidding strategies for Periodic Double Auctions (PDAs) are complicated because they need to predict and plan for future auctions, which may affect the bidding strategy in the current auction. We present a general bidding strategy for PDAs based on forecasting clearing prices and using Monte Carlos Tree Search (MCTS) to plan a bidding strategy across multiple time periods. We developed a controlled simulator by isolating Power Trading Agent Competition’s wholesale market to evaluate bidding strategies in a realistic PDA energy market. We show that our MCTS bidding strategy is cost effective in buying energy compared to other baseline and state-ofthe-art strategies and it’s performance improves with increasing number of MCTS simulations.
IJCAI Conference 2018 Conference Paper
The Stackelberg Security Game (SSG) model has been immensely influential in security research since it was introduced roughly a decade ago. Furthermore, deployed SSG-based applications are one of most successful examples of game theory applications in the real world. We present a broad survey of recent technical advances in SSG and related literature, and then look to the future by highlighting the new potential applications and open research problems in SSG.
AAMAS Conference 2017 Conference Paper
Honeypots are decoy cyberdefense systems placed in a network to entice malicious entities into attacking in order to waste attacker resources and learn information about attack behavior or previously unknown exploits. We focus on the strategic selection of various honeypot configurations in order to adapt to an intelligent attacker amidst a dynamic environment. In order to infiltrate networks, attackers leverage various exploits on the system. However, these exploits and the value they provide dynamically change over time as more information is gathered about them. We introduce a model that addresses the combinatorial complexity of the honeypot selection problem and allow for these dynamic exploits. To solve this new problem, we map this model to a Multi-Armed Bandit (MAB) problem, which is a class of machine learning problems that maintain balance between exploration and exploitation. We show empirically that both stochastic and adversarial MAB solutions improve over static defense strategies.
IJCAI Conference 2017 Conference Paper
The Strong Stackelberg Equilibrium (SSE) has drawn extensive attention recently in several security domains. However, the SSE concept neglects the advantage of defender's strategic revelation of her private information, and overestimates the observation ability of the adversaries. In this paper, we overcome these restrictions and analyze the tradeoff between strategic secrecy and commitment in security games. We propose a Disguised-resource Security Game (DSG) where the defender strategically disguises some of her resources. We compare strategic information revelation with public commitment and formally show that they have different advantages depending the payoff structure. To compute the Perfect Bayesian Equilibrium (PBE), several novel approaches are provided, including a novel algorithm based on support set enumeration, and an approximation algorithm for \epsilon-PBE. Extensive experimental evaluation shows that both strategic secrecy and Stackelberg commitment are critical measures in security domain, and our approaches can efficiently solve PBEs for realistic-sized problems.
IJCAI Conference 2017 Conference Paper
In recent years, there have been a number of successful cyber attacks on enterprise networks by malicious actors which have caused severe damage. These networks have Intrusion Detection and Prevention Systems in place to protect them, but they are notorious for producing a high volume of alerts. These alerts must be investigated by cyber analysts to determine whether they are an attack or benign. Unfortunately, there are magnitude more alerts generated than there are cyber analysts to investigate them. This trend is expected to continue into the future creating a need for tools which find optimal assignments of the incoming alerts to analysts in the presence of a strategic adversary. We address this challenge with the four following contributions: (1) a cyber screening game (CSG) model for the cyber network protection domain, (2) an NP-hardness proof for computing the optimal strategy for the defender, (3) an algorithm that finds the optimal allocation of experts to alerts in the CSG, and (4) heuristic improvements for computing allocations in CSGs that accomplishes significant scale-up which we show empirically to closely match the solution quality of the optimal algorithm.
IS Journal 2016 Journal Article
The increasing complexity of securing modern computer networks makes decision support systems an important tool for administrators. A challenge many existing tools fail to address is that attackers react strategically to new security measures, adapting their behaviors in response. Game theory provides a methodology for making decisions that takes into account these reactions, rather than assuming static attackers. The authors present an overview of how game theory can be used to inform one type of security decision: how to optimally place honeypots in a network. They demonstrate this approach on a realistic case study and present initial validation results based on a study comparing their approach with human decision makers.
AAAI Conference 2016 Conference Paper
Highly targeted spear phishing attacks are increasingly common, and have been implicated in many major security breeches. Email filtering systems are the first line of defense against such attacks. These filters are typically configured with uniform thresholds for deciding whether or not to allow a message to be delivered to a user. However, users have very significant differences in both their susceptibility to phishing attacks as well as their access to critical information and credentials that can cause damage. Recent work has considered setting personalized thresholds for individual users based on a Stackelberg game model. We consider two important extensions of the previous model. First, in our model user values can be substitutable, modeling cases where multiple users provide access to the same information or credential. Second, we consider attackers who make sequential attack plans based on the outcome of previous attacks. Our analysis starts from scenarios where there is only one credential and then extends to more general scenarios with multiple credentials. For single-credential scenarios, we demonstrate that the optimal defense strategy can be found by solving a binary combinatorial optimization problem called PEDS. For multiple-credential scenarios, we formulate it as a bilevel optimization problem for finding the optimal defense strategy and then reduce it to a single level optimization problem called PEMS using complementary slackness conditions. Experimental results show that both PEDS and PEMS lead to significant higher defender utilities than two existing benchmarks in different parameter settings. Also, both PEDS and PEMS are more robust than the existing benchmarks considering uncertainties.
AAAI Conference 2016 Conference Paper
AAAI Conference 2016 Conference Paper
Courses in artificial intelligence and related topics often cover methods for reasoning under uncertainty, decision theory, and game theory. However, these methods can seem very abstract when students first encounter them, and they are often taught using simple toy problems. Our goal is to help students to operationalize this knowledge by designing sophisticated autonomous agents that must make complex decisions in games that capture their interest. We describe a tournament-based pedagogy that we have used in two different courses with two different games based on current research topics in artificial intelligence to engage students in designing agents that use strategic reasoning. Many students find this structure very engaging, and we find that students develop a deeper understanding of the abstract strategic reasoning concepts introduced in the courses.
AAAI Conference 2016 Conference Paper
Strong Stackelberg Equilibrium (SSE) is a fundamental solution concept in game theory in which one player commits to a strategy, while the other player observes this commitment and plays a best response. We present a new algorithm for computing SSE for two-player extensive-form general-sum games with imperfect information (EFGs) where computing SSE is an NP-hard problem. Our algorithm is based on a correlated version of SSE, known as Stackelberg Extensive- Form Correlated Equilibrium (SEFCE). Our contribution is therefore twofold: (1) we give the first linear program for computing SEFCE in EFGs without chance, (2) we repeatedly solve and modify this linear program in a systematic search until we arrive to SSE. Our new algorithm outperforms the best previous algorithms by several orders of magnitude.
AAAI Conference 2015 Conference Paper
Many search and security games played on a graph can be modeled as normal-form zero-sum games with strategies consisting of sequences of actions. The size of the strategy space provides a computational challenge when solving these games. This complexity is tackled either by using the compact representation of sequential strategies and linear programming, or by incremental strategy generation of iterative double-oracle methods. In this paper, we present novel hybrid of these two approaches: compact-strategy doubleoracle (CS-DO) algorithm that combines the advantages of the compact representation with incremental strategy generation. We experimentally compare CS-DO with the standard approaches and analyze the impact of the size of the support on the performance of the algorithms. Results show that CS-DO dramatically improves the convergence rate in games with non-trivial support.
IJCAI Conference 2015 Conference Paper
Preventing attacks in a computer network is the core problem in network security. We introduce a new game-theoretic model of the interaction between a network administrator who uses limited resource to harden a network and an attacker who follows a multi-stage plan to attack the network. The possible plans of the attacker are compactly represented using attack graphs, while the defender adds fake targets (honeypots) to the network to deceive the attacker. The compact representation of the attacker’s strategies presents a computational challenge and finding the best response of the attacker is NP-hard. We present a solution method that first translates an attack graph into an MDP and solves it using policy search with a set of pruning techniques. We present an empirical evaluation of the model and solution algorithms, evaluating scalability, the types of solutions that are generated for realistic cases, and sensitivity analysis.
AIJ Journal 2013 Journal Article
AAMAS Conference 2013 Conference Paper
Counterinsurgency, which is the effort to mitigate support for an opposing organization, is one such domain that has been studied recently and past work has modeled the problem as an influence blocking maximization that features an influencer and a mitigator. While past work has introduced scalable heuristic techniques for generating effective strategies using a double oracle algorithm, it has not addressed the issue of uncertainty and asymmetric information, which is the topic of this paper.
ECAI Conference 2012 Conference Paper
We develop and evaluate a new exact algorithm for finding Nash equilibria of two-player zero-sum extensive-form games with imperfect information. Our approach is based on the sequence-form representation of the game, and uses an algorithmic framework of double-oracle methods that have been used successfully in other classes of games. The algorithm uses an iterative decomposition, solving restricted games and exploiting fast best-response algorithms to add additional sequences to the game over time. We demonstrate our algorithm on a class of adversarial graph search games motivated by real world border patrolling scenarios. The results indicate that our framework is a promising way to scale up solutions for extensive-form games, reducing both memory and computation time requirements.
AAMAS Conference 2012 Conference Paper
The burgeoning area of security games has focused on real-world domains where security agencies protect critical infrastructure from a diverse set of adaptive adversaries. There are security domains where the payoffs for preventing the different types of adversaries may take different forms (seized money, reduced crime, saved lives, etc) which are not readily comparable. Thus, it can be difficult to know how to weigh the different payoffs when deciding on a security strategy. To address the challenges of these domains, we propose a fundamentally different solution concept, multi-objective security games (MOSG), which combines security games and multi-objective optimization. Instead of a single optimal solution, MOSGs have a set of Pareto optimal (non-dominated) solutions referred to as the Pareto frontier. The Pareto frontier can be generated by solving a sequence of constrained single-objective optimization problems (CSOP), where one objective is selected to be maximized while lower bounds are specified for the other objectives. Our contributions include: (i) an algorithm, Iterative $\epsilon$-Constraints, for generating the sequence of CSOPs; (ii) an exact approach for solving an MILP formulation of a CSOP (which also applies to multi-objective optimization in more general Stackelberg games); (iii) heuristics that achieve speedup by exploiting the structure of security games to further constrain a CSOP; (iv) an approximate approach for solving an algorithmic formulation of a CSOP, increasing the scalability of our approach with quality guarantees. Additional contributions of this paper include proofs on the level of approximation and detailed experimental evaluation of the proposed approaches.
AAAI Conference 2012 Conference Paper
Randomized first-mover strategies of Stackelberg games are used in several deployed applications to allocate limited resources for the protection of critical infrastructure. Stackelberg games model the fact that a strategic attacker can surveil and exploit the defender’s strategy, and randomization guards against the worst effects by making the defender less predictable. In accordance with the standard game-theoretic model of Stackelberg games, past work has typically assumed that the attacker has perfect knowledge of the defender’s randomized strategy and will react correspondingly. In light of the fact that surveillance is costly, risky, and delays an attack, this assumption is clearly simplistic: attackers will usually act on partial knowledge of the defender’s strategies. The attacker’s imperfect estimate could present opportunities and possibly also threats to a strategic defender. In this paper, we therefore begin a systematic study of security games with limited surveillance. We propose a natural model wherein an attacker forms or updates a belief based on observed actions, and chooses an optimal response. We investigate the model both theoretically and experimentally. In particular, we give mathematical programs to compute optimal attacker and defender strategies for a fixed observation duration, and show how to use them to estimate the attacker’s observation durations. Our experimental results show that the defender can achieve significant improvement in expected utility by taking the attacker’s limited surveillance into account, validating the motivation of our work.
AAMAS Conference 2011 Conference Paper
Game theory is fast becoming a vital tool for reasoning about complex real-world security problems, including critical infrastructure protection. The game models for these applications are constructed using expert analysis and historical data to estimate the values of key parameters, including the preferences and capabilities of terrorists. In many cases, it would be natural to represent uncertainty over these parameters using continuous distributions (such as uniform intervals or Gaussians). However, existing solution algorithms are limited to considering a small, finite number of possible attacker types with different payoffs. We introduce a general model of infinite Bayesian Stackelberg security games that allows payoffs to be represented using continuous payoff distributions. We then develop several techniques for finding approximate solutions for this class of games, and show empirically that our methods offer dramatic improvements over the current state of the art, providing new ways to improve the robustness of security game models.
AAMAS Conference 2011 Conference Paper
Building on research previously reported at AAMAS conferences, this paper describes an innovative application of a novel gametheoretic approach for a national scale security deployment. Working with the United States Transportation Security Administration (TSA), we have developed a new application called GUARDS to assist in resource allocation tasks for airport protection at over 400 United States airports. In contrast with previous efforts such as ARMOR and IRIS, which focused on one-off tailored applications and one security activity (e. g. canine patrol or checkpoints) per application, GUARDS faces three key issues: (i) reasoning about hundreds of heterogeneous security activities; (ii) reasoning over diverse potential threats; (iii) developing a system designed for hundreds of end-users. Since a national deployment precludes tailoring to specific airports, our key ideas are: (i) creating a new game-theoretic framework that allows for heterogeneous defender activities and compact modeling of a large number of threats; (ii) developing an efficient solution technique based on general purpose Stackelberg game solvers; (iii) taking a partially centralized approach for knowledge acquisition and development of the system. In doing so we develop a software scheduling assistant, GUARDS, designed to reason over two agents - the TSA and a potential adversary - and allocate the TSA's limited resources across hundreds of security activities in order to provide protection within airports. The scheduling assistant has been delivered to the TSA and is currently under evaluation and testing for scheduling practices at an undisclosed airport. If successful, the TSA intends to incorporate the system into their unpredictable scheduling practices nationwide. In this paper we discuss the design choices and challenges encountered during the implementation of GUARDS. GUARDS represents promising potential for transitioning years of academic research into a nationally deployed system.
IJCAI Conference 2011 Conference Paper
We describe an innovative application of a novel game-theoretic approach for a \textit{national scale} security deployment. Working with the United States Transportation Security Administration (TSA), we have developed a new application called GUARDS to allocate the TSA's limited resources across hundreds of security activities to provide protection at over 400 United States airports. Similar security applications (e. g. , ARMOR and IRIS) have focused on one-off tailored applications and one security activity (e. g. checkpoints) per application, GUARDS on the other hand faces three new key issues: (i) reasoning about hundreds of heterogeneous security activities; (ii) reasoning over diverse potential threats; (iii) developing a system designed for hundreds of end-users. Since a national deployment precludes tailoring to specific airports, our key ideas are: (i) creating a new game-theoretic framework that allows for heterogeneous defender activities and compact modeling of a large number of threats; (ii) developing an efficient solution technique based on general purpose Stackelberg game solvers; (iii) taking a partially centralized approach for knowledge acquisition. The scheduling assistant has been delivered to the TSA and is currently undergoing evaluation for scheduling practices at an undisclosed airport. If successful, the TSA intends to incorporate the system into their unpredictable scheduling practices nationwide.
AAMAS Conference 2011 Conference Paper
It becomes critical to address human adversaries' bounded rationality in security games as the real-world deployment of such games spreads. To that end, the key contributions of this paper include: (i) new efficient algorithms for computing optimal strategic solutions using Prospect Theory and Quantal Response Equilibrium; (ii) the most comprehensive experiment to date studying the effectiveness of different models against human subjects for security games. Our new techniques outperform the leading contender for modelling human behavior in security games in experiment with human subjects.
IJCAI Conference 2011 Conference Paper
Recent real-world deployments of Stackelberg security games make it critical that we address human adversaries' bounded rationality in computing optimal strategies. To that end, this paper provides three key contributions: (i) new efficient algorithms for computing optimal strategic solutions using Prospect Theory and Quantal Response Equilibrium; (ii) the most comprehensive experiment to date studying the effectiveness of different models against human subjects for security games; and (iii) new techniques for generating representative payoff structures for behavioral experiments in generic classes of games. Our results with human subjects show that our new techniques outperform the leading contender for modeling human behavior in security games.
AAMAS Conference 2011 Conference Paper
The fastest known algorithm for solving General Bayesian Stackelberg games with a finite set of follower (adversary) types have seen direct practical use at the LAX airport for over 3 years; and currently, an (albeit non-Bayesian) algorithm for solving these games is also being used for scheduling air marshals on limited sectors of international flights by the US Federal Air Marshals Service. These algorithms find optimal randomized security schedules to allocate limited security resources to protect targets. As we scale up to larger domains, including the full set of flights covered by the Federal Air Marshals, it is critical to develop newer algorithms that scale-up significantly beyond the limits of the current state-of-theart of Bayesian Stackelberg solvers. In this paper, we present a novel technique based on a hierarchical decomposition and branch and bound search over the follower type space, which may be applied to different Stackelberg game solvers. We have applied this technique to different solvers, resulting in: (i) A new exact algorithm called HBGS that is orders of magnitude faster than the best known previous Bayesian solver for general Stackelberg games; (ii) A new exact algorithm called HBSA which extends the fastest known previous security game solver towards the Bayesian case; and (iii) Approximation versions of HBGS and HBSA that show significant improvements over these newer algorithms with only 12% sacrifice in the practical solution quality.
AAAI Conference 2011 Conference Paper
Given the real-world deployments of attacker-defender Stackelberg security games, robustness to deviations from expected attacker behaviors has now emerged as a critically important issue. This paper provides four key contributions in this context. First, it identifies a fundamentally problematic aspect of current algorithms for security games. It shows that there are many situations where these algorithms face multiple equilibria, and they arbitrarily select one that may hand the defender a significant disadvantage, particularly if the attacker deviates from its equilibrium strategies due to unknown constraints. Second, for important subclasses of security games, it identifies situations where we will face such multiple equilibria. Third, to address these problematic situations, it presents two equilibrium refinement algorithms that can optimize the defender’s utility if the attacker deviates from equilibrium strategies. Finally, it experimentally illustrates that the refinement approach achieved significant robustness in consideration of attackers’ deviation due to unknown constraints.
AAMAS Conference 2010 Conference Paper
Distributed Constraint Optimization (DCOP) is a popular framework for cooperative multi-agent decision making. DCOP is NP-hard, so an important line of work focuses on developing fast incomplete solution algorithms for large-scale applications. One ofthe few incomplete algorithms to provide bounds on solution quality is k-size optimality, which defines a local optimality criterionbased on the size of the group of deviating agents. Unfortunately, the lack of a general-purpose algorithm and the commitment toforming groups based solely on group size has limited the use ofk-size optimality. This paper introduces t-distance optimality which departs fromk-size optimality by using graph distance as an alternative criteriafor selecting groups of deviating agents. This throws open a new research direction into the tradeoffs between different group selectionand coordination mechanisms for incomplete DCOP algorithms. We derive theoretical quality bounds for t-distance optimality thatimprove known bounds for k-size optimality. In addition, we develop a new efficient asynchronous local search algorithm for finding both k-size and t-distance optimal solutions — allowing theseconcepts to be deployed in real applications. Indeed, empirical results show that this algorithm significantly outperforms the only existing algorithm for finding general k-size optimal solutions, whichis also synchronous. Finally, we compare the algorithmic performance of k-size and t-distance optimality using this algorithm. Wefind that t-distance consistently converges to higher-quality solutions in the long run, but results are mixed on convergence speed; we identify cases where k-size and t-distance converge faster.
AAMAS Conference 2010 Conference Paper
We apply game-theoretic techniques to urban security deploymentand propose new algorithms to efficiently solve real-world domainsthat are intractable with existing algorithms.
AAMAS Conference 2010 Conference Paper
Recent work has applied game-theoretic models to real-world security problems at the Los Angeles International Airport (LAX)and Federal Air Marshals Service (FAMS). The analysis of thesedomains is based on input from domain experts intended to capture the best available intelligence information about potential terrorist activities and possible security countermeasures. Nevertheless, these models are subject to significant uncertainty - especiallyin security domains where intelligence about adversary capabilities and preferences is very difficult to gather. This uncertaintypresents significant challenges for applying game-theoretic analysis in these domains. Our experimental results show that standard solution methods based on perfect information assumptionsare very sensitive to payoff uncertainty, resulting in low payoffs forthe defender. We describe a model of Bayesian Stackelberg gamesthat allows for general distributional uncertainty over the attacker'spayoffs. We conduct an experimental analysis of two algorithms forapproximating equilibria of these games, and show that the resulting solutions give much better results than the standard approachwhen there is payoff uncertainty.
AAAI Conference 2010 Conference Paper
Security games, and important class of Stackelberg games, are used in deployed decision-support tools in use by LAX police and the Federal Air Marshals Service. The algorithms used to solve these games find optimal randomized schedules to allocate security resources for infrastructure protection. Unfortunately, the state of the art algorithms either fail to scale or to provide a correct solution for large problems with arbitrary scheduling constraints. We introduce ASPEN, a branch-and-price approach that overcomes these limitations based on two key contributions: (i) A columngeneration approach that exploits a novel network flow representation, avoiding a combinatorial explosion of schedule allocations; (ii) A branch-and-bound algorithm that generates bounds via a fast algorithm for solving security games with relaxed scheduling constraints. ASPEN is the first known method for efficiently solving massive security games with arbitrary schedules.
AAMAS Conference 2010 Conference Paper
There has been significant recent interest in game theoretic approaches to security, with much of the recent research focused onutilizing the leader-follower Stackelberg game model; for example, these games are at the heart of major applications such asthe ARMOR program deployed for security at the LAX airportsince 2007 and the IRIS program in use by the US Federal AirMarshals (FAMS). The foundational assumption for using Stackelberg games is that security forces (leaders), acting first, commit toa randomized strategy; while their adversaries (followers) choosetheir best response after surveillance of this randomized strategy. Yet, in many situations, the followers may act without observation of the leader's strategy, essentially converting the game intoa simultaneous-move game model. Previous work fails to addresshow a leader should compute her strategy given this fundamentaluncertainty about the type of game faced. Focusing on the complex games that are directly inspired by realworld security applications, the paper provides four contributionsin the context of a general class of security games. First, exploiting the structure of these security games, the paper shows that theNash equilibria in security games are interchangeable, thus alleviating the equilibrium selection problem. Second, resolving theleader's dilemma, it shows that under a natural restriction on security games, any Stackelberg strategy is also a Nash equilibriumstrategy; and furthermore, the solution is unique in a class of realworld security games of which ARMOR is a key exemplar. Third, when faced with a follower that can attack multiple targets, manyof these properties no longer hold. Fourth, our experimental resultsemphasize positive properties of games that do not fit our restrictions. Our contributions have major implications for the real-worldapplications.
AAAI Conference 2010 Conference Paper
Law enforcement agencies frequently must allocate limited resources to protect targets embedded in a network, such as important buildings in a city road network. Since intelligent attackers may observe and exploit patterns in the allocation, it is crucial that the allocations be randomized. We cast this problem as an attacker-defender Stackelberg game: the defender’s goal is to obtain an optimal mixed strategy for allocating resources. The defender’s strategy space is exponential in the number of resources, and the attacker’s exponential in the network size. Existing algorithms are therefore useless for all but the smallest networks. We present a solution approach based on two key ideas: (i) A polynomial-sized game model obtained via an approximation of the strategy space, solved efficiently using a linear program; (ii) Two efficient techniques that map solutions from the approximate game to the original, with proofs of correctness under certain assumptions. We present in-depth experimental results, including an evaluation on part of the Mumbai road network.
AAMAS Conference 2009 Conference Paper
Predictable allocations of security resources such as police officers, canine units, or checkpoints are vulnerable to exploitation by attackers. Recent work has applied game-theoretic methods to find optimal randomized security policies, including a fielded application at the Los Angeles International Airport (LAX). This approach has promising applications in many similar domains, including police patrolling for subway and bus systems, randomized baggage screening, and scheduling for the Federal Air Marshal Service (FAMS) on commercial flights. However, the existing methods scale poorly when the security policy requires coordination of many resources, which is central to many of these potential applications. We develop new models and algorithms that scale to much more complex instances of security games. The key idea is to use a compact model of security games, which allows exponential improvements in both memory and runtime relative to the best known algorithms for solving general Stackelberg games. We develop even faster algorithms for security games under payoff restrictions that are natural in many security domains. Finally, introduce additional realistic scheduling constraints while retaining comparable performance improvements. The empirical evaluation comprises both random data and realistic instances of the FAMS and LAX problems. Our new methods scale to problems several orders of magnitude larger than the fastest known algorithm.
AAMAS Conference 2008 Conference Paper
In many complex multi-agent domains it is impractical to compute exact analytic solutions. An alternate means of analysis applies computational tools to derive and analyze empirical game models. These models are noisy approximations, which raises questions about how to account for uncertainty when analyzing the model. We develop a novel experimental framework and apply it to benchmark meta-strategies – general algorithms for selecting strategies based on empirical game models. We demonstrate that modeling noise is important; a naïve approach that disregards noise and plays according to Nash equilibrium yields poor choices. We introduce three parameterized algorithms that factor noise into the analysis by predicting distributions of opponent play. As observation noise increases, rational players generally make less specific outcome predictions. Our comparison of the algorithms identifies logit equilibrium as the best method for making these predictions. Logit equilibrium incorporates a form of noisy decision-making by players. Our evidence shows that this is a robust method for approximating the effects of uncertainty in many contexts. This result has practical relevance for guiding analysis of empirical game models. It also offers an intriguing rationale for behavioral findings that logit equilibrium is a better predictor of human behavior than Nash equilibrium.
AAMAS Conference 2007 Conference Paper
The TAC Supply Chain Management (TAC/SCM) game presents a challenging dynamic environment for autonomous decision-making in a salient application domain. Strategic interactions complicate the analysis of games such as TAC/SCM, since the effectiveness of a given strategy depends on the strategies played by other agents on the supply chain. The TAC tournament generates results from one particular path of combinations, and success in the tournament is rightly regarded as evidence for agent quality. Such results along with post-competition controlled experiments provide useful evaluations of novel techniques employed in the game. We argue that a broader game-theoretic analysis framework can provide a firmer foundation for choice of experimental contexts. Exploiting a repository of agents from the 2005 and 2006 TAC/SCM tournaments, we demonstrate an empirical game-theoretic methodology based on extensive simulation and careful measurement. Our analysis of agents from TAC-05 reveals interesting interactions not seen in the tournament. Extending the analysis to TAC-06 enables us to measure progress from year-to-year, and generates a candidate empirical equilibrium among the best known strategies. We use this equilibrium as a stable background population for comparing relative performance of the 2006 agents, yielding insights complementing the tournament results.
AAAI Conference 2007 Short Paper
AAMAS Conference 2007 Conference Paper
Future market conditions can be a pivotal factor in making business decisions. We present and evaluate methods used by our agent, Deep Maize, to forecast market prices in the Trading Agent Competition Supply Chain Management Game. As a guiding principle we seek to exploit as many sources of available information as possible to inform predictions. Since information comes in several different forms, we integrate well-known methods in a novel way to make predictions. The core of our predictor is a nearest-neighbors machine learning algorithm that identifies historical instances with similar economic indicators. We augment this with an online learning procedure that transforms the predictions by optimizing a scoring rule. This allows us to select more relevant historical contexts using additional information available during individual games. We also explore the advantages of two different representations for predicting price distributions. One uses absolute prices, and the other uses statistics of price time series to exploit local stability. We evaluate these methods using both data from the 2005 tournament final round and additional simulations. We compare several variations of our predictor to one another and a baseline predictor similar to those used by many other tournament agents. We show substantial improvements over the baseline predictor, and demonstrate that each element of our predictor contributes to improved performance.
ICAPS Conference 2004 Conference Paper
Decision makers on supply chains face an uncertain, dynamic, and strategic multiagent environment. We report on Deep Maize, an agent we designed to participate in the 2003 Trading Agent Competition Supply Chain Management (TAC/SCM) game. Our design employs an idealized equilibrium analysis of the SCM game to factor out the strategic aspects of the environment and to define an expected profitable zone of operation. Deep Maize applies distributed feedback control to coordinate its separate functional modules and maintain its environment in the desired zone, despite the uncertainty and dynamism. We evaluate our design with results from the TAC/SCM tournament as well as from controlled experiments conducted after the competition.