Arrow Research search

Author name cluster

Ioannis Panageas

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

33 papers
2 author rows

Possible papers

33

NeurIPS Conference 2025 Conference Paper

Efficient Kernelized Learning in Polyhedral Games beyond Full Information: From Colonel Blotto to Congestion Games

  • Andreas Kontogiannis
  • Vasilis Pollatos
  • Gabriele Farina
  • Panayotis Mertikopoulos
  • Ioannis Panageas

We examine the problem of efficiently learning coarse correlated equilibria (CCE) in polyhedral games, that is, normal-form games with an exponentially large number of actions per player and an underlying combinatorial structure—such as the classic Colonel Blotto game or congestion games. Achieving computational efficiency in this setting requires learning algorithms whose regret and per-iteration complexity scale at most polylogarithmically with the size of the players’ action sets. This challenge has recently been addressed in the full-information setting, primarily through the use of kernelization; however, in the more realistic partial information setting, the situation is much more challenging, and existing approaches result in suboptimal and impractical runtime complexity to learn CCE. We address this gap via a novel kernelization-based framework for payoff-based learning in polyhedral games, which we then apply to certain key classes of polyhedral games—namely Colonel Blotto, graphic matroid and network congestion games. In so doing, we obtain a range of computationally efficient payoff-based learning algorithms which significantly improve upon prior work in terms of the runtime for learning CCE.

NeurIPS Conference 2025 Conference Paper

The Complexity of Finding Local Optima in Contrastive Learning

  • Jingming Yan
  • Yiyuan Luo
  • Vaggos Chatziafratis
  • Ioannis Panageas
  • Parnian Shahkar
  • Stelios Stavroulakis

Contrastive learning is a powerful technique for discovering meaningful data representations by optimizing objectives based on $\textit{contrastive information}$, often given as a set of weighted triplets $\{(x_i, y_i^+, z_{i}^-)\}_{i = 1}^m$ indicating that an "anchor" $x_i$ is more similar to a "positive" example $y_i$ than to a "negative" example $z_i$. The goal is to find representations (e. g. , embeddings in $\mathbb{R}^d$ or a tree metric) where anchors are placed closer to positive than to negative examples. While finding $\textit{global}$ optima of contrastive objectives is $\mathsf{NP}$-hard, the complexity of finding $\text{\textit{local}}$ optima---representations that do not improve by local search algorithms such as gradient-based methods---remains open. Our work settles the complexity of finding local optima in various contrastive learning problems by proving $\mathsf{PLS}$-hardness in discrete settings (e. g. , maximize satisfied triplets) and $\mathsf{CLS}$-hardness in continuous settings (e. g. , minimize Triplet Loss), where $\mathsf{PLS}$ (Polynomial Local Search) and $\mathsf{CLS}$ (Continuous Local Search) are well-studied complexity classes capturing local search dynamics in discrete and continuous optimization, respectively. Our results imply that no polynomial time algorithm (local search or otherwise) can find a local optimum for various contrastive learning problems, unless $\mathsf{PLS}\subseteq\mathsf{P}$ (or $\mathsf{CLS}\subseteq \mathsf{P}$ for continuous problems). Even in the unlikely scenario that $\mathsf{PLS}\subseteq\mathsf{P}$ (or $\mathsf{CLS}\subseteq \mathsf{P}$), our reductions imply that there exist instances where local search algorithms need exponential time to reach a local optimum, even for $d=1$ (embeddings on a line).

NeurIPS Conference 2025 Conference Paper

The Complexity of Symmetric Equilibria in Min-Max Optimization and Team Zero-Sum Games

  • Ioannis Anagnostides
  • Ioannis Panageas
  • Tuomas Sandholm
  • Jingming Yan

We consider the problem of computing stationary points in min-max optimization, with a focus on the special case of Nash equilibria in (two-)team zero-sum games. We first show that computing $\epsilon$-Nash equilibria in $3$-player $\text{\emph{adversarial}}$ team games---wherein a team of $2$ players competes against a $\text{\emph{single}}$ adversary---is $\textsf{CLS}$-complete, resolving the complexity of Nash equilibria in such settings. Our proof proceeds by reducing from $\text{\emph{symmetric}}$ $\epsilon$-Nash equilibria in $\text{\emph{symmetric}}$, identical-payoff, two-player games, by suitably leveraging the adversarial player so as to enforce symmetry---without disturbing the structure of the game. In particular, the class of instances we construct comprises solely polymatrix games, thereby also settling a question left open by Hollender, Maystre, and Nagarajan (2024). Moreover, we establish that computing $\text{\emph{symmetric}}$ (first-order) equilibria in $\text{\emph{symmetric}}$ min-max optimization is $\textsf{PPAD}$-complete, even for quadratic functions. Building on this reduction, we show that computing symmetric $\epsilon$-Nash equilibria in symmetric, $6$-player ($3$ vs. $3$) team zero-sum games is also $\textsf{PPAD}$-complete, even for $\epsilon = \text{poly}(1/n)$. As a corollary, this precludes the existence of symmetric dynamics---which includes many of the algorithms considered in the literature---converging to stationary points. Finally, we prove that computing a $\text{\emph{non-symmetric}}$ $\text{poly}(1/n)$-equilibrium in symmetric min-max optimization is $\textsf{FNP}$-hard.

ICLR Conference 2024 Conference Paper

Beating Price of Anarchy and Gradient Descent without Regret in Potential Games

  • Iosif Sakos
  • Stefanos Leonardos
  • Stelios Andrew Stavroulakis
  • Will Overman
  • Ioannis Panageas
  • Georgios Piliouras

Arguably one of the thorniest problems in game theory is that of equilibrium selection. Specifically, in the presence of multiple equilibria do self-interested learning dynamics typically select the socially optimal ones? We study a rich class of continuous-time no-regret dynamics in potential games (PGs). Our class of dynamics, *Q-Replicator Dynamics* (QRD), include gradient descent (GD), log-barrier and replicator dynamics (RD) as special cases. We start by establishing *pointwise convergence* of all QRD to Nash equilibria in almost all PGs. In the case of GD, we show a tight average case performance within a factor of two of optimal, for a class of symmetric $2\times2$ potential games with unbounded Price of Anarchy (PoA). Despite this positive result, we show that GD is not always the optimal choice even in this restricted setting. Specifically, GD outperforms RD, if and only if *risk-* and *payoff-dominance* equilibria coincide. Finally, we experimentally show how these insights extend to all QRD dynamics and that unbounded gaps between average case performance and PoA analysis are common even in larger settings.

AAAI Conference 2024 Conference Paper

Computing Nash Equilibria in Potential Games with Private Uncoupled Constraints

  • Nikolas Patris
  • Stelios Stavroulakis
  • Fivos Kalogiannis
  • Rose Zhang
  • Ioannis Panageas

We consider the problem of computing Nash equilibria in potential games where each player's strategy set is subject to private uncoupled constraints. This scenario is frequently encountered in real-world applications like road network congestion games where individual drivers adhere to personal budget and fuel limitations. Despite the plethora of algorithms that efficiently compute Nash equilibria (NE) in potential games, the domain of constrained potential games remains largely unexplored. We introduce an algorithm that leverages the Lagrangian formulation of NE. The algorithm is implemented independently by each player and runs in polynomial time with respect to the approximation error, the sum of the size of the action-spaces, and the game's inherit parameters.

UAI Conference 2024 Conference Paper

Last-iterate Convergence Separation between Extra-gradient and Optimism in Constrained Periodic Games

  • Yi Feng
  • Ping Li
  • Ioannis Panageas
  • Xiao Wang 0036

Last-iterate behaviors of learning algorithms in repeated two-player zero-sum games have been extensively studied due to their wide applications in machine learning and related tasks. Typical algorithms that exhibit the last-iterate convergence property include optimistic and extra-gradient methods. However, most existing results establish these properties under the assumption that the game is time-independent. Recently, (Feng et al. , 2023) studied the last-iterate behaviors of optimistic and extra-gradient methods in games with a time-varying payoff matrix, and proved that in an unconstrained periodic game, extra-gradient method converges to the equilibrium while optimistic method diverges. This finding challenges the conventional wisdom that these two methods are expected to behave similarly as they do in time-independent games. However, compared to unconstrained games, games with constrains are more common both in practical and theoretical studies. In this paper, we investigate the last-iterate behaviors of optimistic and extra-gradient methods in the constrained periodic games, demonstrating that similar separation results for last-iterate convergence also hold in this setting.

NeurIPS Conference 2024 Conference Paper

Learning Equilibria in Adversarial Team Markov Games: A Nonconvex-Hidden-Concave Min-Max Optimization Problem

  • Fivos Kalogiannis
  • Jingming Yan
  • Ioannis Panageas

We study the problem of learning a Nash equilibrium (NE) in Markov games which is a cornerstone in multi-agent reinforcement learning (MARL). In particular, we focus on infinite-horizon adversarial team Markov games (ATMGs) in which agents that share a common reward function compete against a single opponent, *the adversary*. These games unify two-player zero-sum Markov games and Markov potential games, resulting in a setting that encompasses both collaboration and competition. Kalogiannis et al. (2023) provided an efficient equilibrium computation algorithm for ATMGs which presumes knowledge of the reward and transition functions and has no sample complexity guarantees. We contribute a learning algorithm that utilizes MARL policy gradient methods with iteration and sample complexity that is polynomial in the approximation error $\epsilon$ and the natural parameters of the ATMG, resolving the main caveats of the solution by (Kalogiannis et al. , 2023). It is worth noting that previously, the existence of learning algorithms for NE was known for Markov two-player zero-sum and potential games but not for ATMGs. Seen through the lens of min-max optimization, computing a NE in these games consists a nonconvex--nonconcave saddle-point problem. Min-max optimization has received an extensive study. Nevertheless, the case of nonconvex--nonconcave landscapes remains elusive: in full generality, finding saddle-points is computationally intractable (Daskalakis et al. , 2021). We circumvent the aforementioned intractability by developing techniques that exploit the hidden structure of the objective function via a nonconvex--concave reformulation. However, this introduces a challenge of a feasibility set with coupled constraints. We tackle these challenges by establishing novel techniques for optimizing weakly-smooth nonconvex functions, extending the framework of (Devolder et al. , 2014).

ICLR Conference 2024 Conference Paper

Learning Nash Equilibria in Rank-1 Games

  • Nikolas Patris
  • Ioannis Panageas

Learning Nash equilibria (NE) in games has garnered significant attention, particularly in the context of training Generative Adversarial Networks (GANs) and multi-agent Reinforcement Learning. The current state-of-the-art in efficiently learning games focuses on landscapes that meet the (weak) Minty property or games characterized by a unique function, often referred to as potential games. A significant challenge in this domain is that computing Nash equilibria is a computationally intractable task [Daskalakis et al. 2009]. In this paper we focus on bimatrix games (A,B) called rank-1. These are games in which the sum of the payoff matrices A+B is a rank 1 matrix; note that standard zero-sum games are rank 0. We show that optimistic gradient descent/ascent converges to an \epsilon-approximate NE after 1/\epsilon^2 log(1/\epsilon) iterates in rank-1 games. We achieve this by leveraging structural results about the NE landscape of rank-1 games Adsul et al. 2021. Notably, our approach bypasses the fact that these games do not satisfy the MVI property.

AAAI Conference 2024 Conference Paper

Optimistic Policy Gradient in Multi-Player Markov Games with a Single Controller: Convergence beyond the Minty Property

  • Ioannis Anagnostides
  • Ioannis Panageas
  • Gabriele Farina
  • Tuomas Sandholm

Policy gradient methods enjoy strong practical performance in numerous tasks in reinforcement learning. Their theoretical understanding in multiagent settings, however, remains limited, especially beyond two-player competitive and potential Markov games. In this paper, we develop a new framework to characterize optimistic policy gradient methods in multi-player Markov games with a single controller. Specifically, under the further assumption that the game exhibits an equilibrium collapse, in that the marginals of coarse correlated equilibria (CCE) induce Nash equilibria (NE), we show convergence to stationary epsilon-NE in O(1/epsilon^2) iterations, where O suppresses polynomial factors in the natural parameters of the game. Such an equilibrium collapse is well-known to manifest itself in two-player zero-sum Markov games, but also occurs even in a class of multi-player Markov games with separable interactions, as established by recent work. As a result, we bypass known complexity barriers for computing stationary NE when either of our assumptions fails. Our approach relies on a natural generalization of the classical Minty property that we introduce, which we anticipate to have further applications beyond Markov games.

ICML Conference 2024 Conference Paper

The Computational Complexity of Finding Second-Order Stationary Points

  • Andreas Kontogiannis
  • Vasilis Pollatos
  • Sotiris Kanellopoulos
  • Panayotis Mertikopoulos
  • Aris Pagourtzis
  • Ioannis Panageas

Non-convex minimization problems are universally considered hard, and even guaranteeing that a computed solution is locally minimizing is known to be NP-hard. In this general context, our paper focuses on the problem of finding stationary points that satisfy an approximate second-order optimality condition, which serves to exclude strict saddles and other non-minimizing stationary points. Our main result is that the problem of finding approximate second-order stationary points (SOSPs) is PLS-complete, i. e. , of the same complexity as the problem of finding first-order stationary points (FOSPs), thus resolving an open question in the field. In particular, our results imply that, under the widely believed complexity conjecture that PLS $\neq$ FNP, finding approximate SOSPs in unconstrained domains is easier than in constrained domains, which is known to be NP-hard. This comes in stark contrast with earlier results which implied that, unless PLS = CLS, finding approximate FOSPs in unconstrained domains is harder than in constrained domains.

ICLR Conference 2023 Conference Paper

Efficiently Computing Nash Equilibria in Adversarial Team Markov Games

  • Fivos Kalogiannis
  • Ioannis Anagnostides
  • Ioannis Panageas
  • Emmanouil V. Vlatakis-Gkaragkounis
  • Vaggos Chatziafratis
  • Stelios Andrew Stavroulakis

Computing Nash equilibrium policies is a central problem in multi-agent reinforcement learning that has received extensive attention both in theory and in practice. However, in light of computational intractability barriers in general-sum games, provable guarantees have been thus far either limited to fully competitive or cooperative scenarios or impose strong assumptions that are difficult to meet in most practical applications. In this work, we depart from those prior results by investigating infinite-horizon \emph{adversarial team Markov games}, a natural and well-motivated class of games in which a team of identically-interested players---in the absence of any explicit coordination or communication---is competing against an adversarial player. This setting allows for a unifying treatment of zero-sum Markov games and Markov potential games, and serves as a step to model more realistic strategic interactions that feature both competing and cooperative interests. Our main contribution is the first algorithm for computing stationary $\epsilon$-approximate Nash equilibria in adversarial team Markov games with computational complexity that is polynomial in all the natural parameters of the game, as well as $1/\epsilon$. The proposed algorithm is based on performing independent policy gradient steps for each player in the team, in tandem with best responses from the side of the adversary; in turn, the policy for the adversary is then obtained by solving a carefully constructed linear program. Our analysis leverages non-standard techniques to establish the KKT optimality conditions for a nonlinear program with nonconvex constraints, thereby leading to a natural interpretation of the induced Lagrange multipliers.

NeurIPS Conference 2023 Conference Paper

Exponential Lower Bounds for Fictitious Play in Potential Games

  • Ioannis Panageas
  • Nikolas Patris
  • Stratis Skoulakis
  • Volkan Cevher

Fictitious Play (FP) is a simple and natural dynamic for repeated play with many applications in game theory and multi-agent reinforcement learning. It was introduced by Brown and its convergence properties for two-player zero-sum games was established later by Robinson. Potential games [Monderer and Shapley 1996] is another class of games which exhibit the FP property [Monderer and Shapley 1996], i. e. , FP dynamics converges to a Nash equilibrium if all agents follows it. Nevertheless, except for two-player zero-sum games and for specific instances of payoff matrices [Abernethy et. al. 2021] or for adversarial tie-breaking rules [Daskalakis and Pan, 2014], the \textit{convergence rate} of FP is unknown. In this work, we focus on the rate of convergence of FP when applied to potential games and more specifically identical payoff games. We prove that FP can take exponential time (in the number of strategies) to reach a Nash equilibrium, even if the game is restricted to \textit{two agents}. To prove this, we recursively construct a two-player coordination game with a unique Nash equilibrium. Moreover, every approximate Nash equilibrium in the constructed game must be close to the pure Nash equilibrium in $\ell_1$-distance.

AAAI Conference 2023 Conference Paper

Mean Estimation of Truncated Mixtures of Two Gaussians: A Gradient Based Approach

  • Sai Ganesh Nagarajan
  • Gerasimos Palaiopanos
  • Ioannis Panageas
  • Tushar Vaidya
  • Samson Yu

Even though data is abundant, it is often subjected to some form of censoring or truncation which inherently creates biases. Removing such biases and performing parameter estimation is a classical challenge in Statistics. In this paper, we focus on the problem of estimating the means of a mixture of two balanced d-dimensional Gaussians when the samples are prone to truncation. A recent theoretical study on the performance of the Expectation-Maximization (EM) algorithm for the aforementioned problem showed EM almost surely converges for d=1 and exhibits local convergence for d>1 to the true means. Nevertheless, the EM algorithm for the case of truncated mixture of two Gaussians is not easy to implement as it requires solving a set of nonlinear equations at every iteration which makes the algorithm impractical. In this work, we propose a gradient based variant of the EM algorithm that has global convergence guarantees when d=1 and local convergence for d>1 to the true means. Moreover, the update rule at every iteration is easy to compute which makes the proposed method practical. We also provide numerous experiments to obtain more insights into the effect of truncation on the convergence to the true parameters in high dimensions.

NeurIPS Conference 2023 Conference Paper

On the Convergence of No-Regret Learning Dynamics in Time-Varying Games

  • Ioannis Anagnostides
  • Ioannis Panageas
  • Gabriele Farina
  • Tuomas Sandholm

Most of the literature on learning in games has focused on the restrictive setting where the underlying repeated game does not change over time. Much less is known about the convergence of no-regret learning algorithms in dynamic multiagent settings. In this paper, we characterize the convergence of optimistic gradient descent (OGD) in time-varying games. Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games parameterized on natural variation measures of the sequence of games, subsuming known results for static games. Furthermore, we establish improved second-order variation bounds under strong convexity-concavity, as long as each game is repeated multiple times. Our results also apply to time-varying general-sum multi-player games via a bilinear formulation of correlated equilibria, which has novel implications for meta-learning and for obtaining refined variation-dependent regret bounds, addressing questions left open in prior papers. Finally, we leverage our framework to also provide new insights on dynamic regret guarantees in static games.

NeurIPS Conference 2023 Conference Paper

On the Last-iterate Convergence in Time-varying Zero-sum Games: Extra Gradient Succeeds where Optimism Fails

  • Yi Feng
  • Hu Fu
  • Qun Hu
  • Ping Li
  • Ioannis Panageas
  • Bo Peng
  • Xiao Wang

Last-iterate convergence has received extensive study in two player zero-sum games starting from bilinear, convex-concave up to settings that satisfy the MVI condition. Typical methods that exhibit last-iterate convergence for the aforementioned games include extra-gradient (EG) and optimistic gradient descent ascent (OGDA). However, all the established last-iterate convergence results hold for the restrictive setting where the underlying repeated game does not change over time. Recently, a line of research has focused on regret analysis of OGDA in time-varying games, i. e. , games where payoffs evolve with time; the last-iterate behavior of OGDA and EG in time-varying environments remains unclear though. In this paper, we study the last-iterate behavior of various algorithms in two types of unconstrained, time-varying, bilinear zero-sum games: periodic and convergent perturbed games. These models expand upon the usual repeated game formulation and incorporate external environmental factors, such as the seasonal effects on species competition and vanishing external noise. In periodic games, we prove that EG will converge while OGDA and momentum method will diverge. This is quite surprising, as to the best of our knowledge, it is the first result that indicates EG and OGDA have qualitatively different last-iterate behaviors and do not exhibit similar behavior. In convergent perturbed games, we prove all these algorithms converge as long as the game itself stabilizes with a faster rate than $1/t$.

ICML Conference 2023 Conference Paper

Semi Bandit dynamics in Congestion Games: Convergence to Nash Equilibrium and No-Regret Guarantees

  • Ioannis Panageas
  • Stratis Skoulakis
  • Luca Viano
  • Xiao Wang 0036
  • Volkan Cevher

In this work, we propose introduce a variant of online stochastic gradient descent and prove it converges to Nash equilibria and simultaneously it has sublinear regret for the class of congestion games in the semi-bandit feedback setting. Our proposed method admits convergence rates depending only polynomially on the number of players and the number of facilities, but not on the size of the action set, which can be exponentially large in terms of the number of facilities. Moreover, the running time of our method has polynomial-time dependence on the implicit description of the game. Our analysis exploits techniques from convex geometry, in particular Caratheodory’s theorem and recent advances in non-convex stochastic optimization. This work improves upon and answers an open question from (Cui et al 2022).

ICLR Conference 2023 Conference Paper

Towards convergence to Nash equilibria in two-team zero-sum games

  • Fivos Kalogiannis
  • Ioannis Panageas
  • Emmanouil V. Vlatakis-Gkaragkounis

Contemporary applications of machine learning raise important and overlooked theoretical questions regarding optimization in two-team games. Formally, two-team zero-sum games are defined as multi-player games where players are split into two competing sets of agents, each experiencing a utility identical to that of their teammates and opposite to that of the opposing team. We focus on the solution concept of Nash equilibria and prove $\textrm{CLS}$-hardness of computing them in this class of games. To further examine the capabilities of online learning algorithms in games with full-information feedback, we propose a benchmark of a simple ---yet nontrivial--- family of such games. These games do not enjoy the properties used to prove convergence for relevant algorithms. In particular, we use a dynamical systems perspective to demonstrate that gradient descent-ascent, its optimistic variant, optimistic multiplicative weights update, and extra gradient fail to converge (even locally) to a Nash equilibrium. On a brighter note, we propose a first-order method that leverages control theory techniques and under some conditions enjoys last-iterate local convergence to a Nash equilibrium. We also believe our proposed method is of independent interest for general min-max optimization.

NeurIPS Conference 2023 Conference Paper

Zero-sum Polymatrix Markov Games: Equilibrium Collapse and Efficient Computation of Nash Equilibria

  • Fivos Kalogiannis
  • Ioannis Panageas

The works of (Daskalakis et al. , 2009, 2022; Jin et al. , 2022; Deng et al. , 2023) indicate that computing Nash equilibria in multi-player Markov games is a computationally hard task. This fact raises the question of whether or not computational intractability can be circumvented if one focuses on specific classes of Markov games. One such example is two-player zero-sum Markov games, in which efficient ways to compute a Nash equilibrium are known. Inspired by zero-sum polymatrix normal-form games (Cai et al. , 2016), we define a class of zero-sum multi-agent Markov games in which there are only pairwise interactions described by a graph that changes per state. For this class of Markov games, we show that an $\epsilon$-approximate Nash equilibrium can be found efficiently. To do so, we generalize the techniques of (Cai et al. , 2016), by showing that the set of coarse-correlated equilibria collapses to the set of Nash equilibria. Afterwards, it is possible to use any algorithm in the literature that computes approximate coarse-correlated equilibria Markovian policies to get an approximate Nash equilibrium.

IJCAI Conference 2022 Conference Paper

Accelerated Multiplicative Weights Update Avoids Saddle Points Almost Always

  • Yi Feng
  • Ioannis Panageas
  • Xiao Wang

We consider nonconvex optimization problem with constraint that is a product of simplices. A commonly used algorithm in solving this type of problem is the Multiplicative Weights Update (MWU), an algorithm that is widely used in game theory, machine learning and multi agent systems. Despite it has been known that MWU avoids saddle points, there is a question that remains unaddressed: ``Is there an accelerated version of MWU that avoids saddle points provably? '' In this paper we provide a positive answer to above question. We provide an accelerated MWU based on Riemannian Accelerated Gradient Descent, and prove that the Riemannian Accelerated Gradient Descent, thus the accelerated MWU, avoid saddle points.

ICLR Conference 2022 Conference Paper

Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games

  • Stefanos Leonardos
  • Will Overman
  • Ioannis Panageas
  • Georgios Piliouras

Potential games are arguably one of the most important and widely studied classes of normal form games. They define the archetypal setting of multi-agent coordination in which all agents utilities are perfectly aligned via a common potential function. Can this intuitive framework be transplanted in the setting of Markov games? What are the similarities and differences between multi-agent coordination with and without state dependence? To answer these questions, we study a natural class of Markov Potential Games (MPGs) that generalize prior attempts at capturing complex stateful multi-agent coordination. Counter-intuitively, insights from normal-form potential games do not carry over as MPGs involve settings where state-games can be zero-sum games. In the opposite direction, Markov games where every state-game is a potential game are not necessarily MPGs. Nevertheless, MPGs showcase standard desirable properties such as the existence of deterministic Nash policies. In our main technical result, we prove convergence of independent policy gradient and its stochastic counterpart to Nash policies (polynomially fast in the approximation error) by adapting recent gradient dominance property arguments developed for single-agent Markov decision processes to multi-agent learning settings.

ICML Conference 2022 Conference Paper

On Last-Iterate Convergence Beyond Zero-Sum Games

  • Ioannis Anagnostides
  • Ioannis Panageas
  • Gabriele Farina
  • Tuomas Sandholm

Most existing results about last-iterate convergence of learning dynamics are limited to two-player zero-sum games, and only apply under rigid assumptions about what dynamics the players follow. In this paper we provide new results and techniques that apply to broader families of games and learning dynamics. First, we show that in a class of games that includes constant-sum polymatrix and strategically zero-sum games, the trajectories of dynamics such as optimistic mirror descent (OMD) exhibit a boundedness property, which holds even when players employ different algorithms and prediction mechanisms. This property enables us to obtain $O(1/\sqrt{T})$ rates and optimal $O(1)$ regret bounds. Our analysis also reveals a surprising property: OMD either reaches arbitrarily close to a Nash equilibrium or it outperforms the robust price of anarchy in efficiency. Moreover, for potential games we establish convergence to an $\epsilon$-equilibrium after $O(1/\epsilon^2)$ iterations for mirror descent under a broad class of regularizers, as well as optimal $O(1)$ regret bounds for OMD variants. Our framework also extends to near-potential games, and unifies known analyses for distributed learning in Fisher’s market model. Finally, we analyze the convergence, efficiency, and robustness of optimistic gradient descent (OGD) in general-sum continuous games.

NeurIPS Conference 2022 Conference Paper

On Scrambling Phenomena for Randomly Initialized Recurrent Networks

  • Vaggos Chatziafratis
  • Ioannis Panageas
  • Clayton Sanford
  • Stelios Stavroulakis

Recurrent Neural Networks (RNNs) frequently exhibit complicated dynamics, and their sensitivity to the initialization process often renders them notoriously hard to train. Recent works have shed light on such phenomena analyzing when exploding or vanishing gradients may occur, either of which is detrimental for training dynamics. In this paper, we point to a formal connection between RNNs and chaotic dynamical systems and prove a qualitatively stronger phenomenon about RNNs than what exploding gradients seem to suggest. Our main result proves that under standard initialization (e. g. , He, Xavier etc. ), RNNs will exhibit \textit{Li-Yorke chaos} with \textit{constant} probability \textit{independent} of the network's width. This explains the experimentally observed phenomenon of \textit{scrambling}, under which trajectories of nearby points may appear to be arbitrarily close during some timesteps, yet will be far away in future timesteps. In stark contrast to their feedforward counterparts, we show that chaotic behavior in RNNs is preserved under small perturbations and that their expressive power remains exponential in the number of feedback iterations. Our technical arguments rely on viewing RNNs as random walks under non-linear activations, and studying the existence of certain types of higher-order fixed points called \textit{periodic points} in order to establish phase transitions from order to chaos.

NeurIPS Conference 2022 Conference Paper

Optimistic Mirror Descent Either Converges to Nash or to Strong Coarse Correlated Equilibria in Bimatrix Games

  • Ioannis Anagnostides
  • Gabriele Farina
  • Ioannis Panageas
  • Tuomas Sandholm

We show that, for any sufficiently small fixed $\epsilon > 0$, when both players in a general-sum two-player (bimatrix) game employ optimistic mirror descent (OMD) with smooth regularization, learning rate $\eta = O(\epsilon^2)$ and $T = \Omega(poly(1/\epsilon))$ repetitions, either the dynamics reach an $\epsilon$-approximate Nash equilibrium (NE), or the average correlated distribution of play is an $\Omega(poly(\epsilon))$-strong coarse correlated equilibrium (CCE): any possible unilateral deviation does not only leave the player worse, but will decrease its utility by $\Omega(poly(\epsilon))$. As an immediate consequence, when the iterates of OMD are bounded away from being Nash equilibria in a bimatrix game, we guarantee convergence to an \emph{exact} CCE after only $O(1)$ iterations. Our results reveal that uncoupled no-regret learning algorithms can converge to CCE in general-sum games remarkably faster than to NE in, for example, zero-sum games. To establish this, we show that when OMD does not reach arbitrarily close to a NE, the (cumulative) regret of both players is not only negative, but decays linearly with time. Given that regret is the canonical measure of performance in online learning, our results suggest that cycling behavior of no-regret learning algorithms in games can be justified in terms of efficiency.

ICML Conference 2020 Conference Paper

Better depth-width trade-offs for neural networks through the lens of dynamical systems

  • Vaggos Chatziafratis
  • Sai Ganesh Nagarajan
  • Ioannis Panageas

The expressivity of neural networks as a function of their depth, width and type of activation units has been an important question in deep learning theory. Recently, depth separation results for ReLU networks were obtained via a new connection with dynamical systems, using a generalized notion of fixed points of a continuous map $f$, called periodic points. In this work, we strengthen the connection with dynamical systems and we improve the existing width lower bounds along several aspects. Our first main result is period-specific width lower bounds that hold under the stronger notion of $L^1$-approximation error, instead of the weaker classification error. Our second contribution is that we provide sharper width lower bounds, still yielding meaningful exponential depth-width separations, in regimes where previous results wouldn’t apply. A byproduct of our results is that there exists a universal constant characterizing the depth-width trade-offs, as long as $f$ has odd periods. Technically, our results follow by unveiling a tighter connection between the following three quantities of a given function: its period, its Lipschitz constant and the growth rate of the number of oscillations arising under compositions of the function $f$ with itself.

ICLR Conference 2020 Conference Paper

Depth-Width Trade-offs for ReLU Networks via Sharkovsky's Theorem

  • Vaggos Chatziafratis
  • Sai Ganesh Nagarajan
  • Ioannis Panageas
  • Xiao Wang 0036

Understanding the representational power of Deep Neural Networks (DNNs) and how their structural properties (e.g., depth, width, type of activation unit) affect the functions they can compute, has been an important yet challenging question in deep learning and approximation theory. In a seminal paper, Telgarsky high- lighted the benefits of depth by presenting a family of functions (based on sim- ple triangular waves) for which DNNs achieve zero classification error, whereas shallow networks with fewer than exponentially many nodes incur constant error. Even though Telgarsky’s work reveals the limitations of shallow neural networks, it doesn’t inform us on why these functions are difficult to represent and in fact he states it as a tantalizing open question to characterize those functions that cannot be well-approximated by smaller depths. In this work, we point to a new connection between DNNs expressivity and Sharkovsky’s Theorem from dynamical systems, that enables us to characterize the depth-width trade-offs of ReLU networks for representing functions based on the presence of a generalized notion of fixed points, called periodic points (a fixed point is a point of period 1). Motivated by our observation that the triangle waves used in Telgarsky’s work contain points of period 3 – a period that is special in that it implies chaotic behaviour based on the celebrated result by Li-Yorke – we proceed to give general lower bounds for the width needed to represent periodic functions as a function of the depth. Technically, the crux of our approach is based on an eigenvalue analysis of the dynamical systems associated with such functions.

NeurIPS Conference 2020 Conference Paper

Fast Convergence of Langevin Dynamics on Manifold: Geodesics meet Log-Sobolev

  • Xiao Wang
  • Qi Lei
  • Ioannis Panageas

Sampling is a fundamental and arguably very important task with numerous applications in Machine Learning. One approach to sample from a high dimensional distribution $e^{-f}$ for some function $f$ is the Langevin Algorithm (LA). Recently, there has been a lot of progress in showing fast convergence of LA even in cases where $f$ is non-convex, notably \cite{VW19}, \cite{MoritaRisteski} in which the former paper focuses on functions $f$ defined in $\mathbb{R}^n$ and the latter paper focuses on functions with symmetries (like matrix completion type objectives) with manifold structure. Our work generalizes the results of \cite{VW19} where $f$ is defined on a manifold $M$ rather than $\mathbb{R}^n$. From technical point of view, we show that KL decreases in a geometric rate whenever the distribution $e^{-f}$ satisfies a log-Sobolev inequality on $M$.

NeurIPS Conference 2019 Conference Paper

First-order methods almost always avoid saddle points: The case of vanishing step-sizes

  • Ioannis Panageas
  • Georgios Piliouras
  • Xiao Wang

In a series of papers [Lee et al 2016], [Panageas and Piliouras 2017], [Lee et al 2019], it was established that some of the most commonly used first order methods almost surely (under random initializations) and with step-size being small enough, avoid strict saddle points, as long as the objective function $f$ is $C^2$ and has Lipschitz gradient. The key observation was that first order methods can be studied from a dynamical systems perspective, in which instantiations of Center-Stable manifold theorem allow for a global analysis. The results of the aforementioned papers were limited to the case where the step-size $\alpha$ is constant, i. e. , does not depend on time (and typically bounded from the inverse of the Lipschitz constant of the gradient of $f$). It remains an open question whether or not the results still hold when the step-size is time dependent and vanishes with time. In this paper, we resolve this question on the affirmative for gradient descent, mirror descent, manifold descent and proximal point. The main technical challenge is that the induced (from each first order method) dynamical system is time non-homogeneous and the stable manifold theorem is not applicable in its classic form. By exploiting the dynamical systems structure of the aforementioned first order methods, we are able to prove a stable manifold theorem that is applicable to time non-homogeneous dynamical systems and generalize the results in [Lee et al 2019] for time dependent step-sizes.

ICML Conference 2019 Conference Paper

Multiplicative Weights Updates as a distributed constrained optimization algorithm: Convergence to second-order stationary points almost always

  • Ioannis Panageas
  • Georgios Piliouras
  • Xiao Wang 0036

Non-concave maximization has been the subject of much recent study in the optimization and machine learning communities, specifically in deep learning. Recent papers ([Ge et al. 2015, Lee et al 2017] and references therein) indicate that first order methods work well and avoid saddles points. Results as in [Lee \etal 2017], however, are limited to the unconstrained case or for cases where the critical points are in the interior of the feasibility set, which fail to capture some of the most interesting applications. In this paper we focus on constrained non-concave maximization. We analyze a variant of a well-established algorithm in machine learning called Multiplicative Weights Update (MWU) for the maximization problem $\max_{\mathbf{x} \in D} P(\mathbf{x})$, where $P$ is non-concave, twice continuously differentiable and $D$ is a product of simplices. We show that MWU converges almost always for small enough stepsizes to critical points that satisfy the second order KKT conditions, by combining techniques from dynamical systems as well as taking advantage of a recent connection between Baum Eagon inequality and MWU [Palaiopanos et al 2017].

NeurIPS Conference 2018 Conference Paper

The Limit Points of (Optimistic) Gradient Descent in Min-Max Optimization

  • Constantinos Daskalakis
  • Ioannis Panageas

Motivated by applications in Optimization, Game Theory, and the training of Generative Adversarial Networks, the convergence properties of first order methods in min-max problems have received extensive study. It has been recognized that they may cycle, and there is no good understanding of their limit points when they do not. When they converge, do they converge to local min-max solutions? We characterize the limit points of two basic first order methods, namely Gradient Descent/Ascent (GDA) and Optimistic Gradient Descent Ascent (OGDA). We show that both dynamics avoid unstable critical points for almost all initializations. Moreover, for small step sizes and under mild assumptions, the set of OGDA-stable critical points is a superset of GDA-stable critical points, which is a superset of local min-max solutions (strict in some cases). The connecting thread is that the behavior of these dynamics can be studied from a dynamical systems perspective.

NeurIPS Conference 2017 Conference Paper

Multiplicative Weights Update with Constant Step-Size in Congestion Games: Convergence, Limit Cycles and Chaos

  • Gerasimos Palaiopanos
  • Ioannis Panageas
  • Georgios Piliouras

The Multiplicative Weights Update (MWU) method is a ubiquitous meta-algorithm that works as follows: A distribution is maintained on a certain set, and at each step the probability assigned to action $\gamma$ is multiplied by $(1 -\epsilon C(\gamma))>0$ where $C(\gamma)$ is the ``cost" of action $\gamma$ and then rescaled to ensure that the new values form a distribution. We analyze MWU in congestion games where agents use \textit{arbitrary admissible constants} as learning rates $\epsilon$ and prove convergence to \textit{exact Nash equilibria}. Interestingly, this convergence result does not carry over to the nearly homologous MWU variant where at each step the probability assigned to action $\gamma$ is multiplied by $(1 -\epsilon)^{C(\gamma)}$ even for the simplest case of two-agent, two-strategy load balancing games, where such dynamics can provably lead to limit cycles or even chaotic behavior.

SODA Conference 2016 Conference Paper

Evolutionary Dynamics in Finite Populations Mix Rapidly

  • Ioannis Panageas
  • Piyush Srivastava 0001
  • Nisheeth K. Vishnoi

In this paper we prove that the mixing time of a broad class of evolutionary dynamics in finite, unstructured populations is roughly logarithmic in the size of the state space. An important special case of such a stochastic process is the Wright-Fisher model from evolutionary biology (with selection and mutation) on a population of size N over m genotypes. Our main result implies that the mixing time of this process is O (log N ) for all mutation rates and fitness landscapes, and solves the main open problem from [4]. In particular, it significantly extends the main result in [18] who proved this for m = 2. Biologically, such models have been used to study the evolution of viral populations with applications to drug design strategies countering them. Here the time it takes for the population to reach a steady state is important both for the estimation of the steady-state structure of the population as well in the modeling of the treatment strength and duration. Our result, that such populations exhibit rapid mixing, makes both of these approaches sound. Technically, we make a novel connection between Markov chains arising in evolutionary dynamics and dynamical systems on the probability simplex. This allows us to use the local and global stability properties of the fixed points of such dynamical systems to construct a contractive coupling in a fairly general setting. We expect that our mixing time result would be useful beyond the evolutionary biology setting, and the techniques used here would find applications in bounding the mixing times of Markov chains which have a natural underlying dynamical system.

IROS Conference 2013 Conference Paper

Support-theoretic subgraph preconditioners for large-scale SLAM

  • Yong-Dian Jian
  • Doru Balcan
  • Ioannis Panageas
  • Prasad Tetali
  • Frank Dellaert

Efficiently solving large-scale sparse linear systems is important for robot mapping and navigation. Recently, the subgraph-preconditioned conjugate gradient method has been proposed to combine the advantages of two reigning paradigms, direct and iterative methods, to improve the efficiency of the solver. Yet the question of how to pick a good subgraph is still an open problem. In this paper, we propose a new metric to measure the quality of a spanning tree preconditioner based on support theory. We use this metric to develop an algorithm to find good subgraph preconditioners and apply them to solve the SLAM problem. The results show that although the proposed algorithm is not fast enough, the new metric is effective and resulting subgraph preconditioners significantly improve the efficiency of the state-of-the-art solver.