Arrow Research search

Author name cluster

Shishen Lin

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers
1 author row

Possible papers

4

IJCAI Conference 2025 Conference Paper

Randomised Optimism via Competitive Co-Evolution for Matrix Games with Bandit Feedback

  • Shishen Lin

Learning in games is a fundamental problem in machine learning and artificial intelligence, with numerous applications. This work investigates two-player zero-sum matrix games with an unknown payoff matrix and bandit feedback, where each player observes their actions and the corresponding noisy payoff. Prior studies have proposed algorithms for this setting, demonstrating the effectiveness of deterministic optimism (e. g. , UCB for matrix games) in achieving sublinear regret. However, the potential of randomised optimism in matrix games remains theoretically unexplored. We propose Competitive Co-evolutionary Bandit Learning (CoEBL), a novel algorithm that integrates evolutionary algorithms (EAs) into the bandit framework to implement randomised optimism through EA variation operators. We prove that CoEBL achieves sublinear regret, matching the performance of deterministic optimism-based methods. To the best of our knowledge, this is the first theoretical regret analysis of an evolutionary bandit learning algorithm in matrix games. Empirical evaluations on diverse matrix game benchmarks demonstrate that CoEBL not only achieves sublinear regret but also consistently outperforms classical bandit algorithms, including EXP3, the variant EXP3-IX, and UCB. These results highlight the potential of evolutionary bandit learning, particularly the efficacy of randomised optimism via evolutionary algorithms in game-theoretic settings.

AAAI Conference 2025 Conference Paper

Towards Runtime Analysis of Population-Based Co-evolutionary Algorithms on Sparse Binary Zero-Sum Game

  • Per Kristian Lehre
  • Shishen Lin

The maximin optimisation problem, inspired by Von Neumann’s work (von Neumann 1928) and widely applied in adversarial optimisation, has become a key research area in machine learning. Gradient Descent Ascent (GDA) is a common method for solving these problems but requires the pay-off function to be differentiable, making it unsuitable for discrete or binary functions that often occur in game-theoretical scenarios. Co-evolutionary algorithms (CoEAs), which are derivative-free, offer an alternative to these problems. However, the theoretical understanding of CoEAs is still limited. This paper provides the first rigorous runtime analysis of CoEAs with pairwise dominance on binary two-player zero-sum games (or maximin problems), specifically focusing on the DIAGONAL game. The mathematical analysis rigorously shows that the PDCoEA can efficiently find the optimum in polynomial runtime with high probability under low mutation rates and large population sizes. Empirical evidence also identifies an error threshold where higher mutation rates lead to inefficiency. In contrast, single-pair-individual algorithms, i.e., RLS-PD and (1+1)-CoEAs, fail to find the optimum in polynomial time. These findings highlight the usefulness of pairwise dominance, low mutation rates, and large populations in maintaining a “co-evolutionary arms race”.

IJCAI Conference 2024 Conference Paper

Concentration Tail-Bound Analysis of Coevolutionary and Bandit Learning Algorithms

  • Per Kristian Lehre
  • Shishen Lin

Runtime analysis, as a branch of the theory of AI, studies how the number of iterations algorithms take before finding a solution (its runtime) depends on the design of the algorithm and the problem structure. Drift analysis is a state-of-the-art tool for estimating the runtime of randomised algorithms, such as bandit and evolutionary algorithms. Drift refers roughly to the expected progress towards the optimum per iteration. This paper considers the problem of deriving concentration tail-bounds on the runtime of algorithms. It provides a novel drift theorem that gives precise exponential tail-bounds given positive, weak, zero and even negative drift. Previously, such exponential tail bounds were missing in the case of weak, zero, or negative drift. Our drift theorem can be used to prove a strong concentration of the runtime/regret of algorithms in AI. For example, we prove that the regret of the RWAB bandit algorithm is highly concentrated, while previous analyses only considered the expected regret. This means that the algorithm obtains the optimum within a given time frame with high probability, i. e. a form of algorithm reliability. Moreover, our theorem implies that the time needed by the co-evolutionary algorithm RLS-PD to obtain a Nash equilibrium in a Bilinear max-min-benchmark problem is highly concentrated. However, we also prove that the algorithm forgets the Nash equilibrium, and the time until this occurs is highly concentrated. This highlights a weakness in the RLS-PD which should be addressed by future work.

NeurIPS Conference 2024 Conference Paper

No Free Lunch Theorem and Black-Box Complexity Analysis for Adversarial Optimisation

  • Per Kristian Lehre
  • Shishen Lin

Black-box optimisation is one of the important areas in optimisation. The original No Free Lunch (NFL) theorems highlight the limitations of traditional black-box optimisation and learning algorithms, serving as a theoretical foundation for traditional optimisation. No Free Lunch Analysis in adversarial (also called maximin) optimisation is a long-standing problem [45, 46]. This paper first rigorously proves a (NFL) Theorem for general black-box adversarial optimisation when considering Pure Strategy Nash Equilibrium (NE) as the solution concept. We emphasise the solution concept (i. e. define the optimality in adversarial optimisation) as the key in our NFL theorem. In particular, if Nash Equilibrium is considered as the solution concept and the cost of the algorithm is measured in terms of the number of columns and rows queried in the payoff matrix, then the average performance of all black-box adversarial optimisation algorithms is the same. Moreover, we first introduce black-box complexity to analyse the black-box adversarial optimisation algorithm. We employ Yao’s Principle and our new NFL Theorem to provide general lower bounds for the query complexity of finding a Nash Equilibrium in adversarial optimisation. Finally, we illustrate the practical ramifications of our results on simple two-player zero-sum games. More specifically, no black-box optimisation algorithm for finding the unique Nash equilibrium in two-player zero-sum games can exceed logarithmic complexity relative to search space size. Meanwhile, no black-box algorithm can solve any bimatrix game with unique NE with fewer than a linear number of queries in the size of the payoff matrix.