Arrow Research search

Author name cluster

Xiaotie Deng

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

54 papers
2 author rows

Possible papers

54

NeurIPS Conference 2025 Conference Paper

Mechanism Design for LLM Fine-tuning with Multiple Reward Models

  • Haoran Sun
  • Yurong Chen
  • Siwei Wang
  • Chu Xu
  • Wei Chen
  • Xiaotie Deng

Fine-tuning large language models (LLMs) to aggregate multiple preferences has attracted considerable research attention. With aggregation algorithms advancing, a potential economic scenario arises where fine-tuning services are provided to agents with different preferences. In this context, agents may benefit from strategically misreporting their preferences, but this could harm the aggregation performance. This paper addresses such incentive issues by framing it as a mechanism design problem: an LLM provider determines the fine-tuning objective (training rule) and the pricing scheme (payment rule) for agents. We primarily focus on training rules that maximize social welfare subject to certain regularizations, referred to as SW-Max rules. First, we show that under most circumstances, truthful reporting is sub-optimal with simply a SW-Max rule, thereby highlighting the necessity of payments. Second, we extend the VCG payment to implement SW-Max rules in dominant-strategy incentive compatibility (DSIC). We characterize sufficient conditions for payment equivalence and derive the necessary conditions for a payment rule to implement a SW-Max rule in DSIC and other principles. Third, we demonstrate that our mechanism is approximately DSIC with perturbed input, showcasing its robustness against the inevitable errors in real-world applications. Experiments on real LLM training results further confirm the practical implications of our results.

TCS Journal 2025 Journal Article

On the optimal mixing problem of approximate Nash equilibria in bimatrix games

  • Xiaotie Deng
  • Dongchen Li
  • Hanyu Li

This paper introduces the optimal mixing problem, a natural extension of the computation of approximate Nash Equilibria (NE) in bimatrix games. The problem focuses on determining the optimal convex combination of given strategies that minimizes the approximation (i. e. , regret) in NE computation. We develop algorithms for the exact and approximate optimal mixing problems and present new complexity results that bridge both practical and theoretical aspects of NE computation. Practically, our algorithms can be used to enhance and integrate arbitrary existing constant-approximate NE algorithms, offering a powerful tool for the design of approximate NE algorithms. Theoretically, these algorithms allow us to explore the implications of support restrictions on approximate NE and derive the upper-bound separations between approximate NE and exact NE. Consequently, this work contributes to theoretical understandings of the computational complexity of approximate NE under various constraints and practical improvements in multi-agent reinforcement learning (MARL) and other fields where NE computation is involved.

AAMAS Conference 2025 Conference Paper

Optimal Mechanism Design for Crowdfunding of Public Goods

  • Yukun Cheng
  • Xiaotie Deng
  • Baqiao Quan

Mechanisms for crowdfunding public goods are essential for ensuring that societies can collectively benefit from public goods. Unlike previous researches on crowdfunding for public goods, which focused on binary outcomes—either full provision or none at all, this paper proposes an auction framework to examine the partial provision of public goods, based on the funds raised, with the goal of maximizing the final investment amount. We develop truthful investment mechanisms that achieve the (approximate) optimal expected investment amount across different models, taking into account the number of agents.

IJCAI Conference 2025 Conference Paper

Survey on Strategic Mining in Blockchain: A Reinforcement Learning Approach

  • Jichen Li
  • Lijia Xie
  • Hanting Huang
  • Bo Zhou
  • Binfeng Song
  • Wanying Zeng
  • Xiaotie Deng
  • Xiao Zhang

Strategic mining attacks, such as selfish mining, exploit blockchain consensus protocols by deviating from honest behavior to maximize rewards. Markov Decision Process (MDP) analysis faces scalability challenges in modern digital economics, including blockchain. To address these limitations, reinforcement learning (RL) provides a scalable alternative, enabling adaptive strategy optimization in complex dynamic environments. In this survey, we examine RL’s role in strategic mining analysis, comparing it to MDP-based approaches. We begin by reviewing foundational MDP models and their limitations, before exploring RL frameworks that can learn near-optimal strategies across various protocols. Building on this analysis, we compare RL techniques and their effectiveness in deriving security thresholds, such as the minimum attacker power required for profitable attacks. Expanding the discussion further, we classify consensus protocols and propose open challenges, such as multi-agent dynamics and real-world validation. This survey highlights the potential of reinforcement learning to address the challenges of selfish mining, including protocol design, threat detection, and security analysis, while offering a strategic roadmap for researchers in decentralized systems and AI-driven analytics.

AAAI Conference 2024 Conference Paper

Competition among Pairwise Lottery Contests

  • Xiaotie Deng
  • Hangxin Gan
  • Ningyuan Li
  • Weian Li
  • Qi Qi

We investigate a two-stage competitive model involving multiple contests. In this model, each contest designer chooses two participants from a pool of candidate contestants and determines the biases. Contestants strategically distribute their efforts across various contests within their budget. We first show the existence of a pure strategy Nash equilibrium (PNE) for the contestants, and propose a fully polynomial-time approximation scheme to compute an approximate PNE. In the scenario where designers simultaneously decide the participants and biases, the subgame perfect equilibrium (SPE) may not exist. Nonetheless, when designers' decisions are made in two substages, the existence of SPE is established. In the scenario where designers can hold multiple contests, we show that the SPE always exists under mild conditions and can be computed efficiently.

NeurIPS Conference 2024 Conference Paper

Contextual Decision-Making with Knapsacks Beyond the Worst Case

  • Zhaohua Chen
  • Rui Ai
  • Mingwei Yang
  • Yuqi Pan
  • Chang Wang
  • Xiaotie Deng

We study the framework of a dynamic decision-making scenario with resource constraints. In this framework, an agent, whose target is to maximize the total reward under the initial inventory, selects an action in each round upon observing a random request, leading to a reward and resource consumptions that are further associated with an unknown random external factor. While previous research has already established an $\widetilde{O}(\sqrt{T})$ worst-case regret for this problem, this work offers two results that go beyond the worst-case perspective: one for the worst-case gap between benchmarks and another for logarithmic regret rates. We first show that an $\Omega(\sqrt{T})$ distance between the commonly used fluid benchmark and the online optimum is unavoidable when the former has a degenerate optimal solution. On the algorithmic side, we merge the re-solving heuristic with distribution estimation skills and propose an algorithm that achieves an $\widetilde{O}(1)$ regret as long as the fluid LP has a unique and non-degenerate solution. Furthermore, we prove that our algorithm maintains a near-optimal $\widetilde{O}(\sqrt{T})$ regret even in the worst cases and extend these results to the setting where the request and external factor are continuous. Regarding information structure, our regret results are obtained under two feedback models, respectively, where the algorithm accesses the external factor at the end of each round and at the end of a round only when a non-null action is executed.

AAAI Conference 2024 Conference Paper

Dynamic Budget Throttling in Repeated Second-Price Auctions

  • Zhaohua Chen
  • Chang Wang
  • Qian Wang
  • Yuqi Pan
  • Zhuming Shi
  • Zheng Cai
  • Yukun Ren
  • Zhihua Zhu

In today's online advertising markets, a crucial requirement for an advertiser is to control her total expenditure within a time horizon under some budget. Among various budget control methods, throttling has emerged as a popular choice, managing an advertiser's total expenditure by selecting only a subset of auctions to participate in. This paper provides a theoretical panorama of a single advertiser's dynamic budget throttling process in repeated second-price auctions. We first establish a lower bound on the regret and an upper bound on the asymptotic competitive ratio for any throttling algorithm, respectively, when the advertiser's values are stochastic and adversarial. Regarding the algorithmic side, we propose the OGD-CB algorithm, which guarantees a near-optimal expected regret with stochastic values. On the other hand, when values are adversarial, we prove that this algorithm also reaches the upper bound on the asymptotic competitive ratio. We further compare throttling with pacing, another widely adopted budget control method, in repeated second-price auctions. In the stochastic case, we demonstrate that pacing is generally superior to throttling for the advertiser, supporting the well-known result that pacing is asymptotically optimal in this scenario. However, in the adversarial case, we give an exciting result indicating that throttling is also an asymptotically optimal dynamic bidding strategy. Our results bridge the gaps in theoretical research of throttling in repeated auctions and comprehensively reveal the ability of this popular budget-smoothing strategy.

ICLR Conference 2024 Conference Paper

Learning Thresholds with Latent Values and Censored Feedback

  • Jiahao Zhang
  • Tao Lin 0013
  • Weiqiang Zheng
  • Zhe Feng 0004
  • Yifeng Teng
  • Xiaotie Deng

In this paper, we investigate a problem of *actively* learning threshold in latent space, where the *unknown* reward $g(\gamma, v)$ depends on the proposed threshold $\gamma$ and latent value $v$ and it can be $only$ achieved if the threshold is lower than or equal to the *unknown* latent value. This problem has broad applications in practical scenarios, e.g., reserve price optimization in online auctions, online task assignments in crowdsourcing, setting recruiting bars in hiring, etc. We first characterize the query complexity of learning a threshold with the expected reward at most $\epsilon$ smaller than the optimum and prove that the number of queries needed can be infinitely large even when $g(\gamma, v)$ is monotone with respect to both $\gamma$ and $v$. On the positive side, we provide a tight query complexity $\tilde{\Theta}(1/\epsilon^3)$ when $g$ is monotone and the CDF of value distribution is Lipschitz. Moreover, we show a tight $\tilde{\Theta}(1/\epsilon^3)$ query complexity can be achieved as long as $g$ satisfies one-sided Lipschitzness, which provides a complete characterization for this problem. Finally, we extend this model to an online learning setting and demonstrate a tight $\Theta(T^{2/3})$ regret bound using continuous-arm bandit techniques and the aforementioned query complexity results.

NeurIPS Conference 2023 Conference Paper

A Scalable Neural Network for DSIC Affine Maximizer Auction Design

  • Zhijian Duan
  • Haoran Sun
  • Yurong Chen
  • Xiaotie Deng

Automated auction design aims to find empirically high-revenue mechanisms through machine learning. Existing works on multi item auction scenarios can be roughly divided into RegretNet-like and affine maximizer auctions (AMAs) approaches. However, the former cannot strictly ensure dominant strategy incentive compatibility (DSIC), while the latter faces scalability issue due to the large number of allocation candidates. To address these limitations, we propose AMenuNet, a scalable neural network that constructs the AMA parameters (even including the allocation menu) from bidder and item representations. AMenuNet is always DSIC and individually rational (IR) due to the properties of AMAs, and it enhances scalability by generating candidate allocations through a neural network. Additionally, AMenuNet is permutation equivariant, and its number of parameters is independent of auction scale. We conduct extensive experiments to demonstrate that AMenuNet outperforms strong baselines in both contextual and non-contextual multi-item auctions, scales well to larger auctions, generalizes well to different settings, and identifies useful deterministic allocations. Overall, our proposed approach offers an effective solution to automated DSIC auction design, with improved scalability and strong revenue performance in various settings.

AAMAS Conference 2023 Conference Paper

Altruism, Collectivism and Egalitarianism: On a Variety of Prosocial Behaviors in Binary Networked Public Goods Games

  • Jichen Li
  • Xiaotie Deng
  • Yukun Cheng
  • Yuqi Pan
  • Xuanzhi Xia
  • Zongjun Yang
  • Jan Xie

Binary Networked public goods (BNPG) game consists of a network 𝐺 = (𝑉, 𝐸) with n players residing as nodes in a network and making a YES/NO decision to invest a public project. Examples of such public projects include face mask wearing during a pandemic, crime reporting and vaccination, etc. Most of the conventional modes of BNPG games solely posit egoism as the motivation of players: they only care about their own benefits. However, a series of real-world examples show that people have a wide range of prosocial behaviors in making decisions. To address this property, we introduce a novel extension of BNPG games to account for three kinds of prosocial motivations: altruism, collectivism, and egalitarianism. We revise utility functions to reflect different prosocial motivations with respect to the welfare of others, mediated by a prosocial graph. We develop computational complexity results to decide the existence of pure strategy Nash equilibrium in these models, for cases where the prosocial graph is a tree, a clique or a general network. We further discuss the Prosocial Network Modification (PNM) problem, in which a principal can change the network structure within a budget constraint, to induce a given strategy profile with respect to an equilibrium. For all three types of PNM problems, we completely characterize their corresponding computational complexity results.

ICML Conference 2023 Conference Paper

Are Equivariant Equilibrium Approximators Beneficial?

  • Zhijian Duan 0001
  • Yunxuan Ma
  • Xiaotie Deng

Recently, remarkable progress has been made by approximating Nash equilibrium (NE), correlated equilibrium (CE), and coarse correlated equilibrium (CCE) through function approximation that trains a neural network to predict equilibria from game representations. Furthermore, equivariant architectures are widely adopted in designing such equilibrium approximators in normal-form games. In this paper, we theoretically characterize the benefits and limitations of equivariant equilibrium approximators. For the benefits, we show that they enjoy better generalizability than general ones and can achieve better approximations when the payoff distribution is permutation-invariant. For the limitations, we discuss their drawbacks in terms of equilibrium selection and social welfare. Together, our results help to understand the role of equivariance in equilibrium approximators.

ICML Conference 2023 Conference Paper

Coordinated Dynamic Bidding in Repeated Second-Price Auctions with Budgets

  • Yurong Chen 0002
  • Qian Wang 0025
  • Zhijian Duan 0001
  • Haoran Sun
  • Zhaohua Chen 0001
  • Xiang Yan
  • Xiaotie Deng

In online ad markets, a rising number of advertisers are employing bidding agencies to participate in ad auctions. These agencies are specialized in designing online algorithms and bidding on behalf of their clients. Typically, an agency usually has information on multiple advertisers, so she can potentially coordinate bids to help her clients achieve higher utilities than those under independent bidding. In this paper, we study coordinated online bidding algorithms in repeated second-price auctions with budgets. We propose algorithms that guarantee every client a higher utility than the best she can get under independent bidding. We show that these algorithms achieve maximal social welfare and discuss bidders’ incentives to misreport their budgets, in symmetric cases. Our proofs combine the techniques of online learning and equilibrium analysis, overcoming the difficulty of competing with a multi-dimensional benchmark. The performance of our algorithms is further evaluated by experiments on both synthetic and real data. To the best of our knowledge, we are the first to consider bidder coordination in online repeated auctions with constraints.

AAAI Conference 2023 Conference Paper

From Monopoly to Competition: Optimal Contests Prevail

  • Xiaotie Deng
  • Yotam Gafni
  • Ron Lavi
  • Tao Lin
  • Hongyi Ling

We study competition among contests in a general model that allows for an arbitrary and heterogeneous space of contest design and symmetric contestants. The goal of the contest designers is to maximize the contestants' sum of efforts. Our main result shows that optimal contests in the monopolistic setting (i.e., those that maximize the sum of efforts in a model with a single contest) form an equilibrium in the model with competition among contests. Under a very natural assumption these contests are in fact dominant, and the equilibria that they form are unique. Moreover, equilibria with the optimal contests are Pareto-optimal even in cases where other equilibria emerge. In many natural cases, they also maximize the social welfare.

AAMAS Conference 2023 Conference Paper

Is Nash Equilibrium Approximator Learnable?

  • Zhijian Duan
  • Wenhan Huang
  • Dinghuai Zhang
  • Yali Du
  • Jun Wang
  • Yaodong Yang
  • Xiaotie Deng

In this paper, we investigate the learnability of the function approximator that approximates Nash equilibrium (NE) for games generated from a distribution. First, we offer a generalization bound using the Probably Approximately Correct (PAC) learning model. The bound describes the gap between the expected loss and empirical loss of the NE approximator. Afterward, we prove the agnostic PAC learnability of the Nash approximator. In addition to theoretical analysis, we demonstrate an application of NE approximator in experiments. The trained NE approximator can be used to warmstart and accelerate classical NE solvers. Together, our results show the practicability of approximating NE through function approximation.

ICML Conference 2023 Conference Paper

Learning to Bid in Repeated First-Price Auctions with Budgets

  • Qian Wang 0025
  • Zongjun Yang
  • Xiaotie Deng
  • Yuqing Kong

Budget management strategies in repeated auctions have received growing attention in online advertising markets. However, previous work on budget management in online bidding mainly focused on second-price auctions. The rapid shift from second-price auctions to first-price auctions for online ads in recent years has motivated the challenging question of how to bid in repeated first-price auctions while controlling budgets. In this work, we study the problem of learning in repeated first-price auctions with budgets. We design a dual-based algorithm that can achieve a near-optimal $\widetilde{O}(\sqrt{T})$ regret with full information feedback where the maximum competing bid is always revealed after each auction. We further consider the setting with one-sided information feedback where only the winning bid is revealed after each auction. We show that our modified algorithm can still achieve an $\widetilde{O}(\sqrt{T})$ regret with mild assumptions on the bidder’s value distribution. Finally, we complement the theoretical results with numerical experiments to confirm the effectiveness of our budget management policy.

AAMAS Conference 2023 Conference Paper

Sybil-Proof Diffusion Auction in Social Networks

  • Hongyin Chen
  • Xiaotie Deng
  • Ying Wang
  • Yue Wu
  • Dengji Zhao

A diffusion auction is a market to sell commodities over a social network, where the challenge is to incentivize existing buyers to invite their neighbors in the network to join the market. Existing mechanisms have been designed to solve the challenge in various settings, aiming at desirable properties such as non-deficiency, incentive compatibility and social welfare maximization. Since the mechanisms are employed in dynamic networks with ever-changing structures, buyers could easily generate fake nodes in the network to manipulate the mechanisms for their own benefits, which is commonly known as the Sybil attack. We observe that strategic agents may gain an unfair advantage in existing mechanisms through such attacks. To resist this potential attack, we propose two diffusion auction mechanisms, the Sybil tax mechanism (STM) and the Sybil cluster mechanism (SCM), to achieve both Sybil-proofness and incentive compatibility in the single-item setting. Our proposal provides the first mechanisms to protect the interests of buyers against Sybil attacks with a mild sacrifice of social welfare and revenue.

AAAI Conference 2023 Conference Paper

Truthful Mechanisms for Steiner Tree Problems

  • Jinshan Zhang
  • Zhengyang Liu
  • Xiaotie Deng
  • Jianwei Yin

Consider an undirected graph G=(V,E) model for a communication network, where each edge is owned by a selfish agent, who reports the cost for offering the use of her edge. Note that each edge agent may misreport her own cost for the use of the edge for her own benefit. In such a non-cooperative setting, we aim at designing an approximately truthful mechanism for establishing a Steiner tree, a minimum cost tree spanning over all the terminals. We present a truthful-in-expectation mechanism that achieves the approximation ratio ln 4 + ε ≈ 1.39, which matches the current best algorithmic ratio for STP.

ICML Conference 2022 Conference Paper

A Context-Integrated Transformer-Based Neural Network for Auction Design

  • Zhijian Duan 0001
  • Jingwu Tang
  • Yutong Yin
  • Zhe Feng 0004
  • Xiang Yan
  • Manzil Zaheer
  • Xiaotie Deng

One of the central problems in auction design is developing an incentive-compatible mechanism that maximizes the auctioneer’s expected revenue. While theoretical approaches have encountered bottlenecks in multi-item auctions, recently, there has been much progress on finding the optimal mechanism through deep learning. However, these works either focus on a fixed set of bidders and items, or restrict the auction to be symmetric. In this work, we overcome such limitations by factoring public contextual information of bidders and items into the auction learning framework. We propose $\mathtt{CITransNet}$, a context-integrated transformer-based neural network for optimal auction design, which maintains permutation-equivariance over bids and contexts while being able to find asymmetric solutions. We show by extensive experiments that $\mathtt{CITransNet}$ can recover the known optimal solutions in single-item settings, outperform strong baselines in multi-item auctions, and generalize well to cases other than those in training.

AAMAS Conference 2022 Conference Paper

Multiagent Q-learning with Sub-Team Coordination

  • Wenhan Huang
  • Kai Li
  • Kun Shao
  • Tianze Zhou
  • Jun Luo
  • Dongge Wang
  • Hangyu Mao
  • Jianye Hao

For cooperative mutliagent reinforcement learning tasks, we propose a novel value factorization framework in the popular centralized training with decentralized execution paradigm, called multiagent Q-learning with sub-team coordination (QSCAN). This framework could flexibly exploit local coordination within sub-teams for effective factorization while honoring the individual-globalmax (IGM) condition. QSCAN encompasses the full spectrum of sub-team coordination according to sub-team size, ranging from the monotonic value function class to the entire IGM function class, with familiar methods such as QMIX and QPLEX located at the respective extremes of the spectrum. Empirical results show that QSCAN’s performance dominates state-of-the-art methods in predator-prey tasks and the Switch challenge in MA-Gym.

NeurIPS Conference 2022 Conference Paper

Multiagent Q-learning with Sub-Team Coordination

  • Wenhan Huang
  • Kai Li
  • Kun Shao
  • Tianze Zhou
  • Matthew Taylor
  • Jun Luo
  • Dongge Wang
  • Hangyu Mao

In many real-world cooperative multiagent reinforcement learning (MARL) tasks, teams of agents can rehearse together before deployment, but then communication constraints may force individual agents to execute independently when deployed. Centralized training and decentralized execution (CTDE) is increasingly popular in recent years, focusing mainly on this setting. In the value-based MARL branch, credit assignment mechanism is typically used to factorize the team reward into each individual’s reward — individual-global-max (IGM) is a condition on the factorization ensuring that agents’ action choices coincide with team’s optimal joint action. However, current architectures fail to consider local coordination within sub-teams that should be exploited for more effective factorization, leading to faster learning. We propose a novel value factorization framework, called multiagent Q-learning with sub-team coordination (QSCAN), to flexibly represent sub-team coordination while honoring the IGM condition. QSCAN encompasses the full spectrum of sub-team coordination according to sub-team size, ranging from the monotonic value function class to the entire IGM function class, with familiar methods such as QMIX and QPLEX located at the respective extremes of the spectrum. Experimental results show that QSCAN’s performance dominates state-of-the-art methods in matrix games, predator-prey tasks, the Switch challenge in MA-Gym. Additionally, QSCAN achieves comparable performances to those methods in a selection of StarCraft II micro-management tasks.

IJCAI Conference 2022 Conference Paper

On the Convergence of Fictitious Play: A Decomposition Approach

  • Yurong Chen
  • Xiaotie Deng
  • Chenchen Li
  • David Mguni
  • Jun Wang
  • Xiang Yan
  • Yaodong Yang

Fictitious play (FP) is one of the most fundamental game-theoretical learning frameworks for computing Nash equilibrium in n-player games, which builds the foundation for modern multi-agent learning algorithms. Although FP has provable convergence guarantees on zero-sum games and potential games, many real-world problems are often a mixture of both and the convergence property of FP has not been fully studied yet. In this paper, we extend the convergence results of FP to the combinations of such games and beyond. Specifically, we derive new conditions for FP to converge by leveraging game decomposition techniques. We further develop a linear relationship unifying cooperation and competition in the sense that these two classes of games are mutually transferable. Finally, we analyse a non-convergent example of FP, the Shapley game, and develop sufficient conditions for FP to converge.

AAMAS Conference 2022 Conference Paper

Revenue and User Traffic Maximization in Mobile Short-Video Advertising

  • Dezhi Ran
  • Weiqiang Zheng
  • Yunqi Li
  • Kaigui Bian
  • Jie Zhang
  • Xiaotie Deng

A new mobile attention economy has emerged with the explosive growth of short-video apps such as TikTok. In this internet market, three types of agents interact with each other: the platform, influencers, and advertisers. A short-video platform encourages its influencers to attract users by creating appealing content through short-form videos and allows advertisers to display their ads in short-form videos. There are two options for the advertisers: one is to bid for platform advert slots in a similar way to search engine auctions; the other is to pay an influencer to make engaging short videos and promote them through the influencer’s channel. The second option will generate a higher conversion ratio if advertisers choose the right influencers whose followers match their target market. Although displaying influencer ads will generate less revenue, it is more engaging than platform ads, which is better for maintaining user traffic. Therefore, it is crucial for a platform to balance these factors by establishing a sustainable business agreement with its influencers and advertisers. In this paper, we develop a two-stage solution for a platform to maximize short-term revenue and long-term user traffic maintenance. In the first stage, we estimate the impact of user traffic generated by displaying influencer ads and characterize the user traffic the platform should allocate to influencers for overall revenue maximization. In the second stage, we devise an optimal (1 − 1/𝑒)-competitive algorithm for ad slot allocation. To complement this analysis, we examine the ratio of the revenue generated by our online algorithm to the optimal offline revenue. Our simulation results show that this ratio is 0. 94 on average, which is much higher than (1 − 1/𝑒) and outperforms four baseline algorithms.

AAAI Conference 2021 Conference Paper

On the Approximation of Nash Equilibria in Sparse Win-Lose Multi-player Games

  • Zhengyang Liu
  • Jiawei Li
  • Xiaotie Deng

A polymatrix game is a multi-player game over n players, where each player chooses a pure strategy from a list of its own pure strategies. The utility of each player is a sum of payoffs it gains from the two player’s game from all its neighbors, under its chosen strategy and that of its neighbor. As a natural extension to two-player games (a. k. a. bimatrix games), polymatrix games are widely used for multi-agent games in real world scenarios. In this paper we show that the problem of approximating a Nash equilibrium in a polymatrix game within the polynomial precision is PPAD-hard, even in sparse and win-lose ones. This result further challenges the predictability of Nash equilibria as a solution concept in the multi-agent setting. We also propose a simple and efficient algorithm, when the game is further restricted. Together, we establish a new dichotomy theorem for this class of games. It is also of independent interest for exploring the computational and structural properties in Nash equilibria.

NeurIPS Conference 2020 Conference Paper

A Game-Theoretic Analysis of the Empirical Revenue Maximization Algorithm with Endogenous Sampling

  • Xiaotie Deng
  • Ron Lavi
  • Tao Lin
  • Qi Qi
  • Wenwei WANG
  • Xiang Yan

The Empirical Revenue Maximization (ERM) is one of the most important price learning algorithms in auction design: as the literature shows it can learn approximately optimal reserve prices for revenue-maximizing auctioneers in both repeated auctions and uniform-price auctions. However, in these applications the agents who provide inputs to ERM have incentives to manipulate the inputs to lower the outputted price. We generalize the definition of an incentive-awareness measure proposed by Lavi et al (2019), to quantify the reduction of ERM's outputted price due to a change of m>=1 out of N input samples, and provide specific convergence rates of this measure to zero as N goes to infinity for different types of input distributions. By adopting this measure, we construct an efficient, approximately incentive-compatible, and revenue-optimal learning algorithm using ERM in repeated auctions against non-myopic bidders, and show approximate group incentive-compatibility in uniform-price auctions.

AAAI Conference 2019 Conference Paper

Latent Dirichlet Allocation for Internet Price War

  • Chenchen Li
  • Xiang Yan
  • Xiaotie Deng
  • Yuan Qi
  • Wei Chu
  • Le Song
  • Junlong Qiao
  • Jianshan He

Current Internet market makers are facing an intense competitive environment, where personalized price reductions or discounted coupons are provided by their peers to attract more customers. Much investment is spent to catch up with each other’s competitors but participants in such a price cut war are often incapable of winning due to their lack of information about others’ strategies or customers’ preference. We formalize the problem as a stochastic game with imperfect and incomplete information and develop a variant of Latent Dirichlet Allocation (LDA) to infer latent variables under the current market environment, which represents preferences of customers and strategies of competitors. Tests on simulated experiments and an open dataset for real data show that, by subsuming all available market information of the market maker’s competitors, our model exhibits a significant improvement for understanding the market environment and finding the best response strategies in the Internet price war. Our work marks the first successful learning method to infer latent information in the environment of price war by the LDA modeling, and sets an example for related competitive applications to follow.

MFCS Conference 2017 Conference Paper

Smoothed and Average-Case Approximation Ratios of Mechanisms: Beyond the Worst-Case Analysis

  • Xiaotie Deng
  • Yansong Gao
  • Jie Zhang 0008

The approximation ratio has become one of the dominant measures in mechanism design problems. In light of analysis of algorithms, we define the smoothed approximation ratio to compare the performance of the optimal mechanism and a truthful mechanism when the inputs are subject to random perturbations of the worst-case inputs, and define the average-case approximation ratio to compare the performance of these two mechanisms when the inputs follow a distribution. For the one-sided matching problem, Filos-Ratsikas et al. [2014] show that, amongst all truthful mechanisms, random priority achieves the tight approximation ratio bound of Theta(sqrt{n}). We prove that, despite of this worst-case bound, random priority has a constant smoothed approximation ratio. This is, to our limited knowledge, the first work that asymptotically differentiates the smoothed approximation ratio from the worst-case approximation ratio for mechanism design problems. For the average-case, we show that our approximation ratio can be improved to 1+e. These results partially explain why random priority has been successfully used in practice, although in the worst case the optimal social welfare is Theta(sqrt{n}) times of what random priority achieves. These results also pave the way for further studies of smoothed and average-case analysis for approximate mechanism design problems, beyond the worst-case analysis.

AAAI Conference 2016 Conference Paper

Incentives for Strategic Behavior in Fisher Market Games

  • Ning Chen
  • Xiaotie Deng
  • Bo Tang
  • Hongyang Zhang

In a Fisher market game, a market equilibrium is computed in terms of the utility functions and money endowments that agents reported. As a consequence, an individual buyer may misreport his private information to obtain a utility gain. We investigate the extent to which an agent’s utility can be increased by unilateral strategic plays and prove that the percentage of this improvement is at most 2 for markets with weak gross substitute utilities. Equivalently, we show that truthfully reporting is a 0. 5-approximate Nash equilibrium in this game. To identify sufficient conditions for truthfully reporting being close to Nash equilibrium, we conduct a parameterized study on strategic behaviors and further show that the ratio of utility gain decreases linearly as buyer’s initial endowment increases or his maximum share of an item decreases. Finally, we consider collusive behavior of a coalition and prove that the utility gain is bounded by 1/(1 − maximum share of the collusion). Our findings justify the truthful reporting assumption in Fisher markets by a quantitative study on participants incentive, and imply that under large market assumption, the utility gain of a buyer from manipulations diminishes to 0.

AAMAS Conference 2016 Conference Paper

Network Pollution Games

  • Eleftherios Anastasiadis
  • Xiaotie Deng
  • Piotr Krysta
  • Minming Li
  • Han Qiao
  • Jinshan Zhang

We introduce a new network model of the pollution control problem and present two applications of this model. On a high level, our model comprises a graph whose nodes represent the agents, that could be thought of as sources of pollution, and edges between agents represent the effect of spread of pollution. The government as the regulator is responsible to maximize the social welfare while setting bounds on the levels of emitted pollution both locally and globally. Our model is inspired by the existing literature in environmental economics that applies game theoretical methodology to control pollution. We study the social welfare maximization problem in our model. Our main results include hardness results for the problem, and in complement, a constant approximation algorithm on planar graphs. Our approximation algorithm leads to a truthful in expectation mechanism, and it is obtained by a novel decomposition technique of planar graphs to deal with constraints on vertices. We note that no known planar decomposition techniques can be used here and our technique can be of independent interest.

JAAMAS Journal 2016 Journal Article

Pricing ad slots with consecutive multi-unit demand

  • Xiaotie Deng
  • Paul Goldberg
  • Jinshan Zhang

Abstract We consider the optimal pricing problem for a model of the rich media advertisement market, that has other related applications. Our model differs from traditional position auctions in that we consider buyers whose demand might be multiple consecutive slots, which is motivated by modeling buyers who may require these to display a large size ad. We study three major pricing mechanisms, the Bayesian pricing model, the maximum revenue market equilibrium model and an envy-free solution model. Under the Bayesian model, we design a polynomial-time computable truthful mechanism that optimizes the revenue. For the market equilibrium paradigm, we find a polynomial-time algorithm to obtain the maximum revenue market equilibrium solution. In the envy-free setting, an optimal solution is presented for the case where the buyers have the same demand for the number of consecutive slots. We present results of a simulation that compares the revenues from the above schemes.

IJCAI Conference 2016 Conference Paper

Truthfulness of a Proportional Sharing Mechanism in Resource Exchange

  • Yukun Cheng
  • Xiaotie Deng
  • Qi Qi
  • Xiang Yan

In this paper, we consider the popular proportional sharing mechanism and discuss the incentives and opportunities of an agent to lie for personal gains in resource exchange game. The main result is a proof that an agent manipulating the proportional sharing mechanism by misreporting its resource amount will not benefit its own utility eventually. This result establishes a strategic stability property of the resource exchange protocol. We further illustrate and confirm the result via network examples.

TCS Journal 2014 Journal Article

Revenue maximization in a Bayesian double auction market

  • Xiaotie Deng
  • Paul Goldberg
  • Bo Tang
  • Jinshan Zhang

We study double auction market design where the market maker wants to maximize its total revenue by buying low from the sellers and selling high to the buyers. We consider a Bayesian setting where buyers and sellers have independent probability distributions on the values of products on the market. For the simplest setting, each seller has one kind of indivisible good with a bounded (integer) amount that can be sold to a buyer, who may demand a bounded number of copies. We develop a maximum mechanism for the market maker to maximize its own revenue. For the more general case where each seller's product may be different, we consider a number of variants in terms of constraints on supplies and demands. For each of them, we develop a polynomial time computable truthful mechanism for the market maker to achieve a revenue at least a constant α times the revenue of any other truthful mechanism.

AAAI Conference 2014 Conference Paper

The Fisher Market Game: Equilibrium and Welfare

  • Simina Brânzei
  • Yiling Chen
  • Xiaotie Deng
  • Aris Filos-Ratsikas
  • Søren Frederiksen
  • Jie Zhang

The Fisher market model is one of the most fundamental resource allocation models in economics. In a Fisher market, the prices and allocations of goods are determined according to the preferences and budgets of buyers to clear the market. In a Fisher market game, however, buyers are strategic and report their preferences over goods; the marketclearing prices and allocations are then determined based on their reported preferences rather than their real preferences. We show that the Fisher market game always has a pure Nash equilibrium, for buyers with linear, Leontief, and Cobb-Douglas utility functions, which are three representative classes of utility functions in the important Constant Elasticity of Substitution (CES) family. Furthermore, to quantify the social efficiency, we prove Price of Anarchy bounds for the game when the utility functions of buyers fall into these three classes respectively.

FOCS Conference 2006 Conference Paper

Computing Nash Equilibria: Approximation and Smoothed Complexity

  • Xi Chen 0001
  • Xiaotie Deng
  • Shang-Hua Teng

We advance significantly beyond the recent progress on the algorithmic complexity of Nash equilibria by solving two major open problems in the approximation of Nash equilibria and in the smoothed analysis of algorithms. --We show that no algorithm with complexity poly(n, \frac{1} { \in } ) can compute an \in-approximate Nash equilibrium in a two-player game, in which each player has n pure strategies, unless PPAD \subseteq P. In other words, the problem of computing a Nash equilibrium in a twoplayer game does not have a fully polynomial-time approximation scheme unless PPAD \subseteq P. --We prove that no algorithm for computing a Nash equilibrium in a two-player game can have smoothed complexity poly(n, \frac{1} {\sigma } ) under input perturbation of magnitude s, unless PPAD \subseteq RP. In particular, the smoothed complexity of the classic Lemke-Howson algorithm is not polynomial unless PPAD \subseteq RP. Instrumental to our proof, we introduce a new discrete fixed-point problem on a high-dimensional hypergrid with constant side-length, and show that it can host the embedding of the proof structure of any PPAD problem. We prove a key geometric lemma for finding a discrete fixed-point, a new concept defined on n + 1 vertices of a unit hypercube. This lemma enables us to overcome the curse of dimensionality in reasoning about fixed-points in high dimensions.

FOCS Conference 1998 Conference Paper

A TDI System and its Application to Approximation Algorithms

  • Mao-cheng Cai
  • Xiaotie Deng
  • Wenan Zang

We obtain a necessary and sufficient condition for tournaments to possess a min-max relation on packing and covering directed cycles, together with strongly polynomial time algorithms for the feedback vertex set problem and the cycle packing problem in this class of tournaments; the condition and the algorithms are all based on a totally dual integral (TDI) system, a theoretical framework introduced by J. Edmonds and R. Giles (1994) for establishing min-max results. As a consequence, we find a 2. 5-approximation polynomial time algorithm for the feedback vertex set problem in any tournament.

IROS Conference 1993 Conference Paper

Landmark selection for path execution

  • Xiaotie Deng
  • Evangelos E. Milios
  • Andranik Mirzaian

A commonly used approach to self-location is for the robot to use point features or landmarks. Landmarks are typically difficult to detect and track with video or range sensors, and hence it is sensible to try to minimize the number of times the robot abandons the tracking of an already detected landmark to detect and pursue another. The problem addressed is how to select the landmarks that the robot is to detect and track over different parts of a given path. Several algorithms with different amounts of flexibility, generality and complexity are proposed. The authors address the uniform cost case (all landmarks have equal cost of detection and tracking), and the weighted cost case (each landmark has its own cost). The case of different sets of landmarks having different utility measures is also treated. The algorithm complexity is low-order polynomial in the number of landmarks k, the number of straight line segments of the path, and the number of shadows cast on the path by each landmark, except when taking into account the usefulness of landmarks in groups, which is exponential in k.

FOCS Conference 1991 Conference Paper

How to Learn an Unknown Environment (Extended Abstract)

  • Xiaotie Deng
  • Tiko Kameda
  • Christos H. Papadimitriou

The authors consider the problem faced by a newborn that must explore and learn an unknown room with obstacles in it. They seek algorithms that achieve a bounded ratio of the worst-case distance traversed in order to see all visible points of the environment (thus creating a map), divided by the optimum distance needed to verify the map. The situation is complicated by the fact that the latter offline problem (optimally verifying a map) is NP-hard and thus must be solved approximately. Although the authors show that there is no such competitive algorithm for general obstacle courses, they give a competitive algorithm for the case of a polygonal room with a bounded number of obstacles in it. >

FOCS Conference 1990 Conference Paper

Exploring an Unknown Graph (Extended Abstract)

  • Xiaotie Deng
  • Christos H. Papadimitriou

It is desired to explore all edges of an unknown directed, strongly connected graph. At each point one has a map of all nodes and edges visited, one can recognize these nodes and edges upon seeing them again, and it is known how many unexplored edges emanate from each node visited. The goal is to minimize the ratio of the total number of edges traversed to the optimum number of traversals had the graph been known. For Eulerian graphs this ratio cannot be better than 2, and 2 is achievable by a simple algorithm. In contrast, the ratio is unbounded when the deficiency of the graph (the number of edges that have to be added to make it Eulerian) is unbounded. The main result is an algorithm that achieves a bounded ratio when the deficiency is bounded; unfortunately the ratio is exponential in the deficiency. It is also shown that, when partial information about the graph is available, minimizing the worst-case ratio is PSPACE-complete. >