Author name cluster

Shweta Jain

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

21 papers

1 author row

AAMAS Conference 2025 Conference Paper

Anytime Fairness Guarantees in Stochastic Combinatorial MABs: A Novel Learning Framework

Subham Pokhriyal
Shweta Jain
Ganesh Ghalme
Vaneet Aggarwal

This paper proposes a novel framework for incorporating anytime fairness guarantees in a general Stochastic Combinatorial Multi- Armed Bandit (CMAB) problem when the time horizon is unknown. Our framework does not make any assumptions about the reward feedback or structure and provides fairness guarantees as long as a sublinear regret algorithm exists to solve the same problem. The framework essentially operates in the episodes of length 𝐻, which is a user-defined parameter. The framework divides each episode of length 𝐻 into fairness rounds and learning rounds. Motivated by preemptive scheduling on uniform machines, we propose Fair-CMAB which prioritizes fairness rounds upfront and uses any existing CMAB algorithm for learning rounds. This helps in generalizing the framework significantly. Theoretically, we prove that for a sufficiently large value of 𝐻, Fair-CMAB achieves anytime fairness guarantees after some initial number of rounds and achieves the regret guarantees of the same order as the learning algorithm.

AAMAS Conference 2025 Conference Paper

Fair Assignment on Multi-Stage Graphs

Vibulan J
Swapnil Dhamal
Shweta Jain
Ojassvi Kumar
Aman Kumar
Harpreet Singh

This paper explores the problem of fair assignment of disjoint paths to agents on multi-stage graphs. We motivate the problem by demonstrating that an assignment minimizing the overall cost of all the agents’ paths may lead to significant envy among the agents. Showing NP-hardness of finding an envy-minimizing assignment, we propose algorithms that achieve a desired degree of envy while also providing a bound on the Cost of Fairness. Our algorithms run several orders of magnitude faster than a suitably formulated ILP.

AAMAS Conference 2025 Conference Paper

FLIGHT: Facility Location Integrating Generalized, Holistic Theory of Welfare

Avyukta Manjunatha Vummintala
Shivam Gupta
Shweta Jain
Sujit Gujar

The Facility Location Problem (FLP) is a well-studied optimization problem with applications in many real-world scenarios. Past literature has explored the solutions from different perspectives to tackle FLPs. These include investigating FLPs under objective functions such as utilitarian, egalitarian, Nash welfare, etc. We propose a unified framework, FLIGHT, to accommodate a broad class of welfare notions. The framework undergoes rigorous theoretical analysis, and we prove some structural properties of the solution to FLP. Additionally, we provide approximation bounds, which (under certain assumptions) provide insight into an interesting fact– as the number of agents arbitrarily increases, the choice of welfare notion is irrelevant. Furthermore, the paper examines a scenario in which the agents are independently and identically distributed (i. i. d.) according to a given probability distribution. In this setting, we derive results concerning the optimal estimator of the welfare and establish an asymptotic result for welfare functions.

AAMAS Conference 2024 Conference Paper

Fairness and Privacy Guarantees in Federated Contextual Bandits

Sambhav Solanki
Sujit Gujar
Shweta Jain

This paper studies the contextual multi-armed bandit problem with fairness and privacy guarantees in a federated setting. It proposes a collaborative algorithm, Fed-FairX-LinUCB that achieves sublinear fairness regret and can be adapted to ensure differential privacy. The key challenge is designing a communication protocol that balances privacy and regret. The proposed protocol achieves both sub-linear fairness regret and effective use of privacy budget. Experiments validates the efficacy of both Fed-FairX-LinUCB and its private counterpart, Priv-FairX-LinUCB

AAMAS Conference 2024 Conference Paper

Fairness of Exposure in Online Restless Multi-armed Bandits

Archit Sood
Shweta Jain
Sujit Gujar

Restless multi-armed bandits (RMABs) generalize the multi-armed bandits where each arm exhibits Markovian behavior and transitions according to their transition dynamics. Solutions to RMAB exist for both offline and online cases. However, they do not consider the distribution of pulls among the arms. Studies have shown that optimal policies lead to unfairness, where some arms are not exposed enough. Existing works in fairness in RMABs focus heavily on the offline case, which diminishes their application in real-world scenarios where the environment is largely unknown. In the online scenario, we propose the first fair RMAB framework, where each arm receives pulls in proportion to its merit. We define the merit of an arm as a function of its stationary reward distribution. We prove that our algorithm achieves sublinear fairness regret in the single pull case 𝑂( √ 𝑇 ln𝑇), with 𝑇 being the total number of episodes. Empirically, we show that our algorithm performs well in the multi-pull scenario as well.

AAMAS Conference 2024 Conference Paper

Simultaneously Achieving Group Exposure Fairness and Within-Group Meritocracy in Stochastic Bandits

Subham Pokhriyal
Shweta Jain
Ganesh Ghalme
Swapnil Dhamal
Sujit Gujar

Existing approaches to fairness in stochastic multi-armed bandits (MAB) primarily focus on exposure guarantee to individual arms. When arms are naturally grouped by certain attribute(s), we propose Bi-Level Fairness, which considers two levels of fairness. At the first level, Bi-Level Fairness guarantees a certain minimum exposure to each group. To address the unbalanced allocation of pulls to individual arms within a group, we consider meritocratic fairness at the second level, which ensures that each arm is pulled according to its merit within the group. Our work shows that we can adapt a UCB-based algorithm to achieve a Bi-Level Fairness by providing (i) anytime Group Exposure Fairness guarantees and (ii) ensuring individual-level Meritocratic Fairness within each group. We first show that one can decompose regret bounds into two components: (a) regret due to anytime group exposure fairness and (b) regret due to meritocratic fairness within each group. Our proposed algorithm BF-UCB balances these two regrets optimally to achieve the upper bound of 𝑂( √ 𝑇) on regret; 𝑇 being the stopping time. With the help of simulated experiments, we further show that BF-UCB achieves sub-linear regret; provides better group and individual exposure guarantees compared to existing algorithms; and does not result in a significant drop in reward with respect to UCB algorithm, which does not impose any fairness constraint.

AAMAS Conference 2023 Conference Paper

A Novel Demand Response Model and Method for Peak Reduction in Smart Grids - PowerTAC

Sanjay Chandlekar
Arthik Boroju
Shweta Jain
Sujit Gujar

We study the Demand Response behavior of smart grid customers in response to the offered discounts for peak reduction. We propose a model that depicts the probability of a customer reducing its load as a function of the discounts offered. This function is parametrized by the rate of reduction (RR). We provide an optimal algorithm, MJS–ExpResponse, that allocates the discounts to each customer by maximizing the expected reduction under a budget constraint. When RRs are unknown, we propose a Multi-Armed Bandit based online algorithm, namely MJSUCB–ExpResponse, to learn RRs. We experimentally show that it exhibits sublinear regret and showcase its efficacy in a real-world smart grid system using the PowerTAC simulator as a test bed.

IJCAI Conference 2023 Conference Paper

A Novel Demand Response Model and Method for Peak Reduction in Smart Grids -- PowerTAC

Sanjay Chandlekar
Shweta Jain
Sujit Gujar

One of the widely used peak reduction methods in smart grids is demand response, where one analyzes the shift in customers' (agents') usage patterns in response to the signal from the distribution company. Often, these signals are in the form of incentives offered to agents. This work studies the effect of incentives on the probabilities of accepting such offers in a real-world smart grid simulator, PowerTAC. We first show that there exists a function that depicts the probability of an agent reducing its load as a function of the discounts offered to them. We call it reduction probability (RP). RP function is further parametrized by the rate of reduction (RR), which can differ for each agent. We provide an optimal algorithm, MJS--ExpResponse, that outputs the discounts to each agent by maximizing the expected reduction under a budget constraint. When RRs are unknown, we propose a Multi-Armed Bandit (MAB) based online algorithm, namely MJSUCB--ExpResponse, to learn RRs. Experimentally we show that it exhibits sublinear regret. Finally, we showcase the efficacy of the proposed algorithm in mitigating demand peaks in a real-world smart grid system using the PowerTAC simulator as a test bed.

PDF Details DOI

AAMAS Conference 2023 Conference Paper

Fairness Driven Efficient Algorithms for Sequenced Group Trip Planning Query Problem

Napendra Solanki
Shweta Jain
Suman Banerjee
Yayathi Pavan Kumar S

The Group Trip Planning Query Problem (GTP) is a well-researched spatial database problem. Given a city road network with Pointof-Interests (PoIs) representing vertices divided into different categories, GTP aims to suggest one PoI from each category to minimize the group’s total distance traveled. This paper focuses on sequenced GTP with pre-determined category visit order, studied under the constraints of fairness, and referred to as sequenced Fair Group Trip Planning Query Problem (Fair-GTP). While GTP aims to minimize the group’s total travel time, Fair-GTP seeks to minimize the maximum time difference between any two agents in the group. Although solving group trip planning queries is NP-hard, we present polynomial time algorithms for finding optimal paths for both sequenced GTP and Fair-GTP. Our second significant result provides a bound on the price of fairness (PoF) representing the ratio of optimal path cost in sequenced Fair-GTP to optimal path cost in sequenced GTP. We show that while the PoF can go arbitrarily bad for general sequenced Fair-GTP solutions, restricting to Paretooptimal solutions bounds the PoF by (2𝑏 − 1), where 𝑏 denotes the number of agents traveling in the group. We further show that this bound is tight. Finally, we present the performance analysis of our algorithms on real-world datasets, demonstrating that our solution approach recommends PoIs within reasonable computational time, and in practice, PoF is below 2.

AAMAS Conference 2023 Conference Paper

Group Fair Clustering Revisited -- Notions and Efficient Algorithm

Shivam Gupta
Ganesh Ghalme
Narayanan C. Krishnan
Shweta Jain

This paper considers the problem of group fairness in clustering. We propose a new fairness notion which strictly generalizes existing notions, and we theoretically analyze the relationships between several existing notions. Finally, we propose a simple and efficient greedy round-robin-based algorithm (FRAC𝑂𝐸) and extensive experiments to validate its efficacy across multiple datasets.

AAMAS Conference 2023 Conference Paper

On Subset Selection of Multiple Humans To Improve Human-AI Team Accuracy

Sagalpreet Singh
Shweta Jain
Shashi Shekhar Jha

There are several classification tasks where neither the human nor the model is perfectly accurate. Some recent works, therefore, focus on the Human-AI team model, where the AI model’s probabilistic output is combined with the human-predicted class label. The combined decision is shown to consistently outperform the model’s or human’s accuracy alone. All the previous works, however, restrict to the setting where they consider a single human to combine with the AI model. Motivated by the crowdsourcing literature, which combines labels from multiple humans, we show that combining multiple human labels with the model’s probabilistic output can lead to significant improvement in accuracy. This paper further shows that while combining multiple humans helps, a naive combination of humans with AI model can lead to poor accuracy. Hence, there is a strong need for an intelligent strategy to select a subset of humans and combine their labels. To this end, we present an approach to merge the predicted labels from multiple humans with the model’s probabilistic output. We then provide an efficient algorithm to find the optimal subset of humans whose combined labels offer the most accurate output. Finally, we empirically demonstrate that the combined model outperforms the AI model or any human alone in terms of accuracy. Besides this, our subset selection algorithm and combination method outperforms the single human model and other naïve combination techniques.

AAMAS Conference 2022 Conference Paper

Budgeted Combinatorial Multi-Armed Bandits

Debojit Das
Shweta Jain
Sujit Gujar

We consider a budgeted combinatorial multi-armed bandit setting where, in every round, the algorithm selects a super-arm consisting of one or more arms. The goal is to minimize the total expected regret after all rounds within a limited budget. Existing techniques in this literature either fix the budget per round or fix the number of arms pulled in each round. Our setting is more general where based on the remaining budget and remaining number of rounds, the algorithm can decide how many arms to be pulled in each round. First, we propose CBwK-Greedy-UCB algorithm, which uses a greedy technique, CBwK-Greedy, to allocate the arms to the rounds. Next, we propose a reduction of this problem to Bandits with Knapsacks (BwK) with a single pull. With this reduction, we propose CBwK-LP- UCB that uses PrimalDualBwK ingeniously. We rigorously prove regret bounds for CBwK-LP-UCB. We experimentally compare the two algorithms and observe that CBwK-Greedy-UCB performs incrementally better than CBwK-LP-UCB. We also show that for very high budgets, the regret goes to zero.

AIJ Journal 2021 Journal Article

Ballooning multi-armed bandits

Ganesh Ghalme
Swapnil Dhamal
Shweta Jain
Sujit Gujar
Y. Narahari

In this paper, we introduce ballooning multi-armed bandits (BL-MAB), a novel extension of the classical stochastic MAB model. In the BL-MAB model, the set of available arms grows (or balloons) over time. In contrast to the classical MAB setting where the regret is computed with respect to the best arm overall, the regret in a BL-MAB setting is computed with respect to the best available arm at each time. We first observe that the existing stochastic MAB algorithms result in linear regret for the BL-MAB model. We prove that, if the best arm is equally likely to arrive at any time instant, a sub-linear regret cannot be achieved. Next, we show that if the best arm is more likely to arrive in the early rounds, one can achieve sub-linear regret. Our proposed algorithm determines (1) the fraction of the time horizon for which the newly arriving arms should be explored and (2) the sequence of arm pulls in the exploitation phase from among the explored arms. Making reasonable assumptions on the arrival distribution of the best arm in terms of the thinness of the distribution's tail, we prove that the proposed algorithm achieves sub-linear instance-independent regret. We further quantify explicit dependence of regret on the arrival distribution parameters. We reinforce our theoretical findings with extensive simulation results. We conclude by showing that our algorithm would achieve sub-linear regret even if (a) the distributional parameters are not exactly known, but are obtained using a reasonable learning mechanism or (b) the best arm is not more likely to arrive early, but a large fraction of arms is likely to arrive relatively early.

AAMAS Conference 2019 Conference Paper

Fair Division of Indivisible Goods Among Strategic Agents

Siddharth Barman
Ganesh Ghalme
Shweta Jain
Pooja Kulkarni
Shivika Narang

We study fair division of indivisible goods among strategic agents in a single-parameter environment. This work specifically considers fairness in terms of envy freeness up to one good (EF1) and maximin share guarantee (MMS). We show that (in a single-parameter environment) the problem of maximizing welfare, subject to the constraint that the allocation of the indivisible goods is EF1, admits a polynomial-time, 1/2-approximate, truthful auction. Under MMS setup, we develop a truthful auction which efficiently finds an allocation wherein each agent gets a bundle of value at least (1/2 − ε) times her maximin share and the welfare of the computed allocation is at least the optimal, here ε > 0 is a fixed constant. Our results for EF1 and MMS are based on establishing interesting majorization inequalities.

AIJ Journal 2018 Journal Article

A quality assuring, cost optimal multi-armed bandit mechanism for expertsourcing

Shweta Jain
Sujit Gujar
Satyanath Bhat
Onno Zoeter
Y. Narahari

There are numerous situations when a service requester wishes to expertsource a series of identical but non-trivial tasks from a pool of experts so as to achieve an assured accuracy level for each task, in a cost optimal way. The experts available are typically heterogeneous with unknown but fixed qualities and different service costs. The service costs are usually private to the experts and the experts could be strategic about their costs. The problem is to select for each task an optimal subset of experts so that the outcome obtained after aggregating the opinions from the selected experts guarantees a target level of accuracy. The problem is a challenging one even in a non-strategic setting since the accuracy of an aggregated outcome depends on unknown qualities. We develop a novel multi-armed bandit (MAB) mechanism for solving this problem. First, we propose a framework, Assured Accuracy Bandit (AAB) framework, which leads to a MAB algorithm, Constrained Confidence Bound for Non-Strategic Setting (CCB-NS). We derive an upper bound on the number of time steps this algorithm chooses a sub-optimal set, which depends on the target accuracy and true qualities. A more challenging situation arises when the requester not only has to learn the qualities of the experts but has to elicit their true service costs as well. We modify the CCB-NS algorithm to obtain an adaptive exploration separated algorithm Constrained Confidence Bound for Strategic Setting (CCB-S). The CCB-S algorithm produces an ex-post monotone allocation rule that can then be transformed into an ex-post incentive compatible and ex-post individually rational mechanism. This mechanism learns the qualities of the experts and guarantees a given target accuracy level in a cost optimal way. We also provide a lower bound on the number of times any algorithm must select a sub-optimal set and we see that the lower bound matches our upper bound up to a constant factor. We provide insights on a practical implementation of this framework through an illustrative example and demonstrate the efficacy of our algorithms through simulations.

AAMAS Conference 2018 Conference Paper

Design of Coalition Resistant Credit Score Functions for Online Discussion Forums

Ganesh Ghalme
Sujit Gujar
Amleshwar Kumar
Shweta Jain
Y. Narahari

We consider the problem of designing a robust credit score function in the context of online discussion forums. Credit score function assigns a real-valued credit score to each participant based on activities on the forum. A credit score of a participant quantifies the usefulness of contribution made by her. However, participants can manipulate a credit score function by forming coalitions, i. e. , by strategically awarding upvotes, likes, etc. among a subset of agents to maximize their credit scores. We propose a coalition resistant credit score function which discourages such strategic endorsements. We use community detection algorithms to identify closeknit communities in the graph of interactions and characterize coalition identifying community detection metric. In particular, we show that modularity is coalition identifying and provide theoretical guarantees on modularity based credit score function. Finally, we validate our theoretical findings with simulations on illustrative datasets.

JBHI Journal 2018 Journal Article

Riemann Liouvelle Fractional Integral Based Empirical Mode Decomposition for ECG Denoising

Shweta Jain
Varun Bajaj
Anil Kumar

Electrocardiograph (ECG) denoising is the most important step in diagnosis of heart-related diseases, as the diagnosis gets influenced with noises. In this paper, a new method for ECG denoising is proposed, which incorporates empirical mode decomposition algorithm with Riegmann Liouvelle (RL) fractional integral filtering and Savitzky-Golay (SG) filtering. In the proposed method, noisy ECG signal is decomposed into its intrinsic mode functions (IMFs), from which noisy IMFs, corrupted with high-frequency (HF) and low-frequency (LF) noises, are identified by proposed noisy-IMFs identification methodologies. To denoise the signal, RL fractional integral filtering and SG filtering are applied on noisy IMFs corrupted with HF and LF noises, respectively; ECG signal is reconstructed with denoised IMFs and remaining signal dominant IMFs to obtain noise-free ECG signal. Proposed methodology is tested with MIT-BIH arrhythmia database. Its performance, in terms of signal-to-noise ratio and mean square error, is compared with other related ECG denoising methods based on fractional integral, empirical mode decomposition, and ensemble empirical mode decomposition. The obtained results by proposed method prove that the proposed method gives efficient noise removal performance.

AAMAS Conference 2017 Conference Paper

Thompson Sampling Based Mechanisms for Stochastic Multi-Armed Bandit Problems

Ganesh Ghalme
Shweta Jain
Sujit Gujar
Y. Narahari

This paper explores Thompson sampling in the context of mechanism design for stochastic multi-armed bandit (MAB) problems. The setting is that of an MAB problem where the reward distribution of each arm consists of a stochastic component as well as a strategic component. Many existing MAB mechanisms use upper confidence bound (UCB) based algorithms for learning the parameters of the reward distribution. The randomized nature of Thompson sampling introduces certain unique, non-trivial challenges for mechanism design, which we address in this paper through a rigorous regret analysis. We first propose a MAB mechanism with deterministic payment rule, namely, TSM-D. We show that in TSM- D, the variance of agent utilities asymptotically approaches zero. However, the game theoretic properties satisfied by TSM-D (incentive compatibility and individual rationality with high probability) are rather weak. As our main contribution, we then propose the mechanism TSM-R, with randomized payment rule, and prove that TSM-R satisfies appropriate, adequate notions of incentive compatibility and individual rationality. For TSM-R, we also establish a theoretical upper bound on the variance in utilities of the agents. We further show, using simulations, that the variance in social welfare incurred by TSM-D or TSM-R is much lower when compared to that of existing UCB based mechanisms. We believe this paper establishes Thompson sampling as an attractive approach to be used in MAB mechanism design.

AAMAS Conference 2016 Conference Paper

A Deterministic MAB Mechanism for Crowdsourcing with Logarithmic Regret and Immediate Payments

Shweta Jain
Ganesh Ghalme
Satyanath Bhat
Sujit Gujar
Y. Narahari

We consider a general crowdsourcing setting with strategic workers whose qualities are unknown and design a multiarmed bandit (MAB) mechanism, CrowdUCB, which is deterministic, regret minimizing, and offers immediate payments to the workers. The problem involves sequentially selecting workers to process tasks in order to maximize the social welfare while learning the qualities of the strategic workers (strategic about their costs). Existing MAB mechanisms are either: (a) deterministic which potentially cause significant loss in social welfare, or (b) randomized which typically lead to high variance in payments. CrowdUCB completely addresses the above problems with the following features: (i) offers deterministic payments, (ii) achieves logarithmic regret in social welfare, (iii) renders allocations more effective by allocating blocks of tasks to a worker instead of a single task, and (iv) offers payment to a worker immediately upon completion of an assigned block of tasks. CrowdUCB is a mechanism with learning that learns the qualities of the workers while eliciting their true costs, irrespective of whether or not the workers know their own qualities. We show that CrowdUCB is ex-post individually rational (EPIR) and ex-post incentive compatible (EPIC) when the workers do not know their own qualities and when they update their beliefs in sync with the requester. When the workers know their own qualities, CrowdUCB is EPIR and ε−EPIC where ε is sub-linear in terms of the number of tasks.

AAMAS Conference 2016 Conference Paper

A Truthful Mechanism with Biparameter Learning for Online Crowdsourcing

Satyanath Bhat
Divya Padmanabhan
Shweta Jain
Yadati Narahari

AAAI Conference 2014 Conference Paper

A Multiarmed Bandit Incentive Mechanism for Crowdsourcing Demand Response in Smart Grids

Shweta Jain
Balakrishnan Narayanaswamy
Y. Narahari

Demand response is a critical part of renewable integration and energy cost reduction goals across the world. Motivated by the need to reduce costs arising from electricity shortage and renewable energy fluctuations, we propose a novel multiarmed bandit mechanism for demand response (MAB-MDR) which makes monetary offers to strategic consumers who have unknown response characteristics, to incetivize reduction in demand. Our work is inspired by a novel connection we make to crowdsourcing mechanisms. The proposed mechanism incorporates realistic features of the demand response problem including time varying and quadratic cost function. The mechanism marries auctions, that allow users to report their preferences, with online algorithms, that allow distribution companies to learn user-specific parameters. We show that MAB-MDR is dominant strategy incentive compatible, individually rational, and achieves sublinear regret. Such mechanisms can be effectively deployed in smart grids using new information and control architecture innovations and lead to welcome savings in energy costs.