Author name cluster

Brian Brubach

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers

2 author rows

AAAI Conference 2024 Conference Paper

Implications of Distance over Redistricting Maps: Central and Outlier Maps

Seyed A. Esmaeili
Darshan Chakrabarti
Hayley Grape
Brian Brubach

In representative democracy, a redistricting map is chosen to partition an electorate into districts which each elects a representative. A valid redistricting map must satisfy a collection of constraints such as being compact, contiguous, and of almost-equal population. However, these constraints are loose enough to enable an enormous ensemble of valid redistricting maps. This enables a partisan legislature to gerrymander by choosing a map which unfairly favors it. In this paper, we introduce an interpretable and tractable distance measure over redistricting maps which does not use election results and study its implications over the ensemble of redistricting maps. Specifically, we define a central map which may be considered "most typical" and give a rigorous justification for it by showing that it mirrors the Kemeny ranking in a scenario where we have a committee voting over a collection of redistricting maps to be drawn. We include runnning time and sample complexity analysis for our algorithms, including some negative results which hold using any algorithm. We further study outlier detection based on this distance measure and show that our framework can detect some gerrymandered maps. More precisely, we show some maps that are widely considered to be gerrymandered that lie very far away from our central maps in comparison to a large ensemble of valid redistricting maps. Since our distance measure does not rely on election results, this gives a significant advantage in gerrymandering detection which is lacking in all previous methods.

PDF Details DOI

NeurIPS Conference 2021 Conference Paper

Fair Clustering Under a Bounded Cost

Seyed Esmaeili
Brian Brubach
Aravind Srinivasan
John Dickerson

Clustering is a fundamental unsupervised learning problem where a dataset is partitioned into clusters that consist of nearby points in a metric space. A recent variant, fair clustering, associates a color with each point representing its group membership and requires that each color has (approximately) equal representation in each cluster to satisfy group fairness. In this model, the cost of the clustering objective increases due to enforcing fairness in the algorithm. The relative increase in the cost, the ```````''price of fairness, '' can indeed be unbounded. Therefore, in this paper we propose to treat an upper bound on the clustering objective as a constraint on the clustering problem, and to maximize equality of representation subject to it. We consider two fairness objectives: the group utilitarian objective and the group egalitarian objective, as well as the group leximin objective which generalizes the group egalitarian objective. We derive fundamental lower bounds on the approximation of the utilitarian and egalitarian objectives and introduce algorithms with provable guarantees for them. For the leximin objective we introduce an effective heuristic algorithm. We further derive impossibility results for other natural fairness objectives. We conclude with experimental results on real-world datasets that demonstrate the validity of our algorithms.

PDF Details

AAAI Conference 2021 Conference Paper

Fairness, Semi-Supervised Learning, and More: A General Framework for Clustering with Stochastic Pairwise Constraints

Brian Brubach
Darshan Chakrabarti
John P. Dickerson
Aravind Srinivasan
Leonidas Tsepenekas

Metric clustering is fundamental in areas ranging from Combinatorial Optimization and Data Mining, to Machine Learning and Operations Research. However, in a variety of situations we may have additional requirements or knowledge, distinct from the underlying metric, regarding which pairs of points should be clustered together. To capture and analyze such scenarios, we introduce a novel family of stochastic pairwise constraints, which we incorporate into several essential clustering objectives (radius/median/means). Moreover, we demonstrate that these constraints can succinctly model an intriguing collection of applications, including among others Individual Fairness in clustering and Must-link constraints in semi-supervised learning. Our main result consists of a general framework that yields approximation algorithms with provable guarantees for important clustering objectives, while at the same time producing solutions that respect the stochastic pairwise constraints. Furthermore, for certain objectives we devise improved results in the case of Must-link constraints, which are also the best possible from a theoretical perspective. Finally, we present experimental evidence that validates the effectiveness of our algorithms.

PDF Details

NeurIPS Conference 2021 Conference Paper

Improved Guarantees for Offline Stochastic Matching via new Ordered Contention Resolution Schemes

Brian Brubach
Nathaniel Grammel
Will Ma
Aravind Srinivasan

Matching is one of the most fundamental and broadly applicable problems across many domains. In these diverse real-world applications, there is often a degree of uncertainty in the input which has led to the study of stochastic matching models. Here, each edge in the graph has a known, independent probability of existing derived from some prediction. Algorithms must probe edges to determine existence and match them irrevocably if they exist. Further, each vertex may have a patience constraint denoting how many of its neighboring edges can be probed. We present new ordered contention resolution schemes yielding improved approximation guarantees for some of the foundational problems studied in this area. For stochastic matching with patience constraints in general graphs, we provide a $0. 382$-approximate algorithm, significantly improving over the previous best $0. 31$-approximation of Baveja et al. (2018). When the vertices do not have patience constraints, we describe a $0. 432$-approximate random order probing algorithm with several corollaries such as an improved guarantee for the Prophet Secretary problem under Edge Arrivals. Finally, for the special case of bipartite graphs with unit patience constraints on one of the partitions, we show a $0. 632$-approximate algorithm that improves on the recent $1/3$-guarantee of Hikima et al. (2021).

PDF Details

NeurIPS Conference 2021 Conference Paper

It's COMPASlicated: The Messy Relationship between RAI Datasets and Algorithmic Fairness Benchmarks

Michelle Bao
Angela Zhou
Samantha Zottola
Brian Brubach
Sarah Desmarais
Aaron Horowitz
Kristian Lum
Suresh Venkatasubramanian

Risk assessment instrument (RAI) datasets, particularly ProPublica’s COMPAS dataset, are commonly used in algorithmic fairness papers due to benchmarking practices of comparing algorithms on datasets used in prior work. In many cases, this data is used as a benchmark to demonstrate good performance without ac-counting for the complexities of criminal justice (CJ) processes. However, we show that pretrial RAI datasets can contain numerous measurement biases and errors, and due to disparities in discretion and deployment, algorithmic fairness applied to RAI datasets is limited in making claims about real-world outcomes. These reasons make the datasets a poor fit for benchmarking under assumptions of ground truth and real-world impact. Furthermore, conventional practices of simply replicating previous data experiments may implicitly inherit or edify normative positions without explicitly interrogating value-laden assumptions. Without con-text of how interdisciplinary fields have engaged in CJ research and context of how RAIs operate upstream and downstream, algorithmic fairness practices are misaligned for meaningful contribution in the context of CJ, and would benefit from transparent engagement with normative considerations and values related to fairness, justice, and equality. These factors prompt questions about whether benchmarks for intrinsically socio-technical systems like the CJ system can exist in a beneficial and ethical way.

PDF Details

ICML Conference 2020 Conference Paper

A Pairwise Fair and Community-preserving Approach to k-Center Clustering

Brian Brubach
Darshan Chakrabarti
John Dickerson 0001
Samir Khuller
Aravind Srinivasan
Leonidas Tsepenekas

Clustering is a foundational problem in machine learning with numerous applications. As machine learning increases in ubiquity as a backend for automated systems, concerns about fairness arise. Much of the current literature on fairness deals with discrimination against protected classes in supervised learning (group fairness). We define a different notion of fair clustering wherein the probability that two points (or a community of points) become separated is bounded by an increasing function of their pairwise distance (or community diameter). We capture the situation where data points represent people who gain some benefit from being clustered together. Unfairness arises when certain points are deterministically separated, either arbitrarily or by someone who intends to harm them as in the case of gerrymandering election districts. In response, we formally define two new types of fairness in the clustering setting, pairwise fairness and community preservation. To explore the practicality of our fairness goals, we devise an approach for extending existing $k$-center algorithms to satisfy these fairness constraints. Analysis of this approach proves that reasonable approximations can be achieved while maintaining fairness. In experiments, we compare the effectiveness of our approach to classical $k$-center algorithms/heuristics and explore the tradeoff between optimal clustering and fairness.

Details

NeurIPS Conference 2020 Conference Paper

Probabilistic Fair Clustering

Seyed Esmaeili
Brian Brubach
Leonidas Tsepenekas
John Dickerson

In clustering problems, a central decision-maker is given a complete metric graph over vertices and must provide a clustering of vertices that minimizes some objective function. In fair clustering problems, vertices are endowed with a color (e. g. , membership in a group), and the requirements of a valid clustering might also include the representation of colors in the solution. Prior work in fair clustering assumes complete knowledge of group membership. In this paper, we generalize this by assuming imperfect knowledge of group membership through probabilistic assignments, and present algorithms in this more general setting with approximation ratio guarantees. We also address the problem of "metric membership", where group membership has a notion of order and distance. Experiments are conducted using our proposed algorithms as well as baselines to validate our approach, and also surface nuanced concerns when group membership is not known deterministically.

PDF Details

SODA Conference 2018 Conference Paper

Algorithms to Approximate Column-Sparse Packing Problems

Brian Brubach
Karthik Abinav Sankararaman
Aravind Srinivasan
Pan Xu 0001

Column-sparse packing problems arise in several contexts in both deterministic and stochastic discrete optimization. We present two unifying ideas, (non-uniform) attenuation and multiple-chance algorithms, to obtain improved approximation algorithms for some well-known families of such problems. As three main examples, we attain the integrality gap, up to lower-order terms, for known LP relaxations for k -column sparse packing integer programs (Bansal et al. , Theory of Computing, 2012) and stochastic k -set packing (Bansal et al. , Algorithmica, 2012), and go “half the remaining distance” to optimal for a major integrality-gap conjecture of Füredi, Kahn and Seymour on hypergraph matching ( Combinatorica, 1993).

Details

AAMAS Conference 2017 Conference Paper

Attenuate Locally, Win Globally: An Attenuation-based Framework for Online Stochastic Matching with Timeouts

Brian Brubach
Karthik Abinav Sankararaman
Aravind Srinivasan
Pan Xu

Online matching problems have garnered significant attention in recent years due to numerous applications. Many of them capture the uncertainty in the real world by including stochasticity in both the arrival process and the matching process. The Online Stochastic Matching with Timeouts problem introduced by Bansal, Gupta, Li, Mestre, Nagarajan, and Rudra (Algorithmica, 2012) models matching markets (e. g. E-Bay, Amazon). Buyers arrive from an independent and identically distributed (i. i. d.) known distribution on buyer profiles and can be shown a list of items one at a time. Each buyer has some probability of purchasing each item and a limit (timeout) on the number of items they can be shown. Bansal et al. (Algorithmica, 2012) gave a 0. 12-competitive algorithm which was improved by Adamczyk, Grandoni, and Mukherjee (ESA, 2015) to 0. 24. We present an online attenuation framework that uses an algorithm for offline stochastic matching as a black box. Our main contributions are as follows. On the upper bound side, we show that this framework combined with a black box adapted from Bansal et al. (Algorithmica, 2012) yields an online algorithm which nearly doubles the ratio to 0. 46. On the lower bound side, we show that no algorithm can achieve a ratio better than 0. 632 using the common LP for this problem. This framework has a high potential for further improvements since new algorithms for offline stochastic matching can lead directly to improvements for the online problem. Our online framework also has the potential for a variety of extensions. For example, we introduce a natural generalization: Online Stochastic Matching with Two-sided Timeouts in which both online and offline vertices have timeouts. Our framework provides the first algorithm for this problem achieving a ratio of 0. 31. We accomplish this by proposing a new black box algorithm for offline stochastic matching on star graphs, which may be of independent interest. This new black box improves the approximation ratio for ∗For detailed proofs, refer the full version of the paper. Aravind Srinivasan’s research was supported in part by NSF Awards CNS-1010789 and CCF-1422569, and by a research award from Adobe, Inc. The research of Brian Brubach, Karthik Sankararaman, and Pan Xu was supported in part by NSF Awards CNS-1010789 and CCF-1422569. Appears in: Proc. of the 16th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2017), S. Das, E. Durfee, K. Larson, M. Winikoff (eds.), May 8–12, 2017, São Paulo, Brazil. Copyright © 2017, International Foundation for Autonomous Agents and Multiagent Systems (www. ifaamas. org). All rights reserved. the offline stochastic matching problem on star graphs from 0. 5 by Adamczyk et al. (ESA 2015) to 0. 56.

PDF