Arrow Research search

Author name cluster

Rajmohan Rajaraman

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

25 papers
2 author rows

Possible papers

25

TCS Journal 2026 Journal Article

Stability of peer-to-peer networks under greedy peering

  • Lucianna Kiffer
  • Rajmohan Rajaraman

Major cryptocurrency networks traditionally rely on random peering choices to establish connections in their peer-to-peer networks. While effective for open, permissionless systems, random peering does not account for actors who optimize connections to reduce propagation delays. This paper investigates the dynamics of such “greedy” strategies. We model nodes selecting peers to minimize their average distance to a designated subset of nodes, analyzing factors such as peer selection methods, degree constraints, and the size of the subset-a key aspect in blockchain networks where only a fraction of nodes propagate content or run high-performance hardware. We first analyze an idealized version of the game where each node has full knowledge of the current network and aims to select the d best connections, and prove the existence of equilibria under various model assumptions. Since in reality nodes only have local knowledge based on their peers’ behavior, we also study a greedy protocol which runs in rounds, with each node replacing its worst-performing edge with a new random edge. We exactly characterize stability properties of networks that evolve with this peering rule and derive regimes (based on number of nodes and degree) where stability is possible and even inevitable. We also run extensive simulations with this peering rule examining both how the network evolves and how different network parameters affect the stability properties of the network. Our findings generally show that the only stable networks that arise from greedy peering choices are low-diameter and result in disparate performance for nodes in the network.

ICML Conference 2025 Conference Paper

Optimal Fair Learning Robust to Adversarial Distribution Shift

  • Sushant Agarwal
  • Amit Deshpande 0001
  • Rajmohan Rajaraman
  • Ravi Sundaram

Previous work in fair machine learning has characterised the Fair Bayes Optimal Classifier (BOC) on a given distribution for both deterministic and randomized classifiers. We study the robustness of the Fair BOC to adversarial noise in the data distribution. Kearns & Li (1988) implies that the accuracy of the deterministic BOC without any fairness constraints is robust (Lipschitz) to malicious noise in the data distribution. We demonstrate that their robustness guarantee breaks down when we add fairness constraints. Hence, we consider the randomized Fair BOC, and our central result is that its accuracy is robust to malicious noise in the data distribution. Our robustness result applies to various fairness constraints—Demographic Parity, Equal Opportunity, Predictive Equality. Beyond robustness, we demonstrate that randomization leads to better accuracy and efficiency. We show that the randomized Fair BOC is nearly-deterministic, and gives randomized predictions on at most one data point, hence availing numerous benefits of randomness, while using very little of it.

AAAI Conference 2025 Conference Paper

Sample Complexity of Linear Regression Models for Opinion Formation in Networks

  • Haolin Liu
  • Rajmohan Rajaraman
  • Ravi Sundaram
  • Anil Kumar Vullikanti
  • Omer Wasim
  • Haifeng Xu

Consider public health officials aiming to spread awareness about a new vaccine in a community interconnected by a social network. How can they distribute information with minimal resources, so as to avoid polarization and ensure community-wide convergence of opinion? To tackle such challenges, we initiate the study of sample complexity of opinion formation in networks. Our framework is built on the recognized opinion formation game, where we regard each agent’s opinion as a data-derived model, unlike previous works that treat opinions as data-independent scalars. The opinion model for every agent is initially learned from its local samples and evolves game-theoretically as all agents communicate with neighbors and revise their models towards an equilibrium. Our focus is on the sample complexity needed to ensure that the opinions converge to an equilibrium such that every agent’s final model has low generalization error. Our paper has two main technical results. First, we present a novel polynomial time optimization framework to quantify the total sample complexity for arbitrary networks, when the underlying learning problem is (generalized) linear regression. Second, we leverage this optimization to study the network gain which measures the improvement of sample complexity when learning over a network compared to that in isolation. Towards this end, we derive network gain bounds for various network classes including cliques, star graphs, and random regular graphs. Additionally, our framework provides a method to study sample distribution within the network, suggesting that it is sufficient to allocate samples inversely to the degree. Empirical results on both synthetic and real-world networks strongly support our theoretical findings.

FOCS Conference 2023 Conference Paper

One Tree to Rule Them All: Poly-Logarithmic Universal Steiner Tree

  • Costas Busch
  • Da Qi Chen
  • Arnold Filtser
  • Daniel Hathcock
  • D. Ellis Hershkowitz
  • Rajmohan Rajaraman

A spanning tree T of graph G is a $\rho$-approximate universal Steiner tree (UST) for root vertex r if, for any subset of vertices S containing r, the cost of the minimal subgraph of T connecting S is within a $\rho$ factor of the minimum cost tree connecting S in G. Busch et al. (FOCS 2012) showed that every graph admits $2^{O(\sqrt{\log n})}$-approximate USTs by showing that USTs are equivalent to strong sparse partition hierarchies (up to poly-logs). Further, they posed poly-logarithmic USTs and strong sparse partition hierarchies as open questions. We settle these open questions by giving polynomial-time algorithms for computing both $O\left(\log ^{7} n\right)$-approximate USTs and poly-logarithmic strong sparse partition hierarchies. We reduce the existence of these objects to the previously studied cluster aggregation problem and a class of well-separated point sets which we call dangling nets. For graphs with constant doubling dimension or constant pathwidth we obtain improved bounds by deriving $O(\log n)$-approximate USTs and $O(1)$ strong sparse partition hierarchies. Our doubling dimension result is tight up to second order terms.

SODA Conference 2021 Conference Paper

Competitive Data-Structure Dynamization

  • Claire Mathieu
  • Rajmohan Rajaraman
  • Neal E. Young
  • Arman Yousefi

Data-structure dynamization is a general approach for making static data structures dynamic. It is used extensively in geometric settings and in the guise of so-called merge (or compaction) policies in big-data databases such as Google Bigtable and LevelDB (our focus). Previous theoretical work is based on worst-case analyses for uniform inputs — insertions of one item at a time and constant read rate. In practice, merge policies must not only handle batch insertions and varying read/write ratios, they can take advantage of such non-uniformity to reduce cost on a per-input basis. To model this, we initiate the study of data-structure dynamization through the lens of competitive analysis, via two new online set-cover problems. For each, the input is a sequence of disjoint sets of weighted items. The sets are revealed one at a time. The algorithm must respond to each with a set cover that covers all items revealed so far. It obtains the cover incrementally from the previous cover by adding one or more sets and optionally removing existing sets. For each new set the algorithm incurs build cost equal to the weight of the items in the set. In the first problem the objective is to minimize total build cost plus total query cost, where the algorithm incurs a query cost at each time t equal to the current cover size. In the second problem, the objective is to minimize the build cost while keeping the query cost from exceeding k (a given parameter) at any time. We give deterministic online algorithms for both variants, with competitive ratios of Θ(log∗ n ) and k, respectively. The latter ratio is optimal for the second variant.

FOCS Conference 2020 Conference Paper

Scheduling Precedence-Constrained Jobs on Related Machines with Communication Delay

  • Biswaroop Maiti
  • Rajmohan Rajaraman
  • David Stalfa
  • Zoya Svitkina
  • Aravindan Vijayaraghavan

We consider the problem of scheduling precedence-constrained jobs on uniformly-related machines in the presence of an arbitrary, fixed communication delay. Communication delay is the amount of time that must pass between the completion of a job on one machine and the start of any successor of that job on a different machine. We consider a model that allows job duplication, i. e. processing of the same job on multiple machines, which, as we show, can reduce the length of a schedule (i. e. , its makespan) by a logarithmic factor. Our main result is an approximation algorithm for makespan with approximation ratio polylogarithmic in the number of machines and the length of the communication delay, assuming the minimum makespan is at least the delay. Our algorithm is based on rounding a linear programming relaxation for the problem, which includes carefully designed constraints capturing the interaction among communication delay, precedence requirements, varying speeds, and job duplication. To derive a schedule from a solution to the linear program, we balance the benefits of duplication in satisfying precedence constraints early against its drawbacks in increasing overall system load. Our result builds on two previous lines of work, one with communication delay but identical machines (Lepere, Rapine 2002), and the other with uniformly-related machines but no communication delay (Chudak, Shmoys 1999). We next show that the integrality gap of our mathematical program is polylogarithmic in the communication delay. Our gap construction employs expander graphs and exploits a property of robust expansion and its generalization to paths of longer length, which may be of independent interest. Finally, we quantify the advantage of duplication in scheduling with communication delay. We show that the best schedule without duplication can have a larger makespan than the optimal with duplication by a logarithmic factor. Nevertheless, we present a polynomial time algorithm to transform any schedule to a schedule without duplication at the cost of an increase in makespan polylogarithmic in the number of jobs and machines. Together with our makespan approximation algorithm for schedules allowing duplication, this also yields a polylogarithmic-approximation algorithm for the setting where duplication is not allowed.

SODA Conference 2013 Conference Paper

On the Complexity of Information Spreading in Dynamic Networks

  • Chinmoy Dutta
  • Gopal Pandurangan
  • Rajmohan Rajaraman
  • Zhifeng Sun
  • Emanuele Viola

We study how to spread k tokens of information to every node on an n -node dynamic network, the edges of which are changing at each round. This basic gossip problem can be completed in O ( n + k ) rounds in any static network, and determining its complexity in dynamic networks is central to understanding the algorithmic limits and capabilities of various dynamic network models. Our focus is on token-forwarding algorithms, which do not manipulate tokens in any way other than storing, copying and forwarding them. We first consider the strongly adaptive adversary model where in each round, each node first chooses a token to broadcast to all its neighbors (without knowing who they are), and then an adversary chooses an arbitrary connected communication network for that round with the knowledge of the tokens chosen by each node. We show that Ω( nk /log n + n ) rounds are needed for any randomized (centralized or distributed) token-forwarding algorithm to disseminate the k tokens, thus resolving an open problem raised in [KLO10]. The bound applies to a wide class of initial token distributions, including those in which each token is held by exactly one node and well-mixed ones in which each node has each token independently with a constant probability. Our result for the strongly adaptive adversary model motivates us to study the weakly adaptive adversary model where in each round, the adversary is required to lay down the network first, and then each node sends a possibly distinct token to each of its neighbors. We propose a simple randomized distributed algorithm where in each round, along every edge ( u, v ), a token sampled uniformly at random from the symmetric difference of the sets of tokens held by node u and node v is exchanged. We prove that starting from any well-mixed distribution of tokens where each node has each token independently with a constant probability, this algorithm solves the k -gossip problem in O (( n + k ) log n log k ) rounds with high probability over the initial token distribution and the randomness of the protocol. We then show how the above uniform sampling problem can be solved using Õ (log n ) bits of communication, making the overall algorithm communication-efficient. We next present a centralized algorithm that solves the gossip problem for every initial distribution in O (( n + k ) log 2 n ) rounds in the offline setting where the entire sequence of communication networks is known to the algorithm in advance. Finally, we present an -round centralized offline algorithm in which each node can only broadcast a single token to all of its neighbors in each round.

FOCS Conference 2012 Conference Paper

Split and Join: Strong Partitions and Universal Steiner Trees for Graphs

  • Costas Busch
  • Chinmoy Dutta
  • Jaikumar Radhakrishnan
  • Rajmohan Rajaraman
  • Srinivasagopalan Srivathsan

We study the problem of constructing universal Steiner trees for undirected graphs. Given a graph G and a root node r, we seek a single spanning tree T of minimum stretch, where the stretch of T is defined to be the maximum ratio, over all terminal sets X, of the cost of the minimal sub-tree T X of T that connects X to r to the cost of an optimal Steiner tree connecting X to r in G. Universal Steiner trees (USTs) are important for data aggregation problems where computing the Steiner tree from scratch for every input instance of terminals is costly, as for example in low energy sensor network applications. graphs with 2 O(√log n) -stretch. We also give a polynomial time We provide a polynomial time UST construction for general polylog(n)-stretch construction for minor-free graphs. One basic building block of our algorithms is a hierarchy of graph partitions, each of which guarantees small strong diameter for each cluster and bounded neighbourhood intersections for each node. We show close connections between the problems of constructing USTs and building such graph partitions. Our construction of partition hierarchies for general graphs is based on an iterative cluster merging procedure, while the one for minor-free graphs is based on a separator theorem for such graphs and the solution to a cluster aggregation problem that may be of independent interest even for general graphs. To our knowledge, this is the first subpolynomial-stretch (o(n ε ) for any ε >; 0) UST construction for general graphs, and the first polylogarithmic-stretch UST construction for minor-free graphs.

FOCS Conference 2009 Conference Paper

Reducibility among Fractional Stability Problems

  • Shiva Kintali
  • Laura J. Poplawski
  • Rajmohan Rajaraman
  • Ravi Sundaram
  • Shang-Hua Teng

In a landmark paper, Papadimitriou introduced a number of syntactic subclasses of TFNP based on proof styles that (unlike TFNP) admit complete problems. A recent series of results has shown that finding Nash equilibria is complete for PPAD, a particularly notable subclass of TFNP. A major goal of this work is to expand the universe of known PPAD-complete problems. We resolve the computational complexity of a number of outstanding open problems with practical applications. Here is the list of problems we show to be PPAD-complete, along with the domains of practical significance: Fractional Stable Paths Problem (FSPP) - Internet routing; Core of Balanced Games - Economics and Game theory; Scarf's Lemma - Combinatorics; Hypergraph Matching - Social Choice and Preference Systems; Fractional Bounded Budget Connection Games (FBBC) - Social networks; and Strong Fractional Kernel - Graph Theory. In fact, we show that no fully polynomial-time approximation schemes exist (unless PPAD is in FP). This paper is entirely a series of reductions that build in nontrivial ways on the framework established in previous work. In the course of deriving these reductions, we created two new concepts - preference games and personalized equilibria. The entire set of new reductions can be presented as a lattice with the above problems sandwiched between preference games (at the "easy" end) and personalized equilibria (at the "hard" end). Our completeness results extend to natural approximate versions of most of these problems. On a technical note, we wish to highlight our novel "continuous-to-discrete" reduction from exact personalized equilibria to approximate personalized equilibria using a linear program augmented with an exponential number of "min" constraints of a specific form. In addition to enhancing our repertoire of PPAD-complete problems, we expect the concepts and techniques in this paper to find future use in algorithmic game theory.

FOCS Conference 1999 Conference Paper

Online Scheduling to Minimize Average Stretch

  • S. Muthukrishnan 0001
  • Rajmohan Rajaraman
  • Anthony Shaheen
  • Johannes Gehrke

We consider the classical problem of online job scheduling on uniprocessor and multiprocessor machines. For a given job, we measure the quality of service provided by an algorithm by the stretch of the job, which is defined as the ratio of the amount of time that the job spends in the system to the processing time of the job. For a given sequence of jobs, we measure the performance of an algorithm by the average stretch achieved by the algorithm over all the jobs in the sequence. The average stretch metric has been used to evaluate the performance of scheduling algorithms in many applications arising in databases, networks and systems; however no formal analysis of scheduling algorithms is known for the average stretch metric. The main contribution of the paper is to show that the shortest remaining processing time algorithm (SRPT) is O(l)-competitive with respect to average stretch for both uniprocessors as well as multiprocessors. For uniprocessors, we prove that SRPT is 2-competitive; we also establish an essentially matching lower bound on the competitive ratio of SRPT. For multiprocessors, we show that the competitive ratio of SRPT is at most 14. Furthermore, we establish constant-factor lower bounds on the competitive ratio of any online algorithm for both uniprocessors and multiprocessors.

TCS Journal 1999 Journal Article

Rapid convergence of a local load balancing algorithm for asynchronous rings

  • Johannes E. Gehrke
  • C.Greg Plaxton
  • Rajmohan Rajaraman

We consider the problem of load balancing in a ring network. We present an analysis of the following local algorithm. In each step, each node of the ring examines the number of tokens at its clockwise neighbor and sends a token to the neighbour if the neighbour has fewer tokens. We show that in a synchronous model, for any initial token distribution b on an n-node ring, the algorithm converges to a completely balanced distribution within 4OPT(b) + n steps, where OPT(b) is the time taken by the optimal centralized algorithm to balance b completely. Our main result is an analysis of the algorithm in an asynchronous model in which local computations and messages may be arbitrarily delayed, subject to the constraint that each message is eventually delivered and each computation is eventually performed. By generalizing our analysis for the synchronous model, we show that for any initial token distribution b, the algorithm converges to a completely balanced distribution within 8OPT(b) + 2n rounds, where a round is a minimal sequence of steps in which every component of the network is scheduled at least once. We also show that for every initial token distribution, the message complexity of the algorithm is asymptotically optimal among all algorithms that move tokens in the clockwise direction.

FOCS Conference 1996 Conference Paper

Fast Fault-Tolerant Concurrent Access to Shared Objects

  • C. Greg Plaxton
  • Rajmohan Rajaraman

The authors consider a synchronous model of distributed computation in which n nodes communicate via point-to-point messages, subject to the following constraints: (i) in a single "step", a node can only send or receive O(logn) words, and (ii) communication is unreliable in that a constant fraction of all messages are lost at each step due to node and/or link failures. They design and analyze a simple local protocol for providing fast concurrent access to shared objects in this faulty network environment. In the protocol, clients use a hashing-based method to access shared objects. When a large number of clients attempt to read a given object at the same time, the object is rapidly replicated to an appropriate number of servers. Once the necessary level of replication has been achieved, each remaining request for the object is serviced within O(1) expected steps. The protocol has practical potential for supporting high levels of concurrency in distributed file systems over wide area networks.