Arrow Research search

Author name cluster

Kasper Green Larsen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

49 papers
2 author rows

Possible papers

49

NeurIPS Conference 2025 Conference Paper

Tight Generalization Bounds for Large-Margin Halfspaces

  • Kasper Green Larsen
  • Natascha Schalburg

We prove the first generalization bound for large-margin halfspaces that is asymptotically tight in the tradeoff between the margin, the fraction of training points with the given margin, the failure probability and the number of training points.

IJCAI Conference 2024 Conference Paper

Bagging is an Optimal PAC Learner (Extended Abstract)

  • Kasper Green Larsen

Determining the optimal sample complexity of PAC learning in the realizable setting was a central open problem in learning theory for decades. Finally, seminal work by Hanneke gave an algorithm with a provably optimal sample complexity. His algorithm is based on a careful and structured sub-sampling of the training data and then returning a majority vote among hypotheses trained on each of the sub-samples. While being a very exciting theoretical result, it has not had much impact in practice, in part due to inefficiency, since it constructs a polynomial number of sub-samples of the training data, each of linear size. In this work, we prove the surprising result that the practical and classic heuristic bagging (a. k. a. bootstrap aggregation), due to Breiman, is in fact also an optimal PAC learner. Bagging pre-dates Hanneke's algorithm by twenty years and is taught in most undergraduate machine learning courses. Moreover, we show that it only requires a logarithmic number of sub-samples to reach optimality.

NeurIPS Conference 2024 Conference Paper

Derandomizing Multi-Distribution Learning

  • Kasper Green Larsen
  • Omar Montasser
  • Nikita Zhivotovskiy

Multi-distribution or collaborative learning involves learning a single predictor that works well across multiple data distributions, using samples from each during training. Recent research on multi-distribution learning, focusing on binary loss and finite VC dimension classes, has shown near-optimal sample complexity that is achieved with oracle efficient algorithms. That is, these algorithms are computationally efficient given an efficient ERM for the class. Unlike in classical PAC learning, where the optimal sample complexity is achieved with deterministic predictors, current multi-distribution learning algorithms output randomized predictors. This raises the question: can these algorithms be derandomized to produce a deterministic predictor for multiple distributions? Through a reduction to discrepancy minimization, we show that derandomizing multi-distribution learning is computationally hard, even when ERM is computationally efficient. On the positive side, we identify a structural condition enabling an efficient black-box reduction, converting existing randomized multi-distribution predictors into deterministic ones.

MFCS Conference 2024 Conference Paper

From TCS to Learning Theory (Invited Paper)

  • Kasper Green Larsen

While machine learning theory and theoretical computer science are both based on a solid mathematical foundation, the two research communities have a smaller overlap than what the proximity of the fields warrant. In this invited abstract, I will argue that traditional theoretical computer scientists have much to offer the learning theory community and vice versa. I will make this argument by telling a personal story of how I broadened my research focus to encompass learning theory, and how my TCS background has been extremely useful in doing so. It is my hope that this personal account may inspire more TCS researchers to tackle the many elegant and important theoretical questions that learning theory has to offer.

NeurIPS Conference 2024 Conference Paper

Optimal Parallelization of Boosting

  • Arthur da Cunha
  • Mikael Møller Høgsgaard
  • Kasper Green Larsen

Recent works on the parallel complexity of Boosting have established strong lower bounds on the tradeoff between the number of training rounds $p$ and the total parallel work per round $t$. These works have also presented highly non-trivial parallel algorithms that shed light on different regions of this tradeoff. Despite these advancements, a significant gap persists between the theoretical lower bounds and the performance of these algorithms across much of the tradeoff space. In this work, we essentially close this gap by providing both improved lower bounds on the parallel complexity of weak-to-strong learners, and a parallel Boosting algorithm whose performance matches these bounds across the entire $p$ vs. $t$ compromise spectrum, up to logarithmic factors. Ultimately, this work settles the parallel complexity of Boosting algorithms that are nearly sample-optimal.

ICML Conference 2024 Conference Paper

Replicable Learning of Large-Margin Halfspaces

  • Alkis Kalavasis
  • Amin Karbasi
  • Kasper Green Larsen
  • Grigoris Velegkas
  • Felix Zhou 0002

We provide an efficient replicable algorithm for the problem of learning large-margin halfspaces. Our results improve upon the algorithms provided by Impagliazzo, Lei, Pitassi, and Sorrell (STOC, 2022). We design the first dimension-independent replicable algorithm for this task which runs in polynomial time, is proper, and has strictly improved sample complexity compared to the one achieved by Impagliazzo et al. (STOC, 2022) with respect to all the relevant parameters. Moreover, our algorithm has sample complexity that is optimal with respect to the accuracy parameter $\epsilon$. Departing from the requirement of polynomial time algorithms, using the DP-to-Replicability reduction of Bun et al. (STOC 2023), we show how to obtain a replicable algorithm for large-margin halfspaces with improved sample complexity with respect to the margin parameter $\tau$, but running time doubly exponential in $1/\tau^2$ and worse sample complexity dependence on $\epsilon$ than our previous algorithm. We then design an improved algorithm with better sample complexity than both of our previous algorithms and running time exponential in $1/\tau^{2}. $

FOCS Conference 2024 Conference Paper

Revisiting Agnostic PAC Learning

  • Steve Hanneke
  • Kasper Green Larsen
  • Nikita Zhivotovskiy

PAC learning, dating back to Valiant'84 and Vapnik and Chervonenkis'64, '74, is a classic model for studying supervised learning. In the agnostic setting, we have access to a hypothesis set $\mathbf{H}$ and a training set of labeled samples drawn i, i $\mathbf{d}$. from an unknown data distribution D. The goal is to produce a classifier that is competitive with the hypothesis in $\mathbf{H}$ having the least probability of mispredicting the label of a new sample from D. Empirical Risk Minimization (ERM) is a natural learning algorithm, where one simply outputs the hypothesis from $\mathbf{H}$ making the fewest mistakes on the training data. This simple algorithm is known to have an optimal error in terms of the VC-dimension of $\mathbf{H}$ and the number of samples. In this work, we revisit agnostic PAC learning and first show that ERM and any other proper learning algorithm is in fact sub-optimal if we treat the performance of the best hypothesis in $\mathbf{H}$, as a parameter. We then complement this lower bound with the first learning algorithm achieving an optimal error. Our algorithm introduces several new ideas that we hope may find further applications in learning theory.

ICML Conference 2024 Conference Paper

Sparse Dimensionality Reduction Revisited

  • Mikael Møller Høgsgaard
  • Lior Kamma
  • Kasper Green Larsen
  • Jelani Nelson
  • Chris Schwiegelshohn

The sparse Johnson-Lindenstrauss transform is one of the central techniques in dimensionality reduction. It supports embedding a set of $n$ points in $\mathbb{R}^d$ into $m=O(\varepsilon^{-2} \ln n)$ dimensions while preserving all pairwise distances to within $1 \pm \varepsilon$. Each input point $x$ is embedded to $Ax$, where $A$ is an $m \times d$ matrix having $s$ non-zeros per column, allowing for an embedding time of $O(s \|x\|_0)$. Since the sparsity of $A$ governs the embedding time, much work has gone into improving the sparsity $s$. The current state-of-the-art by Kane and Nelson (2014) shows that $s = O(\varepsilon^{-1} \ln n)$ suffices. This is almost matched by a lower bound of $s = \Omega(\varepsilon^{-1} \ln n/\ln(1/\varepsilon))$ by Nelson and Nguyen (2013) for $d=\Omega(n)$. Previous work thus suggests that we have near-optimal embeddings. In this work, we revisit sparse embeddings and present a sparser embedding for instances in which $d = n^{o(1)}$, which in many applications is realistic. Formally, our embedding achieves $s = O(\varepsilon^{-1}(\ln n/\ln(1/\varepsilon)+\ln^{2/3}n \ln^{1/3} d))$. We also complement our analysis by strengthening the lower bound of Nelson and Nguyen to hold also when $d \ll n$, thereby matching the first term in our new sparsity upper bound. Finally, we also improve the sparsity of the best oblivious subspace embeddings for optimal embedding dimensionality.

MFCS Conference 2024 Conference Paper

Sublinear Time Shortest Path in Expander Graphs

  • Noga Alon
  • Allan Grønlund Jørgensen
  • Søren Fuglede Jørgensen
  • Kasper Green Larsen

Computing a shortest path between two nodes in an undirected unweighted graph is among the most basic algorithmic tasks. Breadth first search solves this problem in linear time, which is clearly also a lower bound in the worst case. However, several works have shown how to solve this problem in sublinear time in expectation when the input graph is drawn from one of several classes of random graphs. In this work, we extend these results by giving sublinear time shortest path (and short path) algorithms for expander graphs. We thus identify a natural deterministic property of a graph (that is satisfied by typical random regular graphs) which suffices for sublinear time shortest paths. The algorithms are very simple, involving only bidirectional breadth first search and short random walks. We also complement our new algorithms by near-matching lower bounds.

NeurIPS Conference 2024 Conference Paper

The Many Faces of Optimal Weak-to-Strong Learning

  • Mikael Møller Høgsgaard
  • Kasper Green Larsen
  • Markus Engelund Mathiasen

Boosting is an extremely successful idea, allowing one to combine multiple low accuracy classifiers into a much more accurate voting classifier. In this work, we present a new and surprisingly simple Boosting algorithm that obtains a provably optimal sample complexity. Sample optimal Boosting algorithms have only recently been developed, and our new algorithm has the fastest runtime among all such algorithms and is the simplest to describe: Partition your training data into 5 disjoint pieces of equal size, run AdaBoost on each, and combine the resulting classifiers via a majority vote. In addition to this theoretical contribution, we also perform the first empirical comparison of the proposed sample optimal Boosting algorithms. Our pilot empirical study suggests that our new algorithm might outperform previous algorithms on large data sets.

ICML Conference 2023 Conference Paper

AdaBoost is not an Optimal Weak to Strong Learner

  • Mikael Møller Høgsgaard
  • Kasper Green Larsen
  • Martin Ritzert

AdaBoost is a classic boosting algorithm for combining multiple inaccurate classifiers produced by a weak learner, to produce a strong learner with arbitrarily high accuracy when given enough training data. Determining the optimal number of samples necessary to obtain a given accuracy of the strong learner, is a basic learning theoretic question. Larsen and Ritzert (NeurIPS’22) recently presented the first provably optimal weak-to-strong learner. However, their algorithm is somewhat complicated and it remains an intriguing question whether the prototypical boosting algorithm AdaBoost also makes optimal use of training samples. In this work, we answer this question in the negative. Concretely, we show that the sample complexity of AdaBoost, and other classic variations thereof, are sub-optimal by at least one logarithmic factor in the desired accuracy of the strong learner.

FOCS Conference 2023 Conference Paper

Super-Logarithmic Lower Bounds for Dynamic Graph Problems

  • Kasper Green Larsen
  • Huacheng Yu

In this work, we prove a $\tilde{\Omega}(\lg^{3/2} n)$ unconditional lower bound on the maximum of the query time and update time for dynamic data structures supporting reachability queries in n-node directed acyclic graphs under edge insertions. This is the first super-logarithmic lower bound for any natural graph problem. In proving the lower bound, we also make novel contributions to the state-of-the-art data structure lower bound techniques that we hope may lead to further progress in proving lower bounds.

ICML Conference 2023 Conference Paper

The Fast Johnson-Lindenstrauss Transform Is Even Faster

  • Nova Fandina
  • Mikael Møller Høgsgaard
  • Kasper Green Larsen

The Johnson-Lindenstaruss lemma (Johnson & Lindenstrauss, 1984) is a cornerstone result in dimensionality reduction, stating it is possible to embed a set of $n$ points in $d$-dimensional Euclidean space into optimal $k=O(\varepsilon^{-2} \ln n)$ dimensions, while preserving all pairwise distances to within a factor $(1 \pm \varepsilon)$. The seminal Fast Johnson-Lindenstrauss (Fast JL) transform by Ailon and Chazelle (SICOMP’09) supports computing the embedding of a data point in $O(d \ln d +k \ln^2 n)$ time, where the $d \ln d$ term comes from multiplication with a $d \times d$ Hadamard matrix and the $k \ln^2 n$ term comes from multiplication with a sparse $k \times d$ matrix. Despite the Fast JL transform being more than a decade old, it is one of the fastest dimensionality reduction techniques for many tradeoffs between $\varepsilon, d$ and $n$. In this work, we give a surprising new analysis of the Fast JL transform, showing that the $k \ln^2 n$ term in the embedding time can be improved to $(k \ln^2 n)/\alpha$ for an $\alpha = \Omega(\min\{\varepsilon^{-1}\ln(1/\varepsilon), \ln n\})$. The improvement follows by using an even sparser matrix. We complement our improved analysis with a lower bound showing that our new analysis is in fact tight.

NeurIPS Conference 2022 Conference Paper

Improved Coresets for Euclidean $k$-Means

  • Vincent Cohen-Addad
  • Kasper Green Larsen
  • David Saulpic
  • Chris Schwiegelshohn
  • Omar Ali Sheikh-Omar

Given a set of $n$ points in $d$ dimensions, the Euclidean $k$-means problem (resp. Euclidean $k$-median) consists of finding $k$ centers such that the sum of squared distances (resp. sum of distances) from every point to its closest center is minimized. The arguably most popular way of dealing with this problem in the big data setting is to first compress the data by computing a weighted subset known as a coreset and then run any algorithm on this subset. The guarantee of the coreset is that for any candidate solution, the ratio between coreset cost and the cost of the original instance is less than a $(1\pm \varepsilon)$ factor. The current state of the art coreset size is $\tilde O(\min(k^{2} \cdot \varepsilon^{-2}, k\cdot \varepsilon^{-4}))$ for Euclidean $k$-means and $\tilde O(\min(k^{2} \cdot \varepsilon^{-2}, k\cdot \varepsilon^{-3}))$ for Euclidean $k$-median. The best known lower bound for both problems is $\Omega(k\varepsilon^{-2})$. In this paper, we improve these bounds to $\tilde O(\min(k^{3/2} \cdot \varepsilon^{-2}, k\cdot \varepsilon^{-4}))$ for Euclidean $k$-means and $\tilde O(\min(k^{4/3} \cdot \varepsilon^{-2}, k\cdot \varepsilon^{-3}))$ for Euclidean $k$-median. In particular, ours is the first provable bound that breaks through the $k^2$ barrier while retaining an optimal dependency on $\varepsilon$.

NeurIPS Conference 2022 Conference Paper

Optimal Weak to Strong Learning

  • Kasper Green Larsen
  • Martin Ritzert

The classic algorithm AdaBoost allows to convert a weak learner, that is an algorithm that produces a hypothesis which is slightly better than chance, into a strong learner, achieving arbitrarily high accuracy when given enough training data. We present a new algorithm that constructs a strong learner from a weak learner, but uses less training data than AdaBoost and all other weak to strong learners to achieve the same generalization bounds. A sample complexity lower bound shows that our new algorithm uses the minimum possible amount of training data and is thus optimal. Hence, this work settles the sample complexity of the classic problem of constructing a strong learner from a weak learner.

ICML Conference 2021 Conference Paper

CountSketches, Feature Hashing and the Median of Three

  • Kasper Green Larsen
  • Rasmus Pagh
  • Jakub Tetek

In this paper, we revisit the classic CountSketch method, which is a sparse, random projection that transforms a (high-dimensional) Euclidean vector $v$ to a vector of dimension $(2t-1) s$, where $t, s > 0$ are integer parameters. It is known that a CountSketch allows estimating coordinates of $v$ with variance bounded by $\|v\|_2^2/s$. For $t > 1$, the estimator takes the median of $2t-1$ independent estimates, and the probability that the estimate is off by more than $2 \|v\|_2/\sqrt{s}$ is exponentially small in $t$. This suggests choosing $t$ to be logarithmic in a desired inverse failure probability. However, implementations of CountSketch often use a small, constant $t$. Previous work only predicts a constant factor improvement in this setting. Our main contribution is a new analysis of CountSketch, showing an improvement in variance to $O(\min\{\|v\|_1^2/s^2, \|v\|_2^2/s\})$ when $t > 1$. That is, the variance decreases proportionally to $s^{-2}$, asymptotically for large enough $s$.

SODA Conference 2021 Conference Paper

Optimal Oblivious Priority Queues

  • Zahra Jafargholi
  • Kasper Green Larsen
  • Mark Simkin 0001

In this work, we present the first asymptotically optimal oblivious priority queue, which matches the lower bound of Jacob, Larsen, and Nielsen (SODA'19). Our construction is conceptually simple and statistically secure. We illustrate the power of our optimal oblivious priority queue by presenting a conceptually equally simple construction of statistically secure offline ORAMs with O (log n ) bandwidth overhead.

NeurIPS Conference 2020 Conference Paper

Margins are Insufficient for Explaining Gradient Boosting

  • Allan Grønlund
  • Lior Kamma
  • Kasper Green Larsen

Boosting is one of the most successful ideas in machine learning, achieving great practical performance with little fine-tuning. The success of boosted classifiers is most often attributed to improvements in margins. The focus on margin explanations was pioneered in the seminal work by Schaphire et al. (1998) and has culminated in the $k$'th margin generalization bound by Gao and Zhou (2013), which was recently proved to be near-tight for some data distributions (Gr\o nlund et al. 2019). In this work, we first demonstrate that the $k$'th margin bound is inadequate in explaining the performance of state-of-the-art gradient boosters. We then explain the short comings of the $k$'th margin bound and prove a stronger and more refined margin-based generalization bound that indeed succeeds in explaining the performance of modern gradient boosters. Finally, we improve upon the recent generalization lower bound by Gr\o nlund et al. (2019).

ICML Conference 2020 Conference Paper

Near-Tight Margin-Based Generalization Bounds for Support Vector Machines

  • Allan Grønlund Jørgensen
  • Lior Kamma
  • Kasper Green Larsen

Support Vector Machines (SVMs) are among the most fundamental tools for binary classification. In its simplest formulation, an SVM produces a hyperplane separating two classes of data using the largest possible margin to the data. The focus on maximizing the margin has been well motivated through numerous generalization bounds. In this paper, we revisit and improve the classic generalization bounds in terms of margins. Furthermore, we complement our new generalization bound by a nearly matching lower bound, thus almost settling the generalization performance of SVMs in terms of margins.

SODA Conference 2019 Conference Paper

Lower Bounds for Oblivious Data Structures

  • Riko Jacob
  • Kasper Green Larsen
  • Jesper Buus Nielsen

An oblivious data structure is a data structure where the memory access patterns reveals no information about the operations performed on it. Such data structures were introduced by Wang et al. [ACM SIGSAC’14] and are intended for situations where one wishes to store the data structure at an untrusted server. One way to obtain an oblivious data structure is simply to run a classic data structure on an oblivious RAM (ORAM). Until very recently, this resulted in an overhead of ω (lg n ) for the most natural setting of parameters. Moreover, a recent lower bound for ORAMs by Larsen and Nielsen [CRYPTO’18] show that they always incur an overhead of at least Ω(lg n ) if used in a black box manner. To circumvent the ω (lg n ) overhead, researchers have instead studied classic data structure problems more directly and have obtained efficient solutions for many such problems such as stacks, queues, deques, priority queues and search trees. However, none of these data structures process operations faster than Θ(lg n ), leaving open the question of whether even faster solutions exist. In this paper, we rule out this possibility by proving Ω(lg n ) lower bounds for oblivious stacks, queues, deques, priority queues and search trees.

NeurIPS Conference 2019 Conference Paper

Margin-Based Generalization Lower Bounds for Boosted Classifiers

  • Allan Grønlund
  • Lior Kamma
  • Kasper Green Larsen
  • Alexander Mathiasen
  • Jelani Nelson

Boosting is one of the most successful ideas in machine learning. The most well-accepted explanations for the low generalization error of boosting algorithms such as AdaBoost stem from margin theory. The study of margins in the context of boosting algorithms was initiated by Schapire, Freund, Bartlett and Lee (1998), and has inspired numerous boosting algorithms and generalization bounds. To date, the strongest known generalization (upper bound) is the $k$th margin bound of Gao and Zhou (2013). Despite the numerous generalization upper bounds that have been proved over the last two decades, nothing is known about the tightness of these bounds. In this paper, we give the first margin-based lower bounds on the generalization error of boosted classifiers. Our lower bounds nearly match the $k$th margin bound and thus almost settle the generalization performance of boosted classifiers in terms of margins.

ICML Conference 2019 Conference Paper

Optimal Minimal Margin Maximization with Boosting

  • Alexander Mathiasen
  • Kasper Green Larsen
  • Allan Grønlund Jørgensen

Boosting algorithms iteratively produce linear combinations of more and more base hypotheses and it has been observed experimentally that the generalization error keeps improving even after achieving zero training error. One popular explanation attributes this to improvements in margins. A common goal in a long line of research, is to obtain large margins using as few base hypotheses as possible, culminating with the AdaBoostV algorithm by R{ä}tsch and Warmuth [JMLR’05]. The AdaBoostV algorithm was later conjectured to yield an optimal trade-off between number of hypotheses trained and the minimal margin over all training points (Nie, Warmuth, Vishwanathan and Zhang [JMLR’13]). Our main contribution is a new algorithm refuting this conjecture. Furthermore, we prove a lower bound which implies that our new algorithm is optimal.

NeurIPS Conference 2018 Conference Paper

Fully Understanding The Hashing Trick

  • Casper Freksen
  • Lior Kamma
  • Kasper Green Larsen

Feature hashing, also known as {\em the hashing trick}, introduced by Weinberger et al. (2009), is one of the key techniques used in scaling-up machine learning algorithms. Loosely speaking, feature hashing uses a random sparse projection matrix $A: \mathbb{R}^n \to \mathbb{R}^m$ (where $m \ll n$) in order to reduce the dimension of the data from $n$ to $m$ while approximately preserving the Euclidean norm. Every column of $A$ contains exactly one non-zero entry, equals to either $-1$ or $1$. Weinberger et al. showed tail bounds on $\|Ax\|_2^2$. Specifically they showed that for every $\varepsilon, \delta$, if $\|x\|_{\infty} / \|x\|_2$ is sufficiently small, and $m$ is sufficiently large, then \begin{equation*}\Pr[ \; | \; \|Ax\|_2^2 - \|x\|_2^2\; | < \varepsilon \|x\|_2^2 \; ] \ge 1 - \delta \; .\end{equation*} These bounds were later extended by Dasgupta et al. (2010) and most recently refined by Dahlgaard et al. (2017), however, the true nature of the performance of this key technique, and specifically the correct tradeoff between the pivotal parameters $\|x\|_{\infty} / \|x\|_2, m, \varepsilon, \delta$ remained an open question. We settle this question by giving tight asymptotic bounds on the exact tradeoff between the central parameters, thus providing a complete understanding of the performance of feature hashing. We complement the asymptotic bound with empirical data, which shows that the constants "hiding" in the asymptotic notation are, in fact, very close to $1$, thus further illustrating the tightness of the presented bounds in practice.

FOCS Conference 2017 Conference Paper

A Dichotomy for Regular Expression Membership Testing

  • Karl Bringmann
  • Allan Grønlund Jørgensen
  • Kasper Green Larsen

We study regular expression membership testing: Given a regular expression of size m and a string of size n, decide whether the string is in the language described by the regular expression. Its classic O(nm) algorithm is one of the big success stories of the 70s, which allowed pattern matching to develop into the standard tool that it is today. Many special cases of pattern matching have been studied that can be solved faster than in quadratic time. However, a systematic study of tractable cases was made possible only recently, with the first conditional lower bounds reported by Backurs and Indyk [FOCS'16]. Restricted to any “type” of homogeneous regular expressions of depth 2 or 3, they either presented a near-linear time algorithm or a quadratic conditional lower bound, with one exception known as the Word Break problem. In this paper we complete their work as follows: (1) We present two almost-linear time algorithms that generalize all known almost-linear time algorithms for special cases of regular expression membership testing. (2) We classify all types, except for the Word Break problem, into almost-linear time or quadratic time assuming the Strong Exponential Time Hypothesis. This extends the classification from depth 2 and 3 to any constant depth. (3) For the Word Break problem we give an improved Õ(nm 1/3 + m) algorithm. Surprisingly, we also prove a matching conditional lower bound for combinatorial algorithms. This establishes Word Break as the only intermediate problem. In total, we prove matching upper and lower bounds for any type of bounded-depth homogeneous regular expressions, which yields a full dichotomy for regular expression membership testing.

STOC Conference 2017 Conference Paper

DecreaseKeys are expensive for external memory priority queues

  • Kasper Eenberg
  • Kasper Green Larsen
  • Huacheng Yu

One of the biggest open problems in external memory data structures is the priority queue problem with DecreaseKey operations. If only Insert and ExtractMin operations need to be supported, one can design a comparison-based priority queue performing O (( N / B )lg M / B N ) I/Os over a sequence of N operations, where B is the disk block size in number of words and M is the main memory size in number of words. This matches the lower bound for comparison-based sorting and is hence optimal for comparison-based priority queues. However, if we also need to support DecreaseKeys, the performance of the best known priority queue is only O (( N / B ) lg 2 N ) I/Os. The big open question is whether a degradation in performance really is necessary. We answer this question affirmatively by proving a lower bound of Ω(( N / B ) lg lg N B ) I/Os for processing a sequence of N intermixed Insert, ExtraxtMin and DecreaseKey operations. Our lower bound is proved in the cell probe model and thus holds also for non-comparison-based priority queues.

SODA Conference 2017 Conference Paper

Faster Online Matrix-Vector Multiplication

  • Kasper Green Larsen
  • R. Ryan Williams

We consider the Online Boolean Matrix-Vector Multiplication (OMV) problem studied by Henzinger et al. [STOC'15]: given an n × n Boolean matrix M, we receive n Boolean vectors v 1, …, v n one at a time, and are required to output Mv i (over the Boolean semiring) before seeing the vector v i+1, for all i. Previous known algorithms for this problem are combinatorial, running in O ( n 3 /log 2 n ) time. Henzinger et al. conjecture there is no O ( n 3-∊ ) time algorithm for OMV, for all ∊ > 0; their OMV conjecture is shown to imply strong hardness results for many basic dynamic problems. We give a substantially faster method for computing OMV, running in randomized time. In fact, after seeing vectors, we already achieve amortized time for matrix-vector multiplication. Our approach gives a way to reduce matrix-vector multiplication to solving a version of the Orthogonal Vectors problem, which in turn reduces to “small” algebraic matrix-matrix multiplication. Applications include faster independent set detection, partial match retrieval, and 2-CNF evaluation. We also show how a modification of our method gives a cell probe data structure for OMV with worst case time per query vector, where w is the word size. This result rules out an unconditional proof of the OMV conjecture using purely information-theoretic arguments.

FOCS Conference 2017 Conference Paper

Optimality of the Johnson-Lindenstrauss Lemma

  • Kasper Green Larsen
  • Jelani Nelson

For any d, n ≥ 2 and 1/(min{n, d}) 0. 4999 d such that any embedding f: X → ℝ m satisfying ∀x, y ∈ X, (1-ε)∥x-y∥ 2 2 ≤ ∥f(x)-f(y)∥ 2 2 ≤ (1+ε)∥x-y∥ 2 2 must have m = Ω(ε -2 lg n). This lower bound matches the upper bound given by the Johnson-Lindenstrauss lemma [JL84]. Furthermore, our lower bound holds for nearly the full range of ε of interest, since there is always an isometric embedding into dimension min{d, n} (either the identity map, or projection onto span(X)). Previously such a lower bound was only known to hold against linear maps f, and not for such a wide range of parameters ε, n, d [LN16]. The best previously known lower bound for general f was m = Ω(ε -2 lg n/ lg(1/ε)) [Wel74], [Alo03], which is suboptimal for any ε = o(1).

FOCS Conference 2016 Conference Paper

Heavy Hitters via Cluster-Preserving Clustering

  • Kasper Green Larsen
  • Jelani Nelson
  • Huy L. Nguyen 0001
  • Mikkel Thorup

In the turnstile ℓ p heavy hitters problem with parameter ε, one must maintain a high-dimensional vector x ∈ ℝ n subject to updates of the form update (i, Δ) causing the change x i ← x i + Δ, where i ε[n], Δ ∈ ℝ. Upon receiving a query, the goal is to report every "heavy hitter" i ∈ [n] with |x i | ≥ ε ∥x∥ p as part of a list L ⊆ [n] of size O(1/ε p ), i. e. proportional to the maximum possible number of heavy hitters. For any pε(0, 2] the COUNTSKETCH of [CCFC04] solves ℓ p heavy hitters using O(ε -p lg n) words of space with O(lg n) update time, O(n lg n) query time to output L, and whose output after any query is correct with high probability (whp) 1 - 1/poly(n) [JST11, Section 4. 4]. This space bound is optimal even in the strict turnstile model [JST11] in which it is promised that x i ≥ 0 for all i ∈ [n] at all points in the stream, but unfortunately the query time is very slow. To remedy this, the work [CM05] proposed the "dyadic trick" for the COUNTMIN sketch for p = 1 in the strict turnstile model, which to maintain whp correctness achieves suboptimal space O(ε -1 lg 2 n), worse update time O(lg 2 n), but much better query time O(ε -1 poly(lg n)). An extension to all p ∈ (0, 2] appears in [KNPW11, Theorem 1], and can be obtained from [Pag13]. We show that this tradeoff between space and update time versus query time is unnecessary. We provide a new algorithm, EXPANDERSKETCH, which in the most general turnstile model achieves optimal O(ε-plog n) space, O(log n) update time, and fast O(ε-ppoly(log n)) query time, providing correctness whp. In fact, a simpler version of our algorithm for p = 1 in the strict turnstile model answers queries even faster than the "dyadic trick" by roughly a log n factor, dominating it in all regards. Our main innovation is an efficient reduction from the heavy hitters to a clustering problem in which each heavy hitter is encoded as some form of noisy spectral cluster in a much bigger graph, and the goal is to identify every cluster. Since every heavy hitter must be found, correctness requires that every cluster be found. We thus need a "cluster-preserving clustering" algorithm, that partitions the graph into clusters with the promise of not destroying any original cluster. To do this we first apply standard spectral graph partitioning, and then we use some novel combinatorial techniques to modify the cuts obtained so as to make sure that the original clusters are sufficiently preserved. Our cluster-preserving clustering may be of broader interest much beyond heavy hitters.

SODA Conference 2015 Conference Paper

Approximate Range Emptiness in Constant Time and Optimal Space

  • Mayank Goswami 0001
  • Allan Grønlund Jørgensen
  • Kasper Green Larsen
  • Rasmus Pagh

This paper studies the ε -approximate range emptiness problem, where the task is to represent a set S of n points from {0, …, U – 1} and answer emptiness queries of the form “[ a; b ] ∩ S ≠ ? ” with a probability of false positives allowed. This generalizes the functionality of Bloom filters from single point queries to any interval length L. Setting the false positive rate to ε/L and performing L queries, Bloom filters yield a solution to this problem with space O ( n lg (L/ ε)) bits, false positive probability bounded by ε for intervals of length up to L, using query time O ( L lg( L/ ε)). Our first contribution is to show that the space/error trade-off cannot be improved asymptotically: Any data structure for answering approximate range emptiness queries on intervals of length up to L with false positive probability ε, must use space Ω( n lg( L /ε)) — O ( n ) bits. On the positive side we show that the query time can be improved greatly, to constant time, while matching our space lower bound up to a lower order additive term. This result is achieved through a succinct data structure for (non-approximate 1d) range emptiness/reporting queries, which may be of independent interest.

FOCS Conference 2015 Conference Paper

New Unconditional Hardness Results for Dynamic and Online Problems

  • Raphaël Clifford
  • Allan Grønlund Jørgensen
  • Kasper Green Larsen

There has been a resurgence of interest in lower bounds whose truth rests on the conjectured hardness of well known computational problems. These conditional lower bounds have become important and popular due to the painfully slow progress on proving strong unconditional lower bounds. Nevertheless, the long term goal is to replace these conditional bounds with unconditional ones. In this paper we make progress in this direction by studying the cell probe complexity of two conjectured to be hard problems of particular importance: matrix-vector multiplication and a version of dynamic set disjointness known as Patrascu's Multiphase Problem. We give improved unconditional lower bounds for these problems as well as introducing new proof techniques of independent interest. These include a technique capable of proving strong threshold lower bounds of the following form: If we insist on having a very fast query time, then the update time has to be slow enough to compute a lookup table with the answer to every possible query. This is the first time a lower bound of this type has been proven.

FOCS Conference 2012 Conference Paper

Higher Cell Probe Lower Bounds for Evaluating Polynomials

  • Kasper Green Larsen

In this paper, we study the cell probe complexity of evaluating an n-degree polynomial P over a finite field F of size at least n 1+Ω(1). More specifically, we show that any static data structure for evaluating P(x), where x ∈ F, must use Ω(lg |F|/ lg(Sw/n lg |F|)) cell probes to answer a query, where S denotes the space of the data structure in number of cells and w the cell size in bits. This bound holds in expectation for randomized data structures with any constant error probability δ u )) for dynamic data structures for polynomial evaluation over a finite field F of size Ω(n 2 ). Here t q denotes the expected query time and tu the worst case update time. This lower bound holds for randomized data structures with any constant error probability δ u, t q } = Ω(max{lg n, lg d/ lg lg d}) has been achieved for dynamic data structures, where d denotes the number of different queries and updates to the problem. Furthermore, it is the first such lower bound that holds for randomized data structures with a constant probability of error.

SODA Conference 2012 Conference Paper

I/O-efficient data structures for colored range and prefix reporting

  • Kasper Green Larsen
  • Rasmus Pagh

Motivated by information retrieval applications, we consider the one-dimensional colored range reporting problem in rank space. The goal is to build a static data structure for sets C 1, …, C m ⊆ {1, …, σ} that supports queries of the kind: Given indices a, b, report the set ∪ a ≤ i ≤ b C i. We study the problem in the I/O model, and show that there exists an optimal linear-space data structure that answers queries in O (1 + k/B ) I/Os, where k denotes the output size and B the disk block size in words. In fact, we obtain the same bound for the harder problem of three-sided orthogonal range reporting. In this problem, we are to preprocess a set of n two-dimensional points in rank space, such that all points inside a query rectangle of the form [ x 1, x 2 ] × (−∞, y ] can be reported. The best previous bounds for this problem is either O ( n lg 2 B n ) space and O (1 + k/B ) query I/Os, or O ( n ) space and O(lg ( h ) B n + k/B ) query I/Os, where lg ( h ) B n is the base B logarithm iterated h times, for any constant integer h. The previous bounds are both achieved under the indivisibility assumption, while our solution exploits the full capabilities of the underlying machine. Breaking the indivisibility assumption thus provides us with cleaner and optimal bounds. Our results also imply an optimal solution to the following colored prefix reporting problem. Given a set S of strings, each O (1) disk blocks in length, and a function c: S → 2 (1, …, σ), support queries of the kind: Given a string p, report the set ∪ x ∊ S ∩ p * c ( x ), where p* denotes the set of strings with prefix p. Finally, we consider the possibility of top- k extensions of this result, and present a simple solution in a model that allows non-blocked I/O.

STOC Conference 2012 Conference Paper

The cell probe complexity of dynamic range counting

  • Kasper Green Larsen

In this paper we develop a new technique for proving lower bounds on the update time and query time of dynamic data structures in the cell probe model. With this technique, we prove the highest lower bound to date for any explicit problem, namely a lower bound of t q =Ω((lg n/lg(wt u )) 2 ). Here n is the number of update operations, w the cell size, t q the query time and t u the update time. In the most natural setting of cell size w=Θ(lg n), this gives a lower bound of t q =Ω((lg n/lg lg n) 2 ) for any polylogarithmic update time. This bound is almost a quadratic improvement over the highest previous lower bound of Ω(lg n), due to Patrascu and Demaine [SICOMP'06]. We prove our lower bound for the fundamental problem of weighted orthogonal range counting. In this problem, we are to support insertions of two-dimensional points, each assigned a Θ(lg n)-bit integer weight. A query to this problem is specified by a point q=(x,y), and the goal is to report the sum of the weights assigned to the points dominated by q, where a point (x',y') is dominated by q if x' ≤ x and y' ≤ y. In addition to being the highest cell probe lower bound to date, our lower bound is also tight for data structures with update time t u = Ω(lg 2+ε n), where ε>0 is an arbitrarily small constant.

FOCS Conference 2011 Conference Paper

On Range Searching in the Group Model and Combinatorial Discrepancy

  • Kasper Green Larsen

In this paper we establish an intimate connection between dynamic range searching in the group model and combinatorial discrepancy. Our result states that, for a broad class of range searching data structures (including all known upper bounds), it must hold that t u t q = Ω(disc 2 /lg n) where t u is the worst case update time, t q the worst case query time and $\disc$ is the combinatorial discrepancy of the range searching problem in question. This relation immediately implies a whole range of exceptionally high and near-tight lower bounds for all of the basic range searching problems. We list a few of them in the following: 1. For halfspace range searching in d-dimensional space, we get a lower bound of t u t q = Ω(n 1-1/d /lg n). This comes within a lg n lg lg n factor of the best known upper bound. 2. For orthogonal range searching in d-dimensional space, we get a lower bound of t u t q = Ω(lg d-2+μ(d) n), where μ(d) >; 0 is some small but strictly positive function of d. 3. For ball range searching in d-dimensional space, we get a lower bound of t u t q = Ω(n 1-1/d /lg n). We note that the previous highest lower bound for any explicit problem, due to Patrascu [STOC'07], states that t q = Ω((lg n/lg(lg n + t u )) 2 ), which does however hold for a less restrictive class of data structures. Our result also has implications for the field of combinatorial discrepancy. Using textbook range searching solutions, we improve on the best known discrepancy upper bound for axis-aligned rectangles in dimensions d ≥ 3.

SODA Conference 2011 Conference Paper

Range Selection and Median: Tight Cell Probe Lower Bounds and Adaptive Data Structures

  • Allan Grønlund Jørgensen
  • Kasper Green Larsen

Range selection is the problem of preprocessing an input array A of n unique integers, such that given a query ( i, j, k ), one can report the k 'th smallest integer in the subarray A [ i ], A [ i + 1], …, A [ j ]. In this paper we consider static data structures in the word-RAM for range selection and several natural special cases thereof. The first special case is known as range median, which arises when k is fixed to ⌊( j − i + 1)/2⌋. The second case, denoted prefix selection, arises when i is fixed to 0. Finally, we also consider the bounded rank prefix selection problem and the fixed rank range selection problem. In the former, data structures must support prefix selection queries under the assumption that k ≤ κ for some value κ ≤ n given at construction time, while in the latter, data structures must support range selection queries where k is fixed beforehand for all queries. We prove cell probe lower bounds for range selection, prefix selection and range median, stating that any data structure that uses S words of space needs Ω(log n /log( Sw/n )) time to answer a query. In particular, any data structure that uses n log O (1) n space needs Ω(log n / log log n ) time to answer a query, and any data structure that supports queries in constant time, needs n 1+Ω(1) space. For data structures that uses n log O (1) n space this matches the best known upper bound. Additionally, we present a linear space data structure that supports range selection queries in O (log k /log log n + log log n ) time. Finally, we prove that any data structure that uses S space, needs Ω(log κ/log( Sw/n )) time to answer a bounded rank prefix selection query and Ω(log k /log( Sw/n )) time to answer a fixed rank range selection query. This shows that our data structure is optimal except for small values of k.

FOCS Conference 2009 Conference Paper

Orthogonal Range Reporting in Three and Higher Dimensions

  • Peyman Afshani
  • Lars Arge
  • Kasper Green Larsen

In orthogonal range reporting we are to preprocess N points in d-dimensional space so that the points inside a d-dimensional axis-aligned query box can be reported efficiently. This is a fundamental problem in various fields, including spatial databases and computational geometry. In this paper we provide a number of improvements for three and higher dimensional orthogonal range reporting: In the pointer machine model, we improve all the best previous results, some of which have not seen any improvements in almost two decades. In the I/O-model, we improve the previously known three-dimensional structures and provide the first (non-trivial) structures for four and higher dimensions.