Author name cluster

Or Zamir

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

2 author rows

STOC Conference 2025 Conference Paper

Optimality of Frequency Moment Estimation

Mark Braverman
Or Zamir

Details

TMLR Journal 2024 Journal Article

Undetectable Steganography for Language Models

Or Zamir

We introduce a cryptographic method to hide an arbitrary secret payload in the response of a Large Language Model (LLM). A secret key is required to extract the payload from the model's response, and without the key it is provably impossible to distinguish between the responses of the original LLM and the LLM that hides a payload. In particular, the quality of generated text is not affected by the payload. Our approach extends a recent result of Christ, Gunn and Zamir (2023) who introduced an undetectable watermarking scheme for LLMs.

PDF Details

STOC Conference 2023 Conference Paper

Algorithmic Applications of Hypergraph and Partition Containers

Or Zamir

Details

STOC Conference 2022 Conference Paper

Hardness of approximation in p via short cycle removal: cycle detection, distance oracles, and beyond

Amir Abboud
Karl Bringmann
Seri Khoury
Or Zamir

Details

FOCS Conference 2022 Conference Paper

Planting Undetectable Backdoors in Machine Learning Models: [Extended Abstract]

Shafi Goldwasser
Michael P. Kim
Vinod Vaikuntanathan
Or Zamir

Given the computational cost and technical expertise required to train machine learning models, users may delegate the task of learning to a service provider. Delegation of learning has clear benefits, and at the same time raises serious concerns of trust. This work studies possible abuses of power by untrusted learners. We show how a malicious learner can plant an undetectable backdoor into a classifier. On the surface, such a backdoored classifier behaves normally, but in reality, the learner maintains a mechanism for changing the classification of any input, with only a slight perturbation. Importantly, without the appropriate “backdoor key, ” the mechanism is hidden and cannot be detected by any computationally-bounded observer. We demonstrate two frameworks for planting undetectable backdoors, with incomparable guarantees. •First, we show how to plant a backdoor in any model, using digital signature schemes. The construction guarantees that given query access to the original model and the backdoored version, it is computationally infeasible to find even a single input where they differ. This property implies that the backdoored model has generalization error comparable with the original model. Moreover, even if the distinguisher can request backdoored inputs of its choice, they cannot backdoor a new input—a property we call non-replicability. •Second, we demonstrate how to insert undetectable backdoors in models trained using the Random Fourier Features (RFF) learning paradigm (Rahimi, Recht; NeurIPS 2007). In this construction, undetectability holds against powerful white-box distinguishers: given a complete description of the network and the training data, no efficient distinguisher can guess whether the model is “clean” or contains a backdoor. The backdooring algorithm executes the RFF algorithm faithfully on the given training data, tampering only with its random coins. We prove this strong guarantee under the hardness of the Continuous Learning With Errors problem (Bruna, Regev, Song, Tang; STOC 2021). We show a similar white-box undetectable backdoor for random ReLU networks based on the hardness of Sparse PCA (Berthet, Rigollet; COLT 2013). Our construction of undetectable backdoors also sheds light on the related issue of robustness to adversarial examples. In particular, by constructing undetectable backdoor for an “adversarially-robust” learning algorithm, we can produce a classifier that is indistinguishable from a robust classifier, but where every input has an adversarial example! In this way, the existence of undetectable backdoors represent a significant theoretical roadblock to certifying adversarial robustness.

Details

SODA Conference 2022 Conference Paper

Simulating a stack using queues

Haim Kaplan
Robert Endre Tarjan
Or Zamir
Uri Zwick

It is well known that a queue can be simulated by two stacks using a constant number of stack operations per queue operation. In this paper we consider the forgotten converse problem of simulating a stack using several queues. We consider several variants of this problem. For the offline variant, we obtain a tight upper and lower bounds for the worst-case number of queue operations needed to simulate a sequence of n stack operations using k queues. For the online variant, when the number of queues k is constant, and n is the maximum number of items in the stack at any given time, we obtain tight Θ( n 1/ k ) upper and lower bounds on the worst-case and amortized number of queue operations needed to simulate one stack operation. When k is allowed to grow with n, we prove an upper bound of O ( n 1/ k + log k n ) and a lower bound of on the amortized number of queue operations per stack operation. We also prove an upper bound of O ( kn 1/ k ) and a lower bound of Ω( n 1/ k + log k n ) on the worst-case number of queue operations per stack operation. We also show that the specific but interesting sequence of n pushes followed by n pops can be implemented much faster using a total number of only Θ( n log k n ) queue operations, for every k ≥ 2, an amortized number of Θ(log k n ) queue operations per stack operation, and this bound is tight. On the other hand, we show that the same sequence requires at least Ω( n 1/ k ) queue operations per stack operation in the worst case.

Details

ICML Conference 2021 Conference Paper

Randomized Dimensionality Reduction for Facility Location and Single-Linkage Clustering

Shyam Narayanan
Sandeep Silwal
Piotr Indyk
Or Zamir

Random dimensionality reduction is a versatile tool for speeding up algorithms for high-dimensional problems. We study its application to two clustering problems: the facility location problem, and the single-linkage hierarchical clustering problem, which is equivalent to computing the minimum spanning tree. We show that if we project the input pointset $X$ onto a random $d = O(d_X)$-dimensional subspace (where $d_X$ is the doubling dimension of $X$), then the optimum facility location cost in the projected space approximates the original cost up to a constant factor. We show an analogous statement for minimum spanning tree, but with the dimension $d$ having an extra $\log \log n$ term and the approximation factor being arbitrarily close to $1$. Furthermore, we extend these results to approximating {\em solutions} instead of just their {\em costs}. Lastly, we provide experimental results to validate the quality of solutions and the speedup due to the dimensionality reduction. Unlike several previous papers studying this approach in the context of $k$-means and $k$-medians, our dimension bound does not depend on the number of clusters but only on the intrinsic dimensionality of $X$.

Details

FOCS Conference 2021 Conference Paper

Tight Space Complexity of the Coin Problem

Mark Braverman
Sumegha Garg
Or Zamir

In the coin problem we are asked to distinguish, with probability at least 2/3, between $n\ i. i. d$. coins which are heads with probability $\frac{1}{2}+\beta$ from ones which are heads with probability $\frac{1}{2}-\beta$. We are interested in the space complexity of the coin problem, corresponding to the width of a read-once branching program solving the problem. The coin problem becomes more difficult as $\beta$ becomes smaller. Statistically, it can be solved whenever $\beta= \Omega(n^{-1/2})$, using counting. It has been previously shown that for $\beta=O(n^{-1/2})$, counting is essentially optimal (equivalently, width $poly (n)$ is necessary [Braverman-Garg-Woodruff FOCS'20]). On the other hand, the coin problem only requires $O(\log n)$ width for $\beta > n^{-c}$ for any constant $c > \log_{2}(\sqrt{5}-1)\approx 0. 306$ (following low-width simulation of AND-OR tree of [Valiant Journal of Algorithms'84]). In this paper, we close the gap between the bounds, showing a tight threshold between the values of $\beta=n^{-c}$ where $O(\log n)$ width suffices and the regime where $poly (n)$ width is needed, with a transition at $c=1/3$. This gives a complete characterization (up to constant factors) of the memory complexity of solving the coin problem, for all values of bias $\beta$. We introduce new techniques in both bounds. For the upper bound, we give a construction based on recursive majority that does not require a memory stack of size $\log n$ bits. For the lower bound, we introduce new combinatorial techniques for analyzing progression of the success probabilities in read-once branching programs.

Details

SODA Conference 2019 Conference Paper

A sort of an adversary

Haim Kaplan
Or Zamir
Uri Zwick

We describe an efficient deterministic adversary that forces any comparison-based sorting algorithm to perform at least n log n comparisons. This improves on previous efficient adversaries of Atallah and Kosaraju (1981), Richards and Vaidya (1988), and of Brodal et al. (1996) that force any sorting algorithm to perform at least ψ n log n comparisons.

Details

STOC Conference 2019 Conference Paper

Faster k -SAT algorithms using biased-PPSZ

Thomas Dueholm Hansen
Haim Kaplan
Or Zamir
Uri Zwick

The PPSZ algorithm, due to Paturi, Pudlak, Saks and Zane, is currently the fastest known algorithm for the k -SAT problem, for every k >3. For 3-SAT, a tiny improvement over PPSZ was obtained by Hertli. We introduce a biased version of the PPSZ algorithm using which we obtain an improvement over PPSZ for every k ≥ 3. For k =3 we also improve on Herli’s result and get a much more noticeable improvement over PPSZ, though still relatively small. In particular, for Unique 3-SAT, we improve the current bound from 1.308 n to 1.307 n .

Details

FOCS Conference 2019 Conference Paper

Random k-out Subgraph Leaves only O(n/k) Inter-Component Edges

Jacob Holm
Valerie King
Mikkel Thorup
Or Zamir
Uri Zwick

Each vertex of an arbitrary simple graph on n vertices chooses k random incident edges. What is the expected number of edges in the original graph that connect different connected components of the sampled subgraph? We prove that the answer is O(n/k), when k ≥ c log n, for some large enough c. We conjecture that the same holds for smaller values of k, possibly for any k ≥ 2. Such a result is best possible for any k ≥ 2. As an application, we use this sampling result to obtain a one-way communication protocol with private randomness for finding a spanning forest of a graph in which each vertex sends only O (√n log n) bits to a referee.

Details

SODA Conference 2016 Conference Paper

Subtree Isomorphism Revisited

Amir Abboud
Arturs Backurs
Thomas Dueholm Hansen
Virginia Vassilevska Williams
Or Zamir

Details

SODA Conference 2015 Conference Paper

The amortized cost of finding the minimum

Haim Kaplan
Or Zamir
Uri Zwick

We obtain an essentially optimal tradeoff between the amortized cost of the three basic priority queue operations insert, delete and find-min in the comparison model. More specifically, we show that for any fixed ε > 0, where n is the number of items in the priority queue and A (insert), A (delete) and A (find-min) are the amortized costs of the insert, delete and find-min operations, respectively. In particular, if A (insert) + A (delete) = O (1), then A (find-min) = Ω( n ), and A (find-min) = O ( n α ), for some α < 1, only if A (insert) + A (delete) = Ω(log n ). (We can, of course, have A (insert) = O (1), A (delete) = O (log n ), or vice versa, and A (find-min) = O (1).) Our lower bound holds even if randomization is allowed. Surprisingly, such fundamental bounds on the amortized cost of the operations were not known before. Brodal, Chaudhuri and Rad-hakrishnan, obtained similar bounds for the worst-case complexity of find-min.

Details