Arrow Research search

Author name cluster

Omer Reingold

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

49 papers
2 author rows

Possible papers

49

JBHI Journal 2026 Journal Article

A Post-Processing Fairness Mitigation Method for Medical Prediction Models

  • Noga Stern
  • Inbal Livni Navon
  • Omer Reingold
  • Eitan Bachmat
  • Noam Barda
  • Noa Dagan

A common challenge in medicine is the allocation of limited resources. Prediction models can assist in making resource allocation decisions, but they may also exhibit unfairness. We introduce a post-processing algorithm tailored to this scenario and compare its performance to modified pre-processing and in-processing algorithms. We developed predictors for in-hospital mortality, utilizing ICU data from two datasets. We then applied a post-processing algorithm we developed to enforce equal opportunity while adhering to a limited resources constraint and compared it to modified existing pre-processing and in-processing algorithms. The results were evaluated in terms of fairness and predictive performance, and the usability of the three algorithms was compared. All algorithms showed substantial improvement in the “equal opportunity” metric, presenting an average decrease of 52% in the span of sensitivity values across subgroups, and none of the algorithms consistently outperformed the others. In some cases enforcing fairness reduced predictive performance, with an average decrease of 4% in sensitivity and 3% in positive predictive value. However, there were usability differences between the algorithms: the pre-processing and post-processing algorithms preserve the numerical risk predictions, and only the post-processing algorithm does not require re-training to change the size of the intervention group. Our results demonstrate that all three methods enhance model fairness, with no single approach consistently outperforming the others. All three methods also achieve similar overall predictive performance, which in some cases is reduced compared to the base model. However, our post-processing algorithm offers practical advantages in usability compared to the alternatives.

TMLR Journal 2025 Journal Article

Fairness with respect to Stereotype Predictors: Impossibilities and Best Practices

  • Inbal Rachel Livni Navon
  • Omer Reingold
  • Judy Hanwen Shen

As AI systems increasingly influence decision-making from consumer recommendations to educational opportunities, their accountability becomes paramount. This need for oversight has driven extensive research into algorithmic fairness, a body of work that has examined both allocative and representational harms. However, numerous works examining representational harms such as stereotypes encompass many different concepts measured by different criteria, yielding many, potentially conflicting, characterizations of harm. The abundance of measurement approaches makes the mitigation of stereotypes in downstream machine learning models highly challenging. Our work introduces and unifies a broad class of auditors through the framework of \textit{stereotype predictors}. We map notions of fairness with respect to these predictors to existing notions of group fairness. We give guidance, with theoretical foundations, for selecting one or a set of stereotype predictors and provide algorithms for achieving fairness with respect to stereotype predictors under various fairness notions. We demonstrate the effectiveness of our algorithms with different stereotype predictors in two empirical case studies.

FOCS Conference 2025 Conference Paper

How Global Calibration Strengthens Multiaccuracy

  • Sílvia Casacuberta
  • Parikshit Gopalan
  • Varun Kanade
  • Omer Reingold

Multiaccuracy and multicalibration are multi-group fairness notions for prediction that have found numerous applications in learning and computational complexity [HKRR18]. They can be achieved from a single learning primitive: weak agnostic learning. A line of work starting from [GKR+22] has shown that multicalibration implies a very strong form of learning. Here we investigate the power of multiaccuracy as a learning primitive, both with and without the additional assumption of calibration. We find that multiaccuracy in itself is rather weak, but that the addition of global calibration (this notion is called calibrated multiaccuracy) boosts its power substantially, enough to recover implications that were previously known only assuming the stronger notion of multicalibration. We give evidence that multiaccuracy might not be as powerful as standard weak agnostic learning, by showing that there is no way to post-process a multiaccurate predictor to get a weak learner, even assuming the best hypothesis has correlation 1/2. Rather, we show that it yields a restricted form of weak agnostic learning, which requires some concept in the class to have correlation greater than 1/2 with the labels. However, by also requiring the predictor to be calibrated, we recover not just weak, but strong agnostic learning. A similar picture emerges when we consider the derivation of hardcore measures from predictors satisfying multigroup fairness notions [TTV09], [CDV24]. On the one hand, while multiaccuracy only yields hardcore measures of density half the optimal, we show that (a weighted version of) calibrated multiaccuracy achieves optimal density. Our results yield new insights into the complementary roles played by multiaccuracy and calibration in each setting. They shed light on why multiaccuracy and global calibration, although not particularly powerful by themselves, together yield considerably stronger notions.

ICML Conference 2025 Conference Paper

Representative Language Generation

  • Charlotte Peale
  • Vinod Raman
  • Omer Reingold

We introduce "representative generation, " extending the theoretical framework for generation proposed by Kleinberg et al. (2024) and formalized by Li et al. (2024), to additionally address diversity and bias concerns in generative models. Our notion requires outputs of a generative model to proportionally represent groups of interest from the training data. We characterize representative uniform and non-uniform generation, introducing the “group closure dimension” as a key combinatorial quantity. For representative generation in the limit, we analyze both information-theoretic and computational aspects, demonstrating feasibility for countably infinite hypothesis classes and collections of groups under certain conditions, but proving a negative result for computability using only membership queries. This contrasts with Kleinberg et al. ’s (2024) positive results for standard generation in the limit. Our findings provide a rigorous foundation for developing more diverse and representative generative models.

AAAI Conference 2024 Conference Paper

Dissenting Explanations: Leveraging Disagreement to Reduce Model Overreliance

  • Omer Reingold
  • Judy Hanwen Shen
  • Aditi Talati

While modern explanation methods have been shown to be inconsistent and contradictory, the explainability of black-box models nevertheless remains desirable. When the role of explanations extends from understanding models to aiding decision making, the semantics of explanations is not always fully understood – to what extent do explanations ``explain” a decision and to what extent do they merely advocate for a decision? Can we help humans gain insights from explanations accompanying correct predictions and not over-rely on incorrect predictions advocated for by explanations? With this perspective in mind, we introduce the notion of dissenting explanations: conflicting predictions with accompanying explanations. We first explore the advantage of dissenting explanations in the setting of model multiplicity, where multiple models with similar performance may have different predictions. Through a human study on the task of identifying deceptive reviews, we demonstrate that dissenting explanations reduce overreliance on model predictions, without reducing overall accuracy. Motivated by the utility of dissenting explanations we present both global and local methods for their generation.

ICML Conference 2023 Conference Paper

Omnipredictors for Constrained Optimization

  • Lunjia Hu
  • Inbal Livni Navon
  • Omer Reingold
  • Chutong Yang

The notion of omnipredictors (Gopalan, Kalai, Reingold, Sharan and Wieder ITCS 2022), suggested a new paradigm for loss minimization. Rather than learning a predictor based on a known loss function, omnipredictors can easily be post-processed to minimize any one of a rich family of loss functions compared with the loss of hypotheses in a class $\mathcal C$. It has been shown that such omnipredictors exist and are implied (for all convex and Lipschitz loss functions) by the notion of multicalibration from the algorithmic fairness literature. In this paper, we introduce omnipredictors for constrained optimization and study their complexity and implications. The notion that we introduce allows the learner to be unaware of the loss function that will be later assigned as well as the constraints that will be later imposed, as long as the subpopulations that are used to define these constraints are known. We show how to obtain omnipredictors for constrained optimization problems, relying on appropriate variants of multicalibration. We also investigate the implications of this notion when the constraints used are so-called group fairness notions.

NeurIPS Conference 2023 Conference Paper

Swap Agnostic Learning, or Characterizing Omniprediction via Multicalibration

  • Parikshit Gopalan
  • Michael Kim
  • Omer Reingold

We introduce and study the notion of Swap Agnostic Learning. The problem can be phrased as a game between a *predictor* and an *adversary*: first, the predictor selects a hypothesis $h$; then, the adversary plays in response, and for each level set of the predictor, selects a loss-minimizing hypothesis $c_v \in \mathcal{C}$; the predictor wins if $h$ competes with the adaptive adversary's loss. Despite the strength of the adversary, our main result demonstrates the feasibility Swap Agnostic Learning for any convex loss. Somewhat surprisingly, the result follows by proving an *equivalence* between Swap Agnostic Learning and swap variants of the recent notions Omniprediction (ITCS'22) and Multicalibration (ICML'18). Beyond this equivalence, we establish further connections to the literature on Outcome Indistinguishability (STOC'20, ITCS'23), revealing a unified notion of OI that captures all existing notions of omniprediction and multicalibration.

FOCS Conference 2019 Conference Paper

Learning from Outcomes: Evidence-Based Rankings

  • Cynthia Dwork
  • Michael P. Kim
  • Omer Reingold
  • Guy N. Rothblum
  • Gal Yona

Many selection procedures involve ordering candidates according to their qualifications. For example, a university might order applicants according to a perceived probability of graduation within four years, and then select the top 1000 applicants. In this work, we address the problem of ranking members of a population according to their "probability" of success, based on a training set of historical binary outcome data (e. g. , graduated in four years or not). We show how to obtain rankings that satisfy a number of desirable accuracy and fairness criteria, despite the coarseness of the training data. As the task of ranking is global (the rank of every individual depends not only on their own qualifications, but also on every other individuals' qualifications) ranking is more subtle and vulnerable to manipulation than standard prediction tasks. Towards mitigating unfair discrimination caused by inaccuracies in rankings, we develop two parallel definitions of evidence-based rankings. The first definition relies on a semantic notion of domination-compatibility: if the training data suggest that members of a set S are more qualified (on average) than the members of T, then a ranking that favors T over S (i. e. where T dominates S) is blatantly inconsistent with the evidence, and likely to be discriminatory. The definition asks for domination-compatibility, not just for a pair of sets, but rather for every pair of sets from a rich collection C of subpopulations. The second definition aims at precluding even more general forms of discrimination; this notion of evidence-consistency requires that the ranking must be justified on the basis of consistency with the expectations for every set in the collection C. Somewhat surprisingly, while evidence-consistency is a strictly stronger notion than domination-compatibility when the collection C is predefined, the two notions are equivalent when the collection C may depend on the ranking in question.

NeurIPS Conference 2018 Conference Paper

Fairness Through Computationally-Bounded Awareness

  • Michael Kim
  • Omer Reingold
  • Guy Rothblum

We study the problem of fair classification within the versatile framework of Dwork et al. [ITCS '12], which assumes the existence of a metric that measures similarity between pairs of individuals. Unlike earlier work, we do not assume that the entire metric is known to the learning algorithm; instead, the learner can query this arbitrary metric a bounded number of times. We propose a new notion of fairness called metric multifairness and show how to achieve this notion in our setting. Metric multifairness is parameterized by a similarity metric d on pairs of individuals to classify and a rich collection C of (possibly overlapping) "comparison sets" over pairs of individuals. At a high level, metric multifairness guarantees that similar subpopulations are treated similarly, as long as these subpopulations are identified within the class C.

STOC Conference 2018 Conference Paper

Improved pseudorandomness for unordered branching programs through local monotonicity

  • Eshan Chattopadhyay
  • Pooya Hatami
  • Omer Reingold
  • Avishay Tal

We present an explicit pseudorandom generator with seed length Õ((log n ) w +1 ) for read-once, oblivious, width w branching programs that can read their input bits in any order. This improves upon the work of Impagliazzo, Meka and Zuckerman (FOCS’12) where they required seed length n 1/2+ o (1) . A central ingredient in our work is the following bound that we prove on the Fourier spectrum of branching programs. For any width w read-once, oblivious branching program B :{0,1} n → {0,1}, any k ∈ {1,…, n }, [complex formula not displayed] This settles a conjecture posed by Reingold, Steinke and Vadhan (RANDOM’13). Our analysis crucially uses a notion of local monotonicity on the edge labeling of the branching program. We carry critical parts of our proof under the assumption of local monotonicity and show how to deduce our results for unrestricted branching programs.

ICML Conference 2018 Conference Paper

Multicalibration: Calibration for the (Computationally-Identifiable) Masses

  • Úrsula Hébert-Johnson
  • Michael P. Kim
  • Omer Reingold
  • Guy N. Rothblum

We develop and study multicalibration as a new measure of fairness in machine learning that aims to mitigate inadvertent or malicious discrimination that is introduced at training time (even from ground truth data). Multicalibration guarantees meaningful (calibrated) predictions for every subpopulation that can be identified within a specified class of computations. The specified class can be quite rich; in particular, it can contain many overlapping subgroups of a protected group. We demonstrate that in many settings this strong notion of protection from discrimination is provably attainable and aligned with the goal of obtaining accurate predictions. Along the way, we present algorithms for learning a multicalibrated predictor, study the computational complexity of this task, and illustrate tight connections to the agnostic learning model.

FOCS Conference 2017 Conference Paper

Derandomization Beyond Connectivity: Undirected Laplacian Systems in Nearly Logarithmic Space

  • Jack Murtagh
  • Omer Reingold
  • Aaron Sidford
  • Salil P. Vadhan

We give a deterministic Õ(log n)-space algorithm for approximately solving linear systems given by Laplacians of undirected graphs, and consequently also approximating hitting times, commute times, and escape probabilities for undirected graphs. Previously, such systems were known to be solvable by randomized algorithms using O(log n) space (Doron, Le Gall, and Ta-Shma, 2017) and hence by deterministic algorithms using O(log 3/2 n) space (Saks and Zhou, FOCS 1995 and JCSS 1999). Our algorithm combines ideas from time-efficient Laplacian solvers (Spielman and Teng, STOC `04; Peng and Spielman, STOC `14) with ideas used to show that UNDIRECTED S-T CONNECTIVITY is in deterministic logspace (Reingold, STOC `05 and JACM `08; Rozenman and Vadhan, RANDOM `05).

ECAI Conference 2016 Conference Paper

Adaptive Condorcet-Based Stopping Rules Can Be Efficient

  • Omer Reingold
  • Nina Narodytska

A crowdsourcing project is usually comprised of many unit tasks known as Human Intelligence Tasks (HITs). As answers to each HIT varies between workers, each HIT is often contracted to more than one worker to obtain a reliable and consistent enough answer. When implementing a project, an important design decision is how to formulate HITs and how to aggregate workers' answers. These decisions have strong impact on the quality of results and cost of elicitation process. One way to design an efficient elicitation procedure is to use adaptive stopping rules, which allows terminating the elicitation process as soon as a high quality result is guaranteed.

NeurIPS Conference 2015 Conference Paper

Generalization in Adaptive Data Analysis and Holdout Reuse

  • Cynthia Dwork
  • Vitaly Feldman
  • Moritz Hardt
  • Toni Pitassi
  • Omer Reingold
  • Aaron Roth

Overfitting is the bane of data analysts, even when data are plentiful. Formal approaches to understanding this problem focus on statistical inference and generalization of individual analysis procedures. Yet the practice of data analysis is an inherently interactive and adaptive process: new analyses and hypotheses are proposed after seeing the results of previous ones, parameters are tuned on the basis of obtained results, and datasets are shared and reused. An investigation of this gap has recently been initiated by the authors in (Dwork et al. , 2014), where we focused on the problem of estimating expectations of adaptively chosen functions. In this paper, we give a simple and practical method for reusing a holdout (or testing) set to validate the accuracy of hypotheses produced by a learning algorithm operating on a training set. Reusing a holdout set adaptively multiple times can easily lead to overfitting to the holdout set itself. We give an algorithm that enables the validation of a large number of adaptively chosen hypotheses, while provably avoiding overfitting. We illustrate the advantages of our algorithm over the standard use of the holdout set via a simple synthetic experiment. We also formalize and address the general problem of data reuse in adaptive data analysis. We show how the differential-privacy based approach in (Dwork et al. , 2014) is applicable much more broadly to adaptive data analysis. We then show that a simple approach based on description length can also be used to give guarantees of statistical validity in adaptive settings. Finally, we demonstrate that these incomparable approaches can be unified via the notion of approximate max-information that we introduce. This, in particular, allows the preservation of statistical validity guarantees even when an analyst adaptively composes algorithms which have guarantees based on either of the two approaches.

STOC Conference 2015 Conference Paper

Preserving Statistical Validity in Adaptive Data Analysis

  • Cynthia Dwork
  • Vitaly Feldman
  • Moritz Hardt
  • Toniann Pitassi
  • Omer Reingold
  • Aaron Roth 0001

A great deal of effort has been devoted to reducing the risk of spurious scientific discoveries, from the use of sophisticated validation techniques, to deep statistical methods for controlling the false discovery rate in multiple hypothesis testing. However, there is a fundamental disconnect between the theoretical results and the practice of data analysis: the theory of statistical inference assumes a fixed collection of hypotheses to be tested, or learning algorithms to be applied, selected non-adaptively before the data are gathered, whereas in practice data is shared and reused with hypotheses and new analyses being generated on the basis of data exploration and the outcomes of previous analyses. In this work we initiate a principled study of how to guarantee the validity of statistical inference in adaptive data analysis. As an instance of this problem, we propose and investigate the question of estimating the expectations of m adaptively chosen functions on an unknown distribution given n random samples. We show that, surprisingly, there is a way to estimate an exponential in n number of expectations accurately even if the functions are chosen adaptively. This gives an exponential improvement over standard empirical estimators that are limited to a linear number of estimates. Our result follows from a general technique that counter-intuitively involves actively perturbing and coordinating the estimates, using techniques developed for privacy preservation. We give additional applications of this technique to our question.

FOCS Conference 2012 Conference Paper

Better Pseudorandom Generators from Milder Pseudorandom Restrictions

  • Parikshit Gopalan
  • Raghu Meka
  • Omer Reingold
  • Luca Trevisan 0001
  • Salil P. Vadhan

We present an iterative approach to constructing pseudorandom generators, based on the repeated application of mild pseudorandom restrictions. We use this template to construct pseudorandom generators for combinatorial rectangles and read-once CNFs and a hitting set generator for width-3 branching programs, all of which achieve near-optimal seed-length even in the low-error regime: We get seed-length Õ(log (n/ε)) for error ε. Previously, only constructions with seed-length O(log 3/2 n) or O(log 2 n) were known for these classes with error ε = 1/poly(n). The (pseudo)random restrictions we use are milder than those typically used for proving circuit lower bounds in that we only set a constant fraction of the bits at a time. While such restrictions do not simplify the functions drastically, we show that they can be derandomized using small-bias spaces.

FOCS Conference 2011 Conference Paper

Balls and Bins: Smaller Hash Families and Faster Evaluation

  • L. Elisa Celis
  • Omer Reingold
  • Gil Segev 0001
  • Udi Wieder

A fundamental fact in the analysis of randomized algorithms is that when n balls are hashed into n bins independently and uniformly at random, with high probability each bin contains at most O(log n/ log log n) balls. In various applications, however, the assumption that a truly random hash function is available is not always valid, and explicit functions are required. In this paper we study the size of families (or, equivalently, the description length of their functions) that guarantee a maximal load of O(log n/ log log n) with high probability, as well as the evaluation time of their functions. Whereas such functions must be described using Omega(log n) bits, the best upper bound was formerly O(log 2 n/ log log n) bits, which is attained by O(log n/ log log n)-wise independent functions. Traditional constructions of the latter offer an evaluation time of O(log n/ log log n), which according to Siegel's lower bound [FOCS '89] can be reduced only at the cost of significantly increasing the description length. We construct two families that guarantee a maximal load of O(log n/ log log n) with high probability. Our constructions are based on two different approaches, and exhibit different trade-offs between the description length and the evaluation time. The first construction shows that O(log n/ log log n)-wise independence can in fact be replaced by "gradually increasing independence", resulting in functions that are described using O(log n log log n) bits and evaluated in time O(log n log log n). The second construction is based on derandomization techniques for space-bounded computations combined with a tailored construction of a pseudorandom generator, resulting in functions that are described using O(log 3/2 n) bits and evaluated in time O(√(log n)). The latter can be compared to Siegel's lower bound stating that O(log n / log log n)-wise independent functions that are evaluated in time O(√(log n)) must be described using Ω(2√(log n)) bits.

STOC Conference 2010 Conference Paper

Efficiency improvements in constructing pseudorandom generators from one-way functions

  • Iftach Haitner
  • Omer Reingold
  • Salil P. Vadhan

We give a new construction of pseudorandom generators from any one-way function. The construction achieves better parameters and is simpler than that given in the seminal work of Hastad, Impagliazzo, Levin, and Luby [SICOMP '99]. The key to our construction is a new notion of "next-block pseudoentropy", which is inspired by the notion of "inaccessible entropy" recently introduced in [Haitner, Reingold, Vadhan, Wee, STOC '09]. An additional advantage over previous constructions is that our pseudorandom generators are parallelizable and invoke the one-way function in a non-adaptive manner. Using [Applebaum, Ishai, Kushilevitz, SICOMP '06], this implies the existence of pseudorandom generators in NC^0 based on the existence of one-way functions in NC^1.

FOCS Conference 2010 Conference Paper

The Limits of Two-Party Differential Privacy

  • Andrew McGregor 0001
  • Ilya Mironov
  • Toniann Pitassi
  • Omer Reingold
  • Kunal Talwar
  • Salil P. Vadhan

We study differential privacy in a distributed setting where two parties would like to perform analysis of their joint data while preserving privacy for both datasets. Our results imply almost tight lower bounds on the accuracy of such data analyses, both for specific natural functions (such as Hamming distance) and in general. Our bounds expose a sharp contrast between the two-party setting and the simpler client-server setting (where privacy guarantees are one-sided). In addition, those bounds demonstrate a dramatic gap between the accuracy that can be obtained by differentially private data analysis versus the accuracy obtainable when privacy is relaxed to a computational variant of differential privacy. The first proof technique we develop demonstrates a connection between differential privacy and deterministic extraction from Santha-Vazirani sources. A second connection we expose indicates that the ability to approximate a function by a low-error differentially private protocol is strongly related to the ability to approximate it by a low communication protocol. (The connection goes in both directions).

STOC Conference 2009 Conference Paper

Inaccessible entropy

  • Iftach Haitner
  • Omer Reingold
  • Salil P. Vadhan
  • Hoeteck Wee

We put forth a new computational notion of entropy, which measures the (in)feasibility of sampling high entropy strings that are consistent with a given protocol. Specifically, we say that the i'th round of a protocol (A,B) has *accessible entropy* at most k, if no polynomial-time strategy A* can generate messages for A such that the entropy of its message in the i'th round has entropy greater than k when conditioned both on prior messages of the protocol and on prior coin tosses of A*. We say that the protocol has *inaccessible entropy* if the total accessible entropy (summed over the rounds) is noticeably smaller than the real entropy of A's messages, conditioned only on prior messages (but not the coin tosses of A). As applications of this notion, we -- Give a much simpler and more efficient construction of statistically hiding commitment schemes from arbitrary one-way functions. -- Prove that constant-round statistically hiding commitments are necessary for constructing constant-round zero-knowledge proof systems for NP that remain secure under parallel composition (assuming the existence of one-way functions).

FOCS Conference 2008 Conference Paper

Dense Subsets of Pseudorandom Sets

  • Omer Reingold
  • Luca Trevisan 0001
  • Madhur Tulsiani
  • Salil P. Vadhan

A theorem of Green, Tao, and Ziegler can be stated (roughly) as follows: ifR is a pseudorandom set, and D is a dense subset of R, then D may be modeled by a set M that is dense in the entire domain such that D and M are indistinguishable. (The precise statement refers to"measures" or distributions rather than sets.) The proof of this theorem is very general, and it applies to notions of pseudo-randomness and indistinguishability defined in terms of any family of distinguishers with some mild closure properties. The proof proceeds via iterative partitioning and an energy increment argument, in the spirit of the proof of the weak Szemeredi regularity lemma. The "reduction" involved in the proof has exponential complexity in the distinguishing probability. We present a new proof inspired by Nisan's proof of Impagliazzo's hardcore set theorem. The reduction in our proof has polynomial complexity in the distinguishing probability and provides a new characterization of the notion of "pseudoentropy" of a distribution. A proof similar to ours has also been independently discovered by Gowers [2]. We also follow the connection between the two theorems and obtain a new proof of Impagliazzo's hardcore set theorem via iterative partitioning and energy increment. While our reduction has exponential complexity in some parameters, it has the advantage that the hardcore set is efficiently recognizable.

FOCS Conference 2007 Conference Paper

Finding Collisions in Interactive Protocols - A Tight Lower Bound on the Round Complexity of Statistically-Hiding Commitments

  • Iftach Haitner
  • Jonathan J. Hoch
  • Omer Reingold
  • Gil Segev 0001

We study the round complexity of various cryptographic protocols. Our main result is a tight lower bound on the round complexity of any fully-black-box construction of a statistically-hiding commitment scheme from oneway permutations, and even front trapdoor permutations. This lower bound matches the round complexity of the statistically-hiding commitment scheme due to Naor, Ostrovsky, Venkatesan and Yung (CRYPTO '92). As a corollary, we derive similar tight lower bounds for several other ctyptographicprotocols, such as single-server private information retrieval, interactive hashing, and oblivious transfer that guarantees statistical security for one of the parties. Our techniques extend the collision-finding oracle due to Simon (EUROCRYPT '98) to the setting of interactive protocols (our extension also implies an alternative proof for the main property of the original oracle). In addition, we substantially extend the reconstruction paradigm of Gennaro and Trevisan (FOCS '00). In both cases, our extensions are quite delicate and may be found useful in proving additional black-box separation results.

STOC Conference 2006 Conference Paper

Pseudorandom walks on regular digraphs and the RL vs. L problem

  • Omer Reingold
  • Luca Trevisan 0001
  • Salil P. Vadhan

We revisit the general RL vs. L question, obtaining the following results. Generalizing Reingold's techniques to directed graphs, we present a deterministic, log-space algorithm that given a regular directed graph G (or, more generally, a digraph with Eulerian connected components) and two vertices s and t, finds a path between s and t if one exists. If we restrict ourselves to directed graphs that are regular and consistently labelled , then we are able to produce pseudorandom walks for such graphs in logarithmic space (this result already found an independent application). We prove that if (2) could be generalized to all regular directed graphs (including ones that are not consistently labelled) then L=RL. We do so by exhibiting a new complete promise problem for RL, and showing that such a problem can be solved in deterministic logarithmic space given a log-space pseudorandom walk generator for regular directed graphs.

STOC Conference 2005 Conference Paper

Undirected ST-connectivity in log-space

  • Omer Reingold

We present a deterministic , log-space algorithm that solves st-connectivity in undirected graphs. The previous bound on the space complexity of undirected st-connectivity was log 4/3 obtained by Armoni, Ta-Shma, Wigderson and Zhou [9]. As undirected st-connectivity is complete for the class of problems solvable by symmetric, non-deterministic, log-space computations (the class SL), this algorithm implies that SL = L (where L is the class of problems solvable by deterministic log-space computations). Independent of our work (and using different techniques), Trifonov [45] has presented an O(log n log log n)-space, deterministic algorithm for undirected st-connectivity.Our algorithm also implies a way to construct in log-space a fixed sequence of directions that guides a deterministic walk through all of the vertices of any connected graph. Specifically, we give log-space constructible universal-traversal sequences for graphs with restricted labelling and log-space constructible universal-exploration sequences for general graphs.

FOCS Conference 2004 Conference Paper

Assignment Testers: Towards a Combinatorial Proof of the PCP-Theorem

  • Irit Dinur
  • Omer Reingold

In this work, we look back into the proof of the PCP theorem, with the goal of finding new proofs that are "more combinatorial" and arguably simpler. For that, we introduce the notion of an assignment tester, which is a strengthening of the standard PCP verifier, in the following sense. Given a statement and an alleged proof for it, while the PCP verifier checks correctness of the statement, the assignment-tester checks correctness of the statement and the proof. This notion enables composition that is truly modular, i. e. , one can compose two assignment-testers without any assumptions on how they are constructed. A related notion was independently introduced in (Ben-Sasson et. al. STOC 04). We provide a toolkit of (non-trivial) generic transformations on assignment testers. These transformations may be interesting in their own right, and allow us to present the following two main results: 1. The first is a new proof of the PCP theorem. This proof relies on a rather weak assignment tester given as a "black box". From this, we construct combinatorially the full PCP. An important component of this proof is a new combinatorial aggregation technique (i. e. , a new transformation that allows the verifier to read fewer, though possibly longer, "pieces" of the proof). An implementation of the black-box tester can be obtained from the algebraic proof techniques that already appear in L. Babai et al. , 1991 and U. Feige et al. , 1991. Obtaining a combinatorial implementation of this tester would give a purely combinatorial proof for the PCP theorem, which we view as an interesting open problem. 2. Our second construction is a "standalone" combinatorial construction showing NP /spl sube/ PCP (S. Arora et al. , 1998). This implies, for example, that approximating max-SAT is quasi-NP-hard. This construction relies on a transformation that makes an assignment tester "oblivious": so that the proof locations read are independent of the statement that is being proven. This eliminates, in a rather surprising manner, the need for aggregation in a crucial point in the proof.

STOC Conference 2004 Conference Paper

Completeness in two-party secure computation: a computational view

  • Danny Harnik
  • Moni Naor
  • Omer Reingold
  • Alon Rosen

A Secure Function Evaluation (SFE) of a two-variable function f(·,·) is a protocol that allows two parties with inputs x and y to evaluate f(x,y) in a manner where neither party learns "more than is necessary". A rich body of work deals with the study of completeness for secure two-party computation. A function f is complete for SFE if a protocol for securely evaluating f allows the secure evaluation of all (efficiently computable) functions. The questions investigated are which functions are complete for SFE, which functions have SFE protocols unconditionally and whether there are functions that are neither complete nor have efficient SFE protocols.The previous study of these questions was mainly conducted from an Information Theoretic point of view and provided strong answers in the form of combinatorial properties. However, we show that there are major differences between the information theoretic and computational settings. In particular, we show functions that are considered as having SFE unconditionally by the combinatorial criteria but are actually complete in the computational setting. We initiate the fully computational study of these fundamental questions. Somewhat surprisingly, we manage to provide an almost full characterization of the complete functions in this model as well. More precisely, we present a computational criterion (called computational row non-transitivity ) for a function f to be complete for the asymmetric case. Furthermore, we show a matching criterion called computational row transitivity for f to have a simple SFE (based on no additional assumptions). This criterion is close to the negation of the computational row non-transitivity and thus we essentially characterize all "nice" functions as either complete or having SFE unconditionally.

FOCS Conference 2001 Conference Paper

On the Impossibility of Basing Trapdoor Functions on Trapdoor Predicates

  • Yael Gertner
  • Tal Malkin
  • Omer Reingold

We prove that, somewhat surprisingly, there is no black-box reduction of (poly-to-one) trapdoor functions to trapdoor predicates (equivalently, to public-key encryption schemes). Our proof follows the methodology that was introduced by R. Impagliazzo and S. Rudich (1989), although we use a new, weaker model of separation.

FOCS Conference 2000 Conference Paper

Entropy Waves, the Zig-Zag Graph Product, and New Constant-Degree Expanders and Extractors

  • Omer Reingold
  • Salil P. Vadhan
  • Avi Wigderson

The main contribution is a new type of graph product, which we call the zig-zag product. Taking a product of a large graph with a small graph, the resulting graph inherits (roughly) its size from the large one, its degree from the small one, and its expansion properties from both. Iteration yields simple explicit constructions of constant-degree expanders of every size, starting from one constant-size expander. Crucial to our intuition (and simple analysis) of the properties of this graph product is the view of expanders as functions which act as "entropy wave" propagators-they transform probability distributions in which entropy is concentrated in one area to distributions where that concentration is dissipated. In these terms, the graph product affords the constructive interference of two such waves. A variant of this product can be applied to extractors, giving the first explicit extractors whose seed length depends (poly)logarithmically on only the entropy deficiency of the source (rather than its length) and that extract almost all the entropy of high min-entropy sources. These high min-entropy extractors have several interesting applications, including the first constant-degree explicit expanders which beat the "eigenvalue bound".

FOCS Conference 2000 Conference Paper

Extracting Randomness via Repeated Condensing

  • Omer Reingold
  • Ronen Shaltiel
  • Avi Wigderson

On an input probability distribution with some (min-)entropy an extractor outputs a distribution with a (near) maximum entropy rate (namely the uniform distribution). A natural weakening of this concept is a condenser, whose output distribution has a higher entropy rate than the input distribution (without losing much of the initial entropy). We construct efficient explicit condensers. The condenser constructions combine (variants or more efficient versions of) ideas from several works, including the block extraction scheme of Nisan and Zuckerman (1996), the observation made by Srinivasan and Zuckerman (1994) and Nisan and Ta-Schma (1999) that a failure of the block extraction scheme is also useful, the recursive "win-win" case analysis of Impagliazzo et al. (1999, 2000), and the error correction of random sources used by Trevisan (1999). As a natural byproduct, (via repeated iterating of condensers), we obtain new extractor constructions. The new extractors give significant qualitative improvements over previous ones for sources of arbitrary min-entropy; they are nearly optimal simultaneously in the main two parameters-seed length and output length. Specifically, our extractors can make any of these two parameters optimal (up to a constant factor), only at a poly-logarithmic loss in the other. Previous constructions require polynomial loss in both cases for general sources. We also give a simple reduction converting "standard" extractors (which are good for an average seed) to "strong " ones (which are good for mast seeds), with essentially the same parameters.

FOCS Conference 2000 Conference Paper

The Relationship between Public Key Encryption and Oblivious Transfer

  • Yael Gertner
  • Sampath Kannan
  • Tal Malkin
  • Omer Reingold
  • Mahesh Viswanathan 0001

In this paper we study the relationships among some of the most fundamental primitives and protocols in cryptography: public-key encryption (i. e. trapdoor predicates), oblivious transfer (which is equivalent to general secure multi-party computation), key agreement and trapdoor permutations. Our main results show that public-key encryption and oblivious transfer are incomparable under black-box reductions. These separations are tightly matched by our positive results where a restricted (strong) version of one primitive does imply the other primitive. We also show separations between oblivious transfer and key agreement. Finally, we conclude that neither oblivious transfer nor trapdoor predicates imply trapdoor permutations. Our techniques for showing negative results follow the oracle separations of R. Impagliazzo and S. Rudich (1989).

FOCS Conference 1999 Conference Paper

Error Reduction for Extractors

  • Ran Raz
  • Omer Reingold
  • Salil P. Vadhan

An extractor is a function which extracts (almost) truly random bits from a weak random source, using a small number of additional random bits as a catalyst. We present a general method to reduce the error of any extractor. Our method works particularly well in the case that the original extractor extracts up to a constant function of the source min-entropy and achieves a polynomially small error. In that case, we are able to reduce the error to (almost) any /spl epsiv/, using only O(log(1//spl epsiv/)) additional truly random bits (while keeping the other parameters of the original extractor more or less the same). In other cases (e. g. when the original extractor extracts all the min-entropy or achieves only a constant error), our method is not optimal but it is still quite efficient and leads to improved constructions of extractors. Using our method, we are able to improve almost all known extractors in the case where the error required is relatively small (e. g. less than a polynomially small error). In particular, we apply our method to the new extractors of L. Trevisan (1999) and R. Raz et al. (1999) to obtain improved constructions in almost all cases. Specifically, we obtain extractors that work for sources of any min-entropy on strings of length n which (a) extract any 1/n/sup /spl gamma// fraction of the min-entropy using O[log n+log(1//spl epsiv/)] truly random bits (for any /spl gamma/>0), (b) extract any constant fraction of the min-entropy using O[log/sup 2/n+log(1//spl epsiv/)] truly random bits, and (c) extract all the min-entropy using O[log/sup 3/n+log n/spl middot/log(1//spl epsiv/)] truly random bits.

FOCS Conference 1999 Conference Paper

Magic Functions

  • Cynthia Dwork
  • Moni Naor
  • Omer Reingold
  • Larry J. Stockmeyer

In this paper we show that three apparently unrelated problems are in fact very closely related. We sketch these problems at a high level. The selective decommitment problem first arose in a slightly different form, selective decryption, in the context of Byzantine agreement, no later than 1985. Instead of seeing encryptions of plaintexts the adversary is given commitments to the plaintexts. This problem is poorly understood even in strong-receiver commitments, which leak no information about the plaintext values information-theoretically. The second problem is in complexity theory: what can be proved in (a possibly weakened form of) zero-knowledge in a 3-round argument (interactive proof in which the prover is polynomial-time bounded)? The Fiat-Shamir Methodology is cryptographic, and addresses a methodology suggested by Fiat and Shamir (1987) to construct a (non-interactive) signature scheme from any 3-round (not necessarily zero-knowledge) public-coin identification scheme.

FOCS Conference 1997 Conference Paper

Number-theoretic Constructions of Efficient Pseudo-random Functions

  • Moni Naor
  • Omer Reingold

We describe efficient constructions for various cryptographic primitives (both in private-key and in public-key cryptography). We show these constructions to be at least as secure as the decisional version of the Diffie-Hellman assumption or as the assumption that factoring is hard. Our major result is a new construction of pseudo-random functions such that computing their value at any given point involves two multiple products. This is much more efficient than previous proposals. Furthermore, these functions have the advantage of being in TC/sup 0/ (the class of functions computable by constant depth circuits consisting of a polynomial number of threshold gates) which has several interesting applications. The simple algebraic structure of the functions implies additional features. In particular, we show a zero-knowledge proof for statements of the form "y=f/sub s/(x)" and "y/spl ne/f(x)" given a commitment to a key s of a pseudo-random function f/sub s/.

FOCS Conference 1995 Conference Paper

Synthesizers and Their Application to the Parallel Construction of Psuedo-Random Functions

  • Moni Naor
  • Omer Reingold

We present a new cryptographic primitive called pseudo-random synthesizer and show how to use it in order to get a parallel construction of a pseudo-random function. We show an NC/sup 1/ implementation of pseudo-random synthesizers based on the RSA or the Diffie-Hellman assumptions. This yields the first parallel (NC/sup 2/) pseudo-random function and the only alternative to the original construction of Goldreich, Gold-wasser and Micali (GGM). The security of our constructions is similar to the security of the underling assumptions. We discuss the connection with problems in computational learning theory.