Author name cluster

Ronen Eldan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers

2 author rows

STOC Conference 2023 Conference Paper

An Optimal "It Ain't Over Till It's Over" Theorem

Ronen Eldan
Avi Wigderson
Pei Wu

We study the probability of Boolean functions with small max influence to become constant under random restrictions. Let f be a Boolean function such that the variance of f is Ω(1) and all its individual influences are bounded by τ. We show that when restricting all but a ρ=Ω((log1/τ) −1 ) fraction of the coordinates, the restricted function remains nonconstant with overwhelming probability. This bound is essentially optimal, as witnessed by the tribes function = n / C log n ∘ C log n . We extend it to an anti-concentration result, showing that the restricted function has nontrivial variance with probability 1− o (1). This gives a sharp version of the “it ain’t over till it’s over” theorem due to Mossel, O’Donnell, and Oleszkiewicz. Our proof is discrete, and avoids the use of the invariance principle. We also show two consequences of our above result: (i) As a corollary, we prove that for a uniformly random input x , the block sensitivity of f at x is Ω(log1/τ) with probability 1− o (1). This should be compared with the implication of Kahn, Kalai and Linial’s result, which implies that the average block sensitivity of f is Ω(log1/τ). (ii) Combining our proof with a well-known result due to O’Donnell, Saks, Schramm and Servedio, one can also conclude that: Restricting all but a ρ=Ω(1/√log(1/τ) ) fraction of the coordinates of a monotone function f , then the restricted function has decision tree complexity Ω(τ −Θ(ρ) ) with probability Ω(1).

Details

STOC Conference 2023 Conference Paper

Noise Stability on the Boolean Hypercube via a Renormalized Brownian Motion

Ronen Eldan
Dan Mikulincer
Prasad Raghavendra

We consider a variant of the classical notion of noise on the Boolean hypercube which gives rise to a new approach to inequalities regarding noise stability. We use this approach to give a new proof of the Majority is Stablest theorem by Mossel, O'Donnell, and Oleszkiewicz, improving the dependence of the bound on the maximal influence of the function from logarithmic to polynomial. We also show that a variant of the conjecture by Courtade and Kumar regarding the most informative Boolean function, where the classical noise is replaced by our notion, holds true. Our approach is based on a stochastic construction that we call the renormalized Brownian motion, which facilitates the use of inequalities in Gaussian space in the analysis of Boolean functions.

Details

FOCS Conference 2022 Conference Paper

Localization Schemes: A Framework for Proving Mixing Bounds for Markov Chains (extended abstract)

Yuansi Chen
Ronen Eldan

Two recent and seemingly-unrelated techniques for proving mixing bounds for Markov chains are: (i) the framework of Spectral Independence, introduced by Anari, Liu and Oveis Gharan, and its numerous extensions, which have given rise to several breakthroughs in the analysis of mixing times of discrete Markov chains and (ii) the Stochastic Localization technique which has proven useful in establishing mixing and expansion bounds for both log-concave measures and for measures on the discrete hypercube. In this paper, we introduce a framework which connects ideas from both techniques. Our framework unifies, simplifies and extends those two techniques. In its center is the concept of a “localization scheme” which, to every probability measure on some space $\Omega$, assigns a martingale of probability measures which “localize” in space as time evolves. As it turns out, to every such scheme corresponds a Markov chain, and many chains of interest appear naturally in this framework. This viewpoint provides tools for deriving mixing bounds for the dynamics through the analysis of the corresponding localization process. Generalizations of concepts of Spectral Independence and Entropic Independence naturally arise from our definitions, and in particular we recover the main theorems in the spectral and entropic independence frameworks via simple martingale arguments (completely bypassing the need to use the theory of high-dimensional expanders). We demonstrate the strength of our proposed machinery by giving short and (arguably) simpler proofs to many mixing bounds in the recent literature. In particular, we: (i) Give the first O(n log n) bound for mixing time of the hardcore-model (of arbitrary degree) in the tree-uniqueness regime, under Glauber dynamics, (ii) Give the first optimal mixing bounds for Ising models in the uniqueness regime under any external fields, (iii) Prove a KL-divergence decay bound for log-concave sampling via the Restricted Gaussian Oracle, which achieves optimal mixing under any exp (n)-ivarm start, (iv) Prove a logarithmic-Sobolev inequality for near-critical Ferromagnetic Ising models, recovering in a simple way a variant of a recent result by Bauerschmidt and Dagallier.

Details

STOC Conference 2020 Conference Paper

Concentration on the Boolean hypercube via pathwise stochastic analysis

Ronen Eldan
Renan Gross

We develop a new technique for proving concentration inequalities which relate between the variance and influences of Boolean functions. Second, we strengthen several classical inequalities concerning the influences of a Boolean function, showing that near-maximizers must have large vertex boundaries. An inequality due to Talagrand states that for a Boolean function f , ( f )≤ C ∑ i =1 n i ( f )/1+log(1/ i ( f )). We give a lower bound for the size of the vertex boundary of functions saturating this inequality. As a corollary, we show that for sets that satisfy the edge-isoperimetric inequality or the Kahn-Kalai-Linial inequality up to a constant, a constant proportion of the mass is in the inner vertex boundary. Lastly, we improve a quantitative relation between influences and noise stability given by Keller and Kindler. Our proofs rely on techniques based on stochastic calculus, and bypass the use of hypercontractivity common to previous proofs.

Details

NeurIPS Conference 2020 Conference Paper

Network size and size of the weights in memorization with two-layers neural networks

Sebastien Bubeck
Ronen Eldan
Yin Tat Lee
Dan Mikulincer

In 1988, Eric B. Baum showed that two-layers neural networks with threshold activation function can perfectly memorize the binary labels of $n$ points in general position in $\R^d$ using only $\ulcorner n/d \urcorner$ neurons. We observe that with ReLU networks, using four times as many neurons one can fit arbitrary real labels. Moreover, for approximate memorization up to error $\epsilon$, the neural tangent kernel can also memorize with only $O\left(\frac{n}{d} \cdot \log(1/\epsilon) \right)$ neurons (assuming that the data is well dispersed too). We show however that these constructions give rise to networks where the \emph{magnitude} of the neurons' weights are far from optimal. In contrast we propose a new training procedure for ReLU networks, based on {\em complex} (as opposed to {\em real}) recombination of the neurons, for which we show approximate memorization with both $O\left(\frac{n}{d} \cdot \frac{\log(1/\epsilon)}{\epsilon}\right)$ neurons, as well as nearly-optimal size of the weights.

PDF Details

STOC Conference 2017 Conference Paper

Kernel-based methods for bandit convex optimization

Sébastien Bubeck
Yin Tat Lee
Ronen Eldan

We consider the adversarial convex bandit problem and we build the first poly ( T )-time algorithm with poly ( n ) √ T -regret for this problem. To do so we introduce three new ideas in the derivative-free optimization literature: (i) kernel methods, (ii) a generalization of Bernoulli convolutions, and (iii) a new annealing schedule for exponential weights (with increasing learning rate). The basic version of our algorithm achieves Õ( n 9.5 #8730; T )-regret, and we show that a simple variant of this algorithm can be run in poly ( n log( T ))-time per step at the cost of an additional poly ( n ) T o (1) factor in the regret. These results improve upon the Õ( n 11 #8730; T )-regret and exp( poly ( T ))-time result of the first two authors, and the log( T ) poly ( n ) #8730; T -regret and log( T ) poly ( n ) -time result of Hazan and Li. Furthermore we conjecture that another variant of the algorithm could achieve Õ( n 1.5 #8730; T )-regret, and moreover that this regret is unimprovable (the current best lower bound being Î©( n #8730; T ) and it is achieved with linear functions). For the simpler situation of zeroth order stochastic convex optimization this corresponds to the conjecture that the optimal query complexity is of order n 3 / ϵ 2 .

Details

NeurIPS Conference 2015 Conference Paper

Bandit Smooth Convex Optimization: Improving the Bias-Variance Tradeoff

Ofer Dekel
Ronen Eldan
Tomer Koren

Bandit convex optimization is one of the fundamental problems in the field of online learning. The best algorithm for the general bandit convex optimization problem guarantees a regret of $\widetilde{O}(T^{5/6})$, while the best known lower bound is $\Omega(T^{1/2})$. Many attemptshave been made to bridge the huge gap between these bounds. A particularly interesting special case of this problem assumes that the loss functions are smooth. In this case, the best known algorithm guarantees a regret of $\widetilde{O}(T^{2/3})$. We present an efficient algorithm for the banditsmooth convex optimization problem that guarantees a regret of $\widetilde{O}(T^{5/8})$. Our result rules out an $\Omega(T^{2/3})$ lower bound and takes a significant step towards the resolution of this open problem.

PDF Details

NeurIPS Conference 2015 Conference Paper

Finite-Time Analysis of Projected Langevin Monte Carlo

Sebastien Bubeck
Ronen Eldan
Joseph Lehec

We analyze the projected Langevin Monte Carlo (LMC) algorithm, a close cousin of projected Stochastic Gradient Descent (SGD). We show that LMC allows to sample in polynomial time from a posterior distribution restricted to a convex body and with concave log-likelihood. This gives the first Markov chain to sample from a log-concave distribution with a first-order oracle, as the existing chains with provable guarantees (lattice walk, ball walk and hit-and-run) require a zeroth-order oracle. Our proof uses elementary concepts from stochastic calculus which could be useful more generally to understand SGD and its variants.

PDF Details

FOCS Conference 2015 Conference Paper

Talagrand's Convolution Conjecture on Gaussian Space

Ronen Eldan
James R. Lee

Smoothing properties of the noise operator on the discrete cube and on Gaussian space have played a pivotal role in many fields. In particular, these smoothing effects have seena broad range of applications in theoretical computer science. We exhibit new regularization properties of the noise operator on Gaussian space. More specifically, we show that the mass on level sets of a probability density decays uniformly under the Ornstein-Uhlenbeck semi group. This confirms positively the Gaussian case of Talagrand's convolution conjecture (1989)on the discrete cube. A major theme is our use of an It o process (the "F"ollmer drift")which can be seen as an entropy-optimal coupling between the Gaussian measure and another given measure on Gaussian space. To analyze this process, we employ stochastic calculus and Girsanov's change of measure formula. The ideas and tools employed here provide a new perspective on hyper contractivity in Gaussian space and the discrete cube. In particular, our work gives a new way of studying "small" sets in product spaces (e. g. , Sets of size 2o(n) in the discrete cube) using a form of regularized online gradient descent.

Details