Author name cluster

Nima Anari

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

25 papers

2 author rows

ICML Conference 2025 Conference Paper

Diffusion Models are Secretly Exchangeable: Parallelizing DDPMs via Auto Speculation

Hengyuan Hu
Aniket Das
Dorsa Sadigh
Nima Anari

Denoising Diffusion Probabilistic Models (DDPMs) have emerged as powerful tools for generative modeling. However, their sequential computation requirements lead to significant inference-time bottlenecks. In this work, we utilize the connection between DDPMs and Stochastic Localization to prove that, under an appropriate reparametrization, the increments of DDPM satisfy an exchangeability property. This general insight enables near-black-box adaptation of various performance optimization techniques from autoregressive models to the diffusion setting. To demonstrate this, we introduce Autospeculative Decoding (ASD), an extension of the widely used speculative decoding algorithm to DDPMs that does not require any auxiliary draft models. Our theoretical analysis shows that ASD achieves a $\tilde{O}(K^{\frac{1}{3}})$ parallel runtime speedup over the $K$ step sequential DDPM. We also demonstrate that a practical implementation of autospeculative decoding accelerates DDPM inference significantly in various domains.

STOC Conference 2024 Conference Paper

Parallel Sampling via Counting

Nima Anari
Ruiquan Gao 0001
Aviad Rubinstein

STOC Conference 2024 Conference Paper

Trickle-Down in Localization Schemes and Applications

Nima Anari
Frederic Koehler
Thuy-Duong Vuong

SODA Conference 2024 Conference Paper

Universality of Spectral Independence with Applications to Fast Mixing in Spin Glasses

Nima Anari
Vishesh Jain
Frederic Koehler
Huy Tuan Pham
Thuy-Duong Vuong

STOC Conference 2023 Conference Paper

Parallel Discrete Sampling via Continuous Walks

Nima Anari
Yizhi Huang
Tianyu Liu
Thuy-Duong Vuong
Brian Xu
Katherine Yu

NeurIPS Conference 2023 Conference Paper

Parallel Sampling of Diffusion Models

Andy Shih
Suneel Belkhale
Stefano Ermon
Dorsa Sadigh
Nima Anari

Diffusion models are powerful generative models but suffer from slow sampling, often taking 1000 sequential denoising steps for one sample. As a result, considerable efforts have been directed toward reducing the number of denoising steps, but these methods hurt sample quality. Instead of reducing the number of denoising steps (trading quality for speed), in this paper we explore an orthogonal approach: can we run the denoising steps in parallel (trading compute for speed)? In spite of the sequential nature of the denoising steps, we show that surprisingly it is possible to parallelize sampling via Picard iterations, by guessing the solution of future denoising steps and iteratively refining until convergence. With this insight, we present ParaDiGMS, a novel method to accelerate the sampling of pretrained diffusion models by denoising multiple steps in parallel. ParaDiGMS is the first diffusion sampling method that enables trading compute for speed and is even compatible with existing fast sampling techniques such as DDIM and DPMSolver. Using ParaDiGMS, we improve sampling speed by 2-4x across a range of robotics and image generation models, giving state-of-the-art sampling speeds of 0. 2s on 100-step DiffusionPolicy and 14. 6s on 1000-step StableDiffusion-v2 with no measurable degradation of task reward, FID score, or CLIP score.

STOC Conference 2022 Conference Paper

Entropic independence: optimal mixing of down-up random walks

Nima Anari
Vishesh Jain
Frederic Koehler
Huy Tuan Pham
Thuy-Duong Vuong

We introduce a notion called entropic independence that is an entropic analog of spectral notions of high-dimensional expansion. Informally, entropic independence of a background distribution µ on k -sized subsets of a ground set of elements says that for any (possibly randomly chosen) set S , the relative entropy of a single element of S drawn uniformly at random carries at most O (1/ k ) fraction of the relative entropy of S . Entropic independence is the analog of the notion of spectral independence, if one replaces variance by entropy. We use entropic independence to derive tight mixing time bounds, overcoming the lossy nature of spectral analysis of Markov chains on exponential-sized state spaces. In our main technical result, we show a general way of deriving entropy contraction, a.k.a. modified log-Sobolev inequalities, for down-up random walks from spectral notions. We show that spectral independence of a distribution under arbitrary external fields automatically implies entropic independence. We furthermore extend our theory to the case where spectral independence does not hold under arbitrary external fields. To do this, we introduce a framework for obtaining tight mixing time bounds for Markov chains based on what we call restricted modified log-Sobolev inequalities, which guarantee entropy contraction not for all distributions, but for those in a sufficiently large neighborhood of the stationary distribution. To derive our results, we relate entropic independence to properties of polynomials: µ is entropically independent exactly when a transformed version of the generating polynomial of µ is upper bounded by its linear tangent; this property is implied by concavity of the said transformation, which was shown by prior work to be locally equivalent to spectral independence. We apply our results to obtain (1) tight modified log-Sobolev inequalities and mixing times for multi-step down-up walks on fractionally log-concave distributions, (2) the tight mixing time of O ( n log n ) for Glauber dynamics on Ising models whose interaction matrix has eigenspectrum lying within an interval of length smaller than 1, improving upon the prior quadratic dependence on n , and (3) nearly-linear time O δ ( n ) samplers for the hardcore and Ising models on n -node graphs that have δ-relative gap to the tree-uniqueness threshold. In the last application, our bound on the running time does not depend on the maximum degree Δ of the graph, and is therefore optimal even for high-degree graphs, and in fact, is sublinear in the size of the graph for high-degree graphs.

FOCS Conference 2022 Conference Paper

Optimal Sublinear Sampling of Spanning Trees and Determinantal Point Processes via Average-Case Entropic Independence

Nima Anari
Yang P. Liu
Thuy-Duong Vuong

We design fast algorithms for repeatedly sampling from strongly Rayleigh distributions, which include as special cases random spanning tree distributions and determinantal point processes. For a graph $G=(V, \ E)$, we show how to approximately sample uniformly random spanning trees from G in $O(|V|)$ 1 time per sample after an initial $O(|E|)$ time preprocessing. This is the first nearly-linear runtime in the output size, which is clearly optimal. For a determinantal point process on k-sized subsets of a ground set of n elements, defined via an $n\times n$ kernel matrix, we show how to approximately sample in ${\widetilde{O}}(k^{\omega})$ time after an initial ${\widetilde{O}}(nk^{\omega-1})$ time preprocessing, where $\omega\lt 2. 372864$ is the matrix multiplication exponent. The time to compute just the weight of the output set is simply $\simeq k^{\omega}$, a natural barrier that suggests our runtime might be optimal for determinantal point processes as well. As a corollary, we even improve the state of the art for obtaining a single sample from a determinantal point process, from the prior runtime of ${\widetilde{O}}(\min\{nk^{2}, \ n^{\omega}\})$ to ${\widetilde{O}}(nk^{\omega-1})$. In our main technical result, we achieve the optimal limit on domain sparsification for strongly Rayleigh distributions. In domain sparsification, sampling from a distribution $\mu$ on $\binom{[n]}{k}$ is reduced to sampling from related distributions on $\binom{[t]}{k}$ for $t\ll n$. We show that for strongly Rayleigh distributions, the domain size can be reduced to nearly linear in the output size $t={\widetilde{O}}(k)$, improving the state of the art from $t={\widetilde{O}}(k^{2})$ for general strongly Rayleigh distributions and the more specialized $t={\widetilde{O}}(k^{15})$ for sBanning tree distributions. Our reduction involves sampling from ${\widetilde{O}}(1)$ domain-sparsified distributions, all of which can be produced efficiently assuming approximate overestimates for marginals of $\mu$ are known and stored in a convenient data structure. Having access to marginals is the discrete analog of having access to the mean and covariance of a continuous distribution, or equivalently knowing “isotropy” for the distribution, the key behind optimal samplers in the continuous setting based on the famous Kannan-Lovász-Simonovits (KLS) conjecture. We view our result as analogous in spirit to the KLS conjecture and its consequences for sampling, but rather for discrete strongly Rayleigh measures. 1 Throughout, ${\widetilde{O}}(\cdot)$ hides polylogarithmic factors in n.

STOC Conference 2021 Conference Paper

Fractionally log-concave and sector-stable polynomials: counting planar matchings and more

Yeganeh Alimohammadi
Nima Anari
Kirankumar Shiragur
Thuy-Duong Vuong

STOC Conference 2021 Conference Paper

Log-concave polynomials in theory and applications (tutorial)

Nima Anari
Cynthia Vinzant

STOC Conference 2021 Conference Paper

Log-concave polynomials IV: approximate exchange, tight mixing times, and near-optimal sampling of forests

Nima Anari
Kuikui Liu
Shayan Oveis Gharan
Cynthia Vinzant
Thuy-Duong Vuong

NeurIPS Conference 2020 Conference Paper

Instance Based Approximations to Profile Maximum Likelihood

Nima Anari
Moses Charikar
Kirankumar Shiragur
Aaron Sidford

In this paper we provide a new efficient algorithm for approximately computing the profile maximum likelihood (PML) distribution, a prominent quantity in symmetric property estimation. We provide an algorithm which matches the previous best known efficient algorithms for computing approximate PML distributions and improves when the number of distinct observed frequencies in the given instance is small. We achieve this result by exploiting new sparsity structure in approximate PML distributions and providing a new matrix rounding algorithm, of independent interest. Leveraging this result, we obtain the first provable computationally efficient implementation of PseudoPML, a general framework for estimating a broad class of symmetric properties. Additionally, we obtain efficient PML-based estimators for distributions with small profile entropy, a natural instance-based complexity measure. Further, we provide a simpler and more practical PseudoPML implementation that matches the best-known theoretical guarantees of such an estimator and evaluate this method empirically.

FOCS Conference 2020 Conference Paper

Isotropy and Log-Concave Polynomials: Accelerated Sampling and High-Precision Counting of Matroid Bases

Nima Anari
Michal Derezinski

We define a notion of isotropy for discrete set distributions. If μ is a distribution over subsets S of a ground set [ n], we say that μ is in isotropic position if \mathbbPS ~ μ[e ∈ S] is the same for all e ∈ [n]. We design a new approximate sampling algorithm that leverages isotropy for the class of distributions μ that have a log-concave generating polynomial; this class includes determinantal point processes, strongly Rayleigh distributions, and uniform distributions over matroid bases. We show that when μ is in approximately isotropic position, the running time of our algorithm depends polynomially on the size of the set S, and only logarithmically on n. When n is much larger than the size of S, this is significantly faster than prior algorithms, and can even be sublinear in n. We then show how to transform a non-isotropic μ into an equivalent approximately isotropic form with a polynomial-time pre-processing step, accelerating subsequent sampling times. The main new ingredient enabling our algorithms is a class of negative dependence inequalities that may be of independent interest. As an application of our results, we show how to approximately count bases of a matroid of rank k over a ground set of n elements to within a factor of 1+ε in time O((n+1/ε 2 ) ·poly(k, logn)). This is the first algorithm that runs in nearly linear time for fixed rank k, and achieves an inverse polynomially low approximation error. The full version of this paper is available at: https: //arxiv. org/abs/2004. 09079.

FOCS Conference 2020 Conference Paper

Spectral Independence in High-Dimensional Expanders and Applications to the Hardcore Model

Nima Anari
Kuikui Liu
Shayan Oveis Gharan

We say a discrete probability distribution over subsets of a finite ground set is spectrally independent if an associated pairwise influence matrix has a bounded largest eigenvalue for the distribution and all of its conditional distributions. We prove that if a distribution is spectrally independent, then the corresponding high dimensional simplicial complex is a local spectral expander. Using a line of recent works on mixing time of high dimensional walks on simplicial complexes [KM17]; [DK17]; [KO18]; [AL20], this implies that the corresponding Glauber dynamics mixes rapidly and generates (approximate) samples from the given distribution. As an application, we show that natural Glauber dynamics mixes rapidly (in polynomial time) to generate a random independent set from the hardcore model up to the uniqueness threshold. This improves the quasi-polynomial running time of Weitz's deterministic correlation decay algorithm [Wei06] for estimating the hardcore partition function, also answering a long-standing open problem of mixing time of Glauber dynamics [LV97]; [LV99]; [DG00]; [Vig01]; [Eft+16].

FOCS Conference 2019 Conference Paper

A Tight Analysis of Bethe Approximation for Permanent

Nima Anari
Alireza Rezaei 0001

We prove that the permanent of nonnegative matrices can be deterministically approximated within a factor of square root 2 n in polynomial time, improving upon the previous deterministic approximations. We show this by proving that the Bethe approximation of the permanent, a quantity computable in polynomial time, is at least as large as the permanent divided by square root 2 n. This resolves a conjecture of Gurvits [21]. Our bound is tight, and when combined with previously known inequalities lower bounding the permanent, fully resolves the quality of Bethe approximation for permanent.

STOC Conference 2019 Conference Paper

Log-concave polynomials II: high-dimensional walks and an FPRAS for counting bases of a matroid

Nima Anari
Kuikui Liu
Shayan Oveis Gharan
Cynthia Vinzant

SODA Conference 2018 Conference Paper

Approximating the Largest Root and Applications to Interlacing Families

Nima Anari
Shayan Oveis Gharan
Amin Saberi
Nikhil Srivastava

We study the problem of approximating the largest root of a real-rooted polynomial of degree n using its top k coefficients and give nearly matching upper and lower bounds. We present algorithms with running time polynomial in k that use the top k coefficients to approximate the maximum root within a factor of n 1/ k and when k ≤ log n and k > log n respectively. We also prove corresponding information-theoretic lower bounds of n Ω(1/ k ) and, and show strong lower bounds for noisy version of the problem in which one is given access to approximate coefficients. This problem has applications in the context of the method of interlacing families of polynomials, which was used for proving the existence of Ramanujan graphs of all degrees, the solution of the Kadison-Singer problem, and bounding the integrality gap of the asymmetric traveling salesman problem. All of these involve computing the maximum root of certain real-rooted polynomials for which the top few coefficients are accessible in subexponential time. Our results yield an algorithm with the running time of for all of them.

FOCS Conference 2018 Conference Paper

Log-Concave Polynomials, Entropy, and a Deterministic Approximation Algorithm for Counting Bases of Matroids

Nima Anari
Shayan Oveis Gharan
Cynthia Vinzant

We give a deterministic polynomial time 2^O(r)-approximation algorithm for the number of bases of a given matroid of rank r and the number of common bases of any two matroids of rank r. To the best of our knowledge, this is the first nontrivial deterministic approximation algorithm that works for arbitrary matroids. Based on a lower bound of Azar, Broder, and Frieze this is almost the best possible assuming oracle access to independent sets of the matroid. There are two main ingredients in our result: For the first, we build upon recent results of Adiprasito, Huh, and Katz and Huh and Wang on combinatorial hodge theory to derive a connection between matroids and log-concave polynomials. We expect that several new applications in approximation algorithms will be derived from this connection in future. Formally, we prove that the multivariate generating polynomial of the bases of any matroid is log-concave as a function over the positive orthant. For the second ingredient, we develop a general framework for approximate counting in discrete problems, based on convex optimization. The connection goes through subadditivity of the entropy. For matroids, we prove that an approximate superadditivity of the entropy holds by relying on the log-concavity of the corresponding polynomials.

SODA Conference 2018 Conference Paper

Nash Social Welfare for Indivisible Items under Separable, Piecewise-Linear Concave Utilities

Nima Anari
Tung Mai
Shayan Oveis Gharan
Vijay V. Vazirani

Recently Cole and Gkatzelis [10] gave the first constant factor approximation algorithm for the problem of allocating indivisible items to agents, under additive valuations, so as to maximize the Nash social welfare (NSW). We give constant factor algorithms for a substantial generalization of their problem – to the case of separable, piecewise-linear concave utility functions. We give two such algorithms, the first using market equilibria and the second using the theory of real stable polynomials. Both approaches require new algorithmic ideas.

FOCS Conference 2018 Conference Paper

Planar Graph Perfect Matching Is in NC

Nima Anari
Vijay V. Vazirani

Is perfect matching in NC? That is, is there a deterministic fast parallel algorithm for it? This has been an outstanding open question in theoretical computer science for over three decades, ever since the discovery of RNC matching algorithms. Within this question, the case of planar graphs has remained an enigma: On the one hand, counting the number of perfect matchings is far harder than finding one (the former is #P-complete and the latter is in P), and on the other, for planar graphs, counting has long been known to be in NC whereas finding one has resisted a solution. In this paper, we give an NC algorithm for finding a perfect matching in a planar graph. Our algorithm uses the above-stated fact about counting matchings in a crucial way. Our main new idea is an NC algorithm for finding a face of the perfect matching polytope at which many new conditions, involving constraints of the polytope, are simultaneously satisfied. Several other ideas are also needed, such as finding a point in the interior of the minimum weight face of this polytope and finding a balanced tight odd set in NC.

NeurIPS Conference 2018 Conference Paper

Smoothed Analysis of Discrete Tensor Decomposition and Assemblies of Neurons

Nima Anari
Constantinos Daskalakis
Wolfgang Maass
Christos Papadimitriou
Amin Saberi
Santosh Vempala

We analyze linear independence of rank one tensors produced by tensor powers of randomly perturbed vectors. This enables efficient decomposition of sums of high-order tensors. Our analysis builds upon [BCMV14] but allows for a wider range of perturbation models, including discrete ones. We give an application to recovering assemblies of neurons. Assemblies are large sets of neurons representing specific memories or concepts. The size of the intersection of two assemblies has been shown in experiments to represent the extent to which these memories co-occur or these concepts are related; the phenomenon is called association of assemblies. This suggests that an animal's memory is a complex web of associations, and poses the problem of recovering this representation from cognitive data. Motivated by this problem, we study the following more general question: Can we reconstruct the Venn diagram of a family of sets, given the sizes of their l-wise intersections? We show that as long as the family of sets is randomly perturbed, it is enough for the number of measurements to be polynomially larger than the number of nonempty regions of the Venn diagram to fully reconstruct the diagram.

STOC Conference 2017 Conference Paper

A generalization of permanent inequalities and applications in counting and optimization

Nima Anari
Shayan Oveis Gharan

FOCS Conference 2017 Conference Paper

Simply Exponential Approximation of the Permanent of Positive Semidefinite Matrices

Nima Anari
Leonid Gurvits
Shayan Oveis Gharan
Amin Saberi

We design a deterministic polynomial time cn approximation algorithm for the permanent of positive semidefinite matrices where c = γ+1 ≃ 4: 84. We write a natural convex relaxation and show that its optimum solution gives a cn approximation of the permanent. We further show that this factor is asymptotically tight by constructing a family of positive semidefinite matrices. We also show that our result implies an approximate version of the permanent-ontop conjecture, which was recently refuted in its original form; we show that the permanent is within a cn factor of the top eigenvalue of the Schur power matrix.

FOCS Conference 2015 Conference Paper

Effective-Resistance-Reducing Flows, Spectrally Thin Trees, and Asymmetric TSP

Nima Anari
Shayan Oveis Gharan

We show that the integrality gap of the natural LP relaxation of the Asymmetric Traveling Salesman Problem is polyloglog(n). In other words, there is a polynomial time algorithm that approximates the value of the optimum tour within a factor of polyloglog(n), where polyloglog(n) is a bounded degree polynomial of loglog(n). We prove this by showing that any k-edge-connected unweighted graph has a polyloglog(n)/k-thin spanning tree. Our main new ingredient is a procedure, albeit an exponentially sized convex program, that “transforms” graphs that do not admit any spectrally thin trees into those that provably have spectrally thin trees. More precisely, given a k-edge-connected graph G = (V, E) where k ≥ 7 log(n), we show that there is a matrix D that “preserves” the structure of all cuts of G such that for a set F ⊆ E that induces an Ω(k)-edge-connected graph, the effective resistance of every edge in F w. r. t. D is at most polylog(k)/k. Then, we use our extension of the seminal work of Marcus, Spielman, and Srivastava [1], fully explained in [2], to prove the existence of a polylog(k)/k-spectrally thin tree with respect to D. Such a tree is polylog(k)/k-combinatorially thin with respect to G as D preserves the structure of cuts of G.

FOCS Conference 2014 Conference Paper

Mechanism Design for Crowdsourcing: An Optimal 1-1/e Competitive Budget-Feasible Mechanism for Large Markets

Nima Anari
Gagan Goel
Afshin Nikzad

In this paper we consider a mechanism design problem in the context of large-scale crowdsourcing markets such as Amazon's Mechanical Turk mturk, ClickWorker clickworker, CrowdFlower crowdflower. In these markets, there is a requester who wants to hire workers to accomplish some tasks. Each worker is assumed to give some utility to the requester on getting hired. Moreover each worker has a minimum cost that he wants to get paid for getting hired. This minimum cost is assumed to be private information of the workers. The question then is -- if the requester has a limited budget, how to design a direct revelation mechanism that picks the right set of workers to hire in order to maximize the requester's utility? We note that although the previous work (Singer (2010) chen et al. (2011)) has studied this problem, a crucial difference in which we deviate from earlier work is the notion of large-scale markets that we introduce in our model. Without the large market assumption, it is known that no mechanism can achieve a competitive ratio better than 0. 414 and 0. 5 for deterministic and randomized mechanisms respectively (while the best known deterministic and randomized mechanisms achieve an approximation ratio of 0. 292 and 0. 33 respectively). In this paper, we design a budget-feasible mechanism for large markets that achieves a competitive ratio of 1 - 1/e ≃ 0. 63. Our mechanism can be seen as a generalization of an alternate way to look at the proportional share mechanism, which is used in all the previous works so far on this problem. Interestingly, we can also show that our mechanism is optimal by showing that no truthful mechanism can achieve a factor better than 1 - 1/e, thus, fully resolving this setting. Finally we consider the more general case of submodular utility functions and give new and improved mechanisms for the case when the market is large.