STOC Conference 2025 Conference Paper
Weak Recovery, Hypothesis Testing, and Mutual Information in Stochastic Block Models and Planted Factor Graphs
- Elchanan Mossel
- Allan Sly
- Youngtak Sohn
Author name cluster
Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.
STOC Conference 2025 Conference Paper
STOC Conference 2024 Conference Paper
STOC Conference 2023 Conference Paper
STOC Conference 2022 Conference Paper
FOCS Conference 2021 Conference Paper
In a broad class of sparse random constraint satisfaction problems (CSP), deep heuristics from statistical physics predict that there is a condensation phase transition before the satisfiability threshold, governed by one-step replica symmetry breaking (1RSB). In fact, in random regular k-NAE-SAT, which is one of such random CSPS, it was verified [1] that its free energy is well-defined and the explicit value follows the 1RSB prediction. However, for any model of sparse random CSP, it has been unknown whether the solution space indeed condensates on O(1) clusters according to the 1RSB prediction. In this paper, we give an affirmative answer to this question for the random regular k-NAE-SAT model. Namely, we prove that with probability close to one, most of the solutions lie inside a bounded number of solution clusters whose sizes are comparable to the scale of the free energy. Furthermore, we establish that the overlap between two independently drawn solutions concentrates precisely at two values. This is the defining property of the one-step replica symmetry breaking class which we establish for the first time in a sparse random CSP, Our proof is based on a detailed moment analysis of a spin system, which has an infinite spin space that encodes the structure of solution clusters. We develop new techniques to study the partition function as well as enhance previous approaches which were only applicable to spin systems with finitely many spins. We believe that our method is applicable to a broad range of random CSPS in the 1RSB universality class.
FOCS Conference 2021 Conference Paper
We consider the symmetric binary perceptron model, a simple model of neural networks that has gathered significant attention in the statistical physics, information theory and probability theory communities, with recent connections made to the performance of learning algorithms in Baldassi et al. '15. We establish that the partition function of this model, normalized by its expected value, converges to a log-normal distribution. As a consequence, this allows us to establish several conjectures for this model: (i) it proves the contiguity conjecture of Aubin et al. '19 between the planted and unplanted models in the satisfiable regime; (ii) it establishes the sharp threshold conjecture; (iii) it proves the frozen 1-RSB conjecture in the symmetric case, conjectured first by Krauth-Mézard '89 in the asymmetric case. In a recent work of Perkins-Xu '21, the last two conjectures were also established by proving that the partition function concentrates on an exponential scale, under an analytical assumption on a real-valued function. This left open the contiguity conjecture and the lognor-mal limit characterization, which are established here unconditionally, with the analytical assumption verified. In particular, our proof technique relies on a dense counter-part of the small graph conditioning method, which was developed for sparse models in the celebrated work of Robinson and Wormald.
FOCS Conference 2016 Conference Paper
Recent work has made substantial progress in understanding the transitions of random constraint satisfaction problems (CSPs). In particular, for several of these models, the exact satisfiability threshold has been rigorously determined, confirming predictions from the statistical physics literature. Here we revisit one of these models, random regular NAE-SAT: knowing the satisfiability threshold, it is natural to study, in the satisfiable regime, the number of solutions in a typical instance. We prove here that these solutions have a well-defined free energy (limiting exponential growth rate), with explicit value matching the one-step replica symmetry breaking prediction. The proof develops new techniques for analyzing a certain "survey propagation model" associated to this problem. We believe that these methods may be applicable in a wide class of related problems.
STOC Conference 2015 Conference Paper
The planted bisection model is a random graph model in which the nodes are divided into two equal-sized communities and then edges are added randomly in a way that depends on the community membership. We establish necessary and sufficient conditions for the asymptotic recoverability of the planted bisection in this model. When the bisection is asymptotically recoverable, we give an efficient algorithm that successfully recovers it. We also show that the planted bisection is recoverable asymptotically if and only if with high probability every node belongs to the same community as the majority of its neighbors.
STOC Conference 2015 Conference Paper
STOC Conference 2014 Conference Paper
FOCS Conference 2012 Conference Paper
The class of two-spin systems contains several important models, including random independent sets and the Ising model of statistical physics. We show that for both the hard-core (independent set) model and the anti-ferromagnetic Ising model with arbitrary external field, it is NP-hard to approximate the partition function or approximately sample from the model on regular graphs when the model has non-uniqueness on the corresponding regular tree. Together with results of Jerrum -- Sinclair, Weitz, and Sinclair -- Srivastava -- Thurley giving FPRAS's for all other two-spin systems except at the uniqueness threshold, this gives an almost complete classification of the computational complexity of two-spin systems on bounded-degree graphs. Our proof establishes that the normalized log-partition function of any two-spin system on bipartite locally tree-like graphs converges to a limiting ``free energy density'' which coincides with the (non-rigorous) Be the prediction of statistical physics. We use this result to characterize the local structure of two-spin systems on locally tree-like bipartite expander graphs, which then become the basic gadgets in a randomized reduction to approximate MAX-CUT. Our approach is novel in that it makes no use of the second moment method employed in previous works on these questions.
FOCS Conference 2010 Conference Paper
The hardcore model is a model of lattice gas systems which has received much attention in statistical physics, probability theory and theoretical computer science. It is the probability distribution over independent sets I of a graph weighted proportionally to λ |I| with fugacity parameter λ. We prove that at the uniqueness threshold of the hardcore model on the d-regular tree, approximating the partition function becomes computationally hard on graphs of maximum degree d. Specifically, we show that unless NP = RP there is no polynomial time approximation scheme for the partition function (the sum of such weighted independent sets) on graphs of maximum degree d for fugacity λ c (d) c (d) + ε(d) where λ c = (d - 1) d-1 /(d - 2) d is the uniqueness threshold on the d-regular tree and ε(d) > 0 is a positive constant. Weitz [36] produced an FPTAS for approximating the partition function when 0 c (d) so this result demonstrates that the computational threshold exactly coincides with the statistical physics phase transition thus confirming the main conjecture of [30]. We further analyze the special case of λ = 1, d = 6 and show there is no polynomial time approximation scheme for approximately counting independent sets on graphs of maximum degree d = 6, which is optimal, improving the previous bound of d = 24. Our proof is based on specially constructed random bipartite graphs which act as gadgets in a reduction to MAX-CUT. Building on the involved second moment method analysis of [30] and combined with an analysis of the reconstruction problem on the tree our proof establishes a strong version of "replica" method heuristics developed by theoretical physicists. The result establishes the first rigorous correspondence between the hardness of approximate counting and sampling with statistical physics phase transitions.
FOCS Conference 2008 Conference Paper
A variety of random graph models have been developed in recent years to study a range of problems on networks, driven by the wide availability of data from many social, telecommunication, biochemical and other networks. A key model, extensively used in the sociology literature, is the exponential random graph model. This model seeks to incorporate in random graphs the notion of reciprocity, that is, the larger than expected number of triangles and other small subgraphs. Sampling from these distributions is crucial for parameter estimation hypothesis testing, and more generally for understanding basic features of the network model itself. In practice sampling is typically carried out using Markov chain Monte Carlo, in particular either the Glauber dynamics or the Metropolis-Hasting procedure. In this paper we characterize the high and low temperature regimes of the exponential random graph model. We establish that in the high temperature regime the mixing time of the Glauber dynamics is Theta(n 2 log n), where n is the number of vertices in the graph; in contrast, we show that in the low temperature regime the mixing is exponentially slow for any local Markov chain. Our results, moreover, give a rigorous basis for criticisms made of such models. In the high temperature regime, where sampling with MCMC is possible, we show that any finite collection of edges are asymptotically independent; thus, the model does not possess the desired reciprocity property, and is not appreciably different from the Erdos-Renyi random graph.
SODA Conference 2008 Conference Paper