Author name cluster

Daniel Stefankovic

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

21 papers

2 author rows

SODA Conference 2022 Conference Paper

On Mixing of Markov Chains: Coupling, Spectral Independence, and Entropy Factorization

Antonio Blanca
Pietro Caputo
Zongchen Chen
Daniel Parisi
Daniel Stefankovic
Eric Vigoda

Details

SODA Conference 2022 Conference Paper

Sampling Colorings and Independent Sets of Random Regular Bipartite Graphs in the Non-Uniqueness Region

Zongchen Chen
Andreas Galanis
Daniel Stefankovic
Eric Vigoda

We give an FPRAS for counting q -colorings for even on almost every Δ-regular bipartite graph. This improves significantly upon the previous best bound of by Jenssen, Keevash, and Perkins (SODA'19). Analogously, for the hard-core model on independentsets weighted by λ > 0, we present an FPRAS for estimating the partition function when, which improves upon previous results by an Ω(log Δ) factor. Our results for the colorings and hard-core models follow from a general result that applies to arbitrary spin systems. Our main contribution is to show how to elevate probabilistic/analytic bounds on the marginal probabilities for the typical structure of phases on random bipartite regular graphs into efficient algorithms, using the polymer method. We further show evidence that our results for colorings and independent sets are within a constant factor of best possible using current polymer-method approaches.

Details

SODA Conference 2021 Conference Paper

Rapid Mixing for Colorings via Spectral Independence

Zongchen Chen
Andreas Galanis
Daniel Stefankovic
Eric Vigoda

The spectral independence approach of Anari et al. (2020) utilized recent results on high-dimensional expanders of Alev and Lau (2020) and established rapid mixing of the Glauber dynamics for the hard-core model defined on weighted independent sets. We develop the spectral independence approach for colorings, and obtain new algorithmic results for the corresponding counting/sampling problems. Let α ∗ ≈ 1. 763 denote the solution to exp(1/ x ) = x and let α > α ∗. We prove that, for any triangle-free graph G = ( V, E ) with maximum degree Δ, for all q ≥ α Δ + 1, the mixing time of the Glauber dynamics for q -colorings is polynomial in n = |V|, with the exponent of the polynomial independent of Δ and q. In comparison, previous approximate counting results for colorings held for a similar range of q (asymptotically in Δ) but with larger girth requirement or with a running time where the polynomial exponent depended on Δ and q (exponentially). One further feature of using the spectral independence approach to study colorings is that it avoids many of the technical complications in previous approaches caused by coupling arguments or by passing to the complex plane; the key improvement on the running time is based on relatively simple combinatorial arguments which are then translated into spectral bounds.

Details

UAI Conference 2021 Conference Paper

Statistical mechanical analysis of neural network pruning

Rupam Acharyya
Ankani Chattoraj
Boyu Zhang
Shouman Das
Daniel Stefankovic

Deep learning architectures with a huge number of parameters are often compressed using pruning techniques to ensure computational efficiency of inference during deployment. Despite multitude of empirical advances, there is a lack of theoretical understanding of the effectiveness of different pruning methods. We inspect different pruning techniques under the statistical mechanics formulation of a teacher-student framework and derive their generalization error (GE) bounds. It has been shown that Determinantal Point Process (DPP) based node pruning method is notably superior to competing approaches when tested on real datasets. Using GE bounds in the aforementioned setup we provide theoretical guarantees for their empirical observations. Another consistent finding in literature is that sparse neural networks ( edge pruned ) generalize better than dense neural networks ( node pruned ) for a fixed number of parameters. We use our theoretical setup to prove this finding and show that even the baseline random edge pruning method performs better than the DPP node pruning method. We also validate this empirically on real datasets.

Details

FOCS Conference 2020 Conference Paper

The complexity of approximating averages on bounded-degree graphs

Andreas Galanis
Daniel Stefankovic
Eric Vigoda

We prove that, unless P=NP, there is no polynomial-time algorithm to approximate within some multiplicative constant the average size of an independent set in graphs of maximum degree 6. This is a special case of a more general result for the hard-core model defined on independent sets weighted by a parameter. In the general setting, we prove that, unless P=NP, for all Δ ≥ 3, all, there is no FPTAS which applies to all graphs of maximum degree Δ for computing the average size of the independent set in the Gibbs distribution, where λ c (Δ) is the critical point for the uniqueness/non-uniqueness phase transition on the Δ-regular tree. Moreover, we prove that for λ in a dense set of this non-uniqueness region the problem is NP-hard to approximate within some constant factor. Our work extends to the antiferromagnetic Ising model and generalizes to all 2-spin antiferromagnetic models, establishing hardness of computing the average magnetization in the tree non-uniqueness region. Previously, Schulman, Sinclair and Srivastava (2015) showed that it is #P-hard to compute the average magnetization exactly, but no hardness of approximation results were known. Hardness results of Sly (2010) and Sly and Sun (2014) for approximating the partition function do not imply hardness of computing averages. The new ingredient in our reduction is an intricate construction of pairs of rooted trees whose marginal distributions at the root agree but their derivatives disagree. The main technical contribution is controlling what marginal distributions and derivatives are achievable and using Cauchy's functional equation to argue existence of the gadgets. The full version of this paper with detailed proofs to all lemmas and theorems can be found out at https: //arxiv. org/abs/2004. 09238.

Details

STOC Conference 2018 Conference Paper

Inapproximability of the independent set polynomial in the complex plane

Ivona Bezáková
Andreas Galanis
Leslie Ann Goldberg
Daniel Stefankovic

We study the complexity of approximating the value of the independent set polynomial Z G (λ) of a graph G with maximum degree Δ when the activity λ is a complex number. When λ is real, the complexity picture is well-understood, and is captured by two real-valued thresholds λ * and λ c , which depend on Δ and satisfy 0 0, resolving in the affirmative a conjecture of Harvey, Srivastava and Vondrak. Our proof techniques are based around tools from complex analysis — specifically the study of iterative multivariate rational maps.

Details

SODA Conference 2018 Conference Paper

Sampling Random Colorings of Sparse Random Graphs

Charilaos Efthymiou 0001
Thomas P. Hayes
Daniel Stefankovic
Eric Vigoda

We study the mixing properties of the single-site Markov chain known as the Glauber dynamics for sampling k -colorings of a sparse random graph G ( n, d/n ) for constant d. The best known rapid mixing results for general graphs are in terms of the maximum degree Δ of the input graph G and hold when k > 11 Δ /6 for all G. Improved results hold when k > αΔ for graphs with girth ≥ 5 and Δ sufficiently large where α ≈ 1. 7632 … is the root of α = exp(1/ α ); further improvements on the constant α hold with stronger girth and maximum degree assumptions. For sparse random graphs the maximum degree is a function of n and the goal is to obtain results in terms of the expected degree d. The following rapid mixing results for G ( n, d/n ) hold with high probability over the choice of the random graph for sufficiently large constant d. Mossel and Sly (2009) proved rapid mixing for constant k, and Efthymiou (2014) improved this to k linear in d. The condition was improved to k > 3 d by Yin and Zhang (2016) using non-MCMC methods. Here we prove rapid mixing when k > αd where α ≈ 1. 7632 … is the same constant as above. Moreover we obtain O ( n 3 ) mixing time of the Glauber dynamics, while in previous rapid mixing results the exponent was an increasing function in d. Our proof analyzes an appropriately defined block dynamics to “hide” high-degree vertices. One new aspect in our improved approach is utilizing so-called local uniformity properties for the analysis of block dynamics. To analyze the “burn-in” phase we prove a concentration inequality for the number of disagreements propagating in large blocks.

Details

ICML Conference 2017 Conference Paper

On The Projection Operator to A Three-view Cardinality Constrained Set

Haichuan Yang
Shupeng Gui
Chuyang Ke
Daniel Stefankovic
Ryohei Fujimaki
Ji Liu 0002

The cardinality constraint is an intrinsic way to restrict the solution structure in many domains, for example, sparse learning, feature selection, and compressed sensing. To solve a cardinality constrained problem, the key challenge is to solve the projection onto the cardinality constraint set, which is NP-hard in general when there exist multiple overlapped cardinality constraints. In this paper, we consider the scenario where the overlapped cardinality constraints satisfy a Three-view Cardinality Structure (TVCS), which reflects the natural restriction in many applications, such as identification of gene regulatory networks and task-worker assignment problem. We cast the projection into a linear programming, and show that for TVCS, the vertex solution of this linear programming is the solution for the original projection problem. We further prove that such solution can be found with the complexity proportional to the number of variables and constraints. We finally use synthetic experiments and two interesting applications in bioinformatics and crowdsourcing to validate the proposed TVCS model and method.

Details

FOCS Conference 2016 Conference Paper

Convergence of MCMC and Loopy BP in the Tree Uniqueness Region for the Hard-Core Model

Charilaos Efthymiou 0001
Thomas P. Hayes
Daniel Stefankovic
Eric Vigoda
Yitong Yin

We study the hard-core (gas) model defined on independent sets of an input graph where the independent sets are weighted by a parameter (aka fugacity) λ > 0. For constant Δ, previous work of Weitz (2006) established an FPTAS for the partition function for graphs of maximum degree Δ when λ c (Δ). Sly (2010) showed that there is no FPRAS, unless NP=RP, when λ > λ c (Δ). The threshold λ c (Δ) is the critical point for the statistical physics phase transition for uniqueness/non-uniqueness on the infinite Δ-regular tree. The running time of Weitz's algorithm is exponential in log Δ. Here we present an FPRAS for the partition function whose running time is O* (n 2 ). We analyze the simple single-site Markov chain known as the Glauber dynamics for sampling from the associated Gibbs distribution. We prove there exists a constant Δ 0 such that for all graphs with maximum degree Δ > Δ 0 and girth > 7 (i. e. , no cycles of length ≤ 6), the mixing time of the Glauber dynamics is O(nlog n) when λ c (Δ). Our work complements that of Weitz which applies for small constant Δ whereas our work applies for all Δ at least a sufficiently large constant Δ 0 (this includes Δ depending on n = IVI). Our proof utilizes loopy BP (belief propagation) which is a widely-used algorithm for inference in graphical models. A novel aspect of our work is using the principal eigenvector for the BP operator to design a distance function which contracts in expectation for pairs of states that behave like the BP fixed point. We also prove that the Glauber dynamics behaves locally like loopy BP. As a byproduct we obtain that the Glauber dynamics, after a short burn-in period, converges close to the BP fixed point, and this implies that the fixed point of loopy BP is a close approximation to the Gibbs distribution. Using these connections we establish that loopy BP quickly converges to the Gibbs distribution when the girth ≥ 6 and λ c (Δ).

Details

SODA Conference 2015 Conference Paper

Spatial mixing and the connective constant: Optimal bounds

Alistair Sinclair
Piyush Srivastava 0001
Daniel Stefankovic
Yitong Yin

We study the problem of deterministic approximate counting of matchings and independent sets in graphs of bounded connective constant. More generally, we consider the problem of evaluating the partition functions of the monomer-dimer model (which is defined as a weighted sum over all matchings where each matching is given a weight γ | V | – 2| M | in terms of a fixed parameter γ called the monomer activity ) and the hard core model (which is defined as a weighted sum over all independent sets where an independent set I is given a weight γ | I | in terms of a fixed parameter γ called the vertex activity ). The connective constant is a natural measure of the average degree of a graph which has been studied extensively in combinatorics and mathematical physics, and can be bounded by a constant even for certain unbounded degree graphs such as those sampled from the sparse Erdös-Rényi model ( n, d / n ). Our main technical contribution is to prove the best possible rates of decay of correlations in the natural probability distributions induced by both the hard core model and the monomer-dimer model in graphs with a given bound on the connective constant. These results on decay of correlations are obtained using a new framework based on the so-called message approach that has been extensively used recently to prove such results for bounded degree graphs. We then use these optimal decay of correlations results to obtain FPTASs for the two problems on graphs of bounded connective constant. In particular, for the monomer-dimer model, we give a deterministic FPTAS for the partition function on all graphs of bounded connective constant for any given value of the monomer activity. The best previously known deterministic algorithm was due to Bayati, Gamarnik, Katz, Nair and Tetali [STOC 2007], and gave the same runtime guarantees as our results but only for the case of bounded degree graphs. For the hard core model, we give an FPTAS for graphs of connective constant Δ whenever the vertex activity λ < λ c (Δ), where; this result is optimal in the sense that an FPTAS for any λ > λ c (Δ) would imply that NP=RP [Sly, FOCS 2010]. The previous best known result in this direction was a recent paper by a subset of the current authors [FOCS 2013], where the result was established under the suboptimal condition λ < λ c (Δ + 1). Our techniques also allow us to improve upon known bounds for decay of correlations for the hard core model on various regular lattices, including those obtained by Restrepo, Shin, Vigoda and Tetali [FOCS 11] for the special case of ℤ 2 using sophisticated numerically intensive methods tailored to that special case.

Details

STOC Conference 2014 Conference Paper

Inapproximability for antiferromagnetic spin systems in the tree non-uniqueness region

Andreas Galanis
Daniel Stefankovic
Eric Vigoda

Details

TARK Conference 2013 Conference Paper

Reasoning Under the Principle of Maximum Entropy for Modal Logics K45, KD45, and S5

Tivadar Papai
Henry A. Kautz
Daniel Stefankovic

We propose modal Markov logic as an extension of propositional Markov logic to reason under the principle of maximum entropy for modal logics K45, KD45, and S5. Analogous to propositional Markov logic, the knowledge base consists of weighted formulas, whose weights are learned from data. However, in contrast to Markov logic, in our framework we use the knowledge base to define a probability distribution over non-equivalent epistemic situations (pointed Kripke structures) rather than over atoms, and use this distribution to assign probabilities to modal formulas. As in all probabilistic representations, the central task in our framework is inference. Although the size of the state space grows doubly exponentially in the number of propositions in the domain, we provide an algorithm that scales only exponentially in the size of the knowledge base. Finally, we briefly discuss the case of languages with an infinite number of propositions. 1.

Details

NeurIPS Conference 2012 Conference Paper

Slice Normalized Dynamic Markov Logic Networks

Tivadar Papai
Henry Kautz
Daniel Stefankovic

Markov logic is a widely used tool in statistical relational learning, which uses a weighted first-order logic knowledge base to specify a Markov random field (MRF) or a conditional random field (CRF). In many applications, a Markov logic network (MLN) is trained in one domain, but used in a different one. This paper focuses on dynamic Markov logic networks, where the domain of time points typically varies between training and testing. It has been previously pointed out that the marginal probabilities of truth assignments to ground atoms can change if one extends or reduces the domains of predicates in an MLN. We show that in addition to this problem, the standard way of unrolling a Markov logic theory into a MRF may result in time-inhomogeneity of the underlying Markov chain. Furthermore, even if these representational problems are not significant for a given domain, we show that the more practical problem of generating samples in a sequential conditional random field for the next slice relying on the samples from the previous slice has high computational cost in the general case, due to the need to estimate a normalization factor for each sample. We propose a new discriminative model, slice normalized dynamic Markov logic networks (SN-DMLN), that suffers from none of these issues. It supports efficient online inference, and can directly model influences between variables within a time slice that do not have a causal direction, in contrast with fully directed models (e. g. , DBNs). Experimental results show an improvement in accuracy over previous approaches to online inference in dynamic Markov logic networks.

PDF Details

FOCS Conference 2011 Conference Paper

An FPTAS for #Knapsack and Related Counting Problems

Parikshit Gopalan
Adam R. Klivans
Raghu Meka
Daniel Stefankovic
Santosh S. Vempala
Eric Vigoda

Given $n$ elements with non-negative integer weights $w_1, .. ., w_n$ and an integer capacity $C$, we consider the counting version of the classic knapsack problem: find the number of distinct subsets whose weights add up to at most $C$. We give the first deterministic, fully polynomial-time approximation scheme (FPTAS) for estimating the number of solutions to any knapsack constraint (our estimate has relative error $1 \pm \epsilon$). Our algorithm is based on dynamic programming. Previously, randomized polynomial-time approximation schemes (FPRAS) were known first by Morris and Sinclair via Markov chain Monte Carlo techniques, and subsequently by Dyer via dynamic programming and rejection sampling. In addition, we present a new method for deterministic approximate counting using {\em read-once branching programs. } Our approach yields an FPTAS for several other counting problems, including counting solutions for the multidimensional knapsack problem with a constant number of constraints, the general integer knapsack problem, and the contingency tables problem with a constant number of rows.

Details

SODA Conference 2011 Conference Paper

Phase Transition for Glauber Dynamics for Independent Sets on Regular Trees

Ricardo Restrepo
Daniel Stefankovic
Juan Vera 0001
Eric Vigoda
Linji Yang

We study the effect of boundary conditions on the relaxation time of the Glauber dynamics for the hardcore lattice gas model on the n -vertex regular b -ary tree of height h. The hard-core model is defined on independent sets weighted by an activity (or fugacity) Λ on trees. Reconstruction studies the effect of a ‘typical’ boundary condition, i. e. , fixed assignment to the leaves, on the root. The threshold for when reconstruction occurs (and a typical boundary influences the root in the limit h → ∞) has been of considerable recent interest since it appears to be connected to the efficiency of certain local algorithms on locally tree-like graphs. The reconstruction threshold occurs at ω ≈ ln b/b where Λ = ω(1 + ω) b is a convenient re-parameterization of the model. We prove that for all boundary conditions, the relaxation time τ in the non-reconstruction region is fast, namely τ = O ( n 1+ o b (1) ) for any ω ≤ ln b/b. In the reconstruction region, for all boundary conditions, we prove τ = O ( n 1+δ+ o b (1) ) for ω = (1 + δ) ln b/b, for every δ > 0. In contrast, we construct a boundary condition, for which the Glauber dynamics slows down in the reconstruction region, namely τ = Ω( n 1+δ/2− o b (1) ) for ω = (1 + δ) ln b/b, for every δ > 0. The interesting part of our proof is this lower bound result, which uses a general technique that transforms an algorithm to prove reconstruction into a set in the state space of the Glauber dynamics with poor conductance.

Details

FOCS Conference 2007 Conference Paper

Adaptive Simulated Annealing: A Near-optimal Connection between Sampling and Counting

Daniel Stefankovic
Santosh S. Vempala
Eric Vigoda

We present a near-optimal reduction from approximately counting the cardinality of a discrete set to approximately sampling elements of the set. An important application of our work is to approximating the partition function Z of a discrete system, such as the Ising model, matchings or colorings of a graph. The standard approach to estimating the partition function Z(\beta *) at some desired inverse temperature \beta * is to define a sequence, which we call a cooling schedule, \beta 0 = 0 \le \beta 1 \le \cdots \le \beta \ell = \beta * where Z(0) is trivial to compute and the ratios Z(\beta i + 1)/Z(\beta i) are easy to estimate by sampling from the distribution corresponding to Z(\beta i). Previous approaches required a cooling schedule of length {\rm O}*(1nA) where A = Z(0), thereby ensuring that each ratio Z(\beta i + 1)/Z(\beta i) is bounded. We present a cooling schedule of length \ell = {\rm O}*\left( {\sqrt {1nA} } \right). For well-studied problems such as estimating the partition function of the Ising model, or approximating the number of colorings or matchings of a graph, our cooling schedule is of length {\rm O}*\left( {\sqrt n } \right) and the total number of samples required is {\rm O}*\left( n \right). This implies an overall savings of a factor of roughly n in the running time of the approximate counting algorithm compared to the previous best approach. A similar improvement in the length of the cooling schedule was recently obtained by Lovász and Vempala in the context of estimating the volume of convex bodies. While our reduction is inspired by theirs, the discrete analogue of their result turns out to be significantly more difficult. Whereas a fixed schedule suffices in their setting, we prove that in the discrete setting we need an adaptive schedule, i. e. , the schedule depends on Z. More precisely, we prove any non-adaptive cooling schedule has length at least {\rm O}*\left( {1nA} \right), and we present an algorithm to find an adaptive schedule of length {\rm O}*\left( {\sqrt {1nA} } \right) and a nearly matching lower bound.

Details

SODA Conference 2006 Conference Paper

Accelerating simulated annealing for the permanent and combinatorial counting problems

Ivona Bezáková
Daniel Stefankovic
Vijay V. Vazirani
Eric Vigoda

Details

SODA Conference 2004 Conference Paper

Simultaneous diophantine approximation with excluded primes

László Babai
Daniel Stefankovic

Details

FOCS Conference 2003 Conference Paper

Locally Testable Cyclic Codes

László Babai
Amir Shpilka
Daniel Stefankovic

Cyclic linear codes of block length n over a finite field F/sub q/ are the linear subspaces of F/sub q//sup n/ that are invariant under a cyclic shift of their coordinates. A family of codes is good if all the codes in the family have constant rate and constant normalized distance (distance divided by block length). It is a long-standing open problem whether there exists a good family of cyclic linear codes based on F. J. MacWilliams and N. J. A. Sloane (1977). A code C is r-testable if there exist a randomized algorithm which, given a word x /spl isin/ F/sub q//sup n/, adaptively selects r positions, checks the entries of x in the selected positions, and makes a decision (accept or reject x) based on the positions selected and the numbers found, such that (i) if x /spl isin/ C then x is surely accepted; (ii) if dist(x, C) /spl ges/ /spl epsi/n then x is probably rejected (dist refers to Hamming distance). A family of codes is locally testable if all members of the family are r-testable for some constant r. This concept arose from holographic proofs/PCPs. O. Goldreich and M. Sudan (2002) asked whether there exist good, locally testable families of codes. In this paper we address the intersection of the two questions stated.

Details

STOC Conference 2002 Conference Paper

Recognizing string graphs in NP

Marcus Schaefer 0001
Eric Sedgwick
Daniel Stefankovic

A string graph is the intersection graph of a set of curves in the plane. Each curve is represented by a vertex, and an edge between two vertices means that the corresponding curves intersect. We show that string graphs can be recognized in NP . The recognition problem was not known to be decidable until very recently, when two independent papers established exponential upper bounds on the number of intersections needed to realize a string graph [18, 20]. These results implied that the recognition problem lies in NEXP . In the present paper we improve this by showing that the recognition problem for string graphs is in NP , and therefore NP -complete, since Kratochvíl [12] showed that the recognition problem is NP -hard. The result has consequences for the computational complexity of problems in graph drawing, and topological inference.

Details

STOC Conference 2001 Conference Paper

Decidability of string graphs

Marcus Schaefer 0001
Daniel Stefankovic

Details