Arrow Research search

Author name cluster

Toniann Pitassi

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

67 papers
2 author rows

Possible papers

67

FOCS Conference 2025 Conference Paper

Stronger Cell Probe Lower Bounds via Local PRGs

  • Oliver Korten
  • Toniann Pitassi
  • Russell Impagliazzo

In this work we observe a tight connection between three topics: $\mathrm{NC}^{0}$ cryptography, $\mathrm{NC}^{0}$ range avoidance, and static data structure lower bounds. Using this connection, we leverage techniques from the cryptanalysis of $\mathrm{NC}^{0}$ PRGs to prove state-of-the-art results in the latter two subjects. Our main result is an improvement to the best known static data structure lower bounds, breaking a barrier which has stood for several decades. Prior to our work, the best known lower bound for any explicit problem with M inputs and N queries was $S \geq N^{\frac{1}{t}}(\log M)^{1-\frac{1}{t}}$ for any setting of the word length w (where $S=$ space and $t=$ time) [1]. We prove, for the same class of explicit problems considered in [1], a quadratically stronger space lower bound of the form $S \geq \tilde{\Omega}\left(N^{\frac{2}{t}} \cdot(\log M)^{1-\frac{2}{t}} \cdot 2^{-O(w)}\right)$ for all even $t\gt0$. Second, for the restricted class of nonadaptive bit probe data structures, we improve on this lower bound polynomially: for all odd constants $t\gt1$ we give an explicit problem with N queries and $M \leq N^{O(1)}$ inputs and prove a lower bound $S \geq \Omega\left(N^{\frac{2}{t}+\epsilon_{t}}\right)$ for some constant $\epsilon_{t}\gt0$ depending only on t. Our results build off of an exciting body of work on refuting semi-random CSPs (e. g. , [2]–[4]). We then utilize our explicit cell probe lower bounds to obtain the best known unconditional algorithms for $\mathrm{NC}^{0}$ range avoidance: we can solve any instance with stretch $n \mapsto m$ in polynomial time once $m \gg n^{\frac{t}{2}}$ when t is even; with the aid of an NP oracle we can solve any instance with $m\gt n^{\frac{t}{2}-\epsilon}$ when t is odd for some constant $\epsilon\gt 0$. Finally, using our main correspondence we establish some barrier results for obtaining significant improvements to our cell probe lower bounds: (i) near-optimal space lower bounds for an explicit problem with $t=4, w=1$ implies $\mathrm{EXP}^{\mathrm{NP}} \nsubseteq \mathrm{NC}^{1}$; (ii) under the widelybelieved assumption that polynomial-stretch $\mathrm{NC}^{0}$ PRGs exist, there is no natural proof of a lower bound of the form $S \geq N^{\Omega(1)}$ when $t=\omega(1), w=1$.

TMLR Journal 2024 Journal Article

Improving Predictor Reliability with Selective Recalibration

  • Thomas P Zollo
  • Zhun Deng
  • Jake Snell
  • Toniann Pitassi
  • Richard Zemel

A reliable deep learning system should be able to accurately express its confidence with respect to its predictions, a quality known as calibration. One of the most effective ways to produce reliable confidence estimates with a pre-trained model is by applying a post-hoc recalibration method. Popular recalibration methods like temperature scaling are typically fit on a small amount of data and work in the model’s output space, as opposed to the more expressive feature embedding space, and thus usually have only one or a handful of parameters. However, the target distribution to which they are applied is often complex and difficult to fit well with such a function. To this end we propose selective recalibration, where a selection model learns to reject some user-chosen proportion of the data in order to allow the recalibrator to focus on regions of the input space that can be well-captured by such a model. We provide theoretical analysis to motivate our algorithm, and test our method through comprehensive experiments on difficult medical imaging and zero-shot classification tasks. Our results show that selective recalibration consistently leads to significantly lower calibration error than a wide range of selection and recalibration baselines.

ICLR Conference 2024 Conference Paper

Prompt Risk Control: A Rigorous Framework for Responsible Deployment of Large Language Models

  • Thomas P. Zollo
  • Todd Morrill
  • Zhun Deng
  • Jake Snell
  • Toniann Pitassi
  • Richard S. Zemel

With the explosion of the zero-shot capabilities of (and thus interest in) pre-trained large language models, there has come accompanying interest in how best to prompt a language model to perform a given task. While it may be tempting to choose a prompt based on empirical results on a validation set, this can lead to a deployment where an unexpectedly high loss occurs. To mitigate this prospect, we propose a lightweight framework, Prompt Risk Control, for selecting a prompt based on rigorous upper bounds on families of informative risk measures. We provide and compare different methods for producing bounds on a diverse set of risk metrics like mean, CVaR, and the Gini coefficient of the loss distribution. In addition, we extend the underlying statistical bounding techniques to accommodate the possibility of distribution shifts in deployment. Extensive experiments on high-impact applications like chatbots, medical question answering, and news summarization highlight why such a framework is necessary to reduce exposure to the worst outcomes.

FOCS Conference 2024 Conference Paper

Strong vs. Weak Range Avoidance and the Linear Ordering Principle

  • Oliver Korten
  • Toniann Pitassi

In a pair of recent breakthroughs [1], [2] it was shown that the classes $\mathrm{S}_{2}^{\mathrm{E}}, \mathsf{ZPE}^{\mathsf{NP}}$ and $\Sigma_{2}^{\mathrm{E}}$ require exponential circuit complexity, giving the first unconditional improvements to a classical result of Kannan [3]. These results were obtained by designing a surprising new algorithm for the total search problem Range Avoidance: given a circuit $C: \{0, 1\}^{n}\rightarrow\{0, 1\}^{n+1}$, find an $n+1$ -bit strina outside its range. Range Avoidance is a member of the class Tf $\Sigma_{2}^{\dot{\mathrm{F}}}$ of total search problems in the second level of the polynomial hierarchy, analogous to its better-known counterpart TFNP in the first level. TF $\Sigma_{2}^{\overline{\mathrm{F}}}$ was only recently introduced in [4] and its structure is not well understood. We investigate here the extent to which algorithms of the kind in [1], [2] can be applied to other search problems in this class, and prove a variety of results both positive and negative. On the positive side we show that Li's Range Avoidance algorithm [2] can be improved to give a reduction from Range Avoidance to a natural total search problem we call the Linear Ordering Principle or “LOP”: given a circuit $\prec: \{0, 1\}^{n}\times\{0, 1\}^{n}\rightarrow\{0, 1\}$ purportedly defining a total order on $\{0, 1\}^{n}$, find either a witness that $\prec$ is not a total order or else a minimal element in the ordering. The problem LOP is quite interesting in its own right, as it defines a natural syntactic subclass ” $\mathrm{L}_{2}^{\mathrm{P}}$ “ of $\mathrm{s}_{2}^{\mathrm{p}}$ which nonetheless maintains most of the interesting properties of $\mathsf{S}_{2}^{\mathrm{P}}$; in particular we show that $\mathrm{L}_{2}^{\mathrm{P}}$ contains MA and that its exponential analogue $\mathrm{L}_{2}^{\mathrm{E}}$ requires $2^{n}/n$ size circuits. Both of these are consequences of our reduction from Range Avoidance to LOP. On the negative side we prove that the algorithms developed in [1], [2] cannot be extended to Strong Range Avoidance, a problem considered in the same paper which first introduced Range Avoidance [4]. In this problem we are given a circuit $C$: $\{0, 1\}^{n}\backslash \{0^{n}\}\rightarrow\{0, 1\}^{n}$, and once again seek a point outside its range. We give a separation in the decision tree (oracle) model showing that this problem cannot be solved in FP $\Sigma_{2}^{\mathrm{P}}\Vert$, which in particular rules out all of the new kinds of algorithms considered in [1], [2]. This black box separation is derived from a novel depth 3 AC°circuit lower bound for a total search problem, which we believe is of independent interest from the perspective of circuit complexity: we show that unlike previous depth 3 lower bounds, ours cannot be proven by reduction from a decision problem, and thus requires new techniques specifically tailored to total search problems. Proving lower bounds of this kind was recently proposed by Vyas and Williams in the context of the original (Weak) Avoid problem [5].

NeurIPS Conference 2023 Conference Paper

Distribution-Free Statistical Dispersion Control for Societal Applications

  • Zhun Deng
  • Thomas Zollo
  • Jake Snell
  • Toniann Pitassi
  • Richard Zemel

Explicit finite-sample statistical guarantees on model performance are an important ingredient in responsible machine learning. Previous work has focused mainly on bounding either the expected loss of a predictor or the probability that an individual prediction will incur a loss value in a specified range. However, for many high-stakes applications it is crucial to understand and control the \textit{dispersion} of a loss distribution, or the extent to which different members of a population experience unequal effects of algorithmic decisions. We initiate the study of distribution-free control of statistical dispersion measures with societal implications and propose a simple yet flexible framework that allows us to handle a much richer class of statistical functionals beyond previous work. Our methods are verified through experiments in toxic comment detection, medical imaging, and film recommendation.

ICLR Conference 2023 Conference Paper

Quantile Risk Control: A Flexible Framework for Bounding the Probability of High-Loss Predictions

  • Jake Snell
  • Thomas P. Zollo
  • Zhun Deng
  • Toniann Pitassi
  • Richard S. Zemel

Rigorous guarantees about the performance of predictive algorithms are necessary in order to ensure their responsible use. Previous work has largely focused on bounding the expected loss of a predictor, but this is not sufficient in many risk-sensitive applications where the distribution of errors is important. In this work, we propose a flexible framework to produce a family of bounds on quantiles of the loss distribution incurred by a predictor. Our method takes advantage of the order statistics of the observed loss values rather than relying on the sample mean alone. We show that a quantile is an informative way of quantifying predictive performance, and that our framework applies to a variety of quantile-based metrics, each targeting important subsets of the data distribution. We analyze the theoretical properties of our proposed method and demonstrate its ability to rigorously control loss quantiles on several real-world datasets.

STOC Conference 2023 Conference Paper

Stability Is Stable: Connections between Replicability, Privacy, and Adaptive Generalization

  • Mark Bun
  • Marco Gaboardi
  • Max Hopkins
  • Russell Impagliazzo
  • Rex Lei
  • Toniann Pitassi
  • Satchit Sivakumar
  • Jessica Sorrell

The notion of replicable algorithms was introduced by Impagliazzo, Lei, Pitassi, and Sorrell (STOC’22) to describe randomized algorithms that are stable under the resampling of their inputs. More precisely, a replicable algorithm gives the same output with high probability when its randomness is fixed and it is run on a new i.i.d. sample drawn from the same distribution. Using replicable algorithms for data analysis can facilitate the verification of published results by ensuring that the results of an analysis will be the same with high probability, even when that analysis is performed on a new data set. In this work, we establish new connections and separations between replicability and standard notions of algorithmic stability. In particular, we give sample-efficient algorithmic reductions between perfect generalization, approximate differential privacy, and replicability for a broad class of statistical problems. Conversely, we show any such equivalence must break down computationally: there exist statistical problems that are easy under differential privacy, but that cannot be solved replicably without breaking public-key cryptography. Furthermore, these results are tight: our reductions are statistically optimal, and we show that any computational separation between DP and replicability must imply the existence of one-way functions. Our statistical reductions give a new algorithmic framework for translating between notions of stability, which we instantiate to answer several open questions in replicability and privacy. This includes giving sample-efficient replicable algorithms for various PAC learning, distribution estimation, and distribution testing problems, algorithmic amplification of δ in approximate DP, conversions from item-level to user-level privacy, and the existence of private agnostic-to-realizable learning reductions under structured distributions.

NeurIPS Conference 2022 Conference Paper

On Learning and Refutation in Noninteractive Local Differential Privacy

  • Alexander Edmonds
  • Aleksandar Nikolov
  • Toniann Pitassi

We study two basic statistical tasks in non-interactive local differential privacy (LDP): *learning* and *refutation*: learning requires finding a concept that best fits an unknown target function (from labelled samples drawn from a distribution), whereas refutation requires distinguishing between data distributions that are well-correlated with some concept in the class, versus distributions where the labels are random. Our main result is a complete characterization of the sample complexity of agnostic PAC learning for non-interactive LDP protocols. We show that the optimal sample complexity for any concept class is captured by the approximate $\gamma_2$ norm of a natural matrix associated with the class. Combined with previous work, this gives an *equivalence* between agnostic learning and refutation in the agnostic setting.

ICLR Conference 2021 Conference Paper

Theoretical bounds on estimation error for meta-learning

  • James Lucas
  • Mengye Ren
  • Irene Raissa Kameni
  • Toniann Pitassi
  • Richard S. Zemel

Machine learning models have traditionally been developed under the assumption that the training and test distributions match exactly. However, recent success in few-shot learning and related problems are encouraging signs that these models can be adapted to more realistic settings where train and test distributions differ. Unfortunately, there is severely limited theoretical support for these algorithms and little is known about the difficulty of these problems. In this work, we provide novel information-theoretic lower-bounds on minimax rates of convergence for algorithms that are trained on data from multiple sources and tested on novel data. Our bounds depend intuitively on the information shared between sources of data, and characterize the difficulty of learning in this setting for arbitrary algorithms. We demonstrate these bounds on a hierarchical Bayesian model of meta-learning, computing both upper and lower bounds on parameter estimation via maximum-a-posteriori inference.

FOCS Conference 2021 Conference Paper

Tradeoffs for small-depth Frege proofs

  • Toniann Pitassi
  • Prasanna Ramakrishnan
  • Li-Yang Tan

We study the complexity of small-depth Frege proofs and give the first tradeoffs between the size of each line and the number of lines. Existing lower bounds apply to the overall proof size-the sum of sizes of all lines-and do not distinguish between these notions of complexity. For depth-d Frege proofs of the Tseitin principle where each line is a size-s formula, we prove that $\exp(n/2^{\Omega(d\sqrt{\log s})})$ many lines are necessary. This yields new lower bounds on line complexity that are not implied by $\mathbf{H}\mathop{\mathbf{a}}\! \! \! \! ^{\circ}\mathbf{stad}$ 's recent $\exp(n^{\Omega(1/d)})$ lower bound on the overall proof size. For $s$ = poly $(n)$, for example, our lower bound remains $\exp(n^{1-o(1)})$ for all $d=o(\sqrt{\log n})$, whereas $\mathbf{H}\mathop{\mathbf{a}}\! \! \! \! ^{\circ}\mathbf{stad}$ 's lower bound is $\exp(n^{o(1)})$ once $d\ = \omega_{n}(1)$. Our main conceptual contribution is the simple obser-vation that techniques for establishing correlation bounds in circuit complexity can be leveraged to establish such tradeoffs in proof complexity.

ICML Conference 2020 Conference Paper

Causal Modeling for Fairness In Dynamical Systems

  • Elliot Creager
  • David Madras
  • Toniann Pitassi
  • Richard S. Zemel

In many applications areas—lending, education, and online recommenders, for example—fairness and equity concerns emerge when a machine learning system interacts with a dynamically changing environment to produce both immediate and long-term effects for individuals and demographic groups. We discuss causal directed acyclic graphs (DAGs) as a unifying framework for the recent literature on fairness in such dynamical systems. We show that this formulation affords several new directions of inquiry to the modeler, where sound causal assumptions can be expressed and manipulated. We emphasize the importance of computing interventional quantities in the dynamical fairness setting, and show how causal assumptions enable simulation (when environment dynamics are known) and estimation by adjustment (when dynamics are unknown) of intervention on short- and long-term outcomes, at both the group and individual levels.

FOCS Conference 2020 Conference Paper

KRW Composition Theorems via Lifting

  • Susanna F. de Rezende
  • Or Meir
  • Jakob Nordström
  • Toniann Pitassi
  • Robert Robere

One of the major open problems in complexity theory is proving super-logarithmic lower bounds on the depth of circuits (i. e. , P\nsubseteq NC 1 ). Karchmer, Raz, and Wigderson [13] suggested to approach this problem by proving that depth complexity behaves “as expected” with respect to the composition of functions f◇g. They showed that the validity of this conjecture would imply that P\nsubseteq NC 1. Several works have made progress toward resolving this conjecture by proving special cases. In particular, these works proved the KRW conjecture for every outer function, but only for few inner functions. Thus, it is an important challenge to prove the KRW conjecture for a wider range of inner functions. In this work, we extend significantly the range of inner functions that can be handled. First, we consider the monotone version of the KRW conjecture. We prove it for every monotone inner function whose depth complexity can be lower bounded via a query-to-communication lifting theorem. This allows us to handle several new and well-studied functions such as the s-t-connectivity, clique, and generation functions. In order to carry this progress back to the non-monotone setting, we introduce a new notion of semi-monotone composition, which combines the non-monotone complexity of the outer function with the monotone complexity of the inner function. In this setting, we prove the KRW conjecture for a similar selection of inner functions, but only for a specific choice of the outer function f.

FOCS Conference 2020 Conference Paper

Lifting with Simple Gadgets and Applications to Circuit and Proof Complexity

  • Susanna F. de Rezende
  • Or Meir
  • Jakob Nordström
  • Toniann Pitassi
  • Robert Robere
  • Marc Vinyals

We significantly strengthen and generalize the theorem lifting Nullstellensatz degree to monotone span program size by Pitassi and Robere (2018) so that it works for any gadget with high enough rank, in particular, for useful gadgets such as equality and greater-than. We apply our generalized theorem to solve three open problems: ; We present the first result that demonstrates a separation in proof power for cutting planes with unbounded versus polynomially bounded coefficients. Specifically, we exhibit CNF formulas that can be refuted in quadratic length and constant line space in cutting planes with unbounded coefficients, but for which there are no refutations in subexponential length and subpolynomial line space if coefficients are restricted to be of polynomial magnitude. : We give the first explicit separation between monotone Boolean formulas and monotone real formulas. Specifically, we give an explicit family of functions that can be computed with monotone real formulas of nearly linear size but require monotone Boolean formulas of exponential size. Previously only a non-explicit separation was known. : We give the strongest separation to-date between monotone Boolean formulas and monotone Boolean circuits. Namely, we show that the classical GEN problem, which has polynomial-size monotone Boolean circuits, requires monotone Boolean formulas of size 2 Ω(n/polylog(n)). An important technical ingredient, which may be of independent interest, is that we show that the Nullstellensatz degree of refuting the pebbling formula over a DAG G over any field coincides exactly with the reversible pebbling price of G. In particular, this implies that the standard decision tree complexity and the parity decision tree complexity of the corresponding falsified clause search problem are equal. This is an extended abstract. The full version of the paper is available at https: //arxiv. org/abs/2001. 02144.

SAT Conference 2020 Conference Paper

Towards a Complexity-Theoretic Understanding of Restarts in SAT Solvers

  • Chunxiao Li 0002
  • Noah Fleming
  • Marc Vinyals
  • Toniann Pitassi
  • Vijay Ganesh 0001

Abstract Restarts are a widely-used class of techniques integral to the efficiency of Conflict-Driven Clause Learning (CDCL) Boolean SAT solvers. While the utility of such policies has been well-established empirically, a theoretical understanding of whether restarts are indeed crucial to the power of CDCL solvers is missing. In this paper, we prove a series of theoretical results that characterize the power of restarts for various models of SAT solvers. More precisely, we make the following contributions. First, we prove an exponential separation between a drunk randomized CDCL solver model with restarts and the same model without restarts using a family of satisfiable instances. Second, we show that the configuration of CDCL solver with VSIDS branching and restarts (with activities erased after restarts) is exponentially more powerful than the same configuration without restarts for a family of unsatisfiable instances. To the best of our knowledge, these are the first separation results involving restarts in the context of SAT solvers. Third, we show that restarts do not add any proof complexity-theoretic power vis-a-vis a number of models of CDCL and DPLL solvers with non-deterministic static variable and value selection.

ICML Conference 2019 Conference Paper

Flexibly Fair Representation Learning by Disentanglement

  • Elliot Creager
  • David Madras
  • Jörn-Henrik Jacobsen
  • Marissa A. Weis
  • Kevin Swersky
  • Toniann Pitassi
  • Richard S. Zemel

We consider the problem of learning representations that achieve group and subgroup fairness with respect to multiple sensitive attributes. Taking inspiration from the disentangled representation learning literature, we propose an algorithm for learning compact representations of datasets that are useful for reconstruction and prediction, but are also flexibly fair, meaning they can be easily modified at test time to achieve subgroup demographic parity with respect to multiple sensitive attributes and their conjunctions. We show empirically that the resulting encoder—which does not require the sensitive attributes for inference—allows for the adaptation of a single representation to a variety of fair classification tasks with new target labels and subgroup definitions.

ICML Conference 2018 Conference Paper

Learning Adversarially Fair and Transferable Representations

  • David Madras
  • Elliot Creager
  • Toniann Pitassi
  • Richard S. Zemel

In this paper, we advocate for representation learning as the key to mitigating unfair prediction outcomes downstream. Motivated by a scenario where learned representations are used by third parties with unknown objectives, we propose and explore adversarial representation learning as a natural method of ensuring those parties act fairly. We connect group fairness (demographic parity, equalized odds, and equal opportunity) to different adversarial objectives. Through worst-case theoretical guarantees and experimental validation, we show that the choice of this objective is crucial to fair prediction. Furthermore, we present the first in-depth experimental demonstration of fair transfer learning and demonstrate empirically that our learned representations admit fair predictions on new tasks while maintaining utility, an essential goal of fair representation learning.

FOCS Conference 2017 Conference Paper

Query-to-Communication Lifting for BPP

  • Mika Göös
  • Toniann Pitassi
  • Thomas Watson 0001

For any n-bit boolean function f, we show that the randomized communication complexity of the composed function f o g n, where g is an index gadget, is characterized by the randomized decision tree complexity of f. In particular, this means that many query complexity separations involving randomized models (e. g. , classical vs. quantum) automatically imply analogous separations in communication complexity.

FOCS Conference 2017 Conference Paper

Random Θ(log n)-CNFs Are Hard for Cutting Planes

  • Noah Fleming
  • Denis Pankratov
  • Toniann Pitassi
  • Robert Robere

The random k-SAT model is the most important and well-studied distribution over k-SAT instances. It is closely connected to statistical physics and is a benchmark for satisfiability algorithms. We show that when k = Θ(log n), any Cutting Planes refutation for random k-SAT requires exponential size in the interesting regime where the number of clauses guarantees that the formula is unsatisfiable with high probability.

FOCS Conference 2016 Conference Paper

Exponential Lower Bounds for Monotone Span Programs

  • Robert Robere
  • Toniann Pitassi
  • Benjamin Rossman
  • Stephen A. Cook

Monotone span programs are a linear-algebraic model of computation which were introduced by Karchmer and Wigderson in 1993 [1]. They are known to be equivalent to linear secret sharing schemes, and have various applications in complexity theory and cryptography. Lower bounds for monotone span programs have been difficult to obtain because they use non-monotone operations to compute monotone functions, in fact, the best known lower bounds are quasipolynomial for a function in (nonmonotone) P [2]. A fundamental open problem is to prove exponential lower bounds on monotone span program size for any explicit function. We resolve this open problem by giving exponential lower bounds on monotone span program size for a function in monotone P. This also implies the first exponential lower bounds for linear secret sharing schemes. Our result is obtained by proving exponential lower bounds using Razborov's rank method [3], a measure that is strong enough to prove lower bounds for many monotone models. As corollaries we obtain new proofs of exponential lower bounds for monotone formula size, monotone switching network size, and the first lower bounds for monotone comparator circuit size for a function in monotone P. We also obtain new polynomial degree lower bounds for Nullstellensatz refutations using an interpolation theorem of Pudlak and Sgall [4]. Finally, we obtain quasipolynomial lower bounds on the rank measure for the st-connectivity function, implying tight bounds for st-connectivity in all of the computational models mentioned above.

STOC Conference 2016 Conference Paper

Poly-logarithmic Frege depth lower bounds via an expander switching lemma

  • Toniann Pitassi
  • Benjamin Rossman
  • Rocco A. Servedio
  • Li-Yang Tan

We show that any polynomial-size Frege refutation of a certain linear-size unsatisfiable 3-CNF formula over n variables must have depth Ω(√log n ). This is an exponential improvement over the previous best results (Pitassi et al. 1993, Krajíček et al. 1995, Ben-Sasson 2002) which give Ω(loglog n ) lower bounds. The 3-CNF formulas which we use to establish this result are Tseitin contradictions on 3-regular expander graphs. In more detail, our main result is a proof that for every d , any depth- d Frege refutation of the Tseitin contradiction over these n -node graphs must have size n Ω((log n )/ d 2 ) . A key ingredient of our approach is a new switching lemma for a carefully designed random restriction process over these expanders. These random restrictions reduce a Tseitin instance on a 3-regular n -node expander to a Tseitin instance on a random subgraph which is a topological embedding of a 3-regular n ′-node expander, for some n ′ which is not too much less than n . Our result involves Ω(√log n ) iterative applications of this type of random restriction.

FOCS Conference 2015 Conference Paper

Deterministic Communication vs. Partition Number

  • Mika Göös
  • Toniann Pitassi
  • Thomas Watson 0001

We show that deterministic communication complexity can be super logarithmic in the partition number of the associated communication matrix. We also obtain near-optimal deterministic lower bounds for the Clique vs. Independent Set problem, which in particular yields new lower bounds for the log-rank conjecture. All these results follow from a simple adaptation of a communication-to-query simulation theorem of Raz and McKenzie (Combinatorica 1999) together with lower bounds for the analogous query complexity questions.

IJCAI Conference 2015 Conference Paper

Inapproximability of Treewidth and Related Problems (Extended Abstract)

  • Yu Wu
  • Per Austrin
  • Toniann Pitassi
  • David Liu

Graphical models, such as Bayesian Networks and Markov networks play an important role in artificial intelligence and machine learning. Inference is a central problem to be solved on these networks. This, and other problems on these graph models are often known to be hard to solve in general, but tractable on graphs with bounded Treewidth. Therefore, finding or approximating the Treewidth of a graph is a fundamental problem related to inference in graphical models. In this paper, we study the approximability of a number of graph problems: Treewidth and Pathwidth of graphs, Minimum Fill- In, and a variety of different graph layout problems such as Minimum Cut Linear Arrangement. We show that, assuming Small Set Expansion Conjecture, all of these problems are NP-hard to approximate to within any constant factor in polynomial time. This paper is an extended abstract of the Journal of Artificial Intelligence Research [Wu et al. , 2014]

STOC Conference 2015 Conference Paper

Preserving Statistical Validity in Adaptive Data Analysis

  • Cynthia Dwork
  • Vitaly Feldman
  • Moritz Hardt
  • Toniann Pitassi
  • Omer Reingold
  • Aaron Roth 0001

A great deal of effort has been devoted to reducing the risk of spurious scientific discoveries, from the use of sophisticated validation techniques, to deep statistical methods for controlling the false discovery rate in multiple hypothesis testing. However, there is a fundamental disconnect between the theoretical results and the practice of data analysis: the theory of statistical inference assumes a fixed collection of hypotheses to be tested, or learning algorithms to be applied, selected non-adaptively before the data are gathered, whereas in practice data is shared and reused with hypotheses and new analyses being generated on the basis of data exploration and the outcomes of previous analyses. In this work we initiate a principled study of how to guarantee the validity of statistical inference in adaptive data analysis. As an instance of this problem, we propose and investigate the question of estimating the expectations of m adaptively chosen functions on an unknown distribution given n random samples. We show that, surprisingly, there is a way to estimate an exponential in n number of expectations accurately even if the functions are chosen adaptively. This gives an exponential improvement over standard empirical estimators that are limited to a linear number of estimates. Our result follows from a general technique that counter-intuitively involves actively perturbing and coordinating the estimates, using techniques developed for privacy preservation. We give additional applications of this technique to our question.

FOCS Conference 2014 Conference Paper

Circuit Complexity, Proof Complexity, and Polynomial Identity Testing

  • Joshua A. Grochow
  • Toniann Pitassi

We introduce a new and natural algebraic proof system, which has tight connections to (algebraic) circuit complexity. In particular, we show that any super-polynomial lower bound on any Boolean tautology in our proof system implies that the permanent does not have polynomial-size algebraic circuits (VNP≠VP). As a corollary, super-polynomial lower bounds on the number of lines in Polynomial Calculus proofs (as opposed to the usual measure of number of monomials) imply the Permanent versus Determinant Conjecture. Note that, prior to our work, there was no proof system for which lower bounds on an arbitrary tautology implied any computational lower bound. Our proof system helps clarify the relationships between previous algebraic proof systems, and begins to shed light on why proof complexity lower bounds for various proof systems have been so much harder than lower bounds on the corresponding circuit classes. In doing so, we highlight the importance of polynomial identity testing (PIT) for understanding proof complexity.

FOCS Conference 2013 Conference Paper

A Tight Bound for Set Disjointness in the Message-Passing Model

  • Mark Braverman
  • Faith Ellen
  • Rotem Oshman
  • Toniann Pitassi
  • Vinod Vaikuntanathan

In a multiparty message-passing model of communication, there are k players. Each player has a private input, and they communicate by sending messages to one another over private channels. While this model has been used extensively in distributed computing and in secure multiparty computation, lower bounds on communication complexity in this model and related models have been somewhat scarce. In recent work [25], [29], [30], strong lower bounds of the form Ω(n·k) were obtained for several functions in the message-passing model; however, a lower bound on the classical set disjointness problem remained elusive. In this paper, we prove a tight lower bound of Ω(n · k) for the set disjointness problem in the message passing model. Our bound is obtained by developing information complexity tools for the message-passing model and proving an information complexity lower bound for set disjointness.

FOCS Conference 2013 Conference Paper

Average Case Lower Bounds for Monotone Switching Networks

  • Yuval Filmus
  • Toniann Pitassi
  • Robert Robere
  • Stephen A. Cook

An approximate computation of a Boolean function by a circuit or switching network is a computation in which the function is computed correctly on the majority of the inputs (rather than on all inputs). Besides being interesting in their own right, lower bounds for approximate computation have proved useful in many sub areas of complexity theory, such as cryptography and derandomization. Lower bounds for approximate computation are also known as correlation bounds or average case hardness. In this paper, we obtain the first average case monotone depth lower bounds for a function in monotone P. We tolerate errors that are asymptotically the best possible for monotone circuits. Specifically, we prove average case exponential lower bounds on the size of monotone switching networks for the GEN function. As a corollary, we separate the monotone NC hierarchy in the case of errors -- a result which was previously only known for exact computations. Our proof extends and simplifies the Fourier analytic technique due to Potechin, and further developed by Chan and Potechin. As a corollary of our main lower bound, we prove that the communication complexity approach for monotone depth lower bounds does not naturally generalize to the average case setting.

ICML Conference 2013 Conference Paper

Learning Fair Representations

  • Richard S. Zemel
  • Yu Wu
  • Kevin Swersky
  • Toniann Pitassi
  • Cynthia Dwork

We propose a learning algorithm for fair classification that achieves both group fairness (the proportion of members in a protected group receiving positive classification is identical to the proportion in the population as a whole), and individual fairness (similar individuals should be treated similarly). We formulate fairness as an optimization problem of finding a good representation of the data with two competing goals: to encode the data as well as possible, while simultaneously obfuscating any information about membership in the protected group. We show positive results of our algorithm relative to other known techniques, on three datasets. Moreover, we demonstrate several advantages to our approach. First, our intermediate representation can be used for other classification tasks (i. e. , transfer learning is possible); secondly, we take a step toward learning a distance metric which can find important dimensions of the data for classification.

STOC Conference 2010 Conference Paper

Hardness amplification in proof complexity

  • Paul Beame
  • Trinh Huynh
  • Toniann Pitassi

We present a general method for converting any family of unsatisfiable CNF formulas that is hard for one of the simplest proof systems -- tree resolution -- into formulas that require large rank in very strong proof systems, including any proof system that manipulates polynomials of degree at most k (known as Th(k) proofs). These include high degree versions of Lovasz-Schrijver and Cutting Planes proofs. We introduce two very general families of these proof systems, denoted Tcc(k) and Rcc(k). The proof lines of Tcc(k) are arbitrary Boolean functions, each of which can be evaluated by an efficient k-party randomized communication protocol. Tcc(k) proofs include Th(k-1) proofs as a special case. Rcc(k) proofs generalize Tcc(k) proofs and require only that each inference be checkable by a short k-party protocol. For all k in O(loglog n), our main results are as follows: First, any unsatisfiable CNF formula of high resolution rank can be efficiently transformed into another CNF formula requiring high rank in all Rcc(k) systems, and exponential tree size in all Tcc(k) systems. Secondly, there are strict rank hierarchies for all Rcc(k) systems, and strict tree-size hierarchies for all Tcc(k) systems. Finally, we apply our general method to give optimal integrality gaps for low rank Rcc(2) proofs for MAX-2t-SAT, which imply optimal integrality gaps for low rank Cutting Planes and Th(1) proofs.

FOCS Conference 2010 Conference Paper

The Limits of Two-Party Differential Privacy

  • Andrew McGregor 0001
  • Ilya Mironov
  • Toniann Pitassi
  • Omer Reingold
  • Kunal Talwar
  • Salil P. Vadhan

We study differential privacy in a distributed setting where two parties would like to perform analysis of their joint data while preserving privacy for both datasets. Our results imply almost tight lower bounds on the accuracy of such data analyses, both for specific natural functions (such as Hamming distance) and in general. Our bounds expose a sharp contrast between the two-party setting and the simpler client-server setting (where privacy guarantees are one-sided). In addition, those bounds demonstrate a dramatic gap between the accuracy that can be obtained by differentially private data analysis versus the accuracy obtainable when privacy is relaxed to a computational variant of differential privacy. The first proof technique we develop demonstrates a connection between differential privacy and deterministic extraction from Santha-Vazirani sources. A second connection we expose indicates that the ability to approximate a function by a low-error differentially private protocol is strongly related to the ability to approximate it by a low communication protocol. (The connection goes in both directions).

AAAI Conference 2008 Conference Paper

Clause Learning Can Effectively P-Simulate General Propositional Resolution

  • Philipp Hertel
  • Toniann Pitassi

Currently, the most effective complete SAT solvers are based on the DPLL algorithm augmented by clause learning. These solvers can handle many real-world problems from application areas like verification, diagnosis, planning, and design. Without clause learning, however, DPLL loses most of its effectiveness on real world problems. Recently there has been some work on obtaining a deeper understanding of the technique of clause learning. In this paper we utilize the idea of effective p-simulation, which is a new way of comparing clause learning with general resolution and other proof systems. We then show that pool proofs, a previously used characterization of clause learning, can effectively p-simulate general resolution. Furthermore, this result holds even for the more restrictive class of greedy, unit propagating, pool proofs, which more accurately characterize clause learning as it is used in practice. This result is surprising and indicates that clause learning is significantly more powerful than was previously known.

FOCS Conference 2007 Conference Paper

Exponential Time/Space Speedups for Resolution and the PSPACE-completeness of Black-White Pebbling

  • Philipp Hertel
  • Toniann Pitassi

The complexity of the Black-White Pebbling Game has remained open for 30 years. It was devised to capture the power of non-deterministic space bounded computation. Since then it has been applied to problems in diverse areas of computer science including VLSI design and more recently propositional proof complexity. In this paper we show that the Black-While Pebbling Game is PSPACE-complete. We then use similar ideas in a more complicated reduction to prove the PSPACE-completeness of Resolution space. The reduction also yields a surprising exponential time/space speedup for Resolution in which an increase of 3 units of space results in an exponential decrease in proof-size.

FOCS Conference 2007 Conference Paper

Integrality gaps of 2 - o(1) for Vertex Cover SDPs in the Lovész-Schrijver Hierarchy

  • Konstantinos Georgiou
  • Avner Magen
  • Toniann Pitassi
  • Iannis Tourlakis

Linear and semidefinite programming are highly successful approaches for obtaining good approximations for NP-hard optimization problems. For example, breakthrough approximation algorithms for Max Cut and Sparsest Cut use semidefinite programming. Perhaps the most prominent NP-hard problem whose exact approximation factor is still unresolved is Vertex Cover. PCP-based techniques of Dinur and Safra [7] show that it is not possible to achieve a factor better than 1. 36; on the other hand no known algorithm does better than the factor of 2 achieved by the simple greedy algorithm. Furthermore, there is a widespread belief that SDP technicptes are the most promising methods available for improving upon this factor of 2. Following a line of study initiated by Arora et al. [3], our aim is to show that a large family of LP and SDP based algorithms fail to produce an approximation for Vertex Cover better than 2. Lovasz and Schrijver [21] introduced the systems LS and LS + for systematically tightening LP and SDP relaxations, respectively, over many rounds. These systems naturally capture large classes of LP and SDP relaxations; indeed, LS + captures the celebrated SDP-based algorithms for Max Cur and Sparsest Cur mentioned above. We rule out polynomial-time 2 - Omega(lfloor) approximations for Vertex Cover using LS +. In particular, we prove an integrality gap of 2 - o(lfloor)for Vertex Cover SDPs obtained by tightening the standard LP relaxation with Omega(radiclog n/ log log n) rounds of LS +. While tight integrality gaps were known for Vertex Cover in the weaker LS system [23 ], previous results did not rule out a2 - Omega(1) approximation after even two rounds of LS +.

FOCS Conference 2004 Conference Paper

Learnability and Automatizability

  • Michael Alekhnovich
  • Mark Braverman
  • Vitaly Feldman
  • Adam R. Klivans
  • Toniann Pitassi

We consider the complexity of properly learning concept classes, i. e. when the learner must output a hypothesis of the same form as the unknown concept. We present the following upper and lower bounds on well-known concept classes: 1) We show that unless NP = RP, there is no polynomial-time PAC learning algorithm for DNF formulae where the hypothesis is an OR-of-thresholds. Note that as special cases, we show that neither DNF nor OR-of-thresholds are properly learnable unless NP = RP. Previous hardness results have required strong restrictions on the size of the output DNF formula. We also prove that it is NP-hard to learn the intersection of /spl lscr/ /spl ges/ 2 halfspaces by the intersection of k halfspaces for any constant k > 0. Previous work held for the case when k = /spl lscr/; 2) Assuming that NP /spl nsube/ DTIME(2/sup n/spl epsi//) for a certain constant /spl epsiv/ < 1 we show that it is not possible to learn size s decision trees by size s/sup k/ decision trees for any k /spl ges/ 0. Previous hardness results for learning decision trees held for k /spl les/ 2; 3) We present the first nontrivial upper bounds on properly learning DNF formulae and decision trees. In particular we show how to learn size s DNF by DNF in time 2/sup O~/(/spl radic/(n log s)), and how to learn size s decision trees by decision trees in time n/sup O(log s)/. The hardness results for DNF formulae and intersections of halfspaces are obtained via specialized graph products for amplifying the hardness of approximating the chromatic number as well as applying work on the hardness of approximate hypergraph coloring. The hardness results for decision trees, as well as the upper bounds, are obtained by developing a connection between automatizability in proof complexity and learnability, which may have other applications.

FOCS Conference 2003 Conference Paper

Algorithms and Complexity Results for #SAT and Bayesian Inference

  • Fahiem Bacchus
  • Shannon Dalmao
  • Toniann Pitassi

Bayesian inference is an important problem with numerous applications in probabilistic reasoning. Counting satisfying assignments is a closely related problem of fundamental theoretical importance. In this paper, we show that plain old DPLL equipped with memorization (an algorithm we call #DPLLCache) can solve both of these problems with time complexity that is at least as good as state-of-the-art exact algorithms, and that it can also achieve the best known time-space tradeoff. We then proceed to show that there are instances where #DPLLCache can achieve an exponential speedup over existing algorithms.

FOCS Conference 2003 Conference Paper

Rank Bounds and Integrality Gaps for Cutting Planes Procedures Joshua

  • Joshua Buresh-Oppenheim
  • Nicola Galesi
  • Shlomo Hoory
  • Avner Magen
  • Toniann Pitassi

We present a new method for proving rank lower bounds for Cutting Planes (CP) and several procedures based on lifting due to Lovasz and Schrijver (LS), when viewed as proof systems for unsatisfiability. We apply this method to obtain the following new results: first, we prove near-optimal rank bounds for Cutting Planes and Lovasz-Schrijver proofs for several prominent unsatisfiable CNF examples, including random kCNF formulas and the Tseitin graph formulas. It follows from these lower bounds that a linear number of rounds of CP or LS procedures when applied to relaxations of integer linear programs is not sufficient for reducing the integrality gap. Secondly, we give unsatisfiable examples that have constant rank CP and LS proofs but that require linear rank resolution proofs. Thirdly, we give examples where the CP rank is O(log n) but the LS rank is linear. Finally, we address the question of size versus rank: we show that, for both proof systems, rank does not accurately reflect proof size. Specifically, there are examples with polynomial-size CP/LS proofs, but requiring linear rank.

FOCS Conference 2002 Conference Paper

Bounded-Depth Frege Lower Bounds for Weaker Pigeonhole Principles

  • Joshua Buresh-Oppenheim
  • Paul Beame
  • Toniann Pitassi
  • Ran Raz
  • Ashish Sabharwal

We prove a quasi-polynomial lower bound on the size of bounded-depth Frege proofs of the pigeonhole principle PHP/sub n//sup m/ where m = (1 + 1/polylog n)n. This lower bound qualitatively matches the known quasipolynomial-size bounded-depth Frege proofs for these principles. Our technique, which uses a switching lemma argument like other lower bounds for bounded-depth Frege proofs, is novel in that the tautology to which this switching lemma is applied remains random throughout the argument.

NeurIPS Conference 2000 Conference Paper

A Gradient-Based Boosting Algorithm for Regression Problems

  • Richard Zemel
  • Toniann Pitassi

In adaptive boosting, several weak learners trained sequentially are combined to boost the overall algorithm performance. Re(cid: 173) cently adaptive boosting methods for classification problems have been derived as gradient descent algorithms. This formulation jus(cid: 173) tifies key elements and parameters in the methods, all chosen to optimize a single common objective function. We propose an anal(cid: 173) ogous formulation for adaptive boosting of regression problems, utilizing a novel objective function that leads to a simple boosting algorithm. We prove that this method reduces training error, and compare its performance to other regression methods. The aim of boosting algorithms is to "boost" the small advantage that a hypothesis produced by a weak learner can achieve over random guessing, by using the weak learning procedure several times on a sequence of carefully constructed distribu(cid: 173) tions. Boosting methods, notably AdaBoost (Freund & Schapire, 1997), are sim(cid: 173) ple yet powerful algorithms that are easy to implement and yield excellent results in practice. Two crucial elements of boosting algorithms are the way in which a new distribution is constructed for the learning procedure to produce the next hy(cid: 173) pothesis in the sequence, and the way in which hypotheses are combined to pro(cid: 173) duce a highly accurate output. Both of these involve a set of parameters, whose values appeared to be determined in an ad hoc maImer. Recently boosting algo(cid: 173) rithms have been derived as gradient descent algorithms (Breiman, 1997; Schapire & Singer, 1998; Friedman et al. , 1999; Mason et al. , 1999). These formulations justify the parameter values as all serving to optimize a single common objective function. These optimization formulations of boosting originally developed for classification problems have recently been applied to regression problems. However, key prop(cid: 173) erties of these regression boosting methods deviate significantly from the classifica(cid: 173) tion boosting approach. We propose a new boosting algorithm for regression prob(cid: 173) lems, also derived from a central objective function, which retains these properties. In this paper, we describe the original boosting algorithm and summarize boosting methods for regression. We present our method and provide a simple proof that elucidates conditions under which convergence on training error can be guaran(cid: 173) teed. We propose a probabilistic framework that clarifies the relationship between various optimization-based boosting methods. Finally, we summarize empirical comparisons between our method and others on some standard problems. 1 A Brief Summary of Boosting Methods Adaptive boosting methods are simple modular algorithms that operate as follows. Let 9: X -t Y be the function to be learned, where the label set Y is finite, typ(cid: 173) ically binary-valued. The algorithm uses a learning procedure, which has access to n training examples, {(Xl, Y1), .. ., (xn, Yn)}, drawn randomly from X x Yac(cid: 173) cording to distribution D; it outputs a hypothesis I: X -t Y, whose error is the expected value of a loss function on I(x), g(x), where X is chosen according to D. Given f, cl > 0 and access to random examples, a strong learning procedure outputs with probability 1 - cl a hypothesis with error at most f, with running time polyno(cid: 173) mial in 1/ f, 1/ cl and the number of examples. A weak learning procedure satisfies the same conditions, but where f need only be better than random guessing. Schapire (1990) showed that any weak learning procedure, denoted WeakLeam, can be efficiently transformed ("boosted") into a strong learning procedure. The AdaBoost algorithm achieves this by calling WeakLeam multiple times, in a se(cid: 173) quence of T stages, each time presenting it with a different distribution over a fixed training set and finally combining all of the hypotheses. The algorithm maintains a weight w: for each training example i at stage i, and a distribution D t is computed by normalizing these weights. The algorithm loops through these steps: At stagei, the distribution D t is given to WeakLeam, which generates a hy(cid: 173) pothesis It- The error rate ft of It w. r. t. D t is: ft = 2: :i f, (x'); t'y ' wU 2: :7=1 w~ w: * (ft/ (l - The new training distribution is obtained from the new weights: W; +l

MFCS Conference 1998 Conference Paper

Minimum Propositional Proof Length is NP-Hard to Linearly Approximate

  • Michael Alekhnovich
  • Sam Buss
  • Shlomo Moran
  • Toniann Pitassi

Abstract We prove that the problem of determining the minimum propositional proof length is NP-hard to approximate within any constant factor. These results hold for all Frege systems, for all extended Frege systems, for resolution and Horn resolution, and for the sequent calculus and the cut-free sequent calculus. Also, if NP is not in \(QP = DTIME(n^{log^{O(1)} n} )\), then it is impossible to approximate minimum propositional proof length within a factor of \(2^{log^{(1 - \varepsilon )} n}\) for any є > 0. All these hardness of approximation results apply to proof length measured either by number of symbols or by number of inferences, for tree-like or dag-like proofs. We introduce the Monotone Minimum (Circuit) Satisfying Assignment problem and prove the same hardness results for Monotone Minimum (Circuit) Satisfying Assignment.

CSL Conference 1998 Conference Paper

Resolution and the Weak Pigeonhole Principle

  • Sam Buss
  • Toniann Pitassi

Abstract We give new upper bounds for resolution proofs of the weak pigeonhole principle. We also give lower bounds for tree-like resolution proofs. We present a normal form for resolution proofs of pigeonhole principles based on a new monotone resolution rule.

FOCS Conference 1997 Conference Paper

No Feasible Interpolation for TC0-Frege Proofs

  • Maria Luisa Bonet
  • Toniann Pitassi
  • Ran Raz

The interpolation method has been one of the main tools for proving lower bounds for propositional proof systems. Loosely speaking, if one can prove that a particular proof system has the feasible interpolation property, then a generic reduction can (usually) be applied to prove lower bounds for the proof system, sometimes assuming a (usually modest) complexity-theoretic assumption. In this paper, we show that this method cannot be used to obtain lower bounds for Frege systems, or even for TC/sup 0/-Frege systems. More specifically, we show that unless factoring is feasible, neither Frege nor TC/sup 0/-Frege has the feasible interpolation property. In order to carry out our argument, we show how to carry out proofs of many elementary axioms/theorems of arithmetic in polynomial-size TC/sup 0/-Frege. In particular, we show how to carry out the proof for the Chinese Remainder Theorem, which may be of independent interest. As a corollary, we obtain that TC/sup 0/-Frege as well as any proof system that polynomially simulates it, is not automatizable (under a hardness assumption).

FOCS Conference 1996 Conference Paper

Simplified and Improved Resolution Lower Bounds

  • Paul Beame
  • Toniann Pitassi

We give simple new lower bounds on the lengths of resolution proofs for the pigeonhole principle and for randomly generated formulas. For random formulas, our bounds significantly extend the range of formula sizes for which non-trivial lower bounds are known. For example, we show that with probability approaching 1, any resolution refutation of a randomly chosen 3-CNF formula with at most n/sup 6/5-/spl epsiv// clauses requires exponential size. Previous bounds applied only when the number of clauses was at most linear in the number of variables. For the pigeonhole principle our bound is a small improvement over previous bounds. Our proofs are more elementary than previous arguments, and establish a connection between resolution proof size and maximum clause size.

FOCS Conference 1995 Conference Paper

Improved Depth Lower Vounds for Small Distance Connectivity

  • Paul Beame
  • Russell Impagliazzo
  • Toniann Pitassi

We consider the problem of determining, given a graph G and specified nodes s and t, whether or not there is a path of at most k edges in G from s to t. We show that solving this problem on polynomial-size unbounded fan-in circuits, requires depth /spl Omega/(loglogk), improving on a depth lower bound of n(log*k) when k=log/sup O(1/) n. In addition we show that there is a constant c such that for k/spl les/logn, any depth d unbounded fan-in circuit for this problem requires size at least n/sup ck/spl epsiv/d/ where /spl epsiv//sub d/=/spl phi//sup -2d//3 and /spl phi/ is the golden mean. This latter result improves on an n/sup /spl Omega/(log(d+3/k)) bound where log/sup (i/) is the i-fold composition of log with itself. The key to our technique is a new form of switching lemma which combines some of the features of iteratively shortening terms due to Furst, Saxe, and Sipser (1981) and Ajtai (1983) with the kinds of switching lemma arguments introduced by Yao (1985), Hastad (1986), and Cai (1986) that have been the methods of choice for subsequent results.

FOCS Conference 1994 Conference Paper

Lower Bound on Hilbert's Nullstellensatz and propositional proofs

  • Paul Beame
  • Russell Impagliazzo
  • Jan Krajícek
  • Toniann Pitassi
  • Pavel Pudlák

The weak form of the Hilbert's Nullstellensatz says that a system of algebraic equations over a field, Q/sub i/(x~)=0, does not have a solution in the algebraic closure iff 1 is in the ideal generated by the polynomials Q/sub i/(x~). We shall prove a lower bound on the degrees of polynomials P/sub i/(x~) such that /spl Sigma//sub i/ P/sub i/(x~)Q/sub i/(x~)=1. This result has the following application. The modular counting principle states that no finite set whose cardinality is not divisible by q can be partitioned into q-element classes. For each fixed cardinality N, this principle can be expressed as a propositional formula Count/sub q//sup N/. Ajtai (1988) proved recently that, whenever p, q are two different primes, the propositional formulas Count/sub q//sup qn+1/ do not have polynomial size, constant-depth Frege proofs from instances of Count/sub p//sup m/, m/spl ne/0 (mod p). We give a new proof of this theorem based on the lower bound for the Hilbert's Nullstellensatz. Furthermore our technique enables us to extend the independence results for counting principles to composite numbers p and q. This results in an exact characterization of when Count/sub q/ can be proven efficiently from Count/sub p/, for all p and q. >

FOCS Conference 1992 Conference Paper

The Complexity of the Hajós Calculus

  • Toniann Pitassi
  • Alasdair Urquhart

The Hajos construction is a simple, nondeterministic procedure for generating the class of graphs that are not 3-colorable. A. J. Mansfield and D. J. A. Welsh have posed the problem of proving whether or not there exists a polynomial-size Hajos construction for every non-3-colorable graph. The main result of this paper is a proof that the Hajos calculus is polynomially-bounded if and only if extended Frege proof systems are polynomially bounded. This result links an open problem in graph theory to an important open problem in the complexity of propositional proof systems. In addition, the authors establish an exponential lower bound for a strong subsystem of the Hajos calculus. Lastly, they discuss an interesting graph-theoretical consequence of this result. >