Arrow Research search

Author name cluster

A. Pavan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers
1 author row

Possible papers

7

IJCAI Conference 2024 Conference Paper

Improved Evolutionary Algorithms for Submodular Maximization with Cost Constraints

  • Yanhui Zhu
  • Samik Basu
  • A. Pavan

We present an evolutionary algorithm evo-SMC for the problem of Submodular Maximization under Cost constraints (SMC). Our algorithm achieves 1/2-approximation with a high probability 1-1/n within O(n2 K) iterations, where K denotes the maximum size of a feasible solution set with cost constraint B. To the best of our knowledge, this is the best approximation guarantee offered by evolutionary algorithms for this problem. We further refine evo-SMC and develop st-evo-SMC. This stochastic version yields a significantly faster algorithm while maintaining the approximation ratio of 1/2, with probability 1-epsilon. The required number of iterations reduces to O(nK log(1/epsilon)/p), where the user defined parameters p represents the stochasticity probability, and epsilon denotes the error threshold. Finally, the empirical evaluations carried out through extensive experimentation substantiate the efficiency and effectiveness of our proposed algorithms. Our algorithms consistently outperform existing methods, producing higher-quality solutions.

NeurIPS Conference 2024 Conference Paper

Replicability in Learning: Geometric Partitions and KKM-Sperner Lemma

  • Jason Vander Woude
  • Peter Dixon
  • A. Pavan
  • Jamie Radcliffe
  • N. V. Vinodchandran

This paper studies replicability in machine learning tasks from a geometric viewpoint. Recent works have revealed the role of geometric partitions and Sperner's lemma (and its variations) in designing replicable learning algorithms and in establishing impossibility results. A partition $\mathcal{P}$ of $\mathbb{R}^d$ is called a $(k, \epsilon)$-secluded partition if for every $\vec{p}\in\mathbb{R}^d$, an $\varepsilon$-radius ball (with respect to the $\ell_{\infty}$ norm) centered at $\vec{p}$ intersects at most $k$ members of $\mathcal{P}$. In relation to replicable learning, the parameter $k$ is closely related to the $\textit{list complexity}$, and the parameter $\varepsilon$ is related to the sample complexity of the replicable learner. Construction of secluded partitions with better parameters (small $k$ and large $\varepsilon$) will lead to replicable learning algorithms with small list and sample complexities. Motivated by this connection, we undertake a comprehensive study of secluded partitions and establish near-optimal relationships between $k$ and $\varepsilon$. 1. We show that for any $(k, \epsilon)$-secluded partition where each member has at most unit measure, it must be that $k \geq(1+2\varepsilon)^d$, and consequently, for the interesting regime $k\in[2^d]$ it must be that $\epsilon\leq\frac{\log_4(k)}{d}$. 2. To complement this upper bound on $\epsilon$, we show that for each $d\in\mathbb{N}$ and each viable $k\in[2^d]$, a construction of a $(k, \epsilon)$-secluded (unit cube) partition with $\epsilon\geq\frac{\log_4(k)}{d}\cdot\frac{1}{8\log_4(d+1)}$. This establishes the optimality of $\epsilon$ within a logarithmic factor. 3. Finally, we adapt our proof techniques to obtain a new ``neighborhood'' variant of the cubical KKM lemma (or cubical Sperner's lemma): For any coloring of $[0, 1]^d$ in which no color is used on opposing faces, it holds for each $\epsilon\in(0, \frac12]$ that there is a point where the open $\epsilon$-radius $\ell_\infty$-ball intersects at least $(1+\frac23\epsilon)^d$ colors. While the classical Sperner/KKM lemma guarantees the existence of a point that is "adjacent" to points with $(d+1)$ distinct colors, the neighborhood version guarantees the existence of a small neighborhood with exponentially many points with distinct colors.

AAAI Conference 2023 Conference Paper

Constraint Optimization over Semirings

  • A. Pavan
  • Kuldeep S. Meel
  • N. V. Vinodchandran
  • Arnab Bhattacharyya

Interpretations of logical formulas over semirings (other than the Boolean semiring) have applications in various areas of computer science including logic, AI, databases, and security. Such interpretations provide richer information beyond the truth or falsity of a statement. Examples of such semirings include Viterbi semiring, min-max or access control semiring, tropical semiring, and fuzzy semiring. The present work investigates the complexity of constraint optimization problems over semirings. The generic optimization problem we study is the following: Given a propositional formula phi over n variable and a semiring (K,+,.,0,1), find the maximum value over all possible interpretations of phi over K. This can be seen as a generalization of the well-known satisfiability problem (a propositional formula is satisfiable if and only if the maximum value over all interpretations/assignments over the Boolean semiring is 1). A related problem is to find an interpretation that achieves the maximum value. In this work, we first focus on these optimization problems over the Viterbi semiring, which we call optConfVal and optConf. We first show that for general propositional formulas in negation normal form, optConfVal and optConf are in FP^NP. We then investigate optConf when the input formula phi is represented in the conjunctive normal form. For CNF formulae, we first derive an upper bound on the value of optConf as a function of the number of maximum satisfiable clauses. In particular, we show that if r is the maximum number of satisfiable clauses in a CNF formula with m clauses, then its optConf value is at most 1/4^(m-r). Building on this we establish that optConf for CNF formulae is hard for the complexity class FP^NP[log]. We also design polynomial-time approximation algorithms and establish an inapproximability for optConfVal. We establish similar complexity results for these optimization problems over other semirings including tropical, fuzzy, and access control semirings.

NeurIPS Conference 2023 Conference Paper

List and Certificate Complexities in Replicable Learning

  • Peter Dixon
  • A. Pavan
  • Jason Vander Woude
  • N. V. Vinodchandran

We investigate replicable learning algorithms. Informally a learning algorithm is replicable if the algorithm outputs the same canonical hypothesis over multiple runs with high probability, even when different runs observe a different set of samples from the unknown data distribution. In general, such a strong notion of replicability is not achievable. Thus we consider two feasible notions of replicability called {\em list replicability} and {\em certificate replicability}. Intuitively, these notions capture the degree of (non) replicability. The goal is to design learning algorithms with optimal list and certificate complexities while minimizing the sample complexity. Our contributions are the following. 1. We first study the learning task of estimating the biases of $d$ coins, up to an additive error of $\varepsilon$, by observing samples. For this task, we design a $(d+1)$-list replicable algorithm. To complement this result, we establish that the list complexity is optimal, i. e there are no learning algorithms with a list size smaller than $d+1$ for this task. We also design learning algorithms with certificate complexity $\tilde{O}(\log d)$. The sample complexity of both these algorithms is $\tilde{O}(\frac{d^2}{\varepsilon^2})$ where $\varepsilon$ is the approximation error parameter (for a constant error probability). 2. In the PAC model, we show that any hypothesis class that is learnable with $d$-nonadaptive statistical queries can be learned via a $(d+1)$-list replicable algorithm and also via a $\tilde{O}(\log d)$-certificate replicable algorithm. The sample complexity of both these algorithms is $\tilde{O}(\frac{d^2}{\nu^2})$ where $\nu$ is the approximation error of the statistical query. We also show that for the concept class \dtep, the list complexity is exactly $d+1$ with respect to the uniform distribution. To establish our upper bound results we use rounding schemes induced by geometric partitions with certain properties. We use Sperner/KKM Lemma to establish the lower bound results.

IJCAI Conference 2023 Conference Paper

On Approximating Total Variation Distance

  • Arnab Bhattacharyya
  • Sutanu Gayen
  • Kuldeep S. Meel
  • Dimitrios Myrisiotis
  • A. Pavan
  • N. V. Vinodchandran

Total variation distance (TV distance) is a fundamental notion of distance between probability distributions. In this work, we introduce and study the problem of computing the TV distance of two product distributions over the domain {0, 1}^n. In particular, we establish the following results. 1. The problem of exactly computing the TV distance of two product distributions is #P-complete. This is in stark contrast with other distance measures such as KL, Chi-square, and Hellinger which tensorize over the marginals leading to efficient algorithms. 2. There is a fully polynomial-time deterministic approximation scheme (FPTAS) for computing the TV distance of two product distributions P and Q where Q is the uniform distribution. This result is extended to the case where Q has a constant number of distinct marginals. In contrast, we show that when P and Q are Bayes net distributions the relative approximation of their TV distance is NP-hard.

TCS Journal 2007 Journal Article

Polylogarithmic-round interactive proofs for coNP collapse the exponential hierarchy

  • A. Pavan
  • Alan L. Selman
  • Samik Sengupta
  • N.V. Vinodchandran

If every language in coNP has a constant-round interactive proof system, then the polynomial-time hierarchy collapses [R. B. Boppana, J. Håstad, S. Zachos, Does co-NP have short interactive proofs? Information Processing Letters 25 (2) (1987) 127–132]. On the other hand, the well-known LFKN protocol gives O ( n ) -round interactive proof systems for all languages in coNP [C. Lund, L. Fortnow, H. Karloff, N. Nisan, Algebraic methods for interactive proof systems, Journal of the Association for Computing Machinery 39 (4) (1992) 859–868]. We consider the question of whether it is possible for coNP to have interactive proof systems with polylogarithmic-round complexity. We show that this is unlikely by proving that if a coNP-complete set has a polylogarithmic-round interactive proof system, then the exponential-time hierarchy collapses. We also consider exponential versions of the Karp–Lipton theorem and Yap’s theorem.

TCS Journal 2000 Journal Article

Complete distributional problems, hard languages, and resource-bounded measure

  • A. Pavan
  • Alan L. Selman

We say that a distribution μ is reasonable if there exists a constant s⩾0 such that μ({x||x|⩾n})=Ω(1/ns). We prove the following result, which suggests that all DistNP-complete problems have reasonable distributions. If NP contains a DTIME(2n)-bi-immune set, then every DistNP-complete set has a reasonable distribution. It follows from work of Mayordomo [19] that the consequent holds if the p-measure of NP is not zero. Cai and Selman [6] defined a modification and extension of Levin's notion of average polynomial time to arbitrary time-bounds and proved that if L is P-bi-immune, then L is distributionally hard, meaning that, for every polynomial-time computable distribution μ, the distributional problem (L, μ) is not polynomial on the μ-average. We prove the following results, which suggest that distributional hardness is closely related to more traditional notions of hardness. 1. If NP contains a distributionally hard set, then NP contains a P-immune set. 2. There exists a language L that is distributionally hard but not P-bi-immune if and only if P contains a set that is immune to all P-printable sets. The following corollaries follow readily 1. If the p-measure of NP is not zero, then there exists a language L that is distributionally hard but not P-bi-immune. 2. If the p2 -measure of NP is not zero, then there exists a language L in NP that is distributionally hard but not P-bi-immune.