Author name cluster

William M. Hoza

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers

2 author rows

NeurIPS Conference 2024 Conference Paper

Provable Tempered Overfitting of Minimal Nets and Typical Nets

Itamar Harel
William M. Hoza
Gal Vardi
Itay Evron
Nathan Srebro
Daniel Soudry

We study the overfitting behavior of fully connected deep Neural Networks (NNs) with binary weights fitted to perfectly classify a noisy training set. We consider interpolation using both the smallest NN (having the minimal number of weights) and a random interpolating NN. For both learning rules, we prove overfitting is tempered. Our analysis rests on a new bound on the size of a threshold circuit consistent with a partial function. To the best of our knowledge, ours are the first theoretical results on benign or tempered overfitting that: (1) apply to deep NNs, and (2) do not require a very high or very low input dimension.

PDF Details DOI

STOC Conference 2023 Conference Paper

Depth-d Threshold Circuits vs. Depth-(d+1) AND-OR Trees

Pooya Hatami
William M. Hoza
Avishay Tal
Roei Tell

For any n ∈ ℕ and d = o (loglog( n )), we prove that there is a Boolean function F on n bits and a value γ = 2 −Θ( d ) such that F can be computed by a uniform depth-( d + 1) AC 0 circuit with O ( n ) wires, but F cannot be computed by any depth- d TC 0 circuit with n 1 + γ wires. This bound matches the current state-of-the-art lower bounds for computing explicit functions by threshold circuits of depth d > 2, which were previously known only for functions outside AC 0 such as the parity function. Furthermore, in our result, the AC 0 circuit computing F is a monotone *read-once formula* (i.e., an AND-OR tree), and the lower bound holds even in the average-case setting with respect to advantage n −γ . At a high level, our proof strategy combines two prominent approaches in circuit complexity from the last decade: The celebrated *random projections* method of Håstad, Rossman, Servedio, and Tan (J. ACM 2017), which was previously used to show a tight average-case depth hierarchy for AC 0 ; and the line of works analyzing the effect of *random restrictions* on threshold circuits. We show that under a modified version of Håstad, Rossman, Servedio, and Tan’s projection procedure, any depth- d threshold circuit with n 1 + γ wires simplifies to a near-trivial function, whereas an appropriately parameterized AND-OR tree of depth d + 1 maintains structure.

Details

FOCS Conference 2023 Conference Paper

Weighted Pseudorandom Generators via Inverse Analysis of Random Walks and Shortcutting

Lijie Chen 0001
William M. Hoza
Xin Lyu 0002
Avishay Tal
Hongxun Wu

A weighted pseudorandom generator (WPRG) is a generalization of a pseudorandom generator (PRG) in which, roughly speaking, probabilities are replaced with weights that are permitted to be positive or negative. We present new explicit constructions of WPRGs that fool certain classes of standard-order read-once branching programs. In particular, our WPRGs fool width-3 programs, constant-width regular programs, and unbounded-width permutation programs with a single accepting vertex. In all three cases, the seed length is $\widetilde{O}(\log n \cdot \sqrt{\log (1 / \varepsilon)}+\log (1 / \varepsilon))$, where n is the length of the program and $\varepsilon$ is the error of the WPRG. For comparison, for all three of these models, the best explicit unweighted PRGs known have seed length $\widetilde{O}(\log n$. $\log (1 / \varepsilon)$) (Meka, Reingold, and Tal STOC 2019; Braverman, Rao, Raz, and Yehudayoff SICOMP 2014; Hoza, Pyne, and Vadhan ITCS 2021). Our WPRG seed length is superior when $\varepsilon$ is small. For the case of unbounded-width permutation programs, Pyne and Vadhan previously constructed a WPRG with a seed length that is similar to ours (CCC 2021), but their seed length has an extra additive $\log ^{3 / 2} n$ term, so our WPRG is superior when $\varepsilon \gg 1 / n$. Our results are based on a new, general framework for error reduction. Our framework builds on the remarkable recent work by Ahmadinejad, Kelner, Murtagh, Peebles, Sidford, and Vadhan (FOCS 2020) that gave a near-logarithmic space algorithm for estimating random walk probabilities in Eulerian digraphs with high precision. Our framework centers around the “inverse analysis” of random walks and a key combinatorial structure termed “shortcut graphs. ” Using our new framework and the recent notion of singular value approximation (Ahmadinejad, Peebles, Pyne, Sidford, and Vadhan arXiv 2023), we also present an alternative, simpler proof of Ahmadinejad, Kelner, Murtagh, Peebles, Sidford, and Vadhan’s main theorem. Compared to the original proof, our new proof avoids much of the sophisticated machinery that was imported from recent work on fast Laplacian solvers.

Details

FOCS Conference 2021 Conference Paper

Fooling Constant-Depth Threshold Circuits (Extended Abstract)

Pooya Hatami
William M. Hoza
Avishay Tal
Roei Tell

We present new constructions of pseudorandom generators (PRGs) for two of the most widely studied non-uniform circuit classes in complexity theory. Our main result is a construction of the first non-trivial PRG for linear threshold (LTF) circuits of arbitrary constant depth and super-linear size. This PRG fools circuits with depth $d\in\mathbb{N}$ and $n^{1+\delta}$ wires, where $\delta=2^{-O(d)}$, using seed length $O(n^{1-\delta})$ and with error $2^{-n^{\delta}}$. This tightly matches the best known lower bounds for this circuit class. As a consequence of our result, all the known hardness for LTF circuits has now effectively been translated into pseudorandomness. This brings the extensive effort in the last decade to construct PRGs and deterministic circuit-analysis algorithms for this class to the point where any subsequent improvement would yield breakthrough lower bounds. Our second contribution is a PRG for De Morgan formulas of size $s$ whose seed length is $s^{1/3+o(1)}\cdot\text{polylog}(1/\epsilon)$ for error $\epsilon$. In particular, our PRG can fool formulas of sub-cubic size $s=n^{3-\Omega(1)}$ with an exponentially small error $\epsilon=\exp(-n^{\Omega(1)})$. This significantly improves the inverse-polynomial error of the previous state-of-the-art for such formulas by Impagliazzo, Meka, and Zuckerman (FOCS 2012, JACM 2019), and again tightly matches the best currently-known lower bounds for this class. In both settings, a key ingredient in our constructions is a pseudorandom restriction procedure that has tiny failure probability, but simplifies the function to a non-natural “hybrid computational model” that combines several computational models. As part of our proofs we also construct “extremely low-error” PRGs for related circuit classes; for example, we construct a PRG for arbitrary functions of $s$ LTFs that can handle even the extreme setting of parameters $s=n/\text{polylog}(n)$ and $\epsilon=2^{-n/\text{polylog}(n)}$.

Details

FOCS Conference 2018 Conference Paper

Simple Optimal Hitting Sets for Small-Success RL

William M. Hoza
David Zuckerman

We give a simple explicit hitting set generator for read-once branching programs of width w and length r with known variable order. When r = w, our generator has seed length O(log^2 r + log(1/ε)). When r = polylog w, our generator has optimal seed length O(log w + log(1/ε)). For intermediate values of r, our generator's seed length smoothly interpolates between these two extremes. Our generator's seed length improves on recent work by Braverman, Cohen, and Garg (STOC '18). In addition, our generator and its analysis are dramatically simpler than the work by Braverman et al. Our generator's seed length improves on all the classic generators for space-bounded computation (Nisan Combinatorica '92; Impagliazzo, Nisan, and Wigderson STOC '94; Nisan and Zuckerman JCSS '96) when eps is small. As a corollary of our construction, we show that every RL algorithm that uses r random bits can be simulated by an NL algorithm that uses only O(r/log^c n) nondeterministic bits, where c is an arbitrarily large constant. Finally, we show that any RL algorithm with small success probability eps can be simulated deterministically in space O(log^3/2 n + log n log log(1/ε)). This improves on work by Saks and Zhou (JCSS '99), who gave an algorithm that runs in space O(log^3/2 n + sqrt(log n) log(1/ε)).

Details

STOC Conference 2017 Conference Paper

Targeted pseudorandom generators, simulation advice generators, and derandomizing logspace

William M. Hoza
Chris Umans

Details

SODA Conference 2016 Conference Paper

The Adversarial Noise Threshold for Distributed Protocols

William M. Hoza
Leonard J. Schulman

We consider the problem of implementing distributed protocols, despite adversarial channel errors, on synchronous-messaging networks with arbitrary topology. In our first result we show that any n -party T -round protocol on an undirected communication network G can be compiled into a robust simulation protocol on a sparse ( ℴ ( n ) edges) subnetwork so that the simulation tolerates an adversarial error rate of; the simulation has a round complexity of, where m is the number of edges in G. (So the simulation is work-preserving up to a log factor.) The adversary's error rate is within a constant factor of optimal. Given the error rate, the round complexity blowup is within a factor of ℴ ( k log n ) of optimal, where k is the edge connectivity of G. We also determine that the maximum tolerable error rate on directed communication networks is ⊝(1/ s ) where s is the number of edges in a minimum equivalent digraph. Next we investigate adversarial per-edge error rates, where the adversary is given an error budget on each edge of the network. We determine the limit for tolerable per-edge error rates on an arbitrary directed graph to within a factor of 2. However, the construction that approaches this limit has exponential round complexity, so we give another compiler, which transforms T -round protocols into ℴ ( mT )-round simulations, and prove that for polynomial-query black box compilers, the per-edge error rate tolerated by this last compiler is within a constant factor of optimal.

Details