Author name cluster

Cynthia Dwork

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

43 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

How Many Domains Suffice for Domain Generalization? A Tight Characterization via the Domain Shattering Dimension

Cynthia Dwork
Lunjia Hu
Han Shao

We study a fundamental question of domain generalization: given a family of domains (i. e. , data distributions), how many randomly sampled domains do we need to collect data from in order to learn a model that performs reasonably well on every seen and unseen domain in the family? We model this problem in the PAC framework and introduce a new combinatorial measure, which we call the domain shattering dimension. We show that this dimension characterizes the domain sample complexity. Furthermore, we establish a tight quantitative relationship between the domain shattering dimension and the classic VC dimension, demonstrating that every hypothesis class that is learnable in the standard PAC setting is also learnable in our setting.

PDF Details

STOC Conference 2024 Conference Paper

Complexity-Theoretic Implications of Multicalibration

Sílvia Casacuberta
Cynthia Dwork
Salil P. Vadhan

We present connections between the recent literature on multigroup fairness for prediction algorithms and classical results in computational complexity. Multiaccurate predictors are correct in expectation on each member of an arbitrary collection of pre-specified sets. Multicalibrated predictors satisfy a stronger condition: they are calibrated on each set in the collection. Multiaccuracy is equivalent to a regularity notion for functions defined by Trevisan, Tulsiani, and Vadhan (2009). They showed that, given a class F of (possibly simple) functions, an arbitrarily complex function g can be approximated by a low-complexity function h that makes a small number of oracle calls to members of F , where the notion of approximation requires that h cannot be distinguished from g by members of F . This complexity-theoretic Regularity Lemma is known to have implications in different areas, including in complexity theory, additive number theory, information theory, graph theory, and cryptography. Starting from the stronger notion of multicalibration, we obtain stronger and more general versions of a number of applications of the Regularity Lemma, including the Hardcore Lemma, the Dense Model Theorem, and the equivalence of conditional pseudo-min-entropy and unpredictability. For example, we show that every boolean function (regardless of its hardness) has a small collection of disjoint hardcore sets, where the sizes of those hardcore sets are related to how balanced the function is on corresponding pieces of an efficient partition of the domain.