Author name cluster

Úlfar Erlingsson

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

2 author rows

AAAI Conference 2021 Conference Paper

Tempered Sigmoid Activations for Deep Learning with Differential Privacy

Nicolas Papernot
Abhradeep Thakurta
Shuang Song
Steve Chien
Úlfar Erlingsson

Because learning sometimes involves sensitive data, machine learning algorithms have been extended to offer differential privacy for training data. In practice, this has been mostly an afterthought, with privacy-preserving models obtained by re-running training with a different optimizer, but using the model architectures that already performed well in a nonprivacy-preserving setting. This approach leads to less than ideal privacy/utility tradeoffs, as we show here. To improve these tradeoffs, prior work introduces variants of differential privacy that weaken the privacy guarantee proved to increase model utility. We show this is not necessary and instead propose that utility be improved by choosing activation functions designed explicitly for privacy-preserving training. A crucial operation in differentially private SGD is gradient clipping, which along with modifying the optimization path (at times resulting in not-optimizing a single objective function), may also introduce both significant bias and variance to the learning process. We empirically identify exploding gradients arising from ReLU may be one of the main sources of this. We demonstrate analytically and experimentally how a general family of bounded activation functions, the tempered sigmoids, consistently outperform the currently established choice: unbounded activation functions like ReLU. Using this paradigm, we achieve new state-of-the-art accuracy on MNIST, FashionMNIST, and CIFAR10 without any modification of the learning procedure fundamentals or differential privacy analysis. While the changes we make are simple in retrospect, the simplicity of our approach facilitates its implementation and adoption to meaningfully improve state-of-the-art machine learning while still providing strong guarantees in the original framework of differential privacy.

PDF Details

SODA Conference 2019 Conference Paper

Amplification by Shuffling: From Local to Central Differential Privacy via Anonymity

Úlfar Erlingsson
Vitaly Feldman
Ilya Mironov
Ananth Raghunathan
Kunal Talwar
Abhradeep Thakurta

Sensitive statistics are often collected across sets of users, with repeated collection of reports done over time. For example, trends in users’ private preferences or software usage may be monitored via such reports. We study the collection of such statistics in the local differential privacy (LDP) model, and describe an algorithm whose privacy cost is polylogarithmic in the number of changes to a user's value. More fundamentally—by building on anonymity of the users’ reports—we also demonstrate how the privacy cost of our LDP algorithm can actually be much lower when viewed in the central model of differential privacy. We show, via a new and general privacy amplification technique, that any permutation-invariant algorithm satisfying ε-local differential privacy will satisfy -central differential privacy. By this, we explain how the high noise and overhead of LDP protocols is a consequence of them being significantly more private in the central model. As a practical corollary, our results imply that several LDP-based industrial deployments may have much lower privacy cost than their advertised ε would indicate—at least if reports are anonymized.

Details

ICLR Conference 2018 Conference Paper

Scalable Private Learning with PATE

Nicolas Papernot
Shuang Song 0001
Ilya Mironov
Ananth Raghunathan
Kunal Talwar
Úlfar Erlingsson

The rapid adoption of machine learning has increased concerns about the privacy implications of machine learning models trained on sensitive data, such as medical records or other personal information. To address those concerns, one promising approach is Private Aggregation of Teacher Ensembles, or PATE, which transfers to a "student" model the knowledge of an ensemble of "teacher" models, with intuitive privacy provided by training teachers on disjoint data and strong privacy guaranteed by noisy aggregation of teachers’ answers. However, PATE has so far been evaluated only on simple classification tasks like MNIST, leaving unclear its utility when applied to larger-scale learning tasks and real-world datasets. In this work, we show how PATE can scale to learning tasks with large numbers of output classes and uncurated, imbalanced training data with errors. For this, we introduce new noisy aggregation mechanisms for teacher ensembles that are more selective and add less noise, and prove their tighter differential-privacy guarantees. Our new mechanisms build on two insights: the chance of teacher consensus is increased by using more concentrated noise and, lacking consensus, no answer need be given to a student. The consensus answers used are more likely to be correct, offer better intuitive privacy, and incur lower-differential privacy cost. Our evaluation shows our mechanisms improve on the original PATE on all measures, and scale to larger tasks with both high utility and very strong privacy (ε < 1.0).

Details

ICLR Conference 2017 Conference Paper

Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data

Nicolas Papernot
Martín Abadi
Úlfar Erlingsson
Ian J. Goodfellow
Kunal Talwar

Some machine learning applications involve training data that is sensitive, such as the medical histories of patients in a clinical trial. A model may inadvertently and implicitly store some of its training data; careful analysis of the model may therefore reveal sensitive information. To address this problem, we demonstrate a generally applicable approach to providing strong privacy guarantees for training data: Private Aggregation of Teacher Ensembles (PATE). The approach combines, in a black-box fashion, multiple models trained with disjoint datasets, such as records from different subsets of users. Because they rely directly on sensitive data, these models are not published, but instead used as ''teachers'' for a ''student'' model. The student learns to predict an output chosen by noisy voting among all of the teachers, and cannot directly access an individual teacher or the underlying data or parameters. The student's privacy properties can be understood both intuitively (since no single teacher and thus no single dataset dictates the student's training) and formally, in terms of differential privacy. These properties hold even if an adversary can not only query the student but also inspect its internal workings. Compared with previous work, the approach imposes only weak assumptions on how teachers are trained: it applies to any model, including non-convex models like DNNs. We achieve state-of-the-art privacy/utility trade-offs on MNIST and SVHN thanks to an improved privacy analysis and semi-supervised learning.

Details