Arrow Research search

Author name cluster

Julia Kempe

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

29 papers
2 author rows

Possible papers

29

TMLR Journal 2026 Journal Article

On the Geometry of Regularization in Adversarial Training: High-Dimensional Asymptotics and Generalization Bounds

  • Matteo Vilucchio
  • Nikolaos Tsilivis
  • Bruno Loureiro
  • Julia Kempe

Regularization, whether explicit in terms of a penalty in the loss or implicit in the choice of algorithm, is a cornerstone of modern machine learning. Indeed, controlling the complexity of the model class is particularly important when data is scarce, noisy or contaminated, as it translates a statistical belief on the underlying structure of the data. This work investigates the question of how to choose the regularization norm $\lVert \cdot \rVert$ in the context of high-dimensional adversarial training for binary classification. To this end, we first derive an exact asymptotic description of the robust, regularized empirical risk minimizer for various types of adversarial attacks and regularization norms (including non-$\ell_p$ norms). We complement this analysis with a uniform convergence analysis, deriving bounds on the Rademacher Complexity for this class of problems. Leveraging our theoretical results, we quantitatively characterize the relationship between perturbation size and the optimal choice of $\lVert \cdot \rVert$, confirming the intuition that, in the data scarce regime, the type of regularization becomes increasingly important for adversarial training as perturbations grow in size.

NeurIPS Conference 2025 Conference Paper

Asymmetric REINFORCE for off-Policy Reinforcement Learning: Balancing positive and negative rewards

  • Charles Arnal
  • Gaëtan Narozniak
  • Vivien Cabannes
  • Yunhao Tang
  • Julia Kempe
  • Remi Munos

Reinforcement learning (RL) is increasingly used to align large language models (LLMs). Off-policy methods offer greater implementation simplicity and data efficiency than on-policy techniques, but often result in suboptimal performance. In this work, we study the intermediate range of algorithms between off-policy RL and supervised fine-tuning by analyzing a simple off-policy REINFORCE algorithm, where the advantage is defined as $A=r-V$, with $r$ a reward and $V$ some tunable baseline. Intuitively, lowering $V$ emphasizes high-reward samples, while raising it penalizes low-reward ones more heavily. We first provide a theoretical analysis of this off-policy REINFORCE algorithm, showing that when the baseline $V$ lower-bounds the expected reward, the algorithm enjoys a policy improvement guarantee. Our analysis reveals that while on-policy updates can safely leverage both positive and negative signals, off-policy updates benefit from focusing more on positive rewards than on negative ones. We validate our findings experimentally in a controlled stochastic bandit setting and through fine-tuning state-of-the-art LLMs on reasoning tasks.

ICLR Conference 2025 Conference Paper

Beyond Model Collapse: Scaling Up with Synthesized Data Requires Verification

  • Yunzhen Feng
  • Elvis Dohmatob
  • Pu Yang
  • François Charton
  • Julia Kempe

Large Language Models (LLM) are increasingly trained on data generated by other LLMs, either because generated text and images become part of the pre-training corpus, or because synthetized data is used as a replacement for expensive human-annotation. This raises concerns about *model collapse*, a drop in model performance when their training sets include generated data. Considering that it is easier for both humans and machines to tell between good and bad examples than to generate high-quality samples, we investigate the use of verification on synthesized data to prevent model collapse. We provide a theoretical characterization using Gaussian mixtures, linear classifiers, and linear verifiers to derive conditions with measurable proxies to assess whether the verifier can effectively select synthesized data that leads to optimal performance. We experiment with two practical tasks -- computing matrix eigenvalues with transformers and news summarization with LLMs -- which both exhibit model collapse when trained on generated data, and show that verifiers, even imperfect ones, can indeed be harnessed to prevent model collapse and that our proposed proxy measure strongly correlates with performance.

ICLR Conference 2025 Conference Paper

DRoP: Distributionally Robust Data Pruning

  • Artem Vysogorets
  • Kartik Ahuja
  • Julia Kempe

In the era of exceptionally data-hungry models, careful selection of the training data is essential to mitigate the extensive costs of deep learning. Data pruning offers a solution by removing redundant or uninformative samples from the dataset, which yields faster convergence and improved neural scaling laws. However, little is known about its impact on classification bias of the trained models. We conduct the first systematic study of this effect and reveal that existing data pruning algorithms can produce highly biased classifiers. We present theoretical analysis of the classification risk in a mixture of Gaussians to argue that choosing appropriate class pruning ratios, coupled with random pruning within classes has potential to improve worst-class performance. We thus propose DRoP, a distributionally robust approach to pruning and empirically demonstrate its performance on standard computer vision benchmarks. In sharp contrast to existing algorithms, our proposed method continues improving distributional robustness at a tolerable drop of average performance as we prune more from the datasets.

ICLR Conference 2025 Conference Paper

Flavors of Margin: Implicit Bias of Steepest Descent in Homogeneous Neural Networks

  • Nikolaos Tsilivis 0002
  • Gal Vardi
  • Julia Kempe

We study the implicit bias of the family of steepest descent algorithms with infinitesimal learning rate, including gradient descent, sign gradient descent and coordinate descent, in deep homogeneous neural networks. We prove that an algorithm-dependent geometric margin increases during training and characterize the late-stage bias of the algorithms. In particular, we define a generalized notion of stationarity for optimization problems and show that the algorithms progressively reduce a (generalized) Bregman divergence, which quantifies proximity to such stationary points of a margin-maximization problem. We then experimentally zoom into the trajectories of neural networks optimized with various steepest descent algorithms, highlighting connections to the implicit bias of Adam.

ICML Conference 2025 Conference Paper

PILAF: Optimal Human Preference Sampling for Reward Modeling

  • Yunzhen Feng
  • Ariel Kwiatkowski
  • Kunhao Zheng
  • Julia Kempe
  • Yaqi Duan

As large language models increasingly drive real-world applications, aligning them with human values becomes paramount. Reinforcement Learning from Human Feedback (RLHF) has emerged as a key technique, translating preference data into reward models when oracle human values remain inaccessible. In practice, RLHF mostly relies on approximate reward models, which may not consistently guide the policy toward maximizing the underlying human values. We propose Policy-Interpolated Learning for Aligned Feedback (PILAF), a novel response sampling strategy for preference labeling that explicitly aligns preference learning with maximizing the underlying oracle reward. PILAF is theoretically grounded, demonstrating optimality from both an optimization and a statistical perspective. The method is straightforward to implement and demonstrates strong performance in iterative and online RLHF settings where feedback curation is critical.

ICLR Conference 2025 Conference Paper

Strong Model Collapse

  • Elvis Dohmatob
  • Yunzhen Feng
  • Arjun Subramonian
  • Julia Kempe

Within the scaling laws paradigm, which underpins the training of large neural networks like ChatGPT and Llama, we consider a supervised regression setting and establish a strong form of the model collapse phenomenon, a critical performance degradation due to synthetic data in the training corpus. Our results show that even the smallest fraction of synthetic data (e.g., as little as 1 per 1000) can still lead to model collapse: larger and larger training sets do not enhance performance. We further investigate whether increasing model size, an approach aligned with current trends in training large language models, exacerbates or mitigates model collapse. In a simplified regime where neural networks are approximated via random projections of tunable size, we both theoretically and empirically show that larger models can amplify model collapse. Interestingly, our theory also indicates that, beyond the interpolation threshold (which can be extremely high for very large datasets), larger models may mitigate the collapse, although they do not entirely prevent it. Our theoretical findings are empirically verified through experiments on language models and neural networks for images.

ICML Conference 2024 Conference Paper

A Tale of Tails: Model Collapse as a Change of Scaling Laws

  • Elvis Dohmatob
  • Yunzhen Feng
  • Pu Yang
  • François Charton
  • Julia Kempe

As AI model size grows, neural scaling laws have become a crucial tool to predict the improvements of large models when increasing capacity and the size of original (human or natural) training data. Yet, the widespread use of popular models means that the ecosystem of online data and text will co-evolve to progressively contain increased amounts of synthesized data. In this paper we ask: How will the scaling laws change in the inevitable regime where synthetic data makes its way into the training corpus? Will future models, still improve, or be doomed to degenerate up to total (model) collapse? We develop a theoretical framework of model collapse through the lens of scaling laws. We discover a wide range of decay phenomena, analyzing loss of scaling, shifted scaling with number of generations, the ”un-learning" of skills, and grokking when mixing human and synthesized data. Our theory is validated by large-scale experiments with a transformer on an arithmetic task and text generation using the large language model Llama2.

TMLR Journal 2024 Journal Article

Attacking Bayes: On the Adversarial Robustness of Bayesian Neural Networks

  • Yunzhen Feng
  • Tim G. J. Rudner
  • Nikolaos Tsilivis
  • Julia Kempe

Adversarial examples have been shown to cause neural networks to fail on a wide range of vision and language tasks, but recent work has claimed that {\em Bayesian} neural networks (BNNs) are inherently robust to adversarial perturbations. In this work, we examine this claim. To study the adversarial robustness of BNNs, we investigate whether it is possible to successfully break state-of-the-art BNN inference methods and prediction pipelines using even relatively unsophisticated attacks for three tasks: (1) label prediction under the posterior predictive mean, (2) adversarial example detection with Bayesian predictive uncertainty, and (3) semantic shift detection. We find that BNNs trained with state-of-the-art approximate inference methods, and even BNNs trained with Hamiltonian Monte Carlo, are highly susceptible to adversarial attacks. We also identify various conceptual and experimental errors in previous works that claimed inherent adversarial robustness of BNNs and conclusively demonstrate that BNNs and uncertainty-aware Bayesian prediction pipelines are {\em not} inherently robust against adversarial attacks.

ICML Conference 2024 Conference Paper

Deconstructing the Goldilocks Zone of Neural Network Initialization

  • Artem Vysogorets
  • Anna Dawid
  • Julia Kempe

The second-order properties of the training loss have a massive impact on the optimization dynamics of deep learning models. Fort & Scherlis (2019) discovered that a large excess of positive curvature and local convexity of the loss Hessian is associated with highly trainable initial points located in a region coined the "Goldilocks zone". Only a handful of subsequent studies touched upon this relationship, so it remains largely unexplained. In this paper, we present a rigorous and comprehensive analysis of the Goldilocks zone for homogeneous neural networks. In particular, we derive the fundamental condition resulting in excess of positive curvature of the loss, explaining and refining its conventionally accepted connection to the initialization norm. Further, we relate the excess of positive curvature to model confidence, low initial loss, and a previously unknown type of vanishing cross-entropy loss gradient. To understand the importance of excessive positive curvature for trainability of deep networks, we optimize fully-connected and convolutional architectures outside the Goldilocks zone and analyze the emergent behaviors. We find that strong model performance is not perfectly aligned with the Goldilocks zone, calling for further research into this relationship.

ICLR Conference 2024 Conference Paper

Embarrassingly Simple Dataset Distillation

  • Yunzhen Feng
  • Ramakrishna Vedantam
  • Julia Kempe

Dataset distillation extracts a small set of synthetic training samples from a large dataset with the goal of achieving competitive performance on test data when trained on this sample. In this work, we tackle dataset distillation at its core by treating it directly as a bilevel optimization problem. Re-examining the foundational back-propagation through time method, we study the pronounced variance in the gradients, computational burden, and long-term dependencies. We introduce an improved method: Random Truncated Backpropagation Through Time (RaT-BPTT) to address them. RaT-BPTT incorporates a truncation coupled with a random window, effectively stabilizing the gradients and speeding up the optimization while covering long dependencies. This allows us to establish new state-of-the-art for a variety of standard dataset benchmarks. A deeper dive into the nature of distilled data unveils pronounced intercorrelation. In particular, subsets of distilled datasets tend to exhibit much worse performance than directly distilled smaller datasets of the same size. Leveraging RaT-BPTT, we devise a boosting mechanism that generates distilled datasets that contain subsets with near optimal performance across different data budgets.

NeurIPS Conference 2024 Conference Paper

Iteration Head: A Mechanistic Study of Chain-of-Thought

  • Vivien Cabannes
  • Charles Arnal
  • Wassim Bouaziz
  • Alice Yang
  • Francois Charton
  • Julia Kempe

Chain-of-Thought (CoT) reasoning is known to improve Large Language Models both empirically and in terms of theoretical approximation power. However, our understanding of the inner workings and conditions of apparition of CoT capabilities remains limited. This paper helps fill this gap by demonstrating how CoT reasoning emerges in transformers in a controlled and interpretable setting. In particular, we observe the appearance of a specialized attention mechanism dedicated to iterative reasoning, which we coined "iteration heads". We track both the emergence and the precise working of these iteration heads down to the attention level, and measure the transferability of the CoT skills to which they give rise between tasks.

NeurIPS Conference 2024 Conference Paper

Mission Impossible: A Statistical Perspective on Jailbreaking LLMs

  • Jingtong Su
  • Julia Kempe
  • Karen Ullrich

Large language models (LLMs) are trained on a deluge of text data with limited quality control. As a result, LLMs can exhibit unintended or even harmful behaviours, such as leaking information, fake news or hate speech. Countermeasures, commonly referred to as preference alignment, include fine-tuning the pretrained LLMs with carefully crafted text examples of desired behaviour. Even then, empirical evidence shows preference aligned LLMs can be enticed to harmful behaviour. This so called jailbreaking of LLMs is typically achieved by adversarially modifying the input prompt to the LLM. Our paper provides theoretical insights into the phenomenon of preference alignment and jailbreaking from a statistical perspective. Under our framework, we first show that pretrained LLMs will mimic harmful behaviour if present in the training corpus. \textbf{Under that same framework, we then introduce a statistical notion of alignment, and lower-bound the jailbreaking probability, showing that it is unpreventable under reasonable assumptions. } Based on our insights, we propose an alteration to the currently prevalent alignment strategy RLHF. Specifically, we introduce a simple modification to the RLHF objective, we call \emph{E-RLHF}, that aims to increase the likelihood of safe responses. \emph{E-RLHF} brings no additional training cost, and is compatible with other methods. Empirically, we demonstrate that \emph{E-RLHF} outperforms RLHF on all alignment problems put forward by the AdvBench \citep{zou2023universal} and HarmBench project \citep{mazeika2024harmbench} without sacrificing model performance as measured by the MT-Bench project \citep{zheng2024judging}.

NeurIPS Conference 2024 Conference Paper

Model Collapse Demystified: The Case of Regression

  • Elvis Dohmatob
  • Yুনzhen Feng
  • Julia Kempe

The era of proliferation of large language and image generation models begs the question of what happens if models are trained on the synthesized outputs of other models. The phenomenon of "model collapse" refers to the situation whereby as a model is trained recursively on data generated from previous generations of itself over time, its performance degrades until the model eventually becomes completely useless, i. e. the model collapses. In this work, we investigate this phenomenon within the context of high-dimensional regression with Gaussian data, considering both low- and high-dimensional asymptotics. We derive analytical formulas that quantitatively describe this phenomenon in both under-parameterized and over-parameterized regimes. We show how test error increases linearly in the number of model iterations in terms of all problem hyperparameters (covariance spectrum, regularization, label noise level, dataset size) and further isolate how model collapse affects both bias and variance terms in our setup. We show that even in the noise-free case, catastrophic (exponentially fast) model-collapse can happen in the over-parametrized regime. In the special case of polynomial decaying spectral and source conditions, we obtain modified scaling laws which exhibit new crossover phenomena from fast to slow rates. We also propose a simple strategy based on adaptive regularization to mitigate model collapse. Our theoretical results are validated with experiments.

TMLR Journal 2024 Journal Article

On the Robustness of Neural Collapse and the Neural Collapse of Robustness

  • Jingtong Su
  • Ya Shi Zhang
  • Nikolaos Tsilivis
  • Julia Kempe

Neural Collapse refers to the curious phenomenon in the end of training of a neural network, where feature vectors and classification weights converge to a very simple geometrical arrangement (a simplex). While it has been observed empirically in various cases and has been theoretically motivated, its connection with crucial properties of neural networks, like their generalization and robustness, remains unclear. In this work, we study the stability properties of these simplices. We find that the simplex structure disappears under small adversarial attacks, and that perturbed examples "leap" between simplex vertices. We further analyze the geometry of networks that are optimized to be robust against adversarial perturbations of the input, and find that Neural Collapse is a pervasive phenomenon in these cases as well, with clean and perturbed representations forming aligned simplices, and giving rise to a robust simple nearest-neighbor classifier. By studying the propagation of the amount of collapse inside the network, we identify novel properties of both robust and non-robust machine learning models, and show that earlier, unlike later layers maintain reliable simplices on perturbed data.

NeurIPS Conference 2024 Conference Paper

The Price of Implicit Bias in Adversarially Robust Generalization

  • Nikolaos Tsilivis
  • Natalie S. Frank
  • Nathan Srebro
  • Julia Kempe

We study the implicit bias of optimization in robust empirical risk minimization (robust ERM) and its connection with robust generalization. In classification settings under adversarial perturbations with linear models, we study what type of regularization should ideally be applied for a given perturbation set to improve (robust) generalization. We then show that the implicit bias of optimization in robust ERM can significantly affect the robustness of the model and identify two ways this can happen; either through the optimization algorithm or the architecture. We verify our predictions in simulations with synthetic data and experimentally study the importance of implicit bias in robust ERM with deep neural networks.

JMLR Journal 2023 Journal Article

Connectivity Matters: Neural Network Pruning Through the Lens of Effective Sparsity

  • Artem Vysogorets
  • Julia Kempe

Neural network pruning is a fruitful area of research with surging interest in high sparsity regimes. Benchmarking in this domain heavily relies on faithful representation of the sparsity of subnetworks, which has been traditionally computed as the fraction of removed connections (direct sparsity). This definition, however, fails to recognize unpruned parameters that detached from input or output layers of the underlying subnetworks, potentially underestimating actual effective sparsity: the fraction of inactivated connections. While this effect might be negligible for moderately pruned networks (up to 10–100 compression rates), we find that it plays an increasing role for sparser subnetworks, greatly distorting comparison between different pruning algorithms. For example, we show that effective compression of a randomly pruned LeNet-300-100 can be orders of magnitude larger than its direct counterpart, while no discrepancy is ever observed when using SynFlow for pruning (Tanaka et al., 2020). In this work, we adopt the lens of effective sparsity to reevaluate several recent pruning algorithms on common benchmark architectures (e.g., LeNet-300-100, VGG-19, ResNet-18) and discover that their absolute and relative performance changes dramatically in this new, and as we argue, more appropriate framework. To aim for effective, rather than direct, sparsity, we develop a low-cost extension to most pruning algorithms. Further, equipped with effective sparsity as a reference frame, we partially reconfirm that random pruning with appropriate sparsity allocation across layers performs as well or better than more sophisticated algorithms for pruning at initialization (Su et al., 2020). In response to this observation, using an analogy of pressure distribution in coupled cylinders from thermodynamics, we design novel layerwise sparsity quotas that outperform all existing baselines in the context of random pruning. [abs] [ pdf ][ bib ] &copy JMLR 2023. ( edit, beta )

NeurIPS Conference 2022 Conference Paper

What Can the Neural Tangent Kernel Tell Us About Adversarial Robustness?

  • Nikolaos Tsilivis
  • Julia Kempe

The adversarial vulnerability of neural nets, and subsequent techniques to create robust models have attracted significant attention; yet we still lack a full understanding of this phenomenon. Here, we study adversarial examples of trained neural networks through analytical tools afforded by recent theory advances connecting neural networks and kernel methods, namely the Neural Tangent Kernel (NTK), following a growing body of work that leverages the NTK approximation to successfully analyze important deep learning phenomena and design algorithms for new applications. We show how NTKs allow to generate adversarial examples in a training-free'' fashion, and demonstrate that they transfer to fool their finite-width neural net counterparts in the lazy'' regime. We leverage this connection to provide an alternative view on robust and non-robust features, which have been suggested to underlie the adversarial brittleness of neural nets. Specifically, we define and study features induced by the eigendecomposition of the kernel to better understand the role of robust and non-robust features, the reliance on both for standard classification and the robustness-accuracy trade-off. We find that such features are surprisingly consistent across architectures, and that robust features tend to correspond to the largest eigenvalues of the model, and thus are learned early during training. Our framework allows us to identify and visualize non-robust yet useful features. Finally, we shed light on the robustness mechanism underlying adversarial training of neural nets used in practice: quantifying the evolution of the associated empirical NTK, we demonstrate that its dynamics falls much earlier into the ``lazy'' regime and manifests a much stronger form of the well known bias to prioritize learning features within the top eigenspaces of the kernel, compared to standard training.

FOCS Conference 2008 Conference Paper

Entangled Games are Hard to Approximate

  • Julia Kempe
  • Hirotada Kobayashi
  • Keiji Matsumoto
  • Ben Toner
  • Thomas Vidick

We establish the first hardness results for the problem of computing the value of one-round games played by a referee and a team of players who can share quantum entanglement. In particular, we show that it is NP-hard to approximate within an inverse polynomial the value of a one-round game with (i) quantum referee and two entangled players or (ii) classical referee and three entangled players. Previously it was not even known if computing the value exactly is NP-hard. We also describe a mathematical conjecture, which, if true, would imply hardness of approximation to within a constant. We start our proof by describing two ways to modify classical multi-player games to make them resistant to entangled players. We then show that a strategy for the modified game that uses entanglement can be "rounded'' to one that does not. The results then follow from classical inapproximability bounds. Our work implies that, unless P = NP, the values of entangled-player games cannot be computed by semidefinite programs that are polynomial in the size of the referee's system, a method that has been successful for more restricted quantum games.

FOCS Conference 2008 Conference Paper

Unique Games with Entangled Provers are Easy

  • Julia Kempe
  • Oded Regev 0001
  • Ben Toner

We consider one-round games between a classical verifier and two provers who share entanglement. We show that when the constraints enforced by the verifier are `unique' constraints (i. e. , permutations), the value of the game can be well approximated by a semidefinite program. Essentially the only algorithm known previously was for the special case of binary answers, as follows from the work of Tsirelson in 1980. Among other things, our result implies that the variant of the unique games conjecture where we allow the provers to share entanglement is false. Our proof is based on a novel `quantum rounding technique', showing how to take a solution to an SDP and transform it to a strategy for entangled provers. Using our approximation by a semidefinite program we also show a parallel repetition theorem for unique entangled games.

STOC Conference 2007 Conference Paper

Exponential separations for one-way quantum communication complexity, with applications to cryptography

  • Dmitry Gavinsky
  • Julia Kempe
  • Iordanis Kerenidis
  • Ran Raz
  • Ronald de Wolf

We give an exponential separation between one-way quantum and classical communication protocols for twopartial Boolean functions, both of which are variants of the Boolean Hidden Matching Problem of Bar-Yossef et al. Earlier such an exponential separation was known only for a relational version of the Hidden Matching Problem. Our proofs use the Fourier coefficients inequality of Kahn, Kalai, and Linial. We give a number of applications of this separation. In particular, in the bounded-storage model of cryptography we exhibita scheme that is secure against adversaries with a certain amount of classical storage, but insecure against adversaries with a similar (or even much smaller) amount of quantum storage; in the setting of privacy amplification, we show that there are strong extractors that yield a classically secure key, but are insecure against a quantum adversary.

FOCS Conference 2007 Conference Paper

The Power of Quantum Systems on a Line

  • Dorit Aharonov
  • Daniel Gottesman
  • Sandy Irani
  • Julia Kempe

We study the computational strength of quantum particles (each of finite dimensionality) arranged on a line. First, we prove that it is possible to perform universal adiabatic quantum computation using a one-dimensional quantum system (with 9 states per particle). Building on the same construction, but with some additional technical effort and 12 states per particle, we show that the problem of approximating the ground state energy of a system composed of a line of quantum' particles is QMA-complete; QMA is a quantum analogue of NP. This is in striking contrast to the analogous classical problem, one dimensional MAX-2-SAT with nearest neighbor constraints, which is in P. The proof of the QMA-completeness result requires an additional idea beyond the usual techniques in the area: Some illegal configurations cannot be ruled out by local checks, and are instead ruled out because they would, in the future, evolve into a state which can be seen locally to be illegal. Assuming BQP ne QMA, our construction gives a one-dimensional system which takes an exponential time to relax to its ground state at any temperature. This makes it a candidate for a one-dimensional spin glass.

STOC Conference 2006 Conference Paper

Bounded-error quantum state identification and exponential separations in communication complexity

  • Dmitry Gavinsky
  • Julia Kempe
  • Oded Regev 0001
  • Ronald de Wolf

We consider the problem of bounded-error quantum state identification: given either state α 0 or state α 1 , we are required to output '0', '1' or 'DONO' ("don't know"), such that conditioned on outputting '0' or '1', our guess is correct with high probability. The goal is to maximize the probability of not outputting 'DONO'. We prove a direct product theorem: if we're given two such problems, with optimal probabilities a and b, respectively, and the states in the first problem are pure, then the optimal probability for the joint bounded-error state identification problem is O(ab). Our proof is based on semidefinite programming duality and may be of wider interest.Using this result, we present two exponential separations in the simultaneous message passing model of communication complexity. First, we describe a relation that can be computed with O(log n) classical bits of communication in the presence of shared randomness, but needs Ω(n 1/3 ) communication if the parties don't share randomness, even if communication is quantum. This shows the optimality of Yao's recent exponential simulation of shared-randomness protocols by quantum protocols without shared randomness. Second, we describe a relation that can be computed with O(log n) classical bits of communication in the presence of shared entanglement, but needs Ω((n/log n) 1/3 ) communication if the parties share randomness but no entanglement, even if communication is quantum. This is the first example in communication complexity where entanglement buys you much more than quantum communication does.

FOCS Conference 2004 Conference Paper

Adiabatic Quantum Computation is Equivalent to Standard Quantum Computation

  • Dorit Aharonov
  • Wim van Dam
  • Julia Kempe
  • Zeph Landau
  • Seth Lloyd
  • Oded Regev 0001

The model of adiabatic quantum computation has recently attracted attention in the physics and computer science communities, but its exact computational power has been unknown. We settle this question and describe an efficient adiabatic simulation of any given quantum algorithm. This implies that the adiabatic computation model and the standard quantum circuit model are polynomially equivalent. We also describe an extension of this result with implications to physical implementations of adiabatic computation. We believe that our result highlights the potential importance of the adiabatic computation model in the design of quantum algorithms and in their experimental realization.

STOC Conference 2001 Conference Paper

Quantum walks on graphs

  • Dorit Aharonov
  • Andris Ambainis
  • Julia Kempe
  • Umesh V. Vazirani

We set the ground for a theory of quantum walks on graphs-the generalization of random walks on finite graphs to the quantum world. Such quantum walks do not converge to any stationary distribution, as they are unitary and reversible. However, by suitably relaxing the definition, we can obtain a measure of how fast the quantum walk spreads or how confined the quantum walk stays in a small neighborhood. We give definitions of mixing time, filling time, dispersion time. We show that in all these measures, the quantum walk on the cycle is almost quadratically faster then its classical correspondent. On the other hand, we give a lower bound on the possible speed up by quantum walks for general graphs, showing that quantum walks can be at most polynomially faster than their classical counterparts.