Arrow Research search

Author name cluster

Jon Schneider

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

25 papers
2 author rows

Possible papers

25

ICML Conference 2025 Conference Paper

Best of Both Worlds: Regret Minimization versus Minimax Play

  • Adrian Müller 0002
  • Jon Schneider
  • Stratis Skoulakis
  • Luca Viano
  • Volkan Cevher

In this paper, we investigate the existence of online learning algorithms with bandit feedback that simultaneously guarantee $O(1)$ regret compared to a given comparator strategy, and $\tilde{O}(\sqrt{T})$ regret compared to any fixed strategy, where $T$ is the number of rounds. We provide the first affirmative answer to this question whenever the comparator strategy supports every action. In the context of zero-sum games with min-max value zero, both in normal- and extensive form, we show that our results allow us to guarantee to risk at most $O(1)$ loss while being able to gain $\Omega(T)$ from exploitable opponents, thereby combining the benefits of both no-regret algorithms and minimax play.

EWRL Workshop 2025 Workshop Paper

Best of Both Worlds: Regret Minimization versus Minimax Play

  • Adrian Müller
  • Jon Schneider
  • Stratis Skoulakis
  • Luca Viano
  • Volkan Cevher

In this paper, we investigate the existence of online learning algorithms with bandit feedback that simultaneously guarantee $O(1)$ regret compared to a given comparator strategy, and $\tilde{O}(\sqrt{T})$ regret compared to any fixed strategy, where $T$ is the number of rounds. We provide the first affirmative answer to this question whenever the comparator strategy supports every action. In the context of zero-sum games with min-max value zero, both in normal- and extensive form, we show that our results allow us to guarantee to risk at most $O(1)$ loss while being able to gain $\Omega(T)$ from exploitable opponents, thereby combining the benefits of both no-regret algorithms and minimax play.

NeurIPS Conference 2025 Conference Paper

High-Dimensional Calibration from Swap Regret

  • Maxwell Fishelson
  • Noah Golowich
  • Mehryar Mohri
  • Jon Schneider

We study the online calibration of multi-dimensional forecasts over an arbitrary convex set $\mathcal{P} \subset \mathbb{R}^d$ relative to an arbitrary norm $\Vert\cdot\Vert$. We connect this with the problem of external regret minimization for online linear optimization, showing that if it is possible to guarantee $O(\sqrt{\rho T})$ worst-case regret after $T$ rounds when actions are drawn from $\mathcal{P}$ and losses are drawn from the dual $\Vert \cdot \Vert_*$ unit norm ball, then it is also possible to obtain $\epsilon$-calibrated forecasts after $T = \exp(O(\rho /\epsilon^2))$ rounds. When $\mathcal{P}$ is the $d$-dimensional simplex and $\Vert \cdot \Vert$ is the $\ell_1$-norm, the existence of $O(\sqrt{T\log d})$ algorithms for learning with experts implies that it is possible to obtain $\epsilon$-calibrated forecasts after $T = \exp(O(\log{d}/\epsilon^2)) = d^{O(1/\epsilon^2)}$ rounds, recovering a recent result of Peng 2025. Interestingly, our algorithm obtains this guarantee without requiring access to any online linear optimization subroutine or knowledge of the optimal rate $\rho$ -- in fact, our algorithm is identical for every setting of $\mathcal{P}$ and $\Vert \cdot \Vert$. Instead, we show that the optimal regularizer for the above OLO problem can be used to upper bound the above calibration error by a swap regret, which we then minimize by running the recent TreeSwap algorithm with Follow-The-Leader as a subroutine. The resulting algorithm is highly efficient and plays a distribution over simple averages of past observations in each round. Finally, we prove that any online calibration algorithm that guarantees $\epsilon T$ $\ell_1$-calibration error over the $d$-dimensional simplex requires $T \geq \exp(\mathrm{poly}(1/\epsilon))$ (assuming $d \geq \mathrm{poly}(1/\epsilon)$). This strengthens the corresponding $d^{\Omega(\log{1/\epsilon})}$ lower bound of Peng 2025, and shows that an exponential dependence on $1/\epsilon$ is necessary.

NeurIPS Conference 2025 Conference Paper

Robust Contextual Pricing

  • Anupam Gupta
  • Guru Guruganesh
  • Renato Leme
  • Jon Schneider

We provide an algorithm with regret $O(C d \log \log T)$ for contextual pricing with $C$ corrupted rounds, improving over the previous bound of $O(d^3 C \log^2(T))$ of Krishnamurthy et al. The result is based on a reduction that calls the uncorrupted algorithm as a black-box, unlike the previous approach that modifies the inner workings of the uncorrupted algorithm. As a result, it leads to a conceptually simpler algorithm. Finally, we provide a lower bound ruling out a $O(C + d\log \log T)$ algorithm. This shows that robustifying contextual pricing is harder than robustifying contextual search with $\epsilon$-ball losses, for which it is possible to design algorithms where corruptions add only an extra additive term $C$ to the regret.

NeurIPS Conference 2024 Conference Paper

Contracting with a Learning Agent

  • Yoav Kolumbus
  • Jon Schneider
  • Inbal Talgam-Cohen
  • Emmanouil-Vasileios Vlatakis-Gkaragkounis
  • Joshua R. Wang
  • S. M. Weinberg

Real-life contractual relations typically involve repeated interactions between the principal and agent, where, despite theoretical appeal, players rarely use complex dynamic strategies and instead manage uncertainty through learning algorithms. In this paper, we initiate the study of repeated contracts with learning agents, focusing on those achieving no-regret outcomes. For the canonical setting where the agent’s actions result in success or failure, we present a simple, optimal solution for the principal: Initially provide a linear contract with scalar $\alpha > 0$, then switch to a zero-scalar contract. This shift causes the agent to “free-fall” through their action space, yielding non-zero rewards for the principal at zero cost. Interestingly, despite the apparent exploitation, there are instances where our dynamic contract can make \emph{both} players better off compared to the best static contract. We then broaden the scope of our results to general linearly-scaled contracts, and, finally, to the best of our knowledge, we provide the first analysis of optimization against learning agents with uncertainty about the time horizon.

NeurIPS Conference 2024 Conference Paper

Convergence of No-Swap-Regret Dynamics in Self-Play

  • Renato P. Leme
  • Georgios Piliouras
  • Jon Schneider

In this paper, we investigate the question of whether no-swap-regret dynamics have stronger convergence properties in repeated games than regular no-external-regret dynamics. We prove that in almost all symmetric zero-sum games under symmetric initializations of the agents, no-swap-regret dynamics in self-play are guaranteed to converge in a strong ``frequent-iterate'' sense to the Nash equilibrium: in all but a vanishing fraction of the rounds, the players must play a strategy profile close to a symmetric Nash equilibrium. Remarkably, relaxing any of these three constraints, i. e. by allowing either i) asymmetric initial conditions, or ii) an asymmetric game or iii) no-external regret dynamics suffices to destroy this result and lead to complex non-equilibrating or even chaotic behavior. In a dual type of result, we show that the power of no-swap-regret dynamics comes at a cost of imposing a time-asymmetry on its inputs. While no-external-regret dynamics can be completely determined by the cumulative reward vector received by each player, we show there does not exist any general no-swap-regret dynamics defined on the same state space. In fact, we prove that any no-swap-regret learning algorithm must play a time-asymmetric function over the set of previously observed rewards, ruling out any dynamics based on a symmetric function of the current set of rewards.

ICML Conference 2024 Conference Paper

Online Learning with Bounded Recall

  • Jon Schneider
  • Kiran Vodrahalli

We study the problem of full-information online learning in the “bounded recall” setting popular in the study of repeated games. An online learning algorithm $\mathcal{A}$ is $M$- bounded-recall if its output at time $t$ can be written as a function of the $M$ previous rewards (and not e. g. any other internal state of $\mathcal{A}$). We first demonstrate that a natural approach to constructing bounded-recall algorithms from mean-based no-regret learning algorithms (e. g. , running Hedge over the last $M$ rounds) fails, and that any such algorithm incurs constant regret per round. We then construct a stationary bounded-recall algorithm that achieves a per-round regret of $\Theta(1/\sqrt{M})$, which we complement with a tight lower bound. Finally, we show that unlike the perfect recall setting, any low regret bound bounded-recall algorithm must be aware of the ordering of the past $M$ losses – any bounded-recall algorithm which plays a symmetric function of the past $M$ losses must incur constant regret per round.

NeurIPS Conference 2023 Conference Paper

Is Learning in Games Good for the Learners?

  • William Brown
  • Jon Schneider
  • Kiran Vodrahalli

We consider a number of questions related to tradeoffs between reward and regret in repeated gameplay between two agents. To facilitate this, we introduce a notion of generalized equilibrium which allows for asymmetric regret constraints, and yields polytopes of feasible values for each agent and pair of regret constraints, where we show that any such equilibrium is reachable by a pair of algorithms which maintain their regret guarantees against arbitrary opponents. As a central example, we highlight the case one agent is no-swap and the other's regret is unconstrained. We show that this captures an extension of Stackelberg equilibria with a matching optimal value, and that there exists a wide class of games where a player can significantly increase their utility by deviating from a no-swap-regret algorithm against a no-swap learner (in fact, almost any game without pure Nash equilibria is of this form). Additionally, we make use of generalized equilibria to consider tradeoffs in terms of the opponent's algorithm choice. We give a tight characterization for the maximal reward obtainable against some no-regret learner, yet we also show a class of games in which this is bounded away from the value obtainable against the class of common "mean-based" no-regret algorithms. Finally, we consider the question of learning reward-optimal strategies via repeated play with a no-regret agent when the game is initially unknown. Again we show tradeoffs depending on the opponent's learning algorithm: the Stackelberg strategy is learnable in exponential time with any no-regret agent (and in polynomial time with any no-adaptive-regret agent) for any game where it is learnable via queries, and there are games where it is learnable in polynomial time against any no-swap-regret agent but requires exponential time against a mean-based no-regret agent.

NeurIPS Conference 2023 Conference Paper

Optimal cross-learning for contextual bandits with unknown context distributions

  • Jon Schneider
  • Julian Zimmert

We consider the problem of designing contextual bandit algorithms in the ``cross-learning'' setting of Balseiro et al. , where the learner observes the loss for the action they play in all possible contexts, not just the context of the current round. We specifically consider the setting where losses are chosen adversarially and contexts are sampled i. i. d. from an unknown distribution. In this setting, we resolve an open problem of Balseiro et al. by providing an efficient algorithm with a nearly tight (up to logarithmic factors) regret bound of $\widetilde{O}(\sqrt{TK})$, independent of the number of contexts. As a consequence, we obtain the first nearly tight regret bounds for the problems of learning to bid in first-price auctions (under unknown value distributions) and sleeping bandits with a stochastic action set. At the core of our algorithm is a novel technique for coordinating the execution of a learning algorithm over multiple epochs in such a way to remove correlations between estimation of the unknown distribution and the actions played by the algorithm. This technique may be of independent interest for other learning problems involving estimation of an unknown context distribution.

ICML Conference 2023 Conference Paper

Optimal No-Regret Learning for One-Sided Lipschitz Functions

  • Paul Dütting
  • Guru Guruganesh
  • Jon Schneider
  • Joshua R. Wang

Inspired by applications in pricing and contract design, we study the maximization of one-sided Lipschitz functions, which only provide the (weaker) guarantee that they do not grow too quickly in one direction. We show that it is possible to learn a maximizer for such a function while incurring $O(\log \log T)$ total regret (with a universal constant independent of the number of discontinuities / complexity of the function). This regret bound is asymptotically optimal in $T$ due to a lower bound of Kleinberg and Leighton. By applying this algorithm, we show that one can sell digital goods to multiple buyers and learn the optimal linear contract in the principal-agent setting while incurring at most $O(\log \log T)$ regret.

NeurIPS Conference 2022 Conference Paper

Anonymous Bandits for Multi-User Systems

  • Hossein Esfandiari
  • Vahab Mirrokni
  • Jon Schneider

In this work, we present and study a new framework for online learning in systems with multiple users that provide user anonymity. Specifically, we extend the notion of bandits to obey the standard $k$-anonymity constraint by requiring each observation to be an aggregation of rewards for at least $k$ users. This provides a simple yet effective framework where one can learn a clustering of users in an online fashion without observing any user's individual decision. We initiate the study of anonymous bandits and provide the first sublinear regret algorithms and lower bounds for this setting.

STOC Conference 2021 Conference Paper

Combinatorial Bernoulli factories: matchings, flows, and other polytopes

  • Rad Niazadeh
  • Renato Paes Leme
  • Jon Schneider

A Bernoulli factory is an algorithmic procedure for exact sampling of certain random variables having only Bernoulli access to their parameters. Bernoulli access to a parameter p ∈ [0,1] means the algorithm does not know p , but has sample access to independent draws of a Bernoulli random variable with mean equal to p . In this paper, we study the problem of Bernoulli factories for polytopes: given Bernoulli access to a vector x ∈ P for a given polytope P ⊂ [0,1] n , output a randomized vertex such that the expected value of the i -th coordinate is exactly equal to x i . For example, for the special case of the perfect matching polytope, one is given Bernoulli access to the entries of a doubly stochastic matrix [ x ij ] and asked to sample a matching such that the probability of each edge ( i , j ) be present in the matching is exactly equal to x ij . We show that a polytope P admits a Bernoulli factory if and and only if P is the intersection of [0,1] n with an affine subspace. Our construction is based on an algebraic formulation of the problem, involving identifying a family of Bernstein polynomials (one per vertex) that satisfy a certain algebraic identity on P . The main technical tool behind our construction is a connection between these polynomials and the geometry of zonotope tilings. We apply these results to construct an explicit factory for the perfect matching polytope. The resulting factory is deeply connected to the combinatorial enumeration of arborescences and may be of independent interest. For the k -uniform matroid polytope, we recover a sampling procedure known in statistics as Sampford sampling.

NeurIPS Conference 2021 Conference Paper

Contextual Recommendations and Low-Regret Cutting-Plane Algorithms

  • Sreenivas Gollapudi
  • Guru Guruganesh
  • Kostas Kollias
  • Pasin Manurangsi
  • Renato Leme
  • Jon Schneider

We consider the following variant of contextual linear bandits motivated by routing applications in navigational engines and recommendation systems. We wish to learn a hidden $d$-dimensional value $w^*$. Every round, we are presented with a subset $\mathcal{X}_t \subseteq \mathbb{R}^d$ of possible actions. If we choose (i. e. recommend to the user) action $x_t$, we obtain utility $\langle x_t, w^* \rangle$ but only learn the identity of the best action $\arg\max_{x \in \X_t} \langle x, w^* \rangle$. We design algorithms for this problem which achieve regret $O(d\log T)$ and $\exp(O(d \log d))$. To accomplish this, we design novel cutting-plane algorithms with low “regret” -- the total distance between the true point $w^*$ and the hyperplanes the separation oracle returns. We also consider the variant where we are allowed to provide a list of several recommendations. In this variant, we give an algorithm with $O(d^2 \log d)$ regret and list size $\poly(d)$. Finally, we construct nearly tight algorithms for a weaker variant of this problem where the learner only learns the identity of an action that is better than the recommendation. Our results rely on new algorithmic techniques in convex geometry (including a variant of Steiner’s formula for the centroid of a convex set) which may be of independent interest.

IJCAI Conference 2021 Conference Paper

Jointly Learning Prices and Product Features

  • Ehsan Emamjomeh-Zadeh
  • Renato Paes Leme
  • Jon Schneider
  • Balasubramanian Sivan

Product Design is an important problem in marketing research where a firm tries to learn what features of a product are more valuable to consumers. We study this problem from the viewpoint of online learning: a firm repeatedly interacts with a buyer by choosing a product configuration as well as a price and observing the buyer's purchasing decision. The goal of the firm is to maximize revenue throughout the course of $T$ rounds by learning the buyer's preferences. We study both the case of a set of discrete products and the case of a continuous set of allowable product features. In both cases we provide nearly tight upper and lower regret bounds.

NeurIPS Conference 2021 Conference Paper

Margin-Independent Online Multiclass Learning via Convex Geometry

  • Guru Guruganesh
  • Allen Liu
  • Jon Schneider
  • Joshua Wang

We consider the problem of multi-class classification, where a stream of adversarially chosen queries arrive and must be assigned a label online. Unlike traditional bounds which seek to minimize the misclassification rate, we minimize the total distance from each query to the region corresponding to its assigned label. When the true labels are determined via a nearest neighbor partition -- i. e. the label of a point is given by which of $k$ centers it is closest to in Euclidean distance -- we show that one can achieve a loss that is independent of the total number of queries. We complement this result by showing that learning general convex sets requires an almost linear loss per query. Our results build off of regret guarantees for the problem of contextual search. In addition, we develop a novel reduction technique from multiclass classification to binary classification which may be of independent interest.

ICML Conference 2021 Conference Paper

Reserve Price Optimization for First Price Auctions in Display Advertising

  • Zhe Feng 0004
  • Sébastien Lahaie
  • Jon Schneider
  • Jinchao Ye

The display advertising industry has recently transitioned from second- to first-price auctions as its primary mechanism for ad allocation and pricing. In light of this, publishers need to re-evaluate and optimize their auction parameters, notably reserve prices. In this paper, we propose a gradient-based algorithm to adaptively update and optimize reserve prices based on estimates of bidders’ responsiveness to experimental shocks in reserves. Our key innovation is to draw on the inherent structure of the revenue objective in order to reduce the variance of gradient estimates and improve convergence rates in both theory and practice. We show that revenue in a first-price auction can be usefully decomposed into a \emph{demand} component and a \emph{bidding} component, and introduce techniques to reduce the variance of each component. We characterize the bias-variance trade-offs of these techniques and validate the performance of our proposed algorithm through experiments on synthetic data and real display ad auctions data from a major ad exchange.

NeurIPS Conference 2020 Conference Paper

Myersonian Regression

  • Allen Liu
  • Renato Leme
  • Jon Schneider

Motivated by pricing applications in online advertising, we study a variant of linear regression with a discontinuous loss function that we term Myersonian regression. In this variant, we wish to find a linear function $f: \mathbb{R}^d \rightarrow \mathbb{R}$ that well approximates a set of points $(x_i, v_i) \in \mathbb{R}^d \times [0, 1]$ in the following sense: we receive a loss of $v_i$ when $f(x_i) > v_i$ and a loss of $v_i - f(x_i)$ when $f(x_i) \leq v_i$. This arises naturally in the economic application of designing a pricing policy for differentiated items (where the loss is the gap between the performance of our policy and the optimal Myerson prices). We show that Myersonian regression is NP-hard to solve exactly and furthermore that no fully polynomial-time approximation scheme exists for Myersonian regression conditioned on the Exponential Time Hypothesis being true. In contrast to this, we demonstrate a polynomial-time approximation scheme for Myersonian regression that obtains an $\epsilon m$ additive approximation to the optimal possible revenue and can be computed in time $O(\exp(\mathrm{poly}(1/\epsilon))\poly(m, n))$. We show that this algorithm is stable and generalizes well over distributions of samples.

NeurIPS Conference 2019 Conference Paper

Contextual Bandits with Cross-Learning

  • Santiago Balseiro
  • Negin Golrezaei
  • Mohammad Mahdian
  • Vahab Mirrokni
  • Jon Schneider

In the classical contextual bandits problem, in each round $t$, a learner observes some context $c$, chooses some action $a$ to perform, and receives some reward $r_{a, t}(c)$. We consider the variant of this problem where in addition to receiving the reward $r_{a, t}(c)$, the learner also learns the values of $r_{a, t}(c')$ for all other contexts $c'$; i. e. , the rewards that would have been achieved by performing that action under different contexts. This variant arises in several strategic settings, such as learning how to bid in non-truthful repeated auctions (in this setting the context is the decision maker's private valuation for each auction). We call this problem the contextual bandits problem with cross-learning. The best algorithms for the classical contextual bandits problem achieve $\tilde{O}(\sqrt{CKT})$ regret against all stationary policies, where $C$ is the number of contexts, $K$ the number of actions, and $T$ the number of rounds. We demonstrate algorithms for the contextual bandits problem with cross-learning that remove the dependence on $C$ and achieve regret $\tilde{O}(\sqrt{KT})$ (when contexts are stochastic with known distribution), $\tilde{O}(K^{1/3}T^{2/3})$ (when contexts are stochastic with unknown distribution), and $\tilde{O}(\sqrt{KT})$ (when contexts are adversarial but rewards are stochastic). We simulate our algorithms on real auction data from an ad exchange running first-price auctions (showing that they outperform traditional contextual bandit algorithms).

NeurIPS Conference 2019 Conference Paper

Prior-Free Dynamic Auctions with Low Regret Buyers

  • Yuan Deng
  • Jon Schneider
  • Balasubramanian Sivan

We study the problem of how to repeatedly sell to a buyer running a no-regret, mean-based algorithm. Previous work [Braverman et al. , 2018] shows that it is possible to design effective mechanisms in such a setting that extract almost all of the economic surplus, but these mechanisms require the buyer's values each round to be drawn independently and identically from a fixed distribution. In this work, we do away with this assumption and consider the prior-free setting where the buyer's value each round is chosen adversarially (possibly adaptively). We show that even in this prior-free setting, it is possible to extract a $(1-\varepsilon)$-approximation of the full economic surplus for any $\varepsilon > 0$. The number of options offered to a buyer in any round scales independently of the number of rounds $T$ and polynomially in $\varepsilon$. We show that this is optimal up to a polynomial factor; any mechanism achieving this approximation factor, even when values are drawn stochastically, requires at least $\Omega(1/\varepsilon)$ options. Finally, we examine what is possible when we constrain our mechanism to a natural auction format where overbidding is dominated. Braverman et al. [2018] show that even when values are drawn from a known stochastic distribution supported on $[1/H, 1]$, it is impossible in general to extract more than $O(\log\log H / \log H)$ of the economic surplus. We show how to achieve the same approximation factor in the prior-independent setting (where the distribution is unknown to the seller), and an approximation factor of $O(1 / \log H)$ in the prior-free setting (where the values are chosen adversarially).

NeurIPS Conference 2019 Conference Paper

Strategizing against No-regret Learners

  • Yuan Deng
  • Jon Schneider
  • Balasubramanian Sivan

How should a player who repeatedly plays a game against a no-regret learner strategize to maximize his utility? We study this question and show that under some mild assumptions, the player can always guarantee himself a utility of at least what he would get in a Stackelberg equilibrium. When the no-regret learner has only two actions, we show that the player cannot get any higher utility than the Stackelberg equilibrium utility. But when the no-regret learner has more than two actions and plays a mean-based no-regret strategy, we show that the player can get strictly higher than the Stackelberg equilibrium utility. We construct the optimal game-play for the player against a mean-based no-regret learner who has three actions. When the no-regret learner's strategy also guarantees him a no-swap regret, we show that the player cannot get anything higher than a Stackelberg equilibrium utility.

NeurIPS Conference 2018 Conference Paper

Contextual Pricing for Lipschitz Buyers

  • Jieming Mao
  • Renato Leme
  • Jon Schneider

We investigate the problem of learning a Lipschitz function from binary feedback. In this problem, a learner is trying to learn a Lipschitz function $f: [0, 1]^d \rightarrow [0, 1]$ over the course of $T$ rounds. On round $t$, an adversary provides the learner with an input $x_t$, the learner submits a guess $y_t$ for $f(x_t)$, and learns whether $y_t > f(x_t)$ or $y_t \leq f(x_t)$. The learner's goal is to minimize their total loss $\sum_t\ell(f(x_t), y_t)$ (for some loss function $\ell$). The problem is motivated by \textit{contextual dynamic pricing}, where a firm must sell a stream of differentiated products to a collection of buyers with non-linear valuations for the items and observes only whether the item was sold or not at the posted price. For the symmetric loss $\ell(f(x_t), y_t) = \vert f(x_t) - y_t \vert$, we provide an algorithm for this problem achieving total loss $O(\log T)$ when $d=1$ and $O(T^{(d-1)/d})$ when $d>1$, and show that both bounds are tight (up to a factor of $\sqrt{\log T}$). For the pricing loss function $\ell(f(x_t), y_t) = f(x_t) - y_t {\bf 1}\{y_t \leq f(x_t)\}$ we show a regret bound of $O(T^{d/(d+1)})$ and show that this bound is tight. We present improved bounds in the special case of a population of linear buyers.

FOCS Conference 2018 Conference Paper

Contextual Search via Intrinsic Volumes

  • Renato Paes Leme
  • Jon Schneider

We study the problem of contextual search, a multidimensional generalization of binary search that captures many problems in contextual decision-making. In contextual search, a learner is trying to learn the value of a hidden vector. Every round the learner is provided an adversarially-chosen context vector, submits a guess for the inner product of the context vector with the hidden vector, learns whether their guess was too high or too low, and incurs some loss based on the quality of their guess. The learner's goal is to minimize their total loss over the course of some number of rounds. We present an algorithm for the contextual search problem for the symmetric loss function that achieves constant total loss. We present a new algorithm for the dynamic pricing problem (which can be realized as a special case of the contextual search problem) that achieves doubly-logarithmic total loss, improving exponentially on the previous best known upper bounds and matching the known lower bounds (up to a polynomial dependence on dimension). Both algorithms make significant use of ideas from the field of integral geometry, most notably the notion of intrinsic volumes of a convex set. To the best of our knowledge this is the first application of intrinsic volumes to algorithm design.