Author name cluster

Vikas Sindhwani

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

49 papers

2 author rows

ICRA Conference 2025 Conference Paper

Achieving Human Level Competitive Robot Table Tennis

David B. D'Ambrosio
Saminda Abeyruwan
Laura Graesser
Atil Iscen
Heni Ben Amor
Alex Bewley
Barney J. Reed
Krista Reymann

Achieving human-level performance on real world tasks is a north star for the robotics community. We present the first learned robot agent that reaches amateur humanlevel performance in competitive table tennis. Table tennis is a physically demanding sport that takes humans years to master. We contribute (1) a hierarchical and modular policy architecture consisting of (i) low level controllers with their skill descriptors that model their capabilities and (ii) a high level controller that chooses the low level skills, (2) techniques for enabling zero-shot sim-to-real and curriculum building, including an iterative approach (train in sim, deploy in real), and (3) real time adaptation to unseen opponents. Policy performance was assessed through 29 robot vs. human matches of which the robot won 45 % (13/29). All humans were unseen players and their skill level varied from beginner to tournament level. Whilst the robot lost all matches vs. the most advanced players it won 100 % matches vs. beginners and 55 % matches vs. intermediate players, demonstrating solidly amateur humanlevel performance. Videos of the matches can be viewed here 1. See sites https://google.com/view/competitive-robot-table-tennis.

ICML Conference 2025 Conference Paper

Learning the RoPEs: Better 2D and 3D Position Encodings with STRING

Connor Schenck
Isaac Reid
Mithun George Jacob
Alex Bewley
Joshua Ainslie
David Rendleman
Deepali Jain
Mohit Sharma 0001

We introduce $\textbf{STRING}$: Separable Translationally Invariant Position Encodings. STRING extends Rotary Position Encodings, a recently proposed and widely used algorithm in large language models, via a unifying theoretical framework. Importantly, STRING still provides $\textbf{exact}$ translation invariance, including token coordinates of arbitrary dimensionality, whilst maintaining a low computational footprint. These properties are especially important in robotics, where efficient 3D token representation is key. We integrate STRING into Vision Transformers with RGB(-D) inputs (color plus optional depth), showing substantial gains, e. g. in open-vocabulary object detection and for robotics controllers. We complement our experiments with a rigorous mathematical analysis, proving the universality of our methods. Videos of STRING-based robotics controllers can be found here: https: //sites. google. com/view/string-robotics.

IROS Conference 2024 Conference Paper

Embodied AI with Two Arms: Zero-shot Learning, Safety and Modularity

Jake Varley
Sumeet Singh
Deepali Jain
Krzysztof Choromanski
Andy Zeng 0001
Somnath Basu Roy Chowdhury
Avinava Dubey
Vikas Sindhwani

We present an embodied AI system which receives open-ended natural language instructions from a human, and controls two arms to collaboratively accomplish potentially long-horizon tasks over a large workspace. Our system is modular: it deploys state of the art Large Language Models for task planning, Vision-Language models for semantic perception, and Point Cloud transformers for grasping. With semantic and physical safety in mind, these modules are interfaced with a real-time trajectory optimizer and a compliant tracking controller to enable human-robot proximity. We demonstrate performance for the following tasks: bi-arm sorting, bottle opening, and trash disposal tasks. These are done zero-shot where the models used have not been trained with any real world data from this bi-arm robot, scenes or workspace. Composing both learning- and non-learning-based components in a modular fashion with interpretable inputs and outputs allows the user to easily debug points of failures and fragilities. One may also in-place swap modules to improve the robustness of the overall platform, for instance with imitation-learned policies.

ICRA Conference 2024 Conference Paper

How to Prompt Your Robot: A PromptBook for Manipulation Skills with Code as Policies

Montserrat Gonzalez Arenas
Ted Xiao
Sumeet Singh
Vidhi Jain
Allen Z. Ren
Quan Vuong
Jake Varley
Alexander Herzog

Large Language Models (LLMs) have demonstrated the ability to perform semantic reasoning, planning and write code for robotics tasks. However, most methods rely on pre-existing primitives (i. e. pick, open drawer) or similar examples of robot code alone, which heavily limits their scalability to new scenarios. We present PromptBook, a collection of different prompting paradigms to generate code for successfully executing new manipulation skills. We demonstrate example-based, instruction-based and chain-of-thought to write robot code; as well as a method to build the prompt leveraging LLMs and human feedback. We show PromptBook enables LLMs to write code for new low-level manipulation skills in a zero-shot manner: from picking diverse objects, opening/closing drawers, to whisking, and waving hello. We evaluate the new skills on a mobile manipulator with 83% success rate at picking, 50-71% at opening drawers and 100% at closing them. Notably, the LLM is able to infer gripper orientation for grasping a drawer handle (z-axis aligned) vs. a top-down grasp (x-axis aligned).

TMLR Journal 2024 Journal Article

Revisiting Energy Based Models as Policies: Ranking Noise Contrastive Estimation and Interpolating Energy Models

Sumeet Singh
Stephen Tu
Vikas Sindhwani

A crucial design decision for any robot learning pipeline is the choice of policy representation: what type of model should be used to generate the next set of robot actions? Owing to the inherent multi-modal nature of many robotic tasks, combined with the recent successes in generative modeling, researchers have turned to state-of-the-art probabilistic models such as diffusion models for policy representation. In this work, we revisit the choice of energy-based models (EBM) as a policy class. We show that the prevailing folklore---that energy models in high dimensional continuous spaces are impractical to train---is false. We develop a practical training objective and algorithm for energy models which combines several key ingredients: (i) ranking noise contrastive estimation (R-NCE), (ii) learnable negative samplers, and (iii) non-adversarial joint training. We prove that our proposed objective function is asymptotically consistent and quantify its limiting variance. On the other hand, we show that the Implicit Behavior Cloning (IBC) objective is actually biased even at the population level, providing a mathematical explanation for the poor performance of IBC trained energy policies in several independent follow-up works. We further extend our algorithm to learn a continuous stochastic process that bridges noise and data, modeling this process with a family of EBMs indexed by scale variable. In doing so, we demonstrate that the core idea behind recent progress in generative modeling is actually compatible with EBMs. Altogether, our proposed training algorithms enable us to train energy-based models as policies which compete with---and even outperform---diffusion models and other state-of-the-art approaches in several challenging multi-modal benchmarks: obstacle avoidance path planning and contact-rich block pushing.

ICRA Conference 2024 Conference Paper

SARA-RT: Scaling up Robotics Transformers with Self-Adaptive Robust Attention

Isabel Leal
Krzysztof Choromanski
Deepali Jain
Avinava Dubey
Jake Varley
Michael S. Ryoo
Yao Lu 0006
Frederick Liu

We present Self-Adaptive Robust Attention for Robotics Transformers (SARA-RT): a new paradigm for addressing the emerging challenge of scaling up Robotics Transformers (RT) for on-robot deployment. SARA-RT relies on the new method of fine-tuning proposed by us, called up-training. It converts pre-trained or already fine-tuned Transformer-based robotic policies of quadratic time complexity (including massive billion-parameter vision-language-action models or VLAs), into their efficient linear-attention counterparts maintaining high quality. We demonstrate the effectiveness of SARA-RT by speeding up: (a) the class of recently introduced RT-2 models [1], the first VLA robotic policies pre-trained on internet-scale data, as well as (b) Point Cloud Transformer (PCT) robotic policies operating on large point clouds. We complement our results with the rigorous mathematical analysis providing deeper insight into the phenomenon of SARA.

NeurIPS Conference 2024 Conference Paper

Structured Unrestricted-Rank Matrices for Parameter Efficient Finetuning

Arijit Sehanobish
Avinava Dubey
Krzysztof Choromanski
Somnath B. Chowdhury
Deepali Jain
Vikas Sindhwani
Snigdha Chaturvedi

Recent efforts to scale Transformer models have demonstrated rapid progress across a wide range of tasks (Wei at. al 2022). However, fine-tuning these models for downstream tasks is quite expensive due to their large parameter counts. Parameter-efficient fine-tuning (PEFT) approaches have emerged as a viable alternative, allowing us to fine-tune models by updating only a small number of parameters. In this work, we propose a general framework for parameter efficient fine-tuning (PEFT), based on structured unrestricted-rank matrices (SURM) which can serve as a drop-in replacement for popular approaches such as Adapters and LoRA. Unlike other methods like LoRA, SURMs give us more flexibility in finding the right balance between compactness and expressiveness. This is achieved by using low displacement rank matrices (LDRMs), which hasn't been used in this context before. SURMs remain competitive with baselines, often providing significant quality improvements while using a smaller parameter budget. SURMs achieve: 5 - 7 % accuracy gains on various image classification tasks while replacing low-rank matrices in LoRA and: up to 12x reduction of the number of parameters in adapters (with virtually no loss in quality) on the GLUE benchmark.

PDF Details DOI

IROS Conference 2024 Conference Paper

The Design of the Barkour Benchmark for Robot Agility

Wenhao Yu 0003
Ken Caluwaerts
Atil Iscen
J. Chase Kew
Tingnan Zhang
Daniel Freeman
Lisa Lee
Stefano Saliceti

In this paper, we describe the design of the Barkour benchmark for measuring robot agility in navigating complex environments. Despite the growing interest in developing agile robot locomotion skills, the field lacks systematic benchmarks to measure the performance of robotic control systems and hardware in agility-focused tasks. This motivated us to propose the Barkour benchmark, an obstacle course designed to quantify agility across various robotic platforms. Inspired by dog agility competitions, the course features diverse obstacles and a time-based scoring mechanism, encouraging researchers to develop controllers that enable robots to move quickly, precisely, and with adaptability. This benchmark is challenging as it demands diverse motion skills and the time-based scoring requires control precision at high speed. Along with the design details presented in the paper, we release our simulated environment setups in MuJoCo-XLA and the CAD model of a custom-designed quadruped robot to facilitate future research to reproduce the Barkour setup (available at sites.google.com/view/barkour). We hope these together will accelerate the pace of robot agility research.

ICRA Conference 2023 Conference Paper

A Contextual Bandit Approach for Learning to Plan in Environments with Probabilistic Goal Configurations

Sohan Rudra
Saksham Goel
Anirban Santara
Claudio Gentile
Laurent Perron
Fei Xia
Vikas Sindhwani
Carolina Parada

Object-goal navigation (Object-nav) entails searching, recognizing and navigating to a target object. Object-nav has been extensively studied by the Embodied-AI community, but most solutions are often restricted to considering static objects (e. g. , television, fridge, etc.), We propose a modular framework for object-nav that is able to efficiently search indoor environments for not just static objects but also movable objects (e. g. fruits, glasses, phones, etc.) that frequently change their positions due to human intervention. Our contextual-bandit agent efficiently explores the environment by showing optimism in the face of uncertainty and learns a model of the likelihood of spotting different objects from each navigable location. The likelihoods are used as rewards in a weighted minimum latency solver to deduce a trajectory for the robot. We evaluate our algorithms in two simulated environments and a real-world setting, to demonstrate high sample efficiency and reliability.

NeurIPS Conference 2023 Conference Paper

Mnemosyne: Learning to Train Transformers with Transformers

Deepali Jain
Krzysztof M Choromanski
Kumar Avinava Dubey
Sumeet Singh
Vikas Sindhwani
Tingnan Zhang
Jie Tan

In this work, we propose a new class of learnable optimizers, called Mnemosyne. It is based on the novel spatio-temporal low-rank implicit attention Transformers that can learn to train entire neural network architectures, including other Transformers, without any task-specific optimizer tuning. We show that Mnemosyne: (a) outperforms popular LSTM optimizers (also with new feature engineering to mitigate catastrophic forgetting of LSTMs), (b) can successfully train Transformers while using simple meta-training strategies that require minimal computational resources, (c) matches accuracy-wise SOTA hand-designed optimizers with carefully tuned hyper-parameters (often producing top performing models). Furthermore, Mnemosyne provides space complexity comparable to that of its hand-designed first-order counterparts, which allows it to scale to training larger sets of parameters. We conduct an extensive empirical evaluation of Mnemosyne on: (a) fine-tuning a wide range of Vision Transformers (ViTs) from medium-size architectures to massive ViT-Hs (36 layers, 16 heads), (b) pre-training BERT models and (c) soft prompt-tuning large 11B+ T5XXL models. We complement our results with a comprehensive theoretical analysis of the compact associative memory used by Mnemosyne which we believe was never done before.

ICRA Conference 2023 Conference Paper

Robotic Table Wiping via Reinforcement Learning and Whole-body Trajectory Optimization

Thomas Lew
Sumeet Singh
Mario Prats
Jeffrey T. Bingham
Jonathan Weisz
Benjie Holson
Xiaohan Zhang
Vikas Sindhwani

We propose a framework to enable multipurpose assistive mobile robots to autonomously wipe tables to clean spills and crumbs. This problem is challenging, as it requires planning wiping actions while reasoning over uncertain latent dynamics of crumbs and spills captured via high-dimensional visual observations. Simultaneously, we must guarantee constraints satisfaction to enable safe deployment in unstructured cluttered environments. To tackle this problem, we first propose a stochastic differential equation to model crumbs and spill dynamics and absorption with a robot wiper. Using this model, we train a vision-based policy for planning wiping actions in simulation using reinforcement learning (RL). To enable zero-shot sim-to-real deployment, we dovetail the RL policy with a whole-body trajectory optimization framework to compute base and arm joint trajectories that execute the desired wiping motions while guaranteeing constraints satisfaction. We extensively validate our approach in simulation and on hardware. Video of experiments: https://youtu.be/inORKP4F3EI

ICLR Conference 2023 Conference Paper

Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language

Andy Zeng 0001
Maria Attarian
Brian Ichter
Krzysztof Choromanski
Adrian Wong
Stefan Welker
Federico Tombari
Aveek Purohit

We investigate how multimodal prompt engineering can use language as the intermediate representation to combine complementary knowledge from different pretrained (potentially multimodal) language models for a variety of tasks. This approach is both distinct from and complementary to the dominant paradigm of joint multimodal training. It also recalls a traditional systems-building view as in classical NLP pipelines, but with prompting large pretrained multimodal models. We refer to these as Socratic Models (SMs): a modular class of systems in which multiple pretrained models may be composed zero-shot via multimodal-informed prompting to capture new multimodal capabilities, without additional finetuning. We show that these systems provide competitive state-of-the-art performance for zero-shot image captioning and video-to-text retrieval, and also enable new applications such as (i) answering free-form questions about egocentric video, (ii) engaging in multimodal assistive dialogue with people (e.g., for cooking recipes), and (iii) robot perception and planning. We hope this work provides (a) results for stronger zero-shot baseline performance with analysis also highlighting their limitations, (b) new perspectives for building multimodal systems powered by large pretrained models, and (c) practical application advantages in certain regimes limited by data scarcity, training compute, or model access.

ICLR Conference 2022 Conference Paper

Hybrid Random Features

Krzysztof Choromanski
Han Lin
Haoxian Chen 0002
Arijit Sehanobish
Yuanzhe Ma
Deepali Jain
Jake Varley
Andy Zeng 0001

We propose a new class of random feature methods for linearizing softmax and Gaussian kernels called hybrid random features (HRFs) that automatically adapt the quality of kernel estimation to provide most accurate approximation in the defined regions of interest. Special instantiations of HRFs lead to well-known methods such as trigonometric (Rahimi & Recht, 2007) or (recently introduced in the context of linear-attention Transformers) positive random features (Choromanski et al., 2021). By generalizing Bochner’s Theorem for softmax/Gaussian kernels and leveraging random features for compositional kernels, the HRF-mechanism provides strong theoretical guarantees - unbiased approximation and strictly smaller worst-case relative errors than its counterparts. We conduct exhaustive empirical evaluation of HRF ranging from pointwise kernel estimation experiments, through tests on data admitting clustering structure to benchmarking implicit-attention Transformers (also for downstream Robotics applications), demonstrating its quality in a wide spectrum of machine learning problems.

IROS Conference 2022 Conference Paper

Multiscale Sensor Fusion and Continuous Control with Neural CDEs

Sumeet Singh
Francis McCann Ramirez
Jacob Varley
Andy Zeng 0001
Vikas Sindhwani

Though robot learning is often formulated in terms of discrete-time Markov decision processes (MDPs), physical robots require near-continuous multiscale feedback control. Machines operate on multiple asynchronous sensing modalities, each with different frequencies, e. g. , video frames at 30Hz, proprioceptive state at 100Hz, force-torque data at 500Hz, etc. While the classic approach is to batch observations into fixed-time windows then pass them through feed-forward encoders (e. g. , with deep networks), we show that there exists a more elegant approach - one that treats policy learning as modeling latent state dynamics in continuous-time. Specifically, we present InFuser, a unified architecture that trains continuous time-policies with Neural Controlled Differential Equations (CDEs). InFuser evolves a single latent state representation over time by (In)tegrating and (Fus)ing multi-sensory observations (arriving at different frequencies), and inferring actions in continuous-time. This enables policies that can react to multi-frequency multi-sensory feedback for truly end-to-end visuomotor control, without discrete-time assumptions. Behavior cloning experiments demonstrate that InFuser learns robust policies for dynamic tasks (e. g. , swinging a ball into a cup) notably outperforming several baselines in settings where observations from one sensing modality can arrive at much sparser intervals than others.

ICRA Conference 2022 Conference Paper

Optimizing Trajectories with Closed-Loop Dynamic SQP

Sumeet Singh
Jean-Jacques E. Slotine
Vikas Sindhwani

Indirect trajectory optimization methods such as Differential Dynamic Programming (DDP) have found considerable success when only planning under dynamic feasibility constraints. Meanwhile, nonlinear programming (NLP) has been the state-of-the-art approach when faced with additional constraints (e. g. , control bounds, obstacle avoidance). However, a naïve implementation of NLP algorithms, e. g. , shooting-based sequential quadratic programming (SQP), may suffer from slow convergence – caused from natural instabilities of the underlying system manifesting as poor numerical stability within the optimization. Re-interpreting the DDP closed-loop rollout policy as a sensitivity-based correction to a second-order search direction, we demonstrate how to compute analogous closedloop policies (i. e. , feedback gains) for constrained problems. Our key theoretical result introduces a novel dynamic programmingbased constraint-set recursion that augments the canonical “cost-to-go” backward pass. On the algorithmic front, we develop a hybrid-SQP algorithm incorporating DDP-style closedloop rollouts, enabled via efficient parallelized computation of the feedback gains. Finally, we validate our theoretical and algorithmic contributions on a set of increasingly challenging benchmarks, demonstrating significant improvements in convergence speed over standard open-loop SQP.

ICRA Conference 2021 Conference Paper

Learning to Rearrange Deformable Cables, Fabrics, and Bags with Goal-Conditioned Transporter Networks

Daniel Seita
Pete Florence
Jonathan Tompson
Erwin Coumans
Vikas Sindhwani
Ken Goldberg
Andy Zeng 0001

Rearranging and manipulating deformable objects such as cables, fabrics, and bags is a long-standing challenge in robotic manipulation. The complex dynamics and high-dimensional configuration spaces of deformables, compared to rigid objects, make manipulation difficult not only for multi-step planning, but even for goal specification. Goals cannot be as easily specified as rigid object poses, and may involve complex relative spatial relations such as "place the item inside the bag". In this work, we develop a suite of simulated benchmarks with 1D, 2D, and 3D deformable structures, including tasks that involve image-based goal-conditioning and multi-step deformable manipulation. We propose embedding goal-conditioning into Transporter Networks, a recently proposed model architecture for learning robotic manipulation that rearranges deep features to infer displacements that can represent pick and place actions. We demonstrate that goal-conditioned Transporter Networks enable agents to manipulate deformable structures into flexibly specified configurations without test-time visual anchors for target locations. We also significantly extend prior results using Transporter Networks for manipulating deformable objects by testing on tasks with 2D and 3D deformables. Supplementary material is available at https://berkeleyautomation.github.io/bags/.

ICRA Conference 2021 Conference Paper

Piecewise-Linear Motion Planning amidst Static, Moving, or Morphing Obstacles

Bachir El Khadir
Jean-Bernard Lasserre
Vikas Sindhwani

We propose a novel method for planning shortest length piecewise-linear motions through complex environments punctured with static, moving, or even morphing obstacles. Using a moment optimization approach, we formulate a hierarchy of semidefinite programs that yield increasingly refined lower bounds converging monotonically to the optimal path length. Our global moment optimization approach natively handles continuous time constraints without any need for time discretization. For computational tractability, we derive an iterative motion planner which compares favorably with sampling-based and nonlinear optimization baselines.

NeurIPS Conference 2020 Conference Paper

Ode to an ODE

Krzysztof M. Choromanski
Jared Quincy Davis
Valerii Likhosherstov
Xingyou Song
Jean-Jacques Slotine
Jacob Varley
Honglak Lee
Adrian Weller

We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the orthogonal group O(d). This nested system of two flows, where the parameter-flow is constrained to lie on the compact manifold, provides stability and effectiveness of training and solves the gradient vanishing-explosion problem which is intrinsically related to training deep neural network architectures such as Neural ODEs. Consequently, it leads to better downstream models, as we show on the example of training reinforcement learning policies with evolution strategies, and in the supervised learning setting, by comparing with previous SOTA baselines. We provide strong convergence results for our proposed mechanism that are independent of the width of the network, supporting our empirical studies. Our results show an intriguing connection between the theory of deep neural networks and the field of matrix flows on compact manifolds.

IROS Conference 2020 Conference Paper

Robotic Table Tennis with Model-Free Reinforcement Learning

Wenbo Gao
Laura Graesser
Krzysztof Choromanski
Xingyou Song
Nevena Lazic
Pannag R. Sanketi
Vikas Sindhwani
Navdeep Jaitly

We propose a model-free algorithm for learning efficient policies capable of returning table tennis balls by controlling robot joints at a rate of 100Hz. We demonstrate that evolutionary search (ES) methods acting on CNN-based policy architectures for non-visual inputs and convolving across time learn compact controllers leading to smooth motions. Furthermore, we show that with appropriately tuned curriculum learning on the task and rewards, policies are capable of developing multi-modal styles, specifically forehand and backhand stroke, whilst achieving 80% return rate on a wide range of ball throws. We observe that multi-modality does not require any architectural priors, such as multi-head architectures or hierarchical policies.

ICML Conference 2020 Conference Paper

Stochastic Flows and Geometric Optimization on the Orthogonal Group

Krzysztof Choromanski
David Cheikhi
Jared Quincy Davis
Valerii Likhosherstov
Achille Nazaret
Achraf Bahamou
Xingyou Song
Mrugank Akarte

We present a new class of stochastic, geometrically-driven optimization algorithms on the orthogonal group O(d) and naturally reductive homogeneous manifolds obtained from the action of the rotation group SO(d). We theoretically and experimentally demonstrate that our methods can be applied in various fields of machine learning including deep, convolutional and recurrent neural networks, reinforcement learning, normalizing flows and metric learning. We show an intriguing connection between efficient stochastic optimization on the orthogonal group and graph theory (e. g. matching problem, partition functions over graphs, graph-coloring). We leverage the theory of Lie groups and provide theoretical results for the designed class of algorithms. We demonstrate broad applicability of our methods by showing strong performance on the seemingly unrelated tasks of learning world models to obtain stable policies for the most difficult Humanoid agent from OpenAI Gym and improving convolutional neural networks.

ICRA Conference 2020 Conference Paper

Unsupervised Anomaly Detection for Self-flying Delivery Drones

Vikas Sindhwani
Hakim Sidahmed
Krzysztof Choromanski
Brandon Jones

We propose a novel anomaly detection framework for a fleet of hybrid aerial vehicles executing high-speed package pickup and delivery missions. The detection is based on machine learning models of normal flight profiles, trained on millions of flight log measurements of control inputs and sensor readings. We develop a new scalable algorithm for robust regression which can simultaneously fit predictive flight dynamics models while identifying and discarding abnormal flight missions from the training set. The resulting unsupervised estimator has a very high breakdown point and can withstand massive contamination of training data to uncover what normal flight patterns look like, without requiring any form of prior knowledge of aircraft aerodynamics or manual labeling of anomalies upfront. Across many different anomaly types, spanning simple 3sigma statistical thresholds to turbulence and other equipment anomalies, our models achieve high detection rates across the board. Our method consistently outperforms alternative robust detection methods on synthetic benchmark problems. To the best of our knowledge, dynamics modeling of hybrid delivery drones for anomaly detection at the scale of 100 million measurements from 5000 real flight missions in variable flight conditions is unprecedented.

NeurIPS Conference 2019 Conference Paper

From Complexity to Simplicity: Adaptive ES-Active Subspaces for Blackbox Optimization

Krzysztof Choromanski
Aldo Pacchiano
Jack Parker-Holder
Yunhao Tang
Vikas Sindhwani

We present a new algorithm (ASEBO) for optimizing high-dimensional blackbox functions. ASEBO adapts to the geometry of the function and learns optimal sets of sensing directions, which are used to probe it, on-the-fly. It addresses the exploration-exploitation trade-off of blackbox optimization with expensive blackbox queries by continuously learning the bias of the lower-dimensional model used to approximate gradients of smoothings of the function via compressed sensing and contextual bandits methods. To obtain this model, it leverages techniques from the emerging theory of active subspaces in a novel ES blackbox optimization context. As a result, ASEBO learns the dynamically changing intrinsic dimensionality of the gradient space and adapts to the hardness of different stages of the optimization without external supervision. Consequently, it leads to more sample-efficient blackbox optimization than state-of-the-art algorithms. We provide theoretical results and test ASEBO advantages over other methods empirically by evaluating it on the set of reinforcement learning policy optimization tasks as well as functions from the recently open-sourced Nevergrad library.

ICRA Conference 2018 Conference Paper

Optimizing Simulations with Noise-Tolerant Structured Exploration

Krzysztof Choromanski
Atil Iscen
Vikas Sindhwani
Jie Tan 0001
Erwin Coumans

We propose a simple drop-in noise-tolerant replacement for the standard finite difference procedure used ubiquitously in blackbox optimization. In our approach, parameter perturbation directions are defined by a family of structured orthogonal matrices. We show that at the small cost of computing a Fast Walsh-Hadamard/Fourier Transform (FWHT/FFT), such structured finite differences consistently give higher quality approximation of gradients and Jacobians in comparison to vanilla approaches that use coordinate directions or random Gaussian perturbations. We find that trajectory optimizers like Iterative LQR and Differential Dynamic Programming require fewer iterations to solve several classic continuous control tasks when our methods are used to linearize noisy, blackbox dynamics instead of standard finite differences. By embedding structured exploration in a quasi-Newton optimizer (LBFGS), we are able to learn agile walking and turning policies for quadruped locomotion, that successfully transfer from simulation to actual hardware. We theoretically justify our methods via bounds on the quality of gradient reconstruction and provide a basis for applying them also to nonsmooth problems.

ICML Conference 2018 Conference Paper

Structured Evolution with Compact Architectures for Scalable Policy Optimization

Krzysztof Choromanski
Mark Rowland 0001
Vikas Sindhwani
Richard E. Turner
Adrian Weller

We present a new method of blackbox optimization via gradient approximation with the use of structured random orthogonal matrices, providing more accurate estimators than baselines and with provable theoretical guarantees. We show that this algorithm can be successfully applied to learn better quality compact policies than those using standard gradient estimation techniques. The compact policies we learn have several advantages over unstructured ones, including faster training algorithms and faster inference. These benefits are important when the policy is deployed on real hardware with limited resources. Further, compact policies provide more scalable architectures for derivative-free optimization (DFO) in high-dimensional spaces. We show that most robotics tasks from the OpenAI Gym can be solved using neural networks with less than 300 parameters, with almost linear time complexity of the inference phase, with up to 13x fewer parameters relative to the Evolution Strategies (ES) algorithm introduced by Salimans et al. (2017). We do not need heuristics such as fitness shaping to learn good quality policies, resulting in a simple and theoretically motivated training mechanism.

JMLR Journal 2017 Journal Article

Hierarchically Compositional Kernels for Scalable Nonparametric Learning

Jie Chen
Haim Avron
Vikas Sindhwani

We propose a novel class of kernels to alleviate the high computational cost of large-scale nonparametric learning with kernel methods. The proposed kernel is defined based on a hierarchical partitioning of the underlying data domain, where the NystrÃ¶m method (a globally low-rank approximation) is married with a locally lossless approximation in a hierarchical fashion. The kernel maintains (strict) positive-definiteness. The corresponding kernel matrix admits a recursively off- diagonal low-rank structure, which allows for fast linear algebra computations. Suppressing the factor of data dimension, the memory and arithmetic complexities for training a regression or a classifier are reduced from $O(n^2)$ and $O(n^3)$ to $O(nr)$ and $O(nr^2)$, respectively, where $n$ is the number of training examples and $r$ is the rank on each level of the hierarchy. Although other randomized approximate kernels entail a similar complexity, empirical results show that the proposed kernel achieves a matching performance with a smaller $r$. We demonstrate comprehensive experiments to show the effective use of the proposed kernel on data sizes up to the order of millions. [abs] [ pdf ][ bib ] &copy JMLR 2017. ( edit, beta )

NeurIPS Conference 2017 Conference Paper

On Blackbox Backpropagation and Jacobian Sensing

Krzysztof Choromanski
Vikas Sindhwani

From a small number of calls to a given “blackbox" on random input perturbations, we show how to efficiently recover its unknown Jacobian, or estimate the left action of its Jacobian on a given vector. Our methods are based on a novel combination of compressed sensing and graph coloring techniques, and provably exploit structural prior knowledge about the Jacobian such as sparsity and symmetry while being noise robust. We demonstrate efficient backpropagation through noisy blackbox layers in a deep neural net, improved data-efficiency in the task of linearizing the dynamics of a rigid body system, and the generic ability to handle a rich class of input-output dependency structures in Jacobian estimation problems.

JMLR Journal 2016 Journal Article

Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels

Haim Avron
Vikas Sindhwani
Jiyan Yang
Michael W. Mahoney

We consider the problem of improving the efficiency of randomized Fourier feature maps to accelerate training and testing speed of kernel methods on large data sets. These approximate feature maps arise as Monte Carlo approximations to integral representations of shift-invariant kernel functions (e.g., Gaussian kernel). In this paper, we propose to use Quasi-Monte Carlo (QMC) approximations instead, where the relevant integrands are evaluated on a low-discrepancy sequence of points as opposed to random point sets as in the Monte Carlo approach. We derive a new discrepancy measure called box discrepancy based on theoretical characterizations of the integration error with respect to a given sequence. We then propose to learn QMC sequences adapted to our setting based on explicit box discrepancy minimization. Our theoretical analyses are complemented with empirical results that demonstrate the effectiveness of classical and adaptive QMC techniques for this problem. [abs] [ pdf ][ bib ] &copy JMLR 2016. ( edit, beta )

ICML Conference 2016 Conference Paper

Recycling Randomness with Structure for Sublinear time Kernel Expansions

Krzysztof Choromanski
Vikas Sindhwani

We propose a scheme for recycling Gaussian random vectors into structured matrices to ap- proximate various kernel functions in sublin- ear time via random embeddings. Our frame- work includes the Fastfood construction of Le et al. (2013) as a special case, but also ex- tends to Circulant, Toeplitz and Hankel matri- ces, and the broader family of structured matri- ces that are characterized by the concept of low- displacement rank. We introduce notions of co- herence and graph-theoretic structural constants that control the approximation quality, and prove unbiasedness and low-variance properties of ran- dom feature maps that arise within our frame- work. For the case of low-displacement matri- ces, we show how the degree of structure and randomness can be controlled to reduce statis- tical variance at the cost of increased computa- tion and storage requirements. Empirical results strongly support our theory and justify the use of a broader family of structured matrices for scal- ing up kernel methods using random features.

NeurIPS Conference 2015 Conference Paper

Structured Transforms for Small-Footprint Deep Learning

Vikas Sindhwani
Tara Sainath
Sanjiv Kumar

We consider the task of building compact deep learning pipelines suitable for deploymenton storage and power constrained mobile devices. We propose a uni-fied framework to learn a broad family of structured parameter matrices that arecharacterized by the notion of low displacement rank. Our structured transformsadmit fast function and gradient evaluation, and span a rich range of parametersharing configurations whose statistical modeling capacity can be explicitly tunedalong a continuum from structured to unstructured. Experimental results showthat these transforms can significantly accelerate inference and forward/backwardpasses during training, and offer superior accuracy-compactness-speed tradeoffsin comparison to a number of existing techniques. In keyword spotting applicationsin mobile speech recognition, our methods are much more effective thanstandard linear low-rank bottleneck layers and nearly retain the performance ofstate of the art models, while providing more than 3. 5-fold compression.

ICML Conference 2014 Conference Paper

Quasi-Monte Carlo Feature Maps for Shift-Invariant Kernels

Jiyan Yang
Vikas Sindhwani
Haim Avron
Michael W. Mahoney

We consider the problem of improving the efficiency of randomized Fourier feature maps to accelerate training and testing speed of kernel methods on large datasets. These approximate feature maps arise as Monte Carlo approximations to integral representations of shift-invariant kernel functions (e. g. , Gaussian kernel). In this paper, we propose to use Quasi-Monte Carlo (QMC) approximations instead where the relevant integrands are evaluated on a low-discrepancy sequence of points as opposed to random point sets as in the Monte Carlo approach. We derive a new discrepancy measure called box discrepancy based on theoretical characterizations of the integration error with respect to a given sequence. We then propose to learn QMC sequences adapted to our setting based on explicit box discrepancy minimization. Our theoretical analyses are complemented with empirical results that demonstrate the effectiveness of classical and adaptive QMC techniques for this problem.

ICML Conference 2013 Conference Paper

Fast Conical Hull Algorithms for Near-separable Non-negative Matrix Factorization

Abhishek Kumar 0001
Vikas Sindhwani
Prabhanjan Kambadur

The separability assumption (Arora et al. , 2012; Donoho & Stodden, 2003) turns non-negative matrix factorization (NMF) into a tractable problem. Recently, a new class of provably-correct NMF algorithms have emerged under this assumption. In this paper, we reformulate the separable NMF problem as that of finding the extreme rays of the conical hull of a finite set of vectors. From this geometric perspective, we derive new separable NMF algorithms that are highly scalable and empirically noise robust, and have several favorable properties in relation to existing methods. A parallel implementation of our algorithm scales excellently on shared and distributed-memory machines.

UAI Conference 2013 Conference Paper

Scalable Matrix-valued Kernel Learning for High-dimensional Nonlinear Multivariate Regression and Granger Causality

Vikas Sindhwani
Ha Quang Minh
Aurélie C. Lozano

We propose a general matrix-valued multiple kernel learning framework for highdimensional nonlinear multivariate regression problems. This framework allows a broad class of mixed norm regularizers, including those that induce sparsity, to be imposed on a dictionary of vector-valued Reproducing Kernel Hilbert Spaces. We develop a highly scalable and eigendecompositionfree algorithm that orchestrates two inexact solvers for simultaneously learning both the input and output components of separable matrix-valued kernels. As a key application enabled by our framework, we show how high-dimensional causal inference tasks can be naturally cast as sparse function estimation problems, leading to novel nonlinear extensions of a class of Graphical Granger Causality techniques. Our algorithmic developments and extensive empirical studies are complemented by theoretical analyses in terms of Rademacher generalization bounds.

NeurIPS Conference 2013 Conference Paper

Sketching Structured Matrices for Faster Nonlinear Regression

Haim Avron
Vikas Sindhwani
David Woodruff

Motivated by the desire to extend fast randomized techniques to nonlinear $l_p$ regression, we consider a class of structured regression problems. These problems involve Vandermonde matrices which arise naturally in various statistical modeling settings, including classical polynomial fitting problems and recently developed randomized techniques for scalable kernel methods. We show that this structure can be exploited to further accelerate the solution of the regression problem, achieving running times that are faster than input sparsity''. We present empirical results confirming both the practical value of our modeling framework, as well as speedup benefits of randomized regression. "

ICML Conference 2012 Conference Paper

Efficient and Practical Stochastic Subgradient Descent for Nuclear Norm Regularization

Haim Avron
Satyen Kale
Shiva Prasad Kasiviswanathan
Vikas Sindhwani

IJCAI Conference 2011 Conference Paper

Concept Labeling: Building Text Classifiers with Minimal Supervision

Vijil Chenthamarakshan
Prem Melville
Vikas Sindhwani
Richard D. Lawrence

The rapid construction of supervised text classification models is becoming a pervasive need across many modern applications. To reduce human-labeling bottlenecks, many new statistical paradigms (e. g. , active, semi-supervised, transfer and multi-task learning) have been vigorously pursued in recent literature with varying degrees of empirical success. Concurrently, the emergence of Web 2. 0 platforms in the last decade has enabled a world-wide, collaborative human effort to construct a massive ontology of concepts with very rich, detailed and accurate descriptions. In this paper we propose a new framework to extract supervisory information from such ontologies and complement it with a shift in human effort from direct labeling of examples in the domain of interest to the much more efficient identification of concept-class associations. Through empirical studies on text categorization problems using the Wikipedia ontology, we show that this shift allows very high-quality models to be immediately induced at virtually no cost.

PDF Details DOI

NeurIPS Conference 2011 Conference Paper

Non-parametric Group Orthogonal Matching Pursuit for Sparse Learning with Multiple Kernels

Vikas Sindhwani
Aurelie Lozano

We consider regularized risk minimization in a large dictionary of Reproducing kernel Hilbert Spaces (RKHSs) over which the target function has a sparse representation. This setting, commonly referred to as Sparse Multiple Kernel Learning (MKL), may be viewed as the non-parametric extension of group sparsity in linear models. While the two dominant algorithmic strands of sparse learning, namely convex relaxations using l1 norm (e. g. , Lasso) and greedy methods (e. g. , OMP), have both been rigorously extended for group sparsity, the sparse MKL literature has so farmainly adopted the former withmild empirical success. In this paper, we close this gap by proposing a Group-OMP based framework for sparse multiple kernel learning. Unlike l1-MKL, our approach decouples the sparsity regularizer (via a direct l0 constraint) from the smoothness regularizer (via RKHS norms) which leads to better empirical performance as well as a simpler optimization procedure that only requires a black-box single-kernel solver. The algorithmic development and empirical studies are complemented by theoretical analyses in terms of Rademacher generalization bounds and sparse recovery conditions analogous to those for OMP [27] and Group-OMP [16].

ICML Conference 2011 Conference Paper

Vector-valued Manifold Regularization

Ha Quang Minh
Vikas Sindhwani

NeurIPS Conference 2010 Conference Paper

Block Variable Selection in Multivariate Regression and High-dimensional Causal Inference

Vikas Sindhwani
Aurelie Lozano

We consider multivariate regression problems involving high-dimensional predictor and response spaces. To efficiently address such problems, we propose a variable selection method, Multivariate Group Orthogonal Matching Pursuit, which extends the standard Orthogonal Matching Pursuit technique to account for arbitrary sparsity patterns induced by domain-specific groupings over both input and output variables, while also taking advantage of the correlation that may exist between the multiple outputs. We illustrate the utility of this framework for inferring causal relationships over a collection of high-dimensional time series variables. When applied to time-evolving social media content, our models yield a new family of causality-based influence measures that may be seen as an alternative to PageRank. Theoretical guarantees, extensive simulations and empirical studies confirm the generality and value of our framework.

ICML Conference 2009 Conference Paper

Uncertainty sampling and transductive experimental design for active dual supervision

Vikas Sindhwani
Prem Melville
Richard D. Lawrence

ICML Conference 2008 Conference Paper

An RKHS for multi-view learning and manifold co-regularization

Vikas Sindhwani
David S. Rosenberg

Inspired by co-training, many multi-view semi-supervised kernel methods implement the following idea: find a function in each of multiple Reproducing Kernel Hilbert Spaces (RKHSs) such that (a) the chosen functions make similar predictions on unlabeled examples, and (b) the average prediction given by the chosen functions performs well on labeled examples. In this paper, we construct a single RKHS with a data-dependent "co-regularization" norm that reduces these approaches to standard supervised learning. The reproducing kernel for this RKHS can be explicitly derived and plugged into any kernel method, greatly extending the theoretical and algorithmic scope of coregularization. In particular, with this development, the Rademacher complexity bound for co-regularization given in (Rosenberg & Bartlett, 2007) follows easily from wellknown results. Furthermore, more refined bounds given by localized Rademacher complexity can also be easily applied. We propose a co-regularization based algorithmic alternative to manifold regularization (Belkin et al., 2006; Sindhwani et al., 2005a) that leads to major empirical improvements on semi-supervised tasks. Unlike the recently proposed transductive approach of (Yu et al., 2008), our RKHS formulation is truly semi-supervised and naturally extends to unseen test data.

JMLR Journal 2008 Journal Article

Optimization Techniques for Semi-Supervised Support Vector Machines

Olivier Chapelle
Vikas Sindhwani
Sathiya S. Keerthi

Due to its wide applicability, the problem of semi-supervised classification is attracting increasing attention in machine learning. Semi-Supervised Support Vector Machines (S 3 VMs) are based on applying the margin maximization principle to both labeled and unlabeled examples. Unlike SVMs, their formulation leads to a non-convex optimization problem. A suite of algorithms have recently been proposed for solving S 3 VMs. This paper reviews key ideas in this literature. The performance and behavior of various S 3 VMs algorithms is studied together, under a common experimental setting. [abs] [ pdf ][ bib ] &copy JMLR 2008. ( edit, beta )

NeurIPS Conference 2008 Conference Paper

Regularized Co-Clustering with Dual Supervision

Vikas Sindhwani
Jianying Hu
Aleksandra Mojsilovic

By attempting to simultaneously partition both the rows (examples) and columns (features) of a data matrix, Co-clustering algorithms often demonstrate surpris- ingly impressive performance improvements over traditional one-sided (row) clustering techniques. A good clustering of features may be seen as a combinatorial transformation of the data matrix, effectively enforcing a form of regularization that may lead to a better clustering of examples (and vice-versa). In many applications, partial supervision in the form of a few row labels as well as column labels may be available to potentially assist co-clustering. In this paper, we develop two novel semi-supervised multi-class classification algorithms motivated respectively by spectral bipartite graph partitioning and matrix approximation (e. g. , non-negative matrix factorization) formulations for co-clustering. These algorithms (i) support dual supervision in the form of labels for both examples and/or features, (ii) provide principled predictive capability on out-of-sample test data, and (iii) arise naturally from the classical Representer theorem applied to regularization problems posed on a collection of Reproducing Kernel Hilbert Spaces. Empirical results demonstrate the effectiveness and utility of our algorithms.

IJCAI Conference 2007 Conference Paper

Vikas Sindhwani
Wei Chu
Sathiya Keerthi

In this paper, we propose a graph-based construction of semi-supervised Gaussian process classifiers. Our method is based on recently proposed techniques for incorporating the geometric properties of unlabeled data within globally defined kernel functions. The full machinery for standard supervised Gaussian process inference is brought to bear on the problem of learning from labeled and unlabeled data. This approach provides a natural probabilistic extension to unseen test examples. We employ Expectation Propagation procedures for evidence-based model selection. In the presence of few labeled examples, this approach is found to significantly outperform cross-validation techniques. We present empirical results demonstrating the strengths of our approach.

NeurIPS Conference 2006 Conference Paper

An Efficient Method for Gradient-Based Adaptation of Hyperparameters in SVM Models

S. Keerthi
Vikas Sindhwani
Olivier Chapelle

We consider the task of tuning hyperparameters in SVM models based on minimizing a smooth performance validation function, e. g. , smoothed k-fold crossvalidation error, using non-linear optimization techniques. The key computation in this approach is that of the gradient of the validation function with respect to hyperparameters. We show that for large-scale problems involving a wide choice of kernel-based models and validation functions, this computation can be very efficiently done; often within just a fraction of the training time. Empirical results show that a near-optimal set of hyperparameters can be identified by our approach with very few training rounds and gradient computations. .

NeurIPS Conference 2006 Conference Paper

Branch and Bound for Semi-Supervised Support Vector Machines

Olivier Chapelle
Vikas Sindhwani
S. Keerthi

Semi-supervised SVMs (S3 VM) attempt to learn low-density separators by maximizing the margin over labeled and unlabeled examples. The associated optimization problem is non-convex. To examine the full potential of S3 VMs modulo local minima problems in current implementations, we apply branch and bound techniques for obtaining exact, global ly optimal solutions. Empirical evidence suggests that the globally optimal solution can return excellent generalization performance in situations where other implementations fail completely. While our current implementation is only applicable to small datasets, we discuss variants that can potentially lead to practically useful algorithms.

ICML Conference 2006 Conference Paper

Deterministic annealing for semi-supervised kernel machines

Vikas Sindhwani
S. Sathiya Keerthi
Olivier Chapelle

An intuitive approach to utilizing unlabeled data in kernel-based classification algorithms is to simply treat unknown labels as additional optimization variables. For margin-based loss functions, one can view this approach as attempting to learn low-density separators. However, this is a hard optimization problem to solve in typical semi-supervised settings where unlabeled data is abundant. The popular Transductive SVM algorithm is a label-switching-retraining procedure that is known to be susceptible to local minima. In this paper, we present a global optimization framework for semi-supervised Kernel machines where an easier problem is parametrically deformed to the original hard problem and minimizers are smoothly tracked. Our approach is motivated from deterministic annealing techniques and involves a sequence of convex optimization problems that are exactly and efficiently solved. We present empirical results on several synthetic and real world datasets that demonstrate the effectiveness of our approach.

JMLR Journal 2006 Journal Article

Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples

Mikhail Belkin
Partha Niyogi
Vikas Sindhwani

We propose a family of learning algorithms based on a new form of regularization that allows us to exploit the geometry of the marginal distribution. We focus on a semi-supervised framework that incorporates labeled and unlabeled data in a general-purpose learner. Some transductive graph learning algorithms and standard methods including support vector machines and regularized least squares can be obtained as special cases. We use properties of reproducing kernel Hilbert spaces to prove new Representer theorems that provide theoretical basis for the algorithms. As a result (in contrast to purely graph-based approaches) we obtain a natural out-of-sample extension to novel examples and so are able to handle both transductive and truly semi-supervised settings. We present experimental evidence suggesting that our semi-supervised algorithms are able to use unlabeled data effectively. Finally we have a brief discussion of unsupervised and fully supervised learning within our general framework. [abs] [ pdf ][ bib ] &copy JMLR 2006. ( edit, beta )

NeurIPS Conference 2006 Conference Paper

Relational Learning with Gaussian Processes

Wei Chu
Vikas Sindhwani
Zoubin Ghahramani
S. Keerthi

Correlation between instances is often modelled via a kernel function using in- put attributes of the instances. Relational knowledge can further reveal additional pairwise correlations between variables of interest. In this paper, we develop a class of models which incorporates both reciprocal relational information and in- put attributes using Gaussian process techniques. This approach provides a novel non-parametric Bayesian framework with a data-dependent covariance function for supervised learning tasks. We also apply this framework to semi-supervised learning. Experimental results on several real world data sets verify the usefulness of this algorithm.

ICML Conference 2005 Conference Paper

Beyond the point cloud: from transductive to semi-supervised learning

Vikas Sindhwani
Partha Niyogi
Mikhail Belkin