Author name cluster

Satwik Kottur

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers

2 author rows

TMLR Journal 2023 Journal Article

Robustness through Data Augmentation Loss Consistency

Tianjian Huang
Shaunak Ashish Halbe
Chinnadhurai Sankar
Pooyan Amini
Satwik Kottur
Alborz Geramifard
Meisam Razaviyayn
Ahmad Beirami

While deep learning through empirical risk minimization (ERM) has succeeded at achieving human-level performance at a variety of complex tasks, ERM is not robust to distribution shifts or adversarial attacks. Synthetic data augmentation followed by empirical risk minimization (DA-ERM) is a simple and widely used solution to improve robustness in ERM. In addition, consistency regularization can be applied to further improve the robustness of the model by forcing the representation of the original sample and the augmented one to be similar. However, existing consistency regularization methods are not applicable to covariant data augmentation, where the label in the augmented sample is dependent on the augmentation function. For example, dialog state covaries with named entity when we augment data with a new named entity. In this paper, we propose data augmented loss invariant regularization (DAIR), a simple form of consistency regularization that is applied directly at the loss level rather than intermediate features, making it widely applicable to both invariant and covariant data augmentation regardless of network architecture, problem setup, and task. We apply DAIR to real-world learning problems involving covariant data augmentation: robust neural task-oriented dialog state tracking and robust visual question answering. We also apply DAIR to tasks involving invariant data augmentation: robust regression, robust classification against adversarial attacks, and robust ImageNet classification under distribution shift. Our experiments show that DAIR consistently outperforms ERM and DA-ERM with little marginal computational cost and sets new state-of-the-art results in several benchmarks involving covariant data augmentation. Our code of all experiments are available at: https://github.com/optimization-for-data-driven-science/DAIR.

PDF Details

ICML Conference 2017 Conference Paper

Canopy Fast Sampling with Cover Trees

Manzil Zaheer
Satwik Kottur
Amr Ahmed 0001
José M. F. Moura
Alexander J. Smola

Hierarchical Bayesian models often capture distributions over a very large number of distinct atoms. The need for these models arises when organizing huge amount of unsupervised data, for instance, features extracted using deep convnets that can be exploited to organize abundant unlabeled images. Inference for hierarchical Bayesian models in such cases can be rather nontrivial, leading to approximate approaches. In this work, we propose Canopy, a sampler based on Cover Trees that is exact, has guaranteed runtime logarithmic in the number of atoms, and is provably polynomial in the inherent dimensionality of the underlying parameter space. In other words, the algorithm is as fast as search over a hierarchical data structure. We provide theory for Canopy and demonstrate its effectiveness on both synthetic and real datasets, consisting of over 100 million images.

Details

NeurIPS Conference 2017 Conference Paper

Deep Sets

Manzil Zaheer
Satwik Kottur
Siamak Ravanbakhsh
Barnabas Poczos
Russ Salakhutdinov
Alexander Smola

We study the problem of designing models for machine learning tasks defined on sets. In contrast to the traditional approach of operating on fixed dimensional vectors, we consider objective functions defined on sets and are invariant to permutations. Such problems are widespread, ranging from the estimation of population statistics, to anomaly detection in piezometer data of embankment dams, to cosmology. Our main theorem characterizes the permutation invariant objective functions and provides a family of functions to which any permutation invariant objective function must belong. This family of functions has a special structure which enables us to design a deep network architecture that can operate on sets and which can be deployed on a variety of scenarios including both unsupervised and supervised learning tasks. We demonstrate the applicability of our method on population statistic estimation, point cloud classification, set expansion, and outlier detection.

PDF Details

IJCAI Conference 2017 Conference Paper

Exploring Personalized Neural Conversational Models

Satwik Kottur
Xiaoyu Wang
Vitor Carvalho

Modeling dialog systems is currently one of the most active problems in Natural Language Processing. Recent advancement in Deep Learning has sparked an interest in the use of neural networks in modeling language, particularly for personalized conversational agents that can retain contextual information during dialog exchanges. This work carefully explores and compares several of the recently proposed neural conversation models, and carries out a detailed evaluation on the multiple factors that can significantly affect predictive performance, such as pretraining, embedding training, data cleaning, diversity reranking, evaluation setting, etc. Based on the tradeoffs of different models, we propose a new generative dialogue model conditioned on speakers as well as context history that outperforms all previous models on both retrieval and generative metrics. Our findings indicate that pretraining speaker embeddings on larger datasets, as well as bootstrapping word and speaker embeddings, can significantly improve performance (up to 3 points in perplexity), and that promoting diversity in using Mutual Information based techniques has a very strong effect in ranking metrics.

PDF Details