Author name cluster

Divam Gupta

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

2 author rows

NeurIPS Conference 2024 Conference Paper

Codec Avatar Studio: Paired Human Captures for Complete, Driveable, and Generalizable Avatars

Julieta Martinez
Emily Kim
Javier Romero
Timur Bagautdinov
Shunsuke Saito
Shoou-I Yu
Stuart Anderson
Michael Zollhöfer

To build photorealistic avatars that users can embody, human modelling must be complete (cover the full body), driveable (able to reproduce the current motion and appearance from the user), and generalizable ( i. e. , easily adaptable to novel identities). Towards these goals, paired captures, that is, captures of the same subject obtained from systems of diverse quality and availability, are crucial. However, paired captures are rarely available to researchers outside of dedicated industrial labs: Codec Avatar Studio is our proposal to close this gap. Towards generalization and driveability, we introduce a dataset of 256 subjects captured in two modalities: high resolution multi-view scans of their heads, and video from the internal cameras of a headset. Towards completeness, we introduce a dataset of 4 subjects captured in eight modalities: high quality relightable multi-view captures of heads and hands, full body multi-view captures with minimal and regular clothes, and corresponding head, hands and body phone captures. Together with our data, we also provide code and pre-trained models for different state-of-the-art human generation models. Our datasets and code are available at https: //github. com/facebookresearch/ava-256 and https: //github. com/facebookresearch/goliath.

PDF Details DOI

ICLR Conference 2020 Conference Paper

Unsupervised Clustering using Pseudo-semi-supervised Learning

Divam Gupta
Ramachandran Ramjee
Nipun Kwatra
Muthian Sivathanu

In this paper, we propose a framework that leverages semi-supervised models to improve unsupervised clustering performance. To leverage semi-supervised models, we first need to automatically generate labels, called pseudo-labels. We find that prior approaches for generating pseudo-labels hurt clustering performance because of their low accuracy. Instead, we use an ensemble of deep networks to construct a similarity graph, from which we extract high accuracy pseudo-labels. The approach of finding high quality pseudo-labels using ensembles and training the semi-supervised model is iterated, yielding continued improvement. We show that our approach outperforms state of the art clustering results for multiple image and text datasets. For example, we achieve 54.6% accuracy for CIFAR-10 and 43.9% for 20news, outperforming state of the art by 8-12% in absolute terms.

Details

AAAI Conference 2019 Conference Paper

GIRNet: Interleaved Multi-Task Recurrent State Sequence Models

Divam Gupta
Tanmoy Chakraborty
Soumen Chakrabarti

In several natural language tasks, labeled sequences are available in separate domains (say, languages), but the goal is to label sequences with mixed domain (such as code-switched text). Or, we may have available models for labeling whole passages (say, with sentiments), which we would like to exploit toward better position-specific label inference (say, target-dependent sentiment annotation). A key characteristic shared across such tasks is that different positions in a primary instance can benefit from different ‘experts’ trained from auxiliary data, but labeled primary instances are scarce, and labeling the best expert for each position entails unacceptable cognitive burden. We propose GIRNet, a unified position-sensitive multi-task recurrent neural network (RNN) architecture for such applications. Auxiliary and primary tasks need not share training instances. Auxiliary RNNs are trained over auxiliary instances. A primary instance is also submitted to each auxiliary RNN, but their state sequences are gated and merged into a novel composite state sequence tailored to the primary inference task. Our approach is in sharp contrast to recent multi-task networks like the crossstitch and sluice networks, which do not control state transfer at such fine granularity. We demonstrate the superiority of GIRNet using three applications: sentiment classification of code-switched passages, part-of-speech tagging of codeswitched text, and target position-sensitive annotation of sentiment in monolingual passages. In all cases, we establish new state-of-the-art performance beyond recent competitive baselines.

PDF Details