Arrow Research search

Author name cluster

Kanika Narang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers
1 author row

Possible papers

3

NeurIPS Conference 2025 Conference Paper

VisualLens: Personalization through Task-Agnostic Visual History

  • Wang Bill Zhu
  • Deqing Fu
  • Kai Sun
  • Yi Lu
  • Zhaojiang Lin
  • Seungwhan Moon
  • Kanika Narang
  • Mustafa Canim

Existing recommendation systems either rely on user interaction logs, such as online shopping history for shopping recommendations, or focus on text signals. However, item-based histories are not always accessible and generalizable for multimodal recommendation. We hypothesize that a user's visual history --- comprising images from daily life --- can offer rich, task-agnostic insights into their interests and preferences, and thus be leveraged for effective personalization. To this end, we propose VisualLens, a novel framework that leverages multimodal large language models (MLLMs) to enable personalization using task-agnostic visual history. VisualLens extracts, filters, and refines a spectrum user profile from the visual history to support personalized recommendation. We created two new benchmarks, Google-Review-V and Yelp-V, with task-agnostic visual histories, and show that VisualLens improves over state-of-the-art item-based multimodal recommendations by 5-10\% on Hit@3, and outperforms GPT-4o by 2-5\%. Further analysis shows that VisualLens is robust across varying history lengths and excels at adapting to both longer histories and unseen content categories.

AAAI Conference 2024 Conference Paper

Measuring Self-Supervised Representation Quality for Downstream Classification Using Discriminative Features

  • Neha Kalibhat
  • Kanika Narang
  • Hamed Firooz
  • Maziar Sanjabi
  • Soheil Feizi

Self-supervised learning (SSL) has shown impressive results in downstream classification tasks. However, there is limited work in understanding their failure modes and interpreting their learned representations. In this paper, we study the representation space of state-of-the-art self-supervised models including SimCLR, SwaV, MoCo, BYOL, DINO, SimSiam, VICReg and Barlow Twins. Without the use of class label information, we discover discriminative features that correspond to unique physical attributes in images, present mostly in correctly-classified representations. Using these features, we can compress the representation space by up to$40% without significantly affecting linear classification performance. We then propose Self-Supervised Representation Quality Score (or Q-Score), an unsupervised score that can reliably predict if a given sample is likely to be mis-classified during linear evaluation, achieving AUPRC of 91.45 on ImageNet-100 and 78.78 on ImageNet-1K. Q-Score can also be used as a regularization term on pre-trained encoders to remedy low-quality representations. Fine-tuning with Q-Score regularization can boost the linear probing accuracy of SSL models by up to 5.8% on ImageNet-100 and 3.7% on ImageNet-1K compared to their baselines. Finally, using gradient heatmaps and Salient ImageNet masks, we define a metric to quantify the interpretability of each representation. We show that discriminative features are strongly correlated to core attributes and, enhancing these features through Q-score regularization makes SSL representations more interpretable.

TMLR Journal 2024 Journal Article

On the Equivalence of Graph Convolution and Mixup

  • Xiaotian Han
  • Hanqing Zeng
  • Yu Chen
  • Shaoliang Nie
  • Jingzhou Liu
  • Kanika Narang
  • Zahra Shakeri
  • Karthik Abinav Sankararaman

This paper investigates the relationship between graph convolution and Mixup techniques. Graph convolution in a graph neural network involves aggregating features from neighboring samples to learn representative features for a specific node or sample. On the other hand, Mixup is a data augmentation technique that generates new examples by averaging features and one-hot labels from multiple samples. One commonality between these techniques is their utilization of information from multiple samples to derive feature representation. This study aims to explore whether a connection exists between the two. Our investigation reveals that, under two mild modifications, graph convolution can be viewed as a specialized form of Mixup that is applied during both the training and testing phases. The two modifications are 1) \textit{Homophily Relabel} - assigning the target node's label to all its neighbors, and 2) \textit{Test-Time Mixup} - Mixup the feature during the test time. We establish this equivalence mathematically by demonstrating that graph convolution networks and simplified graph convolution can be expressed as a form of Mixup. We also empirically verify the equivalence by training an MLP using the two modifications to achieve comparable performance.