Author name cluster

Procheta Sen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers

1 author row

TMLR Journal 2025 Journal Article

Dissecting Bias in LLMs: A Mechanistic Interpretability Perspective

Zubair Bashir
Bhavik Chandna
Procheta Sen

Large Language Models (LLMs) are known to exhibit social, demographic, and gender biases, often as a consequence of the data on which they are trained. In this work, we adopt a mechanistic interpretability approach to analyze how such biases are structurally represented within models such as GPT-2 and Llama2. Focusing on demographic and gender biases, we explore different metrics to identify the internal edges responsible for biased behavior. We then assess the stability, localization, and generalizability of these components across dataset and linguistic variations. Through systematic ablations, we demonstrate that bias-related computations are highly localized, often concentrated in a small subset of layers. Moreover, the identified components change across fine-tuning settings, including those unrelated to bias. Finally, we show that removing these components not only reduces biased outputs but also affects other NLP tasks, such as named entity recognition and linguistic acceptability judgment because of the sharing of important components with these tasks.

PDF Details

TMLR Journal 2025 Journal Article

Regularized Gradient Clipping Provably Trains Wide and Deep Neural Networks

Matteo Tucat
Anirbit Mukherjee
Mingfei Sun
Procheta Sen
Omar Rivasplata

We present and analyze a novel regularized form of the gradient clipping algorithm, proving that it converges to global minima of the loss surface of deep neural networks under the squared loss, provided that the layers are of sufficient width. The algorithm presented here, dubbed $\delta-$GClip, introduces a modification to gradient clipping that leads to a first-of-its-kind example of a step size scheduling for gradient descent that provably minimizes training losses of deep neural nets. We also present empirical evidence that our theoretically founded $\delta-$GClip algorithm is competitive with the state-of-the-art deep learning heuristics on various neural architectures including modern transformer based architectures. The modification we do to standard gradient clipping is designed to leverage the PL* condition, a variant of the Polyak-Łojasiewicz inequality which was recently proven to be true for sufficiently wide neural networks at any depth within a neighbourhood of the initialization.

PDF Details

AAAI Conference 2025 Short Paper

Unraveling the Influence of Training Data and Internal Structures in Large Language Models for Enhanced Explainability (Student Abstract)

Lingfang Li
Procheta Sen

Recent advances in deep learning have expanded the application of large language models (LLMs) across fields such as medicine, finance, and education. Understanding the mechanisms underlying these models is essential to mitigate issues like hallucinations and bias. This study provides deep learning practitioners with insights into how specific training data points and internal structures influence model behaviour. Using influence functions and mechanistic interpretability, we will analyze the impact of data on model predictions across various tasks. Preliminary findings indicate that semantic search techniques, such as FAISS, enable efficient identification of influential training points in GPT-2 small. Future work will extend these methods to additional tasks and more complex models, with a focus on further elucidating LLM structures to improve interpretability.

PDF Details DOI

AILAW Journal 2024 Journal Article

A case study for automated attribute extraction from legal documents using large language models

Subinay Adhikary
Procheta Sen
Dwaipayan Roy
Kripabandhu Ghosh

Abstract The escalating number of pending cases is a growing concern worldwide. Recent advancements in digitization have opened up possibilities for leveraging artificial intelligence (AI) tools in the processing of legal documents. Adopting a structured representation for legal documents, as opposed to a mere bag-of-words flat text representation, can significantly enhance processing capabilities. With the aim of achieving this objective, we put forward a set of diverse attributes for criminal case proceedings. To enhance the effectiveness of automatically extracting these attributes from legal documents within a sequence labeling framework, we propose the utilization of a few-shot learning approach based on Large Language Models (LLMs). Moreover, we demonstrate the efficacy of the extracted attributes in downstream tasks, such as legal judgment prediction and legal statute prediction.

Details DOI

AAAI Conference 2023 System Paper

Task2KB: A Public Task-Oriented Knowledge Base

Procheta Sen
Xi Wang
Ruiqing Xu
Emine Yilmaz

Search engines and conversational assistants are commonly used to help users complete their every day tasks such as booking travel, cooking, etc. While there are some existing datasets that can be used for this purpose, their coverage is limited to very few domains. In this paper, we propose a novel knowledge base, ‘Task2KB’, which is constructed using data crawled from WikiHow, an online knowledge resource offering instructional articles on a wide range of tasks. Task2KB encapsulates various types of task-related information and attributes, such as requirements, detailed step description, and available methods to complete tasks. Due to its higher coverage compared to existing related knowledge graphs, Task2KB can be highly useful in the development of general purpose task completion assistants.

PDF Details DOI

AAAI Conference 2020 Conference Paper

Towards Socially Responsible AI: Cognitive Bias-Aware Multi-Objective Learning

Procheta Sen
Debasis Ganguly

Human society had a long history of suffering from cognitive biases leading to social prejudices and mass injustice. The prevalent existence of cognitive biases in large volumes of historical data can pose a threat of being manifested as unethical and seemingly inhumane predictions as outputs of AI systems trained on such data. To alleviate this problem, we propose a bias-aware multi-objective learning framework that given a set of identity attributes (e. g. gender, ethnicity etc.) and a subset of sensitive categories of the possible classes of prediction outputs, learns to reduce the frequency of predicting certain combinations of them, e. g. predicting stereotypes such as ‘most blacks use abusive language’, or ‘fear is a virtue of women’. Our experiments conducted on an emotion prediction task with balanced class priors shows that a set of baseline bias-agnostic models exhibit cognitive biases with respect to gender, such as women are prone to be afraid whereas men are more prone to be angry. In contrast, our proposed bias-aware multi-objective learning methodology is shown to reduce such biases in the predictid emotions.

PDF Details