Author name cluster

Sunil Gupta

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

32 papers

2 author rows

AAAI Conference 2026 Conference Paper

Probabilities Are All You Need: A Probability-Only Approach to Uncertainty Estimation in Large Language Models

Manh Nguyen
Sunil Gupta
Hung Le

Large Language Models (LLMs) exhibit strong performance across various natural language processing (NLP) tasks but remain vulnerable to hallucinations, generating factually incorrect or misleading outputs. Uncertainty estimation, often using predictive entropy estimation, is key to addressing this issue. However, existing methods often require multiple samples or extra computation to assess semantic entropy. This paper proposes an efficient, training-free uncertainty estimation method that approximates predictive entropy using the responses' top-K probabilities. Moreover, we employ an adaptive mechanism to determine K to enhance flexibility and filter out low-confidence probabilities. Experimental results on three free-form question-answering datasets across several LLMs demonstrate that our method outperforms expensive state-of-the-art baselines, contributing to the broader goal of enhancing LLM trustworthiness.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Beyond the Known: Decision Making with Counterfactual Reasoning Decision Transformer

Minh Hoang Nguyen
Linh Le Pham Van
Thommen George Karimpanal
Sunil Gupta
Hung Le

Decision Transformers (DT) play a crucial role in modern reinforcement learning, leveraging offline datasets to achieve impressive results across various domains. However, DT requires high-quality, comprehensive data to perform optimally. In real-world applications, the lack of training data and the scarcity of optimal behaviours make training on offline datasets challenging, as suboptimal data can hinder performance. To address this, we propose the Counterfactual Reasoning Decision Transformer (CRDT), a novel framework inspired by counterfactual reasoning. CRDT enhances DT’s ability to reason beyond known data by generating and utilizing counterfactual experiences, enabling improved decision-making in unseen scenarios. Experiments across Atari and D4RL benchmarks, including scenarios with limited data and altered dynamics, demonstrate that CRDT outperforms conventional DT approaches. Additionally, reasoning counterfactually allows the DT agent to obtain stitching abilities, combining suboptimal trajectories, without architectural modifications. These results highlight the potential of counterfactual reasoning to enhance reinforcement learning agents' performance and generalization capabilities.

PDF Details DOI

AAMAS Conference 2025 Conference Paper

Navigating Social Dilemmas with LLM-based Agents via Consideration of Future Consequences

Dung Nguyen
Hung Le
Kien Do
Sunil Gupta
Svetha Venkatesh
Truyen Tran

Agents built on LLMs have shown versatile capabilities but face difficulties in being cooperative in social dilemma situations. When making decisions under the strain of selecting between long-term consequences and short-term benefits in commonly shared resources, LLM-based agents are vulnerable to the tragedy of the commons, i. e. individuals’ greed exploitation leads to early depletion. We propose LLM agents that consider future consequences to aid them in navigating intertemporal social dilemmas. We introduce two approaches—prompting and intervention—to equip the agent with the ability to consider future consequences when making a decision, which results in a new kind of agent—CFC-Agent. Furthermore, we enable the CFC-Agent to act toward different levels of consideration for future consequences. Our experiments in different settings show that agents that consider future consequences exhibit sustainable behaviour and achieve high common rewards for the population.

PDF

IJCAI Conference 2025 Conference Paper

Navigating Social Dilemmas with LLM-based Agents via Consideration of Future Consequences

Dung Nguyen
Hung Le
Kien Do
Sunil Gupta
Svetha Venkatesh
Truyen Tran

Artificial agents with the aid of large language models (LLMs) are effective in various real-world scenarios but struggle to cooperate in social dilemmas. When making decisions under the strain of selecting between long-term consequences and short-term benefits in commonly shared resources, LLM-based agents often exploit the environment, leading to early depletion. Inspired by the concept of consideration of future consequences (CFC), which is well-known in social psychology, we propose a framework to enable the ability to consider future consequences for LLM-based agents, which results in a new kind of agent that we term the CFC-Agent. We enable the CFC-Agent to act toward different levels of consideration for future consequences. Our first set of experiments, where LLM is directly asked to make decisions, shows that agents considering future consequences exhibit sustainable behaviour and achieve high common rewards for the population. Extensive experiments in complex environments showed that the CFC-Agent can manage a sequence of calls to LLM for reasoning and engaging in communication to cooperate with others to resolve the common dilemma better. Finally, our analysis showed that considering future consequences not only affects the final decision but also improves the conversations between LLM-based agents toward a better resolution of social dilemmas.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Reproducing Kernel Banach Space Models for Neural Networks with Application to Rademacher Complexity Analysis

Alistair Shilton
Sunil Gupta
Santu Rana
Svetha Venkatesh

This paper explores the use of Hermite transform based reproducing kernel Banach space methods to construct exact or un-approximated models of feedforward neural networks of arbitrary width, depth and topology, including ResNet and Transformers networks, assuming only a feedforward topology, finite energy activations and finite (spectral-) norm weights and biases. Using this model, two straightforward but surprisingly tight bounds on Rademacher complexity are derived, precisely (1) a general bound that is width-independent and scales exponentially with depth; and (2) a width- and depth-independent bound for networks with appropriately constrained (below threshold) weights and biases.