Author name cluster

Sameep Mehta

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers

1 author row

AAAI Conference 2026 System Paper

DFAgent: From Natural Language Data Interactions to Reusable Agent-Ready Tools

Neelamadhav Gantayat
Renuka Sindhgatta
Sambit Ghosh
Sameep Mehta
Soujanya Soni

We present DataFoundry Agent (DFAgent), a system that forges reusable, agent-ready tools from interactive data exploration, quality, and remediation tasks. Users engage with data through natural-language prompts for operations that include inspection, transformation, and visualization. These interactions automatically generate executable code snippets that are logged. From these snippets, DFAgent acts as a foundry, synthesizing a governed catalog of enriched tools exposed via the Model Context Protocol (MCP). In this way, user-derived logic for all data operations is transformed into standardized, composable tools without reimplementation. We demonstrate how diverse interactions accumulate into a reusable toolset, highlighting a paradigm that unifies natural language interaction, executable code generation, and tool foundry processes for agentic data systems.

PDF Details DOI

AAAI Conference 2026 System Paper

ToolSmith: A Multi-Agent Framework for Enterprise Tool Creation

Purna Chandra Sekhar Vakudavathu
Kushal Mukherjee
Jayachandu Bandlamudi
Renuka Sindhgatta
Sameep Mehta

Although LLMs can generate tools for generic domains and tasks, they struggle with enterprise-related domains that involve proprietary APIs and data schemas. We present ToolSmith, a framework for autonomously generating and validating agent-compatible tools. Given an API specification and a Tool Specification Requirement (TSR), ToolSmith produces a tool function and verifies it through a closed-loop process: it creates natural language (NL) tests and executes the tool in a secure agent sandbox for validation. For state-changing tools, ToolSmith confirms outcomes by querying the API with parameters derived from the NL tests. If the tool fails to produce the desired output, ToolSmith generates diagnostic feedback to iteratively regenerate it. By ensuring both functional correctness and agent compatibility, ToolSmith enables reliable automation of enterprise workflows.

PDF Details DOI

AAAI Conference 2025 System Paper

Question-guided Insights Generation for Automated Exploratory Data Analysis

Abhijit Manatkar
Ashlesha Akella
Krishnasuri Narayanam
Sameep Mehta

Exploratory Data Analysis (EDA) derives meaningful insights from extensive and complex datasets. This process typically involves a series of analytical operations to identify the patterns within the data. However, the effectiveness of EDA is often limited by the user's domain knowledge and proficiency in data exploration methods. To overcome these challenges, we developed QUIS, a fully automated EDA system that uncovers insights by generating data-related questions and exploring subspaces in the dataset without prior training. QUIS allows users to control key system parameters such as beam width, beam depth, and expansion factor for subspace selection, the interestingness score for filtering valuable insights, and parameters for managing the quality and quantity of generated questions.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

LLM-powered GraphQL Generator for Data Retrieval

Balaji Ganesan
Sambit Ghosh
Nitin Gupta
Manish Kesarwani
Sameep Mehta
Renuka Sindhgatta

GraphQL offers an efficient, powerful, and flexible alternative to REST APIs. However, application developers writing GraphQL clients need both technical and domain-specific expertise to reap its benefits, and avoid over-fetching or under-fetching data. Automated GraphQL generation has so far proven to be a hard problem because of complex GraphQL schema and lack of benchmark datasets. To address these issues, our work focuses on building an LLM-powered pipeline that can accept user requirements in natural language along with the complex GraphQL schema and automatically produce the GraphQL query needed to retrieve the necessary data. Automated GraphQL generation helps reduce entry barriers to application developers, broadening GraphQL adoption.

PDF Details DOI

AAAI Conference 2024 System Paper

LLMGuard: Guarding against Unsafe LLM Behavior

Shubh Goyal
Medha Hira
Shubham Mishra
Sukriti Goyal
Arnav Goel
Niharika Dadu
Kirushikesh DB
Sameep Mehta

Although the rise of Large Language Models (LLMs) in enterprise settings brings new opportunities and capabilities, it also brings challenges, such as the risk of generating inappropriate, biased, or misleading content that violates regulations and can have legal concerns. To alleviate this, we present "LLMGuard", a tool that monitors user interactions with an LLM application and flags content against specific behaviours or conversation topics. To do this robustly, LLMGuard employs an ensemble of detectors.

PDF Details DOI

AAAI Conference 2020 Short Paper

Multidimensional Analysis of Trust in News Articles (Student Abstract)

Avneet Kaur
Maitree Leekha
Utkarsh Chawla
Ayush Agarwal
Mudit Saxena
Nishtha Madaan
Kalapriya Kannan
Sameep Mehta

The advancements in the ﬁeld of Information Communication Technology have engendered revolutionary changes in the journalism industry, not only on the part of the journalists and the media personnel, but also on the people consuming these news stories, who today, are only a click away from all the updates they need. However, these advances have also exposed the prevailing venality, wearying off the trust of the public in news media. How then, does an individual discern that which, out of the countless news stories for an incident, should be trusted? This work introduces a system that presents the user a multidimensional analysis for trust in news from various media sources based on the textual content of the articles, assessment of the journalists’ perspectives and the temporal diversity of the issues being covered by the media houses publishing the news articles. Our experiments on a self-collected dataset conﬁrm that the system aids in a comprehensive analysis of trust.

PDF Details

AAAI Conference 2018 Conference Paper

Content and Context: Two-Pronged Bootstrapped Learning for Regex-Formatted Entity Extraction

Stanley Simoes
Deepak P
Munu Sairamesh
Deepak Khemani
Sameep Mehta

Regular expressions are an important building block of rulebased information extraction systems. Regexes can encode rules to recognize instances of simple entities which can then feed into the identiﬁcation of more complex cross-entity relationships. Manually crafting a regex that recognizes all possible instances of an entity is difﬁcult since an entity can manifest in a variety of different forms. Thus, the problem of automatically generalizing manually crafted seed regexes to improve the recall of IE systems has attracted research attention. In this paper, we propose a bootstrapped approach to improve the recall for extraction of regex-formatted entities, with the only source of supervision being the seed regex. Our approach starts from a manually authored high precision seed regex for the entity of interest, and uses the matches of the seed regex and the context around these matches to identify more instances of the entity. These are then used to identify a set of diverse, high recall regexes that are representative of this entity. Through an empirical evaluation over multiple real world document corpora, we illustrate the effectiveness of our approach.

PDF Details

AAAI Conference 2018 Short Paper

Semantic Understanding for Contextual In-Video Advertising

Rishi Madhok
Shashank Mujumdar
Nitin Gupta
Sameep Mehta

With the increasing consumer base of online video content, it is important for advertisers to understand the video context when targeting video ads to consumers. To improve the consumer experience and quality of ads, key factors need to be considered such as (i) ad relevance to video content (ii) where and how video ads are placed, and (iii) non-intrusive user experience. We propose a framework to semantically understand the video content for better ad recommendation that ensure these criteria.

PDF Details

IJCAI Conference 2015 Conference Paper

Tracking Political Elections on Social Media: Applications and Experience

Danish Contractor
Bhupesh Chawda
Sameep Mehta
L Venkata Subramaniam
Tanveer Afzal Faruquie

In recent times, social media has become a popular medium for many election campaigns. It not only allows candidates to reach out to a large section of the electorate, it is also a potent medium for people to express their opinion on the proposed policies and promises of candidates. Analyzing social media data is challenging as the text can be noisy, sparse and even multilingual. In addition, the information may not be completely trustworthy, particularly in the presence of propaganda, promotions and rumors. In this paper we describe our work for analyzing election campaigns using social media data. Using data from the 2012 US presidential elections and the 2013 Philippines General elections, we provide detailed experiments on our methods that use granger causality to identify topics that were most “causal” for public opinion and which in turn, give an interpretable insight into “elections topics” that were most important. Our system was deployed by the largest media organization in the Philippines during the 2013 General elections and using our work, the media house able to identify and report news stories much faster than competitors and reported higher TRP ratings during the election.

PDF Details

IJCAI Conference 2011 Conference Paper

A System for Providing Differentiated QoS in Retail Banking

Sameep Mehta
Girish Chafle
Gyana Parija
Vikas Kedia

In today's services driven economic environment, it is imperative for organizations to provide better quality service experience to differentiate and grow their business. Customer satisfaction (C-SAT) is the key driver for retention and growth in Retail Banking. Wait time, the time spent by a customer at the branch before getting serviced, contributes significantly to C-SAT. Due to high footfall, it is improbable to improve the wait time of every customer walking in the branch. Therefore, banks in developing countries are strategically looking to segment its customers and services and offer differentiated QoS based service delivery. In this work, we present a system for customer segmentation, and scheduling based on historic value of the customer and characteristics of current service request. We describe the system and give mathematical formulation of the scheduling problem and the associated heuristics. We present results and experience of deployment of this solution in multiple branches of a leading bank in India.

PDF Details DOI

AAAI Conference 2011 Conference Paper

Design and Analysis of Value Creation Networks

Sampath Kameshwaran
Sameep Mehta
Vinayaka Pandit

PDF Details