Author name cluster

Partha Talukdar

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers

2 author rows

ICLR Conference 2024 Conference Paper

LLM Augmented LLMs: Expanding Capabilities through Composition

Rachit Bansal
Bidisha Samanta
Siddharth Dalmia
Nitish Gupta
Sriram Ganapathy
Abhishek Bapna
Prateek Jain
Partha Talukdar

Foundational models with billions of parameters which have been trained on large corpus of data have demonstrated non-trivial skills in a variety of domains. However, due to their monolithic structure, it is challenging and expensive to augment them or impart new skills. On the other hand, due to their adaptation abilities,several new instances of these models are being trained towards new domains and tasks. In this work, we study the problem of efficient and practical composition of existing foundation models with more specific models to enable newer capabilities. To this end, we propose CALM—Composition to Augment Language Models—which introduces cross-attention between models to compose their representations and enable new capabilities. Salient features of CALM are: (i) Scales up LLMs on new tasks by ‘re-using’ existing LLMs along with a few additional parameters and data, (ii) Existing model weights are kept intact, and hence preserves existing capabilities, and (iii) Applies to diverse domains and settings. We illustrate that augmenting PaLM2-S with a smaller model trained on low-resource languages results in an absolute improvement of up to 13% on tasks like translation into English and arithmetic reasoning for low-resource languages. Similarly,when PaLM2-S is augmented with a code-specific model, we see a relative improvement of 40% over the base model for code generation and explanation tasks—on-par with fully fine-tuned counterparts.

Details

ICRA Conference 2021 Conference Paper

Spatial Reasoning from Natural Language Instructions for Robot Manipulation

Sagar Gubbi Venkatesh
Anirban Biswas
Raviteja Upadrashta
Vikram Srinivasan
Partha Talukdar
Bharadwaj Amrutur

Robots that can manipulate objects in unstructured environments and collaborate with humans can benefit immensely by understanding natural language. We propose a pipelined architecture of two stages to perform spatial reasoning on the text input. All the objects in the scene are first localized, and then the instruction for the robot in natural language and the localized co-ordinates are mapped to the start and end co-ordinates corresponding to the locations where the robot must pick up and place the object respectively. We show that representing the localized objects by quantizing their positions to a binary grid is preferable to representing them as a list of 2D co-ordinates. We also show that attention improves generalization and can overcome biases in the dataset. The proposed method is used to pick-and-place playing cards using a robot arm.

Details

AAAI Conference 2020 Conference Paper

ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations

Ekagra Ranjan
Soumya Sanyal
Partha Talukdar

Graph Neural Networks (GNN) have been shown to work effectively for modeling graph structured data to solve tasks such as node classiﬁcation, link prediction and graph classi- ﬁcation. There has been some recent progress in deﬁning the notion of pooling in graphs whereby the model tries to generate a graph level representation by downsampling and summarizing the information present in the nodes. Existing pooling methods either fail to effectively capture the graph substructure or do not easily scale to large graphs. In this work, we propose ASAP (Adaptive Structure Aware Pooling), a sparse and differentiable pooling method that addresses the limitations of previous graph pooling architectures. ASAP utilizes a novel self-attention network along with a modiﬁed GNN formulation to capture the importance of each node in a given graph. It also learns a sparse soft cluster assignment for nodes at each layer to effectively pool the subgraphs to form the pooled graph. Through extensive experiments on multiple datasets and theoretical analysis, we motivate our choice of the components used in ASAP. Our experimental results show that combining existing GNN architectures with ASAP leads to state-of-the-art results on multiple graph classiﬁcation benchmarks. ASAP has an average improvement of 4%, compared to current sparse hierarchical state-of-the-art method. We make the source code of ASAP available to encourage reproducible research 1.

PDF Details

AAAI Conference 2020 Conference Paper

InteractE: Improving Convolution-Based Knowledge Graph Embeddings by Increasing Feature Interactions

Shikhar Vashishth
Soumya Sanyal
Vikram Nitin
Nilesh Agrawal
Partha Talukdar

Most existing knowledge graphs suffer from incompleteness, which can be alleviated by inferring missing links based on known facts. One popular way to accomplish this is to generate low-dimensional embeddings of entities and relations, and use these to make inferences. ConvE, a recently proposed approach, applies convolutional ﬁlters on 2D reshapings of entity and relation embeddings in order to capture rich interactions between their components. However, the number of interactions that ConvE can capture is limited. In this paper, we analyze how increasing the number of these interactions affects link prediction performance, and utilize our observations to propose InteractE. InteractE is based on three key ideas – feature permutation, a novel feature reshaping, and circular convolution. Through extensive experiments, we ﬁnd that InteractE outperforms state-of-the-art convolutional link prediction baselines on FB15k-237. Further, InteractE achieves an MRR score that is 9%, 7. 5%, and 23% better than ConvE on the FB15k-237, WN18RR and YAGO3-10 datasets respectively. The results validate our central hypothesis – that increasing feature interaction is beneﬁcial to link prediction performance. We make the source code of InteractE available to encourage reproducible research.

PDF Details

AAAI Conference 2020 Conference Paper

P-SIF: Document Embeddings Using Partition Averaging

Vivek Gupta
Ankit Saw
Pegah Nokhiz
Praneeth Netrapalli
Piyush Rai
Partha Talukdar

Simple weighted averaging of word vectors often yields effective representations for sentences which outperform sophisticated seq2seq neural models in many tasks. While it is desirable to use the same method to represent documents as well, unfortunately, the effectiveness is lost when representing long documents involving multiple sentences. One of the key reasons is that a longer document is likely to contain words from many different topics; hence, creating a single vector while ignoring all the topical structure is unlikely to yield an effective document representation. This problem is less acute in single sentences and other short text fragments where the presence of a single topic is most likely. To alleviate this problem, we present P-SIF, a partitioned word averaging model to represent long documents. P-SIF retains the simplicity of simple weighted word averaging while taking a document’s topical structure into account. In particular, P- SIF learns topic-speciﬁc vectors from a document and ﬁnally concatenates them all to represent the overall document. We provide theoretical justiﬁcations on the correctness of P-SIF. Through a comprehensive set of experiments, we demonstrate P-SIF’s effectiveness compared to simple weighted averaging and many other baselines.

PDF Details

NeurIPS Conference 2019 Conference Paper

HyperGCN: A New Method For Training Graph Convolutional Networks on Hypergraphs

Naganand Yadati
Madhav Nimishakavi
Prateek Yadav
Vikram Nitin
Anand Louis
Partha Talukdar

In many real-world network datasets such as co-authorship, co-citation, email communication, etc. , relationships are complex and go beyond pairwise. Hypergraphs provide a flexible and natural modeling tool to model such complex relationships. The obvious existence of such complex relationships in many real-world networks naturaly motivates the problem of learning with hypergraphs. A popular learning paradigm is hypergraph-based semi-supervised learning (SSL) where the goal is to assign labels to initially unlabeled vertices in a hypergraph. Motivated by the fact that a graph convolutional network (GCN) has been effective for graph-based SSL, we propose HyperGCN, a novel GCN for SSL on attributed hypergraphs. Additionally, we show how HyperGCN can be used as a learning-based approach for combinatorial optimisation on NP-hard hypergraph problems. We demonstrate HyperGCN's effectiveness through detailed experimentation on real-world hypergraphs. We have made HyperGCN's source code available to foster reproducible research.

PDF Details

AAAI Conference 2019 Conference Paper

ReAl-LiFE: Accelerating the Discovery of Individualized Brain Connectomes on GPUs

Sawan Kumar
Varsha Sreenivasan
Partha Talukdar
Franco Pestilli
Devarajan Sridharan

Diffusion imaging and tractography enable mapping structural connections in the human brain, in-vivo. Linear Fascicle Evaluation (LiFE) is a state-of-the-art approach for pruning spurious connections in the estimated structural connectome, by optimizing its fit to the measured diffusion data. Yet, LiFE imposes heavy demands on computing time, precluding its use in analyses of large connectome databases. Here, we introduce a GPU-based implementation of LiFE that achieves 50-100x speedups over conventional CPU-based implementations for connectome sizes of up to several million fibers. Briefly, the algorithm accelerates generalized matrix multiplications on a compressed tensor through efficient GPU kernels, while ensuring favorable memory access patterns. Leveraging these speedups, we advance LiFE’s algorithm by imposing a regularization constraint on estimated fiber weights during connectome pruning. Our regularized, accelerated, LiFE algorithm (“ReAl-LiFE”) estimates sparser connectomes that also provide more accurate fits to the underlying diffusion signal. We demonstrate the utility of our approach by classifying pathological signatures of structural connectivity in patients with Alzheimer’s Disease (AD). We estimated million fiber whole-brain connectomes, followed by pruning with ReAl-LiFE, for 90 individuals (45 AD patients and 45 healthy controls). Linear classifiers, based on support vector machines, achieved over 80% accuracy in classifying AD patients from healthy controls based on their ReAl-LiFE pruned structural connectomes alone. Moreover, classification based on the ReAl-LiFE pruned connectome outperformed both the unpruned connectome, as well as the LiFE pruned connectome, in terms of accuracy. We propose our GPU-accelerated approach as a widely relevant tool for non-negative least squares optimization, across many domains.

PDF Details

AAAI Conference 2016 Conference Paper

ClaimEval: Integrated and Flexible Framework for Claim Evaluation Using Credibility of Sources

Mehdi Samadi
Partha Talukdar
Manuela Veloso
Manuel Blum

The World Wide Web (WWW) has become a rapidly growing platform consisting of numerous sources which provide supporting or contradictory information about claims (e. g. , “Chicken meat is healthy”). In order to decide whether a claim is true or false, one needs to analyze content of different sources of information on the Web, measure credibility of information sources, and aggregate all these information. This is a tedious process and the Web search engines address only part of the overall problem, viz. , producing only a list of relevant sources. In this paper, we present ClaimEval, a novel and integrated approach which given a set of claims to validate, extracts a set of pro and con arguments from the Web information sources, and jointly estimates credibility of sources and correctness of claims. ClaimEval uses Probabilistic Soft Logic (PSL), resulting in a ﬂexible and principled framework which makes it easy to state and incorporate different forms of prior-knowledge. Through extensive experiments on realworld datasets, we demonstrate ClaimEval’s capability in determining validity of a set of claims, resulting in improved accuracy compared to state-of-the-art baselines.

PDF Details

IJCAI Conference 2015 Conference Paper

AskWorld: Budget-Sensitive Query Evaluation for Knowledge-on-Demand

Mehdi Samadi
Partha Talukdar
Manuela Veloso
Tom Mitchell

Recently, several Web-scale knowledge harvesting systems have been built, each of which is competent at extracting information from certain types of data (e. g. , unstructured text, structured tables on the web, etc.). In order to determine the response to a new query posed to such systems (e. g. , is sugar a healthy food?), it is useful to integrate opinions from multiple systems. If a response is desired within a specific time budget (e. g. , in less than 2 seconds), then maybe only a subset of these resources can be queried. In this paper, we address the problem of knowledge integration for on-demand time-budgeted query answering. We propose a new method, AskWorld, which learns a policy that chooses which queries to send to which resources, by accommodating varying budget constraints that are available only at query (test) time. Through extensive experiments on real world datasets, we demonstrate AskWorld’s capability in selecting most informative resources to query within test-time constraints, resulting in improved performance compared to competitive baselines.

PDF Details

AAAI Conference 2015 Conference Paper

Never-Ending Learning

Tom Mitchell
William Cohen
Estevam Hruschka
Partha Talukdar
Justin Betteridge
Andrew Carlson
Bhavana Dalvi Mishra
Matthew Gardner

Whereas people learn many different types of knowledge from diverse experiences over many years, most current machine learning systems acquire just a single function or data model from just a single data set. We propose a never-ending learning paradigm for machine learning, to better reﬂect the more ambitious and encompassing type of learning performed by humans. As a case study, we describe the Never- Ending Language Learner (NELL), which achieves some of the desired properties of a never-ending learner, and we discuss lessons learned. NELL has been learning to read the web 24 hours/day since January 2010, and so far has acquired a knowledge base with over 80 million conﬁdence-weighted beliefs (e. g. , servedWith(tea, biscuits)), while learning continually to improve its reading competence over time. NELL has also learned to reason over its knowledge base to infer new beliefs from old ones, and is now beginning to extend its ontology by synthesizing new relational predicates. NELL can be tracked online at http: //rtw. ml. cmu. edu, and followed on Twitter at @CMUNELL.

PDF Details

NeurIPS Conference 2008 Conference Paper

Regularized Learning with Networks of Features

Ted Sandler
John Blitzer
Partha Talukdar
Lyle Ungar

For many supervised learning problems, we possess prior knowledge about which features yield similar information about the target variable. In predicting the topic of a document, we might know that two words are synonyms, or when performing image recognition, we know which pixels are adjacent. Such synonymous or neighboring features are near-duplicates and should therefore be expected to have similar weights in a good model. Here we present a framework for regularized learning in settings where one has prior knowledge about which features are expected to have similar and dissimilar weights. This prior knowledge is encoded as a graph whose vertices represent features and whose edges represent similarities and dissimilarities between them. During learning, each feature's weight is penalized by the amount it differs from the average weight of its neighbors. For text classification, regularization using graphs of word co-occurrences outperforms manifold learning and compares favorably to other recently proposed semi-supervised learning methods. For sentiment analysis, feature graphs constructed from declarative human knowledge, as well as from auxiliary task learning, significantly improve prediction accuracy.

PDF Details