Arrow Research search

Author name cluster

Partha Talukdar

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers
2 author rows

Possible papers

11

ICLR Conference 2024 Conference Paper

LLM Augmented LLMs: Expanding Capabilities through Composition

  • Rachit Bansal
  • Bidisha Samanta
  • Siddharth Dalmia
  • Nitish Gupta
  • Sriram Ganapathy
  • Abhishek Bapna
  • Prateek Jain
  • Partha Talukdar

Foundational models with billions of parameters which have been trained on large corpus of data have demonstrated non-trivial skills in a variety of domains. However, due to their monolithic structure, it is challenging and expensive to augment them or impart new skills. On the other hand, due to their adaptation abilities,several new instances of these models are being trained towards new domains and tasks. In this work, we study the problem of efficient and practical composition of existing foundation models with more specific models to enable newer capabilities. To this end, we propose CALM—Composition to Augment Language Models—which introduces cross-attention between models to compose their representations and enable new capabilities. Salient features of CALM are: (i) Scales up LLMs on new tasks by ‘re-using’ existing LLMs along with a few additional parameters and data, (ii) Existing model weights are kept intact, and hence preserves existing capabilities, and (iii) Applies to diverse domains and settings. We illustrate that augmenting PaLM2-S with a smaller model trained on low-resource languages results in an absolute improvement of up to 13% on tasks like translation into English and arithmetic reasoning for low-resource languages. Similarly,when PaLM2-S is augmented with a code-specific model, we see a relative improvement of 40% over the base model for code generation and explanation tasks—on-par with fully fine-tuned counterparts.

ICRA Conference 2021 Conference Paper

Spatial Reasoning from Natural Language Instructions for Robot Manipulation

  • Sagar Gubbi Venkatesh
  • Anirban Biswas
  • Raviteja Upadrashta
  • Vikram Srinivasan
  • Partha Talukdar
  • Bharadwaj Amrutur

Robots that can manipulate objects in unstructured environments and collaborate with humans can benefit immensely by understanding natural language. We propose a pipelined architecture of two stages to perform spatial reasoning on the text input. All the objects in the scene are first localized, and then the instruction for the robot in natural language and the localized co-ordinates are mapped to the start and end co-ordinates corresponding to the locations where the robot must pick up and place the object respectively. We show that representing the localized objects by quantizing their positions to a binary grid is preferable to representing them as a list of 2D co-ordinates. We also show that attention improves generalization and can overcome biases in the dataset. The proposed method is used to pick-and-place playing cards using a robot arm.

AAAI Conference 2020 Conference Paper

ASAP: Adaptive Structure Aware Pooling for Learning Hierarchical Graph Representations

  • Ekagra Ranjan
  • Soumya Sanyal
  • Partha Talukdar

Graph Neural Networks (GNN) have been shown to work effectively for modeling graph structured data to solve tasks such as node classification, link prediction and graph classi- fication. There has been some recent progress in defining the notion of pooling in graphs whereby the model tries to generate a graph level representation by downsampling and summarizing the information present in the nodes. Existing pooling methods either fail to effectively capture the graph substructure or do not easily scale to large graphs. In this work, we propose ASAP (Adaptive Structure Aware Pooling), a sparse and differentiable pooling method that addresses the limitations of previous graph pooling architectures. ASAP utilizes a novel self-attention network along with a modified GNN formulation to capture the importance of each node in a given graph. It also learns a sparse soft cluster assignment for nodes at each layer to effectively pool the subgraphs to form the pooled graph. Through extensive experiments on multiple datasets and theoretical analysis, we motivate our choice of the components used in ASAP. Our experimental results show that combining existing GNN architectures with ASAP leads to state-of-the-art results on multiple graph classification benchmarks. ASAP has an average improvement of 4%, compared to current sparse hierarchical state-of-the-art method. We make the source code of ASAP available to encourage reproducible research 1.

AAAI Conference 2020 Conference Paper

InteractE: Improving Convolution-Based Knowledge Graph Embeddings by Increasing Feature Interactions

  • Shikhar Vashishth
  • Soumya Sanyal
  • Vikram Nitin
  • Nilesh Agrawal
  • Partha Talukdar

Most existing knowledge graphs suffer from incompleteness, which can be alleviated by inferring missing links based on known facts. One popular way to accomplish this is to generate low-dimensional embeddings of entities and relations, and use these to make inferences. ConvE, a recently proposed approach, applies convolutional filters on 2D reshapings of entity and relation embeddings in order to capture rich interactions between their components. However, the number of interactions that ConvE can capture is limited. In this paper, we analyze how increasing the number of these interactions affects link prediction performance, and utilize our observations to propose InteractE. InteractE is based on three key ideas – feature permutation, a novel feature reshaping, and circular convolution. Through extensive experiments, we find that InteractE outperforms state-of-the-art convolutional link prediction baselines on FB15k-237. Further, InteractE achieves an MRR score that is 9%, 7. 5%, and 23% better than ConvE on the FB15k-237, WN18RR and YAGO3-10 datasets respectively. The results validate our central hypothesis – that increasing feature interaction is beneficial to link prediction performance. We make the source code of InteractE available to encourage reproducible research.

AAAI Conference 2020 Conference Paper

P-SIF: Document Embeddings Using Partition Averaging

  • Vivek Gupta
  • Ankit Saw
  • Pegah Nokhiz
  • Praneeth Netrapalli
  • Piyush Rai
  • Partha Talukdar

Simple weighted averaging of word vectors often yields effective representations for sentences which outperform sophisticated seq2seq neural models in many tasks. While it is desirable to use the same method to represent documents as well, unfortunately, the effectiveness is lost when representing long documents involving multiple sentences. One of the key reasons is that a longer document is likely to contain words from many different topics; hence, creating a single vector while ignoring all the topical structure is unlikely to yield an effective document representation. This problem is less acute in single sentences and other short text fragments where the presence of a single topic is most likely. To alleviate this problem, we present P-SIF, a partitioned word averaging model to represent long documents. P-SIF retains the simplicity of simple weighted word averaging while taking a document’s topical structure into account. In particular, P- SIF learns topic-specific vectors from a document and finally concatenates them all to represent the overall document. We provide theoretical justifications on the correctness of P-SIF. Through a comprehensive set of experiments, we demonstrate P-SIF’s effectiveness compared to simple weighted averaging and many other baselines.

NeurIPS Conference 2019 Conference Paper

HyperGCN: A New Method For Training Graph Convolutional Networks on Hypergraphs

  • Naganand Yadati
  • Madhav Nimishakavi
  • Prateek Yadav
  • Vikram Nitin
  • Anand Louis
  • Partha Talukdar

In many real-world network datasets such as co-authorship, co-citation, email communication, etc. , relationships are complex and go beyond pairwise. Hypergraphs provide a flexible and natural modeling tool to model such complex relationships. The obvious existence of such complex relationships in many real-world networks naturaly motivates the problem of learning with hypergraphs. A popular learning paradigm is hypergraph-based semi-supervised learning (SSL) where the goal is to assign labels to initially unlabeled vertices in a hypergraph. Motivated by the fact that a graph convolutional network (GCN) has been effective for graph-based SSL, we propose HyperGCN, a novel GCN for SSL on attributed hypergraphs. Additionally, we show how HyperGCN can be used as a learning-based approach for combinatorial optimisation on NP-hard hypergraph problems. We demonstrate HyperGCN's effectiveness through detailed experimentation on real-world hypergraphs. We have made HyperGCN's source code available to foster reproducible research.

AAAI Conference 2019 Conference Paper

ReAl-LiFE: Accelerating the Discovery of Individualized Brain Connectomes on GPUs

  • Sawan Kumar
  • Varsha Sreenivasan
  • Partha Talukdar
  • Franco Pestilli
  • Devarajan Sridharan

Diffusion imaging and tractography enable mapping structural connections in the human brain, in-vivo. Linear Fascicle Evaluation (LiFE) is a state-of-the-art approach for pruning spurious connections in the estimated structural connectome, by optimizing its fit to the measured diffusion data. Yet, LiFE imposes heavy demands on computing time, precluding its use in analyses of large connectome databases. Here, we introduce a GPU-based implementation of LiFE that achieves 50-100x speedups over conventional CPU-based implementations for connectome sizes of up to several million fibers. Briefly, the algorithm accelerates generalized matrix multiplications on a compressed tensor through efficient GPU kernels, while ensuring favorable memory access patterns. Leveraging these speedups, we advance LiFE’s algorithm by imposing a regularization constraint on estimated fiber weights during connectome pruning. Our regularized, accelerated, LiFE algorithm (“ReAl-LiFE”) estimates sparser connectomes that also provide more accurate fits to the underlying diffusion signal. We demonstrate the utility of our approach by classifying pathological signatures of structural connectivity in patients with Alzheimer’s Disease (AD). We estimated million fiber whole-brain connectomes, followed by pruning with ReAl-LiFE, for 90 individuals (45 AD patients and 45 healthy controls). Linear classifiers, based on support vector machines, achieved over 80% accuracy in classifying AD patients from healthy controls based on their ReAl-LiFE pruned structural connectomes alone. Moreover, classification based on the ReAl-LiFE pruned connectome outperformed both the unpruned connectome, as well as the LiFE pruned connectome, in terms of accuracy. We propose our GPU-accelerated approach as a widely relevant tool for non-negative least squares optimization, across many domains.

AAAI Conference 2016 Conference Paper

ClaimEval: Integrated and Flexible Framework for Claim Evaluation Using Credibility of Sources

  • Mehdi Samadi
  • Partha Talukdar
  • Manuela Veloso
  • Manuel Blum

The World Wide Web (WWW) has become a rapidly growing platform consisting of numerous sources which provide supporting or contradictory information about claims (e. g. , “Chicken meat is healthy”). In order to decide whether a claim is true or false, one needs to analyze content of different sources of information on the Web, measure credibility of information sources, and aggregate all these information. This is a tedious process and the Web search engines address only part of the overall problem, viz. , producing only a list of relevant sources. In this paper, we present ClaimEval, a novel and integrated approach which given a set of claims to validate, extracts a set of pro and con arguments from the Web information sources, and jointly estimates credibility of sources and correctness of claims. ClaimEval uses Probabilistic Soft Logic (PSL), resulting in a flexible and principled framework which makes it easy to state and incorporate different forms of prior-knowledge. Through extensive experiments on realworld datasets, we demonstrate ClaimEval’s capability in determining validity of a set of claims, resulting in improved accuracy compared to state-of-the-art baselines.

IJCAI Conference 2015 Conference Paper

AskWorld: Budget-Sensitive Query Evaluation for Knowledge-on-Demand

  • Mehdi Samadi
  • Partha Talukdar
  • Manuela Veloso
  • Tom Mitchell

Recently, several Web-scale knowledge harvesting systems have been built, each of which is competent at extracting information from certain types of data (e. g. , unstructured text, structured tables on the web, etc.). In order to determine the response to a new query posed to such systems (e. g. , is sugar a healthy food?), it is useful to integrate opinions from multiple systems. If a response is desired within a specific time budget (e. g. , in less than 2 seconds), then maybe only a subset of these resources can be queried. In this paper, we address the problem of knowledge integration for on-demand time-budgeted query answering. We propose a new method, AskWorld, which learns a policy that chooses which queries to send to which resources, by accommodating varying budget constraints that are available only at query (test) time. Through extensive experiments on real world datasets, we demonstrate AskWorld’s capability in selecting most informative resources to query within test-time constraints, resulting in improved performance compared to competitive baselines.

AAAI Conference 2015 Conference Paper

Never-Ending Learning

  • Tom Mitchell
  • William Cohen
  • Estevam Hruschka
  • Partha Talukdar
  • Justin Betteridge
  • Andrew Carlson
  • Bhavana Dalvi Mishra
  • Matthew Gardner

Whereas people learn many different types of knowledge from diverse experiences over many years, most current machine learning systems acquire just a single function or data model from just a single data set. We propose a never-ending learning paradigm for machine learning, to better reflect the more ambitious and encompassing type of learning performed by humans. As a case study, we describe the Never- Ending Language Learner (NELL), which achieves some of the desired properties of a never-ending learner, and we discuss lessons learned. NELL has been learning to read the web 24 hours/day since January 2010, and so far has acquired a knowledge base with over 80 million confidence-weighted beliefs (e. g. , servedWith(tea, biscuits)), while learning continually to improve its reading competence over time. NELL has also learned to reason over its knowledge base to infer new beliefs from old ones, and is now beginning to extend its ontology by synthesizing new relational predicates. NELL can be tracked online at http: //rtw. ml. cmu. edu, and followed on Twitter at @CMUNELL.

NeurIPS Conference 2008 Conference Paper

Regularized Learning with Networks of Features

  • Ted Sandler
  • John Blitzer
  • Partha Talukdar
  • Lyle Ungar

For many supervised learning problems, we possess prior knowledge about which features yield similar information about the target variable. In predicting the topic of a document, we might know that two words are synonyms, or when performing image recognition, we know which pixels are adjacent. Such synonymous or neighboring features are near-duplicates and should therefore be expected to have similar weights in a good model. Here we present a framework for regularized learning in settings where one has prior knowledge about which features are expected to have similar and dissimilar weights. This prior knowledge is encoded as a graph whose vertices represent features and whose edges represent similarities and dissimilarities between them. During learning, each feature's weight is penalized by the amount it differs from the average weight of its neighbors. For text classification, regularization using graphs of word co-occurrences outperforms manifold learning and compares favorably to other recently proposed semi-supervised learning methods. For sentiment analysis, feature graphs constructed from declarative human knowledge, as well as from auxiliary task learning, significantly improve prediction accuracy.