Interpretable multi-source data fusion through Latent Variable Gaussian Process

Sandipp Krishnan Ravi; Yigitcan Comlek; Arjun Pathak; Vipul Gupta; Rajnikant Umretiya; Andrew Hoffman; Ghanshyam Pilania; Piyush Pandita; Sayan Ghosh; Nathaniel Mckeever; Wei Chen; Liping Wang

doi:10.1016/j.engappai.2025.110033

Back to EAAI

EAAI 2025

Interpretable multi-source data fusion through Latent Variable Gaussian Process

Journal Article journal-article Applied Artificial Intelligence · Artificial Intelligence

Details DOI

Abstract

With the advent of artificial intelligence and machine learning, various domains of science and engineering communities have leveraged data-driven surrogates to model complex systems through fusing numerous sources of information (data) from published papers, patents, open repositories, or other available resources. However, very little attention has been paid to the differences in quality and comprehensiveness of the known and unknown underlying physical parameters of the information sources, which could have downstream implications during system optimization. Towards resolving this issue, an interpretable multi-source data fusion framework based on Latent Variable Gaussian Process (LVGP) model is proposed. The individual data sources are first labeled as categorical variables and then mapped into a physically meaningful latent space, enabling the development of a source-aware data fusion model. Additionally, a dissimilarity metric based on the learned latent variables of the LVGP is introduced to study and understand the differences between the data sources. The proposed approach is demonstrated on and analyzed through two mathematical and two materials engineering case studies. From the case studies, it is observed that the proposed multi-source data fusion framework provides more accurate predictions for sparse data scenarios compared to single-source or source-unaware data fusion models.

Authors

Keywords

Latent Variable Gaussian Process
Gaussian process regression
Multi-source modeling
Data fusion
Interpretable Artificial Intelligence
Uncertainty quantification
Probabilistic machine learning

Context

Venue: Engineering Applications of Artificial Intelligence
Archive span: 1988-2026
Indexed papers: 13269
Paper id: 320317156517087692