Author name cluster

Deng Pan

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

1 author row

EAAI Journal 2025 Journal Article

A latent-coupled neural network for multiphysics long-term forecasting in reactor transients using sparse observations

Yu-Yan Xu
Jun Luo
Deng Pan
Wei Lu
Ting Liu
Guanghui Yuan
Minxiao Zhong
Qing Li

Complex dynamical systems in safety-critical applications like nuclear reactors involve strongly coupled physical fields evolving over space and time. Accurate prediction of these fields is vital for safety monitoring but is challenged by limited sensor placement and unobservable variables (e. g. , xenon and iodine concentrations). This paper proposes the Sparse observation to High-dimensional coupled physical field Prediction Network (SHPNet), a deep learning framework that predicts and reconstructs multiple physical fields directly from sparse observations. SHPNet combines a three-branch autoencoder to extract shared latent representations with a neural operator that models temporal dynamics in latent space, enabling efficient long-term forecasting. Evaluated on Hua-long Pressurized Reactor (HPR1000) under varying power and burnup conditions, SHPNet outperforms traditional frameworks and end-to-end model, achieving higher accuracy, robustness to observation sparsity, and effective reconstruction of unobservable fields. These results demonstrate SHPNet’s potential as a practical tool for real-time monitoring of complex coupled systems.

Details DOI

EAAI Journal 2025 Journal Article

Connected vehicle following control based on gated recurrent unit with attention mechanism

Shengjie Wang
Deng Pan
Xianda Chen
Zexin Duan
Zehao Xu

A delicate balance between safety, efficiency, and fluidity needs to be carefully maintained in vehicle following, in strict accordance with real-time control imperatives. Achieving efficient vehicle-following operations under safe driving conditions, through smooth behavioral adjustments, presents a significant challenge for data-driven vehicle-following models. In response to this challenge, we have developed a deep neural network based on gated recurrent unit (GRU) with attention mechanism, named AGRUNet model, for artificial intelligence (AI) control of vehicle following behavior. Through training and testing on diverse datasets, the AGRUNet model not only establishes a nonlinear mapping relationship between the following vehicle’s acceleration and the speeds of the leading and following vehicles, their distance, and the control strategy of the leading vehicles but also accurately forecasts the future behaviors of following vehicles in complex vehicle-following scenarios in real-time. This capability enables the following vehicle to optimize its behavior based on the current vehicle-following situation and control requirements, thereby improving safety, efficiency, and smoothness. Rigorous simulations of AGRUNet on the Highway Drone(HighD), Next Generation Simulation(NGSIM), Waymo, and Lyft Level 5(Lyft) datasets demonstrate its superior performance in prediction accuracy and vehicle-following control. Compared to the widely adopted, high-performance Long Short-Term Memory (LSTM) model, AGRUNet achieves prediction accuracy gains of approximately 2%, 7%, 22%, and 3% across these datasets. Extensive testing further indicates that AGRUNet significantly reduces collision rates during sudden emergency braking by the leading vehicle, enhancing safety, and improving the efficiency and smoothness of behavior adjustments, all while ensuring vehicle-following safety.

Details DOI

IJCAI Conference 2025 Conference Paper

Fast Explanations via Policy Gradient-Optimized Explainer

Deng Pan
Nuno Moniz
Nitesh V. Chawla

The challenge of delivering efficient explanations is a critical barrier that prevents the adoption of model explanations in real-world applications. Existing approaches often depend on extensive model queries for sample-level explanations or rely on expert's knowledge of specific model structures that trade general applicability for efficiency. To address these limitations, this paper introduces a novel framework Fast EXplanation (FEX) that represents attribution-based explanations via probability distributions, which are optimized by leveraging the policy gradient method. The proposed framework offers a robust, scalable solution for real-time, large-scale model explanations, bridging the gap between efficiency and applicability. We validate our framework on image and text classification tasks and the experiments demonstrate that our method reduces inference time by over 97 percent and memory usage by 70 percent compared to traditional model-agnostic approaches while maintaining high-quality explanations and broad applicability.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Leveraging Artificial Intelligence to Bridge Gaps in Pediatric Oncology Care for Marginalized Spanish-Speaking Communities

Grigorii Khvatskii
Angelica Garcia Martinez
Deng Pan
Matthew Belcher
Gerónimo Medrano Loera
Dayana Pineda Pérez
Juan Emmanuel Ferrari Muñoz-Ledo
Horacio Márquez-González

In low-and middle-income countries (LMICs) pediatric cancer patients and their caregivers often suffer from effects of underfunded, fragmented and outdated healthcare systems. One of these effects is a breakdown of communication between hospital staff and caregivers, which is felt stronger among vulnerable populations. Our proposed solution integrates Large Language Models (LLM) and Automatic Speech Recognition (ASR) technologies to enhance communication between caregivers and healthcare providers while integrating community feedback. We combine cutting-edge technology with existing hospital infrastructure to allow for easy deployment and testing. The system will improve access to health, nutrition, and parental care programs, prioritizing caregiver engagement and real-time interaction. Ultimately, our system will pave the way to more equitable access to medical care, and address structural barriers affecting marginalized communities.

PDF Details DOI

JBHI Journal 2024 Journal Article

MultiModRLBP: A Deep Learning Approach for Multi-Modal RNA-Small Molecule Ligand Binding Sites Prediction

Junkai Wang
Lijun Quan
Zhi Jin
Hongjie Wu
Xuhao Ma
Xuejiao Wang
Jingxin Xie
Deng Pan

This study aims to tackle the intricate challenge of predicting RNA-small molecule binding sites to explore the potential value in the field of RNA drug targets. To address this challenge, we propose the MultiModRLBP method, which integrates multi-modal features using deep learning algorithms. These features include 3D structural properties at the nucleotide base level of the RNA molecule, relational graphs based on overall RNA structure, and rich RNA semantic information. In our investigation, we gathered 851 interactions between RNA and small molecule ligand from the RNAglib dataset and RLBind training set. Unlike conventional training sets, this collection broadened its scope by including RNA complexes that have the same RNA sequence but change their respective binding sites due to structural differences or the presence of different ligands. This enhancement enables the MultiModRLBP model to more accurately capture subtle changes at the structural level, ultimately improving its ability to discern nuances among similar RNA conformations. Furthermore, we evaluated MultiModRLBP on two classic test sets, Test18 and Test3, highlighting its performance disparities on small molecules based on metal and non-metal ions. Additionally, we conducted a structural sensitivity analysis on specific complex categories, considering RNA instances with varying degrees of structural changes and whether they share the same ligands. The research results indicate that MultiModRLBP outperforms the current state-of-the-art methods on multiple classic test sets, particularly excelling in predicting binding sites for non-metal ions and instances where the binding sites are widely distributed along the sequence. MultiModRLBP also can be used as a potential tool when the RNA structure is perturbed or the RNA experimental tertiary structure is not available. Most importantly, MultiModRLBP exhibits the capability to distinguish binding characteristics of RNA that are structurally diverse yet exhibit sequence similarity. These advancements hold promise in reducing the costs associated with the development of RNA-targeted drugs.

Details DOI

EAAI Journal 2024 Journal Article

Physics-informed neural network for simulating magnetic field of coaxial magnetic gear

Shubo Hou
Xiuhong Hao
Deng Pan
Wenchao Wu

In the process of performance analysis and structure optimization of coaxial magnetic gear, an emerging method for precise magnetic field simulation remains a focal point in engineering research. In this study, we introduce a physics-informed neural network to model the magnetic field of a magnetic gear. We employed a physics-based loss function to optimize neural network parameters to solve the magnetic field of a Maxwell-controlled magnetic gear. Additionally, we developed a joint training model that leverages the continuity of the medium interface. The feasibility of the model was confirmed by solving the magnetic field of a permanent magnet, with an error margin of less than 5%. The model exhibited excellent precision in simulating magnetic field behavior within magnetic gears. We demonstrate that adjusting model parameters enables the creation of a proxy model, which effectively addresses analogous problems. Furthermore, leveraging transfer learning substantially diminishes training time for similar tasks, resulting in a 43% reduction in training cost. Finally, we propose an enhanced physical information neural network with data-physical drive fusion and use a special Poisson's equation solution in the magnetized region as a data drive during training. The enhanced physics-informed neural network effectively solved the magnetic field of a magnetic gear, resulting in a 50% improvement in solution accuracy. This study establishes the groundwork for analyzing and optimizing magnetic gears, providing new research insight for electromagnetic practitioners.

Details DOI

AAAI Conference 2023 Conference Paper

Learning Compact Features via In-Training Representation Alignment

Xin Li
Xiangrui Li
Deng Pan
Yao Qiang
Dongxiao Zhu

Deep neural networks (DNNs) for supervised learning can be viewed as a pipeline of the feature extractor (i.e., last hidden layer) and a linear classifier (i.e., output layer) that are trained jointly with stochastic gradient descent (SGD) on the loss function (e.g., cross-entropy). In each epoch, the true gradient of the loss function is estimated using a mini-batch sampled from the training set and model parameters are then updated with the mini-batch gradients. Although the latter provides an unbiased estimation of the former, they are subject to substantial variances derived from the size and number of sampled mini-batches, leading to noisy and jumpy updates. To stabilize such undesirable variance in estimating the true gradients, we propose In-Training Representation Alignment (ITRA) that explicitly aligns feature distributions of two different mini-batches with a matching loss in the SGD training process. We also provide a rigorous analysis of the desirable effects of the matching loss on feature representation learning: (1) extracting compact feature representation; (2) reducing over-adaption on mini-batches via an adaptively weighting mechanism; and (3) accommodating to multi-modalities. Finally, we conduct large-scale experiments on both image and text classifications to demonstrate its superior performance to the strong baselines.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Negative Flux Aggregation to Estimate Feature Attributions

Xin Li
Deng Pan
Chengyin Li
Yao Qiang
Dongxiao Zhu

There are increasing demands for understanding deep neural networks' (DNNs) behavior spurred by growing security and/or transparency concerns. Due to multi-layer nonlinearity of the deep neural network architectures, explaining DNN predictions still remains as an open problem, preventing us from gaining a deeper understanding of the mechanisms. To enhance the explainability of DNNs, we estimate the input feature's attributions to the prediction task using divergence and flux. Inspired by the divergence theorem in vector analysis, we develop a novel Negative Flux Aggregation (NeFLAG) formulation and an efficient approximation algorithm to estimate attribution map. Unlike the previous techniques, ours doesn't rely on fitting a surrogate model nor need any path integration of gradients. Both qualitative and quantitative experiments demonstrate a superior performance of NeFLAG in generating more faithful attribution maps than the competing methods. Our code is available at https: //github. com/xinli0928/NeFLAG.

PDF Details DOI

NeurIPS Conference 2022 Conference Paper

AttCAT: Explaining Transformers via Attentive Class Activation Tokens

Yao Qiang
Deng Pan
Chengyin Li
Xin Li
Rhongho Jang
Dongxiao Zhu

Transformers have improved the state-of-the-art in various natural language processing and computer vision tasks. However, the success of the Transformer model has not yet been duly explained. Current explanation techniques, which dissect either the self-attention mechanism or gradient-based attribution, do not necessarily provide a faithful explanation of the inner workings of Transformers due to the following reasons: first, attention weights alone without considering the magnitudes of feature values are not adequate to reveal the self-attention mechanism; second, whereas most Transformer explanation techniques utilize self-attention module, the skip-connection module, contributing a significant portion of information flows in Transformers, has not yet been sufficiently exploited in explanation; third, the gradient-based attribution of individual feature does not incorporate interaction among features in explaining the model's output. In order to tackle the above problems, we propose a novel Transformer explanation technique via attentive class activation tokens, aka, AttCAT, leveraging encoded features, their gradients, and their attention weights to generate a faithful and confident explanation for Transformer's output. Extensive experiments are conducted to demonstrate the superior performance of AttCAT, which generalizes well to different Transformer architectures, evaluation metrics, datasets, and tasks, to the baseline methods. Our code is available at: https: //github. com/qiangyao1988/AttCAT.

PDF Details

IJCAI Conference 2021 Conference Paper

Explaining Deep Neural Network Models with Adversarial Gradient Integration

Deng Pan
Xin Li
Dongxiao Zhu

Deep neural networks (DNNs) have became one of the most high performing tools in a broad range of machine learning areas. However, the multilayer non-linearity of the network architectures prevent us from gaining a better understanding of the models’ predictions. Gradient based attribution methods (e. g. , Integrated Gradient (IG)) that decipher input features’ contribution to the prediction task have been shown to be highly effective yet requiring a reference input as the anchor for explaining model’s output. The performance of DNN model interpretation can be quite inconsistent with regard to the choice of references. Here we propose an Adversarial Gradient Integration (AGI) method that integrates the gradients from adversarial examples to the target example along the curve of steepest ascent to calculate the resulting contributions from all input features. Our method doesn’t rely on the choice of references, hence can avoid the ambiguity and inconsistency sourced from the reference selection. We demonstrate the performance of our AGI method and compare with competing methods in explaining image classification results. Code is available from https: //github. com/pd90506/AGI.

PDF Details DOI

AAAI Conference 2021 Conference Paper

Improving Adversarial Robustness via Probabilistically Compact Loss with Logit Constraints

Xin Li
Xiangrui Li
Deng Pan
Dongxiao Zhu

Convolutional neural networks (CNNs) have achieved stateof-the-art performance on various tasks in computer vision. However, recent studies demonstrate that these models are vulnerable to carefully crafted adversarial samples and suffer from a significant performance drop when predicting them. Many methods have been proposed to improve adversarial robustness (e. g. , adversarial training and new loss functions to learn adversarially robust feature representations). Here we offer a unique insight into the predictive behavior of CNNs that they tend to misclassify adversarial samples into the most probable false classes. This inspires us to propose a new Probabilistically Compact (PC) loss with logit constraints which can be used as a drop-in replacement for crossentropy (CE) loss to improve CNN’s adversarial robustness. Specifically, PC loss enlarges the probability gaps between true class and false classes meanwhile the logit constraints prevent the gaps from being melted by a small perturbation. We extensively compare our method with the state-of-the-art using large scale datasets under both white-box and blackbox attacks to demonstrate its effectiveness. The source codes are available at https: //github. com/xinli0928/PC-LC.

PDF Details

IJCAI Conference 2020 Conference Paper

Explainable Recommendation via Interpretable Feature Mapping and Evaluation of Explainability

Deng Pan
Xiangrui Li
Xin Li
Dongxiao Zhu

Latent factor collaborative filtering (CF) has been a widely used technique for recommender system by learning the semantic representations of users and items. Recently, explainable recommendation has attracted much attention from research community. However, trade-off exists between explainability and performance of the recommendation where metadata is often needed to alleviate the dilemma. We present a novel feature mapping approach that maps the uninterpretable general features onto the interpretable aspect features, achieving both satisfactory accuracy and explainability in the recommendations by simultaneous minimization of rating prediction loss and interpretation loss. To evaluate the explainability, we propose two new evaluation metrics specifically designed for aspect-level explanation using surrogate ground truth. Experimental results demonstrate a strong performance in both recommendation and explaining explanation, eliminating the need for metadata. Code is available from https: //github. com/pd90506/AMCF.

PDF Details DOI

AAAI Conference 2020 Conference Paper

On the Learning Property of Logistic and Softmax Losses for Deep Neural Networks

Xiangrui Li
Xin Li
Deng Pan
Dongxiao Zhu

Deep convolutional neural networks (CNNs) trained with logistic and softmax losses have made signiﬁcant advancement in visual recognition tasks in computer vision. When training data exhibit class imbalances, the class-wise reweighted version of logistic and softmax losses are often used to boost performance of the unweighted version. In this paper, motivated to explain the reweighting mechanism, we explicate the learning property of those two loss functions by analyzing the necessary condition (e. g. , gradient equals to zero) after training CNNs to converge to a local minimum. The analysis immediately provides us explanations for understanding (1) quantitative effects of the class-wise reweighting mechanism: deterministic effectiveness for binary classiﬁcation using logistic loss yet indeterministic for multi-class classiﬁcation using softmax loss; (2) disadvantage of logistic loss for single-label multi-class classiﬁcation via one-vs. -all approach, which is due to the averaging effect on predicted probabilities for the negative class (e. g. , non-target classes) in the learning process. With the disadvantage and advantage of logistic loss disentangled, we thereafter propose a novel reweighted logistic loss for multi-class classiﬁcation. Our simple yet effective formulation improves ordinary logistic loss by focusing on learning hard non-target classes (target vs. non-target class in one-vs. -all) and turned out to be competitive with softmax loss. We evaluate our method on several benchmark datasets to demonstrate its effectiveness.

PDF Details