Author name cluster

Pengfei Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

32 papers

2 author rows

JBHI Journal 2026 Journal Article

BSN With Explicit Noise-Aware Constraint for Self-Supervised Low-Dose CT Denoising

Pengfei Wang
Danyang Li
Yaoduo Zhang
Gaofeng Chen
Yongbo Wang
Jianhua Ma
Ji He

Although supervised deep learning methods have made significant advances in low-dose computed tomography (LDCT) image denoising, these approaches typically require pairs of low-dose and normal-dose CT images for training, which are often unavailable in clinical settings. Self-supervised deep learning (SSDL) has great potential to cast off the dependence on paired training datasets. However, existing SSDL methods are limited by the neighboring noise independence assumptions, making them ineffective for handling spatially correlated noises in LDCT images. To address this issue, this paper introduces a novel SSDL approach, named, Noise-Aware Blind Spot Network (NA-BSN), for high-quality LDCT imaging, while mitigating the dependence on the assumption of neighboring noise independence. NA-BSN achieves high-quality image reconstruction without referencing clean data through its explicit noise-aware constraint mechanism during the self-supervised learning process. Specifically, it is experimentally observed and theoretical proven that the $l1$ norm value of CT images in a downsampled space follows a certain descend trend with increasing of the radiation dose, which is then used to construct the explicit noise-aware constraint in the architecture of BSN for self-supervised LDCT image denoising. Various clinical datasets are adopted to validate the performance of the presented NA-BSN method. Experimental results reveal that NA-BSN significantly reduces the spatially correlated CT noises and retains crucial image details in various complex scenarios, such as different types of scanning machines, scanning positions, dose-level settings, and reconstruction kernels.

Details DOI

AAAI Conference 2026 Conference Paper

Fast Multi-view Consistent 3D Editing with Video Priors

Liyi Chen
Ruihuang Li
Guowen Zhang
Pengfei Wang
Lei Zhang

Text-driven 3D editing enables user-friendly 3D object or scene editing with text instructions. Due to the lack of multi-view consistency priors, existing methods typically resort to employ 2D generation or editing models to process per-view individually, followed by iterative 2D-3D-2D updating. However, these methods are not only time-consuming but also prone to yielding over-smoothed results, since iterative process averages the different editing signals gathered from different views. In this paper, we propose, an early and pioneering work of generative Video Prior based 3D Editing, ViP3DE in short, to repurpose the temporal consistency priors from pre-trained video generation models to achieve consistent 3D editing within a single forward pass. Our key insight is to condition the video generation model on a single edited view to generate other consistent edited views for 3D updating directly, thereby bypassing iterative editing paradigm. First, 3D updating requires edited views to be paired with specific camera poses. To this end, we propose \textit{motion-preserved noise blending} for the video model to generate edited views at predefined camera poses. In addition, we introduce \textit{geometrically aware denoising} to further enhance multi-view consistency by integrating 3D geometric priors into video models. Extensive experiments demonstrate that our proposed ViP3DE can achieve high-quality 3D editing results even within a single forward pass, significantly outperforming existing methods in both editing quality and editing time cost.

PDF Details DOI

TIST Journal 2026 Journal Article

Hierarchical Reinforcement Learning for Energy-Efficient Urban Planning

Yanan Xiao
Lu Jiang
Steven Jige Quan
Pengfei Wang
Minghao Yin
Pengyang Wang

Urban planning plays a pivotal role in fostering sustainable, energy-efficient communities amidst the escalating challenges of climate change and energy crises. However, traditional methods often overlook the hierarchical interdependencies inherent in urban structures, land use configurations, and building design, leading to suboptimal outcomes. Here, we introduce a Hierarchical Energy-Efficient Urban Planning (HEEUP) framework, leveraging hierarchical reinforcement learning (HRL) to integrate decision-making across three tiers: urban structure, land use configuration, and building design. Our approach facilitates collaboration between cascading agents, enabling energy-conscious planning decisions that align with functional and contextual urban demands. Evaluations across real-world datasets from Miami and Albuquerque demonstrate a 7.3% improvement in energy efficiency compared to baseline methods. Furthermore, HEEUP achieves an average 24.52% reduction in computational training time compared to traditional optimization algorithms and generative machine learning models. These findings underscore the potential of HEEUP to significantly advance energy-efficient urban planning, providing a scalable, adaptable paradigm for addressing contemporary sustainability challenges.

Details DOI

YNICL Journal 2026 Journal Article

Neurovascular coupling, cognition, and cardiac function in stroke-free atrial fibrillation

Songhong Yue
Jintao Wang
Jiahao Yan
Wanjun Hu
Jun Wang
Yucheng Ding
Laiyang Ma
Pengfei Wang

OBJECTIVES: Atrial fibrillation (AF) is linked to cognitive impairment even without overt stroke, potentially via neurovascular coupling (NVC) dysfunction. We aimed to determine whether alterations in NVC mediate the relationship between cardiac structural remodeling and cognitive deficits in stroke-free AF (sfAF) patients. METHODS: This prospective cohort study evaluated participants who underwent MR brain scans, transthoracic echocardiography, and completed neuropsychological assessments. We used a neurovascular coupling approach, combining fMRI metrics (amplitude of low-frequency fluctuations (ALFF), fractional ALFF (fALFF), regional homogeneity (ReHo), and degree centrality (DC)) with regional cerebral blood flow maps. Mediation analysis were conducted to explore the relationships among NVC metrics, cardiac parameters, and cognitive performance in sfAF. RESULTS: Compared with healthy controls, the sfAF patients exhibited significant cardiac remodeling, marked by increased LA and LVEDd and decreased LVEF, alongside poorer cognitive performance on AVLT-Immediate and MoCA. They also showed widespread disruptions in neurovascular coupling, with notably reduced fALFF-CBF coupling in the orbital inferior frontal gyrus. Moreover, this reduction in fALFF-CBF coupling mediated the relationship between elevated LVEDd and short-term memory impairment. CONCLUSION: NVC dysfunction mediates the adverse effects of cardiac remodeling on cognitive function in sfAF, supporting a heart-brain axis model as a potential target for early intervention.

Details DOI

AAAI Conference 2026 Conference Paper

scCluBench: Comprehensive Benchmarking of Clustering Algorithms for Single-Cell RNA Sequencing

Ping Xu
Zaitian Wang
Zhirui Wang
Pengjiang Li
Jiajia Wang
Ran Zhang
Pengfei Wang
Yuanchun Zhou

Cell clustering is crucial for uncovering cellular heterogeneity in single-cell RNA sequencing (scRNA-seq) data by identifying cell types and marker genes. Despite its importance, existing benchmarks for scRNA-seq clustering remain fragmented, lacking standardized protocols and often omitting recent advances in artificial intelligence.To fill these gaps, we present scCluBench, a comprehensive benchmark of clustering algorithms for scRNA-seq data. scCluBench provides 36 scRNA-seq datasets collected from diverse public sources, covering multiple tissues, which are uniformly processed to ensure consistency for systematic evaluation and downstream analyses. To assess performance, we collect and reproduce a range of scRNA-seq clustering methods, including traditional, deep learning-based, graph-based, and biological foundation models. We comprehensively evaluate each method both quantitatively and qualitatively, using core performance metrics and visualization analyses. Furthermore, we construct representative downstream biological tasks, such as marker gene identification and cell type annotation, to further assess the practical utility. scCluBench then investigates the performance differences and applicability boundaries of various clustering models across diverse analytical tasks, systematically assessing their robustness and scalability in real-world scenarios. Overall, scCluBench offers a standardized and user-friendly benchmark for scRNA-seq clustering, with standardized datasets, unified evaluation protocols, and transparent analyses, facilitating informed method selection and providing valuable insights into model generalizability and application scope.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Dynamic and Adaptive Feature Generation with LLM

Xinhao Zhang
Jinghan Zhang
Banafsheh Rekabdar
Yuanchun Zhou
Pengfei Wang
Kunpeng Liu

The representation of feature space is a crucial environment where data points get vectorized and embedded for subsequent modeling. Thus, the efficacy of machine learning (ML) algorithms is closely related to the quality of feature engineering. As one of the most important techniques, feature generation transforms raw data into an optimized feature space conducive to model training and further refines the space. Despite the advancements in automated feature engineering and feature generation, current methodologies often suffer from three fundamental issues: lack of explainability, limited applicability, and inflexible strategy. These shortcomings frequently hinder and limit the deployment of ML models across varied scenarios. Our research introduces a novel approach adopting large language models (LLMs) and feature-generating prompts to address these challenges. We propose a dynamic and adaptive feature generation method that enhances the interpretability of the feature generation process. Our approach broadens the applicability across various data types and tasks and offers advantages in terms of strategic flexibility. A broad range of experiments showcases that our approach is significantly superior to existing methods.

PDF Details DOI

ICRA Conference 2025 Conference Paper

EAR-SLAM: Environment-Aware Robust Localization System for Terrestrial-Aerial Bimodal Vehicles

Wenjun He
Xingpeng Wang
Pengfei Wang
Tianfu Zhang
Chao Xu 0001
Fei Gao 0011
Yanjun Cao

Terrestrial-aerial bimodal vehicles (TABVs) can fly to avoid obstacles and move safely on the ground to save energy, offering enhanced adaptability and flexibility in various challenging environments. However, a robust localization approach becomes a bottleneck to stably applying the TABVs in real-world tasks. Besides the general limitations of visual SLAM methods, large FoV differences between the two modes, abrupt motion strikes in mode transitions, and unstable attitude in ground mode pose great challenges. In this paper, we present an environment-aware robust localization system specifically designed for passive-wheel-based TABVs, which feature two passive wheels alongside a standard quadrotor. The localization system tightly integrates data from multiple sensors, including a stereo camera, Inertial Measurement Units (IMUs), encoders, and single-point laser distance sensors. First, we introduce a terrain-aware odometer model that accurately estimates terrain slope and vehicle's velocity. Then, we propose an anomaly-aware method that senses anomalous sensors and dynamically adjusts the optimization weights accordingly. By explicitly estimating the environmental conditions, such as ground terrain slopes and visual information qualities, the robot can achieve accurate and robust localization results on the ground. To validate our localization approach, we conducted extensive experiments across various challenging scenarios, demonstrating the effectiveness and reliability of our system for real-world applications.

Details

JBHI Journal 2025 Journal Article

Hierarchical Graph Transformer With Contrastive Learning for Gene Regulatory Network Inference

Wentao Cui
Qingqing Long
Wenhao Liu
Chen Fang
Xuezhi Wang
Pengfei Wang
Yuanchun Zhou

Gene regulatory networks (GRNs) are crucial for understanding gene regulation and cellular processes. Inferring GRNs helps uncover regulatory pathways, shedding light on the regulation and development of cellular processes. With the rise of high-throughput sequencing and advancements in computational technology, computational models have emerged as cost-effective alternatives to traditional experimental studies. Moreover, the surge in ChIP-seq data for TF-DNA binding has catalyzed the development of graph neural network (GNN)-based methods, greatly advancing GRN inference capabilities. However, most existing GNN-based methods suffer from the inability to capture long-distance structural semantic correlations due to transitive interactions. In this paper, we introduce a novel GNN-based model named H ierarchical G raph T ransformer with C ontrastive Learning for GRN (HGTCGRN) inference. HGTCGRN excels at capturing structural semantics using a hierarchical graph Transformer, which introduces a series of gene family nodes representing gene functions as virtual nodes to interact with nodes in the GRNS. These semantic-aware virtual-node embeddings are aggregated to produce node representations with varying emphasis. Additionally, we leverage gene ontology information to construct gene interaction networks for contrastive learning optimization of GRNs. Experimental results demonstrate that HGTCGRN achieves superior performance in GRN inference.

Details DOI

ICRA Conference 2025 Conference Paper

Learning Dynamic Weight Adjustment for Spatial-Temporal Trajectory Planning in Crowd Navigation

Muqing Cao
Xinhang Xu
Yizhuo Yang 0001
Jianping Li 0004
Tongxing Jin
Pengfei Wang
Tzu-Yi Hung
Guosheng Lin

Robot navigation in dense human crowds poses a significant challenge due to the complexity of human behavior in dynamic and obstacle-rich environments. In this work, we propose a dynamic weight adjustment scheme using a neural network to predict the optimal weights of objectives in an optimization-based motion planner. We adopt a spatial-temporal trajectory planner and incorporate diverse objectives to achieve a balance among safety, efficiency, and goal achievement in complex and dynamic environments. We design the network structure, observation encoding, and reward function to effectively train the policy network using reinforcement learning, allowing the robot to adapt its behavior in real time based on environmental and pedestrian information. Simulation results show improved safety compared to the fixed-weight planner and the state-of-the-art learning-based methods, and verify the ability of the learned policy to adaptively adjust the weights based on the observed situations. The feasibility of the approach is demonstrated in a navigation task using an autonomous delivery robot across a crowded corridor over a 300 m distance. Video: https://youtu.be/nSCbNaaF_VM

Details

AAAI Conference 2025 Conference Paper

Motif-Oriented Representation Learning with Topology Refinement for Drug-Drug Interaction Prediction

Ran Zhang
Xuezhi Wang
Guannan Liu
Pengyang Wang
Yuanchun Zhou
Pengfei Wang

Drug-Drug Interaction (DDI) prediction has attracted considerable attention in designing multi-drug combination strategies and avoiding adverse reactions. Notably, Artificial Intelligence (AI)-driven DDI prediction methods have emerged as a pivotal research paradigm. However, most AI-driven DDI prediction methods fall short in exploring intra-molecular motifs, and heavily rely on the overly idealized assumption of the complete inter-molecular topology, limiting their expressive capacities. To this end, we propose a Motif-Oriented representation learning with TOpology Refinement for DDI prediction, namely MOTOR, to exploit both the multi-granularity motif information and the topological structure of DDI networks. Specifically, MOTOR effectively captures motif internal structures, motif local contexts, and motif global semantics. Furthermore, MOTOR employs an iterative learning strategy to continuously refine the DDI topology and optimize the corresponding drug representations. Extensive experimental results demonstrate that MOTOR exhibits superior performance with interpretable insights in DDI prediction tasks across three real-world datasets, thereby opening up new avenues in AI-driven DDI prediction.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Rethinking Graph Contrastive Learning Through Relative Similarity Preservation

Zhiyuan Ning
Pengfei Wang
Ziyue Qiao
Pengyang Wang
Yuanchun Zhou

Graph contrastive learning (GCL) has achieved remarkable success by following the computer vision paradigm of preserving absolute similarity between augmented views. However, this approach faces fundamental challenges in graphs due to their discrete, non-Euclidean nature -- view generation often breaks semantic validity and similarity verification becomes unreliable. Through analyzing 11 real-world graphs, we discover a universal pattern transcending the homophily-heterophily dichotomy: label consistency systematically diminishes as structural distance increases, manifesting as smooth decay in homophily graphs and oscillatory decay in heterophily graphs. We establish theoretical guarantees for this pattern through random walk theory, proving label distribution convergence and characterizing the mechanisms behind different decay behaviors. This discovery reveals that graphs naturally encode relative similarity patterns, where structurally closer nodes exhibit collectively stronger semantic relationships. Leveraging this insight, we propose RELGCL, a novel GCL framework with complementary pairwise and listwise implementations that preserve these inherent patterns through collective similarity objectives. Extensive experiments demonstrate that our method consistently outperforms 20 existing approaches across both homophily and heterophily graphs, validating the effectiveness of leveraging natural relative similarity over artificial absolute similarity.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

scSiameseClu: A Siamese Clustering Framework for Interpreting Single-cell RNA Sequencing Data

Ping Xu
Zhiyuan Ning
Pengjiang Li
Wenhao Liu
Pengyang Wang
Jiaxu Cui
Yuanchun Zhou
Pengfei Wang

Single-cell RNA sequencing (scRNA-seq) reveals cell heterogeneity, with cell clustering playing a key role in identifying cell types and marker genes. Recent advances, especially graph neural networks (GNNs)-based methods, have significantly improved clustering performance. However, the analysis of scRNA-seq data remains challenging due to noise, sparsity, and high dimensionality. Compounding these challenges, GNNs often suffer from over-smoothing, limiting their ability to capture complex biological information. In response, we propose scSiameseClu, a novel Siamese Clustering framework for interpreting single-cell RNA-seq data, comprising of 3 key steps: (1) Dual Augmentation Module, which applies biologically informed perturbations to the gene expression matrix and cell graph relationships to enhance representation robustness; (2) Siamese Fusion Module, which combines cross-correlation refinement and adaptive information fusion to capture complex cellular relationships while mitigating over-smoothing; and (3) Optimal Transport Clustering, which utilizes Sinkhorn distance to efficiently align cluster assignments with predefined proportions while maintaining balance. Comprehensive evaluations on seven real-world datasets demonstrate that scSiameseClu outperforms state-of-the-art methods in single-cell clustering, cell type annotation, and cell type classification, providing a powerful tool for scRNA-seq data interpretation.

PDF Details DOI

JBHI Journal 2025 Journal Article

SDPR: Prescription Recommendation With Syndrome Differentiation in Traditional Chinese Medicine

Wenjing Yue
Wendi Ji
Xinyu Wang
Xin Ma
Pengfei Wang
Xiaoling Wang

Prescription recommendation is critical for clinical decision support in Traditional Chinese Medicine (TCM), aiming to recommend a herb set based on a patient's symptoms. The core principle of TCM clinical practice, treatment based on syndrome differentiation (SD), follows a four-step progressive process: symptoms to syndromes, therapeutic methods, and herbs. However, existing models oversimplify this process by overlooking therapeutic methods, directly mapping symptoms to herbs or syndromes to herbs, resulting in information loss and reducing the effectiveness of recommended prescriptions. Furthermore, the implicit, sparse, and many-to-many relationships between syndromes and therapeutic methods, coupled with the nonlinear interactions between therapeutic methods and herbs, further hinder the modeling of the complete SD process. To address these challenges, we propose a novel four-partite graph paradigm that explicitly models the four key components of SD and their interactions, preserving critical information at each step and aligning more closely with clinicians' decision-making logic. Building on this, we develop SDPR, an SD-based prescription recommendation model comprising four modules aligned with all SD steps. Then, we integrated them into a multi-task learning framework to fully capture the progressive prescription process. To handle the implicit and complex relationships among syndromes, therapeutic methods, and herbs, we introduce a syndrome-induced pre-training strategy and a therapeutic method-aware contrastive learning framework. Extensive experiments on public and real-world datasets validate SDPR's effectiveness in herb recommendation and prescription retrieval, confirming the strength of the four-partite graph paradigm. Our broader goal is to advance the intelligent development of TCM in healthcare.

Details DOI

TIST Journal 2025 Journal Article

Self-optimizing Teacher and Auto-matching Student Framework for Change-point Representation Learning in Time Series Forecasting

Jinxiao Fan
Pengfei Wang
Liang Liu
Huadong Ma

Real-world time series data is inherently complex, noisy, and exhibits abrupt changes, posing various challenges in data modeling. Given the ubiquity and importance of time-series data, accurately forecasting change points, instead of the overall predictive performance, has become increasingly attractive as it assists in risk mitigation and loss prevention. In this task, we argue that the past and future interactions involving the target points determine the comprehensive structure contributing to abrupt changes. However, traditional left-to-right auto-regressive approaches only consider the historical sequence, resulting in a flawed learning process and limited performance. In this paper, we extend the teacher-student learning and propose a novel S elf-optimizing T eacher and A uto-matching S tudent framework (named ST-AS) to predict change points in time series data. Our framework models change point representations specific to the target points by integrating future knowledge while avoiding data leakage. Specifically, we design a Gumbel-enhanced filter for our self-optimizing teacher, which constructs selected and filtered sub-groups to derive discriminative representations using a positive-unlabeled learning strategy. Given this well-trained teacher, we propose an adaptive pattern matcher for our auto-matching student model, which learns missing information by automatically aligning relevant features. After that, a novel two-stage dual-guided learning process is then designed to mimic teacher’s decision-making behavior and enhance student’s excavate capability. Finally, we conduct extensive experiments on four real-world datasets to demonstrate that our proposed ST-AS exhibits significantly better prediction performance compared to existing state-of-the-art alternatives.

Details DOI

IJCAI Conference 2025 Conference Paper

State Feedback Enhanced Graph Differential Equations for Multivariate Time Series Forecasting

Jiaxu Cui
Qipeng Wang
Yiming Zhao
Bingyi Sun
Pengfei Wang
Bo Yang

Multivariate time series forecasting holds significant theoretical and practical importance in various fields, including web analytics and transportation. Recently, graph neural networks and graph differential equations have shown exceptional capabilities in modeling spatio-temporal features. However, existing methods often suffer from over-smoothing, hindering real-world problem-solving. In this work, we analyze the graph propagation process as a dynamical system and propose a novel feedback mechanism to enhance representation power, adaptively adjusting the representations to align with desired performance outcomes, thereby fundamentally mitigating the issue of over-smoothing. Moreover, we introduce an effective multivariate time series forecasting model called SF-GDE, based on the proposed graph propagation with the feedback mechanism. Intensive experiments are conducted on three real-world datasets from diverse fields. Results show that SF-GDE outperforms the state of the arts, and the feedback mechanism can serve as a universal booster to improve performance for graph propagation models.

PDF Details DOI

TIST Journal 2025 Journal Article

Towards Predicting Urban Land Use Changes: A Dynamic Graph Alignment Perspective

Yu Fan
Xinjiang Lu
Hao Liu
Pengfei Wang
Liang Liu
Huadong Ma
Jingbo Zhou

Urban land use, intrinsically linked to people’s daily activities, undergoes continuous evolution, presenting a complex interplay that remains partially understood. To bridge this gap, our study leverages fine-grained human mobility data to predict these changes, adopting a novel approach that conceptualizes “community-level” land use shifts as a regression problem and represents citywide changes through dynamic graphs. We harness recent advancements in graph neural networks (GNNs), which, despite their success in various applications, face challenges in directly predicting land use changes due to the temporal mismatch between the slow evolution of urban land and the immediacy of human mobility data. Our research stands out by introducing a temporal skeleton for dynamic GNNs to synchronize human activity graphs with urban land use changes, a dynamic heterogeneous GNN approach for integrating diverse human activity data to capture essential temporal dependencies, and a novel algorithm powered by causal inference to elucidate the primary factors influencing land use predictions at the community level, all of which contribute to a training process informed by the generated causal graph. Empirically validated on three real-world datasets, our model demonstrates a performance leap over state-of-the-art baselines, marking a pivotal step toward understanding and predicting the dynamics of urban land use.

Details DOI

IJCAI Conference 2024 Conference Paper

FedGCS: A Generative Framework for Efficient Client Selection in Federated Learning via Gradient-based Optimization

Zhiyuan Ning
Chunlin Tian
Meng Xiao
Wei Fan
Pengyang Wang
Li Li
Pengfei Wang
Yuanchun Zhou

Federated Learning faces significant challenges in statistical and system heterogeneity, along with high energy consumption, necessitating efficient client selection strategies. Traditional approaches, including heuristic and learning-based methods, fall short of addressing these complexities holistically. In response, we propose FedGCS, a novel generative client selection framework that innovatively recasts the client selection process as a generative task. Drawing inspiration from the methodologies used in large language models, FedGCS efficiently encodes abundant decision-making knowledge within a continuous representation space, enabling efficient gradient-based optimization to search for optimal client selection that will be finally output via generation. The framework comprises four steps: (1) automatic collection of diverse “selection-score” pair data using classical client selection methods; (2) training an encoder-evaluator-decoder framework on this data to construct a continuous representation space; (3) employing gradient-based optimization in this space for optimal client selection; (4) generating the final optimal client selection via using beam search for the well-trained decoder. FedGCS outperforms traditional methods by being more comprehensive, generalizable, and efficient, simultaneously optimizing for model performance, latency, and energy consumption. The effectiveness of FedGCS is proven through extensive experimental analyses.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

Finite-Time Convergence Rates of Decentralized Local Markovian Stochastic Approximation

Pengfei Wang
Nenggan Zheng

Markovian stochastic approximation has recently aroused a great deal of interest in many fields; however, it is not well understood in decentralized settings. Decentralized Markovian stochastic approximation is far more challenging than its single-agent counterpart due to the complex coupling structure between decentralized communication and Markovian noise-corrupted local updates. In this paper, a decentralized local markovian stochastic approximation (DLMSA) algorithm has been proposed and attains a near-optimal convergence rate. Specifically, we first provide a local variant of decentralized Markovian stochastic approximation so that each agent performs multiple local updates and then periodically communicate with its neighbors. Furthermore, we propose DLMSA with compressed communication (C-DLMSA) for further reducing the communication overhead. In this way, each agent only needs to communicate compressed information (e. g. , sign compression) with its neighbors. We show that C-DLMSA enjoys the same convergence rate as that of the original DLMSA. Finally, we verify our theoretical results by applying our methods to solve multi-task reinforcement learning problems over multi-agent systems.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

Make Graph Neural Networks Great Again: A Generic Integration Paradigm of Topology-Free Patterns for Traffic Speed Prediction

Yicheng Zhou
Pengfei Wang
Hao Dong
Denghui Zhang
Dingqi Yang
Yanjie Fu
Pengyang Wang

Urban traffic speed prediction aims to estimate the future traffic speed for improving urban transportation services. Enormous efforts have been made to exploit Graph Neural Networks (GNNs) for modeling spatial correlations and temporal dependencies of traffic speed evolving patterns, regularized by graph topology. While achieving promising results, current traffic speed prediction methods still suffer from ignoring topology-free patterns, which cannot be captured by GNNs. To tackle this challenge, we propose a generic model for enabling the current GNN-based methods to preserve topology-free patterns. Specifically, we first develop a Dual Cross-Scale Transformer (DCST) architecture, including a Spatial Transformer and a Temporal Transformer, to preserve the cross-scale topology-free patterns and associated dynamics, respectively. Then, to further integrate both topology-regularized/-free patterns, we propose a distillation-style learning framework, in which the existing GNN-based methods are considered as the teacher model, and the proposed DCST architecture is considered as the student model. The teacher model would inject the learned topology-regularized patterns into the student model for integrating topology-free patterns. The extensive experimental results demonstrated the effectiveness of our methods.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Question Calibration and Multi-Hop Modeling for Temporal Question Answering

Chao Xue
Di Liang
Pengfei Wang
Jing Zhang

Many models that leverage knowledge graphs (KGs) have recently demonstrated remarkable success in question answering (QA) tasks. In the real world, many facts contained in KGs are time-constrained thus temporal KGQA has received increasing attention. Despite the fruitful efforts of previous models in temporal KGQA, they still have several limitations. (I) They adopt pre-trained language models (PLMs) to obtain question representations, while PLMs tend to focus on entity information and ignore entity transfer caused by temporal constraints, and finally fail to learn specific temporal representations of entities. (II) They neither emphasize the graph structure between entities nor explicitly model the multi-hop relationship in the graph, which will make it difficult to solve complex multi-hop question answering. To alleviate this problem, we propose a novel Question Calibration and Multi-Hop Modeling (QC-MHM) network. Specifically, We first calibrate the question representation by fusing the question and the time-constrained concepts in KG. Then, we construct the GNN layer to complete multi-hop message passing. Finally, the question representation is combined with the embedding output by the GNN to generate the final prediction. Empirical results verify that the proposed model achieves better performance than the state-of-the-art models in the benchmark dataset. Notably, the Hits@1 and Hits@10 results of QC-MHM on the CronQuestions dataset's complex questions are absolutely improved by 5.1% and 1.2% compared to the best-performing baseline. Moreover, QC-MHM can generate interpretable and trustworthy predictions.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

SCAT: A Time Series Forecasting with Spectral Central Alternating Transformers

Chengjie Zhou
Chao Che
Pengfei Wang
Qiang Zhang

Time series forecasting has essential applications across various domains. For instance, forecasting power time series can optimize energy usage and bolster grid stability and reliability. Existing models based on transformer architecture are limited to classical design, ignoring the impact of spatial information and noise on model architecture design. Therefore, we propose an atypical design of Transformer-based models for multivariate time series forecasting. This design consists of two critical components: (i) spectral clustering center of time series employed as the focal point for attention computation; (ii) alternating attention mechanism wherein each query transformer is compatible with spectral clustering centers, executing attention at the sequence level instead of the token level. The alternating design has a two-fold benefit: firstly, it eliminates the uncertainty noise present in the dependent variable sequence of the channel input, and secondly, it incorporates the Euclidean distance to mitigate the impact of extreme values on the attention matrix, thereby aligning predictions more closely to the sequence's natural progression. Experiments on ten real-world datasets, encompassing Wind, Electricity, Weather, and others, demonstrate that our Spectral Central Alternating Transformer (SCAT) outperforms state-of-the-art methods (SOTA) by an average of 42. 8% in prediction performance in power time series forecasting.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Semi-supervised Multi-label Learning with Balanced Binary Angular Margin Loss

Ximing Li
Silong Liang
Changchun Li
Pengfei Wang
Fangming Gu

Semi-supervised multi-label learning (SSMLL) refers to inducing classifiers using a small number of samples with multiple labels and many unlabeled samples. The prevalent solution of SSMLL involves forming pseudo-labels for unlabeled samples and inducing classifiers using both labeled and pseudo-labeled samples in a self-training manner. Unfortunately, with the commonly used binary type of loss and negative sampling, we have empirically found that learning with labeled and pseudo-labeled samples can result in the variance bias problem between the feature distributions of positive and negative samples for each label. To alleviate this problem, we aim to balance the variance bias between positive and negative samples from the perspective of the feature angle distribution for each label. Specifically, we extend the traditional binary angular margin loss to a balanced extension with feature angle distribution transformations under the Gaussian assumption, where the distributions are iteratively updated during classifier training. We also suggest an efficient prototype-based negative sampling method to maintain high-quality negative samples for each label. With this insight, we propose a novel SSMLL method, namely Semi-Supervised Multi-Label Learning with Balanced Binary Angular Margin loss (S$^2$ML$^2$-BBAM). To evaluate the effectiveness of S$^2$ML$^2$-BBAM, we compare it with existing competitors on benchmark datasets. The experimental results validate that S$^2$ML$^2$-BBAM can achieve very competitive performance.

PDF Details DOI

AIJ Journal 2024 Journal Article

Temporal inductive path neural network for temporal knowledge graph reasoning

Hao Dong
Pengyang Wang
Meng Xiao
Zhiyuan Ning
Pengfei Wang
Yuanchun Zhou

Temporal Knowledge Graph (TKG) is an extension of traditional Knowledge Graph (KG) that incorporates the dimension of time. Reasoning on TKGs is a crucial task that aims to predict future facts based on historical occurrences. The key challenge lies in uncovering structural dependencies within historical subgraphs and temporal patterns. Most existing approaches model TKGs relying on entity modeling, as nodes in the graph play a crucial role in knowledge representation. However, the real-world scenario often involves an extensive number of entities, with new entities emerging over time. This makes it challenging for entity-dependent methods to cope with extensive volumes of entities, and effectively handling newly emerging entities also becomes a significant challenge. Therefore, we propose Temporal Inductive Path Neural Network (TiPNN), which models historical information in an entity-independent perspective. Specifically, TiPNN adopts a unified graph, namely history temporal graph, to comprehensively capture and encapsulate information from history. Subsequently, we utilize the defined query-aware temporal paths on a history temporal graph to model historical path information related to queries for reasoning. Extensive experiments illustrate that the proposed model not only attains significant performance enhancements but also handles inductive settings, while additionally facilitating the provision of reasoning evidence through history temporal graphs.

Details DOI

IJCAI Conference 2024 Conference Paper

TFWT: Tabular Feature Weighting with Transformer

Xinhao Zhang
Zaitian Wang
Lu Jiang
Wanfu Gao
Pengfei Wang
Kunpeng Liu

In this paper, we propose a novel feature weighting method to address the limitation of existing feature processing methods for tabular data. Typically the existing methods assume equal importance across all samples and features in one dataset. This simplified processing methods overlook the unique contributions of each feature, and thus may miss important feature information. As a result, it leads to suboptimal performance in complex datasets with rich features. To address this problem, we introduce Tabular Feature Weighting with Transformer, a novel feature weighting approach for tabular data. Our method adopts Transformer to capture complex feature dependencies and contextually assign appropriate weights to discrete and continuous features. Besides, we employ a reinforcement learning strategy to further fine-tune the weighting process. Our extensive experimental results across various real-world datasets and diverse downstream tasks show the effectiveness of TFWT and highlight the potential for enhancing feature weighting in tabular data analysis.

PDF Details DOI

IJCAI Conference 2023 Conference Paper

Adaptive Path-Memory Network for Temporal Knowledge Graph Reasoning

Hao Dong
Zhiyuan Ning
Pengyang Wang
Ziyue Qiao
Pengfei Wang
Yuanchun Zhou
Yanjie Fu

Temporal knowledge graph (TKG) reasoning aims to predict the future missing facts based on historical information and has gained increasing research interest recently. Lots of works have been made to model the historical structural and temporal characteristics for the reasoning task. Most existing works model the graph structure mainly depending on entity representation. However, the magnitude of TKG entities in real-world scenarios is considerable, and an increasing number of new entities will arise as time goes on. Therefore, we propose a novel architecture modeling with relation feature of TKG, namely aDAptivE path-MemOry Network (DaeMon), which adaptively models the temporal path information between query subject and each object candidate across history time. It models the historical information without depending on entity representation. Specifically, DaeMon uses path memory to record the temporal path information derived from path aggregation unit across timeline considering the memory passing strategy between adjacent timestamps. Extensive experiments conducted on four real-world TKG datasets demonstrate that our proposed model obtains substantial performance improvement and outperforms the state-of-the-art up to 4. 8% absolute in MRR.

PDF Details DOI

TIST Journal 2023 Journal Article

Configure Your Federation: Hierarchical Attention-enhanced Meta-Learning Network for Personalized Federated Learning

Yujia Gao
Pengfei Wang
Liang Liu
Chi Zhang
Huadong Ma

Federated learning, as a distributed machine learning framework, enables clients to conduct model training without transmitting their data to the server, which is used to solve the dilemma of data silos and data privacy. It can work well on clients having similar data characteristics and distribution. However, it has some limitations where the dataset of clients may be different in distribution, quantity, and concept in many application scenarios. Personalized federated learning is a new federated learning paradigm that aims to guarantee client personalized models’ effectiveness when collaborating with the cloud server. Intuitively, providing further facilitated collaborations for the clients with similar data characteristics and distribution can benefit personalized model building. However, due to the invisibility of client data, it is challenging to extract client characteristics and define collaborative relationships among them from a fine-grained view. Moreover, a reasonable collaborative training approach needs to be designed for a distributed server–client framework. In this article, we design a Hierarchical Attention-enhanced Meta-learning Network (HAM) to address this issue. The main advantage of HAM is that it utilizes the meta-learning approach of taking model parameters as features and learns to learn an extra model for each client to analyze similarities according to their local dataset automatically. According to its two-layers framework, HAM can reasonably achieve a tradeoff between clients’ personality and commonality and provides a hybrid model with useful information from all clients. Considering there are two networks (HAM and base network) that need to learn for each client during the federated training process, we then provide an alternative learning approach to train them in an end-to-end fashion. To further clarify the approach, we describe the personalized federated learning settings framework as FedHAM where the HAM network is distributed deployed in each client. Extensive experiments based on two datasets prove that our method outperforms state-of-the-art baselines under different evaluation metrics.

Details DOI

NeurIPS Conference 2023 Conference Paper

Reinforcement-Enhanced Autoregressive Feature Transformation: Gradient-steered Search in Continuous Space for Postfix Expressions

Dongjie Wang
Meng Xiao
Min Wu
Pengfei Wang
Yuanchun Zhou
Yanjie Fu

Feature transformation aims to generate new pattern-discriminative feature space from original features to improve downstream machine learning (ML) task performances. However, the discrete search space for the optimal feature explosively grows on the basis of combinations of features and operations from low-order forms to high-order forms. Existing methods, such as exhaustive search, expansion reduction, evolutionary algorithms, reinforcement learning, and iterative greedy, suffer from large search space. Overly emphasizing efficiency in algorithm design usually sacrifice stability or robustness. To fundamentally fill this gap, we reformulate discrete feature transformation as a continuous space optimization task and develop an embedding-optimization-reconstruction framework. This framework includes four steps: 1) reinforcement-enhanced data preparation, aiming to prepare high-quality transformation-accuracy training data; 2) feature transformation operation sequence embedding, intending to encapsulate the knowledge of prepared training data within a continuous space; 3) gradient-steered optimal embedding search, dedicating to uncover potentially superior embeddings within the learned space; 4) transformation operation sequence reconstruction, striving to reproduce the feature transformation solution to pinpoint the optimal feature space. Finally, extensive experiments and case studies are performed to demonstrate the effectiveness and robustness of the proposed method. The code and data are publicly accessible https: //www. dropbox. com/sh/imh8ckui7va3k5u/AACulQegVx0MuywYyoCqSdVPa? dl=0.

PDF Details

AAAI Conference 2021 Conference Paper

Context-Guided Adaptive Network for Efficient Human Pose Estimation

Lei Zhao
Jun Wen
Pengfei Wang
Nenggan Zheng

Although recent work has achieved great progress in human pose estimation (HPE), most methods show limitations in either inference speed or accuracy. In this paper, we propose a fast and accurate end-to-end HPE method, which is specifically designed to overcome the commonly encountered jitter box, defective box and ambiguous box problems of boxbased methods, e. g. Mask R-CNN. Concretely, 1) we propose the ROIGuider to aggregate box instance features from all feature levels under the guidance of global context instance information. Further, 2) the proposed Center Line Branch is equipped with a Dichotomy Extended Area algorithm to adaptively expand each instance box area, and Ambiguity Alleviation strategy to eliminate duplicated keypoints. Finally, 3) to achieve efficient multi-scale feature fusion and real-time inference, we design a novel Trapezoidal Network (TNet) backbone. Experimenting on the COCO dataset, our method achieves 68. 1 AP at 25. 4 fps, and outperforms Mask- RCNN by 8. 9 AP at a similar speed. The competitive performance on the HPE and person instance segmentation tasks over the state-of-the-art models show the promise of the proposed method. The source code will be made available at https: //github. com/zlcnup/CGANet.

PDF Details

IJCAI Conference 2021 Conference Paper

Pattern-enhanced Contrastive Policy Learning Network for Sequential Recommendation

Xiaohai Tong
Pengfei Wang
Chenliang Li
Long Xia
Shaozhang Niu

Sequential recommendation aims to predict users’ future behaviors given their historical interactions. However, due to the randomness and diversity of a user’s behaviors, not all historical items are informative to tell his/her next choice. It is obvious that identifying relevant items and extracting meaningful sequential patterns are necessary for a better recommendation. Unfortunately, few works have focused on this sequence denoising process. In this paper, we propose a PatteRn-enhanced ContrAstive Policy Learning Network (RAP for short) for sequential recommendation, RAP formalizes the denoising problem in the form of Markov Decision Process (MDP), and sample actions for each item to determine whether it is relevant with the target item. To tackle the lack of relevance supervision, RAP fuses a series of mined sequential patterns into the policy learning process, which work as a prior knowledge to guide the denoising process. After that, RAP splits the initial item sequence into two disjoint subsequences: a positive subsequence and a negative subsequence. At this, a novel contrastive learning mechanism is introduced to guide the sequence denoising and achieve preference estimation from the positive subsequence simultaneously. Extensive experiments on four public real-world datasets demonstrate the effectiveness of our approach for sequential recommendation.

PDF Details DOI

AAAI Conference 2021 Conference Paper

PGNet: Real-time Arbitrarily-Shaped Text Spotting with Point Gathering Network

Pengfei Wang
Chengquan Zhang
Fei Qi
Shanshan Liu
Xiaoqiang Zhang
Pengyuan Lyu
Junyu Han
Jingtuo Liu

The reading of arbitrarily-shaped text has received increasing research attention. However, existing text spotters are mostly built on two-stage frameworks or character-based methods, which suffer from either Non-Maximum Suppression (NMS), Region-of-Interest (RoI) operations, or character-level annotations. In this paper, to address the above problems, we propose a novel fully convolutional Point Gathering Network (PGNet) for reading arbitrarily-shaped text in real-time. The PGNet is a single-shot text spotter, where the pixel-level character classification map is learned with proposed PG-CTC loss avoiding the usage of character-level annotations. With a PG-CTC decoder, we gather high-level character classification vectors from two-dimensional space and decode them into text symbols without NMS and RoI operations involved, which guarantees high efficiency. Additionally, reasoning the relations between each character and its neighbors, a graph refinement module (GRM) is proposed to optimize the coarse recognition and improve the end-to-end performance. Experiments prove that the proposed method achieves competitive accuracy, meanwhile significantly improving the running speed. In particular, in Total-Text, it runs at 46. 7 FPS, surpassing the previous spotters with a large margin.

PDF Details

AAAI Conference 2019 Conference Paper

Asynchronous Proximal Stochastic Gradient Algorithm for Composition Optimization Problems

Pengfei Wang
Risheng Liu
Nenggan Zheng
Zhefeng Gong

In machine learning research, many emerging applications can be (re)formulated as the composition optimization problem with nonsmooth regularization penalty. To solve this problem, traditional stochastic gradient descent (SGD) algorithm and its variants either have low convergence rate or are computationally expensive. Recently, several stochastic composition gradient algorithms have been proposed, however, these methods are still inefficient and not scalable to large-scale composition optimization problem instances. To address these challenges, we propose an asynchronous parallel algorithm, named Async-ProxSCVR, which effectively combines asynchronous parallel implementation and variance reduction method. We prove that the algorithm admits the fastest convergence rate for both strongly convex and general nonconvex cases. Furthermore, we analyze the query complexity of the proposed algorithm and prove that linear speedup is accessible when we increase the number of processors. Finally, we evaluate our algorithm Async-ProxSCVR on two representative composition optimization problems including value function evaluation in reinforcement learning and sparse mean-variance optimization problem. Experimental results show that the algorithm achieves significant speedups and is much faster than existing compared methods.

PDF Details

TIST Journal 2017 Journal Article

Spotting Trip Purposes from Taxi Trajectories

Pengfei Wang
Guannan Liu
Yanjie Fu
Yuanchun Zhou
Jianhui Li

What is the purpose of a trip? What are the unique human mobility patterns and spatial contexts in or near the pickup points and delivery points of trajectories for a specific trip purpose? Many prior studies have modeled human mobility patterns in urban regions; however, these analytics mainly focus on interpreting the semantic meanings of geographic topics at an aggregate level. Given the lack of information about human activities at pick-up and dropoff points, it is challenging to convert the prior studies into effective tools for inferring trip purposes. To address this challenge, in this article, we study large-scale taxi trajectories from an unsupervised perspective in light of the following observations. First, the POI configurations of origin and destination regions closely relate to the urban functionality of these regions and further indicate various human activities. Second, with respect to the functionality of neighborhood environments, trip purposes can be discerned from the transitions between regions with different functionality at particular time periods. Along these lines, we develop a general probabilistic framework for spotting trip purposes from massive taxi GPS trajectories. Specifically, we first augment the origin and destination regions of trajectories by attaching neighborhood POIs. Then, we introduce a latent factor, POI Topic, to represent the mixed functionality of the regions, such that each origin or destination point in the city can be modeled as a mixture over POI Topics. In addition, considering the transitions from origins to destinations at specific time periods, the trip time is generated collaboratively from the pairwise POI Topics at both ends of the O-D pairs, constituting POI Links, and hence the trip purpose can be explained semantically by the POI Links. Finally, we present extensive experiments with the real-world data of New York City to demonstrate the effectiveness of our proposed method for spotting trip purposes, and moreover, the model is validated to perform well in predicting the destinations and trip time among all the baseline methods.

Details DOI