Author name cluster

Senzhang Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

35 papers

2 author rows

AAAI Conference 2026 Conference Paper

DiMA: Distinguishing Resident and Tourist Preferences via Multi-Modal LLM Alignment for Out-of-Town Cross-Domain Recommendation

Fan Zhang
Jinpeng Chen
Tao Wang
Huan Li
Senzhang Wang
Feifei Kou
Ye Ji
Kaimin Wei

Out-of-Town (OOT) recommendation aims to provide personalized suggestions for users in unfamiliar cities. However, OOT recommendation faces two fundamental challenges: the difficulty of reasoning across modalities, as preference signals in disparate formats such as images and text are hard to compare; and the preference deviation problem, since a user's resident and tourist preferences often diverge, rendering simple preference transfer ineffective. To address these challenges, we propose Distinguishing Resident and Tourist Preferences via Multi-Modal LLM Alignment for Out-of-Town Cross-Domain Recommendation (DiMA), a framework for re-ranking Points of Interest (POIs). To tackle the multimodal challenge, DiMA first leverages Multimodal Large Language Models and Large Language Models (LLMs) to transform heterogeneous POI data into unified semantic tags, enabling both cross-modal reasoning and efficient downstream processing. To address preference deviation, a ``teacher'' LLM executes a custom Chain-of-Thought (CoT) process to disentangle resident and tourist preferences from multi-city histories for re-ranking. Finally, a lightweight student model learns this CoT reasoning via Supervised Fine-Tuning and is then refined with Direct Preference Optimization to align with true user choices, with the potential to surpass the teacher. Extensive experiments on a real-world dataset demonstrate that DiMA significantly enhances the performance of baseline models in the OOT recommendation re-ranking task.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Beyond Text: Fine-Grained Multi-Modal Fact Verification with Hypergraph Transformers

Hui Pang
Chaozhuo Li
Litian Zhang
Senzhang Wang
Xi Zhang

Fact verification has become increasingly vital in the internet age, driven by the proliferation of false claims and political misinformation. While traditional methods rely predominantly on text-based evidence, multi-modal evidence introduces richer sources of information, offering valuable insights for claim verification. Existing multi-modal verification models often focus on superficial correlations between claims and evidence, neglecting the complex semantic interactions present in fine-grained multi-modal signals. In this paper, we propose a novel framework for multi-modal fact-checking, named Hypergraph Transformer-based Multi-modal Fact-Checking (HGTMFC). Our approach captures high-order relationships between different modalities of evidence and claims by leveraging hypergraphs. HGTMFC models the intricate relationships among evidence across various modalities and enhances information propagation through a transformer-based mechanism embedded within the hypergraph. Moreover, we utilize linegraphs to refine this propagation process, further strengthening the model's reasoning capabilities. Experiments on benchmark datasets demonstrate that our model significantly outperforms existing approaches in multi-modal fact verification.

PDF Details DOI

AAAI Conference 2025 Conference Paper

C2F-TP: A Coarse-to-Fine Denoising Framework for Uncertainty-Aware Trajectory Prediction

Zichen Wang
Hao Miao
Senzhang Wang
Renzhi Wang
Jianxin Wang
Jian Zhang

Accurately predicting the trajectory of vehicles is critically important for ensuring safety and reliability in autonomous driving. Although considerable research efforts have been made recently, the inherent trajectory uncertainty caused by various factors including the dynamic driving intends and the diverse driving scenarios still poses significant challenges to accurate trajectory prediction. To address this issue, we propose C2F-TP, a coarse-to-fine denoising framework for uncertainty-aware vehicle trajectory prediction. C2F-TP features an innovative two-stage coarse-to-fine prediction process. Specifically, in the spatial-temporal interaction stage, we propose a spatial-temporal interaction module to capture the inter-vehicle interactions and learn a multimodal trajectory distribution, from which a certain number of noisy trajectories are sampled. Next, in the trajectory refinement stage, we design a conditional denoising model to reduce the uncertainty of the sampled trajectories through a step-wise denoising operation. Extensive experiments are conducted on two real datasets NGSIM and highD that are widely adopted in trajectory prediction. The result demonstrates the effectiveness of our proposal.

PDF Details DOI

AAAI Conference 2025 Conference Paper

FairTP: A Prolonged Fairness Framework for Traffic Prediction

Jiangnan Xia
Yu Yang
Jiaxing Shen
Senzhang Wang
Jiannong Cao

Traffic prediction is pivotal in intelligent transportation systems. Existing works focus mainly on improving overall accuracy, overlooking a crucial problem of whether prediction results will lead to biased decisions by transportation authorities. In practice, the uneven deployment of traffic sensors in different urban areas produces imbalanced data, making the traffic prediction model fail in some urban areas and leading to unfair regional decision-making that eventually severely affects equity and quality of residents’ life. Existing fairness machine learning models struggle to maintain fair traffic prediction over prolonged periods. Although these models might achieve fairness at certain time slots, this static fairness will break down as traffic conditions change. To fill this research gap, we investigate prolonged fair traffic prediction, introducing two novel fairness metrics, i.e., region-based static fairness and sensor-based dynamic fairness, tailored to fairness fluctuations over time and across areas. An innovative prolonged fairness traffic prediction framework, namely FairTP, is then proposed. FairTP achieves prolonged fairness by alternating between “sacrifice” and “benefit” the prediction accuracy of each traffic sensor or area, ensuring that the number of these two actions are balanced over time. Specifically, FairTP incorporates a state identification module to discriminate whether the traffic sensors or areas are in a “sacrifice” or “benefit” state, thereby enabling prolonged fairness-aware traffic predictions. Additionally, we devise a state-guided balanced sampling strategy to select training examples to further enhance prediction fairness by mitigating the performance disparities among areas with uneven sensor distribution over time. Extensive experiments in two real-world datasets show that FairTP significantly improves prediction fairness without causing significant accuracy degradation.

PDF Details DOI

AAAI Conference 2025 Conference Paper

Ghidorah: Towards Robust Multi-Scale Information Diffusion Prediction via Test-Time Training

Wenting Zhu
Chaozhuo Li
Litian Zhang
Senzhang Wang
Xi Zhang

Information diffusion prediction (IDP) is a pivotal task for understanding the dynamics of information propagation within social networks. Conventional models typically adhere to a fixed learning-based paradigm, where the trained prediction model remains static during the inference phase. This paradigm presupposes that the data is independent and identically distributed, an assumption that may not hold true due to the inherently open nature of social media and the uncertainty and variability in user behavior. In this paper, we address the novel problem of out-of-distribution (OOD) shifts within IDP tasks and propose a new test-time training-based model for multi-scale IDP tasks, named Ghidorah. Our approach focuses on adapting a subset of model parameters to accommodate the unique characteristics of test samples through self-supervised learning (SSL) tasks. Ghidorah comprises three components: the macroscopic prediction branch, the microscopic prediction branch, and the auxiliary SSL branch. The auxiliary SSL task employs a masked autoencoder-based loss to fine-tune the model for specific test samples prior to prediction. Furthermore, Ghidorah integrates invariant learning to capture robust representations while mitigating spurious correlations. To our knowledge, Ghidorah is the first work to introduce a test-time training framework specifically designed to address the critical yet often overlooked OOD challenges in IDP. Experimental results across several benchmark datasets validate the superiority of our approach.

PDF Details DOI

AAAI Conference 2025 Conference Paper

MRR-FV: Unlocking Complex Fact Verification with Multi-Hop Retrieval and Reasoning

Liwen Zheng
Chaozhuo Li
Litian Zhang
Haoran Jia
Senzhang Wang
Zheng Liu
Xi Zhang

The pervasive spread of misinformation on social networks highlights the critical necessity for effective fact verification systems. Traditional approaches primarily focus on pairwise correlations between claims and evidence, often neglecting comprehensive multi-hop retrieval and reasoning, which results in suboptimal performance when dealing with complex claims. In this paper, we propose MRR-FV, a generative retrieval-enhanced model designed to address the novel challenge of Multi-hop Retrieval and Reasoning for Fact Verification, which integrates two core modules: Generative Multi-hop Retriever and the Hierarchical Interaction Reasoner. MRR-FV utilizes an autoregressive model for iterative multi-hop evidence retrieval, complemented by a pre-trained compressor to address the challenge of intention shift across retrieval hops. For claim verification, we propose a hierarchical interaction reasoner that conducts intra-sentence reasoning to capture long-term semantic dependencies and inter-sentence reasoning across multi-hop evidence subgraphs to reveal complex evidence interactions. Experimental evaluations on the FEVER and HOVER datasets demonstrate the superior performance of our model in both claim verification and evidence retrieval tasks.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Role-aware Multi-agent Reinforcement Learning for Coordinated Emergency Traffic Control

Ming Cheng
Hao Chen
Zhiqing Li
Jia Wang
Senzhang Wang

Emergency traffic control presents an increasingly critical challenge, requiring seamless coordination among emergency vehicles, regular vehicles, and traffic lights to ensure efficient passage for all vehicles. Existing models primarily only focus on traffic light control, leaving emergency and regular vehicles prone to delay due to the lack of navigation strategies. To address this issue, we propose the R ole-aware M ulti-agent T raffic C ontrol (RMTC) framework, which dynamically assigns appropriate roles to traffic components for better cooperation by considering their relations with emergency vehicles and adaptively adjusting their policies. Specifically, RMTC introduces a Heterogeneous Temporal Traffic Graph (HTTG) to model the spatial and temporal relationships among all traffic components (traffic lights, regular and emergency vehicles) at each time step. Furthermore, we develop a Dynamic Role Learning model to infer the evolving roles of traffic lights and regular vehicles based on HTTG. Finally, we present a Role-aware Multi-agent Reinforcement Learning approach that learns traffic policies conditioned on the dynamically roles. Extensive experiments across four public traffic scenarios show that RMTC outperforms existing traffic light control methods by significantly reducing emergency vehicle travel time, while effectively preserving traffic efficiency for regular vehicles. The code is released at https: //github. com/mingchenghexi/RMTC.

IJCAI Conference 2025 Conference Paper

STAMImputer: Spatio-Temporal Attention MoE for Traffic Data Imputation

Yiming Wang
Hao Peng
Senzhang Wang
Haohua Du
Chunyang Liu
Jia Wu
Guanlin Wu

Traffic data imputation is fundamentally important to support various applications in intelligent transportation systems such as traffic flow prediction. However, existing time-to-space sequential methods often fail to effectively extract features in block-wise missing data scenarios. Meanwhile, the static graph structure for spatial feature propagation significantly constrains the model's flexibility in handling the distribution shift issue for the nonstationary traffic data. To address these issues, this paper proposes a Spatio-Temporal Attention Mixture of experts network named STAMImputer for traffic data imputation. Specifically, we introduce a Mixture of Experts (MoE) framework to capture latent spatio-temporal features and their influence weights, effectively imputing block missing. A novel Low-rank guided Sampling Graph ATtention (LrSGAT) mechanism is designed to dynamically balance the local and global correlations across road networks. The sampled attention vectors are utilized to generate dynamic graphs that capture real-time spatial correlations. Extensive experiments are conducted on four traffic datasets for evaluation. The result shows STAMImputer achieves significantly performance improvement compared with existing SOTA approaches. Our codes are available at https: //github. com/RingBDStack/STAMImupter.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Fine Tuning Out-of-Vocabulary Item Recommendation with User Sequence Imagination

Ruochen Liu
Hao Chen
Yuanchen Bei
Qijie Shen
Fangwei Zhong
Senzhang Wang
Jianxin Wang

Recommending out-of-vocabulary (OOV) items is a challenging problem since the in-vocabulary (IV) items have well-trained behavioral embeddings but the OOV items only have content features. Current OOV recommendation models often generate 'makeshift' embeddings for OOV items from content features and then jointly recommend with the `makeshift' OOV item embeddings and the behavioral IV item embeddings. However, merely using the 'makeshift' embedding will result in suboptimal recommendation performance due to the substantial gap between the content feature and the behavioral embeddings. To bridge the gap, we propose a novel User Sequence IMagination (USIM) fine-tuning framework, which first imagines the user sequences and then refines the generated OOV embeddings with the user behavioral embeddings. Specifically, we frame the user sequence imagination as a reinforcement learning problem and develop a recommendation-focused reward function to evaluate to what extent a user can help recommend the OOV items. Besides, we propose an embedding-driven transition function to model the embedding transition after imaging a user. USIM has been deployed on a prominent e-commerce platform for months, offering recommendations for millions of OOV items and billions of users. Extensive experiments demonstrate that USIM outperforms traditional generative models in OOV item recommendation performance across traditional collaborative filtering and GNN-based collaborative filtering models.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

FuseAnyPart: Diffusion-Driven Facial Parts Swapping via Multiple Reference Images

Zheng Yu
Yaohua Wang
Siying Cui
Aixi Zhang
Wei-Long Zheng
Senzhang Wang

Facial parts swapping aims to selectively transfer regions of interest from the source image onto the target image while maintaining the rest of the target image unchanged. Most studies on face swapping designed specifically for full-face swapping, are either unable or significantly limited when it comes to swapping individual facial parts, which hinders fine-grained and customized character designs. However, designing such an approach specifically for facial parts swapping is challenged by a reasonable multiple reference feature fusion, which needs to be both efficient and effective. To overcome this challenge, FuseAnyPart is proposed to facilitate the seamless "fuse-any-part" customization of the face. In FuseAnyPart, facial parts from different people are assembled into a complete face in latent space within the Mask-based Fusion Module. Subsequently, the consolidated feature is dispatched to the Addition-based Injection Module forfusion within the UNet of the diffusion model to create novel characters. Extensive experiments qualitatively and quantitatively validate the superiority and robustness of FuseAnyPart. Source codes are available at https: //github. com/Thomas-wyh/FuseAnyPart.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

SaSDim: Self-Adaptive Noise Scaling Diffusion Model for Spatial Time Series Imputation

Shunyang Zhang
Senzhang Wang
Xianzhen Tan
Renzhi Wang
Ruochen Liu
Jian Zhang
Jianxin Wang

Spatial time series imputation is of great importance to various real-world applications. As the state-of-the-art generative models, diffusion models (e. g. CSDI) have outperformed statistical and autoregressive based models in time series imputation. However, diffusion models may introduce unstable noise owing to the inherent uncertainty in sampling, leading to the generated noise deviating from the intended Gaussian distribution. Consequently, the imputed data may deviate from the real data. To this end, we propose a Self-adaptive noise Scaling Diffusion Model named SaSDim for spatial time series imputation. Specifically, we introduce a novel Probabilistic High-Order SDE Solver Module to scale the noise following the standard Gaussian distribution. The noise scaling operation helps the noise prediction module of the diffusion model to more accurately estimate the variance of noise. To effectively learn the spatial and temporal features, a Spatial guided Global Convolution Module (SgGConv) for multi-periodic temporal dependencies learning with the Fast Fourier Transformation and dynamic spatial dependencies learning with dynamic graph convolution is also proposed. Extensive experiments conducted on three real-world spatial time series datasets verify the effectiveness of SaSDim.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

Score-CDM: Score-Weighted Convolutional Diffusion Model for Multivariate Time Series Imputation

Shunyang Zhang
Senzhang Wang
Hao Miao
Hao Chen
Changjun Fan
Jian Zhang

Multivariant time series (MTS) data are usually incomplete in real scenarios, and imputing the incomplete MTS is practically important to facilitate various time series mining tasks. Recently, diffusion model-based MTS imputation methods have achieved promising results by utilizing CNN or attention mechanisms for temporal features learning. However, it is hard to adaptively trade off the diverse effects of local and global temporal features by simply combining CNN and attention. To address this issue, we propose a Score-weighted Convolutional Diffusion Model (Score-CDM for short), whose backbone consists of a Score-weighted Convolution Module (SCM) and an Adaptive Reception Module (ARM). SCM adopts a score map to capture the global temporal features in the time domain, while ARM uses a Spectral2Time Window Block (S2TWB) to convolve the local time series data in the spectral domain. Benefiting from the time convolution properties of Fast Fourier Transformation, ARM can adaptively change the receptive field of the score map, and thus effectively balance the local and global temporal features. We conduct extensive evaluations on three real MTS datasets of different domains, and the result verifies the effectiveness of the proposed Score-CDM.

PDF Details DOI

NeurIPS Conference 2023 Conference Paper

A Comprehensive Study on Text-attributed Graphs: Benchmarking and Rethinking

Hao Yan
Chaozhuo Li
Ruosong Long
Chao Yan
Jianan Zhao
Wenwen Zhuang
Jun Yin
Peiyan Zhang

Text-attributed graphs (TAGs) are prevalent in various real-world scenarios, where each node is associated with a text description. The cornerstone of representation learning on TAGs lies in the seamless integration of textual semantics within individual nodes and the topological connections across nodes. Recent advancements in pre-trained language models (PLMs) and graph neural networks (GNNs) have facilitated effective learning on TAGs, garnering increased research interest. However, the absence of meaningful benchmark datasets and standardized evaluation procedures for TAGs has impeded progress in this field. In this paper, we propose CS-TAG, a comprehensive and diverse collection of challenging benchmark datasets for TAGs. The CS-TAG datasets are notably large in scale and encompass a wide range of domains, spanning from citation networks to purchase graphs. In addition to building the datasets, we conduct extensive benchmark experiments over CS-TAG with various learning paradigms, including PLMs, GNNs, PLM-GNN co-training methods, and the proposed novel topological pre-training of language models. In a nutshell, we provide an overview of the CS-TAG datasets, standardized evaluation procedures, and present baseline experiments. The entire CS-TAG project is publicly accessible at \url{https: //github. com/sktsherlock/TAG-Benchmark}.

NeurIPS Conference 2023 Conference Paper

Train Once and Explain Everywhere: Pre-training Interpretable Graph Neural Networks

Jun Yin
Chaozhuo Li
Hao Yan
Jianxun Lian
Senzhang Wang

Intrinsic interpretable graph neural networks aim to provide transparent predictions by identifying the influential fraction of the input graph that guides the model prediction, i. e. , the explanatory subgraph. However, current interpretable GNNs mostly are dataset-specific and hard to generalize to different graphs. A more generalizable GNN interpretation model which can effectively distill the universal structural patterns of different graphs is until-now unexplored. Motivated by the great success of recent pre-training techniques, we for the first time propose the Pre-training Interpretable Graph Neural Network ($\pi$-GNN) to distill the universal interpretability of GNNs by pre-training over synthetic graphs with ground-truth explanations. Specifically, we introduce a structural pattern learning module to extract diverse universal structure patterns and integrate them together to comprehensively represent the graphs of different types. Next, a hypergraph refining module is proposed to identify the explanatory subgraph by incorporating the universal structure patterns with local edge interactions. Finally, the task-specific predictor is cascaded with the pre-trained $\pi$-GNN model and fine-tuned over downstream tasks. Extensive experiments demonstrate that $\pi$-GNN significantly surpasses the leading interpretable GNN baselines with up to 9. 98\% interpretation improvement and 16. 06\% classification accuracy improvement. Meanwhile, $\pi$-GNN pre-trained on graph classification task also achieves the top-tier interpretation performance on node classification task, which further verifies its promising generalization performance among different downstream tasks. Our code and datasets are available at https: //anonymous. 4open. science/r/PI-GNN-F86C

NeurIPS Conference 2023 Conference Paper

V-InFoR: A Robust Graph Neural Networks Explainer for Structurally Corrupted Graphs

Senzhang Wang
Jun Yin
Chaozhuo Li
Xing Xie
Jianxin Wang

GNN explanation method aims to identify an explanatory subgraph which contains the most informative components of the full graph. However, a major limitation of existing GNN explainers is that they are not robust to the structurally corrupted graphs, e. g. , graphs with noisy or adversarial edges. On the one hand, existing GNN explainers mostly explore explanations based on either the raw graph features or the learned latent representations, both of which can be easily corrupted. On the other hand, the corruptions in graphs are irregular in terms of the structural properties, e. g. , the size or connectivity of graphs, which makes the rigorous constraints used by previous GNN explainers unfeasible. To address these issues, we propose a robust GNN explainer called V-InfoR. Specifically, a robust graph representation extractor, which takes insights of variational inference, is proposed to infer the latent distribution of graph representations. Instead of directly using the corrupted raw features or representations of each single graph, we sample the graph representations from the inferred distribution for the downstream explanation generator, which can effectively eliminate the minor corruption. We next formulate the explanation exploration as a graph information bottleneck (GIB) optimization problem. As a more general method that does not need any rigorous structural constraints, our GIB-based method can adaptively capture both the regularity and irregularity of the severely corrupted graphs for explanation. Extensive evaluations on both synthetic and real-world datasets indicate that V-InfoR significantly improves the GNN explanation performance for the structurally corrupted graphs. Code and dataset are available at https: //anonymous. 4open. science/r/V-InfoR-EF88

AAAI Conference 2023 Conference Paper

WSiP: Wave Superposition Inspired Pooling for Dynamic Interactions-Aware Trajectory Prediction

Renzhi Wang
Senzhang Wang
Hao Yan
Xiang Wang

Predicting motions of surrounding vehicles is critically important to help autonomous driving systems plan a safe path and avoid collisions. Although recent social pooling based LSTM models have achieved significant performance gains by considering the motion interactions between vehicles close to each other, vehicle trajectory prediction still remains as a challenging research issue due to the dynamic and high-order interactions in the real complex driving scenarios. To this end, we propose a wave superposition inspired social pooling (Wave-pooling for short) method for dynamically aggregating the high-order interactions from both local and global neighbor vehicles. Through modeling each vehicle as a wave with the amplitude and phase, Wave-pooling can more effectively represent the dynamic motion states of vehicles and capture their high-order dynamic interactions by wave superposition. By integrating Wave-pooling, an encoder-decoder based learning framework named WSiP is also proposed. Extensive experiments conducted on two public highway datasets NGSIM and highD verify the effectiveness of WSiP by comparison with current state-of-the-art baselines. More importantly, the result of WSiP is more interpretable as the interaction strength between vehicles can be intuitively reflected by their phase difference. The code of the work is publicly available at https://github.com/Chopin0123/WSiP.

PDF Details DOI

ICLR Conference 2022 Conference Paper

Ada-NETS: Face Clustering via Adaptive Neighbour Discovery in the Structure Space

Yaohua Wang
Yaobin Zhang
Fangyi Zhang
Senzhang Wang
Ming Lin 0002
YuQi Zhang
Xiuyu Sun

Face clustering has attracted rising research interest recently to take advantage of massive amounts of face images on the web. State-of-the-art performance has been achieved by Graph Convolutional Networks (GCN) due to their powerful representation capacity. However, existing GCN-based methods build face graphs mainly according to $k$NN relations in the feature space, which may lead to a lot of noise edges connecting two faces of different classes. The face features will be polluted when messages pass along these noise edges, thus degrading the performance of GCNs. In this paper, a novel algorithm named Ada-NETS is proposed to cluster faces by constructing clean graphs for GCNs. In Ada-NETS, each face is transformed to a new structure space, obtaining robust features by considering face features of the neighbour images. Then, an adaptive neighbour discovery strategy is proposed to determine a proper number of edges connecting to each face image. It significantly reduces the noise edges while maintaining the good ones to build a graph with clean yet rich edges for GCNs to cluster faces. Experiments on multiple public clustering datasets show that Ada-NETS significantly outperforms current state-of-the-art methods, proving its superiority and generalization. Code is available at https://github.com/damo-cv/Ada-NETS.

ICML Conference 2022 Conference Paper

HousE: Knowledge Graph Embedding with Householder Parameterization

Rui Li 0086
Jianan Zhao 0002
Chaozhuo Li
Di He 0001
Yiqi Wang 0001
Yuming Liu
Hao Sun 0015
Senzhang Wang

The effectiveness of knowledge graph embedding (KGE) largely depends on the ability to model intrinsic relation patterns and mapping properties. However, existing approaches can only capture some of them with insufficient modeling capacity. In this work, we propose a more powerful KGE framework named HousE, which involves a novel parameterization based on two kinds of Householder transformations: (1) Householder rotations to achieve superior capacity of modeling relation patterns; (2) Householder projections to handle sophisticated relation mapping properties. Theoretically, HousE is capable of modeling crucial relation patterns and mapping properties simultaneously. Besides, HousE is a generalization of existing rotation-based models while extending the rotations to high-dimensional spaces. Empirically, HousE achieves new state-of-the-art performance on five benchmark datasets. Our code is available at https: //github. com/anrep/HousE.

TIST Journal 2022 Journal Article

Introduction to the Special Issue on Deep Learning for Spatio-Temporal Data: Part 2

Senzhang Wang
Junbo Zhang
Yanjie Fu
Yong Li

TIST Journal 2022 Journal Article

Multivariate Correlation-aware Spatio-temporal Graph Convolutional Networks for Multi-scale Traffic Prediction

Senzhang Wang
Meiyue Zhang
Hao Miao
Zhaohui Peng
Philip S. Yu

Traffic flow prediction based on vehicle trajectories collected from the installed GPS devices is critically important to Intelligent Transportation Systems (ITS). One limitation of existing traffic prediction models is that they mostly focus on predicting road-segment level traffic conditions, which can be considered as a fine-grained prediction. In many scenarios, however, a coarse-grained prediction, such as predicting the traffic flows among different urban areas covering multiple road links, is also required to help government have a better understanding on traffic conditions from the macroscopic point of view. This is especially useful in the applications of urban planning and public transportation planning. Another limitation is that the correlations among different types of traffic-related features are largely ignored. For example, the traffic flow and traffic speed are usually negatively correlated. Existing works regard these traffic-related features as independent features without considering their correlations. In this article, we for the first time study the novel problem of multivariate correlation-aware multi-scale traffic flow predicting, and we propose a feature correlation-aware spatio-temporal graph convolutional networks named MC-STGCN to effectively address it. Specifically, given a road graph, we first construct a coarse-grained road graph based on both the topology closeness and the traffic flow similarity among the nodes (road links). Then a cross-scale spatial-temporal feature learning and fusion technique is proposed for dealing with both the fine- and coarse-grained traffic data. In the spatial domain, a cross-scale GCN is proposed to learn the multi-scale spatial features jointly and fuse them together. In the temporal domain, a cross-scale temporal network that is composed of a hierarchical attention is designed for effectively capturing intra- and inter-scale temporal correlations. To effectively capture the feature correlations, a feature correlation learning component is also designed. Finally, a structural constraint is introduced to make the predictions on the two scale traffic data consistent. We conduct extensive evaluations over two real traffic datasets, and the results demonstrate the superior performance of the proposal on both fine- and coarse-grained traffic predictions.

IJCAI Conference 2022 Conference Paper

Reconstruction Enhanced Multi-View Contrastive Learning for Anomaly Detection on Attributed Networks

Jiaqiang Zhang
Senzhang Wang
Songcan Chen

Detecting abnormal nodes from attributed networks is of great importance in many real applications, such as financial fraud detection and cyber security. This task is challenging due to both the complex interactions between the anomalous nodes with other counterparts and their inconsistency in terms of attributes. This paper proposes a self-supervised learning framework that jointly optimizes a multi-view contrastive learning-based module and an attribute reconstruction-based module to more accurately detect anomalies on attributed networks. Specifically, two contrastive learning views are firstly established, which allow the model to better encode rich local and global information related to the abnormality. Motivated by the attribute consistency principle between neighboring nodes, a masked autoencoder-based reconstruction module is also introduced to identify the nodes which have large reconstruction errors, then are regarded as anomalies. Finally, the two complementary modules are integrated for more accurately detecting the anomalous nodes. Extensive experiments conducted on five benchmark datasets show our model outperforms current state-of-the-art models.

PDF Details DOI

NeurIPS Conference 2022 Conference Paper

Robust Graph Structure Learning via Multiple Statistical Tests

Yaohua Wang
Fangyi Zhang
Ming Lin
Senzhang Wang
Xiuyu Sun
Rong Jin

Graph structure learning aims to learn connectivity in a graph from data. It is particularly important for many computer vision related tasks since no explicit graph structure is available for images for most cases. A natural way to construct a graph among images is to treat each image as a node and assign pairwise image similarities as weights to corresponding edges. It is well known that pairwise similarities between images are sensitive to the noise in feature representations, leading to unreliable graph structures. We address this problem from the viewpoint of statistical tests. By viewing the feature vector of each node as an independent sample, the decision of whether creating an edge between two nodes based on their similarity in feature representation can be thought as a ${\it single}$ statistical test. To improve the robustness in the decision of creating an edge, multiple samples are drawn and integrated by ${\it multiple}$ statistical tests to generate a more reliable similarity measure, consequentially more reliable graph structure. The corresponding elegant matrix form named $\mathcal{B}$$\textbf{-Attention}$ is designed for efficiency. The effectiveness of multiple tests for graph structure learning is verified both theoretically and empirically on multiple clustering and ReID benchmark datasets. Source codes are available at https: //github. com/Thomas-wyh/B-Attention.

TIST Journal 2021 Journal Article

ACM TIST Special Issue on Deep Learning for Spatio-Temporal Data: Part 1

Senzhang Wang
Junbo Zhang
Yanjie Fu
Yong Li

AAAI Conference 2021 Conference Paper

Adversarial Directed Graph Embedding

Shijie Zhu
Jianxin Li
Hao Peng
Senzhang Wang
Lifang He

Node representation learning for directed graphs is critically important to facilitate many graph mining tasks. To capture the directed edges between nodes, existing methods mostly learn two embedding vectors for each node, source vector and target vector. However, these methods learn the source and target vectors separately. For the node with very low indegree or outdegree, the corresponding target vector or source vector cannot be effectively learned. In this paper, we propose a novel Directed Graph embedding framework based on Generative Adversarial Network, called DGGAN. The main idea is to use adversarial mechanisms to deploy a discriminator and two generators that jointly learn each node’s source and target vectors. For a given node, the two generators are trained to generate its fake target and source neighbor nodes from the same underlying distribution, and the discriminator aims to distinguish whether a neighbor node is real or fake. The two generators are formulated into a unified framework and could mutually reinforce each other to learn more robust source and target vectors. Extensive experiments show that DGGAN consistently and significantly outperforms existing state-of-the-art methods across multiple graph mining tasks on directed graphs.

IJCAI Conference 2021 Conference Paper

Multi-Channel Pooling Graph Neural Networks

Jinlong Du
Senzhang Wang
Hao Miao
Jiaqiang Zhang

Graph pooling is a critical operation to downsample a graph in graph neural networks. Existing coarsening pooling methods (e. g. DiffPool) mostly focus on capturing the global topology structure by assigning the nodes into several coarse clusters, while dropping pooling methods (e. g. SAGPool) try to preserve the local topology structure by selecting the top-k representative nodes. However, there lacks an effective method to integrate the two types of methods so that both the local and the global topology structure of a graph can be well captured. To address this issue, we propose a Multi-channel Graph Pooling method named MuchPool, which captures the local structure, the global structure, and node feature simultaneously in graph pooling. Specifically, we use two channels to conduct dropping pooling based on the local topology and node features respectively, and one channel to conduct coarsening pooling. Then a cross-channel convolution operation is designed to refine the graph representations of different channels. Finally, the pooling results are aggregated as the final pooled graph. Extensive experiments on six benchmark datasets present the superior performance of MuchPool. The code of this work is publicly available at Github.

PDF Details DOI

AAAI Conference 2020 Conference Paper

Motif-Matching Based Subgraph-Level Attentional Convolutional Network for Graph Classification

Hao Peng
Jianxin Li
Qiran Gong
Yuanxin Ning
Senzhang Wang
Lifang He

Graph classiﬁcation is critically important to many real-world applications that are associated with graph data such as chemical drug analysis and social network mining. Traditional methods usually require feature engineering to extract the graph features that can help discriminate the graphs of different classes. Although recently deep learning based graph embedding approaches are proposed to automatically learn graph features, they mostly use a few vertex arrangements extracted from the graph for feature learning, which may lose some structural information. In this work, we present a novel motif-based attentional graph convolution neural network for graph classiﬁcation, which can learn more discriminative and richer graph features. Speciﬁcally, a motif-matching guided subgraph normalization method is developed to better preserve the spatial information. A novel subgraph-level selfattention network is also proposed to capture the different impacts or weights of different subgraphs. Experimental results on both bioinformatics and social network datasets show that the proposed models signiﬁcantly improve graph classiﬁcation performance over both traditional graph kernel methods and recent deep learning approaches.

AAAI Conference 2019 Conference Paper

Adversarial Learning for Weakly-Supervised Social Network Alignment

Chaozhuo Li
Senzhang Wang
Yukun Wang
Philip Yu
Yanbo Liang
Yun Liu
Zhoujun Li

Nowadays, it is common for one natural person to join multiple social networks to enjoy different kinds of services. Linking identical users across multiple social networks, also known as social network alignment, is an important problem of great research challenges. Existing methods usually link social identities on the pairwise sample level, which may lead to undesirable performance when the number of available annotations is limited. Motivated by the isomorphism information, in this paper we consider all the identities in a social network as a whole and perform social network alignment from the distribution level. The insight is that we aim to learn a projection function to not only minimize the distance between the distributions of user identities in two social networks, but also incorporate the available annotations as the learning guidance. We propose three models SNNAu, SNNAb and SNNAo to learn the projection function under the weakly-supervised adversarial learning framework. Empirically, we evaluate the proposed models over multiple datasets, and the results demonstrate the superiority of our proposals.

IJCAI Conference 2019 Conference Paper

Aspect-Based Sentiment Classification with Attentive Neural Turing Machines

Qianren Mao
Jianxin Li
Senzhang Wang
Yuanning Zhang
Hao Peng
Min He
Lihong Wang

Aspect-based sentiment classification aims to identify sentiment polarity expressed towards a given opinion target in a sentence. The sentiment polarity of the target is not only highly determined by sentiment semantic context but also correlated with the concerned opinion target. Existing works cannot effectively capture and store the inter-dependence between the opinion target and its context. To solve this issue, we propose a novel model of Attentive Neural Turing Machines (ANTM). Via interactive read-write operations between an external memory storage and a recurrent controller, ANTM can learn the dependable correlation of the opinion target to context and concentrate on crucial sentiment information. Specifically, ANTM separates the information of storage and computation, which extends the capabilities of the controller to learn and store sequential features. The read and write operations enable ANTM to adaptively keep track of the interactive attention history between memory content and controller state. Moreover, we append target entity embeddings into both input and output of the controller in order to augment the integration of target information. We evaluate our model on SemEval2014 dataset which contains reviews of Laptop and Restaurant domains and Twitter review dataset. Experimental results verify that our model achieves state-of-the-art performance on aspect-based sentiment classification.

AAAI Conference 2019 Conference Paper

Deeply Fusing Reviews and Contents for Cold Start Users in Cross-Domain Recommendation Systems

Wenjing Fu
Zhaohui Peng
Senzhang Wang
Yang Xu
Jin Li

As one promising way to solve the challenging issues of data sparsity and cold start in recommender systems, crossdomain recommendation has gained increasing research interest recently. Cross-domain recommendation aims to improve the recommendation performance by means of transferring explicit or implicit feedback from the auxiliary domain to the target domain. Although the side information of review texts and item contents has been proven to be useful in recommendation, most existing works only use one kind of side information and cannot deeply fuse this side information with ratings. In this paper, we propose a Review and Content based Deep Fusion Model named RC-DFM for crossdomain recommendation. We first extend Stacked Denoising Autoencoders (SDAE) to effectively fuse review texts and item contents with the rating matrix in both auxiliary and target domains. Through this way, the learned latent factors of users and items in both domains preserve more semantic information for recommendation. Then we utilize a multi-layer perceptron to transfer user latent factors between the two domains to address the data sparsity and cold start issues. Experimental results on real datasets demonstrate the superior performance of RC-DFM compared with state-of-the-art recommendation methods.Deeply Fusing Reviews and Contents for Cold Start Users in Cross-Domain Recommendation Systems

IJCAI Conference 2018 Conference Paper

Aspect-Level Deep Collaborative Filtering via Heterogeneous Information Networks

Xiaotian Han
Chuan Shi
Senzhang Wang
Philip S. Yu
Li Song

Latent factor models have been widely used for recommendation. Most existing latent factor models mainly utilize the rating information between users and items, although some recently extended models add some auxiliary information to learn a unified latent factor between users and items. The unified latent factor only represents the latent features of users and items from the aspect of purchase history. However, the latent features of users and items may stem from different aspects, e. g. , the brand-aspect and category-aspect of items. In this paper, we propose a Neural network based Aspect-level Collaborative Filtering model (NeuACF) to exploit different aspect latent factors. Through modelling rich objects and relations in recommender system as a heterogeneous information network, NeuACF first extracts different aspect-level similarity matrices of users and items through different meta-paths and then feeds an elaborately designed deep neural network with these matrices to learn aspect-level latent factors. Finally, the aspect-level latent factors are effectively fused with an attention mechanism for the top-N recommendation. Extensive experiments on three real datasets show that NeuACF significantly outperforms both existing latent factor models and recent neural network models.

IJCAI Conference 2018 Conference Paper

Ensemble Neural Relation Extraction with Adaptive Boosting

Dongdong Yang
Senzhang Wang
Zhoujun Li

Relation extraction has been widely studied to extract new relational facts from open corpus. Previous relation extraction methods are faced with the problem of wrong labels and noisy data, which substantially decrease the performance of the model. In this paper, we propose an ensemble neural network model - Adaptive Boosting LSTMs with Attention, to more effectively perform relation extraction. Specifically, our model first employs the recursive neural network LSTMs to embed each sentence. Then we import attention into LSTMs by considering that the words in a sentence do not contribute equally to the semantic meaning of the sentence. Next via adaptive boosting, we build strategically several such neural classifiers. By ensembling multiple such LSTM classifiers with adaptive boosting, we could build a more effective and robust joint ensemble neural networks based relation extractor. Experiment results on real dataset demonstrate the superior performance of the proposed model, improving F1-score by about 8% compared to the state-of-the-art models.

IJCAI Conference 2018 Conference Paper

Text Emotion Distribution Learning via Multi-Task Convolutional Neural Network

Yuxiang Zhang
Jiamei Fu
Dongyu She
Ying Zhang
Senzhang Wang
Jufeng Yang

Emotion analysis of on-line user generated textual content is important for natural language processing and social media analytics tasks. Most of previous emotion analysis approaches focus on identifying users’ emotional states from text by classifying emotions into one of the finite categories, e. g. , joy, surprise, anger and fear. However, there exists ambiguity characteristic for the emotion analysis, since a single sentence can evoke multiple emotions with different intensities. To address this problem, we introduce emotion distribution learning and propose a multi-task convolutional neural network for text emotion analysis. The end-to-end framework optimizes the distribution prediction and classification tasks simultaneously, which is able to learn robust representations for the distribution dataset with annotations of different voters. While most work adopt the majority voting scheme for the ground truth labeling, we also propose a lexiconbased strategy to generate distributions from a single label, which provides prior information for the emotion classification. Experiments conducted on five public text datasets (i. e. , SemEval, Fairy Tales, ISEAR, TEC, CBET) demonstrate that our proposed method performs favorably against the state-of-the-art approaches.

AAAI Conference 2016 Conference Paper

Aggregating Inter-Sentence Information to Enhance Relation Extraction

Hao Zheng
Zhoujun Li
Senzhang Wang
Zhao Yan
Jianshe Zhou

Previous work for relation extraction from free text is mainly based on intra-sentence information. As relations might be mentioned across sentences, inter-sentence information can be leveraged to improve distantly supervised relation extraction. To effectively exploit inter-sentence information, we propose a ranking-based approach, which ﬁrst learns a scoring function based on a listwise learning-to-rank model and then uses it for multi-label relation extraction. Experimental results verify the effectiveness of our method for aggregating information across sentences. Additionally, to further improve the ranking of high-quality extractions, we propose an effective method to rank relations from different entity pairs. This method can be easily integrated into our overall relation extraction framework, and boosts the precision signiﬁcantly.

TIST Journal 2016 Journal Article

Coranking the Future Influence of Multiobjects in Bibliographic Network Through Mutual Reinforcement

Senzhang Wang
Sihong Xie
Xiaoming Zhang
Zhoujun Li
Philip S. Yu
Yueying He

Scientific literature ranking is essential to help researchers find valuable publications from a large literature collection. Recently, with the prevalence of webpage ranking algorithms such as PageRank and HITS, graph-based algorithms have been widely used to iteratively rank papers and researchers through the networks formed by citation and coauthor relationships. However, existing graph-based ranking algorithms mostly focus on ranking the current importance of literature. For researchers who enter an emerging research area, they might be more interested in new papers and young researchers that are likely to become influential in the future, since such papers and researchers are more helpful in letting them quickly catch up on the most recent advances and find valuable research directions. Meanwhile, although some works have been proposed to rank the prestige of a certain type of objects with the help of multiple networks formed of multiobjects, there still lacks a unified framework to rank multiple types of objects in the bibliographic network simultaneously. In this article, we propose a unified ranking framework MRCoRank to corank the future popularity of four types of objects: papers, authors, terms, and venues through mutual reinforcement. Specifically, because the citation data of new publications are sparse and not efficient to characterize their innovativeness, we make the first attempt to extract the text features to help characterize innovative papers and authors. With the observation that the current trend is more indicative of the future trend of citation and coauthor relationships, we then construct time-aware weighted graphs to quantify the importance of links established at different times on both citation and coauthor graphs. By leveraging both the constructed text features and time-aware graphs, we finally fuse the rich information in a mutual reinforcement ranking framework to rank the future importance of multiobjects simultaneously. We evaluate the proposed model through extensive experiments on the ArnetMiner dataset containing more than 1,500,000 papers. Experimental results verify the effectiveness of MRCoRank in coranking the future influence of multiobjects in a bibliographic network.

AAAI Conference 2015 Conference Paper

Burst Time Prediction in Cascades

Senzhang Wang
Zhao Yan
Xia Hu
Philip S. Yu
Zhoujun Li

Studying the bursty nature of cascades in social media is practically important in many applications such as product sales prediction, disaster relief, and stock market prediction. Although the cascade volume prediction has been extensively studied, how to predict when a burst will come remains an open problem. It is challenging to predict the time of the burst due to the “quick rise and fall” pattern and the diverse time spans of the cascades. To this end, this paper proposes a classification based approach for burst time prediction by utilizing and modeling rich knowledge in information diffusion. Particularly, we first propose a time window based approach to predict in which time window the burst will appear. This paves the way to transform the time prediction task to a classification problem. To address the challenge that the original time series data of the cascade popularity only are not sufficient for predicting cascades with diverse magnitudes and time spans, we explore rich information diffusion related knowledge and model them in a scale-independent manner. Extensive experiments on a Sina Weibo reposting dataset demonstrate the superior performance of the proposed approach in accurately predicting the burst time of posts.