Arrow Research search

Author name cluster

Bowen Du

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

14 papers
1 author row

Possible papers

14

AAAI Conference 2026 Conference Paper

Global-Lens Transformers: Adaptive Token Mixing for Dynamic Link Prediction

  • Tao Zou
  • Chengfeng Wu
  • Tianxi Liao
  • Junchen Ye
  • Bowen Du

Dynamic graph learning plays a pivotal role in modeling evolving relationships over time, especially for temporal link prediction tasks in domains such as traffic systems, social networks, and recommendation platforms. While Transformer-based models have demonstrated strong performance by capturing long-range temporal dependencies, their reliance on self-attention results in quadratic complexity with respect to sequence length, limiting scalability on high-frequency or large-scale graphs. In this work, we revisit the necessity of self-attention in dynamic graph modeling. Inspired by recent findings that attribute the success of Transformers more to their architectural design than attention itself, we propose GLFormer, a novel attention-free Transformer-style framework for dynamic graphs. GLFormer introduces an adaptive token mixer that performs context-aware local aggregation based on interaction order and time intervals. To capture long-term dependencies, we further design a hierarchical aggregation module that expands the temporal receptive field by stacking local token mixers across layers. Experiments on six widely used dynamic graph benchmarks show that GLFormer achieves competitive or superior performance, which reveals that attention-free architectures can match or surpass Transformer baselines in dynamic graph settings with significantly improved efficiency.

AAAI Conference 2026 Conference Paper

UNO! UNified Offline Training Paradigm for Learning Path Recommendation

  • Linzhi Peng
  • Wentao Zhu
  • Ke Cheng
  • Heng Chang
  • Junchen Ye
  • Bowen Du
  • Weifeng Lv

With the wide adoption of online education platforms, adaptive learning systems have become increasingly important. Learning Path Recommendation (LPR) aims to dynamically adjust learning content to optimize learning efficiency based on individual student needs. However, current LPR methods suffer from sparse reward for precise assessment and only focus on anonymous sessions that overlook more personalized and effective paths. To address these challenges, we propose UNO, UNified Offline Training Paradigm for Learning Path Recommendation. This approach introduces an offline training paradigm in RL-based LPR to provide dense process rewards by a personalized advantage based on a reward model, which can estimate the students' internal knowledge levels on the learning targets. Additionally, we propose UniLPR model, a personalized recommendation system that unifies modeling the implicit relationships between students' long-term accumulation and evolving requirements for questions, and refines through Group Relative Policy Optimization(GRPO). Finally, we design learning tasks that encompass historical reviewing, recent learning, and long-term exploratory learning to simulate the comprehensive and diverse learning needs of students. Our UNO achieves state-of-the-art performance across all tasks, demonstrating its effectiveness.

IJCAI Conference 2024 Conference Paper

An NCDE-based Framework for Universal Representation Learning of Time Series

  • Zihan Liu
  • Bowen Du
  • Junchen Ye
  • Xianqing Wen
  • Leilei Sun

Exploiting self-supervised learning (SSL) to extract the universal representations of time series could not only capture the natural properties of time series but also offer huge help to the downstream tasks. Nevertheless, existing time series representation learning (TSRL) methods face challenges in attaining universality. Indeed, existing methods relying solely on one SSL strategy (either contrastive learning (CL) or generative) often fall short in capturing rich semantic information for various downstream tasks. Moreover, time series exhibit diverse distributions and inherent characteristics, particularly with the common occurrence of missing values, posing a notable challenge for existing backbones in effectively handling such diverse time series data. To bridge these gaps, we propose CTRL, a framework for universal TSRL. For the first time, we employ Neural Controlled Differential Equation (NCDE) as the backbone for TSRL, which captures the continuous processes and exhibits robustness to missing data. Additionally, a dual-task SSL strategy, integrating both reconstruction and contrasting tasks, is proposed to enrich the semantic information of the learned representations. Furthermore, novel hard negative construction and false negative elimination mechanisms are proposed to improve sampling efficiency and reduce sampling bias in CL. Finally, extensive experiments demonstrate the superiority of CTRL in forecasting, classification, and imputation tasks, particularly its outstanding robustness to missing data.

AAAI Conference 2023 Conference Paper

Conditional Diffusion Based on Discrete Graph Structures for Molecular Graph Generation

  • Han Huang
  • Leilei Sun
  • Bowen Du
  • Weifeng Lv

Learning the underlying distribution of molecular graphs and generating high-fidelity samples is a fundamental research problem in drug discovery and material science. However, accurately modeling distribution and rapidly generating novel molecular graphs remain crucial and challenging goals. To accomplish these goals, we propose a novel Conditional Diffusion model based on discrete Graph Structures (CDGS) for molecular graph generation. Specifically, we construct a forward graph diffusion process on both graph structures and inherent features through stochastic differential equations (SDE) and derive discrete graph structures as the condition for reverse generative processes. We present a specialized hybrid graph noise prediction model that extracts the global context and the local node-edge dependency from intermediate graph states. We further utilize ordinary differential equation (ODE) solvers for efficient graph sampling, based on the semi-linear structure of the probability flow ODE. We also combine the solvers with gradient guidance from the molecule property predictor for similarity-constrained molecule optimization. Experiments on diverse datasets validate the effectiveness of our framework. Particularly, the proposed method still generates high-quality molecular graphs in a limited number of steps.

IJCAI Conference 2023 Conference Paper

Continuous-Time Graph Learning for Cascade Popularity Prediction

  • Xiaodong Lu
  • Shuo Ji
  • Le Yu
  • Leilei Sun
  • Bowen Du
  • Tongyu Zhu

Information propagation on social networks could be modeled as cascades, and many efforts have been made to predict the future popularity of cascades. However, most of the existing research treats a cascade as an individual sequence. Actually, the cascades might be correlated with each other due to the shared users or similar topics. Moreover, the preferences of users and semantics of a cascade are usually continuously evolving over time. In this paper, we propose a continuous-time graph learning method for cascade popularity prediction, which first connects different cascades via a universal sequence of user-cascade and user-user interactions and then chronologically learns on the sequence by maintaining the dynamic states of users and cascades. Specifically, for each interaction, we present an evolution learning module to continuously update the dynamic states of the related users and cascade based on their currently encoded messages and previous dynamic states. We also devise a cascade representation learning component to embed the temporal information and structural information carried by the cascade. Experiments on real-world datasets demonstrate the superiority and rationality of our approach.

AAAI Conference 2023 Conference Paper

Generic and Dynamic Graph Representation Learning for Crowd Flow Modeling

  • Liangzhe Han
  • Ruixing Zhang
  • Leilei Sun
  • Bowen Du
  • Yanjie Fu
  • Tongyu Zhu

Many deep spatio-temporal learning methods have been proposed for crowd flow modeling in recent years. However, most of them focus on designing a spatial and temporal convolution mechanism to aggregate information from nearby nodes and historical observations for a pre-defined prediction task. Different from the existing research, this paper aims to provide a generic and dynamic representation learning method for crowd flow modeling. The main idea of our method is to maintain a continuous-time representation for each node, and update the representations of all nodes continuously according to the streaming observed data. Along this line, a particular encoder-decoder architecture is proposed, where the encoder converts the newly happened transactions into a timestamped message, and then the representations of related nodes are updated according to the generated message. The role of the decoder is to guide the representation learning process by reconstructing the observed transactions based on the most recent node representations. Moreover, a number of virtual nodes are added to discover macro-level spatial patterns and also share the representations among spatially-interacted stations. Experiments have been conducted on two real-world datasets for four popular prediction tasks in crowd flow modeling. The result demonstrates that our method could achieve better prediction performance for all the tasks than baseline methods.

AAAI Conference 2023 Conference Paper

Predicting Temporal Sets with Simplified Fully Connected Networks

  • Le Yu
  • Zihang Liu
  • Tongyu Zhu
  • Leilei Sun
  • Bowen Du
  • Weifeng Lv

Given a sequence of sets, where each set contains an arbitrary number of elements, temporal sets prediction aims to predict which elements will appear in the subsequent set. Existing methods for temporal sets prediction are developed on sophisticated components (e.g., recurrent neural networks, attention or gating mechanisms, and graph neural networks), which inevitably increase the model complexity due to more trainable parameters and higher computational costs. Moreover, the involved nonlinear activation may contribute little or even degrade the performance. In this paper, we present a succinct architecture that is solely built on the Simplified Fully Connected Networks (SFCNs) for temporal sets prediction to bring both effectiveness and efficiency together. In particular, given a user's sequence of sets, we employ SFCNs to derive representations of the user by learning inter-set temporal dependencies, intra-set element relationships, and intra-embedding channel correlations. Two families of general functions are introduced to preserve the permutation-invariant property of each set and the permutation-equivariant property of elements in each set. Moreover, we design a user representations adaptive fusing module to aggregate user representations according to each element for improving the prediction performance. Experiments on four benchmarks show the superiority of our approach over the state-of-the-art under both transductive and inductive settings. We also theoretically and empirically demonstrate that our model has lower space and time complexity than baselines. Codes and datasets are available at https://github.com/yule-BUAA/SFCNTSP.

NeurIPS Conference 2023 Conference Paper

Towards Better Dynamic Graph Learning: New Architecture and Unified Library

  • Le Yu
  • Leilei Sun
  • Bowen Du
  • Weifeng Lv

We propose DyGFormer, a new Transformer-based architecture for dynamic graph learning. DyGFormer is conceptually simple and only needs to learn from nodes' historical first-hop interactions by: (1) a neighbor co-occurrence encoding scheme that explores the correlations of the source node and destination node based on their historical sequences; (2) a patching technique that divides each sequence into multiple patches and feeds them to Transformer, allowing the model to effectively and efficiently benefit from longer histories. We also introduce DyGLib, a unified library with standard training pipelines, extensible coding interfaces, and comprehensive evaluating protocols to promote reproducible, scalable, and credible dynamic graph learning research. By performing exhaustive experiments on thirteen datasets for dynamic link prediction and dynamic node classification tasks, we find that DyGFormer achieves state-of-the-art performance on most of the datasets, demonstrating its effectiveness in capturing nodes' correlations and long-term temporal dependencies. Moreover, some results of baselines are inconsistent with previous reports, which may be caused by their diverse but less rigorous implementations, showing the importance of DyGLib. All the used resources are publicly available at https: //github. com/yule-BUAA/DyGLib.

TIST Journal 2022 Journal Article

Graph Sequence Neural Network with an Attention Mechanism for Traffic Speed Prediction

  • Zhilong Lu
  • Weifeng Lv
  • Zhipu Xie
  • Bowen Du
  • Guixi Xiong
  • Leilei Sun
  • Haiquan Wang

Recent years have witnessed the emerging success of Graph Neural Networks (GNNs) for modeling graphical data. A GNN can model the spatial dependencies of nodes in a graph based on message passing through node aggregation. However, in many application scenarios, these spatial dependencies can change over time, and a basic GNN model cannot capture these changes. In this article, we propose a G raph S eq uence neural network with an A tt ention mechanism (GSeqAtt) for processing graph sequences. More specifically, two attention mechanisms are combined: a horizontal mechanism and a vertical mechanism. GTransformer, which is a horizontal attention mechanism for handling time series, is used to capture the correlations between graphs in the input time sequence. The vertical attention mechanism, a Graph Network (GN) block structure with an attention mechanism (GNAtt), acts within the graph structure in each frame of the time series. Experiments show that our proposed model is able to handle information propagation for graph sequences accurately and efficiently. Moreover, results on real-world data from three road intersections show that our GSeqAtt outperforms state-of-the-art baselines on the traffic speed prediction task.

AAAI Conference 2021 Conference Paper

Coupled Layer-wise Graph Convolution for Transportation Demand Prediction

  • Junchen Ye
  • Leilei Sun
  • Bowen Du
  • Yanjie Fu
  • Hui Xiong

Graph Convolutional Network (GCN) has been widely applied in transportation demand prediction due to its excellent ability to capture non-Euclidean spatial dependence among station-level or regional transportation demands. However, in most of the existing research, the graph convolution was implemented on a heuristically generated adjacency matrix, which could neither reflect the real spatial relationships of stations accurately, nor capture the multi-level spatial dependence of demands adaptively. To cope with the above problems, this paper provides a novel graph convolutional network for transportation demand prediction. Firstly, a novel graph convolution architecture is proposed, which has different adjacency matrices in different layers and all the adjacency matrices are self-learned during the training process. Secondly, a layer-wise coupling mechanism is provided, which associates the upper-level adjacency matrix with the lower-level one. It also reduces the scale of parameters in our model. Lastly, a unitary network is constructed to give the final prediction result by integrating the hidden spatial states with gated recurrent unit, which could capture the multi-level spatial dependence and temporal dynamics simultaneously. Experiments have been conducted on two real-world datasets, NYC Citi Bike and NYC Taxi, and the results demonstrate the superiority of our model over the state-of-the-art ones.

IJCAI Conference 2020 Conference Paper

Rebalancing Expanding EV Sharing Systems with Deep Reinforcement Learning

  • Man Luo
  • Wenzhe Zhang
  • Tianyou Song
  • Kun Li
  • Hongming Zhu
  • Bowen Du
  • Hongkai Wen

Electric Vehicle (EV) sharing systems have recently experienced unprecedented growth across the world. One of the key challenges in their operation is vehicle rebalancing, i. e. , repositioning the EVs across stations to better satisfy future user demand. This is particularly challenging in the shared EV context, because i) the range of EVs is limited while charging time is substantial, which constrains the rebalancing options; and ii) as a new mobility trend, most of the current EV sharing systems are still continuously expanding their station networks, i. e. , the targets for rebalancing can change over time. To tackle these challenges, in this paper we model the rebalancing task as a Multi-Agent Reinforcement Learning (MARL) problem, which directly takes the range and charging properties of the EVs into account. We propose a novel approach of policy optimization with action cascading, which isolates the non-stationarity locally, and use two connected networks to solve the formulated MARL. We evaluate the proposed approach using a simulator calibrated with 1-year operation data from a real EV sharing system. Results show that our approach significantly outperforms the state-of-the-art, offering up to 14% gain in order satisfied rate and 12% increase in net revenue.

AAAI Conference 2018 Conference Paper

S-Net: From Answer Extraction to Answer Synthesis for Machine Reading Comprehension

  • Chuanqi Tan
  • Furu Wei
  • Nan Yang
  • Bowen Du
  • Weifeng Lv
  • Ming Zhou

In this paper, we present a novel approach to machine reading comprehension for the MS-MARCO dataset. Unlike the SQuAD dataset that aims to answer a question with exact text spans in a passage, the MS-MARCO dataset defines the task as answering a question from multiple passages and the words in the answer are not necessary in the passages. We therefore develop an extraction-then-synthesis framework to synthesize answers from extraction results. Specifically, the answer extraction model is first employed to predict the most important sub-spans from the passage as evidence, and the answer synthesis model takes the evidence as additional features along with the question and passage to further elaborate the final answers. We build the answer extraction model with state-ofthe-art neural networks for single passage reading comprehension, and propose an additional task of passage ranking to help answer extraction in multiple passages. The answer synthesis model is based on the sequence-to-sequence neural networks with extracted evidences as features. Experiments show that our extraction-then-synthesis method outperforms state-of-the-art methods.

TIST Journal 2018 Journal Article

SmartTransfer

  • Bowen Du
  • Yifeng Cui
  • Yanjie Fu
  • Runxing Zhong
  • Hui Xiong

In urban transportation systems, transfer stations refer to hubs connecting a variety of bus and subway lines and, thus, are the most important nodes in transportation networks. The pervasive availability of large-scale travel traces of passengers, collected from automated fare collection (AFC) systems, has provided unprecedented opportunities for understanding citywide transfer patterns, which can benefit smart transportation, such as smart route recommendation to avoid crowded lines, and dynamic bus scheduling to enhance transportation efficiency. To this end, in this article, we provide a systematic study of the measurement, patterns, and modeling of spatiotemporal dynamics of passenger transfers. Along this line, we develop a data-driven analytical system for modeling the transfer volumes of each transfer station. More specifically, we first identify and quantify the discriminative patterns of spatiotemporal dynamics of passenger transfers by utilizing heterogeneous sources of transfer related data for each station. Also, we develop a multi-task spatiotemporal learning model for predicting the transfer volumes of a specific station at a specific time period. Moreover, we further leverage the predictive model of passenger transfers to provide crowdedness-aware route recommendations. Finally, we conduct the extensive evaluations with a variety of real-world data. Experimental results demonstrate the effectiveness of our proposed modeling method and its applications for smart transportation.