Author name cluster

Cheng Long

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers

2 author rows

AAAI Conference 2026 Conference Paper

From Dialogue to Destination: Geography-Aware Large Language Models with Multimodal Fusion for Conversational Recommendation

Yeming Li
Chenxi Liu
Jie Zou
Cheng Long
Chaoning Zhang
Peng Wang
Yang Yang

Conversational Recommender Systems (CRS) aim to provide personalized recommendations by interacting with users through natural language dialogue. However, in scenarios requiring deep geospatial awareness, existing methods, including those based on Large Language Models (LLMs), still face significant challenges in effectively fusing heterogeneous, multimodal geographic information with dynamic dialogue context. Simple fusion strategies struggle to resolve the asymmetric dependencies between dynamic user intent and static geographic context and fail to bridge the semantic gap between LLMs and structured geospatial data. To address these issues, we propose a framework for geography-aware CRS, named GeoCRS. Our core idea is to empower a frozen LLM with powerful geospatial reasoning capabilities by conditioning it on a dynamic, multimodal guidance signal generated by an external fusion architecture, all without altering the LLM's internal parameters. Specifically, we first design a hierarchical geographical encoder to uniformly represent heterogeneous geographic data. Subsequently, we introduce a contextual feature modulation module that asymmetrically injects the geographic context into the user's dialogue intent via a novel modulation mechanism to improve conversational recommendation via both geographic and dialogue context. Extensive experiments on public benchmark datasets demonstrate that our proposed GeoCRS significantly outperforms state-of-the-art baselines on the geography-aware conversational recommendation task.

PDF Details DOI

AAAI Conference 2026 Conference Paper

SubGCache: Accelerating Graph-based RAG with Subgraph-level KV Cache

Qiuyu Zhu
Liang Zhang
Qianxiong Xu
Cheng Long
Jie Zhang

Graph-based retrieval-augmented generation (RAG) enables large language models (LLMs) to incorporate structured knowledge via graph retrieval as contextual input, enhancing more accurate and context-aware reasoning. We observe that for different queries, it could retrieve similar subgraphs as prompts, and thus we propose SubGCache, which aims to reduce inference latency by reusing computation across queries with similar structural prompts (i.e., subgraphs). Specifically, SubGCache clusters queries based on subgraph embeddings, constructs a representative subgraph for each cluster, and pre-computes the key-value (KV) cache of the representative subgraph. For each query with its retrieved subgraph within a cluster, it reuses the pre-computed KV cache of the representative subgraph of the cluster without computing the KV tensors again for saving computation. Extensive experiments on three datasets across multiple LLM backbones and graph-based RAG frameworks demonstrate that SubGCache consistently reduces inference latency with comparable and even improved generation quality, achieving up to 6.68x reduction in time-to-first-token (TTFT).

PDF Details DOI

ICML Conference 2025 Conference Paper

Arrow: Accelerator for Time Series Causal Discovery with Time Weaving

Yuanyuan Yao 0002
Yuan Dong
Lu Chen 0001
Kun Kuang 0001
Ziquan Fang
Cheng Long
Yunjun Gao
Tianyi Li 0005

Current causal discovery methods for time series data can effectively address a variety of scenarios; however, they remain constrained by inefficiencies. The significant inefficiencies arise primarily from the high computational costs associated with binning, the uncertainty in selecting appropriate time lags, and the extensive sets of candidate variables. To achieve both high efficiency and effectiveness of causal discovery, we introduce an accelerator termed ARROW. It incorporates an innovative concept termed “Time Weaving” that efficiently encodes time series data to well capture the dynamic trends, thereby mitigating computational complexity. We also propose a novel time lag discovery strategy utilizing XOR operations, which derives a theorem to obtain the optimal time lag and significantly enhances the efficiency using XOR operations. To optimize the search space for causal relationships, we design an efficient pruning strategy that intelligently identifies the most relevant candidate variables, enhancing the efficiency and accuracy of causal discovery. We applied ARROW to four different types of time series causal discovery algorithms and evaluated it on 25 synthetic and real-world datasets. The results demonstrate that, compared to the original algorithms, ARROW achieves up to 153x speedup while achieving higher accuracy in most cases.

Details

ICML Conference 2025 Conference Paper

Generalization in Federated Learning: A Conditional Mutual Information Framework

Ziqiao Wang
Cheng Long
Yongyi Mao

Federated learning (FL) is a widely adopted privacy-preserving distributed learning framework, yet its generalization performance remains less explored compared to centralized learning. In FL, the generalization error consists of two components: the out-of-sample gap, which measures the gap between the empirical and true risk for participating clients, and the participation gap, which quantifies the risk difference between participating and non-participating clients. In this work, we apply an information-theoretic analysis via the conditional mutual information (CMI) framework to study FL’s two-level generalization. Beyond the traditional supersample-based CMI framework, we introduce a superclient construction to accommodate the two-level generalization setting in FL. We derive multiple CMI-based bounds, including hypothesis-based CMI bounds, illustrating how privacy constraints in FL can imply generalization guarantees. Furthermore, we propose fast-rate evaluated CMI bounds that recover the best-known convergence rate for two-level FL generalization in the small empirical risk regime. For specific FL model aggregation strategies and structured loss functions, we refine our bounds to achieve improved convergence rates with respect to the number of participating clients. Empirical evaluations confirm that our evaluated CMI bounds are non-vacuous and accurately capture the generalization behavior of FL algorithms.

Details

AAAI Conference 2025 Conference Paper

KITS: Inductive Spatio-Temporal Kriging with Increment Training Strategy

Qianxiong Xu
Cheng Long
Ziyue Li
Sijie Ruan
Rui Zhao
Zhishuai Li

Sensors are commonly deployed to perceive the environment. However, due to the high cost, sensors are usually sparsely deployed. Kriging is the tailored task to infer the unobserved nodes (without sensors) using the observed nodes (with sensors). The essence of kriging task is transferability. Recently, several inductive spatio-temporal kriging methods have been proposed based on graph neural networks, being trained based on a graph built on top of observed nodes via pretext tasks such as masking nodes out and reconstructing them. However, the graph in training is inevitably much sparser than the graph in inference that includes all the observed and unobserved nodes. The learned pattern cannot be well generalized for inference, denoted as graph gap. To address this issue, we first present a novel Increment training strategy: instead of masking nodes (and reconstructing them), we add virtual nodes into the training graph so as to mitigate the graph gap issue naturally. Nevertheless, the empty-shell virtual nodes without labels could have bad-learned features and lack supervision signals. To solve these issues, we pair each virtual node with its most similar observed node and fuse their features together; to enhance the supervision signal, we construct reliable pseudo labels for virtual nodes. As a result, the learned pattern of virtual nodes could be safely transferred to real unobserved nodes for reliable kriging. We name our new Kriging model with Increment Training Strategy as KITS. Extensive experiments demonstrate that KITS consistently outperforms existing methods by large margins, e.g., the improvement over MAE score could be as high as 18.33%.

PDF Details DOI

AAAI Conference 2025 Conference Paper

TimeCMA: Towards LLM-Empowered Multivariate Time Series Forecasting via Cross-Modality Alignment

Chenxi Liu
Qianxiong Xu
Hao Miao
Sun Yang
Lingzheng Zhang
Cheng Long
Ziyue Li
Rui Zhao

Multivariate time series forecasting (MTSF) aims to learn temporal dynamics among variables to forecast future time series. Existing statistical and deep learning-based methods suffer from limited learnable parameters and small-scale training data. Recently, large language models (LLMs) combining time series with textual prompts have achieved promising performance in MTSF. However, we discovered that current LLM-based solutions fall short in learning disentangled embeddings. We introduce TimeCMA, an intuitive yet effective framework for MTSF via cross-modality alignment. Specifically, we present a dual-modality encoding with two branches: the time series encoding branch extracts disentangled yet weak time series embeddings, and the LLM-empowered encoding branch wraps the same time series with text as prompts to obtain entangled yet robust prompt embeddings. As a result, such a cross-modality alignment retrieves both disentangled and robust time series embeddings, ``the best of two worlds'', from the prompt embeddings based on time series and prompt modality similarities. As another key design, to reduce the computational costs from time series with their length textual prompts, we design an effective prompt to encourage the most essential temporal information to be encapsulated in the last token: only the last token is passed to downstream prediction. We further store the last token embeddings to accelerate inference speed. Extensive experiments on eight real datasets demonstrate that TimeCMA outperforms state-of-the-arts.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Towards Cross-Modality Modeling for Time Series Analytics: A Survey in the LLM Era

Chenxi Liu
Shaowen Zhou
Qianxiong Xu
Hao Miao
Cheng Long
Ziyue Li
Rui Zhao

The proliferation of edge devices has generated an unprecedented volume of time series data across different domains, motivating a variety of well-customized methods. Recently, Large Language Models (LLMs) have emerged as a new paradigm for time series analytics by leveraging the shared sequential nature of textual data and time series. However, a fundamental cross-modality gap between time series and LLMs exists, as LLMs are pre-trained on textual corpora and are not inherently optimized for time series. Many recent proposals are designed to address this issue. In this survey, we provide an up-to-date overview of LLMs-based cross-modality modeling for time series analytics. We first introduce a taxonomy that classifies existing approaches into four groups based on the type of textual data employed for time series modeling. We then summarize key cross-modality strategies, e. g. , alignment and fusion, and discuss their applications across a range of downstream tasks. Furthermore, we conduct experiments on multimodal datasets from different application domains to investigate effective combinations of textual data and cross-modality strategies for enhancing time series analytics. Finally, we suggest several promising directions for future research. This survey is designed for a range of professionals, researchers, and practitioners interested in LLM-based time series modeling.

PDF Details DOI

AAMAS Conference 2024 Conference Paper

2D-Ptr: 2D Array Pointer Network for Solving the Heterogeneous Capacitated Vehicle Routing Problem

Qidong Liu
Chaoyue Liu
Shaoyao Niu
Cheng Long
Jie Zhang
Mingliang Xu

The heterogeneous capacitated vehicle routing problem (HCVRP) aims to optimize the routes of heterogeneous vehicles with capacity constraints to serve a set of customers with demands. Existing learning-based methods for solving HCVRP have the problem of weak generalization ability, which means that well-trained model cannot adapt well to new scenarios with different vehicle or customer numbers. To address this issue, by modeling the simultaneous decision-making of multiple agents as a sequence of consecutive actions in real time, we propose a pointer network extension model, which includes a static encoder and a dynamic encoder to map the current situation to node embeddings and vehicle embeddings, respectively. For each element in the consecutive actions sequence, the decoder of our model uses the probability distribution obtained from node embeddings and vehicle embeddings as a 2D array pointer to select a tuple from the combinations of vehicles and nodes (customers and depot). We call this architecture a 2D Array Pointer network (2D-Ptr). Instead of planning paths based on the priority order of vehicles, 2D-Ptr plans paths based on the priority order of actions. In addition, 2D-Ptr consists of a series of carefully designed attention modules, entitling the model to be generalizable in the scenarios where additional vehicles (or customers) are introduced or existing vehicles (or customers) are removed. We empirically test 2D-Ptr and show its capability for producing near-optimal solutions through cooperative actions. 2D- Ptr delivers competitive performance against the state-of-the-art baselines, and can solve arbitrary instances of the HCVRP without requiring re-training. ∗Corresponding authors. This work is licensed under a Creative Commons Attribution International 4. 0 License. Proc. of the 23rd International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2024), N. Alechina, V. Dignum, M. Dastani, J. S. Sichman (eds.), May 6 – 10, 2024, Auckland, New Zealand. © 2024 International Foundation for Autonomous Agents and Multiagent Systems (www. ifaamas. org).

PDF

NeurIPS Conference 2024 Conference Paper

Hybrid Mamba for Few-Shot Segmentation

Qianxiong Xu
Xuanyi Liu
Lanyun Zhu
Guosheng Lin
Cheng Long
Ziyue Li
Rui Zhao

Many few-shot segmentation (FSS) methods use cross attention to fuse support foreground (FG) into query features, regardless of the quadratic complexity. A recent advance Mamba can also well capture intra-sequence dependencies, yet the complexity is only linear. Hence, we aim to devise a cross (attention-like) Mamba to capture inter-sequence dependencies for FSS. A simple idea is to scan on support features to selectively compress them into the hidden state, which is then used as the initial hidden state to sequentially scan query features. Nevertheless, it suffers from (1) support forgetting issue: query features will also gradually be compressed when scanning on them, so the support features in hidden state keep reducing, and many query pixels cannot fuse sufficient support features; (2) intra-class gap issue: query FG is essentially more similar to itself rather than to support FG, i. e. , query may prefer not to fuse support features but their own ones from the hidden state, yet the success of FSS relies on the effective use of support information. To tackle them, we design a hybrid Mamba network (HMNet), including (1) a support recapped Mamba to periodically recap the support features when scanning query, so the hidden state can always contain rich support information; (2) a query intercepted Mamba to forbid the mutual interactions among query pixels, and encourage them to fuse more support features from the hidden state. Consequently, the support information is better utilized, leading to better performance. Extensive experiments have been conducted on two public benchmarks, showing the superiority of HMNet. The code is available at https: //github. com/Sam1224/HMNet.

PDF Details DOI

AAAI Conference 2023 Conference Paper

Deep Graph Structural Infomax

Wenting Zhao
Gongping Xu
Zhen Cui
Siqiang Luo
Cheng Long
Tong Zhang

In the scene of self-supervised graph learning, Mutual Information (MI) was recently introduced for graph encoding to generate robust node embeddings. A successful representative is Deep Graph Infomax (DGI), which essentially operates on the space of node features but ignores topological structures, and just considers global graph summary. In this paper, we present an effective model called Deep Graph Structural Infomax (DGSI) to learn node representation. We explore to derive the structural mutual information from the perspective of Information Bottleneck (IB), which defines a trade-off between the sufficiency and minimality of representation on the condition of the topological structure preservation. Intuitively, the derived constraints formally maximize the structural mutual information both edge-wise and local neighborhood-wise. Besides, we develop a general framework that incorporates the global representational mutual information, local representational mutual information, and sufficient structural information into the node representation. Essentially, our DGSI extends DGI and could capture more fine-grained semantic information as well as beneficial structural information in a self-supervised manner, thereby improving node representation and further boosting the learning performance. Extensive experiments on different types of datasets demonstrate the effectiveness and superiority of the proposed method.

PDF Details DOI

IJCAI Conference 2022 Conference Paper

A Survey on Neural Open Information Extraction: Current Status and Future Directions

Shaowen Zhou
Bowen Yu
Aixin Sun
Cheng Long
Jingyang Li
Jian Sun

Open Information Extraction (OpenIE) facilitates domain-independent discovery of relational facts from large corpora. The technique well suits many open-world natural language understanding scenarios, such as automatic knowledge base construction, open-domain question answering, and explicit reasoning. Thanks to the rapid development in deep learning technologies, numerous neural OpenIE architectures have been proposed and achieve considerable performance improvement. In this survey, we provide an extensive overview of the state-of-the-art neural OpenIE models, their key design decisions, strengths and weakness. Then, we discuss limitations of current solutions and the open issues in OpenIE problem itself. Finally we list recent trends that could help expand its scope and applicability, setting up promising directions for future research in OpenIE. To our best knowledge, this paper is the first review on neural OpenIE.

PDF Details DOI

AAAI Conference 2020 Conference Paper

Learning to Generate Maps from Trajectories

Sijie Ruan
Cheng Long
Jie Bao
Chunyang Li
Zisheng Yu
Ruiyuan Li
Yuxuan Liang
Tianfu He

Accurate and updated road network data is vital in many urban applications, such as car-sharing, and logistics. The traditional approach to identifying the road network, i. e. , ﬁeld survey, requires a signiﬁcant amount of time and effort. With the wide usage of GPS embedded devices, a huge amount of trajectory data has been generated by different types of mobile objects, which provides a new opportunity to extract the underlying road network. However, the existing trajectory-based map recovery approaches require many empirical parameters and do not utilize the prior knowledge in existing maps, which over-simpliﬁes or overcomplicates the reconstructed road network. To this end, we propose a deep learning-based map generation framework, i. e. , DeepMG, which learns the structure of the existing road network to overcome the noisy GPS positions. More speciﬁcally, DeepMG extracts features from trajectories in both spatial view and transition view and uses a convolutional deep neural network T2RNet to infer road centerlines. After that, a trajectory-based post-processing algorithm is proposed to re- ﬁne the topological connectivity of the recovered map. Extensive experiments on two real-world trajectory datasets con- ﬁrm that DeepMG signiﬁcantly outperforms the state-of-theart methods.

PDF Details