Arrow Research search

Author name cluster

Fan Zhou

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

86 papers
2 author rows

Possible papers

86

AAAI Conference 2026 Conference Paper

Beyond Graph Priors: A Co-Evolving Framework Under Uncertainty for Enterprise Resilience Assessment

  • Yanzhe Xie
  • Li Huang
  • Qiang Gao
  • Xueqin Chen
  • Fan Zhou
  • Kunpeng Zhang

Assessing enterprise resilience under uncertainty necessitates capturing both intrinsic attributes and evolving inter-enterprise dependencies. However, real-world enterprise systems pose substantial structural challenges: redundant or loosely correlated links can trigger spurious relational inferences, while missing or latent dependencies often hinder the propagation of informative signals. Moreover, most existing approaches adopt static graph priors or decouple structural refinement from semantic learning, lacking a co-evolutionary paradigm that allows structure and representation to inform one another. We propose CFU, a novel Co-evolving Framework under Uncertainty, which reconceptualizes graph structure as a dynamic and learnable component evolving alongside node semantics. Specifically, CFU begins with a structure-aware contrastive pretraining phase to distill latent relational semantics without supervision. It then performs bidirectional structural refinement, filtering structurally redundant edges through semantic agreement scoring, and uncovering temporally contingent, task-relevant dependencies via similarity-guided inference. These operations are integrated through a dynamic fusion procedure that continuously aligns the evolving topology with the resilience objective. By embedding structural adaptation within the learning loop, CFU enables context-aware resilience assessment across incomplete, ambiguous, and structurally volatile enterprise environments. Ultimately, extensive experiments conducted on real-world datasets demonstrate its superior performance across diverse evaluation scenarios.

AAAI Conference 2026 Conference Paper

Causally-Grounded Dual-Path Attention Intervention for Object Hallucination Mitigation in LVLMs

  • Liu Yu
  • Zhonghao Chen
  • Ping Kuang
  • Zhikun Feng
  • Fan Zhou
  • Lan Wang
  • Gillian Dobbie

Object hallucination remains a critical challenge in Large Vision-Language Models (LVLMs), where models generate content inconsistent with visual inputs. Existing language-decoder based mitigation approaches often regulate visual or textual attention independently, overlooking their interaction as two key causal factors. To address this, we propose Owl (Bi-mOdal attention reWeighting for Layer-wise hallucination mitigation), a causally-grounded framework that models hallucination process via a structural causal graph, treating decomposed visual and textual attentions as mediators. We introduce VTACR (Visual-to-Textual Attention Contribution Ratio), a novel metric that quantifies the modality contribution imbalance during decoding. Our analysis reveals that hallucinations frequently occur in low-VTACR scenarios, where textual priors dominate and visual grounding is weakened. To mitigate this, we design a fine-grained attention intervention mechanism that dynamically adjusts token- and layer-wise attention guided by VTACR signals. Finally, we propose a dual-path contrastive decoding strategy: one path emphasizes visually grounded predictions, while the other amplifies hallucinated ones -- letting visual truth shine and hallucination collapse. Experimental results on the POPE and CHAIR benchmarks show that Owl achieves significant hallucination reduction, setting a new SOTA in faithfulness while preserving vision-language understanding capability. Our code is available at https://github.com/CikZ2023/OWL

AAAI Conference 2026 Conference Paper

Learning to Curate Context: Jointly Optimizing Retrieval and Prediction for Multimodal Social Media Popularity

  • Xovee Xu
  • Shuojun Lin
  • Fan Zhou
  • Jingkuan Song

Predicting the popularity of user-generated content (UGC) is a crucial but challenging task in social media analysis. While existing retrieval-augmented models enhance predictions by supplying rich contextual information, they remain limited by a fundamental precision-recall dilemma: enlarging the retrieval set increases coverage but introduces noisy, irrelevant context that harms prediction. In this work, we propose a unified framework that learns to retrieve, filter, and predict. Central to our approach is a Mixture-of-Logits-based retrieval module that replaces static similarity metrics with a dynamic, multi-faceted scoring function, enabling the retriever to be directly optimized by the prediction objective. Then an uncertainty-aware filter is designed to perform differentiable subset selection and refine the selected representations using the information bottleneck principle. At last, to enhance predictive robustness, we introduce a confidence-weighted test-time perturbation strategy. By learning to retrieve UGCs that are beneficial for prediction and filtering out uncertainty, our framework provides more relevant and reliable context. Extensive experiments demonstrate that the proposed framework achieves state-of-the-art performance, consistently outperforming strong baselines.

AAAI Conference 2026 Conference Paper

Modality-Balanced Collaborative Distillation for Multi-Modal Domain Generalization

  • Xiaohan Wang
  • Zhangtao Cheng
  • Ting Zhong
  • Leiting Chen
  • Fan Zhou

Weight Averaging (WA) has emerged as a powerful technique for enhancing generalization by promoting convergence to a flat loss landscape, which correlates with stronger out-of-distribution performance. However, applying WA directly to multi-modal domain generalization (MMDG) is challenging: differences in optimization speed across modalities lead WA to overfit to faster-converging ones in early stages, suppressing the contribution of slower yet complementary modalities, thereby hindering effective modality fusion and skewing the loss surface toward sharper, less generalizable minima. To address this issue, we propose MBCD, a unified collaborative distillation framework that retains WA's flatness-inducing advantages while overcoming its shortcomings in multi-modal contexts. MBCD begins with adaptive modality dropout in the student model to curb early-stage bias toward dominant modalities. A gradient consistency constraint then aligns learning signals between uni-modal branches and the fused representation, encouraging coordinated and smoother optimization. Finally, a WA-based teacher conducts cross-modal distillation by transferring fused knowledge to each uni-modal branch, which strengthens cross-modal interactions and steer convergence toward flatter solutions. Extensive experiments on MMDG benchmarks show that MBCD consistently outperforms existing methods, achieving superior accuracy and robustness across diverse unseen domains.

AAAI Conference 2026 Conference Paper

Shedding the Facades, Connecting the Domains: Detecting Shifting Multimodal Hate Video with Test-Time Adaptation

  • Jiao Li
  • Jian Lang
  • Xikai Tang
  • Wenzheng Shu
  • Ting Zhong
  • Qiang Gao
  • Yong Wang
  • Leiting Chen

Hate Video Detection (HVD) is crucial for online ecosystems. Existing methods assume identical distributions between training (source) and inference (target) data. However, hateful content often evolves into irregular and ambiguous forms to evade censorship, resulting in substantial semantic drift and rendering previously trained models ineffective. Test-Time Adaptation (TTA) offers a solution by adapting models during inference to narrow the cross-domain gap, while conventional TTA methods target mild distribution shifts and struggle with the severe semantic drift in HVD. To tackle these challenges, we propose SCANNER, the first TTA framework tailored for HVD. Motivated by the insight that, despite the evolving nature of hateful manifestations, their underlying cores remain largely invariant (i.e., targeting is still based on characteristics like gender, race, etc), we leverage these stable cores as a bridge to connect the source and target domains. Specifically, SCANNER initially reveals the stable cores from the ambiguous layout in evolving hateful content via a principled centroid-guided alignment mechanism. To alleviate the impact of outlier-like samples that are weakly correlated with centroids during the alignment process, SCANNER enhances the prior by incorporating a sample-level adaptive centroid alignment strategy, promoting more stable adaptation. Furthermore, to mitigate semantic collapse from overly uniform outputs within clusters, SCANNER introduces an intra-cluster diversity regularization that encourages the cluster-wise semantic richness. Experiments show that SCANNER outperforms all baselines, with an average gain of 4.69% in Macro-F1 over the best.

NeurIPS Conference 2025 Conference Paper

Breaking the Order Barrier: Off-Policy Evaluation for Confounded POMDPs

  • Qi Kuang
  • Jiayi Wang
  • Fan Zhou
  • Zhengling Qi

We consider off-policy evaluation (OPE) in Partially Observable Markov Decision Processes (POMDPs) with unobserved confounding. Recent advances have introduced bridge-function to circumvent unmeasured confounding and develop estimators for the policy value, yet the statistical error bounds of them related to the length of horizon $T$ and the size of the state-action space $|\mathcal{O}||\mathcal{A}|$ remain largely unexplored. In this paper, we systematically investigate the finite-sample error bounds of OPE estimators in finite-horizon tabular confounded POMDPs. Specifically, we show that under certain rank conditions, the estimation error for policy value can achieve a rate of $\mathcal{O}(T^{1. 5}/\sqrt{n})$, excluding the cardinality of the observation space $|\mathcal{O}|$ and the action space $|\mathcal{A}|$. With an additional mild condition on the concentrability coefficients in confounded POMDPs, the rate of estimation error can be improved to $\mathcal{O}(T/\sqrt{n})$. We also show that for a fully history-dependent policy, the estimation error scales as $\mathcal{O}\big(T/\sqrt{n}(|\mathcal{O}| |\mathcal{A}|)^{\frac{T}{2}}\big)$, highlighting the exponential error dependence introduced by history-based proxies to infer hidden states. Furthermore, when the target policy is memoryless policy, the error bound improves to $\mathcal{O}\big(T/\sqrt{n}\sqrt{|\mathcal{O}| |\mathcal{A}|}\big)$, which matches the optimal rate known for tabular MDPs. To the best of our knowledge, this is the first work to provide a comprehensive finite-sample analysis of OPE in confounded POMDPs.

ICML Conference 2025 Conference Paper

Diving into Self-Evolving Training for Multimodal Reasoning

  • Wei Liu 0131
  • Junlong Li
  • Xiwen Zhang
  • Fan Zhou
  • Yu Cheng 0001
  • Junxian He

Self-evolving training—where models iteratively learn from their own outputs—has emerged as a key approach for complex reasoning tasks, addressing the scarcity of high-quality chain-of-thought data. However, its effectiveness in multimodal reasoning, a domain more intricate than text-only reasoning, remains underexplored, and the understanding of critical factors in this training paradigm remains limited. Furthermore, a central challenge for this training method is performance saturation, which impedes further improvements and scalability. Inspired by reinforcement learning (RL), in this paper, we reframe self-evolving training for multimodal reasoning through the lens of RL, identifying three pivotal factors: $\textit{Training Method}$, $\textit{Reward Model}$, and $\textit{Prompt Variation}$. Through systematic analysis, we establish relatively optimal design principles that significantly enhance multimodal reasoning capabilities. Moreover, delving deeper into training dynamics, we uncover the roots of saturation and propose a new automatic balancing mechanism to mitigate this limitation. Building on these insights, we propose M-STaR ( M ultimodal S elf-evolving T r a ining for R easoning), a framework that achieves consistent performance gains across models of varying sizes and diverse benchmarks. All resources will be made publicly available.

ICLR Conference 2025 Conference Paper

Enhancing Prediction Performance through Influence Measure

  • Shuguang Yu
  • Wenqian Xu
  • Xinyi Zhou
  • Xuechun Wang
  • Hongtu Zhu
  • Fan Zhou

In the field of machine learning, the pursuit of accurate models is ongoing. A key aspect of improving prediction performance lies in identifying which data points in the training set should be excluded and which high-quality, potentially unlabeled data points outside the training set should be incorporated to improve the model's performance on unseen data. To accomplish this, an effective metric is needed to evaluate the contribution of each data point toward enhancing overall model performance. This paper proposes the use of an influence measure as a metric to assess the impact of training data on test set performance. Additionally, we introduce a data selection method to optimize the training set as well as a dynamic active learning algorithm driven by the influence measure. The effectiveness of these methods is demonstrated through extensive simulations and real-world datasets.

AAAI Conference 2025 Conference Paper

Improving Multimodal Social Media Popularity Prediction via Selective Retrieval Knowledge Augmentation

  • Xovee Xu
  • Yifan Zhang
  • Fan Zhou
  • Jingkuan Song

Understanding and predicting the popularity of online User-Generated Content (UGC) is critical for various social and recommendation systems. Existing efforts have focused on extracting predictive features and using pre-trained deep models to learn and fuse multimodal UGC representations. However, the dissemination of social UGCs is not an isolated process in social network; rather, it is influenced by contextual relevant UGCs and various exogenous factors, including social ties, trends, user interests, and platform algorithms. In this work, we propose a retrieval-based framework to enhance the popularity prediction of multimodal UGCs. Our framework extends beyond a simple semantic retrieval, incorporating a meta retrieval strategy that queries a diverse set of relevant UGCs by considering multimodal content semantics, and metadata from user and post. Moreover, to eliminate irrelevant and noisy UGCs in retrieval, we introduce a new measure called Relative Retrieval Contribution to Prediction (RRCP), which selectively refines the retrieved UGCs. We then aggregate the contextual UGC knowledge using vision-language graph neural networks, and fuse them with an RRCP-Attention-based prediction network. Extensive experiments on three large-scale social media datasets demonstrate significant improvements ranging from 26.68% to 48.19% across all metrics compared to strong baselines.

AAAI Conference 2025 Conference Paper

In-context Prompt-augmented Micro-video Popularity Prediction

  • Zhangtao Cheng
  • Jiao Li
  • Jian Lang
  • Ting Zhong
  • Fan Zhou

Micro-video popularity prediction (MVPP) plays a crucial role in various downstream applications. Recently, multimodal methods that integrate multiple modalities to predict the popularity have exhibited impressive performance. However, these methods face several unresolved issues: (1) limited contextual information and (2) incomplete modal semantics. Incorporating relevant videos and performing full fine-tuning on pre-trained models typically achieves powerful capabilities in addressing these issues. However, this paradigm is not optimal due to its weak transferability and scarce downstream data. Inspired by prompt learning, we propose ICPF, a novel In-Context Prompt-augmented Framework to enhance popularity prediction. ICPF maintains a model-agnostic design, facilitating seamless integration with various multimodal fusion models. Specifically, the multi-branch retriever first retrieves similar modal content through within-modality similarities. Next, in-context prompt generator extracts semantic prior features from retrieved videos and generates in-context prompts, enriching pre-trained models with valuable contextual knowledge. Finally, knowledge-augmented predictor captures complementary features including modal semantics and popularity information. Extensive experiments conducted on three real-world datasets demonstrate the superiority of ICPF compared to 14 competitive baselines.

NeurIPS Conference 2025 Conference Paper

JavisGPT: A Unified Multi-modal LLM for Sounding-Video Comprehension and Generation

  • Kai Liu
  • Jungang Li
  • Yuchong Sun
  • Shengqiong Wu
  • jianzhang gao
  • Daoan Zhang
  • Wei Zhang
  • Sheng Jin

This paper presents JavisGPT, the first unified multimodal large language model (MLLM) for joint audio-video (JAV) comprehension and generation. JavisGPT has a concise encoder-LLM-decoder architecture, which has a SyncFusion module for spatio-temporal audio-video fusion and synchrony-aware learnable queries to bridge a pretrained JAV-DiT generator. This design enables temporally coherent video-audio understanding and generation from multimodal instructions. We design an effective three-stage training pipeline consisting of multimodal pretraining, audio-video fine-tuning, and large-scale instruction-tuning, to progressively build multimodal comprehension and generation from existing vision-language models. For instruction tuning, we construct JavisInst-Omni, a high-quality instruction dataset with over 200K GPT-4o-curated audio-video-text dialogues that cover diverse and multi-level comprehension and generation scenarios. On JAV comprehension and generation benchmarks, our experiments show that JavisGPT outperforms existing MLLMs, particularly in complex and temporally synchronized settings.

ICML Conference 2025 Conference Paper

Programming Every Example: Lifting Pre-training Data Quality Like Experts at Scale

  • Fan Zhou
  • Zengzhi Wang
  • Qian Liu 0033
  • Junlong Li
  • Pengfei Liu 0003

Large language model pre-training has traditionally relied on human experts to craft heuristics for improving the corpora quality, resulting in numerous rules developed to date. However, these fixed rules lack the flexibility to address the unique characteristics of individual examples, yet crafting sample-wise rules is impractical for human experts. In this paper, we show that even small language models, with only 0. 3B parameters, can exhibit substantial data refining capabilities. We propose Programming Every Example (ProX), a novel framework that treats data refinement as a programming task, and enables the model to refine corpora by generating and executing fine-grained operations, such as string normalization, for each individual example at scale. Experiments show that models trained on ProX-refined data consistently outperform other baselines across 10 benchmarks, demonstrating effectiveness across model sizes (up to 1. 7B) and pre-training corpora (C4, RedPajama-V2, FineWeb, FineWeb-Edu, and DCLM). ProX also shows great potential in continual pre-training: on math domain, ProX boosts 7B models by up to 20% within 10B tokens—results typically achieved with much larger scale training (e. g. , 200B tokens). We believe ProX offers a way to curate high-quality pre-training data, and finally contributes to efficient LLM development.

AAAI Conference 2025 Conference Paper

Retrieval-Augmented Dynamic Prompt Tuning for Incomplete Multimodal Learning

  • Jian Lang
  • Zhangtao Cheng
  • Ting Zhong
  • Fan Zhou

Multimodal learning with incomplete modality is practical and challenging. Recently, researchers have focused on enhancing the robustness of pre-trained MultiModal Transformers (MMTs) under missing modality conditions by applying learnable prompts. However, these prompt-based methods face several limitations: (1) incomplete modalities provide restricted modal cues for task-specific inference, (2) dummy imputation for missing content causes information loss and introduces noise, and (3) static prompts are instance-agnostic, offering limited knowledge for instances with various missing conditions. To address these issues, we propose RAGPT, a novel Retrieval-AuGmented dynamic Prompt Tuning framework. RAGPT comprises three modules: (I) the multi-channel retriever, which identifies similar instances through a within-modality retrieval strategy, (II) the missing modality generator, which recovers missing information using retrieved contexts, and (III) the context-aware prompter, which captures contextual knowledge from relevant instances and generates dynamic prompts to largely enhance the MMT’s robustness. Extensive experiments conducted on three real-world datasets show that RAGPT consistently outperforms all competitive baselines in handling incomplete modality problems.

NeurIPS Conference 2025 Conference Paper

Revisiting Reinforcement Learning for LLM Reasoning from A Cross-Domain Perspective

  • Jorge (Zhoujun) Cheng
  • Shibo Hao
  • Tianyang Liu
  • Fan Zhou
  • Yutao Xie
  • Feng Yao
  • Yuexin Bian
  • Nilabjo Dey

Reinforcement learning (RL) has shown promise in enhancing large language model (LLM) reasoning, yet progress towards broader capabilities is limited by the availability of high-quality, multi-domain datasets. This work introduces \ours, a 92K RL-for-reasoning dataset designed to address this gap, covering six reasoning domains: Math, Code, Science, Logic, Simulation, and Tabular, each with corresponding verifiers. We build \ours via a careful data-curation pipeline, including sourcing, deduplication, reward design, and domain-specific and difficulty-based filtering, to facilitate the systematic investigation of cross-domain RL generalization. Our study using \ours suggests the efficacy of a simple mixed-domain RL training approach and reveals several key aspects affecting cross-domain transferability. We further train two models {\ours}-7B and {\ours}-32B purely with RL on our curated data and observe largely improved performance over leading open RL reasoning model baselines, with gains of 7. 3\% and 7. 8\% respectively on an extensive 17-task, six-domain evaluation suite. We are releasing our dataset, code, and evaluation suite to the community, aiming to support further research and development of more general RL-enhanced reasoning models.

NeurIPS Conference 2025 Conference Paper

Versatile Transferable Unlearnable Example Generator

  • Zhihao Li
  • Jiale Cai
  • Gezheng Xu
  • Hao Zheng
  • Qiuyue Li
  • Fan Zhou
  • Shichun Yang
  • Charles Ling

The rapid growth of publicly available data has fueled deep learning advancements but also raises concerns about unauthorized data usage. Unlearnable Examples (UEs) have emerged as a data protection strategy that introduces imperceptible perturbations to prevent unauthorized learning. However, most existing UE methods produce perturbations strongly tied to specific training sets, leading to a significant drop in unlearnability when applied to unseen data or tasks. In this paper, we argue that for broad applicability, UEs should maintain their effectiveness across diverse application scenarios. To this end, we conduct the first comprehensive study on the transferability of UEs across diverse and practical yet demanding settings. Specifically, we identify key scenarios that pose significant challenges for existing UE methods, including varying styles, out-of-distribution classes, resolutions, and architectures. Moreover, we propose $\textbf{Versatile Transferable Generator}$ (VTG), a transferable generator designed to safeguard data across various conditions. Specifically, VTG integrates Adversarial Domain Augmentation (ADA) into the generator’s training process to synthesize out-of-distribution samples, thereby improving its generalizability to unseen scenarios. Furthermore, we propose a Perturbation-Label Coupling (PLC) mechanism that leverages contrastive learning to directly align perturbations with class labels. This approach reduces the generator’s reliance on data semantics, allowing VTG to produce unlearnable perturbations in a distribution-agnostic manner. Extensive experiments demonstrate the effectiveness and broad applicability of our approach. Code is available at https: //github. com/zhli-cs/VTG.

AAAI Conference 2024 Short Paper

Amplifying Diversity and Quality in Commonsense Knowledge Graph Completion (Student Abstract)

  • Liu Yu
  • Fenghui Tian
  • Ping Kuang
  • Fan Zhou

Conventional commonsense knowledge graph completion (CKGC) methods provide inadequate sequence when fine-tuning or generating stages and incorporate full fine-tuning, which fail to align with the autoregressive model's pre-training patterns and have insufficient parameter efficiency. Moreover, decoding through beam or greedy search produces low diversity and high similarity in generated tail entities. Hence, we resort to prefix-tuning and propose a lightweight, effective pipeline to enhance the quality and diversity of extracted commonsense knowledge. Precisely, we measure head entity similarity to yield and then concatenate top-k tuples before each target tuple for prefix-tuning the source LM, thereby improving the efficiency and speed for pretrained models; then, we design a penalty-tailored diverse beam search (p-DBS) for decoding tail entities, producing a greater quantity and diversity of generated commonsense tuples; besides, a filter strategy is utilized to filter out invalid commonsense knowledge. Through extensive automatic evaluations, including ChatGPT scoring, our method can extract diverse, novel, and accurate commonsense knowledge (CK).

AAAI Conference 2024 Short Paper

Biases Mitigation and Expressiveness Preservation in Language Models: A Comprehensive Pipeline (Student Abstract)

  • Liu Yu
  • Ludie Guo
  • Ping Kuang
  • Fan Zhou

Pre-trained language models (PLMs) have greatly transformed various downstream tasks, yet frequently display social biases from training data, raising fairness concerns. Recent efforts to debias PLMs come with limitations: they either fine-tune the entire parameters in PLMs, which is time-consuming and disregards the expressiveness of PLMs, or ignore the reintroducing biases from downstream tasks when applying debiased models to them. Hence, we propose a two-stage pipeline to mitigate biases from both internal and downstream contexts while preserving expressiveness in language models. Specifically, for the debiasing procedure, we resort to continuous prefix-tuning, not fully fine-tuning the PLM, in which we design a debiasing term for optimization and an alignment term to keep words’ relative distances and ensure the model's expressiveness. For downstream tasks, we perform causal intervention across different demographic groups for invariant predictions. Results on three GLUE tasks show our method alleviates biases from internal and downstream contexts, while keeping PLM expressiveness intact.

AAAI Conference 2024 Short Paper

Decoupling User Relationships Guides Information Diffusion Prediction (Student Abstract)

  • Wenxue Ye
  • Shichong Li
  • Zhangtao Cheng
  • Xovee Xu
  • Ting Zhong
  • Bei Hui
  • Fan Zhou

Information diffusion prediction is a critical task for many social network applications. However, current methods are mainly limited by the following aspects: user relationships behind resharing behaviors are complex and entangled. To address these issues, we propose MHGFormer, a novel multi-channel hypergraph transformer framework, to better decouple complex user relations and obtain fine-grained user representations. First, we employ designed triangular motifs to decouple user relations into three different level hypergraphs. Second, a position-aware hypergraph transformer is used to refine user relation and obtain high-quality user representations. Extensive experiments conducted on two social datasets demonstrate that MHGFormer outperforms state-of-the-art diffusion models across several settings.

AAAI Conference 2024 Short Paper

Disentanglement-Guided Spatial-Temporal Graph Neural Network for Metro Flow Forecasting (Student Abstract)

  • Jinyu Hong
  • Ping Kuang
  • Qiang Gao
  • Fan Zhou

In recent intelligent transportation applications, metro flow forecasting has received much attention from researchers. Most prior arts endeavor to explore spatial or temporal dependencies while ignoring the key characteristic patterns underlying historical flows, e.g., trend and periodicity. Although the multiple granularity distillations or spatial dependency correlation can promote the flow estimation. However, the potential noise and spatial dynamics are under-explored. To this end, we propose a novel Disentanglement-Guided Spatial-Temporal Graph Neural Network or DGST to address the above concerns. It contains a Disentanglement Pre-training procedure for characteristic pattern disentanglement learning, a Characteristic Pattern Prediction for different future characteristic explorations, and a Spatial-Temporal Correlation for spatial-temporal dynamic learning. Experiments on a real-world dataset demonstrate the superiority of our DGST.

IJCAI Conference 2024 Conference Paper

Enhancing Fine-Grained Urban Flow Inference via Incremental Neural Operator

  • Qiang Gao
  • Xiaolong Song
  • Li Huang
  • Goce Trajcevski
  • Fan Zhou
  • Xueqin Chen

Fine-grained urban flow inference (FUFI), which involves inferring fine-grained flow maps from their coarse-grained counterparts, is of tremendous interest in the realm of sustainable urban traffic services. To address the FUFI, existing solutions mainly concentrate on investigating spatial dependencies, introducing external factors, reducing excessive memory costs, etc. , -- while rarely considering the catastrophic forgetting (CF) problem. Motivated by recent operator learning, we present an Urban Neural Operator solution with Incremental learning (UNOI), primarily seeking to learn grained-invariant solutions for FUFI in addition to addressing CF. Specifically, we devise an urban neural operator (UNO) in UNOI that learns mappings between approximation spaces by treating the different-grained flows as continuous functions, allowing a more flexible capture of spatial correlations. Furthermore, the phenomenon of CF behind time-related flows could hinder the capture of flow dynamics. Thus, UNOI mitigates CF concerns as well as privacy issues by placing UNO blocks in two incremental settings, i. e. , flow-related and task-related. Experimental results on large-scale real-world datasets demonstrate the superiority of our proposed solution against the baselines.

NeurIPS Conference 2024 Conference Paper

Enhancing LLM’s Cognition via Structurization

  • Kai Liu
  • Zhihang Fu
  • Chao Chen
  • Wei Zhang
  • Rongxin Jiang
  • Fan Zhou
  • Yaowu Chen
  • Yue Wu

When reading long-form text, human cognition is complex and structurized. While large language models (LLMs) process input contexts through a causal and sequential perspective, this approach can potentially limit their ability to handle intricate and complex inputs effectively. To enhance LLM’s cognition capability, this paper presents a novel concept of context structurization. Specifically, we transform the plain, unordered contextual sentences into well-ordered and hierarchically structurized elements. By doing so, LLMs can better grasp intricate and extended contexts through precise attention and information-seeking along the organized structures. Extensive evaluations are conducted across various model architectures and sizes (including a series of auto-regressive LLMs as well as BERT-like masking models) on a diverse set of NLP tasks (e. g. , context-based question-answering, exhaustive hallucination evaluation, and passage-level dense retrieval). Empirical results show consistent and significant performance gains afforded by a single-round structurization. In particular, we boost the open-sourced LLaMA2-70B model to achieve comparable performance against GPT-3. 5-Turbo as the halluci- nation evaluator. Besides, we show the feasibility of distilling advanced LLMs’ language processing abilities to a smaller yet effective StruXGPT-7B to execute structurization, addressing the practicality of our approach. Code is available at https: //github. com/alibaba/struxgpt.

AAAI Conference 2024 Conference Paper

Generalizing across Temporal Domains with Koopman Operators

  • Qiuhao Zeng
  • Wei Wang
  • Fan Zhou
  • Gezheng Xu
  • Ruizhi Pu
  • Changjian Shui
  • Christian Gagné
  • Shichun Yang

In the field of domain generalization, the task of constructing a predictive model capable of generalizing to a target domain without access to target data remains challenging. This problem becomes further complicated when considering evolving dynamics between domains. While various approaches have been proposed to address this issue, a comprehensive understanding of the underlying generalization theory is still lacking. In this study, we contribute novel theoretic results that aligning conditional distribution leads to the reduction of generalization bounds. Our analysis serves as a key motivation for solving the Temporal Domain Generalization (TDG) problem through the application of Koopman Neural Operators, resulting in Temporal Koopman Networks (TKNets). By employing Koopman Neural Operators, we effectively address the time-evolving distributions encountered in TDG using the principles of Koopman theory, where measurement functions are sought to establish linear transition relations between evolving domains. Through empirical evaluations conducted on synthetic and real-world datasets, we validate the effectiveness of our proposed approach.

AAAI Conference 2024 Conference Paper

GMP-AR: Granularity Message Passing and Adaptive Reconciliation for Temporal Hierarchy Forecasting

  • Fan Zhou
  • Chen Pan
  • Lintao Ma
  • Yu Liu
  • Siqiao Xue
  • James Zhang
  • Jun Zhou
  • Hongyuan Mei

Time series forecasts of different temporal granularity are widely used in real-world applications, e.g., sales prediction in days and weeks for making different inventory plans. However, these tasks are usually solved separately without ensuring coherence, which is crucial for aligning downstream decisions. Previous works mainly focus on ensuring coherence with some straightforward methods, e.g., aggregation from the forecasts of fine granularity to the coarse ones, and allocation from the coarse granularity to the fine ones. These methods merely take the temporal hierarchical structure to maintain coherence without improving the forecasting accuracy. In this paper, we propose a novel granularity message-passing mechanism (GMP) that leverages temporal hierarchy information to improve forecasting performance and also utilizes an adaptive reconciliation (AR) strategy to maintain coherence without performance loss. Furthermore, we introduce an optimization module to achieve task-based targets while adhering to more real-world constraints. Experiments on real-world datasets demonstrate that our framework (GMP-AR) achieves superior performances on temporal hierarchical forecasting tasks compared to state-of-the-art methods. In addition, our framework has been successfully applied to a real-world task of payment traffic management in Alipay by integrating with the task-based optimization module.

AAAI Conference 2024 Short Paper

Graph Anomaly Detection with Diffusion Model-Based Graph Enhancement (Student Abstract)

  • Shikang Pang
  • Chunjing Xiao
  • Wenxin Tai
  • Zhangtao Cheng
  • Fan Zhou

Graph anomaly detection has gained significant research interest across various domains. Due to the lack of labeled data, contrastive learning has been applied in detecting anomalies and various scales of contrastive strategies have been initiated. However, these methods might force two instances (e.g., node-level and subgraph-level representations) with different category labels to be consistent during model training, which can adversely impact the model robustness. To tackle this problem, we present a novel contrastive learning framework with the Diffusion model-based graph Enhancement module for Graph Anomaly Detection, DEGAD. In this framework, we design a diffusion model-based graph enhancement module to manipulate neighbors to generate enhanced graphs, which can efficiently alleviate the inconsistent problem. Further, based on the enhanced graphs, we present a multi-scale contrastive module to discriminate anomalies. Experimental results demonstrate the superiority of our model.

TIST Journal 2024 Journal Article

Inferring Real Mobility in Presence of Fake Check-ins Data

  • Qiang Gao
  • Hongzhu Fu
  • Kunpeng Zhang
  • Goce Trajcevski
  • Xu Teng
  • Fan Zhou

Understanding human mobility has become an important aspect of location-based services in tasks such as personalized recommendation and individual moving pattern recognition, enabled by the large volumes of data from geo-tagged social media (GTSM). Prior studies mainly focus on analyzing human historical footprints collected by GTSM and assuming the veracity of the data, which need not hold when some users are not willing to share their real footprints due to privacy concerns—thereby affecting reliability/authenticity. In this study, we address the problem of Inferring Real Mobility (IRMo) of users, from their unreliable historical traces. Tackling IRMo is a non-trivial task due to the: (1) sparsity of check-in data; (2) suspicious counterfeit check-in behaviors; and (3) unobserved dependencies in human trajectories. To address these issues, we develop a novel Graph-enhanced Attention model called IRMoGA, which attempts to capture underlying mobility patterns and check-in correlations by exploiting the unreliable spatio-temporal data. Specifically, we incorporate the attention mechanism (rather than solely relying on traditional recursive models) to understand the regularity of human mobility, while employing a graph neural network to understand the mutual interactions from human historical check-ins and leveraging prior knowledge to alleviate the inferring bias. Our experiments conducted on four real-world datasets demonstrate the superior performance of IRMoGA over several state-of-the-art baselines, e.g., up to 39.16% improvement regarding the Recall score on Foursquare.

AAAI Conference 2024 Short Paper

Interpreting Temporal Knowledge Graph Reasoning (Student Abstract)

  • Bin Chen
  • Kai Yang
  • Wenxin Tai
  • Zhangtao Cheng
  • Leyuan Liu
  • Ting Zhong
  • Fan Zhou

Temporal knowledge graph reasoning is an essential task that holds immense value in diverse real-world applications. Existing studies mainly focus on leveraging structural and sequential dependencies, excelling in tasks like entity and link prediction. However, they confront a notable interpretability gap in their predictions, a pivotal facet for comprehending model behavior. In this study, we propose an innovative method, LSGAT, which not only exhibits remarkable precision in entity predictions but also enhances interpretability by identifying pivotal historical events influencing event predictions. LSGAT enables concise explanations for prediction outcomes, offering valuable insights into the otherwise enigmatic "black box" reasoning process. Through an exploration of the implications of the most influential events, it facilitates a deeper understanding of the underlying mechanisms governing predictions.

ICLR Conference 2024 Conference Paper

Lemur: Harmonizing Natural Language and Code for Language Agents

  • Yiheng Xu
  • Hongjin Su
  • Chen Xing
  • Boyu Mi
  • Qian Liu 0033
  • Weijia Shi
  • Binyuan Hui
  • Fan Zhou

We introduce Lemur and Lemur-Chat, openly accessible language models optimized for both natural language and coding capabilities to serve as the backbone of versatile language agents. The evolution from language chat models to functional language agents demands that models not only master human interaction, reasoning, and planning but also ensure grounding in the relevant environments. This calls for a harmonious blend of language and coding capabilities in the models. Lemur and Lemur-Chat are proposed to address this necessity, demonstrating balanced proficiencies in both domains, unlike existing open-source models that tend to specialize in either. Through meticulous pretraining using a code-intensive corpus and instruction fine-tuning on text and code data, our models achieve state-of-the-art averaged performance across diverse text and coding benchmarks. Comprehensive experiments demonstrate Lemur’s superiority over existing open-source models and its proficiency across various agent tasks involving human communication, tool usage, and interaction under fully- and partially- observable environments. The harmonization between natural and programming languages enables Lemur-Chat to significantly narrow the gap with proprietary models on agent abilities, providing key insights into developing advanced open-source agents adept at reasoning, planning, and operating seamlessly across environments. Our model and code have been open-sourced at https://github.com/OpenLemur/Lemur.

NeurIPS Conference 2024 Conference Paper

OlympicArena: Benchmarking Multi-discipline Cognitive Reasoning for Superintelligent AI

  • Zhen Huang
  • Zengzhi Wang
  • Shijie Xia
  • Xuefeng Li
  • Haoyang Zou
  • Ruijie Xu
  • Run-Ze Fan
  • Lyumanshan Ye

The evolution of Artificial Intelligence (AI) has been significantly accelerated by advancements in Large Language Models (LLMs) and Large Multimodal Models (LMMs), gradually showcasing potential cognitive reasoning abilities in problem-solving and scientific discovery (i. e. , AI4Science) once exclusive to human intellect. To comprehensively evaluate current models' performance in cognitive reasoning abilities, we introduce OlympicArena, which includes 11, 163 bilingual problems across both text-only and interleaved text-image modalities. These challenges encompass a wide range of disciplines spanning seven fields and 62 international Olympic competitions, rigorously examined for data leakage. We argue that the challenges in Olympic competition problems are ideal for evaluating AI's cognitive reasoning due to their complexity and interdisciplinary nature, which are essential for tackling complex scientific challenges and facilitating discoveries. Beyond evaluating performance across various disciplines using answer-only criteria, we conduct detailed experiments and analyses from multiple perspectives. We delve into the models' cognitive reasoning abilities, their performance across different modalities, and their outcomes in process-level evaluations, which are vital for tasks requiring complex reasoning with lengthy solutions. Our extensive evaluations reveal that even advanced models like GPT-4o only achieve a 39. 97\% overall accuracy (28. 67\% for mathematics and 29. 71\% for physics), illustrating current AI limitations in complex reasoning and multimodal integration. Through the OlympicArena, we aim to advance AI towards superintelligence, equipping it to address more complex challenges in science and beyond. We also provide a comprehensive set of resources to support AI research, including a benchmark dataset, an open-source annotation platform, a detailed evaluation tool, and a leaderboard with automatic submission features.

NeurIPS Conference 2024 Conference Paper

Rethinking Out-of-Distribution Detection on Imbalanced Data Distribution

  • Kai Liu
  • Zhihang Fu
  • Sheng Jin
  • Chao Chen
  • Ze Chen
  • Rongxin Jiang
  • Fan Zhou
  • Yaowu Chen

Detecting and rejecting unknown out-of-distribution (OOD) samples is critical for deployed neural networks to void unreliable predictions. In real-world scenarios, however, the efficacy of existing OOD detection methods is often impeded by the inherent imbalance of in-distribution (ID) data, which causes significant performance decline. Through statistical observations, we have identified two common challenges faced by different OOD detectors: misidentifying tail class ID samples as OOD, while erroneously predicting OOD samples as head class from ID. To explain this phenomenon, we introduce a generalized statistical framework, termed ImOOD, to formulate the OOD detection problem on imbalanced data distribution. Consequently, the theoretical analysis reveals that there exists a class-aware bias item between balanced and imbalanced OOD detection, which contributes to the performance gap. Building upon this finding, we present a unified training-time regularization technique to mitigate the bias and boost imbalanced OOD detectors across architecture designs. Our theoretically grounded method translates into consistent improvements on the representative CIFAR10-LT, CIFAR100-LT, and ImageNet-LT benchmarks against several state-of-the-art OOD detection ap- proaches. Code is available at https: //github. com/alibaba/imood.

IJCAI Conference 2024 Conference Paper

Revitalizing Real Image Deraining via a Generic Paradigm towards Multiple Rainy Patterns

  • Xin Li
  • Yuxin Feng
  • Fan Zhou
  • Yun Liang
  • Zhuo Su

Synthetic data-driven methods perform well on image rain removal task, but they still face many challenges in real rainfall scenarios due to the complexity and diversity of rainy patterns. In this paper, we propose a new generic paradigm for real image deraining from the perspective of synthesizing data covering more rainy patterns and constructing image rain removal networks with strong generalization performance. Firstly, instead of simply superimposing rain layers, we integrate various rainy patterns and design a phenomenal pipeline that incorporates multiple degradation types. Secondly, we construct a Patterns-aware Rain Removal Network (PRRN), which learns from both synthetic and real data simultaneously. In addition, to eliminate the inevitable distribution differences between synthetic and real data, we design a new Multi-representation Inter-domain Alignment Module (MIAM) in PRRN. By using multiple parallel submodules, MIAM achieves alignment of data domains in multiple feature subspaces. Based on several authoritative objective evaluation metrics, we successfully validate the effectiveness and robustness of the proposed method in real scenarios through extensive experiments carried out on five challenging real datasets.

TIST Journal 2024 Journal Article

Score-based Graph Learning for Urban Flow Prediction

  • Pengyu Wang
  • Xucheng Luo
  • Wenxin Tai
  • Kunpeng Zhang
  • Goce Trajcevsky
  • Fan Zhou

Accurate urban flow prediction (UFP) is crucial for a range of smart city applications such as traffic management, urban planning, and risk assessment. To capture the intrinsic characteristics of urban flow, recent efforts have utilized spatial and temporal graph neural networks to deal with the complex dependence between the traffic in adjacent areas. However, existing graph neural network based approaches suffer from several critical drawbacks, including improper graph representation of urban traffic data, lack of semantic correlation modeling among graph nodes, and coarse-grained exploitation of external factors. To address these issues, we propose DiffUFP, a novel probabilistic graph-based framework for UFP. DiffUFP consists of two key designs: (1) a semantic region dynamic extraction method that effectively captures the underlying traffic network topology, and (2) a conditional denoising score-based adjacency matrix generator that takes spatial, temporal, and external factors into account when constructing the adjacency matrix rather than simply concatenation in existing studies. Extensive experiments conducted on real-world datasets demonstrate the superiority of DiffUFP over the state-of-the-art UFP models and the effect of the two specific modules.

AAAI Conference 2024 Short Paper

Shallow Diffusion for Fast Speech Enhancement (Student Abstract)

  • Yue Lei
  • Bin Chen
  • Wenxin Tai
  • Ting Zhong
  • Fan Zhou

Recently, the field of Speech Enhancement has witnessed the success of diffusion-based generative models. However, these diffusion-based methods used to take multiple iterations to generate high-quality samples, leading to high computational costs and inefficiency. In this paper, we propose SDFEN (Shallow Diffusion for Fast spEech eNhancement), a novel approach for addressing the inefficiency problem while enhancing the quality of generated samples by reducing the iterative steps in the reverse process of diffusion method. Specifically, we introduce the shallow diffusion strategy initiating the reverse process with an adaptive time step to accelerate inference. In addition, a dedicated noisy predictor is further proposed to guide the adaptive selection of time step. Experiment results demonstrate the superiority of the proposed SDFEN in effectiveness and efficiency.

AAAI Conference 2024 Short Paper

Spatial-Temporal Augmentation for Crime Prediction (Student Abstract)

  • Hongzhu Fu
  • Fan Zhou
  • Qing Guo
  • Qiang Gao

Crime prediction stands as a pivotal concern within the realm of urban management due to its potential threats to public safety. While prior research has predominantly focused on unraveling the intricate dependencies among urban regions and temporal dynamics, the challenges posed by the scarcity and uncertainty of historical crime data have not been thoroughly investigated. This study introduces an innovative spatial-temporal augmented learning framework for crime prediction, namely STAug. In STAug, we devise a CrimeMix to improve the ability of generalization. Furthermore, we harness a spatial-temporal aggregation to capture and incorporate multiple correlations covering the temporal, spatial, and crime-type aspects. Experiments on two real-world datasets underscore the superiority of STAug over several baselines.

NeurIPS Conference 2024 Conference Paper

Two-way Deconfounder for Off-policy Evaluation in Causal Reinforcement Learning

  • Shuguang Yu
  • Shuxing Fang
  • Ruixin Peng
  • Zhengling Qi
  • Fan Zhou
  • Chengchun Shi

This paper studies off-policy evaluation (OPE) in the presence of unmeasured confounders. Inspired by the two-way fixed effects regression model widely used in the panel data literature, we propose a two-way unmeasured confounding assumption to model the system dynamics in causal reinforcement learning and develop a two-way deconfounder algorithm that devises a neural tensor network to simultaneously learn both the unmeasured confounders and the system dynamics, based on which a model-based estimator can be constructed for consistent policy value estimation. We illustrate the effectiveness of the proposed estimator through theoretical results and numerical experiments.

AAAI Conference 2023 Short Paper

A Probabilistic Graph Diffusion Model for Source Localization (Student Abstract)

  • Tangjiang Qian
  • Xovee Xu
  • Zhe Xiao
  • Ting Zhong
  • Fan Zhou

Source localization, as a reverse problem of graph diffusion, is important for many applications such as rumor tracking, detecting computer viruses, and finding epidemic spreaders. However, it is still under-explored due to the inherent uncertainty of the diffusion process: after a long period of propagation, the same diffusion process may start with diverse sources. Most existing solutions utilize deterministic models and therefore cannot describe the diffusion uncertainty of sources. Moreover, current probabilistic approaches are hard to conduct smooth transformations with variational inference. To overcome the limitations, we propose a probabilistic framework using continuous normalizing flows with invertible transformations and graph neural networks to explicitly model the uncertainty of the diffusion source. Experimental results on two real-world datasets demonstrate the effectiveness of our model over strong baselines.

AAAI Conference 2023 Short Paper

CasODE: Modeling Irregular Information Cascade via Neural Ordinary Differential Equations (Student Abstract)

  • Zhangtao Cheng
  • Xovee Xu
  • Ting Zhong
  • Fan Zhou
  • Goce Trajcevski

Predicting information cascade popularity is a fundamental problem for understanding the nature of information propagation on social media. However, existing works fail to capture an essential aspect of information propagation: the temporal irregularity of cascade event -- i.e., users' re-tweetings at random and non-periodic time instants. In this work, we present a novel framework CasODE for information cascade prediction with neural ordinary differential equations (ODEs). CasODE generalizes the discrete state transitions in RNNs to continuous-time dynamics for modeling the irregular-sampled events in information cascades. Experimental evaluations on real-world datasets demonstrate the advantages of the CasODE over baseline approaches.

AAAI Conference 2023 Conference Paper

De-biased Teacher: Rethinking IoU Matching for Semi-supervised Object Detection

  • Kuo Wang
  • Jingyu Zhuang
  • Guanbin Li
  • Chaowei Fang
  • Lechao Cheng
  • Liang Lin
  • Fan Zhou

Most of the recent research in semi-supervised object detection follows the pseudo-labeling paradigm evolved from the semi-supervised image classification task. However, the training paradigm of the two-stage object detector inevitably makes the pseudo-label learning process for unlabeled images full of bias. Specifically, the IoU matching scheme used for selecting and labeling candidate boxes is based on the assumption that the matching source~(ground truth) is accurate enough in terms of the number of objects, object position and object category. Obviously, pseudo-labels generated for unlabeled images cannot satisfy such a strong assumption, which makes the produced training proposals extremely unreliable and thus severely spoil the follow-up training. To de-bias the training proposals generated by the pseudo-label-based IoU matching, we propose a general framework -- De-biased Teacher, which abandons both the IoU matching and pseudo labeling processes by directly generating favorable training proposals for consistency regularization between the weak/strong augmented image pairs. Moreover, a distribution-based refinement scheme is designed to eliminate the scattered class predictions of significantly low values for higher efficiency. Extensive experiments demonstrate that the proposed De-biased Teacher consistently outperforms other state-of-the-art methods on the MS-COCO and PASCAL VOC benchmarks. Source codes are available at https://github.com/wkfdb/De-biased-Teracher.

AAAI Conference 2023 Short Paper

Debiasing Intrinsic Bias and Application Bias Jointly via Invariant Risk Minimization (Student Abstract)

  • Yuzhou Mao
  • Liu Yu
  • Yi Yang
  • Fan Zhou
  • Ting Zhong

Demographic biases and social stereotypes are common in pretrained language models (PLMs), while the fine-tuning in downstream applications can also produce new biases or amplify the impact of the original biases. Existing works separate the debiasing from the fine-tuning procedure, which results in a gap between intrinsic bias and application bias. In this work, we propose a debiasing framework CauDebias to eliminate both biases, which directly combines debiasing with fine-tuning and can be applied for any PLMs in downstream tasks. We distinguish the bias-relevant (non-causal factors) and label-relevant (causal factors) parts in sentences from a causal invariant perspective. Specifically, we perform intervention on non-causal factors in different demographic groups, and then devise an invariant risk minimization loss to trade-off performance between bias mitigation and task accuracy. Experimental results on three downstream tasks show that our CauDebias can remarkably reduce biases in PLMs while minimizing the impact on downstream tasks.

NeurIPS Conference 2023 Conference Paper

Directional diffusion models for graph representation learning

  • Run Yang
  • Yuling Yang
  • Fan Zhou
  • Qiang Sun

Diffusion models have achieved remarkable success in diverse domains such as image synthesis, super-resolution, and 3D molecule generation. Surprisingly, the application of diffusion models in graph learning has garnered little attention. In this paper, we aim to bridge this gap by exploring the use of diffusion models for unsupervised graph representation learning. Our investigation commences with the identification of anisotropic structures within graphs and the recognition of a crucial limitation in the vanilla forward diffusion process when dealing with these anisotropic structures. The original forward diffusion process continually adds isotropic Gaussian noise to the data, which may excessively dilute anisotropic signals, leading to rapid signal-to-noise conversion. This rapid conversion poses challenges for training denoising neural networks and obstructs the acquisition of semantically meaningful representations during the reverse process. To overcome this challenge, we introduce a novel class of models termed {\it directional diffusion models}. These models adopt data-dependent, anisotropic, and directional noises in the forward diffusion process. In order to assess the effectiveness of our proposed models, we conduct extensive experiments on 12 publicly available datasets, with a particular focus on two distinct graph representation learning tasks. The experimental results unequivocally establish the superiority of our models over state-of-the-art baselines, underscoring their effectiveness in capturing meaningful graph representations. Our research not only sheds light on the intricacies of the forward process in diffusion models but also underscores the vast potential of these models in addressing a wide spectrum of graph-related tasks. Our code is available at \url{https: //github. com/statsle/DDM}.

NeurIPS Conference 2023 Conference Paper

DOSE: Diffusion Dropout with Adaptive Prior for Speech Enhancement

  • Wenxin Tai
  • Yue Lei
  • Fan Zhou
  • Goce Trajcevski
  • Ting Zhong

Speech enhancement (SE) aims to improve the intelligibility and quality of speech in the presence of non-stationary additive noise. Deterministic deep learning models have traditionally been used for SE, but recent studies have shown that generative approaches, such as denoising diffusion probabilistic models (DDPMs), can also be effective. However, incorporating condition information into DDPMs for SE remains a challenge. We propose a model-agnostic method called DOSE that employs two efficient condition-augmentation techniques to address this challenge, based on two key insights: (1) We force the model to prioritize the condition factor when generating samples by training it with dropout operation; (2) We inject the condition information into the sampling process by providing an informative adaptive prior. Experiments demonstrate that our approach yields substantial improvements in high-quality and stable speech generation, consistency with the condition factor, and inference efficiency. Codes are publicly available at https: //github. com/ICDM-UESTC/DOSE.

AAAI Conference 2023 Short Paper

DyCVAE: Learning Dynamic Causal Factors for Non-stationary Series Domain Generalization (Student Abstract)

  • Weifeng Zhang
  • Zhiyuan Wang
  • Kunpeng Zhang
  • Ting Zhong
  • Fan Zhou

Learning domain-invariant representations is a major task of out-of-distribution generalization. To address this issue, recent efforts have taken into accounting causality, aiming at learning the causal factors with regard to tasks. However, extending existing generalization methods for adapting non-stationary time series may be ineffective, because they fail to model the underlying causal factors due to temporal-domain shifts except for source-domain shifts, as pointed out by recent studies. To this end, we propose a novel model DyCVAE to learn dynamic causal factors. The results on synthetic and real datasets demonstrate the effectiveness of our proposed model for the task of generalization in time series domain.

NeurIPS Conference 2023 Conference Paper

Enhancing Knowledge Transfer for Task Incremental Learning with Data-free Subnetwork

  • Qiang Gao
  • Xiaojun Shan
  • Yuchen Zhang
  • Fan Zhou

As there exist competitive subnetworks within a dense network in concert with Lottery Ticket Hypothesis, we introduce a novel neuron-wise task incremental learning method, namely Data-free Subnetworks (DSN), which attempts to enhance the elastic knowledge transfer across the tasks that sequentially arrive. Specifically, DSN primarily seeks to transfer knowledge to the new coming task from the learned tasks by selecting the affiliated weights of a small set of neurons to be activated, including the reused neurons from prior tasks via neuron-wise masks. And it also transfers possibly valuable knowledge to the earlier tasks via data-free replay. Especially, DSN inherently relieves the catastrophic forgetting and the unavailability of past data or possible privacy concerns. The comprehensive experiments conducted on four benchmark datasets demonstrate the effectiveness of the proposed DSN in the context of task-incremental learning by comparing it to several state-of-the-art baselines. In particular, DSN enables the knowledge transfer to the earlier tasks, which is often overlooked by prior efforts.

AAAI Conference 2023 Short Paper

Exploring Hypergraph of Earnings Call for Risk Prediction (Student Abstract)

  • Yi He
  • Wenxin Tai
  • Fan Zhou
  • Yi Yang

In financial economics, studies have shown that the textual content in the earnings conference call transcript has predictive power for a firm's future risk. However, the conference call transcript is very long and contains diverse non-relevant content, which poses challenges for the text-based risk forecast. This study investigates the structural dependency within a conference call transcript by explicitly modeling the dialogue between managers and analysts. Specifically, we utilize TextRank to extract information and exploit the semantic correlation within a discussion using hypergraph learning. This novel design can improve the transcript representation performance and reduce the risk of forecast errors. Experimental results on a large-scale dataset show that our approach can significantly improve prediction performance compared to state-of-the-art text-based models.

YNIMG Journal 2023 Journal Article

Exploring the neural mechanisms underlying achalasia: A study of functional connectivity and regional brain activity

  • Nina Zhang
  • Binyu Teng
  • Xinyi Lu
  • Liangliang Shi
  • Li Liu
  • Fan Zhou
  • Ni Jiang
  • Xin Zhang

BACKGROUND AND AIMS: The pathophysiology of achalasia, which involves central nuclei abnormalities, remains unknown. We investigated the resting-state functional MRI (rs-fMRI) features of patients with achalasia. METHODS: We applied resting-state functional MRI (rs-fMRI) to investigate the brain features in patients with achalasia (n = 27), compared to healthy controls (n = 29). Focusing on three regions of interest (ROIs): the dorsal motor nucleus of the vagus (DMV), the nucleus ambiguus (NA), and the nucleus of the solitary tract (NTS), we analyzed variations in resting-state functional connectivity (rs-FC), fractional amplitude of low-frequency fluctuations (fALFF), and regional homogeneity (ReHo). RESULTS: Achalasia patients demonstrated stronger functional connectivity between the NA and the right precentral gyrus, left postcentral gyrus, and left insula. No significant changes were found in the DMV or NTS. The fMRI analysis showed higher rs-FC values for NA-DMV and NA-NTS connections in achalasia patients. Achalasia patients exhibited decreased fALFF values in the NA, DMV, and NTS regions, as well as increased ReHo values in the NA and DMV regions. A positive correlation was observed between fALFF values in all six ROIs and the width of the barium meal. The NTS fALFF value and NA ReHo value displayed a positive correlation with integrated relaxation pressure (IRP), while the ReHo value in the right precentral gyrus showed an inverse correlation with the height of the barium meal. CONCLUSIONS: Abnormal rs-FC and regional brain activity was found in patients with achalasia. Our study provides new insights into the pathophysiology of achalasia and highlights the potential of rs-fMRI in improving the diagnosis and treatment of this condition.

AAAI Conference 2023 Conference Paper

Foresee What You Will Learn: Data Augmentation for Domain Generalization in Non-stationary Environment

  • Qiuhao Zeng
  • Wei Wang
  • Fan Zhou
  • Charles Ling
  • Boyu Wang

Existing domain generalization aims to learn a generalizable model to perform well even on unseen domains. For many real-world machine learning applications, the data distribution often shifts gradually along domain indices. For example, a self-driving car with a vision system drives from dawn to dusk, with the sky gradually darkening. Therefore, the system must be able to adapt to changes in ambient illuminations and continue to drive safely on the road. In this paper, we formulate such problems as Evolving Domain Generalization, where a model aims to generalize well on a target domain by discovering and leveraging the evolving pattern of the environment. We then propose Directional Domain Augmentation (DDA), which simulates the unseen target features by mapping source data as augmentations through a domain transformer. Specifically, we formulate DDA as a bi-level optimization problem and solve it through a novel meta-learning approach in the representation space. We evaluate the proposed method on both synthetic datasets and real-world datasets, and empirical results show that our approach can outperform other existing methods.

JMLR Journal 2023 Journal Article

Gap Minimization for Knowledge Sharing and Transfer

  • Boyu Wang
  • Jorge A. Mendez
  • Changjian Shui
  • Fan Zhou
  • Di Wu
  • Gezheng Xu
  • Christian Gagné
  • Eric Eaton

Learning from multiple related tasks by knowledge sharing and transfer has become increasingly relevant over the last two decades. In order to successfully transfer information from one task to another, it is critical to understand the similarities and differences between the domains. In this paper, we introduce the notion of performance gap, an intuitive and novel measure of the distance between learning tasks. Unlike existing measures which are used as tools to bound the difference of expected risks between tasks (e.g., $\mathcal{H}$-divergence or discrepancy distance), we theoretically show that the performance gap can be viewed as a data- and algorithm-dependent regularizer, which controls the model complexity and leads to finer guarantees. More importantly, it also provides new insights and motivates a novel principle for designing strategies for knowledge sharing and transfer: gap minimization. We instantiate this principle with two algorithms: 1. gapBoost, a novel and principled boosting algorithm that explicitly minimizes the performance gap between source and target domains for transfer learning; and 2. gapMTNN, a representation learning algorithm that reformulates gap minimization as semantic conditional matching for multitask learning. Our extensive evaluation on both transfer learning and multitask learning benchmark data sets shows that our methods outperform existing baselines. [abs] [ pdf ][ bib ] [ code ] &copy JMLR 2023. ( edit, beta )

NeurIPS Conference 2023 Conference Paper

Language Models Can Improve Event Prediction by Few-Shot Abductive Reasoning

  • Xiaoming Shi
  • Siqiao Xue
  • Kangrui Wang
  • Fan Zhou
  • James Zhang
  • Jun Zhou
  • Chenhao Tan
  • Hongyuan Mei

Large language models have shown astonishing performance on a wide range of reasoning tasks. In this paper, we investigate whether they could reason about real-world events and help improve the prediction performance of event sequence models. We design LAMP, a framework that integrates a large language model in event prediction. Particularly, the language model performs abductive reasoning to assist an event sequence model: the event model proposes predictions on future events given the past; instructed by a few expert-annotated demonstrations, the language model learns to suggest possible causes for each proposal; a search module finds out the previous events that match the causes; a scoring function learns to examine whether the retrieved events could actually cause the proposal. Through extensive experiments on several challenging real-world datasets, we demonstrate that our framework---thanks to the reasoning capabilities of large language models---could significantly outperform the state-of-the-art event sequence models.

AAAI Conference 2023 Short Paper

Learning Dynamic Temporal Relations with Continuous Graph for Multivariate Time Series Forecasting (Student Abstract)

  • Zhiyuan Wang
  • Fan Zhou
  • Goce Trajcevski
  • Kunpeng Zhang
  • Ting Zhong

The recent advance in graph neural networks (GNNs) has inspired a few studies to leverage the dependencies of variables for time series prediction. Despite the promising results, existing GNN-based models cannot capture the global dynamic relations between variables owing to the inherent limitation of their graph learning module. Besides, multi-scale temporal information is usually ignored or simply concatenated in prior methods, resulting in inaccurate predictions. To overcome these limitations, we present CGMF, a Continuous Graph learning method for Multivariate time series Forecasting (CGMF). Our CGMF consists of a continuous graph module incorporating differential equations to capture the long-range intra- and inter-relations of the temporal embedding sequence. We also introduce a controlled differential equation-based fusion mechanism that efficiently exploits multi-scale representations to form continuous evolutional dynamics and learn rich relations and patterns shared across different scales. Comprehensive experiments demonstrate the effectiveness of our method for a variety of datasets.

AAAI Conference 2023 Short Paper

Mobility Prediction via Sequential Trajectory Disentanglement (Student Abstract)

  • Jinyu Hong
  • Fan Zhou
  • Qiang Gao
  • Ping Kuang
  • Kunpeng Zhang

Accurately predicting human mobility is a critical task in location-based recommendation. Most prior approaches focus on fusing multiple semantics trajectories to forecast the future movement of people, and fail to consider the distinct relations in underlying context of human mobility, resulting in a narrow perspective to comprehend human motions. Inspired by recent advances in disentanglement learning, we propose a novel self-supervised method called SelfMove for next POI prediction. SelfMove seeks to disentangle the potential time-invariant and time-varying factors from massive trajectories, which provides an interpretable view to understand the complex semantics underlying human mobility representations. To address the data sparsity issue, we present two realistic trajectory augmentation approaches to help understand the intrinsic periodicity and constantly changing intents of humans. In addition, a POI-centric graph structure is proposed to explore both homogeneous and heterogeneous collaborative signals behind historical trajectories. Experiments on two real-world datasets demonstrate the superiority of SelfMove compared to the state-of-the-art baselines.

IJCAI Conference 2023 Conference Paper

Open Anomalous Trajectory Recognition via Probabilistic Metric Learning

  • Qiang Gao
  • Xiaohan Wang
  • Chaoran Liu
  • Goce Trajcevski
  • Li Huang
  • Fan Zhou

Typically, trajectories considered anomalous are the ones deviating from usual (e. g. , traffic-dictated) driving patterns. However, this closed-set context fails to recognize the unknown anomalous trajectories, resulting in an insufficient self-motivated learning paradigm. In this study, we investigate the novel Anomalous Trajectory Recognition problem in an Open-world scenario (ATRO) and introduce a novel probabilistic Metric learning model, namely ATROM, to address it. Specifically, ATROM can detect the presence of unknown anomalous behavior in addition to identifying known behavior. It has a Mutual Interaction Distillation that uses contrastive metric learning to explore the interactive semantics regarding the diverse behavioral intents and a Probabilistic Trajectory Embedding that forces the trajectories with distinct behaviors to follow different Gaussian priors. More importantly, ATROM offers a probabilistic metric rule to discriminate between known and unknown behavioral patterns by taking advantage of the approximation of multiple priors. Experimental results on two large-scale trajectory datasets demonstrate the superiority of ATROM in addressing both known and unknown anomalous patterns.

NeurIPS Conference 2023 Conference Paper

Optimal Treatment Allocation for Efficient Policy Evaluation in Sequential Decision Making

  • Ting Li
  • Chengchun Shi
  • Jianing Wang
  • Fan Zhou
  • Hongtu Zhu

A/B testing is critical for modern technological companies to evaluate the effectiveness of newly developed products against standard baselines. This paper studies optimal designs that aim to maximize the amount of information obtained from online experiments to estimate treatment effects accurately. We propose three optimal allocation strategies in a dynamic setting where treatments are sequentially assigned over time. These strategies are designed to minimize the variance of the treatment effect estimator when data follow a non Markov decision process or a (time-varying) Markov decision process. We further develop estimation procedures based on existing off-policy evaluation (OPE) methods and conduct extensive experiments in various environments to demonstrate the effectiveness of the proposed methodologies. In theory, we prove the optimality of the proposed treatment allocation design and establish upper bounds for the mean squared errors of the resulting treatment effect estimators.

JMLR Journal 2023 Journal Article

Over-parameterized Deep Nonparametric Regression for Dependent Data with Its Applications to Reinforcement Learning

  • Xingdong Feng
  • Yuling Jiao
  • Lican Kang
  • Baqun Zhang
  • Fan Zhou

In this paper, we provide statistical guarantees for over-parameterized deep nonparametric regression in the presence of dependent data. By decomposing the error, we establish non-asymptotic error bounds for deep estimation, which is achieved by effectively balancing the approximation and generalization errors. We have derived an approximation result for Hölder functions with constrained weights. Additionally, the generalization error is bounded by the weight norm, allowing for a neural network parameter number that is much larger than the training sample size. Furthermore, we address the issue of the curse of dimensionality by assuming that the samples originate from distributions with low intrinsic dimensions. Under this assumption, we are able to overcome the challenges posed by high-dimensional spaces. By incorporating an additional error propagation mechanism, we derive oracle inequalities for the over-parameterized deep fitted $Q$-iteration. [abs] [ pdf ][ bib ] &copy JMLR 2023. ( edit, beta )

AAAI Conference 2023 Conference Paper

Overcoming Forgetting in Fine-Grained Urban Flow Inference via Adaptive Knowledge Replay

  • Haoyang Yu
  • Xovee Xu
  • Ting Zhong
  • Fan Zhou

Fine-grained urban flow inference (FUFI) problem aims at inferring the high-resolution flow maps from the coarse-grained ones, which plays an important role in sustainable and economic urban computing and traffic management. Previous models addressed the FUFI problem from spatial constraint, external factors, and memory cost. However, utilizing the new urban flow maps to calibrate the learned model is very challenging due to the "catastrophic forgetting" problem and is still under-explored. In this paper, we make the first step in FUFI and present CUFAR -- Continual Urban Flow inference with Adaptive knowledge Replay -- a novel framework for inferring the fine-grained citywide traffic flows. Specifically, (1) we design a spatial-temporal inference network that can extract better flow map features from both local and global levels; (2) then we present an adaptive knowledge replay (AKR) training algorithm to selectively replay the learned knowledge to facilitate the learning process of the model on new knowledge without forgetting. In addition, we also propose a knowledge discriminator to avoid "negative replaying" issue introduced by noisy urban flow maps. Extensive experiments on four large-scale real-world FUFI datasets demonstrate that our proposed model consistently outperforms strong baselines and effectively mitigates the forgetting problem. Source code is available at: https://github.com/PattonYu/CUFAR.

AAAI Conference 2023 Conference Paper

Revisiting Denoising Diffusion Probabilistic Models for Speech Enhancement: Condition Collapse, Efficiency and Refinement

  • Wenxin Tai
  • Fan Zhou
  • Goce Trajcevski
  • Ting Zhong

Recent literature has shown that denoising diffusion probabilistic models (DDPMs) can be used to synthesize high-fidelity samples with a competitive (or sometimes better) quality than previous state-of-the-art approaches. However, few attempts have been made to apply DDPM for the speech enhancement task. The reported performance of the existing works is relatively poor and significantly inferior to other generative methods. In this work, we first reveal the difficulties in applying existing diffusion models to the field of speech enhancement. Then we introduce DR-DiffuSE, a simple and effective framework for speech enhancement using conditional diffusion models. We present three strategies (two in diffusion training and one in reverse sampling) to tackle the condition collapse and guarantee the sufficient use of condition information. For efficiency, we introduce the fast sampling technique to reduce the sampling process into several steps and exploit a refinement network to calibrate the defective speech. Our proposed method achieves the state-of-the-art performance to the GAN-based model and shows a significant improvement over existing DDPM-based algorithms.

AAAI Conference 2023 Conference Paper

SLOTH: Structured Learning and Task-Based Optimization for Time Series Forecasting on Hierarchies

  • Fan Zhou
  • Chen Pan
  • Lintao Ma
  • Yu Liu
  • Shiyu Wang
  • James Zhang
  • Xinxin Zhu
  • Xuanwei Hu

Multivariate time series forecasting with hierarchical structure is widely used in real-world applications, e.g., sales predictions for the geographical hierarchy formed by cities, states, and countries. The hierarchical time series (HTS) forecasting includes two sub-tasks, i.e., forecasting and reconciliation. In the previous works, hierarchical information is only integrated in the reconciliation step to maintain coherency, but not in forecasting step for accuracy improvement. In this paper, we propose two novel tree-based feature integration mechanisms, i.e., top-down convolution and bottom-up attention to leverage the information of the hierarchical structure to improve the forecasting performance. Moreover, unlike most previous reconciliation methods which either rely on strong assumptions or focus on coherent constraints only, we utilize deep neural optimization networks, which not only achieve coherency without any assumptions, but also allow more flexible and realistic constraints to achieve task-based targets, e.g., lower under-estimation penalty and meaningful decision-making loss to facilitate the subsequent downstream tasks. Experiments on real-world datasets demonstrate that our tree-based feature integration mechanism achieves superior performances on hierarchical forecasting tasks compared to the state-of-the-art methods, and our neural optimization networks can be applied to real-world tasks effectively without any additional effort under coherence and task-based constraints.

ICML Conference 2023 Conference Paper

Variance Control for Distributional Reinforcement Learning

  • Qi Kuang
  • Zhoufan Zhu
  • Liwen Zhang
  • Fan Zhou

Although distributional reinforcement learning (DRL) has been widely examined in the past few years, very few studies investigate the validity of the obtained Q-function estimator in the distributional setting. To fully understand how the approximation errors of the Q-function affect the whole training process, we do some error analysis and theoretically show how to reduce both the bias and the variance of the error terms. With this new understanding, we construct a new estimator Quantiled Expansion Mean (QEM) and introduce a new DRL algorithm (QEMRL) from the statistical perspective. We extensively evaluate our QEMRL algorithm on a variety of Atari and Mujoco benchmark tasks and demonstrate that QEMRL achieves significant improvement over baseline algorithms in terms of sample efficiency and convergence performance.

AAAI Conference 2022 Short Paper

A Probabilistic Framework for Land Deformation Prediction (Student Abstract)

  • Rongfan Li
  • Fan Zhou
  • Goce Trajcevski
  • Kunpeng Zhang
  • Ting Zhong

The development of InSAR (satellite Interferometric Synthetic Aperture Radar) enables accurate monitoring of land surface deformations, and has led to advances of deformation forecast for preventing landslide, which is one of the severe geological disasters. Despite the unparalleled success, existing spatio-temporal models typically make predictions on static adjacency relationships, simplifying the conditional dependencies and neglecting the distributions of variables. To overcome those limitations, we propose a Distribution Aware Probabilistic Framework (DAPF), which learns manifold embeddings while maintaining the distribution of deformations. We obtain a dynamic adjacency matrix upon which we approximate the true posterior while emphasizing the spatiotemporal characteristics. Experimental results on real-world dataset validate the superior performance of our method.

AAAI Conference 2022 Short Paper

Conditional Collaborative Filtering Process for Top-K Recommender System (Student Abstract)

  • Guanyu Wang
  • Xovee Xu
  • Ting Zhong
  • Fan Zhou

Conditional neural process (CNP) has been extensively applied into data analyzing tasks due to its excellent ability to make accurate predictions for incomplete data points. However, in literature there are only few works that studied the CNP in recommendation systems. In this work, we propose CCFP, which is a collaborative filtering method that differs from other CF models by incorporating CNP into encoderdecoder architecture. By analyzing the complete user-item interaction data, our model fits a global representation that can better representing the features of users and items. CCFP can significantly improve the recommendation performance compared to baselines by predicting items for the target users with their incomplete observation data.

IJCAI Conference 2022 Conference Paper

Double-Check Soft Teacher for Semi-Supervised Object Detection

  • Kuo Wang
  • Yuxiang Nie
  • Chaowei Fang
  • Chengzhi Han
  • Xuewen Wu
  • Xiaohui Wang Wang
  • Liang Lin
  • Fan Zhou

In the semi-supervised object detection task, due to the scarcity of labeled data and the diversity and complexity of objects to be detected, the quality of pseudo-labels generated by existing methods for unlabeled data is relatively low, which severely restricts the performance of semi-supervised object detection. In this paper, we revisit the pseudo-labeling based Teacher-Student mutual learning framework for semi-supervised object detection and identify that the inconsistency of the location and feature of the candidate object proposals between the Teacher and the Student branches are the fatal cause of the low quality of the pseudo labels. To address this issue, we propose a simple yet effective technique within the mainstream teacher-student framework, called Double Check Soft Teacher, to overcome the harm caused by insufficient quality of pseudo labels. Specifically, our proposed method leverages teacher model to generate pseudo labels for the student model. Especially, the candidate boxes generated by the student model based on the pseudo label will be sent to the teacher model for "double check", and then the teacher model will output probabilistic soft label with background class for those candidate boxes, which will be used to train the student model. Together with a pseudo labeling mechanism based on the sum of the TOP-K prediction score, which improves the recall rate of pseudo labels, Double Check Soft Teacher consistently surpasses state-of-the-art methods by significant margins on the MS-COCO benchmark, pushing the new state-of-the-art. Source codes are available at https: //github. com/wkfdb/DCST.

AAAI Conference 2022 Conference Paper

Dynamic Manifold Learning for Land Deformation Forecasting

  • Fan Zhou
  • Rongfan Li
  • Qiang Gao
  • Goce Trajcevski
  • Kunpeng Zhang
  • Ting Zhong

Landslides refer to occurrences of massive ground movements due to geological (and meteorological) factors, and can have disastrous impact on property, economy, and even lead to loss of life. The advances of remote sensing provide accurate and continuous terrain monitoring, enabling the study and analysis of land deformation which, in turn, can be used for possible landslides forecast. Prior studies either rely on independent observations for displacement prediction or model static land characteristics without considering the subtle interactions between different locations and the dynamic changes of the surface conditions. We present DyLand – Dynamic Manifold Learning with Normalizing Flows for Land deformation prediction – a novel framework for learning dynamic structures of terrain surface and improving the performance of land deformation prediction. DyLand models the spatial connections of InSAR measurements and estimates conditional distributions of deformations on the terrain manifold with a novel normalizing flow-based method. Instead of modeling the stable terrains, it incorporates surface permutations and captures the innate dynamics of the land surface while allowing for tractable likelihood estimates on the manifold. Our extensive evaluations on curated InSAR datasets from continuous monitoring of slopes prone to landslides show that DyLand outperforms existing bechmarking models.

AAAI Conference 2022 Short Paper

Fine-Grained Urban Flow Inference via Normalizing Flow (Student Abstract)

  • Haoyang Yu
  • Xovee Xu
  • Ting Zhong
  • Fan Zhou

Fine-grained urban flow inference (FUFI) aims to infer the coarse-grained (CG) urban flow map to the corresponding fine-grained (FG) one, which plays an important role in efficient traffic monitoring and management in smart cities. In FUFI, the CG map can be obtained with only a small number of monitoring devices, greatly reducing the overhead of deploying devices and the costs of maintenance, labor, and electricity. Existing FUFI methods are mainly based on techniques from image super-resolution (SR) models, which cannot fully consider the influence of external factors and face the ill-posed problem in SR tasks. In this paper, we propose UFI-Flow, a novel approach for addressing the FUFI problem by learning the conditional distributions of CG and FG map pairs. Given the CG map and the latent variables, the corresponding FG map is inferred by invertible transformations. In addition, an augmented distribution fusion mechanism is further proposed to constrain the urban flow distribution within the influence of external factors. We provide a new large-scale real-world FUFI dataset and show that UFI-Flow significantly outperforms the strong baselines.

AAAI Conference 2022 Short Paper

Large-Scale IP Usage Identification via Deep Ensemble Learning (Student Abstract)

  • Zhiyuan Wang
  • Fan Zhou
  • Kunpeng Zhang
  • Yong Wang

Understanding users’ behavior via IP addresses is essential towards numerous practical IP-based applications such as online content delivery, fraud prevention, and many others. Among which profiling IP address has been extensively studied, such as IP geolocation and anomaly detection. However, less is known about the scenario of an IP address, e. g. , dedicated enterprise network or home broadband. In this work, we initiate the first attempt to address a large-scale IP scenario prediction problem. Specifically, we collect IP scenario data from four regions and propose a novel deep ensemble learning-based model to learn IP assignment rules and complex feature interactions. Extensive experiments support that our method can make accurate IP scenario identification and generalize from data in one region to another.

AAAI Conference 2022 Short Paper

Learning Contrastive Multi-View Graphs for Recommendation (Student Abstract)

  • Zhangtao Cheng
  • Ting Zhong
  • Kunpeng Zhang
  • Joojo Walker
  • Fan Zhou

This paper exploits self-supervised learning (SSL) to learn more accurate and robust representations from the user-item interaction graph. Particularly, we propose a novel SSL model that effectively leverages contrastive multi-view learning and pseudo-siamese network to construct a pre-training and posttraining framework. Moreover, we present three graph augmentation techniques during the pre-training stage and explore the effects of combining different augmentations, which allow us to learn general and robust representations for the GNN-based recommendation. Simple experimental evaluations on real-world datasets show that the proposed solution significantly improves the recommendation accuracy, especially for sparse data, and is also noise resistant.

NeurIPS Conference 2022 Conference Paper

Learning Latent Seasonal-Trend Representations for Time Series Forecasting

  • Zhiyuan Wang
  • Xovee Xu
  • Weifeng Zhang
  • Goce Trajcevski
  • Ting Zhong
  • Fan Zhou

Forecasting complex time series is ubiquitous and vital in a range of applications but challenging. Recent advances endeavor to achieve progress by incorporating various deep learning techniques (e. g. , RNN and Transformer) into sequential models. However, clear patterns are still hard to extract since time series are often composed of several intricately entangled components. Motivated by the success of disentangled variational autoencoder in computer vision and classical time series decomposition, we plan to infer a couple of representations that depict seasonal and trend components of time series. To achieve this goal, we propose LaST, which, based on variational inference, aims to disentangle the seasonal-trend representations in the latent space. Furthermore, LaST supervises and disassociates representations from the perspectives of themselves and input reconstruction, and introduces a series of auxiliary objectives. Extensive experiments prove that LaST achieves state-of-the-art performance on time series forecasting task against the most advanced representation learning and end-to-end forecasting models. For reproducibility, our implementation is publicly available on Github.

AAAI Conference 2022 Short Paper

Linking Transformer to Hawkes Process for Information Cascade Prediction (Student Abstract)

  • Liu Yu
  • Xovee Xu
  • Ting Zhong
  • Goce Trajcevski
  • Fan Zhou

Information cascade is typically formalized as a process of (simplified) discrete sequence of events, and recent approaches have tackled its prediction via variants of recurrent neural networks. However, the information diffusion process is essentially an evolving directed acyclic graph (DAG) in the continuous-time domain. In this paper, we propose a transformer enhanced Hawkes process (Hawkesformer), which links the hierarchical attention mechanism with Hawkes process to model the arrival stream of discrete events continuously. A two-level attention architecture is used to parameterize the intensity function of Hawkesformer, which captures the long-term dependencies between nodes in graph and better embeds the cascade evolution rate for modeling short-term outbreaks. Experimental results demonstrate the significant improvements of Hawkesformer over the state-of-the-art.

AAAI Conference 2022 Conference Paper

PrEF: Probabilistic Electricity Forecasting via Copula-Augmented State Space Model

  • Zhiyuan Wang
  • Xovee Xu
  • Goce Trajcevski
  • Kunpeng Zhang
  • Ting Zhong
  • Fan Zhou

Electricity forecasting has important implications for the key decisions in modern electricity systems, ranging from power generation, transmission, distribution and so on. In the literature, traditional statistic approaches, machine-learning methods and deep learning (e. g. , recurrent neural network) based models are utilized to model the trends and patterns in electricity time-series data. However, they are restricted either by their deterministic forms or by independence in probabilistic assumptions – thereby neglecting the uncertainty or significant correlations between distributions of electricity data. Ignoring these, in turn, may yield error accumulation, especially when relying on historical data and aiming at multi-step prediction. To overcome these, we propose a novel method named Probabilistic Electricity Forecasting (PrEF) by proposing a non-linear neural state space model (SSM) and incorporating copula-augmented mechanism into that, which can learn uncertainty-dependencies knowledge and understand interactive relationships between various factors from large-scale electricity time-series data. Our method distinguishes itself from existing models by its traceable inference procedure and its capability of providing high-quality probabilistic distribution predictions. Extensive experiments on two real-world electricity datasets demonstrate that our method consistently outperforms the alternatives.

ICML Conference 2022 Conference Paper

Quantification and Analysis of Layer-wise and Pixel-wise Information Discarding

  • Haotian Ma
  • Hao Zhang 0063
  • Fan Zhou
  • Yinqing Zhang
  • Quanshi Zhang

This paper presents a method to explain how the information of each input variable is gradually discarded during the forward propagation in a deep neural network (DNN), which provides new perspectives to explain DNNs. We define two types of entropy-based metrics, i. e. (1) the discarding of pixel-wise information used in the forward propagation, and (2) the uncertainty of the input reconstruction, to measure input information contained by a specific layer from two perspectives. Unlike previous attribution metrics, the proposed metrics ensure the fairness of comparisons between different layers of different DNNs. We can use these metrics to analyze the efficiency of information processing in DNNs, which exhibits strong connections to the performance of DNNs. We analyze information discarding in a pixel-wise manner, which is different from the information bottleneck theory measuring feature information w. r. t. the sample distribution. Experiments have shown the effectiveness of our metrics in analyzing classic DNNs and explaining existing deep-learning techniques. The code is available at https: //github. com/haotianSustc/deepinfo.

IJCAI Conference 2022 Conference Paper

Table Pre-training: A Survey on Model Architectures, Pre-training Objectives, and Downstream Tasks

  • Haoyu Dong
  • Zhoujun Cheng
  • Xinyi He
  • Mengyu Zhou
  • Anda Zhou
  • Fan Zhou
  • Ao Liu
  • Shi Han

Following the success of pre-training techniques in the natural language domain, a flurry of table pre-training frameworks have been proposed and have achieved new state-of-the-arts on various downstream tasks such as table question answering, table type recognition, column relation classification, table search, and formula prediction. Various model architectures have been explored to best capture the characteristics of (semi-)structured tables, especially specially-designed attention mechanisms. Moreover, to fully leverage the supervision signals in unlabeled tables, diverse pre-training objectives have been designed and evaluated, for example, denoising cell values, predicting numerical relationships, and learning a neural SQL executor. This survey aims to provide a comprehensive review of model designs, pre-training objectives, and downstream tasks for table pre-training, and we further share our thoughts on existing challenges and future opportunities.

TIST Journal 2021 Journal Article

Contrastive Trajectory Learning for Tour Recommendation

  • Fan Zhou
  • Pengyu Wang
  • Xovee Xu
  • Wenxin Tai
  • Goce Trajcevski

The main objective of Personalized Tour Recommendation (PTR) is to generate a sequence of point-of-interest (POIs) for a particular tourist, according to the user-specific constraints such as duration time, start and end points, the number of attractions planned to visit, and so on. Previous PTR solutions are based on either heuristics for solving the orienteering problem to maximize a global reward with a specified budget or approaches attempting to learn user visiting preferences and transition patterns with the stochastic process or recurrent neural networks. However, existing learning methodologies rely on historical trips to train the model and use the next visited POI as the supervised signal, which may not fully capture the coherence of preferences and thus recommend similar trips to different users, primarily due to the data sparsity problem and long-tailed distribution of POI popularity. This work presents a novel tour recommendation model by distilling knowledge and supervision signals from the trips in a self-supervised manner. We propose Contrastive Trajectory Learning for Tour Recommendation (CTLTR), which utilizes the intrinsic POI dependencies and traveling intent to discover extra knowledge and augments the sparse data via pre-training auxiliary self-supervised objectives. CTLTR provides a principled way to characterize the inherent data correlations while tackling the implicit feedback and weak supervision problems by learning robust representations applicable for tour planning. We introduce a hierarchical recurrent encoder-decoder to identify tourists’ intentions and use the contrastive loss to discover subsequence semantics and their sequential patterns through maximizing the mutual information. Additionally, we observe that a data augmentation step as the preliminary of contrastive learning can solve the overfitting issue resulting from data sparsity. We conduct extensive experiments on a range of real-world datasets and demonstrate that our model can significantly improve the recommendation performance over the state-of-the-art baselines in terms of both recommendation accuracy and visiting orders.

AAAI Conference 2021 Conference Paper

Forecasting Reservoir Inflow via Recurrent Neural ODEs

  • Fan Zhou
  • Liang Li

Forecasting reservoir inflow is critical for making many policies, ranging from flood control and agriculture irrigation to water ecology management, hydropower generation, and landslide prevention. Prior studies mainly exploit autoregressive models – e. g. , recurrent neural networks (RNN) and its many variants – to model the flow time series’s temporal pattern. However, existing approaches rely on regular and accurate inflow observations, which either fail to predict multiscale inflow (e. g. , an hour, a day, or a month ahead prediction) or ignore the uncertainty of observations due to confounding factors such as snowmelt and precipitation. To address the limitations, we propose a novel inflow forecasting model by incorporating the uncertainty of the observations into the RNN model and the continuous-time dynamics of the latent states with neural ordinary differential equations (ODE). Our method, called FlowODE, explicitly encodes the stochasticity of hidden conditions in addition to the temporal dependencies among inflow observations. Moreover, FlowODE explores a continuum of layers instead of discrete RNNs to model the hidden states’ dynamics, allowing us to infer the inflow at any time horizon flexibly. We conduct extensive experiments on the real-world datasets collected from two largescale hydropower dams. The results show that our method consistently outperforms previous inflow forecasting models while providing adaptable predictions and a flexible balance between prediction accuracy and computational cost.

AAAI Conference 2021 Conference Paper

Land Deformation Prediction via Slope-Aware Graph Neural Networks

  • Fan Zhou
  • Rongfan Li
  • Goce Trajcevski
  • Kunpeng Zhang

We introduce a slope-aware graph neural network (SA-GNN) to leverage continuously monitored data and predict the land displacement. Unlike general GNNs tackling tasks in the plain graphs, our method is capable of generalizing 3D spatial knowledge from InSAR point clouds. Specifically, we structure of the land surface, while preserving the spatial correlations among adjacent points. The point cloud can then be efficiently converted to a near-neighbor graph where general GNN methods can be applied to predict the displacement of the slope surface. We conducted experiments on real-world datasets and the results demonstrate that SA-GNN outperforms existing 3D CNN and point GNN methods.

AAAI Conference 2021 Conference Paper

Multi-task Learning by Leveraging the Semantic Information

  • Fan Zhou
  • Brahim Chaib-draa
  • Boyu Wang

One crucial objective of multi-task learning is to align distributions across tasks so that the information between them can be transferred and shared. However, existing approaches only focused on matching the marginal feature distribution while ignoring the semantic information, which may hinder the learning performance. To address this issue, we propose to leverage the label information in multi-task learning by exploring the semantic conditional relations among tasks. We first theoretically analyze the generalization bound of multi-task learning based on the notion of Jensen-Shannon divergence, which provides new insights into the value of label information in multi-task learning. Our analysis also leads to a concrete algorithm that jointly matches the semantic distribution and controls label distribution divergence. To confirm the effectiveness of the proposed method, we first compare the algorithm with several baselines on some benchmarks and then test the algorithms under label space shift conditions. Empirical results demonstrate that the proposed method could outperform most baselines and achieve state-of-the-art performance, particularly showing the benefits under the label shift conditions.

IJCAI Conference 2021 Conference Paper

Non-decreasing Quantile Function Network with Efficient Exploration for Distributional Reinforcement Learning

  • Fan Zhou
  • Zhoufan Zhu
  • Qi Kuang
  • Liwen Zhang

Although distributional reinforcement learning (DRL) has been widely examined in the past few years, there are two open questions people are still trying to address. One is how to ensure the validity of the learned quantile function, the other is how to efficiently utilize the distribution information. This paper attempts to provide some new perspectives to encourage the future in-depth studies in these two fields. We first propose a non-decreasing quantile function network (NDQFN) to guarantee the monotonicity of the obtained quantile estimates and then design a general exploration framework called distributional prediction error (DPE) for DRL which utilizes the entire distribution of the quantile function. In this paper, we not only discuss the theoretical necessity of our method but also show the performance gain it achieves in practice by comparing with some competitors on Atari 2600 Games especially in some hard-explored games.

ICML Conference 2021 Conference Paper

Optimal Estimation of High Dimensional Smooth Additive Function Based on Noisy Observations

  • Fan Zhou
  • Ping Li 0001

Given $\bx_j = \btheta + \bepsilon_j$, $j=1, .. ., n$ where $\btheta \in \RR^d$ is an unknown parameter and $\bepsilon_j$ are i. i. d. Gaussian noise vectors, we study the estimation of $f(\btheta)$ for a given smooth function $f: \RR^d \rightarrow \RR$ equipped with an additive structure. We inherit the idea from a recent work which introduced an effective bias reduction technique through iterative bootstrap and derive a bias-reducing estimator. By establishing its normal approximation results, we show that the proposed estimator can achieve asymptotic normality with a looser constraint on smoothness compared with general smooth function due to the additive structure. Such results further imply that the proposed estimator is asymptotically efficient. Both upper and lower bounds on mean squared error are proved which shows the proposed estimator is minimax optimal for the smooth class considered. Numerical simulation results are presented to validate our analysis and show its superior performance of the proposed estimator over the plug-in approach in terms of bias reduction and building confidence intervals.

AAAI Conference 2021 Conference Paper

Overcoming Catastrophic Forgetting in Graph Neural Networks with Experience Replay

  • Fan Zhou
  • Chengtai Cao

Graph Neural Networks (GNNs) have recently received significant research attention due to their superior performance on a variety of graph-related learning tasks. Most of the current works focus on either static or dynamic graph settings, addressing a single particular task, e. g. , node/graph classification, link prediction. In this work, we investigate the question: can GNNs be applied to continuously learning a sequence of tasks? Towards that, we explore the Continual Graph Learning (CGL) paradigm and present the Experience Replay based framework ER-GNN for CGL to alleviate the catastrophic forgetting problem in existing GNNs. ER-GNN stores knowledge from previous tasks as experiences and replays them when learning new tasks to mitigate the catastrophic forgetting issue. We propose three experience node selection strategies: mean of feature, coverage maximization, and influence maximization, to guide the process of selecting experience nodes. Extensive experiments on three benchmark datasets demonstrate the effectiveness of our ER-GNN and shed light on the incremental graph (non-Euclidean) structure learning.

NeurIPS Conference 2021 Conference Paper

Rate-Optimal Subspace Estimation on Random Graphs

  • Zhixin Zhou
  • Fan Zhou
  • Ping Li
  • Cun-Hui Zhang

We study the theory of random bipartite graph whose adjacency matrix is generated according to a connectivity matrix $M$. We consider the bipartite graph to be sparse, i. e. , the entries of $M$ are upper bounded by certain sparsity parameter. We show that the performance of estimating the connectivity matrix $M$ depends on the sparsity of the graph. We focus on two measurement of performance of estimation: the error of estimating $M$ and the error of estimating the column space of $M$. In the first case, we consider the operator norm and Frobenius norm of the difference between the estimation and the true connectivity matrix. In the second case, the performance will be measured by the difference between the estimated projection matrix and the true projection matrix in operator norm and Frobenius norm. We will show that the estimators we propose achieve the minimax optimal rate.

IJCAI Conference 2020 Conference Paper

Enhancing Urban Flow Maps via Neural ODEs

  • Fan Zhou
  • Liang Li
  • Ting Zhong
  • Goce Trajcevski
  • Kunpeng Zhang
  • Jiahao Wang

Flow super-resolution (FSR) enables inferring fine-grained urban flows with coarse-grained observations and plays an important role in traffic monitoring and prediction. The existing FSR solutions rely on deep CNN models (e. g. , ResNet) for learning spatial correlation, incurring excessive memory cost and numerous parameter updates. We propose to tackle the urban flows inference using dynamic systems paradigm and present a new method FODE -- FSR with Ordinary Differential Equations (ODEs). FODE extends neural ODEs by introducing an affine coupling layer to overcome the problem of numerically unstable gradient computation, which allows more accurate and efficient spatial correlation estimation, without extra memory cost. In addition, FODE provides a flexible balance between flow inference accuracy and computational efficiency. A FODE-based augmented normalization mechanism is further introduced to constrain the flow distribution with the influence of external factors. Experimental evaluations on two real-world datasets demonstrate that FODE significantly outperforms several baseline approaches.

NeurIPS Conference 2020 Conference Paper

Non-Crossing Quantile Regression for Distributional Reinforcement Learning

  • Fan Zhou
  • Jianing Wang
  • Xingdong Feng

Distributional reinforcement learning (DRL) estimates the distribution over future returns instead of the mean to more efficiently capture the intrinsic uncertainty of MDPs. However, batch-based DRL algorithms cannot guarantee the non-decreasing property of learned quantile curves especially at the early training stage, leading to abnormal distribution estimates and reduced model interpretability. To address these issues, we introduce a general DRL framework by using non-crossing quantile regression to ensure the monotonicity constraint within each sampled batch, which can be incorporated with any well-known DRL algorithm. We demonstrate the validity of our method from both the theory and model implementation perspectives. Experiments on Atari 2600 Games show that some state-of-art DRL algorithms with the non-crossing modification can significantly outperform their baselines in terms of faster convergence speeds and better testing performance. In particular, our method can effectively recover the distribution information and thus dramatically increase the exploration efficiency when the reward space is extremely sparse.

NeurIPS Conference 2019 Conference Paper

Graph-Based Semi-Supervised Learning with Non-ignorable Non-response

  • Fan Zhou
  • Tengfei Li
  • Haibo Zhou
  • Hongtu Zhu
  • Ye Jieping

Graph-based semi-supervised learning is a very powerful tool in classification tasks, while in most existing literature the labelled nodes are assumed to be randomly sampled. When the labelling status depends on the unobserved node response, ignoring the missingness can lead to significant estimation bias and handicap the classifiers. This situation is called non-ignorable non-response. To solve the problem, we propose a Graph-based joint model with Non-ignorable Non-response (GNN), followed by a joint inverse weighting estimation procedure incorporated with sampling imputation approach. Our method is proved to outperform some state-of-art models in both regression and classification problems, by simulations and real analysis on the Cora dataset.

JBHI Journal 2019 Journal Article

Identifying Brain Networks at Multiple Time Scales via Deep Recurrent Neural Network

  • Yan Cui
  • Shijie Zhao
  • Han Wang
  • Li Xie
  • Yaowu Chen
  • Junwei Han
  • Lei Guo
  • Fan Zhou

For decades, task functional magnetic resonance imaging has been a powerful noninvasive tool to explore the organizational architecture of human brain function. Researchers have developed a variety of brain network analysis methods for task fMRI data, including the general linear model, independent component analysis, and sparse representation methods. However, these shallow models are limited in faithful reconstruction and modeling of the hierarchical and temporal structures of brain networks, as demonstrated in more and more studies. Recently, recurrent neural networks (RNNs) exhibit great ability of modeling hierarchical and temporal dependence features in the machine learning field, which might be suitable for task fMRI data modeling. To explore such possible advantages of RNNs for task fMRI data, we propose a novel framework of a deep recurrent neural network (DRNN) to model the functional brain networks from task fMRI data. Experimental results on the motor task fMRI data of Human Connectome Project 900 subjects release demonstrated that the proposed DRNN can not only faithfully reconstruct functional brain networks, but also identify more meaningful brain networks with multiple time scales which are overlooked by traditional shallow models. In general, this work provides an effective and powerful approach to identifying functional brain networks at multiple time scales from task fMRI data.

JMLR Journal 2019 Journal Article

The Sup-norm Perturbation of HOSVD and Low Rank Tensor Denoising

  • Dong Xia
  • Fan Zhou

The higher order singular value decomposition (HOSVD) of tensors is a generalization of matrix SVD. The perturbation analysis of HOSVD under random noise is more delicate than its matrix counterpart. Recently, polynomial time algorithms have been proposed where statistically optimal estimates of the singular subspaces and the low rank tensors are attainable in the Euclidean norm. In this article, we analyze the sup-norm perturbation bounds of HOSVD and introduce estimators of the singular subspaces with sharp deviation bounds in the sup-norm. We also investigate a low rank tensor denoising estimator and demonstrate its fast convergence rate with respect to the entry-wise errors. The sup-norm perturbation bounds reveal unconventional phase transitions for statistical learning applications such as the exact clustering in high dimensional Gaussian mixture model and the exact support recovery in sub-tensor localizations. In addition, the bounds established for HOSVD also elaborate the one-sided sup-norm perturbation bounds for the singular subspaces of unbalanced (or fat) matrices. [abs] [ pdf ][ bib ] &copy JMLR 2019. ( edit, beta )

IJCAI Conference 2018 Conference Paper

On the Convergence Properties of a K-step Averaging Stochastic Gradient Descent Algorithm for Nonconvex Optimization

  • Fan Zhou
  • Guojing Cong

We adopt and analyze a synchronous K-step averaging stochastic gradient descent algorithm which we call K-AVG for solving large scale machine learning problems. We establish the convergence results of K-AVG for nonconvex objectives. Our analysis of K-AVG applies to many existing variants of synchronous SGD. We explain why the K-step delay is necessary and leads to better performance than traditional parallel stochastic gradient descent which is equivalent to K-AVG with $K=1$. We also show that K-AVG scales better with the number of learners than asynchronous stochastic gradient descent (ASGD). Another advantage of K-AVG over ASGD is that it allows larger stepsizes and facilitates faster convergence. On a cluster of $128$ GPUs, K-AVG is faster than ASGD implementations and achieves better accuracies and faster convergence for training with the CIFAR-10 dataset.

IJCAI Conference 2018 Conference Paper

Trajectory-User Linking via Variational AutoEncoder

  • Fan Zhou
  • Qiang Gao
  • Goce Trajcevski
  • Kunpeng Zhang
  • Ting Zhong
  • Fengli Zhang

Trajectory-User Linking (TUL) is an essential task in Geo-tagged social media (GTSM) applications, enabling personalized Point of Interest (POI) recommendation and activity identification. Existing works on mining mobility patterns often model trajectories using Markov Chains (MC) or recurrent neural networks (RNN) -- either assuming independence between non-adjacent locations or following a shallow generation process. However, most of them ignore the fact that human trajectories are often sparse, high-dimensional and may contain embedded hierarchical structures. We tackle the TUL problem with a semi-supervised learning framework, called TULVAE (TUL via Variational AutoEncoder), which learns the human mobility in a neural generative architecture with stochastic latent variables that span hidden states in RNN. TULVAE alleviates the data sparsity problem by leveraging large-scale unlabeled data and represents the hierarchical and structural semantics of trajectories with high-dimensional latent variables. Our experiments demonstrate that TULVAE improves efficiency and linking performance in real GTSM datasets, in comparison to existing methods.

AAMAS Conference 2017 Conference Paper

Dynamic Generalization Kanerva Coding in Reinforcement Learning for TCP Congestion Control Design

  • Wei Li
  • Fan Zhou
  • Waleed Meleis
  • Kaushik Chowdhury

Traditional reinforcement learning (RL) techniques often encounter limitations when solving large or continuous stateaction spaces. Training times needed to explore the very large space are impractically long, and it can be difficult to generalize learned knowledge. A compact representation of the state space is usually generated to solve both problems. However, simple state abstraction often cannot achieve the desired learning quality, while expert state representations usually involve costly hand-crafted strategies. We propose a new technique, generalization-based Kanerva coding, that automatically generates and optimizes state abstractions for learning. When applied to adapting the congestion window of the highly complex TCP congestion control protocol, a standard Internet protocol, this technique outperforms the current standard-TCP New Reno by 59. 5% in throughput and 6. 5% in delay. Our technique also achieves a 35. 2% improvement in throughput over the best previously proposed Kanerva coding technique when applied in the same context.

IJCAI Conference 2017 Conference Paper

Identifying Human Mobility via Trajectory Embeddings

  • Qiang Gao
  • Fan Zhou
  • Kunpeng Zhang
  • Goce Trajcevski
  • Xucheng Luo
  • Fengli Zhang

Understanding human trajectory patterns is an important task in many location based social networks (LBSNs) applications, such as personalized recommendation and preference-based route planning. Most of the existing methods classify a trajectory (or its segments) based on spatio-temporal values and activities, into some predefined categories, e. g. , walking or jogging. We tackle a novel trajectory classification problem: we identify and link trajectories to users who generate them in the LBSNs, a problem called Trajectory-User Linking (TUL). Solving the TUL problem is not a trivial task because: (1) the number of the classes (i. e. , users) is much larger than the number of motion patterns in the common trajectory classification problems; and (2) the location based trajectory data, especially the check-ins, are often extremely sparse. To address these challenges, a Recurrent Neural Networks (RNN) based semi-supervised learning model, called TULER (TUL via Embedding and RNN) is proposed, which exploits the spatio-temporal data to capture the underlying semantics of user mobility patterns. Experiments conducted on real-world datasets demonstrate that TULER achieves better accuracy than the existing methods.