Arrow Research search

Author name cluster

Lei Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

64 papers
1 author row

Possible papers

64

AAAI Conference 2026 Conference Paper

Knowledge Graph Guided Heterogeneity-Informed Diffusion Model for Spatio-Temporal Generation

  • Zi'ang Wang
  • Lei Chen
  • Yuanchang Jin
  • Pan Deng
  • Shuangshuang Pang
  • Junting Liu
  • Yu Zhao

Spatio-temporal data generation aims to synthesize realistic urban data across graph nodes by learning spatial and temporal dependencies. This task plays a crucial role in urban planning by enabling the simulation of unobserved nodes. However, existing approaches face critical limitations that time series generation methods fail to generalize to unseen nodes, while spatio-temporal generative models are either restricted to the trajectory generation task or dependent on auxiliary data inputs. To bridge these gaps, we propose a Knowledge Graph Guided Heterogeneity-Informed Diffusion Model (KGDiff) in this paper through the following key innovations. First, we design a geometry-aware mixture of experts integrating Euclidean, hyperbolic, and hyperspherical representations to comprehensively encode urban structural knowledge. Next, we present a learnable meta spatio-temporal pattern module that normalizes node-specific heterogeneity before the generation process, and a conditional denoising process that progressively transforms random noise into realistic samples under structural guidance. Finally, extensive experiments across real-world urban datasets demonstrate that KGDiff achieves the state-of-art performance in generating realistic urban spatio-temporal data.

AAAI Conference 2026 Conference Paper

Learning from Scoring Disagreements: Contrastive Error Mining for Efficient and Robust LLM-based Assessment

  • Lei Chen
  • Tengteng Cheng
  • BoYu Gao
  • Zitao Liu
  • Weiqi Luo

Automated grading of student responses still faces numerous challenges, particularly when dealing with complex and ambiguous answers. In particular, large models are prone to scoring bias when handling uncertain responses, and few-shot reasoning methods often lack stability, which limits their applicability in real educational scenarios. To tackle these challenges, we propose the Contrastive Error Mining and FineTuning (CEM-FT) framework, which automatically identifies high-value hard samples by analyzing scoring disagreements between a full fine-tuned model and a few-shot model. A lightweight LoRA adapter is then trained on these samples to refine model performance with minimal computational overhead. Experiments on the SciEntsbank, Beetle, and Mohler datasets show that CEM-FT can improve QWK by up to 3.9% compared to the fine-tuned Qwen model on SciEntsbank datasets, which is a significant improvement over the few-shot baseline. The proposed framework substantially enhances both scoring accuracy and consistency, providing a practical, robust solution for reliable automated assessment with large language models.

AAAI Conference 2026 Conference Paper

Token Painter: Training-Free Text-Guided Image Inpainting via Mask Autoregressive Models

  • Longtao Jiang
  • Jie Huang
  • Mingfei Han
  • Lei Chen
  • Yongqiang Yu
  • Feng Zhao
  • Xiaojun Chang
  • Zhihui Li

Text-guided image inpainting aims to inpaint masked image regions based on a textual prompt while preserving the background. Although diffusion-based methods have become dominant, their property of modeling the entire image in latent space makes it challenging for the results to align well with prompt details and maintain a consistent background. To address these issues, we explore Mask AutoRegressive (MAR) models for this task. MAR naturally supports image inpainting by generating latent tokens corresponding to mask regions, enabling better local controllability without altering the background. However, directly applying MAR to this task makes the inpainting content either ignore the prompts or be disharmonious with the background context. Through analysis of the attention maps from the inpainting images, we identify the impact of background tokens on text tokens during the MAR generation, and leverage this to designToken Painter, a training-free text-guided image inpainting method based on MAR. Our approach introduces two key components: (1) Dual-Stream Encoder Information Fusion (DEIF), which fuses the semantic and context information from text and background in frequency domain to produce novel guidance tokens, allowing MAR to generate text-faithful inpainting content while keeping harmonious with background context. (2) Adaptive Decoder Attention Score Enhancing (ADAE), which adaptively enhances attention scores on guidance tokens and inpainting tokens to further enhance the alignment of prompt details and the content visual quality. Extensive experiments demonstrate that our training-free method outperforms prior state-of-the-art methods across almost all metrics.

TMLR Journal 2025 Journal Article

A Survey on Large Language Model Acceleration based on KV Cache Management

  • Haoyang Li
  • Yiming Li
  • Anxin Tian
  • Tianhao Tang
  • Zhanchao Xu
  • Xuejia Chen
  • Nicole HU
  • Wei Dong

Large Language Models (LLMs) have revolutionized a wide range of domains such as natural language processing, computer vision, and multi-modal tasks due to their ability to comprehend context and perform logical reasoning. However, the computational and memory demands of LLMs, particularly during inference, pose significant challenges when scaling them to real-world, long-context, and real-time applications. Key-Value (KV) cache management has emerged as a critical optimization technique for accelerating LLM inference by reducing redundant computations and improving memory utilization. This survey provides a comprehensive overview of KV cache management strategies for LLM acceleration, categorizing them into token-level, model-level, and system-level optimizations. Token-level strategies include KV cache selection, budget allocation, merging, quantization, and low-rank decomposition, while model-level optimizations focus on architectural innovations and attention mechanisms to enhance KV reuse. System-level approaches address memory management, scheduling, and hardware-aware designs to improve efficiency across diverse computing environments. Additionally, the survey provides an overview of both text and multimodal datasets and benchmarks used to evaluate these strategies. By presenting detailed taxonomies and comparative analyses, this work aims to offer useful insights for researchers and practitioners to support the development of efficient and scalable KV cache management techniques, contributing to the practical deployment of LLMs in real-world applications.

NeurIPS Conference 2025 Conference Paper

AttentionPredictor: Temporal Patterns Matter for KV Cache Compression

  • Qingyue Yang
  • Jie Wang
  • Xing Li
  • Zhihai Wang
  • Chen Chen
  • Lei Chen
  • Xianzhi Yu
  • Wulong Liu

With the development of large language models (LLMs), efficient inference through Key-Value (KV) cache compression has attracted considerable attention, especially for long-context generation. To compress the KV cache, recent methods identify critical KV tokens through static modeling of attention scores. However, these methods often struggle to accurately determine critical tokens as they neglect the *temporal patterns* in attention scores, resulting in a noticeable degradation in LLM performance. To address this challenge, we propose **AttentionPredictor**, which is the **first learning-based method to directly predict attention patterns for KV cache compression and critical token identification**. Specifically, AttentionPredictor learns a lightweight, unified convolution model to dynamically capture spatiotemporal patterns and predict the next-token attention scores. An appealing feature of AttentionPredictor is that it accurately predicts the attention score and shares the unified prediction model, which consumes negligible memory, among all transformer layers. Moreover, we propose a cross-token critical cache prefetching framework that hides the token estimation time overhead to accelerate the decoding stage. By retaining most of the attention information, AttentionPredictor achieves **13$\times$** KV cache compression and **5. 6$\times$** speedup in a cache offloading scenario with comparable LLM performance, significantly outperforming the state-of-the-arts. The code is available at https: //github. com/MIRALab-USTC/LLM-AttentionPredictor.

NeurIPS Conference 2025 Conference Paper

FuXi-Ocean: A Global Ocean Forecasting System with Sub-Daily Resolution

  • Qiusheng Huang
  • Yuan Niu
  • Xiaohui Zhong
  • Lei Chen
  • dianjun zhang
  • Xuefeng Zhang
  • Hao Li

Accurate, high-resolution ocean forecasting is crucial for maritime operations and environmental monitoring. While traditional numerical models are capable of producing sub-daily, eddy-resolving forecasts, they are computationally intensive and face challenges in maintaining accuracy at fine spatial and temporal scales. In contrast, recent data-driven approaches offer improved computational efficiency and emerging potential, yet typically operate at daily resolution and struggle with sub-daily predictions due to error accumulation over time. We introduce FuXi-Ocean, the first data-driven global ocean forecasting model achieving six-hourly predictions at eddy-resolving 1/12° spatial resolution, reaching depths of up to 1500 meters. The model architecture integrates a context-aware feature extraction module with a predictive network employing stacked attention blocks. The core innovation is the Mixture-of-Time (MoT) module, which adaptively integrates predictions from multiple temporal contexts by learning variable-specific reliability, mitigating cumulative errors in sequential forecasting. Through comprehensive experimental evaluation, FuXi-Ocean demonstrates superior skill in predicting key variables, including temperature, salinity, and currents, across multiple depths.

NeurIPS Conference 2025 Conference Paper

GoRA: Gradient-driven Adaptive Low Rank Adaptation

  • Haonan He
  • Peng Ye
  • Yuchen Ren
  • Yuan Yuan
  • Lei Chen

Low-Rank Adaptation (LoRA) is a crucial method for efficiently fine-tuning large language models (LLMs), with its effectiveness influenced by two key factors: rank selection and weight initialization. While numerous LoRA variants have been proposed to improve performance by addressing one of these aspects, they often compromise usability or computational efficiency. In this paper, we analyze and identify the core limitations of existing approaches and propose a novel framework— GoRA ( G radient-driven Adaptive L o w R ank A daptation)—that simultaneously adapts both the rank and initialization strategy within a unified framework. GoRA leverages gradient information during training to dynamically assign optimal ranks and initialize low-rank adapter weights in an adaptive manner. To our knowledge, GoRA is the first method that not only addresses the limitations of prior approaches—which often focus on either rank selection or initialization in isolation—but also unifies both aspects within a single framework, enabling more effective and efficient adaptation. Extensive experiments across various architectures and modalities show that GoRA consistently outperforms existing LoRA-based methods while preserving the efficiency of vanilla LoRA. For example, when fine-tuning Llama3. 1-8B-Base for mathematical reasoning, GoRA achieves a 5. 13-point improvement over standard LoRA and even outperforms full fine-tuning by 2. 05 points under high-rank settings. Code is available at: https: //github. com/hhnqqq/MyTransformers.

NeurIPS Conference 2025 Conference Paper

Leader360V: A Large-scale, Real-world 360 Video Dataset for Multi-task Learning in Diverse Environment

  • Weiming Zhang
  • Dingwen Xiao
  • Aobotao DAI
  • Yexin Liu
  • Tianbo Pan
  • Shiqi Wen
  • Lei Chen
  • Lin Wang

360 video captures the complete surrounding scenes with the ultra-large field of view of 360x180. This makes 360 scene understanding tasks, e. g. , segmentation and tracking, crucial for appications, such as autonomous driving, robotics. With the recent emergence of foundation models, the community is, however, impeded by the lack of large-scale, labelled real-world datasets. This is caused by the inherent spherical properties, e. g. , severe distortion in polar regions, and content discontinuities, rendering the annotation costly yet complex. This paper introduces Leader360V, the first large-scale (10K+), labeled real-world 360 video datasets for instance segmentation and tracking. Our datasets enjoy high scene diversity, ranging from indoor and urban settings to natural and dynamic outdoor scenes. To automate annotation, we design an automatic labeling pipeline, which subtly coordinates pre-trained 2D segmentors and large language models (LLMs) to facilitate the labeling. The pipeline operates in three novel stages. Specifically, in the Initial Annotation Phase, we introduce a Semantic- and Distortion-aware Refinement ( SDR ) module, which combines object mask proposals from multiple 2D segmentors with LLM-verified semantic labels. These are then converted into mask prompts to guide SAM2 in generating distortion-aware masks for subsequent frames. In the Auto-Refine Annotation Phase, missing or incomplete regions are corrected either by applying the SDR again or resolving the discontinuities near the horizontal borders. The Manual Revision Phase finally incorporates LLMs and human annotators to further refine and validate the annotations. Extensive user studies and evaluations demonstrate the effectiveness of our labeling pipeline. Meanwhile, experiments confirm that Leader360V significantly enhances model performance for 360 video segmentation and tracking, paving the way for more scalable 360 scene understanding. We release our dataset and code at {https: //leader360v. github. io/Leader360V_HomePage/} for better understanding.

NeurIPS Conference 2025 Conference Paper

Optimizing Retrieval for RAG via Reinforcement Learning

  • Jiawei Zhou
  • Lei Chen

As retrieval-augmented generation (RAG) becomes more widespread, the role of retrieval is shifting from retrieving information for human browsing to retrieving context for AI reasoning. This shift creates more complex search environments, where relevance is difficult to pre-define. Existing retrievers rely on supervised fine-tuning (SFT) with human labels or synthetic data, resulting in static relevance that struggles to adapt to diverse RAG environments. To address this challenge, we propose R3, a Retrieval framework optimized for RAG through Reinforcement learning (RL). Specifically, we adopt an RL training paradigm that enables the retriever to explore and self-improve within given RAG environments, automating the learning process with minimal manual experimentation or tuning effort. Extensive experiments across diverse tasks demonstrate that \ours improves RAG performance by 5. 2% over the original retriever and surpasses state-of-the-art retrievers by 4. 9%, while achieving comparable results to LLM-augmented retrieval and RAG systems built on post-trained or instruction-tuned LLMs. It is both efficient and practical, requiring only 4 GPUs and completing training within a single day.

NeurIPS Conference 2024 Conference Paper

Benchmarking PtO and PnO Methods in the Predictive Combinatorial Optimization Regime

  • Haoyu Geng
  • Hang Ruan
  • Runzhong Wang
  • Yang Li
  • Yang Wang
  • Lei Chen
  • Junchi Yan

Predictive combinatorial optimization, where the parameters of combinatorial optimization (CO) are unknown at the decision-making time, is the precise modeling of many real-world applications, including energy cost-aware scheduling and budget allocation on advertising. Tackling such a problem usually involves a prediction model and a CO solver. These two modules are integrated into the predictive CO pipeline following two design principles: ''Predict-then-Optimize (PtO)'', which learns predictions by supervised training and subsequently solves CO using predicted coefficients, while the other, named ''Predict-and-Optimize (PnO)'', directly optimizes towards the ultimate decision quality and claims to yield better decisions than traditional PtO approaches. However, there lacks a systematic benchmark of both approaches, including the specific design choices at the module level, as well as an evaluation dataset that covers representative real-world scenarios. To this end, we develop a modular framework to benchmark 11 existing PtO/PnO methods on 8 problems, including a new industrial dataset for combinatorial advertising that will be released. Our study shows that PnO approaches are better than PtO on 7 out of 8 benchmarks, but there is no silver bullet found for the specific design choices of PnO. A comprehensive categorization of current approaches and integration of typical scenarios are provided under a unified benchmark. Therefore, this paper could serve as a comprehensive benchmark for future PnO approach development and also offer fast prototyping for application-focused development. The code is available at \url{https: //github. com/Thinklab-SJTU/PredictiveCO-Benchmark}.

AAAI Conference 2024 Conference Paper

Compositional Generalization for Multi-Label Text Classification: A Data-Augmentation Approach

  • Yuyang Chai
  • Zhuang Li
  • Jiahui Liu
  • Lei Chen
  • Fei Li
  • Donghong Ji
  • Chong Teng

Despite significant advancements in multi-label text classification, the ability of existing models to generalize to novel and seldom-encountered complex concepts, which are compositions of elementary ones, remains underexplored. This research addresses this gap. By creating unique data splits across three benchmarks, we assess the compositional generalization ability of existing multi-label text classification models. Our results show that these models often fail to generalize to compositional concepts encountered infrequently during training, leading to inferior performance on tests with these new combinations. To address this, we introduce a data augmentation method that leverages two innovative text generation models designed to enhance the classification models' capacity for compositional generalization. Our experiments show that this data augmentation approach significantly improves the compositional generalization capabilities of classification models on our benchmarks, with both generation models surpassing other text generation baselines. Our codes available at https://github.com/yychai74/LD-VAE.

NeurIPS Conference 2024 Conference Paper

CRAG - Comprehensive RAG Benchmark

  • Xiao Yang
  • Kai Sun
  • Hao Xin
  • Yushi Sun
  • Nikita Bhalla
  • Xiangsen Chen
  • Sajal Choudhary
  • Rongze D. Gui

Retrieval-Augmented Generation (RAG) has recently emerged as a promising solution to alleviate Large Language Model (LLM)’s deficiency in lack of knowledge. Existing RAG datasets, however, do not adequately represent the diverse and dynamic nature of real-world Question Answering (QA) tasks. To bridge this gap, we introduce the Comprehensive RAG Benchmark (CRAG), a factual question answering benchmark of 4, 409 question-answer pairs and mock APIs to simulate web and Knowledge Graph (KG) search. CRAG is designed to encapsulate a diverse array of questions across five domains and eight question categories, reflecting varied entity popularity from popular to long-tail, and temporal dynamisms ranging from years to seconds. Our evaluation on this benchmark highlights the gap to fully trustworthy QA. Whereas most advanced LLMs achieve $\le 34\%$ accuracy on CRAG, adding RAG in a straightforward manner improves the accuracy only to 44%. State-of-the-art industry RAG solutions only answer 63% questions without any hallucination. CRAG also reveals much lower accuracy in answering questions regarding facts with higher dynamism, lower popularity, or higher complexity, suggesting future research directions. The CRAG benchmark laid the groundwork for a KDD Cup 2024 challenge, attracted thousands of participants and submissions. We commit to maintaining CRAG to serve research communities in advancing RAG solutions and general QA solutions. CRAG is available at https: //github. com/facebookresearch/CRAG/.

AAAI Conference 2024 Conference Paper

EG-NAS: Neural Architecture Search with Fast Evolutionary Exploration

  • Zicheng Cai
  • Lei Chen
  • Peng Liu
  • Tongtao Ling
  • Yutao Lai

Differentiable Architecture Search (DARTS) has achieved a rapid search for excellent architectures by optimizing architecture parameters through gradient descent. However, this efficiency comes with a significant challenge: the risk of premature convergence to local optima, resulting in subpar performance that falls short of expectations. To address this issue, we propose a novel and effective method called Evolutionary Gradient-Based Neural Architecture Search (EG-NAS). Our approach combines the strengths of both gradient descent and evolutionary strategy, allowing for the exploration of various optimization directions during the architecture search process. To begin with, we continue to employ gradient descent for updating network parameters to ensure efficiency. Subsequently, to mitigate the risk of premature convergence, we introduce an evolutionary strategy with global search capabilities to optimize the architecture parameters. By leveraging the best of both worlds, our method strikes a balance between efficient exploration and exploitation of the search space. Moreover, we have redefined the fitness function to not only consider accuracy but also account for individual similarity. This inclusion enhances the diversity and accuracy of the optimized directions identified by the evolutionary strategy. Extensive experiments on various datasets and search spaces demonstrate that EG-NAS achieves highly competitive performance at significantly low search costs compared to state-of-the-art methods. The code is available at https://github.com/caicaicheng/EG-NAS.

AAAI Conference 2024 Conference Paper

Fine-Tuning Graph Neural Networks by Preserving Graph Generative Patterns

  • Yifei Sun
  • Qi Zhu
  • Yang Yang
  • Chunping Wang
  • Tianyu Fan
  • Jiajun Zhu
  • Lei Chen

Recently, the paradigm of pre-training and fine-tuning graph neural networks has been intensively studied and applied in a wide range of graph mining tasks. Its success is generally attributed to the structural consistency between pre-training and downstream datasets, which, however, does not hold in many real-world scenarios. Existing works have shown that the structural divergence between pre-training and downstream graphs significantly limits the transferability when using the vanilla fine-tuning strategy. This divergence leads to model overfitting on pre-training graphs and causes difficulties in capturing the structural properties of the downstream graphs. In this paper, we identify the fundamental cause of structural divergence as the discrepancy of generative patterns between the pre-training and downstream graphs. Furthermore, we propose G-Tuning to preserve the generative patterns of downstream graphs. Given a downstream graph G, the core idea is to tune the pre-trained GNN so that it can reconstruct the generative patterns of G, the graphon W. However, the exact reconstruction of a graphon is known to be computationally expensive. To overcome this challenge, we provide a theoretical analysis that establishes the existence of a set of alternative graphons called graphon bases for any given graphon. By utilizing a linear combination of these graphon bases, we can efficiently approximate W. This theoretical finding forms the basis of our model, as it enables effective learning of the graphon bases and their associated coefficients. Compared with existing algorithms, G-Tuning demonstrates consistent performance improvement in 7 in-domain and 7 out-of-domain transfer learning experiments.

AAAI Conference 2024 Conference Paper

Learning Multi-Scale Video-Text Correspondence for Weakly Supervised Temporal Article Gronding

  • Wenjia Geng
  • Yong Liu
  • Lei Chen
  • Sujia Wang
  • Jie Zhou
  • Yansong Tang

Weakly Supervised temporal Article Grounding (WSAG) is a challenging and practical task in video understanding. Specifically, given a video and a relevant article, whose sentences are at different semantic scales, WSAG aims to localize corresponding video segments for all “groundable” sentences. Compared to other grounding tasks, e.g., localizing one target segment with respect to a given sentence query, WSAG confronts an essential obstacle rooted in the intricate multi-scale information inherent within both textual and visual modalities. Existing methods overlook the modeling and alignment of such structured information present in multi-scale video segments and hierarchical textual content. To this end, we propose a Multi-Scale Video-Text Correspondence Learning (MVTCL) framework, which enhances the grounding performance in complex scenes by modeling multi-scale semantic correspondence both within and between modalities. Specifically, MVTCL initially aggregates video content spanning distinct temporal scales and leverages hierarchical textual relationships in both temporal and semantic dimensions via a semantic calibration module. Then multi-scale contrastive learning module is introduced to generate more discriminative representations by selecting typical contexts and performing inter-video contrastive learning. Through the multi-scale semantic calibration architecture and supervision design, our method achieves new state-of-the-art performance on existing WSAG benchmarks.

TIST Journal 2024 Journal Article

Mitigating Recommendation Biases via Group-Alignment and Global-Uniformity in Representation Learning

  • Miaomiao Cai
  • Min Hou
  • Lei Chen
  • Le Wu
  • Haoyue Bai
  • Yong Li
  • Meng Wang

Collaborative Filtering (CF) plays a crucial role in modern recommender systems, leveraging historical user-item interactions to provide personalized suggestions. However, CF-based methods often encounter biases due to imbalances in training data. This phenomenon makes CF-based methods tend to prioritize recommending popular items and performing unsatisfactorily on inactive users. Existing works address this issue by rebalancing training samples, reranking recommendation results, or making the modeling process robust to the bias. Despite their effectiveness, these approaches can compromise accuracy or be sensitive to weighting strategies, making them challenging to train. Therefore, exploring how to mitigate these biases remains in urgent demand. In this article, we deeply analyze the causes and effects of the biases and propose a framework to alleviate biases in recommendation from the perspective of representation distribution, namely Group-Alignment and Global-Uniformity Enhanced Representation Learning for Debiasing Recommendation (AURL). Specifically, we identify two significant problems in the representation distribution of users and items, namely group-discrepancy and global-collapse. These two problems directly lead to biases in the recommendation results. To this end, we propose two simple but effective regularizers in the representation space, respectively named group-alignment and global-uniformity. The goal of group-alignment is to bring the representation distribution of long-tail entities closer to that of popular entities, while global-uniformity aims to preserve the information of entities as much as possible by evenly distributing representations. Our method directly optimizes both the group-alignment and global-uniformity regularization terms to mitigate recommendation biases. Please note that AURL applies to arbitrary CF-based recommendation backbones. Extensive experiments on three real datasets and various recommendation backbones verify the superiority of our proposed framework. The results show that AURL not only outperforms existing debiasing models in mitigating biases but also improves recommendation performance to some extent.

IS Journal 2024 Journal Article

Model Predictive Control-Based Value Estimation for Efficient Reinforcement Learning

  • Qizhen Wu
  • Kexin Liu
  • Lei Chen

Reinforcement learning (RL) suffers from limitations in real practices primarily due to the number of required interactions with virtual environments. It results in a challenging problem because we are implausible to obtain a local optimal strategy with only a few attempts for many learning methods. Hereby, we design an improved RL method based on model predictive control that models the environment through a data-driven approach. Based on the learned environment model, it performs multistep prediction to estimate the value function and optimize the policy. The method demonstrates higher learning efficiency, faster convergent speed of strategies tending to the local optimal value, and less sample capacity space required by experience replay buffers. Experimental results, both in classic databases and in a dynamic obstacle-avoidance scenario for an unmanned aerial vehicle, validate the proposed approaches.

AAAI Conference 2024 Conference Paper

Towards Fair Graph Federated Learning via Incentive Mechanisms

  • Chenglu Pan
  • Jiarong Xu
  • Yue Yu
  • Ziqi Yang
  • Qingbiao Wu
  • Chunping Wang
  • Lei Chen
  • Yang Yang

Graph federated learning (FL) has emerged as a pivotal paradigm enabling multiple agents to collaboratively train a graph model while preserving local data privacy. Yet, current efforts overlook a key issue: agents are self-interested and would hesitant to share data without fair and satisfactory incentives. This paper is the first endeavor to address this issue by studying the incentive mechanism for graph federated learning. We identify a unique phenomenon in graph federated learning: the presence of agents posing potential harm to the federation and agents contributing with delays. This stands in contrast to previous FL incentive mechanisms that assume all agents contribute positively and in a timely manner. In view of this, this paper presents a novel incentive mechanism tailored for fair graph federated learning, integrating incentives derived from both model gradient and payoff. To achieve this, we first introduce an agent valuation function aimed at quantifying agent contributions through the introduction of two criteria: gradient alignment and graph diversity. Moreover, due to the high heterogeneity in graph federated learning, striking a balance between accuracy and fairness becomes particularly crucial. We introduce motif prototypes to enhance accuracy, communicated between the server and agents, enhancing global model aggregation and aiding agents in local model optimization. Extensive experiments show that our model achieves the best trade-off between accuracy and the fairness of model gradient, as well as superior payoff fairness.

NeurIPS Conference 2024 Conference Paper

Towards Next-Generation Logic Synthesis: A Scalable Neural Circuit Generation Framework

  • Zhihai Wang
  • Jie Wang
  • Qingyue Yang
  • Yinqi Bai
  • Xing Li
  • Lei Chen
  • Jianye Hao
  • Mingxuan Yuan

Logic Synthesis (LS) aims to generate an optimized logic circuit satisfying a given functionality, which generally consists of circuit translation and optimization. It is a challenging and fundamental combinatorial optimization problem in integrated circuit design. Traditional LS approaches rely on manually designed heuristics to tackle the LS task, while machine learning recently offers a promising approach towards next-generation logic synthesis by neural circuit generation and optimization. In this paper, we first revisit the application of differentiable neural architecture search (DNAS) methods to circuit generation and found from extensive experiments that existing DNAS methods struggle to exactly generate circuits, scale poorly to large circuits, and exhibit high sensitivity to hyper-parameters. Then we provide three major insights for these challenges from extensive empirical analysis: 1) DNAS tends to overfit to too many skip-connections, consequently wasting a significant portion of the network's expressive capabilities; 2) DNAS suffers from the structure bias between the network architecture and the circuit inherent structure, leading to inefficient search; 3) the learning difficulty of different input-output examples varies significantly, leading to severely imbalanced learning. To address these challenges in a systematic way, we propose a novel regularized triangle-shaped circuit network generation framework, which leverages our key insights for completely accurate and scalable circuit generation. Furthermore, we propose an evolutionary algorithm assisted by reinforcement learning agent restarting technique for efficient and effective neural circuit optimization. Extensive experiments on four different circuit benchmarks demonstrate that our method can precisely generate circuits with up to 1200 nodes. Moreover, our synthesized circuits significantly outperform the state-of-the-art results from several competitive winners in IWLS 2022 and 2023 competitions.

NeurIPS Conference 2023 Conference Paper

Disentangling Cognitive Diagnosis with Limited Exercise Labels

  • Xiangzhi Chen
  • Le Wu
  • Fei Liu
  • Lei Chen
  • Kun Zhang
  • Richang Hong
  • Meng Wang

Cognitive diagnosis is an important task in intelligence education, which aims at measuring students’ proficiency in specific knowledge concepts. Given a fully labeled exercise-concept matrix, most existing models focused on mining students' response records for cognitive diagnosis. Despite their success, due to the huge cost of labeling exercises, a more practical scenario is that limited exercises are labeled with concepts. Performing cognitive diagnosis with limited exercise labels is under-explored and remains pretty much open. In this paper, we propose Disentanglement based Cognitive Diagnosis (DCD) to address the challenges of limited exercise labels. Specifically, we utilize students' response records to model student proficiency, exercise difficulty and exercise label distribution. Then, we introduce two novel modules - group-based disentanglement and limited-labeled alignment modules - to disentangle the factors relevant to concepts and align them with real limited labels. Particularly, we introduce the tree-like structure of concepts with negligible cost for group-based disentangling, as concepts of different levels exhibit different independence relationships. Extensive experiments on widely used benchmarks demonstrate the superiority of our proposed model.

AAAI Conference 2023 Conference Paper

Flora: Dual-Frequency LOss-Compensated ReAl-Time Monocular 3D Video Reconstruction

  • Likang Wang
  • Yue Gong
  • Qirui Wang
  • Kaixuan Zhou
  • Lei Chen

In this work, we propose a real-time monocular 3D video reconstruction approach named Flora for reconstructing delicate and complete 3D scenes from RGB video sequences in an end-to-end manner. Specifically, we introduce a novel method with two main contributions. Firstly, the proposed feature aggregation module retains both color and reliability in a dual-frequency form. Secondly, the loss compensation module solves missing structure by correcting losses for falsely pruned voxels. The dual-frequency feature aggregation module enhances reconstruction quality in both precision and recall, and the loss compensation module benefits the recall. Notably, both proposed contributions achieve great results with negligible inferencing overhead. Our state-of-the-art experimental results on real-world datasets demonstrate Flora's leading performance in both effectiveness and efficiency. The code is available at https://github.com/NoOneUST/Flora.

NeurIPS Conference 2023 Conference Paper

Joint Feature and Differentiable $ k $-NN Graph Learning using Dirichlet Energy

  • Lei Xu
  • Lei Chen
  • Rong Wang
  • Feiping Nie
  • Xuelong Li

Feature selection (FS) plays an important role in machine learning, which extracts important features and accelerates the learning process. In this paper, we propose a deep FS method that simultaneously conducts feature selection and differentiable $ k $-NN graph learning based on the Dirichlet Energy. The Dirichlet Energy identifies important features by measuring their smoothness on the graph structure, and facilitates the learning of a new graph that reflects the inherent structure in new feature subspace. We employ Optimal Transport theory to address the non-differentiability issue of learning $ k $-NN graphs in neural networks, which theoretically makes our method applicable to other graph neural networks for dynamic graph learning. Furthermore, the proposed framework is interpretable, since all modules are designed algorithmically. We validate the effectiveness of our model with extensive experiments on both synthetic and real-world datasets.

AAAI Conference 2023 Conference Paper

Loan Fraud Users Detection in Online Lending Leveraging Multiple Data Views

  • Sha Zhao
  • Yongrui Huang
  • Ling Chen
  • Chunping Wang
  • Shijian Li
  • Lei Chen
  • Gang Pan

In recent years, online lending platforms have been becoming attractive for micro-financing and popular in financial industries. However, such online lending platforms face a high risk of failure due to the lack of expertise on borrowers' creditworthness. Thus, risk forecasting is important to avoid economic loss. Detecting loan fraud users in advance is at the heart of risk forecasting. The purpose of fraud user (borrower) detection is to predict whether one user will fail to make required payments in the future. Detecting fraud users depend on historical loan records. However, a large proportion of users lack such information, especially for new users. In this paper, we attempt to detect loan fraud users from cross domain heterogeneous data views, including user attributes, installed app lists, app installation behaviors, and app-in logs, which compensate for the lack of historical loan records. However, it is difficult to effectively fuse the multiple heterogeneous data views. Moreover, some samples miss one or even more data views, increasing the difficulty in fusion. To address the challenges, we propose a novel end-to-end deep multiview learning approach, which encodes heterogeneous data views into homogeneous ones, generates the missing views based on the learned relationship among all the views, and then fuses all the views together to a comprehensive view for identifying fraud users. Our model is evaluated on a real-world large-scale dataset consisting of 401,978 loan records of 228,117 users from January 1, 2019, to September 30, 2019, achieving the state-of-the-art performance.

IJCAI Conference 2023 Conference Paper

Sancus: Staleness-Aware Communication-Avoiding Full-Graph Decentralized Training in Large-Scale Graph Neural Networks (Extended Abstract)

  • Jingshu Peng
  • Zhao Chen
  • Yingxia Shao
  • Yanyan Shen
  • Lei Chen
  • Jiannong Cao

Graph neural networks (GNNs) have emerged due to their success at modeling graph data. Yet, it is challenging for GNNs to efficiently scale to large graphs. Thus, distributed GNNs come into play. To avoid communication caused by expensive data movement between workers, we propose SANCUS, a staleness-aware communication-avoiding decentralized GNN system. By introducing a set of novel bounded embedding staleness metrics and adaptively skipping broadcasts, SANCUS abstracts decentralized GNN processing as sequential matrix multiplication and uses historical embeddings via cache. Theoretically, we show bounded approximation errors of embeddings and gradients with convergence guarantee. Empirically, we evaluate SANCUS with common GNN models via different system setups on large-scale benchmark datasets. Compared to SOTA works, SANCUS can avoid up to 74% communication with at least 1: 86_ faster throughput on average without accuracy loss.

AAAI Conference 2023 Conference Paper

SwinRDM: Integrate SwinRNN with Diffusion Model towards High-Resolution and High-Quality Weather Forecasting

  • Lei Chen
  • Fei Du
  • Yuan Hu
  • Zhibin Wang
  • Fan Wang

Data-driven medium-range weather forecasting has attracted much attention in recent years. However, the forecasting accuracy at high resolution is unsatisfactory currently. Pursuing high-resolution and high-quality weather forecasting, we develop a data-driven model SwinRDM which integrates an improved version of SwinRNN with a diffusion model. SwinRDM performs predictions at 0.25-degree resolution and achieves superior forecasting accuracy to IFS (Integrated Forecast System), the state-of-the-art operational NWP model, on representative atmospheric variables including 500 hPa geopotential (Z500), 850 hPa temperature (T850), 2-m temperature (T2M), and total precipitation (TP), at lead times of up to 5 days. We propose to leverage a two-step strategy to achieve high-resolution predictions at 0.25-degree considering the trade-off between computation memory and forecasting accuracy. Recurrent predictions for future atmospheric fields are firstly performed at 1.40625-degree resolution, and then a diffusion-based super-resolution model is leveraged to recover the high spatial resolution and finer-scale atmospheric details. SwinRDM pushes forward the performance and potential of data-driven models for a large margin towards operational applications.

NeurIPS Conference 2023 Conference Paper

Universal Prompt Tuning for Graph Neural Networks

  • Taoran Fang
  • Yunchao Zhang
  • Yang Yang
  • Chunping Wang
  • Lei Chen

In recent years, prompt tuning has sparked a research surge in adapting pre-trained models. Unlike the unified pre-training strategy employed in the language field, the graph field exhibits diverse pre-training strategies, posing challenges in designing appropriate prompt-based tuning methods for graph neural networks. While some pioneering work has devised specialized prompting functions for models that employ edge prediction as their pre-training tasks, these methods are limited to specific pre-trained GNN models and lack broader applicability. In this paper, we introduce a universal prompt-based tuning method called Graph Prompt Feature (GPF) for pre-trained GNN models under any pre-training strategy. GPF operates on the input graph's feature space and can theoretically achieve an equivalent effect to any form of prompting function. Consequently, we no longer need to illustrate the prompting function corresponding to each pre-training strategy explicitly. Instead, we employ GPF to obtain the prompted graph for the downstream task in an adaptive manner. We provide rigorous derivations to demonstrate the universality of GPF and make guarantee of its effectiveness. The experimental results under various pre-training strategies indicate that our method performs better than fine-tuning, with an average improvement of about 1. 4% in full-shot scenarios and about 3. 2% in few-shot scenarios. Moreover, our method significantly outperforms existing specialized prompt-based tuning methods when applied to models utilizing the pre-training strategy they specialize in. These numerous advantages position our method as a compelling alternative to fine-tuning for downstream adaptations.

IJCAI Conference 2022 Conference Paper

A Universal PINNs Method for Solving Partial Differential Equations with a Point Source

  • Xiang Huang
  • Hongsheng Liu
  • Beiji Shi
  • Zidong Wang
  • Kang Yang
  • Yang Li
  • Min Wang
  • Haotian Chu

In recent years, deep learning technology has been used to solve partial differential equations (PDEs), among which the physics-informed neural networks (PINNs)method emerges to be a promising method for solving both forward and inverse PDE problems. PDEs with a point source that is expressed as a Dirac delta function in the governing equations are mathematical models of many physical processes. However, they cannot be solved directly by conventional PINNs method due to the singularity brought by the Dirac delta function. In this paper, we propose a universal solution to tackle this problem by proposing three novel techniques. Firstly the Dirac delta function is modeled as a continuous probability density function to eliminate the singularity at the point source; secondly a lower bound constrained uncertainty weighting algorithm is proposed to balance the physics-informed loss terms of point source area and the remaining areas; and thirdly a multi-scale deep neural network with periodic activation function is used to improve the accuracy and convergence speed. We evaluate the proposed method with three representative PDEs, and the experimental results show that our method outperforms existing deep learning based methods with respect to the accuracy, the efficiency and the versatility.

IJCAI Conference 2022 Conference Paper

Beyond Homophily: Structure-aware Path Aggregation Graph Neural Network

  • Yifei Sun
  • Haoran Deng
  • Yang Yang
  • Chunping Wang
  • Jiarong Xu
  • Renhong Huang
  • Linfeng Cao
  • Yang Wang

Graph neural networks (GNNs) have been intensively studied in various real-world tasks. However, the homophily assumption of GNNs' aggregation function limits their representation learning ability in heterophily graphs. In this paper, we shed light on the path level patterns in graphs that can explicitly reflect rich semantic and structural information. We therefore propose a novel Structure-aware Path Aggregation Graph Neural Network (PathNet) aiming to generalize GNNs for both homophily and heterophily graphs. Specifically, we first introduce a maximal entropy path sampler, which helps us sample a number of paths containing structural context. Then, we introduce a structure-aware recurrent cell consisting of order-preserving and distance-aware components to learn the semantic information of neighborhoods. Finally, we model the preference of different paths to target node after path encoding. Experimental results demonstrate that our model achieves superior performance in node classification on both heterophily and homophily graphs.

NeurIPS Conference 2022 Conference Paper

DGraph: A Large-Scale Financial Dataset for Graph Anomaly Detection

  • Xuanwen Huang
  • Yang Yang
  • Yang Wang
  • Chunping Wang
  • Zhisheng Zhang
  • Jiarong Xu
  • Lei Chen
  • Michalis Vazirgiannis

Graph Anomaly Detection (GAD) has recently become a hot research spot due to its practicability and theoretical value. Since GAD emphasizes the application and the rarity of anomalous samples, enriching the varieties of its datasets is fundamental. Thus, this paper present DGraph, a real-world dynamic graph in the finance domain. DGraph overcomes many limitations of current GAD datasets. It contains about 3M nodes, 4M dynamic edges, and 1M ground-truth nodes. We provide a comprehensive observation of DGraph, revealing that anomalous nodes and normal nodes generally have different structures, neighbor distribution, and temporal dynamics. Moreover, it suggests that 2M background nodes are also essential for detecting fraudsters. Furthermore, we conduct extensive experiments on DGraph. Observation and experiments demonstrate that DGraph is propulsive to advance GAD research and enable in-depth exploration of anomalous nodes.

NeurIPS Conference 2022 Conference Paper

Meta-Auto-Decoder for Solving Parametric Partial Differential Equations

  • Xiang Huang
  • Zhanhong Ye
  • Hongsheng Liu
  • Shi Ji
  • Zidong Wang
  • Kang Yang
  • Yang Li
  • Min Wang

Many important problems in science and engineering require solving the so-called parametric partial differential equations (PDEs), i. e. , PDEs with different physical parameters, boundary conditions, shapes of computation domains, etc. Recently, building learning-based numerical solvers for parametric PDEs has become an emerging new field. One category of methods such as the Deep Galerkin Method (DGM) and Physics-Informed Neural Networks (PINNs) aim to approximate the solution of the PDEs. They are typically unsupervised and mesh-free, but require going through the time-consuming network training process from scratch for each set of parameters of the PDE. Another category of methods such as Fourier Neural Operator (FNO) and Deep Operator Network (DeepONet) try to approximate the solution mapping directly. Being fast with only one forward inference for each PDE parameter without retraining, they often require a large corpus of paired input-output observations drawn from numerical simulations, and most of them need a predefined mesh as well. In this paper, we propose Meta-Auto-Decoder (MAD), a mesh-free and unsupervised deep learning method that enables the pre-trained model to be quickly adapted to equation instances by implicitly encoding (possibly heterogenous) PDE parameters as latent vectors. The proposed method MAD can be interpreted by manifold learning in infinite-dimensional spaces, granting it a geometric insight. Extensive numerical experiments show that the MAD method exhibits faster convergence speed without losing accuracy than other deep learning-based methods.

NeurIPS Conference 2022 Conference Paper

Revisiting Injective Attacks on Recommender Systems

  • Haoyang Li
  • Shimin Di
  • Lei Chen

Recent studies have demonstrated that recommender systems (RecSys) are vulnerable to injective attacks. Given a limited fake user budget, attackers can inject fake users with carefully designed behaviors into the open platforms, making RecSys recommend a target item to more real users for profits. In this paper, we first revisit existing attackers and reveal that they suffer from the difficulty-agnostic and diversity-deficit issues. Existing attackers concentrate their efforts on difficult users who have low tendencies toward the target item, thus reducing their effectiveness. Moreover, they are incapable of affecting the target RecSys to recommend the target item to real users in a diverse manner, because their generated fake user behaviors are dominated by large communities. To alleviate these two issues, we propose a difficulty and diversity aware attacker, namely DADA. We design the difficulty-aware and diversity-aware objectives to enable easy users from various communities to contribute more weights when optimizing attackers. By incorporating these two objectives, the proposed attacker DADA can concentrate on easy users while also affecting a broader range of real users simultaneously, thereby boosting the effectiveness. Extensive experiments on three real-world datasets demonstrate the effectiveness of our proposed attacker.

IS Journal 2022 Journal Article

Robust Precipitation Bias Correction Through an Ordinal Distribution Autoencoder

  • Youcheng Luo
  • Xiaoyang Xu
  • Yiqun Liu
  • Hanqing Chao
  • Hai Chu
  • Lei Chen
  • Junping Zhang
  • Leiming Ma

Numerical precipitation prediction plays a crucial role in weather forecasting and has broad applications in public services including aviation management and urban disaster early-warning systems. However, numerical weather prediction (NWP) models are often constrained by a systematic bias due to coarse spatial resolution, lack of parameterizations, and limitations of observation and conventional meteorological models, including constrained sample size and long-tail distribution. To address these issues, we present a data-driven deep learning model, named the ordinal distribution autoencoder (ODA), which principally includes a precipitation confidence network and a combinatorial network that contains two blocks, i. e. , a denoising autoencoder block and an ordinal distribution regression block. As an expert-free model for bias correction of precipitation, it can effectively correct numerical precipitation prediction based on meteorological data from the European Centre for Medium-Range Weather Forecasts (ECMWF) and SMS-WARMS, an NWP model used in East China. Experiments in the two NWP models demonstrate that, compared with several classical machine-learning algorithms and deep learning models, our proposed ODA generally performs better in bias correction.

NeurIPS Conference 2022 Conference Paper

Tenrec: A Large-scale Multipurpose Benchmark Dataset for Recommender Systems

  • Guanghu Yuan
  • Fajie Yuan
  • Yudong Li
  • Beibei Kong
  • Shujie Li
  • Lei Chen
  • Min Yang
  • Chenyun Yu

Existing benchmark datasets for recommender systems (RS) either are created at a small scale or involve very limited forms of user feedback. RS models evaluated on such datasets often lack practical values for large-scale real-world applications. In this paper, we describe Tenrec, a novel and publicly available data collection for RS that records various user feedback from four different recommendation scenarios. To be specific, Tenrec has the following five characteristics: (1) it is large-scale, containing around 5 million users and 140 million interactions; (2) it has not only positive user feedback, but also true negative feedback (vs. one-class recommendation); (3) it contains overlapped users and items across four different scenarios; (4) it contains various types of user positive feedback, in forms of clicking, liking, sharing, and following, etc; (5) it contains additional features beyond the user IDs and item IDs. We verify Tenrec on ten diverse recommendation tasks by running several classical baseline models per task. Tenrec has the potential to become a useful benchmark dataset for a majority of popular recommendation tasks. Our source codes and datasets will be included in supplementary materials.

IJCAI Conference 2022 Conference Paper

Uncertainty-Aware Representation Learning for Action Segmentation

  • Lei Chen
  • Muheng Li
  • Yueqi Duan
  • Jie Zhou
  • Jiwen Lu

In this paper, we propose an uncertainty-aware representation Learning (UARL) method for action segmentation. Most existing action segmentation methods exploit continuity information of the action period to predict frame-level labels, which ignores the temporal ambiguity of the transition region between two actions. Moreover, similar periods of different actions, e. g. , the beginning of some actions, will confuse the network if they are annotated with different labels, which causes spatial ambiguity. To address this, we design the UARL to exploit the transitional expression between two action periods by uncertainty learning. Specially, we model every frame of actions with an active distribution that represents the probabilities of different actions, which captures the uncertainty of the action and exploits the tendency during the action. We evaluate our method on three popular action prediction datasets: Breakfast, Georgia Tech Egocentric Activities (GTEA), and 50Salads. The experimental results demonstrate that our method achieves the performance with state-of-the-art.

AAAI Conference 2021 Conference Paper

A User-Adaptive Layer Selection Framework for Very Deep Sequential Recommender Models

  • Lei Chen
  • Fajie Yuan
  • Jiaxi Yang
  • Xiang Ao
  • Chengming Li
  • Min Yang

Sequential recommender systems (SRS) have become a research hotspot in recent studies. Because of the requirement in capturing user’s dynamic interests, sequential neural network based recommender models often need to be stacked with more hidden layers (e. g. , up to 100 layers) compared with standard collaborative filtering methods. However, the high network latency has become the main obstacle when deploying very deep recommender models into a production environment. In this paper, we argue that the typical prediction framework that treats all users equally during the inference phase is inefficient in running time, as well as sub-optimal in accuracy. To resolve such an issue, we present SkipRec, an adaptive inference framework by learning to skip inactive hidden layers on a per-user basis. Specifically, we devise a policy network to automatically determine which layers should be retained and which layers are allowed to be skipped, so as to achieve user-specific decisions. To derive the optimal skipping policy, we propose using gumbel softmax and reinforcement learning to solve the non-differentiable problem during backpropagation. We perform extensive experiments on three real-world recommendation datasets, and demonstrate that SkipRec attains comparable or better accuracy with much less inference time.

NeurIPS Conference 2021 Conference Paper

AutoGEL: An Automated Graph Neural Network with Explicit Link Information

  • Zhili Wang
  • Shimin Di
  • Lei Chen

Recently, Graph Neural Networks (GNNs) have gained popularity in a variety of real-world scenarios. Despite the great success, the architecture design of GNNs heavily relies on manual labor. Thus, automated graph neural network (AutoGNN) has attracted interest and attention from the research community, which makes significant performance improvements in recent years. However, existing AutoGNN works mainly adopt an implicit way to model and leverage the link information in the graphs, which is not well regularized to the link prediction task on graphs, and limits the performance of AutoGNN for other graph tasks. In this paper, we present a novel AutoGNN work that explicitly models the link information, abbreviated to AutoGEL. In such a way, AutoGEL can handle the link prediction task and improve the performance of AutoGNNs on the node classification and graph classification task. Moreover, AutoGEL proposes a novel search space containing various design dimensions at both intra-layer and inter-layer designs and adopts a more robust differentiable search algorithm to further improve efficiency and effectiveness. Experimental results on benchmark data sets demonstrate the superiority of AutoGEL on several tasks.

NeurIPS Conference 2021 Conference Paper

MixSeq: Connecting Macroscopic Time Series Forecasting with Microscopic Time Series Data

  • Zhibo Zhu
  • Ziqi Liu
  • Ge Jin
  • Zhiqiang Zhang
  • Lei Chen
  • Jun Zhou
  • Jianyong Zhou

Time series forecasting is widely used in business intelligence, e. g. , forecast stock market price, sales, and help the analysis of data trend. Most time series of interest are macroscopic time series that are aggregated from microscopic data. However, instead of directly modeling the macroscopic time series, rare literature studied the forecasting of macroscopic time series by leveraging data on the microscopic level. In this paper, we assume that the microscopic time series follow some unknown mixture probabilistic distributions. We theoretically show that as we identify the ground truth latent mixture components, the estimation of time series from each component could be improved because of lower variance, thus benefitting the estimation of macroscopic time series as well. Inspired by the power of Seq2seq and its variants on the modeling of time series data, we propose Mixture of Seq2seq (MixSeq), an end2end mixture model to cluster microscopic time series, where all the components come from a family of Seq2seq models parameterized by different parameters. Extensive experiments on both synthetic and real-world data show the superiority of our approach.

AAAI Conference 2021 Conference Paper

THOR, Trace-based Hardware-driven Layer-Oriented Natural Gradient Descent Computation

  • Mengyun Chen
  • Kaixin Gao
  • Xiaolei Liu
  • Zidong Wang
  • Ningxi Ni
  • Qian Zhang
  • Lei Chen
  • Chao Ding

It is well-known that second-order optimizer can accelerate the training of deep neural networks, however, the huge computation cost of second-order optimization makes it impractical to apply in real practice. In order to reduce the cost, many methods have been proposed to approximate a second-order matrix. Inspired by KFAC, we propose a novel Trace-based Hardware-driven layer-ORiented Natural Gradient Descent Computation method, called THOR, to make the second-order optimization applicable in the real application models. Specifically, we gradually increase the update interval and use the matrix trace to determine which blocks of Fisher Information Matrix (FIM) need to be updated. Moreover, by resorting the power of hardware, we have designed a hardware-driven approximation method for computing FIM to achieve better performance. To demonstrate the effectiveness of THOR, we have conducted extensive experiments. The results show that training ResNet-50 on ImageNet with THOR only takes 66. 7 minutes to achieve a top-1 accuracy of 75. 9 % under an 8 Ascend 910 environment with MindSpore, a new deep learning computing framework. Moreover, with more computational resources, THOR can only takes 2. 7 minutes to 75. 9 % with 256 Ascend 910.

IS Journal 2021 Journal Article

Toward Aspect-Level Sentiment Modification Without Parallel Data

  • Qingnan Jiang
  • Lei Chen
  • Wei Zhao
  • Min Yang

This article takes the lead to study aspect-level sentiment modification (ALSM) without parallel data. Given a sentence, the task of ALSM needs to reverse the sentiment with respect to the given aspect while preserving other content. The main challenge is reversing the sentiment of the given aspect without affecting the sentiments of other aspects in the sentences. To handle this problem, we propose a joint aspect-level sentiment modification (called JASM) model. JASM is a multitask system, which jointly trains two coupled modules: aspect-specific sentiment words extraction and aspect-level sentiment transformation. Besides, we propose a novel memory mechanism to learn aspect-aware sentiment representation and a gating mechanism to dynamically select aspect-aware sentiment information or content information for generating the next words. Experiments show that the proposed model substantially outperforms the compared methods in both aspect-level sentiment transformation and content preservation. For applications, we conduct data augmentation for aspect-based sentiment analysis (ABSA) through generating plausible training data with the trained ALSM model. Experiments show that data augmentation with generated data boosts the performance of a broad range of ABSA models.

AAAI Conference 2020 Conference Paper

Block Hankel Tensor ARIMA for Multiple Short Time Series Forecasting

  • Qiquan Shi
  • Jiaming Yin
  • Jiajun Cai
  • Andrzej Cichocki
  • Tatsuya Yokota
  • Lei Chen
  • Mingxuan Yuan
  • Jia Zeng

This work proposes a novel approach for multiple time series forecasting. At first, multi-way delay embedding transform (MDT) is employed to represent time series as low-rank block Hankel tensors (BHT). Then, the higher-order tensors are projected to compressed core tensors by applying Tucker decomposition. At the same time, the generalized tensor Autoregressive Integrated Moving Average (ARIMA) is explicitly used on consecutive core tensors to predict future samples. In this manner, the proposed approach tactically incorporates the unique advantages of MDT tensorization (to exploit mutual correlations) and tensor ARIMA coupled with low-rank Tucker decomposition into a unified framework. This framework exploits the low-rank structure of block Hankel tensors in the embedded space and captures the intrinsic correlations among multiple TS, which thus can improve the forecasting results, especially for multiple short time series. Experiments conducted on three public datasets and two industrial datasets verify that the proposed BHT-ARIMA effectively improves forecasting accuracy and reduces computational cost compared with the state-of-the-art methods.

NeurIPS Conference 2020 Conference Paper

Can Graph Neural Networks Count Substructures?

  • Zhengdao Chen
  • Lei Chen
  • Soledad Villar
  • Joan Bruna

The ability to detect and count certain substructures in graphs is important for solving many tasks on graph-structured data, especially in the contexts of computational chemistry and biology as well as social network analysis. Inspired by this, we propose to study the expressive power of graph neural networks (GNNs) via their ability to count attributed graph substructures, extending recent works that examine their power in graph isomorphism testing and function approximation. We distinguish between two types of substructure counting: induced-subgraph-count and subgraph-count, and establish both positive and negative answers for popular GNN architectures. Specifically, we prove that Message Passing Neural Networks (MPNNs), 2-Weisfeiler-Lehman (2-WL) and 2-Invariant Graph Networks (2-IGNs) cannot perform induced-subgraph-count of substructures consisting of 3 or more nodes, while they can perform subgraph-count of star-shaped substructures. As an intermediary step, we prove that 2-WL and 2-IGNs are equivalent in distinguishing non-isomorphic graphs, partly answering an open problem raised in Maron et al. (2019). We also prove positive results for k-WL and k-IGNs as well as negative results for k-WL with a finite number of iterations. We then conduct experiments that support the theoretical results for MPNNs and 2-IGNs. Moreover, motivated by substructure counting and inspired by Murphy et al. (2019), we propose the Local Relational Pooling model and demonstrate that it is not only effective for substructure counting but also able to achieve competitive performance on molecular prediction tasks.

NeurIPS Conference 2020 Conference Paper

Interstellar: Searching Recurrent Architecture for Knowledge Graph Embedding

  • Yongqi Zhang
  • Quanming Yao
  • Lei Chen

Knowledge graph (KG) embedding is well-known in learning representations of KGs. Many models have been proposed to learn the interactions between entities and relations of the triplets. However, long-term information among multiple triplets is also important to KG. In this work, based on the relational paths, which are composed of a sequence of triplets, we define the Interstellar as a recurrent neural architecture search problem for the short-term and long-term information along the paths. First, we analyze the difficulty of using a unified model to work as the Interstellar. Then, we propose to search for recurrent architecture as the Interstellar for different KG tasks. A case study on synthetic data illustrates the importance of the defined search problem. Experiments on real datasets demonstrate the effectiveness of the searched models and the efficiency of the proposed hybrid-search algorithm.

AAAI Conference 2020 Conference Paper

Revisiting Graph Based Collaborative Filtering: A Linear Residual Graph Convolutional Network Approach

  • Lei Chen
  • Le Wu
  • Richang Hong
  • Kun Zhang
  • Meng Wang

Graph Convolutional Networks (GCNs) are state-of-the-art graph based representation learning models by iteratively stacking multiple layers of convolution aggregation operations and non-linear activation operations. Recently, in Collaborative Filtering (CF) based Recommender Systems (RS), by treating the user-item interaction behavior as a bipartite graph, some researchers model higher-layer collaborative signals with GCNs. These GCN based recommender models show superior performance compared to traditional works. However, these models suffer from training difficulty with non-linear activations for large user-item graphs. Besides, most GCN based models could not model deeper layers due to the over smoothing effect with the graph convolution operation. In this paper, we revisit GCN based CF models from two aspects. First, we empirically show that removing non-linearities would enhance recommendation performance, which is consistent with the theories in simple graph convolutional networks. Second, we propose a residual network structure that is specifically designed for CF with useritem interaction modeling, which alleviates the over smoothing problem in graph convolution aggregation operation with sparse user-item interaction data. The proposed model is a linear model and it is easy to train, scale to large datasets, and yield better efficiency and effectiveness on two real datasets. We publish the source code at https: //github. com/newlei/LR- GCCF.

IJCAI Conference 2020 Conference Paper

Simultaneous Arrival Matching for New Spatial Crowdsourcing Platforms

  • Boyang Li
  • Yurong Cheng
  • Ye Yuan
  • Guoren Wang
  • Lei Chen

In recent years, 3D spatial crowdsourcing platforms become popular, in which users and workers travel together to their assigned workplaces for services, such as InterestingSport and Nanguache. A typical problem over 3D spatial crowdsourcing platforms is to match users with suitable workers and workplaces. Existing studies all ignored that the workers and users assigned to the same workplace should arrive almost at the same time, which is very practical in the real world. Thus, in this paper, we propose a new Simultaneous Arrival Matching (SAM), which enables workers and users to arrive at their assigned workplace within a given tolerant time. We find that the new considered arriving time constraint breaks the monotonic additivity of the result set. Thus, it brings a large challenge in designing effective and efficient algorithms for the SAM. We design Sliding Window algorithm and Threshold Scanning algorithm to solve the SAM. We conduct the experiments on real and synthetic datasets, experimental results show the effectiveness and efficiency of our algorithms.

TIST Journal 2020 Journal Article

Travel Recommendation via Fusing Multi-Auxiliary Information into Matrix Factorization

  • Lei Chen
  • Zhiang Wu
  • Jie Cao
  • Guixiang Zhu
  • Yong Ge

As an e-commerce feature, the personalized recommendation is invariably highly-valued by both consumers and merchants. The e-tourism has become one of the hottest industries with the adoption of recommendation systems. Several lines of evidence have confirmed the travel-product recommendation is quite different from traditional recommendations. Travel products are usually browsed and purchased relatively infrequently compared with other traditional products (e.g., books and food), which gives rise to the extreme sparsity of travel data. Meanwhile, the choice of a suitable travel product is affected by an army of factors such as departure, destination, and financial and time budgets. To address these challenging problems, in this article, we propose a Probabilistic Matrix Factorization with Multi-Auxiliary Information (PMF-MAI) model in the context of the travel-product recommendation. In particular, PMF-MAI is able to fuse the probabilistic matrix factorization on the user-item interaction matrix with the linear regression on a suite of features constructed by the multiple auxiliary information. In order to fit the sparse data, PMF-MAI is built by a whole-data based learning approach that utilizes unobserved data to increase the coupling between probabilistic matrix factorization and linear regression. Extensive experiments are conducted on a real-world dataset provided by a large tourism e-commerce company. PMF-MAI shows an overwhelming superiority over all competitive baselines on the recommendation performance. Also, the importance of features is examined to reveal the crucial auxiliary information having a great impact on the adoption of travel products.

IJCAI Conference 2019 Conference Paper

Knowledge-enhanced Hierarchical Attention for Community Question Answering with Multi-task and Adaptive Learning

  • Min Yang
  • Lei Chen
  • Xiaojun Chen
  • Qingyao Wu
  • Wei Zhou
  • Ying Shen

In this paper, we propose a Knowledge-enhanced Hierarchical Attention for community question answering with Multi-task learning and Adaptive learning (KHAMA). First, we propose a hierarchical attention network to fully fuse knowledge from input documents and knowledge base (KB) by exploiting the semantic compositionality of the input sequences. The external factual knowledge helps recognize background knowledge (entity mentions and their relationships) and eliminate noise information from long documents that have sophisticated syntactic and semantic structures. In addition, we build multiple CQA models with adaptive boosting and then combine these models to learn a more effective and robust CQA system. Further- more, KHAMA is a multi-task learning model. It regards CQA as the primary task and question categorization as the auxiliary task, aiming at learning a category-aware document encoder and enhance the quality of identifying essential information from long questions. Extensive experiments on two benchmarks demonstrate that KHAMA achieves substantial improvements over the compared methods.

IJCAI Conference 2019 Conference Paper

Label distribution learning with label-specific features

  • Tingting Ren
  • Xiuyi Jia
  • Weiwei Li
  • Lei Chen
  • Zechao Li

Label distribution learning (LDL) is a novel machine learning paradigm to deal with label ambiguity issues by placing more emphasis on how relevant each label is to a particular instance. Many LDL algorithms have been proposed and most of them concentrate on the learning models, while few of them focus on the feature selection problem. All existing LDL models are built on a simple feature space in which all features are shared by all the class labels. However, this kind of traditional data representation strategy tends to select features that are distinguishable for all labels, but ignores label-specific features that are pertinent and discriminative for each class label. In this paper, we propose a novel LDL algorithm by leveraging label-specific features. The common features for all labels and specific features for each label are simultaneously learned to enhance the LDL model. Moreover, we also exploit the label correlations in the proposed LDL model. The experimental results on several real-world data sets validate the effectiveness of our method.

NeurIPS Conference 2019 Conference Paper

On the equivalence between graph isomorphism testing and function approximation with GNNs

  • Zhengdao Chen
  • Soledad Villar
  • Lei Chen
  • Joan Bruna

Graph neural networks (GNNs) have achieved lots of success on graph-structured data. In light of this, there has been increasing interest in studying their representation power. One line of work focuses on the universal approximation of permutation-invariant functions by certain classes of GNNs, and another demonstrates the limitation of GNNs via graph isomorphism tests. Our work connects these two perspectives and proves their equivalence. We further develop a framework of the representation power of GNNs with the language of sigma-algebra, which incorporates both viewpoints. Using this framework, we compare the expressive power of different classes of GNNs as well as other methods on graphs. In particular, we prove that order-2 Graph G-invariant networks fail to distinguish non-isomorphic regular graphs with the same degree. We then extend them to a new architecture, Ring-GNN, which succeeds in distinguishing these graphs as well as for tasks on real-world datasets.

IJCAI Conference 2019 Conference Paper

Personalized Multimedia Item and Key Frame Recommendation

  • Le Wu
  • Lei Chen
  • Yonghui Yang
  • Richang Hong
  • Yong Ge
  • Xing Xie
  • Meng Wang

When recommending or advertising items to users, an emerging trend is to present each multimedia item with a key frame image (e. g. , the poster of a movie). As each multimedia item can be represented as multiple fine-grained visual images (e. g. , related images of the movie), personalized key frame recommendation is necessary in these applications to attract users' unique visual preferences. However, previous personalized key frame recommendation models relied on users' fine grained image behavior of multimedia items (e. g. , user-image interaction behavior), which is often not available in real scenarios. In this paper, we study the general problem of joint multimedia item and key frame recommendation in the absence of the fine-grained user-image behavior. We argue that the key challenge of this problem lies in discovering users' visual profiles for key frame recommendation, as most recommendation models would fail without any users' fine-grained image behavior. To tackle this challenge, we leverage users' item behavior by projecting users(items) in two latent spaces: a collaborative latent space and a visual latent space. We further design a model to discern both the collaborative and visual dimensions of users, and model how users make decisive item preferences from these two spaces. As a result, the learned user visual profiles could be directly applied for key frame recommendation. Finally, experimental results on a real-world dataset clearly show the effectiveness of our proposed model on the two recommendation tasks.

JBHI Journal 2016 Journal Article

A Novel Brain Networks Enhancement Model (BNEM) for BOLD fMRI Data Analysis With Highly Spatial Reproducibility

  • Nizhuan Wang
  • Weiming Zeng
  • Dongtailang Chen
  • Jun Yin
  • Lei Chen

Independent component analysis aiming at detecting the functional connectivity among discrete cortical brain regions has been extensively used to explore the functional magnetic resonance imaging data. Although the independent components (ICs) were with relatively high quality, the noise embedding in ICs has a great impact on the true active/inactive region inference and the reproducibility, in postprocessing stage, e. g. , the extraction of statistical parametrical maps (SPMs). In this paper, a novel brain network enhancement model (BNEM) is proposed, which mainly consists of two key techniques: 1) 3-D wavelet noise filter (3DWNF) for the meaningful ICs, which greatly suppresses noise and enforces the real activation inference of SPMs; and 2) a spatial reproducibility enhancement algorithm (SREA), aiming to improve the reproducibility of SPMs. The simulated experiment demonstrated that the postfiltering signals by 3DWNF were with higher correlation and less normalized mean square error to the ground truths than the prefiltering ones; SREA could further enhance the quality of most postfiltering ones, preserving the consistency with 3DWNF. The real data experiments also revealed that 1) 3DWNF could lead to more accurate preservation of the true positive voxels by correctly identifying the high proportionally misclassified voxels of the nonenhanced SPMs; 2) SREA could further improve the classification accuracy of the active/inactive voxels of SPMs corresponding to the 3DWNF denoised ICs; and 3) both 3DWNF and SREA contribute to the reproducibility enhancement of the reproduced SPMs by BNEM. Thus, BNEM is expected to have wide applicability in the neuroscience and clinical domain.

IJCAI Conference 2015 Conference Paper

VRCA: A Clustering Algorithm for Massive Amount of Texts

  • Ming Liu
  • Lei Chen
  • Bingquan Liu
  • Xiaolong Wang

There are lots of texts appearing in the web every day. This fact enables the amount of texts in the web to explode. Therefore, how to deal with large-scale text collection becomes more and more important. Clustering is a generally acceptable solution for text organization. Via its unsupervised characteristic, users can easily dig the useful information that they desired. However, traditional clustering algorithms can only deal with small-scale text collection. When it enlarges, they lose their performances. The main reason attributes to the high-dimensional vectors generated from texts. Therefore, to cluster texts in large amount, this paper proposes a novel clustering algorithm, where only the features that can represent cluster are preserved in cluster’s vector. In this algorithm, clustering process is separated into two parts. In one part, feature’s weight is fine-tuned to make cluster partition meet an optimization function. In the other part, features are reordered and only the useful features that can represent cluster are kept in cluster’s vector. Experimental results demonstrate that our algorithm obtains high performance on both small-scale and large-scale text collections.