Arrow Research search

Author name cluster

Yafei Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers
1 author row

Possible papers

7

JBHI Journal 2026 Journal Article

NoTAC: A Noise-Tolerance Automatic Cleaning Framework for Bone Marrow Karyotyping Data

  • Rihan Huang
  • Siyuan Chen
  • Yafei Li
  • Chunling Zhang
  • Yilan Zhang
  • Changchun Yang
  • Na Li
  • Jingdong Hu

Deep neural networks have advanced chromosome classification, a critical procedure in karyotyping for disease diagnosis. However, training an effective DNN requires clean and reliable data, whereas real-world clinical chromosome data often contain label errors and outliers, which degrade DNN performance and limit their clinical applicability. In this work, we propose a Noise-Tolerance Automatic Cleaning framework, named NoTAC, to address potential labeling errors and outliers to enhance the performance of chromosome classification. The framework consists of two branches: KaryoCleanse for label noise detection and KaryoDrift for outlier identification. First, it identifies potential label errors by leveraging the DNN’s self-confidence, estimating the latent label distribution, and ranking probabilities to prune mislabeled data. Second, it scores out-of-distribution samples based on the average K-nearest neighbor distances, enabling the identification and removal of outlier data. We conducted comprehensive comparative experiments against state-of-the-art noise-handling methods on a real-world R-band bone marrow chromosome dataset. Our results demonstrate that NoTAC achieves superior performance with an accuracy of 93. 99%, which represents a 6. 25% relative improvement over the baseline and outperforms the best competing method by 0. 92%. Furthermore, our qualitative analysis of NoTAC revealed reliable data issues in a real-world R-band bone marrow chromosome dataset, offering insights into how these issues impair DNN prediction capabilities. These findings demonstrate NoTAC’s potential to enhance both the performance and reliability of DNNs in practical medical datasets. The proposed method has also been applied to assist clinical karyotype diagnosis.

IJCAI Conference 2025 Conference Paper

Credit Assignment and Fine-Tuning Enhanced Reinforcement Learning for Collaborative Spatial Crowdsourcing

  • Wei Chen
  • Yafei Li
  • Baolong Mei
  • Guanglei Zhu
  • Jiaqi Wu
  • Mingliang Xu

Collaborative spatial crowdsourcing leverages distributed workers' collective intelligence to accomplish spatial tasks. A central challenge is to efficiently assign suitable workers to collaborate on these tasks. Although mainstream reinforcement learning (RL) methods have proven effective in task allocation, they face two key obstacles: delayed reward feedback and non-stationary data distributions, both hindering optimal allocation and collaborative efficiency. To address these limitations, we propose CAFE (credit assignment and fine-tuning enhanced), a novel multi-agent RL framework for spatial crowdsourcing. CAFE introduces a credit assignment mechanism that distributes rewards based on workers' contributions and spatiotemporal constraints, coupled with bi-level meta-optimization to jointly optimize credit assignment and RL policy. To handle non-stationary spatial task distributions, CAFE employs an adaptive fine-tuning procedure that efficiently adjusts credit assignment parameters while preserving collaborative knowledge. Experiments on two real-world datasets validate the effectiveness of our framework, demonstrating superior performance in terms of task completion and equitable reward redistribution.

AAAI Conference 2025 Conference Paper

Gradient-Guided Credit Assignment and Joint Optimization for Dependency-Aware Spatial Crowdsourcing

  • Yafei Li
  • Wei Chen
  • Jinxing Yan
  • Huiling Li
  • Lei Gao
  • Mingliang Xu

Dependency-aware spatial crowdsourcing (DASC) addresses the unique challenges posed by subtask dependencies in spatial task assignment. This paper investigates the task assignment problem in DASC and proposes a two-stage Recommend and Match Optimization (RMO) framework, leveraging multi-agent reinforcement learning for subtask recommendation and a multi-dimensional utility function for subtask matching. The RMO framework primarily addresses two key challenges: credit assignment for subtasks with interdependencies and maintaining overall coherence between subtask recommendation and matching. Specifically, we employ meta-gradients to construct auxiliary policies and establish a gradient connection between two stages, which can effectively address credit assignment and joint optimization of subtask recommendation and matching, while concurrently accelerating network training. We further establish a unified gradient descent process through gradient synchronization across recommendation networks, auxiliary policies, and the matching utility evaluation function. Experiments on two real-world datasets validate the effectiveness and feasibility of our proposed approach.

IJCAI Conference 2025 Conference Paper

HLMTrans: A Sim-to-Real Transfer Framework for Spatial Crowdsourcing with Human-Guided Language Models

  • Qingshun Wu
  • Yafei Li
  • Lulu Li
  • Yuanyuan Jin
  • Shuo He
  • Mingliang Xu

Reinforcement Learning (RL), trained via trial and error in simulators, has been proven to be an effective approach for addressing task assignment problems in spatial crowdsourcing. However, a performance gap still exists when transferring the simulator-trained RL Models (RLMs) to real-world settings due to the misalignment of travel time. Existing works mostly focus on using data-driven and learning-based methods to predict travel time; unfortunately, these approaches are limited in achieving accurate predictions by requiring a large amount of real-world data covering the entire state distribution. In this paper, we propose a Sim-to-Real Transfer with Human-guided Language Models framework called HLMTrans, which comprises three core modules: RLMs decision for task assignment, sim-to-real transfer with Large Language Models (LLMs), and preference learning from human feedback. HLMTrans first leverages the zero-shot chain-of-thought reasoning capability of LLMs to estimate travel time by capturing the real-world dynamics. This estimation is then input as domain knowledge into the forward model of Grounded Action Transformation (GAT) to enhance the action transformation of RLMs. Further, we design a human preference learning mechanism to fine-tune LLMs, improving their generation quality and enabling RLMs learn a more realistic policy. We evaluate the proposed HLMTrans on two real-world datasets, and the experimental results demonstrate that HLMTrans outperforms the SOTA methods in terms of effectiveness and efficiency.

AAAI Conference 2024 Short Paper

MapLE: Matching Molecular Analogues Promptly with Low Computational Resources by Multi-Metrics Evaluation (Student Abstract)

  • Xiaojian Chen
  • Chuyue Liao
  • Yanhui Gu
  • Yafei Li
  • Jinlan Wang
  • Yi Chen
  • Masaru Kitsuregawa

Matching molecular analogues is a computational chemistry and bioinformatics research issue which is used to identify molecules that are structurally or functionally similar to a target molecule. Recent studies on matching analogous molecules have predominantly concentrated on enhancing effectiveness, often sidelining computational efficiency, particularly in contexts of low computational resources. This oversight poses challenges in many real applications (e.g., drug discovery, catalyst generation and so forth). To tackle this issue, we propose a general strategy named MapLE, aiming to promptly match analogous molecules with low computational resources by multi-metrics evaluation. Experimental evaluation conducted on a public biomolecular dataset validates the excellent and efficient performance of the proposed strategy.

AAAI Conference 2023 Short Paper

HaPPy: Harnessing the Wisdom from Multi-Perspective Graphs for Protein-Ligand Binding Affinity Prediction (Student Abstract)

  • Xianfeng Zhang
  • Yanhui Gu
  • Guandong Xu
  • Yafei Li
  • Jinlan Wang
  • Zhenglu Yang

Gathering information from multi-perspective graphs is an essential issue for many applications especially for proteinligand binding affinity prediction. Most of traditional approaches obtained such information individually with low interpretability. In this paper, we harness the rich information from multi-perspective graphs with a general model, which abstractly represents protein-ligand complexes with better interpretability while achieving excellent predictive performance. In addition, we specially analyze the protein-ligand binding affinity problem, taking into account the heterogeneity of proteins and ligands. Experimental evaluations demonstrate the effectiveness of our data representation strategy on public datasets by fusing information from different perspectives.

TIST Journal 2018 Journal Article

Traffic Simulation and Visual Verification in Smog

  • Mingliang Xu
  • Hua Wang
  • Shili Chu
  • Yong Gan
  • Xiaoheng Jiang
  • Yafei Li
  • Bing Zhou

Smog causes low visibility on the road and it can impact the safety of traffic. Modeling traffic in smog will have a significant impact on realistic traffic simulations. Most existing traffic models assume that drivers have optimal vision in the simulations, making these simulations are not suitable for modeling smog weather conditions. In this article, we introduce the Smog Full Velocity Difference Model (SMOG-FVDM) for a realistic simulation of traffic in smog weather conditions. In this model, we present a stadia model for drivers in smog conditions. We introduce it into a car-following traffic model using both psychological force and body force concepts, and then we introduce the SMOG-FVDM. Considering that there are lots of parameters in the SMOG-FVDM, we design a visual verification system based on SMOG-FVDM to arrive at an adequate solution which can show visual simulation results under different road scenarios and different degrees of smog by reconciling the parameters. Experimental results show that our model can give a realistic and efficient traffic simulation of smog weather conditions.