Arrow Research search

Author name cluster

Fan Lin

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers
2 author rows

Possible papers

6

TIST Journal 2026 Journal Article

TencentLLMEval: A Hierarchical Evaluation of Real-World Capabilities for Human-Aligned LLMs

  • Shuyi Xie
  • Wenlin Yao
  • Yong Dai
  • Shaobo Wang
  • Zishan Xu
  • Fan Lin
  • Donglin Zhou
  • Lifeng Jin

Large language models (LLMs) have shown impressive capabilities across various natural language tasks. However, evaluating their alignment with human preferences remains a challenge. To this end, we propose a comprehensive human evaluation framework to assess LLMs’ proficiency in following instructions on diverse real-world tasks. We construct a hierarchical task tree encompassing seven major areas covering over 200 categories and over 800 tasks, which covers diverse capabilities such as question answering, reasoning, multi-turn dialogue, and text generation, to evaluate LLMs in a comprehensive and in-depth manner. We also design detailed evaluation standards and processes to facilitate consistent, unbiased judgments from human evaluators. A test set of over 3,000 instances is released, spanning different difficulty levels and knowledge domains. Our work provides a standardized methodology to evaluate human alignment in LLMs for both English and Chinese. We also analyze the feasibility of automating parts of evaluation with a strong LLM (GPT-4). Our framework supports a thorough assessment of LLMs as they are integrated into real-world applications. We have made publicly available the task tree, TencentLLMEval dataset, and evaluation methodology which have been demonstrated as effective in assessing the performance of Tencent Hunyuan LLMs. By doing so, we aim to facilitate the benchmarking of advances in the development of safe and human-aligned LLMs.

JBHI Journal 2025 Journal Article

Unsupervised Feature Selection-Driven Active Learning for Semi-Supervised Automatic ECG Analysis

  • Xiao Li
  • Yongkang Zhou
  • Songyang An
  • Yu Zeng
  • Xinqi Zhang
  • Jun Wang
  • Yizhe Huang
  • Fan Lin

Automatic analysis methods of electrocardiograms (ECGs) usually required large-scale annotated training data, but the annotation process is extremely time-consuming. While semi-supervised learning can leverage unlabeled data, its performance depends heavily on the quality of the initial labeled subset. Active learning has been used to identify the most informative samples for annotation, but conventional approaches face three critical limitations: (1) dependency on manual intervention for iterative query design, (2) prohibitive computational costs during sample selection, and (3) limited compatibility with semi-supervised learning frameworks. To address these limitations, we proposed an Unsupervised Active Feature-selective Semi-Supervised Learning (UAFSSL) framework for ECG analysis, including an unsupervised feature selection-based active learning module and a semi-supervised learning module. UAFSSL captures latent data distributions via unsupervised feature extraction, selects diverse and representative samples using pseudo-label clustering, and integrates seamlessly with semi-supervised learning to eliminate human intervention. We validated our algorithm on an ECG waveform segmentation task and an atrial fibrillation detection task. In the waveform segmentation task, our method improved the F1-score for P-wave delineation by 2. 4% compared to random sampling, using only 5% of labeled samples. For the atrial fibrillation detection task, we evaluated our method on both the AFDB and a 24-hour dataset collected from 500 atrial fibrillation patients. Using only 200 labeled samples for model training, our method achieved AUC improvements of 2. 5% and 2. 2% over random sampling in five-fold cross validation. This is the first study to integrate unsupervised active learning with semi-supervised learning for automatic ECG analysis, offering a robust, automated solution to reduce annotation costs while enhancing clinical applicability.

NeurIPS Conference 2024 Conference Paper

IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation

  • Fan Lin
  • Shuyi Xie
  • Yong Dai
  • Wenlin Yao
  • Tianjiao Lang
  • Yu Zhang

As Large Language Models (LLMs) become more capable of handling increasingly complex tasks, the evaluation set must keep pace with these advancements to ensure it remains sufficiently discriminative. Item Discrimination (ID) theory, which is widely used in educational assessment, measures the ability of individual test items to differentiate between high and low performers. Inspired by this theory, we propose an ID-induced prompt synthesis framework for evaluating LLMs so that the evaluation set continually updates and refines according to model abilities. Our data synthesis framework prioritizes both breadth and specificity. It can generate prompts that comprehensively evaluate the capabilities of LLMs while revealing meaningful performance differences between models, allowing for effective discrimination of their relative strengths and weaknesses across various tasks and domains. To produce high-quality data, we incorporate a self-correct mechanism into our generalization framework and develop two models to predict prompt discrimination and difficulty score to facilitate our data synthesis framework, contributing valuable tools to evaluation data synthesis research. We apply our generated data to evaluate five SOTA models. Our data achieves an average score of 51. 92, accompanied by a variance of 10. 06. By contrast, previous works (i. e. , SELF-INSTRUCT and WizardLM) obtain an average score exceeding 67, with a variance below 3. 2. The results demonstrate that the data generated by our framework is more challenging and discriminative compared to previous works. We will release a dataset of over 3, 000 carefully crafted prompts to facilitate evaluation research of LLMs.

ECAI Conference 2023 Conference Paper

Diffusion Model for Camouflaged Object Detection

  • Zhennan Chen
  • Rongrong Gao
  • Tian-Zhu Xiang
  • Fan Lin

Camouflaged object detection is a challenging task that aims to identify objects that are highly similar to their background. Due to the powerful noise-to-image denoising capability of denoising diffusion models, in this paper, we propose a diffusion-based framework for camouflaged object detection, termed diffCOD, a new framework that considers the camouflaged object segmentation task as a denoising diffusion process from noisy masks to object masks. Specifically, the object mask diffuses from the ground-truth masks to a random distribution, and the designed model learns to reverse this noising process. To strengthen the denoising learning, the input image prior is encoded and integrated into the denoising diffusion model to guide the diffusion process. Furthermore, we design an injection attention module (IAM) to interact conditional semantic features extracted from the image with the diffusion noise embedding via the cross-attention mechanism to enhance denoising learning. Extensive experiments on four widely used COD benchmark datasets demonstrate that the proposed method achieves favorable performance compared to the existing 11 state-of-the-art methods, especially in the detailed texture segmentation of camouflaged objects. Our code will be made publicly available at: https: //github. com/ZNan-Chen/diffCOD.

JBHI Journal 2022 Journal Article

Semi-Supervised Learning for Automatic Atrial Fibrillation Detection in 24-Hour Holter Monitoring

  • Peng Zhang
  • Yuting Chen
  • Fan Lin
  • Sifan Wu
  • Xiaoyun Yang
  • Qiang Li

Paroxysmal atrial fibrillation (AF) is generally diagnosed by long-term dynamic electrocardiogram (ECG) monitoring. Identifying AF episodes from long-term ECG data can place a heavy burden on clinicians. Many machine-learning-based automatic AF detection methods have been proposed to solve this issue. However, these methods require numerous annotated data to train the model, and the annotation of AF in long-term ECG is extremely time-consuming. Reducing the demand for labeled data can effectively improve the clinical practicability of automatic AF detection methods. In this study, we developed a novel semi-supervised learning method that generated modified low-entropy labels of unlabeled samples for training a deep learning model to automatically detect paroxysmal AF in 24 h Holter monitoring data. Our method employed a 1D CNN-LSTM neural network with RR intervals as input and used few labeled training data with numerous unlabeled data for training the neural network. This method was evaluated using a 24 h Holter monitoring dataset collected from 1000 paroxysmal AF patients. Using labeled samples from only 10 patients for model training, our method achieved a sensitivity of 97. 8%, specificity of 97. 9%, and accuracy of 97. 9% in five-fold cross-validation. Compared to the supervised learning method with complete labeled samples, the detection accuracy of our method was only 0. 5% lower, while the workload of data annotation was significantly reduced by more than 98%. In general, this is the first study to apply semi-supervised learning techniques for automatic AF detection using ECG. Our method can effectively reduce the demand for AF data annotations and can improve the clinical practicability of automatic AF detection.

JBHI Journal 2021 Journal Article

Self-Ensembling Co-Training Framework for Semi-Supervised COVID-19 CT Segmentation

  • Caizi Li
  • Li Dong
  • Qi Dou
  • Fan Lin
  • Kebao Zhang
  • Zuxin Feng
  • Weixin Si
  • Xuesong Deng

The coronavirus disease 2019 (COVID-19) has become a severe worldwide health emergency and is spreading at a rapid rate. Segmentation of COVID lesions from computed tomography (CT) scans is of great importance for supervising disease progression and further clinical treatment. As labeling COVID-19 CT scans is labor-intensive and time-consuming, it is essential to develop a segmentation method based on limited labeled data to conduct this task. In this paper, we propose a self-ensembled co-training framework, which is trained by limited labeled data and large-scale unlabeled data, to automatically extract COVID lesions from CT scans. Specifically, to enrich the diversity of unsupervised information, we build a co-training framework consisting of two collaborative models, in which the two models teach each other during training by using their respective predicted pseudo-labels of unlabeled data. Moreover, to alleviate the adverse impacts of noisy pseudo-labels for each model, we propose a self-ensembling strategy to perform consistency regularization for the up-to-date predictions of unlabeled data, in which the predictions of unlabeled data are gradually ensembled via moving average at the end of every training epoch. We evaluate our framework on a COVID-19 dataset containing 103 CT scans. Experimental results show that our proposed method achieves better performance in the case of only 4 labeled CT scans compared to the state-of-the-art semi-supervised segmentation networks.