Author name cluster

Pengfei Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

13 papers

1 author row

EAAI Journal 2026 Journal Article

An efficient and interpretable clustering-based framework for large-scale maritime traffic pattern recognition

Shaoqing Guo
Victor Bolbot
Liangliang Lu
Pengfei Chen
Osiris A. Valdez Banda

Details DOI

AAAI Conference 2026 Conference Paper

ConInstruct: Evaluating Large Language Models on Conflict Detection and Resolution in Instructions

Xingwei He
Qianru Zhang
Pengfei Chen
Guanhua Chen
Linlin Yu
Yuan Yuan
Siu-Ming Yiu

Instruction-following is a critical capability of Large Language Models (LLMs). While existing works primarily focus on assessing how well LLMs adhere to user instructions, they often overlook scenarios where instructions contain conflicting constraints—a common occurrence in complex prompts. The behavior of LLMs under such conditions remains under-explored. To bridge this gap, we introduce ConInstruct, a benchmark specifically designed to assess LLMs' ability to detect and resolve conflicts within user instructions. Using this dataset, we evaluate LLMs' conflict detection performance and analyze their conflict resolution behavior. Our experiments reveal two key findings: (1) Most proprietary LLMs exhibit strong conflict detection capabilities, whereas among open-source models, only DeepSeek-R1 demonstrates similarly strong performance. DeepSeek-R1 and Claude-4.5-Sonnet achieve the highest average F1-scores at 91.5% and 87.3%, respectively, ranking first and second overall. (2) Despite their strong conflict detection abilities, LLMs rarely explicitly notify users about the conflicts or request clarification when faced with conflicting constraints. These results underscore a critical shortcoming in current LLMs and highlight an important area for future improvement when designing instruction-following LLMs.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Fine-grained Image Quality Assessment for Perceptual Image Restoration

Xiangfei Sheng
Xiaofeng Pan
Zhichao Yang
Pengfei Chen
Leida Li

Recent years have witnessed remarkable achievements in perceptual image restoration (IR), creating an urgent demand for accurate image quality assessment (IQA), which is essential for both performance comparison and algorithm optimization. Unfortunately, the existing IQA metrics exhibit inherent weakness for IR task, particularly when distinguishing fine-grained quality differences among restored images. To address this dilemma, we contribute the first-of-its-kind fine-grained image quality assessment dataset for image restoration, termed FGRestore, comprising 18,408 restored images across six common IR tasks. Beyond conventional scalar quality scores, FGRestore was also annotated with 30,886 fine-grained pairwise preferences. Based on FGRestore, a comprehensive benchmark was conducted on the existing IQA metrics, which reveal significant inconsistencies between score-based IQA evaluations and the fine-grained restoration quality. Motivated by these findings, we further propose FGResQ, a new IQA model specifically designed for image restoration, which features both coarse-grained score regression and fine-grained quality ranking. Extensive experiments and comparisons demonstrate that FGResQ significantly outperforms state-of-the-art IQA metrics.

PDF Details DOI

TAAS Journal 2026 Journal Article

Fine-grained Tracing for Performance Anomaly Diagnosis of Serverless Functions

Runan Wang
Guangba Yu
Giuliano Casale
Pengfei Chen
Antonio Filieri

Serverless function compositions subject to unpredictable faults are challenging to evaluate for root cause analysis. Even though distributed tracing provides observations at multiple levels of granularity for troubleshooting, excessive code instrumentation increases the tracing overheads in terms of both computation and storage. Therefore, developers face the challenge of where and how to instrument serverless functions to maximize the likelihood of locating faults based on tracing data while minimizing tracing overhead and costs. In this article, we propose a methodology to instrument an application with code-level tracing to infer the location of faults, taking into account constraints in terms of the maximum cost of the instrumentation and testing. We encode the tracing probe placement based on the control flow graph of the application and devise heuristics-based tracing data collection strategies to relate possible probe placements with their ability to locate a fault. Then we train novelty detection models to identify the internal anomalies and present an enhanced global search algorithm that automatically computes a probe placement with optimal fault localization ability versus cost. Experimental results show high performance in locating single and multiple faults with over 90% recall score for up to 15% latency anomalies, with minimal instrumentation overhead.

Details DOI

EAAI Journal 2026 Journal Article

Joint embedding for multi-structural hypergraph based dimensionality reduction in rotor fault diagnosis

Yongfei Zhang
Qibo Liang
Yuqiao Zheng
Rongzhen Zhao
Linfeng Deng
Mingkuan Shi
Pengfei Chen
Kongyuan Wei

Details DOI

AAAI Conference 2026 Conference Paper

LongT2IBench: A Benchmark for Evaluating Long Text-to-Image Generation with Graph-structured Annotations

Zhichao Yang
Tianjiao Gu
Jianjie Wang
Feiyu Lin
Xiangfei Sheng
Pengfei Chen
Leida Li

The increasing popularity of long Text-to-Image (T2I) generation has created an urgent need for automatic and interpretable models that can evaluate the image-text alignment in long prompt scenarios. However, the existing T2I alignment benchmarks predominantly focus on short prompt scenarios and only provide MOS or Likert scale annotations. This inherent limitation hinders the development of long T2I evaluators, particularly in terms of the interpretability of alignment. In this study, we contribute LongT2IBench, which comprises 14K long text-image pairs accompanied by graph-structured human annotations. Given the detail-intensive nature of long prompts, we first design a Generate-Refine-Qualify annotation protocol to convert them into textual graph structures that encompass entities, attributes, and relations. Through this transformation, fine-grained alignment annotations are achieved based on these granular elements. Finally, the graph-structed annotations are converted into alignment scores and interpretations to facilitate the design of T2I evaluation models. Based on LongT2IBench, we further propose LongT2IExpert, a LongT2I evaluator that enables multi-modal large language models (MLLMs) to provide both quantitative scores and structured interpretations through an instruction-tuning process with Hierarchical Alignment Chain-of-Thought (CoT). Extensive experiments and comparisons demonstrate the superiority of the proposed LongT2IExpert in alignment evaluation and interpretation.

PDF Details DOI

AAAI Conference 2026 Conference Paper

TuningIQA: Fine-Grained Blind Image Quality Assessment for Livestreaming Camera Tuning

Xiangfei Sheng
Zhichao Duan
Xiaofeng Pan
Yipo Huang
Zhichao Yang
Pengfei Chen
Leida Li

Livestreaming has become increasingly prevalent in modern visual communication, where automatic camera quality tuning is essential for delivering superior user Quality of Experience (QoE). Such tuning requires accurate blind image quality assessment (BIQA) to guide parameter optimization decisions. Unfortunately, the existing BIQA models typically only predict an overall coarse-grained quality score, which cannot provide fine-grained perceptual guidance for precise camera parameter tuning. To bridge this gap, we first establish FGLive-10K, a comprehensive fine-grained BIQA database containing 10,185 high-resolution images captured under varying camera parameter configurations across diverse livestreaming scenarios. The dataset features 50,925 multi-attribute quality annotations and 19,234 fine-grained pairwise preference annotations. Based on FGLive-10K, we further develop TuningIQA, a fine-grained BIQA metric for livestreaming camera tuning, which integrates human-aware feature extraction and graph-based camera parameter fusion. Extensive experiments and comparisons demonstrate that TuningIQA significantly outperforms state-of-the-art BIQA methods in both score regression and fine-grained quality ranking, achieving superior performance when deployed for livestreaming camera tuning.

PDF Details DOI

EAAI Journal 2025 Journal Article

Dimensionality reduction of rolling bearing fault data based on graph-embedded semi-supervised deep auto-encoders

Kongyuan Wei
Rongzhen Zhao
Haixia Kou
Pengfei Chen
Yongyong Cao
Yuqiao Zheng
Linfeng Deng

Details DOI

EAAI Journal 2025 Journal Article

Multi-source contrastive cluster center method for cross-domain bearing fault identification

Pengfei Chen
Lizhen Wu
Rongzhen Zhao
Kongyuan Wei
Yuqiao Zheng
Linfeng Deng
Yongfei Zhang
Mingkuan Shi

Details DOI

EAAI Journal 2023 Journal Article

Unsupervised structure subdomain adaptation based the Contrastive Cluster Center for bearing fault diagnosis

Pengfei Chen
Rongzhen Zhao
Tianjing He
Kongyuan Wei
Jianhui Yuan

Details DOI

AAAI Conference 2021 Conference Paper

Beyond Class-Conditional Assumption: A Primary Attempt to Combat Instance-Dependent Label Noise

Pengfei Chen
Junjie Ye
Guangyong Chen
Jingwei Zhao
Pheng-Ann Heng

Supervised learning under label noise has seen numerous advances recently, while existing theoretical findings and empirical results broadly build up on the class-conditional noise (CCN) assumption that the noise is independent of input features given the true label. In this work, we present a theoretical hypothesis testing and prove that noise in real-world dataset is unlikely to be CCN, which confirms that label noise should depend on the instance and justifies the urgent need to go beyond the CCN assumption. The theoretical results motivate us to study the more general and practical-relevant instancedependent noise (IDN). To stimulate the development of theory and methodology on IDN, we formalize an algorithm to generate controllable IDN and present both theoretical and empirical evidence to show that IDN is semantically meaningful and challenging. As a primary attempt to combat IDN, we present a tiny algorithm termed self-evolution average label (SEAL), which not only stands out under IDN with various noise fractions, but also improves the generalization on realworld noise benchmark Clothing1M. Our code is released1. Notably, our theoretical analysis in Section 2 provides rigorous motivations for studying IDN, which is an important topic that deserves more research attention in future.

PDF Details

AAAI Conference 2021 Conference Paper

Foresee then Evaluate: Decomposing Value Estimation with Latent Future Prediction

Hongyao Tang
Zhaopeng Meng
Guangyong Chen
Pengfei Chen
Chen Chen
Yaodong Yang
Luo Zhang
Wulong Liu

Value function is the central notion of Reinforcement Learning (RL). Value estimation, especially with function approximation, can be challenging since it involves the stochasticity of environmental dynamics and reward signals that can be sparse and delayed in some cases. A typical model-free RL algorithm usually estimates the values of a policy by Temporal Difference (TD) or Monte Carlo (MC) algorithms directly from rewards, without explicitly taking dynamics into consideration. In this paper, we propose Value Decomposition with Future Prediction (VDFP), providing an explicit two-step understanding of the value estimation process: 1) first foresee the latent future, 2) and then evaluate it. We analytically decompose the value function into a latent future dynamics part and a policy-independent trajectory return part, inducing a way to model latent dynamics and returns separately in value estimation. Further, we derive a practical deep RL algorithm, consisting of a convolutional model to learn compact trajectory representation from past experiences, a conditional variational auto-encoder to predict the latent future dynamics and a convex return model that evaluates trajectory representation. In experiments, we empirically demonstrate the effectiveness of our approach for both off-policy and on-policy RL in several OpenAI Gym continuous control tasks as well as a few challenging variants with delayed reward.

PDF Details

AAAI Conference 2021 Conference Paper

Robustness of Accuracy Metric and its Inspirations in Learning with Noisy Labels

Pengfei Chen
Junjie Ye
Guangyong Chen
Jingwei Zhao
Pheng-Ann Heng

For multi-class classification under class-conditional label noise, we prove that the accuracy metric itself can be robust. We concretize this finding’s inspiration in two essential aspects: training and validation, with which we address critical issues in learning with noisy labels. For training, we show that maximizing training accuracy on sufficiently many noisy samples yields an approximately optimal classifier. For validation, we prove that a noisy validation set is reliable, addressing the critical demand of model selection in scenarios like hyperparameter-tuning and early stopping. Previously, model selection using noisy validation samples has not been theoretically justified. We verify our theoretical results and additional claims with extensive experiments. We show characterizations of models trained with noisy labels, motivated by our theoretical results, and verify the utility of a noisy validation set by showing the impressive performance of a framework termed noisy best teacher and student (NTS). Our code is released1.

PDF Details