Author name cluster

Fei Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

40 papers

2 author rows

AAAI Conference 2026 Conference Paper

Generating Attribute-Aware Human Motions from Textual Prompt

Xinghan Wang
Kun Xu
Fei Li
Cao Sheng
JiaZhong Yu
Yadong Mu

Text-driven human motion generation has recently attracted considerable attention, allowing models to generate human motions based on textual descriptions. However, current methods neglect the influence of human attributes—such as age, gender, weight, and height—which are key factors shaping human motion patterns. This work represents a pilot exploration for bridging this gap. We conceptualize each motion as comprising both attribute information and action semantics, where textual descriptions align exclusively with action semantics. To achieve this, a new framework inspired by Structural Causal Models is proposed to decouple action semantics from human attributes, enabling text-to-semantics prediction and attribute-controlled generation. The resulting model is capable of generating attribute-aware motion aligned with the user's text and attribute inputs. For evaluation, we introduce a comprehensive dataset containing attribute annotations for text-motion pairs, setting the first benchmark for attribute-aware motion generation. Extensive experiments validate our model's effectiveness.

PDF Details DOI

AAAI Conference 2026 Conference Paper

Interest-Shift-Aware Logical Reasoning for Efficient Long-Sequence Recommendation

Fei Li
Qingyun Gao
Enneng Yang
Jianzhe Zhao
Guibing Guo

Logical reasoning-based recommendation methods formulate logical expressions to characterize user-item interaction patterns, incorporating regularization constraints to ensure consistency with logical rules. However, these methods face two critical challenges: (1) As sequence length increases, they cannot effectively capture the dynamic transfer of user interests across subsequences (i.e., subsequence interest drift), thereby degenerating logical expressions to single-subsequence inference. (2) The time complexity of logical reasoning and rule learning scales quadratically with the sequence length, severely constraining computational efficiency in long-sequence recommendation. To address these challenges, we propose ELECTOR, an intErest-shift-aware long-sequence Logical reasoning for EffiCienT lOng-sequence Recommendation method. Specifically, we design a Subsequence Interest Learning Module (SIL) to model cross-subsequence interest drifts in long sequences. SIL employs a local attention mechanism to extract subsequence interests effectively and a global attention mechanism to capture the correlations among subsequence interests. Subsequently, we propose an Interest-aware Logical Reasoning (ILR) mechanism that performs logical reasoning using a limited set of subsequence and short-term interests, rather than reasoning over the entire sequence, significantly reducing time complexity. Additionally, ILR employs interest logical reasoning contrastive loss to ensure the model simultaneously considers multiple interests. Experiments on four real-world datasets demonstrate that our method significantly outperforms all baselines regarding computational efficiency and recommendation accuracy, confirming its effectiveness.

PDF Details DOI

AAAI Conference 2026 Conference Paper

KVmix: Gradient-Based Layer Importance-Aware Mixed-Precision Quantization for KV Cache

Fei Li
Song Liu
Weiguo Wu
Shiqiang Nie
Jinyu Wang

The high memory demands of the Key-Value (KV) Cache during the inference of Large Language Models (LLMs) severely restrict their deployment in resource-constrained platforms. Quantization can effectively alleviate the memory pressure caused by KV Cache. However, existing methods either rely on static one-size-fits-all precision allocation or fail to dynamically prioritize critical KV in long-context tasks, forcing memory-accuracy-throughput tradeoffs. In this work, we propose a novel mixed-precision quantization method for KV Cache named KVmix. KVmix leverages gradient-based importance analysis to evaluate how individual Key and Value projection matrices affect the model loss, enabling layer-specific bit-width allocation for mix-precision quantization. It dynamically prioritizes higher precision for important layers while aggressively quantizing less influential ones, achieving a tunable balance between accuracy and efficiency. KVmix introduces a dynamic long-context optimization strategy that adaptively keeps full-precision KV pairs for recent pivotal tokens and compresses older ones, achieving high-quality sequence generation with low memory usage. Additionally, KVmix provides efficient low-bit quantization and CUDA kernels to optimize computational overhead. On LLMs such as Llama and Mistral, KVmix achieves near-lossless inference performance with extremely low quantization configuration (Key 2.19bit Value 2.38bit), while delivering a remarkable 4.9× memory compression and a 5.3× speedup in inference throughput.

PDF Details DOI

AAAI Conference 2026 Conference Paper

PaSE: Prototype-aligned Calibration and Shapley-based Equilibrium for Multimodal Sentiment Analysis

Kang He
Boyu Chen
Yuzhe Ding
Fei Li
Chong Teng
Donghong Ji

Multimodal Sentiment Analysis (MSA) seeks to understand human emotions by integrating textual, acoustic, and visual signals. Although multimodal fusion is designed to leverage cross-modal complementarity, real-world scenarios often exhibit modality competition: dominant modalities tend to overshadow weaker ones, leading to suboptimal performance. In this paper, we propose PaSE, a novel Prototype-aligned Calibration and Shapley-optimized Equilibrium framework, which enhances collaboration while explicitly mitigating modality competition. PaSE first applies Prototype-guided Calibration Learning (PCL) to refine unimodal representations and align them through an Entropic Optimal Transport mechanism that ensures semantic consistency. To further stabilize optimization, we introduce a Dual-Phase Optimization strategy. A prototype-gated fusion module is first used to extract shared representations, followed by Shapley-based Gradient Modulation (SGM), which adaptively adjusts gradients according to the contribution of each modality. Extensive experiments on IEMOCAP, MOSI, and MOSEI confirm that PaSE achieves the superior performance and effectively alleviates modality competition.

PDF Details DOI

EAAI Journal 2025 Journal Article

A lightweight deep learning framework for wild berry detection in complex natural environments

Xiaorong Zhang
Fei Li
XuTing Hu
Juan Fang

Wild berries (WildB) play a crucial role in Nordic forest ecosystems’ ecological and economic balance. However, research on wild berry detection has remained scarce in recent years. Driven by recent advancements in Artificial Intelligence (AI), particularly in deep learning-based computer vision applications, we propose a detection model based on the You Only Look Once version 11n (YOLOv11n) architecture (WildB-YOLO). WildB-YOLO integrates multiple innovations: the Frog Feature Pyramid Network (FrogFPN) improves multi-scale feature fusion, facilitating the detection of objects at various scales. The Scale-Aware Context Module (SACM) enhances contextual modeling, improving target discrimination. The Weighted Exponential Moving Average Loss (WEMA Loss) mitigates class imbalance. Additionally, Soft Non-Maximum Suppression (Soft-NMS) refines bounding box selection and reduces false positives, enhancing overall detection performance. The model employs Layer Adaptive Magnitude-Based Pruning (LAMP) to enhance efficiency further, achieving lightweight optimization while maintaining high detection precision. Experimental results demonstrate that WildB-YOLO achieves a mean Average Precision (mAP) of 59. 5% at Intersection over Union (IoU) thresholds ranging from 50% to 95% (mAP50-95), outperforming the original YOLOv11n by 1. 9%. Furthermore, WildB-YOLO’s optimized model size is 1. 5MB, with 2. 5G floating point operations per second (FLOPs) and 2. 6M parameters, representing reductions of 71. 1%, 60. 3%, and 76. 9%, respectively, compared to the baseline model. This reduction in complexity facilitates deployment on resource-constrained devices, significantly enhancing the applicability of AI-driven berry detection in practical field scenarios. This study pioneers a dedicated solution for wild berry detection and contributes novel strategies for small object detection in complex natural environments. WildB-YOLO is open-source at: https: //github. com/zxr0826/WildB-YOLO.