Author name cluster

Yi Lu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

16 papers

2 author rows

TMLR Journal 2026 Journal Article

StructEval: Benchmarking LLMs' Capabilities to Generate Structural Outputs

Jialin Yang
Dongfu Jiang
Tony He
Sherman Siu
Yuxuan Zhang
Disen Liao
Zhuofeng Li
Huaye Zeng

As Large Language Models (LLMs) become integral to software development workflows, their ability to generate structured outputs has become critically important. We introduce $\textbf{StructEval}$, a comprehensive benchmark for evaluating LLMs' capabilities in producing both non-renderable (JSON, YAML, CSV) and renderable (HTML, React, SVG) structured formats. Unlike prior benchmarks, StructEval systematically evaluates structural fidelity across diverse formats through two paradigms: $\textbf{1)}$ generation tasks, producing structured output from natural language prompts, and $\textbf{2)}$ conversion tasks, translating between structured formats. Our benchmark encompasses 18 formats and 44 types of task, with novel metrics for format adherence and structural correctness. Results reveal significant performance gaps—even state-of-the-art models like o1-mini achieve only $75.58$ average score, with open-source alternatives lagging approximately $10$ points behind. We find generation tasks more challenging than conversion tasks, and producing correct visual content more difficult than generating text-only structures.

PDF Details

AAAI Conference 2026 Conference Paper

Unlearning in Cross-Modal Retrieval via Prior-Prototype Guided Partitioned Dampening

Yi Lu
Shu Li
Yurong Qian

Selective deletion of data from deep models, known as unlearning, has become crucial for enforcing the right to be forgotten, while also mitigating the negative impact of flawed training data. Retraining deep models is often impractical due to data access restrictions and computational overhead. Existing retraining-free methods are typically based on the Fisher Information Matrix (FIM), which quantifies the importance of model parameters with respect to forgetting classes, applying equal dampening to these parameters. This approach implicitly assumes a semantically uniform representation space, where all retained classes are equidistant from the forgetting classes. However, this assumption often fails in real-world cross-modal retrieval scenarios characterized by multi-label and non-orthogonal semantics. To overcome this limitation, we propose Prior-Prototype guided Partitioned dampening (PPP), an effective strategy for selective forgetting in cross-modal retrieval. First, PPP defines prior-prototypes, which are semantic centers derived from well-trained models, to identify neighbor classes semantically close to the forgetting set. Then, PPP uses Fisher information to identify parameters sensitive to forgetting and partitions them into buffer and core regions based on their relative importance to the neighbor and retained sets. Finally, PPP applies a hierarchical dampening strategy, where core parameters receive stronger suppression guided by prototype-based semantic disparities. Comprehensive evaluations on four large-scale benchmarks show that PPP performs competitively with retraining-based baselines, highlighting its effectiveness and generalizability in selective unlearning for cross-modal retrieval.

PDF Details DOI

EAAI Journal 2025 Journal Article

A novel object detection model for sugar beet Cercospora leaf spot in field scenarios based on large kernel decomposition and spatial channel interaction attention

Hualong Dong
Yi Lu
Yurong Qian
Xuefei Ning
Ting Chen
Ke Tang

Details DOI

YNICL Journal 2025 Journal Article

Alterations of long-range association fibers in patients with anti-N-methyl-D-aspartate receptor encephalitis

Xiaodong Chen
Ling Fang
Yiying Huang
Yu Huang
Yi Lu
Jinhui Wang
Chunxin Liu
Huanquan Liao

Details DOI

NeurIPS Conference 2025 Conference Paper

Understanding Parametric and Contextual Knowledge Reconciliation within Large Language Models

Jun Zhao
Yongzhuo Yang
Xiang Hu
Jingqi Tong
Yi Lu
Wei Wu
Tao Gui
Qi Zhang

Retrieval-Augmented Generation (RAG) provides additional contextual knowledge to complement the parametric knowledge in Large Language Models (LLMs). These two knowledge interweave to enhance the accuracy and timeliness of LLM responses. However, the internal mechanisms by which LLMs utilize these knowledge remain unclear. We propose modeling the forward propagation of knowledge as an entity flow, employing this framework to trace LLMs' internal behaviors when processing mixed-source knowledge. Linear probing utilizes a trainable linear classifier to detect specific attributes in hidden layers. However, once trained, a probe cannot adapt to dynamically specified entities. To address this challenge, we construct an entity-aware probe, which introduces special tokens to mark probing targets and employs a small trainable rank-8 lora update to process these special markers. We first verify this approach through an attribution experiment, demonstrating that it can accurately detect information about ad-hoc entities from complex hidden states. Next, we trace entity flows across layers to understand how LLMs reconcile conflicting knowledge internally. Our probing results reveal that contextual and parametric knowledge are routed between tokens through distinct sets of attention heads, supporting attention competition only within knowledge types. While conflicting knowledge maintains a residual presence across layers, aligned knowledge from multiple sources gradually accumulates, with the magnitude of this accumulation directly determining its influence on final outputs.

PDF Details

AAAI Conference 2025 Conference Paper

Video Repurposing from User Generated Content: A Large-scale Dataset and Benchmark

Yongliang Wu
Wenbo Zhu
Jiawang Cao
Yi Lu
Bozheng Li
Weiheng Chi
Zihan Qiu
Lirian Su

The demand for producing short-form videos for sharing on social media platforms has experienced significant growth in recent times. Despite notable advancements in the fields of video summarization and highlight detection, which can create partially usable short films from raw videos, these approaches are often domain-specific and require an in-depth understanding of real-world video content. To tackle this predicament, we propose Repurpose-10K, an extensive dataset comprising over 10,000 videos with more than 120,000 annotated clips aimed at resolving the video long-to-short task. Recognizing the inherent constraints posed by untrained human annotators, which can result in inaccurate annotations for repurposed videos, we propose a two-stage solution to obtain annotations from real-world user-generated content. Furthermore, we offer a baseline model to address this challenging task by integrating audio, visual, and caption aspects through a cross-modal fusion and alignment framework. We aspire for our work to ignite groundbreaking research in the lesser-explored realms of video repurposing.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

VisualLens: Personalization through Task-Agnostic Visual History

Wang Bill Zhu
Deqing Fu
Kai Sun
Yi Lu
Zhaojiang Lin
Seungwhan Moon
Kanika Narang
Mustafa Canim

Existing recommendation systems either rely on user interaction logs, such as online shopping history for shopping recommendations, or focus on text signals. However, item-based histories are not always accessible and generalizable for multimodal recommendation. We hypothesize that a user's visual history --- comprising images from daily life --- can offer rich, task-agnostic insights into their interests and preferences, and thus be leveraged for effective personalization. To this end, we propose VisualLens, a novel framework that leverages multimodal large language models (MLLMs) to enable personalization using task-agnostic visual history. VisualLens extracts, filters, and refines a spectrum user profile from the visual history to support personalized recommendation. We created two new benchmarks, Google-Review-V and Yelp-V, with task-agnostic visual histories, and show that VisualLens improves over state-of-the-art item-based multimodal recommendations by 5-10\% on Hit@3, and outperforms GPT-4o by 2-5\%. Further analysis shows that VisualLens is robust across varying history lengths and excels at adapting to both longer histories and unseen content categories.

PDF Details

NeurIPS Conference 2024 Conference Paper

APDDv2: Aesthetics of Paintings and Drawings Dataset with Artist Labeled Scores and Comments

Xin Jin
Qianqian Qiao
Yi Lu
Huaye Wang
Heng Huang
Shan Gao
Jianfei Liu
Rui Li

Datasets play a pivotal role in training visual models, facilitating the development of abstract understandings of visual features through diverse image samples and multidimensional attributes. However, in the realm of aesthetic evaluation of artistic images, datasets remain relatively scarce. Existing painting datasets are often characterized by limited scoring dimensions and insufficient annotations, thereby constraining the advancement and application of automatic aesthetic evaluation methods in the domain of painting. To bridge this gap, we introduce the Aesthetics Paintings and Drawings Dataset (APDD), the first comprehensive collection of paintings encompassing 24 distinct artistic categories and 10 aesthetic attributes. Building upon the initial release of APDDv1, our ongoing research has identified opportunities for enhancement in data scale and annotation precision. Consequently, APDDv2 boasts an expanded image corpus and improved annotation quality, featuring detailed language comments to better cater to the needs of both researchers and practitioners seeking high-quality painting datasets. Furthermore, we present an updated version of the Art Assessment Network for Specific Painting Styles, denoted as ArtCLIP. Experimental validation demonstrates the superior performance of this revised model in the realm of aesthetic evaluation, surpassing its predecessor in accuracy and efficacy. The dataset and model are available at https: //github. com/BestiVictory/APDDv2. git.

PDF Details DOI

IJCAI Conference 2024 Conference Paper

Paintings and Drawings Aesthetics Assessment with Rich Attributes for Various Artistic Categories

Xin Jin
Qianqian Qiao
Yi Lu
Huaye Wang
Shan Gao
Heng Huang
Guangdong Li

Image aesthetic evaluation is a highly prominent research domain in the field of computer vision. In recent years, there has been a proliferation of datasets and corresponding evaluation methodologies for assessing the aesthetic quality of photographic works, leading to the establishment of a relatively mature research environment. However, in contrast to the extensive research in photographic aesthetics, the field of aesthetic evaluation for paintings and drawings has seen limited attention until the introduction of the BAID dataset in March 2023. This dataset solely comprises overall scores for high-quality artistic images. Our research marks the pioneering introduction of a multi-attribute, multi-category dataset specifically tailored to the field of painting: Aesthetics of Paintings and Drawings Dataset (APDD). The construction of APDD received active participation from 28 professional artists worldwide, along with dozens of students specializing in the field of art. This dataset encompasses 24 distinct artistic categories and 10 different aesthetic attributes. Each image in APDD has been evaluated by six professionally trained experts in the field of art, including assessments for both total aesthetic scores and aesthetic attribute scores. The final APDD dataset comprises a total of 4985 images, with an annotation count exceeding 31100 entries. Concurrently, we propose an innovative approach: Art Assessment Network for Specific Painting Styles (AANSPS), designed for the assessment of aesthetic attributes in mixed-attribute art datasets. Through this research, our goal is to catalyze advancements in the field of aesthetic evaluation for paintings and drawings, while enriching the available resources and methodologies for its further development and application. Dataset is available at https: //github. com/BestiVictory/APDD. git

PDF Details DOI

JBHI Journal 2021 Journal Article

Multiple Embeddings Enhanced Multi-Graph Neural Networks for Chinese Healthcare Named Entity Recognition

Lung-Hao Lee
Yi Lu

Named Entity Recognition (NER) is a natural language processing task for recognizing named entities in a given sentence. Chinese NER is difficult due to the lack of delimited spaces and conventional features for determining named entity boundaries and categories. This study proposes the ME-MGNN (Multiple Embeddings enhanced Multi-Graph Neural Networks) model for Chinese NER in the healthcare domain. We integrate multiple embeddings at different granularities from the radical, character to word levels for an extended character representation, and this is fed into multiple gated graph sequence neural networks to identify named entities and classify their types. The experimental datasets were collected from health-related news, digital health magazines and medical question/answer forums. Manual annotation was conducted for a total of 68, 460 named entities across 10 entity types (body, symptom, instrument, examination, chemical, disease, drug, supplement, treatment and time) in 30, 692 sentences. Experimental results indicated our ME-MGNN model achieved an F1-score result of 75. 69, outperforming previous methods. In practice, a series of model analysis implied that our method is effective and efficient for Chinese healthcare NER.

Details DOI

YNICL Journal 2020 Journal Article

Age-related atrophy of cortical thickness and genetic effect of ANK3 gene in first episode MDD patients

Yuqi Cheng
Jian Xu
Chenglong Dong
Zonglin Shen
Cong Zhou
Na Li
Yi Lu
Liuyi Ran

Details DOI

IROS Conference 2017 Conference Paper

Preliminary study on magnetic tracking based navigation for wire-driven flexible robot

Changchun Zhang
Yi Lu
Xiaoxiao Qiu
Shuang Song 0002
Li Liu 0017
Max Q. -H. Meng

Flexible manipulator enables curvilinear accessibility through small incisions or natural orifices for minimally invasive surgery and diagnosis, which makes it a good choice for minimally invasive surgery. In order to control the robot precisely and safely, the real-time position and shape information of the robot need to be measured well. In this paper, we propose a magnetic tracking based tip pose and shape detection method for wire driven flexible robots. A permanent magnet is mounted at the distal end of the robot. Its magnetic field can be sensed with a sensor array. Therefore, position and orientation of the tip can be estimated utilizing the tracking method. A shape sensing algorithm is then carried out to estimate the real-time shape based on the tip pose. With the tip pose and shape display in the reconstructed visual environment, navigation can be achieved. This method provides the advantages that no sensors are needed to mount on the robot and has no line-of-sight problem. Experimental results verified the feasibility of the proposed method. A navigation error of 1. 9mm is achieved.

Details

YNICL Journal 2016 Journal Article

Changes of grey matter volume in first-episode drug-naive adult major depressive disorder patients with different age-onset

Zonglin Shen
Yuqi Cheng
Shuran Yang
Nan Dai
Jing Ye
Xiaoyan Liu
Jin Lu
Na Li

Details DOI

YNICL Journal 2016 Journal Article

The volumetric and shape changes of the putamen and thalamus in first episode, untreated major depressive disorder

Yi Lu
Hongmin Liang
Dan Han
Yin Mo
Zongfang Li
Yuqi Cheng
Xiufeng Xu
Zonglin Shen

Details DOI

JBHI Journal 2014 Journal Article

The Sensitive and Efficient Detection of Quadriceps Muscle Thickness Changes in Cross-Sectional Plane Using Ultrasonography: A Feasibility Investigation

Jizhou Li
Yongjin Zhou
Yi Lu
Guangquan Zhou
Lei Wang
Yong-Ping Zheng

As a direct determinant parameter to quantify muscle activity, the muscle thickness (MT) has been investigated in many aspects and for various purposes. Ultrasonography (US) is a promising modality to detect muscle morphological changes during contractions since it is portable, noninvasive, and real time. However, there are few reports on sensitive and efficient estimation of changes of MT in a cross-sectional plane. In this feasibility investigation, we proposed a coarse-to-fine method based on a compressive-tracking algorithm for estimation of MT changes during an example task of isometric knee extension using ultrasound images. The sensitivity and efficiency are evaluated with 1920 US images from quadriceps muscle (QM) in eight subjects. The detection results were compared with those obtained from both traditional manual measurement and the well known normalized cross-correlation method, and the effect of the size of tracking window on detection performance was evaluated as well. It is demonstrated that the proposed method agrees well with the manual measurement. Meanwhile, it is not only sensitive to relatively small changes of MT but also computationally efficient.

Details DOI

IJCAI Conference 2013 Conference Paper

Fault-Tolerant Planning under Uncertainty

Luis Pineda
Yi Lu
Shlomo Zilberstein
Claudia V. Goldman

A fault represents some erroneous operation of a system that could result from an action selection error or some abnormal condition. We formally define error models that characterize the likelihood of various faults and consider the problem of faulttolerant planning, which optimizes performance given an error model. We show that factoring the possibility of errors significantly degrades the performance of stochastic planning algorithms such as LAO*, because the number of reachable states grows dramatically. We introduce an approach to plan for a bounded number of faults and analyze its theoretical properties. When combined with a continual planning paradigm, the k-fault-tolerant planning method can produce near-optimal performance, even when the number of faults exceeds the bound. Empirical results in two challenging domains confirm the effectiveness of the approach in handling different types of runtime errors.

PDF Details DOI