Author name cluster

Da Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

1 author row

AAAI Conference 2026 Conference Paper

EchoBat: Echo-Vision Enhancement and Echo-Layered Sampling for Video LLMs Hallucination Mitigation

Shuai Liu
Da Chen
Yiheng Pan
Chenwei Tian
Qian Li
Chenhao Lin

Recent advancements in multimodal large language models (MLLMs) have shown remarkable progress in video understanding. However, video MLLMs (VideoMLLMs) still suffer from hallucinations, generating nonsensical or irrelevant content. This issue partly stems from over-reliance on pre-trained knowledge, sometimes neglecting the rich visual information present in the video. Additionally, many existing methods rely on uniform frame sampling, which can overlook critical visual cues. To address these challenges, we present EchoBat, a novel approach that leverages audio information as well as video temporal and logical consistency to improve preference data construction and keyframe extraction. Our method integrates Direct Preference Optimization (DPO) to mitigate hallucinations by leveraging high-quality, contextually rich preference feedback. Specifically, we use GPT-4o to generate high-quality video descriptions and integrate visually relevant segments from Whisper-derived transcripts to construct preference responses. Correspondingly, we use the reference model itself to describe the reversed video, and use GPT-4o to flashback the text and fill in the hallucination to produce non-preferred responses. This strategy enhances the model’s ability to better understand visual content and temporal, logical relationships within videos. Furthermore, we propose an echo-layered sampling strategy for keyframe extraction from videos, which can provide more precise visual supervision compared to uniform sampling. Experimental results on the three latest video hallucination benchmarks demonstrate the effectiveness of our approach.

PDF Details DOI

JBHI Journal 2025 Journal Article

$\text{MR}^{2}$-Net: Retinal OCTA Image Stitching via Multi-Scale Representation Learning and Dynamic Location Guidance

Haiting Mao
Yuhui Ma
Dan Zhang
Yanda Meng
Shaodong Ma
Yuchuan Qiao
Huazhu Fu
Caifeng Shan

Optical coherence tomography angiography (OCTA) plays a crucial role in quantifying and analyzing retinal vascular diseases. However, the limited field of view (FOV) inherent in most commercial OCTA imaging systems poses a significant challenge for clinicians, restricting the possibility to analyze larger retinal regions of high resolution. Automatic stitching of OCTA scans in adjacent regions may provide a promising solution to extend the region of interest. However, commonly-used stitching algorithms face difficulties in achieving effective alignment due to noise, artifacts and dense vasculature present in OCTA images. To address these challenges, we propose a novel retinal OCTA image stitching network, named $\text{MR}^{2}$ -Net, which integrates multi-scale representation learning and dynamic location guidance. In the first stage, an image registration network with a progressive multi-resolution feature fusion is proposed to derive deep semantic information effectively. Additionally, we introduce a dynamic guidance strategy to locate the foveal avascular zone (FAZ) and constrain registration errors in overlapping vascular regions. In the second stage, an image fusion network based on multiple mask constraints and adjacent image aggregation (AIA) strategies is developed to further eliminate the artifacts in the overlapping areas of stitched images, thereby achieving precise vessel alignment. To validate the effectiveness of our method, we conduct a series of experiments on two delicately constructed datasets, i. e. , OPTOVUE-OCTA and SVision-OCTA. Experimental results demonstrate that our method outperforms other image stitching methods and effectively generates high-quality wide-field OCTA images, achieving a structural similarity index (SSIM) score of 0. 8264 and 0. 8014 on the two datasets, respectively.

Details DOI

EAAI Journal 2025 Journal Article

Three-dimensional reconstruction and fracture segmentation based on X-ray and computed tomography paired dataset

Yuan Gao
Yuan Zhou
Da Chen
Jiachen Li
Mingle Zhou
Gang Li
Yunbo Gu
Jean-Louis Coatrieux

Details DOI