Arrow Research search

Author name cluster

Dongyang Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

12 papers
2 author rows

Possible papers

12

AAAI Conference 2026 Conference Paper

DCTR: Dual-Constraint Subgraph Optimization for Knowledge Graph-based Retrieval-Augmented Generation

  • Yukun Cao
  • Zirui Xu
  • Dongyang Li
  • Zhihao Guo
  • Luobin Huang
  • LIsheng Wang

Knowledge Graph (KG)-based Retrieval-Augmented Generation (RAG) shifts the contents of retrieval from narrative text to a relational knowledge network, empowering large language models (LLMs) to harness structured relationships between entities. However, conventional KG-RAG approaches are resource-intensive, requiring either query decomposition with multiple LLM rounds or parameterized static knowledge injection to update the model. Although subgraph reasoning aims to address these issues, most current methods are based on heuristic shortest path and multi-hop graph traversal algorithms. The retrieved subgraphs suffer from incompleteness and semantic drift, and neglect the interaction between subgraph and LLMs in terms of fine-grained structural semantics. We propose a dual-constraint subgraph optimization for KG-RAG (DCTR). It improves subgraph retrieval and generates high-quality subgraphs with structural integrity and information salience for LLMs. Specifically, it formulates subgraph generation as a two-stage graph-theoretic constrained optimization problem to create compact and complete pseudo-labels. Since these pseudo-labels are discrete, a smooth approximation is employed to convert them into a differentiable representation, thereby optimizing the retriever to highlight key information while extracting subgraphs. On two benchmark datasets, DCTR significantly enhances subgraph quality, achieving state-of-the-art performance in LLM reasoning.

AAAI Conference 2026 Conference Paper

Mnemosyne: Accelerating Multi-Hop Question Answering via Cache Hit Order Fitting

  • Haizhou Du
  • Jiujiu Li
  • Dongyang Li
  • Luobin Huang
  • LIsheng Wang

Multi-Hop Question Answering (MHQA) requires step-by-step reasoning across multiple pieces of information to answer complex questions. The cache-aided Retrieval-Augmented Generation (RAG) can accelerate the process of external knowledge retrieval at each reasoning step for MHQA. However, existing methods focus on the internal structure and ignore the misalignment between the queries’ arrival order and cache hit order. To tackle this, we propose Mnemosyne, a cache hit order fitting method designed to accelerate the RAG progress for MHQA. Specifically, our cache-aware order fitting strategy adjusts the order of queries arrival via graph reordering to better align with the cache hit order, thereby reducing the likelihood of failed or unproductive retrieval attempts. The multi-granularity caching storage mechanism is designed to loosen the strict hit condition to multiple similar semantic matching modes, facilitating that relevant documents can still be retrieved. Experiments conducted on four multi-hop QA datasets demonstrate that Mnemosyne effectively reduces retrieval latency while enhancing task answer F1 score, achieving a superior trade-off between efficiency and effectiveness.

AAAI Conference 2024 Conference Paper

CIDR: A Cooperative Integrated Dynamic Refining Method for Minimal Feature Removal Problem

  • Qian Chen
  • Taolin Zhang
  • Dongyang Li
  • Xiaofeng He

The minimal feature removal problem in the post-hoc explanation area aims to identify the minimal feature set (MFS). Prior studies using the greedy algorithm to calculate the minimal feature set lack the exploration of feature interactions under a monotonic assumption which cannot be satisfied in general scenarios. In order to address the above limitations, we propose a Cooperative Integrated Dynamic Refining method (CIDR) to efficiently discover minimal feature sets. Specifically, we design Cooperative Integrated Gradients (CIG) to detect interactions between features. By incorporating CIG and characteristics of the minimal feature set, we transform the minimal feature removal problem into a knapsack problem. Additionally, we devise an auxiliary Minimal Feature Refinement algorithm to determine the minimal feature set from numerous candidate sets. To the best of our knowledge, our work is the first to address the minimal feature removal problem in the field of natural language processing. Extensive experiments demonstrate that CIDR is capable of tracing representative minimal feature sets with improved interpretability across various models and datasets.

AAAI Conference 2024 Conference Paper

Enhancing Hyperspectral Images via Diffusion Model and Group-Autoencoder Super-resolution Network

  • Zhaoyang Wang
  • Dongyang Li
  • Mingyang Zhang
  • Hao Luo
  • Maoguo Gong

Existing hyperspectral image (HSI) super-resolution (SR) methods struggle to effectively capture the complex spectral-spatial relationships and low-level details, while diffusion models represent a promising generative model known for their exceptional performance in modeling complex relations and learning high and low-level visual features. The direct application of diffusion models to HSI SR is hampered by challenges such as difficulties in model convergence and protracted inference time. In this work, we introduce a novel Group-Autoencoder (GAE) framework that synergistically combines with the diffusion model to construct a highly effective HSI SR model (DMGASR). Our proposed GAE framework encodes high-dimensional HSI data into low-dimensional latent space where the diffusion model works, thereby alleviating the difficulty of training the diffusion model while maintaining band correlation and considerably reducing inference time. Experimental results on both natural and remote sensing hyperspectral datasets demonstrate that the proposed method is superior to other state-of-the-art methods both visually and metrically.

ECAI Conference 2024 Conference Paper

R 4: Reinforced Retriever-Reorder-Responder for Retrieval-Augmented Large Language Models

  • Taolin Zhang 0001
  • Dongyang Li
  • Qizhou Chen
  • Chengyu Wang 0001
  • Longtao Huang
  • Hui Xue 0001
  • Xiaofeng He
  • Jun Huang 0007

Retrieval-augmented large language models (LLMs) leverage relevant content retrieved by information retrieval systems to generate correct responses, aiming to alleviate the hallucination problem. However, existing retriever-responder methods typically append relevant documents to the prompt of LLMs to perform text generation tasks without considering the interaction of fine-grained structural semantics between the retrieved documents and the LLMs. This issue is particularly important for accurate response generation as LLMs tend to “lose in the middle” when dealing with input prompts augmented with lengthy documents. In this work, we propose a new pipeline named “Reinforced Retriever-Reorder-Responder” (R4) to learn document orderings for retrieval-augmented LLMs, thereby further enhancing their generation abilities while the large numbers of parameters of LLMs remain frozen. The reordering learning process is divided into two steps according to the quality of the generated responses: document order adjustment and document representation enhancement. Specifically, document order adjustment aims to organize retrieved document orderings into beginning, middle, and end positions based on graph attention learning, which maximizes the reinforced reward of response quality. Document representation enhancement further refines the representations of retrieved documents for responses of poor quality via document-level gradient adversarial learning. Extensive experiments demonstrate that our proposed pipeline achieves better factual question-answering performance on knowledge-intensive tasks compared to strong baselines across various public datasets. The source codes and trained models will be released upon paper acceptance.

NeurIPS Conference 2024 Conference Paper

Visual Decoding and Reconstruction via EEG Embeddings with Guided Diffusion

  • Dongyang Li
  • Chen Wei
  • Shiying Li
  • Jiachen Zou
  • Quanying Liu

How to decode human vision through neural signals has attracted a long-standing interest in neuroscience and machine learning. Modern contrastive learning and generative models improved the performance of visual decoding and reconstruction based on functional Magnetic Resonance Imaging (fMRI). However, the high cost and low temporal resolution of fMRI limit their applications in brain-computer interfaces (BCIs), prompting a high need for visual decoding based on electroencephalography (EEG). In this study, we present an end-to-end EEG-based visual reconstruction zero-shot framework, consisting of a tailored brain encoder, called the Adaptive Thinking Mapper (ATM), which projects neural signals from different sources into the shared subspace as the clip embedding, and a two-stage multi-pipe EEG-to-image generation strategy. In stage one, EEG is embedded to align the high-level clip embedding, and then the prior diffusion model refines EEG embedding into image priors. A blurry image also decoded from EEG for maintaining the low-level feature. In stage two, we input both the high-level clip embedding, the blurry image and caption from EEG latent to a pre-trained diffusion model. Furthermore, we analyzed the impacts of different time windows and brain regions on decoding and reconstruction. The versatility of our framework is demonstrated in the magnetoencephalogram (MEG) data modality. The experimental results indicate that our EEG-based visual zero-shot framework achieves SOTA performance in classification, retrieval and reconstruction, highlighting the portability, low cost, and high temporal resolution of EEG, enabling a wide range of BCI applications. Our code is available at https: //github. com/ncclab-sustech/EEG Image decode.

JBHI Journal 2023 Journal Article

Dual-Input Transformer: An End-to-End Model for Preoperative Assessment of Pathological Complete Response to Neoadjuvant Chemotherapy in Breast Cancer Ultrasonography

  • Tong Tong
  • Dongyang Li
  • Jionghui Gu
  • Guo Chen
  • Guotao Bai
  • Xin Yang
  • Kun Wang
  • Tianan Jiang

Neoadjuvant chemotherapy (NAC) is the primary method to reduce the burden of tumor and metastasis; in the treatment of breast cancer, it may provide additional opportunities for breast-conserving surgery. Preoperative assessment of pathological complete response (PCR) to NAC is important for developing individualized treatment approaches and predicting patient prognosis. Compared to magnetic resonance imaging (MRI) and mammography, ultrasonography (US) has the advantages of simplicity, flexibility, and real-time imaging. Moreover, it does not require radiation and can provide multi-time acquisition of the tumor during NAC treatment. Recently, deep learning radiomics models based on multi-time-point US images for the prediction of NAC effectiveness have been proposed. To further improve the prediction performance, we carefully designed four supporting modules for our proposed dual-input transformer (DiT): isolated tokens-to-token patch embedding module, shared position embedding, time embedding, and weighted average pooling feature representation modules. The design of each module considers the characteristics of the US images at multiple time points. We validated our model on our retrospective US dataset composed of 484 cases from two centers whose consistency is not sufficiently high. Patients were allocated to training (n = 297), validation (n = 99), and external test (n = 88) sets. The results show that our model can achieve better performance than the Siamese CNN and the standard tokens-to-token vision transformer without using multi-time-point images. The ablation study also proved the effectiveness of each module designed for DiT.

AAAI Conference 2023 Conference Paper

Frequency Domain Disentanglement for Arbitrary Neural Style Transfer

  • Dongyang Li
  • Hao Luo
  • Pichao Wang
  • Zhibin Wang
  • Shang Liu
  • Fan Wang

Arbitrary neural style transfer has been a popular research topic due to its rich application scenarios. Effective disentanglement of content and style is the critical factor for synthesizing an image with arbitrary style. The existing methods focus on disentangling feature representations of content and style in the spatial domain where the content and style components are innately entangled and difficult to be disentangled clearly. Therefore, these methods always suffer from low-quality results because of the sub-optimal disentanglement. To address such a challenge, this paper proposes the frequency mixer (FreMixer) module that disentangles and re-entangles the frequency spectrum of content and style components in the frequency domain. Since content and style components have different frequency-domain characteristics (frequency bands and frequency patterns), the FreMixer could well disentangle these two components. Based on the FreMixer module, we design a novel Frequency Domain Disentanglement (FDD) framework for arbitrary neural style transfer. Qualitative and quantitative experiments verify that the proposed method can render better stylized results compared to the state-of-the-art methods.

ICLR Conference 2021 Conference Paper

Learning Accurate Entropy Model with Global Reference for Image Compression

  • Yichen Qian
  • Zhiyu Tan
  • Xiuyu Sun
  • Ming Lin 0002
  • Dongyang Li
  • Zhenhong Sun
  • Hao Li 0030
  • Rong Jin 0001

In recent deep image compression neural networks, the entropy model plays a critical role in estimating the prior distribution of deep image encodings. Existing methods combine hyperprior with local context in the entropy estimation function. This greatly limits their performance due to the absence of a global vision. In this work, we propose a novel Global Reference Model for image compression to effectively leverage both the local and the global context information, leading to an enhanced compression rate. The proposed method scans decoded latents and then finds the most relevant latent to assist the distribution estimating of the current latent. A by-product of this work is the innovation of a mean-shifting GDN module that further improves the performance. Experimental results demonstrate that the proposed model outperforms the rate-distortion performance of most of the state-of-the-art methods in the industry.