Arrow Research search

Author name cluster

Dongdong Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

15 papers
1 author row

Possible papers

15

AAAI Conference 2026 Conference Paper

Dream-IF: Dynamic Relative EnhAnceMent for Image Fusion

  • Xingxin Xu
  • Bing Cao
  • Dongdong Li
  • Qinghua Hu
  • Pengfei Zhu

Image fusion aims to integrate comprehensive information from images acquired through multiple sources. However, images captured by diverse sensors often encounter various degradations that can negatively affect fusion quality. Traditional fusion methods generally treat image enhancement and fusion as separate processes, overlooking the inherent correlation between them; notably, the dominant regions in one modality of a fused image often indicate areas where the other modality might benefit from enhancement. Inspired by this observation, we introduce the concept of dominant regions for image enhancement and present a Dynamic Relative EnhAnceMent framework for Image Fusion (Dream-IF). This framework quantifies the relative dominance of each modality across different layers and leverages this information to facilitate reciprocal cross-modal enhancement. By integrating the relative dominance derived from image fusion, our approach supports not only image restoration but also a broader range of image enhancement applications. Furthermore, we employ prompt-based encoding to capture degradation-specific details, which dynamically steer the restoration process and promote coordinated enhancement in both multi-modal image fusion and image enhancement scenarios. Extensive experimental results demonstrate that Dream-IF consistently outperforms its counterparts.

IJCAI Conference 2025 Conference Paper

Exploring Efficient and Effective Sequence Learning for Visual Object Tracking

  • Dongdong Li
  • Zhinan Gao
  • Yangliu Kuai
  • Rui Chen

Sequence learning based tracking frameworks are popular in the tracking community. In practice, its auto-regressive sequence generation manner leads to inferior performance and high latency compared with latest advanced trackers. In this paper, to mitigate this issue, we propose an efficient and effective sequence-to-sequence tracking framework named FastSeqTrack. FastSeqTrack differs from previous sequence learning based trackers in terms of token initialization and sequence generation manner. Four tracking tokens are appended to patch embeddings and generated in the encoder as initial guesses for the bounding box sequence, which improves the tracking accuracy compared with randomly initialized tokens. Tracking tokens are then parallelly fed into the decoder in a one-pass manner and greatly boost the forward inference speed compared with the auto-regressive manner. Inspired by the early-exit mechanism, we inject internal classifiers after each decoder layer to early terminate forward inference when the softmax confidence is sufficiently reliable. In easy tracking frames, early exits avoid network overthinking and unnecessary computation. Extensive experiments on multiple benchmarks demonstrate that FastSeqTrack runs over 100 fps and showcases superior performance against state-of-the-art trackers. Codes and models are available at https: //github. com/vision4drones/FastSeqTrack.

IJCAI Conference 2022 Conference Paper

Self-Guided Hard Negative Generation for Unsupervised Person Re-Identification

  • Dongdong Li
  • Zhigang Wang
  • Jian Wang
  • Xinyu Zhang
  • Errui Ding
  • Jingdong Wang
  • Zhaoxiang Zhang

Recent unsupervised person re-identification (reID) methods mostly apply pseudo labels from clustering algorithms as supervision signals. Despite great success, this fashion is very likely to aggregate different identities with similar appearances into the same cluster. In result, the hard negative samples, playing important role in training reID models, are significantly reduced. To alleviate this problem, we propose a self-guided hard negative generation method for unsupervised person re-ID. Specifically, a joint framework is developed which incorporates a hard negative generation network (HNGN) and a re-ID network. To continuously generate harder negative samples to provide effective supervisions in the contrastive learning, the two networks are alternately trained in an adversarial manner to improve each other, where the reID network guides HNGN to generate challenging data and HNGN enforces the re-ID network to enhance discrimination ability. During inference, the performance of re-ID network is improved without introducing any extra parameters. Extensive experiments demonstrate that the proposed method significantly outperforms a strong baseline and also achieves better results than state-of-the-art methods.

JBHI Journal 2021 Journal Article

FLDNet: Frame-Level Distilling Neural Network for EEG Emotion Recognition

  • Zhe Wang
  • Tianhao Gu
  • Yiwen Zhu
  • Dongdong Li
  • Hai Yang
  • Wenli Du

Based on the current research on EEG emotion recognition, there are some limitations, such as hand-engineered features, redundant and meaningless signal frames and the loss of frame-to-frame correlation. In this paper, a novel deep learning framework is proposed, named the frame-level distilling neural network (FLDNet), for learning distilled features from the correlations of different frames. A layer named the frame gate is designed to integrate weighted semantic information on multiple frames to remove redundant and meaningless signal frames. A triple-net structure is introduced to distill the learned features net by net to replace the hand-engineered features with professional knowledge. Specifically, one neural network is normally trained for several epochs. Then, a second network of the same structure will be initialized again to learn the extracted features from the frame gate of the first neural network based on the output of the first net. Similarly, the third net improves the features based on the frame gate of the second network. To utilize the representation ability of the triple neural network, an ensemble layer is conducted to integrate the discriminative ability of the proposed framework for final decisions. Consequently, the proposed FLDNet provides an effective method for capturing the correlation between different frames and automatically learn distilled high-level features for emotion recognition. The experiments are carried out in a subject-independent emotion recognition task on public emotion datasets of DEAP and DREAMER benchmarks, which have demonstrated the effectiveness and robustness of the proposed FLDNet.