Author name cluster

Li Zhao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

38 papers

1 author row

AAAI Conference 2026 Conference Paper

IGIANet: Illumination Guided Implicit Alignment Network for Infrared–Visible UAV Detection

Xiangqi Chen
Dawei Zhang
Li Zhao
Chengzhuan Yang
Zhongyu Chen
Jungang Lou
Zhonglong Zheng
Sang-Woon Jeon

Visible-Infrared (RGB-IR) Unmanned Aerial Vehicle (UAV) object detection integrates complementary cues from visible and infrared sensors, offering broad application potential. However, due to sensor parallax, it still faces the challenge of weak spatial misalignment, which significantly limits its performance in UAV-based object detection. Existing methods emphasize strict alignment, overlooking spectral heterogeneity under varying illumination. To address these issues, we propose the Illumination Guided Implicit Alignment Network (IGIANet) to mitigate modality heterogeneity without explicit alignment. Specifically, we integrate three novel modules. First, we propose an illumination-guided frequency modulation module that adaptively allocates fusion weights to visible and infrared features based on global illumination estimation, effectively alleviating modality imbalance under varying lighting conditions. Second, we introduce a frequency-guided cross-modality differential enhancement module, which computes differential cues across frequency domains to enhance complementary information and highlight weakly aligned and low-contrast regions. Finally, we introduce an implicit alignment-driven dynamic fusion module that actively estimates offsets and generates dynamic, position-adaptive fusion kernels to align and fuse modalities. Extensive experiments demonstrate that IGIANet outperforms state-of-the-art models on various benchmarks, achieving 80.9% mAP on DroneVehicle, 57.1% mAP on VEDAI, and 49.4% mAP on FLIR.

PDF Details DOI

AAAI Conference 2026 Conference Paper

MaskAnyNet: Rethinking Masked Image Regions as Valuable Information in Supervised Learning

Jingshan Hong
Haigen Hu
Huihuang Zhang
Qianwei Zhou
Li Zhao

In supervised learning, traditional image masking faces two key issues: (i) discarded pixels are underutilized, leading to a loss of valuable contextual information; (ii) masking may remove small or critical features, especially in fine-grained tasks. In contrast, masked image modeling (MIM) has demonstrated that masked regions can be reconstructed from partial input, revealing that even incomplete data can exhibit strong contextual consistency with the original image. This highlights the potential of masked regions as sources of semantic diversity. Motivated by this, we revisit the image masking approach, proposing to treat masked content as auxiliary knowledge rather than ignored. Based on this, we proposed MaskAnyNet, which combines masking with a relearning mechanism to exploit both visible and masked information. It can be easily extended to any model with an additional branch to jointly learn from the recomposed masked region. This approach leverages the semantic diversity of masked regions to enrich features and preserve fine-grained details. Experiments on CNN and Transformer backbones show consistent gains across multiple benchmarks. Further analysis confirms that the proposed method improves semantic diversity through the reuse of masked content.

PDF Details DOI

YNIMG Journal 2025 Journal Article

Age and gender-related patterns of arterial transit time and cerebral blood flow in healthy adults

Zongpai Zhang
Elizabeth Riley
Shichun Chen
Li Zhao
Adam K. Anderson
Eve DeRosa
Weiying Dai

Normal aging has been associated with increased arterial transit time (ATT) and reduced cerebral blood flow (CBF). However, age-related patterns of ATT and CBF and their relationship remain unclear. This is partly due to the lengthy scan times required for ATT measurements, which caused previous age-related CBF studies to not fully account for transit time. In this work, we aimed to elucidate age-related ATT and ATT-corrected CBF patterns. We examined 131 healthy subjects aged 19 to 82 years old using two pseudo-continuous arterial spin labeling (PCASL) MRI scans: one to measure fast low-resolution ATT maps with five post-labeling delays and the other to measure high-resolution perfusion-weighted maps with a single post-labeling delay. Both ATT and perfusion-weighed maps were applied with vessel suppression. We found that ATT increases with age in the frontal, temporoparietal, and occipital regions, with a more pronounced elongation in males compared to females in the middle temporal gyrus. ATT-corrected CBF decreases with age in several brain regions, including the anterior cingulate, insula, posterior cingulate, angular, precuneus, supramarginal, frontal, parietal, superior and middle temporal, occipital, and cerebellar regions, while remaining stable in the inferior temporal and subcortical regions. In contrast, without ATT correction, we detected artifactual decreases in the inferior temporal and precentral regions. These findings suggest that ATT provides valuable and independent insights into microvascular deficits and should be incorporated into CBF measurements for studies involving aging populations.

Details DOI

NeurIPS Conference 2025 Conference Paper

Dyn-O: Building Structured World Models with Object-Centric Representations

Zizhao Wang
Kaixin Wang
Li Zhao
Peter Stone
Jiang Bian

World models aim to capture the dynamics of the environment, enabling agents to predict and plan for future states. In most scenarios of interest, the dynamics are highly centered on interactions among objects within the environment. This motivates the development of world models that operate on object-centric rather than monolithic representations, with the goal of more effectively capturing environment dynamics and enhancing compositional generalization. However, the development of object-centric world models has largely been explored in environments with limited visual complexity (such as basic geometries). It remains underexplored whether such models can be effective in more challenging settings. In this paper, we fill this gap by introducing Dyn-O, an enhanced structured world model built upon object-centric representations. Compared to prior work in object-centric representations, Dyn-O improves in both learning representations and modeling dynamics. On the challenging Procgen games, we demonstrate that our method can learn object-centric world models directly from pixel observations, outperforming DreamerV3 in rollout prediction accuracy. Furthermore, by decoupling object centric features into dynamic-agnostic and dynamic-aware components, we enable finer-grained manipulation of these features and generate more diverse imagined trajectories. The code of Dyn-O can be found at: https: //github. com/wangzizhao/dyn-O.