EAAI Journal 2026 Journal Article
Enhanced graph neural network for rapid multi-field seismic prediction in shield tunnels with contact loss defects
- Xianlong Wu
- Jun Shen
- Xiaohua Bao
- Xiangsheng Chen
- Hongzhi Cui
Author name cluster
Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.
EAAI Journal 2026 Journal Article
AAAI Conference 2026 Conference Paper
With the widespread adoption of multi-view data in numerous fields, multi-view unsupervised feature selection (MUFS) has made notable strides in both feature pruning and missing-view completion. Nonetheless, existing MUFS methods typically rely on centralized servers, which cannot meet real-world demands for privacy preservation and distributed learning, and they often suffer from suboptimal solution and weak convergence guarantees. To address these challenges, IMUFFS, an incomplete multi-view unsupervised federated feature selection via cooperative particle swarm optimization (CPSO) and tensor-aligned learning (TAL) is proposed. Specifically, each client executes CPSO-TAL at two stages: (i) an external optimization phase that involves a CPSO, inspired by the co-evolutionary mechanism of hybrid breeding optimization algorithm, performing a global search in the feature space, and (ii) an internal optimization phase that leverages TAL with imputation and CP decomposition, where CP decomposition reduces dimensionality by decomposing the original tensor into a sum of core components, to learn low-dimensional embeddings, while simultaneously updating anchor graphs and view preference weights, thereby harmonizing imputation and representation learning. On the server side, a federated aggregation strategy using adaptive normalized mutual information (NMI) weighting combines the locally optimized feature selection (FS) weights and NMI scores from clients, ensuring privacy while improving the quality of FS and convergence. Extensive experiments on multiple datasets demonstrate that IMUFFS consistently outperforms state-of-the-art methods, yielding more effective and robust FS and enhancing better missing-view completion.
AAAI Conference 2026 Conference Paper
As one of the primary causes of visual impairment, Diabetic Retinopathy (DR) requires accurate and robust grading to facilitate timely diagnosis and intervention. Different from conventional DR grading methods that utilize single-view images, recent clinical studies have revealed that multi-view fundus images can significantly enhance DR grading performance by expanding the field of view (FOV). However, there is a long-tailed distribution problem in fundus image analysis, i.e., a high prevalence of mild DR grades and a low prevalence of rare ones (e.g., cases of high severity), which presents a significant challenge to developing a unified model capable of detecting rare or unseen DR grades not encountered during training. In this paper, we propose ProME-DR, a Prompt-driven zero-shot DR grading framework, which leverages prompt Matching and Emulating to recognize the unseen DR categories and views beyond the training set. ProME-DR disentangles the training process into two stages to learn generalized knowledge for novel DR disease grading. Initially, ProME-DR leverages two sets of prompt units to capture semantic and inter-view consistency knowledge via a split-and-mask manner, gathering instance-level DR visual clues. Subsequently, it constructs a concept-aware emulator to generate context prompt units, linking extensible knowledge learned from the previously seen DR attributes for zero-shot DR grading. Extensive experiments conducted on eight datasets and various scenarios confirm the superiority of ProME-DR.
NeurIPS Conference 2025 Conference Paper
Language-guided object recognition in remote sensing imagery is crucial for large-scale mapping and automated data annotation. However, existing open-vocabulary and visual grounding methods rely on explicit category cues, limiting their ability to handle complex or implicit queries that require advanced reasoning. To address this issue, we introduce a new suite of tasks, including Instruction-Oriented Object Counting, Detection, and Segmentation (InstructCDS), covering open-vocabulary, open-ended, and open-subclass scenarios. We further present EarthInstruct, the first InstructCDS benchmark for earth observation. It is constructed from two diverse remote sensing datasets with varying spatial resolutions and annotation rules across 20 categories, necessitating models to interpret dataset-specific instructions. Given the scarcity of semantically rich labeled data in remote sensing, we propose InstructSAM, a training-free framework for instruction-driven object recognition. InstructSAM leverages large vision-language models to interpret user instructions and estimate object counts, employs SAM2 for mask proposal, and formulates mask-label assignment as a binary integer programming problem. By integrating semantic similarity with counting constraints, InstructSAM efficiently assigns categories to predicted masks without relying on confidence thresholds. Experiments demonstrate that InstructSAM matches or surpasses specialized baselines across multiple tasks while maintaining near-constant inference time regardless of object count, reducing output tokens by 89\% and overall runtime by over 32\% compared to direct generation approaches. We believe the contributions of the proposed tasks, benchmark, and effective approach will advance future research in developing versatile object recognition systems. The code is available at https: //VoyagerXvoyagerx. github. io/InstructSAM.
IS Journal 2025 Journal Article
With widespread adoption globally, micromobility like bikes, e-scooters, and e-bikes has attracted increasing attention due to its ability to complement existing transportation modes and promote sustainable transportation. Understanding micromobility user behaviors in urban areas is essential for improving safety and comfort, as well as for informing infrastructure development and policy. Prior investigations on micromobility user behaviors primarily relied on statistical and kinematic modeling approaches. Although these methods have proven effective in characterizing user behaviors at both macroscopic and microscopic levels, the advent of artificial intelligence (AI)-powered data analytics and behavioral modeling is revolutionizing the field. Recently, advanced machine learning models, such as gradient boosting decision tree, graph convolutional network, and inverse reinforcement learning, has introduced new momentum into micromobility user behavior research. This article explores recent developments, research opportunities, and future directions in this field, leveraging the power of more generic AI approaches.
TIST Journal 2025 Journal Article
Learning resources in online learning systems typically adhere to uniform formats and settings, lacking flexibility and personalization to meet diverse learning needs and preferences. This inability to meet individualized learning needs and preferences has spurred research interest in personalized learning path recommendations. Many researchers have explored recommending learning path by leveraging user historical learning resource sequence to model personalized characteristics. However, these methods overlook the time information in the learning process and fail to interpret the dynamic shifts in learning preferences during recommendation. Therefore, we propose a method, termed TA-RL, for learning path recommendation, based on time-aware attention mechanism and reinforcement learning. First, we propose a novel time-aware attention mechanism to trace the evolving learning preferences of user, in which attention weights are computed using a context-aware time distance measure and the similarity between history learning resources. Then, we employ a Monte Carlo policy gradient reinforcement learning method to generate learning path recommendation based on learning preferences. We validate the effectiveness of our proposed method by comprehensive experiments on two real-world datasets.
EAAI Journal 2025 Journal Article
AAAI Conference 2025 Conference Paper
Cryo-Electron Tomography (cryo-ET) is a 3D imaging technology that facilitates the study of macromolecular structures at near-atomic resolution. Recent volumetric segmentation approaches on cryo-ET images have drawn widespread interest in the biological sector. However, existing methods heavily rely on manually labeled data, which requires highly professional skills, thereby hindering the adoption of fully-supervised approaches for cryo-ET images. Some unsupervised domain adaptation (UDA) approaches have been designed to enhance the segmentation network performance using unlabeled data. However, applying these methods directly to cryo-ET image segmentation tasks remains challenging due to two main issues: 1) the source dataset, usually obtained through simulation, contains a fixed level of noise, while the target dataset, directly collected from raw-data from the real-world scenario, have unpredictable noise levels. 2) the source data used for training typically consists of known macromoleculars. In contrast, the target domain data are often unknown, causing the model to be biased towards those known macromolecules, leading to a domain shift problem. To address such challenges, in this work, we introduce a voxel-wise unsupervised domain adaptation approach, termed Vox-UDA, specifically for cryo-ET subtomogram segmentation. Vox-UDA incorporates a noise generation module to simulate target-like noises in the source dataset for cross-noise level adaptation. Additionally, we propose a denoised pseudo-labeling strategy based on the improved Bilateral Filter to alleviate the domain shift problem. More importantly, we construct the first UDA cryo-ET subtomogram segmentation benchmark on three experimental datasets. Extensive experimental results on multiple benchmarks and newly curated real-world datasets demonstrate the superiority of our proposed approach compared to state-of-the-art UDA methods.
NeurIPS Conference 2024 Conference Paper
Monocular depth estimation (MDE) is fundamental for deriving 3D scene structures from 2D images. While state-of-the-art monocular relative depth estimation (MRDE) excels in estimating relative depths for in-the-wild images, current monocular metric depth estimation (MMDE) approaches still face challenges in handling unseen scenes. Since MMDE can be viewed as the composition of MRDE and metric scale recovery, we attribute this difficulty to scene dependency, where MMDE models rely on scenes observed during supervised training for predicting scene scales during inference. To address this issue, we propose to use humans as landmarks for distilling scene-independent metric scale priors from generative painting models. Our approach, Metric from Human (MfH), bridges from generalizable MRDE to zero-shot MMDE in a generate-and-estimate manner. Specifically, MfH generates humans on the input image with generative painting and estimates human dimensions with an off-the-shelf human mesh recovery (HMR) model. Based on MRDE predictions, it propagates the metric information from painted humans to the contexts, resulting in metric depth estimations for the original input. Through this annotation-free test-time adaptation, MfH achieves superior zero-shot performance in MMDE, demonstrating its strong generalization ability.
AAAI Conference 2018 Conference Paper
The technique called splitting sets has been proven useful in simplifying the investigation of Answer Set Programming (ASP). In this paper, we investigate the splitting set theorem for LPMLN that is a new extension of ASP created by combining the ideas of ASP and Markov Logic Networks (MLN). Firstly, we extend the notion of splitting sets to LPMLN programs and present the splitting set theorem for LPMLN. Then, the use of the theorem for simplifying several LPMLN inference tasks is illustrated. After that, we give two parallel approaches for solving LPMLN programs via using the theorem. The preliminary experimental results show that these approaches are alternative ways to promote an LPMLN solver.
YNIMG Journal 2006 Journal Article
YNIMG Journal 2006 Journal Article
YNIMG Journal 2005 Journal Article