Author name cluster

Bo Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

78 papers

2 author rows

AAAI Conference 2026 Conference Paper

CCFQA: A Benchmark for Cross-Lingual and Cross-Modal Speech and Text Factuality Evaluation

Yexing Du
Kaiyuan Liu
Youcheng Pan
Zheng Chu
Bo Yang
Xiaocheng Feng
Ming Liu
Yang Xiang

As Large Language Models (LLMs) are increasingly popularized in the multilingual world, ensuring hallucination-free factuality becomes markedly crucial. However, existing benchmarks for evaluating the reliability of Multimodal Large Language Models (MLLMs) predominantly focus on textual or visual modalities with a primary emphasis on English, which creates a gap in evaluation when processing multilingual input, especially in speech. To bridge this gap, we propose a novel Cross-lingual and Cross-modal Factuality benchmark (CCFQA). Specifically, the CCFQA benchmark contains parallel speech-text factual questions across 8 languages, designed to systematically evaluate MLLMs' cross-lingual and cross-modal factuality capabilities. Our experimental results demonstrate that current MLLMs still face substantial challenges on the CCFQA benchmark. Furthermore, we propose a few-shot transfer learning strategy that effectively transfers the Question Answering (QA) capabilities of LLMs in English to multilingual Spoken Question Answering (SQA) tasks, achieving competitive performance with GPT-4o-mini-Audio using just 5-shot training. We release CCFQA as a foundational research resource to promote the development of MLLMs with more robust and reliable speech understanding capabilities.

PDF Details DOI

EAAI Journal 2026 Journal Article

Efficient hybrid-strategy Q-learning based power enhancement for dynamic thermoelectric generation systems reconfiguration under heterogeneous temperature distribution

Bo Yang
Chuanyun Tang
Lei Zhou
Zijian Zhang
Yixuan Chen
Hai Lu
Hongbiao Li
Dengke Gao

Details DOI

EAAI Journal 2026 Journal Article

ESEN: Evidence-aware Semantic Enhancement Network for Fact-checking Fake News Detection

Yanfang Qiu
Kun Ma
Xiaoyun Liu
Ke Ji
Zhenxiang Chen
Bo Yang

Details DOI

TCS Journal 2026 Journal Article

Leakage-resilient attribute-based encryption scheme with CCA security

Yanwei Zhou
Ran Xu
Zirui Qiao
Bo Yang

Details DOI

AAAI Conference 2026 Conference Paper

MISF: MLLM Guided Iterative Sample Filtering for Data Fault Detection

Guoying Chen
Ruizhuo Zhao
Zhewei Xu
Bo Yang
Kunlong Wang

High quality datasets are critical for training reliable machine learning models, yet data faults caused by insufficient annotation expertise or malicious poisoning attacks remain prevalent. Traditional classifier based methods rely on manually curated subsets for fault detection, but their limited scale frequently leads to model overfitting. While multimodal large language models (MLLMs) based methods offer promising detection capabilities, their few-shot learning limitations hinder generalization in domain specific tasks. To address these challenges, we propose MLLM Guided Iterative Sample Filtering (MISF), a novel framework that combines the strengths of MLLM based initialization and iterative data refinement. Our framework initializes the detection model with MLLM generated synthetic images and a curated clean subset, then iteratively refines it by progressively selecting high certainty clean samples, improving both domain adaptation and detection accuracy. Extensive experiments on RESISC45 and Oxford-IIIT Pets datasets demonstrate that MISF effectively identifies data faults, outperforming existing approaches. MISF provides a robust, scalable solution for improving dataset quality in specialized domains.

PDF Details DOI

EAAI Journal 2026 Journal Article

Natural gas demand forecasting via latent pattern retrieval and expert specialization

Sichong Lu
Hairun Wang
Jiahui Chai
Yi Su
Lean Yu
Bo Yang

Details DOI

AAAI Conference 2026 Conference Paper

Single-Stage fMRI-to-3D Reconstruction via Viewpoint-Aware Embedding and Hierarchical Guidance

Xun Zhang
Weihao Xia
Yulong Liu
Bo Yang
Alessandro Bozzon
Pan Wang

Understanding the neural basis of three-dimensional (3D) perception is a fundamental objective in cognitive neuroscience. Despite advances in decoding 2D visual stimuli from neural data, reconstructing high-fidelity 3D objects with detailed texture and geometry remains largely unexplored. In this work, we introduce NeuroSculptor3D, the first single-stage, end-to-end framework for reconstructing textured 3D shapes directly from brain activity. NeuroSculptor3D integrates a viewpoint-aware brain embedding module that captures fine-grained spatial variations across visual perspectives, and a hierarchical guidance mechanism that aligns brain-derived features with perceptual, semantic, and structural priors. Together, these components facilitate the generation of consistent multi-view embeddings, which are then decoded via TRELLIS to produce high-quality textured 3D reconstructions. Experiments on the fMRI-Shape dataset demonstrate that NeuroSculptor3D outperforms existing baselines across multiple settings, achieving significant improvements in both structural accuracy and semantic consistency. Code will be released to facilitate further research.

PDF Details DOI

AAAI Conference 2026 Conference Paper

SkyMoE: A Vision-Language Foundation Model for Enhancing Geospatial Interpretation with Mixture of Experts

Jiaqi Liu
Ronghao Fu
Lang Sun
Haoran Liu
Xiao Yang
Weipeng Zhang
Xu Na
Zhuoran Duan

The emergence of large vision-language models (VLMs) has significantly enhanced the efficiency and flexibility of geospatial interpretation. However, general-purpose VLMs remain suboptimal for remote sensing (RS) tasks. Existing geospatial VLMs typically adopt a unified modeling strategy and struggle to differentiate between task types and interpretation granularities, limiting their ability to balance local detail perception and global contextual understanding. In this paper, we present SkyMoE, a Mixture-of-Experts (MoE) vision-language model tailored for multimodal, multi-task RS interpretation. SkyMoE employs an adaptive router that generates task- and granularity-aware routing instructions, enabling specialized large language model experts to handle diverse sub-tasks. To further promote expert decoupling and granularity sensitivity, we introduce a context-disentangled augmentation strategy that creates contrastive pairs between local and global features, guiding experts toward level-specific representation learning. We also construct MGRS-Bench, a comprehensive benchmark covering multiple RS interpretation tasks and granularity levels, to evaluate generalization in complex scenarios. Extensive experiments on 21 public datasets demonstrate that SkyMoE achieves state-of-the-art performance across tasks, validating its adaptability, scalability, and superior multi-granularity understanding in remote sensing.

PDF Details DOI

EAAI Journal 2026 Journal Article

Weld signals driven abnormal quality detection based on a conditional differential denoising variational autoencoder for multi-pulse spot welding

Yuliang Wu
Bo Yang
Shilong Wang
Yucheng Zhang
Fangren Zhang
Heyao Wang

Details DOI

ICRA Conference 2025 Conference Paper

Brain-Inspired Spatial Continuous State Encoding for Efficient Spiking-Based Navigation

Qingao Chai
Jiashuo Wang
Runhao Jiang
Bo Yang
Rui Yan
Huajin Tang

Spiking neural networks (SNNs) show great potential in mapless navigation tasks due to their low power consumption, but the continuous representation of spatial information poses a challenge to SNN training. Neuroscience findings reveal that spatial cognition cells encode spatial information through population spike patterns. Inspired by this, we propose a navigation method based on SNNs, leveraging spatial cognition cells, which include grid cells (GCs), head direction cells (HDCs), and boundary vector cells (BVCs). Our method integrates spike-based information to achieve precise navigation goal encoding and egocentric environment perception, significantly improving SNN navigation capabilities in complex environments. Simulation and real-world experiments demonstrate that our method achieves significant improvements in navigation success rate and energy efficiency, showcasing superior adaptability across environments. Our work provides a novel approach to developing efficient brain-inspired navigation systems.

Details

IJCAI Conference 2025 Conference Paper

Distilling A Universal Expert from Clustered Federated Learning

Zeqi Leng
Chunxu Zhang
Guodong Long
Riting Xia
Bo Yang

Clustered Federated Learning (CFL) addresses the challenges posed by non-IID data by training multiple group- or cluster-specific expert models. However, existing methods often overlook the shared information across clusters, which represents the generalizable knowledge valuable to all participants in the Federated Learning (FL) system. To overcome this limitation, this paper introduces a novel FL framework that distills a universal expert model from the knowledge of multiple clusters. This universal expert captures globally shared information across all clients and is subsequently distributed to each client as the initialization for the next round of model training. The proposed FL framework operates in three iterative steps: (1) local model training at each client, (2) cluster-specific model aggregation, and (3) universal expert distillation. This three-step learning paradigm ensures the preservation of fine-grained non-IID characteristics while effectively incorporating shared knowledge across clusters. Compared to traditional gradient-based aggregation methods, the distillation-based model aggregation introduces greater flexibility in handling model heterogeneity and reduces conflicts among cluster-specific experts. Extensive experimental results demonstrate the superior performance of the proposed method across various scenarios, highlighting its potential to advance the state of CFL by balancing personalized and shared knowledge more effectively.

PDF Details DOI

IROS Conference 2025 Conference Paper

Heterogeneous Graph Network-Based UWB Localization for Complex Indoor Environments

Bo Yang
Luyang Li
Sizhen He
Weinan Chen
Hong Zhang

Accurate indoor location-based services are important for mobile robots, especially in complex indoor environments. In this paper, we propose a heterogeneous graph network-based ultra-wide band (UWB) localization method to provide accurate and robust localization results for mobile robots in complex indoor scenarios. The core of our approach lies in constructing the anchors, ranging measurements and tags into a heterogeneous graph structure according to the topological structure of the UWB localization system, and then design a spatial-temporal heterogeneous graph attention neural network to extract high-level features and estimate the tag locations from the graph. Therefore, the geometric relationships contained in the UWB localization system are comprehensively established, while the spatial and temporal information contained in the ranging measurements can also be extracted. We validate the proposed method through real-world experiments. The results demonstrate that, compared to existing deep learning-based methods, the constructed heterogeneous graph better represents the geometric structure of the UWB localization system, and the designed heterogeneous graph neural network effectively extracts the spatial-temporal and geometric features. Consequently, the accuracy and robustness of UWB localization are significantly improved.

Details

ICRA Conference 2025 Conference Paper

HSRL: A Hierarchical Control System Based on Spiking Deep Reinforcement Learning for Robot Navigation

Bo Yang
Shibo Zhou
Chaohui Lin
Qingao Chai
Rui Yan 0005
De Ma
Gang Pan 0001
Huajin Tang

Reinforcement Learning (RL) has shown promise in robotic navigation tasks, yet applying it to real-world environments remains challenging due to dynamic complexities and the need for dynamically feasible actions. We propose a hierarchical control framework based on Spiking Deep Reinforcement Learning (SDRL) for robust robot navigation in real environments. Our approach utilizes a two-layer architecture: a high-level decision layer powered by a Spiking GRU network for handling partially observable environments, and a low-level executive layer employing Continuous Attractor Neural Networks (CANNs) to ensure precise and continuous actions. This hierarchical structure allows real-time decisionmaking that respects the physical constraints of the robot. Experimental results show that our method adapts effectively to new environments without fine-tuning and surpasses existing methods in performance. We also explore the implementation on the Darwin3 chip, paving the way for biologically inspired motion control in future robotic applications.

Details

AAAI Conference 2025 Conference Paper

Improving Integrated Gradient-based Transferable Adversarial Examples by Refining the Integration Path

Yuchen Ren
Zhengyu Zhao
Chenhao Lin
Bo Yang
Lu Zhou
Zhe Liu
Chao Shen

Transferable adversarial examples are known to cause threats in practical, black-box attack scenarios. A notable approach to improving transferability is using integrated gradients (IG), originally developed for model interpretability. In this paper, we find that existing IG-based attacks have limited transferability due to their naive adoption of IG in model interpretability. To address this limitation, we focus on the IG integration path and refine it in three aspects: multiplicity, monotonicity, and diversity, supported by theoretical analyses. We propose the Multiple Monotonic Diversified Integrated Gradients (MuMoDIG) attack, which can generate highly transferable adversarial examples on different CNN and ViT models and defenses. Experiments validate that MuMoDIG outperforms the latest IG-based attack by up to 37.3% and other state-of-the-art attacks by 8.4%. In general, our study reveals that migrating established techniques to improve transferability may require non-trivial efforts.

PDF Details DOI

EAAI Journal 2025 Journal Article

Multi-stages attention breast cancer classification based on nonlinear spiking neural P neurons with autapses

Yaorui Tang
Bo Yang
Hong Peng
Xiaohui Luo

Details DOI

AAAI Conference 2025 Conference Paper

Multifaceted User Modeling in Recommendation: A Federated Foundation Models Approach

Chunxu Zhang
Guodong Long
Hongkuan Guo
Zhaojie Liu
Guorui Zhou
Zijian Zhang
Yang Liu
Bo Yang

Multifaceted user modeling aims to uncover fine-grained patterns and learn representations from user data, revealing their diverse interests and characteristics, such as profile, preference, and personality. Recent studies on foundation model-based recommendation have emphasized the Transformer architecture's remarkable ability to capture complex, non-linear user-item interaction relationships. This paper aims to advance foundation model-based recommendersystems by introducing enhancements to multifaceted user modeling capabilities. We propose a novel Transformer layer designed specifically for recommendation, using the self-attention mechanism to capture sequential user-item interaction patterns. Specifically, we design a group gating network to identify user groups, enabling hierarchical discovery across different layers, thereby capturing the multifaceted nature of user interests through multiple Transformer layers. Furthermore, to broaden the data scope and further enhance multifaceted user modeling, we extend the framework to a federated setting, enabling the use of private datasets while ensuring privacy. Experimental validations on benchmark datasets demonstrate the superior performance of our proposed method.

PDF Details DOI

EAAI Journal 2025 Journal Article

Omni-scale spatio-temporal attention network for impact localization of sandwich composite panels

Yang Zhang
Bo Yang
Shilong Wang
Fan Mo
Fengyang Bi
Yan He

Details DOI

JAIR Journal 2025 Journal Article

Rumor Detection with Adaptive Data Augmentation and Adversarial Training

Ying Wang
Fuyuan Ma
Zhaoqi Yang
Yaodi Zhu
Bo Yang
Pengfei Shen
Lei Yun

Rumors are widely spread on social media, which has a negative impact on social stability. To address this problem, many rumor detection methods have been proposed. However, most existing methods overlook the potential impact of noise and adversarial attacks on their detection performance, which could compromise their effectiveness when applied in an unknown environment. To overcome these challenges and improve the framework robustness to noise and adversarial attacks, we propose a novel rumor detection framework with Adaptive Data Augmentation and Adversarial Training, named ADAAT. Our framework utilizes the adaptive data augmentation module to calculate the importance of edges and features and adaptively modify the less important among them with a greater probability. In addition, it contains a hard sample generation module which generates adversarial representations through adversarial training. These adversarial representations are treated as hard samples, which are utilized in contrastive learning to learn essential features, thereby improving the robustness of the framework. Our framework proves superiority in rumor detection tasks, increasing the accuracy by an average of 3.6%, 4.5% and 2.5% over the state-of-the-art methods on Twitter15, Twitter16 and PHEME, respectively. When the ADAAT framework is applied to attacked test data, the detection accuracy decreases by only 1.3%, 1.4%, and 1.2%. This paper appears in the AI & Society Track.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

State Feedback Enhanced Graph Differential Equations for Multivariate Time Series Forecasting

Jiaxu Cui
Qipeng Wang
Yiming Zhao
Bingyi Sun
Pengfei Wang
Bo Yang

Multivariate time series forecasting holds significant theoretical and practical importance in various fields, including web analytics and transportation. Recently, graph neural networks and graph differential equations have shown exceptional capabilities in modeling spatio-temporal features. However, existing methods often suffer from over-smoothing, hindering real-world problem-solving. In this work, we analyze the graph propagation process as a dynamical system and propose a novel feedback mechanism to enhance representation power, adaptively adjusting the representations to align with desired performance outcomes, thereby fundamentally mitigating the issue of over-smoothing. Moreover, we introduce an effective multivariate time series forecasting model called SF-GDE, based on the proposed graph propagation with the feedback mechanism. Intensive experiments are conducted on three real-world datasets from diverse fields. Results show that SF-GDE outperforms the state of the arts, and the feedback mechanism can serve as a universal booster to improve performance for graph propagation models.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

TimeEmb: A Lightweight Static-Dynamic Disentanglement Framework for Time Series Forecasting

Mingyuan Xia
Chunxu Zhang
Zijian Zhang
Hao Miao
Qidong Liu
Yuanshao Zhu
Bo Yang

Temporal non-stationarity, the phenomenon that time series distributions change over time, poses fundamental challenges to reliable time series forecasting. Intuitively, the complex time series can be decomposed into two factors, i. e. , time-invariant and time-varying components, which indicate static and dynamic patterns, respectively. Nonetheless, existing methods often conflate the time-varying and time-invariant components, and jointly learn the combined long-term patterns and short-term fluctuations, leading to suboptimal performance facing distribution shifts. To address this issue, we initiatively propose a lightweight static-dynamic decomposition framework, TimeEmb, for time series forecasting. TimeEmb innovatively separates time series into two complementary components: (1) time-invariant component, captured by a novel global embedding module that learns persistent representations across time series, and (2) time-varying component, processed by an efficient frequency-domain filtering mechanism inspired by full-spectrum analysis in signal processing. Experiments on real-world datasets demonstrate that TimeEmb outperforms state-of-the-art baselines and requires fewer computational resources. We conduct comprehensive quantitative and qualitative analyses to verify the efficacy of static-dynamic disentanglement. This lightweight framework can also improve existing time-series forecasting methods with simple integration. To ease reproducibility, our code is available at https: //github. com/showmeon/TimeEmb.