Arrow Research search

Author name cluster

Jiamin Wu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

9 papers
2 author rows

Possible papers

9

NeurIPS Conference 2025 Conference Paper

CSBrain: A Cross-scale Spatiotemporal Brain Foundation Model for EEG Decoding

  • Yuchen Zhou
  • Jiamin Wu
  • Zichen Ren
  • Zhouheng Yao
  • Weiheng Lu
  • Kunyu Peng
  • Qihao Zheng
  • Chunfeng Song

Understanding and decoding human brain activity from electroencephalography (EEG) signals is a fundamental problem in neuroscience and artificial intelligence, with applications ranging from cognition and emotion recognition to clinical diagnosis and brain–computer interfaces. While recent EEG foundation models have made progress in generalized brain decoding by leveraging unified architectures and large-scale pretraining, they inherit a scale-agnostic dense modeling paradigm from NLP and vision. This design overlooks an intrinsic property of neural activity—cross-scale spatiotemporal structure. Different EEG task patterns span a broad range of temporal and spatial scales, from brief neural activations to slow-varying rhythms, and from localized cortical activations to large-scale distributed interactions. Ignoring this diversity may lead to suboptimal representations and weakened generalization ability. To address these limitations, we propose CSBrain, a Cross-scale Spatiotemporal Brain foundation model for generalized EEG decoding. CSBrain introduces two key components: (i) Cross-scale Spatiotemporal Tokenization (CST), which aggregates multi-scale features within localized temporal windows and anatomical brain regions into compact scale-aware token representations; and (ii) Structured Sparse Attention (SSA), which models cross-window and cross-region dependencies for diverse decoding tasks, further enriching scale diversities while eliminating the spurious dependencies. CST and SSA are alternately stacked to progressively integrate cross-scale spatiotemporal dependencies. Extensive experiments across 11 representative EEG tasks and 16 datasets demonstrate that CSBrain consistently outperforms both task-specific models and strong foundation baselines. These results establish cross-scale modeling as a key inductive bias for generalized EEG decoding and highlight CSBrain as a robust backbone for future brain–AI research.

NeurIPS Conference 2025 Conference Paper

HopaDIFF: Holistic-Partial Aware Fourier Conditioned Diffusion for Referring Human Action Segmentation in Multi-Person Scenarios

  • Kunyu Peng
  • Junchao Huang
  • Xiangsheng Huang
  • Di Wen
  • Junwei Zheng
  • Yufan Chen
  • Kailun Yang
  • Jiamin Wu

Action segmentation is a core challenge in high-level video understanding, aiming to partition untrimmed videos into segments and assign each a label from a predefined action set. Existing methods primarily address single-person activities with fixed action sequences, overlooking multi-person scenarios. In this work, we pioneer textual reference-guided human action segmentation in multi-person settings, where a textual description specifies the target person for segmentation. We introduce the first dataset for Referring Human Action Segmentation, i. e. , RHAS133, built from 133 movies and annotated with 137 fine-grained actions with 33h video data, together with textual descriptions for this new task. Benchmarking existing action segmentation methods on RHAS133 using VLM-based feature extractors reveals limited performance and poor aggregation of visual cues for the target person. To address this, we propose a holistic-partial aware Fourier-conditioned diffusion framework, i. e. , HopaDIFF, leveraging a novel cross-input gate attentional xLSTM to enhance holistic-partial long-range reasoning and a novel Fourier condition to introduce more fine-grained control to improve the action segmentation generation. HopaDIFF achieves state-of-the-art results on RHAS133 in diverse evaluation settings. The dataset and code are available at https: //github. com/KPeng9510/HopaDIFF.

ICML Conference 2025 Conference Paper

MindAligner: Explicit Brain Functional Alignment for Cross-Subject Visual Decoding from Limited fMRI Data

  • Yuqin Dai
  • Zhouheng Yao
  • Chunfeng Song
  • Qihao Zheng
  • Weijian Mai
  • Kunyu Peng
  • Shuai Lu
  • Wanli Ouyang

Brain decoding aims to reconstruct visual perception of human subject from fMRI signals, which is crucial for understanding brain’s perception mechanisms. Existing methods are confined to the single-subject paradigm due to substantial brain variability, which leads to weak generalization across individuals and incurs high training costs, exacerbated by limited availability of fMRI data. To address these challenges, we propose MindAligner, an explicit functional alignment framework for cross-subject brain decoding from limited fMRI data. The proposed MindAligner enjoys several merits. First, we learn a Brain Transfer Matrix (BTM) that projects the brain signals of an arbitrary new subject to one of the known subjects, enabling seamless use of pre-trained decoding models. Second, to facilitate reliable BTM learning, a Brain Functional Alignment module is proposed to perform soft cross-subject brain alignment under different visual stimuli with a multi-level brain alignment loss, uncovering fine-grained functional correspondences with high interpretability. Experiments indicate that MindAligner not only outperforms existing methods in visual decoding under data-limited conditions, but also provides valuable neuroscience insights in cross-subject functional analysis. The code will be made publicly available.

NeurIPS Conference 2025 Conference Paper

PaceLLM: Brain-Inspired Large Language Models for Long-Context Understanding

  • Kangcong Li
  • Peng Ye
  • Chongjun Tu
  • Lin Zhang
  • Chunfeng Song
  • Jiamin Wu
  • Tao Yang
  • Qihao Zheng

While Large Language Models (LLMs) demonstrate strong performance across domains, their long-context capabilities are limited by transient neural activations causing information decay and unstructured feed-forward network (FFN) weights leading to semantic fragmentation. Inspired by the brain’s working memory and cortical modularity, we propose PaceLLM, featuring two innovations: (1) a Persistent Activity (PA) Mechanism that mimics prefrontal cortex (PFC) neurons’ persistent firing by introducing an activation-level memory bank to dynamically retrieve, reuse, and update critical FFN states, addressing contextual decay; and (2) Cortical Expert (CE) Clustering that emulates task-adaptive neural specialization to reorganize FFN weights into semantic modules, establishing cross-token dependencies and mitigating fragmentation. Extensive evaluations show that PaceLLM achieves 6% improvement on LongBench’s Multi-document QA and 12. 5–17. 5% performance gains on $\infty$-Bench tasks, while extending measurable context length to 200K tokens in Needle-In-A-Haystack (NIAH) tests. This work pioneers brain-inspired LLM optimization and is complementary to other works. Besides, it can be generalized to any model and enhance their long-context performance and interpretability without structural overhauls.

NeurIPS Conference 2025 Conference Paper

SynBrain: Enhancing Visual-to-fMRI Synthesis via Probabilistic Representation Learning

  • Weijian Mai
  • Jiamin Wu
  • Yu Zhu
  • Zhouheng Yao
  • Dongzhan Zhou
  • Andrew Luo
  • Qihao Zheng
  • Wanli Ouyang

Deciphering how visual stimuli are transformed into cortical responses is a fundamental challenge in computational neuroscience. This visual-to-neural mapping is inherently a one-to-many relationship, as identical visual inputs reliably evoke variable hemodynamic responses across trials, contexts, and subjects. However, existing deterministic methods struggle to simultaneously model this biological variability while capturing the underlying functional consistency that encodes stimulus information. To address these limitations, we propose SynBrain, a generative framework that simulates the transformation from visual semantics to neural responses in a probabilistic and biologically interpretable manner. SynBrain introduces two key components: (i) BrainVAE models neural representations as continuous probability distributions via probabilistic learning while maintaining functional consistency through visual semantic constraints; (ii) A Semantic-to-Neural Mapper acts as a semantic transmission pathway, projecting visual semantics into the neural response manifold to facilitate high-fidelity fMRI synthesis. Experimental results demonstrate that SynBrain surpasses state-of-the-art methods in subject-specific visual-to-fMRI encoding performance. Furthermore, SynBrain adapts efficiently to new subjects with few-shot data and synthesizes high-quality fMRI signals that are effective in improving data-limited fMRI-to-image decoding performance. Beyond that, SynBrain reveals functional consistency across trials and subjects, with synthesized signals capturing interpretable patterns shaped by biological neural variability. Our code is available at https: //github. com/MichaelMaiii/SynBrain.

JBHI Journal 2025 Journal Article

ToothAxis: Generalizable Tooth Axis Estimation Network from CBCT or IOS Models

  • Nan Bao
  • Qingyao Luo
  • Jiamin Wu
  • Zhiming Cui
  • Yue Zhao

Tooth axes, indicating the orientation of teeth, are crucial in orthodontics and dental implants. The precise and automated estimation of tooth axes in 3D dental models is of significant importance. In clinical settings, Cone-beam computed tomography (CBCT) images and intraoral scanning (IOS) models are the two primary forms of digital data, providing 3D volumetric and surface information of the oral cavity, respectively. However, the detection of tooth axes remains largely manual annotation due to the complexities associated with geometric definitions and the variations among different tooth types and individuals. In this paper, we propose a novel two-stage network, named ToothAxis, for tooth axis estimation using either CBCT or IOS models. Given that IOS models only capture the tooth crown surface and lack information about the tooth roots, we initially employ an implicit-function tooth completion module for 3D tooth completion in the first stage. Subsequently, with the 3D tooth models segmented from CBCT images or completed from IOS models, a point-wise offset-based module is proposed in the second stage to accurately estimate the tooth axes. This design aims to encode tooth orientation into a dense representation, which is better suited for sparse information regression tasks, such as tooth axis estimation. Additionally, we incorporate a class-specific feature attention module to integrate global context representation, thereby enhancing robustness in managing diverse tooth shapes. We evaluated ToothAxis on a dataset obtained from real-world dental clinics, comprising 529 tooth models with corresponding CBCT images and paired IOS models. Finally, the ToothAxis achieves angle errors of LA ( $2. 921^{\circ }$ ), PSA ( $4. 801^{\circ }$ ), and LSA ( $5. 074^{\circ }$ ) on tooth models extracted from CBCT images, and LA ( $5. 326^{\circ }$ ), PSA ( $6. 360^{\circ }$ ), and LSA ( $6. 520^{\circ }$ ) on partial crowns extracted from IOS models. Extensive evaluations, ablation studies, and comparative analyses demonstrate that our method achieves accurate tooth axis estimations and surpasses state-of-the-art approaches.

AAAI Conference 2024 Conference Paper

Task-Adaptive Prompted Transformer for Cross-Domain Few-Shot Learning

  • Jiamin Wu
  • Xin Liu
  • Xiaotian Yin
  • Tianzhu Zhang
  • Yongdong Zhang

Cross-Domain Few-Shot Learning (CD-FSL) aims at recognizing samples in novel classes from unseen domains that are vastly different from training classes, with few labeled samples. However, the large domain gap between training and novel classes makes previous FSL methods perform poorly. To address this issue, we propose MetaPrompt, a Task-adaptive Prompted Transformer model for CD-FSL, by jointly exploiting prompt learning and the parameter generation framework. The proposed MetaPrompt enjoys several merits. First, a task-conditioned prompt generator is established upon attention mechanisms. It can flexibly produce a task-adaptive prompt with arbitrary length for unseen tasks, by selectively gathering task characteristics from the contextualized support embeddings. Second, the task-adaptive prompt is attached to Vision Transformer to facilitate fast task adaptation, steering the task-agnostic representation to incorporate task knowledge. To our best knowledge, this is the first work to exploit a prompt-based parameter generation mechanism for CD-FSL. Extensive experimental results on the Meta-Dataset benchmark demonstrate that our method achieves superior results against state-of-the-art methods.

EAAI Journal 2022 Journal Article

TSN-based routing and scheduling scheme for Industrial Internet of Things in underground mining

  • Yinghui Zhang
  • Jiamin Wu
  • Mingli Liu
  • Aiping Tan

Recently, there has been a growing interest in Time-Sensitive Networking (TSN) research in the Industrial Internet of Things(IIoT). The research on real-time and reliable IIoT transmission technology based on TSN has received extensive attention. Especially in some industrial scenarios, such as the safety monitoring of the underground construction environment, the real-time and reliability of data transmission is very important due to the long tunnel and complex structure in underground mining. It is a huge challenge to ensure that the data collected by the sensor has a slight end-to-end delay to the receiving end. A routing scheduling method is proposed based on TSN for IIoT applications in underground mining. Firstly, according to the characteristics of large-scale and distributed hybrid topology in mines and the delay constraints of TSN, the multi-hop routing cooperative scheduling problem S is defined. It is proved that the problem is NP-Hard. Secondly, two algorithms are proposed for problem S: (1) greedy algorithm based on local shortest delay; (2) heuristic algorithm based on an ant colony. Finally, a comparative experimental analysis of the proposed algorithm is carried out, proving that this paper’s algorithm performs better in terms of delay and jitter.

EAAI Journal 2021 Journal Article

Bayesian retinex underwater image enhancement

  • Peixian Zhuang
  • Chongyi Li
  • Jiamin Wu

This paper develops a Bayesian retinex algorithm for enhancing single underwater image with multiorder gradient priors of reflectance and illumination. First, a simple yet effective color correction approach is adopted to remove color casts and recover naturalness. Then a maximum a posteriori formulation for underwater image enhancement is established on the color-corrected image by imposing multiorder gradient priors on reflectance and illumination. The l 1 norm is appropriately used to model piecewise and piecewise linear approximations on the reflectance, and the l 2 norm is used to enforce spatial smoothness and spatial linear smoothness on the illumination. Meanwhile, a complex underwater image enhancement issue is turned into two simple denoising subproblems where their convergence analyses are mathematically provided, and their solutions can be derived by an efficient optimization algorithm. Besides, the proposed model is fast implemented on pixelwise operations while not requiring additional prior knowledge about underwater imaging conditions. Final experiments demonstrate the effectiveness of the proposed method in color correction, naturalness preservation, structures and details promotion, artifacts or noise suppression. Compared with several traditional and leading enhancement approaches, the proposed method yields better results on qualitative and quantitative assessments. The superiority of the proposed method can be extended to several challenging applications.