Arrow Research search

Author name cluster

Bohan Zhang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
1 author row

Possible papers

5

YNIMG Journal 2026 Journal Article

Nonlinear shift along the sensorimotor-association-axis in brain responses to task performance

  • Yuqi Yuan
  • Bohan Zhang
  • Kyle Perkins
  • Fan Cao

In the literature of cognitive neuroscience, researchers tend to assume a linear relationship between brain activation level and task performance; however, conflicting findings have been reported in different studies. Therefore, there may be a non-linear relationship between task performance and brain activation if a full range of task performance is considered. In the current study, using the Human Connectome Project (HCP) dataset, we examined the relationship between brain activation (i.e., beta values) and working memory performance in four conditions (i.e., faces, body parts, tools and places). We found a gradual change along the sensorimotor-association (S-A) axis, with the higher-rank regions showing greater concavity (an inverted U-shaped curve) than the lower-rank regions only in the face and body part conditions. In the tool and place condition, very few high-order regions show a relationship with performance; therefore, the association with S-A ranking is missing. Instead, in the place condition, many regions showed a convex pattern with task performance. Moreover, the inflection point is above the average performance in the concave regions and below the average in the convex regions. In summary, our study revealed a novel functional property of the brain in response to task performance along the S-A axis.

YNICL Journal 2026 Journal Article

The specific linear or curved boundaries between WHO grade II–III insular gliomas and the basal ganglia indicate distinct biological features, survival outcomes, and surgical strategies: evidence from 330 cases

  • Hongfang Zhao
  • Bohan Zhang
  • Bowen Xue
  • Jianfeng Liang
  • Changyu Lu
  • Zonggang Hou
  • Zhenye Li
  • Jian Xie

BACKGROUND: WHO grade II-III insular glioma (InG) can displace the basal ganglia, resulting in distinct linear or curved boundary shapes. This study retrospectively explored this morphological distinction and its clinical relevance. METHODS: test, Cox regression, Kaplan-Meier analysis, and correlation analysis. Regression models were constructed using the stepwise Wald method. Model performance was evaluated through internal validation, including bootstrap resampling and 5-fold cross-validation. RESULTS: A total of 330 patients were included. Based on the FD, patients were classified into linear ('L'; 49%) and curved ('C'; 51%) subgroups. Compared with the L subgroup, the C subgroup exhibited more favorable biological features and progression-free survival (PFS) (p < 0.001). Using routine pathological variables, regression models were constructed to identify crucial factors in different subgroups and perform disease stratification for Statistical explanation. Internal validation supported model stability. With regard to surgical responses, although the rate of gross total resection (GTR) was higher in the L subgroup, the PFS was not increased, whereas that of the C subgroup demonstrated an increase (p < 0.001). CONCLUSIONS: The boundary shape between WHO grade II-III InG and the basal ganglia may reflect differences in biological features, survival outcomes, and surgical responses. Thus, the boundary shape should be considered in clinical practice.

NeurIPS Conference 2025 Conference Paper

PHYBench: Holistic Evaluation of Physical Perception and Reasoning in Large Language Models

  • Shi Qiu
  • Shaoyang Guo
  • Zhuo-Yang Song
  • Yunbo Sun
  • Zeyu Cai
  • Jiashen Wei
  • Tianyu Luo
  • Yixuan Yin

Current benchmarks for evaluating the reasoning capabilities of Large Language Models (LLMs) face significant limitations: task oversimplification, data contamination, and flawed evaluation items. These deficiencies necessitate more rigorous assessment methods. To address these limitations, we introduce PHYBench, a benchmark of 500 original physics problems ranging from high school to Physics Olympiad difficulty. PHYBench addresses data contamination through original content and employs a systematic curation pipeline to eliminate flawed items. Evaluations show that PHYBench activates more tokens and provides stronger differentiation between reasoning models compared to other baselines like AIME 2024, OlympiadBench and GPQA. Even the best-performing model, Gemini 2. 5 Pro, achieves only 36. 9\% accuracy compared to human experts' 61. 9\%. To further enhance evaluation precision, we introduce the Expression Edit Distance (EED) Score for mathematical expression assessment, which improves sample efficiency by 204\% over binary scoring. Moreover, PHYBench effectively elicits multi-step and multi-condition reasoning, providing a platform for examining models' reasoning robustness, preferences, and deficiencies. The benchmark results and dataset are publicly available at https: //www. phybench. cn/.

NeurIPS Conference 2024 Conference Paper

SpreadsheetBench: Towards Challenging Real World Spreadsheet Manipulation

  • Zeyao Ma
  • Bohan Zhang
  • Jing Zhang
  • Jifan Yu
  • Xiaokang Zhang
  • Xiaohan Zhang
  • Sijia Luo
  • Xi Wang

We introduce SpreadsheetBench, a challenging spreadsheet manipulation benchmark exclusively derived from real-world scenarios, designed to immerse current large language models (LLMs) in the actual workflow of spreadsheet users. Unlike existing benchmarks that rely on synthesized queries and simplified spreadsheet files, SpreadsheetBench is built from 912 real questions gathered from online Excel forums, which reflect the intricate needs of users. The associated spreadsheets from the forums contain a variety of tabular data such as multiple tables, non-standard relational tables, and abundant non-textual elements. Furthermore, we propose a more reliable evaluation metric akin to online judge platforms, where multiple spreadsheet files are created as test cases for each instruction, ensuring the evaluation of robust solutions capable of handling spreadsheets with varying values. Our comprehensive evaluation of various LLMs under both single-round and multi-round inference settings reveals a substantial gap between the state-of-the-art (SOTA) models and human performance, highlighting the benchmark's difficulty.

AAAI Conference 2022 Conference Paper

A Unified Framework for Real Time Motion Completion

  • Yinglin Duan
  • Yue Lin
  • Zhengxia Zou
  • Yi Yuan
  • Zhehui Qian
  • Bohan Zhang

Motion completion, as a challenging and fundamental problem, is of great significance in film and game applications. For different motion completion application scenarios (inbetweening, in-filling, and blending), most previous methods deal with the completion problems with case-by-case methodology designs. In this work, we propose a simple but effective method to solve multiple motion completion problems under a unified framework and achieve a new state-ofthe-art accuracy on LaFAN1 (+17% better than the previous SoTA) under multiple evaluation settings. Inspired by the recent great success of self-attention-based transformer models, we consider the completion as a sequence-to-sequence prediction problem. Our method consists of three modules a standard transformer encoder with self-attention that learns long-range dependencies of input motions, a trainable mixture embedding module that models temporal information and encodes different key-frame combinations in a unified form, and a new motion perceptual loss for better capturing high-frequency movements. Our method can predict multiple missing frames within a single forward propagation in real-time without post-processing. We also introduce a novel large-scale dance movement dataset for exploring the scaling capability of our method and its effectiveness in complex motion applications.