Arrow Research search

Author name cluster

Yue Peng

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers
1 author row

Possible papers

4

AAAI Conference 2026 Conference Paper

StepFun-Formalizer: Unlocking the Autoformalization Potential of LLMs Through Knowledge-Reasoning Fusion

  • Yutong Wu
  • Di Huang
  • Ruosi Wan
  • Yue Peng
  • Shijie Shang
  • Chenrui Cao
  • Lei Qi
  • Rui Zhang

Autoformalization aims to translate natural-language mathematical statements into a formal language. While LLMs have accelerated progress in this area, existing methods still suffer from low accuracy. We identify two key abilities for effective autoformalization: comprehensive mastery of formal-language domain knowledge, and reasoning capability of natural language problem understanding and informal-formal alignment. Without the former, a model cannot identify the correct formal objects; without the latter, it struggles to interpret real-world contexts and map them precisely into formal expressions. To address these gaps, we introduce ThinkingF, a data synthesis and training pipeline that improves both abilities. First, we construct two datasets: one by distilling and selecting large-scale examples rich in formal knowledge, and another by generating informal-to-formal reasoning trajectories guided by expert-designed templates. We then apply SFT and RLVR with these datasets to further fuse and refine the two abilities. The resulting 7B and 32B models exhibit both comprehensive formal knowledge and strong informal-to-formal reasoning. Notably, StepFun-Formalizer-32B achieves SOTA BEq@1 scores of 40.5% on FormalMATH-Lite and 26.7% on ProverBench, surpassing all prior general-purpose and specialized models.

JBHI Journal 2026 Journal Article

UTADC-Net: Unsupervised Topological-Aware Diffusion Condensation Network for Medical Image Segmentation

  • Yue Peng
  • Ruodai Wu
  • Bing Xiong
  • Fuqiang Chen
  • Jun Ma
  • Yaoqin Xie
  • Jing Cai
  • Wenjian Qin

Medical image segmentation plays a crucial role in computer-aided diagnosis and treatment planning. Unsupervised segmentation methods that can effectively leverage unlabeled data bring significant promise in clinical application. However, they remain a challenging task in maintaining anatomical structure topological consistency that often produces anatomical structure breaks, connectivity errors, or boundary discontinuities. To address these issues, we propose a novel Unsupervised Topological-Aware Diffusion Condensation Network (UTADC-Net) for medical image segmentation. Specifically, we design a diffusion condensation-based framework that achieves structural consistency in segmentation results by effectively modeling long-range dependencies between pixels and incorporating topological constraints. First, to effectively fuse local details and global semantic information, we employ a pixel-centric patch embedding module by simultaneously modeling local structural features and inter-region interactions. Second, to enhance the topological consistency of segmentation results, we introduce an adaptive topological constraint mechanism that guides the network to learn anatomically aligned structural representations through pixel-level topological relationships and corresponding loss functions. Extensive experiments conducted on three public medical image datasets demonstrate that our proposed UTADC-Net significantly outperforms existing unsupervised methods in terms of segmentation accuracy and topological structure preservation. Notably, our method demonstrates segmentation results with excellent anatomical structural consistency. These results indicate that our framework provides a novel and practical solution for unsupervised medical image segmentation.

AAAI Conference 2025 Conference Paper

Unpaired Multi-Domain Histopathology Virtual Staining Using Dual Path Prompted Inversion

  • Bing Xiong
  • Yue Peng
  • Ranran Zhang
  • Fuqiang Chen
  • Jiaye He
  • Wenjian Qin

Virtual staining leverages computer-aided techniques to transfer the style of histochemically stained tissue samples to other staining types. In virtual staining of pathological images, maintaining strict structural consistency is crucial, as these images emphasize structural integrity more than natural images. Even slight structural alterations can lead to deviations in diagnostic semantic information. Furthermore, the unpaired characteristic of virtual staining data may compromise the preservation of pathological diagnostic content. To address these challenges, we propose a dual-path inversion virtual staining method using prompt learning, which optimizes visual prompts to control content and style, while preserving complete pathological diagnostic content. Our proposed inversion technique comprises two key components: (1) Dual Path Prompted Strategy, we utilize a feature adapter function to generate reference images for inversion, providing style templates for input image inversion, called Style Target Path. We utilize the inversion of the input image as the Structural Target path, employing visual prompt images to maintain structural consistency in this path while preserving style information from the style Target path. During the deterministic sampling process, we achieve complete content-style disentanglement through a plug-and-play embedding visual prompt approach. (2) StainPrompt Optimization, where we only optimize the null visual prompt as ``operator'' for dual path inversion, rather than fine-tune pre-trained model. We optimize null visual prompt for structual and style trajectory around pivotal noise on each timestep, ensuring accurate dual-path inversion reconstruction. Extensive evaluations on publicly available multi-domain unpaired staining datasets demonstrate high structural consistency and accurate style transfer results.