Arrow Research search

Author name cluster

An Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

4 papers
2 author rows

Possible papers

4

AAAI Conference 2026 Conference Paper

Binary-Gaussian: Compact and Progressive Representation for 3D Gaussian Segmentation

  • An Yang
  • Chenyu Liu
  • Jun Du
  • Jianqing Gao
  • Jia Pan
  • Jinshui Hu
  • Baocai Yin
  • Bing Yin

3D Gaussian Splatting (3D-GS) has emerged as an efficient 3D representation and a promising foundation for semantic tasks like segmentation. However, existing 3D-GS-based segmentation methods typically rely on high-dimensional category features, which introduce substantial memory overhead. Moreover, fine-grained segmentation remains challenging due to label space congestion and the lack of stable multi-granularity control mechanisms. To address these limitations, we propose a coarse-to-fine binary encoding scheme for per-Gaussian category representation, which compresses each feature into a single integer via the binary-to-decimal mapping, drastically reducing memory usage. We further design a progressive training strategy that decomposes panoptic segmentation into a series of independent sub-tasks, reducing inter-class conflicts and thereby enhancing fine-grained segmentation capability. Additionally, we fine-tune opacity during segmentation training to address the incompatibility between photometric rendering and semantic segmentation, which often leads to foreground-background confusion. Extensive experiments on multiple benchmarks demonstrate that our method achieves state-of-the-art segmentation performance while significantly reducing memory consumption and accelerating inference.

NeurIPS Conference 2025 Conference Paper

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

  • Shenzhi Wang
  • Le Yu
  • Chang Gao
  • Chujie Zheng
  • Shixuan Liu
  • Rui Lu
  • Kai Dang
  • Xiong-Hui Chen

Reinforcement Learning with Verifiable Rewards (RLVR) has emerged as a powerful approach to enhancing the reasoning capabilities of Large Language Models (LLMs), yet its underlying mechanisms remain insufficiently understood. In this work, we undertake a pioneering exploration of RLVR through the novel perspective of token entropy patterns, comprehensively analyzing how different tokens influence reasoning performance. By examining token entropy patterns in Chain-of-Thought (CoT) reasoning, we observe that only a small fraction (approximately 20\%) of tokens exhibit high entropy, and these tokens semantically act as critical forks that steer the model toward diverse reasoning pathways. We further demonstrate that moderately increasing the entropy of these high-entropy tokens via decoding temperature adjustments leads to improved performance, quantitatively confirming their role as decision points in reasoning. We ultimately refine RLVR by restricting policy gradient updates to these forking tokens. Despite utilizing only 20\% of tokens, our approach achieves comparable performance to full-gradient updates on the Qwen3-8B base model. Moreover, it demonstrates remarkable improvements on the larger Qwen3-32B base model, boosting AIME'25 scores by 11. 04 and AIME'24 scores by 7. 71. In contrast, training exclusively on the 80\% lowest-entropy tokens leads to a marked decline in performance. These findings indicate that the efficacy of RLVR primarily arises from optimizing the high-entropy tokens that dictate key reasoning directions. Collectively, our results suggest promising avenues for optimizing RLVR algorithms by strategically leveraging the potential of these high-entropy minority tokens to further enhance the reasoning abilities of LLMs.

ICML Conference 2022 Conference Paper

OFA: Unifying Architectures, Tasks, and Modalities Through a Simple Sequence-to-Sequence Learning Framework

  • Peng Wang 0028
  • An Yang
  • Rui Men
  • Junyang Lin
  • Shuai Bai
  • Zhikang Li
  • Jianxin Ma
  • Chang Zhou 0005

In this work, we pursue a unified paradigm for multimodal pretraining to break the shackles of complex task/modality-specific customization. We propose OFA, a Task-Agnostic and Modality-Agnostic framework that supports Task Comprehensiveness. OFA unifies a diverse set of cross-modal and unimodal tasks, including image generation, visual grounding, image captioning, image classification, language modeling, etc. , in a simple sequence-to-sequence learning framework. OFA follows the instruction-based learning in both pretraining and finetuning stages, requiring no extra task-specific layers for downstream tasks. In comparison with the recent state-of-the-art vision & language models that rely on extremely large cross-modal datasets, OFA is pretrained on only 20M publicly available image-text pairs. Despite its simplicity and relatively small-scale training data, OFA achieves new SOTAs in a series of cross-modal tasks while attaining highly competitive performances on uni-modal tasks. Our further analysis indicates that OFA can also effectively transfer to unseen tasks and unseen domains. Our code and models are publicly available at https: //github. com/OFA-Sys/OFA.

AAAI Conference 2020 Conference Paper

A Robust Adversarial Training Approach to Machine Reading Comprehension

  • Kai Liu
  • Xin Liu
  • An Yang
  • Jing Liu
  • Jinsong Su
  • Sujian Li
  • Qiaoqiao She

Lacking robustness is a serious problem for Machine Reading Comprehension (MRC) models. To alleviate this problem, one of the most promising ways is to augment the training dataset with sophisticated designed adversarial examples. Generally, those examples are created by rules according to the observed patterns of successful adversarial attacks. Since the types of adversarial examples are innumerable, it is not adequate to manually design and enrich training data to defend against all types of adversarial attacks. In this paper, we propose a novel robust adversarial training approach to improve the robustness of MRC models in a more generic way. Given an MRC model well-trained on the original dataset, our approach dynamically generates adversarial examples based on the parameters of current model and further trains the model by using the generated examples in an iterative schedule. When applied to the state-of-the-art MRC models, including QANET, BERT and ERNIE2. 0, our approach obtains significant and comprehensive improvements on 5 adversarial datasets constructed in different ways, without sacrificing the performance on the original SQuAD development set. Moreover, when coupled with other data augmentation strategy, our approach further boosts the overall performance on adversarial datasets and outperforms the state-of-the-art methods.