Author name cluster

Yang Luo

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers

2 author rows

AAAI Conference 2026 Conference Paper

SGS-3D: High-Fidelity 3D Instance Segmentation via Reliable Semantic Mask Splitting and Growing

Chaolei Wang
Yang Luo
Jing Du
Siyu Chen
Yiping Chen
Ting Han

Accurate 3D instance segmentation is crucial for high-quality scene understanding in the 3D vision domain. However, 3D instance segmentation based on 2D-to-3D lifting approaches struggle to produce precise instance-level segmentation, due to accumulated errors introduced during the lifting process from ambiguous semantic guidance and insufficient depth constraints. To tackle these challenges, we propose Splitting and Growing reliable Semantic mask for high-fidelity 3D instance segmentation (SGS-3D), a novel "split-then-grow" framework that first purifies and splits ambiguous lifted masks using geometric primitives, and then grows them into complete instances within the scene. Unlike existing approaches that directly rely on raw lifted masks and sacrifice segmentation accuracy, SGS-3D serves as a training-free refinement method that jointly fuses semantic and geometric information, enabling effective cooperation between the two levels of representation. Specifically, for semantic guidance, we introduce a mask filtering strategy that leverages the co-occurrence of 3D geometry primitives to identify and remove ambiguous masks, thereby ensuring more reliable semantic consistency with the 3D object instances. For the geometric refinement, we construct fine-grained object instances by exploiting both spatial continuity and high-level features, particularly in the case of semantic ambiguity between distinct objects. Experimental results on ScanNet200, ScanNet++, and KITTI-360 demonstrate that SGS-3D substantially improves segmentation accuracy and robustness against inaccurate masks from pre-trained models, yielding high-fidelity object instances while maintaining strong generalization across diverse indoor and outdoor environments.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

Can We Translate Code Better with LLMs and Call Graph Analysis?

Yang Luo

This paper proposes an innovative code translation method aimed at addressing the accuracy issues encountered by large language models (LLMs) in translating code of complex large-scale software projects. The method utilizes the Language Server Protocol to obtain the call graph of the entire codebase, and optimizes the input prompt of the LLM accordingly, significantly improving the correctness of translation at the compilation stage. Moreover, this method introduces the bridged debuggers technique based on the Debug Adapter Protocol and dynamic test case generation, effectively fixing runtime errors. Experiments on multiple mainstream datasets demonstrate that, compared to existing code translation methods and LLMs, this method achieves a significant improvement in translation accuracy.

PDF Details DOI

ICML Conference 2025 Conference Paper

Info-Coevolution: An Efficient Framework for Data Model Coevolution

Ziheng Qin
Hailun Xu
Wei Chee Yew
Qi Jia
Yang Luo
Kanchan Sarkar
Danhui Guan
Kai Wang 0036

Machine learning relies heavily on data, yet the continuous growth of real-world data poses challenges for efficient dataset construction and training. A fundamental yet unsolved question is: given our current model and data, does a new data (sample/batch) need annotation/learning? Conventional approaches retain all available data, leading to non-optimal data and training efficiency. Active learning aims to reduce data redundancy by selecting a subset of samples to annotate, while it increases pipeline complexity and introduces bias. In this work, we propose Info-Coevolution, a novel framework that efficiently enables models and data to coevolve through online selective annotation with no bias. Leveraging task-specific models (and open-source models), it selectively annotates and integrates online and web data to improve datasets efficiently. For real-world datasets like ImageNet-1K, Info-Coevolution reduces annotation and training costs by 32% without performance loss. It is able to automatically give the saving ratio without tuning the ratio. It can further reduce the annotation ratio to 50% with semi-supervised learning. We also explore retrieval-based dataset enhancement using unlabeled open-source data. Code is available at https: //github. com/NUS-HPC-AI-Lab/Info-Coevolution/.

Details

IJCAI Conference 2025 Conference Paper

LPDetective: Dusting the LLM Chats for Prompt Template Abusers

Yang Luo
Qingni Shen
Zhonghai Wu

The abuse of LLM Chatbot interfaces by web robots leads to a significant waste of GPU and server resources, posing a serious security challenge. To address this issue, we propose LPDetective, an unsupervised method for detecting robot prompt templates. This method is based on the assumption that robot-generated text repeatedly uses the same or highly similar phrases and sentence structures across multiple sessions, differing from human natural conversations. We design a multi-stage workflow, including message grouping, text similarity measurement, hierarchical clustering analysis, and regular expression extraction, to automatically extract potential robot behavior patterns from chat logs. LPDetective does not require predefined templates or rely on training data, enabling it to adaptively discover new, unknown patterns. We conduct systematic experiments on three large-scale real-world datasets: Bing Copilot, Wildchat, and ChatLog. The results show that LPDetective can efficiently and accurately detect robot prompt templates in various scenarios, achieving a 7. 5% improvement in F1 score compared to the state-of-the-art XLNet method and reducing detection latency by 178 times on the Bing Copilot dataset.

PDF Details DOI

IJCAI Conference 2025 Conference Paper

MA-RAG: Automating Role Engineering for RESTful APIs with Multi-Head Attention and Retrieval-Augmented Generation

Yang Luo
Qingni Shen
Zhonghai Wu

This paper addresses the role engineering problem for RESTful applications and proposes a role engineering method based on multi-head attention and Retrieval Augmented Generation called MA-RAG. The method first performs fine-grained control flow analysis on the system source code to extract permission information of API handlers. Then, using basic blocks as units, it employs pre-trained code models to convert the source code into semantic vectors, which are stored in the retrieval augmented generation model. On this basis, a call chain structure tree is constructed with permissions as the center, utilizing the multi-head attention mechanism to aggregate semantic information of different code granularities from bottom to top, with each attention head corresponding to a role engineering objective. Finally, the root vectors of each permission tree are subjected to self-supervised clustering to adaptively determine the number of roles and perform division. We evaluated MA-RAG on 284 real-world software systems, and the results show that compared with other methods, MA-RAG can significantly save time overhead, reduce the number of generated roles, lower the role permission overlap rate, and improve the interpretability score.

PDF Details DOI

ICML Conference 2025 Conference Paper

MERIT: Maximum-normalized Element-wise Ratio for Language Model Large-batch Training

Yang Luo
Zangwei Zheng
Ziheng Qin
Zirui Zhu
Yong Liu 0020
Yang You 0001

Large-batch training has become a cornerstone in accelerating the training of deep neural networks, yet it poses challenges in optimization and generalization. Existing optimizers like AdamW present performance degradation during language models’ large-batch training, due to the information bottleneck in attention layers caused by the sharp increase of max attention logit. While the LAMB optimizer partially addresses this issue, some attention layers still face this issue. The reason is that $l_2$-norm-based trust ratios in LAMB are less effective in directly influencing the max value of query/key weights. Furthermore, the weight-wise trust ratio in LAMB is error-prone as it overlooks relationships of weight values within rows or columns. Building on these observations, we propose a novel optimizer, MERIT, which leverages the max-norm to calculate the trust ratio to constrain the max attention logit more effectively. Moreover, we further construct element-wise trust ratios to provide more robust update scaling by focusing on local weight structures. Extensive experiments of large-batch training across various sizes of GPT-2 models demonstrate the superior performance of MERIT. Notably, during the training of GPT-2 Medium, MERIT enables a 6k batch size without any performance degradation compared to the standard batch size (480) with 48B training tokens. This work highlights the importance of considering the max attention logit and finer-granularity trust ratio in large-batch training. It successfully improves the training stability and paves the way for larger batch usage, enabling faster development and iteration of large language models. Code is available at https: //github. com/NUS-HPC-AI-Lab/MERIT.

Details

ICML Conference 2025 Conference Paper

SeedLoRA: A Fusion Approach to Efficient LLM Fine-Tuning

Yong Liu 0020
Di Fu
Shenggan Cheng
Zirui Zhu
Yang Luo
Minhao Cheng
Cho-Jui Hsieh
Yang You 0001

Despite Low-Rank Adaptation (LoRA)’s popularity for fine-tuning large models, it often exhibits a noticeable performance gap compared to full fine-tuning, particularly in complex tasks such as mathematical reasoning and code generation. Motivated by this discrepancy, we propose a novel fusion approach for LoRA fine-tuned models. Our key insight is that LoRA models trained with different random seeds on the same task often exhibit complementary strengths. In contrast to existing research that typically focuses on fusing models trained on diverse tasks, we explore the potential of combining multiple LoRA models fine-tuned on the same task with different random seeds. This intra-task fusion method aims to leverage the strengths of various fine-tuned models to create a more robust and effective adaptation. To validate our approach, we conducted comprehensive experiments across three key areas: mathematical reasoning, code generation, and general instruction-tuning tasks. The results demonstrate that our fusion method significantly enhances LoRA’s performance, outperforming both standalone LoRA models and current fusion methods. Notably, this advancement substantially narrows the gap between LoRA and full fine-tuning, thus offering a more effective approach to model adaptation without the GPU memory burden of full parameter fine-tuning.

Details

JBHI Journal 2023 Journal Article

A Benchmark Dataset of Endoscopic Images and Novel Deep Learning Method to Detect Intestinal Metaplasia and Gastritis Atrophy

Jie Yang
Yan Ou
Zhiqian Chen
Juan Liao
Wenjian Sun
Yang Luo
Chunbo Luo

Endoscopy has been routinely used to diagnose stomach diseases including intestinal metaplasia (IM) and gastritis atrophy (GA). Such routine examination usually demands highly skilled radiologists to focus on a single patient with substantial time, causing the following two key challenges: 1) the dependency on the radiologist's experience leading to inconsistent diagnosis results across different radiologists; 2) limited examination efficiency due to the demanding time and energy consumption to the radiologist. This paper proposes to address these two issues in endoscopy using novel machine learning method in three-folds. Firstly, we build a novel and relatively big endoscopy dataset of 21, 420 images from the widely used White Light Imaging (WLI) endoscopy and more recent Linked Color Imaging (LCI) endoscopy, which were annotated by experienced radiologists and validated with biopsy results, presenting a benchmark dataset. Secondly, we propose a novel machine learning model inspired by the human visual system, named as local attention grouping, to effectively extract key visual features, which is further improved by learning from multiple randomly selected regional images via ensemble learning. Such a method avoids the significant problem in the deep learning methods that decrease the resolution of original images to reduce the size of input samples, which would remove smaller lesions in endoscopy images. Finally, we propose a dual transfer learning strategy to train the model with co-distributed features between WLI and LCI images to further improve the performance. The experiment results, measured by accuracy, specificity, sensitivity, positive detection rate and negative detection rate, on IM are 99. 18 $\%$, 98. 90 $\%$, 99. 45 $\%$, 99. 45 $\%$, 98. 91 $\%$, respectively, and on GA are 97. 12 $\%$, 95. 34 $\%$, 98. 90 $\%$, 98. 86 $\%$, 95. 50 $\%$, respectively, achieving state of the art performance that outperforms current mainstream deep learning models.

Details DOI

NeurIPS Conference 2023 Conference Paper

Response Length Perception and Sequence Scheduling: An LLM-Empowered LLM Inference Pipeline

Zangwei Zheng
Xiaozhe Ren
Fuzhao Xue
Yang Luo
Xin Jiang
Yang You

Large language models (LLMs) have revolutionized the field of AI, demonstrating unprecedented capacity across various tasks. However, the inference process for LLMs comes with significant computational costs. In this paper, we propose an efficient LLM inference pipeline that harnesses the power of LLMs. Our approach begins by tapping into the potential of LLMs to accurately perceive and predict the response length with minimal overhead. By leveraging this information, we introduce an efficient sequence scheduling technique that groups queries with similar response lengths into micro-batches. We evaluate our approach on real-world instruction datasets using the LLaMA-based model, and our results demonstrate an impressive 86% improvement in inference throughput without compromising effectiveness. Notably, our method is orthogonal to other inference acceleration techniques, making it a valuable addition to many existing toolkits (e. g. , FlashAttention, Quantization) for LLM inference.

PDF Details

JBHI Journal 2022 Journal Article

A Fully Deep Learning Paradigm for Pneumoconiosis Staging on Chest Radiographs

Wenjian Sun
Dongsheng Wu
Yang Luo
Lu Liu
Hongjing Zhang
Shuang Wu
Yan Zhang
Chenglong Wang

Pneumoconiosis staging has been a very challenging task, both for certified radiologists and computer-aided detection algorithms. Although deep learning has shown proven advantages in the detection of pneumoconiosis, it remains challenging in pneumoconiosis staging due to the stage ambiguity of pneumoconiosis and noisy samples caused by misdiagnosis when they are used in training deep learning models. In this article, we propose a fully deep learning pneumoconiosis staging paradigm that comprises a segmentation procedure and a staging procedure. The segmentation procedure extracts lung fields in chest radiographs through an Asymmetric Encoder-Decoder Network (AED-Net) that can mitigate the domain shift between multiple datasets. The staging procedure classifies the lung fields into four stages through our proposed deep log-normal label distribution learning and focal staging loss. The two cascaded procedures can effectively solve the problem of model overfitting caused by stage ambiguity and noisy labels of pneumoconiosis. Besides, we collect a clinical chest radiograph dataset of pneumoconiosis from the certified radiologist's diagnostic reports. The experimental results on this novel pneumoconiosis dataset confirm that the proposed deep pneumoconiosis staging paradigm achieves an Accuracy of 90. 4%, a Precision of 84. 8%, a Sensitivity of 78. 4%, a Specificity of 95. 6%, an F1-score of 80. 9% and an Area Under the Curve (AUC) of 96%. In particular, we achieve 68. 4% Precision, 76. 5% Sensitivity, 95% Specificity, 72. 2% F1-score and 89% AUC on the early pneumoconiosis ‘stage-1’.

Details DOI