Author name cluster

Bin Gao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers

2 author rows

AAAI Conference 2025 Conference Paper

AdaSkip: Adaptive Sublayer Skipping for Accelerating Long-Context LLM Inference

Zhuomin He
Yizhen Yao
Pengfei Zuo
Bin Gao
Qinya Li
Zhenzhe Zheng
Fan Wu

Long-context large language models (LLMs) inference is increasingly critical, motivating a number of studies devoted to alleviating the substantial storage and computational costs in such scenarios. Layer-wise skipping methods are promising optimizations but rarely explored in long-context inference. We observe that existing layer-wise skipping strategies have several limitations when applied in long-context inference, including the inability to adapt to model and context variability, disregard for sublayer significance, and inapplicability for the prefilling phase. This paper proposes AdaSkip, an adaptive sublayer skipping method specifically designed for long-context inference. AdaSkip adaptively identifies less important layers by leveraging on-the-fly similarity information, enables sublayer-wise skipping, and accelerates both the prefilling and decoding phases. The effectiveness of AdaSkip is demonstrated through extensive experiments on various long-context benchmarks and models, showcasing its superior inference performance over existing baselines.

PDF Details DOI

ICML Conference 2025 Conference Paper

Distributed Retraction-Free and Communication-Efficient Optimization on the Stiefel Manifold

Yilong Song
Peijin Li
Bin Gao
Kun Yuan

Optimization problems on the Stiefel manifold, ranging from principal component analysis to enhancing neural network robustness, are ubiquitous in machine learning. The Landing algorithm avoids computationally expensive retraction operations on manifolds, making it highly competitive for large-scale problems. This paper extends this method to distributed settings, introducing EF-Landing, the first retraction-free and communication-efficient algorithm for distributed stochastic optimization on the Stiefel manifold. By incorporating communication compression and error feedback, EF-Landing ensures convergence and constraint feasibility while significantly reducing communication overhead. We provide sharp convergence guarantees, demonstrating that EF-Landing achieves the same asymptotic linear speedup convergence rate as existing methods without communication compression. Furthermore, our analysis is highly versatile, applying to both deterministic and stochastic settings and encompassing algorithms based on gradient descent or momentum-based gradient descent. We also generalize EF-Landing to operate on block-wise Stiefel manifolds, enabling greater flexibility for structured constraints. Extensive numerical experiments validate our theoretical results.

Details

IS Journal 2024 Journal Article

Improved Small Object Detection Algorithm Based on YOLOv5

Bo Xu
Bin Gao
Yunhu Li

YOLOv5 is a popular object detection algorithm that is widely used in various industrial fields, especially in the field of autonomous driving. However, this algorithm has problems, such as false positives and false negatives when detecting small targets. The article proposes an improved method for small object detection using YOLOv5s. First, a multilevel feature fusion detection head is proposed to extract larger feature maps from the backbone of the model, improving the ability to extract features of small objects. Second, a decoupled attention mechanism is introduced at each detection head, which separates the detection of object box position, object box confidence, and class probability to reduce confusion between different feature information. Finally, the focal minimum points distance intersection over union loss function is adopted to mitigate the effects of class imbalance and poor-quality object pixels.

Details DOI

JMLR Journal 2024 Journal Article

Infeasible Deterministic, Stochastic, and Variance-Reduction Algorithms for Optimization under Orthogonality Constraints

Pierre Ablin
Simon Vary
Bin Gao
Pierre-Antoine Absil

Orthogonality constraints naturally appear in many machine learning problems, from principal component analysis to robust neural network training. They are usually solved using Riemannian optimization algorithms, which minimize the objective function while enforcing the constraint. However, enforcing the orthogonality constraint can be the most time-consuming operation in such algorithms. Recently, Ablin and Peyré (2022) proposed the landing algorithm, a method with cheap iterations that does not enforce the orthogonality constraints but is attracted towards the manifold in a smooth manner. This article provides new practical and theoretical developments for the landing algorithm. First, the method is extended to the Stiefel manifold, the set of rectangular orthogonal matrices. We also consider stochastic and variance reduction algorithms when the cost function is an average of many functions. We demonstrate that all these methods have the same rate of convergence as their Riemannian counterparts that exactly enforce the constraint, and converge to the manifold. Finally, our experiments demonstrate the promise of our approach to an array of machine-learning problems that involve orthogonality constraints. [abs] [ pdf ][ bib ] [ code ] &copy JMLR 2024. ( edit, beta )

PDF Details

IS Journal 2018 Journal Article

Building a Globally Optimized Computational Intelligent Image Processing Algorithm for On-Site Inference of Nitrogen in Plants

Susanto B. Sulistyo
W.L. Woo
S.S. Dlay
Bin Gao

Estimating nutrient content in plants is a crucial task in the application of precision farming. This work will be more challenging if it is conducted nondestructively based on plant images captured in the field due to the variation of lighting conditions. This paper proposes a computational intelligence image processing to analyze nitrogen status in wheat plants. We developed an ensemble of deep learning multilayer perceptron-using committee machines for color normalization and image segmentation. This paper also focuses on building a genetic-algorithm-based global optimization to fine tune the color normalization and nitrogen estimation results. We discovered that the proposed method can successfully normalize plant images by reducing color variabilities compared to other color normalization techniques. Furthermore, this algorithm is able to enhance the nitrogen estimation results compared to other non-global optimization methods as well as the most renowned SPAD meter based nitrogen measurement.

Details DOI

AAAI Conference 2014 Conference Paper

Learning Deep Representations for Graph Clustering

Fei Tian
Bin Gao
Qing Cui
Enhong Chen
Tie-Yan Liu

Recently deep learning has been successfully adopted in many applications such as speech recognition and image classification. In this work, we explore the possibility of employing deep learning in graph clustering. We propose a simple method, which first learns a nonlinear embedding of the original graph by stacked autoencoder, and then runs k-means algorithm on the embedding to obtain clustering result. We show that this simple method has solid theoretical foundation, due to the similarity between autoencoder and spectral clustering in terms of what they actually optimize. Then, we demonstrate that the proposed method is more efficient and flexible than spectral clustering. First, the computational complexity of autoencoder is much lower than spectral clustering: the former can be linear to the number of nodes in a sparse graph while the latter is super quadratic due to eigenvalue decomposition. Second, when additional sparsity constraint is imposed, we can simply employ the sparse autoencoder developed in the literature of deep learning; however, it is nonstraightforward to implement a sparse spectral method. The experimental results on various graph datasets show that the proposed method significantly outperforms conventional spectral clustering, which clearly indicates the effectiveness of deep learning in graph clustering.

PDF Details