Author name cluster

Shaohui Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

3 papers

2 author rows

TIST Journal 2026 Journal Article

BUZZ: Beehive-structured Sparse KV Cache with Segmented Heavy Hitters for Efficient LLM Inference

Junqi Zhao
Zhijin Fang
Shu Li
Shaohui Yang
Shichao He

Large language models (LLMs) are critical in natural language processing but face challenges in inference speed and computational efficiency, hindering real-time applications. The key-value (KV) cache mechanism helps reduce computational overhead in transformer models; however, efficient contextual understanding remains problematic. In this paper, we introduce BUZZ, an innovative KV caching algorithm that leverages structured contextual information to optimize cache memory usage while enhancing inference speed. The core concept of BUZZ involves interval sampling of historically significant tokens to maintain sentence structure information, ensuring that KV Cache historical tokens are consistently distributed at nearly equal intervals. Tokens recently removed from the sliding window undergo local-max sampling based on attention values, preserving crucial contextual information. Additionally, we propose BUZZ with \(\log n\), an extension that enhances performance under extreme compression and long-context settings. Evaluations on four real-world datasets—CNN/Daily Mail, XSUM, LongBench, Wikitext, and 10-QA—demonstrate that BUZZ (1) achieves a 2.5 \(\times\) reduction in cache memory usage for LLM inference while maintaining over 99% accuracy in long-text summarization, and (2) surpasses state-of-the-art multi-document question answering by 7.69% under equivalent memory constraints, avoiding out-of-memory issues faced by full cache approaches. Furthermore, BUZZ achieves substantial inference speed improvements with a \(\log{n}\) time complexity. The implementation of BUZZ is available at: https://github.com/JunqiZhao888/buzz-llm.

Details DOI

ICRA Conference 2021 Conference Paper

Whole-Body Real-Time Motion Planning for Multicopters

Shaohui Yang
Botao He
Zhepei Wang
Chao Xu 0001
Fei Gao 0011

Multicopters are able to perform high maneuverability yet their potential have not been fully achieved. In this work, we propose a full-body, optimization-based motion planning framework that takes shape and attitude of aerial robot into consideration such that the aggressiveness of drone maneuvering improves significantly in cluttered environment. Our method takes in a series of intersecting polyhedrons that describe a range of 3D free spaces and outputs a time-indexed trajectory in real-time with full-body collision-free guarantee. The drone is modeled as a tilted cuboid, yet we argue that our framework can be freely adjusted to fit multicopters of different shapes. Guaranteeing dynamic feasibility and safety conditions, our framework transforms the original constrained nonlinear programming problem to an unconstrained one in higher dimensions which is further solved by quasi-Newton methods. Benchmark has shown that our method improves the state-of-art with orders of magnitude in terms of computation time and memory usage. Simulations and onboard experiments are carried out as validation.

Details

AAAI Conference 2010 Conference Paper

Keyword Extraction and Headline Generation Using Novel Word Features

Songhua Xu
Shaohui Yang
Francis Lau

We introduce several novel word features for keyword extraction and headline generation. These new word features are derived according to the background knowledge of a document as supplied by Wikipedia. Given a document, to acquire its background knowledge from Wikipedia, we first generate a query for searching the Wikipedia corpus based on the key facts present in the document. We then use the query to find articles in the Wikipedia corpus that are closely related to the contents of the document. With the Wikipedia search result article set, we extract the inlink, outlink, category and infobox information in each article to derive a set of novel word features which reflect the document’s background knowledge. These newly introduced word features offer valuable indications on individual words’ importance in the input document. They serve as nice complements to the traditional word features derivable from explicit information of a document. In addition, we also introduce a word-document fitness feature to characterize the influence of a document’s genre on the keyword extraction and headline generation process. We study the effectiveness of these novel word features for keyword extraction and headline generation by experiments and have obtained very encouraging results.

PDF Details