Author name cluster

Zhiteng Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

1 author row

ICLR Conference 2025 Conference Paper

ARB-LLM: Alternating Refined Binarizations for Large Language Models

Zhiteng Li
Xianglong Yan
Tianao Zhang
Haotong Qin
Dong Xie
Jiang Tian
Zhongchao Shi
Linghe Kong

Large Language Models (LLMs) have greatly pushed forward advancements in natural language processing, yet their high memory and computational demands hinder practical deployment. Binarization, as an effective compression technique, can shrink model weights to just 1 bit, significantly reducing the high demands on computation and memory. However, current binarization methods struggle to narrow the distribution gap between binarized and full-precision weights, while also overlooking the column deviation in LLM weight distribution. To tackle these issues, we propose ARB-LLM, a novel 1-bit post-training quantization (PTQ) technique tailored for LLMs. To narrow the distribution shift between binarized and full-precision weights, we first design an alternating refined binarization (ARB) algorithm to progressively update the binarization parameters, which significantly reduces the quantization error. Moreover, considering the pivot role of calibration data and the column deviation in LLM weights, we further extend ARB to ARB-X and ARB-RC. In addition, we refine the weight partition strategy with column-group bitmap (CGB), which further enhance performance. Equipping ARB-X and ARB-RC with CGB, we obtain ARB-LLM$_{\text{X}}$ and ARB-LLM$ _{\text{RC}} $ respectively, which significantly outperform state-of-the-art (SOTA) binarization methods for LLMs. As a binary PTQ method, our ARB-LLM$ _{\text{RC}} $ is the first to surpass FP16 models of the same size. Code: https://github.com/ZHITENGLI/ARB-LLM.

Details

ICML Conference 2025 Conference Paper

BiMaCoSR: Binary One-Step Diffusion Model Leveraging Flexible Matrix Compression for Real Super-Resolution

Kai Liu 0034
Kaicheng Yang
Zheng Chen 0014
Zhiteng Li
Yong Guo
Wenbo Li 0001
Linghe Kong
Yulun Zhang 0001

While super-resolution (SR) methods based on diffusion models (DM) have demonstrated inspiring performance, their deployment is impeded due to the heavy request of memory and computation. Recent researchers apply two kinds of methods to compress or fasten the DM. One is to compress the DM into 1-bit, aka binarization, alleviating the storage and computation pressure. The other distills the multi-step DM into only one step, significantly speeding up inference process. Nonetheless, it remains impossible to deploy DM to resource-limited edge devices. To address this problem, we propose BiMaCoSR, which combines binarization and one-step distillation to obtain extreme compression and acceleration. To prevent the catastrophic collapse of the model caused by binarization, we proposed sparse matrix branch (SMB) and low rank matrix branch (LRM). Both auxiliary branches pass the full-precision (FP) information but in different ways. SMB absorbs the extreme values and its output is high rank, carrying abundant FP information. Whereas, the design of LRMB is inspired by LoRA and is initialized with the top r SVD components, outputting low rank representation. The computation and storage overhead of our proposed branches can be safely ignored. Comprehensive comparison experiments are conducted to exhibit BiMaCoSR outperforms current state-of-the-art binarization methods and gains competitive performance compared with FP one-step model. Moreover, we achieve excellent compression and acceleration. BiMaCoSR achieves a 23. 8x compression ratio and a 27. 4x speedup ratio compared to FP counterpart. Our code and model are available at https: //github. com/Kai-Liu001/BiMaCoSR

Details

ICLR Conference 2025 Conference Paper

GenDataAgent: On-the-fly Dataset Augmentation with Synthetic Data

Zhiteng Li
Lele Chen
Jerone T. A. Andrews
Yunhao Ba
Yulun Zhang 0001
Alice Xiang

We propose a generative agent that augments training datasets with synthetic data for model fine-tuning. Unlike prior work, which uniformly samples synthetic data, our agent iteratively generates relevant samples on-the-fly, aligning with the target distribution. It prioritizes synthetic data that complements difficult training samples, focusing on those with high variance in gradient updates. Experiments across several image classification tasks demonstrate the effectiveness of our approach.

Details

ICRA Conference 2025 Conference Paper

LE-Object: Language Embedded Object-Level Neural Radiance Fields for Open-Vocabulary Scene

Mengting Wang
Yunzhou Zhang
Xingshuo Wang
Zhiyao Zhang
Zhiteng Li

Recent advancements in Visual Language Models (VLMs) have significantly driven research in open-vocabulary 3D scene reconstruction, showcasing strong potential in open-set retrieval and semantic understanding. However, existing approaches face challenges in open-world environments: they either suffer from insufficient precision in semantic segmentation, leading to inadequate fine-grained scene understanding, or they are limited to object-level reconstruction, failing to capture intricate object details and lack applicability in open-world settings. To address these issues, we introduce LE-Object, an object-centric Neural Implicit Radiance Field (NeRF) method for open-world scenarios to achieve fine-grained scene understanding and high-fidelity object reconstruction. LE-Object integrates spatial features (SF) from object point clouds with visual features (VF) from VLMs to perform object association, ensuring spatiotemporal consistency in object mask segmentation, and extends VLM features from 2D images into 3D space, enabling precise open-world semantic inference and detailed object reconstruction. Experimental results demonstrate that LE-Object excels in zero-shot semantic segmentation and open-world object reconstruction, offering innovative solutions for global navigation and local object manipulation in open-world applications.

Details

IROS Conference 2024 Conference Paper

FI-SLAM: Feature Fusion and Instance Reconstruction for Neural Implicit SLAM

Xingshuo Wang
Yunzhou Zhang
Zhiyao Zhang
Mengting Wang
Zhiteng Li
Xuanhua Chen

Recent advancements in neural implicit fields for Simultaneous Localization and Mapping (SLAM) have provided breakthroughs. However, the benefits of reconstruction results to the perception ability of robot are minimal. Therefore, we propose FI-SLAM, a dense semantic instance SLAM system based on neural implicit representation, which significantly aids robots in better understanding the scene. FI-SLAM employs a coordinate and plane joint encoding method, which reduces the difficulty of feature storage by flattening the feature space. Furthermore, to improve representation efficiency, we use the method of adjacent feature level linear interpolation to describe features. We propose a feature fusion (FF) method to merge the object features with the scene features. The fused feature vector enhances the reconstruction accuracy of the local scene while ensuring the global reconstruction effect. It has improved the global reconstruction effect of the scene and the accuracy of camera tracking. Numerous experiments on synthetic and real-world datasets demonstrate that our method can assure accurate tracking precision, high-fidelity reconstruction results, and complete semantic instance maps. In summary, the algorithm we proposed heavily augments the scene perception capabilities of robot.

Details