Author name cluster

Yeqing Li

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

6 papers

2 author rows

TMLR Journal 2024 Journal Article

Greedy Growing Enables High-Resolution Pixel-Based Diffusion Models

Cristina Nader Vasconcelos
Abdullah Rashwan
Austin Waters
Trevor Walker
Keyang Xu
Jimmy Yan
Rui Qian
Yeqing Li

We address the long-standing problem of how to learn effective pixel-based image diffusion models at scale, introducing a remarkably simple greedy method for stable training of large-scale, high-resolution models. without the needs for cascaded super-resolution components.The key insight stems from careful pre-training of core components, namely, those responsible for text-to-image alignment vs. high resolution rendering. We first demonstrate the benefits of scaling a Shallow UNet, with no down(up)-sampling enc(dec)oder. Scaling its deep core layers is shown to improve alignment, object structure, and composition. Building on this core model, we propose a greedy algorithm that grows the architecture into high resolution end-to-end models, while preserving the integrity of the pre-trained representation,stabilizing training, and reducing the need for large high-resolution datasets. This enables a single stage model capable of generating high-resolution images without the need of a super-resolution cascade. Our key results rely on public datasets and show that we are able to train non-cascaded models up to 8B parameters with no further regularization schemes.Vermeer, our full pipeline model trained with internal datasets to produce 1024×1024 images, without cascades, is preferred by 44.0% vs. 21.4% human evaluators over SDXL.

PDF Details

NeurIPS Conference 2024 Conference Paper

Subject-driven Text-to-Image Generation via Preference-based Reinforcement Learning

Yanting Miao
William Loh
Suraj Kothawade
Pascal Poupart
Abdullah Rashwan
Yeqing Li

Text-to-image generative models have recently attracted considerable interest, enabling the synthesis of high-quality images from textual prompts. However, these models often lack the capability to generate specific subjects from given reference images or to synthesize novel renditions under varying conditions. Methods like DreamBooth and Subject-driven Text-to-Image (SuTI) have made significant progress in this area. Yet, both approaches primarily focus on enhancing similarity to reference images and require expensive setups, often overlooking the need for efficient training and avoiding overfitting to the reference images. In this work, we present the $\lambda$-Harmonic reward function, which provides a reliable reward signal and enables early stopping for faster training and effective regularization. By combining the Bradley-Terry preference model, the $\lambda$-Harmonic reward function also provides preference labels for subject-driven generation tasks. We propose Reward Preference Optimization (RPO), which offers a simpler setup (requiring only 3\% of the negative samples used by DreamBooth) and fewer gradient steps for fine-tuning. Unlike most existing methods, our approach does not require training a text encoder or optimizing text embeddings and achieves text-image alignment by fine-tuning only the U-Net component. Empirically, $\lambda$-Harmonic proves to be a reliable approach for model selection in subject-driven generation tasks. Based on preference labels and early stopping validation from the $\lambda$-Harmonic reward function, our algorithm achieves a state-of-the-art CLIP-I score of 0. 833 and a CLIP-T score of 0. 314 on DreamBench.

PDF Details DOI

AAAI Conference 2016 Conference Paper

Scalable Sequential Spectral Clustering

Yeqing Li
Junzhou Huang
Wei Liu

In the past decades, Spectral Clustering (SC) has become one of the most effective clustering approaches. Although it has been widely used, one signiﬁcant drawback of SC is its expensive computation cost. Many efforts have been devoted to accelerating SC algorithms and promising results have been achieved. However, most of the existing algorithms rely on the assumption that data can be stored in the computer memory. When data cannot ﬁt in the memory, these algorithms will suffer severe performance degradations. In order to overcome this issue, we propose a novel sequential SC algorithm for tackling large-scale clustering with limited computational resources, e. g. , memory. We begin with investigating an effective way of approximating the graph afﬁnity matrix via leveraging a bipartite graph. Then we choose a smart graph construction and optimization strategy to avoid random access to data. These efforts lead to an efﬁcient SC algorithm whose memory usage is independent of the number of input data points. Extensive experiments carried out on large datasets demonstrate that the proposed sequential SC algorithm is up to a thousand times faster than the state-of-thearts.

PDF Details

ECAI Conference 2016 Conference Paper

Semi-Supervised Group Sparse Representation: Model, Algorithm and Applications

Longwen Gao
Yeqing Li
Junzhou Huang
Shuigeng Zhou

Group sparse representation (GSR) exploits group structure in data and works well on many problems. However, the group structure must be manually given in advance. In many practical scenarios such as classification, samples are grouped according to their labels. Constructing a consistent group structure in such cases is not easy. The reasons are: 1) samples may be incorrectly labeled; and 2) label assigning in big data is time-consuming and expensive. In this paper, we propose and formulate a new problem, semi-supervised group sparse representation (SS-GSR) to support group sparse representation among both labeled and unlabeled data, while learning a more robust group structure, which can be further exploited to more effectively represent other unlabeled data. We develop a model to tackle the SS-GSR problem, based on the manifold assumption in subspace segmentation that samples in the same group lie close in feature space and span the same subspace. We also propose an alternating algorithm to solve the model. Finally, we validate the model via extensive experiments.

Details

AAAI Conference 2015 Conference Paper

Large-Scale Multi-View Spectral Clustering via Bipartite Graph

Yeqing Li
Feiping Nie
Heng Huang
Junzhou Huang

In this paper, we address the problem of large-scale multi-view spectral clustering. In many real-world applications, data can be represented in various heterogeneous features or views. Different views often provide different aspects of information that are complementary to each other. Several previous methods of clustering have demonstrated that better accuracy can be achieved using integrated information of all the views than just using each view individually. One important class of such methods is multi-view spectral clustering, which is based on graph Laplacian. However, existing methods are not applicable to large-scale problem for their high computational complexity. To this end, we propose a novel large-scale multi-view spectral clustering approach based on the bipartite graph. Our method uses local manifold fusion to integrate heterogeneous features. To improve efficiency, we approximate the similarity graphs using bipartite graphs. Furthermore, we show that our method can be easily extended to handle the out-of-sample problem. Extensive experimental results on five benchmark datasets demonstrate the effectiveness and efficiency of the proposed method, where our method runs up to nearly 3000 times faster than the state-of-the-art methods.

PDF Details

AAAI Conference 2014 Conference Paper

Sub-Selective Quantization for Large-Scale Image Search

Yeqing Li
Chen Chen
Wei Liu
Junzhou Huang

Recently with the explosive growth of visual content on the Internet, large-scale image search has attracted intensive attention. It has been shown that mapping highdimensional image descriptors to compact binary codes can lead to considerable efficiency gains in both storage and similarity computation of images. However, most existing methods still suffer from expensive training devoted to large-scale binary code learning. To address this issue, we propose a sub-selection based matrix manipulation algorithm which can significantly reduce the computational cost of code learning. As case studies, we apply the sub-selection algorithm to two popular quantization techniques PCA Quantization (PCAQ) and Iterative Quantization (ITQ). Crucially, we can justify the resulting sub-selective quantization by proving its theoretic properties. Extensive experiments are carried out on three image benchmarks with up to one million samples, corroborating the efficacy of the sub-selective quantization method in terms of image retrieval.

PDF Details