Author name cluster

Wei Ke 0003

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers

1 author row

ICML Conference 2025 Conference Paper

Are High-Quality AI-Generated Images More Difficult for Models to Detect?

Yao Xiao
Binbin Yang
Weiyan Chen
Jiahao Chen
Zijie Cao
ZiYi Dong
Xiangyang Ji
Liang Lin

The remarkable evolution of generative models has enabled the generation of high-quality, visually attractive images, often perceptually indistinguishable from real photographs to human eyes. This has spurred significant attention on AI-generated image (AIGI) detection. Intuitively, higher image quality should increase detection difficulty. However, our systematic study on cutting-edge text-to-image generators reveals a counterintuitive finding: AIGIs with higher quality scores, as assessed by human preference models, tend to be more easily detected by existing models. To investigate this, we examine how the text prompts for generation and image characteristics influence both quality scores and detector accuracy. We observe that images from short prompts tend to achieve higher preference scores while being easier to detect. Furthermore, through clustering and regression analyses, we verify that image characteristics like saturation, contrast, and texture richness collectively impact both image quality and detector accuracy. Finally, we demonstrate that the performance of off-the-shelf detectors can be enhanced across diverse generators and datasets by selecting input patches based on the predicted scores of our regression models, thus substantiating the broader applicability of our findings. Code and data are available at https: //github. com/Coxy7/AIGI-Detection-Quality-Paradox.

Details

ICLR Conference 2025 Conference Paper

Refining CLIP's Spatial Awareness: A Visual-Centric Perspective

Congpei Qiu
Yanhao Wu
Wei Ke 0003
Xiuxiu Bai
Tong Zhang 0023

Contrastive Language-Image Pre-training (CLIP) excels in global alignment with language but exhibits limited sensitivity to spatial information, leading to strong performance in zero-shot classification tasks but underperformance in tasks requiring precise spatial understanding. Recent approaches have introduced Region-Language Alignment (RLA) to enhance CLIP's performance in dense multimodal tasks by aligning regional visual representations with corresponding text inputs. However, we find that CLIP ViTs fine-tuned with RLA suffer from notable loss in spatial awareness, which is crucial for dense prediction tasks. To address this, we propose the Spatial Correlation Distillation (SCD) framework, which preserves CLIP's inherent spatial structure and mitigates above degradation. To further enhance spatial correlations, we introduce a lightweight Refiner that extracts refined correlations directly from CLIP before feeding them into SCD, based on an intriguring finding that CLIP naturally capture high-quality dense features. Together, these components form a robust distillation framework that enables CLIP ViTs to integrate both visual-language and visual-centric improvements, achieving state-of-the-art results across various open-vocabulary dense prediction benchmarks.

Details

ICML Conference 2024 Conference Paper

Kepler codebook

Junrong Lian
Ziyue Dong
Pengxu Wei
Wei Ke 0003
Chang Liu 0030
Qixiang Ye
Xiangyang Ji
Liang Lin

A codebook designed for learning discrete distributions in latent space has demonstrated state-of-the-art results on generation tasks. This inspires us to explore what distribution of codebook is better. Following the spirit of Kepler’s Conjecture, we cast the codebook training as solving the sphere packing problem and derive a Kepler codebook with a compact and structured distribution to obtain a codebook for image representations. Furthermore, we implement the Kepler codebook training by simply employing this derived distribution as regularization and using the codebook partition method. We conduct extensive experiments to evaluate our trained codebook for image reconstruction and generation on natural and human face datasets, respectively, achieving significant performance improvement. Besides, our Kepler codebook has demonstrated superior performance when evaluated across datasets and even for reconstructing images with different resolutions. Our trained models and source codes will be publicly released.

Details

ICLR Conference 2024 Conference Paper

Mind Your Augmentation: The Key to Decoupling Dense Self-Supervised Learning

Congpei Qiu
Tong Zhang 0023
Yanhao Wu
Wei Ke 0003
Mathieu Salzmann
Sabine Süsstrunk

Dense Self-Supervised Learning (SSL) creates positive pairs by building positive paired regions or points, thereby aiming to preserve local features, for example of individual objects. However, existing approaches tend to couple objects by leaking information from the neighboring contextual regions when the pairs have a limited overlap. In this paper, we first quantitatively identify and confirm the existence of such a coupling phenomenon. We then address it by developing a remarkably simple yet highly effective solution comprising a novel augmentation method, Region Collaborative Cutout (RCC), and a corresponding decoupling branch. Importantly, our design is versatile and can be seamlessly integrated into existing SSL frameworks, whether based on Convolutional Neural Networks (CNNs) or Vision Transformers (ViTs). We conduct extensive experiments, incorporating our solution into two CNN-based and two ViT-based methods, with results confirming the effectiveness of our approach. Moreover, we provide empirical evidence that our method significantly contributes to the disentanglement of feature representations among objects, both in quantitative and qualitative terms.

Details

IROS Conference 2021 Conference Paper

Kohonen Self-Organizing Map based Route Planning: A Revisit

Qingshu Guan
Xiaopeng Hong
Wei Ke 0003
Liangfei Zhang
Guanghui Sun
Yihong Gong

In this paper, we revisit the long-standing Traveling Salesman Problem (TSP) and focus on the challenging, yet practical route planning problem with limited computational resources. We make contributions to TSP, one of the most famous NP-hard problems by providing a new improved approximate solution, which we term TOpology Preserving Self-Organizing Map (TOPSOM). TOPSOM well preserves the topology of the node map to be traversed by maintaining the continuity of nodes and the distances between them. In addition, to satisfy the requirements of convex hull, we design an elastic competitive Hebbian learning rule. TOPSOM can solve large-scale TSPs with high precision and high efficiency with limited computational costs. Extensive experimental results on mainstream route planning benchmarks including TSPLIB and National TSP’s show that our method consistently outperforms baseline methods, by up to 7. 7% in terms of the Percent Deviation of Mean solution to best known solution.

Details