Arrow Research search

Author name cluster

Cheng Liu

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

20 papers
2 author rows

Possible papers

20

AAAI Conference 2026 Conference Paper

Learning Whom to Align With: Progressive Anomaly Combination Detection for Partially View-Aligned Clustering

  • Hang Gao
  • Zuosong Cai
  • Yuze Li
  • Cheng Liu
  • Gaoyang Li
  • Ying Li
  • Wei Du
  • You Zhou

Partially View-aligned Clustering (PVC) addresses the challenge of partial view alignment in multi-view learning by leveraging complementary and consistent information. While existing PVC methods show promise, most rely on distance-based strategies that are sensitive to view-specific details and noise, limiting their robustness. In this work, we propose a novel view alignment strategy that reformulates the alignment task as an anomaly detection problem. Rather than learning a view-alignment matrix that enforces strict one-to-one correspondences across views, we adopt a progressive approach to identify well-aligned samples. Specifically, we sample subsets of data by generating random view combinations from unaligned samples and propose an anomaly combination detection module to evaluate the alignment consistency of these combinations. In addition, our progressive training framework alternates between updating model parameters and selecting high-confidence view combinations for subsequent optimization. By reformulating view alignment as an anomaly detection task, our approach provides a more robust and effective solution to partial view alignment. Experiments on benchmark datasets demonstrate that our method outperforms state-of-the-art approaches in the PVC problem.

AAAI Conference 2026 Conference Paper

LoopLLM: Transferable Energy-Latency Attacks in LLMs via Repetitive Generation

  • Xingyu Li
  • Xiaolei Liu
  • Cheng Liu
  • Yixiao Xu
  • Kangyi Ding
  • Bangzhou Xin
  • Jia-Li Yin

As large language models (LLMs) scale, their inference incurs substantial computational resources, exposing them to energy-latency attacks, where crafted prompts induce high energy and latency cost. Existing attack methods aim to prolong output by delaying the generation of termination symbols. However, as the output grows longer, controlling the termination symbols through input becomes difficult, making these methods less effective. Therefore, we propose LoopLLM, an energy-latency attack framework based on the observation that repetitive generation can trigger low-entropy decoding loops, reliably compelling LLMs to generate until their output limits. LoopLLM introduces (1) a repetition-inducing prompt optimization that exploits autoregressive vulnerabilities to induce repetitive generation, and (2) a token-aligned ensemble optimization that aggregates gradients to improve cross-model transferability. Extensive experiments on 12 open-source and 2 commercial LLMs show that LoopLLM significantly outperforms existing methods, achieving over 90% of the maximum output length, compared to 20% for baselines, and improving transferability by around 40% to DeepSeek-V3 and Gemini 2.5 Flash.

AAAI Conference 2026 Conference Paper

Refinement Contrastive Learning of Cell–Gene Associations for Unsupervised Cell Type Identification

  • Liang Peng
  • Haopeng Liu
  • Yixuan Ye
  • Cheng Liu
  • Wenjun Shen
  • Si Wu
  • Hau-San Wong

Unsupervised cell type identification is crucial for uncovering and characterizing heterogeneous populations in single cell omics studies. Although a range of clustering methods have been developed, most focus exclusively on intrinsic cellular structure and ignore the pivotal role of cell-gene associations, which limits their ability to distinguish closely related cell types. To this end, we propose a Refinement Contrastive Learning framework (scRCL) that explicitly incorporates cell-gene interactions to derive more informative representations. Specifically, we introduce two contrastive distribution alignment components that reveal reliable intrinsic cellular structures by effectively exploiting cell-cell structural relationships. Additionally, we develop a refinement module that integrates gene-correlation structure learning to enhance cell embeddings by capturing underlying cell-gene associations. This module strengthens connections between cells and their associated genes, refining the representation learning to exploiting biologically meaningful relationships. Extensive experiments on several single-cell RNA-seq and spatial transcriptomics benchmark datasets demonstrate that our method consistently outperforms state-of-the-art baselines in cell-type identification accuracy. Moreover, downstream biological analyses confirm that the recovered cell populations exhibit coherent gene-expression signatures, further validating the biological relevance of our approach.

AAAI Conference 2026 Conference Paper

TTA-Bench: A Comprehensive Benchmark for Evaluating Text-to-Audio Models

  • Hui Wang
  • Cheng Liu
  • Junyang Chen
  • Haoze Liu
  • Yuhang Jia
  • Shiwan Zhao
  • Jiaming Zhou
  • Haoqin Sun

Text-to-Audio (TTA) generation has made rapid progress, but current evaluation methods remain narrow, focusing mainly on perceptual quality while overlooking robustness, generalization, and ethical concerns. We present TTA-Bench, a comprehensive benchmark for evaluating TTA models across functional performance, reliability, and social responsibility. It covers seven dimensions including accuracy, robustness, fairness, and toxicity, and includes 2,999 diverse prompts generated through automated and manual methods. We introduce a unified evaluation protocol that combines objective metrics with over 118,000 human annotations from both experts and general users. Ten state-of-the-art models are benchmarked under this framework, offering detailed insights into their strengths and limitations. TTA-Bench establishes a new standard for holistic evaluation of TTA systems.

JBHI Journal 2025 Journal Article

An Unsupervised Correlation Learning-Based Clustering Model for Multiple Complex Lesions Evaluation

  • Wenfeng Xu
  • Cong Lai
  • Zefeng Mo
  • Cheng Liu
  • Maoyuan Li
  • Gansen Zhao
  • Kewei Xu

Lesion morphology and quantity evaluation in computer tomography (CT) images are critical for precise disease diagnosis. Most existing methods employ machine learning-based methods to separately evaluate the morphology and quantity of individual lesion, neglecting the synergy between morphological structure and quantitative distribution. This limitation presents challenges when handling multiple complex lesions. This paper proposes an unsupervised correlation learning-based clustering model for evaluating lesion morphology and quantity in scenarios involving multiple complex lesions without predefined specific-logic. Specifically, the model utilizes clinical knowledge and changes in the in- or out-degree of lesion regions to learn their interdependencies, automatically recognizing domain-specific morphological features. These morphological features serve as key representations for morphology estimation and provide essential contextual information for quantity analysis. Furthermore, the model perceives quantity evaluation as a density-based clustering process. By interacting with domain-specific morphological features, the model dynamically adjusts the search objects, followed by designing morphology-specific parameter search strategies to autonomously learn spatial relationships between lesion regions. This approach facilitates the exploration of optimal parameters for accurate lesion evaluation without manual intervention. Experiments conducted on the kidney stone dataset including 53 samples and the kidney tumor dataset comprising 300 samples, indicate that the proposed model has achieved 92. 45% and 95. 33% accuracy in morphology analysis, respectively. For quantity analysis, the proposed model has achieved 79. 25% and 94. 33% accuracy, outperforming the well-performing AR-DBSCAN method by +30. 19% and DRL-DBSCAN method by +6%. The proposed model is demonstrated to be effective in handling morphology and quantity estimation for multiple complex lesions.

JBHI Journal 2025 Journal Article

Deep Self-Reinforced Multi-View Subspace Clustering for Cancer Subtyping

  • Cheng Liu
  • Baoyuan Zheng
  • Jiaojiao Wang
  • Xibiao Wang
  • Hang Gao
  • Fei Wang
  • Wenjun Shen
  • Si Wu

Identifying cancer subtypes is crucial for understanding disease progression and guiding precision medicine. With advances in high-throughput experimental technologies, the integration of multiple types of omics data for cancer subtype identification has become increasingly feasible. However, despite the promising performance of existing integrative cancer subtyping methods, efficiently integrating and clustering multi-omics datasets remains challenging due to the high levels of noise inherent in omics data, which impede the accurate characterization of relationships among samples. To address these challenges, we propose a novel deep multi-view subspace clustering model that incorporates a self-reinforced learning strategy. This strategy iteratively improves the quality of self-representation, which is critical for accurately capturing sample relationships and enabling effective clustering. Specifically, during model training, the proposed method learns a highly reliable self-representation through a good-neighbor learning mechanism, allowing it to model more accurate and robust inter-sample relationships. Building upon this reliable self-representation, we further develop a learnable view-graph fusion framework that integrates complementary information across multiple omics views to derive a consensus representation for clustering, thereby guiding the overall learning process. In addition, we introduce a local graph-guided learning mechanism based on an initial graph constructed from the raw data. This mechanism serves as an effective regularization strategy to prevent the model from converging to suboptimal solutions, thereby enhancing stability and robustness during training. Extensive experimental results demonstrate that the proposed method consistently outperforms several state-of-the-art approaches, validating its effectiveness and robustness for cancer subtype identification.

AAAI Conference 2025 Conference Paper

Graph Coarsening via Supervised Granular-Ball for Scalable Graph Neural Network Training

  • Shuyin Xia
  • Xinjun Ma
  • Zhiyuan Liu
  • Cheng Liu
  • Sen Zhao
  • Guoyin Wang

Graph Neural Networks (GNNs) have demonstrated significant achievements in processing graph data, yet scalability remains a substantial challenge. To address this, numerous graph coarsening methods have been developed. However, most existing coarsening methods are training-dependent, leading to lower efficiency, and they all require a predefined coarsening rate, lacking an adaptive approach. In this paper, we employ granular-ball computing to effectively compress graph data. We construct a coarsened graph network by iteratively splitting the graph into granular-balls based on a purity threshold and using these granular-balls as super vertices. This granulation process significantly reduces the size of the original graph, thereby greatly enhancing the training efficiency and scalability of GNNs. Additionally, our algorithm can adaptively perform splitting without requiring a predefined coarsening rate. Experimental results demonstrate that our method achieves accuracy comparable to training on the original graph. Noise injection experiments further indicate that our method exhibits robust performance. Moreover, our approach can reduce the graph size by up to 20 times without compromising test accuracy, substantially enhancing the scalability of GNNs.

NeurIPS Conference 2025 Conference Paper

OmniConsistency: Learning Style-Agnostic Consistency from Paired Stylization Data

  • Yiren Song
  • Cheng Liu
  • Mike Zheng Shou

Diffusion models have advanced image stylization significantly, yet two core challenges persist: (1) maintaining consistent stylization in complex scenes, particularly identity, composition, and fine details, and (2) preventing style degradation in image-to-image pipelines with style LoRAs. GPT-4o's exceptional stylization consistency highlights the performance gap between open-source methods and proprietary models. To bridge this gap, we propose \textbf{OmniConsistency}, a universal consistency plugin leveraging large-scale Diffusion Transformers (DiTs). OmniConsistency contributes: (1) an in-context consistency learning framework trained on aligned image pairs for robust generalization; (2) a two-stage progressive learning strategy decoupling style learning from consistency preservation to mitigate style degradation; and (3) a fully plug-and-play design compatible with arbitrary style LoRAs under the Flux framework. Extensive experiments show that OmniConsistency significantly enhances visual coherence and aesthetic quality, achieving performance comparable to commercial state-of-the-art model GPT-4o.

JBHI Journal 2024 Journal Article

Clinical-Inspired Framework for Automatic Kidney Stone Recognition and Analysis on Transverse CT Images

  • Wenfeng Xu
  • Cong Lai
  • Zefeng Mo
  • Cheng Liu
  • Maoyuan Li
  • Gansen Zhao
  • Kewei Xu

The stone recognition and analysis in CT images are significant for automatic kidney stone diagnosis. Although certain contributions have been made, existing methods overlook the promoting effect of clinical knowledge on model performance and clinical interpretation. Thus, it is attractive to establish methods for detecting and evaluating kidney stones originating from the practical diagnostic process. Inspired by this, a novel clinical-inspired framework is proposed to involve the diagnostic process of urologists for better analysis. The diagnostic process contains three main steps, the localization step, the identification step and the evaluation step. Three modules integrating the decision-making mode of urologists are designed to mimic the diagnosis process. The object attention module simulates the localization step to provide the position of kidneys by embedding weight feature factor and angle loss. The feature-driven discriminative module mimics the identification step to detect stones by extracting geometric and positional features. The analysis module based on the principle of clustering and graphic combination is a quantitative analysis strategy for simulating the evaluation step. This work constructed a clinical dataset collecting 27, 885 transverse CT images with stones and/or clinical interference. Experiments on the dataset show that the object attention module outperforms the well-performing Yolov7 model by $+$ 1% $mAP_{. 5: .95}$, and the analysis module outperforms the well-performing AR-DBSCAN model and the formula method by $+$ 21. 9% average cluster accuracy and $-$ 17. 35% average error. Experiments demonstrate that the proposed framework is recently the most effective solution for recognizing and evaluating kidney stones.

NeurIPS Conference 2024 Conference Paper

LinNet: Linear Network for Efficient Point Cloud Representation Learning

  • Hao Deng
  • Kunlei Jing
  • Shengmei Cheng
  • Cheng Liu
  • Jiawei Ru
  • Jiang Bo
  • Lin Wang

Point-based methods have made significant progress, but improving their scalability in large-scale 3D scenes is still a challenging problem. In this paper, we delve into the point-based method and develop a simpler, faster, stronger variant model, dubbed as LinNet. In particular, we first propose the disassembled set abstraction (DSA) module, which is more effective than the previous version of set abstraction. It achieves more efficient local aggregation by leveraging spatial anisotropy and channel anisotropy separately. Additionally, by mapping 3D point clouds onto 1D space-filling curves, we enable parallelization of downsampling and neighborhood queries on GPUs with linear complexity. LinNet, as a purely point-based method, outperforms most previous methods in both indoor and outdoor scenes without any extra attention, and sparse convolution but merely relying on a simple MLP. It achieves the mIoU of 73. 7\%, 81. 4\%, and 69. 1\% on the S3DIS Area5, NuScenes, and SemanticKITTI validation benchmarks, respectively, while speeding up almost 10x times over PointNeXt. Our work further reveals both the efficacy and efficiency potential of the vanilla point-based models in large-scale representation learning. Our code will be available upon publication.

NeurIPS Conference 2024 Conference Paper

Ordering-Based Causal Discovery for Linear and Nonlinear Relations

  • Zhuopeng Xu
  • Yujie Li
  • Cheng Liu
  • Ning Gui

Identifying causal relations from purely observational data typically requires additional assumptions on relations and/or noise. Most current methods restrict their analysis to datasets that are assumed to have pure linear or nonlinear relations, which is often not reflective of real-world datasets that contain a combination of both. This paper presents CaPS, an ordering-based causal discovery algorithm that effectively handles linear and nonlinear relations. CaPS introduces a novel identification criterion for topological ordering and incorporates the concept of "parent score" during the post-processing optimization stage. These scores quantify the strength of the average causal effect, helping to accelerate the pruning process and correct inaccurate predictions in the pruning step. Experimental results demonstrate that our proposed solutions outperform state-of-the-art baselines on synthetic data with varying ratios of linear and nonlinear relations. The results obtained from real-world data also support the competitiveness of CaPS. Code and datasets are available at https: //github. com/E2real/CaPS.

AAAI Conference 2024 Conference Paper

RetouchFormer: Semi-supervised High-Quality Face Retouching Transformer with Prior-Based Selective Self-Attention

  • Xue Wen
  • Lianxin Xie
  • Le Jiang
  • Tianyi Chen
  • Si Wu
  • Cheng Liu
  • Hau-San Wong

Face retouching is to beautify a face image, while preserving the image content as much as possible. It is a promising yet challenging task to remove face imperfections and fill with normal skin. Generic image enhancement methods are hampered by the lack of imperfection localization, which often results in incomplete removal of blemishes at large scales. To address this issue, we propose a transformer-based approach, RetouchFormer, which simultaneously identify imperfections and synthesize realistic content in the corresponding regions. Specifically, we learn a latent dictionary to capture the clean face priors, and predict the imperfection regions via a reconstruction-oriented localization module. Also based on this, we can realize face retouching by explicitly suppressing imperfections in our selective self-attention computation, such that local content will be synthesized from normal skin. On the other hand, multi-scale feature tokens lead to increased flexibility in dealing with the imperfections at various scales. The design elements bring greater effectiveness and efficiency. RetouchFormer outperforms the advanced face retouching methods and synthesizes clean face images with high fidelity in our list of extensive experiments performed.

IJCAI Conference 2024 Conference Paper

SCTrans: Multi-scale scRNA-seq Sub-vector Completion Transformer for Gene-selective Cell Type Annotation

  • Lu Lin
  • Wen Xue
  • Xindian Wei
  • Wenjun Shen
  • Cheng Liu
  • Si Wu
  • Hau San Wong

Cell type annotation is pivotal to single-cell RNA sequencing data (scRNA-seq)-based biological and medical analysis, e. g. , identifying biomarkers, exploring cellular heterogeneity, and understanding disease mechanisms. The previous annotation methods typically learn a nonlinear mapping to infer cell type from gene expression vectors, and thus fall short in discovering and associating salient genes with specific cell types. To address this issue, we propose a multi-scale scRNA-seq Sub-vector Completion Transformer, and our model is referred to as SCTrans. Considering that the expressiveness of gene sub-vectors is richer than that of individual genes, we perform multi-scale partitioning on gene vectors followed by masked sub-vector completion, conditioned on unmasked ones. Toward this end, the multi-scale sub-vectors are tokenized, and the intrinsic contextual relationships are modeled via self-attention computation and conditional contrastive regularization imposed on an encoding transformer. By performing mutual learning between the encoder and an additional lightweight counterpart, the salient tokens can be distinguished from the others. As a result, we can perform gene-selective cell type annotation, which contributes to our superior performance over state-of-the-art annotation methods.

ICRA Conference 2023 Conference Paper

Adaptive Risk-Tendency: Nano Drone Navigation in Cluttered Environments with Distributional Reinforcement Learning

  • Cheng Liu
  • Erik-Jan van Kampen
  • Guido C. H. E. de Croon

Enabling the capability of assessing risk and making risk-aware decisions is essential to applying reinforcement learning to safety-critical robots like drones. In this paper, we investigate a specific case where a nano quadcopter robot learns to navigate an apriori-unknown cluttered environment under partial observability. We present a distributional reinforcement learning framework to generate adaptive risk-tendency policies. Specifically, we propose to use lower tail conditional variance of the learnt return distribution as intrinsic uncertainty estimation, and use exponentially weighted average forecasting (EWAF) to adapt the risk-tendency in accordance with the estimated uncertainty. In simulation and real-world empirical results, we show that (1) the most effective risk-tendency varies across states, (2) the agent with adaptive risk-tendency achieves superior performance compared to risk-neutral policy or risk-averse policy baselines. Code and video can be found in this repository: https://github.com/tudelft/risk-sensitive-rl.git

ICRA Conference 2021 Conference Paper

Sensing via Collisions: a Smart Cage for Quadrotors with Applications to Self-Localization

  • Cheng Liu
  • Roberto Tron

Applications of micro unmanned aerial vehicles (UAVs) are gradually expanding into complex urban and natural environments. Despite noticeable progress, flying robots in obstacle-rich environments is still challenging. On-board processing for detecting and avoiding obstacles is possible, but at a significant computational expense, and with significant limitations (e. g. , for obstacles with small cross sections, such as wires). A low-cost alternative is to mitigate physical contacts through a cage or other similar protective devices. In this paper, we propose to transform these passive protective devices into functional sensors: we introduce a suspended rim combined with a central base measuring the relative displacement of the rim; we provide a full mechanical design, and derive solutions to the inverse kinematics for recovering the collision direction in real time. As a proof of concept, we show the benefits of this novel form of sensing by embedding it in a traditional particle filter for self-localization in a known environment; our experiments show that localization is possible with a minimal sacrifice in payload capacity.

IJCAI Conference 2019 Conference Paper

Improving representation learning in autoencoders via multidimensional interpolation and dual regularizations

  • Sheng Qian
  • Guanyue Li
  • Wen-Ming Cao
  • Cheng Liu
  • Si Wu
  • Hau San Wong

Autoencoders enjoy a remarkable ability to learn data representations. Research on autoencoders shows that the effectiveness of data interpolation can reflect the performance of representation learning. However, existing interpolation methods in autoencoders do not have enough capability of traversing a possible region between two datapoints on a data manifold, and the distribution of interpolated latent representations is not considered. To address these issues, we aim to fully exert the potential of data interpolation and further improve representation learning in autoencoders. Specifically, we propose the multidimensional interpolation to increase the capability of data interpolation by randomly setting interpolation coefficients for each dimension of latent representations. In addition, we regularize autoencoders in both the latent and the data spaces by imposing a prior on latent representations in the Maximum Mean Discrepancy (MMD) framework and encouraging generated datapoints to be realistic in the Generative Adversarial Network (GAN) framework. Compared to representative models, our proposed model has empirically shown that representation learning exhibits better performance on downstream tasks on multiple benchmarks.