Author name cluster

Rex Ying

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

34 papers

2 author rows

AAAI Conference 2025 Conference Paper

An Item Is Worth a Prompt: Versatile Image Editing with Disentangled Control

Aosong Feng
Weikang Qiu
Jinbin Bai
Zhen Dong
Kaicheng Zhou
Xiao Zhang
Rex Ying
Leandros Tassiulas

Building on the success of text-to-image diffusion models (DPMs), image editing is an important application to enable human interaction with AI-generated content. Among various editing methods, editing within the prompt space gains more attention due to its capacity and simplicity of controlling semantics. However, since diffusion models are commonly pretrained on descriptive text captions, direct editing of words in text prompts usually leads to completely different generated images, violating the requirements for image editing. On the other hand, existing editing methods usually consider introducing spatial masks to preserve the identity of unedited regions, which are usually ignored by DPMs and therefore lead to inharmonic editing results. Targeting these two challenges, in this work, we propose to disentangle the comprehensive image-prompt interaction into several item-prompt interactions, with each item linked to a special learned prompt. The resulting framework, named D-Edit, is based on pretrained diffusion models with cross-attention layers disentangled and adopts a two-step optimization to build item-prompt associations. Versatile image editing can then be applied to specific items by manipulating the corresponding prompts. We demonstrate state-of-the-art results in four types of editing operations including image-based, text-based, mask-based editing, and item removal, covering most types of editing applications, all within a single unified framework. Notably, D-Edit is the first framework that can (1) achieve item editing through mask editing and (2) combine image and text-based editing. We demonstrate the quality and versatility of the editing results for a diverse collection of images through both qualitative and quantitative evaluations.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

HELM: Hyperbolic Large Language Models via Mixture-of-Curvature Experts

Neil He
Rishabh Anand
Hiren Madhu
Ali Maatouk
Smita Krishnaswamy
Leandros Tassiulas
Menglin Yang
Rex Ying

Frontier large language models (LLMs) have shown great success in text modeling and generation tasks across domains. However, natural language exhibits inherent semantic hierarchies and nuanced geometric structure, which current LLMs do not capture completely owing to their reliance on Euclidean operations such as dot-products and norms. Furthermore, recent studies have shown that not respecting the underlying geometry of token embeddings leads to training instabilities and degradation of generative capabilities. These findings suggest that shifting to non-Euclidean geometries can better align language models with the underlying geometry of text. We thus propose to operate fully in $\textit{Hyperbolic space}$, known for its expansive, scale-free, and low-distortion properties. To this end, we introduce $\textbf{HELM}$, a family of $\textbf{H}$yp$\textbf{E}$rbolic Large $\textbf{L}$anguage $\textbf{M}$odels, offering a geometric rethinking of the Transformer-based LLM that addresses the representational inflexibility, missing set of necessary operations, and poor scalability of existing hyperbolic LMs. We additionally introduce a $\textbf{Mi}$xture-of-$\textbf{C}$urvature $\textbf{E}$xperts model, $\textbf{HELM-MiCE}$, where each expert operates in a distinct curvature space to encode more fine-grained geometric structure from text, as well as a dense model, $\textbf{HELM-D}$. For $\textbf{HELM-MiCE}$, we further develop hyperbolic Multi-Head Latent Attention ($\textbf{HMLA}$) for efficient, reduced-KV-cache training and inference. For both models, we further develop essential hyperbolic equivalents of rotary positional encodings and root mean square normalization. We are the first to train fully hyperbolic LLMs at billion-parameter scale, and evaluate them on well-known benchmarks such as MMLU and ARC, spanning STEM problem-solving, general knowledge, and commonsense reasoning. Our results show consistent gains from our $\textbf{HELM}$ architectures – up to 4\% – over popular Euclidean architectures used in LLaMA and DeepSeek with superior semantic hierarchy modeling capabilities, highlighting the efficacy and enhanced reasoning afforded by hyperbolic geometry in large-scale language model pretraining.

PDF Details

NeurIPS Conference 2025 Conference Paper

HiPoNet: A Multi-View Simplicial Complex Network for High Dimensional Point-Cloud and Single-Cell data

Siddharth Viswanath
Hiren Madhu
Dhananjay Bhaskar
Jake Kovalic
Dave Johnson
Christopher Tape
Ian Adelstein
Rex Ying

In this paper, we propose HiPoNet, an end-to-end differentiable neural network for regression, classification, and representation learning on high-dimensional point clouds. Our work is motivated by single-cell data which can have very high-dimensionality --exceeding the capabilities of existing methods for point clouds which are mostly tailored for 3D data. Moreover, modern single-cell and spatial experiments now yield entire cohorts of datasets (i. e. , one data set for every patient), necessitating models that can process large, high-dimensional point-clouds at scale. Most current approaches build a single nearest-neighbor graph, discarding important geometric and topological information. In contrast, HiPoNet models the point-cloud as a set of higher-order simplicial complexes, with each particular complex being created using a reweighting of features. This method thus generates multiple constructs corresponding to different views of high-dimensional data, which in biology offers the possibility of disentangling distinct cellular processes. It then employs simplicial wavelet transforms to extract multiscale features, capturing both local and global topology from each view. We show that geometric and topological information is preserved in this framework both theoretically and empirically. We showcase the utility of HiPoNet on point-cloud level tasks, involving classification and regression of entire point-clouds in data cohorts. Experimentally, we find that HiPoNet outperforms other point-cloud and graph-based models on single-cell data. We also apply HiPoNet to spatial transcriptomics datasets using spatial coordinates as one of the views. Overall, HiPoNet offers a robust and scalable solution for high-dimensional data analysis.

PDF Details

NeurIPS Conference 2025 Conference Paper

Hyperbolic Fine-Tuning for Large Language Models

Menglin Yang
Ram B
Aosong Feng
Bo Xiong
Jiahong Liu
Irwin King
Rex Ying

Large language models (LLMs) have demonstrated remarkable performance across various tasks. However, it remains an open question whether the default Euclidean space is the most suitable choice for LLMs. In this study, we investigate the geometric characteristics of LLMs, focusing specifically on tokens and their embeddings. Our findings reveal that token frequency follows a power-law distribution, where high-frequency tokens (e. g. , the, that ) constitute the minority, while low-frequency tokens (e. g. , apple, dog) constitute the majority. Furthermore, high-frequency tokens cluster near the origin, whereas low-frequency tokens are positioned farther away in the embedding space. Additionally, token embeddings exhibit hyperbolic characteristics, indicating a latent tree-like structure within the embedding space. Motivated by these observations, we propose HypLoRA, an efficient fine-tuning approach that operates in hyperbolic space to exploit these underlying hierarchical structures better. HypLoRA performs low-rank adaptation directly in hyperbolic space, thereby preserving hyperbolic modeling capabilities throughout the fine-tuning process. Extensive experiments across various base models and reasoning benchmarks, specifically arithmetic and commonsense reasoning tasks, demonstrate that HypLoRA substantially improves LLM performance.

PDF Details

ICML Conference 2025 Conference Paper

Learning Along the Arrow of Time: Hyperbolic Geometry for Backward-Compatible Representation Learning

Ngoc Bui
Menglin Yang 0001
Runjin Chen
Leonardo Neves
Mingxuan Ju
Rex Ying
Neil Shah
Tong Zhao 0003

Backward compatible representation learning enables updated models to integrate seamlessly with existing ones, avoiding to reprocess stored data. Despite recent advances, existing compatibility approaches in Euclidean space neglect the uncertainty in the old embedding models and force the new model to replicate outdated representations regardless of their quality, and thereby hindering the learning process. In this paper, we switch perspectives to hyperbolic geometry, where we treat time as a natural axis for capturing a model’s confidence and evolution. By lifting embeddings into hyperbolic space and constraining updated embeddings to lie within the entailment cone of the old ones, we maintain generational consistency across models while accounting for uncertainties in the representations. To further enhance compatibility, we introduce a robust contrastive alignment loss that dynamically adjusts alignment weights based on the uncertainty of the old embeddings. Experiments validate the superiority of the proposed method in achieving compatibility, paving the way for more resilient and adaptable machine learning systems.