Arrow Research search

Author name cluster

Rui Jiao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers
2 author rows

Possible papers

11

ICLR Conference 2025 Conference Paper

DenoiseVAE: Learning Molecule-Adaptive Noise Distributions for Denoising-based 3D Molecular Pre-training

  • Yurou Liu
  • Jiahao Chen
  • Rui Jiao
  • Jiangmeng Li
  • Wenbing Huang 0001
  • Bing Su 0001

Denoising learning of 3D molecules learns molecular representations by imposing noises into the equilibrium conformation and predicting the added noises to recover the equilibrium conformation, which essentially captures the information of molecular force fields. Due to the specificity of Potential Energy Surfaces, the probabilities of physically reasonable noises for each atom in different molecules are different. However, existing methods apply the shared heuristic hand-crafted noise sampling strategy to all molecules, resulting in inaccurate force field learning. In this paper, we propose a novel 3D molecular pre-training method, namely DenoiseVAE, which employs a Noise Generator to acquire atom-specific noise distributions for different molecules. It utilizes the stochastic reparameterization technique to sample noisy conformations from the generated distributions, which are inputted into a Denoising Module for denoising. The Noise Generator and the Denoising Module are jointly learned in a manner conforming with the paradigm of Variational Auto Encoder. Consequently, the sampled noisy conformations can be more diverse, adaptive, and informative, and thus DenoiseVAE can learn representations that better reveal the molecular force fields. Extensive experiments show that DenoiseVAE outperforms the current state-of-the-art methods on various molecular property prediction tasks, demonstrating the effectiveness of it.

NeurIPS Conference 2025 Conference Paper

Learning 3D Anisotropic Noise Distributions Improves Molecular Force Fields

  • Xixian Liu
  • Rui Jiao
  • Zhiyuan Liu
  • Yurou Liu
  • Yang Liu
  • Ziheng Lu
  • Wenbing Huang
  • Yang Zhang

Coordinate denoising has emerged as a promising method for 3D molecular pretraining due to its theoretical connection to learning molecular force field. However, existing denoising methods rely on oversimplied molecular dynamics that assume atomic motions to be isotropic and homoscedastic. To address these limitations, we propose a novel denoising framework AniDS: Anisotropic Variational Autoencoder for 3D Molecular Denoising. AniDS introduces a structure-aware anisotropic noise generator that can produce atom-specific, full covariance matrices for Gaussian noise distributions to better reflect directional and structural variability in molecular systems. These covariances are derived from pairwise atomic interactions as anisotropic corrections to an isotropic base. Our design ensures that the resulting covariance matrices are symmetric, positive semi-definite, and SO(3)-equivariant, while providing greater capacity to model complex molecular dynamics. Extensive experiments show that AniDS outperforms prior isotropic and homoscedastic denoising models and other leading methods on the MD17 and OC22 benchmarks, achieving average relative improvements of 8. 9% and 6. 2% in force prediction accuracy. Our case study on a crystal and molecule structure shows that AniDS adaptively suppresses noise along the bonding direction, consistent with physicochemical principles. Our code is available at https: //github. com/ZeroKnighting/AniDS.

NeurIPS Conference 2025 Conference Paper

MOF-BFN: Metal-Organic Frameworks Structure Prediction via Bayesian Flow Networks

  • Rui Jiao
  • Hanlin Wu
  • Wenbing Huang
  • Yuxuan Song
  • Yawen Ouyang
  • Yu Rong
  • Tingyang Xu
  • Pengju Wang

Metal-Organic Frameworks (MOFs) have attracted considerable attention due to their unique properties including high surface area and tunable porosity, and promising applications in catalysis, gas storage, and drug delivery. Structure prediction for MOFs is a challenging task, as these frameworks are intrinsically periodic and hierarchically organized, where the entire structure is assembled from building blocks like metal nodes and organic linkers. To address this, we introduce MOF-BFN, a novel generative model for MOF structure prediction based on Bayesian Flow Networks (BFNs). Given the local geometry of building blocks, MOF-BFN jointly predicts the lattice parameters, as well as the positions and orientations of all building blocks within the unit cell. In particular, the positions are modelled in the fractional coordinate system to naturally incorporate the periodicity. Meanwhile, the orientations are modeled as unit quaternions sampled from learned Bingham distributions via the proposed Bingham BFN, enabling effective orientation generation on the 4D unit hypersphere. Experimental results demonstrate that MOF-BFN achieves state-of-the-art performance across multiple tasks, including structure prediction, geometric property evaluation, and de novo generation, offering a promising tool for designing complex MOF materials.

ICML Conference 2025 Conference Paper

UniMoMo: Unified Generative Modeling of 3D Molecules for De Novo Binder Design

  • Xiangzhe Kong
  • Zishen Zhang
  • Ziting Zhang
  • Rui Jiao
  • Jianzhu Ma
  • Wenbing Huang 0001
  • Kai Liu
  • Yang Liu 0005

The design of target-specific molecules such as small molecules, peptides, and antibodies is vital for biological research and drug discovery. Existing generative methods are restricted to single-domain molecules, failing to address versatile therapeutic needs or utilize cross-domain transferability to enhance model performance. In this paper, we introduce Uni fied generative Mo deling of 3D Mo lecules (UniMoMo), the first framework capable of designing binders of multiple molecular domains using a single model. In particular, UniMoMo unifies the representations of different molecules as graphs of blocks, where each block corresponds to either a standard amino acid or a molecular fragment. Based on these unified representations, UniMoMo utilizes a geometric latent diffusion model for 3D molecular generation, featuring an iterative full-atom autoencoder to compress blocks into latent space points, followed by an E(3)-equivariant diffusion process. Extensive benchmarks across peptides, antibodies, and small molecules demonstrate the superiority of our unified framework over existing domain-specific models, highlighting the benefits of multi-domain training.

ICML Conference 2025 Conference Paper

Zero-Shot Cyclic Peptide Design via Composable Geometric Constraints

  • Dapeng Jiang
  • Xiangzhe Kong
  • Jiaqi Han
  • Mingyu Li
  • Rui Jiao
  • Wenbing Huang 0001
  • Stefano Ermon
  • Jianzhu Ma

Cyclic peptides, characterized by geometric constraints absent in linear peptides, offer enhanced biochemical properties, presenting new opportunities to address unmet medical needs. However, designing target-specific cyclic peptides remains underexplored due to limited training data. To bridge the gap, we propose CP-Composer, a novel generative framework that enables zero-shot cyclic peptide generation via composable geometric constraints. Our approach decomposes complex cyclization patterns into unit constraints, which are incorporated into a diffusion model through geometric conditioning on nodes and edges. During training, the model learns from unit constraints and their random combinations in linear peptides, while at inference, novel constraint combinations required for cyclization are imposed as input. Experiments show that our model, despite trained with linear peptides, is capable of generating diverse target-binding cyclic peptides, reaching success rates from 38% to 84% on different cyclization strategies.

NeurIPS Conference 2024 Conference Paper

3D Structure Prediction of Atomic Systems with Flow-based Direct Preference Optimization

  • Rui Jiao
  • Xiangzhe Kong
  • Wenbing Huang
  • Yang Liu

Predicting high-fidelity 3D structures of atomic systems is a fundamental yet challenging problem in scientific domains. While recent work demonstrates the advantage of generative models in this realm, the exploration of different probability paths are still insufficient, and hallucinations during sampling are persistently occurring. To address these pitfalls, we introduce FlowDPO, a novel framework that explores various probability paths with flow matching models and further suppresses hallucinations using Direct Preference Optimization (DPO) for structure generation. Our approach begins with a pre-trained flow matching model to generate multiple candidate structures for each training sample. These structures are then evaluated and ranked based on their distance to the ground truth, resulting in an automatic preference dataset. Using this dataset, we apply DPO to optimize the original model, improving its performance in generating structures closely aligned with the desired reference distribution. As confirmed by our theoretical analysis, such paradigm and objective function are compatible with arbitrary Gaussian paths, exhibiting favorable universality. Extensive experimental results on antibodies and crystals demonstrate substantial benefits of our FlowDPO, highlighting its potential to advance the field of 3D structure prediction with generative models.

ICML Conference 2024 Conference Paper

Equivariant Diffusion for Crystal Structure Prediction

  • Peijia Lin
  • Pin Chen
  • Rui Jiao
  • Qing Mo
  • Jianhuan Cen
  • Wenbing Huang 0001
  • Yang Liu 0005
  • Dan Huang 0001

In addressing the challenge of Crystal Structure Prediction (CSP), symmetry-aware deep learning models, particularly diffusion models, have been extensively studied, which treat CSP as a conditional generation task. However, ensuring permutation, rotation, and periodic translation equivariance during diffusion process remains incompletely addressed. In this work, we propose EquiCSP, a novel equivariant diffusion-based generative model. We not only address the overlooked issue of lattice permutation equivariance in existing models, but also develop a unique noising algorithm that rigorously maintains periodic translation equivariance throughout both training and inference processes. Our experiments indicate that EquiCSP significantly surpasses existing models in terms of generating accurate structures and demonstrates faster convergence during the training process.

NeurIPS Conference 2024 Conference Paper

Learning Superconductivity from Ordered and Disordered Material Structures

  • Pin Chen
  • Luoxuan Peng
  • Rui Jiao
  • Qing Mo
  • Zhen Wang
  • Wenbing Huang
  • Yang Liu
  • Yutong Lu

Superconductivity is a fascinating phenomenon observed in certain materials under certain conditions. However, some critical aspects of it, such as the relationship between superconductivity and materials' chemical/structural features, still need to be understood. Recent successes of data-driven approaches in material science strongly inspire researchers to study this relationship with them, but a corresponding dataset is still lacking. Hence, we present a new dataset for data-driven approaches, namely SuperCon3D, containing both 3D crystal structures and experimental superconducting transition temperature (Tc) for the first time. Based on SuperCon3D, we propose two deep learning methods for designing high Tc superconductors. The first is SODNet, a novel equivariant graph attention model for screening known structures, which differs from existing models in incorporating both ordered and disordered geometric content. The second is a diffusion generative model DiffCSP-SC for creating new structures, which enables high Tc-targeted generation. Extensive experiments demonstrate that both our proposed dataset and models are advantageous for designing new high Tc superconducting candidates.

ICLR Conference 2024 Conference Paper

Space Group Constrained Crystal Generation

  • Rui Jiao
  • Wenbing Huang 0001
  • Yu Liu
  • Deli Zhao
  • Yang Liu 0005

Crystals are the foundation of numerous scientific and industrial applications. While various learning-based approaches have been proposed for crystal generation, existing methods neglect the spacegroup constraint which is crucial in describing the geometry of crystals and closely relevant to many desirable properties. However, considering spacegroup constraint is challenging owing to its diverse and nontrivial forms. In this paper, we reduce the spacegroup constraint into an equivalent formulation that is more tractable to be handcrafted into the generation process. In particular, we translate the spacegroup constraint into two cases: the basis constraint of the invariant exponential space of the lattice matrix and the Wyckoff position constraint of the fractional coordinates. Upon the derived constraints, we then propose DiffCSP++, a novel diffusion model that has enhanced a previous work DiffCSP by further taking spacegroup constraint into account. Experiments on several popular datasets verify the benefit of the involvement of the spacegroup constraint, and show that our DiffCSP++ achieves the best or comparable performance on crystal structure prediction and ab initio crystal generation.

NeurIPS Conference 2023 Conference Paper

Crystal Structure Prediction by Joint Equivariant Diffusion

  • Rui Jiao
  • Wenbing Huang
  • Peijia Lin
  • Jiaqi Han
  • Pin Chen
  • Yutong Lu
  • Yang Liu

Crystal Structure Prediction (CSP) is crucial in various scientific disciplines. While CSP can be addressed by employing currently-prevailing generative models ( e. g. diffusion models), this task encounters unique challenges owing to the symmetric geometry of crystal structures---the invariance of translation, rotation, and periodicity. To incorporate the above symmetries, this paper proposes DiffCSP, a novel diffusion model to learn the structure distribution from stable crystals. To be specific, DiffCSP jointly generates the lattice and atom coordinates for each crystal by employing a periodic-E(3)-equivariant denoising model, to better model the crystal geometry. Notably, different from related equivariant generative approaches, DiffCSP leverages fractional coordinates other than Cartesian coordinates to represent crystals, remarkably promoting the diffusion and the generation process of atom positions. Extensive experiments verify that our DiffCSP remarkably outperforms existing CSP methods, with a much lower computation cost in contrast to DFT-based methods. Moreover, the superiority of DiffCSP is still observed when it is extended for ab initio crystal generation.

AAAI Conference 2023 Conference Paper

Energy-Motivated Equivariant Pretraining for 3D Molecular Graphs

  • Rui Jiao
  • Jiaqi Han
  • Wenbing Huang
  • Yu Rong
  • Yang Liu

Pretraining molecular representation models without labels is fundamental to various applications. Conventional methods mainly process 2D molecular graphs and focus solely on 2D tasks, making their pretrained models incapable of characterizing 3D geometry and thus defective for downstream 3D tasks. In this work, we tackle 3D molecular pretraining in a complete and novel sense. In particular, we first propose to adopt an equivariant energy-based model as the backbone for pretraining, which enjoys the merits of fulfilling the symmetry of 3D space. Then we develop a node-level pretraining loss for force prediction, where we further exploit the Riemann-Gaussian distribution to ensure the loss to be E(3)-invariant, enabling more robustness. Moreover, a graph-level noise scale prediction task is also leveraged to further promote the eventual performance. We evaluate our model pretrained from a large-scale 3D dataset GEOM-QM9 on two challenging 3D benchmarks: MD17 and QM9. Experimental results demonstrate the efficacy of our method against current state-of-the-art pretraining approaches, and verify the validity of our design for each proposed component. Code is available at https://github.com/jiaor17/3D-EMGP.