Arrow Research search

Author name cluster

Pin Chen

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

5 papers
2 author rows

Possible papers

5

AAAI Conference 2026 Conference Paper

FUSION: Dataset Pruning via Fusing Uncertainty with Structural Information for Optimal Neural Training in Crystal Property Prediction

  • Xiean Wang
  • Pin Chen
  • Liqin Tan
  • Yutong Lu
  • Qingsong Zou

The rapid expansion of materials databases offers unprecedented opportunities for accelerating materials discovery via machine learning. However, the widespread assumption that larger datasets inherently produce better models does not hold in practice. We propose FUSION (Fusing Uncertainty with Structural Information for Optimal Neural training), an offline dataset pruning strategy that synergistically combines uncertainty quantification with crystallographic structure analysis via geometric fingerprinting, framing dataset pruning as a discrete optimization problem. Through evaluation across 3 benchmark datasets, FUSION consistently outperforms baselines, including random pruning, uncertainty sampling, weighting factor pruning, diversity sampling, and active learning. It demonstrates robust transferability across 11 diverse architectures, outperforming random pruning by 1.91–13.65% across different datasets, with an average improvement of 6.36%. Moreover, our analysis suggests that different models exhibit varying robustness characteristics when faced with pruned training data, highlighting the importance of model selection tailored to dataset composition. We identify optimal pruning points where removing just 0–8% of training data improves model performance, yielding gains up to 12.67% in specific model–dataset combinations. These results establish a new paradigm for materials informatics that prioritizes data quality over quantity, offering a pathway toward more efficient and sustainable machine learning workflows in computational materials science.

ICLR Conference 2025 Conference Paper

ECD: A Machine Learning Benchmark for Predicting Enhanced-Precision Electronic Charge Density in Crystalline Inorganic Materials

  • Pin Chen
  • Zexin Xu
  • Qing Mo
  • Hongjin Zhong
  • Fengyang Xu
  • Yutong Lu

Supervised machine learning techniques are increasingly being adopted to speed up electronic structure predictions, serving as alternatives to first-principles methods like Density Functional Theory (DFT). Although current DFT datasets mainly emphasize chemical properties and atomic forces, the precise prediction of electronic charge density is essential for accurately determining a system's total energy and ground state properties. In this study, we introduce a novel electronic charge density dataset named ECD, which encompasses 140,646 stable crystal geometries with medium-precision Perdew–Burke–Ernzerhof (PBE) functional data. Within this dataset, a subset of 7,147 geometries includes high-precision electronic charge density data calculated using the Heyd–Scuseria–Ernzerhof (HSE) functional in DFT. By designing various benchmark tasks for crystalline materials and emphasizing training with large-scale PBE data while fine-tuning with a smaller subset of high-precision HSE data, we demonstrate the efficacy of current machine learning models in predicting electronic charge densities. The ECD dataset and baseline models are open-sourced to support community efforts in developing new methodologies and accelerating materials design and applications.

ICML Conference 2024 Conference Paper

Equivariant Diffusion for Crystal Structure Prediction

  • Peijia Lin
  • Pin Chen
  • Rui Jiao
  • Qing Mo
  • Jianhuan Cen
  • Wenbing Huang 0001
  • Yang Liu 0005
  • Dan Huang 0001

In addressing the challenge of Crystal Structure Prediction (CSP), symmetry-aware deep learning models, particularly diffusion models, have been extensively studied, which treat CSP as a conditional generation task. However, ensuring permutation, rotation, and periodic translation equivariance during diffusion process remains incompletely addressed. In this work, we propose EquiCSP, a novel equivariant diffusion-based generative model. We not only address the overlooked issue of lattice permutation equivariance in existing models, but also develop a unique noising algorithm that rigorously maintains periodic translation equivariance throughout both training and inference processes. Our experiments indicate that EquiCSP significantly surpasses existing models in terms of generating accurate structures and demonstrates faster convergence during the training process.

NeurIPS Conference 2024 Conference Paper

Learning Superconductivity from Ordered and Disordered Material Structures

  • Pin Chen
  • Luoxuan Peng
  • Rui Jiao
  • Qing Mo
  • Zhen Wang
  • Wenbing Huang
  • Yang Liu
  • Yutong Lu

Superconductivity is a fascinating phenomenon observed in certain materials under certain conditions. However, some critical aspects of it, such as the relationship between superconductivity and materials' chemical/structural features, still need to be understood. Recent successes of data-driven approaches in material science strongly inspire researchers to study this relationship with them, but a corresponding dataset is still lacking. Hence, we present a new dataset for data-driven approaches, namely SuperCon3D, containing both 3D crystal structures and experimental superconducting transition temperature (Tc) for the first time. Based on SuperCon3D, we propose two deep learning methods for designing high Tc superconductors. The first is SODNet, a novel equivariant graph attention model for screening known structures, which differs from existing models in incorporating both ordered and disordered geometric content. The second is a diffusion generative model DiffCSP-SC for creating new structures, which enables high Tc-targeted generation. Extensive experiments demonstrate that both our proposed dataset and models are advantageous for designing new high Tc superconducting candidates.

NeurIPS Conference 2023 Conference Paper

Crystal Structure Prediction by Joint Equivariant Diffusion

  • Rui Jiao
  • Wenbing Huang
  • Peijia Lin
  • Jiaqi Han
  • Pin Chen
  • Yutong Lu
  • Yang Liu

Crystal Structure Prediction (CSP) is crucial in various scientific disciplines. While CSP can be addressed by employing currently-prevailing generative models ( e. g. diffusion models), this task encounters unique challenges owing to the symmetric geometry of crystal structures---the invariance of translation, rotation, and periodicity. To incorporate the above symmetries, this paper proposes DiffCSP, a novel diffusion model to learn the structure distribution from stable crystals. To be specific, DiffCSP jointly generates the lattice and atom coordinates for each crystal by employing a periodic-E(3)-equivariant denoising model, to better model the crystal geometry. Notably, different from related equivariant generative approaches, DiffCSP leverages fractional coordinates other than Cartesian coordinates to represent crystals, remarkably promoting the diffusion and the generation process of atom positions. Extensive experiments verify that our DiffCSP remarkably outperforms existing CSP methods, with a much lower computation cost in contrast to DFT-based methods. Moreover, the superiority of DiffCSP is still observed when it is extended for ab initio crystal generation.