Author name cluster

Nianzu Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

10 papers

2 author rows

NeurIPS Conference 2025 Conference Paper

CAPability: A Comprehensive Visual Caption Benchmark for Evaluating Both Correctness and Thoroughness

Zhihang Liu
Chen-Wei Xie
Bin Wen
Feiwu Yu
Pandeng Li
Boqiang Zhang
Nianzu Yang
Zuan Gao

Visual captioning benchmarks have become outdated with the emergence of modern multimodal large language models (MLLMs), as the brief ground-truth sentences and traditional metrics fail to assess detailed captions effectively. While recent benchmarks attempt to address this by focusing on keyword extraction or object-centric evaluation, they remain limited to vague-view or object-view analyses and incomplete visual element coverage. In this paper, we introduce CAPability, a comprehensive multi-view benchmark for evaluating visual captioning across 12 dimensions spanning six critical views. We curate nearly 11K human-annotated images and videos with visual element annotations to evaluate the generated captions. CAPability stably assesses both the correctness and thoroughness of captions with \textit{precision} and \textit{hit} metrics. By converting annotations to QA pairs, we further introduce a heuristic metric, \textit{know but cannot tell} ($K\bar{T}$), indicating a significant performance gap between QA and caption capabilities. Our work provides a holistic analysis of MLLMs' captioning abilities, as we identify their strengths and weaknesses across various dimensions, guiding future research to enhance specific aspects of their capabilities.

PDF Details

ICML Conference 2025 Conference Paper

DSBRouter: End-to-end Global Routing via Diffusion Schr\"{o}dinger Bridge

Liangliang Shi
Shenhui Zhang
Xingbo Du
Nianzu Yang
Junchi Yan

Global routing (GR) is a fundamental task in modern chip design and various learning techniques have been devised. However, a persistent challenge is the inherent lack of a mechanism to guarantee the routing connectivity in network’s prediction results, necessitating post-processing search or reinforcement learning (RL) to enforce the connectivity. In this paper, we propose a neural GR solver called DSBRouter, leveraging the Diffusion Schrödinger Bridge (DSB) model for GR. During training, unlike previous works that learn the mapping from noise to routes, we establish a bridge between the initial pins and the routing via DSB, which learns the forward and backward mapping between them. For inference, based on the evaluation metric (e. g. low overflow), we further introduce a sampling scheme with evaluation-based guidance to enhance the routing predictions. Note that DSBRouter is an end-to-end model that does not require a post-step to ensure connectivity. Empirical results show that it achieves SOTA performance on the overflow reduction in ISPD98 and part of ISPD07. In some cases, DSBRouter can even generate routes with zero overflow.

Details

NeurIPS Conference 2025 Conference Paper

Repurposing AlphaFold3-like Protein Folding Models for Antibody Sequence and Structure Co-design

Nianzu Yang
Songlin Jiang
Jian Ma
Huaijin Wu
Shuangjia Zheng
Wengong Jin
Junchi Yan

Diffusion models hold great potential for accelerating antibody design, but their performance is so far limited by the number of antibody-antigen complexes used for model training. Meanwhile, AlphaFold3-like protein folding models, pre-trained on a large corpus of crystal structures, have acquired a broad understanding of biomolecular interaction. Based on this insight, we develop a new antigen-conditioned antibody design model by adapting the diffusion module of AlphaFold3-like models for sequence-structure co-diffusion. Specifically, we extend their structure diffusion module with a sequence diffusion head and fine-tune the entire protein folding model for antibody sequence-structure co-design. Our benchmark results show that sequence-structure co-diffusion models not only surpass state-of-the-art antibody design methods in performance but also maintain structure prediction accuracy comparable to the original folding model. Notably, in the antibody co-design task, our method achieves a CDR-H3 recovery rate of 65% for typical antibodies, outperforming the baselines by 87%, and attains a remarkable 63% recovery rate for nanobodies.

PDF Details

ICLR Conference 2025 Conference Paper

Unify ML4TSP: Drawing Methodological Principles for TSP and Beyond from Streamlined Design Space of Learning and Search

Yang Li 0197
Jiale Ma
Wenzheng Pan
Runzhong Wang
Haoyu Geng
Nianzu Yang
Junchi Yan

Despite the rich works on machine learning (ML) for combinatorial optimization (CO), a unified, principled framework remains lacking. This study utilizes the Travelling Salesman Problem (TSP) as a major case study, with adaptations demonstrated for other CO problems, dissecting established mainstream learning-based solvers to outline a comprehensive design space. We present ML4TSPBench, which advances a unified modular streamline incorporating existing technologies in both learning and search for transparent ablation, aiming to reassess the role of learning and discern which parts of existing techniques are genuinely beneficial and which are not. This further leads to the investigation of desirable principles of learning designs and the exploration of concepts guiding method designs. We demonstrate the desirability of principles such as joint probability estimation, symmetry solution representation, and online optimization for learning-based designs. Leveraging the findings, we propose enhancements to existing methods to compensate for their missing attributes, thereby advancing performance and enriching the technique library. From a higher viewpoint, we also uncover a performance advantage in non-autoregressive and supervised paradigms compared to their counterparts. The strategic decoupling and organic recompositions yield a factory of new TSP solvers, where we investigate synergies across various method combinations and pinpoint the optimal design choices to create more powerful ML4TSP solvers, thereby facilitating and offering a reference for future research and engineering endeavors.

Details

ICLR Conference 2024 Conference Paper

EBMDock: Neural Probabilistic Protein-Protein Docking via a Differentiable Energy Model

Huaijin Wu
Wei Liu 0005
Yatao An Bian
Jiaxiang Wu 0001
Nianzu Yang
Junchi Yan

Protein complex formation, a pivotal challenge in contemporary biology, has recently gained interest from the machine learning community, particularly concerning protein-ligand docking tasks. In this paper, we delve into the equally crucial but comparatively under-investigated domain of protein-protein docking. Specifically, we propose a geometric deep learning framework, termed EBMDock, which employs statistical potential as its energy function. This approach produces a probability distribution over docking poses, such that the identified docking pose aligns with a minimum point in the energy landscape. We employ a differential algorithm grounded in Langevin dynamics to efficiently sample from the docking pose distribution. Additionally, we incorporate energy-based training using contrastive divergence, enhancing both performance and stability. Empirical results demonstrate that our approach achieves superior performance on two benchmark datasets DIPS and DB5.5. Furthermore, the results suggest EBMDock can serve as an orthogonal enhancement to existing methods.

Details

ICML Conference 2024 Conference Paper

MorphGrower: A Synchronized Layer-by-layer Growing Approach for Plausible Neuronal Morphology Generation

Nianzu Yang
Kaipeng Zeng
Haotian Lu 0009
Yexin Wu
Zexin Yuan
Danni Chen
Shengdian Jiang
Jiaxiang Wu 0001

Neuronal morphology is essential for studying brain functioning and understanding neurodegenerative disorders. As acquiring real-world morphology data is expensive, computational approaches for morphology generation have been studied. Traditional methods heavily rely on expert-set rules and parameter tuning, making it difficult to generalize across different types of morphologies. Recently, MorphVAE was introduced as the sole learning-based method, but its generated morphologies lack plausibility, i. e. , they do not appear realistic enough and most of the generated samples are topologically invalid. To fill this gap, this paper proposes MorphGrower, which mimicks the neuron natural growth mechanism for generation. Specifically, MorphGrower generates morphologies layer by layer, with each subsequent layer conditioned on the previously generated structure. During each layer generation, MorphGrower utilizes a pair of sibling branches as the basic generation block and generates branch pairs synchronously. This approach ensures topological validity and allows for fine-grained generation, thereby enhancing the realism of the final generated morphologies. Results on four real-world datasets demonstrate that MorphGrower outperforms MorphVAE by a notable margin. Importantly, the electrophysiological response simulation demonstrates the plausibility of our generated samples from a neuroscience perspective. Our code is available at https: //github. com/Thinklab-SJTU/MorphGrower.

Details

ICML Conference 2024 Conference Paper

SSL4Q: Semi-Supervised Learning of Quantum Data with Application to Quantum State Classification

Yehui Tang 0002
Nianzu Yang
Mabiao Long
Junchi Yan

The accurate classification of quantum states is crucial for advancing quantum computing, as it allows for the effective analysis and correct functioning of quantum devices by analyzing the statistics of the data from quantum measurements. Traditional supervised methods, which rely on extensive labeled measurement outcomes, are used to categorize unknown quantum states with different properties. However, the labeling process demands computational and memory resources that increase exponentially with the number of qubits. We propose SSL4Q, manage to achieve (for the first time) semi-supervised learning specifically designed for quantum state classification. SSL4Q’s architecture is tailored to ensure permutation invariance for unordered quantum measurements and maintain robustness in the face of measurement uncertainties. Our empirical studies encompass simulations on two types of quantum systems: the Heisenberg Model and the Variational Quantum Circuit (VQC) Model, with system size reaching up to 50 qubits. The numerical results demonstrate SSL4Q’s superiority over traditional supervised models in scenarios with limited labels, highlighting its potential in efficiently classifying quantum states with reduced computational and resource overhead.

Details

ICLR Conference 2024 Conference Paper

Towards LLM4QPE: Unsupervised Pretraining of Quantum Property Estimation and A Benchmark

Yehui Tang 0002
Hao Xiong 0003
Nianzu Yang
Tailong Xiao
Junchi Yan

Estimating the properties of quantum systems such as quantum phase has been critical in addressing the essential quantum many-body problems in physics and chemistry. Deep learning models have been recently introduced to property estimation, surpassing conventional statistical approaches. However, these methods are tailored to the specific task and quantum data at hand. It remains an open and attractive question for devising a more universal task-agnostic pretraining model for quantum property estimation. In this paper, we propose LLM4QPE, a large language model style quantum task-agnostic pretraining and finetuning paradigm that 1) performs unsupervised pretraining on diverse quantum systems with different physical conditions; 2) uses the pretrained model for supervised finetuning and delivers high performance with limited training data, on downstream tasks. It mitigates the cost for quantum data collection and speeds up convergence. Extensive experiments show the promising efficacy of LLM4QPE in various tasks including classifying quantum phases of matter on Rydberg atom model and predicting two-body correlation function on anisotropic Heisenberg model.

Details

NeurIPS Conference 2022 Conference Paper

Learning Substructure Invariance for Out-of-Distribution Molecular Representations

Nianzu Yang
Kaipeng Zeng
Qitian Wu
Xiaosong Jia
Junchi Yan

Molecule representation learning (MRL) has been extensively studied and current methods have shown promising power for various tasks, e. g. , molecular property prediction and target identification. However, a common hypothesis of existing methods is that either the model development or experimental evaluation is mostly based on i. i. d. data across training and testing. Such a hypothesis can be violated in real-world applications where testing molecules could come from new environments, bringing about serious performance degradation or unexpected prediction. We propose a new representation learning framework entitled MoleOOD to enhance the robustness of MRL models against such distribution shifts, motivated by an observation that the (bio)chemical properties of molecules are usually invariantly associated with certain privileged molecular substructures across different environments (e. g. , scaffolds, sizes, etc. ). Specifically, We introduce an environment inference model to identify the latent factors that impact data generation from different distributions in a fully data-driven manner. We also propose a new learning objective to guide the molecule encoder to leverage environment-invariant substructures that more stably relate with the labels across environments. Extensive experiments on ten real-world datasets demonstrate that our model has a stronger generalization ability than existing methods under various out-of-distribution (OOD) settings, despite the absence of manual specifications of environments. Particularly, our method achieves up to 5. 9\% and 3. 9\% improvement over the strongest baselines on OGB and DrugOOD benchmarks in terms of ROC-AUC, respectively. Our source code is publicly available at \url{https: //github. com/yangnianzu0515/MoleOOD}.

PDF Details

ICML Conference 2021 Conference Paper

Learning Self-Modulating Attention in Continuous Time Space with Applications to Sequential Recommendation

Chao Chen 0016
Haoyu Geng
Nianzu Yang
Junchi Yan
Daiyue Xue
Jianping Yu
Xiaokang Yang 0001

User interests are usually dynamic in the real world, which poses both theoretical and practical challenges for learning accurate preferences from rich behavior data. Among existing user behavior modeling solutions, attention networks are widely adopted for its effectiveness and relative simplicity. Despite being extensively studied, existing attentions still suffer from two limitations: i) conventional attentions mainly take into account the spatial correlation between user behaviors, regardless the distance between those behaviors in the continuous time space; and ii) these attentions mostly provide a dense and undistinguished distribution over all past behaviors then attentively encode them into the output latent representations. This is however not suitable in practical scenarios where a user’s future actions are relevant to a small subset of her/his historical behaviors. In this paper, we propose a novel attention network, named \textit{self-modulating attention}, that models the complex and non-linearly evolving dynamic user preferences. We empirically demonstrate the effectiveness of our method on top-N sequential recommendation tasks, and the results on three large-scale real-world datasets show that our model can achieve state-of-the-art performance.

Details