Author name cluster

Zichen Wang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

21 papers

2 author rows

AAAI Conference 2026 Conference Paper

SITA: A Framework for Structure-to-Instance Theorem Autoformalization

Chenyi Li
Wanli Ma
Zichen Wang
Zaiwen Wen

While large language models (LLMs) have shown progress in mathematical reasoning, they still face challenges in formalizing theorems that arise from instantiating abstract structures in concrete settings. With the goal of auto-formalizing mathematical results at the research level, we develop a framework for structure-to-instance theorem autoformalization (SITA), which systematically bridges the gap between abstract mathematical theories and their concrete applications in Lean proof assistant. Formalized abstract structures are treated as modular templates that contain definitions, assumptions, operations, and theorems. These templates serve as reusable guides for the formalization of concrete instances. Given a specific instantiation, we generate corresponding Lean definitions and instance declarations, integrate them using Lean’s typeclass mechanism, and construct verified theorems by checking structural assumptions. We incorporate LLM-based generation with feedback-guided refinement to ensure both automation and formal correctness. Experiments on a dataset of optimization problems demonstrate that SITA effectively formalizes diverse instances grounded in abstract structures.

PDF Details DOI

ICML Conference 2025 Conference Paper

AffinityFlow: Guided Flows for Antibody Affinity Maturation

Can Chen
Karla-Luise Herpoldt
Chenchao Zhao
Zichen Wang
Marcus D. Collins
Shang Shang
Ron Benson

Antibodies are widely used as therapeutics, but their development requires costly affinity maturation, involving iterative mutations to enhance binding affinity. This paper explores a sequence-only scenario for affinity maturation, using solely antibody and antigen sequences. Recently AlphaFlow wraps AlphaFold within flow matching to generate diverse protein structures, enabling a sequence-conditioned generative model of structure. Building on this, we propose an alternating optimization framework that (1) fixes the sequence to guide structure generation toward high binding affinity using a structure-based predictor, then (2) applies inverse folding to create sequence mutations, refined by a sequence-based predictor. A key challenge is the lack of labeled data for training both predictors. To address this, we develop a co-teaching module that incorporates valuable information from noisy biophysical energies into predictor refinement. The sequence-based predictor selects consensus samples to teach the structure-based predictor, and vice versa. Our method, AffinityFlow, achieves state-of-the-art performance in proof-of-concept affinity maturation experiments.

Details

AAAI Conference 2025 Conference Paper

C2F-TP: A Coarse-to-Fine Denoising Framework for Uncertainty-Aware Trajectory Prediction

Zichen Wang
Hao Miao
Senzhang Wang
Renzhi Wang
Jianxin Wang
Jian Zhang

Accurately predicting the trajectory of vehicles is critically important for ensuring safety and reliability in autonomous driving. Although considerable research efforts have been made recently, the inherent trajectory uncertainty caused by various factors including the dynamic driving intends and the diverse driving scenarios still poses significant challenges to accurate trajectory prediction. To address this issue, we propose C2F-TP, a coarse-to-fine denoising framework for uncertainty-aware vehicle trajectory prediction. C2F-TP features an innovative two-stage coarse-to-fine prediction process. Specifically, in the spatial-temporal interaction stage, we propose a spatial-temporal interaction module to capture the inter-vehicle interactions and learn a multimodal trajectory distribution, from which a certain number of noisy trajectories are sampled. Next, in the trajectory refinement stage, we design a conditional denoising model to reduce the uncertainty of the sampled trajectories through a step-wise denoising operation. Extensive experiments are conducted on two real datasets NGSIM and highD that are widely adopted in trajectory prediction. The result demonstrates the effectiveness of our proposal.

PDF Details DOI

NeurIPS Conference 2025 Conference Paper

Design-Based Bandits Under Network Interference: Trade-Off Between Regret and Statistical Inference

Zichen Wang
Haoyang Hong
Chuanhao Li
Haoxuan Li
Zhiheng Zhang
Huazheng Wang

In multi-armed bandits with network interference (MABNI), the action taken by one node can influence the rewards of others, creating complex interdependence. While existing research on MABNI largely concentrates on minimizing regret, it often overlooks the crucial concern that an excessive emphasis on the optimal arm can undermine the inference accuracy for sub-optimal arms. Although initial efforts have been made to address this trade-off in single-unit scenarios, these challenges have become more pronounced in the context of MABNI. In this paper, we establish, for the first time, a theoretical Pareto frontier characterizing the trade-off between regret minimization and inference accuracy in adversarial (design-based) MABNI. We further introduce an anytime-valid asymptotic confidence sequence along with a corresponding algorithm, $\texttt{EXP3-N-CS}$, specifically designed to balance the trade-off between regret minimization and inference accuracy in this setting.

PDF Details

AAAI Conference 2025 Conference Paper

Fed-DFA: Federated Distillation for Heterogeneous Model Fusion Through the Adversarial Lens

Zichen Wang
Feng Yan
Tianyi Wang
Cong Wang
Yuanchao Shu
Peng Cheng
Jiming Chen

Most of the federated learning techniques are limited to homogeneous model fusion. With the rapid growth of smart applications on resource-constrained edge devices, it becomes a barrier to accommodate their heterogeneous computing power and memory in the real world. Federated Distillation is a promising alternative to enable aggregation from heterogeneous models. However, the effectiveness of knowledge transfer still remains elusive under the shadow of distinct representation power from heterogeneous models. In this paper, we approach from an adversarial perspective to characterize the decision boundaries during distillation. By leveraging K-step PGD attacks, we successfully model the dynamics of the closest boundary points and establish a quantitative connection between the predictive uncertainty and boundary margin. Based on these findings, we further propose a new loss function to make the distillation attend to samples close to the decision boundaries, thus learning from more informed logit distributions. The extensive experiments over CIFAR-10/100 and Tiny-ImageNet demonstrate about 0.5-3.5% improvement of accuracy under different IID and non-IID settings, with only a small increment of computational overhead.

PDF Details DOI

TMLR Journal 2025 Journal Article

LC-PLM: Long-context Protein Language Modeling Using Bidirectional Mamba with Shared Projection Layers

Yingheng Wang
Zichen Wang
Gil Sadeh
Luca Zancato
Alessandro Achille
George Karypis
Huzefa Rangwala

Self-supervised training of language models (LMs) has seen great success for protein sequences in learning meaningful representations and for generative drug design. Most protein LMs are based on the Transformer architecture trained on individual proteins with short context lengths. Such protein LMs cannot extrapolate to longer proteins and protein complexes well. They also fail to account for the underlying biological mechanisms carried out by biomolecular interactions and dynamics i.e., proteins often interact with other proteins, molecules, and pathways in complex biological systems. In this work, we propose LC-PLM based on an alternative protein LM architecture, BiMamba-S, built upon selective structured state-space models, to learn high-quality universal protein representations at the amino acid token level using masked language modeling. We also introduce its graph-contextual variant, LC-PLM, which contextualizes protein-protein interaction (PPI) graphs for a second stage of training. LC-PLM demonstrates favorable neural scaling laws, better length extrapolation capability, and up to 30% and 16% improvements on protein downstream tasks compared to Transformer-based ESM-2 when trained with 100B and 1T tokens, respectively. LC-PLM-G further trained within the context of PPI graphs shows promising results on protein structure and function prediction tasks. Our study demonstrates the benefit of increasing the context size with computationally efficient LM architecture (e.g., structured state space models) in learning universal protein representations and incorporating molecular interaction contexts contained in biological graphs. Model is available at github.com/amazon-science/LC-PLM.

PDF Details

NeurIPS Conference 2025 Conference Paper

Online Experimental Design With Estimation-Regret Trade-off Under Network Interference

Zhiheng Zhang
Zichen Wang

Network interference has attracted significant attention in the field of causal inference, encapsulating various sociological behaviors in which the treatment assigned to one individual within a network may affect the outcomes of others, such as their neighbors. A key challenge in this setting is that standard causal inference methods often assume independent treatment effects among individuals, which may not hold in networked environments. To estimate interference-aware causal effects, a traditional approach is to inherit the independent settings, where practitioners randomly assign experimental participants to different groups and compare their outcomes. Although effective in offline settings, this strategy becomes problematic in sequential experiments, where suboptimal decisions persist, leading to substantial regret. To address this issue, we introduce a unified interference-aware framework for online experimental design. Compared to existing studies, we extend the definition of arm space using the statistical concept of exposure mapping, which allows for a more flexible and context-aware representation of treatment effects in network settings. Crucially, we establish a Pareto-optimal trade-off between estimation accuracy and regret under the network concerning both time period and arm space, which remains superior to baseline models even without network interference. Furthermore, we propose an algorithmic implementation and discuss its generalization in different learning settings and network topology.

PDF Details

ICML Conference 2025 Conference Paper

Provably Efficient Algorithm for Best Scoring Rule Identification in Online Principal-Agent Information Acquisition

Zichen Wang
Chuanhao Li 0002
Huazheng Wang

We investigate the problem of identifying the optimal scoring rule within the principal-agent framework for online information acquisition problem. We focus on the principal’s perspective, seeking to determine the desired scoring rule through interactions with the agent. To address this challenge, we propose two algorithms: OIAFC and OIAFB, tailored for fixed confidence and fixed budget settings, respectively. Our theoretical analysis demonstrates that OIAFC can extract the desired $(\epsilon, \delta)$-scoring rule with a efficient instance-dependent sample complexity or an instance-independent sample complexity. Our analysis also shows that OIAFB matches the instance-independent performance bound of OIAFC, while both algorithms share the same complexity across fixed confidence and fixed budget settings.

Details

ICLR Conference 2025 Conference Paper

Retrieval Augmented Diffusion Model for Structure-informed Antibody Design and Optimization

Zichen Wang
Yaokun Ji
Jianing Tian
Shuangjia Zheng

Antibodies are essential proteins responsible for immune responses in organisms, capable of specifically recognizing antigen molecules of pathogens. Recent advances in generative models have significantly enhanced rational antibody design. However, existing methods mainly create antibodies from scratch without template constraints, leading to model optimization challenges and unnatural sequences. To address these issues, we propose a retrieval-augmented diffusion framework, termed RADAb, for efficient antibody design. Our method leverages a set of structural homologous motifs that align with query structural constraints to guide the generative model in inversely optimizing antibodies according to desired design criteria. Specifically, we introduce a structure-informed retrieval mechanism that integrates these exemplar motifs with the input backbone through a novel dual-branch denoising module, utilizing both structural and evolutionary information. Additionally, we develop a conditional diffusion model that iteratively refines the optimization process by incorporating both global context and local evolutionary conditions. Our approach is agnostic to the choice of generative models. Empirical experiments demonstrate that our method achieves state-of-the-art performance in multiple antibody inverse folding and optimization tasks, offering a new perspective on biomolecular generative models.

Details

NeurIPS Conference 2025 Conference Paper

RiboFlow: Conditional De Novo RNA Co-Design via Synergistic Flow Matching

Runze Ma
Zhongyue Zhang
Zichen Wang
Chenqing Hua
Jiahua Rao
Zhuomin Zhou
Shuangjia Zheng

Ribonucleic acid (RNA) binds to molecules to achieve specific biological functions. While generative models are advancing biomolecule design, existing methods for designing RNA that target specific ligands face limitations in capturing RNA’s conformational flexibility, ensuring structural validity, and overcoming data scarcity. To address these challenges, we introduce RiboFlow, a synergistic flow matching model to co-design RNA structures and sequences based on target molecules. By integrating RNA backbone frames, torsion angles, and sequence features in an unified architecture, RiboFlow explicitly models RNA’s dynamic conformations while enforcing sequence-structure consistency to improve validity. Additionally, we curate RiboBind, a large-scale dataset of RNA-molecule interactions, to resolve the scarcity of high-quality structural data. Extensive experiments reveal that RiboFlow not only outperforms state-of-the-art RNA design methods by a large margin but also showcases controllable capabilities for achieving high binding affinity to target ligands. Our work bridges critical gaps in controllable RNA design, offering a framework for structure-aware, data-efficient generation.

PDF Details

NeurIPS Conference 2025 Conference Paper

Toward Engineering AGI: Benchmarking the Engineering Design Capabilities of LLMs

Xingang Guo
Yaxin Li
XiangYi Kong
YILAN JIANG
Xiayu Zhao
Zhihua Gong
Yufan Zhang
Daixuan Li

Modern engineering, spanning electrical, mechanical, aerospace, civil, and computer disciplines, stands as a cornerstone of human civilization and the foundation of our society. However, engineering design poses a fundamentally different challenge for large language models (LLMs) compared with traditional textbook-style problem solving or factual question answering. Although existing benchmarks have driven progress in areas such as language understanding, code synthesis, and scientific problem solving, real-world engineering design demands the synthesis of domain knowledge, navigation of complex trade-offs, and management of the tedious processes that consume much of practicing engineers' time. Despite these shared challenges across engineering disciplines, no benchmark currently captures the unique demands of engineering design work. In this work, we introduce EngDesign, an Engineering Design benchmark that evaluates LLMs' abilities to perform practical design tasks across nine engineering domains. Unlike existing benchmarks that focus on factual recall or question answering, EngDesign uniquely emphasizes LLMs' ability to synthesize domain knowledge, reason under constraints, and generate functional, objective-oriented engineering designs. Each task in EngDesign represents a real-world engineering design problem, accompanied by a detailed task description specifying design goals, constraints, and performance requirements. EngDesign pioneers a simulation-based evaluation paradigm that moves beyond textbook knowledge to assess genuine engineering design capabilities and shifts evaluation from static answer checking to dynamic, simulation-driven functional verification, marking a crucial step toward realizing the vision of engineering Artificial General Intelligence (AGI).

PDF Details

NeurIPS Conference 2025 Conference Paper

Tree-Based Premise Selection for Lean4

Zichen Wang
Anjie Dong
Zaiwen Wen

Premise selection is a critical bottleneck in interactive theorem proving, particularly with large libraries. Existing methods, primarily relying on semantic embeddings, often fail to effectively leverage the rich structural information inherent in mathematical expressions. This paper proposes a novel framework for premise selection based on the structure of expression trees. The framework enhances premise selection ability by explicitly utilizing the structural information of Lean expressions and by means of the simplified tree representation obtained via common subexpression elimination. Our method employs a multi-stage filtering pipeline, incorporating structure-aware similarity measures including the Weisfeiler-Lehman kernel, tree edit distance, $\texttt{Const}$ node Jaccard similarity, and collapse-match similarity. An adaptive fusion strategy combines these metrics for refined ranking. To handle large-scale data efficiently, we incorporate cluster-based search space optimization and structural compatibility constraints. Comprehensive evaluation on a large theorem library extracted from Mathlib4 demonstrates that our method significantly outperforms existing premise retrieval tools across various metrics. Experimental analysis, including ablation studies and parameter sensitivity analysis, validates the contribution of individual components and highlights the efficacy of our structure-aware approach and multi-metric fusion.

PDF Details

TMLR Journal 2024 Journal Article

Adversarial Attacks on Online Learning to Rank with Stochastic Click Models

Zichen Wang
Rishab Balasubramanian
Hui Yuan
Chenyu Song
Mengdi Wang
Huazheng Wang

We propose the first study of adversarial attacks on online learning to rank. The goal of the attacker it to misguide the online learning to rank algorithm to place the target item on top of the ranking list linear times to time horizon $T$ with a sublinear attack cost. We propose generalized list poisoning attacks that perturb the ranking list presented to the user. This strategy can efficiently attack any no-regret ranker in general stochastic click models. Furthermore, we propose a click poisoning-based strategy named attack-then-quit that can efficiently attack two representative OLTR algorithms for stochastic click models. We theoretically analyze the success and cost upper bound of the two proposed methods. Experimental results based on synthetic and real-world data further validate the effectiveness and cost-efficiency of the proposed attack strategies.

PDF Details

NeurIPS Conference 2024 Conference Paper

DiffusionPDE: Generative PDE-Solving under Partial Observation

Jiahe Huang
Guandao Yang
Zichen Wang
Jeong Joon Park

We introduce a general framework for solving partial differential equations (PDEs) using generative diffusion models. In particular, we focus on the scenarios where we do not have the full knowledge of the scene necessary to apply classical solvers. Most existing forward or inverse PDE approaches perform poorly when the observations on the data or the underlying coefficients are incomplete, which is a common assumption for real-world measurements. In this work, we propose DiffusionPDE that can simultaneously fill in the missing information and solve a PDE by modeling the joint distribution of the solution and coefficient spaces. We show that the learned generative priors lead to a versatile framework for accurately solving a wide range of PDEs under partial observation, significantly outperforming the state-of-the-art methods for both forward and inverse directions.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Fine-Grained Prototypes Distillation for Few-Shot Object Detection

Zichen Wang
Bo Yang
Haonan Yue
Zhenghao Ma

Few-shot object detection (FSOD) aims at extending a generic detector for novel object detection with only a few training examples. It attracts great concerns recently due to the practical meanings. Meta-learning has been demonstrated to be an effective paradigm for this task. In general, methods based on meta-learning employ an additional support branch to encode novel examples (a.k.a. support images) into class prototypes, which are then fused with query branch to facilitate the model prediction. However, the class-level prototypes are difficult to precisely generate, and they also lack detailed information, leading to instability in performance. New methods are required to capture the distinctive local context for more robust novel object detection. To this end, we propose to distill the most representative support features into fine-grained prototypes. These prototypes are then assigned into query feature maps based on the matching results, modeling the detailed feature relations between two branches. This process is realized by our Fine-Grained Feature Aggregation (FFA) module. Moreover, in terms of high-level feature fusion, we propose Balanced Class-Agnostic Sampling (B-CAS) strategy and Non-Linear Fusion (NLF) module from differenct perspectives. They are complementary to each other and depict the high-level feature relations more effectively. Extensive experiments on PASCAL VOC and MS COCO benchmarks show that our method sets a new state-of-the-art performance in most settings. Our code is available at https://github.com/wangchen1801/FPD.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Graph Neural Prompting with Large Language Models

Yijun Tian
Huan Song
Zichen Wang
Haozhu Wang
Ziqing Hu
Fang Wang
Nitesh V. Chawla
Panpan Xu

Large language models (LLMs) have shown remarkable generalization capability with exceptional performance in various language modeling tasks. However, they still exhibit inherent limitations in precisely capturing and returning grounded knowledge. While existing work has explored utilizing knowledge graphs (KGs) to enhance language modeling via joint training and customized model architectures, applying this to LLMs is problematic owing to their large number of parameters and high computational cost. Therefore, how to enhance pre-trained LLMs using grounded knowledge, e.g., retrieval-augmented generation, remains an open question. In this work, we propose Graph Neural Prompting (GNP), a novel plug-and-play method to assist pre-trained LLMs in learning beneficial knowledge from KGs. GNP encompasses various designs, including a standard graph neural network encoder, a cross-modality pooling module, a domain projector, and a self-supervised link prediction objective. Extensive experiments on multiple datasets demonstrate the superiority of GNP on both commonsense and biomedical reasoning tasks across different LLM sizes and settings. Code is available at https://github.com/meettyj/GNP.

PDF Details DOI

AAAI Conference 2024 Conference Paper

Learn How to See: Collaborative Embodied Learning for Object Detection and Camera Adjusting

Lingdong Shen
Chunlei Huo
Nuo Xu
Chaowei Han
Zichen Wang

Passive object detectors, trained on large-scale static datasets, often overlook the feedback from object detection to image acquisition. Embodied vision and active detection mitigate this issue by interacting with the environment. Nevertheless, the materialization of activeness hinges on resource-intensive data collection and annotation. To tackle these challenges, we propose a collaborative student-teacher framework. Technically, a replay buffer is built based on the trajectory data to encapsulate the relationship of state, action, and reward. In addition, the student network diverges from reinforcement learning by redefining sequential decision pathways using a GPT structure enriched with causal self-attention. Moreover, the teacher network establishes a subtle state-reward mapping based on adjacent benefit differences, providing reliable rewards for student adaptively self-tuning with the vast unlabeled replay buffer data. Additionally, an innovative yet straightforward benefit reference value is proposed within the teacher network, adding to its effectiveness and simplicity. Leveraging a flexible replay buffer and embodied collaboration between teacher and student, the framework learns to see before detection with shallower features and shorter inference steps. Experiments highlight significant advantages of our algorithm over state-of-the-art detectors. The code is released at https://github.com/lydonShen/STF.

PDF Details DOI

UAI Conference 2024 Conference Paper

Pure Exploration in Asynchronous Federated Bandits

Zichen Wang
Chuanhao Li 0002
Chenyu Song
Lianghui Wang
Quanquan Gu
Huazheng Wang

We study the federated pure exploration problem of multi-armed bandits and linear bandits, where $M$ agents cooperatively identify the best arm via communicating with the central server. To enhance the robustness against latency and unavailability of agents that are common in practice, we propose the first federated asynchronous multi-armed bandit and linear bandit algorithms for pure exploration with fixed confidence. Our theoretical analysis shows the proposed algorithms achieve near-optimal sample complexities and efficient communication costs in a fully asynchronous environment. Moreover, experimental results based on synthetic and real-world data empirically elucidate the effectiveness and communication cost-efficiency of the proposed algorithms.

Details

AAAI Conference 2024 Conference Paper

Simultaneous Optimization of Bid Shading and Internal Auction for Demand-Side Platforms

Yadong Xu
Bonan Ni
Weiran Shen
Xun Wang
Zichen Wang
Yinsong Xue
Pingzhong Tang

Online advertising has been one of the most important sources for industry's growth, where the demand-side platforms (DSP) play an important role via bidding to the ad exchanges on behalf of their advertiser clients. Since more and more ad exchanges have shifted from second to first price auctions, it is challenging for DSPs to adjust bidding strategy in the volatile environment. Recent studies on bid shading in first-price auctions may have limited performance due to relatively strong hypotheses about winning probability distribution. Moreover, these studies do not consider the incentive of advertiser clients, which can be crucial for a reliable advertising platform. In this work, we consider both the optimization of bid shading technique and the design of internal auction which is ex-post incentive compatible (IC) for the management of a DSP. Firstly, we prove that the joint design of bid shading and ex-post IC auction can be reduced to choosing one monotone bid function for each advertiser without loss of optimality. Then we propose a parameterized neural network to implement the monotone bid functions. With well-designed surrogate loss, the objective can be optimized in an end-to-end manner. Finally, our experimental results demonstrate the effectiveness and superiority of our algorithm.

PDF Details DOI

TCS Journal 2023 Journal Article

Collaborative coalitions-based joint service caching and task offloading for edge networks

Zichen Wang
Hongwei Du

Details DOI

ICRA Conference 2022 Conference Paper

Multi-Dimensional Proprioception and Stiffness Tuning for Soft Robotic Joints

Zhonggui Fang
Chaoyi Huang
Yaxi Wang
Jiahao Xu
Jiyong Tan
Bin Li
Zichen Wang
Yige Wu

Proprioception and variable stiffness are two trending topics in soft robotics research. The former could endow soft robots with the ability to perceive the environment as well as their internal states without the need of dedicated sensors, while the latter could strengthen the otherwise excessive compliance, enabling soft robots for tasks which require a higher force. Both directions have been extensively reported in existing literature, achieving both concurrently was even more challenging. The major limiting factor was the limited stiffness due to the hyper elasticity of conventional soft robots, which increases the difficulties in capturing the continues deformation. In this work, we proposed an alternative approach to tackle these two challenges, a novel “tune-down” approach, combining proprioception with stiffness regulation and implemented over-constrained soft robotic joint designs to further strengthen this spirit. As a result, the soft robotic joint could achieve multi-directional proprioception, as well as variable stiffness tuning, concurrently, using merely an on-board sensor for basic pneumatic control. The concept, design, modeling, actuation/control, and experimental validation were presented in detail, demonstrating the efficacy and potential of the proposed approach.

Details