Author name cluster

Yadi Cao

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

7 papers

2 author rows

TMLR Journal 2026 Journal Article

Discovering Symbolic Differential Equations with Symmetry Invariants

Jianke Yang
Manu Bhat
Bryan Hu
Yadi Cao
Nima Dehmamy
Robin Walters
Rose Yu

Discovering symbolic differential equations from data uncovers fundamental dynamical laws underlying complex systems. However, existing methods often struggle with the vast search space of equations and may produce equations that violate known physical laws. In this work, we address these problems by introducing the concept of \textit{symmetry invariants} in equation discovery. We leverage the fact that differential equations admitting a symmetry group can be expressed in terms of differential invariants of symmetry transformations. Thus, we propose to use these invariants as atomic entities in equation discovery, ensuring the discovered equations satisfy the specified symmetry. Our approach integrates seamlessly with existing equation discovery methods such as sparse regression and genetic programming, improving their accuracy and efficiency. We validate the proposed method through applications to various physical systems, such as Darcy flow and reaction-diffusion, demonstrating its ability to recover parsimonious and interpretable equations that respect the laws of physics.

PDF Details

TMLR Journal 2026 Journal Article

VICON: Vision In-Context Operator Networks for Multi-Physics Fluid Dynamics Prediction

Yadi Cao
Yuxuan Liu
Liu Yang
Rose Yu
Hayden Schaeffer
Stanley Osher

In-Context Operator Networks (ICONs) have demonstrated the ability to learn operators across diverse partial differential equations using few-shot, in-context learning. However, existing ICONs process each spatial point as an individual token, severely limiting computational efficiency when handling dense data in higher spatial dimensions. We propose \textit{Vision In-Context Operator Networks} (VICON), which integrate vision transformer architectures to efficiently process 2D data through patch-wise operations while preserving ICON's adaptability to multi-physics systems and varying timesteps. Evaluated across three fluid dynamics benchmarks, VICON significantly outperforms state-of-the-art baselines DPOT and MPP, reducing the average last-step rollout error by 37.9\% compared to DPOT and 44.7\% compared to MPP, while requiring only 72.5\% and 34.8\% of their respective inference times. VICON naturally supports flexible rollout strategies with varying timestep strides, enabling immediate deployment in \textit{imperfect measurement systems} where sampling frequencies may differ or frames might be dropped—common challenges in real-world settings—without requiring retraining or interpolation. In these realistic scenarios, VICON exhibits remarkable robustness, experiencing only 24.41\% relative performance degradation compared to 71.37\%-74.49\% degradation in baseline methods, demonstrating its versatility for deployment in realistic applications. Our scripts for processing datasets and code are publicly available at https://github.com/Eydcao/VICON.

PDF Details

ICML Conference 2025 Conference Paper

Adapting While Learning: Grounding LLMs for Scientific Problems with Tool Usage Adaptation

Bohan Lyu 0001
Yadi Cao
Duncan Watson-Parris
Leon Bergen
Taylor Berg-Kirkpatrick
Rose Yu

Large Language Models (LLMs) demonstrate promising capabilities in solving scientific problems but often suffer from the issue of hallucination. While integrating LLMs with tools can mitigate this issue, models fine-tuned on tool usage become overreliant on them and incur unnecessary costs. Inspired by how human experts assess problem complexity before selecting solutions, we propose a novel two-component fine-tuning method, Adapting while Learning (AWL). In the first component World Knowledge Learning (WKL), LLMs internalize scientific knowledge by learning from tool-generated solutions. In the second component Tool Usage Adaptation (TUA), we categorize problems as easy or hard based on the model’s accuracy, and train it to maintain direct reasoning for easy problems while switching to tools for hard ones. We validate our method on 6 scientific benchmark datasets across climate science, epidemiology, physics, and other domains. Compared to the original instruct model (8B), models post-trained with AWL achieve 29. 11% higher answer accuracy and 12. 72% better tool usage accuracy, even surpassing state-of-the-art models including GPT-4o and Claude-3. 5 on 4 custom-created datasets. Our code is open-source at https: //github. com/Rose-STL-Lab/Adapting-While-Learning.

Details

TMLR Journal 2025 Journal Article

DELTA: Dual Consistency Delving with Topological Uncertainty for Active Graph Domain Adaptation

Pengyun Wang
Yadi Cao
Chris Russell
Yanxin Shen
Junyu Luo
Ming Zhang
Siyu Heng
Xiao Luo

Graph domain adaptation has recently enabled knowledge transfer across different graphs. However, without the semantic information on target graphs, the performance on target graphs is still far from satisfactory. To address the issue, we study the problem of active graph domain adaptation, which selects a small quantitative of informative nodes on the target graph for extra annotation. This problem is highly challenging due to the complicated topological relationships and the distribution discrepancy across graphs. In this paper, we propose a novel approach named Dual Consistency Delving with Topological Uncertainty (DELTA) for active graph domain adaptation. Our DELTA consists of an edge-oriented graph subnetwork and a path-oriented graph subnetwork, which can explore topological semantics from complementary perspectives. In particular, our edge-oriented graph subnetwork utilizes the message passing mechanism to learn neighborhood information, while our path-oriented graph subnetwork explores high-order relationships from substructures. To jointly learn from two subnetworks, we roughly select informative candidate nodes with the consideration of consistency across two subnetworks. Then, we aggregate local semantics from its K-hop subgraph based on node degrees for topological uncertainty estimation. To overcome potential distribution shifts, we compare target nodes and their corresponding source nodes for discrepancy scores as an additional component for fine selection. Extensive experiments on benchmark datasets demonstrate that DELTA outperforms various state-of-the-art approaches. The code implementation of DELTA is available at https://github.com/goose315/DELTA.

PDF Details

NeurIPS Conference 2024 Conference Paper

Physics-Informed Regularization for Domain-Agnostic Dynamical System Modeling

Zijie Huang
Wanjia Zhao
Jingdong Gao
Ziniu Hu
Xiao Luo
Yadi Cao
Yuanzhou Chen
Yizhou Sun

Learning complex physical dynamics purely from data is challenging due to the intrinsic properties of systems to be satisfied. Incorporating physics-informed priors, such as in Hamiltonian Neural Networks (HNNs), achieves high-precision modeling for energy-conservative systems. However, real-world systems often deviate from strict energy conservation and follow different physical priors. To address this, we present a framework that achieves high-precision modeling for a wide range of dynamical systems from the numerical aspect, by enforcing Time-Reversal Symmetry (TRS) via a novel regularization term. It helps preserve energies for conservative systems while serving as a strong inductive bias for non-conservative, reversible systems. While TRS is a domain-specific physical prior, we present the first theoretical proof that TRS loss can universally improve modeling accuracy by minimizing higher-order Taylor terms in ODE integration, which is numerically beneficial to various systems regardless of their properties, even for irreversible systems. By integrating the TRS loss within neural ordinary differential equation models, the proposed model TREAT demonstrates superior performance on diverse physical systems. It achieves a significant 11. 5% MSE improvement in a challenging chaotic triple-pendulum scenario, underscoring TREAT’s broad applicability and effectiveness.

PDF Details DOI

ICML Conference 2023 Conference Paper

Efficient Learning of Mesh-Based Physical Simulation with Bi-Stride Multi-Scale Graph Neural Network

Yadi Cao
Menglei Chai
Minchen Li
Chenfanfu Jiang

Learning the long-range interactions on large-scale mesh-based physical systems with flat Graph Neural Networks (GNNs) and stacking Message Passings (MPs) is challenging due to the scaling complexity w. r. t. the number of nodes and over-smoothing. Therefore, there has been growing interest in the community to introduce multi-scale structures to GNNs for physics simulation. However, current state-of-the-art methods are limited by their reliance on the labor-heavy drawing of coarser meshes or building coarser levels based on spatial proximity, which can introduce wrong edges across geometry boundaries. Inspired by the bipartite graph determination, we propose a novel pooling strategy, bi-stride to tackle the aforementioned limitations. Bi-stride pools nodes on every other frontier of the Breadth-First-Search (BFS), without the need for the manual drawing of coarser meshes and, avoid wrong edges introduced by spatial proximity. Additionally, it enables a reduced number of MP times on each level and the non-parametrized pooling and unpooling by interpolations, similar to convolutional Neural Networks (CNNs), which significantly reduces computational requirements. Experiments show that the proposed framework, BSMS-GNN, significantly outperforms existing methods in terms of both accuracy and computational efficiency in representative physics-based simulation scenarios.

Details

NeurIPS Conference 2022 Conference Paper

PlasticityNet: Learning to Simulate Metal, Sand, and Snow for Optimization Time Integration

Xuan Li
Yadi Cao
Minchen Li
Yin Yang
Craig Schroeder
Chenfanfu Jiang

In this paper, we propose a neural network-based approach for learning to represent the behavior of plastic solid materials ranging from rubber and metal to sand and snow. Unlike elastic forces such as spring forces, these plastic forces do not result from the positional gradient of any potential energy, imposing great challenges on the stability and flexibility of their simulation. Our method effectively resolves this issue by learning a generalizable plastic energy whose derivative closely matches the analytical behavior of plastic forces. Our method, for the first time, enables the simulation of a wide range of arbitrary elasticity-plasticity combinations using time step-independent, unconditionally stable optimization-based time integrators. We demonstrate the efficacy of our method by learning and producing challenging 2D and 3D effects of metal, sand, and snow with complex dynamics.

PDF Details