Author name cluster

Yin Yang

Possible papers associated with this exact author name in Arrow. This page groups case-insensitive exact name matches and is not a full identity disambiguation profile.

11 papers

1 author row

AAAI Conference 2026 Conference Paper

Analysing Satellite Imagery Classification Under Spatial Domain Shift Across Geographic Regions (Abstract Reprint)

Sara A. Al-Emadi
Yin Yang
Ferda Ofli

Deep learning models are designed based on the i.i.d. assumption; consequently, they experience a significant performance drop due to the distribution shifts when deployed in real environments. Domain Generalisation (DG) aims to bridge the distribution shift between the source and target domains by improving the generalisability of the model to Out-Of-Distribution (OOD) data. This challenge is prominent in satellite imagery classification due to the scarcity of data from underrepresented regions such as Africa and Oceania. In this paper, we address the limitations of existing datasets in capturing distribution shifts caused by geospatial differences between geographic regions by constructing a new, large-scale dataset called Domain Shift across Geographic Regions (DSGR). This dataset aims to help researchers better understand the impact of distribution shifts on satellite imagery classification. Furthermore, we perform rigorous experiments on DSGR to investigate and benchmark the robustness of existing DG techniques under single- and multi-source domain settings and the role of foundation models in enhancing the DG techniques. Our evaluations reveal that recent DG techniques have a comparable, yet weak, performance on DSGR. However, when combined with a foundation model like CLIP, ERM (introduced in 1999) achieves highly competitive results, surpassing even recent state-of-the-art DG solutions in enhancing the generalisability of deep learning models across different geographic regions. Our dataset and code are available at https://github.com/RWGAI/DSGR.

PDF Details DOI

AAAI Conference 2026 Conference Paper

ElastoGen: 4D Generative Elastodynamics

Yutao Feng
Yintong Shang
Xiang Feng
Lei Lan
Shandian Zhe
Tianjia Shao
Hongzhi Wu
Kun Zhou

We present ElastoGen, a knowledge-driven AI model that generates physically accurate 4D elastodynamics. Unlike deep models that learn from video- or image-based observations, ElastoGen leverages the principles of physics and learns from established mathematical and optimization procedures. The core idea of ElastoGen is converting the differential equation, corresponding to the nonlinear force equilibrium, into a series of iterative local convolution-like operations, which naturally fit deep architectures. We carefully build our network module following this overarching design philosophy. ElastoGen is much more lightweight in terms of both training requirements and network scale than deep generative models. Because of its alignment with actual physical procedures, ElastoGen efficiently generates accurate dynamics for a wide range of hyperelastic materials and can be easily integrated with upstream and downstream deep modules to enable end-to-end 4D generation.

PDF Details DOI

NeurIPS Conference 2024 Conference Paper

Atlas3D: Physically Constrained Self-Supporting Text-to-3D for Simulation and Fabrication

Yunuo Chen
Tianyi Xie
Zeshun Zong
Xuan Li
Feng Gao
Yin Yang
Ying Nian Wu
Chenfanfu Jiang

Existing diffusion-based text-to-3D generation methods primarily focus on producing visually realistic shapes and appearances, often neglecting the physical constraints necessary for downstream tasks. Generated models frequently fail to maintain balance when placed in physics-based simulations or 3D printed. This balance is crucial for satisfying user design intentions in interactive gaming, embodied AI, and robotics, where stable models are needed for reliable interaction. Additionally, stable models ensure that 3D-printed objects, such as figurines for home decoration, can stand on their own without requiring additional supports. To fill this gap, we introduce Atlas3D, an automatic and easy-to-implement method that enhances existing Score Distillation Sampling (SDS)-based text-to-3D tools. Atlas3D ensures the generation of self-supporting 3D models that adhere to physical laws of stability under gravity, contact, and friction. Our approach combines a novel differentiable simulation-based loss function with physically inspired regularization, serving as either a refinement or a post-processing module for existing frameworks. We verify Atlas3D's efficacy through extensive generation tasks and validate the resulting 3D models in both simulated and real-world environments.

PDF Details DOI

AAAI Conference 2022 Conference Paper

Active Boundary Loss for Semantic Segmentation

Chi Wang
Yunke Zhang
Miaomiao Cui
Peiran Ren
Yin Yang
Xuansong Xie
Xian-Sheng Hua
Hujun Bao

This paper proposes a novel active boundary loss for semantic segmentation. It can progressively encourage the alignment between predicted boundaries and ground-truth boundaries during end-to-end training, which is not explicitly enforced in commonly used cross-entropy loss. Based on the predicted boundaries detected from the segmentation results using current network parameters, we formulate the boundary alignment problem as a differentiable direction vector prediction problem to guide the movement of predicted boundaries in each iteration. Our loss is model-agnostic and can be plugged in to the training of segmentation networks to improve the boundary details. Experimental results show that training with the active boundary loss can effectively improve the boundary F-score and mean Intersection-over-Union on challenging image and video object segmentation datasets.

PDF Details

AAAI Conference 2022 Conference Paper

HoD-Net: High-Order Differentiable Deep Neural Networks and Applications

Siyuan Shen
Tianjia Shao
Kun Zhou
Chenfanfu Jiang
Feng Luo
Yin Yang

We introduce a deep architecture named HoD-Net to enable high-order differentiability for deep learning. HoD-Net is based on and generalizes the complex-step finite difference (CSFD) method. While similar to classic finite difference, CSFD approaches the derivative of a function from a higher-dimension complex domain, leading to highly accurate and robust differentiation computation without numerical stability issues. This method can be coupled with backpropagation and adjoint perturbation methods for an efficient calculation of high-order derivatives. We show how this numerical scheme can be leveraged in challenging deep learning problems, such as high-order network training, deep learningbased physics simulation, and neural differential equations.

PDF Details

NeurIPS Conference 2022 Conference Paper

PlasticityNet: Learning to Simulate Metal, Sand, and Snow for Optimization Time Integration

Xuan Li
Yadi Cao
Minchen Li
Yin Yang
Craig Schroeder
Chenfanfu Jiang

In this paper, we propose a neural network-based approach for learning to represent the behavior of plastic solid materials ranging from rubber and metal to sand and snow. Unlike elastic forces such as spring forces, these plastic forces do not result from the positional gradient of any potential energy, imposing great challenges on the stability and flexibility of their simulation. Our method effectively resolves this issue by learning a generalizable plastic energy whose derivative closely matches the analytical behavior of plastic forces. Our method, for the first time, enables the simulation of a wide range of arbitrary elasticity-plasticity combinations using time step-independent, unconditionally stable optimization-based time integrators. We demonstrate the efficacy of our method by learning and producing challenging 2D and 3D effects of metal, sand, and snow with complex dynamics.

PDF Details

AAAI Conference 2022 Conference Paper

Pose Guided Image Generation from Misaligned Sources via Residual Flow Based Correction

Jiawei Lu
He Wang
Tianjia Shao
Yin Yang
Kun Zhou

Generating new images with desired properties (e. g. new view/poses) from source images has been enthusiastically pursued recently, due to its wide range of potential applications. One way to ensure high-quality generation is to use multiple sources with complementary information such as different views of the same object. However, as source images are often misaligned due to the large disparities among the camera settings, strong assumptions have been made in the past with respect to the camera(s) or/and the object in interest, limiting the application of such techniques. Therefore, we propose a new general approach which models multiple types of variations among sources, such as view angles, poses, facial expressions, in a unified framework, so that it can be employed on datasets of vastly different nature. We verify our approach on a variety of data including humans bodies, faces, city scenes and 3D objects. Both the qualitative and quantitative results demonstrate the better performance of our method than the state of the art.

PDF Details

JBHI Journal 2021 Journal Article

DMC-Fusion: Deep Multi-Cascade Fusion With Classifier-Based Feature Synthesis for Medical Multi-Modal Images

Qing Zuo
Jianping Zhang
Yin Yang

Multi-modal medical image fusion is a challenging yet important task for precision diagnosis and surgical planning in clinical practice. Although single feature fusion strategy such as Densefuse has achieved inspiring performance, it tends to be not fully preserved for the source image features. In this paper, a deep multi-fusion framework with classifier-based feature synthesis is proposed to automatically fuse multi-modal medical images. It consists of a pre-trained autoencoder based on dense connections, a feature classifier and a multi-cascade fusion decoder with separately fusing high-frequency and low-frequency. The encoder and decoder are transferred from MS-COCO datasets and pre-trained simultaneously on multi-modal medical image public datasets to extract features. The feature classification is conducted through Gaussian high-pass filtering and the peak signal to noise ratio thresholding, then feature maps in each layer of the pre-trained Dense-Block and decoder are divided into high-frequency and low-frequency sequences. Specifically, in proposed feature fusion block, parameter-adaptive pulse coupled neural network and l 1 -weighted are employed to fuse high-frequency and low-frequency, respectively. Finally, we design a novel multi-cascade fusion decoder on total decoding feature stage to selectively fuse useful information from different modalities. We also validate our approach for the brain disease classification using the fused images, and a statistical significance test is performed to illustrate that the improvement in classification performance is due to the fusion. Experimental results demonstrate that the proposed method achieves the state-of-the-art performance in both qualitative and quantitative evaluations.

Details DOI

AAAI Conference 2021 Conference Paper

In-game Residential Home Planning via Visual Context-aware Global Relation Learning

Lijuan Liu
Yin Yang
Yi Yuan
Tianjia Shao
He Wang
Kun Zhou

In this paper, we propose an effective global relation learning algorithm to recommend an appropriate location of a building unit for in-game customization of residential home complex. Given a construction layout, we propose a visual contextaware graph generation network that learns the implicit global relations among the scene components and infers the location of a new building unit. The proposed network takes as input the scene graph and the corresponding top-view depth image. It provides the location recommendations for a newlyadded building units by learning an auto-regressive edge distribution conditioned on existing scenes. We also introduce a global graph-image matching loss to enhance the awareness of essential geometry semantics of the site. Qualitative and quantitative experiments demonstrate that the recommended location well reflects the implicit spatial rules of components in the residential estates, and it is instructive and practical to locate the building units in the 3D scene of the complex construction.

PDF Details

AAAI Conference 2021 Conference Paper

Online 3D Bin Packing with Constrained Deep Reinforcement Learning

Hang Zhao
Qijin She
Chenyang Zhu
Yin Yang
Kai Xu

We solve a challenging yet practically useful variant of 3D Bin Packing Problem (3D-BPP). In our problem, the agent has limited information about the items to be packed into a single bin, and an item must be packed immediately after its arrival without buffering or readjusting. The item’s placement also subjects to the constraints of order dependence and physical stability. We formulate this online 3D-BPP as a constrained Markov decision process (CMDP). To solve the problem, we propose an effective and easy-to-implement constrained deep reinforcement learning (DRL) method under the actor-critic framework. In particular, we introduce a prediction-and-projection scheme: The agent first predicts a feasibility mask for the placement actions as an auxiliary task and then uses the mask to modulate the action probabilities output by the actor during training. Such supervision and projection facilitate the agent to learn feasible policies very efficiently. Our method can be easily extended to handle lookahead items, multi-bin packing, and item re-orienting. We have conducted extensive evaluation showing that the learned policy significantly outperforms the state-of-the-art methods. A preliminary user study even suggests that our method might attain a human-level performance.

PDF Details

IS Journal 2020 Journal Article

HIGnet: Hierarchical and Interactive Gate Networks for Item Recommendation

Mingyang Zhong
Chaojie Li
Jiahui Wen
Liangchen Liu
Jingwei Ma
Guangda Zhang
Yin Yang

Existing research exploits the semantic information from reviews to complement user-item interactions for item recommendation. However, as these approaches either defer the user-item interactions until the prediction layer or simply concatenate all the reviews of a user/item into a single review, they fail to capture the complex correlations between each user-item pair or introduce noises. Thus, we propose a novel Hierarchical and Interactive Gate Network (HIGnet) model for rating prediction. Modeling local word informativeness and global review semantics in a hierarchical manner enable us to exploit textual features of users/items and capture complex semantic user-item correlations at different levels of granularities. Experiments on five challenging real-world datasets demonstrate the state-of-the-art performance of the proposed HIGnet model. To facilitate community research, the implementation of the proposed model is made publicly available (https://github.com/uqjwen/higan).

Details DOI