Sike Wang Papers

EAAI Journal 2026 Journal Article

Buckling deformation reconstruction from strain distributions via U-shaped networks and knowledge distillation

Sike Wang
Xingyu Wang
Junyi Duan
Huaixiao Yan
Ying Huang
Chengcheng Tao

This paper presents a novel framework for reconstructing structural buckling deformation from strain distribution. The reconstruction was based on U-shaped network (UNet), a convolutional neural network (CNN) that inputs the strain field of the buckling structure to predict deformation. Two neural network architectures, UNet and Nested UNet (UNet++) were trained to reconstruct buckling deformation. A knowledge distillation approach was designed to transfer features of layers from the larger teacher model (UNet++) to the smaller student model (UNet). This approach can improve the accuracy of the student model without increasing model size. To improve knowledge distillation, we replaced uniform weights for feature transformation with adaptive weights. The developed method was validated on a mixed strain-deformation dataset from the finite element analysis and distributed strain measurement, which provided a real-world implication with diverse information. The trained UNet and UNet++ achieved normalized mean absolute error (NMAE) of 2. 74% and 1. 76%, respectively. According to the training results, the best UNet model trained with the proposed knowledge distillation method achieved an NMAE of 1. 84%, demonstrating a 31. 75% improvement. A parametric study was conducted to investigate the effect of transferring weights in the proposed framework. In addition, the effect of the framework for deformation reconstruction under varying conditions was evaluated, which indicated a general improvement. This study provides a tool to advance the capability of identifying structural buckling by leveraging CNN, smart sensing, numerical modeling, and knowledge distillation, which contributes to health monitoring and anomaly detection of structures.

Details DOI

NeurIPS Conference 2024 Conference Paper

4-bit Shampoo for Memory-Efficient Network Training

Sike Wang
Pan Zhou
Jia Li
Hua Huang

Second-order optimizers, maintaining a matrix termed a preconditioner, are superior to first-order optimizers in both theory and practice. The states forming the preconditioner and its inverse root restrict the maximum size of models trained by second-order optimizers. To address this, compressing 32-bit optimizer states to lower bitwidths has shown promise in reducing memory usage. However, current approaches only pertain to first-order optimizers. In this paper, we propose the first 4-bit second-order optimizers, exemplified by 4-bit Shampoo, maintaining performance similar to that of 32-bit ones. We show that quantizing the eigenvector matrix of the preconditioner in 4-bit Shampoo is remarkably better than quantizing the preconditioner itself both theoretically and experimentally. By rectifying the orthogonality of the quantized eigenvector matrix, we enhance the approximation of the preconditioner's eigenvector matrix, which also benefits the computation of its inverse 4-th root. Besides, we find that linear square quantization slightly outperforms dynamic tree quantization when quantizing second-order optimizer states. Evaluation on various networks for image classification and natural language modeling demonstrates that our 4-bit Shampoo achieves comparable performance to its 32-bit counterpart while being more memory-efficient.

PDF Details DOI

Possible papers

Buckling deformation reconstruction from strain distributions via U-shaped networks and knowledge distillation

4-bit Shampoo for Memory-Efficient Network Training