Adaptive Knowledge Driven Regularization for Deep Neural Networks

Zhaojing Luo; Shaofeng Cai; Can Cui; Beng Chin Ooi; Yang Yang

Back to AAAI

AAAI 2021

Adaptive Knowledge Driven Regularization for Deep Neural Networks

Conference Paper AAAI Technical Track on Machine Learning III Artificial Intelligence

PDF Details

Abstract

In many real-world applications, the amount of data available for training is often limited, and thus inductive bias and auxiliary knowledge are much needed for regularizing model training. One popular regularization method is to impose prior distribution assumptions on model parameters, and many recent works also attempt to regularize training by integrating external knowledge into specific neurons. However, existing regularization methods fail to take account of the interaction between connected neuron pairs, which is invaluable internal knowledge for adaptive regularization for better representation learning as training progresses. In this paper, we explicitly take into account the interaction between connected neurons, and propose an adaptive internal knowledge driven regularization method, CORR-Reg. The key idea of CORR-Reg is to give a higher significance weight to connections of more correlated neuron pairs. The significance weights adaptively identify more important input neurons for each neuron. Instead of regularizing connection model parameters with a static strength such as weight decay, CORR- Reg imposes weaker regularization strength on more significant connections. As a consequence, neurons attend to more informative input features and thus learn more diversified and discriminative representation. We derive CORR-Reg with the Bayesian inference framework and propose a novel optimization algorithm with the Lagrange multiplier method and Stochastic Gradient Descent. Extensive evaluations on diverse benchmark datasets and neural network structures show that CORR-Reg achieves significant improvement over stateof-the-art regularization methods.

Adaptive Knowledge Driven Regularization for Deep Neural Networks

Abstract

Authors

Keywords

Context