Distributed Estimation on Semi-Supervised Generalized Linear Model

Jiyuan Tu; Weidong Liu; Xiaojun Mao

Back to JMLR

JMLR 2024

Distributed Estimation on Semi-Supervised Generalized Linear Model

Journal Article Articles Artificial Intelligence · Machine Learning

PDF Details

Abstract

Semi-supervised learning is devoted to using unlabeled data to improve the performance of machine learning algorithms. In this paper, we study the semi-supervised generalized linear model (GLM) in the distributed setup. In the cases of single or multiple machines containing unlabeled data, we propose two distributed semi-supervised algorithms based on the distributed approximate Newton method. When the labeled local sample size is small, our algorithms still give a consistent estimation, while fully supervised methods fail to converge. Moreover, we theoretically prove that the convergence rate is greatly improved when sufficient unlabeled data exists. Therefore, the proposed method requires much fewer rounds of communications to achieve the optimal rate than its fully-supervised counterpart. In the case of the linear model, we prove the rate lower bound after one round of communication, which shows that rate improvement is essential. Finally, several simulation analyses and real data studies are provided to demonstrate the effectiveness of our method. [abs] [ pdf ][ bib ] &copy JMLR 2024. ( edit, beta )

Distributed Estimation on Semi-Supervised Generalized Linear Model

Abstract

Authors

Keywords

Context