Arrow Research search
Back to ICML

ICML 2023

Bayesian Progressive Deep Topic Model with Knowledge Informed Textual Data Coarsening Process

Conference Paper Accepted Paper Artificial Intelligence ยท Machine Learning

Abstract

Deep topic models have shown an impressive ability to extract multi-layer document latent representations and discover hierarchical semantically meaningful topics. However, most deep topic models are limited to the single-step generative process, despite the fact that the progressive generative process has achieved impressive performance in modeling image data. To this end, in this paper, we propose a novel progressive deep topic model that consists of a knowledge-informed textural data coarsening process and a corresponding progressive generative model. The former is used to build multi-level observations ranging from concrete to abstract, while the latter is used to generate more concrete observations gradually. Additionally, we incorporate a graph-enhanced decoder to capture the semantic relationships among words at different levels of observation. Furthermore, we perform a theoretical analysis of the proposed model based on the principle of information theory and show how it can alleviate the well-known "latent variable collapse" problem. Finally, extensive experiments demonstrate that our proposed model effectively improves the ability of deep topic models, resulting in higher-quality latent document representations and topics.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
International Conference on Machine Learning
Archive span
1993-2025
Indexed papers
16471
Paper id
953450759421081671