Factorial LDA: Sparse Multi-Dimensional Text Models

Michael Paul; Mark Dredze

Back to NeurIPS

NeurIPS 2012

Factorial LDA: Sparse Multi-Dimensional Text Models

Conference Paper Artificial Intelligence · Machine Learning

PDF Details

Abstract

Multi-dimensional latent variable models can capture the many latent factors in a text corpus, such as topic, author perspective and sentiment. We introduce factorial LDA, a multi-dimensional latent variable model in which a document is influenced by K different factors, and each word token depends on a K-dimensional vector of latent variables. Our model incorporates structured word priors and learns a sparse product of factors. Experiments on research abstracts show that our model can learn latent factors such as research topic, scientific discipline, and focus (e. g. methods vs. applications. ) Our modeling improvements reduce test perplexity and improve human interpretability of the discovered factors.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue: Annual Conference on Neural Information Processing Systems
Archive span: 1987-2025
Indexed papers: 30776
Paper id: 556463207629243415