Arrow Research search
Back to AAAI

AAAI 1998

Speech Recognition with Dynamic Bayesian Networks

Conference Paper Graphical Probabilistic Models Artificial Intelligence

Abstract

DynamicBayesian networks (DBNs) are a useful tool for representing complexstochastic processes. Recent developments in inference and learning in DBNsallow their use in real-world applications. In this paper, we apply DBNs to the problem of speech recognition. The factored state representation enabled by DBNsallows us to explicitly represent long-term articulatory and acoustic context in addition to the phonetic-state information maintained by hidden Markov models (HMMs). Furthermore, it enables us to modelthe short-term correlations amongmultiple observation streams within single time-frames. Given a DBN structure capable of representing these long- and short-term correlations, we applied the EMalgorithm to learn models with up to 500, 000 parameters. The use of structured DBN models decreased the error rate by 12 to 29% on a large-vocabulary isolated-word recognition task, compared to a discrete HMM; it also improvedsignificantly on other published results for the same task. This is the first successful application of DBNs to a largescale speech recognition problem. Investigation of the learned modelsindicates that the hidden state variables are strongly correlated with acoustic properties of the speech signal.

Authors

Keywords

No keywords are indexed for this paper.

Context

Venue
AAAI Conference on Artificial Intelligence
Archive span
1980-2026
Indexed papers
28718
Paper id
224115494842019392