AAAI 1994
A Statistical Method for Handling Unknown Words
Abstract
Robust Natural Language Processing systems must be able to handle words that are not in their lexicon. We created a classifier that was trained on tagged text to find the most likely parts of speech for unknown words. The classifier uses a contingency table to count the observed features, and a loglinear model to smooth the cell counts. After smoothing, the contingency table is used to obtain the conditional probability distribution for classification.
Authors
Keywords
No keywords are indexed for this paper.
Context
- Venue
- AAAI Conference on Artificial Intelligence
- Archive span
- 1980-2026
- Indexed papers
- 28718
- Paper id
- 252143293646888644