DocumentCode :
1076679
Title :
Probabilistic Modeling of Korean Morphology
Author :
Lee, Do-Gil ; Rim, Hae-Chang
Author_Institution :
Inst. of Korean Culture, Korea Univ., Seoul
Volume :
17
Issue :
5
fYear :
2009
fDate :
7/1/2009 12:00:00 AM
Firstpage :
945
Lastpage :
955
Abstract :
This paper proposes new probabilistic models for analyzing Korean morphology. In order to take advantage of the characteristics of Korean morphology, the proposed models are based on three linguistic units: eojeol (a Korean spacing unit), morpheme, and syllable. Unlike previous approaches that are based on rules and dictionaries, the probabilistic approach proposed in this study can automatically acquire complete linguistic knowledge from part-of-speech (POS) tagged corpora. In addition, this approach, without any system modification, is easily applicable to other corpora with different tag sets and annotation guidelines. The three different models and their combinations are evaluated on three corpora over a wide range of conditions. The eojeol-unit and syllable-unit models compensate for the weaknesses of the morpheme-unit model. The eojeol-unit model performed efficiently, and improved the precision. The syllable-unit model improved in precision as well, showing a particularly robust performance in treating unknown words. The proposed approach is also proven to outperform the previous approaches.
Keywords :
probability; speech processing; Korean morphology; Korean spacing unit; eojeol unit; linguistic knowledge; morpheme unit; part-of-speech tagged corpora; probabilistic modeling; syllable unit; Dictionaries; Educational programs; Educational technology; Guidelines; Helium; Machine learning; Morphology; Natural languages; Robustness; Speech processing; Korean morphology; machine learning; morphologial analysis; probabilistic model;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2009.2019922
Filename :
5075772
Link To Document :
بازگشت