DocumentCode
2361717
Title
Hybrid baseform builder for phonetic languages
Author
Kumar, Mohit ; Rajput, Nitendra ; Verma, Ashish
Author_Institution
IBM India Res. Lab., New Delhi, India
fYear
2005
fDate
4-7 Jan. 2005
Firstpage
382
Lastpage
386
Abstract
We present a novel technique of automatically building baseforms from the spelling for languages that are phonetic. For such languages, although rule-based techniques give fairly accurate baseforms, they have some ambiguities depending upon the language. To handle these, we apply a statistical method to improve the correctness of phonetic spelling builders. The rule-based baseforms are used as a training corpus for improving the system. We also present an alternative method of building decision trees over the phone context to modify the rule-based baseforms. The novel framework of generating the baseforms using both, spelling-to-sound rules and statistics, one after the other, requires very small amount of training data. Correction results and recognition results are presented by using the Hindi language baseform builder and by using the baseforms generated in a Hindi speech recognition task.
Keywords
decision trees; natural languages; speech processing; speech recognition; statistical analysis; Hindi language baseform builder; Hindi speech recognition task; decision trees; hybrid baseform builder; phonetic languages; phonetic spelling builders; rule-based baseforms; spelling-to-sound rules; statistical method; training corpus; Acoustics; Decision trees; Loudspeakers; Natural languages; Speech recognition; Speech synthesis; Statistical analysis; Tiles; Tires; Vocabulary;
fLanguage
English
Publisher
ieee
Conference_Titel
Intelligent Sensing and Information Processing, 2005. Proceedings of 2005 International Conference on
Print_ISBN
0-7803-8840-2
Type
conf
DOI
10.1109/ICISIP.2005.1529481
Filename
1529481
Link To Document