DocumentCode
3244384
Title
Pronunciation variation analysis based on acoustic and phonemic distance measures with application examples on Mandarin Chinese
Author
Tsai, Ming-Yi ; Lee, Lin-shan
Author_Institution
Graduate Inst. of Commun. Eng., Nat. Taiwan Univ., Taipei, Taiwan
fYear
2003
fDate
30 Nov.-3 Dec. 2003
Firstpage
117
Lastpage
122
Abstract
In this paper, two conceptually different statistical distance metrics are defined and analyzed. First, the asymmetric acoustic distance measures how the acoustic property of one phoneme is close to that of another, and is defined here based on the Mahalanobis distance between two hidden Markov models. Second, the asymmetric phonemic distance measures how probable a phoneme is realized as another, based on the aligned phonemic canonical and surface transcriptions of speech corpora. These two mutually dependent distance measures are used to construct an abstract acoustic/phonemic distance plane which is helpful in quantitatively analyzing pronunciation variations. Besides, clear distinction and complicated correlation between these two distances were discussed, which is believed to be helpful in many application areas involving lexical design and acoustic modelling as well. Preliminary analysis were performed on LDC Hub-4NE Mandarin Broadcast News database, and possible application in pronunciation modelling was discussed.
Keywords
hidden Markov models; speech processing; speech recognition; statistical analysis; LDC Hub-4NE Mandarin Broadcast News database; Mahalanobis distance; Mandarin Chinese language; acoustic distance measures; asymmetric acoustic distance; asymmetric phonemic distance; distance plane; hidden Markov models; phoneme; pronunciation modelling; pronunciation variation analysis; speech corpora; statistical distance metrics; Acoustic applications; Acoustic measurements; Acoustical engineering; Broadcasting; Databases; Dictionaries; Hidden Markov models; Performance analysis; Speech analysis; Speech recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on
Print_ISBN
0-7803-7980-2
Type
conf
DOI
10.1109/ASRU.2003.1318414
Filename
1318414
Link To Document