DocumentCode
1336394
Title
Integration of phonetic and prosodic information for robust utterance verification
Author
Wu, C.-H. ; Chen, Y.-J. ; Yan, G.-L.
Author_Institution
Dept. of Comput. Sci. & Inf. Eng., Nat. Cheng Kung Univ., Tainan, Taiwan
Volume
147
Issue
1
fYear
2000
fDate
2/1/2000 12:00:00 AM
Firstpage
55
Lastpage
61
Abstract
Mandarin speech is known for its tonal characteristic, and prosodic information plays an important role in Mandarin speech recognition. Driven by this property, phonetic and prosodic information are integrated and used for Mandarin telephone speech keyword spotting. A two-stage strategy, with recognition followed by verification, is adopted. For keyword recognition, 132 subsyllable models, two general acoustic filler models and one background/silence model are separately trained and used as the basic recognition units. For utterance verification, 12 anti-subsyllable models, 175 context-dependent prosodic models and five anti-prosodic models are constructed. A keyword verification function combining phonetic-phase and prosodic-phase verification is investigated. Using a test set of 3088 conversational speech utterances from 33 speakers (20 males and 13 females) and a vocabulary of 2583 faculty names, at 8.5% false rejection, the proposed verification method results in an 18.3% false alarm rate. Furthermore, this method is able correctly to reject 90.9% of non-keywords. Comparison with a baseline system without prosodic-phase verification shows that prosodic information can benefit the verification performance
Keywords
feature extraction; natural languages; speech processing; speech recognition; Mandarin speech; Mandarin speech recognition; Mandarin telephone speech keyword spotting; acoustic filler models; anti-prosodic models; anti-subsyllable models; background/silence model; baseline system; context-dependent prosodic models; keyword recognition; phonetic information; prosodic information; prosodic-phase verification; robust utterance verification; tonal characteristic; two-stage strategy; verification performance;
fLanguage
English
Journal_Title
Vision, Image and Signal Processing, IEE Proceedings -
Publisher
iet
ISSN
1350-245X
Type
jour
DOI
10.1049/ip-vis:20000099
Filename
842718
Link To Document