Title :
Tone and pitch accent classification using auditory attention cues
Author_Institution :
US R&D, Sony Comput. Entertainment, Foster City, CA, USA
Abstract :
A detailed description of tone and intonation is beneficial for many spoken language processing applications. In traditional methods for tone and pitch accent modeling, prosodic features, such as pitch, energy and duration, have been used. Here, a novel system that uses auditory attention cues is proposed for tone and fine grained pitch accent classification. The auditory attention cues are biologically inspired and hence extracted by mimicking the processing stages in the human auditory system. When tested on the Boston University Radio News Corpus, the proposed method achieves 64.6% pitch accent and 89.7% boundary tone classification accuracy. In addition, it is demonstrated that the model also successfully recognizes lexical tones in Mandarin with 79.0% accuracy when tested on a continuous Mandarin Chinese speech database. The results compare very well to the reported human performance on these tasks.
Keywords :
speech recognition; Mandarin Chinese speech database; auditory attention cues; fine grained pitch accent classification; pitch accent modeling; tone accent classification; tone accent modeling; Accuracy; Biological system modeling; Computational modeling; Databases; Feature extraction; Speech; Speech recognition; auditory attention; auditory gist; boundary tone; lexical tone; pitch accent; tone recognition;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2011 IEEE International Conference on
Conference_Location :
Prague
Print_ISBN :
978-1-4577-0538-0
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2011.5947531