DocumentCode :
1694293
Title :
Accurate speech segmentation by mimicking human auditory processing
Author :
King, Simon ; Hasegawa-Johnson, Mark
Author_Institution :
Dept. of Electr. & Comput. Eng., Univ. of Illinois, Urbana, IL, USA
fYear :
2013
Firstpage :
8096
Lastpage :
8100
Abstract :
This paper addresses the problem of locating phone boundaries without prior knowledge of the text of an utterance. A biomimetic model of human auditory processing is used to calculate the neural features of frequency synchrony and average signal level. Frequency synchrony and average signal level are used as input to a two-layered support vector machine (SVM)-based system to detect phone boundaries. Phone boundaries are detected with 87.0% precision and 84.8% recall when the automatic segmentation system has no prior knowledge of the phone sequence in the utterance.
Keywords :
speech processing; synchronisation; automatic segmentation system; biomimetic model; frequency synchrony; human auditory processing; neural features; phone boundaries location; phone sequence; signal level; speech segmentation; two-layered support vector machine-based system; two-layered-based system; Computational modeling; Frequency synchronization; Ice; Speech; Speech recognition; Support vector machines; Training; Automatic segmentation; auditory modeling; average signal level; frequency synchrony;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
ISSN :
1520-6149
Type :
conf
DOI :
10.1109/ICASSP.2013.6639242
Filename :
6639242
Link To Document :
بازگشت