DocumentCode
1694293
Title
Accurate speech segmentation by mimicking human auditory processing
Author
King, Simon ; Hasegawa-Johnson, Mark
Author_Institution
Dept. of Electr. & Comput. Eng., Univ. of Illinois, Urbana, IL, USA
fYear
2013
Firstpage
8096
Lastpage
8100
Abstract
This paper addresses the problem of locating phone boundaries without prior knowledge of the text of an utterance. A biomimetic model of human auditory processing is used to calculate the neural features of frequency synchrony and average signal level. Frequency synchrony and average signal level are used as input to a two-layered support vector machine (SVM)-based system to detect phone boundaries. Phone boundaries are detected with 87.0% precision and 84.8% recall when the automatic segmentation system has no prior knowledge of the phone sequence in the utterance.
Keywords
speech processing; synchronisation; automatic segmentation system; biomimetic model; frequency synchrony; human auditory processing; neural features; phone boundaries location; phone sequence; signal level; speech segmentation; two-layered support vector machine-based system; two-layered-based system; Computational modeling; Frequency synchronization; Ice; Speech; Speech recognition; Support vector machines; Training; Automatic segmentation; auditory modeling; average signal level; frequency synchrony;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location
Vancouver, BC
ISSN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2013.6639242
Filename
6639242
Link To Document