• DocumentCode
    1694293
  • Title

    Accurate speech segmentation by mimicking human auditory processing

  • Author

    King, Simon ; Hasegawa-Johnson, Mark

  • Author_Institution
    Dept. of Electr. & Comput. Eng., Univ. of Illinois, Urbana, IL, USA
  • fYear
    2013
  • Firstpage
    8096
  • Lastpage
    8100
  • Abstract
    This paper addresses the problem of locating phone boundaries without prior knowledge of the text of an utterance. A biomimetic model of human auditory processing is used to calculate the neural features of frequency synchrony and average signal level. Frequency synchrony and average signal level are used as input to a two-layered support vector machine (SVM)-based system to detect phone boundaries. Phone boundaries are detected with 87.0% precision and 84.8% recall when the automatic segmentation system has no prior knowledge of the phone sequence in the utterance.
  • Keywords
    speech processing; synchronisation; automatic segmentation system; biomimetic model; frequency synchrony; human auditory processing; neural features; phone boundaries location; phone sequence; signal level; speech segmentation; two-layered support vector machine-based system; two-layered-based system; Computational modeling; Frequency synchronization; Ice; Speech; Speech recognition; Support vector machines; Training; Automatic segmentation; auditory modeling; average signal level; frequency synchrony;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
  • Conference_Location
    Vancouver, BC
  • ISSN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2013.6639242
  • Filename
    6639242