• DocumentCode
    417276
  • Title

    Fusion based speech segmentation in DARPA SPINE2 task

  • Author

    Zheng, Chengyi ; Yan, Yonghong

  • Author_Institution
    Comput. Sci. & Eng. Dept., Oregon Health & Sci. Univ., Beaverton, OR, USA
  • Volume
    1
  • fYear
    2004
  • fDate
    17-21 May 2004
  • Abstract
    We report a new fusion based segmentation approach using multiple filter bank coefficients. This approach takes advantage of current feature extraction procedure, with little additional computation cost. Another level of fusion was performed by combining several segmentation systems. Evaluation was conducted on the second Speech In Noisy Environments (SPINE2) task. Experiments show our fusion based approaches significantly reduced the WER compared to two classifier-based approaches. Compared to the manual segmentation, our approach only has 0.3% WER increase.
  • Keywords
    channel bank filters; error statistics; feature extraction; sensor fusion; speech recognition; ASR; DARPA SPINE2 task; Speech In Noisy Environments task; WER reduction; automatic speech recognition; feature extraction; fusion based segmentation; multiple filter bank coefficients; speech segmentation; Acoustic noise; Automatic speech recognition; Computer science; Decoding; Hidden Markov models; Loudspeakers; Military aircraft; Speech recognition; Streaming media; Working environment noise;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 2004. Proceedings. (ICASSP '04). IEEE International Conference on
  • ISSN
    1520-6149
  • Print_ISBN
    0-7803-8484-9
  • Type

    conf

  • DOI
    10.1109/ICASSP.2004.1326128
  • Filename
    1326128