• DocumentCode
    117906
  • Title

    Multi-view features in a DNN-CRF model for improved sentence unit detection on English broadcast news

  • Author

    Guangpu Huang ; Chenglin Xu ; Xiong Xiao ; Lei Xie ; Eng Siong Chng ; Haizhou Li

  • Author_Institution
    Temasek Labs. @NLU, Singapore, Singapore
  • fYear
    2014
  • fDate
    9-12 Dec. 2014
  • Firstpage
    1
  • Lastpage
    9
  • Abstract
    This paper presents a deep neural network-conditional random field (DNN-CRF) system with multi-view features for sentence unit detection on English broadcast news. We proposed a set of multi-view features extracted from the acoustic, articulatory, and linguistic domains, and used them together in the DNN-CRF model to predict the sentence boundaries. We tested the accuracy of the multi-view features on the standard NIST RT-04 English broadcast news speech data. Experiments show that the best system outperforms the state-of-the-art sentence unit detection system significantly by 13.2% absolute NIST sentence error rate reduction using the reference transcription. However, the performance gain is limited on the recognized transcription partly due to the high word error rate.
  • Keywords
    feature extraction; natural language processing; neural nets; speech recognition; DNN-CRF model; NIST RT-04 english broadcast news speech data; NIST sentence error rate reduction; acoustic domain; articulatory domain; deep neural network conditional random field system; linguistic domain; multiview feature extraction; sentence unit detection improvement; word error rate; Acoustics; Feature extraction; Hidden Markov models; Pragmatics; Speech; Tongue; Training;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Asia-Pacific Signal and Information Processing Association, 2014 Annual Summit and Conference (APSIPA)
  • Conference_Location
    Siem Reap
  • Type

    conf

  • DOI
    10.1109/APSIPA.2014.7041543
  • Filename
    7041543