• DocumentCode
    694559
  • Title

    Experimental comparison of text information based punctuation recovery algorithms in real data

  • Author

    Xiao Chen ; Dengfeng Ke ; Bo Xu

  • Author_Institution
    Interactive Digital Media Technol. Res. Center (IDMTech), Inst. of Autom., Beijing, China
  • fYear
    2013
  • fDate
    12-13 Oct. 2013
  • Firstpage
    1199
  • Lastpage
    1202
  • Abstract
    Punctuation recovery is very important for automatic speech recognition (ASR). It greatly improves readability of transcripts and user experience, and facilitates following natural language processing tasks. The text information based method is one of the basic solutions of punctuation recovery. For analyzing the features of these algorithms, improving them and using them to develop practical system, this paper evaluates text information based punctuation recovery algorithms (HELM, CRF, RNNLM and GTI) in real data. Results show that GTI outperforms other algorithms for punctuation recovery, and ASR´s error is the main cause of performance degradation of all punctuation recovery algorithms. Finally, some suggestions are given.
  • Keywords
    speech recognition; text analysis; ASR; CRF algorithm; GTI algorithm; HELM algorithm; RNNLM algorithm; automatic speech recognition; natural language processing; punctuation recovery algorithm; text information; transcript readability; user experience; Computer science; punctuation recovery; real data; text information;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Network Technology (ICCSNT), 2013 3rd International Conference on
  • Conference_Location
    Dalian
  • Type

    conf

  • DOI
    10.1109/ICCSNT.2013.6967317
  • Filename
    6967317