Title :
Experimental comparison of text information based punctuation recovery algorithms in real data
Author :
Xiao Chen ; Dengfeng Ke ; Bo Xu
Author_Institution :
Interactive Digital Media Technol. Res. Center (IDMTech), Inst. of Autom., Beijing, China
Abstract :
Punctuation recovery is very important for automatic speech recognition (ASR). It greatly improves readability of transcripts and user experience, and facilitates following natural language processing tasks. The text information based method is one of the basic solutions of punctuation recovery. For analyzing the features of these algorithms, improving them and using them to develop practical system, this paper evaluates text information based punctuation recovery algorithms (HELM, CRF, RNNLM and GTI) in real data. Results show that GTI outperforms other algorithms for punctuation recovery, and ASR´s error is the main cause of performance degradation of all punctuation recovery algorithms. Finally, some suggestions are given.
Keywords :
speech recognition; text analysis; ASR; CRF algorithm; GTI algorithm; HELM algorithm; RNNLM algorithm; automatic speech recognition; natural language processing; punctuation recovery algorithm; text information; transcript readability; user experience; Computer science; punctuation recovery; real data; text information;
Conference_Titel :
Computer Science and Network Technology (ICCSNT), 2013 3rd International Conference on
Conference_Location :
Dalian
DOI :
10.1109/ICCSNT.2013.6967317