DocumentCode
694559
Title
Experimental comparison of text information based punctuation recovery algorithms in real data
Author
Xiao Chen ; Dengfeng Ke ; Bo Xu
Author_Institution
Interactive Digital Media Technol. Res. Center (IDMTech), Inst. of Autom., Beijing, China
fYear
2013
fDate
12-13 Oct. 2013
Firstpage
1199
Lastpage
1202
Abstract
Punctuation recovery is very important for automatic speech recognition (ASR). It greatly improves readability of transcripts and user experience, and facilitates following natural language processing tasks. The text information based method is one of the basic solutions of punctuation recovery. For analyzing the features of these algorithms, improving them and using them to develop practical system, this paper evaluates text information based punctuation recovery algorithms (HELM, CRF, RNNLM and GTI) in real data. Results show that GTI outperforms other algorithms for punctuation recovery, and ASR´s error is the main cause of performance degradation of all punctuation recovery algorithms. Finally, some suggestions are given.
Keywords
speech recognition; text analysis; ASR; CRF algorithm; GTI algorithm; HELM algorithm; RNNLM algorithm; automatic speech recognition; natural language processing; punctuation recovery algorithm; text information; transcript readability; user experience; Computer science; punctuation recovery; real data; text information;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Science and Network Technology (ICCSNT), 2013 3rd International Conference on
Conference_Location
Dalian
Type
conf
DOI
10.1109/ICCSNT.2013.6967317
Filename
6967317
Link To Document