مرکز منطقه ای اطلاع رساني علوم و فناوري - Syntactically-informed models for comma prediction

DocumentCode :

3530609

Title :

Syntactically-informed models for comma prediction

Author :

Favre, Benoit ; Hakkani-Tür, Dilek ; Shriberg, Elizabeth

Author_Institution :

Int. Comput. Sci. Inst., Berkeley, CA

fYear :

2009

fDate :

19-24 April 2009

Firstpage :

4697

Lastpage :

4700

Abstract :

Providing punctuation in speech transcripts not only improves readability, but it also helps downstream text processing such as information extraction or machine translation. In this paper, we improve by 7% the accuracy of comma prediction in English broadcast news by introducing syntactic features inspired by the role of commas as described in linguistics studies. We conduct an analysis of the impact of those features on other subsets of features (prosody, words...) when combined through CRFs. The syntactic cues can help characterizing large syntactic patterns such as appositions and lists which are not necessarily marked by prosody.

Keywords :

linguistics; natural language processing; speech recognition; text analysis; English broadcast news; automatic speech recognition systems; downstream text processing; information extraction; linguistics; machine translation; speech transcription; Boosting; Broadcasting; Classification tree analysis; Computer science; Data mining; Decision trees; Neural networks; Predictive models; Speech processing; Testing; Machine Learning; Punctuation; Speech Processing;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Acoustics, Speech and Signal Processing, 2009. ICASSP 2009. IEEE International Conference on

Conference_Location :

Taipei

ISSN :

1520-6149

Print_ISBN :

978-1-4244-2353-8

Electronic_ISBN :

1520-6149

Type :

conf

DOI :

10.1109/ICASSP.2009.4960679

Filename :

4960679

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3530609