مرکز منطقه ای اطلاع رساني علوم و فناوري - A Probabilistic Generative Framework for Extractive Broadcast News Speech Summarization

DocumentCode :

3559906

Title :

A Probabilistic Generative Framework for Extractive Broadcast News Speech Summarization

Author :

Chen, Yi-Ting ; Chen, Berlin ; Wang, Hsin-Min

Author_Institution :

Dept. of Comput. Sci. & Inf. Eng., Nat. Taiwan Normal Univ., Taipei

Volume :

Issue :

fYear :

2009

Firstpage :

Lastpage :

106

Abstract :

In this paper, we consider extractive summarization of broadcast news speech and propose a unified probabilistic generative framework that combines the sentence generative probability and the sentence prior probability for sentence ranking. Each sentence of a spoken document to be summarized is treated as a probabilistic generative model for predicting the document. Two matching strategies, namely literal term matching and concept matching, are thoroughly investigated. We explore the use of the language model (LM) and the relevance model (RM) for literal term matching, while the sentence topical mixture model (STMM) and the word topical mixture model (WTMM) are used for concept matching. In addition, the lexical and prosodic features, as well as the relevance information of spoken sentences, are properly incorporated for the estimation of the sentence prior probability. An elegant feature of our proposed framework is that both the sentence generative probability and the sentence prior probability can be estimated in an unsupervised manner, without the need for handcrafted document-summary pairs. The experiments were performed on Chinese broadcast news collected in Taiwan, and very encouraging results were obtained.

Keywords :

broadcasting; pattern matching; speech processing; Chinese broadcast news; Taiwan; concept matching; extractive broadcast news speech summarization; language model; literal term matching; probabilistic generative framework; relevance model; sentence generative probability; sentence prior probability; sentence ranking; sentence topical mixture model; word topical mixture model; Extractive spoken document summarization; language model (LM); probabilistic generative framework; relevance model (RM); topical mixture model;

fLanguage :

English

Journal_Title :

Audio, Speech, and Language Processing, IEEE Transactions on

Publisher :

ieee

Conference_Location :

12/16/2008 12:00:00 AM

ISSN :

1558-7916

Type :

jour

DOI :

10.1109/TASL.2008.2005031

Filename :

4717223

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=3559906