Title :
A content-based Chinese speech document retrieval system design and implementation
Author :
Zhong, Cencen ; Miao, Zhenjiang ; Zhang, Jie ; Du, Luyan ; Kang, Dandan
Author_Institution :
Inst. of Inf. Sci., Beijing Jiaotong Univ., Beijing, China
Abstract :
The rapid development of speech processing technology provides a potential for speech retrieval. This paper designs and implements a content-based Chinese speech document retrieval system using keyword spotting and text classification. In this system, a segment of unknown spontaneous speech will be converted into a series of keywords and then classified into a certain category, called topic, hoping to establish a retrieval model with two-level semantic information, which enables users to search for desired speech by keyword or topic query. Besides, based on the theory of mutual information, text classification is also used to react on the keywords to remove some false alarms. This paper mainly describes the structure, principle and completion situation of this retrieval system, finally gives the experimental results and discussions.
Keywords :
content-based retrieval; natural languages; speech processing; text analysis; content-based Chinese speech document retrieval system; keyword spotting; speech processing technology; text classification; Automatic speech recognition; Content based retrieval; Explosions; Information retrieval; Information science; Mutual information; Sampling methods; Signal processing; Speech processing; Text categorization;
Conference_Titel :
Communications, Computers and Signal Processing, 2009. PacRim 2009. IEEE Pacific Rim Conference on
Conference_Location :
Victoria, BC
Print_ISBN :
978-1-4244-4560-8
Electronic_ISBN :
978-1-4244-4561-5
DOI :
10.1109/PACRIM.2009.5291386