DocumentCode :
2560815
Title :
Taiwanese TV news-to-document index system
Author :
Lyu, Dau-Cheng ; Yang, Bo-Hou ; Lyu, Ren-Yuan ; Hsu, Chun-Nan
Author_Institution :
Dept. of Electr. Eng., Chang Gung Univ., Taoyuan, Taiwan
fYear :
2005
fDate :
28-30 May 2005
Firstpage :
182
Lastpage :
185
Abstract :
This paper describes an index system from Taiwanese TV speech news to World Wide Web Chinese text documents. This system is based on two main techniques: automatic speech recognition (ASR) and bi-lingual text alignment. For the former, we utilized the speech-to-text approach to recognize the utterance of anchors in the TV news as Taiwanese tonal syllable sequences. Then we translated the Chinese text documents which obtained from the corresponding news website to the Taiwanese tonal syllables by a bi-lingual pronunciation lexicon. Afterward, a dynamic programming algorithm is used in the syllable-level alignment for linking the TV news and the documents. A corpus of speech data about 100 speakers and the text data with 840k Chinese characters were used to train the acoustic and language models in ASR. A bi-lingual lexicon contains 70k vocabularies is used as the resource of the pronunciation model for ASR and the statistical translation model for bi-lingual text alignment. Finally, the experiment of the TV news with 40 stories was evaluated for the document index system, and the accuracy rate of index is over 82% on average.
Keywords :
indexing; speech synthesis; television; text analysis; Taiwanese TV; World Wide Web Chinese text documents; automatic speech recognition; bilingual pronunciation lexicon; bilingual text alignment; dynamic programming; news-to-document index system; speech-to-text approach; syllable-level alignment; Automatic speech recognition; Computer science; Dynamic programming; Information science; Multimedia databases; Natural languages; Speech recognition; TV; Vocabulary; Web sites;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Cellular Neural Networks and Their Applications, 2005 9th International Workshop on
Print_ISBN :
0-7803-9185-3
Type :
conf
DOI :
10.1109/CNNA.2005.1543191
Filename :
1543191
Link To Document :
بازگشت