DocumentCode
3426761
Title
Using textual information from LVCSR transcripts for phonetic-based spoken term detection
Author
Dubois, Corentin ; Charlet, Delphine
Author_Institution
TECH/SSTP/RVA, France Telecom R&D, Lannion
fYear
2008
fDate
March 31 2008-April 4 2008
Firstpage
4961
Lastpage
4964
Abstract
This paper presents a spoken term detection method, based on automatic speech recognition and phonetic representation. The proposed method combines textual search in word transcripts obtained with a large vocabulary continuous speech recognizer system and phonetic search in the phonetization of these transcripts, to accurately locate the occurrences of a list of keywords in a broadcast corpus. Textual information from the transcripts and an efficient rescoring scheme are used to improve the performance of the phonetic search. Our experiments show that the proposed method outperforms the baseline textual and phonetic searches by its ability to separate correct detections from false alarms.
Keywords
natural language processing; search problems; speech processing; speech recognition; LVCSR transcripts; automatic speech recognition; broadcast corpus; keyword spotting; large vocabulary continuous speech recognizer system; phonetic representation; phonetic search; phonetic-based spoken term detection; phonetization; rescoring scheme; textual information; textual search; word transcripts; Audio recording; Automatic speech recognition; Broadcasting; Decoding; Dictionaries; Error analysis; Indexing; Research and development; Speech recognition; Vocabulary; Automatic Speech Recognition; OOV keyword; Spoken Term Detection; phonetic representation;
fLanguage
English
Publisher
ieee
Conference_Titel
Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on
Conference_Location
Las Vegas, NV
ISSN
1520-6149
Print_ISBN
978-1-4244-1483-3
Electronic_ISBN
1520-6149
Type
conf
DOI
10.1109/ICASSP.2008.4518771
Filename
4518771
Link To Document