Title :
Enhancing query expansion for semantic retrieval of spoken content with automatically discovered acoustic patterns
Author :
Hung-yi Lee ; Yun-Chiao Li ; Cheng-Tao Chung ; Lin-Shan Lee
Author_Institution :
Res. Center for Inf. Technol. Innovation, Acad. Sinica, Taipei, Taiwan
Abstract :
Query expansion techniques were originally developed for text information retrieval in order to retrieve the documents not containing the query terms but semantically related to the query. This is achieved by assuming the terms frequently occurring in the top-ranked documents in the first-pass retrieval results to be query-related and using them to expand the query to do the second-pass retrieval. However, when this approach was used for spoken content retrieval, the inevitable recognition errors and the OOV problems in ASR make it difficult for many query-related terms to be included in the expanded query, and much of the information carried by the speech signal is lost during recognition and not recoverable. In this paper, we propose to use a second ASR engine based on acoustic patterns automatically discovered from the spoken archive used for retrieval. These acoustic patterns are discovered directly based on the signal characteristics, and therefore can compensate for the information lost during recognition to a good extent. When a text query is entered, the system generates the first-pass retrieval results based on the transcriptions of the spoken segments obtained via the conventional ASR. The acoustic patterns frequently occurring in the spoken segments ranked on top of the first-pass results are considered as query-related, and the spoken segments containing these query-related acoustic patterns are retrieved. In this way, even though some query-related terms are OOV or incorrectly recognized, the segments including these terms can still be retrieved by acoustic patterns corresponding to these terms. Preliminary experiments performed on Mandarin broadcast news offered very encouraging results.
Keywords :
acoustic signal processing; content-based retrieval; query processing; speech recognition; text analysis; ASR engine; Mandarin broadcast news; OOV problems; automatic acoustic pattern discovery; document retrieval; first-pass retrieval results; query expansion techniques; query terms; query-related acoustic pattern retrieval; recognition errors; second-pass retrieval; semantic retrieval; signal characteristics; speech signal; spoken archive; spoken content retrieval; spoken segment transcriptions; text information retrieval; text query; top-ranked documents; Acoustics; Engines; Hidden Markov models; Lattices; Semantics; Speech; Speech recognition; Acoustic Pattern Discovery; Query Expansion;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2013 IEEE International Conference on
Conference_Location :
Vancouver, BC
DOI :
10.1109/ICASSP.2013.6639283