Language independent query-by-example spoken term detection using N-best phone sequences and partial matching

Author

Haihua Xu ; Peng Yang ; Xiong Xiao ; Lei Xie ; Cheung-Chi Leung ; Hongjie Chen ; Yu Jia ; Hang, L.V. ; Lei Wang ; Su Jun Leow ; Bin Ma ; Eng Siong Chng ; Haizhou Li

Author_Institution

Temasek Lab., Nanyang Technol. Univ., Singapore, Singapore

fYear

2015

fDate

19-24 April 2015

Firstpage

5191

Lastpage

5195

Abstract

In this paper, we propose a partial sequence matching based symbolic search (SS) method for the task of language independent query-by-example spoken term detection. One main drawback of conventional SS approach is the high miss rate for long queries. This is due to high variations in symbol representation of query and search audios, especially in language independent scenario. The successful matching of a query with its instances in search audio becomes exponentially more difficult as the query grows longer. To reduce miss rate, we propose a partial matching strategy, in which all partial phone sequences of a query are used to search for query instances. The partial matching is also suitable for real life applications where exact match is usually not necessary and word prefix, suffix, and order should not affect the search result. When applied to the QUESST 2014 task, results show the partial matching of phone sequences is able to reduce miss rate of long queries significantly compared with conventional full matching method. In addition, for the most challenging inexact matching queries (type 3), it also shows clear advantage over DTW-based methods.

Keywords

query formulation; query processing; speech recognition; QUESST 2014 task; inexact matching queries; language independent query-by-example spoken term detection; language independent scenario; miss rate; partial matching strategy; partial phone sequences; partial sequence matching based symbolic search method; search audios; symbol representation; Acoustics; Audio databases; Indexing; Keyword search; Lattices; Search problems; Speech; keyword search; partial matching; phone tokenizer; queryby-example; spoken term detection;

fLanguage

English

Publisher

ieee

Conference_Titel

Acoustics, Speech and Signal Processing (ICASSP), 2015 IEEE International Conference on

Conference_Location

South Brisbane, QLD

Type

conf

DOI

10.1109/ICASSP.2015.7178961

Filename

7178961