DocumentCode
3377882
Title
Uniterm Voice Indexing and Search for Mobile Devices
Author
Ma, Changxue
Author_Institution
Applic. & Software Res. Center, Motorola Inc, Algonquin, IL
fYear
2008
fDate
3-7 Aug. 2008
Firstpage
1
Lastpage
6
Abstract
In this paper we present two novel approaches for voice indexing and search. The first approach is a Uniterm based voice indexing and search scheme that can be used for the fast retrieval of voice tagged multimedia contents on mobile devices. Uniterms, a string of phonemes with high scores, are extracted, in the indexing stage, from the phoneme lattice. For retrieval, the Uniterms are scored against the latent lattice model from the query voice. The candidate Uniterm list is selected to generate the audio segments of which the best phoneme paths are retrieved. These best paths of the candidate audio segments are compared against the best path of query voice and the search results are generated among the best matches. The second approach is also presented for comparison where the indices are extracted from the lattice in the form of unigram and bigram feature vectors. Each audio segment is represented by a tf-idf modulated feature vector. The search process involves two stages: the coarse search looks up the index and quickly returns a set of candidates; the fine search then compares the best paths of the query voice to the phone lattices of the candidates by using dynamic programming. Experimental results show that the Uniterm approach is significantly better and it is feasible for such voice search approaches on voice tagged multimedia contents on mobile devices. Finally we introduce a real-time news broadcasting search system.
Keywords
dynamic programming; information retrieval; mobile handsets; multimedia communication; query processing; voice communication; Uniterm voice indexing and search; audio segments; bigram feature vectors; dynamic programming; latent lattice model; mobile devices; phonemes; query voice; real-time news broadcasting search system; unigram feature vectors; voice tagged multimedia contents retrieval; Application software; Content based retrieval; Dictionaries; Dynamic programming; Indexing; Lattices; Multimedia communication; Real time systems; Speech recognition; User interfaces;
fLanguage
English
Publisher
ieee
Conference_Titel
Computer Communications and Networks, 2008. ICCCN '08. Proceedings of 17th International Conference on
Conference_Location
St. Thomas, US Virgin Islands
ISSN
1095-2055
Print_ISBN
978-1-4244-2389-7
Electronic_ISBN
1095-2055
Type
conf
DOI
10.1109/ICCCN.2008.ECP.156
Filename
4674316
Link To Document