Title :
Large-scale speaker identification
Author :
Schmidt, L. ; Sharifi, Morteza ; Lopez Moreno, Ignacio
Author_Institution :
MIT, Cambridge, MA, USA
Abstract :
Speaker identification is one of the main tasks in speech processing. In addition to identification accuracy, large-scale applications of speaker identification give rise to another challenge: fast search in the database of speakers. In this paper, we propose a system based on i-vectors, a current approach for speaker identification, and locality sensitive hashing, an algorithm for fast nearest neighbor search in high dimensions. The connection between the two techniques is the cosine distance: on the one hand, we use the cosine distance to compare i-vectors, on the other hand, locality sensitive hashing allows us to quickly approximate the cosine distance in our retrieval procedure. We evaluate our approach on a realistic data set from YouTube with about 1,000 speakers. The results show that our algorithm is approximately one to two orders of magnitude faster than a linear search while maintaining the identification accuracy of an i-vector-based system.
Keywords :
search problems; speaker recognition; vectors; YouTube; cosine distance; fast nearest neighbor search; i-vector-based system; identification accuracy; large-scale speaker identification; linear search; locality sensitive hashing; realistic data set; speech processing; Accuracy; Acoustics; Speech; Speech processing; Vectors; Videos; YouTube; i-vectors; indexing; kd-tree; locality sensitive hashing; speaker identification;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
Conference_Location :
Florence
DOI :
10.1109/ICASSP.2014.6853878