DocumentCode :
868663
Title :
Efficient and Robust Music Identification With Weighted Finite-State Transducers
Author :
Mohri, Mehryar ; Moreno, Pedro J. ; Weinstein, Eugene
Author_Institution :
Courant Inst. of Math. Sci., New York Univ., New York, NY, USA
Volume :
18
Issue :
1
fYear :
2010
Firstpage :
197
Lastpage :
207
Abstract :
We present an approach to music identification based on weighted finite-state transducers and Gaussian mixture models, inspired by techniques used in large-vocabulary speech recognition. Our modeling approach is based on learning a set of elementary music sounds in a fully unsupervised manner. While the space of possible music sound sequences is very large, our method enables the construction of a compact and efficient representation for the song collection using finite-state transducers. This paper gives a novel and substantially faster algorithm for the construction of factor transducers, the key representation of song snippets supporting our music identification technique. The complexity of our algorithm is linear with respect to the size of the suffix automaton constructed. Our experiments further show that it helps speed up the construction of the weighted suffix automaton in our task by a factor of 17 with respect to our previous method using the intermediate steps of determinization and minimization. We show that, using these techniques, a large-scale music identification system can be constructed for a database of over 15 000 songs while achieving an identification accuracy of 99.4% on undistorted test data, and performing robustly in the presence of noise and distortions.
Keywords :
Gaussian processes; acoustic transducers; computational complexity; finite state machines; music; sequences; unsupervised learning; Gaussian mixture model; elementary music sound sequence; factor transducer; large-scale robust music identification system; large-vocabulary speech recognition; song snippet; suffix automaton; unsupervised learning; weighted finite-state transducer; Content-based information retrieval; factor automata; finite-state transducers; music identification; suffix automata;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2009.2023170
Filename :
4926217
Link To Document :
بازگشت