Title :
Multimodal Sparsity-Eager Support Vector Machines for Music Classification
Author :
Aryafar, Kamelia ; Shokoufandeh, Ali
Author_Institution :
Comput. Sci. Dept., Drexel Univ., Philadelphia, PA, USA
Abstract :
As the demand for multimedia grows, the development of information retrieval systems utilizing all available data modalities becomes of paramount importance. The provision of multiple modalities is motivated by usability, presence of noise in one modality and non-universality of a single modality. Radio stations and music TV channels hold archives of millions of music tapes and lyrics. Gigabytes of music files are also spread over the web along with the lyrics and metadata for each file. Searching and organizing large scale multimodal datasets is a challenging task. Supervised methods such as support vector machine (SVM) achieve state of the art performance for music classification on single modality, but suffer from over-fitting on training examples and limitations of single modality approaches. In this paper, we introduce a classifier fusion of multimodal audio and lyrics data to address these single modality classification limitations. We introduce the multimodal l1-SVM classifier, that utilizes sparse methods to deal with over-fitting for music classification. We compare the classification accuracy of the fusion classifier for a genre classification task in a large public dataset with single modality l1-SVM.
Keywords :
information retrieval; multimedia computing; music; support vector machines; SVM classifier fusion; World Wide Web; classification accuracy; data modalities; fusion classifier; genre classification task; gigabytes; information retrieval systems; large scale multimodal datasets; lyrics data; multimedia; multimodal audio; multimodal sparsity-eager support vector machines; music TV channels; music classification; music files; public dataset; radio stations; single modality classification limitations; sparse methods; supervised methods; usability; Accuracy; Multimedia communication; Multiple signal classification; Music; Support vector machines; Training; Vectors; audio; classification; multimodal;
Conference_Titel :
Machine Learning and Applications (ICMLA), 2014 13th International Conference on
Conference_Location :
Detroit, MI
DOI :
10.1109/ICMLA.2014.72