DocumentCode
259706
Title
Multimodal Music and Lyrics Fusion Classifier for Artist Identification
Author
Aryafar, Kamelia ; Shokoufandeh, Ali
Author_Institution
Comput. Sci. Dept., Drexel Univ., Philadelphia, PA, USA
fYear
2014
fDate
3-6 Dec. 2014
Firstpage
506
Lastpage
509
Abstract
Humans interact with each other using different communication modalities including speech, gestures and written documents. In the absence of one modality or presence of a noisy modality, other modalities can benefit precision of systems. HCI systems can also benefit from these multimodal communication models for different machine learning tasks. The provision of multiple modalities is motivated by usability, presence of noise in one modality and non-universality of a single modality. Combining multimodal information introduces new challenges to machine learning such as designing fusion classifiers. In this paper we explore the multimodal fusion of audio and lyrics for music artist identification. We compare our results with a single modality artist classifier and introduce new directions for designing a fusion classifier.
Keywords
audio signal processing; human computer interaction; learning (artificial intelligence); music; pattern classification; HCI systems; artist identification; communication modalities; human-computer interaction; machine learning tasks; multimodal communication models; multimodal music lyrics fusion classifier; noisy modality; single modality artist classifier; Accuracy; Kernel; Mel frequency cepstral coefficient; Music; Music information retrieval; Semantics; Sparse matrices; audio; classification; multimodal; sparse methods;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications (ICMLA), 2014 13th International Conference on
Conference_Location
Detroit, MI
Type
conf
DOI
10.1109/ICMLA.2014.88
Filename
7033167
Link To Document