• DocumentCode
    259706
  • Title

    Multimodal Music and Lyrics Fusion Classifier for Artist Identification

  • Author

    Aryafar, Kamelia ; Shokoufandeh, Ali

  • Author_Institution
    Comput. Sci. Dept., Drexel Univ., Philadelphia, PA, USA
  • fYear
    2014
  • fDate
    3-6 Dec. 2014
  • Firstpage
    506
  • Lastpage
    509
  • Abstract
    Humans interact with each other using different communication modalities including speech, gestures and written documents. In the absence of one modality or presence of a noisy modality, other modalities can benefit precision of systems. HCI systems can also benefit from these multimodal communication models for different machine learning tasks. The provision of multiple modalities is motivated by usability, presence of noise in one modality and non-universality of a single modality. Combining multimodal information introduces new challenges to machine learning such as designing fusion classifiers. In this paper we explore the multimodal fusion of audio and lyrics for music artist identification. We compare our results with a single modality artist classifier and introduce new directions for designing a fusion classifier.
  • Keywords
    audio signal processing; human computer interaction; learning (artificial intelligence); music; pattern classification; HCI systems; artist identification; communication modalities; human-computer interaction; machine learning tasks; multimodal communication models; multimodal music lyrics fusion classifier; noisy modality; single modality artist classifier; Accuracy; Kernel; Mel frequency cepstral coefficient; Music; Music information retrieval; Semantics; Sparse matrices; audio; classification; multimodal; sparse methods;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2014 13th International Conference on
  • Conference_Location
    Detroit, MI
  • Type

    conf

  • DOI
    10.1109/ICMLA.2014.88
  • Filename
    7033167