• DocumentCode
    2285332
  • Title

    Homogeneous segmentation and classifier ensemble for audio tag annotation and retrieval

  • Author

    Lo, Hung-Yi ; Wang, Ju-Chiang ; Wang, Hsin-Min

  • Author_Institution
    Inst. of Inf. Sci., Acad. Sinica, Taipei, Taiwan
  • fYear
    2010
  • fDate
    19-23 July 2010
  • Firstpage
    304
  • Lastpage
    309
  • Abstract
    Audio tags describe different types of musical information such as genre, mood, and instrument. This paper aims to automatically annotate audio clips with tags and retrieve relevant clips from a music database by tags. Given an audio clip, we divide it into several homogeneous segments by using an audio novelty curve, and then extract audio features from each segment with respect to various musical information, such as dynamics, rhythm, timbre, pitch, and tonality. The features in frame-based feature vector sequence format are further represented by their mean and standard deviation such that they can be combined with other segment-based features to form a fixed-dimensional feature vector for a segment. We train an ensemble classifier, which consists of SVM and AdaBoost classifiers, for each tag. For the audio annotation task, the individual classifier outputs are transformed into calibrated probability scores such that probability ensemble can be employed. For the audio retrieval task, we propose using ranking ensemble. We participated in the MIREX 2009 audio tag classification task and our system was ranked first in terms of F-measure and the area under the ROC curve given a tag.
  • Keywords
    audio signal processing; feature extraction; information retrieval; probability; signal classification; support vector machines; AdaBoost classifier; F-measure; MIREX 2009 audio tag classification task; ROC curve; SVM classifier; audio feature extraction; audio novelty curve; audio tag annotation; audio tag retrieval; calibrated probability scores; classifier ensemble; ensemble classifier; frame-based feature vector sequence format; homogeneous segmentation; music database; probability ensemble; ranking ensemble; Accuracy; Classification algorithms; Feature extraction; Measurement; Support vector machine classification; Training; audio segmentation; audio tag annotation; audio tag retrieval; ensemble method;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia and Expo (ICME), 2010 IEEE International Conference on
  • Conference_Location
    Suntec City
  • ISSN
    1945-7871
  • Print_ISBN
    978-1-4244-7491-2
  • Type

    conf

  • DOI
    10.1109/ICME.2010.5583009
  • Filename
    5583009