• DocumentCode
    67904
  • Title

    A Robust Fitness Measure for Capturing Repetitions in Music Recordings With Applications to Audio Thumbnailing

  • Author

    Muller, Mathias ; Nanzhu Jiang ; Grosche, Peter

  • Author_Institution
    Int. Audio Labs., Univ. of Erlangen-Nuremberg, Erlangen, Germany
  • Volume
    21
  • Issue
    3
  • fYear
    2013
  • fDate
    Mar-13
  • Firstpage
    531
  • Lastpage
    543
  • Abstract
    The automatic extraction of structural information from music recordings constitutes a central research topic. In this paper, we deal with a subproblem of audio structure analysis called audio thumbnailing with the goal to determine the audio segment that best represents a given music recording. Typically, such a segment has many (approximate) repetitions covering large parts of the recording. As the main technical contribution, we introduce a novel fitness measure that assigns a fitness value to each segment that expresses how much and how well the segment “explains” the repetitive structure of the entire recording. The thumbnail is then defined to be the fitness-maximizing segment. To compute the fitness measure, we describe an optimization scheme that jointly performs two error-prone steps, path extraction and grouping, which are usually performed successively. As a result, our approach is even able to cope with strong musical and acoustic variations that may occur within and across related segments. As a further contribution, we introduce the concept of fitness scape plots that reveal global structural properties of an entire recording. Finally, to show the robustness and practicability of our thumbnailing approach, we present various experiments based on different audio collections that comprise popular music, classical music, and folk song field recordings.
  • Keywords
    audio recording; audio signal processing; music; optimisation; audio segment; audio structure analysis; audio thumbnailing; automatic extraction; classical music; fitness-maximizing segment; folk song field recordings; grouping method; music recording; optimization scheme; path extraction; repetitive structure; robust fitness measure; Abstracts; Educational institutions; Instruments; Music; Robustness; Speech; Speech processing; Structure analysis; alignment; audio; fitness; music; path; repetition; thumbnail;
  • fLanguage
    English
  • Journal_Title
    Audio, Speech, and Language Processing, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1558-7916
  • Type

    jour

  • DOI
    10.1109/TASL.2012.2227732
  • Filename
    6353546