DocumentCode
67904
Title
A Robust Fitness Measure for Capturing Repetitions in Music Recordings With Applications to Audio Thumbnailing
Author
Muller, Mathias ; Nanzhu Jiang ; Grosche, Peter
Author_Institution
Int. Audio Labs., Univ. of Erlangen-Nuremberg, Erlangen, Germany
Volume
21
Issue
3
fYear
2013
fDate
Mar-13
Firstpage
531
Lastpage
543
Abstract
The automatic extraction of structural information from music recordings constitutes a central research topic. In this paper, we deal with a subproblem of audio structure analysis called audio thumbnailing with the goal to determine the audio segment that best represents a given music recording. Typically, such a segment has many (approximate) repetitions covering large parts of the recording. As the main technical contribution, we introduce a novel fitness measure that assigns a fitness value to each segment that expresses how much and how well the segment “explains” the repetitive structure of the entire recording. The thumbnail is then defined to be the fitness-maximizing segment. To compute the fitness measure, we describe an optimization scheme that jointly performs two error-prone steps, path extraction and grouping, which are usually performed successively. As a result, our approach is even able to cope with strong musical and acoustic variations that may occur within and across related segments. As a further contribution, we introduce the concept of fitness scape plots that reveal global structural properties of an entire recording. Finally, to show the robustness and practicability of our thumbnailing approach, we present various experiments based on different audio collections that comprise popular music, classical music, and folk song field recordings.
Keywords
audio recording; audio signal processing; music; optimisation; audio segment; audio structure analysis; audio thumbnailing; automatic extraction; classical music; fitness-maximizing segment; folk song field recordings; grouping method; music recording; optimization scheme; path extraction; repetitive structure; robust fitness measure; Abstracts; Educational institutions; Instruments; Music; Robustness; Speech; Speech processing; Structure analysis; alignment; audio; fitness; music; path; repetition; thumbnail;
fLanguage
English
Journal_Title
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher
ieee
ISSN
1558-7916
Type
jour
DOI
10.1109/TASL.2012.2227732
Filename
6353546
Link To Document