DocumentCode
3391051
Title
Multifeature audio segmentation for browsing and annotation
Author
Tzanetakis, George ; Cook, Perry
Author_Institution
Dept. of Comput. Sci., Princeton Univ., NJ, USA
fYear
1999
fDate
1999
Firstpage
103
Lastpage
106
Abstract
Indexing and content-based retrieval are necessary to handle the large amounts of audio and multimedia data that is becoming available on the Web and elsewhere. Since manual indexing using existing audio editors is extremely time consuming a number of automatic content analysis systems have been proposed. Most of these systems rely on speech recognition techniques to create text indices. On the other hand, very few systems have been proposed for automatic indexing of music and general audio. Typically these systems rely on classification and similarity-retrieval techniques and work in restricted audio domains. A somewhat different, more general approach for fast indexing of arbitrary audio data is the use of segmentation based on multiple temporal features combined with automatic or semi-automatic annotation. In this paper, a general methodology for audio segmentation is proposed. A number of experiments were performed to evaluate the proposed methodology and compare different segmentation schemes. Finally, a prototype audio browsing and annotation tool based on segmentation combined with existing classification techniques was implemented
Keywords
audio signal processing; content-based retrieval; database indexing; feature extraction; multimedia databases; online front-ends; signal classification; audio browsing/annotation tool; audio data; automatic annotation; automatic content analysis systems; automatic indexing; classification techniques; content-based retrieval; experiments; multifeature audio segmentation; multimedia data; multiple temporal features; similarity-retrieval techniques; speech recognition; text indices; Computer science; Content based retrieval; Humans; Image edge detection; Image segmentation; Indexing; Information retrieval; Prototypes; Sections; Speech;
fLanguage
English
Publisher
ieee
Conference_Titel
Applications of Signal Processing to Audio and Acoustics, 1999 IEEE Workshop on
Conference_Location
New Paltz, NY
Print_ISBN
0-7803-5612-8
Type
conf
DOI
10.1109/ASPAA.1999.810860
Filename
810860
Link To Document