Title :
Perceptual similarity between audio clips and feature selection for its measurement
Author :
Qinghua Wu ; Xiaolei Zhang ; Ping Lv ; Ji Wu
Author_Institution :
Dept. of Electron. Eng., Tsinghua Univ., Beijing, China
Abstract :
In this paper, we explore the retrieval of perceptually similar audio. It focuses on finding sounds according to human perceptions. Thus such retrieval is more “human-centered” [1] than previous audio retrievals which intend to find homologous sounds. We make comprehensive use of various acoustic features to measure the perceptual similarity. Since some acoustic features may be redundant or even adverse to the similarity measurement, we propose to find a complementary and effective combination of acoustic features via SFFS (Sequential Floating Forward Selection) method. Experimental results show that LSP, MFCC, and PLP are the three most effective acoustic features. Moreover, the optimal combination of features can improve the accuracy of similarity classification by about 2% compared with the best performance of a single acoustic feature.
Keywords :
acoustic signal processing; audio signal processing; cepstral analysis; content-based retrieval; hearing; signal classification; LSP; MFCC; PLP; SFFS method; acoustic features; audio clips; audio retrieval; complementary combination; feature selection; homologous sounds; human perceptions; human-centered; optimal combination; perceptual similarity; sequential floating forward selection method; similarity classification; similarity measurement; Accuracy; Acoustic measurements; Humans; Mel frequency cepstral coefficient; Multimedia communication; Speech; content-based analysis; feature selection; perceptual similarity;
Conference_Titel :
Chinese Spoken Language Processing (ISCSLP), 2012 8th International Symposium on
Conference_Location :
Kowloon
Print_ISBN :
978-1-4673-2506-6
Electronic_ISBN :
978-1-4673-2505-9
DOI :
10.1109/ISCSLP.2012.6423476