Author :
Chen, Chih-Chang ; Chen, Chien-Hung ; Lu, Ping-Tsung ; Chen, Oscal T.-C.
Author_Institution :
Dept. of Electr. Eng., Nat. Chung Cheng Univ., Chiayi, Taiwan
Abstract :
Due to a large amount of multimedia resources available on websites, it is important to effectively retrieve useful data from such a huge media pool. In this work, an affective computing scheme is proposed to classify online songs and speeches into six affective moods. To do this, a coarse-to-fine classification is developed to compute the affective moods of song and speech clips by using a two-layer hierarchical classifier. By analyzing the features of normalized intensity mean, rhythm regularity, Zero Crossing Rate (ZCR), ZCR peak duration, standard deviation of spectral centroid and tempo, songs are classified into emotions of happy, angry, sad, calm, fret and exciting. Additionally, moods of speeches are identified by happy, angry, sad, calm, fear and surprised using the features of fundamental-frequency standard deviation, averaged spectral centroid, averaged spectral spread and ZCR. Our experimental results reveal that the averaged accuracies of 55.2% and 60.2% associated with songs and speeches are reached, respectively. Therefore, the proposed system can be beneficial to understanding affective moods of online songs and speeches before a user´s listening.
Keywords :
Web sites; multimedia communication; Web sites; affective computing; affective understanding; averaged spectral centroid; averaged spectral spread; coarse-to-fine classification; fundamental-frequency standard deviation; intensity mean; multimedia resources; online songs; online speeches; rhythm regularity; two-layer hierarchical classifier; zero crossing rate; Cepstral analysis; Emotion recognition; Feature extraction; Mel frequency cepstral coefficient; Mood; Music; Rhythm; Speech analysis; Support vector machine classification; Support vector machines;