DocumentCode
589321
Title
Classification, Segmentation and Chronological Prediction of Cinematic Sound
Author
Silva, P.M.
Author_Institution
Dept. of Inf. Eng., Univ. do Porto, Porto, Portugal
Volume
2
fYear
2012
fDate
12-15 Dec. 2012
Firstpage
369
Lastpage
374
Abstract
This paper presents work done on classification, segmentation and chronological prediction of cinematic sound employing support vector machines (SVM) with sequential minimal optimization (SMO). Speech, music, environmental sound and silence, plus all pair wise combinations excluding silence, are considered as classes. A model considering simple adjacency rules and probabilistic output from logistic regression is used for segmenting fixed-length parts into auditory scenes. Evaluation of the proposed methods on a 44-film dataset against k-nearest neighbor, Naive Bayes and standard SVM classifiers shows superior results of the SMO classifier on all performance metrics. Subsequently, we propose sample size optimizations to the building of similar datasets. Finally, we use meta-features built from classification as descriptors in a chronological model for predicting the period of production of a given soundtrack. A decision table classifier is able to estimate the year of production of an unknown soundtrack with a mean absolute error of approximately five years.
Keywords
audio signal processing; cinematography; decision tables; decision trees; music; optimisation; regression analysis; signal classification; speech processing; support vector machines; SMO classifier; SVM classifiers; adjacency rules; auditory scenes; chronological model; cinematic sound chronological prediction; cinematic sound classification prediction; cinematic sound segmentation prediction; decision table classifier; decision trees; fixed-length part segmentation; logistic regression; machine learning; mean absolute error; performance metrics; probabilistic output; sample size optimizations; sequential minimal optimization; support vector machines; Music; Optimization; Production; Sociology; Speech; Statistics; Support vector machines; Audio databases; Cinematography; Classification algorithms; Decision trees; Machine learning; Regression analysis; Support vector machines;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Applications (ICMLA), 2012 11th International Conference on
Conference_Location
Boca Raton, FL
Print_ISBN
978-1-4673-4651-1
Type
conf
DOI
10.1109/ICMLA.2012.172
Filename
6406764
Link To Document