• DocumentCode
    589321
  • Title

    Classification, Segmentation and Chronological Prediction of Cinematic Sound

  • Author

    Silva, P.M.

  • Author_Institution
    Dept. of Inf. Eng., Univ. do Porto, Porto, Portugal
  • Volume
    2
  • fYear
    2012
  • fDate
    12-15 Dec. 2012
  • Firstpage
    369
  • Lastpage
    374
  • Abstract
    This paper presents work done on classification, segmentation and chronological prediction of cinematic sound employing support vector machines (SVM) with sequential minimal optimization (SMO). Speech, music, environmental sound and silence, plus all pair wise combinations excluding silence, are considered as classes. A model considering simple adjacency rules and probabilistic output from logistic regression is used for segmenting fixed-length parts into auditory scenes. Evaluation of the proposed methods on a 44-film dataset against k-nearest neighbor, Naive Bayes and standard SVM classifiers shows superior results of the SMO classifier on all performance metrics. Subsequently, we propose sample size optimizations to the building of similar datasets. Finally, we use meta-features built from classification as descriptors in a chronological model for predicting the period of production of a given soundtrack. A decision table classifier is able to estimate the year of production of an unknown soundtrack with a mean absolute error of approximately five years.
  • Keywords
    audio signal processing; cinematography; decision tables; decision trees; music; optimisation; regression analysis; signal classification; speech processing; support vector machines; SMO classifier; SVM classifiers; adjacency rules; auditory scenes; chronological model; cinematic sound chronological prediction; cinematic sound classification prediction; cinematic sound segmentation prediction; decision table classifier; decision trees; fixed-length part segmentation; logistic regression; machine learning; mean absolute error; performance metrics; probabilistic output; sample size optimizations; sequential minimal optimization; support vector machines; Music; Optimization; Production; Sociology; Speech; Statistics; Support vector machines; Audio databases; Cinematography; Classification algorithms; Decision trees; Machine learning; Regression analysis; Support vector machines;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Applications (ICMLA), 2012 11th International Conference on
  • Conference_Location
    Boca Raton, FL
  • Print_ISBN
    978-1-4673-4651-1
  • Type

    conf

  • DOI
    10.1109/ICMLA.2012.172
  • Filename
    6406764