• DocumentCode
    2988157
  • Title

    Detecting and labeling folk literature in spoken cultural heritage archives using structural and prosodic features

  • Author

    Valente, Fabio ; Motlicek, Petr

  • Author_Institution
    Idiap Res. Inst., Martigny, Switzerland
  • fYear
    2012
  • fDate
    27-29 June 2012
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    Spoken cultural heritage can present considerably heterogeneous content as tales, stories, recitals, poems, theatrical representations and other form of folk literature. This work investigates the automatic detection and classification of those data type in large spoken audio archives. The corpus used for this study consists of 90 radio broadcast shows collected for preserving a large variety of Swiss French dialects. Given the variability of the language spoken in the recordings, the paper proposes a language-independent system based on structural features obtained using a speaker diarization system and various acoustic/prosodic features. Results reveal that such a system can achieve an F-measure equal to 0.85 (Precision 0.88/Recall 0.84) in retrieving folk literature in those archives. Prosodic features appear more effective and complementary to structural features. Furthermore, the paper investigates whether the same approach can be used to label speech segments into five large classes (Storytelling, Poetry, Theatre, Interviews, Functionals) showing F-measures ranging from 0.52 to 0.88. As last contribution, prosodic features for disambiguating between spoken prose and spoken poetry are investigated. In summary the study shows that simple structural and acoustic/prosodic features can be used to effectively retrieve and label folk literature in broadcast archives.
  • Keywords
    classification; history; literature; F-measure; Swiss French dialects; acoustic/prosodic features; automatic detection; broadcast archives; classification; data type; folk literature labeling; folk literature retrieval; heterogeneous content; language-independent system; poems; radio broadcast; recitals; speaker diarization system; speech segments; spoken audio archives; spoken cultural heritage archives; stories; structural features; tales; theatrical representation; Acoustics; Boosting; Cultural differences; Feature extraction; Hidden Markov models; Interviews; Speech;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Content-Based Multimedia Indexing (CBMI), 2012 10th International Workshop on
  • Conference_Location
    Annecy
  • ISSN
    1949-3983
  • Print_ISBN
    978-1-4673-2368-0
  • Electronic_ISBN
    1949-3983
  • Type

    conf

  • DOI
    10.1109/CBMI.2012.6269839
  • Filename
    6269839