• DocumentCode
    2806662
  • Title

    Perceptual audio features for unsupervised key-phrase detection

  • Author

    Von Zeddelmann, Dirk ; Kurth, Frank ; Müller, Meinard

  • Author_Institution
    KOM Dept., Fraunhofer-FKIE, Wachtberg, Germany
  • fYear
    2010
  • fDate
    14-19 March 2010
  • Firstpage
    257
  • Lastpage
    260
  • Abstract
    We propose a new type of audio feature (HFCC-ENS) as well as an unsupervised method for detecting short sequences of spoken words (key-phrases) within long speech recordings. Our technical contributions are threefold: Firstly, we propose to use bandwidth-adapted filterbanks instead of classical MFCC-style filters in the feature extraction step. Secondly, the time resolution of the resulting features is adapted to account for the temporal characteristics of the spoken phrases. Thirdly, the key-phrase detection step is performed by matching sequences of the resulting HFCC-ENS features with features extracted from a target speech recording. We evaluate the proposed method using the German Kiel Corpus and furthermore investigate speech-related properties of the proposed feature.
  • Keywords
    cepstral analysis; feature extraction; speech recognition; German kiel corpus; MFCC style filters; feature extraction step; perceptual audio features; speech recordings; spoken words sequences; unsupervised key phrase detection; Audio recording; Bandwidth; Feature extraction; Filters; Frequency; Hidden Markov models; Humans; Robustness; Speech processing; Statistics; HFCC; Speech features; key-phrase detection; key-phrase spotting;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics Speech and Signal Processing (ICASSP), 2010 IEEE International Conference on
  • Conference_Location
    Dallas, TX
  • ISSN
    1520-6149
  • Print_ISBN
    978-1-4244-4295-9
  • Electronic_ISBN
    1520-6149
  • Type

    conf

  • DOI
    10.1109/ICASSP.2010.5495974
  • Filename
    5495974