• DocumentCode
    312046
  • Title

    Keyword spotting enhancement for video soundtrack indexing

  • Author

    Gelin, Philippe ; Wellekens, Chris J.

  • Author_Institution
    Dept. of Multimedia Commun., Inst. Eurecom, Sophia Antipolis, France
  • Volume
    2
  • fYear
    1996
  • fDate
    3-6 Oct 1996
  • Firstpage
    586
  • Abstract
    Multimedia databases contain an increasing number of videos that are not easily semantically accessed. Among the useful indices that can be extracted from the soundtrack, the presence of a keyword at some place plays a prominent role. This paper deals with the specificities of such a keyword spotter and the enhancements brought to our previous technique (1996) based on frame labeling. To be useful, such a keyword spotter has to be speaker-independent. Moreover, it has to be able to detect any word from an open vocabulary. This directly implies the use of a phonemic representation of the word. These constraints usually lead to an excessively time-consuming tool. The division of the indexing process into two parts-the first one off-line, the second one at query time-allows a faster response
  • Keywords
    audio coding; audio recording; audio systems; indexing; multimedia computing; speech coding; speech recognition; video recording; visual databases; vocabulary; frame labeling; multimedia databases; off-line process; open vocabulary; phonemic representation; query time; response speed; semantic access; speaker-independent keyword spotting; video soundtrack indexing; word detection; Hidden Markov models; Indexing; Labeling; Lattices; Loudspeakers; Multimedia communication; Multimedia databases; Speech; Viterbi algorithm; Vocabulary;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Spoken Language, 1996. ICSLP 96. Proceedings., Fourth International Conference on
  • Conference_Location
    Philadelphia, PA
  • Print_ISBN
    0-7803-3555-4
  • Type

    conf

  • DOI
    10.1109/ICSLP.1996.607429
  • Filename
    607429