• DocumentCode
    310650
  • Title

    REMAP for video soundtrack indexing

  • Author

    Gelin, Philippe ; Wellekens, Chris J.

  • Author_Institution
    Dept. of Multimedia Commun., Inst. Eurecom, Sophia Antipolis, France
  • Volume
    2
  • fYear
    1997
  • fDate
    21-24 Apr 1997
  • Firstpage
    1423
  • Abstract
    Indexing of video soundtracks is an important issue for the navigation in multimedia databases. Based on wordspotting techniques, it should meet very constraining specifications; namely fast response to queries, concise processed speech information for limiting the storage memory, speaker independant mode, easy characterization of any word by its phonemic spelling. A solution based on phonemic lattices and on a division of the indexing process into an off-line and an online part is proposed. Previous works based on frame labelling and maximum likelihood criterion are now modified to take into account this new approach based on a maximum a posteriori (MAP) criterion. The REMAP algorithm implements this MAP criterion for training. It has several advantages such as maximizing the global discriminant criterion, avoiding the difficult problem of phoneme transition detection during the training process and being well suited for a hybrid hidden Markov model (HMM) and neural network (NN) approach
  • Keywords
    hidden Markov models; maximum likelihood estimation; multimedia communication; neural nets; speech processing; speech recognition; video signal processing; visual databases; HMM; REMAP algorithm; concise processed speech information; fast response; frame labelling; global discriminant criterion; hybrid hidden Markov model; maximum a posteriori criterion; maximum likelihood criterion; multimedia database navigation; neural network; phonemic lattices; phonemic spelling; speaker independant mode; storage memory; training; video soundtrack indexing; wordspotting techniques; Hidden Markov models; Indexing; Labeling; Lattices; Loudspeakers; Maximum likelihood detection; Multimedia databases; Navigation; Neural networks; Speech processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech, and Signal Processing, 1997. ICASSP-97., 1997 IEEE International Conference on
  • Conference_Location
    Munich
  • ISSN
    1520-6149
  • Print_ISBN
    0-8186-7919-0
  • Type

    conf

  • DOI
    10.1109/ICASSP.1997.596215
  • Filename
    596215