• DocumentCode
    330291
  • Title

    Enhancing recognition systems through an integrated processing of visual and audio information

  • Author

    Postma, Eric ; Kasabov, Nikola ; van den Herik, Jaap

  • Author_Institution
    Dept. of Comput. Sci., Maastricht Univ., Netherlands
  • Volume
    2
  • fYear
    1998
  • fDate
    11-14 Oct 1998
  • Firstpage
    1591
  • Abstract
    The AVIS framework for integrated audio and visual information processing is applied to the problem of person identification. An instantiation of the AVIS framework, called PIAVI, is based on a fuzzy neural network (FuNN) model of audio-visual person identification. In PIAVI´s unimodal (visual) mode of operation, only dynamic visual features are used, whereas in the bimodal mode of operation, dynamic auditory and dynamic visual features are integrated at an early level of processing. Using a new dataset of dynamic features, a comparative study is performed with PIAVI in its two modes of operation. The results show that, with a large training set, perfect performance is achieved in the unimodal case. With a smaller training set, online application of the person identification system becomes feasible. Using this smaller set, unimodal identification performance is unsatisfactory. However, in the bimodal case, the identification performance is upgraded to satisfactory level of performance by early integration. It is concluded that, by using dynamic audio-visual features and FuNNs, an adequate on-line application of PIAVI in person identification tasks is within reach
  • Keywords
    fuzzy neural nets; image recognition; performance evaluation; speaker recognition; AVIS framework; PIAVI; audio visual information processing; dataset; dynamic auditory features; dynamic visual features; fuzzy neural network; image recognition; large training set; performance; person identification; speech recognition; Computer science; Frequency synchronization; Fuzzy neural networks; Humans; Image processing; Information processing; Information science; Signal processing; Speech processing; Speech recognition;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Systems, Man, and Cybernetics, 1998. 1998 IEEE International Conference on
  • Conference_Location
    San Diego, CA
  • ISSN
    1062-922X
  • Print_ISBN
    0-7803-4778-1
  • Type

    conf

  • DOI
    10.1109/ICSMC.1998.728115
  • Filename
    728115