• DocumentCode
    534740
  • Title

    Lip feature fusion in speech synthesis system driven by visual-speech for speech impaired

  • Author

    Wang, Mengjun ; Li, Gang

  • Author_Institution
    Sch. of Inf. Eng., HeBei Univ. of Technol., Tianjin, China
  • Volume
    5
  • fYear
    2010
  • fDate
    16-18 Oct. 2010
  • Firstpage
    1836
  • Lastpage
    1839
  • Abstract
    Lipreading is applied to synthesize speech for the speech-impaired people. To get a higher recognition result, concatenative feature fusion with weighting coefficients is used to integrate the geometrical feature vector of lip region and the descriptors of lip contours by Discrete Cosine Transform to get a new discriminate vector. Experiments are carried out based on HMM with different states and Gaussian mixture component in a small database for speaker-dependent case. Experiment results showed that the integrated discriminate vector after feature fusion obtained the information from the Geometrical feature vector of lip region and the DCT descriptors of lip contours. With best weighting coefficients m: n=1.5:1, the recognition rate are improved by as much as 3.92% and 29.91%.
  • Keywords
    discrete cosine transforms; feature extraction; hidden Markov models; sensor fusion; speech recognition; speech synthesis; DCT descriptors; Gaussian mixture component; HMM; concatenative feature fusion; discrete cosine transform; geometrical feature vector; lip contours; lip feature fusion; lip reading; lip region; speaker-dependent case; speech impaired person; speech recognition; speech synthesis system; speech-impaired people; visual speech; weighting coefficients; Discrete cosine transforms; Feature extraction; Hidden Markov models; Image sequences; Speech; Speech recognition; Visualization; DCT descriptors; Hidden Markov Model; feature fusion; geometrical feature vector; weighting combination;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Biomedical Engineering and Informatics (BMEI), 2010 3rd International Conference on
  • Conference_Location
    Yantai
  • Print_ISBN
    978-1-4244-6495-1
  • Type

    conf

  • DOI
    10.1109/BMEI.2010.5639968
  • Filename
    5639968