• DocumentCode
    465469
  • Title

    Frame-Based SEMG-to-Speech Conversion

  • Author

    Lam, Yuet-Ming ; Leong, Philip Heng-Wai ; Mak, Man-Wai

  • Author_Institution
    Chinese Univ. of Hong Kong, Shatin
  • Volume
    1
  • fYear
    2006
  • fDate
    6-9 Aug. 2006
  • Firstpage
    240
  • Lastpage
    244
  • Abstract
    This paper presents a methodology that uses surface electromyogram (SEMG) signals recorded from the cheek and chin to synthesize speech. A neural network is trained to map the SEMG features (short-time Fourier transform coefficients) into vector-quantized codebook indices of speech features (linear prediction coefficients, pitch, and energy). To synthesize a word, SEMG signals recorded during pronouncing a word are blocked into frames; SEMG features are then extracted from each SEMG frame and presented to the neural network to obtain a sequence of speech feature indices. The waveform of the word is then constructed by concatenating the pre-recorded speech segments corresponding to the feature indices. Experimental evaluations based on the synthesis of eight words show that on average over 70% of the words can be synthesized correctly and the neural network can classify SEMG frames into seven phonemes and silence at a rate of 77.8%. The rate can be further improved to 88.3% by assuming medium-time stationarity of the speech signals. The experimental results demonstrate the feasibility of synthesizing words based on SEMG signals only.
  • Keywords
    Fourier transforms; electromyography; feature extraction; medical signal processing; neural nets; speech synthesis; vector quantisation; SEMG signal; SEMG-to-speech conversion; feature extraction; neural network; short-time Fourier transform coefficient; speech segment; speech signal; speech synthesis; surface electromyogram signal; vector-quantized codebook index; word pronunciation; Computer science; Feature extraction; Fourier transforms; Humans; Network synthesis; Neural networks; Oral communication; Signal synthesis; Speech recognition; Speech synthesis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Circuits and Systems, 2006. MWSCAS '06. 49th IEEE International Midwest Symposium on
  • Conference_Location
    San Juan
  • ISSN
    1548-3746
  • Print_ISBN
    1-4244-0172-0
  • Electronic_ISBN
    1548-3746
  • Type

    conf

  • DOI
    10.1109/MWSCAS.2006.382042
  • Filename
    4267119