• DocumentCode
    3282117
  • Title

    Nonlinear information fusion in multi-sensor processing - extracting and exploiting hidden dynamics of speech captured by a bone-conductive microphone

  • Author

    Deng, Li ; Liu, Zicheng ; Zhang, Zhengyou ; Acero, Alex

  • Author_Institution
    Microsoft Res., Redmond, WA, USA
  • fYear
    2004
  • fDate
    29 Sept.-1 Oct. 2004
  • Firstpage
    19
  • Lastpage
    22
  • Abstract
    One well-known difficulty in creating an effective human-machine interface via the speech input is the adverse effects of concurrent acoustic noise. To overcome this challenge, we have developed a joint hardware and software solution. A novel bone-conductive microphone is integrated with a regular air-conductive one in a single headset. These two simultaneous sensors capture the distinct signal properties in the speech embedded in acoustic noise. The focus of this paper is the exploration of the type of dynamic properties that are relatively invariant between the bone-conductive sensor´s signal and the clean speech signal; the latter would not be available to the recognizer. Our approach is based on a nonlinear processing technique that estimates the unobserved (hidden) vocal tract resonances, as a representation of such invariant hidden dynamics, from the available bone-sensor signal. The information about these dynamic aspects of the clean speech is then fused with the other noisy measurements that aims to improve the recognition system´s robustness to acoustic distortion. The fusion technique is based on a combination of three sets of signals including the synthesized speech signal using the vocal tract resonance dynamics extracted nonlinearly from the bone-sensor signal.
  • Keywords
    acoustic noise; man-machine systems; microphone arrays; sensor fusion; speech recognition; speech-based user interfaces; acoustic distortion; acoustic noise; bone-conductive microphone; hidden vocal tract resonance; human-machine interface; information fusion; invariant hidden dynamic; multisensor processing; nonlinear processing technique; speech recognition; Acoustic noise; Data mining; Joints; Man machine systems; Microphones; Resonance; Speech enhancement; Speech processing; Speech recognition; Speech synthesis;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Multimedia Signal Processing, 2004 IEEE 6th Workshop on
  • Print_ISBN
    0-7803-8578-0
  • Type

    conf

  • DOI
    10.1109/MMSP.2004.1436405
  • Filename
    1436405