• DocumentCode
    179324
  • Title

    Visual-only discrimination between native and non-native speech

  • Author

    Georgakis, Christos ; Petridis, S. ; Pantic, Maja

  • Author_Institution
    Dept. of Comput., Imperial Coll. London, London, UK
  • fYear
    2014
  • fDate
    4-9 May 2014
  • Firstpage
    4828
  • Lastpage
    4832
  • Abstract
    Accent is an important biometric characteristic that is defined by the presence of specific traits in the speaking style of an individual. These are identified by patterns in the speech production system, such as those present in the vocal tract or in lip movements. Evidence from linguistics and speech processing research suggests that visual information enhances speech recognition. Intrigued by these findings, along with the assumption that visually perceivable accent-related patterns are transferred from the mother tongue to a foreign language, we investigate the task of discriminating native from non-native speech in English, employing visual features only. Training and evaluation is performed on segments of continuous visual speech, captured by mobile phones, where all speakers read the same text. We apply various appearance descriptors to represent the mouth region at each video frame. Vocabulary-based histograms, being the final representation of dynamic features for all utterances, are used for recognition. Binary classification experiments, discriminating native and non-native speakers, are conducted in a subject-independent manner. Our results show that this task can be addressed by means of an automated approach that uses visual features only.
  • Keywords
    linguistics; mobile radio; speaker recognition; speech enhancement; vocabulary; English; appearance descriptors; binary classification; biometric characteristic; continuous visual speech; foreign language; linguistics; lip movements; mobile phones; mother tongue; nonnative speech; speaking style; specific traits; speech processing; speech production system; speech recognition; video frame; visual features; visual only discrimination; vocabulary-based histograms; vocal tract; Hidden Markov models; Mouth; Speech; Speech processing; Speech recognition; Vectors; Visualization; Accent Classification; Non-Native Speech Identification; Visual Speech Processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Acoustics, Speech and Signal Processing (ICASSP), 2014 IEEE International Conference on
  • Conference_Location
    Florence
  • Type

    conf

  • DOI
    10.1109/ICASSP.2014.6854519
  • Filename
    6854519