• DocumentCode
    187638
  • Title

    Sub-segmental, segmental and supra-segmental analysis of linear prediction residual signal for language identification

  • Author

    Nandi, Dipanjan ; Pati, Debadatta ; Rao, K. Sreenivasa

  • Author_Institution
    Sch. of Inf. Technol., Indian Inst. of Technol., Kharagpur, Kharagpur, India
  • fYear
    2014
  • fDate
    22-25 July 2014
  • Firstpage
    1
  • Lastpage
    6
  • Abstract
    In this work, excitation source information is explored for language identification (LID) task. The excitation signal is represented by linear prediction (LP) residual. Different aspects of the excitation source information can be captured by processing LP residual signal at sub-segmental, segmental and supra-segmental levels. Gaussian mixture modelling (GMM) technique is used to build the language models. Present LID study has been carried out on IITKGP-MLILSC speech database. Individually, the segmental level information provides good LID accuracy followed by sub-segmental and supra-segmental level information. Combined evidences from all three levels represent the complete excitation source information. Finally, a comparative study has been carried out between the vocal tract and excitation source features, which portrays the distinct nature of these two features. Combination of both the features, yield an improvement of 10.01% in LID accuracy than only excitation source information. This observation indicates the significance of excitation source information for LID task.
  • Keywords
    Gaussian processes; mixture models; natural language processing; prediction theory; speech processing; GMM; Gaussian mixture modelling; IITKGP-MLILSC speech database; LID task; LP residual signal; excitation signal; excitation source features; excitation source information; language identification; language models; linear prediction residual signal; subsegmental analysis; subsegmental level information; supra-segmental analysis; supra-segmental level information; vocal tract; Accuracy; Correlation; Feature extraction; Mel frequency cepstral coefficient; Production; Speech; IITKGP-MLILSC; LP residual; MFCC; segmental; sub-segmental; suprasegmental;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Signal Processing and Communications (SPCOM), 2014 International Conference on
  • Conference_Location
    Bangalore
  • Print_ISBN
    978-1-4799-4666-2
  • Type

    conf

  • DOI
    10.1109/SPCOM.2014.6983974
  • Filename
    6983974