• DocumentCode
    3376915
  • Title

    Gender prediction of Indian names

  • Author

    Tripathi, Anshuman ; Faruqui, Manaal

  • Author_Institution
    Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Kharagpur, India
  • fYear
    2011
  • fDate
    14-16 Jan. 2011
  • Firstpage
    137
  • Lastpage
    141
  • Abstract
    We present a Support Vector Machine (SVM) based classification approach for gender prediction of Indian names.We first identify various features based upon morphological analysis that can be useful for such classification and evaluate them. We then state a novel approach of using n-gram-suffixes along with these features which gives us significant advantage over the baseline approach. We believe that we are the first to use n-grams of suffixes instead of the whole word for predictor systems. Our system reports a top F1 score of 94.9% which is expected to improve further with increase in training data size.
  • Keywords
    classification; gender issues; support vector machines; Indian name gender prediction; morphological analysis; n-gram-suffixes; predictor systems; support vector machine based classification; Computer science; Electronic mail; Kernel; Natural language processing; Support vector machines; Training; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Students' Technology Symposium (TechSym), 2011 IEEE
  • Conference_Location
    Kharagpur
  • Print_ISBN
    978-1-4244-8941-1
  • Type

    conf

  • DOI
    10.1109/TECHSYM.2011.5783842
  • Filename
    5783842