DocumentCode
3376915
Title
Gender prediction of Indian names
Author
Tripathi, Anshuman ; Faruqui, Manaal
Author_Institution
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Kharagpur, India
fYear
2011
fDate
14-16 Jan. 2011
Firstpage
137
Lastpage
141
Abstract
We present a Support Vector Machine (SVM) based classification approach for gender prediction of Indian names.We first identify various features based upon morphological analysis that can be useful for such classification and evaluate them. We then state a novel approach of using n-gram-suffixes along with these features which gives us significant advantage over the baseline approach. We believe that we are the first to use n-grams of suffixes instead of the whole word for predictor systems. Our system reports a top F1 score of 94.9% which is expected to improve further with increase in training data size.
Keywords
classification; gender issues; support vector machines; Indian name gender prediction; morphological analysis; n-gram-suffixes; predictor systems; support vector machine based classification; Computer science; Electronic mail; Kernel; Natural language processing; Support vector machines; Training; Training data;
fLanguage
English
Publisher
ieee
Conference_Titel
Students' Technology Symposium (TechSym), 2011 IEEE
Conference_Location
Kharagpur
Print_ISBN
978-1-4244-8941-1
Type
conf
DOI
10.1109/TECHSYM.2011.5783842
Filename
5783842
Link To Document