DocumentCode :
3376915
Title :
Gender prediction of Indian names
Author :
Tripathi, Anshuman ; Faruqui, Manaal
Author_Institution :
Dept. of Comput. Sci. & Eng., Indian Inst. of Technol., Kharagpur, India
fYear :
2011
fDate :
14-16 Jan. 2011
Firstpage :
137
Lastpage :
141
Abstract :
We present a Support Vector Machine (SVM) based classification approach for gender prediction of Indian names.We first identify various features based upon morphological analysis that can be useful for such classification and evaluate them. We then state a novel approach of using n-gram-suffixes along with these features which gives us significant advantage over the baseline approach. We believe that we are the first to use n-grams of suffixes instead of the whole word for predictor systems. Our system reports a top F1 score of 94.9% which is expected to improve further with increase in training data size.
Keywords :
classification; gender issues; support vector machines; Indian name gender prediction; morphological analysis; n-gram-suffixes; predictor systems; support vector machine based classification; Computer science; Electronic mail; Kernel; Natural language processing; Support vector machines; Training; Training data;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Students' Technology Symposium (TechSym), 2011 IEEE
Conference_Location :
Kharagpur
Print_ISBN :
978-1-4244-8941-1
Type :
conf
DOI :
10.1109/TECHSYM.2011.5783842
Filename :
5783842
Link To Document :
بازگشت