DocumentCode
1797455
Title
Dimension reduction techniques for accessing Chinese readability
Author
Yaw-Huei Chen ; Ting-Chia Lin
Author_Institution
Dept. of Comput. Sci. & Inf. Eng., Nat. Chiayi Univ., Chiayi, Taiwan
Volume
1
fYear
2014
fDate
13-16 July 2014
Firstpage
434
Lastpage
438
Abstract
Machine learning-based techniques have been used to assess document readability in recent studies. One of the important issues of machine learning-based text classification techniques is to reduce the dimension of the document vectors. Different feature selection and feature extraction methods such as mutual information, chi-square test, information gain, PCA, and LSA are compared for assessing Chinese readability. We also compare classification techniques SVM and LDA. The experimental results indicate that the combination of chi-square feature selection method and SVM performs well.
Keywords
feature extraction; feature selection; learning (artificial intelligence); natural language processing; pattern classification; principal component analysis; support vector machines; text analysis; Chinese readability; LDA; LSA; PCA; SVM; chi-square feature selection method; chi-square test; dimension reduction techniques; document readability assessment; document vectors; feature extraction; information gain; machine learning-based techniques; mutual information; text classification techniques; Abstracts; Feature extraction; Principal component analysis; Support vector machines; Chinese readability; Classification; Feature extraction; Feature selection; Machine learning;
fLanguage
English
Publisher
ieee
Conference_Titel
Machine Learning and Cybernetics (ICMLC), 2014 International Conference on
Conference_Location
Lanzhou
ISSN
2160-133X
Print_ISBN
978-1-4799-4216-9
Type
conf
DOI
10.1109/ICMLC.2014.7009154
Filename
7009154
Link To Document