• DocumentCode
    1797455
  • Title

    Dimension reduction techniques for accessing Chinese readability

  • Author

    Yaw-Huei Chen ; Ting-Chia Lin

  • Author_Institution
    Dept. of Comput. Sci. & Inf. Eng., Nat. Chiayi Univ., Chiayi, Taiwan
  • Volume
    1
  • fYear
    2014
  • fDate
    13-16 July 2014
  • Firstpage
    434
  • Lastpage
    438
  • Abstract
    Machine learning-based techniques have been used to assess document readability in recent studies. One of the important issues of machine learning-based text classification techniques is to reduce the dimension of the document vectors. Different feature selection and feature extraction methods such as mutual information, chi-square test, information gain, PCA, and LSA are compared for assessing Chinese readability. We also compare classification techniques SVM and LDA. The experimental results indicate that the combination of chi-square feature selection method and SVM performs well.
  • Keywords
    feature extraction; feature selection; learning (artificial intelligence); natural language processing; pattern classification; principal component analysis; support vector machines; text analysis; Chinese readability; LDA; LSA; PCA; SVM; chi-square feature selection method; chi-square test; dimension reduction techniques; document readability assessment; document vectors; feature extraction; information gain; machine learning-based techniques; mutual information; text classification techniques; Abstracts; Feature extraction; Principal component analysis; Support vector machines; Chinese readability; Classification; Feature extraction; Feature selection; Machine learning;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Machine Learning and Cybernetics (ICMLC), 2014 International Conference on
  • Conference_Location
    Lanzhou
  • ISSN
    2160-133X
  • Print_ISBN
    978-1-4799-4216-9
  • Type

    conf

  • DOI
    10.1109/ICMLC.2014.7009154
  • Filename
    7009154