• DocumentCode
    3759430
  • Title

    Based on Rough Sets and the Associated Analysis of KNN Text Classification Research

  • Author

    Guo Aizhang;Yang Tao

  • Author_Institution
    Qilu Univ. of Technol., Jinan, China
  • fYear
    2015
  • Firstpage
    485
  • Lastpage
    488
  • Abstract
    With the rapid development of network information technology, the text is as a basic information carrier and begins to present exponential growth. The existing text classification methods haven´t got information from the vast amounts of information resources timely and accurately. In order to solve the problem, the paper puts forward a new method about text categorization. It is a KNN algorithm based on rough set and correlation analysis. Firstly, we introduce the concept of rough set. In the training set of text vector space, we divide all kinds of text vector spaces into certain and uncertain areas. For certain areas, we can directly judge its category. For uncertain areas, we determine the type of text vector through KNN text classification algorithm based on correlation analysis. Experimental results show that the KNN text classification algorithm based on rough sets and the associated analysis have greatly improved the efficiency and accuracy of text categorization. It can meet the requirements of processing large amounts of text data.
  • Keywords
    "Text categorization","Classification algorithms","Algorithm design and analysis","Training","Approximation algorithms","Rough sets","Correlation"
  • Publisher
    ieee
  • Conference_Titel
    Distributed Computing and Applications for Business Engineering and Science (DCABES), 2015 14th International Symposium on
  • Type

    conf

  • DOI
    10.1109/DCABES.2015.127
  • Filename
    7429661