Title :
Research of improved IF-IDF Weighting algorithm
Author :
Jie, Gan ; Li-chao, Chen
Author_Institution :
Institute of Computer Science and Technology, Taiyuan University of Science and Technology, Shanxi, China
Abstract :
It does not consider how similar words are distributed in the text that the traditional algorithm of the VSM characteristic weighs - TF-IDF. For solving the problem, from the semantic view and combined optimization techniques, a improved IF-IDF Weighting algorithm is proposed. This algorithm can effectually reduce the subjective factors of faceted classification, and further improve the effect of current most text clustering algorithm that based on Vector Space Model (VSM). By experiments, the algorithm is feasible and effective, and to some extent, the precision ratio and recall ratio of text clustering is enhanced.
Keywords :
Algorithm design and analysis; Classification algorithms; Clustering algorithms; Computational modeling; Semantics; Software; Time frequency analysis; HowNet; clustering; semantic similarity; term weighting algorithm; text clustering;
Conference_Titel :
Information Science and Engineering (ICISE), 2010 2nd International Conference on
Conference_Location :
Hangzhou, China
Print_ISBN :
978-1-4244-7616-9
DOI :
10.1109/ICISE.2010.5690286