Title :
A Novel Document Analysis Method Using Compressibility Vector
Author :
Zhang, Nuo ; Watanabe, Toshinori ; Matsuzaki, Daisuke ; Koga, Hisashi
Author_Institution :
Univ. of Electro-Commun., Tokyo
Abstract :
Similarity analysis and keyword extraction are widely used as document relation analysis techniques. These methods are based on dictionary-base morphological analysis. However, they cannot meet the need when Internet grows fast and new words appear but dictionary can not be renewed fast enough. In this study, we propose a new document relation analysis method based on the document´s compressibility. The effectiveness of the proposed method will be examined in simulations.
Keywords :
data compression; dictionaries; document handling; dictionary-base morphological analysis; document compressibility vector; document relation analysis method; keyword extraction; similarity analysis; Algorithm design and analysis; Data compression; Data mining; Data privacy; Dictionaries; Information analysis; Information systems; Internet; Text analysis; Web pages;
Conference_Titel :
Data, Privacy, and E-Commerce, 2007. ISDPE 2007. The First International Symposium on
Conference_Location :
Chengdu
Print_ISBN :
978-0-7695-3016-1
DOI :
10.1109/ISDPE.2007.93