Title :
Text Categorization with Considering Temporal Patterns of Term Usages
Author :
Abe, Hidenao ; Tsumoto, Shusaku
Author_Institution :
Dept. of Med. Inf., Shimane Univ., Izumo, Japan
Abstract :
In document categorization method by using similarity measures based on word vectors, it is important to determine key words to characterize each document. However, conventional methods select the key words based on their frequency or/and particular importance index such as tf-idf. In this paper, we propose a method to characterize each document by using temporal clusters of technical term usages. The method obtains document clusters based on the similarity between the document that are characterized by the temporal patterns of an importance index for considering temporal differences of the term usages In the experiment, we compare document categorization results based on document clustering by using the two types of feature sets about two sets of bibliographical documents. By regarding to the experimental results, we discuss the usefulness of the temporal patterns of term usages to characterize the documents.
Keywords :
pattern classification; pattern clustering; text analysis; bibliographical document; document categorization method; keyword determination; similarity measure; technical term usage; temporal cluster; temporal pattern; text categorization; word vector; Document Clustering; Temporal Patterns; Term Usage Index; Text Mining;
Conference_Titel :
Data Mining Workshops (ICDMW), 2010 IEEE International Conference on
Conference_Location :
Sydney, NSW
Print_ISBN :
978-1-4244-9244-2
Electronic_ISBN :
978-0-7695-4257-7
DOI :
10.1109/ICDMW.2010.186