DocumentCode :
2724918
Title :
Classification of XML Documents
Author :
Bouchachia, Abdelhamid ; Hassler, Marcus
Author_Institution :
Dept. of Informatics-Syst., Alpen-Adria-Univ., Klagenfurt
fYear :
2007
fDate :
March 1 2007-April 5 2007
Firstpage :
390
Lastpage :
396
Abstract :
With the explosion of XML-based online documents, the task of knowledge discovery from the Web becomes highly significant. As an appropriate machinery, classification allows to categorize documents to facilitate that task. A classification approach is introduced in this paper. It is based on the k-nearest neighborhood algorithm that relies on an edit distance measure. The originality of the work lies in combining both the content and the structure of XML documents to compute the edit distance. The approach is empirically evaluated using real-world XML collections
Keywords :
XML; classification; data mining; XML document classification; XML-based online documents; edit distance measure; k-nearest neighborhood algorithm; knowledge discovery; Computational intelligence; Data mining; Explosions; Information retrieval; Machinery; Software libraries; Standards development; Text categorization; Web mining; XML;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computational Intelligence and Data Mining, 2007. CIDM 2007. IEEE Symposium on
Conference_Location :
Honolulu, HI
Print_ISBN :
1-4244-0705-2
Type :
conf
DOI :
10.1109/CIDM.2007.368901
Filename :
4221325
Link To Document :
بازگشت