DocumentCode
3222683
Title
Text mining in ´Request for Comments Document Series´
Author
Gurusamy, Siva ; Manjula, D. ; Geetha, T.V.
Author_Institution
Sch. of Comput. Sci. & Eng., Anna Univ., India
fYear
2002
fDate
13-15 Dec. 2002
Firstpage
147
Lastpage
155
Abstract
This paper discusses the knowledge discovery in text (KDT) system for the ´Request for Comments (RFC) Document Series´. The paper proposes a versatile system architecture for text mining in RFC that maintains structured and unstructured data components of the document. The documents are represented by keywords and knowledge discovery is performed by analysing the co-occurrence frequencies of the various keywords representing the document. The clustering of documents is done by extracted knowledge, which can reduce the search space. The relevant documents retrieved during the search process for a query are ranked based on relevance of the topic in it. This paper describes RFC Viewer, our tool for viewing the RFC document in rich text format rather than text format, which also provides knowledge extracted from the RFC document and supports various KDD operations on the document.
Keywords
data mining; text analysis; RFC Viewer; Request for Comments Document Series; document clustering; keyword co-occurrence frequencies; knowledge discovery in text system; query; rich text format; search space; structured data components; system architecture; text mining; topic relevance; unstructured data components; Computer architecture; Computer science; Data mining; Databases; Frequency; Knowledge engineering; Labeling; Maintenance engineering; Performance analysis; Text mining;
fLanguage
English
Publisher
ieee
Conference_Titel
Language Engineering Conference, 2002. Proceedings
Print_ISBN
0-7695-1885-0
Type
conf
DOI
10.1109/LEC.2002.1182302
Filename
1182302
Link To Document