• DocumentCode
    3222683
  • Title

    Text mining in ´Request for Comments Document Series´

  • Author

    Gurusamy, Siva ; Manjula, D. ; Geetha, T.V.

  • Author_Institution
    Sch. of Comput. Sci. & Eng., Anna Univ., India
  • fYear
    2002
  • fDate
    13-15 Dec. 2002
  • Firstpage
    147
  • Lastpage
    155
  • Abstract
    This paper discusses the knowledge discovery in text (KDT) system for the ´Request for Comments (RFC) Document Series´. The paper proposes a versatile system architecture for text mining in RFC that maintains structured and unstructured data components of the document. The documents are represented by keywords and knowledge discovery is performed by analysing the co-occurrence frequencies of the various keywords representing the document. The clustering of documents is done by extracted knowledge, which can reduce the search space. The relevant documents retrieved during the search process for a query are ranked based on relevance of the topic in it. This paper describes RFC Viewer, our tool for viewing the RFC document in rich text format rather than text format, which also provides knowledge extracted from the RFC document and supports various KDD operations on the document.
  • Keywords
    data mining; text analysis; RFC Viewer; Request for Comments Document Series; document clustering; keyword co-occurrence frequencies; knowledge discovery in text system; query; rich text format; search space; structured data components; system architecture; text mining; topic relevance; unstructured data components; Computer architecture; Computer science; Data mining; Databases; Frequency; Knowledge engineering; Labeling; Maintenance engineering; Performance analysis; Text mining;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Language Engineering Conference, 2002. Proceedings
  • Print_ISBN
    0-7695-1885-0
  • Type

    conf

  • DOI
    10.1109/LEC.2002.1182302
  • Filename
    1182302