• DocumentCode
    2093330
  • Title

    Content-Oriented Automatic Text Categorization with the Cognitive Situation Models

  • Author

    Guo, Yi ; Shao, Zhiqing ; Nan, Hua

  • Author_Institution
    Dept. of Comput. Sci. & Eng., East China Univ. of Sci. & Technol. Shanghai, Shanghai, China
  • Volume
    1
  • fYear
    2008
  • fDate
    20-22 Dec. 2008
  • Firstpage
    512
  • Lastpage
    516
  • Abstract
    Text categorization is an important research field within text mining. The initial objective of text categorization is to recognize, understand and organize various volumes of texts or documents. The general procedures of categorization are treated as supervised learning, from which the similarity can be inferred from a collection of categorized texts for training purpose. Obviously, the typical approaches for categorization are restrained at single word level and not content-oriented. This paper introduces an innovative research work, a content-oriented automatic text categorization algorithm (CogCate), inspired with cognitive situation models, to simulate the human cognitive procedure in the text categorization task. CogCate is not limited with traditional statistics analysis at word level, but includes a process of lexical or semantics analysis, which secures the accuracy of categorization. The evaluation experiments have testified the precision of CogCate. Meanwhile, CogCate tremendously reduces the time and effort spent on training and corpus maintenance, and proves that text categorization can benefit from interdisciplinary research efforts.
  • Keywords
    classification; cognition; computational linguistics; data mining; learning (artificial intelligence); text analysis; cognitive situation model; content-oriented automatic text categorization algorithm; human cognitive procedure; innovative research work; lexical analysis; semantics analysis; supervised learning; text mining; Cognitive science; Computer science; Data mining; Humans; Supervised learning; Testing; Text categorization; Text mining; Text recognition; Visualization; Cognitive; Content-Oriented; Situation Models; Text Categorization;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer Science and Computational Technology, 2008. ISCSCT '08. International Symposium on
  • Conference_Location
    Shanghai
  • Print_ISBN
    978-1-4244-3746-7
  • Type

    conf

  • DOI
    10.1109/ISCSCT.2008.63
  • Filename
    4731480