• DocumentCode
    2594519
  • Title

    CASIT: Content Based Identification of Textual Information in a Large Database

  • Author

    Guezouli, Larbi ; Essafi, Hassane

  • Author_Institution
    Comput. Sci. Dept., Batna Univ., Batna, Algeria
  • fYear
    2010
  • fDate
    20-23 April 2010
  • Firstpage
    621
  • Lastpage
    625
  • Abstract
    This paper describes CASIT model (CAlculation of SImilarity of Text). Starting from a coarse confrontation of text documents, based on the Latent Semantic Indexing model (LSI), CASIT method calculates in a finer way, the rate of similarity between model documents of text and others which are confronted to them. Our approach takes into account the neighbourhood of the words, which makes it possible to balance the words in the calculation of the score.
  • Keywords
    text analysis; CASIT model; calculation of similarity of text; content based identification; latent semantic indexing model; text documents; textual information; Application software; Computer science; Conferences; Databases; Filters; Frequency; Indexing; Information retrieval; Large scale integration; Matrix decomposition; CASIT; Component; LSI; textual research; vectorial model;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Advanced Information Networking and Applications Workshops (WAINA), 2010 IEEE 24th International Conference on
  • Conference_Location
    Perth, WA
  • Print_ISBN
    978-1-4244-6701-3
  • Type

    conf

  • DOI
    10.1109/WAINA.2010.133
  • Filename
    5480625