• DocumentCode
    2219717
  • Title

    Identifying variable-length meaningful phrases with correlation functions

  • Author

    Kim, Hyoung-Rae ; Chan, Philip K.

  • Author_Institution
    Dept. of Comput. Sci., Florida Inst. of Technol., Melbourne, FL, USA
  • fYear
    2004
  • fDate
    15-17 Nov. 2004
  • Firstpage
    30
  • Lastpage
    38
  • Abstract
    Finding meaningful phrases in a document has been studied in various information retrieval systems in order to improve the performance. Many previous statistical phrase-finding methods had a different aim such as document classification. Some are hybridized with statistical and syntactic grammatical methods; others use correlation heuristics between words. We propose a new phrase-finding algorithm that adds correlated words one by one to the phrases found in the previous stage, maintaining high correlation within a phrase. Our results indicate that our algorithm finds more meaningful phrases than an existing algorithm. Furthermore, the previous algorithm could be improved by applying different correlation junctions.
  • Keywords
    computational complexity; document handling; information retrieval; information retrieval systems; statistical analysis; correlation heuristics; information retrieval system; statistical method; syntactic grammatical method; time complexity; variable-length phrase-finding algorithm; Algorithm design and analysis; Artificial intelligence; Clustering algorithms; Data mining; Frequency; Humans; Information retrieval; Performance analysis; Probability; Robustness;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Tools with Artificial Intelligence, 2004. ICTAI 2004. 16th IEEE International Conference on
  • ISSN
    1082-3409
  • Print_ISBN
    0-7695-2236-X
  • Type

    conf

  • DOI
    10.1109/ICTAI.2004.70
  • Filename
    1374167