• DocumentCode
    3630274
  • Title

    A genetic algorithm for logical topic text segmentation

  • Author

    Alin Mihaila;Andreea Mihis;Cristina Mihaila

  • Author_Institution
    Babe?-Bolyai University, Cluj-Napoca, Romania
  • fYear
    2008
  • Firstpage
    500
  • Lastpage
    505
  • Abstract
    Topic text segmentation is an important problem in information retrieval and summarization. The segmentation process tries to split a text into thematic clusters (segments) in such a way that every cluster has a high cohesion and the contiguous clusters are connected as little as possible. The originality of this work is twofold. First, we propose new segmentation criteria based on text entailment for interpreting the cohesion and connectivity of segments and second, we use a genetic algorithm which uses a measure based on text entailment for determining the topic boundaries, in order to identify a predefined number of segments. The obtained results are compared with against two manually segmented texts.
  • Keywords
    "Genetic algorithms","Information retrieval","Context modeling","Frequency","Dynamic programming","Decision trees","Proposals","Trade agreements","Natural languages"
  • Publisher
    ieee
  • Conference_Titel
    Digital Information Management, 2008. ICDIM 2008. Third International Conference on
  • Print_ISBN
    978-1-4244-2916-5
  • Type

    conf

  • DOI
    10.1109/ICDIM.2008.4746783
  • Filename
    4746783