DocumentCode
3630274
Title
A genetic algorithm for logical topic text segmentation
Author
Alin Mihaila;Andreea Mihis;Cristina Mihaila
Author_Institution
Babe?-Bolyai University, Cluj-Napoca, Romania
fYear
2008
Firstpage
500
Lastpage
505
Abstract
Topic text segmentation is an important problem in information retrieval and summarization. The segmentation process tries to split a text into thematic clusters (segments) in such a way that every cluster has a high cohesion and the contiguous clusters are connected as little as possible. The originality of this work is twofold. First, we propose new segmentation criteria based on text entailment for interpreting the cohesion and connectivity of segments and second, we use a genetic algorithm which uses a measure based on text entailment for determining the topic boundaries, in order to identify a predefined number of segments. The obtained results are compared with against two manually segmented texts.
Keywords
"Genetic algorithms","Information retrieval","Context modeling","Frequency","Dynamic programming","Decision trees","Proposals","Trade agreements","Natural languages"
Publisher
ieee
Conference_Titel
Digital Information Management, 2008. ICDIM 2008. Third International Conference on
Print_ISBN
978-1-4244-2916-5
Type
conf
DOI
10.1109/ICDIM.2008.4746783
Filename
4746783
Link To Document