DocumentCode :
3318270
Title :
Improvement of the dotplotting method for linear text segmentation
Author :
Ye, Na ; Zhu, Jingbo ; Luo, Haitao ; Wang, Huizhen ; Zhang, Bin
Author_Institution :
Natural Language Process. Lab., Inst. of Comput. Software & Theor., China
fYear :
2005
fDate :
30 Oct.-1 Nov. 2005
Firstpage :
636
Lastpage :
641
Abstract :
The dotplotting method, employed by Reynar (1994), is a state-of-the-art algorithm for automatic linear text segmentation. However, several problems are found in its measure for assessing density that represents topical coherence: the density function is asymmetric, leading to the apparent false conclusion that forward scan may result in different segmentation with backward scan; besides, while determining next boundary, the assessing strategy doesn´t adequately take the previously located boundaries into account. In this paper we propose modified models that remedy these problems. We also make use of segment length to improve segmentation performance. Experimental results show that the modified models achieve considerable improvement in Pk value and precision and recall over the original dotplotting method.
Keywords :
text analysis; automatic linear text segmentation; dotplotting method; topical coherence; Coherence; Computer applications; Density functional theory; Density measurement; Electronic mail; Information resources; Laboratories; Natural language processing; Software algorithms; Text processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
Print_ISBN :
0-7803-9361-9
Type :
conf
DOI :
10.1109/NLPKE.2005.1598814
Filename :
1598814
Link To Document :
بازگشت