• DocumentCode
    3318270
  • Title

    Improvement of the dotplotting method for linear text segmentation

  • Author

    Ye, Na ; Zhu, Jingbo ; Luo, Haitao ; Wang, Huizhen ; Zhang, Bin

  • Author_Institution
    Natural Language Process. Lab., Inst. of Comput. Software & Theor., China
  • fYear
    2005
  • fDate
    30 Oct.-1 Nov. 2005
  • Firstpage
    636
  • Lastpage
    641
  • Abstract
    The dotplotting method, employed by Reynar (1994), is a state-of-the-art algorithm for automatic linear text segmentation. However, several problems are found in its measure for assessing density that represents topical coherence: the density function is asymmetric, leading to the apparent false conclusion that forward scan may result in different segmentation with backward scan; besides, while determining next boundary, the assessing strategy doesn´t adequately take the previously located boundaries into account. In this paper we propose modified models that remedy these problems. We also make use of segment length to improve segmentation performance. Experimental results show that the modified models achieve considerable improvement in Pk value and precision and recall over the original dotplotting method.
  • Keywords
    text analysis; automatic linear text segmentation; dotplotting method; topical coherence; Coherence; Computer applications; Density functional theory; Density measurement; Electronic mail; Information resources; Laboratories; Natural language processing; Software algorithms; Text processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Natural Language Processing and Knowledge Engineering, 2005. IEEE NLP-KE '05. Proceedings of 2005 IEEE International Conference on
  • Print_ISBN
    0-7803-9361-9
  • Type

    conf

  • DOI
    10.1109/NLPKE.2005.1598814
  • Filename
    1598814