Title :
An Efficient Linear Text Segmentation Algorithm Using Hierarchical Agglomerative Clustering
Author :
Wu, Ji-Wei ; Tseng, Judy C R ; Tsai, Wen-Nung
Author_Institution :
Dept. of Comput. Sci., Nat. Chiao Tung Univ., Hsinchu, Taiwan
Abstract :
Linear text segmentation aims at dividing a long text into several topical segments. It is beneficial to many natural language processing tasks, such as information retrieval and document summarization. In this article, an efficient linear text segmentation algorithm based on hierarchical agglomerative clustering is presented. The proposed linear text segmentation algorithm is implemented without auxiliary knowledge base, parameter setting, and user involvement. Experimental results show that the proposed linear text segmentation algorithm not only provides linear time computational complexity, but also provides comparable segmentation accuracy with several well-known linear text segmentation algorithms.
Keywords :
computational complexity; knowledge based systems; natural language processing; pattern clustering; text analysis; auxiliary knowledge base; efficient linear text segmentation algorithm; hierarchical agglomerative clustering; linear time computational complexity; natural language processing task; segmentation accuracy; Accuracy; Algorithm design and analysis; Clustering algorithms; Computational complexity; Error probability; Heuristic algorithms; Merging; NLP application; computational intelligence; hierarchical agglomerative clustering; text segmentation;
Conference_Titel :
Computational Intelligence and Security (CIS), 2011 Seventh International Conference on
Conference_Location :
Hainan
Print_ISBN :
978-1-4577-2008-6
DOI :
10.1109/CIS.2011.240