Title :
Mapping texts into graphs: An improved text similarity algorithm
Author :
Zuoguo Liu ; Xiaorong Chen
Author_Institution :
Coll. of Comput. Sci. & Inf., Guizhou Univ., Guiyang, China
Abstract :
An improved graph-based text similarity (GBTS) algorithm which based on original GBTS is illustrated in this paper. A text is mapped into a graph which consists of terms as its nodes and term sequences as its undirected edges. The Maximum Common Subgraph (MCS) of two graphs is useful for analyzing their similarity. What´s more, the similarity of two texts is divided into two parts: nodes similarity and edges similarity. Each part is calculated respectively and text similarity is the sum of two parts. At last, the improved algorithm is compared with the original one.
Keywords :
graph theory; text analysis; GBTS algorithm; MCS; edge similarity; graph-based text similarity algorithm; maximum common subgraph; node similarity; term sequences; text mapping; undirected edges; graph theory; mapped graph; maximum common subgraph;
Conference_Titel :
Computer Science and Network Technology (ICCSNT), 2012 2nd International Conference on
Conference_Location :
Changchun
Print_ISBN :
978-1-4673-2963-7
DOI :
10.1109/ICCSNT.2012.6526173