DocumentCode
2923476
Title
Preserving Patterns in Bipartite Graph Partitioning
Author
Hu, Tianming ; Qu, Chao ; Tan, Chew Lim ; Sung, Sam Yuan ; Zhou, Wenjun
Author_Institution
DongGuan Univ. of Technol.
fYear
2006
fDate
Nov. 2006
Firstpage
489
Lastpage
496
Abstract
This paper describes a new bipartite formulation for word-document co-clustering such that hyperclique patterns, strongly affiliated documents in this case, are guaranteed not to be split into different clusters. Our approach for pattern preserving clustering consists of three steps: mine maximal hyperclique patterns, form the bipartite, and partition it. With hyperclique patterns of documents preserved, the topic of each cluster can be represented by both the top words from that cluster and the documents in the patterns, which are expected to be more compact and representative than those in the standard bipartite formulation. Experiments with real-world datasets show that, with hyperclique patterns as starting points, we can improve the clustering results in terms of various external clustering criteria. Also, the partitioned bipartite with preserved topical sets of documents naturally lends itself to different functions in search engines
Keywords
document handling; graph theory; pattern clustering; bipartite formulation; bipartite graph partitioning; clustering criteria; document topical set; maximal document hyperclique pattern; pattern preservation; pattern preserving clustering; search engine; word document coclustering; Artificial intelligence; Bipartite graph; Chaos; Clustering algorithms; Computational efficiency; Educational institutions; Joining processes; Partitioning algorithms; Pattern analysis; Search engines;
fLanguage
English
Publisher
ieee
Conference_Titel
Tools with Artificial Intelligence, 2006. ICTAI '06. 18th IEEE International Conference on
Conference_Location
Arlington, VA
ISSN
1082-3409
Print_ISBN
0-7695-2728-0
Type
conf
DOI
10.1109/ICTAI.2006.97
Filename
4031935
Link To Document