DocumentCode
3213529
Title
Dynamic peer-to-peer distributed document clustering and cluster summarization
Author
Meena, S.M.
Author_Institution
Dept. of Inf. Technol., Rajalakshmi Eng. Coll., Chennai, China
fYear
2011
fDate
20-22 July 2011
Firstpage
815
Lastpage
819
Abstract
The main objective of this paper is to provide cluster summarization of huge text document. Mining process includes the sharing of large scale amount of data from various sources, which gets concluded at the mined data. In distributed data mining, adopting aflat node distribution model can affect scalability, modularity, flexibility which are being overcome by using dynamic peer to peer document clustering and cluster summarization. The Dynamic P2P document clustering and cluster summarization (DP2PCS) architecture is based upon bonus words and stigma words. For document clustering applications, the system summarizes the distributed document clusters using a distributed key-phrase extraction algorithm, thus providing interpretation of the clusters. Document summarization is used for fast information retrieval in less time. Compared to existing system the dynamic nature of proposed system facilitates a scalable cluster wherein the peers may join or leave the group at will. The summarization process on an average reduces the original documents content by 63 percentage based on the word count.
Keywords
data mining; information retrieval; pattern clustering; peer-to-peer computing; text analysis; bonus words; distributed data mining process; distributed key-phrase extraction algorithm; dynamic peer-to-peer distributed document clustering; flat node distribution model; information retrieval; stigma words; text document cluster summarization; Distributed data mining; distributed document clustering; document summarization; hierarchical peer-to-peer networks;
fLanguage
English
Publisher
iet
Conference_Titel
Sustainable Energy and Intelligent Systems (SEISCON 2011), International Conference on
Conference_Location
Chennai
Type
conf
DOI
10.1049/cp.2011.0478
Filename
6143427
Link To Document