DocumentCode :
166031
Title :
HFRECCA for clustering of text data from travel guide articles
Author :
Wazarkar, Seema V. ; Manjrekar, Amrita A.
Author_Institution :
Dept. of Technol., Shivaji Univ., Kolhapur, India
fYear :
2014
fDate :
24-27 Sept. 2014
Firstpage :
1486
Lastpage :
1489
Abstract :
Text clustering is advantageous for extraction of text data from web applications such as e-news papers, collection of research papers, blogs, news feeds at social networks, etc. This paper presents a text clustering Hierarchical Fuzzy Relational Eigenvector Centrality-based Clustering Algorithm (HFRECCA). The algorithm is a combination of fuzzy clustering, divisive hierarchical clustering and page rank algorithm. Travel guide articles are pre-processed to remove stop words and stemming. Then, similarity matrix is generated using word distance computation. In HFRECCA, divisive hierarchical clustering algorithm is applied where it uses Fuzzy Relational Eigenvector Centrality-based Clustering Algorithm (FRECCA) as sub routine algorithm. FRECCA outputs cluster membership values on the basis of page rank score using page rank algorithm and generate clusters according to it. HFRECCA has features of hierarchical clustering as well as fuzzy clustering as it creates hierarchy of clusters and an object can belong to multiple clusters. Structure of information resides in text documents is hierarchical hence HFRECCA is useful for clustering of data from natural language documents.
Keywords :
eigenvalues and eigenfunctions; natural language processing; text analysis; text detection; Web applications; e-news papers; fuzzy clustering; hierarchical clustering algorithm; hierarchical fuzzy relational eigenvector centrality-based clustering algorithm; natural language documents; page rank algorithm; social networks; sub routine algorithm; text clustering; text data extraction; travel guide articles; Algorithm design and analysis; Clustering algorithms; Computational modeling; Data mining; Data models; Partitioning algorithms; Semantics; Fuzzy clustering; Hierarchical clustering; Similarity Measure; Text Clustering;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Advances in Computing, Communications and Informatics (ICACCI, 2014 International Conference on
Conference_Location :
New Delhi
Print_ISBN :
978-1-4799-3078-4
Type :
conf
DOI :
10.1109/ICACCI.2014.6968349
Filename :
6968349
Link To Document :
بازگشت