Title of article :
Frequent Term Based Clustering of Stories with Semantic Analysis for Searching and Retrieval
Author/Authors :
Amrut Nagasunder، نويسنده , , Bharath Boregowda، نويسنده , , Madhu Venkatesha، نويسنده , , Ananthanarayana V. S.، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2010
Pages :
9
From page :
219
To page :
227
Abstract :
Effective document organizations are often those which provide a concise representation of text content in a large collection ofdocuments. We have considered the task of clustering of stories (documents) as a facilitation of effectual document arrangement for searchingand retrieval. We propose a novel representation for a story, based on the essential parts of speech - the nouns, verbs and adjectives. We thenperform a clustering of these story representations, resulting in a graph structure where the story representations are conjoined at nodes havingthe same or synonymous noun. Such a structure can be queried for stories by giving a search string. We employ the use of a knowledge bankthroughout the system as a step to realize semantic analysis of the text. For testing the goodness of cluster, we carry out the classification test, ontwo data-sets. We are able to achieve significantly high quality of clustering, with promising results in regard to memory compaction
Keywords :
Document clustering , semantic analysis , Natural language processing , text mining
Journal title :
International Journal of Advanced Research in Computer Science
Serial Year :
2010
Journal title :
International Journal of Advanced Research in Computer Science
Record number :
668399
Link To Document :
بازگشت