DocumentCode :
2877894
Title :
Using a Competitive Clustering Algorithm to Comprehend Web Applications
Author :
De Lucia, Andrea ; Scanniello, Giuseppe ; Tortora, Genoveffa
Author_Institution :
Dipt. di Matematica e Informatica, Salerno Univ., Fisciano
fYear :
2006
fDate :
23-24 Sept. 2006
Firstpage :
33
Lastpage :
40
Abstract :
We propose an approach based on winner takes all, a competitive clustering algorithm, to support the comprehension of static and dynamic Web applications. The process first computes the distances between the Web pages and then identifies similar pages through the winner takes all clustering algorithm. Two different instances of the process are presented to identify similar pages at structural and content level, respectively. The first instance encodes the page structure into a string and then uses the Levenshtein algorithm to achieve the distances between pairs of pages. On the other hand, to group similar pages at content level we use the latent semantic indexing to produce the page representations as vectors in the concept space. The Euclidean distance is then computed between the vectors to achieve the distances between the pages to be given as input to the adopted clustering algorithm. A prototype to automate the identification of group of similar pages has been implemented. The approach and the prototype have been assessed in a case study
Keywords :
Web sites; indexing; pattern clustering; Euclidean distance; Levenshtein algorithm; Web application comprehension; Web pages; competitive clustering; latent semantic indexing; winner takes all clustering; Application software; Cloning; Clustering algorithms; Euclidean distance; HTML; Indexing; Prototypes; Reverse engineering; Software prototyping; Web pages;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Web Site Evolution, 2006. WSE '06. Eighth IEEE International Symposium on
Conference_Location :
Philadelphia, PA
ISSN :
1550-4441
Print_ISBN :
0-7695-2696-9
Type :
conf
DOI :
10.1109/WSE.2006.19
Filename :
4027204
Link To Document :
بازگشت