Title :
An exploratory analysis of the novelty of a news Web site
Author :
Calzarossa, Maria Carla ; Tessera, Daniele
Author_Institution :
Dipt. di Inf. e Sist., Univ. di Pavia, Pavia, Italy
Abstract :
The growing amount of information published on the Web, combined with its dynamic nature, opens many challenging issues dealing with management and retrieval of the information and provisioning of the underlying infrastructures. Search engines have to meet two conflicting requirements: minimize the number of downloads and provide up-to-date information. In this paper, we present the results of an exploratory analysis aimed at investigating the novelty of the content of a news Web site. We analyzed the Web site from an horizontal perspective by focusing on the content of the individual articles and from a vertical perspective by focusing on the entire collection of articles published on the site. These two perspectives allowed us to study how fast and to what extent articles were modified and to model the evolution of the Web site.
Keywords :
Web sites; electronic publishing; information management; information retrieval; search engines; Web site; article publishing; exploratory analysis; information retrieval; search engines; HTML; Markov processes; Monitoring; Multimedia communication; Streaming media; Web pages;
Conference_Titel :
Performance Evaluation of Computer and Telecommunication Systems (SPECTS), 2010 International Symposium on
Conference_Location :
Ottawa, ON
Print_ISBN :
978-1-56555-340-8