DocumentCode
2296509
Title
Beyond the Web Graph: Mining the Information Architecture of the WWW with Navigation Structure Graphs
Author
Keller, Matthias ; Nussbaumer, Martin
Author_Institution
Steinbuch Centre for Comput. (SCC), Karlsruhe Inst. of Technol. (KIT), Karlsruhe, Germany
fYear
2011
fDate
7-9 Sept. 2011
Firstpage
99
Lastpage
106
Abstract
Large Web sites contain a plethora of different menus and navigation aids, which implement systems of content organization as hierarchies, linear structures or matrices. Humans are able to decode the fine-grained content organization because they are aware of the different access methods provided by navigation systems and understand the higher-level information architecture. In contrast, current methods of link analysis cannot extract such a detailed model of the information architecture and are not able to recognize site boundaries and content hierarchies the way humans do. In this paper present a new approach of mining navigation systems that increases the precision of Web structure mining. Instead of analyzing the complete Web graph spanned by pages and hyperlinks, sub graphs called Navigation Structure Graphs (NSGs) are analyzed. A NSG represents the hyperlinks belonging to a certain navigation system. We demonstrate the capabilities of NSGs for analyzing the organization of Web sites and present our research on mining NSGs.
Keywords
Web sites; data mining; graph theory; WWW; Web graph; Web sites; Web structure mining; content organization; information architecture mining; link analysis; navigation structure graph; navigation system mining; Cascading style sheets; Data mining; Humans; Information architecture; Navigation; Organizations; Visualization; Web graph; Web structure mining; hierarchy extraction;
fLanguage
English
Publisher
ieee
Conference_Titel
Emerging Intelligent Data and Web Technologies (EIDWT), 2011 International Conference on
Conference_Location
Tirana
Print_ISBN
978-1-4577-0840-4
Type
conf
DOI
10.1109/EIDWT.2011.23
Filename
6076427
Link To Document