Title :
Analyzing Anchor-Links to Extract Semantic Inferences of a Web Page
Author :
Chauhan, Naresh ; Sharma, Arvind Kumar
Author_Institution :
YMCA Inst. of Eng., Faridabad
Abstract :
Since an anchor is used in an HTML document to point to a related document/picture/media application, the existing approaches [3,4,5], to find out the information about an associated Web page, are based on the use of anchor-text contained in the anchor tag. The problem with this approach is that sometimes anchor-texts are either not present at all or a single word text / an image anchor is contained in the anchor tag. In this paper, a dataset of about hundred Web pages of different categories from open directory project (ODP) has been surveyed and analyzed. The result shows that cohesive text surrounding the anchor and non-cohesive text present elsewhere in the Web pages provides rich semantic cues about a target Web page.
Keywords :
Web sites; hypermedia markup languages; inference mechanisms; semantic Web; text analysis; HTML document; Web page; anchor-links analysis; anchor-text tag; open directory project; semantic inference extraction; Application software; Data mining; HTML; Head; Indexing; Information analysis; Information resources; Information technology; Robots; Web pages;
Conference_Titel :
Information Technology, (ICIT 2007). 10th International Conference on
Conference_Location :
Orissa
Print_ISBN :
0-7695-3068-0
DOI :
10.1109/ICIT.2007.46