Title :
A novel approach for link context extraction using Bison parser
Author :
Gupta, Swastik ; Yadav, Suneel
Author_Institution :
AKGEC, Ghaziabad, India
Abstract :
With the advent of World Wide Web, link context has been widely used for finding the theme of the target web page. Many approaches have been used to take advantage of the link context to get the precise context of link but the approaches were not very efficient. Link Context has been used in many areas like classification of web page, search engines, topical crawlers. In this paper we have derived the link context using LALR parser (Bison parser). For this different web pages have been collected and with the help of tag tree concepts are found out. Then using Bison parser link context have been derived. We have also compared the technique with the anchor text based method using Jaccard coefficient.
Keywords :
Internet; classification; context-free grammars; indexing; information retrieval; search engines; text analysis; Bison parser; Jaccard coefficient; LALR parser; Web page classification; World Wide Web; anchor text based method; link context extraction; search engine classification; topical crawler classification; Conferences; Context; Crawlers; Flexible printed circuits; Grammar; HTML; Web pages; Anchor text; Crawling; Indexing; LALR parsing; Link Context; Tag tree;
Conference_Titel :
Advance Computing Conference (IACC), 2014 IEEE International
Conference_Location :
Gurgaon
Print_ISBN :
978-1-4799-2571-1
DOI :
10.1109/IAdCC.2014.6779449