Title :
Enhancing automatic extraction of top-k list from web
Author :
Patil, Dipali S. ; Dhawas, N.A.
Author_Institution :
STES´s Sinhgad Inst. of Technol., Lonavala, India
Abstract :
Now a day´s World Wide Web is considered as biggest resource of information. This large database which contains information in all area but finding particular information or extracting accurate data from web is difficult. The strong reason behind this sentence is that the data available on this huge database is not in same format. When data is in particular format you can extract information without any difficulty when extract data from HTML pages, we select data easily with the help of tags. This paper is extracting top-k list from all available web database which contain data either in structured or unstructured format. An algorithm is implemented for this reason which provides an accurate and faster generation of top-k list.
Keywords :
Internet; database management systems; hypermedia markup languages; HTML pages; World Wide Web; automatic extraction enhancement; data extraction; database; structured format; top-k list; unstructured format; Classification algorithms; Convergence; Data mining; Databases; Feature extraction; HTML; Web pages; Classifier; Content Processor; Database; Parser; Top-k list;
Conference_Titel :
Convergence of Technology (I2CT), 2014 International Conference for
Print_ISBN :
978-1-4799-3758-5
DOI :
10.1109/I2CT.2014.7092331