DocumentCode
3570138
Title
Discover Information and Knowledge from Websites Using an Integrated Summarization and Visualization Framework
Author
Fung, Chun Che ; Thanadechteemapat, Wigrai
Author_Institution
Sch. of Inf. Technol., Murdoch Univ., Perth, WA, Australia
fYear
2010
Firstpage
232
Lastpage
235
Abstract
The number of Web sites has noticeably increased to roughly 225 million in the last ten years. This means there is a rapid growth of knowledge and information on the Internet. Although search engines can help users to filter their desired information based on key words, the searched result is normally presented in the form of a list, and users have to visit each Web page in order to determine the appropriateness of the result. A considerable amount of time therefore has to be spent on finding the required information. To address this issue, this paper proposes a knowledge discovery approach on the Web by providing an overview of the information on a Website using an integration of summarization and visualization techniques. This includes text summarization, tag cloud, Document Type View, and interactive features such as drill down and thumbnails. This approach is capable to reduce the time required to identify and search for information or knowledge from the Web.
Keywords
Internet; Web sites; data mining; data visualisation; text analysis; Internet; Web sites; document type view; drill down feature; information discovery; integrated summarization; interactive features; knowledge discovery; tag cloud; text summarization; thumbnails feature; visualization framework; Data mining; Data visualization; Information technology; Internet; Machine learning; Natural language processing; Search engines; Tag clouds; Web pages; Web sites; Tag Cloud; Text summarization; Visualization; Web assessment;
fLanguage
English
Publisher
ieee
Conference_Titel
Knowledge Discovery and Data Mining, 2010. WKDD '10. Third International Conference on
Print_ISBN
978-1-4244-5397-9
Electronic_ISBN
978-1-4244-5398-6
Type
conf
DOI
10.1109/WKDD.2010.109
Filename
5432653
Link To Document