Title :
A text mining approach on automatic generation of Web directories and hierarchies
Author :
Yang, Hsin-Chang ; Lee, Chung-Hong
Author_Institution :
Dept. of Inf. Manage., Chang Jung Univ., Tainan, Taiwan
Abstract :
There are enormous amount of Web pages in the world. Retrieval of required information from the WWW is thus an arduous task. Different models for retrieving Web pages have been used by the WWW community. One of the most widely used model is by traversing a predefined Web directory hierarchy to reach a user´s goal. The Web directories are compiled or classified folders of Web pages and are usually organized into a hierarchical structure. The classification of Web pages into proper directories and the organization of directory hierarchies are generally performed by human experts. We provide a method to apply a kind of text mining techniques on a set of Web pages to automatically create Web directories and organize them into hierarchies. The method is based on the self-organizing map learning algorithm and requires no human intervention during the construction of Web directories and hierarchies. The experiments show that our method can produce comprehensible and reasonable Web directories and hierarchies.
Keywords :
Web sites; data mining; information retrieval; learning (artificial intelligence); self-organising feature maps; Web directory; Web page; human expert; information retrieval; self-organizing map learning algorithm; text mining; Clustering algorithms; Humans; Information management; Information retrieval; Labeling; Neural networks; Neurons; Text mining; Web pages; World Wide Web;
Conference_Titel :
Web Intelligence, 2003. WI 2003. Proceedings. IEEE/WIC International Conference on
Print_ISBN :
0-7695-1932-6
DOI :
10.1109/WI.2003.1241282