Author_Institution :
Coll. of Comput. Sci. & Technol., Taiyuan Univ. of Technol., Taiyuan, China
Abstract :
With the development of internet technology, the information in the website increases sharply. In view of the problem of the user´s great desire for intranet information retrieval and the inefficiency of the intranet information retrieval service provided by web search engines, in this paper, we study the architecture, key technologies and implementation of the intranet search engine system. We designed and implemented an intranet search engine system based on Lucene. The system contains information gathering module, indexing module, searching module and system interface module, which can index and search many document formats, such as html, word, excel, pdf and so on. Experiments show that the system has a good indexing and retrieval efficiency and performance, which can provide intranet information retrieval service for users effectively.
Keywords :
Internet; Web sites; indexing; information retrieval; intranets; search engines; Internet technology; Intranet information retrieval service; Intranet search engine system; Lucene; Website; document formats; indexing module; information gathering module; searching module; system interface module; Crawlers; Engines; Indexing; Internet; Search engines; Indexing; Lucene; Search Engine; information retrieval;