Title :
PyThinSearch: A Simple Web Search Engine
Author_Institution :
Grad. Sch. of Inf. Sci. & Technol., Hokkaido Univ., Sapporo
Abstract :
We describe a simple, functioning Web search engine for indexing and searching online documents using Python programming language. Python was chosen because it is an elegant language with simple syntax, is easy to learn and debug, and supports main operating systems almost evenly. The remarkable characteristics of this program are an adjustable search function that allows users to rank documents with several combinations of score functions and the focus on anchor text analysis as we provide four additional schemes to calculate scores based on anchor text. We also provide an additional ranking algorithm based on link addition process in network motivated by PageRank and HITS as an experimental tool. This algorithm is the original contribution of this paper.
Keywords :
Internet; indexing; information retrieval; programming languages; search engines; PyThinSearch; Python programming language; Web search engine; indexing; link addition process; ranking algorithm; Books; Crawlers; Neural networks; Open source software; Operating systems; Robots; Search engines; Text analysis; Web pages; Web search; anchor text analysis; python programming language; ranking algorithm; search engine;
Conference_Titel :
Complex, Intelligent and Software Intensive Systems, 2009. CISIS '09. International Conference on
Conference_Location :
Fukuoka
Print_ISBN :
978-1-4244-3569-2
Electronic_ISBN :
978-0-7695-3575-3
DOI :
10.1109/CISIS.2009.12