• DocumentCode
    1927979
  • Title

    PyThinSearch: A Simple Web Search Engine

  • Author

    Mirzal, Andri

  • Author_Institution
    Grad. Sch. of Inf. Sci. & Technol., Hokkaido Univ., Sapporo
  • fYear
    2009
  • fDate
    16-19 March 2009
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    We describe a simple, functioning Web search engine for indexing and searching online documents using Python programming language. Python was chosen because it is an elegant language with simple syntax, is easy to learn and debug, and supports main operating systems almost evenly. The remarkable characteristics of this program are an adjustable search function that allows users to rank documents with several combinations of score functions and the focus on anchor text analysis as we provide four additional schemes to calculate scores based on anchor text. We also provide an additional ranking algorithm based on link addition process in network motivated by PageRank and HITS as an experimental tool. This algorithm is the original contribution of this paper.
  • Keywords
    Internet; indexing; information retrieval; programming languages; search engines; PyThinSearch; Python programming language; Web search engine; indexing; link addition process; ranking algorithm; Books; Crawlers; Neural networks; Open source software; Operating systems; Robots; Search engines; Text analysis; Web pages; Web search; anchor text analysis; python programming language; ranking algorithm; search engine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Complex, Intelligent and Software Intensive Systems, 2009. CISIS '09. International Conference on
  • Conference_Location
    Fukuoka
  • Print_ISBN
    978-1-4244-3569-2
  • Electronic_ISBN
    978-0-7695-3575-3
  • Type

    conf

  • DOI
    10.1109/CISIS.2009.12
  • Filename
    5066762