• DocumentCode
    119407
  • Title

    A Reference Framework for the Automated Exploration of Web Applications

  • Author

    Le Breton, Gabriel ; Bergeron, Nicolas ; Halle, Sylvain

  • Author_Institution
    Dept. d´Inf. et de Math., Univ. du Quebec a Chicoutimi, Chicoutimi, QC, Canada
  • fYear
    2014
  • fDate
    4-7 Aug. 2014
  • Firstpage
    81
  • Lastpage
    90
  • Abstract
    Web crawling is the process of exhaustively exploring the contents of a web site or application through automated means. While the results of such a crawling can be put through numerous uses ranging from a simple backup to comprehensive testing and analysis, features of modern-day applications prevent crawlers from properly exploring applications. We provide an in-depth analysis of 15 such features, and report on their presence in a study of 16 real-world web sites. Based on that study, we develop a configurable web application where the presence of each such feature can be turned on or off, aimed as a test bench where existing crawlers can be compared in a uniform way. Our results, which are the first exhaustive comparison of available crawlers, indicates areas where future work should be aimed.
  • Keywords
    Internet; information retrieval; Web application exploration; Web crawling; Web site; Browsers; Crawlers; HTML; Navigation; Servers; Testing; Web sites; benchmark; crawlers; web applications;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Engineering of Complex Computer Systems (ICECCS), 2014 19th International Conference on
  • Conference_Location
    Tianjin
  • Print_ISBN
    978-1-4799-5481-0
  • Type

    conf

  • DOI
    10.1109/ICECCS.2014.20
  • Filename
    6923122