• DocumentCode
    2115087
  • Title

    Original and Normalized Web Log Metrics as Functions of Controllable Variables of Log Study

  • Author

    Buzikashvili, N.

  • Author_Institution
    Russian Acad. of Sci., Moscow
  • fYear
    2007
  • fDate
    Oct. 31 2007-Nov. 2 2007
  • Firstpage
    3
  • Lastpage
    12
  • Abstract
    Different studies of Web search engine query logs calculate the same metrics using logs sampled during different periods and processed under different values of two controllable variables: a client discriminator used to exclude clients who are agents and a temporal cutoff used to segment client transactions into sessions. These peculiar to the Web log analysis variables markedly affect the resulting metrics, and metrics calculated under different values of the variables may not be compared directly. To extract a contribution of the controllable variables we introduce metrics normalized by their values corresponding to the unit values of the variables. Whilst differences between values of the original metric may be enormous among different logs, normalized metrics are relatively similar functions of the controllable variables. As a result, one can use available logs as a learning collection to determine these functions and to use them to approximate metrics values for unavailable logs by reported values of metrics and controllable variables. The Yandex (2005, 2007) and Excite (2001) logs were used as a learning collection to estimate log-free normalized metrics as functions of the controllable variables. The revealed functions were applied to previously reported AltaVista (1998, 2002) results to extrapolate values reported for one combination of controllable variables onto all possible combinations of variables.
  • Keywords
    Internet; search engines; Web log metrics; Web search engine query log; client discriminator; Control systems; Search engines; Time measurement; Web search;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Conference, 2007. LA-WEB 2007. Latin American
  • Conference_Location
    Santiago
  • Print_ISBN
    978-0-7695-3008-6
  • Type

    conf

  • DOI
    10.1109/LA-Web.2007.20
  • Filename
    4383153