• DocumentCode
    3559050
  • Title

    Statistical Analysis of Bayes Optimal Subset Ranking

  • Author

    Cossock, David ; Zhang, Tong

  • Author_Institution
    Yahoo Inc., Sunnyvale, CA
  • Volume
    54
  • Issue
    11
  • fYear
    2008
  • Firstpage
    5140
  • Lastpage
    5154
  • Abstract
    The ranking problem has become increasingly important in modern applications of statistical methods in automated decision making systems. In particular, we consider a formulation of the statistical ranking problem which we call subset ranking, and focus on the discounted cumulated gain (DCG) criterion that measures the quality of items near the top of the rank-list. Similar to error minimization for binary classification, direct optimization of natural ranking criteria such as DCG leads to a nonconvex optimization problems that can be NP-hard. Therefore, a computationally more tractable approach is needed. We present bounds that relate the approximate optimization of DCG to the approximate minimization of certain regression errors. These bounds justify the use of convex learning formulations for solving the subset ranking problem. The resulting estimation methods are not conventional, in that we focus on the estimation quality in the top-portion of the rank-list. We further investigate the asymptotic statistical behavior of these formulations. Under appropriate conditions, the consistency of the estimation schemes with respect to the DCG metric can be derived.
  • Keywords
    Bayes methods; concave programming; convex programming; decision making; learning (artificial intelligence); pattern classification; query formulation; search engines; set theory; statistical analysis; Bayes optimal subset ranking; DCG criterion; NP-hard problem; Web search query; automated decision making system; binary classification; convex learning formulation; discounted cumulated gain; nonconvex optimization; statistical analysis; Application software; Decision making; Electronic commerce; Gain measurement; Internet; Particle measurements; Search engines; Statistical analysis; Web pages; Web search; Bayes optimal; consistency; convex surrogate; ranking;
  • fLanguage
    English
  • Journal_Title
    Information Theory, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    0018-9448
  • Type

    jour

  • DOI
    10.1109/TIT.2008.929939
  • Filename
    4655444