DocumentCode
3559050
Title
Statistical Analysis of Bayes Optimal Subset Ranking
Author
Cossock, David ; Zhang, Tong
Author_Institution
Yahoo Inc., Sunnyvale, CA
Volume
54
Issue
11
fYear
2008
Firstpage
5140
Lastpage
5154
Abstract
The ranking problem has become increasingly important in modern applications of statistical methods in automated decision making systems. In particular, we consider a formulation of the statistical ranking problem which we call subset ranking, and focus on the discounted cumulated gain (DCG) criterion that measures the quality of items near the top of the rank-list. Similar to error minimization for binary classification, direct optimization of natural ranking criteria such as DCG leads to a nonconvex optimization problems that can be NP-hard. Therefore, a computationally more tractable approach is needed. We present bounds that relate the approximate optimization of DCG to the approximate minimization of certain regression errors. These bounds justify the use of convex learning formulations for solving the subset ranking problem. The resulting estimation methods are not conventional, in that we focus on the estimation quality in the top-portion of the rank-list. We further investigate the asymptotic statistical behavior of these formulations. Under appropriate conditions, the consistency of the estimation schemes with respect to the DCG metric can be derived.
Keywords
Bayes methods; concave programming; convex programming; decision making; learning (artificial intelligence); pattern classification; query formulation; search engines; set theory; statistical analysis; Bayes optimal subset ranking; DCG criterion; NP-hard problem; Web search query; automated decision making system; binary classification; convex learning formulation; discounted cumulated gain; nonconvex optimization; statistical analysis; Application software; Decision making; Electronic commerce; Gain measurement; Internet; Particle measurements; Search engines; Statistical analysis; Web pages; Web search; Bayes optimal; consistency; convex surrogate; ranking;
fLanguage
English
Journal_Title
Information Theory, IEEE Transactions on
Publisher
ieee
ISSN
0018-9448
Type
jour
DOI
10.1109/TIT.2008.929939
Filename
4655444
Link To Document