مرکز منطقه ای اطلاع رساني علوم و فناوري - An effective and economic bi-level approach to ranking and rating spam detection

Abstract :

Rating and ranking of items are important parts of modern electronic commerce. As a result, dishonest business owners are spamming the ecosystems in return for favorable product rankings, while consumers can be misled to purchase low quality products. To protect the interests of consumers, it is a critical task to spot spamming activities and maintain the ecosystems health. Existing spam detection methods dichotomize microscopic and macroscopic viewpoints of the problem. On the one hand, microscopic methods work on the scale of individual ratings and can be trapped in the ratings that are less harmful to the ecosystems health, leading to sub-optimal allocations of human efforts. On the other hand, macroscopic approaches focus on the ratings that can manipulate the ecosystems in a larger scale. However, the macroscopic signals they inspect can only be tangentially connected to the most critical system health statuses, leading to hard-to-measure spam detection outcome. Further, these macroscopic methods lack of a consistent way to drill down to the microscopic scale and detect actual spams. To address the above drawbacks, we propose a bi-level framework that unifies both perspectives to pinpoint suspicious ratings that can affect the ecosystems more directly and significantly, such that the limited human effort is allocated to maintain the ecosystem health effectively and economically. The framework revolves around the notion of ranking regularity. It first constructs a system health signal from an approximation of ground truth ranking via aggregation of multiple noisy crowdsourced rankings, with only a minimum of expensive expert input. This signal helps the framework to drill down on critical regions where a microscopic method pinpoints suspicious individual ratings for human investigation. We obtain promising experimental results on datasets from mainstream restaurant rating websites.