Title of article :
Are Raw RSS Feeds Suitable for Broad Issue Scanning?
A Science Concern Case Study
Author/Authors :
Mike Thelwall، نويسنده , , Rudy Prabowo، نويسنده , , Ruth Fairclough، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2006
Abstract :
Broad issue scanning is the task of identifying important
public debates arising in a given broad issue; really
simple syndication (RSS) feeds are a natural information
source for investigating broad issues. RSS, as originally
conceived, is a method for publishing timely and
concise information on the Internet, for example, about
the main stories in a news site or the latest postings in
a blog. RSS feeds are potentially a nonintrusive source
of high-quality data about public opinion: Monitoring a
large number may allow quantitative methods to extract
information relevant to a given need. In this article we
describe an RSS feed-based coword frequency method
to identify bursts of discussion relevant to a given
broad issue. A case study of public science concerns is
used to demonstrate the method and assess the suitability
of raw RSS feeds for broad issue scanning (i.e.,
without data cleansing). An attempt to identify genuine
science concern debates from the corpus through investigating
the top 1,000 “burst” words found only two
genuine debates, however. The low success rate was
mainly caused by a few pathological feeds that dominated
the results and obscured any significant debates.
The results point to the need to develop effective data
cleansing procedures for RSS feeds, particularly if there
is not a large quantity of discussion about the broad
issue, and a range of potential techniques is suggested.
Finally, the analysis confirmed that the time series information
generated by real-time monitoring of RSS feeds
could usefully illustrate the evolution of new debates
relevant to a broad issue
Journal title :
Journal of the American Society for Information Science and Technology
Journal title :
Journal of the American Society for Information Science and Technology