Title of article :
RetriBlog: An architecture-centered framework for developing blog crawlers
Author/Authors :
Ferreira، نويسنده , , Rafael and Freitas، نويسنده , , Fred and Brito، نويسنده , , Patrick and Melo، نويسنده , , Jean and Lima، نويسنده , , Rinaldo and Costa، نويسنده , , Evandro، نويسنده ,
Issue Information :
روزنامه با شماره پیاپی سال 2013
Abstract :
Blogs have become an important social tool. It allows the users to share their tastes, express their opinions, report news, form groups related to some subject, among others. The information obtained from the blogosphere may be used to create several applications in various fields. However, due to the growing number of blogs posted every day, as well as the dynamicity of the blogosphere, the task of extracting relevant information from the blogs has become difficult and time consuming. In this paper, we use information retrieval and extraction techniques to deal with this problem. Furthermore, as blogs have many variation points is required to provide applications that can be easily adapted. Faced with this scenario, the work proposes RetriBlog, an architecture-centered framework for the development of blog crawlers. Finally, it presents an evaluation of the proposed algorithms and three case studies.
Keywords :
Social web , Blog crawler , Tag recommendation , Content extraction
Journal title :
Expert Systems with Applications
Journal title :
Expert Systems with Applications