DocumentCode
2206147
Title
SubSift Web Services and Workflows for Profiling and Comparing Scientists and Their Published Works
Author
Price, Simon ; Flach, Peter A. ; Spiegler, Sebastian ; Bailey, Christopher ; Rogers, Nikki
Author_Institution
Inst. for Learning & Res. Technol., Univ. of Bristol, Bristol, UK
fYear
2010
fDate
7-10 Dec. 2010
Firstpage
182
Lastpage
189
Abstract
Scientific researchers, laboratories and organisations can be profiled and compared by analysing their published works, including documents ranging from academic papers to web sites, blog posts and Twitter feeds. This paper describes how the vector space model from information retrieval, more normally associated with full text search, has been employed in the open source Sub Sift software to support workflows to profile and compare such collections of documents. Sub Sift was originally designed to match submitted conference or journal papers to potential peer reviewers based on the similarity between the paper´s abstract and the reviewer´s publications as found in online bibliographic databases. The software is implemented as a family of Restful web services that, composed into a re-usable workflow, have already been used to support several major data mining conferences. Alternative workflows and service compositions are now enabling other interesting applications.
Keywords
Web services; bibliographic systems; information retrieval; text analysis; SubSift Web services; Twitter; Web sites; blog posts; data mining; information retrieval; online bibliographic databases; text search; vector space model;
fLanguage
English
Publisher
ieee
Conference_Titel
e-Science (e-Science), 2010 IEEE Sixth International Conference on
Conference_Location
Brisbane, QLD
Print_ISBN
978-1-4244-8957-2
Electronic_ISBN
978-0-7695-4290-4
Type
conf
DOI
10.1109/eScience.2010.29
Filename
5693916
Link To Document