• DocumentCode
    2139047
  • Title

    Moving Text Analysis Tools to the Cloud

  • Author

    Vashishtha, Himanshu ; Smit, Michael ; Stroulia, Eleni

  • Author_Institution
    Dept. of Comput. Sci., Univ. of Alberta, Edmonton, AB, Canada
  • fYear
    2010
  • fDate
    5-10 July 2010
  • Firstpage
    107
  • Lastpage
    114
  • Abstract
    Text analysis is an important computational task, as unstructured data including text abound and can potentially provide interesting information and knowledge in a variety of areas. In our collaboration with Digital Humanists, we have started to examine the opportunities that the cloud offers to improving the response times of text-analysis tools so that users can comparatively analyze large text corpora across a variety of dimensions. To that end, we have started migrating existing text analysis tools to the cloud, beginning with TAPoR, the Text Analysis Portal for Research. In this paper, we discuss our experience redesigning and re-implementing four basic TAPoR operations on Hadoop and we report on the performance improvements enabled by the migration.
  • Keywords
    Internet; humanities; portals; text analysis; Hadoop; TAPoR; cloud computing; computational task; digital humanists; text analysis portal for research; text analysis tools; text-analysis response time; Clouds; HTML; Indexes; Java; Simple object access protocol; Text analysis; MapReduce; cloud computing; text analysis; web services;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Services (SERVICES-1), 2010 6th World Congress on
  • Conference_Location
    Miami, FL
  • Print_ISBN
    978-1-4244-8199-6
  • Electronic_ISBN
    978-0-7695-4129-7
  • Type

    conf

  • DOI
    10.1109/SERVICES.2010.91
  • Filename
    5575783