• DocumentCode
    2053744
  • Title

    Event Extraction from Turkish Football Web-casting Texts Using Hand-crafted Templates

  • Author

    Tunaoglu, D. ; Alan, Özgür ; Sabuncu, Orkunt ; Akpinar, Samet ; Çiçekli, Nihan K. ; Alpaslan, Ferda N.

  • Author_Institution
    METU Technopolis, Orbim Corp., Ankara, Turkey
  • fYear
    2009
  • fDate
    14-16 Sept. 2009
  • Firstpage
    466
  • Lastpage
    472
  • Abstract
    In this paper, we present a domain specific information extraction approach. We use manually formed templates to extract information from unstructured documents where grammatical and syntactical errors occur frequently. We applied our approach to primarily Turkish unstructured soccer Web-casting texts. Compared to automated approaches we achieve high precision-recall rates (97% - 85%). In addition to that, unlike automated approaches we do not use part-of-speech taggers, parsers, phrase chunkers or that kind of a linguistic tool. As a result, our approach can be applied to any domain or any language without the necessity of successful linguistic tools. The drawback of our approach is the time spent on crafting the templates. We also propose the means to decrease that time.
  • Keywords
    Internet; information retrieval; sport; text analysis; Turkish football Web-casting text; domain specific information extraction; event extraction; grammatical error; hand-crafted template; syntactical error; Buildings; Cities and towns; Computer errors; Data mining; Games; Information analysis; Intelligent systems; Internet; Ontologies; Search engines; Data Mining; Semantic Web;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Semantic Computing, 2009. ICSC '09. IEEE International Conference on
  • Conference_Location
    Berkeley, CA
  • Print_ISBN
    978-1-4244-4962-0
  • Electronic_ISBN
    978-0-7695-3800-6
  • Type

    conf

  • DOI
    10.1109/ICSC.2009.16
  • Filename
    5298635