• DocumentCode
    3716486
  • Title

    SSIE: An Automatic Data Extractor for Sports Management in Athletics Modality

  • Author

    Simões;Fabio Matsunaga;Armando Toda;Jacques Brancher;Abdallah Junior;Rosangela Busto

  • Author_Institution
    Comput. Sci. Dept., Londrina State Univ., Londrina, Brazil
  • fYear
    2015
  • Firstpage
    144
  • Lastpage
    151
  • Abstract
    Sports management concerns the organization of sport results and modalities information and statistical analysis by professionals. However, these information scattered around the web or organized by sport events which difficult the prospection of sport talents and the textual information are unstructured or semi-structured. This work proposes a Summary Sport Information Extraction System (SSIE) to generate a summary of statistics of the athletics modality by the automatic information extraction of documents retrieved from web. These documents are converted in textual information and classified using Naive Bayes learning method, according to sport type. After the documents retrieval and classification, text segmentation/tokenization, corpus annotation and entity/subset recognition by chunking were used to generate data frames in parse trees structure. The parse trees information are stored in a database, from which was possible to summary projection and big data analyzing over the web. The main contribution of this work was the clustering of huge amount of data spread on the web, useful for sports management.
  • Keywords
    "Data mining","Portable document format","Feature extraction","Web pages","Knowledge based systems","Learning systems"
  • Publisher
    ieee
  • Conference_Titel
    Computer and Information Technology; Ubiquitous Computing and Communications; Dependable, Autonomic and Secure Computing; Pervasive Intelligence and Computing (CIT/IUCC/DASC/PICOM), 2015 IEEE International Conference on
  • Type

    conf

  • DOI
    10.1109/CIT/IUCC/DASC/PICOM.2015.23
  • Filename
    7363064