• DocumentCode
    3565634
  • Title

    Rule-based and machine learning approach for event sentence extraction in Indonesian online news articles

  • Author

    Abidin, Taufik Fuadi ; Dimyathi, Rahmad ; Ferdhiana, Ridha

  • Author_Institution
    Dept. of Inf., Syiah Kuala Univ., Banda Aceh, Indonesia
  • fYear
    2014
  • Firstpage
    25
  • Lastpage
    28
  • Abstract
    With the rapid maturity of internet and web technology over the last decades, the number of Indonesian online news articles is growing rapidly on the web at a pace we never experienced before. In this paper, we introduce a combination of rule-based and machine learning approach to find the sentences that have tropical disease information in them, such as the incidence date and the number of casualty, and we measure its accuracy. Given a set of web pages in tropical disease topic, we first extract the sentences in the pages that match contextual and morphological patterns for a date and number of casualty using a rule-based algorithm. After that, we classify the sentences using Support Vector Machine and collect the sentences that have tropical disease information in them. The results show that the proposed method works well and has good accuracy.
  • Keywords
    Internet; Web sites; diseases; knowledge based systems; learning (artificial intelligence); medical computing; pattern classification; support vector machines; Indonesian online news articles; Internet; Web pages; Web technology; contextual pattern; event sentence extraction; machine learning approach; morphological pattern; rule-based approach; sentence classification; sentence extraction; support vector machine; tropical disease information; Accuracy; Data mining; Dictionaries; Diseases; Feature extraction; Kernel; Support vector machines; Event sentence extraction; accuracy measure; support vector machine;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology Systems and Innovation (ICITSI), 2014 International Conference on
  • Type

    conf

  • DOI
    10.1109/ICITSI.2014.7048232
  • Filename
    7048232