• DocumentCode
    1773946
  • Title

    PEWP: Process extraction based on word position in documents

  • Author

    Yuchen Chen ; ZhiJun Ding ; Haichun Sun

  • Author_Institution
    Dept. of Comput. Sci. & Technol., Tongji Univ., Shanghai, China
  • fYear
    2014
  • fDate
    Sept. 29 2014-Oct. 1 2014
  • Firstpage
    135
  • Lastpage
    140
  • Abstract
    Search engine is a popular and beneficial tool to help people quickly find required information. However, some sequence information, such as “What should be prepared for applying visas”, “Where can apply visas” and “How long could get visas”, often can´t be integrally got from traditional search engine. But this sequence information is helpful to give great instructions to make people understand the steps of doing things. In this paper, the method of PEWP can automatically obtain the step sequence information based on the idea of process and text mining, considering both word position and frequency at the same time. The experiment makes a comparison between PEWP and topic extraction, and the results show PEWP is better, which is almost strict-sort and recall rate nearly to 71% at average.
  • Keywords
    data mining; feature extraction; search engines; text analysis; word processing; PEWP; process extraction based on word position in documents; search engine; text mining; Cleaning; Context; Licenses; Registers; Standards; Text mining; process extraction; text mining; word position;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Information Management (ICDIM), 2014 Ninth International Conference on
  • Conference_Location
    Phitsanulok
  • Type

    conf

  • DOI
    10.1109/ICDIM.2014.6991399
  • Filename
    6991399