DocumentCode :
1773946
Title :
PEWP: Process extraction based on word position in documents
Author :
Yuchen Chen ; ZhiJun Ding ; Haichun Sun
Author_Institution :
Dept. of Comput. Sci. & Technol., Tongji Univ., Shanghai, China
fYear :
2014
fDate :
Sept. 29 2014-Oct. 1 2014
Firstpage :
135
Lastpage :
140
Abstract :
Search engine is a popular and beneficial tool to help people quickly find required information. However, some sequence information, such as “What should be prepared for applying visas”, “Where can apply visas” and “How long could get visas”, often can´t be integrally got from traditional search engine. But this sequence information is helpful to give great instructions to make people understand the steps of doing things. In this paper, the method of PEWP can automatically obtain the step sequence information based on the idea of process and text mining, considering both word position and frequency at the same time. The experiment makes a comparison between PEWP and topic extraction, and the results show PEWP is better, which is almost strict-sort and recall rate nearly to 71% at average.
Keywords :
data mining; feature extraction; search engines; text analysis; word processing; PEWP; process extraction based on word position in documents; search engine; text mining; Cleaning; Context; Licenses; Registers; Standards; Text mining; process extraction; text mining; word position;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Digital Information Management (ICDIM), 2014 Ninth International Conference on
Conference_Location :
Phitsanulok
Type :
conf
DOI :
10.1109/ICDIM.2014.6991399
Filename :
6991399
Link To Document :
بازگشت