Title :
Person name identification in Chinese documents using finite state automata
Author :
Shen, Bing ; Zhongfei ; Yuan, Chunfa
Author_Institution :
Comput. Sci. Dept., Binghamton Univ., NY, USA
Abstract :
This research is about automatic identification and extraction of person names in Chinese text documents. Solutions to this problem have immediate and extensive applications in many areas especially in Web Intelligent Agents related applications such as Web search engines, Web data mining, and automatic Web information analysis. We have noted that while finite state automata (FSA) based techniques have been extensively used in NLP and IE in English, they have not yet been extensively used in processing Chinese text, and in particular, to our knowledge, no work has been reported in using FSA in person name identification and extraction. Motivated by this need, we have proposed a person name identification method based on FSA, called NICF. Evaluations show that NICF works very well in terms of identification recall and accuracy, as well as the processing speed, and thus holds a great promise for future applications.
Keywords :
Web sites; automata theory; data mining; finite state machines; search engines; text analysis; Chinese document; Chinese text document; FSA; IE; NICF; NLP; Web information analysis; Web intelligent agents; Web search engine; automatic extraction; automatic idenfication; finite state automata; person name identification; Application software; Automata; Computer science; Data mining; Information analysis; Intelligent agent; Internet; Robustness; Search engines; Web search;
Conference_Titel :
Intelligent Agent Technology, 2003. IAT 2003. IEEE/WIC International Conference on
Print_ISBN :
0-7695-1931-8
DOI :
10.1109/IAT.2003.1241125