مرکز منطقه ای اطلاع رساني علوم و فناوري

DocumentCode :

1586644

Title :

Extracting company names from text

Author :

Rau, Lisa F.

Author_Institution :

GE Res. & Dev. Center, Schenectady, NY, USA

fYear :

1991

Firstpage :

Lastpage :

Abstract :

A detailed description is given of an implemented algorithm that extracts company names automatically from financial news. Extracting company names from text is one problem; recognizing subsequent references to a company is another. The author addresses both problems in an implemented, well-tested module that operates as a detachable process from a set of natural language processing tools. She implements a good algorithm by combining heuristics, exception lists and extensive corpus analysis. The algorithm generates the most likely variations that those names may go by, for use in subsequent retrieval. Tested on over one million words of naturally occurring financial news, the system has extracted thousands of company names with over 95% accuracy (precision) compared to a human, and succeeded in extracting 25% more companies than were indexed by a human

Keywords :

computerised pattern recognition; financial data processing; information retrieval; natural languages; word processing; company names; corpus analysis; detachable process; exception lists; financial news; heuristics; natural language processing tools; naturally occurring financial news; retrieval; well-tested module; Artificial intelligence; Databases; Frequency; Humans; Laboratories; Natural language processing; Natural languages; Research and development; Testing; Text recognition;

fLanguage :

English

Publisher :

ieee

Conference_Titel :

Artificial Intelligence Applications, 1991. Proceedings., Seventh IEEE Conference on

Conference_Location :

Miami Beach, FL

Print_ISBN :

0-8186-2135-4

Type :

conf

DOI :

10.1109/CAIA.1991.120841

Filename :

120841

Link To Document :

https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=49&DC=1586644