Title of article :
Matchsimile: A flexible approximate matching tool for searching proper names
Author/Authors :
Gonzalo Navarro1، نويسنده , , Ricardo Baeza-Yates1، نويسنده , , Jo?o Marcelo Azevedo Arcoverde2، نويسنده ,
Issue Information :
ماهنامه با شماره پیاپی سال 2003
Pages :
13
From page :
3
To page :
15
Abstract :
We present the architecture and algorithms behind Matchsimile, an approximate string matching lookup tool especially designed for extracting person and company names from large texts. Part of a larger information extraction environment, this specific engine receives a large set of proper names to search for, a text to search, and search options; and outputs all the occurrences of the names found in the text. Beyond the similarity search capabilities applied at the intraword level, the tool considers a set of specific person name formation rules at the word level, such as combination, abbreviation, duplicity detections, ordering, word omission and insertion, among others. This engine is used in a successful commercial application (also named Matchsimile), which allows searching for lawyer names in official law publications.
Journal title :
Journal of the American Society for Information Science and Technology
Serial Year :
2003
Journal title :
Journal of the American Society for Information Science and Technology
Record number :
993320
Link To Document :
بازگشت