• DocumentCode
    1122772
  • Title

    An approach to designing very fast approximate string matching algorithms

  • Author

    Du, M.-W. ; Chang, S.C.

  • Author_Institution
    Comput. and Intelligent Syst. Lab., GTE Labs. Inc., Waltham, MA, USA
  • Volume
    6
  • Issue
    4
  • fYear
    1994
  • fDate
    8/1/1994 12:00:00 AM
  • Firstpage
    620
  • Lastpage
    633
  • Abstract
    An approach to designing very fast algorithms for approximate string matching in a dictionary is proposed. Multiple spelling errors corresponding to insert, delete, change, and transpose operations on character strings are considered in the fault model. The design of very fast approximate string matching algorithms through a four-step reduction procedure is described. The final and most effective step uses hashing techniques to avoid comparing the given word with words at large distances. The technique has been applied to a library book catalog textbase. The experiments show that performing approximate string matching for a large dictionary in real-time on an ordinary sequential computer under our multiple fault model is feasible
  • Keywords
    database theory; error correction; query processing; search problems; character strings; dictionary; error correction; fast approximate string matching algorithms; fault model; four-step reduction procedure; hashing techniques; information retrieval; large dictionary; library book catalog textbase; multiple fault model; nearest neighbor search; sequential computer; spelling errors; Algorithm design and analysis; Books; Computer applications; Computer errors; Databases; Dictionaries; Error correction; Information retrieval; Libraries; Nearest neighbor searches;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/69.298177
  • Filename
    298177