• DocumentCode
    3208073
  • Title

    Supervised and unsupervised automatic spelling correction algorithms

  • Author

    Van Delden, Sebastian ; Bracewell, David ; Gomez, Fernando

  • Author_Institution
    Dept. of Math. & Comput. Sci., South Carolina Univ., Spartanburg, SC, USA
  • fYear
    2004
  • fDate
    8-10 Nov. 2004
  • Firstpage
    530
  • Lastpage
    535
  • Abstract
    We present two algorithms for automatically improving the quality of texts which contain a large number of spelling errors. A supervised algorithm, which automatically corrects unknown words that are generated primarily from typing errors, is presented first. The second algorithm is an unsupervised approach to automatically correcting typing errors, individual words that have been split, multiple words which have been concatenated, and a combination of these errors. The algorithms have been developed and tested on a large source of real-world, human- and machine-generated spelling errors.
  • Keywords
    natural languages; text analysis; word processing; automatic spelling correction algorithms; spelling errors; supervised algorithm; unsupervised approach; Computer errors; Computer science; Concatenated codes; Databases; Error correction; Filtering algorithms; Humans; Information retrieval; NASA; Natural languages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Reuse and Integration, 2004. IRI 2004. Proceedings of the 2004 IEEE International Conference on
  • Print_ISBN
    0-7803-8819-4
  • Type

    conf

  • DOI
    10.1109/IRI.2004.1431515
  • Filename
    1431515