DocumentCode
3208073
Title
Supervised and unsupervised automatic spelling correction algorithms
Author
Van Delden, Sebastian ; Bracewell, David ; Gomez, Fernando
Author_Institution
Dept. of Math. & Comput. Sci., South Carolina Univ., Spartanburg, SC, USA
fYear
2004
fDate
8-10 Nov. 2004
Firstpage
530
Lastpage
535
Abstract
We present two algorithms for automatically improving the quality of texts which contain a large number of spelling errors. A supervised algorithm, which automatically corrects unknown words that are generated primarily from typing errors, is presented first. The second algorithm is an unsupervised approach to automatically correcting typing errors, individual words that have been split, multiple words which have been concatenated, and a combination of these errors. The algorithms have been developed and tested on a large source of real-world, human- and machine-generated spelling errors.
Keywords
natural languages; text analysis; word processing; automatic spelling correction algorithms; spelling errors; supervised algorithm; unsupervised approach; Computer errors; Computer science; Concatenated codes; Databases; Error correction; Filtering algorithms; Humans; Information retrieval; NASA; Natural languages;
fLanguage
English
Publisher
ieee
Conference_Titel
Information Reuse and Integration, 2004. IRI 2004. Proceedings of the 2004 IEEE International Conference on
Print_ISBN
0-7803-8819-4
Type
conf
DOI
10.1109/IRI.2004.1431515
Filename
1431515
Link To Document