DocumentCode
3334515
Title
A non word error spell checker for Indonesian using morphologically analyzer and HMM
Author
Soleh, M.Y. ; Purwarianti, Ayu
Author_Institution
Dept. of Inf., Bandung Inst. of Technol., Bandung, Indonesia
fYear
2011
fDate
17-19 July 2011
Firstpage
1
Lastpage
6
Abstract
Spell checker consists of two main methods, error detection and error correction. In this study, spell checker is built by using morphological analyzer and dictionary lookup as error detection method with two alternative optimization, binary search and hash. Whilst as for error correction, two alternative methods, namely forward reversed dictionary and probability of similarity is used. Forward reversed dictionary corrects the misspelled word by considering edit distance between the misspelled word and its candidates. Probability of similarity, which is the main proposed method for error correction, correct the misspelled word by calculating its similarity to a candidate word, based on the value of optimum subsequence between them. Candidate sorting was accomplished through the use of HMM (Hidden Markov Model), where the word is considered as observed state and the candidates as hidden state. By using HMM, the system does not only consider the similarity of the candidate word with misspelled words, but also consider the sequence of words in sentences where the word is located. The experiment result proves that sorting candidates by using HMM increase the precision accuracy. As for correction method, the result showed that using probability of similarity has better correctness accuracy than forward reversed dictionary.
Keywords
hidden Markov models; natural language processing; probability; HMM; Indonesian nonword error spell checker; binary search optimzation; candidate sorting; dictionary lookup; edit distance; error correction; error detection method; forward reversed dictionary method; hash optimzation; hidden Markov model; hidden state; morphologically analyzer; observed state; similarity probability method; Accuracy; Dictionaries; Equations; Forward error correction; Hidden Markov models; Mathematical model; Probability; HMM; Indonesian spell checker; forward reverse dictionary; morphologically analyzer; non-word error; probability of similarity;
fLanguage
English
Publisher
ieee
Conference_Titel
Electrical Engineering and Informatics (ICEEI), 2011 International Conference on
Conference_Location
Bandung
ISSN
2155-6822
Print_ISBN
978-1-4577-0753-7
Type
conf
DOI
10.1109/ICEEI.2011.6021514
Filename
6021514
Link To Document