DocumentCode
1984085
Title
A novel technique for words reordering based on N-grams
Author
Athanaselis, Theologos ; Bakamidis, Stelios ; Dologlou, Ioannis
Author_Institution
Inst. for Language & Speech Process., Athens
fYear
2007
fDate
12-15 Feb. 2007
Firstpage
1
Lastpage
4
Abstract
This paper presents an approach for repairing word order errors in English text by reordering words in a sentence and choosing the version that maximizes the number of trigram hits according to a language model. The novelty of this method concerns the use of an efficient confusion matrix technique for reordering the words. For further reducing the number of permutations the use of unigramspsila probability is used. The comparative advantage of this method is that works with a large set of words, and avoids the laborious and costly process of collecting word order errors for creating error patterns.
Keywords
natural languages; probability; text analysis; English text; confusion matrix; language model; trigram hits; word order errors; words reordering; Computer errors; Internet; Machine learning algorithms; Natural languages; Probability; Search engines; Speech processing; Testing; Text recognition; Writing;
fLanguage
English
Publisher
ieee
Conference_Titel
Signal Processing and Its Applications, 2007. ISSPA 2007. 9th International Symposium on
Conference_Location
Sharjah
Print_ISBN
978-1-4244-0778-1
Electronic_ISBN
978-1-4244-1779-8
Type
conf
DOI
10.1109/ISSPA.2007.4555284
Filename
4555284
Link To Document