DocumentCode :
2971341
Title :
Analyzing the performance differences between pattern matching and compressed pattern matching on texts
Author :
Erdogan, Can ; Nusret Bulus, H. ; Diri, B.
fYear :
2013
fDate :
7-9 Nov. 2013
Firstpage :
135
Lastpage :
138
Abstract :
In this study the statistics of pattern matching on text data and the statistics of compressed pattern matching on compressed form of the same text data are compared. A new application has been developed to count the character matching numbers in compressed and uncompressed texts individually. Also a new text compression algorithm that allows compressed pattern matching by using classical pattern matching algorithms without any change is presented in this paper. In this paper while the presented compression algorithm based on digram and trigram substitution has been giving about 30-35% compression factor, the duration of compressed pattern matching on compressed text is calculated less than the duration of pattern matching on uncompressed text. Also it is confirmed that the number of character comparison on compressed texts while doing a compressed pattern matching is less than the number of character comparison on uncompressed texts. Thus the aim of the developed compression algorithm is to point out the difference in text processing between compressed and uncompressed text and to form opinions for another applications.
Keywords :
data compression; pattern matching; statistical analysis; text analysis; character matching numbers; classical pattern matching algorithms; compressed pattern matching; compression factor; digram substitution; statistics; text compression algorithm; text data; text processing; trigram substitution; uncompressed texts; Compression algorithms; Data compression; Dictionaries; Encoding; Force; Indexes; Pattern matching; Compressed Pattern Matching; Data compression; Pattern Substitution; Pattern matching;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Electronics, Computer and Computation (ICECCO), 2013 International Conference on
Conference_Location :
Ankara
Type :
conf
DOI :
10.1109/ICECCO.2013.6718247
Filename :
6718247
Link To Document :
بازگشت