DocumentCode :
606014
Title :
Quasi-linear-time substring searching by q-gram distance
Author :
Hanada, Hirotoshi ; Kudo, Motoi ; Nakamura, A.
Author_Institution :
Grad. Sch. of Inf. Sci. & Technol., Hokkaido Univ., Sapporo, Japan
fYear :
2012
fDate :
23-25 Oct. 2012
Firstpage :
544
Lastpage :
549
Abstract :
The q-gram distance dq(x, y) between two strings x and y is a string similarity measure correlated with a famous string distance: the edit distance. In addition, it can be computed much faster, in linear (O(|x|+|y|)) time, than the edit distance in quadratic (O(|x||y|)) time, where | · | denotes the string length. However, it does not mean that we can find all substrings of a text t similar to a pattern p in linear time. In this paper we will propose a searching algorithm achieving quasi-linear (O(|t| log |p| + |p|)) time by the q-gram distance.
Keywords :
computational complexity; string matching; text analysis; edit distance; linear (O(|x|+|y|)) time; q-gram distance; quadratic (O(|x||y|)) time; quasilinear (O(|t| log |p| + |p|)) time; quasilinear-time substring searching; searching algorithm; string length; string matching; string similarity measure; text substrings; approximate string matching; q-gram distance; string searching;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Information Science and Service Science and Data Mining (ISSDM), 2012 6th International Conference on New Trends in
Conference_Location :
Taipei
Print_ISBN :
978-1-4673-0876-2
Type :
conf
Filename :
6528694
Link To Document :
بازگشت