Title :
Efficient and accurate approach for approximate string search in spatial dataset
Author :
Nikam, Pratiksha Praful
Author_Institution :
Dept. of Comput. Eng., GSMCOE, Pune, India
Abstract :
This paper proposes a new index and method to find strings approximately in spatial databases. Specifically, the task of candidate generation is as follows. Given a location name with wrong spelling, the system finds location in OSM dataset which are most similar to that location name which are misspelled. An approximate solution is proposed using log linear model which is defined as a conditional probability distribution of a corrected word and a rule set for the correction conditioned on wrong location name. An Aho-corasic tree which is used for storing and applying correction rules referred to as rule index and an Aho-Corasic algorithm which is efficient and gives guarantee to find top k candidates. Experiment on large real OSM dataset demonstrates the accuracy of proposed method upon existing methods.
Keywords :
search problems; statistical distributions; string matching; text analysis; trees (mathematics); visual databases; Aho-Corasic algorithm; Aho-corasic tree; OSM dataset; approximate string search; candidate generation; conditional probability distribution; correction rules; log linear model; rule index; spatial databases; spatial dataset; Accuracy; Algorithm design and analysis; Approximation algorithms; Data structures; Indexes; Probabilistic logic; Spatial databases; Aho-Corasick algorithm; Approximate string search; OSM dataset; spatial databases;
Conference_Titel :
Advance Computing Conference (IACC), 2015 IEEE International
Conference_Location :
Banglore
Print_ISBN :
978-1-4799-8046-8
DOI :
10.1109/IADCC.2015.7154721