Title :
An Efficient Word Searching Algorithm through Splitting and Hashing the Offline Text
Author :
Singh, Bharat ; Yadav, Ishadutta ; Agarwal, Suneeta ; Prasad, Rajesh
Author_Institution :
Dept. of Comput. Sci. & Eng., Motilal Nehru Nat. Inst. of Technol., Allahabad, India
Abstract :
Word matching problem is to find all the occurrences of a pattern P[0...m-1] in the text T[0...n-1], where P neither contains any white space nor preceded and followed by space. In this paper, we assume that our text is offline. Ibrahiem et al. in 2008 have proposed an algorithm (WSA) for solving the word matching problem by splitting the offline text into number of tables in the preprocessing phase. The main drawback of this algorithm was: after splitting the text into a number of tables, they search each occurrence of the pattern by the brute force manner in each table. In this paper, we improved the algorithm by using an efficient hash function SDBM proposed by R. J. Enbody et al. in 1988. In this technique, after splitting the text into number of tables, we match the hash value of the pattern P with the hash values of the words of same length in the text T. This algorithm is called as modified word searching algorithm (MWSA). Experimental results show that MWSA algorithm is much faster than the previously proposed WSA algorithm.
Keywords :
cryptography; pattern matching; text analysis; word processing; MWSA algorithm; SDBM hash function; offline text hashing; offline text splitting; word matching; word searching; Communications technology; Computer science; Pattern matching; Space technology; White spaces; Algorithm; hashing; offline searching; string matching; word searching;
Conference_Titel :
Advances in Recent Technologies in Communication and Computing, 2009. ARTCom '09. International Conference on
Conference_Location :
Kottayam, Kerala
Print_ISBN :
978-1-4244-5104-3
Electronic_ISBN :
978-0-7695-3845-7
DOI :
10.1109/ARTCom.2009.210