Title :
An evaluation of Lucene for keywords search in large-scale short text storage
Author :
Qian Liping ; Wang Lidong
Author_Institution :
Dept. of Comput., Beijing Univ. of Civil Eng. & Archit., Beijing, China
Abstract :
Some popular Internet applications such as instant message, blog, twitter and Google buzz generate huge data of short text. These data can then be summarized, mined, and queried by other applications. To this end, suitable storage design with outstanding performance must be offered to address the question of real-time full text indexing and searching. This paper studies Lucene indexing and searching performance for short-text. It gives a comparison test between Lucene and Oracle Text, and then presents performance factors of Lucene for short text search. The experimental results show that Lucene can meets the needs and is far superior to Oracle Text in this typical scenario.
Keywords :
Internet; indexing; information retrieval; Google buzz; Internet application; Lucene indexing; Oracle text; blog; instant message; keywords search; large-scale short text storage; twitter; Application software; Computer networks; Electronic mail; Indexing; Information services; Internet; Keyword search; Large-scale systems; Relational databases; Web sites; information retrieval; lucene; search; short text;
Conference_Titel :
Computer Design and Applications (ICCDA), 2010 International Conference on
Conference_Location :
Qinhuangdao
Print_ISBN :
978-1-4244-7164-5
Electronic_ISBN :
978-1-4244-7164-5
DOI :
10.1109/ICCDA.2010.5541219