• DocumentCode
    140798
  • Title

    Efficient instant-fuzzy search with proximity ranking

  • Author

    Cetindil, Inci ; Esmaelnezhad, Jamshid ; Taewoo Kim ; Chen Li

  • Author_Institution
    Dept. of Comput. Sci., Univ. of California, Irvine, Irvine, CA, USA
  • fYear
    2014
  • fDate
    March 31 2014-April 4 2014
  • Firstpage
    328
  • Lastpage
    339
  • Abstract
    Instant search is an emerging information-retrieval paradigm in which a system finds answers to a query instantly while a user types in keywords character-by-character. Fuzzy search further improves user search experiences by finding relevant answers with keywords similar to query keywords. A main computational challenge in this paradigm is the high-speed requirement, i.e., each query needs to be answered within milliseconds to achieve an instant response and a high query throughput. At the same time, we also need good ranking functions that consider the proximity of keywords to compute relevance scores. In this paper, we study how to integrate proximity information into ranking in instant-fuzzy search while achieving efficient time and space complexities. We adapt existing solutions on proximity ranking to instant-fuzzy search. A naïve solution is computing all answers then ranking them, but it cannot meet this high-speed requirement on large data sets when there are too many answers, so there are studies of early-termination techniques to efficiently compute relevant answers. To overcome the space and time limitations of these solutions, we propose an approach that focuses on common phrases in the data and queries, assuming records with these phrases are ranked higher. We study how to index these phrases and develop an incremental-computation algorithm for efficiently segmenting a query into phrases and computing relevant answers. We conducted a thorough experimental study on real data sets to show the tradeoffs between time, space, and quality of these solutions.
  • Keywords
    computational complexity; fuzzy set theory; query formulation; query processing; early-termination techniques; incremental-computation algorithm; information-retrieval paradigm; instant-fuzzy search; proximity ranking; query answering; query segmentation; space complexity; time complexity; Dictionaries; Heart; Indexing; Servers; Surgery; Tumors;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Engineering (ICDE), 2014 IEEE 30th International Conference on
  • Conference_Location
    Chicago, IL
  • Type

    conf

  • DOI
    10.1109/ICDE.2014.6816662
  • Filename
    6816662