• DocumentCode
    3120743
  • Title

    Scalable full-text search for petascale file systems

  • Author

    Leung, Andrew W. ; Miller, Ethan L.

  • Author_Institution
    Storage Syst. Res. Center, Univ. of California, Santa Cruz, Santa Cruz, CA
  • fYear
    2008
  • fDate
    17-17 Nov. 2008
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    As file system capacities reach the petascale, it is becoming increasingly difficult for users to organize, find, and manage their data. File system search has the potential to greatly improve how users manage and access files. Unfortunately, existing file system search is designed for smaller scale systems, making it difficult for existing solutions to scale to petascale files systems. In this paper, we motivate the importance of file system search in petascale file systems and present a new full text file system search design for petascale file systems. Unlike existing solutions, our design exploits file system properties. Using a novel index partitioning mechanism that utilizes file system namespace locality, we are able to improve search scalability and performance and we discuss how such a design can potentially improve search security and ranking.We describe how our design can be implemented within the Ceph petascale file system.
  • Keywords
    file organisation; information retrieval; text analysis; file system namespace locality; full text file system search design; index partitioning; petascale file systems; scalable full-text search; search ranking; search security; Dictionaries; File systems; Government; Hardware; Indexing; Navigation; Organizing; Productivity; Scalability; Security;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Petascale Data Storage Workshop, 2008. PDSW '08. 3rd
  • Conference_Location
    Austin, TX
  • Print_ISBN
    978-1-4244-4208-9
  • Type

    conf

  • DOI
    10.1109/PDSW.2008.4811884
  • Filename
    4811884