• DocumentCode
    1921094
  • Title

    An Efficient SSD-based Hybrid Storage Architecture for Large-Scale Search Engines

  • Author

    Li, Ruixuan ; Li, Chengzhou ; Xiao, Weijun ; Jin, Hai ; He, Heng ; Gu, Xiwu ; Wen, Kunmei ; Xu, Zhiyong

  • Author_Institution
    Sch. of Comput. Sci. & Technol., Huazhong Univ. of Sci. & Technol., Wuhan, China
  • fYear
    2012
  • fDate
    10-13 Sept. 2012
  • Firstpage
    450
  • Lastpage
    459
  • Abstract
    Large-scale search engines use hard disk drives (HDD) to store the mass index data for their capacity, whose performances are limited by the relatively low I/O performance of HDD. Caching is an effective optimization, and many caching algorithms have been proposed to improve retrieval performance. Considering the high cost of memory and huge amounts of data, the limited capacity of cache in memory cannot resolve the above problem thoroughly. In this paper, we adopt a solid state disk (SSD) based storage architecture, which uses SSD as a secondary cache for memory. We analyze the I/O patterns of search engines and propose SSD-based data management policies based on the hybrid storage architecture, including data selection, data placement and data replacement. Our main goal is to improve the performance of search engines while reducing operation cost inside SSD. The experimental results demonstrate the proposed architecture improves the hit ratio by 13.31%, the performance by 41.05%, the average access time inside SSD by 43.83%, and reduces block erasure operations by 71.52%.
  • Keywords
    cache storage; data handling; disc drives; hard discs; input-output programs; memory architecture; performance evaluation; search engines; HDD; I-O patterns; SSD-based data management policies; SSD-based hybrid storage architecture; block erasure operation reduction; caching algorithms; data placement; data replacement; data selection; hard disk drives; large-scale search engines; operation cost reduction; performance improvement; solid state disk; Educational institutions; Indexes; Memory management; Search engines; Servers; Solids; caching; hybrid storage architecture; search engine; solid state disk;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel Processing (ICPP), 2012 41st International Conference on
  • Conference_Location
    Pittsburgh, PA
  • ISSN
    0190-3918
  • Print_ISBN
    978-1-4673-2508-0
  • Type

    conf

  • DOI
    10.1109/ICPP.2012.17
  • Filename
    6337606