• DocumentCode
    588165
  • Title

    Mining hidden mixture context with ADIOS-P to improve predictive pre-fetcher accuracy

  • Author

    Jong Youl Choi ; Abbasi, Hasan ; Pugmire, David ; Podhorszki, Norbert ; Klasky, Scott ; Capdevila, C. ; Parashar, Manish ; Wolf, Michael ; Qiu, Jian ; Fox, G.

  • Author_Institution
    Sci. Data Group, Oak Ridge Nat. Lab., Oak Ridge, TN, USA
  • fYear
    2012
  • fDate
    8-12 Oct. 2012
  • Firstpage
    1
  • Lastpage
    8
  • Abstract
    Predictive pre-fetcher, which predicts future data access events and loads the data before users requests, has been widely studied, especially in file systems or web contents servers, to reduce data load latency. Especially in scientific data visualization, pre-fetching can reduce the IO waiting time. In order to increase the accuracy, we apply a data mining technique to extract hidden information. More specifically, we apply a data mining technique for discovering the hidden contexts in data access patterns and make prediction based on the inferred context to boost the accuracy. In particular, we performed Probabilistic Latent Semantic Analysis (PLSA), a mixture model based algorithm popular in the text mining area, to mine hidden contexts from the collected user access patterns and, then, we run a predictor within the discovered context. We further improve PLSA by applying the Deterministic Annealing (DA) method to overcome the local optimum problem. In this paper we demonstrate how we can apply PLSA and DA optimization to mine hidden contexts from users data access patterns and improve predictive pre-fetcher performance.
  • Keywords
    data mining; information retrieval; probability; storage management; text analysis; ADIOS-P; DA method; IO waiting time; Web contents servers; data access events; data load latency; data mining technique; data visualization; deterministic annealing method; file systems; hidden information extraction; hidden mixture context; local optimum problem; mixture model based algorithm; pre-fetching; predictive pre-fetcher accuracy; probabilistic latent semantic analysis; text mining area; Accuracy; Algorithm design and analysis; Clustering algorithms; Context; Data mining; Data visualization; Prediction algorithms; hidden context mining; prefetch;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    E-Science (e-Science), 2012 IEEE 8th International Conference on
  • Conference_Location
    Chicago, IL
  • Print_ISBN
    978-1-4673-4467-8
  • Type

    conf

  • DOI
    10.1109/eScience.2012.6404418
  • Filename
    6404418