• DocumentCode
    704758
  • Title

    Performance and energy evaluation of data prefetching on intel Xeon Phi

  • Author

    Guttman, Diana ; Kandemir, Mahmut Taylan ; Arunachalamy, Meenakshi ; Calina, Vlad

  • Author_Institution
    Comput. Sci. & Eng., Pennsylvania State Univ., University Park, PA, USA
  • fYear
    2015
  • fDate
    29-31 March 2015
  • Firstpage
    288
  • Lastpage
    297
  • Abstract
    There is an urgent need to evaluate the existing parallelism and data locality-oriented techniques on emerging manycore machines using multithreaded applications. Data prefetching is a well-known latency hiding technique that comes with various hardware- and software-based implementations in almost all commercial machines. A well-tuned prefetcher can reduce the observed data access latencies significantly by bringing the soonto- be-requested data into the cache ahead of time, eventually improving application execution time. Motivated by this, we present in this paper a detailed performance and power characterization of software (compiler-guided) and hardware data prefetching on an Intel Xeon Phi-based system. Our main contributions are (i) an analysis of the interactions between hardware and software prefetching, showing how hardware prefetching can throttle itself in response to software; (ii) results on the power and energy behavior of prefetching, showing how performance and energy gains outweigh the increased power cost of prefetching; and (iii) an evaluation of the use of intrinsic prefetch instructions to prefetch for applications with difficult-to-detect access patterns.
  • Keywords
    data encapsulation; multi-threading; parallel processing; performance evaluation; program compilers; storage management; Intel Xeon Phi; application execution time; compiler-guided data prefetching; data locality-oriented technique; energy evaluation; hardware data prefetching; hardware-based implementation; intrinsic prefetch instructions; latency hiding technique; manycore machines; multithreaded applications; parallelism technique; performance characterization; performance evaluation; power characterization; software data prefetching; software-based implementation; Benchmark testing; Coprocessors; Hardware; Measurement; Microwave integrated circuits; Prefetching;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Performance Analysis of Systems and Software (ISPASS), 2015 IEEE International Symposium on
  • Conference_Location
    Philadelphia, PA
  • Type

    conf

  • DOI
    10.1109/ISPASS.2015.7095814
  • Filename
    7095814