• DocumentCode
    2958106
  • Title

    Efficient Resource Oblivious Algorithms for Multicores with False Sharing

  • Author

    Cole, Richard ; Ramachandran, Vijaya

  • Author_Institution
    Comput. Sci. Dept., NYU, New York, NY, USA
  • fYear
    2012
  • fDate
    21-25 May 2012
  • Firstpage
    201
  • Lastpage
    214
  • Abstract
    We consider algorithms for a multicore environment in which each core has its own private cache and false sharing can occur. False sharing happens when two or more processors access the same block (i.e., cache-line) in parallel, and at least one processor writes into a location in the block. False sharing causes different processors to have inconsistent views of the data in the block, and many of the methods currently used to resolve these inconsistencies can cause large delays. We analyze the cost of false sharing both for variables stored on the execution stacks of the parallel tasks and for output variables. Our main technical contribution is to establish a low cost for this overhead for the class of multithreaded block-resilient HBP (Hierarchical Balanced Parallel) computations. Using this and other techniques, we develop block-resilient HBP algorithms with low false sharing costs for several fundamental problems including scans, matrix multiplication, FFT, sorting, and hybrid block-resilient HBP algorithms for list ranking and graph connected components. Most of these algorithms are derived from known multicore algorithms, but are further refined to achieve a low false sharing overhead. Our algorithms make no mention of machine parameters, and our analysis of the false sharing overhead is mostly in terms of the the number of tasks generated in parallel during the computation, and thus applies to a variety of schedulers.
  • Keywords
    cache storage; graph theory; matrix multiplication; multi-threading; multiprocessing systems; parallel algorithms; parallel architectures; resource allocation; FFT; block location; execution stacks; false sharing; graph connected components; hybrid block-resilient HBP algorithms; inconsistent data views; list ranking; machine parameters; matrix multiplication; multicore algorithms; multicore environment; multithreaded block-resilient HBP computations; multithreaded block-resilient hierarchical balanced parallel computations; parallel block; parallel tasks generation; private cache; resource oblivious algorithms; Algorithm design and analysis; Computational modeling; Delay; Instruction sets; Multicore processing; Upper bound; cache-efficiency; false-sharing; multicores;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Parallel & Distributed Processing Symposium (IPDPS), 2012 IEEE 26th International
  • Conference_Location
    Shanghai
  • ISSN
    1530-2075
  • Print_ISBN
    978-1-4673-0975-2
  • Type

    conf

  • DOI
    10.1109/IPDPS.2012.28
  • Filename
    6267836