• DocumentCode
    2075287
  • Title

    No sorting? Better searching! [optimal array organization]

  • Author

    Franceschini, Gianni ; Grossi, Roberto

  • Author_Institution
    Dipt. di Informatica, Universita di Pisa, Italy
  • fYear
    2004
  • fDate
    17-19 Oct. 2004
  • Firstpage
    491
  • Lastpage
    498
  • Abstract
    Sorting is commonly meant as the task of arranging keys in increasing or decreasing order (or small variations of this order). Given n keys underlying a total order, the best organization in an array is maintaining them in sorted order. Searching requires Θ (log n) comparisons in the worst case, which is optimal. We demonstrate that this basic fact in data structures does not hold for the general case of multidimensional keys, whose comparison cost is proportional to their length. In two papers by Andersson et al. (1994) and Andersson et al. (1995) and the full version in 2001, Andersson et al. study the complexity of searching a sorted array of n keys, each of length k, arranged in lexicographic (or alphabetic) order for an arbitrary, possibly unbounded, ordered alphabet. They give sophisticated arguments for proving a tight bound in the worst case for this basic data organization, up to a constant factor, obtaining Θ(((k log log n)/(log log (4 + ((k log log n)/log n)))) + k log n) character comparisons (or probes). Note that the bound is Θ (log n) when k = 1, which is the case that is well known in algorithmics. We describe a permutation of the n keys that is different from the sorted order, and sorting is just the starting point for describing our preprocessing. When keys are stored according to this "unsorted" order in the array, the complexity of searching drops to Θ (k + log n) character comparisons (or probes) in the worst case, which is optimal among all possible permutations of the n keys in the array, up to a constant factor. Again, the bound is Θ (log n) when k = 1. Jointly with the aforementioned result of Anders son et al., our finding provably shows that keeping k-dimensional keys sorted in an array is not the best data organization for searching. This fact was not observable before by just considering k = O(1) as sorting is an optimal organization in this case. More implications of our result are commented in the introduction.
  • Keywords
    arrays; computational complexity; data structures; search problems; algorithmics; array organization; array searching complexity; character comparisons; data organization; key sorting; multidimensional keys; ordered alphabet; unsorted array ordering; Algorithm design and analysis; Books; Computer science; Costs; Data structures; Dictionaries; Probes; Sorting; Time measurement;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Foundations of Computer Science, 2004. Proceedings. 45th Annual IEEE Symposium on
  • ISSN
    0272-5428
  • Print_ISBN
    0-7695-2228-9
  • Type

    conf

  • DOI
    10.1109/FOCS.2004.43
  • Filename
    1366269