DocumentCode :
2049341
Title :
Cache Friendly Burrows-Wheeler Inversion
Author :
Kärkkäinen, Juha ; Puglisi, Simon J.
Author_Institution :
Dept. of Comput. Sci., Univ. of Helsinki, Helsinki, Finland
fYear :
2011
fDate :
21-24 June 2011
Firstpage :
38
Lastpage :
42
Abstract :
The Burrows-Wheeler transform permutes the symbols of a string such that the permuted string can be compressed effectively with fast, simple techniques. Inversion of the transform is a bottleneck in practice. Inversion takes linear time, but, for each symbol decoded, folklore says that a random access into the transformed string (and so a CPU cache-miss) is necessary. In this paper we show how to mitigate cache misses and so speed inversion. Our main idea is to modify the standard inversion algorithm to detect and record repeated sub strings in the original string as it is recovered. Subsequent occurrences of these repetitions are then copied in a cache friendly way from the already recovered portion of the string, short cutting a series of random accesses by the standard inversion algorithm. We show experimentally that this approach leads to faster runtimes in general, and can drastically reduce inversion time for highly repetitive data.
Keywords :
cache storage; data compression; transforms; CPU cache misses; cache friendly Burrows-Wheeler transform inversion; permuted string; speed inversion; standard inversion algorithm; transformed string; Arrays; DNA; Data compression; Electronic mail; Pattern matching; Runtime; Transforms; BWT; Burrows-Wheeler transform; cache misses; data compression; suffix array;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Data Compression, Communications and Processing (CCP), 2011 First International Conference on
Conference_Location :
Palinuro
Print_ISBN :
978-1-4577-1458-0
Electronic_ISBN :
978-0-7695-4528-8
Type :
conf
DOI :
10.1109/CCP.2011.15
Filename :
6061025
Link To Document :
بازگشت