DocumentCode
3312070
Title
Data matrix compression by using co-clustering
Author
Bo Han ; Zhenyu Yang
Author_Institution
Int. Sch. of Software, Wuhan Univ., Wuhan, China
Volume
4
fYear
2011
fDate
26-28 July 2011
Firstpage
2600
Lastpage
2604
Abstract
A two dimensional data matrix has been widely used in many applications. The lossless compression of data matrix not only brings benefits for storage but also for network transmission. In this paper, we propose a novel data-mining-based compression approach consisting of three steps: reordering and grouping data matrix columns and rows by co-clustering; post-processing to further expose redundancy in data matrix; data compression by a standard compressor. The inverse transform of co-clustering is very fast and simple, which facilitates matrix uncompression. We tested the approach on a synthetic dataset and five UCI real-life datasets. The experimental results suggest that our approach can improve compression rates at least 24% and up to 68%. The results also show that the time cost of the approach is linearly proportional to data matrix size, which is faster than other competition methods.
Keywords
data compression; data mining; inverse transforms; pattern clustering; data matrix column grouping; data matrix compression; data matrix redundancy; data matrix row coclustering; data-mining-based compression approach; inverse transform; lossless compression; matrix uncompression; network transmission; reordering; standard compressor; two dimensional data matrix; Data compression; Data mining; Educational institutions; Image coding; Redundancy; Software; Transforms; co-clustering; data matrix; lossless compression; redundancy;
fLanguage
English
Publisher
ieee
Conference_Titel
Fuzzy Systems and Knowledge Discovery (FSKD), 2011 Eighth International Conference on
Conference_Location
Shanghai
Print_ISBN
978-1-61284-180-9
Type
conf
DOI
10.1109/FSKD.2011.6019940
Filename
6019940
Link To Document