DocumentCode :
2938194
Title :
An Efficient Cross-Match Implementation Based on Directed Join Algorithm in MapReduce
Author :
Mi, Cuncang ; Qian Chen ; Taoying Liu
Author_Institution :
Inst. of Comput. Technol., Beijing, China
fYear :
2011
fDate :
5-8 Dec. 2011
Firstpage :
41
Lastpage :
48
Abstract :
In the field of astronomy, "Cross-Match" is a common operation used to mine useful information by joining different star catalogues. Nowadays star catalogues obtained through astronomical telescopes are becoming much larger than ever before, which drives us to consider implementing Cross-Match in a distributed computing environment. Although the computer hardware is cheap now and resizable compute capacity in the cloud is also available from some web services, we conduct experiments in a restricted environment to conserve resources as much as possible. In our work, we first use Hive from Face book, but find it not as efficient as we expected when facing two big catalogues. Then we analyze the join process Hive has and carry out some optimization, however, the result is still not satisfactory. Finally, we design our own Cross-Match program which bases on the directed join algorithm in MapReduce, takes advantage of the characteristics of astronomical data, and runs on top of Hadoop. Our program has improved the performance by 86% compared with the common join in Hive when making Cross-Match between USNOA and 2MASS.
Keywords :
Web services; astronomical catalogues; astronomy computing; cloud computing; data handling; data mining; 2MASS; Cross-Match program; Facebook; Hadoop; Hive; MapReduce; USNOA; Web service; astronomical data; astronomical telescope; astronomy; cloud computing; directed join algorithm; distributed computing environment; star catalogue; useful information mining; Astronomy; Computational modeling; Data processing; Distributed databases; Indexes; Telescopes; Big star catalogues; Cross-Match; Directed Join; Hive; MapReduce;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Utility and Cloud Computing (UCC), 2011 Fourth IEEE International Conference on
Conference_Location :
Victoria, NSW
Print_ISBN :
978-1-4577-2116-8
Type :
conf
DOI :
10.1109/UCC.2011.16
Filename :
6123479
Link To Document :
بازگشت