DocumentCode
2938194
Title
An Efficient Cross-Match Implementation Based on Directed Join Algorithm in MapReduce
Author
Mi, Cuncang ; Qian Chen ; Taoying Liu
Author_Institution
Inst. of Comput. Technol., Beijing, China
fYear
2011
fDate
5-8 Dec. 2011
Firstpage
41
Lastpage
48
Abstract
In the field of astronomy, "Cross-Match" is a common operation used to mine useful information by joining different star catalogues. Nowadays star catalogues obtained through astronomical telescopes are becoming much larger than ever before, which drives us to consider implementing Cross-Match in a distributed computing environment. Although the computer hardware is cheap now and resizable compute capacity in the cloud is also available from some web services, we conduct experiments in a restricted environment to conserve resources as much as possible. In our work, we first use Hive from Face book, but find it not as efficient as we expected when facing two big catalogues. Then we analyze the join process Hive has and carry out some optimization, however, the result is still not satisfactory. Finally, we design our own Cross-Match program which bases on the directed join algorithm in MapReduce, takes advantage of the characteristics of astronomical data, and runs on top of Hadoop. Our program has improved the performance by 86% compared with the common join in Hive when making Cross-Match between USNOA and 2MASS.
Keywords
Web services; astronomical catalogues; astronomy computing; cloud computing; data handling; data mining; 2MASS; Cross-Match program; Facebook; Hadoop; Hive; MapReduce; USNOA; Web service; astronomical data; astronomical telescope; astronomy; cloud computing; directed join algorithm; distributed computing environment; star catalogue; useful information mining; Astronomy; Computational modeling; Data processing; Distributed databases; Indexes; Telescopes; Big star catalogues; Cross-Match; Directed Join; Hive; MapReduce;
fLanguage
English
Publisher
ieee
Conference_Titel
Utility and Cloud Computing (UCC), 2011 Fourth IEEE International Conference on
Conference_Location
Victoria, NSW
Print_ISBN
978-1-4577-2116-8
Type
conf
DOI
10.1109/UCC.2011.16
Filename
6123479
Link To Document