DocumentCode
2494199
Title
Effective standards for metadata in the GCMD data access system
Author
Bukhres, Omran ; Miled, Z.B. ; Lynch, Eric ; Olsen, Lola ; Tari, Zahir
Author_Institution
Dept. of Comput. Sci., Indiana Univ., Indianapolis, IN, USA
fYear
2000
fDate
2000
Firstpage
155
Lastpage
161
Abstract
The paper presents an information retrieval system for use by the Global Change Master Directory. The GCMD is a repository that contains Earth Science data collected by various agencies worldwide. The GCMD does not house the actual data, it contains descriptions of the data including the location of the actual data set. The GCMD also provides search services to locate these data descriptor files. For data to be included in the GCMD database, it must be submitted to the GCMD in the Directory Interchange Format (DIF). This DIF submission is currently done by data collectors manually submitting the DIF to the GCMD, but this manual system cannot keep pace with the amount of data being collected. Our proposed solution to keep pace with data being collected is to design and develop a data access system for the GCMD to automate the DIF creation process. Our data access system will be capable of autonomously searching Web sites for Earth Science data sets, extracting the metadata from these data sets, and creating a DIF for the file. The paper describes our prototype system that uses a URL pool to direct its search for Hierarchical Data Format (HDF) files. The HDF file is a self-describing format and contains metadata describing the contents of the files. This metadata is extracted and mapped to the DIF format. We present examples of DIFs created by our prototype to demonstrate that our approach is feasible, and discuss the need for a metadata standard among scientific data sets and how such a standard would enhance the effectiveness of our system and others in the Earth Science community
Keywords
geography; geophysics computing; information resources; information retrieval; information retrieval systems; meta data; standards; DIF creation process; DIF format; DIF submission; Directory Interchange Format; Earth Science community; Earth Science data; Earth Science data sets; GCMD data access system; GCMD database; Global Change Master Directory; HDF file; Hierarchical Data Format; URL pool; Web sites; data access system; data descriptor files; information retrieval system; metadata; metadata standards; scientific data sets; search services; self-describing format; Computer science; Data mining; Databases; Earth; Geoscience; Information retrieval; Internet; Prototypes; Search engines; Space technology;
fLanguage
English
Publisher
ieee
Conference_Titel
Distributed Objects and Applications, 2000. Proceedings. DOA '00. International Symposium on
Conference_Location
Antwerp
Print_ISBN
0-7695-0819-7
Type
conf
DOI
10.1109/DOA.2000.874187
Filename
874187
Link To Document