• DocumentCode
    1931608
  • Title

    Improving Data Discovery for Metadata Repositories through Semantic Search

  • Author

    Berkley, Chad ; Bowers, Shawn ; Jones, Matthew B. ; Madin, Joshua S. ; Schildhauer, Mark

  • Author_Institution
    Nat. Center for Ecological Anal. & Synthesis, UC-Santa Barbara, Santa Barbara, CA
  • fYear
    2009
  • fDate
    16-19 March 2009
  • Firstpage
    1152
  • Lastpage
    1159
  • Abstract
    The amount of ecological data available electronically is increasing at a rapid rate, e.g., over 15,000 data sets are available today in the Knowledge Network for Biocomplexity (KNB) alone. Using the existing search capabilities of these online data repositories, however, scientists struggle to quickly locate data that are relevant to their needs or that will integrate with their current data sets. Semantic technologies aim at addressing many of these problems and hold the promise of enabling more powerful "smart" searches of online data archives. We describe new semantic search features within the Metacat meta-data system, which is used by many ecological research sites around the world for archiving their data using a standardized metadata format. Our semantic search sys-tem adds to Metacat the ability to store OWL-DL ontologies in addition to semantic annotations that link data set attributes to ontology terms. Our approach also extends Metacat to improve metadata search in multiple ways: (i) by expanding standard keyword searches with ontology term hierarchies; (ii) by allowing keyword searches to be applied to annotations in addition to traditional meta-data; and (iii) by allowing more structured searches over annotations via ontology terms. We describe our implementation of these extensions, and compare and contrast these different types of search for a corpus of annotated documents. As data repositories continue to grow, these tools will be instrumental in helping scientists precisely locate and then interpret data for their research needs.
  • Keywords
    data mining; ecology; information retrieval systems; knowledge representation languages; meta data; ontologies (artificial intelligence); semantic Web; Knowledge Network for Biocomplexity; Metacat metadata system; OWL-DL ontologies; annotated documents; data archives; data discovery; metadata repositories; online data repositories; semantic search; Biology; Competitive intelligence; Data analysis; Instruments; Intelligent networks; Keyword search; Network synthesis; Ontologies; Software systems; Soil; data; discovery; metadata; ontology; semantic; storage;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Complex, Intelligent and Software Intensive Systems, 2009. CISIS '09. International Conference on
  • Conference_Location
    Fukuoka
  • Print_ISBN
    978-1-4244-3569-2
  • Electronic_ISBN
    978-0-7695-3575-3
  • Type

    conf

  • DOI
    10.1109/CISIS.2009.122
  • Filename
    5066940