• DocumentCode
    3410029
  • Title

    How biological source capabilities may affect the data collection process

  • Author

    Lacroix, Zoé ; Edupuganti, Vidyadhari

  • Author_Institution
    Arizona State Univ., Tempe, AZ, USA
  • fYear
    2004
  • fDate
    16-19 Aug. 2004
  • Firstpage
    596
  • Lastpage
    597
  • Abstract
    Scientific discovery relies partially on the collection of information related to multiple scientific objects (e.g., "retrieve all genes involved in brain cancer", "retrieve all citations related to diabetes"). Scientists are interested in exploring multiple data sources in order to explore relationships between scientific objects. Each data source provides specific capabilities that allow scientists to access, navigate, and analyze the data. This work addresses the impact of resource selection (data source and capability) in the data collection process as it may affect significantly the quality and completeness of the data. We present preliminary research that demonstrates that the data collection process depends on two orthogonal variables: the data sources involved in the process, and the selection of capabilities available at these resources. We report the results for four commonly used biological resources: the NCBl Nucleotide, Protein, PubMed and OMIM databases.
  • Keywords
    biology computing; data analysis; information resources; information retrieval; NCBl Nucleotide databases; OMIM databases; Protein databases; PubMed databases; biological resources; biological source capabilities; data access; data analysis; data collection; data completeness; data navigation; data quality; multiple data sources; resource selection; scientific discovery; Access protocols; Cancer; Data analysis; Databases; Diabetes; Diseases; Information retrieval; Navigation; Proteins; User interfaces;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computational Systems Bioinformatics Conference, 2004. CSB 2004. Proceedings. 2004 IEEE
  • Print_ISBN
    0-7695-2194-0
  • Type

    conf

  • DOI
    10.1109/CSB.2004.1332511
  • Filename
    1332511