Title :
PDB Data Curation
Author :
Wang, Yanchao ; Sunderraman, Rajshekhar
Author_Institution :
Dept. of Comput. Sci., Georgia State Univ., GA
fDate :
Aug. 30 2006-Sept. 3 2006
Abstract :
In this paper, we propose two architectures for curating PDB data to improve its quality. The first one, PDB Data Curation System, is developed by adding two parts, Checking Filter and Curation Engine, between User Interface and Database. This architecture supports the basic PDB data curation. The other one, PDB Data Curation System with XCML, is designed for further curation which adds four more parts, PDB-XML, PDB, OODB, Protin-OODB, into the previous one. This architecture uses XCML language to automatically check errors of PDB data that enables PDB data more consistent and accurate. These two tools can be used for cleaning existing PDB files and creating new PDB files. We also show some ideas how to add constraints and assertions with XCML to get better data. In addition, we discuss the data provenance that may affect data accuracy and consistency
Keywords :
XML; biochemistry; biology computing; molecular biophysics; object-oriented databases; proteins; PDB data curation; PDB-XML; Protin-OODB; XCML language; checking filter; curation engine; Bioinformatics; Bonding; Cities and towns; Cleaning; Databases; Engines; Filters; Genomics; Protein engineering; User interfaces;
Conference_Titel :
Engineering in Medicine and Biology Society, 2006. EMBS '06. 28th Annual International Conference of the IEEE
Conference_Location :
New York, NY
Print_ISBN :
1-4244-0032-5
Electronic_ISBN :
1557-170X
DOI :
10.1109/IEMBS.2006.259891