• DocumentCode
    984400
  • Title

    Generating compact redundancy-free XML documents from conceptual-model hypergraphs

  • Author

    Mok, Wai Yin ; Embley, David W.

  • Author_Institution
    Dept. of Econ. & Inf. Syst., Alabama Univ., Huntsville, AL
  • Volume
    18
  • Issue
    8
  • fYear
    2006
  • Firstpage
    1082
  • Lastpage
    1096
  • Abstract
    As XML data becomes more and more prevalent and as larger quantities of data find their way into XML documents, the need for quality XML data organization only increase. One standard way of structuring data well is to reduce and, if possible, eliminate redundancy, while at the same time making the storage structures as compact as possible. In this paper, we present a methodology to generate XML storage structures where conforming XML documents are redundancy-free, and for most practical cases, are also fully compact. Our methodology assumes the input is a conceptual-model hypergraph. For the special case that every edge in the hypergraph is binary, we present a simple algorithm, guaranteed to always generate redundancy-free storage structures. We show, however, that generating a minimum number of redundancy-free storage structures is NP-hard. We therefore provide heuristics to guide the process and observe that these heuristics result in satisfactory solutions, which are often optimal. We then present a general algorithm for n-ary edges and show that it generates redundancy-free storage structures. The general algorithm must overcome several problems that do not arise in the special case
  • Keywords
    XML; computational complexity; document handling; entity-relationship modelling; redundancy; storage management; NP-hard; XML storage structure; conceptual-model hypergraph; n-ary edge algorithm; redundancy-free XML document; Computer science; Feedback; Information systems; Proposals; Storage automation; XML; XML data redundancy; XML scheme generation.; compact XML storage structures;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/TKDE.2006.125
  • Filename
    1644731