Abstract :
Unranked trees, that is, trees with no restriction on the number of children of nodes, have recently attracted much attention, primarily as an abstraction of XML (Extensible Markup Language) documents. In this paper, we study logical definability over unranked trees, as well as collections of unranked trees, that can be viewed as databases of XML documents. The traditional approach to definability is to view each tree as a structure of a fixed vocabulary, and study the expressive power of various logics on trees. A different approach, based on model theory, considers a structure whose universe is the set of all trees, and studies definable sets and relations; this approach extends smoothly to the setting of definability over collections of trees. We study the latter, model-theoretic approach. We find sets of operations on unranked trees that define regular tree languages, and show that some natural restrictions correspond to logics studied in the context of XML pattern languages. We then look at relational calculi over collections of unranked trees, and obtain quantifier-restriction results that give us bounds on the expressive power and complexity. As unrestricted relational calculi can express problems complete for each level of the polynomial hierarchy, we look at their restrictions, corresponding and find several calculi with low (NC1) data complexity that can express important XML properties like DTD validation and XPath evaluation.
Keywords :
hypermedia markup languages; query languages; tree data structures; DTD validation; Extensible Markup Language; NC1 data complexity; XML database; XML document; XML pattern language; XML property; XPath; definable relation; definable sets; document type definition; expressive complexity; expressive power; fixed vocabulary; logical definability; model theory; model-theoretic approach; natural restriction; polynomial hierarchy; quantifier-restriction result; query language; relational calculi; tree language; unranked tree; various logic; Automata; Database languages; Logic; Navigation; Polynomials; Relational databases; Spatial databases; Vocabulary; Web sites; XML;