DocumentCode
470052
Title
An XML environment for multistructured textual documents
Author
Bruno, Emmanuel ; Murisasco, Elisabeth
Author_Institution
Univ. du Sud Toulon, La Garde
Volume
1
fYear
2007
fDate
28-31 Oct. 2007
Firstpage
230
Lastpage
235
Abstract
XML is the de facto standard to describe structured data. Several applications in the context of information systems are based on its use: electronic publishing, technical documentation, digital libraries, web, etc. An XML document is mainly hierarchical. But, in some applications, several concurrent hierarchical structures could be associated to the same textual data. This paper presents an XML environment dedicated to the representation and the querying of such documents that we call multistructured textual documents. Our work aims at proposing a method for a compact representation of multiple trees over a single text based on segmentation. Segmentation encoding allows querying overlap/containment relations of markups belonging to different structures. This paper particularly focuses on the architecture of the XML environment implementing our proposals.
Keywords
XML; information systems; tree data structures; XML environment; concurrent hierarchical structures; data structures; information systems; multiple trees; multistructured textual documents; segmentation encoding; Data analysis; Data mining; Documentation; Electronic publishing; Encoding; Information systems; Large scale integration; Proposals; Software libraries; XML;
fLanguage
English
Publisher
ieee
Conference_Titel
Digital Information Management, 2007. ICDIM '07. 2nd International Conference on
Conference_Location
Lyon
Print_ISBN
978-1-4244-1475-8
Electronic_ISBN
978-1-4244-1476-5
Type
conf
DOI
10.1109/ICDIM.2007.4444228
Filename
4444228
Link To Document