• DocumentCode
    470052
  • Title

    An XML environment for multistructured textual documents

  • Author

    Bruno, Emmanuel ; Murisasco, Elisabeth

  • Author_Institution
    Univ. du Sud Toulon, La Garde
  • Volume
    1
  • fYear
    2007
  • fDate
    28-31 Oct. 2007
  • Firstpage
    230
  • Lastpage
    235
  • Abstract
    XML is the de facto standard to describe structured data. Several applications in the context of information systems are based on its use: electronic publishing, technical documentation, digital libraries, web, etc. An XML document is mainly hierarchical. But, in some applications, several concurrent hierarchical structures could be associated to the same textual data. This paper presents an XML environment dedicated to the representation and the querying of such documents that we call multistructured textual documents. Our work aims at proposing a method for a compact representation of multiple trees over a single text based on segmentation. Segmentation encoding allows querying overlap/containment relations of markups belonging to different structures. This paper particularly focuses on the architecture of the XML environment implementing our proposals.
  • Keywords
    XML; information systems; tree data structures; XML environment; concurrent hierarchical structures; data structures; information systems; multiple trees; multistructured textual documents; segmentation encoding; Data analysis; Data mining; Documentation; Electronic publishing; Encoding; Information systems; Large scale integration; Proposals; Software libraries; XML;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Digital Information Management, 2007. ICDIM '07. 2nd International Conference on
  • Conference_Location
    Lyon
  • Print_ISBN
    978-1-4244-1475-8
  • Electronic_ISBN
    978-1-4244-1476-5
  • Type

    conf

  • DOI
    10.1109/ICDIM.2007.4444228
  • Filename
    4444228