• DocumentCode
    3734073
  • Title

    From text to XML by structural information extraction

  • Author

    Yong Piao;Tianyu Wang;He Jiang

  • Author_Institution
    School of Software, Dalian University of Technology, Dalian 116620, China
  • fYear
    2015
  • Firstpage
    448
  • Lastpage
    452
  • Abstract
    Facing tremendous volume of semi-structured XML and non-structured free text, network information retrieval is one of the most research hotspots in dealing with these data more efficiently, precisely and uniformly. Many traditional IR methods ignore text semantics and their labeling result has usually only one level, lacking of context expression as well, therefore structure extraction from free text and its conversion to XML format are studied, with a CRF based algorithm SIECRF provided. Experiment results are analyzed, showing its efficiency to extracting text structure and has a good application future.
  • Keywords
    "Hidden Markov models","Information retrieval","Labeling","Semantics","Data mining","Entropy","XML"
  • Publisher
    ieee
  • Conference_Titel
    Computer and Communications (ICCC), 2015 IEEE International Conference on
  • Print_ISBN
    978-1-4673-8125-3
  • Type

    conf

  • DOI
    10.1109/CompComm.2015.7387613
  • Filename
    7387613