• DocumentCode
    2014132
  • Title

    Bibliographic Attributes Extraction with Layer-upon-Layer Tagging

  • Author

    Wei, Wei ; King, Irwin ; Lee, Jimmy Ho-Man

  • Author_Institution
    R. Inst. of Technol. KTH, Stockholm
  • Volume
    2
  • fYear
    2007
  • fDate
    23-26 Sept. 2007
  • Firstpage
    804
  • Lastpage
    808
  • Abstract
    Bibliographic attributes extraction is an important research topic for digital libraries. In this paper we propose a rule-based method for bibliographic attributes extraction with Layer-upon-Layer Tagging (LLT). The method analyzes bibliographic attributes´ appearances and punctuations to perform format and semantic taggings on two defined parsing layers. The method also resolves to specifically constructed lexicons to achieve high accuracy of semantic tagging. In the experimental evaluation on 1,000 reference strings, the accuracy of author tagging reaches to 96.8% and the accuracy of whole reference tagging is 82.9%. The experimental results demonstrate that the proposed LLT method can tag bibliographic attributes in reference strings with high degree of accuracy.
  • Keywords
    bibliographic systems; digital libraries; knowledge based systems; bibliographic attributes extraction; digital libraries; layer-upon-layer tagging; rule-based method; Data mining; Entropy; Hidden Markov models; Information resources; Optical character recognition software; Performance analysis; Software libraries; Standardization; Tagging; Training data;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis and Recognition, 2007. ICDAR 2007. Ninth International Conference on
  • Conference_Location
    Parana
  • ISSN
    1520-5363
  • Print_ISBN
    978-0-7695-2822-9
  • Type

    conf

  • DOI
    10.1109/ICDAR.2007.4377026
  • Filename
    4377026