• DocumentCode
    2012399
  • Title

    A Strategy for Automatically Extracting References from PDF Documents

  • Author

    Alves, Neide Ferreira ; Lins, Rafael Dueire ; Lencastre, Maria

  • Author_Institution
    Univ. do Estado do Amazonas, Manaus, Brazil
  • fYear
    2012
  • fDate
    27-29 March 2012
  • Firstpage
    435
  • Lastpage
    439
  • Abstract
    Every day the number of citations an author receives is becoming more important than the size of his list of publications. The automatic extraction of bibliographic references in scientific articles is still a difficult problem in Document Engineering, even if the document is originally in digital form. This paper presents a strategy for extracting references of scientific documents in PDF format. The scheme proposed was validated in Live Memory platform, developed to generate digital libraries of proceedings of technical events.
  • Keywords
    bibliographic systems; digital libraries; document image processing; image retrieval; scientific information systems; LiveMemory platform; PDF document; automatic bibliographic reference extraction; digital document; digital libraries; document engineering; scientific articles; scientific documents; Accuracy; Classification algorithms; Data mining; Portable document format; Proposals; Support vector machine classification; Training; bibliographic references; document processing; information extraction; learning; regular expression;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on
  • Conference_Location
    Gold Cost, QLD
  • Print_ISBN
    978-1-4673-0868-7
  • Type

    conf

  • DOI
    10.1109/DAS.2012.12
  • Filename
    6195409