DocumentCode
2012399
Title
A Strategy for Automatically Extracting References from PDF Documents
Author
Alves, Neide Ferreira ; Lins, Rafael Dueire ; Lencastre, Maria
Author_Institution
Univ. do Estado do Amazonas, Manaus, Brazil
fYear
2012
fDate
27-29 March 2012
Firstpage
435
Lastpage
439
Abstract
Every day the number of citations an author receives is becoming more important than the size of his list of publications. The automatic extraction of bibliographic references in scientific articles is still a difficult problem in Document Engineering, even if the document is originally in digital form. This paper presents a strategy for extracting references of scientific documents in PDF format. The scheme proposed was validated in Live Memory platform, developed to generate digital libraries of proceedings of technical events.
Keywords
bibliographic systems; digital libraries; document image processing; image retrieval; scientific information systems; LiveMemory platform; PDF document; automatic bibliographic reference extraction; digital document; digital libraries; document engineering; scientific articles; scientific documents; Accuracy; Classification algorithms; Data mining; Portable document format; Proposals; Support vector machine classification; Training; bibliographic references; document processing; information extraction; learning; regular expression;
fLanguage
English
Publisher
ieee
Conference_Titel
Document Analysis Systems (DAS), 2012 10th IAPR International Workshop on
Conference_Location
Gold Cost, QLD
Print_ISBN
978-1-4673-0868-7
Type
conf
DOI
10.1109/DAS.2012.12
Filename
6195409
Link To Document