DocumentCode
2337877
Title
Robust recognition of complex entities in text exploiting enterprise data and NLP-techniques
Author
Brauer, Falk ; Schramm, Marcus ; Barczynski, Wojciech ; Löser, Alexander ; Do, Hong-Hai
Author_Institution
SAP Res., SAP AG, Dresden
fYear
2008
fDate
13-16 Nov. 2008
Firstpage
551
Lastpage
558
Abstract
Data transactions between business partners often include unstructured data such as invoices or purchase orders. In order to process such automatically, complex business entities need to be identified. Examples for complex entities are products, business partners and purchase orders which are stored in a supplier relationship management system. Both, structured records in the enterprise system and text data, describe these complex entities. A major challenge is to correctly associate entities recognized in unstructured data with entities stored in structured data, e.g. enterprise databases. We address that problem and propose a robust process methodology which includes three phases: candidate extraction from unstructured text, generation of initial mappings with structured data and disambiguation of the mappings exploiting relationships among the entities in the enterprise data and the documentspsila structure. We describe each step in detail, propose a common architecture and introduce to our data model and algorithms.
Keywords
business data processing; database management systems; text analysis; NLP-techniques; data transactions; enterprise data; enterprise databases; robust recognition; supplier relationship management system; text data; Costs; Current supplies; Data mining; Data models; Databases; Identity management systems; Robustness; Supply chain management; Supply chains; Text recognition;
fLanguage
English
Publisher
ieee
Conference_Titel
Digital Information Management, 2008. ICDIM 2008. Third International Conference on
Conference_Location
London
Print_ISBN
978-1-4244-2916-5
Electronic_ISBN
978-1-4244-2917-2
Type
conf
DOI
10.1109/ICDIM.2008.4746780
Filename
4746780
Link To Document