DocumentCode
1981797
Title
Language-independent information extraction based on formal concept analysis
Author
Mironczuk, Marcin ; Czerski, Dariusz ; Sydow, Marcin ; Klopotek, Mieczyslaw A.
Author_Institution
Inst. of Comput. Sci., Warsaw, Poland
fYear
2013
fDate
23-25 Sept. 2013
Firstpage
323
Lastpage
329
Abstract
This paper proposes application of Formal Concept Analysis (FCA) in creating character-level information extraction patterns and presents BigGrams: a prototype of a language-independent information extraction system. The main goal of the system is to recognise and to extract of named entities belonging to some semantic classes (e.g. cars, actors, pop-stars, etc.) from semi structured text (web page documents).
Keywords
formal concept analysis; information retrieval; text analysis; BigGrams; FCA; Web page documents; character-level information extraction patterns; formal concept analysis; language-independent information extraction; named entity extraction; named entity recognition; semistructured text; Context; Data mining; Formal concept analysis; HTML; Information retrieval; Lattices; Seals;
fLanguage
English
Publisher
ieee
Conference_Titel
Informatics and Applications (ICIA),2013 Second International Conference on
Conference_Location
Lodz
Print_ISBN
978-1-4673-5255-0
Type
conf
DOI
10.1109/ICoIA.2013.6650277
Filename
6650277
Link To Document