• DocumentCode
    1981797
  • Title

    Language-independent information extraction based on formal concept analysis

  • Author

    Mironczuk, Marcin ; Czerski, Dariusz ; Sydow, Marcin ; Klopotek, Mieczyslaw A.

  • Author_Institution
    Inst. of Comput. Sci., Warsaw, Poland
  • fYear
    2013
  • fDate
    23-25 Sept. 2013
  • Firstpage
    323
  • Lastpage
    329
  • Abstract
    This paper proposes application of Formal Concept Analysis (FCA) in creating character-level information extraction patterns and presents BigGrams: a prototype of a language-independent information extraction system. The main goal of the system is to recognise and to extract of named entities belonging to some semantic classes (e.g. cars, actors, pop-stars, etc.) from semi structured text (web page documents).
  • Keywords
    formal concept analysis; information retrieval; text analysis; BigGrams; FCA; Web page documents; character-level information extraction patterns; formal concept analysis; language-independent information extraction; named entity extraction; named entity recognition; semistructured text; Context; Data mining; Formal concept analysis; HTML; Information retrieval; Lattices; Seals;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Informatics and Applications (ICIA),2013 Second International Conference on
  • Conference_Location
    Lodz
  • Print_ISBN
    978-1-4673-5255-0
  • Type

    conf

  • DOI
    10.1109/ICoIA.2013.6650277
  • Filename
    6650277