• DocumentCode
    1021887
  • Title

    Acquisition of linguistic patterns for knowledge-based information extraction

  • Author

    Kim, Jun-Tae ; Moldovan, Dan I.

  • Author_Institution
    Dept. of Comput. Eng., Dongguk Univ., Seoul, South Korea
  • Volume
    7
  • Issue
    5
  • fYear
    1995
  • fDate
    10/1/1995 12:00:00 AM
  • Firstpage
    713
  • Lastpage
    724
  • Abstract
    The paper presents an automatic acquisition of linguistic patterns that can be used for knowledge based information extraction from texts. In knowledge based information extraction, linguistic patterns play a central role in the recognition and classification of input texts. Although the knowledge based approach has been proved effective for information extraction on limited domains, there are difficulties in construction of a large number of domain specific linguistic patterns. Manual creation of patterns is time consuming and error prone, even for a small application domain. To solve the scalability and the portability problem, an automatic acquisition of patterns must be provided. We present the PALKA (Parallel Automatic Linguistic Knowledge Acquisition) system that acquires linguistic patterns from a set of domain specific training texts and their desired outputs. A specialized representation of patterns called FP structures has been defined. Patterns are constructed in the form of FP structures from training texts, and the acquired patterns are tuned further through the generalization of semantic constraints. Inductive learning mechanism is applied in the generalization step. The PALKA system has been used to generate patterns for our information extraction system developed for the fourth Message Understanding Conference (MUC-4)
  • Keywords
    knowledge acquisition; knowledge based systems; learning by example; linguistics; natural languages; pattern recognition; word processing; FP structures; PALKA; Parallel Automatic Linguistic Knowledge Acquisition; automatic acquisition; domain specific linguistic patterns; domain specific training text; input text; knowledge based information extraction; knowledge based natural language processing; knowledge-based information extraction; linguistic pattern acquisition; semantic constraints; Data mining; Instruments; Knowledge acquisition; Learning systems; Natural language processing; Pattern analysis; Scalability; Terrorism; Text analysis; Text processing;
  • fLanguage
    English
  • Journal_Title
    Knowledge and Data Engineering, IEEE Transactions on
  • Publisher
    ieee
  • ISSN
    1041-4347
  • Type

    jour

  • DOI
    10.1109/69.469825
  • Filename
    469825