• DocumentCode
    3082220
  • Title

    Entity Relation Extraction from geological text using Conditional Random Fields and subsequence kernels

  • Author

    Sobhana, N.V. ; Ghosh, Soumya K. ; MITRA, PINAKI

  • Author_Institution
    Indian Inst. of Technol., Kharagpur, Kharagpur, India
  • fYear
    2012
  • fDate
    7-9 Dec. 2012
  • Firstpage
    832
  • Lastpage
    840
  • Abstract
    An important research field in text mining is Entity Relation Extraction. Extracting various relations between geological entities is of immense benefit to developing intelligent search tools for geology researchers. In this paper Conditional Random Fields (CRFs) as well as sequence kernels are used for extracting relations between entities from a geological corpus. A geological corpus was developed from a collection of scientific reports and articles on the geology of the Indian subcontinent. The training set, consisting of more than 200K words, has been annotated with a named entity tag set of seventeen tags and with labeled instances of part-of and nearby relations. The system is able to recognize part-of and near-by relations with 71.57% and 77.27% F-measure values for T-CRF, and 78.25% and 83.71% for subsequence kernels. The extracted relations were used for query expansion in a retrieval system to achieve a gain of 10.86% for T-CRF, and 10.58% for subsequence kernels over the baseline Mean Average Precision.
  • Keywords
    data mining; geographic information systems; query processing; text analysis; F-measure values; Indian subcontinent geology; T-CRF; baseline mean average precision; conditional random fields; entity relation extraction; geological corpus; geological text; intelligent search tools; query expansion; retrieval system; scientific reports collection; sequence kernels; subsequence kernels; text mining; Feature extraction; Geology; Kernel; Labeling; Semantics; Training; Weight measurement; F-measure; Geological corpus; Mean Average Precision; Precision; Recall;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    India Conference (INDICON), 2012 Annual IEEE
  • Conference_Location
    Kochi
  • Print_ISBN
    978-1-4673-2270-6
  • Type

    conf

  • DOI
    10.1109/INDCON.2012.6420733
  • Filename
    6420733