• DocumentCode
    3516146
  • Title

    Hierarchical FCA-based conceptual model of text documents used in information retrieval system

  • Author

    Butka, P. ; Pócsová, J.

  • Author_Institution
    Fac. of Econ., Tech. Univ. of Kosice, Kosice, Slovakia
  • fYear
    2011
  • fDate
    19-21 May 2011
  • Firstpage
    199
  • Lastpage
    204
  • Abstract
    Searching for relevant documents in large sets of documents is one of the key tasks in the areas of semantic web and knowledge technologies. This paper deals with analysis and design of improvement for information retrieval (IR) using specific conceptual model automatically created from semantically non-annotated set of text documents. This conceptual model combines locally applied Formal Concept Analysis (FCA) and agglomerative clustering of particular models into one structure, which is suitable to support information retrieval process and can be combined with standard full-text search. Formal Concept Analysis (FCA) is one of the approaches which can be applied in process of conceptual modeling in domain of text documents. Extension of classic FCA (binary table data) is one-sided fuzzy version that works with real values in the object-attribute table (document-term matrix in case of vector representation of text documents). In our approach, starting set of documents is decomposed to smaller sets of similar documents with the use of some partitional clustering algorithm. Then one concept lattice is built for every cluster using FCA method and these FCA-based models are combined to hierarchy of concept lattices using agglomerative clustering algorithm. Finally, we define basic details and methods of IR system that combines standard full-text search and conceptual search (using extracted concept hierarchy).
  • Keywords
    formal concept analysis; fuzzy set theory; information retrieval; pattern clustering; semantic Web; text analysis; FCA; agglomerative clustering algorithm; formal concept analysis; hierarchical conceptual model; information retrieval system; knowledge technologies; object attribute table; one sided fuzzy version; semantic Web; text documents; Analytical models; Clustering algorithms; Computational modeling; Indexes; Information retrieval; Lattices; Merging;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Applied Computational Intelligence and Informatics (SACI), 2011 6th IEEE International Symposium on
  • Conference_Location
    Timisoara
  • Print_ISBN
    978-1-4244-9108-7
  • Type

    conf

  • DOI
    10.1109/SACI.2011.5872999
  • Filename
    5872999