• DocumentCode
    476751
  • Title

    Using linguistic patterns in FCA-based approach for automatic acquisition of taxonomies from Malay text

  • Author

    Nazri, Mohd Zakree Ahmad ; Shamsudin, Siti Mariyam ; Bakar, Azuraliza Abu ; Ghani, Tarmizi Abd

  • Author_Institution
    Center for Artificial Intelligence Technology, FTSM, Universiti Kebangsaan, Malaysia 43600 Bangi, Selangor, Malaysia
  • Volume
    2
  • fYear
    2008
  • fDate
    26-28 Aug. 2008
  • Firstpage
    1
  • Lastpage
    7
  • Abstract
    Previous work has shown that Formal Concept Analysis (FCA) can be used to automatically acquire taxonomies from Indo-European text. The taxonomies are built via FCA using syntactic dependencies as attributes such as verb/head-object, verb/head-subject and verb/prepositional phrase-complement. This paper discusses the overall process of learning taxonomy using FCA with the same syntactic dependencies as the English language which is then applied on Malay texts. Malay, an Austronesian language follows the same Subject-Verb-Object sentence structure like English but syntactically different. The result shows a lower recall and precision compared to related work in other languages. The poor result is caused by several factors such as the selection of smoothing technique. The experimental result indicates that the current smoothing technique with FCA does not produce good results. Therefore, as an addition to the syntactic dependencies, we used linguistic pattern such as Hearst’s pattern in finding similarities between terms. We compare the results of our technique against the cosine used in the FCA-based taxonomy learning approach. The proposed technique attains both higher precision and recall than the previous technique.
  • Keywords
    Artificial intelligence; Computer graphics; Data mining; Machine learning; Natural language processing; Natural languages; Ontologies; Pattern analysis; Smoothing methods; Taxonomy;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Information Technology, 2008. ITSim 2008. International Symposium on
  • Conference_Location
    Kuala Lumpur, Malaysia
  • Print_ISBN
    978-1-4244-2327-9
  • Electronic_ISBN
    978-1-4244-2328-6
  • Type

    conf

  • DOI
    10.1109/ITSIM.2008.4631709
  • Filename
    4631709