• DocumentCode
    3659937
  • Title

    Discovering the Thematic Structure of the Quran using Probabilistic Topic Model

  • Author

    Muazzam Ahmed Siddiqui;Syed Muhammad Faraz;Sohail Abdul Sattar

  • Author_Institution
    Dept. of Inf. Syst., King Abdulaziz Univ., Jeddah, Saudi Arabia
  • fYear
    2013
  • Firstpage
    234
  • Lastpage
    239
  • Abstract
    Topic modeling refers to extracting topics from text. Topic model is a statistical model whose aim is to discover topics from a large collection of documents. A topic consists of a collection of words that are more likely to be found together in the given context of that topic or theme. This paper applies a topic model to discover the thematic structure of the Quran. For centuries, the Quran has been widely studied for the topics it contains and the relationships among them. The Holy Quran is a treasure of tremendous amount of information that addresses various aspects of human life, social as well as individual. The information present in the Quran relates in a conceptual manner although its individual bits may look unstructured and scattered. This paper attempts to use a computational method to identify this hidden thematic structure automatically. We considered each surah in the Quran as a document and used Latent Dirichlet Allocation, a probabilistic topic modeling algorithm, to discover the topics/themes. The Arabic Quran was used as the corpus instead of transliteration or translation. Our results are very promising and we were able to discover the major themes in the surahs, along with the most important terms that describe these themes.
  • Keywords
    "Probabilistic logic","Computational modeling","Hidden Markov models","Data models","Vocabulary","Resource management","Text mining"
  • Publisher
    ieee
  • Conference_Titel
    Advances in Information Technology for the Holy Quran and Its Sciences (32519), 2013 Taibah University International Conference on
  • Type

    conf

  • DOI
    10.1109/NOORIC.2013.55
  • Filename
    7277252