DocumentCode :
3659964
Title :
KSUCCA the Corner Stone for Studying the Meanings of the Holy Quran Words in the Light of Distributional Semantic Models
Author :
AbdulMalik Al-Salman;Maha Sulaiman Alrabiah;Eric Atwell
Author_Institution :
Dept. of Comput. Sci., King Saud Univ., Riyadh, Saudi Arabia
fYear :
2013
Firstpage :
652
Lastpage :
659
Abstract :
Distributional semantic models are considered one of the empiricist approaches to study language structure and design. Its mainly based on building semantic models of words´ meanings using statistical analysis of their distribution in very large corpora. In this paper, we present the Kind Saud University Corpus of Classical Arabic (KSUCCA), which is considered the corner stone for studying the distributional lexical semantic models of the Holy Quran words. It is a free, +50 million words corpus containing texts dating back to the period from pre- Islamic era until the fourth Hijri century. We will describe the design guidelines for KSUCCA including its aim, balance, representation, text sampling, copy right, character encoding and files organization. We will also demonstrate some preliminary experiments we carried out on KSUCCA and the results we got.
Keywords :
"Semantics","Computational modeling","Computer science","Information technology","Buildings","Analytical models","Statistical analysis"
Publisher :
ieee
Conference_Titel :
Advances in Information Technology for the Holy Quran and Its Sciences (32519), 2013 Taibah University International Conference on
Type :
conf
DOI :
10.1109/NOORIC.2013.103
Filename :
7277300
Link To Document :
بازگشت