DocumentCode :
3055386
Title :
Zipf’s law of burstiness in Turkish: The length of intervals between repetitions
Author :
Kocabas, Ilker ; Kisla, Tarik ; Karaoglan, Bahar
Author_Institution :
Ege Univ., Izmir
fYear :
2007
fDate :
7-9 Nov. 2007
Firstpage :
1
Lastpage :
3
Abstract :
Zipf law of burstiness of content words is being less studied than his laws that describe the relation between the rank and the frequency of words. Zipf counted the number of intervals of the same length between the repetitions of the words belonging to the same frequency class and on a 260,000 word English corpus empirically showed that the interval size, I, between each occurrence of a word is inversely proportional to the number of intervals having that size: F a Ip, where p varied between 1 and 1.3. In this study we investigated the validity of the law of burstiness on a Turkish corpus of size 55,000 and found p varying between 0.5 and 0.8.
Keywords :
natural language processing; English corpus; Turkish word burstiness; Zipf law; content word; Books; Differential equations; Frequency measurement; Indexing; Information retrieval; Length measurement; Mathematical model; Measurement standards; Measurement units; Natural language processing;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Computer and information sciences, 2007. iscis 2007. 22nd international symposium on
Conference_Location :
Ankara
Print_ISBN :
978-1-4244-1363-8
Electronic_ISBN :
978-1-4244-1364-5
Type :
conf
DOI :
10.1109/ISCIS.2007.4456847
Filename :
4456847
Link To Document :
بازگشت