• DocumentCode
    3055386
  • Title

    Zipf’s law of burstiness in Turkish: The length of intervals between repetitions

  • Author

    Kocabas, Ilker ; Kisla, Tarik ; Karaoglan, Bahar

  • Author_Institution
    Ege Univ., Izmir
  • fYear
    2007
  • fDate
    7-9 Nov. 2007
  • Firstpage
    1
  • Lastpage
    3
  • Abstract
    Zipf law of burstiness of content words is being less studied than his laws that describe the relation between the rank and the frequency of words. Zipf counted the number of intervals of the same length between the repetitions of the words belonging to the same frequency class and on a 260,000 word English corpus empirically showed that the interval size, I, between each occurrence of a word is inversely proportional to the number of intervals having that size: F a Ip, where p varied between 1 and 1.3. In this study we investigated the validity of the law of burstiness on a Turkish corpus of size 55,000 and found p varying between 0.5 and 0.8.
  • Keywords
    natural language processing; English corpus; Turkish word burstiness; Zipf law; content word; Books; Differential equations; Frequency measurement; Indexing; Information retrieval; Length measurement; Mathematical model; Measurement standards; Measurement units; Natural language processing;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Computer and information sciences, 2007. iscis 2007. 22nd international symposium on
  • Conference_Location
    Ankara
  • Print_ISBN
    978-1-4244-1363-8
  • Electronic_ISBN
    978-1-4244-1364-5
  • Type

    conf

  • DOI
    10.1109/ISCIS.2007.4456847
  • Filename
    4456847