• DocumentCode
    3721175
  • Title

    Language tweet characteristics of Indonesian citizens

  • Author

    Ahmad Fathan Hidayatullah

  • Author_Institution
    Department of Informatics. Universitas Islam Indonesia, UII, Yogyakarta, Indonesia
  • fYear
    2015
  • Firstpage
    397
  • Lastpage
    401
  • Abstract
    Indonesia is a wide country which has thousands of islands, hundred languages and dialects. These conditions cause many habits and behaviour to the people, including their activities in social media. Twitter and other social media have no language rules for users. Therefore, people are able to write everything very free without any regulations when they are posting their tweets. Generally, there are five types of writing that presented in the dataset such as tweet that written in the normal form of Bahasa, mixed Bahasa with local language, mixed Bahasa with foreign language, contains abbreviations, and contains slang words. Moreover, this investigation has found sixteen characteristics of Indonesian tweet where some of them are the combination of the five writing styles. By understanding the characteristics of writing style in Twitter messages, we proposed the algorithm in the pre-processing step to alter the non-standard words into standard form in Bahasa Indonesia.
  • Keywords
    "Twitter","Media","Standards","Pragmatics","Dictionaries","Speech"
  • Publisher
    ieee
  • Conference_Titel
    Science and Technology (TICST), 2015 International Conference on
  • Type

    conf

  • DOI
    10.1109/TICST.2015.7369393
  • Filename
    7369393