DocumentCode
3721175
Title
Language tweet characteristics of Indonesian citizens
Author
Ahmad Fathan Hidayatullah
Author_Institution
Department of Informatics. Universitas Islam Indonesia, UII, Yogyakarta, Indonesia
fYear
2015
Firstpage
397
Lastpage
401
Abstract
Indonesia is a wide country which has thousands of islands, hundred languages and dialects. These conditions cause many habits and behaviour to the people, including their activities in social media. Twitter and other social media have no language rules for users. Therefore, people are able to write everything very free without any regulations when they are posting their tweets. Generally, there are five types of writing that presented in the dataset such as tweet that written in the normal form of Bahasa, mixed Bahasa with local language, mixed Bahasa with foreign language, contains abbreviations, and contains slang words. Moreover, this investigation has found sixteen characteristics of Indonesian tweet where some of them are the combination of the five writing styles. By understanding the characteristics of writing style in Twitter messages, we proposed the algorithm in the pre-processing step to alter the non-standard words into standard form in Bahasa Indonesia.
Keywords
"Twitter","Media","Standards","Pragmatics","Dictionaries","Speech"
Publisher
ieee
Conference_Titel
Science and Technology (TICST), 2015 International Conference on
Type
conf
DOI
10.1109/TICST.2015.7369393
Filename
7369393
Link To Document