Title :
Twitter vs. printed English: An information-theoretic comparison
Author :
Glennon, Emma ; Sankar, Lalitha ; Poor, H. Vincent
Author_Institution :
Dept. of Electr. Eng., Princeton Univ., Princeton, NJ, USA
Abstract :
The popular social networking and microblogging service Twitter contains language that is very different from what is considered proper. This paper quantifies those linguistic differences between printed English and Tweetspeak using information-theoretic concepts. Letter-based n-gram entropies are calculated and compared to analagous data from two corpora of printed English to demonstrate that 1) Twitter´s entropy is overall higher than that of printed English, and 2) individual users´ entropies are on average higher the less conventional their language use is. The implications for digitally-mediated communication in general are also discussed.
Keywords :
computer mediated communication; entropy; linguistics; social networking (online); Tweetspeak; Twitter; digitally-mediated communication; information-theoretic comparison; letter-based n-gram entropies; linguistic differences; microblogging service; printed English; social networking service; Educational institutions; Entropy; Handicapped aids; Radio access networks; Redundancy; Standards; Twitter; Twitter; computer mediated communication; information entropy; information theory; redundancy;
Conference_Titel :
Acoustics, Speech and Signal Processing (ICASSP), 2012 IEEE International Conference on
Conference_Location :
Kyoto
Print_ISBN :
978-1-4673-0045-2
Electronic_ISBN :
1520-6149
DOI :
10.1109/ICASSP.2012.6288563