DocumentCode :
2915682
Title :
Online calculation of word-clouds for efficient label summarization
Author :
Carmona-Cejudo, José M. ; Baena-García, Manuel ; Castillo, Gladys ; Morales-Bueno, Rafael
Author_Institution :
Dipt. Lenguajes y Cienc. de la Comput., Univ. de Malaga, Malaga, Spain
fYear :
2011
fDate :
22-24 Nov. 2011
Firstpage :
1056
Lastpage :
1061
Abstract :
Large amounts of information are available on the Internet in the form of natural language text that can be processed as a stream of documents. Users need solutions that summarize the vast volume of data. Word clouds are a popular graphical representation approach that allows them to obtain such a quick visual summary. Nevertheless, the exact solution to this problem has high memory requirements, and is not scalable as the collection size grows up. In this work, we provide a method for approximate online computation of word clouds to summarize the contents of each label or category of a given text stream, using only the most relevant terms of each document according to some weighting function. We experimentally show that our method, based on sketching techniques, obtains a good performance while using a restricted quantity of memory.
Keywords :
Internet; functions; natural language processing; text analysis; word processing; Internet; approximate online computation; graphical representation approach; natural language text stream; quick visual summary; sketching technique; weighting function; word cloud; Approximation methods; Data structures; Intelligent systems; Memory management; Radiation detectors; Tag clouds; Visualization; count-min sketch; text mining; tfidf;
fLanguage :
English
Publisher :
ieee
Conference_Titel :
Intelligent Systems Design and Applications (ISDA), 2011 11th International Conference on
Conference_Location :
Cordoba
ISSN :
2164-7143
Print_ISBN :
978-1-4577-1676-8
Type :
conf
DOI :
10.1109/ISDA.2011.6121798
Filename :
6121798
Link To Document :
بازگشت