• DocumentCode
    2915682
  • Title

    Online calculation of word-clouds for efficient label summarization

  • Author

    Carmona-Cejudo, José M. ; Baena-García, Manuel ; Castillo, Gladys ; Morales-Bueno, Rafael

  • Author_Institution
    Dipt. Lenguajes y Cienc. de la Comput., Univ. de Malaga, Malaga, Spain
  • fYear
    2011
  • fDate
    22-24 Nov. 2011
  • Firstpage
    1056
  • Lastpage
    1061
  • Abstract
    Large amounts of information are available on the Internet in the form of natural language text that can be processed as a stream of documents. Users need solutions that summarize the vast volume of data. Word clouds are a popular graphical representation approach that allows them to obtain such a quick visual summary. Nevertheless, the exact solution to this problem has high memory requirements, and is not scalable as the collection size grows up. In this work, we provide a method for approximate online computation of word clouds to summarize the contents of each label or category of a given text stream, using only the most relevant terms of each document according to some weighting function. We experimentally show that our method, based on sketching techniques, obtains a good performance while using a restricted quantity of memory.
  • Keywords
    Internet; functions; natural language processing; text analysis; word processing; Internet; approximate online computation; graphical representation approach; natural language text stream; quick visual summary; sketching technique; weighting function; word cloud; Approximation methods; Data structures; Intelligent systems; Memory management; Radiation detectors; Tag clouds; Visualization; count-min sketch; text mining; tfidf;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Intelligent Systems Design and Applications (ISDA), 2011 11th International Conference on
  • Conference_Location
    Cordoba
  • ISSN
    2164-7143
  • Print_ISBN
    978-1-4577-1676-8
  • Type

    conf

  • DOI
    10.1109/ISDA.2011.6121798
  • Filename
    6121798