Title : 
Extracting interesting related context-dependent concepts from social media streams using temporal distributions
         
        
            Author : 
Sayers, C.P. ; Meichun Hsu
         
        
            Author_Institution : 
Hewlett-Packard Labs., Palo Alto, CA, USA
         
        
        
        
        
        
            Abstract : 
To enable the interactive exploration of large social media datasets we exploit the temporal distributions of word n-grams within the message stream to discover “interesting” concepts, determine “relatedness” between concepts, and find representative examples for display. We present a new algorithm for context-dependent “interestingness” using the coefficient of variation of the temporal distribution, apply the well-known technique of Pearson´s Correlation to tweets using equi-height histogramming to determine correlation, and employ an asymmetric variant for computing “relatedness” to encourage exploration. We further introduce techniques using interestingness, correlation, and relatedness to automatically discover concepts and select preferred word N-grams for display. These techniques are demonstrated on an 800,000 tweet dataset from the Academy Awards.
         
        
            Keywords : 
Internet; information analysis; information retrieval; social networking (online); Pearson correlation; coefficient of variation; context dependent concepts; equiheight histogramming; interactive exploration; interesting concepts extraction; social media datasets; social media streams; temporal distribution; temporal distributions; word n-grams; Awards activities; Context; Correlation; Histograms; Media; Twitter; Visualization;
         
        
        
        
            Conference_Titel : 
Data Engineering (ICDE), 2013 IEEE 29th International Conference on
         
        
            Conference_Location : 
Brisbane, QLD
         
        
        
            Print_ISBN : 
978-1-4673-4909-3
         
        
            Electronic_ISBN : 
1063-6382
         
        
        
            DOI : 
10.1109/ICDE.2013.6544931