Title :
Maximizing the reliability of two-state automaton for burst feature detection in news streams
Author :
Du, Gang ; Jun Guo ; Xu, Wei Ran ; Yang, Zhen
Author_Institution :
Sch. of Inf. & Commun. Eng., Beijing Univ. of Posts & Telecommun., Beijing, China
Abstract :
The capture of temporal dynamics of news streams has drawn increasing attentions in recent sequential data mining works. Most of them are based on the intuition that a “burst” of a topic is signaled by a growth of relevant words in a high intensity during a period of time. Such “burst features” can be efficiently identified by Kleinberg´s two-state automaton model. The resolution is an important parameter of the model. It affects the reliability of the results greatly. This paper maximizes the reliability of the results by estimating adaptive resolution for each word with EM algorithm. Experiments with the public news corpora prove that the unified resolution is a bottleneck of the performance, and the results with word-adaptive resolutions approximate to the maximum reliability well.
Keywords :
automata theory; data mining; expectation-maximisation algorithm; information resources; text analysis; EM algorithm; burst feature detection; maximum reliability; news streams; sequential data mining works; temporal dynamics; two-state automaton; Analytical models; Bridges; Cyclones; Mouth; Variable speed drives; EM algorithm; automaton; burst feature detection; temporal data mining; text mining;
Conference_Titel :
Progress in Informatics and Computing (PIC), 2010 IEEE International Conference on
Conference_Location :
Shanghai
Print_ISBN :
978-1-4244-6788-4
DOI :
10.1109/PIC.2010.5687459