Title :
Distributed Web mining using Bayesian networks from multiple data streams
Author :
Chen, R. ; Sivakumar, K. ; Kargupta, H.
Author_Institution :
Washington State Univ., Pullman, WA, USA
Abstract :
We present a collective approach to mining Bayesian networks from distributed heterogenous Web-log data streams. In this approach we first learn a local Bayesian network at each site using the local data. Then each site identifies the observations that are most likely to be evidence of coupling between local and non-local variables and transmits a subset of these observations to a central site. Another Bayesian network is learnt at the central site using the data transmitted from the local site. The local and central Bayesian networks are combined to obtain a collective Bayesian network that models the entire data. We applied this technique to mining multiple data streams, where data centralization is difficult because of large response time and scalability issues. Experimental results and theoretical justification that demonstrate the feasibility of our approach are presented
Keywords :
belief networks; data mining; distributed algorithms; information resources; learning (artificial intelligence); collective Bayesian network; collective approach; data centralization; distributed Web mining; distributed heterogenous Web-log data streams; local Bayesian network; local variables; multiple data streams; nonlocal variables; response time; scalability issues; Bayesian methods; Data mining; Delay; Design optimization; Network servers; Web design; Web mining; Web server; Web sites; World Wide Web;
Conference_Titel :
Data Mining, 2001. ICDM 2001, Proceedings IEEE International Conference on
Conference_Location :
San Jose, CA
Print_ISBN :
0-7695-1119-8
DOI :
10.1109/ICDM.2001.989503