DocumentCode
3301068
Title
Distributed Latent Dirichlet allocation for objects-distributed cluster ensemble
Author
Wang, Hongjun ; Li, Zhishu ; Cheng, Yang
Author_Institution
Sch. of Comput. Sci., Sichuan Univ., Chengdu
fYear
2008
fDate
19-22 Oct. 2008
Firstpage
1
Lastpage
7
Abstract
The paper introduces the model of distributed latent Dirichlet location (D-LDA) for objects-distributed cluster ensemble which can handle the problems of privacy preservation, distributed computing and knowledge reuse. First, the latent variables in D-LDA and some terminologies are defined for cluster ensemble. Second, Markov chain Monte Carlo (MCMC) approximation inference for D-LDA is stated in detail. Third, some datasets from UCI are chosen for experiment, Compared with cluster-based similarity partitioning algorithm (CSPA), hyper-graph partitioning algorithm (HGPA) and meta-clustering algorithm (MCLA), the results show D-LDA does work better, furthermore the outputs of D-LDA, as a soft cluster model, can not only cluster the data points but also show the structure of data points.
Keywords
Markov processes; Monte Carlo methods; approximation theory; data privacy; distributed processing; pattern clustering; Markov chain Monte Carlo approximation inference; UCI; distributed computing; distributed latent Dirichlet allocation; knowledge reusing; objects-distributed cluster ensemble; privacy preservation; soft cluster model; Clustering algorithms; Computer science; Data mining; Data privacy; Distributed computing; Inference algorithms; Machine learning algorithms; Partitioning algorithms; Robust stability; Terminology;
fLanguage
English
Publisher
ieee
Conference_Titel
Natural Language Processing and Knowledge Engineering, 2008. NLP-KE '08. International Conference on
Conference_Location
Beijing
Print_ISBN
978-1-4244-4515-8
Electronic_ISBN
978-1-4244-2780-2
Type
conf
DOI
10.1109/NLPKE.2008.4906792
Filename
4906792
Link To Document