• DocumentCode
    2861983
  • Title

    Mining Local Data Sources For Learning Global Cluster Models

  • Author

    Lam, Chak-Man ; Zhang, Xiao-Feng ; Cheung, William K.

  • Author_Institution
    Hong Kong Baptist University
  • fYear
    2004
  • fDate
    20-24 Sept. 2004
  • Firstpage
    748
  • Lastpage
    751
  • Abstract
    Distributed data mining has been a topic getting more important nowadays as there are many cases where physically sharing of data is probibited, e.g., due to huge data volume or data privacy. In this paper, we are interested in learning a global cluster model by exploring data in distributed sources. A methodology based on periodic model exchange and merge is proposed and applied to hyperlinked Web pages analysis. In addition, we have tested a number of variations of the basic idea, including putting more emphasis on the privacy concern and testing the effect of having different numbers of distributed sources. Experimental results show that the proposed distributed learning scheme is effective with accuracy close to the case with all the data physically shared for the learning.
  • Keywords
    Computer science; Data analysis; Data mining; Data privacy; Frequency; Machine learning; Machine learning algorithms; Testing; Training data; Web pages;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Web Intelligence, 2004. WI 2004. Proceedings. IEEE/WIC/ACM International Conference on
  • Print_ISBN
    0-7695-2100-2
  • Type

    conf

  • DOI
    10.1109/WI.2004.10044
  • Filename
    1410912