• DocumentCode
    243599
  • Title

    Mining Interesting Meta-Paths from Complex Heterogeneous Information Networks

  • Author

    Baoxu Shi ; Weninger, Tim

  • Author_Institution
    Comput. Sci. & Eng., Univ. of Notre Dame, Notre Dame, IN, USA
  • fYear
    2014
  • fDate
    14-14 Dec. 2014
  • Firstpage
    488
  • Lastpage
    495
  • Abstract
    Meta-paths in heterogeneous information networks are almost always hand created and have, so far, only been attempted on data sets with very small type systems like DBLP, IMDB, etc. Most real-world heterogeneous information networks have large and complex type systems. As the size and complexity of the type-system grows it becomes more and more difficult for humans to form reasonable meta-path queries. This work introduces a new technique to discover a new market for data called interesting meta-paths from complex heterogeneous information networks. Our interestingness measure is based on classical knowledge discovery principles, but have been applied in such a way that only interesting meta-paths are mined from the hundreds-of-thousands of possible choices. As in classical pattern mining literature, precision and recall statistics are difficult to obtain, instead we evaluate the effectiveness of our results using a quantitative node-similarity analysis as well as a large user study. Finally, we apply the newly discovered interesting meta-paths to find similar nodes on the Wikipedia heterogeneous information networks.
  • Keywords
    Web sites; complex networks; data mining; information networks; pattern clustering; query processing; statistics; DBLP; IMDB; Wikipedia; complex heterogeneous information networks; complex type systems; knowledge discovery; meta-path queries; meta-paths mining; pattern mining; recall statistics; Data mining; Educational institutions; Electronic publishing; Encyclopedias; Gold; Internet; information networks; meta-paths; similarity;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Mining Workshop (ICDMW), 2014 IEEE International Conference on
  • Conference_Location
    Shenzhen
  • Print_ISBN
    978-1-4799-4275-6
  • Type

    conf

  • DOI
    10.1109/ICDMW.2014.25
  • Filename
    7022636