DocumentCode
243599
Title
Mining Interesting Meta-Paths from Complex Heterogeneous Information Networks
Author
Baoxu Shi ; Weninger, Tim
Author_Institution
Comput. Sci. & Eng., Univ. of Notre Dame, Notre Dame, IN, USA
fYear
2014
fDate
14-14 Dec. 2014
Firstpage
488
Lastpage
495
Abstract
Meta-paths in heterogeneous information networks are almost always hand created and have, so far, only been attempted on data sets with very small type systems like DBLP, IMDB, etc. Most real-world heterogeneous information networks have large and complex type systems. As the size and complexity of the type-system grows it becomes more and more difficult for humans to form reasonable meta-path queries. This work introduces a new technique to discover a new market for data called interesting meta-paths from complex heterogeneous information networks. Our interestingness measure is based on classical knowledge discovery principles, but have been applied in such a way that only interesting meta-paths are mined from the hundreds-of-thousands of possible choices. As in classical pattern mining literature, precision and recall statistics are difficult to obtain, instead we evaluate the effectiveness of our results using a quantitative node-similarity analysis as well as a large user study. Finally, we apply the newly discovered interesting meta-paths to find similar nodes on the Wikipedia heterogeneous information networks.
Keywords
Web sites; complex networks; data mining; information networks; pattern clustering; query processing; statistics; DBLP; IMDB; Wikipedia; complex heterogeneous information networks; complex type systems; knowledge discovery; meta-path queries; meta-paths mining; pattern mining; recall statistics; Data mining; Educational institutions; Electronic publishing; Encyclopedias; Gold; Internet; information networks; meta-paths; similarity;
fLanguage
English
Publisher
ieee
Conference_Titel
Data Mining Workshop (ICDMW), 2014 IEEE International Conference on
Conference_Location
Shenzhen
Print_ISBN
978-1-4799-4275-6
Type
conf
DOI
10.1109/ICDMW.2014.25
Filename
7022636
Link To Document