• DocumentCode
    3576332
  • Title

    Large-scale factorization of type-constrained multi-relational data

  • Author

    Krompass, Denis ; Nickel, Maximilian ; Tresp, Volker

  • Author_Institution
    Ludwig Maximilian Univ., Munich, Germany
  • fYear
    2014
  • Firstpage
    18
  • Lastpage
    24
  • Abstract
    The statistical modeling of large multi-relational datasets has increasingly gained attention in recent years. Typical applications involve large knowledge bases like DBpedia, Freebase, YAGO and the recently introduced Google Knowledge Graph that contain millions of entities, hundreds and thousands of relations, and billions of relational tuples. Collective factorization methods have been shown to scale up to these large multi-relational datasets, in particular in form of tensor approaches that can exploit the highly scalable alternating least squares (ALS) algorithms for calculating the factors. In this paper we extend the recently proposed state-of-the-art RESCAL tensor factorization to consider relational type-constraints. Relational type-constraints explicitly define the logic of relations by excluding entities from the subject or object role. In addition we will show that in absence of prior knowledge about type-constraints, local closed-world assumptions can be approximated for each relation by ignoring unobserved subject or object entities in a relation. In our experiments on representative large datasets (Cora, DBpedia), that contain up to millions of entities and hundreds of type-constrained relations, we show that the proposed approach is scalable. It further significantly outperforms RESCAL without type-constraints in both, runtime and prediction quality.
  • Keywords
    least squares approximations; matrix decomposition; relational databases; tensors; ALS algorithms; Cora dataset; DBpedia dataset; RESCAL tensor factorization; alternating least squares algorithms; collective factorization method; large multirelational datasets; large-scale factorization; object role; prediction quality; relation logic; relational type-constraints; runtime analysis; statistical modeling; tensor approaches; type-constrained multirelational data; Lead;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Data Science and Advanced Analytics (DSAA), 2014 International Conference on
  • Type

    conf

  • DOI
    10.1109/DSAA.2014.7058046
  • Filename
    7058046