• DocumentCode
    1783784
  • Title

    Just compress and relax: Handling missing values in big tensor analysis

  • Author

    Marcos, J.H. ; Sidiropoulos, Nicholas

  • Author_Institution
    Dept. of ECE, Univ. of Minnesota, Minneapolis, MN, USA
  • fYear
    2014
  • fDate
    21-23 May 2014
  • Firstpage
    218
  • Lastpage
    221
  • Abstract
    In applications of tensor analysis, missing data is an important issue that is usually handled via weighted least-squares fitting, imputation, or iterative expectation-maximization. The resulting algorithms are often cumbersome, and tend to fail when the percentage of missing samples is large. This paper proposes a novel and refreshingly simple approach for handling randomly missing values in big tensor analysis. The stepping stone is random multi-way tensor compression, which enables indirect tensor factorization via analysis of compressed `replicas´ of the big tensor. A Bernoulli model for the misses, and two opposite ends of the tensor modeling spectrum are considered: independent and identically distributed (i.i.d.) tensor elements, and low-rank (and in particular rank-one) tensors whose latent factors are i.i.d. In both cases, analytical results are established, showing that the tensor approximation error variance is inversely proportional to the number of available elements. Coupled with recent developments in robust CP decomposition, these results show that it is possible to ignore missing values without losing the ability to identify the underlying model.
  • Keywords
    data structures; expectation-maximisation algorithm; least squares approximations; matrix decomposition; tensors; Bernoulli model; big tensor analysis; cumbersome; imputation; indirect tensor factorization; iterative expectation-maximization; missing data; random multiway tensor compression; robust CP decomposition; tensor approximation error variance; tensor modeling spectrum; weighted least-squares fitting; Computational modeling; Data models; Loading; Matrix decomposition; Signal to noise ratio; Tensile stress; Vectors; CANDECOMP / PARAFAC; Tensor decomposition; big data; imputation; missing elements; missing values; multi-way arrays; tensor completion;
  • fLanguage
    English
  • Publisher
    ieee
  • Conference_Titel
    Communications, Control and Signal Processing (ISCCSP), 2014 6th International Symposium on
  • Conference_Location
    Athens
  • Type

    conf

  • DOI
    10.1109/ISCCSP.2014.6877854
  • Filename
    6877854