SWIFT: Scalable Wasserstein Factorization for Sparse Nonnegative Tensors
- URL: http://arxiv.org/abs/2010.04081v2
- Date: Tue, 15 Dec 2020 10:26:56 GMT
- Title: SWIFT: Scalable Wasserstein Factorization for Sparse Nonnegative Tensors
- Authors: Ardavan Afshar, Kejing Yin, Sherry Yan, Cheng Qian, Joyce C. Ho,
Haesun Park, Jimeng Sun
- Abstract summary: We introduce SWIFT, which minimizes the Wasserstein distance that measures the distance between the input tensor and that of the reconstruction.
SWIFT achieves up to 9.65% and 11.31% relative improvement over baselines for downstream prediction tasks.
- Score: 42.154795547748165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing tensor factorization methods assume that the input tensor follows
some specific distribution (i.e. Poisson, Bernoulli, and Gaussian), and solve
the factorization by minimizing some empirical loss functions defined based on
the corresponding distribution. However, it suffers from several drawbacks: 1)
In reality, the underlying distributions are complicated and unknown, making it
infeasible to be approximated by a simple distribution. 2) The correlation
across dimensions of the input tensor is not well utilized, leading to
sub-optimal performance. Although heuristics were proposed to incorporate such
correlation as side information under Gaussian distribution, they can not
easily be generalized to other distributions. Thus, a more principled way of
utilizing the correlation in tensor factorization models is still an open
challenge. Without assuming any explicit distribution, we formulate the tensor
factorization as an optimal transport problem with Wasserstein distance, which
can handle non-negative inputs.
We introduce SWIFT, which minimizes the Wasserstein distance that measures
the distance between the input tensor and that of the reconstruction. In
particular, we define the N-th order tensor Wasserstein loss for the widely
used tensor CP factorization and derive the optimization algorithm that
minimizes it. By leveraging sparsity structure and different equivalent
formulations for optimizing computational efficiency, SWIFT is as scalable as
other well-known CP algorithms. Using the factor matrices as features, SWIFT
achieves up to 9.65% and 11.31% relative improvement over baselines for
downstream prediction tasks. Under the noisy conditions, SWIFT achieves up to
15% and 17% relative improvements over the best competitors for the prediction
tasks.
Related papers
- Training normalizing flows with computationally intensive target
probability distributions [0.018416014644193065]
We propose an estimator for normalizing flows based on the REINFORCE algorithm.
It is up to ten times faster in terms of the wall-clock time and requires up to $30%$ less memory.
arXiv Detail & Related papers (2023-08-25T10:40:46Z) - Learning Unnormalized Statistical Models via Compositional Optimization [73.30514599338407]
Noise-contrastive estimation(NCE) has been proposed by formulating the objective as the logistic loss of the real data and the artificial noise.
In this paper, we study it a direct approach for optimizing the negative log-likelihood of unnormalized models.
arXiv Detail & Related papers (2023-06-13T01:18:16Z) - Error Analysis of Tensor-Train Cross Approximation [88.83467216606778]
We provide accuracy guarantees in terms of the entire tensor for both exact and noisy measurements.
Results are verified by numerical experiments, and may have important implications for the usefulness of cross approximations for high-order tensors.
arXiv Detail & Related papers (2022-07-09T19:33:59Z) - Nonparametric Sparse Tensor Factorization with Hierarchical Gamma
Processes [16.79618682556073]
We propose a nonparametric factorization approach for sparsely observed tensors.
We use hierarchical Gamma processes and Poisson random measures to construct a tensor-valued process.
For efficient inference, we use Dirichlet process properties over finite sample partitions, density transformations, and random features.
arXiv Detail & Related papers (2021-10-19T16:17:26Z) - Augmented Tensor Decomposition with Stochastic Optimization [46.16865811396394]
Real-world tensor data are usually high-ordered and have large dimensions with millions or billions of entries.
It is expensive to decompose the whole tensor with traditional algorithms.
This paper proposes augmented tensor decomposition, which effectively incorporates data augmentations to boost downstream classification.
arXiv Detail & Related papers (2021-06-15T06:29:05Z) - MTC: Multiresolution Tensor Completion from Partial and Coarse
Observations [49.931849672492305]
Existing completion formulation mostly relies on partial observations from a single tensor.
We propose an efficient Multi-resolution Completion model (MTC) to solve the problem.
arXiv Detail & Related papers (2021-06-14T02:20:03Z) - Scaling and Scalability: Provable Nonconvex Low-Rank Tensor Estimation
from Incomplete Measurements [30.395874385570007]
A fundamental task is to faithfully recover tensors from highly incomplete measurements.
We develop an algorithm to directly recover the tensor factors in the Tucker decomposition.
We show that it provably converges at a linear independent rate of the ground truth tensor for two canonical problems.
arXiv Detail & Related papers (2021-04-29T17:44:49Z) - Alternating linear scheme in a Bayesian framework for low-rank tensor
approximation [5.833272638548154]
We find a low-rank representation for a given tensor by solving a Bayesian inference problem.
We present an algorithm that performs the unscented transform in tensor train format.
arXiv Detail & Related papers (2020-12-21T10:15:30Z) - Linear Optimal Transport Embedding: Provable Wasserstein classification
for certain rigid transformations and perturbations [79.23797234241471]
Discriminating between distributions is an important problem in a number of scientific fields.
The Linear Optimal Transportation (LOT) embeds the space of distributions into an $L2$-space.
We demonstrate the benefits of LOT on a number of distribution classification problems.
arXiv Detail & Related papers (2020-08-20T19:09:33Z) - Uncertainty quantification for nonconvex tensor completion: Confidence
intervals, heteroscedasticity and optimality [92.35257908210316]
We study the problem of estimating a low-rank tensor given incomplete and corrupted observations.
We find that it attains unimprovable rates $ell-2$ accuracy.
arXiv Detail & Related papers (2020-06-15T17:47:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.