Coarse-To-Fine Tensor Trains for Compact Visual Representations
- URL: http://arxiv.org/abs/2406.04332v1
- Date: Thu, 6 Jun 2024 17:59:23 GMT
- Title: Coarse-To-Fine Tensor Trains for Compact Visual Representations
- Authors: Sebastian Loeschcke, Dan Wang, Christian Leth-Espensen, Serge Belongie, Michael J. Kastoryano, Sagie Benaim,
- Abstract summary: 'Prolongation Upsampling Train' is a novel method for learning tensor train representations in a coarse-to-fine manner.
We evaluate our representation along three axes: (1) compression, (2).
denoising capability, and (3) image completion capability.
- Score: 19.216356079910533
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ability to learn compact, high-quality, and easy-to-optimize representations for visual data is paramount to many applications such as novel view synthesis and 3D reconstruction. Recent work has shown substantial success in using tensor networks to design such compact and high-quality representations. However, the ability to optimize tensor-based representations, and in particular, the highly compact tensor train representation, is still lacking. This has prevented practitioners from deploying the full potential of tensor networks for visual data. To this end, we propose 'Prolongation Upsampling Tensor Train (PuTT)', a novel method for learning tensor train representations in a coarse-to-fine manner. Our method involves the prolonging or `upsampling' of a learned tensor train representation, creating a sequence of 'coarse-to-fine' tensor trains that are incrementally refined. We evaluate our representation along three axes: (1). compression, (2). denoising capability, and (3). image completion capability. To assess these axes, we consider the tasks of image fitting, 3D fitting, and novel view synthesis, where our method shows an improved performance compared to state-of-the-art tensor-based methods. For full results see our project webpage: https://sebulo.github.io/PuTT_website/
Related papers
- FewViewGS: Gaussian Splatting with Few View Matching and Multi-stage Training [15.634646420318731]
We present a 3D Gaussian-based novel view synthesis method using sparse input images.
We propose a multi-stage training scheme with matching-based consistency constraints imposed on the novel views.
This is achieved by using the matches of the available training images to supervise the generation of the novel views.
arXiv Detail & Related papers (2024-11-04T16:21:00Z) - Efficient-3DiM: Learning a Generalizable Single-image Novel-view
Synthesizer in One Day [63.96075838322437]
We propose a framework to learn a single-image novel-view synthesizer.
Our framework is able to reduce the total training time from 10 days to less than 1 day.
arXiv Detail & Related papers (2023-10-04T17:57:07Z) - RangeViT: Towards Vision Transformers for 3D Semantic Segmentation in
Autonomous Driving [80.14669385741202]
Vision transformers (ViTs) have achieved state-of-the-art results in many image-based benchmarks.
ViTs are notoriously hard to train and require a lot of training data to learn powerful representations.
We show that our method, called RangeViT, outperforms existing projection-based methods on nuScenes and Semantic KITTI.
arXiv Detail & Related papers (2023-01-24T18:50:48Z) - Low-Rank Tensor Function Representation for Multi-Dimensional Data
Recovery [52.21846313876592]
Low-rank tensor function representation (LRTFR) can continuously represent data beyond meshgrid with infinite resolution.
We develop two fundamental concepts for tensor functions, i.e., the tensor function rank and low-rank tensor function factorization.
Our method substantiates the superiority and versatility of our method as compared with state-of-the-art methods.
arXiv Detail & Related papers (2022-12-01T04:00:38Z) - EfficientTrain: Exploring Generalized Curriculum Learning for Training
Visual Backbones [80.662250618795]
This paper presents a new curriculum learning approach for the efficient training of visual backbones (e.g., vision Transformers)
As an off-the-shelf method, it reduces the wall-time training cost of a wide variety of popular models by >1.5x on ImageNet-1K/22K without sacrificing accuracy.
arXiv Detail & Related papers (2022-11-17T17:38:55Z) - T4DT: Tensorizing Time for Learning Temporal 3D Visual Data [19.418308324435916]
We show that low-rank tensor compression is extremely compact to store and query time-varying signed distance functions.
Unlike existing iterative learning-based approaches like DeepSDF and NeRF, our method uses a closed-form algorithm with theoretical guarantees.
arXiv Detail & Related papers (2022-08-02T12:57:08Z) - Patch-based medical image segmentation using Quantum Tensor Networks [1.5899411215927988]
We formulate image segmentation in a supervised setting with tensor networks.
The key idea is to first lift the pixels in image patches to exponentially high dimensional feature spaces.
The performance of the proposed model is evaluated on three 2D- and one 3D- biomedical imaging datasets.
arXiv Detail & Related papers (2021-09-15T07:54:05Z) - Cherry-Picking Gradients: Learning Low-Rank Embeddings of Visual Data
via Differentiable Cross-Approximation [53.95297550117153]
We propose an end-to-end trainable framework that processes large-scale visual data tensors by looking emphat a fraction of their entries only.
The proposed approach is particularly useful for large-scale multidimensional grid data, and for tasks that require context over a large receptive field.
arXiv Detail & Related papers (2021-05-29T08:39:57Z) - Adaptive Learning of Tensor Network Structures [6.407946291544721]
We leverage the TN formalism to develop a generic and efficient adaptive algorithm to learn the structure and the parameters of a TN from data.
Our algorithm can adaptively identify TN structures with small number of parameters that effectively optimize any differentiable objective function.
arXiv Detail & Related papers (2020-08-12T16:41:56Z) - T-Basis: a Compact Representation for Neural Networks [89.86997385827055]
We introduce T-Basis, a concept for a compact representation of a set of tensors, each of an arbitrary shape, which is often seen in Neural Networks.
We evaluate the proposed approach on the task of neural network compression and demonstrate that it reaches high compression rates at acceptable performance drops.
arXiv Detail & Related papers (2020-07-13T19:03:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.