Accelerating two-dimensional tensor network contractions using QR-decompositions
- URL: http://arxiv.org/abs/2505.00494v1
- Date: Thu, 01 May 2025 12:48:26 GMT
- Title: Accelerating two-dimensional tensor network contractions using QR-decompositions
- Authors: Yining Zhang, Qi Yang, Philippe Corboz,
- Abstract summary: We propose a contraction scheme for $C_4v$-symmetric tensor networks based on combining the corner transfer matrix renormalization group with QR-decompositions.<n>Our approach achieves up to two orders of magnitude speedup compared to standard CTMRG and yields state-of-the-art results.
- Score: 3.6498714804297387
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Infinite projected entangled-pair states (iPEPS) provide a powerful tool for studying strongly correlated systems directly in the thermodynamic limit. A core component of the algorithm is the approximate contraction of the iPEPS, where the computational bottleneck typically lies in the singular value or eigenvalue decompositions involved in the renormalization step. This is particularly true on GPUs, where tensor contractions are substantially faster than these decompositions. Here we propose a contraction scheme for $C_{4v}$-symmetric tensor networks based on combining the corner transfer matrix renormalization group (CTMRG) with QR-decompositions which are substantially faster -- especially on GPUs. Our approach achieves up to two orders of magnitude speedup compared to standard CTMRG and yields state-of-the-art results for the Heisenberg and $J_1$-$J_2$ models in about one hour on an H100 GPU.
Related papers
- Variationally optimizing infinite projected entangled-pair states at large bond dimensions: A split corner transfer matrix renormalization group approach [0.2796197251957244]
We introduce an alternative "split-CTMRG" algorithm, which maintains separate PEPS layers and leverages new environment tensors, reducing computational complexity while preserving accuracy.<n> Benchmarks on quantum lattice models demonstrate substantial speedups for variational energy optimization, rendering this method valuable for large-scale PEPS simulations.
arXiv Detail & Related papers (2025-02-14T16:59:33Z) - Spectral functions with infinite projected entangled-pair states [0.0]
We extend the iPEPS toolbox by a method to efficiently evaluate non-equal time two-point correlators.
It is based on an iPEPS ansatz of the ground state in a large unit cell, with an operator applied in the center of the cell.
At every time step, the two-point correlators within a cell are computed based on the corner transfer matrix renormalization group method.
arXiv Detail & Related papers (2024-05-17T08:43:55Z) - TCCT-Net: Two-Stream Network Architecture for Fast and Efficient Engagement Estimation via Behavioral Feature Signals [58.865901821451295]
We present a novel two-stream feature fusion "Tensor-Convolution and Convolution-Transformer Network" (TCCT-Net) architecture.
To better learn the meaningful patterns in the temporal-spatial domain, we design a "CT" stream that integrates a hybrid convolutional-transformer.
In parallel, to efficiently extract rich patterns from the temporal-frequency domain, we introduce a "TC" stream that uses Continuous Wavelet Transform (CWT) to represent information in a 2D tensor form.
arXiv Detail & Related papers (2024-04-15T06:01:48Z) - Fast Time-Evolution of Matrix-Product States using the QR decomposition [0.0]
We propose and benchmark a modified time evolution block decimation algorithm that uses a truncation scheme based on the QR decomposition instead of the singular value decomposition (SVD)
The modification reduces the scaling with the dimension of the physical Hilbert space $d$ from $d3$ down to $d2$.
In a benchmark simulation of a global quench in a quantum clock model, we observe a speedup of up to three orders of magnitude comparing QR and SVD based updates on an A100 GPU.
arXiv Detail & Related papers (2022-12-19T19:00:05Z) - Efficient Dataset Distillation Using Random Feature Approximation [109.07737733329019]
We propose a novel algorithm that uses a random feature approximation (RFA) of the Neural Network Gaussian Process (NNGP) kernel.
Our algorithm provides at least a 100-fold speedup over KIP and can run on a single GPU.
Our new method, termed an RFA Distillation (RFAD), performs competitively with KIP and other dataset condensation algorithms in accuracy over a range of large-scale datasets.
arXiv Detail & Related papers (2022-10-21T15:56:13Z) - Batch-efficient EigenDecomposition for Small and Medium Matrices [65.67315418971688]
EigenDecomposition (ED) is at the heart of many computer vision algorithms and applications.
We propose a QR-based ED method dedicated to the application scenarios of computer vision.
arXiv Detail & Related papers (2022-07-09T09:14:12Z) - A Fast Parallel Tensor Decomposition with Optimal Stochastic Gradient
Descent: an Application in Structural Damage Identification [1.536989504296526]
We propose a novel algorithm, FP-CPD, to parallelize the CANDECOMP/PARAFAC (CP) decomposition of a tensor $mathcalX in mathbbR I_1 times dots times I_N $.
arXiv Detail & Related papers (2021-11-04T05:17:07Z) - Nesterov Accelerated ADMM for Fast Diffeomorphic Image Registration [63.15453821022452]
Recent developments in approaches based on deep learning have achieved sub-second runtimes for DiffIR.
We propose a simple iterative scheme that functionally composes intermediate non-stationary velocity fields.
We then propose a convex optimisation model that uses a regularisation term of arbitrary order to impose smoothness on these velocity fields.
arXiv Detail & Related papers (2021-09-26T19:56:45Z) - Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch [75.69506249886622]
Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments.
In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network.
arXiv Detail & Related papers (2021-02-08T05:55:47Z) - Kronecker CP Decomposition with Fast Multiplication for Compressing RNNs [11.01184134911405]
Recurrent neural networks (RNNs) are powerful in the tasks oriented to sequential data, such as natural language processing and video recognition.
In this paper, we consider compressing RNNs based on a novel Kronecker CANDECOMP/PARAFAC (KCP) decomposition.
arXiv Detail & Related papers (2020-08-21T07:29:45Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.