Related papers: Fast Time-Evolution of Matrix-Product States using the QR decomposition

Fast Time-Evolution of Matrix-Product States using the QR decomposition

URL: http://arxiv.org/abs/2212.09782v1
Date: Mon, 19 Dec 2022 19:00:05 GMT
Title: Fast Time-Evolution of Matrix-Product States using the QR decomposition
Authors: Jakob Unfried, Johannes Hauschild and Frank Pollmann
Abstract summary: We propose and benchmark a modified time evolution block decimation algorithm that uses a truncation scheme based on the QR decomposition instead of the singular value decomposition (SVD) The modification reduces the scaling with the dimension of the physical Hilbert space $d$ from $d3$ down to $d2$. In a benchmark simulation of a global quench in a quantum clock model, we observe a speedup of up to three orders of magnitude comparing QR and SVD based updates on an A100 GPU.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose and benchmark a modified time evolution block decimation (TEBD) algorithm that uses a truncation scheme based on the QR decomposition instead of the singular value decomposition (SVD). The modification reduces the scaling with the dimension of the physical Hilbert space $d$ from $d^3$ down to $d^2$. Moreover, the QR decomposition has a lower computational complexity than the SVD and allows for highly efficient implementations on GPU hardware. In a benchmark simulation of a global quench in a quantum clock model, we observe a speedup of up to three orders of magnitude comparing QR and SVD based updates on an A100 GPU.

Related papers

QuantVSR: Low-Bit Post-Training Quantization for Real-World Video Super-Resolution [53.13952833016505]
We propose a low-bit quantization model for real-world video super-resolution (VSR)<n>We use a calibration dataset to measure both spatial and temporal complexity for each layer.<n>We refine the FP and low-bit branches to achieve simultaneous optimization.
arXiv Detail & Related papers (2025-08-06T14:35:59Z)
Speedy Deformable 3D Gaussian Splatting: Fast Rendering and Compression of Dynamic Scenes [57.69608119350651]
Recent extensions of 3D Gaussian Splatting (3DGS) to dynamic scenes achieve high-quality novel view synthesis by using neural networks to predict the time-varying deformation of each Gaussian.<n>However, performing per-Gaussian neural inference at every frame poses a significant bottleneck, limiting rendering speed and increasing memory and compute requirements.<n>We present Speedy Deformable 3D Gaussian Splatting (SpeeDe3DGS), a general pipeline for accelerating the rendering speed of dynamic 3DGS and 4DGS representations by reducing neural inference through two complementary techniques.
arXiv Detail & Related papers (2025-06-09T16:30:48Z)
TC-GS: A Faster Gaussian Splatting Module Utilizing Tensor Cores [9.744829716477627]
This paper proposes TC-GS, an algorithm-independent universal module that expands Core (TCU) applicability for 3DGS.<n>The key innovation lies in mapping alpha to matrix multiplication, fully utilizing otherwise idle TCUs in existing 3DGS implementations.
arXiv Detail & Related papers (2025-05-30T16:58:18Z)
Accelerating two-dimensional tensor network contractions using QR-decompositions [3.6498714804297387]
We propose a contraction scheme for $C_4v$-symmetric tensor networks based on combining the corner transfer matrix renormalization group with QR-decompositions. Our approach achieves up to two orders of magnitude speedup compared to standard CTMRG and yields state-of-the-art results.
arXiv Detail & Related papers (2025-05-01T12:48:26Z)
3DGS-LM: Faster Gaussian-Splatting Optimization with Levenberg-Marquardt [65.25603275491544]
We present 3DGS-LM, a new method that accelerates the reconstruction of 3D Gaussian Splatting (3DGS) Our method is 30% faster than the original 3DGS while obtaining the same reconstruction quality optimization.
arXiv Detail & Related papers (2024-09-19T16:31:44Z)
Reduce Computational Complexity for Convolutional Layers by Skipping Zeros [9.833821501774596]
We propose an efficient algorithm for convolutional neural networks. The C-K-S algorithm is accompanied by efficient GPU implementations. Experiments show that C-K-S offers good performance in terms of speed and convergence.
arXiv Detail & Related papers (2023-06-28T06:21:22Z)
Quick Adaptive Ternary Segmentation: An Efficient Decoding Procedure For Hidden Markov Models [70.26374282390401]
Decoding the original signal (i.e., hidden chain) from the noisy observations is one of the main goals in nearly all HMM based data analyses. We present Quick Adaptive Ternary (QATS), a divide-and-conquer procedure which decodes the hidden sequence in polylogarithmic computational complexity.
arXiv Detail & Related papers (2023-05-29T19:37:48Z)
Scalable Quantum Error Correction for Surface Codes using FPGA [67.74017895815125]
A fault-tolerant quantum computer must decode and correct errors faster than they appear. We report a distributed version of the Union-Find decoder that exploits parallel computing resources for further speedup. The implementation employs a scalable architecture called Helios that organizes parallel computing resources into a hybrid tree-grid structure.
arXiv Detail & Related papers (2023-01-20T04:23:00Z)
Rapid Person Re-Identification via Sub-space Consistency Regularization [51.76876061721556]
Person Re-Identification (ReID) matches pedestrians across disjoint cameras. Existing ReID methods adopting real-value feature descriptors have achieved high accuracy, but they are low in efficiency due to the slow Euclidean distance computation. We propose a novel Sub-space Consistency Regularization (SCR) algorithm that can speed up the ReID procedure by 0.25$ times.
arXiv Detail & Related papers (2022-07-13T02:44:05Z)
Batch-efficient EigenDecomposition for Small and Medium Matrices [65.67315418971688]
EigenDecomposition (ED) is at the heart of many computer vision algorithms and applications. We propose a QR-based ED method dedicated to the application scenarios of computer vision.
arXiv Detail & Related papers (2022-07-09T09:14:12Z)
Fast 2D Convolutions and Cross-Correlations Using Scalable Architectures [2.2940141855172027]
The basic idea is to map 2D convolutions and cross-correlations to a collection of 1D convolutions and cross-correlations in the transform domain. The approach uses scalable architectures that can be fitted into modern FPGA and Zynq-SOC devices.
arXiv Detail & Related papers (2021-12-24T22:34:51Z)
Efficient GPU implementation of randomized SVD and its applications [17.71779625877989]
Matrix decompositions are ubiquitous in machine learning, including applications in dimensionality data compression and deep learning algorithms. Typical solutions for matrix decompositions have complexity which significantly increases their computational cost and time. We leverage efficient processing operations that can be run in parallel on modern Graphical Processing Units (GPUs) to reduce the computational burden of computing matrix decompositions.
arXiv Detail & Related papers (2021-10-05T07:42:41Z)
Why Approximate Matrix Square Root Outperforms Accurate SVD in Global Covariance Pooling? [59.820507600960745]
We propose a new GCP meta-layer that uses SVD in the forward pass, and Pad'e Approximants in the backward propagation to compute the gradients. The proposed meta-layer has been integrated into different CNN models and achieves state-of-the-art performances on both large-scale and fine-grained datasets.
arXiv Detail & Related papers (2021-05-06T08:03:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.