Related papers: Memory-Efficient Differentiable Programming for Quantum Optimal Control of Discrete Lattices

Memory-Efficient Differentiable Programming for Quantum Optimal Control of Discrete Lattices

URL: http://arxiv.org/abs/2210.08378v1
Date: Sat, 15 Oct 2022 20:59:23 GMT
Title: Memory-Efficient Differentiable Programming for Quantum Optimal Control of Discrete Lattices
Authors: Xian Wang, Paul Kairys, Sri Hari Krishna Narayanan, Jan H\"uckelheim, Paul Hovland
Abstract summary: Quantum optimal control problems are typically solved by gradient-based algorithms such as GRAPE. QOC reveals that memory requirements are a barrier for simulating large models or long time spans. We employ a nonstandard differentiable programming approach that significantly reduces the memory requirements at the cost of a reasonable amount of recomputation.
Score: 1.5012666537539614
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Quantum optimal control problems are typically solved by gradient-based algorithms such as GRAPE, which suffer from exponential growth in storage with increasing number of qubits and linear growth in memory requirements with increasing number of time steps. Employing QOC for discrete lattices reveals that these memory requirements are a barrier for simulating large models or long time spans. We employ a nonstandard differentiable programming approach that significantly reduces the memory requirements at the cost of a reasonable amount of recomputation. The approach exploits invertibility properties of the unitary matrices to reverse the computation during back-propagation. We utilize QOC software written in the differentiable programming framework JAX that implements this approach, and demonstrate its effectiveness for lattice gauge theory.

Related papers

Scalable Memory Recycling for Large Quantum Programs [0.0]
As quantum computing technology advances, the complexity of quantum algorithms increases, necessitating a shift from low-level circuit descriptions to high-level programming paradigms. This paper addresses the challenges of developing a compilation that optimize memory management and scales well for bigger, more complex circuits.
arXiv Detail & Related papers (2025-03-02T09:56:39Z)
Sparse Gradient Compression for Fine-Tuning Large Language Models [58.44973963468691]
Fine-tuning large language models (LLMs) for downstream tasks has become increasingly crucial due to their widespread use and the growing availability of open-source models. High memory costs associated with fine-tuning remain a significant challenge, especially as models increase in size. We propose sparse compression gradient (SGC) to address these limitations.
arXiv Detail & Related papers (2025-02-01T04:18:28Z)
QPruner: Probabilistic Decision Quantization for Structured Pruning in Large Language Models [3.093903491123962]
Large language models (LLMs) have significantly advanced various natural language processing (NLP) tasks. structured pruning is an effective approach to reducing model size, but it often results in significant accuracy degradation. We introduce quantization into the structured pruning framework to reduce memory consumption during both fine-tuning and inference. We propose QPruner, a novel framework that employs structured pruning to reduce model size, followed by a layer-wise mixed-precision quantization scheme.
arXiv Detail & Related papers (2024-12-16T10:14:01Z)
Progressive Mixed-Precision Decoding for Efficient LLM Inference [49.05448842542558]
We introduce Progressive Mixed-Precision Decoding (PMPD) to address the memory-boundedness of decoding. PMPD achieves 1.4$-$12.2$times$ speedup in matrix-vector multiplications over fp16 models. Our approach delivers a throughput gain of 3.8$-$8.0$times$ over fp16 models and up to 1.54$times$ over uniform quantization approaches.
arXiv Detail & Related papers (2024-10-17T11:46:33Z)
Q-VLM: Post-training Quantization for Large Vision-Language Models [73.19871905102545]
We propose a post-training quantization framework of large vision-language models (LVLMs) for efficient multi-modal inference. We mine the cross-layer dependency that significantly influences discretization errors of the entire vision-language model, and embed this dependency into optimal quantization strategy. Experimental results demonstrate that our method compresses the memory by 2.78x and increase generate speed by 1.44x about 13B LLaVA model without performance degradation.
arXiv Detail & Related papers (2024-10-10T17:02:48Z)
Memory-Augmented Hybrid Quantum Reservoir Computing [0.0]
We present a hybrid quantum-classical approach that implements memory through classical post-processing of quantum measurements. We tested our model on two physical platforms: a fully connected Ising model and a Rydberg atom array.
arXiv Detail & Related papers (2024-09-15T22:44:09Z)
ThinK: Thinner Key Cache by Query-Driven Pruning [63.13363917871414]
Large Language Models (LLMs) have revolutionized the field of natural language processing, achieving unprecedented performance across a variety of applications. This paper focuses on the long-context scenario, addressing the inefficiencies in KV cache memory consumption during inference. We propose ThinK, a novel query-dependent KV cache pruning method designed to minimize attention weight loss while selectively pruning the least significant channels.
arXiv Detail & Related papers (2024-07-30T17:59:08Z)
Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers [58.5711048151424]
We introduce SPARSEK Attention, a novel sparse attention mechanism designed to overcome computational and memory obstacles. Our approach integrates a scoring network and a differentiable top-k mask operator, SPARSEK, to select a constant number of KV pairs for each query. Experimental results reveal that SPARSEK Attention outperforms previous sparse attention methods.
arXiv Detail & Related papers (2024-06-24T15:55:59Z)
Memory-Efficient Optimization with Factorized Hamiltonian Descent [11.01832755213396]
We introduce a novel adaptive, H-Fac, which incorporates a memory-efficient factorization approach to address this challenge. By employing a rank-1 parameterization for both momentum and scaling parameter estimators, H-Fac reduces memory costs to a sublinear level. We develop our algorithms based on principles derived from Hamiltonian dynamics, providing robust theoretical underpinnings in optimization dynamics and convergence guarantees.
arXiv Detail & Related papers (2024-06-14T12:05:17Z)
Optimal control of large quantum systems: assessing memory and runtime performance of GRAPE [0.0]
GRAPE is a popular technique in quantum optimal control, and can be combined with automatic differentiation. We show that the convenience of AD comes at a significant memory cost due to the cumulative storage of a large number of states and propagators. We revisit the strategy of hard-coding gradients in a scheme that fully avoids propagator storage and significantly reduces memory requirements.
arXiv Detail & Related papers (2023-04-13T00:24:40Z)
Energy-efficient Task Adaptation for NLP Edge Inference Leveraging Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks. We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z)
Memory Safe Computations with XLA Compiler [14.510796427699459]
XLA compiler extension adjusts the representation of an algorithm according to a user-specified memory limit. We show that k-nearest neighbour and sparse Gaussian process regression methods can be run at a much larger scale on a single device.
arXiv Detail & Related papers (2022-06-28T16:59:28Z)
Reducing Memory Requirements of Quantum Optimal Control [0.0]
gradient-based algorithms such as GRAPE suffer from exponential growth in storage with increasing number of qubits and linear growth in memory requirements with increasing number of time steps. We have created a nonstandard automatic differentiation technique that can compute gradients needed by GRAPE by exploiting the fact that the inverse of a unitary matrix is its conjugate transpose. Our approach significantly reduces the memory requirements for GRAPE, at the cost of a reasonable amount of recomputation.
arXiv Detail & Related papers (2022-03-23T20:42:54Z)
Space-efficient binary optimization for variational computing [68.8204255655161]
We show that it is possible to greatly reduce the number of qubits needed for the Traveling Salesman Problem. We also propose encoding schemes which smoothly interpolate between the qubit-efficient and the circuit depth-efficient models.
arXiv Detail & Related papers (2020-09-15T18:17:27Z)

This list is automatically generated from the titles and abstracts of the papers in this site.