Memory-Efficient Differentiable Programming for Quantum Optimal Control
of Discrete Lattices
- URL: http://arxiv.org/abs/2210.08378v1
- Date: Sat, 15 Oct 2022 20:59:23 GMT
- Title: Memory-Efficient Differentiable Programming for Quantum Optimal Control
of Discrete Lattices
- Authors: Xian Wang, Paul Kairys, Sri Hari Krishna Narayanan, Jan H\"uckelheim,
Paul Hovland
- Abstract summary: Quantum optimal control problems are typically solved by gradient-based algorithms such as GRAPE.
QOC reveals that memory requirements are a barrier for simulating large models or long time spans.
We employ a nonstandard differentiable programming approach that significantly reduces the memory requirements at the cost of a reasonable amount of recomputation.
- Score: 1.5012666537539614
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Quantum optimal control problems are typically solved by gradient-based
algorithms such as GRAPE, which suffer from exponential growth in storage with
increasing number of qubits and linear growth in memory requirements with
increasing number of time steps. Employing QOC for discrete lattices reveals
that these memory requirements are a barrier for simulating large models or
long time spans. We employ a nonstandard differentiable programming approach
that significantly reduces the memory requirements at the cost of a reasonable
amount of recomputation. The approach exploits invertibility properties of the
unitary matrices to reverse the computation during back-propagation. We utilize
QOC software written in the differentiable programming framework JAX that
implements this approach, and demonstrate its effectiveness for lattice gauge
theory.
Related papers
- Progressive Mixed-Precision Decoding for Efficient LLM Inference [49.05448842542558]
We introduce Progressive Mixed-Precision Decoding (PMPD) to address the memory-boundedness of decoding.
PMPD achieves 1.4$-$12.2$times$ speedup in matrix-vector multiplications over fp16 models.
Our approach delivers a throughput gain of 3.8$-$8.0$times$ over fp16 models and up to 1.54$times$ over uniform quantization approaches.
arXiv Detail & Related papers (2024-10-17T11:46:33Z) - Q-VLM: Post-training Quantization for Large Vision-Language Models [73.19871905102545]
We propose a post-training quantization framework of large vision-language models (LVLMs) for efficient multi-modal inference.
We mine the cross-layer dependency that significantly influences discretization errors of the entire vision-language model, and embed this dependency into optimal quantization strategy.
Experimental results demonstrate that our method compresses the memory by 2.78x and increase generate speed by 1.44x about 13B LLaVA model without performance degradation.
arXiv Detail & Related papers (2024-10-10T17:02:48Z) - Memory-Augmented Quantum Reservoir Computing [0.0]
We present a hybrid quantum-classical approach that implements memory through classical post-processing of quantum measurements.
We tested our model on two physical platforms: a fully connected Ising model and a Rydberg atom array.
arXiv Detail & Related papers (2024-09-15T22:44:09Z) - ThinK: Thinner Key Cache by Query-Driven Pruning [63.13363917871414]
Large Language Models (LLMs) have revolutionized the field of natural language processing, achieving unprecedented performance across a variety of applications.
This paper focuses on the long-context scenario, addressing the inefficiencies in KV cache memory consumption during inference.
We propose ThinK, a novel query-dependent KV cache pruning method designed to minimize attention weight loss while selectively pruning the least significant channels.
arXiv Detail & Related papers (2024-07-30T17:59:08Z) - Sparser is Faster and Less is More: Efficient Sparse Attention for Long-Range Transformers [58.5711048151424]
We introduce SPARSEK Attention, a novel sparse attention mechanism designed to overcome computational and memory obstacles.
Our approach integrates a scoring network and a differentiable top-k mask operator, SPARSEK, to select a constant number of KV pairs for each query.
Experimental results reveal that SPARSEK Attention outperforms previous sparse attention methods.
arXiv Detail & Related papers (2024-06-24T15:55:59Z) - Memory-Efficient Optimization with Factorized Hamiltonian Descent [11.01832755213396]
We introduce a novel adaptive, H-Fac, which incorporates a memory-efficient factorization approach to address this challenge.
By employing a rank-1 parameterization for both momentum and scaling parameter estimators, H-Fac reduces memory costs to a sublinear level.
We develop our algorithms based on principles derived from Hamiltonian dynamics, providing robust theoretical underpinnings in optimization dynamics and convergence guarantees.
arXiv Detail & Related papers (2024-06-14T12:05:17Z) - Optimal control of large quantum systems: assessing memory and runtime
performance of GRAPE [0.0]
GRAPE is a popular technique in quantum optimal control, and can be combined with automatic differentiation.
We show that the convenience of AD comes at a significant memory cost due to the cumulative storage of a large number of states and propagators.
We revisit the strategy of hard-coding gradients in a scheme that fully avoids propagator storage and significantly reduces memory requirements.
arXiv Detail & Related papers (2023-04-13T00:24:40Z) - Energy-efficient Task Adaptation for NLP Edge Inference Leveraging
Heterogeneous Memory Architectures [68.91874045918112]
adapter-ALBERT is an efficient model optimization for maximal data reuse across different tasks.
We demonstrate the advantage of mapping the model to a heterogeneous on-chip memory architecture by performing simulations on a validated NLP edge accelerator.
arXiv Detail & Related papers (2023-03-25T14:40:59Z) - Memory Safe Computations with XLA Compiler [14.510796427699459]
XLA compiler extension adjusts the representation of an algorithm according to a user-specified memory limit.
We show that k-nearest neighbour and sparse Gaussian process regression methods can be run at a much larger scale on a single device.
arXiv Detail & Related papers (2022-06-28T16:59:28Z) - Reducing Memory Requirements of Quantum Optimal Control [0.0]
gradient-based algorithms such as GRAPE suffer from exponential growth in storage with increasing number of qubits and linear growth in memory requirements with increasing number of time steps.
We have created a nonstandard automatic differentiation technique that can compute gradients needed by GRAPE by exploiting the fact that the inverse of a unitary matrix is its conjugate transpose.
Our approach significantly reduces the memory requirements for GRAPE, at the cost of a reasonable amount of recomputation.
arXiv Detail & Related papers (2022-03-23T20:42:54Z) - Space-efficient binary optimization for variational computing [68.8204255655161]
We show that it is possible to greatly reduce the number of qubits needed for the Traveling Salesman Problem.
We also propose encoding schemes which smoothly interpolate between the qubit-efficient and the circuit depth-efficient models.
arXiv Detail & Related papers (2020-09-15T18:17:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.