Retentive Neural Quantum States: Efficient Ansätze for Ab Initio Quantum Chemistry
- URL: http://arxiv.org/abs/2411.03900v1
- Date: Wed, 06 Nov 2024 13:24:34 GMT
- Title: Retentive Neural Quantum States: Efficient Ansätze for Ab Initio Quantum Chemistry
- Authors: Oliver Knitter, Dan Zhao, James Stokes, Martin Ganahl, Stefan Leichenauer, Shravan Veerapaneni,
- Abstract summary: We explore the use of the retentive network (RetNet) as an ansatz for solving electronic ground state problems in quantum chemistry.
We show that RetNet overcomes this time complexity bottleneck by processing data in parallel during training, and recurrently during inference.
- Score: 10.423935999935315
- License:
- Abstract: Neural-network quantum states (NQS) has emerged as a powerful application of quantum-inspired deep learning for variational Monte Carlo methods, offering a competitive alternative to existing techniques for identifying ground states of quantum problems. A significant advancement toward improving the practical scalability of NQS has been the incorporation of autoregressive models, most recently transformers, as variational ansatze. Transformers learn sequence information with greater expressiveness than recurrent models, but at the cost of increased time complexity with respect to sequence length. We explore the use of the retentive network (RetNet), a recurrent alternative to transformers, as an ansatz for solving electronic ground state problems in $\textit{ab initio}$ quantum chemistry. Unlike transformers, RetNets overcome this time complexity bottleneck by processing data in parallel during training, and recurrently during inference. We give a simple computational cost estimate of the RetNet and directly compare it with similar estimates for transformers, establishing a clear threshold ratio of problem-to-model size past which the RetNet's time complexity outperforms that of the transformer. Though this efficiency can comes at the expense of decreased expressiveness relative to the transformer, we overcome this gap through training strategies that leverage the autoregressive structure of the model -- namely, variational neural annealing. Our findings support the RetNet as a means of improving the time complexity of NQS without sacrificing accuracy. We provide further evidence that the ablative improvements of neural annealing extend beyond the RetNet architecture, suggesting it would serve as an effective general training strategy for autoregressive NQS.
Related papers
- UDiTQC: U-Net-Style Diffusion Transformer for Quantum Circuit Synthesis [13.380226276791818]
Current diffusion model approaches based on U-Net architectures, while promising, encounter challenges related to computational efficiency and modeling global context.
We propose UDiT,a novel U-Net-style Diffusion Transformer architecture, which combines U-Net's strengths in multi-scale feature extraction with the Transformer's ability to model global context.
arXiv Detail & Related papers (2025-01-24T15:15:50Z) - Regression and Classification with Single-Qubit Quantum Neural Networks [0.0]
We use a resource-efficient and scalable Single-Qubit Quantum Neural Network (SQQNN) for both regression and classification tasks.
For classification, we introduce a novel training method inspired by the Taylor series, which can efficiently find a global minimum in a single step.
The SQQNN exhibits virtually error-free and strong performance in regression and classification tasks, including the MNIST dataset.
arXiv Detail & Related papers (2024-12-12T17:35:36Z) - Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval.
A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed.
The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z) - Autoregressive + Chain of Thought = Recurrent: Recurrence's Role in Language Models' Computability and a Revisit of Recurrent Transformer [29.970200877158764]
We investigate the influence of recurrent structures in neural models on their reasoning abilities and computability.
We shed light on how the CoT approach can mimic recurrent computation and act as a bridge between autoregression and recurrence.
arXiv Detail & Related papers (2024-09-14T00:30:57Z) - Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers [10.566264033360282]
Post-training quantization (PTQ) has emerged as a promising solution for deploying hyper-scale models on edge devices such as mobile and TVs.
In this paper, we propose a novel PTQ algorithm that balances accuracy and efficiency.
arXiv Detail & Related papers (2024-02-14T05:58:43Z) - Split-Boost Neural Networks [1.1549572298362787]
We propose an innovative training strategy for feed-forward architectures - called split-boost.
Such a novel approach ultimately allows us to avoid explicitly modeling the regularization term.
The proposed strategy is tested on a real-world (anonymized) dataset within a benchmark medical insurance design problem.
arXiv Detail & Related papers (2023-09-06T17:08:57Z) - Incrementally-Computable Neural Networks: Efficient Inference for
Dynamic Inputs [75.40636935415601]
Deep learning often faces the challenge of efficiently processing dynamic inputs, such as sensor data or user inputs.
We take an incremental computing approach, looking to reuse calculations as the inputs change.
We apply this approach to the transformers architecture, creating an efficient incremental inference algorithm with complexity proportional to the fraction of modified inputs.
arXiv Detail & Related papers (2023-07-27T16:30:27Z) - RWKV: Reinventing RNNs for the Transformer Era [54.716108899349614]
We propose a novel model architecture that combines the efficient parallelizable training of transformers with the efficient inference of RNNs.
We scale our models as large as 14 billion parameters, by far the largest dense RNN ever trained, and find RWKV performs on par with similarly sized Transformers.
arXiv Detail & Related papers (2023-05-22T13:57:41Z) - Optimizing Tensor Network Contraction Using Reinforcement Learning [86.05566365115729]
We propose a Reinforcement Learning (RL) approach combined with Graph Neural Networks (GNN) to address the contraction ordering problem.
The problem is extremely challenging due to the huge search space, the heavy-tailed reward distribution, and the challenging credit assignment.
We show how a carefully implemented RL-agent that uses a GNN as the basic policy construct can address these challenges.
arXiv Detail & Related papers (2022-04-18T21:45:13Z) - Simulating the Mott transition on a noisy digital quantum computer via
Cartan-based fast-forwarding circuits [62.73367618671969]
Dynamical mean-field theory (DMFT) maps the local Green's function of the Hubbard model to that of the Anderson impurity model.
Quantum and hybrid quantum-classical algorithms have been proposed to efficiently solve impurity models.
This work presents the first computation of the Mott phase transition using noisy digital quantum hardware.
arXiv Detail & Related papers (2021-12-10T17:32:15Z) - Finetuning Pretrained Transformers into RNNs [81.72974646901136]
Transformers have outperformed recurrent neural networks (RNNs) in natural language generation.
A linear-complexity recurrent variant has proven well suited for autoregressive generation.
This work aims to convert a pretrained transformer into its efficient recurrent counterpart.
arXiv Detail & Related papers (2021-03-24T10:50:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.