Retentive Neural Quantum States: Efficient Ansätze for Ab Initio Quantum Chemistry
- URL: http://arxiv.org/abs/2411.03900v1
- Date: Wed, 06 Nov 2024 13:24:34 GMT
- Title: Retentive Neural Quantum States: Efficient Ansätze for Ab Initio Quantum Chemistry
- Authors: Oliver Knitter, Dan Zhao, James Stokes, Martin Ganahl, Stefan Leichenauer, Shravan Veerapaneni,
- Abstract summary: We explore the use of the retentive network (RetNet) as an ansatz for solving electronic ground state problems in quantum chemistry.
We show that RetNet overcomes this time complexity bottleneck by processing data in parallel during training, and recurrently during inference.
- Score: 10.423935999935315
- License:
- Abstract: Neural-network quantum states (NQS) has emerged as a powerful application of quantum-inspired deep learning for variational Monte Carlo methods, offering a competitive alternative to existing techniques for identifying ground states of quantum problems. A significant advancement toward improving the practical scalability of NQS has been the incorporation of autoregressive models, most recently transformers, as variational ansatze. Transformers learn sequence information with greater expressiveness than recurrent models, but at the cost of increased time complexity with respect to sequence length. We explore the use of the retentive network (RetNet), a recurrent alternative to transformers, as an ansatz for solving electronic ground state problems in $\textit{ab initio}$ quantum chemistry. Unlike transformers, RetNets overcome this time complexity bottleneck by processing data in parallel during training, and recurrently during inference. We give a simple computational cost estimate of the RetNet and directly compare it with similar estimates for transformers, establishing a clear threshold ratio of problem-to-model size past which the RetNet's time complexity outperforms that of the transformer. Though this efficiency can comes at the expense of decreased expressiveness relative to the transformer, we overcome this gap through training strategies that leverage the autoregressive structure of the model -- namely, variational neural annealing. Our findings support the RetNet as a means of improving the time complexity of NQS without sacrificing accuracy. We provide further evidence that the ablative improvements of neural annealing extend beyond the RetNet architecture, suggesting it would serve as an effective general training strategy for autoregressive NQS.
Related papers
- QIANets: Quantum-Integrated Adaptive Networks for Reduced Latency and Improved Inference Times in CNN Models [2.6663666678221376]
Convolutional neural networks (CNNs) have made significant advances in computer vision tasks, yet their high inference times and latency limit real-world applicability.
We introduce QIANets: a novel approach of redesigning the traditional GoogLeNet, DenseNet, and ResNet-18 model architectures to process more parameters and computations whilst maintaining low inference times.
Despite experimental limitations, the method was tested and evaluated, demonstrating reductions in inference times, along with effective accuracy preservations.
arXiv Detail & Related papers (2024-10-14T09:24:48Z) - Autoregressive + Chain of Thought = Recurrent: Recurrence's Role in Language Models' Computability and a Revisit of Recurrent Transformer [29.970200877158764]
We investigate the influence of recurrent structures in neural models on their reasoning abilities and computability.
We shed light on how the CoT approach can mimic recurrent computation and act as a bridge between autoregression and recurrence.
arXiv Detail & Related papers (2024-09-14T00:30:57Z) - Learning to Program Variational Quantum Circuits with Fast Weights [3.6881738506505988]
This paper introduces the Quantum Fast Weight Programmers (QFWP) as a solution to the temporal or sequential learning challenge.
The proposed QFWP model achieves learning of temporal dependencies without necessitating the use of quantum recurrent neural networks.
Numerical simulations conducted in this study showcase the efficacy of the proposed QFWP model in both time-series prediction and RL tasks.
arXiv Detail & Related papers (2024-02-27T18:53:18Z) - Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers [10.566264033360282]
Post-training quantization (PTQ) has emerged as a promising solution for deploying hyper-scale models on edge devices such as mobile and TVs.
In this paper, we propose a novel PTQ algorithm that balances accuracy and efficiency.
arXiv Detail & Related papers (2024-02-14T05:58:43Z) - Split-Boost Neural Networks [1.1549572298362787]
We propose an innovative training strategy for feed-forward architectures - called split-boost.
Such a novel approach ultimately allows us to avoid explicitly modeling the regularization term.
The proposed strategy is tested on a real-world (anonymized) dataset within a benchmark medical insurance design problem.
arXiv Detail & Related papers (2023-09-06T17:08:57Z) - Incrementally-Computable Neural Networks: Efficient Inference for
Dynamic Inputs [75.40636935415601]
Deep learning often faces the challenge of efficiently processing dynamic inputs, such as sensor data or user inputs.
We take an incremental computing approach, looking to reuse calculations as the inputs change.
We apply this approach to the transformers architecture, creating an efficient incremental inference algorithm with complexity proportional to the fraction of modified inputs.
arXiv Detail & Related papers (2023-07-27T16:30:27Z) - RWKV: Reinventing RNNs for the Transformer Era [54.716108899349614]
We propose a novel model architecture that combines the efficient parallelizable training of transformers with the efficient inference of RNNs.
We scale our models as large as 14 billion parameters, by far the largest dense RNN ever trained, and find RWKV performs on par with similarly sized Transformers.
arXiv Detail & Related papers (2023-05-22T13:57:41Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Optimizing Tensor Network Contraction Using Reinforcement Learning [86.05566365115729]
We propose a Reinforcement Learning (RL) approach combined with Graph Neural Networks (GNN) to address the contraction ordering problem.
The problem is extremely challenging due to the huge search space, the heavy-tailed reward distribution, and the challenging credit assignment.
We show how a carefully implemented RL-agent that uses a GNN as the basic policy construct can address these challenges.
arXiv Detail & Related papers (2022-04-18T21:45:13Z) - Simulating the Mott transition on a noisy digital quantum computer via
Cartan-based fast-forwarding circuits [62.73367618671969]
Dynamical mean-field theory (DMFT) maps the local Green's function of the Hubbard model to that of the Anderson impurity model.
Quantum and hybrid quantum-classical algorithms have been proposed to efficiently solve impurity models.
This work presents the first computation of the Mott phase transition using noisy digital quantum hardware.
arXiv Detail & Related papers (2021-12-10T17:32:15Z) - Finetuning Pretrained Transformers into RNNs [81.72974646901136]
Transformers have outperformed recurrent neural networks (RNNs) in natural language generation.
A linear-complexity recurrent variant has proven well suited for autoregressive generation.
This work aims to convert a pretrained transformer into its efficient recurrent counterpart.
arXiv Detail & Related papers (2021-03-24T10:50:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.