Related papers: Retentive Neural Quantum States: Efficient Ansätze for Ab Initio Quantum Chemistry

Retentive Neural Quantum States: Efficient Ansätze for Ab Initio Quantum Chemistry

URL: http://arxiv.org/abs/2411.03900v1
Date: Wed, 06 Nov 2024 13:24:34 GMT
Title: Retentive Neural Quantum States: Efficient Ansätze for Ab Initio Quantum Chemistry
Authors: Oliver Knitter, Dan Zhao, James Stokes, Martin Ganahl, Stefan Leichenauer, Shravan Veerapaneni,
Abstract summary: We explore the use of the retentive network (RetNet) as an ansatz for solving electronic ground state problems in quantum chemistry. We show that RetNet overcomes this time complexity bottleneck by processing data in parallel during training, and recurrently during inference.
Score: 10.423935999935315
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Neural-network quantum states (NQS) has emerged as a powerful application of quantum-inspired deep learning for variational Monte Carlo methods, offering a competitive alternative to existing techniques for identifying ground states of quantum problems. A significant advancement toward improving the practical scalability of NQS has been the incorporation of autoregressive models, most recently transformers, as variational ansatze. Transformers learn sequence information with greater expressiveness than recurrent models, but at the cost of increased time complexity with respect to sequence length. We explore the use of the retentive network (RetNet), a recurrent alternative to transformers, as an ansatz for solving electronic ground state problems in $\textit{ab initio}$ quantum chemistry. Unlike transformers, RetNets overcome this time complexity bottleneck by processing data in parallel during training, and recurrently during inference. We give a simple computational cost estimate of the RetNet and directly compare it with similar estimates for transformers, establishing a clear threshold ratio of problem-to-model size past which the RetNet's time complexity outperforms that of the transformer. Though this efficiency can comes at the expense of decreased expressiveness relative to the transformer, we overcome this gap through training strategies that leverage the autoregressive structure of the model -- namely, variational neural annealing. Our findings support the RetNet as a means of improving the time complexity of NQS without sacrificing accuracy. We provide further evidence that the ablative improvements of neural annealing extend beyond the RetNet architecture, suggesting it would serve as an effective general training strategy for autoregressive NQS.

Related papers

Adaptive Neural Quantum States: A Recurrent Neural Network Perspective [0.7234862895932991]
We show an Adaptive scheme to optimize Neural-network quantum states (NQS)<n>NQS are powerful neural-network ans"atzes that have emerged as promising tools for studying quantum many-body physics.
arXiv Detail & Related papers (2025-07-24T18:00:03Z)
A Comprehensively Adaptive Architectural Optimization-Ingrained Quantum Neural Network Model for Cloud Workloads Prediction [4.501295034557007]
This work proposes a novel Comprehensively Adaptive Architectural Optimization-based Variable Quantum Neural Network (CA-QNN)<n>The model converts workload data into qubits, processed through qubit neurons with Controlled NOT-gated activation functions for intuitive pattern recognition.<n>The proposed model demonstrates superior prediction accuracy, reducing prediction errors by up to 93.40% and 91.27% compared to existing deep learning and QNN-based approaches.
arXiv Detail & Related papers (2025-07-11T05:07:21Z)
MesaNet: Sequence Modeling by Locally Optimal Test-Time Training [67.45211108321203]
We introduce a numerically stable, chunkwise parallelizable version of the recently proposed Mesa layer.<n>We show that optimal test-time training enables reaching lower language modeling perplexity and higher downstream benchmark performance than previous RNNs.
arXiv Detail & Related papers (2025-06-05T16:50:23Z)
Quantum Adaptive Self-Attention for Quantum Transformer Models [0.0]
We propose Quantum Adaptive Self-Attention (QASA), a novel hybrid architecture that enhances classical Transformer models with a quantum attention mechanism. QASA replaces dot-product attention with a parameterized quantum circuit (PQC) that adaptively captures inter-token relationships in the quantum Hilbert space. Experiments on synthetic time-series tasks demonstrate that QASA achieves faster convergence and superior generalization compared to both standard Transformers and reduced classical variants.
arXiv Detail & Related papers (2025-04-05T02:52:37Z)
UDiTQC: U-Net-Style Diffusion Transformer for Quantum Circuit Synthesis [13.380226276791818]
Current diffusion model approaches based on U-Net architectures, while promising, encounter challenges related to computational efficiency and modeling global context. We propose UDiT,a novel U-Net-style Diffusion Transformer architecture, which combines U-Net's strengths in multi-scale feature extraction with the Transformer's ability to model global context.
arXiv Detail & Related papers (2025-01-24T15:15:50Z)
Regression and Classification with Single-Qubit Quantum Neural Networks [0.0]
We use a resource-efficient and scalable Single-Qubit Quantum Neural Network (SQQNN) for both regression and classification tasks. For classification, we introduce a novel training method inspired by the Taylor series, which can efficiently find a global minimum in a single step. The SQQNN exhibits virtually error-free and strong performance in regression and classification tasks, including the MNIST dataset.
arXiv Detail & Related papers (2024-12-12T17:35:36Z)
Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
QIANets: Quantum-Integrated Adaptive Networks for Reduced Latency and Improved Inference Times in CNN Models [2.6663666678221376]
Convolutional neural networks (CNNs) have made significant advances in computer vision tasks, yet their high inference times and latency limit real-world applicability. We introduce QIANets: a novel approach of redesigning the traditional GoogLeNet, DenseNet, and ResNet-18 model architectures to process more parameters and computations whilst maintaining low inference times. Despite experimental limitations, the method was tested and evaluated, demonstrating reductions in inference times, along with effective accuracy preservations.
arXiv Detail & Related papers (2024-10-14T09:24:48Z)
Autoregressive + Chain of Thought = Recurrent: Recurrence's Role in Language Models' Computability and a Revisit of Recurrent Transformer [29.970200877158764]
We investigate the influence of recurrent structures in neural models on their reasoning abilities and computability. We shed light on how the CoT approach can mimic recurrent computation and act as a bridge between autoregression and recurrence.
arXiv Detail & Related papers (2024-09-14T00:30:57Z)
Learning to Program Variational Quantum Circuits with Fast Weights [3.6881738506505988]
This paper introduces the Quantum Fast Weight Programmers (QFWP) as a solution to the temporal or sequential learning challenge. The proposed QFWP model achieves learning of temporal dependencies without necessitating the use of quantum recurrent neural networks. Numerical simulations conducted in this study showcase the efficacy of the proposed QFWP model in both time-series prediction and RL tasks.
arXiv Detail & Related papers (2024-02-27T18:53:18Z)
Towards Next-Level Post-Training Quantization of Hyper-Scale Transformers [10.566264033360282]
Post-training quantization (PTQ) has emerged as a promising solution for deploying hyper-scale models on edge devices such as mobile and TVs. In this paper, we propose a novel PTQ algorithm that balances accuracy and efficiency.
arXiv Detail & Related papers (2024-02-14T05:58:43Z)
Split-Boost Neural Networks [1.1549572298362787]
We propose an innovative training strategy for feed-forward architectures - called split-boost. Such a novel approach ultimately allows us to avoid explicitly modeling the regularization term. The proposed strategy is tested on a real-world (anonymized) dataset within a benchmark medical insurance design problem.
arXiv Detail & Related papers (2023-09-06T17:08:57Z)
Incrementally-Computable Neural Networks: Efficient Inference for Dynamic Inputs [75.40636935415601]
Deep learning often faces the challenge of efficiently processing dynamic inputs, such as sensor data or user inputs. We take an incremental computing approach, looking to reuse calculations as the inputs change. We apply this approach to the transformers architecture, creating an efficient incremental inference algorithm with complexity proportional to the fraction of modified inputs.
arXiv Detail & Related papers (2023-07-27T16:30:27Z)
RWKV: Reinventing RNNs for the Transformer Era [54.716108899349614]
We propose a novel model architecture that combines the efficient parallelizable training of transformers with the efficient inference of RNNs. We scale our models as large as 14 billion parameters, by far the largest dense RNN ever trained, and find RWKV performs on par with similarly sized Transformers.
arXiv Detail & Related papers (2023-05-22T13:57:41Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Optimizing Tensor Network Contraction Using Reinforcement Learning [86.05566365115729]
We propose a Reinforcement Learning (RL) approach combined with Graph Neural Networks (GNN) to address the contraction ordering problem. The problem is extremely challenging due to the huge search space, the heavy-tailed reward distribution, and the challenging credit assignment. We show how a carefully implemented RL-agent that uses a GNN as the basic policy construct can address these challenges.
arXiv Detail & Related papers (2022-04-18T21:45:13Z)
Simulating the Mott transition on a noisy digital quantum computer via Cartan-based fast-forwarding circuits [62.73367618671969]
Dynamical mean-field theory (DMFT) maps the local Green's function of the Hubbard model to that of the Anderson impurity model. Quantum and hybrid quantum-classical algorithms have been proposed to efficiently solve impurity models. This work presents the first computation of the Mott phase transition using noisy digital quantum hardware.
arXiv Detail & Related papers (2021-12-10T17:32:15Z)
Finetuning Pretrained Transformers into RNNs [81.72974646901136]
Transformers have outperformed recurrent neural networks (RNNs) in natural language generation. A linear-complexity recurrent variant has proven well suited for autoregressive generation. This work aims to convert a pretrained transformer into its efficient recurrent counterpart.
arXiv Detail & Related papers (2021-03-24T10:50:43Z)

This list is automatically generated from the titles and abstracts of the papers in this site.