Transformer variational wave functions for frustrated quantum spin
systems
- URL: http://arxiv.org/abs/2211.05504v2
- Date: Sun, 11 Jun 2023 09:47:38 GMT
- Title: Transformer variational wave functions for frustrated quantum spin
systems
- Authors: Luciano Loris Viteritti, Riccardo Rende and Federico Becca
- Abstract summary: We propose an adaptation of the ViT architecture with complex parameters to define a new class of variational neural-network states.
The success of the ViT wave function relies on mixing both local and global operations.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Transformer architecture has become the state-of-art model for natural
language processing tasks and, more recently, also for computer vision tasks,
thus defining the Vision Transformer (ViT) architecture. The key feature is the
ability to describe long-range correlations among the elements of the input
sequences, through the so-called self-attention mechanism. Here, we propose an
adaptation of the ViT architecture with complex parameters to define a new
class of variational neural-network states for quantum many-body systems, the
ViT wave function. We apply this idea to the one-dimensional $J_1$-$J_2$
Heisenberg model, demonstrating that a relatively simple parametrization gets
excellent results for both gapped and gapless phases. In this case, excellent
accuracies are obtained by a relatively shallow architecture, with a single
layer of self-attention, thus largely simplifying the original architecture.
Still, the optimization of a deeper structure is possible and can be used for
more challenging models, most notably highly-frustrated systems in two
dimensions. The success of the ViT wave function relies on mixing both local
and global operations, thus enabling the study of large systems with high
accuracy.
Related papers
- Learnable Multi-Scale Wavelet Transformer: A Novel Alternative to Self-Attention [0.0]
Learnable Multi-Scale Wavelet Transformer (LMWT) is a novel architecture that replaces the standard dot-product self-attention.
We present the detailed mathematical formulation of the learnable Haar wavelet module and its integration into the transformer framework.
Our results indicate that the LMWT achieves competitive performance while offering substantial computational advantages.
arXiv Detail & Related papers (2025-04-08T22:16:54Z) - Instruction-Guided Autoregressive Neural Network Parameter Generation [49.800239140036496]
We propose IGPG, an autoregressive framework that unifies parameter synthesis across diverse tasks and architectures.
By autoregressively generating neural network weights' tokens, IGPG ensures inter-layer coherence and enables efficient adaptation across models and datasets.
Experiments on multiple datasets demonstrate that IGPG consolidates diverse pretrained models into a single, flexible generative framework.
arXiv Detail & Related papers (2025-04-02T05:50:19Z) - WaveFormer: A 3D Transformer with Wavelet-Driven Feature Representation for Efficient Medical Image Segmentation [0.5312470855079862]
We present WaveFormer, a novel 3D-transformer for medical images.
It is inspired by the top-down mechanism of the human visual recognition system.
It preserves both global context and high-frequency details while replacing heavy upsampling layers with efficient wavelet-based summarization and reconstruction.
arXiv Detail & Related papers (2025-03-31T06:28:41Z) - Unifying Dimensions: A Linear Adaptive Approach to Lightweight Image Super-Resolution [6.857919231112562]
Window-based transformers have demonstrated outstanding performance in super-resolution tasks.
They exhibit higher computational complexity and inference latency than convolutional neural networks.
We construct a convolution-based Transformer framework named the linear adaptive mixer network (LAMNet)
arXiv Detail & Related papers (2024-09-26T07:24:09Z) - Learning with SASQuaTCh: a Novel Variational Quantum Transformer Architecture with Kernel-Based Self-Attention [0.464982780843177]
We present a variational quantum circuit architecture named Self-Attention Sequential Quantum Transformer Channel (SASQuaT)
Our approach leverages recent insights from kernel-based operator learning in the context of predicting vision transformer network using simple gate operations and a set of multi-dimensional quantum Fourier transforms.
To validate our approach, we consider image classification tasks in simulation and with hardware, where with only 9 qubits and a handful of parameters we are able to simultaneously embed and classify a grayscale image of handwritten digits with high accuracy.
arXiv Detail & Related papers (2024-03-21T18:00:04Z) - Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling [4.190836962132713]
This paper introduces Orchid, a novel architecture designed to address the quadratic complexity of traditional attention mechanisms.
At the core of this architecture lies a new data-dependent global convolution layer, which contextually adapts its conditioned kernel on input sequence.
We evaluate the proposed model across multiple domains, including language modeling and image classification, to highlight its performance and generality.
arXiv Detail & Related papers (2024-02-28T17:36:45Z) - Hiformer: Heterogeneous Feature Interactions Learning with Transformers
for Recommender Systems [27.781785405875084]
We propose to leverage a Transformer-based architecture with attention layers to automatically capture feature interactions.
We identify two key challenges for applying the vanilla Transformer architecture to web-scale recommender systems.
arXiv Detail & Related papers (2023-11-10T05:57:57Z) - Optimizing Design Choices for Neural Quantum States [0.0]
We present a comparison of a selection of popular network architectures and symmetrization schemes employed for ground state searches of spin Hamiltonians.
In the presence of a non-trivial sign structure of the ground states, we find that the details of symmetrization crucially influence the performance.
arXiv Detail & Related papers (2023-01-17T10:30:05Z) - Vision Transformer with Convolutions Architecture Search [72.70461709267497]
We propose an architecture search method-Vision Transformer with Convolutions Architecture Search (VTCAS)
The high-performance backbone network searched by VTCAS introduces the desirable features of convolutional neural networks into the Transformer architecture.
It enhances the robustness of the neural network for object recognition, especially in the low illumination indoor scene.
arXiv Detail & Related papers (2022-03-20T02:59:51Z) - CSformer: Bridging Convolution and Transformer for Compressive Sensing [65.22377493627687]
This paper proposes a hybrid framework that integrates the advantages of leveraging detailed spatial information from CNN and the global context provided by transformer for enhanced representation learning.
The proposed approach is an end-to-end compressive image sensing method, composed of adaptive sampling and recovery.
The experimental results demonstrate the effectiveness of the dedicated transformer-based architecture for compressive sensing.
arXiv Detail & Related papers (2021-12-31T04:37:11Z) - Global Vision Transformer Pruning with Hessian-Aware Saliency [93.33895899995224]
This work challenges the common design philosophy of the Vision Transformer (ViT) model with uniform dimension across all the stacked blocks in a model stage.
We derive a novel Hessian-based structural pruning criteria comparable across all layers and structures, with latency-aware regularization for direct latency reduction.
Performing iterative pruning on the DeiT-Base model leads to a new architecture family called NViT (Novel ViT), with a novel parameter that utilizes parameters more efficiently.
arXiv Detail & Related papers (2021-10-10T18:04:59Z) - Global Filter Networks for Image Classification [90.81352483076323]
We present a conceptually simple yet computationally efficient architecture that learns long-term spatial dependencies in the frequency domain with log-linear complexity.
Our results demonstrate that GFNet can be a very competitive alternative to transformer-style models and CNNs in efficiency, generalization ability and robustness.
arXiv Detail & Related papers (2021-07-01T17:58:16Z) - Vision Transformer Architecture Search [64.73920718915282]
Current vision transformers (ViTs) are simply inherited from natural language processing (NLP) tasks.
We propose an architecture search method, dubbed ViTAS, to search for the optimal architecture with similar hardware budgets.
Our searched architecture achieves $74.7%$ top-$1$ accuracy on ImageNet and is $2.5%$ superior than the current baseline ViT architecture.
arXiv Detail & Related papers (2021-06-25T15:39:08Z) - Transformers Solve the Limited Receptive Field for Monocular Depth
Prediction [82.90445525977904]
We propose TransDepth, an architecture which benefits from both convolutional neural networks and transformers.
This is the first paper which applies transformers into pixel-wise prediction problems involving continuous labels.
arXiv Detail & Related papers (2021-03-22T18:00:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.