Related papers: Learning Physical Simulation with Message Passing Transformer

Learning Physical Simulation with Message Passing Transformer

URL: http://arxiv.org/abs/2406.06060v1
Date: Mon, 10 Jun 2024 07:14:56 GMT
Title: Learning Physical Simulation with Message Passing Transformer
Authors: Zeyi Xu, Yifei Li,
Abstract summary: We propose a new universal architecture based on Graph Neural Network, the Message Passing Transformer, which incorporates a Message Passing framework. Our architecture achieves significant accuracy improvements in long-term rollouts for both Lagrangian and Eulerian dynamical systems.
Score: 5.431396242057807
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Machine learning methods for physical simulation have achieved significant success in recent years. We propose a new universal architecture based on Graph Neural Network, the Message Passing Transformer, which incorporates a Message Passing framework, employs an Encoder-Processor-Decoder structure, and applies Graph Fourier Loss as loss function for model optimization. To take advantage of the past message passing state information, we propose Hadamard-Product Attention to update the node attribute in the Processor, Hadamard-Product Attention is a variant of Dot-Product Attention that focuses on more fine-grained semantics and emphasizes on assigning attention weights over each feature dimension rather than each position in the sequence relative to others. We further introduce Graph Fourier Loss (GFL) to balance high-energy and low-energy components. To improve time performance, we precompute the graph's Laplacian eigenvectors before the training process. Our architecture achieves significant accuracy improvements in long-term rollouts for both Lagrangian and Eulerian dynamical systems over current methods.

Related papers

Physics-Informed Graph Neural Networks for Transverse Momentum Estimation in CMS Trigger Systems [0.0]
Real-time particle transverse momentum ($p_T$) estimation in high-energy physics demands efficient algorithms under strict hardware constraints.<n>We propose a physics-informed graph neural network (GNN) framework that systematically encodes detector geometry and physical observables.<n>Our co-design methodology yields superior accuracy-efficiency trade-offs compared to existing baselines.
arXiv Detail & Related papers (2025-07-25T12:19:57Z)
Sliding Window Attention Training for Efficient Large Language Models [55.56483740523027]
We introduce SWAT, which enables efficient long-context handling via Sliding Window Attention Training.<n>This paper first attributes the inefficiency of Transformers to the attention sink phenomenon.<n>We replace softmax with the sigmoid function and utilize a balanced ALiBi and Rotary Position Embedding for efficient information compression and retention.
arXiv Detail & Related papers (2025-02-26T05:31:44Z)
Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture. We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z)
A Pure Transformer Pretraining Framework on Text-attributed Graphs [50.833130854272774]
We introduce a feature-centric pretraining perspective by treating graph structure as a prior. Our framework, Graph Sequence Pretraining with Transformer (GSPT), samples node contexts through random walks. GSPT can be easily adapted to both node classification and link prediction, demonstrating promising empirical success on various datasets.
arXiv Detail & Related papers (2024-06-19T22:30:08Z)
Interpretable Lightweight Transformer via Unrolling of Learned Graph Smoothness Priors [16.04850782310842]
We build interpretable and lightweight transformer-like neural networks by unrolling iterative optimization algorithms. A normalized signal-dependent graph learning module amounts to a variant of the basic self-attention mechanism in conventional transformers.
arXiv Detail & Related papers (2024-06-06T14:01:28Z)
Gegenbauer Graph Neural Networks for Time-varying Signal Reconstruction [4.6210788730570584]
Time-varying graph signals are a critical problem in machine learning and signal processing with broad applications. We propose a novel approach that incorporates a learning module to enhance the accuracy of the downstream task. We conduct extensive experiments on real datasets to evaluate the effectiveness of our proposed approach.
arXiv Detail & Related papers (2024-03-28T19:29:17Z)
An end-to-end attention-based approach for learning on graphs [8.552020965470113]
transformer-based architectures for learning on graphs are motivated by attention as an effective learning mechanism. We propose a purely attention-based approach consisting of an encoder and an attention pooling mechanism. Despite its simplicity, the approach outperforms fine-tuned message passing baselines and recently proposed transformer-based methods on more than 70 node and graph-level tasks.
arXiv Detail & Related papers (2024-02-16T16:20:11Z)
Dynamic Graph Message Passing Networks for Visual Recognition [112.49513303433606]
Modelling long-range dependencies is critical for scene understanding tasks in computer vision. A fully-connected graph is beneficial for such modelling, but its computational overhead is prohibitive. We propose a dynamic graph message passing network, that significantly reduces the computational complexity.
arXiv Detail & Related papers (2022-09-20T14:41:37Z)
Model-Architecture Co-Design for High Performance Temporal GNN Inference on FPGA [5.575293536755127]
Real-world applications require high performance inference on real-time streaming dynamic graphs. We present a novel model-architecture co-design for inference in memory-based TGNNs on FPGAs. We train our simplified models using knowledge distillation to ensure similar accuracy vis-'a-vis the original model.
arXiv Detail & Related papers (2022-03-10T00:24:47Z)
Functional Regularization for Reinforcement Learning via Learned Fourier Features [98.90474131452588]
We propose a simple architecture for deep reinforcement learning by embedding inputs into a learned Fourier basis. We show that it improves the sample efficiency of both state-based and image-based RL.
arXiv Detail & Related papers (2021-12-06T18:59:52Z)
Adaptive Fourier Neural Operators: Efficient Token Mixers for Transformers [55.90468016961356]
We propose an efficient token mixer that learns to mix in the Fourier domain. AFNO is based on a principled foundation of operator learning. It can handle a sequence size of 65k and outperforms other efficient self-attention mechanisms.
arXiv Detail & Related papers (2021-11-24T05:44:31Z)
Spectral Transform Forms Scalable Transformer [1.19071399645846]
This work learns from the philosophy of self-attention and proposes an efficient spectral-based neural unit that employs informative long-range temporal interaction. The developed spectral window unit (SW) model predicts scalable dynamic graphs with assured efficiency.
arXiv Detail & Related papers (2021-11-15T08:46:01Z)
Rethinking Graph Transformers with Spectral Attention [13.068288784805901]
We present the $textitSpectral Attention Network$ (SAN), which uses a learned positional encoding (LPE) to learn the position of each node in a given graph. By leveraging the full spectrum of the Laplacian, our model is theoretically powerful in distinguishing graphs, and can better detect similar sub-structures from their resonance. Our model performs on par or better than state-of-the-art GNNs, and outperforms any attention-based model by a wide margin.
arXiv Detail & Related papers (2021-06-07T18:11:11Z)
GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training [59.160154997555956]
We present GradInit, an automated and architecture method for initializing neural networks. It is based on a simple agnostic; the variance of each network layer is adjusted so that a single step of SGD or Adam results in the smallest possible loss value. It also enables training the original Post-LN Transformer for machine translation without learning rate warmup.
arXiv Detail & Related papers (2021-02-16T11:45:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.