Related papers: Sequence modeling of higher-order wave modes of binary black hole mergers

Sequence modeling of higher-order wave modes of binary black hole mergers

URL: http://arxiv.org/abs/2409.03833v2
Date: Tue, 03 Jun 2025 23:32:28 GMT
Title: Sequence modeling of higher-order wave modes of binary black hole mergers
Authors: Victoria Tiki, Kiet Pham, Eliu Huerta,
Abstract summary: Higher-order gravitational wave modes from quasi-circular, spinning, non-precessing binary black hole mergers encode key information about these systems' nonlinear dynamics.<n>We model these waveforms using transformer architectures, targeting the evolution from late inspiral through ringdown.<n>Our results demonstrate that transformer-based models can capture the nonlinear dynamics of binary black hole mergers with high accuracy, even outside the surrogate training domain.
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Higher-order gravitational wave modes from quasi-circular, spinning, non-precessing binary black hole mergers encode key information about these systems' nonlinear dynamics. We model these waveforms using transformer architectures, targeting the evolution from late inspiral through ringdown. Our data is derived from the \texttt{NRHybSur3dq8} surrogate model, which includes spherical harmonic modes up to $\ell \leq 4$ (excluding $(4,0)$, $(4,\pm1)$ and including $(5,5)$ modes). These waveforms span mass ratios $q \leq 8$, spin components $s^z_{{1,2}} \in [-0.8, 0.8]$, and inclination angles $\theta \in [0, \pi]$. The model processes input data over the time interval $t \in [-5000\textrm{M}, -100\textrm{M})$ and generates predictions for the plus and cross polarizations, $(h_{+}, h_{\times})$, over the interval $t \in [-100\textrm{M}, 130\textrm{M}]$. Utilizing 16 NVIDIA A100 GPUs on the Delta supercomputer, we trained the transformer model in 15 hours on over 14 million samples. The model's performance was evaluated on a test dataset of 840,000 samples, achieving mean and median overlap scores of 0.996 and 0.997, respectively, relative to the surrogate-based ground truth signals. We further benchmark the model on numerical relativity waveforms from the SXS catalog, finding that it generalizes well to out-of-distribution systems, capable of reproducing the dynamics of systems with mass ratios up to $q=15$ and spin magnitudes up to 0.998, with a median overlap of 0.969 across 521 NR waveforms and up to 0.998 in face-on/off configurations. These results demonstrate that transformer-based models can capture the nonlinear dynamics of binary black hole mergers with high accuracy, even outside the surrogate training domain, enabling fast sequence modeling of higher-order wave modes.

Related papers

IT$^3$: Idempotent Test-Time Training [95.78053599609044]
This paper introduces Idempotent Test-Time Training (IT$3$), a novel approach to addressing the challenge of distribution shift. IT$3$ is based on the universal property of idempotence. We demonstrate the versatility of our approach across various tasks, including corrupted image classification.
arXiv Detail & Related papers (2024-10-05T15:39:51Z)
Inertial Confinement Fusion Forecasting via Large Language Models [48.76222320245404]
In this study, we introduce $textbfLPI-LLM$, a novel integration of Large Language Models (LLMs) with classical reservoir computing paradigms. We propose the $textitLLM-anchored Reservoir$, augmented with a $textitFusion-specific Prompt$, enabling accurate forecasting of $textttLPI$-generated-hot electron dynamics during implosion. We also present $textbfLPI4AI$, the first $textttLPI$ benchmark based
arXiv Detail & Related papers (2024-07-15T05:46:44Z)
Transformer In-Context Learning for Categorical Data [51.23121284812406]
We extend research on understanding Transformers through the lens of in-context learning with functional data by considering categorical outcomes, nonlinear underlying models, and nonlinear attention. We present what is believed to be the first real-world demonstration of this few-shot-learning methodology, using the ImageNet dataset.
arXiv Detail & Related papers (2024-05-27T15:03:21Z)
Towards Faster Non-Asymptotic Convergence for Diffusion-Based Generative Models [49.81937966106691]
We develop a suite of non-asymptotic theory towards understanding the data generation process of diffusion models. In contrast to prior works, our theory is developed based on an elementary yet versatile non-asymptotic approach.
arXiv Detail & Related papers (2023-06-15T16:30:08Z)
Efficient Sampling of Stochastic Differential Equations with Positive Semi-Definite Models [91.22420505636006]
This paper deals with the problem of efficient sampling from a differential equation, given the drift function and the diffusion matrix. It is possible to obtain independent and identically distributed (i.i.d.) samples at precision $varepsilon$ with a cost that is $m2 d log (1/varepsilon)$ Our results suggest that as the true solution gets smoother, we can circumvent the curse of dimensionality without requiring any sort of convexity.
arXiv Detail & Related papers (2023-03-30T02:50:49Z)
Better Diffusion Models Further Improve Adversarial Training [97.44991845907708]
It has been recognized that the data generated by the diffusion probabilistic model (DDPM) improves adversarial training. This paper gives an affirmative answer by employing the most recent diffusion model which has higher efficiency. Our adversarially trained models achieve state-of-the-art performance on RobustBench using only generated data.
arXiv Detail & Related papers (2023-02-09T13:46:42Z)
PFGM++: Unlocking the Potential of Physics-Inspired Generative Models [14.708385906024546]
We introduce a new family of physics-inspired generative models termed PFGM++. These models realize generative trajectories for $N$ dimensional data by embedding paths in $N+D$ dimensional space. We show that models with finite $D$ can be superior to previous state-of-the-art diffusion models.
arXiv Detail & Related papers (2023-02-08T18:58:02Z)
Neural Inference of Gaussian Processes for Time Series Data of Quasars [72.79083473275742]
We introduce a new model that enables it to describe quasar spectra completely. We also introduce a new method of inference of Gaussian process parameters, which we call $textitNeural Inference$. The combination of both the CDRW model and Neural Inference significantly outperforms the baseline DRW and MLE.
arXiv Detail & Related papers (2022-11-17T13:01:26Z)
Denoising MCMC for Accelerating Diffusion-Based Generative Models [54.06799491319278]
Diffusion models are powerful generative models that simulate the reverse of diffusion processes using score functions to synthesize data from noise. Here, we propose an approach to accelerating score-based sampling: Denoising MCMC. We show that Denoising Langevin Gibbs (DLG), an instance of DMCMC, successfully accelerates all six reverse-S/ODE computation tasks.
arXiv Detail & Related papers (2022-09-29T07:16:10Z)
Poisson Flow Generative Models [9.843778728210427]
"Poisson flow" generative model maps a uniform distribution on a high-dimensional hemisphere into any data distribution. PFGM achieves current state-of-the-art performance among the normalizing flow models on CIFAR-10.
arXiv Detail & Related papers (2022-09-22T17:26:58Z)
Mean Estimation in High-Dimensional Binary Markov Gaussian Mixture Models [12.746888269949407]
We consider a high-dimensional mean estimation problem over a binary hidden Markov model. We establish a nearly minimax optimal (up to logarithmic factors) estimation error rate, as a function of $|theta_*|,delta,d,n$.
arXiv Detail & Related papers (2022-06-06T09:34:04Z)
Approximate Function Evaluation via Multi-Armed Bandits [51.146684847667125]
We study the problem of estimating the value of a known smooth function $f$ at an unknown point $boldsymbolmu in mathbbRn$, where each component $mu_i$ can be sampled via a noisy oracle. We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-delta$ returns an $epsilon$ accurate estimate of $f(boldsymbolmu)$.
arXiv Detail & Related papers (2022-03-18T18:50:52Z)
AI and extreme scale computing to learn and infer the physics of higher order gravitational wave modes of quasi-circular, spinning, non-precessing binary black hole mergers [1.7056768055368385]
We learn and infer the physics of higher order gravitational wave modes of quasi-circular, spinning, non precessing binary black hole mergers. We train AI models using 14 million waveforms, produced with the surrogate model NRHybSur3dq8. We obtain deterministic and probabilistic estimates of the mass-ratio, individual spins, effective spin, and inclination angle of numerical relativity waveforms.
arXiv Detail & Related papers (2021-12-13T19:00:00Z)
Interpretable AI forecasting for numerical relativity waveforms of quasi-circular, spinning, non-precessing binary black hole mergers [1.4438155481047366]
We present a deep-learning artificial intelligence model capable of learning and forecasting the late-inspiral, merger and ringdown of numerical relativity waveforms. We harnessed the Theta supercomputer at the Argonne Leadership Computing Facility to train our AI model using a training set of 1.5 million waveforms. Our findings show that artificial intelligence can accurately forecast the dynamical evolution of numerical relativity waveforms.
arXiv Detail & Related papers (2021-10-13T18:14:52Z)
Denoising modulo samples: k-NN regression and tightness of SDP relaxation [5.025654873456756]
We derive a two-stage algorithm that recovers estimates of the samples $f(x_i)$ with a uniform error rate $O(fraclog nn)frac1d+2)$ holding with high probability. The estimates of the samples $f(x_i)$ can be subsequently utilized to construct an estimate of the function $f$.
arXiv Detail & Related papers (2020-09-10T13:32:46Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
Ballistic propagation of a local impact in the one-dimensional $XY$ model [0.0]
Light-cone-like propagation of information is a universal phenomenon of nonequilibrium dynamics of integrable spin systems. We numerically observe various types of light-cone-like propagation in the parameter region $0leqgammaleq1$ and $0leq2$ of the model.
arXiv Detail & Related papers (2020-07-03T04:07:10Z)
Quantum Algorithms for Simulating the Lattice Schwinger Model [63.18141027763459]
We give scalable, explicit digital quantum algorithms to simulate the lattice Schwinger model in both NISQ and fault-tolerant settings. In lattice units, we find a Schwinger model on $N/2$ physical sites with coupling constant $x-1/2$ and electric field cutoff $x-1/2Lambda$. We estimate observables which we cost in both the NISQ and fault-tolerant settings by assuming a simple target observable---the mean pair density.
arXiv Detail & Related papers (2020-02-25T19:18:36Z)
Gravitational-wave parameter estimation with autoregressive neural network flows [0.0]
We introduce the use of autoregressive normalizing flows for rapid likelihood-free inference of binary black hole system parameters from gravitational-wave data with deep neural networks. A normalizing flow is an invertible mapping on a sample space that can be used to induce a transformation from a simple probability distribution to a more complex one. We build a more powerful latent variable model by incorporating autoregressive flows within the variational autoencoder framework.
arXiv Detail & Related papers (2020-02-18T15:44:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.