AI forecasting of higher-order wave modes of spinning binary black hole mergers
- URL: http://arxiv.org/abs/2409.03833v1
- Date: Thu, 5 Sep 2024 18:00:11 GMT
- Title: AI forecasting of higher-order wave modes of spinning binary black hole mergers
- Authors: Victoria Tiki, Kiet Pham, Eliu Huerta,
- Abstract summary: The model forecasts the waveform evolution from the pre-merger phase through the ringdown.
We trained the model on 14,440,761 waveforms, completing the training in 15 hours using 16 NVIDIA A100 GPUs in the Delta supercomputer.
We conducted interpretability studies to elucidate the waveform features utilized by our transformer model to produce accurate predictions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a physics-inspired transformer model that predicts the non-linear dynamics of higher-order wave modes emitted by quasi-circular, spinning, non-precessing binary black hole mergers. The model forecasts the waveform evolution from the pre-merger phase through the ringdown, starting with an input time-series spanning $ t \in [-5000\textrm{M}, -100\textrm{M}) $. The merger event, defined as the peak amplitude of waveforms that include the $l = |m| = 2$ modes, occurs at $ t = 0\textrm{M} $. The transformer then generates predictions over the time range $ t \in [-100\textrm{M}, 130\textrm{M}] $. We produced training, evaluation and test sets using the NRHybSur3dq8 model, considering a signal manifold defined by mass ratios $ q \in [1, 8] $; spin components $ s^z_{\{1,2\}} \in [-0.8, 0.8] $; modes up to $l \leq 4$, including the $(5,5)$ mode but excluding the $(4,0)$ and $(4,1)$ modes; and inclination angles $\theta \in [0, \pi]$. We trained the model on 14,440,761 waveforms, completing the training in 15 hours using 16 NVIDIA A100 GPUs in the Delta supercomputer. We used 4 H100 GPUs in the DeltaAI supercomputer to compute, within 7 hours, the overlap between ground truth and predicted waveforms using a test set of 840,000 waveforms, finding that the mean and median overlaps over the test set are 0.996 and 0.997, respectively. Additionally, we conducted interpretability studies to elucidate the waveform features utilized by our transformer model to produce accurate predictions. The scientific software used for this work is released with this manuscript.
Related papers
- IT$^3$: Idempotent Test-Time Training [95.78053599609044]
This paper introduces Idempotent Test-Time Training (IT$3$), a novel approach to addressing the challenge of distribution shift.
IT$3$ is based on the universal property of idempotence.
We demonstrate the versatility of our approach across various tasks, including corrupted image classification.
arXiv Detail & Related papers (2024-10-05T15:39:51Z) - Transformer In-Context Learning for Categorical Data [51.23121284812406]
We extend research on understanding Transformers through the lens of in-context learning with functional data by considering categorical outcomes, nonlinear underlying models, and nonlinear attention.
We present what is believed to be the first real-world demonstration of this few-shot-learning methodology, using the ImageNet dataset.
arXiv Detail & Related papers (2024-05-27T15:03:21Z) - Efficient Sampling of Stochastic Differential Equations with Positive
Semi-Definite Models [91.22420505636006]
This paper deals with the problem of efficient sampling from a differential equation, given the drift function and the diffusion matrix.
It is possible to obtain independent and identically distributed (i.i.d.) samples at precision $varepsilon$ with a cost that is $m2 d log (1/varepsilon)$
Our results suggest that as the true solution gets smoother, we can circumvent the curse of dimensionality without requiring any sort of convexity.
arXiv Detail & Related papers (2023-03-30T02:50:49Z) - Neural Inference of Gaussian Processes for Time Series Data of Quasars [72.79083473275742]
We introduce a new model that enables it to describe quasar spectra completely.
We also introduce a new method of inference of Gaussian process parameters, which we call $textitNeural Inference$.
The combination of both the CDRW model and Neural Inference significantly outperforms the baseline DRW and MLE.
arXiv Detail & Related papers (2022-11-17T13:01:26Z) - Mean Estimation in High-Dimensional Binary Markov Gaussian Mixture
Models [12.746888269949407]
We consider a high-dimensional mean estimation problem over a binary hidden Markov model.
We establish a nearly minimax optimal (up to logarithmic factors) estimation error rate, as a function of $|theta_*|,delta,d,n$.
arXiv Detail & Related papers (2022-06-06T09:34:04Z) - Approximate Function Evaluation via Multi-Armed Bandits [51.146684847667125]
We study the problem of estimating the value of a known smooth function $f$ at an unknown point $boldsymbolmu in mathbbRn$, where each component $mu_i$ can be sampled via a noisy oracle.
We design an instance-adaptive algorithm that learns to sample according to the importance of each coordinate, and with probability at least $1-delta$ returns an $epsilon$ accurate estimate of $f(boldsymbolmu)$.
arXiv Detail & Related papers (2022-03-18T18:50:52Z) - AI and extreme scale computing to learn and infer the physics of higher
order gravitational wave modes of quasi-circular, spinning, non-precessing
binary black hole mergers [1.7056768055368385]
We learn and infer the physics of higher order gravitational wave modes of quasi-circular, spinning, non precessing binary black hole mergers.
We train AI models using 14 million waveforms, produced with the surrogate model NRHybSur3dq8.
We obtain deterministic and probabilistic estimates of the mass-ratio, individual spins, effective spin, and inclination angle of numerical relativity waveforms.
arXiv Detail & Related papers (2021-12-13T19:00:00Z) - Interpretable AI forecasting for numerical relativity waveforms of
quasi-circular, spinning, non-precessing binary black hole mergers [1.4438155481047366]
We present a deep-learning artificial intelligence model capable of learning and forecasting the late-inspiral, merger and ringdown of numerical relativity waveforms.
We harnessed the Theta supercomputer at the Argonne Leadership Computing Facility to train our AI model using a training set of 1.5 million waveforms.
Our findings show that artificial intelligence can accurately forecast the dynamical evolution of numerical relativity waveforms.
arXiv Detail & Related papers (2021-10-13T18:14:52Z) - Denoising modulo samples: k-NN regression and tightness of SDP
relaxation [5.025654873456756]
We derive a two-stage algorithm that recovers estimates of the samples $f(x_i)$ with a uniform error rate $O(fraclog nn)frac1d+2)$ holding with high probability.
The estimates of the samples $f(x_i)$ can be subsequently utilized to construct an estimate of the function $f$.
arXiv Detail & Related papers (2020-09-10T13:32:46Z) - Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$
We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z) - Ballistic propagation of a local impact in the one-dimensional $XY$
model [0.0]
Light-cone-like propagation of information is a universal phenomenon of nonequilibrium dynamics of integrable spin systems.
We numerically observe various types of light-cone-like propagation in the parameter region $0leqgammaleq1$ and $0leq2$ of the model.
arXiv Detail & Related papers (2020-07-03T04:07:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.