Related papers: Enhancing the reliability of machine learning for gravitational wave parameter estimation with attention-based models

Enhancing the reliability of machine learning for gravitational wave parameter estimation with attention-based models

URL: http://arxiv.org/abs/2501.10486v2
Date: Tue, 14 Oct 2025 01:33:35 GMT
Title: Enhancing the reliability of machine learning for gravitational wave parameter estimation with attention-based models
Authors: Hibiki Iwanaga, Mahoro Matsuyama, Yousuke Itoh,
Abstract summary: We develop two independent machine learning models to estimate effective spin and chirp mass from spectrograms of gravitational wave signals.<n>We utilize attention maps to visualize the areas our models focus on when making predictions.<n>We show that as the models focus more on glitches, the parameter estimation results become more strongly biased.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce a technique to enhance the reliability of gravitational wave parameter estimation results produced by machine learning. We develop two independent machine learning models based on the Vision Transformer to estimate effective spin and chirp mass from spectrograms of gravitational wave signals from binary black hole mergers. To enhance the reliability of these models, we utilize attention maps to visualize the areas our models focus on when making predictions. This approach enables demonstrating that both models perform parameter estimation based on physically meaningful information. Furthermore, by leveraging these attention maps, we demonstrate a method to quantify the impact of glitches on parameter estimation. We show that as the models focus more on glitches, the parameter estimation results become more strongly biased. This suggests that attention maps could potentially be used to distinguish between cases where the results produced by the machine learning model are reliable and cases where they are not.

Related papers

Learning robust parameter inference and density reconstruction in flyer plate impact experiments [0.0]
Estimating physical parameters or material properties from experimental observations is a common objective in many areas of physics and material science.<n> radiography does not provide direct access to key state variables, such as density.<n>We propose an observable data set consisting of low and high impact velocity experiments/simulations that capture different regimes of compaction and shock propagation.<n>We show that the obtained estimates of EoS and crush model parameters can then be used in hydrodynamic simulations to obtain accurate and physically admissible density reconstructions.
arXiv Detail & Related papers (2025-06-30T14:43:33Z)
Integrating Physics and Data-Driven Approaches: An Explainable and Uncertainty-Aware Hybrid Model for Wind Turbine Power Prediction [1.1270209626877075]
The rapid growth of the wind energy sector underscores the urgent need to optimize turbine operations. Traditional empirical and physics-based models offer approximate predictions of power generation based on wind speed. Data-driven machine learning methods present a promising avenue for improving wind turbine modeling.
arXiv Detail & Related papers (2025-02-11T08:16:48Z)
Large Scale Evaluation of Deep Learning-based Explainable Solar Flare Forecasting Models with Attribution-based Proximity Analysis [0.0]
We propose a novel framework for assessing the interpretability of deep learning models for solar flare prediction.<n>Our study compares two models trained on full-disk line-of-sight (LoS) magnetogram images to predict flares within a 24-hour window.<n>Our findings indicate that the models' predictions align with active region characteristics to varying degrees, offering valuable insights into their behavior.
arXiv Detail & Related papers (2024-11-27T05:43:34Z)
Rethinking Weight-Averaged Model-merging [15.2881959315021]
Weight-averaged model-merging has emerged as a powerful approach in deep learning, capable of enhancing model performance without fine-tuning or retraining. We investigate this technique from three novel perspectives to provide deeper insights into how and why weight-averaged model-merging works. Our findings shed light on the "black box" of weight-averaged model-merging, offering valuable insights and practical recommendations.
arXiv Detail & Related papers (2024-11-14T08:02:14Z)
Explanatory Model Monitoring to Understand the Effects of Feature Shifts on Performance [61.06245197347139]
We propose a novel approach to explain the behavior of a black-box model under feature shifts. We refer to our method that combines concepts from Optimal Transport and Shapley Values as Explanatory Performance Estimation.
arXiv Detail & Related papers (2024-08-24T18:28:19Z)
SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models [85.67096251281191]
We present an innovative approach to model fusion called zero-shot Sparse MIxture of Low-rank Experts (SMILE) construction. SMILE allows for the upscaling of source models into an MoE model without extra data or further training. We conduct extensive experiments across diverse scenarios, such as image classification and text generation tasks, using full fine-tuning and LoRA fine-tuning.
arXiv Detail & Related papers (2024-08-19T17:32:15Z)
CogDPM: Diffusion Probabilistic Models via Cognitive Predictive Coding [62.075029712357]
This work introduces the Cognitive Diffusion Probabilistic Models (CogDPM) CogDPM features a precision estimation method based on the hierarchical sampling capabilities of diffusion models and weight the guidance with precision weights estimated by the inherent property of diffusion models. We apply CogDPM to real-world prediction tasks using the United Kindom precipitation and surface wind datasets.
arXiv Detail & Related papers (2024-05-03T15:54:50Z)
Beyond mirkwood: Enhancing SED Modeling with Conformal Predictions [0.0]
We propose an advanced machine learning-based approach that enhances flexibility and uncertainty in SED fitting. We incorporate conformalized quantile regression to convert point predictions into error bars, enhancing interpretability and reliability.
arXiv Detail & Related papers (2023-12-21T11:27:20Z)
Interpreting a Machine Learning Model for Detecting Gravitational Waves [6.139541666440539]
We apply interpretability techniques developed for computer vision to machine learning models used to search for and find gravitational waves. The models we study are trained to detect black hole merger events in non-Gaussian and non-stationary advanced Laser Interferometer Gravitational-wave Observatory (LIGO) data.
arXiv Detail & Related papers (2022-02-15T13:49:13Z)
Beyond Trivial Counterfactual Explanations with Diverse Valuable Explanations [64.85696493596821]
In computer vision applications, generative counterfactual methods indicate how to perturb a model's input to change its prediction. We propose a counterfactual method that learns a perturbation in a disentangled latent space that is constrained using a diversity-enforcing loss. Our model improves the success rate of producing high-quality valuable explanations when compared to previous state-of-the-art methods.
arXiv Detail & Related papers (2021-03-18T12:57:34Z)
Leveraging Global Parameters for Flow-based Neural Posterior Estimation [90.21090932619695]
Inferring the parameters of a model based on experimental observations is central to the scientific method. A particularly challenging setting is when the model is strongly indeterminate, i.e., when distinct sets of parameters yield identical observations. We present a method for cracking such indeterminacy by exploiting additional information conveyed by an auxiliary set of observations sharing global parameters.
arXiv Detail & Related papers (2021-02-12T12:23:13Z)
Wave Propagation of Visual Stimuli in Focus of Attention [77.4747032928547]
Fast reactions to changes in the surrounding visual environment require efficient attention mechanisms to reallocate computational resources to most relevant locations in the visual field. We present a biologically-plausible model of focus of attention that exhibits effectiveness and efficiency exhibited by foveated animals.
arXiv Detail & Related papers (2020-06-19T09:33:21Z)
Machine learning for causal inference: on the use of cross-fit estimators [77.34726150561087]
Doubly-robust cross-fit estimators have been proposed to yield better statistical properties. We conducted a simulation study to assess the performance of several estimators for the average causal effect (ACE) When used with machine learning, the doubly-robust cross-fit estimators substantially outperformed all of the other estimators in terms of bias, variance, and confidence interval coverage.
arXiv Detail & Related papers (2020-04-21T23:09:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.