Related papers: Adaptive Epsilon Adversarial Training for Robust Gravitational Wave Parameter Estimation Using Normalizing Flows

Adaptive Epsilon Adversarial Training for Robust Gravitational Wave Parameter Estimation Using Normalizing Flows

URL: http://arxiv.org/abs/2412.07559v2
Date: Tue, 17 Dec 2024 13:43:16 GMT
Title: Adaptive Epsilon Adversarial Training for Robust Gravitational Wave Parameter Estimation Using Normalizing Flows
Authors: Yiqian Yang, Xihua Zhu, Fan Zhang,
Abstract summary: Adrial training with Normalizing Flow (NF) models is an emerging research area aimed at improving model robustness through adversarial samples.<n>We propose an adaptive epsilon method for Fast Gradient Sign Method (FGSM) adversarial training, which dynamically adjusts perturbation strengths based on gradient magnitudes using logarithmic scaling.<n>Our hybrid architecture, combining ResNet and Inverse Autoregressive Flow, reduces the Negative Log Likelihood loss by 47% under FGSM attacks compared to the baseline model.<n>Under stronger Projected Gradient Descent attacks with perturbation strength of 0.05, our model maintains an NLL of 6.4, demonstrating superior robustness while avoiding
Score: 2.4184866684341473
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Adversarial training with Normalizing Flow (NF) models is an emerging research area aimed at improving model robustness through adversarial samples. In this study, we focus on applying adversarial training to NF models for gravitational wave parameter estimation. We propose an adaptive epsilon method for Fast Gradient Sign Method (FGSM) adversarial training, which dynamically adjusts perturbation strengths based on gradient magnitudes using logarithmic scaling. Our hybrid architecture, combining ResNet and Inverse Autoregressive Flow, reduces the Negative Log Likelihood (NLL) loss by 47\% under FGSM attacks compared to the baseline model, while maintaining an NLL of 4.2 on clean data (only 5\% higher than the baseline). For perturbation strengths between 0.01 and 0.1, our model achieves an average NLL of 5.8, outperforming both fixed-epsilon (NLL: 6.7) and progressive-epsilon (NLL: 7.2) methods. Under stronger Projected Gradient Descent attacks with perturbation strength of 0.05, our model maintains an NLL of 6.4, demonstrating superior robustness while avoiding catastrophic overfitting.

Related papers

Function-Space Decoupled Diffusion for Forward and Inverse Modeling in Carbon Capture and Storage [65.51149575007149]
We present Fun-DDPS, a generative framework that combines function-space diffusion models with differentiable neural operator surrogates for both forward and inverse modeling.<n>Fun-DDPS produces physically consistent realizations free from the high-frequency artifacts observed in joint-state baselines.
arXiv Detail & Related papers (2026-02-12T18:58:12Z)
MSign: An Optimizer Preventing Training Instability in Large Language Models via Stable Rank Restoration [48.446476072756276]
Training instability remains a critical challenge in large language model pretraining.<n>We study training failures in a 5M NanoGPT model scaled via $$P.<n>We propose MSign, a new norm that periodically applies matrix sign operations to restore stable rank.
arXiv Detail & Related papers (2026-02-02T07:18:45Z)
N-EIoU-YOLOv9: A Signal-Aware Bounding Box Regression Loss for Lightweight Mobile Detection of Rice Leaf Diseases [0.6280530476948474]
We propose N EIoU YOLOv9, a lightweight detection framework based on a signal aware bounding box regression loss.<n>The proposed loss reshapes localization gradient by combining non monotonic focusing with decoupled width and height optimization.<n>This design is particularly effective for small and low contrast targets commonly observed in agricultural disease imagery.
arXiv Detail & Related papers (2026-01-14T05:13:36Z)
DiffusionNFT: Online Diffusion Reinforcement with Forward Process [99.94852379720153]
Diffusion Negative-aware FineTuning (DiffusionNFT) is a new online RL paradigm that optimize diffusion models directly on the forward process via flow matching.<n>DiffusionNFT is up to $25times$ more efficient than FlowGRPO in head-to-head comparisons, while being CFG-free.
arXiv Detail & Related papers (2025-09-19T16:09:33Z)
TGLF-SINN: Deep Learning Surrogate Model for Accelerating Turbulent Transport Modeling in Fusion [18.028061388104963]
We propose textbfTGLF-SINN (Spectra-Informed Neural Network) with three key innovations.<n>Our approach achieves superior performance with significantly less training data.<n>In downstream flux matching applications, our NN surrogate provides 45x speedup over TGLF while maintaining comparable accuracy.
arXiv Detail & Related papers (2025-09-07T09:36:51Z)
When Punctuation Matters: A Large-Scale Comparison of Prompt Robustness Methods for LLMs [55.20230501807337]
We present the first systematic evaluation of 5 methods for improving prompt robustness within a unified experimental framework.<n>We benchmark these techniques on 8 models from Llama, Qwen and Gemma families across 52 tasks from Natural Instructions dataset.
arXiv Detail & Related papers (2025-08-15T10:32:50Z)
Physics-Based Machine Learning Closures and Wall Models for Hypersonic Transition-Continuum Boundary Layer Predictions [0.9320657506524149]
We develop a physics-constrained machine learning framework that augments transport models and boundary conditions.<n>We evaluate these for two-dimensional supersonic flat-plate flows across a range of Mach and Knudsen numbers.<n>Our results show that a trace-free anisotropic viscosity model, paired with the skewed-Gaussian distribution function wall model, achieves significantly improved accuracy.
arXiv Detail & Related papers (2025-07-11T19:40:00Z)
Optimality and Adaptivity of Deep Neural Features for Instrumental Variable Regression [57.40108516085593]
Deep feature instrumental variable (DFIV) regression is a nonparametric approach to IV regression using data-adaptive features learned by deep neural networks. We prove that the DFIV algorithm achieves the minimax optimal learning rate when the target structural function lies in a Besov space.
arXiv Detail & Related papers (2025-01-09T01:22:22Z)
Efficient Gravitational Wave Parameter Estimation via Knowledge Distillation: A ResNet1D-IAF Approach [2.4184866684341473]
This study presents a novel approach using knowledge distillation techniques to enhance computational efficiency in gravitational wave analysis. We develop a framework combining ResNet1D and Inverse Autoregressive Flow (IAF) architectures, where knowledge from a complex teacher model is transferred to a lighter student model. Our experimental results show that the student model achieves a validation loss of 3.70 with optimal configuration (40,100,0.75), compared to the teacher model's 4.09, while reducing the number of parameters by 43%.
arXiv Detail & Related papers (2024-12-11T03:56:46Z)
FLoCoRA: Federated learning compression with low-rank adaptation [0.0]
Low-Rank Adaptation (LoRA) methods have gained popularity in efficient parameter fine-tuning of models containing hundreds of billions of parameters. In this work, we demonstrate the application of LoRA methods to train small-vision models in Federated Learning.
arXiv Detail & Related papers (2024-06-20T07:59:29Z)
Efficient Adversarial Training in LLMs with Continuous Attacks [99.5882845458567]
Large language models (LLMs) are vulnerable to adversarial attacks that can bypass their safety guardrails. We propose a fast adversarial training algorithm (C-AdvUL) composed of two losses. C-AdvIPO is an adversarial variant of IPO that does not require utility data for adversarially robust alignment.
arXiv Detail & Related papers (2024-05-24T14:20:09Z)
Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training. Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z)
Data-Agnostic Model Poisoning against Federated Learning: A Graph Autoencoder Approach [65.2993866461477]
This paper proposes a data-agnostic, model poisoning attack on Federated Learning (FL) The attack requires no knowledge of FL training data and achieves both effectiveness and undetectability. Experiments show that the FL accuracy drops gradually under the proposed attack and existing defense mechanisms fail to detect it.
arXiv Detail & Related papers (2023-11-30T12:19:10Z)
Understanding the robustness difference between stochastic gradient descent and adaptive gradient methods [11.895321856533934]
gradient descent (SGD) and adaptive gradient methods have been widely used in training deep neural networks. We empirically show that while the difference between the standard generalization performance of models trained using these methods is small, those trained using SGD exhibit far greater robustness under input perturbations.
arXiv Detail & Related papers (2023-08-13T07:03:22Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
Manifold Interpolating Optimal-Transport Flows for Trajectory Inference [64.94020639760026]
We present a method called Manifold Interpolating Optimal-Transport Flow (MIOFlow) MIOFlow learns, continuous population dynamics from static snapshot samples taken at sporadic timepoints. We evaluate our method on simulated data with bifurcations and merges, as well as scRNA-seq data from embryoid body differentiation, and acute myeloid leukemia treatment.
arXiv Detail & Related papers (2022-06-29T22:19:03Z)
Guided Diffusion Model for Adversarial Purification from Random Noise [0.0]
We propose a novel guided diffusion purification approach to provide a strong defense against adversarial attacks. Our model achieves 89.62% robust accuracy under PGD-L_inf attack (eps = 8/255) on the CIFAR-10 dataset.
arXiv Detail & Related papers (2022-06-22T06:55:03Z)

This list is automatically generated from the titles and abstracts of the papers in this site.