On the Robustness and Generalization of Deep Learning Driven Full
Waveform Inversion
- URL: http://arxiv.org/abs/2111.14220v1
- Date: Sun, 28 Nov 2021 19:27:59 GMT
- Title: On the Robustness and Generalization of Deep Learning Driven Full
Waveform Inversion
- Authors: Chengyuan Deng, Youzuo Lin
- Abstract summary: Full Waveform Inversion (FWI) is commonly epitomized as an image-to-image translation task.
Despite being trained with synthetic data, the deep learning-driven FWI is expected to perform well when evaluated with sufficient real-world data.
We study such properties by asking: how robust are these deep neural networks and how do they generalize?
- Score: 2.5382095320488665
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: The data-driven approach has been demonstrated as a promising technique to
solve complicated scientific problems. Full Waveform Inversion (FWI) is
commonly epitomized as an image-to-image translation task, which motivates the
use of deep neural networks as an end-to-end solution. Despite being trained
with synthetic data, the deep learning-driven FWI is expected to perform well
when evaluated with sufficient real-world data. In this paper, we study such
properties by asking: how robust are these deep neural networks and how do they
generalize? For robustness, we prove the upper bounds of the deviation between
the predictions from clean and noisy data. Moreover, we demonstrate an
interplay between the noise level and the additional gain of loss. For
generalization, we prove a norm-based generalization error upper bound via a
stability-generalization framework. Experimental results on seismic FWI
datasets corroborate with the theoretical results, shedding light on a better
understanding of utilizing Deep Learning for complicated scientific
applications.
Related papers
- Towards Robust Out-of-Distribution Generalization: Data Augmentation and Neural Architecture Search Approaches [4.577842191730992]
We study ways toward robust OoD generalization for deep learning.
We first propose a novel and effective approach to disentangle the spurious correlation between features that are not essential for recognition.
We then study the problem of strengthening neural architecture search in OoD scenarios.
arXiv Detail & Related papers (2024-10-25T20:50:32Z) - Improving Robustness via Tilted Exponential Layer: A
Communication-Theoretic Perspective [22.062492862286025]
Communication theory aims at enhancing the signal-to-noise ratio at the output of a neural network layer via neural competition.
TEXP learning can be interpreted as maximum likelihood estimation of matched filters.
TEXP inference enhances robustness against noise and other common corruptions.
arXiv Detail & Related papers (2023-11-02T07:47:42Z) - Optimization dependent generalization bound for ReLU networks based on
sensitivity in the tangent bundle [0.0]
We propose a PAC type bound on the generalization error of feedforward ReLU networks.
The obtained bound does not explicitly depend on the depth of the network.
arXiv Detail & Related papers (2023-10-26T13:14:13Z) - Deep Neural Networks Tend To Extrapolate Predictably [51.303814412294514]
neural network predictions tend to be unpredictable and overconfident when faced with out-of-distribution (OOD) inputs.
We observe that neural network predictions often tend towards a constant value as input data becomes increasingly OOD.
We show how one can leverage our insights in practice to enable risk-sensitive decision-making in the presence of OOD inputs.
arXiv Detail & Related papers (2023-10-02T03:25:32Z) - An Information-Theoretic Framework for Supervised Learning [22.280001450122175]
We propose a novel information-theoretic framework with its own notions of regret and sample complexity.
We study the sample complexity of learning from data generated by deep neural networks with ReLU activation units.
We conclude by corroborating our theoretical results with experimental analysis of random single-hidden-layer neural networks.
arXiv Detail & Related papers (2022-03-01T05:58:28Z) - Generalization of Neural Combinatorial Solvers Through the Lens of
Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features.
We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features.
Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound.
Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z) - Towards an Understanding of Benign Overfitting in Neural Networks [104.2956323934544]
Modern machine learning models often employ a huge number of parameters and are typically optimized to have zero training loss.
We examine how these benign overfitting phenomena occur in a two-layer neural network setting.
We show that it is possible for the two-layer ReLU network interpolator to achieve a near minimax-optimal learning rate.
arXiv Detail & Related papers (2021-06-06T19:08:53Z) - A neural anisotropic view of underspecification in deep learning [60.119023683371736]
We show that the way neural networks handle the underspecification of problems is highly dependent on the data representation.
Our results highlight that understanding the architectural inductive bias in deep learning is fundamental to address the fairness, robustness, and generalization of these systems.
arXiv Detail & Related papers (2021-04-29T14:31:09Z) - Dropout: Explicit Forms and Capacity Control [57.36692251815882]
We investigate capacity control provided by dropout in various machine learning problems.
In deep learning, we show that the data-dependent regularizer due to dropout directly controls the Rademacher complexity of the underlying class of deep neural networks.
We evaluate our theoretical findings on real-world datasets, including MovieLens, MNIST, and Fashion-MNIST.
arXiv Detail & Related papers (2020-03-06T19:10:15Z) - Understanding Generalization in Deep Learning via Tensor Methods [53.808840694241]
We advance the understanding of the relations between the network's architecture and its generalizability from the compression perspective.
We propose a series of intuitive, data-dependent and easily-measurable properties that tightly characterize the compressibility and generalizability of neural networks.
arXiv Detail & Related papers (2020-01-14T22:26:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.