Optimizing a Transformer-based network for a deep learning seismic
processing workflow
- URL: http://arxiv.org/abs/2308.04739v1
- Date: Wed, 9 Aug 2023 07:11:42 GMT
- Title: Optimizing a Transformer-based network for a deep learning seismic
processing workflow
- Authors: Randy Harsuko and Tariq Alkhalifah
- Abstract summary: StorSeismic is a recently introduced model based on the Transformer to adapt to various seismic processing tasks.
We observe faster pretraining and competitive results on the fine-tuning tasks and, additionally, fewer parameters to train compared to the vanilla model.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: StorSeismic is a recently introduced model based on the Transformer to adapt
to various seismic processing tasks through its pretraining and fine-tuning
training strategy. In the original implementation, StorSeismic utilized a
sinusoidal positional encoding and a conventional self-attention mechanism,
both borrowed from the natural language processing (NLP) applications. For
seismic processing they admitted good results, but also hinted to limitations
in efficiency and expressiveness. We propose modifications to these two key
components, by utilizing relative positional encoding and low-rank attention
matrices as replacements to the vanilla ones. The proposed changes are tested
on processing tasks applied to a realistic Marmousi and offshore field data as
a sequential strategy, starting from denoising, direct arrival removal,
multiple attenuation, and finally root-mean-squared velocity ($V_{RMS}$)
prediction for normal moveout (NMO) correction. We observe faster pretraining
and competitive results on the fine-tuning tasks and, additionally, fewer
parameters to train compared to the vanilla model.
Related papers
- FiRST: Finetuning Router-Selective Transformers for Input-Adaptive Latency Reduction [11.146015814220858]
FIRST is an algorithm that reduces inference latency by using layer-specific routers to select a subset of transformer layers adaptively for each input sequence.
Our approach reveals that input adaptivity is critical - indeed, different task-specific middle layers play a crucial role in evolving hidden representations depending on task.
arXiv Detail & Related papers (2024-10-16T12:45:35Z) - A convolutional neural network approach to deblending seismic data [1.5488464287814563]
We present a data-driven deep learning-based method for fast and efficient seismic deblending.
A convolutional neural network (CNN) is designed according to the special character of seismic data.
After training and validation of the network, seismic deblending can be performed in near real time.
arXiv Detail & Related papers (2024-09-12T10:54:35Z) - Enhancing Robustness of Vision-Language Models through Orthogonality Learning and Self-Regularization [77.62516752323207]
We introduce an orthogonal fine-tuning method for efficiently fine-tuning pretrained weights and enabling enhanced robustness and generalization.
A self-regularization strategy is further exploited to maintain the stability in terms of zero-shot generalization of VLMs, dubbed OrthSR.
For the first time, we revisit the CLIP and CoOp with our method to effectively improve the model on few-shot image classficiation scenario.
arXiv Detail & Related papers (2024-07-11T10:35:53Z) - Uncovering mesa-optimization algorithms in Transformers [61.06055590704677]
Some autoregressive models can learn as an input sequence is processed, without undergoing any parameter changes, and without being explicitly trained to do so.
We show that standard next-token prediction error minimization gives rise to a subsidiary learning algorithm that adjusts the model as new inputs are revealed.
Our findings explain in-context learning as a product of autoregressive loss minimization and inform the design of new optimization-based Transformer layers.
arXiv Detail & Related papers (2023-09-11T22:42:50Z) - End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures.
We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Deep Preconditioners and their application to seismic wavefield
processing [0.0]
Sparsity-promoting inversion, coupled with fixed-basis sparsifying transforms, represent the go-to approach for many processing tasks.
We propose to train an AutoEncoder network to learn a direct mapping between the input seismic data and a representative latent manifold.
The trained decoder is subsequently used as a nonlinear preconditioner for the physics-driven inverse problem at hand.
arXiv Detail & Related papers (2022-07-20T14:25:32Z) - StorSeismic: A new paradigm in deep learning for seismic processing [0.0]
StorSeismic is a framework for seismic data processing.
We pre-train seismic data, along with synthetically generated ones, in the self-supervised step.
Then, we use the labeled synthetic data to fine-tune the pre-trained network in a supervised fashion to perform various seismic processing tasks.
arXiv Detail & Related papers (2022-04-30T09:55:00Z) - Finetuning Pretrained Transformers into RNNs [81.72974646901136]
Transformers have outperformed recurrent neural networks (RNNs) in natural language generation.
A linear-complexity recurrent variant has proven well suited for autoregressive generation.
This work aims to convert a pretrained transformer into its efficient recurrent counterpart.
arXiv Detail & Related papers (2021-03-24T10:50:43Z) - Dynamic Scale Training for Object Detection [111.33112051962514]
We propose a Dynamic Scale Training paradigm (abbreviated as DST) to mitigate scale variation challenge in object detection.
Experimental results demonstrate the efficacy of our proposed DST towards scale variation handling.
It does not introduce inference overhead and could serve as a free lunch for general detection configurations.
arXiv Detail & Related papers (2020-04-26T16:48:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.