Related papers: Seismic Fault SAM: Adapting SAM with Lightweight Modules and 2.5D Strategy for Fault Detection

Seismic Fault SAM: Adapting SAM with Lightweight Modules and 2.5D Strategy for Fault Detection

URL: http://arxiv.org/abs/2407.14121v1
Date: Fri, 19 Jul 2024 08:38:48 GMT
Title: Seismic Fault SAM: Adapting SAM with Lightweight Modules and 2.5D Strategy for Fault Detection
Authors: Ran Chen, Zeren Zhang, Jinwen Ma,
Abstract summary: This paper proposes Seismic Fault SAM, which applies the general pre-training foundation model-Segment Anything Model (SAM)-to seismic fault interpretation. Our innovative points include designing lightweight Adapter modules, freezing most of the pre-training weights, and only updating a small number of parameters. Experimental results on the largest publicly available seismic dataset, Thebe, show that our method surpasses existing 3D models on both OIS and ODS metrics.
Score: 11.868792440783054
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Seismic fault detection holds significant geographical and practical application value, aiding experts in subsurface structure interpretation and resource exploration. Despite some progress made by automated methods based on deep learning, research in the seismic domain faces significant challenges, particularly because it is difficult to obtain high-quality, large-scale, open-source, and diverse datasets, which hinders the development of general foundation models. Therefore, this paper proposes Seismic Fault SAM, which, for the first time, applies the general pre-training foundation model-Segment Anything Model (SAM)-to seismic fault interpretation. This method aligns the universal knowledge learned from a vast amount of images with the seismic domain tasks through an Adapter design. Specifically, our innovative points include designing lightweight Adapter modules, freezing most of the pre-training weights, and only updating a small number of parameters to allow the model to converge quickly and effectively learn fault features; combining 2.5D input strategy to capture 3D spatial patterns with 2D models; integrating geological constraints into the model through prior-based data augmentation techniques to enhance the model's generalization capability. Experimental results on the largest publicly available seismic dataset, Thebe, show that our method surpasses existing 3D models on both OIS and ODS metrics, achieving state-of-the-art performance and providing an effective extension scheme for other seismic domain downstream tasks that lack labeled data.

Related papers

Improving Full Waveform Inversion in Large Model Era [25.26004497243484]
We show that a model trained entirely on simulated and relatively simple data can generalize remarkably well to challenging geological benchmarks.<n>Our model achieves state-of-the-art performance on OpenFWI and significantly narrows the generalization gap in data-driven FWI.
arXiv Detail & Related papers (2026-02-27T23:33:06Z)
A Survey on Efficient Vision-Language-Action Models [153.11669266922993]
Vision-Language-Action models (VLAs) represent a significant frontier in embodied intelligence, aiming to bridge digital knowledge with physical-world interaction.<n>Motivated by the urgent need to address these challenges, this survey presents the first comprehensive review of Efficient Vision-Language-Action models.
arXiv Detail & Related papers (2025-10-27T17:57:33Z)
Topology-Aware Modeling for Unsupervised Simulation-to-Reality Point Cloud Recognition [63.55828203989405]
We introduce a novel Topology-Aware Modeling (TAM) framework for Sim2Real UDA on object point clouds.<n>Our approach mitigates the domain gap by leveraging global spatial topology, characterized by low-level, high-frequency 3D structures.<n>We propose an advanced self-training strategy that combines cross-domain contrastive learning with self-training.
arXiv Detail & Related papers (2025-06-26T11:53:59Z)
A Large-scale Benchmark on Geological Fault Delineation Models: Domain Shift, Training Dynamics, Generalizability, Evaluation and Inferential Behavior [11.859145373647474]
We present the first large-scale benchmarking study designed to provide guidelines for domain shift strategies in seismic interpretation.<n>Our benchmark spans over 200 combinations of model architectures, datasets and training strategies, across three datasets.<n>Our analysis shows that common fine-tuning practices can lead to catastrophic forgetting when source and target datasets are disjoint.
arXiv Detail & Related papers (2025-05-13T13:56:43Z)
From Dataset to Real-world: General 3D Object Detection via Generalized Cross-domain Few-shot Learning [13.282416396765392]
We introduce the first generalized cross-domain few-shot (GCFS) task in 3D object detection. Our solution integrates multi-modal fusion and contrastive-enhanced prototype learning within one framework. To effectively capture domain-specific representations for each class from limited target data, we propose a contrastive-enhanced prototype learning.
arXiv Detail & Related papers (2025-03-08T17:05:21Z)
SeisMoLLM: Advancing Seismic Monitoring via Cross-modal Transfer with Pre-trained Large Language Model [69.74609763584449]
This work presents SeisMoLLM, the first foundation model that utilizes cross-modal transfer for seismic monitoring. It achieves state-of-the-art performance on the DiTing and STEAD datasets across five critical tasks. In addition to its superior performance, SeisMoLLM maintains efficiency comparable to or even better than lightweight models in both training and inference.
arXiv Detail & Related papers (2025-02-27T10:35:53Z)
RW-Net: Enhancing Few-Shot Point Cloud Classification with a Wavelet Transform Projection-based Network [6.305913808037513]
This work introduces RW-Net, a novel framework designed to address the challenges above by integrating Rate-Distortion Explanation (RDE) and wavelet transform. By emphasizing low-frequency components of the input data, the wavelet transform captures fundamental geometric and structural attributes of 3D objects. The results demonstrate that our approach achieves state-of-the-art performance and exhibits superior generalization and robustness in few-shot learning scenarios.
arXiv Detail & Related papers (2025-01-06T18:55:59Z)
Efficient Feature Aggregation and Scale-Aware Regression for Monocular 3D Object Detection [40.14197775884804]
MonoASRH is a novel monocular 3D detection framework composed of Efficient Hybrid Feature Aggregation Module (EH-FAM) and Adaptive Scale-Aware 3D Regression Head (ASRH) EH-FAM employs multi-head attention with a global receptive field to extract semantic features for small-scale objects. ASRH encodes 2D bounding box dimensions and then fuses scale features with the semantic features aggregated by EH-FAM.
arXiv Detail & Related papers (2024-11-05T02:33:25Z)
GeoWizard: Unleashing the Diffusion Priors for 3D Geometry Estimation from a Single Image [94.56927147492738]
We introduce GeoWizard, a new generative foundation model designed for estimating geometric attributes from single images. We show that leveraging diffusion priors can markedly improve generalization, detail preservation, and efficiency in resource usage. We propose a simple yet effective strategy to segregate the complex data distribution of various scenes into distinct sub-distributions.
arXiv Detail & Related papers (2024-03-18T17:50:41Z)
SeisFusion: Constrained Diffusion Model with Input Guidance for 3D Seismic Data Interpolation and Reconstruction [26.02191880837226]
We propose a novel diffusion model reconstruction framework tailored for 3D seismic data. We introduce a 3D neural network architecture into the diffusion model, successfully extending the 2D diffusion model to 3D space. Our method exhibits superior reconstruction accuracy when applied to both field datasets and synthetic datasets.
arXiv Detail & Related papers (2024-03-18T05:10:13Z)
Segment Anything Model Can Not Segment Anything: Assessing AI Foundation Model's Generalizability in Permafrost Mapping [19.307294875969827]
This paper introduces AI foundation models and their defining characteristics. We evaluate the performance of large AI vision models, especially Meta's Segment Anything Model (SAM) The results show that although promising, SAM still has room for improvement to support AI-augmented terrain mapping.
arXiv Detail & Related papers (2024-01-16T19:10:09Z)
Self-supervised Feature Adaptation for 3D Industrial Anomaly Detection [59.41026558455904]
We focus on multi-modal anomaly detection. Specifically, we investigate early multi-modal approaches that attempted to utilize models pre-trained on large-scale visual datasets. We propose a Local-to-global Self-supervised Feature Adaptation (LSFA) method to finetune the adaptors and learn task-oriented representation toward anomaly detection.
arXiv Detail & Related papers (2024-01-06T07:30:41Z)
FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data. We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC) Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z)
FaultSeg Swin-UNETR: Transformer-Based Self-Supervised Pretraining Model for Fault Recognition [13.339333273943842]
This paper introduces an approach to enhance seismic fault recognition through self-supervised pretraining. We have employed the Swin Transformer model as the core network and employed the SimMIM pretraining task to capture unique features related to discontinuities in seismic data. Experimental results demonstrate that our proposed method attains state-of-the-art performance on the Thebe dataset, as measured by the OIS and ODS metrics.
arXiv Detail & Related papers (2023-10-27T08:38:59Z)
Domain Adaptive 3D Pose Augmentation for In-the-wild Human Mesh Recovery [32.73513554145019]
Domain Adaptive 3D Pose Augmentation (DAPA) is a data augmentation method that enhances the model's generalization ability in in-the-wild scenarios. We show quantitatively that finetuning with DAPA effectively improves results on benchmarks 3DPW and AGORA.
arXiv Detail & Related papers (2022-06-21T15:02:31Z)
Towards Model Generalization for Monocular 3D Object Detection [57.25828870799331]
We present an effective unified camera-generalized paradigm (CGP) for Mono3D object detection. We also propose the 2D-3D geometry-consistent object scaling strategy (GCOS) to bridge the gap via an instance-level augment. Our method called DGMono3D achieves remarkable performance on all evaluated datasets and surpasses the SoTA unsupervised domain adaptation scheme.
arXiv Detail & Related papers (2022-05-23T23:05:07Z)
Unsupervised Domain Adaptation for Monocular 3D Object Detection via Self-Training [57.25828870799331]
We propose STMono3D, a new self-teaching framework for unsupervised domain adaptation on Mono3D. We develop a teacher-student paradigm to generate adaptive pseudo labels on the target domain. STMono3D achieves remarkable performance on all evaluated datasets and even surpasses fully supervised results on the KITTI 3D object detection dataset.
arXiv Detail & Related papers (2022-04-25T12:23:07Z)
Secrets of 3D Implicit Object Shape Reconstruction in the Wild [92.5554695397653]
Reconstructing high-fidelity 3D objects from sparse, partial observation is crucial for various applications in computer vision, robotics, and graphics. Recent neural implicit modeling methods show promising results on synthetic or dense datasets. But, they perform poorly on real-world data that is sparse and noisy. This paper analyzes the root cause of such deficient performance of a popular neural implicit model.
arXiv Detail & Related papers (2021-01-18T03:24:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.