DINOv2 Rocks Geological Image Analysis: Classification, Segmentation, and Interpretability
- URL: http://arxiv.org/abs/2407.18100v1
- Date: Thu, 25 Jul 2024 15:03:36 GMT
- Title: DINOv2 Rocks Geological Image Analysis: Classification, Segmentation, and Interpretability
- Authors: Florent Brondolo, Samuel Beaussant,
- Abstract summary: This study investigates the interpretability, classification, and segmentation of CT-scan images of rock samples.
We compare various segmentation techniques to evaluate their efficacy, efficiency, and adaptability in geological image analysis.
We find that a LoRA fine-tuned DINOv2 excels in out-of-distribution segmentation and significantly outperforms other methods in multi-class segmentation.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study investigates the interpretability, classification, and segmentation of CT-scan images of rock samples, with a particular focus on the application of DINOv2 within Geosciences. We compared various segmentation techniques to evaluate their efficacy, efficiency, and adaptability in geological image analysis. The methods assessed include the Otsu thresholding method, clustering techniques (K-means and fuzzy C-means), a supervised machine learning approach (Random Forest), and deep learning methods (UNet and DINOv2). We tested these methods using ten binary sandstone datasets and three multi-class calcite datasets. To begin, we provide a thorough interpretability analysis of DINOv2's features in the geoscientific context, discussing its suitability and inherent ability to process CT-scanned rock data. In terms of classification, the out-of-the-box DINOv2 demonstrates an impressive capability to perfectly classify rock images, even when the CT scans are out of its original training set. Regarding segmentation, thresholding and unsupervised methods, while fast, perform poorly despite image preprocessing, whereas supervised methods show better results. We underscore the computational demands of deep learning but highlight its minimal intervention, superior generalization, and performance without additional image preprocessing. Additionally, we observe a lack of correlation between a network's depth or the number of parameters and its performance. Our results show that a LoRA fine-tuned DINOv2 excels in out-of-distribution segmentation and significantly outperforms other methods in multi-class segmentation. By systematically comparing these methods, we identify the most efficient strategy for meticulous and laborious segmentation tasks. DINOv2 proves advantageous, achieving segmentations that could be described as "better than ground-truth" against relatively small training sets.
Related papers
- A Comparison of Deep Learning Methods for Cell Detection in Digital Cytology [1.607370483729741]
We evaluate the performance of several Deep Learning (DL) methods for cell detection in Papanicolaou-stained cytological Whole Slide Images (WSIs)
We examine recentoff-the-shelf algorithms as well as custom-designed detectors, applying them to two datasets.
Results show that centroid-based methods, particularly the Improved Fully Convolutional Regression Network (IFCRN) method, outperform segmentation-based methods in terms of both detection accuracy and computational efficiency.
arXiv Detail & Related papers (2025-04-09T15:08:12Z) - Leveraging Semi-Supervised Learning to Enhance Data Mining for Image Classification under Limited Labeled Data [35.431340001608476]
Traditional data mining methods are inadequate when faced with large-scale, high-dimensional and complex data.
This study introduces semi-supervised learning methods, aiming to improve the algorithm's ability to utilize unlabeled data.
Specifically, we adopt a self-training method and combine it with a convolutional neural network (CNN) for image feature extraction and classification.
arXiv Detail & Related papers (2024-11-27T18:59:50Z) - Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - Comparative Evaluation of Traditional and Deep Learning-Based
Segmentation Methods for Spoil Pile Delineation Using UAV Images [0.0]
This study refines and juxtaposes various segmentation approaches, specifically colour-based and morphology-based techniques.
The objective is to enhance and evaluate avenues for object-based analysis for spoil characterisation within the context of mining environments.
Among the diverse segmentation approaches evaluated, the morphology-based deep learning segmentation approach, Segment Anything Model (SAM), exhibited superior performance in comparison to other approaches.
arXiv Detail & Related papers (2024-02-01T02:54:49Z) - Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based
Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage.
We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets.
By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z) - Inducing Gaussian Process Networks [80.40892394020797]
We propose inducing Gaussian process networks (IGN), a simple framework for simultaneously learning the feature space as well as the inducing points.
The inducing points, in particular, are learned directly in the feature space, enabling a seamless representation of complex structured domains.
We report on experimental results for real-world data sets showing that IGNs provide significant advances over state-of-the-art methods.
arXiv Detail & Related papers (2022-04-21T05:27:09Z) - Learning Fuzzy Clustering for SPECT/CT Segmentation via Convolutional
Neural Networks [5.3123694982708365]
Quantitative bone single-photon emission computed tomography (QBSPECT) has the potential to provide a better quantitative assessment of bone metastasis than planar bone scintigraphy.
The segmentation of anatomical regions-of-interests (ROIs) still relies heavily on the manual delineation by experts.
This work proposes a fast and robust automated segmentation method for partitioning a QBSPECT image into lesion, bone, and background.
arXiv Detail & Related papers (2021-04-17T19:03:52Z) - Detecting micro fractures with X-ray computed tomography [4.855026133182103]
We present a data-set produced by the successful visualization of a fracture network in Carrara marble with XRCT.
Three conventional and two machine-learning-based methods are evaluated.
The output of the 2D U-net model is one of the adopted machine-learning-based segmentation methods.
arXiv Detail & Related papers (2021-03-23T20:20:24Z) - Sparse Signal Models for Data Augmentation in Deep Learning ATR [0.8999056386710496]
We propose a data augmentation approach to incorporate domain knowledge and improve the generalization power of a data-intensive learning algorithm.
We exploit the sparsity of the scattering centers in the spatial domain and the smoothly-varying structure of the scattering coefficients in the azimuthal domain to solve the ill-posed problem of over-parametrized model fitting.
arXiv Detail & Related papers (2020-12-16T21:46:33Z) - Dense Contrastive Learning for Self-Supervised Visual Pre-Training [102.15325936477362]
We present dense contrastive learning, which implements self-supervised learning by optimizing a pairwise contrastive (dis)similarity loss at the pixel level between two views of input images.
Compared to the baseline method MoCo-v2, our method introduces negligible computation overhead (only 1% slower)
arXiv Detail & Related papers (2020-11-18T08:42:32Z) - Region Comparison Network for Interpretable Few-shot Image
Classification [97.97902360117368]
Few-shot image classification has been proposed to effectively use only a limited number of labeled examples to train models for new classes.
We propose a metric learning based method named Region Comparison Network (RCN), which is able to reveal how few-shot learning works.
We also present a new way to generalize the interpretability from the level of tasks to categories.
arXiv Detail & Related papers (2020-09-08T07:29:05Z) - Supervision and Source Domain Impact on Representation Learning: A
Histopathology Case Study [6.762603053858596]
In this work, we explored the performance of a deep neural network and triplet loss in the area of representation learning.
We investigated the notion of similarity and dissimilarity in pathology whole-slide images and compared different setups from unsupervised and semi-supervised to supervised learning.
We achieved high accuracy and generalization when the learned representations were applied to two different pathology datasets.
arXiv Detail & Related papers (2020-05-10T21:27:38Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.