Related papers: MatSSL: Robust Self-Supervised Representation Learning for Metallographic Image Segmentation

MatSSL: Robust Self-Supervised Representation Learning for Metallographic Image Segmentation

URL: http://arxiv.org/abs/2507.18184v1
Date: Thu, 24 Jul 2025 08:32:41 GMT
Title: MatSSL: Robust Self-Supervised Representation Learning for Metallographic Image Segmentation
Authors: Hoang Hai Nam Nguyen, Phan Nguyen Duc Hieu, Ho Won Lee,
Abstract summary: MatSSL is a streamlined self-supervised learning architecture that employs Gated Feature Fusion at each stage of the backbone to integrate multi-level representations effectively.<n>We first perform self-supervised pretraining on a small-scale, unlabeled dataset and then fine-tune the model on multiple benchmark datasets.
Score: 0.2799243500184682
License: http://creativecommons.org/licenses/by/4.0/
Abstract: MatSSL is a streamlined self-supervised learning (SSL) architecture that employs Gated Feature Fusion at each stage of the backbone to integrate multi-level representations effectively. Current micrograph analysis of metallic materials relies on supervised methods, which require retraining for each new dataset and often perform inconsistently with only a few labeled samples. While SSL offers a promising alternative by leveraging unlabeled data, most existing methods still depend on large-scale datasets to be effective. MatSSL is designed to overcome this limitation. We first perform self-supervised pretraining on a small-scale, unlabeled dataset and then fine-tune the model on multiple benchmark datasets. The resulting segmentation models achieve 69.13% mIoU on MetalDAM, outperforming the 66.73% achieved by an ImageNet-pretrained encoder, and delivers consistently up to nearly 40% improvement in average mIoU on the Environmental Barrier Coating benchmark dataset (EBC) compared to models pretrained with MicroNet. This suggests that MatSSL enables effective adaptation to the metallographic domain using only a small amount of unlabeled data, while preserving the rich and transferable features learned from large-scale pretraining on natural images.

Related papers

Unlabeled Data vs. Pre-trained Knowledge: Rethinking SSL in the Era of Large Models [24.291082472792905]
Semi-supervised learning (SSL) alleviates the cost of data labeling process by exploiting unlabeled data.<n> exploiting pre-trained models becomes a promising way to address the label scarcity in the downstream tasks.<n>This raises a natural yet critical question: When labeled data is limited, should we rely on unlabeled data or pre-trained models?
arXiv Detail & Related papers (2025-05-19T16:29:20Z)
A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification [51.35500308126506]
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels. We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types.
arXiv Detail & Related papers (2024-07-16T23:17:36Z)
On Pretraining Data Diversity for Self-Supervised Learning [57.91495006862553]
We explore the impact of training with more diverse datasets on the performance of self-supervised learning (SSL) under a fixed computational budget. Our findings consistently demonstrate that increasing pretraining data diversity enhances SSL performance, albeit only when the distribution distance to the downstream data is minimal.
arXiv Detail & Related papers (2024-03-20T17:59:58Z)
Rethinking Transformers Pre-training for Multi-Spectral Satellite Imagery [78.43828998065071]
Recent advances in unsupervised learning have demonstrated the ability of large vision models to achieve promising results on downstream tasks. Such pre-training techniques have also been explored recently in the remote sensing domain due to the availability of large amount of unlabelled data. In this paper, we re-visit transformers pre-training and leverage multi-scale information that is effectively utilized with multiple modalities.
arXiv Detail & Related papers (2024-03-08T16:18:04Z)
Visual Self-supervised Learning Scheme for Dense Prediction Tasks on X-ray Images [3.782392436834913]
Self-supervised learning (SSL) has led to considerable progress in natural language processing (NLP) However, the incorporation of contrastive learning into existing visual SSL models has led to considerable progress, often surpassing supervised counterparts. Here, we focus on dense prediction tasks using security inspection x-ray images to evaluate our proposed model, Segment localization (SegLoc) Based upon the Instance localization (InsLoc) model, SegLoc addresses one of the key challenges of contrastive learning, i.e., false negative pairs of query embeddings.
arXiv Detail & Related papers (2023-10-12T15:42:17Z)
CroSSL: Cross-modal Self-Supervised Learning for Time-series through Latent Masking [11.616031590118014]
CroSSL allows for handling missing modalities and end-to-end cross-modal learning. We evaluate our method on a wide range of data, including motion sensors.
arXiv Detail & Related papers (2023-07-31T17:10:10Z)
Pseudo-Labeling Based Practical Semi-Supervised Meta-Training for Few-Shot Learning [93.63638405586354]
We propose a simple and effective meta-training framework, called pseudo-labeling based meta-learning (PLML)<n> Firstly, we train a classifier via common semi-supervised learning (SSL) and use it to obtain the pseudo-labels of unlabeled data.<n>We build few-shot tasks from labeled and pseudo-labeled data and design a novel finetuning method with feature smoothing and noise suppression.
arXiv Detail & Related papers (2022-07-14T10:53:53Z)
DATA: Domain-Aware and Task-Aware Pre-training [94.62676913928831]
We present DATA, a simple yet effective NAS approach specialized for self-supervised learning (SSL) Our method achieves promising results across a wide range of computation costs on downstream tasks, including image classification, object detection and semantic segmentation.
arXiv Detail & Related papers (2022-03-17T02:38:49Z)
Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets. This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets. In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z)
Remote Sensing Image Scene Classification with Self-Supervised Paradigm under Limited Labeled Samples [11.025191332244919]
We introduce new self-supervised learning (SSL) mechanism to obtain the high-performance pre-training model for RSIs scene classification from large unlabeled data. Experiments on three commonly used RSIs scene classification datasets demonstrated that this new learning paradigm outperforms the traditional dominant ImageNet pre-trained model. The insights distilled from our studies can help to foster the development of SSL in the remote sensing community.
arXiv Detail & Related papers (2020-10-02T09:27:19Z)
Tackling the Problem of Limited Data and Annotations in Semantic Segmentation [1.0152838128195467]
To tackle the problem of limited data annotations in image segmentation, different pre-trained models and CRF based methods are applied. To this end, RotNet, DeeperCluster, and Semi&Weakly Supervised Learning (SWSL) pre-trained models are transferred and finetuned in a DeepLab-v2 baseline. The results of my study show that, on this small dataset, using a pre-trained ResNet50 SWSL model gives results that are 7.4% better than applying an ImageNet pre-trained model.
arXiv Detail & Related papers (2020-07-14T21:11:11Z)

This list is automatically generated from the titles and abstracts of the papers in this site.