SegLoc: Visual Self-supervised Learning Scheme for Dense Prediction
Tasks of Security Inspection X-ray Images
- URL: http://arxiv.org/abs/2310.08421v3
- Date: Sat, 21 Oct 2023 10:55:31 GMT
- Title: SegLoc: Visual Self-supervised Learning Scheme for Dense Prediction
Tasks of Security Inspection X-ray Images
- Authors: Shervin Halat, Mohammad Rahmati, Ehsan Nazerfard
- Abstract summary: Self-supervised learning in computer vision has not been able to stay on track comparatively.
In this paper, we evaluate dense prediction tasks on security inspection x-ray images.
Our model has managed to address one of the most challenging downsides of contrastive learning, i.e., false negative pairs of query embeddings.
- Score: 4.251030047034566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Lately, remarkable advancements of artificial intelligence have been
attributed to the integration of self-supervised learning (SSL) scheme. Despite
impressive achievements within natural language processing (NLP), SSL in
computer vision has not been able to stay on track comparatively. Recently,
integration of contrastive learning on top of existing visual SSL models has
established considerable progress, thereby being able to outperform supervised
counterparts. Nevertheless, the improvements were mostly limited to
classification tasks; moreover, few studies have evaluated visual SSL models in
real-world scenarios, while the majority considered datasets containing
class-wise portrait images, notably ImageNet. Thus, here, we have considered
dense prediction tasks on security inspection x-ray images to evaluate our
proposed model Segmentation Localization (SegLoc). Based upon the model
Instance Localization (InsLoc), our model has managed to address one of the
most challenging downsides of contrastive learning, i.e., false negative pairs
of query embeddings. To do so, our pre-training dataset is synthesized by
cutting, transforming, then pasting labeled segments, as foregrounds, from an
already existing labeled dataset (PIDray) onto instances, as backgrounds, of an
unlabeled dataset (SIXray;) further, we fully harness the labels through
integration of the notion, one queue per class, into MoCo-v2 memory bank,
avoiding false negative pairs. Regarding the task in question, our approach has
outperformed random initialization method by 3% to 6%, while having
underperformed supervised initialization, in AR and AP metrics at different IoU
values for 20 to 30 pre-training epochs.
Related papers
- A Closer Look at Benchmarking Self-Supervised Pre-training with Image Classification [51.35500308126506]
Self-supervised learning (SSL) is a machine learning approach where the data itself provides supervision, eliminating the need for external labels.
We study how classification-based evaluation protocols for SSL correlate and how well they predict downstream performance on different dataset types.
arXiv Detail & Related papers (2024-07-16T23:17:36Z) - Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - In-Domain Self-Supervised Learning Improves Remote Sensing Image Scene
Classification [5.323049242720532]
Self-supervised learning has emerged as a promising approach for remote sensing image classification.
We present a study of different self-supervised pre-training strategies and evaluate their effect across 14 downstream datasets.
arXiv Detail & Related papers (2023-07-04T10:57:52Z) - Masked Unsupervised Self-training for Zero-shot Image Classification [98.23094305347709]
Masked Unsupervised Self-Training (MUST) is a new approach which leverages two different and complimentary sources of supervision: pseudo-labels and raw images.
MUST improves upon CLIP by a large margin and narrows the performance gap between unsupervised and supervised classification.
arXiv Detail & Related papers (2022-06-07T02:03:06Z) - Open-Set Semi-Supervised Learning for 3D Point Cloud Understanding [62.17020485045456]
It is commonly assumed in semi-supervised learning (SSL) that the unlabeled data are drawn from the same distribution as that of the labeled ones.
We propose to selectively utilize unlabeled data through sample weighting, so that only conducive unlabeled data would be prioritized.
arXiv Detail & Related papers (2022-05-02T16:09:17Z) - UniVIP: A Unified Framework for Self-Supervised Visual Pre-training [50.87603616476038]
We propose a novel self-supervised framework to learn versatile visual representations on either single-centric-object or non-iconic dataset.
Massive experiments show that UniVIP pre-trained on non-iconic COCO achieves state-of-the-art transfer performance.
Our method can also exploit single-centric-object dataset such as ImageNet and outperforms BYOL by 2.5% with the same pre-training epochs in linear probing.
arXiv Detail & Related papers (2022-03-14T10:04:04Z) - Hierarchical Self-Supervised Learning for Medical Image Segmentation
Based on Multi-Domain Data Aggregation [23.616336382437275]
We propose Hierarchical Self-Supervised Learning (HSSL) for medical image segmentation.
We first aggregate a dataset from several medical challenges, then pre-train the network in a self-supervised manner, and finally fine-tune on labeled data.
Compared to learning from scratch, our new method yields better performance on various tasks.
arXiv Detail & Related papers (2021-07-10T18:17:57Z) - Remote Sensing Image Scene Classification with Self-Supervised Paradigm
under Limited Labeled Samples [11.025191332244919]
We introduce new self-supervised learning (SSL) mechanism to obtain the high-performance pre-training model for RSIs scene classification from large unlabeled data.
Experiments on three commonly used RSIs scene classification datasets demonstrated that this new learning paradigm outperforms the traditional dominant ImageNet pre-trained model.
The insights distilled from our studies can help to foster the development of SSL in the remote sensing community.
arXiv Detail & Related papers (2020-10-02T09:27:19Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.