Learning to "Segment Anything" in Thermal Infrared Images through
Knowledge Distillation with a Large Scale Dataset SATIR
- URL: http://arxiv.org/abs/2304.07969v1
- Date: Mon, 17 Apr 2023 03:27:10 GMT
- Title: Learning to "Segment Anything" in Thermal Infrared Images through
Knowledge Distillation with a Large Scale Dataset SATIR
- Authors: Junzhang Chen and Xiangzhi Bai
- Abstract summary: The Segment Anything Model (SAM) is a promptable segmentation model recently introduced by Meta AI.
We propose a framework that utilizes SAM to generate pseudo labels for pretraining thermal infrared image segmentation tasks.
Our framework presents a novel approach to collaborate with models trained with large data like SAM to address problems in special fields.
- Score: 15.198798677908615
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The Segment Anything Model (SAM) is a promptable segmentation model recently
introduced by Meta AI that has demonstrated its prowess across various fields
beyond just image segmentation. SAM can accurately segment images across
diverse fields, and generating various masks. We discovered that this ability
of SAM can be leveraged to pretrain models for specific fields. Accordingly, we
have proposed a framework that utilizes SAM to generate pseudo labels for
pretraining thermal infrared image segmentation tasks. Our proposed framework
can effectively improve the accuracy of segmentation results of specific
categories beyond the SOTA ImageNet pretrained model. Our framework presents a
novel approach to collaborate with models trained with large data like SAM to
address problems in special fields. Also, we generated a large scale thermal
infrared segmentation dataset used for pretaining, which contains over 100,000
images with pixel-annotation labels. This approach offers an effective solution
for working with large models in special fields where label annotation is
challenging. Our code is available at https://github.com/chenjzBUAA/SATIR
Related papers
- SAM 2: Segment Anything in Images and Videos [63.44869623822368]
We present Segment Anything Model 2 (SAM 2), a foundation model towards solving promptable visual segmentation in images and videos.
We build a data engine, which improves model and data via user interaction, to collect the largest video segmentation dataset to date.
Our model is a simple transformer architecture with streaming memory for real-time video processing.
arXiv Detail & Related papers (2024-08-01T17:00:08Z) - IRSAM: Advancing Segment Anything Model for Infrared Small Target Detection [55.554484379021524]
Infrared Small Target Detection (IRSTD) task falls short in achieving satisfying performance due to a notable domain gap between natural and infrared images.
We propose the IRSAM model for IRSTD, which improves SAM's encoder-decoder architecture to learn better feature representation of infrared small objects.
arXiv Detail & Related papers (2024-07-10T10:17:57Z) - MAS-SAM: Segment Any Marine Animal with Aggregated Features [55.91291540810978]
We propose a novel feature learning framework named MAS-SAM for marine animal segmentation.
Our method enables to extract richer marine information from global contextual cues to fine-grained local details.
arXiv Detail & Related papers (2024-04-24T07:38:14Z) - Performance Evaluation of Segment Anything Model with Variational Prompting for Application to Non-Visible Spectrum Imagery [15.748043194987075]
This work assesses Segment Anything Model capabilities in segmenting objects of interest in the X-ray/infrared modalities.
Our results show that SAM can segment objects in the X-ray modality when given a box prompt, but its performance varies for point prompts.
We find that infrared objects are also challenging to segment with point prompts given the low-contrast nature of this modality.
arXiv Detail & Related papers (2024-04-18T16:04:14Z) - Semantic-SAM: Segment and Recognize Anything at Any Granularity [83.64686655044765]
We introduce Semantic-SAM, a universal image segmentation model to enable segment and recognize anything at any desired granularity.
We consolidate multiple datasets across three granularities and introduce decoupled classification for objects and parts.
For the multi-granularity capability, we propose a multi-choice learning scheme during training, enabling each click to generate masks at multiple levels.
arXiv Detail & Related papers (2023-07-10T17:59:40Z) - Input Augmentation with SAM: Boosting Medical Image Segmentation with
Segmentation Foundation Model [36.015065439244495]
The Segment Anything Model (SAM) is a recently developed large model for general-purpose segmentation for computer vision tasks.
SAM was trained using 11 million images with over 1 billion masks and can produce segmentation results for a wide range of objects in natural scene images.
This paper shows that although SAM does not immediately give high-quality segmentation for medical image data, its generated masks, features, and stability scores are useful for building and training better medical image segmentation models.
arXiv Detail & Related papers (2023-04-22T07:11:53Z) - SAM Fails to Segment Anything? -- SAM-Adapter: Adapting SAM in
Underperformed Scenes: Camouflage, Shadow, Medical Image Segmentation, and
More [13.047310918166762]
We propose textbfSAM-Adapter, which incorporates domain-specific information or visual prompts into the segmentation network by using simple yet effective adapters.
We can even outperform task-specific network models and achieve state-of-the-art performance in the task we tested: camouflaged object detection.
arXiv Detail & Related papers (2023-04-18T17:38:54Z) - Segment Anything [108.16489338211093]
We build the largest segmentation dataset to date, with over 1 billion masks on 11M licensed and privacy respecting images.
The model is designed and trained to be promptable, so it can transfer zero-shot to new image distributions and tasks.
We evaluate its capabilities on numerous tasks and find that its zero-shot performance is impressive.
arXiv Detail & Related papers (2023-04-05T17:59:46Z) - Semantic Segmentation with Generative Models: Semi-Supervised Learning
and Strong Out-of-Domain Generalization [112.68171734288237]
We propose a novel framework for discriminative pixel-level tasks using a generative model of both images and labels.
We learn a generative adversarial network that captures the joint image-label distribution and is trained efficiently using a large set of unlabeled images.
We demonstrate strong in-domain performance compared to several baselines, and are the first to showcase extreme out-of-domain generalization.
arXiv Detail & Related papers (2021-04-12T21:41:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.