Related papers: WSESeg: Introducing a Dataset for the Segmentation of Winter Sports Equipment with a Baseline for Interactive Segmentation

WSESeg: Introducing a Dataset for the Segmentation of Winter Sports Equipment with a Baseline for Interactive Segmentation

URL: http://arxiv.org/abs/2407.09288v1
Date: Fri, 12 Jul 2024 14:20:12 GMT
Title: WSESeg: Introducing a Dataset for the Segmentation of Winter Sports Equipment with a Baseline for Interactive Segmentation
Authors: Robin Schön, Daniel Kienzle, Rainer Lienhart,
Abstract summary: We introduce a new dataset containing instance segmentation masks for ten different categories of winter sports equipment. We carry out interactive segmentation experiments on said dataset to explore possibilities for efficient further labeling.
Score: 13.38174941551702
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper we introduce a new dataset containing instance segmentation masks for ten different categories of winter sports equipment, called WSESeg (Winter Sports Equipment Segmentation). Furthermore, we carry out interactive segmentation experiments on said dataset to explore possibilities for efficient further labeling. The SAM and HQ-SAM models are conceptualized as foundation models for performing user guided segmentation. In order to measure their claimed generalization capability we evaluate them on WSESeg. Since interactive segmentation offers the benefit of creating easily exploitable ground truth data during test-time, we are going to test various online adaptation methods for the purpose of exploring potentials for improvements without having to fine-tune the models explicitly. Our experiments show that our adaptation methods drastically reduce the Failure Rate (FR) and Number of Clicks (NoC) metrics, which generally leads faster to better interactive segmentation results.

Related papers

ConformalSAM: Unlocking the Potential of Foundational Segmentation Models in Semi-Supervised Semantic Segmentation with Conformal Prediction [57.930531826380836]
This work explores whether a foundational segmentation model can address label scarcity in the pixel-level vision task as an annotator for unlabeled images.<n>We propose ConformalSAM, a novel SSSS framework which first calibrates the foundation model using the target domain's labeled data and then filters out unreliable pixel labels of unlabeled data.
arXiv Detail & Related papers (2025-07-21T17:02:57Z)
Stepping Out of Similar Semantic Space for Open-Vocabulary Segmentation [34.00709332072491]
Open-vocabulary segmentation aims to achieve segmentation of arbitrary categories given unlimited text inputs as guidance.<n>We present a new benchmark named OpenBench that differs significantly from the training semantics.<n>We also propose a method named OVSNet to improve the segmentation performance for diverse and open scenarios.
arXiv Detail & Related papers (2025-06-19T06:32:53Z)
Semantic Library Adaptation: LoRA Retrieval and Fusion for Open-Vocabulary Semantic Segmentation [72.28364940168092]
Open-vocabulary semantic segmentation models associate vision and text to label pixels from an undefined set of classes using textual queries. We introduce Semantic Library Adaptation (SemLA), a novel framework for training-free, test-time domain adaptation.
arXiv Detail & Related papers (2025-03-27T17:59:58Z)
Optimizing against Infeasible Inclusions from Data for Semantic Segmentation through Morphology [58.17907376475596]
State-of-the-art semantic segmentation models are typically optimized in a data-driven fashion.<n>InSeIn extracts explicit inclusion constraints that govern spatial class relations from the semantic segmentation training set at hand.<n>It then enforces a morphological yet differentiable loss that penalizes violations of these constraints during training to promote prediction feasibility.
arXiv Detail & Related papers (2024-08-26T22:39:08Z)
Scale Disparity of Instances in Interactive Point Cloud Segmentation [15.865365305312174]
We propose ClickFormer, an innovative interactive point cloud segmentation model that accurately segments instances of both thing and stuff categories. We employ global attention in the query-voxel transformer to mitigate the risk of generating false positives. Experiments demonstrate that ClickFormer outperforms existing interactive point cloud segmentation methods across both indoor and outdoor datasets.
arXiv Detail & Related papers (2024-07-19T03:45:48Z)
Appearance-Based Refinement for Object-Centric Motion Segmentation [85.2426540999329]
We introduce an appearance-based refinement method that leverages temporal consistency in video streams to correct inaccurate flow-based proposals. Our approach involves a sequence-level selection mechanism that identifies accurate flow-predicted masks as exemplars. Its performance is evaluated on multiple video segmentation benchmarks, including DAVIS, YouTube, SegTrackv2, and FBMS-59.
arXiv Detail & Related papers (2023-12-18T18:59:51Z)
Interactive segmentation in aerial images: a new benchmark and an open access web-based tool [2.729446374377189]
In recent years, interactive semantic segmentation proposed in computer vision has achieved an ideal state of human-computer interaction segmentation. This study aims to bridge the gap between interactive segmentation and remote sensing analysis by conducting benchmark study on various interactive segmentation models.
arXiv Detail & Related papers (2023-08-25T04:49:49Z)
RefSAM: Efficiently Adapting Segmenting Anything Model for Referring Video Object Segmentation [53.4319652364256]
This paper presents the RefSAM model, which explores the potential of SAM for referring video object segmentation. Our proposed approach adapts the original SAM model to enhance cross-modality learning by employing a lightweight Cross-RValModal. We employ a parameter-efficient tuning strategy to align and fuse the language and vision features effectively.
arXiv Detail & Related papers (2023-07-03T13:21:58Z)
Multi-body SE(3) Equivariance for Unsupervised Rigid Segmentation and Motion Estimation [49.56131393810713]
We present an SE(3) equivariant architecture and a training strategy to tackle this task in an unsupervised manner. Our method excels in both model performance and computational efficiency, with only 0.25M parameters and 0.92G FLOPs.
arXiv Detail & Related papers (2023-06-08T22:55:32Z)
Open-vocabulary Panoptic Segmentation with Embedding Modulation [71.15502078615587]
Open-vocabulary image segmentation is attracting increasing attention due to its critical applications in the real world. Traditional closed-vocabulary segmentation methods are not able to characterize novel objects, whereas several recent open-vocabulary attempts obtain unsatisfactory results. We propose OPSNet, an omnipotent and data-efficient framework for Open-vocabulary Panopticon.
arXiv Detail & Related papers (2023-03-20T17:58:48Z)
ISIM: Iterative Self-Improved Model for Weakly Supervised Segmentation [0.34265828682659694]
Weakly Supervised Semantic Conditional (WSSS) is a challenging task aiming to learn the segmentation labels from class-level labels. We propose a framework that employs an iterative approach in a modified encoder-decoder-based segmentation model. Experiments performed with DeepLabv3 and UNet models show a significant gain on the Pascal VOC12 dataset.
arXiv Detail & Related papers (2022-11-22T18:14:06Z)
RAIS: Robust and Accurate Interactive Segmentation via Continual Learning [16.382862088005087]
We propose RAIS, a robust and accurate architecture for interactive segmentation with continuous learning. For efficient learning on the test set, we propose a novel optimization strategy to update global and local parameters. Our method also shows its robustness in the datasets of remote sensing and medical imaging.
arXiv Detail & Related papers (2022-10-20T03:05:44Z)
SlimSeg: Slimmable Semantic Segmentation with Boundary Supervision [54.16430358203348]
We propose a simple but effective slimmable semantic segmentation (SlimSeg) method, which can be executed at different capacities during inference. We show that our proposed SlimSeg with various mainstream networks can produce flexible models that provide dynamic adjustment of computational cost and better performance.
arXiv Detail & Related papers (2022-07-13T14:41:05Z)
Reviving Iterative Training with Mask Guidance for Interactive Segmentation [8.271859911016719]
Recent works on click-based interactive segmentation have demonstrated state-of-the-art results by using various inference-time optimization schemes. We propose a simple feedforward model for click-based interactive segmentation that employs the segmentation masks from previous steps. We find that the models trained on a combination of COCO and LVIS with diverse and high-quality annotations show performance superior to all existing models.
arXiv Detail & Related papers (2021-02-12T15:44:31Z)
FAIRS -- Soft Focus Generator and Attention for Robust Object Segmentation from Extreme Points [70.65563691392987]
We present a new approach to generate object segmentation from user inputs in the form of extreme points and corrective clicks. We demonstrate our method's ability to generate high-quality training data as well as its scalability in incorporating extreme points, guiding clicks, and corrective clicks in a principled manner.
arXiv Detail & Related papers (2020-04-04T22:25:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.