Related papers: Chair Segments: A Compact Benchmark for the Study of Object Segmentation

Chair Segments: A Compact Benchmark for the Study of Object Segmentation

URL: http://arxiv.org/abs/2012.01250v1
Date: Wed, 2 Dec 2020 14:54:03 GMT
Title: Chair Segments: A Compact Benchmark for the Study of Object Segmentation
Authors: Leticia Pinto-Alva, Ian K. Torres, Rosangel Garcia, Ziyan Yang, Vicente Ordonez
Abstract summary: ChairSegments is a novel and compact semi-synthetic dataset for object segmentation. We show empirical findings in transfer learning that mirror recent findings for image classification.
Score: 12.16129964498819
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Over the years, datasets and benchmarks have had an outsized influence on the design of novel algorithms. In this paper, we introduce ChairSegments, a novel and compact semi-synthetic dataset for object segmentation. We also show empirical findings in transfer learning that mirror recent findings for image classification. We particularly show that models that are fine-tuned from a pretrained set of weights lie in the same basin of the optimization landscape. ChairSegments consists of a diverse set of prototypical images of chairs with transparent backgrounds composited into a diverse array of backgrounds. We aim for ChairSegments to be the equivalent of the CIFAR-10 dataset but for quickly designing and iterating over novel model architectures for segmentation. On Chair Segments, a U-Net model can be trained to full convergence in only thirty minutes using a single GPU. Finally, while this dataset is semi-synthetic, it can be a useful proxy for real data, leading to state-of-the-art accuracy on the Object Discovery dataset when used as a source of pretraining.

Related papers

Segment Any Vehicle: Semantic and Visual Context Driven SAM and A Benchmark [12.231630639022335]
We propose SAV, a novel framework comprising three core components: a SAM-based encoder-decoder, a vehicle part knowledge graph, and a context sample retrieval encoding module.<n>The knowledge graph explicitly models the spatial and geometric relationships among vehicle parts through a structured ontology, effectively encoding prior structural knowledge.<n>We introduce a new large-scale benchmark dataset for vehicle part segmentation, named VehicleSeg10K, which contains 11,665 high-quality pixel-level annotations.
arXiv Detail & Related papers (2025-08-06T09:46:49Z)
Data Analysis Prediction over Multiple Unseen Datasets: A Vector Embedding Approach [0.3683202928838613]
We propose a novel methodology that infers the outcome of analytics operators by creating a model from datasets similar to the queried one. Our model can project different real-world scenarios to a lower vector embedding representation and distinguish between them.
arXiv Detail & Related papers (2025-02-24T11:21:08Z)
Few-shot Structure-Informed Machinery Part Segmentation with Foundation Models and Graph Neural Networks [1.5293427903448022]
We propose a novel approach to few-shot semantic segmentation for machinery with multiple parts that exhibit spatial and hierarchical relationships. Our method integrates the foundation models CLIPSeg and Segment Anything Model (SAM) with the interest point detector SuperPoint and a graph convolutional network (GCN) to accurately segment machinery parts. Our model, evaluated on a purely synthetic dataset depicting a truck-mounted loading crane, achieves effective segmentation across various levels of detail.
arXiv Detail & Related papers (2025-01-17T09:55:05Z)
Exploiting Local Features and Range Images for Small Data Real-Time Point Cloud Semantic Segmentation [4.02235104503587]
In this paper, we harness the information from the three-dimensional representation to proficiently capture local features. A GPU-based KDTree allows for rapid building, querying, and enhancing projection with straightforward operations. We show that a reduced version of our model not only demonstrates strong competitiveness against full-scale state-of-the-art models but also operates in real-time.
arXiv Detail & Related papers (2024-10-14T13:49:05Z)
Segmenting Object Affordances: Reproducibility and Sensitivity to Scale [27.277739855754447]
Methods re-use and adapt learning-based architectures for semantic segmentation to the affordance segmentation task. We benchmark these methods under a reproducible setup on two single objects scenarios. Our analysis shows that models are not robust to scale variations when object resolutions differ from those in the training set.
arXiv Detail & Related papers (2024-09-03T11:54:36Z)
Contrastive Lift: 3D Object Instance Segmentation by Slow-Fast Contrastive Fusion [110.84357383258818]
We propose a novel approach to lift 2D segments to 3D and fuse them by means of a neural field representation. The core of our approach is a slow-fast clustering objective function, which is scalable and well-suited for scenes with a large number of objects. Our approach outperforms the state-of-the-art on challenging scenes from the ScanNet, Hypersim, and Replica datasets.
arXiv Detail & Related papers (2023-06-07T17:57:45Z)
MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training. We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects. Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z)
Learning from Temporal Spatial Cubism for Cross-Dataset Skeleton-based Action Recognition [88.34182299496074]
Action labels are only available on a source dataset, but unavailable on a target dataset in the training stage. We utilize a self-supervision scheme to reduce the domain shift between two skeleton-based action datasets. By segmenting and permuting temporal segments or human body parts, we design two self-supervised learning classification tasks.
arXiv Detail & Related papers (2022-07-17T07:05:39Z)
Object Pose Estimation using Mid-level Visual Representations [5.220940151628735]
This work proposes a novel pose estimation model for object categories that can be effectively transferred to previously unseen environments. Deep convolutional network models (CNN) for pose estimation are typically trained and evaluated on datasets curated for object detection, pose estimation, or 3D reconstruction. We show that the approach is favorable when it comes to generalization and transfer to novel environments.
arXiv Detail & Related papers (2022-03-02T22:49:17Z)
MSeg: A Composite Dataset for Multi-domain Semantic Segmentation [100.17755160696939]
We present MSeg, a composite dataset that unifies semantic segmentation datasets from different domains. We reconcile the generalization and bring the pixel-level annotations into alignment by relabeling more than 220,000 object masks in more than 80,000 images. A model trained on MSeg ranks first on the WildDash-v1 leaderboard for robust semantic segmentation, with no exposure to WildDash data during training.
arXiv Detail & Related papers (2021-12-27T16:16:35Z)
Multi-dataset Pretraining: A Unified Model for Semantic Segmentation [97.61605021985062]
We propose a unified framework, termed as Multi-Dataset Pretraining, to take full advantage of the fragmented annotations of different datasets. This is achieved by first pretraining the network via the proposed pixel-to-prototype contrastive loss over multiple datasets. In order to better model the relationship among images and classes from different datasets, we extend the pixel level embeddings via cross dataset mixing.
arXiv Detail & Related papers (2021-06-08T06:13:11Z)
Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets. This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets. We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z)
Reviving Iterative Training with Mask Guidance for Interactive Segmentation [8.271859911016719]
Recent works on click-based interactive segmentation have demonstrated state-of-the-art results by using various inference-time optimization schemes. We propose a simple feedforward model for click-based interactive segmentation that employs the segmentation masks from previous steps. We find that the models trained on a combination of COCO and LVIS with diverse and high-quality annotations show performance superior to all existing models.
arXiv Detail & Related papers (2021-02-12T15:44:31Z)
SVIRO: Synthetic Vehicle Interior Rear Seat Occupancy Dataset and Benchmark [11.101588888002045]
We release SVIRO, a synthetic dataset for sceneries in the passenger compartment of ten different vehicles. We analyze machine learning-based approaches for their generalization capacities and reliability when trained on a limited number of variations.
arXiv Detail & Related papers (2020-01-10T14:44:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.