Salient Objects in Clutter
- URL: http://arxiv.org/abs/2105.03053v1
- Date: Fri, 7 May 2021 03:49:26 GMT
- Title: Salient Objects in Clutter
- Authors: Deng-Ping Fan, Jing Zhang, Gang Xu, Ming-Ming Cheng, Ling Shao
- Abstract summary: This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
- Score: 130.63976772770368
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper identifies and addresses a serious design bias of existing salient
object detection (SOD) datasets, which unrealistically assume that each image
should contain at least one clear and uncluttered salient object. This design
bias has led to a saturation in performance for state-of-the-art SOD models
when evaluated on existing datasets. However, these models are still far from
satisfactory when applied to real-world scenes. Based on our analyses, we
propose a new high-quality dataset and update the previous saliency benchmark.
Specifically, our dataset, called Salient Objects in Clutter (SOC), includes
images with both salient and non-salient objects from several common object
categories. In addition to object category annotations, each salient image is
accompanied by attributes that reflect common challenges in real-world scenes,
which can help provide deeper insight into the SOD problem. Further, with a
given saliency encoder, e.g., the backbone network, existing saliency models
are designed to achieve mapping from the training image set to the training
ground-truth set. We, therefore, argue that improving the dataset can yield
higher performance gains than focusing only on the decoder design. With this in
mind, we investigate several dataset-enhancement strategies, including label
smoothing to implicitly emphasize salient boundaries, random image augmentation
to adapt saliency models to various scenarios, and self-supervised learning as
a regularization strategy to learn from small datasets. Our extensive results
demonstrate the effectiveness of these tricks. We also provide a comprehensive
benchmark for SOD, which can be found in our repository:
http://dpfan.net/SOCBenchmark.
Related papers
- Boosting Salient Object Detection with Knowledge Distillated from Large Foundation Models [7.898092154590899]
Salient Object Detection aims to identify and segment prominent regions within a scene.
Traditional models rely on manually annotated pseudo labels with precise pixel-level accuracy.
We develop a low-cost, high-precision annotation method to address the challenges.
arXiv Detail & Related papers (2025-01-08T15:56:21Z) - Oriented Tiny Object Detection: A Dataset, Benchmark, and Dynamic Unbiased Learning [51.170479006249195]
We introduce a new dataset, benchmark, and a dynamic coarse-to-fine learning scheme in this study.
Our proposed dataset, AI-TOD-R, features the smallest object sizes among all oriented object detection datasets.
We present a benchmark spanning a broad range of detection paradigms, including both fully-supervised and label-efficient approaches.
arXiv Detail & Related papers (2024-12-16T09:14:32Z) - Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Few-shot Object Localization [37.347898735345574]
This paper defines a novel task named Few-Shot Object localization (FSOL)
It aims to achieve precise localization with limited samples.
This task achieves generalized object localization by leveraging a small number of labeled support samples to query the positional information of objects within corresponding images.
Experimental results demonstrate a significant performance improvement of our approach in the FSOL task, establishing an efficient benchmark for further research.
arXiv Detail & Related papers (2024-03-19T05:50:48Z) - Innovative Horizons in Aerial Imagery: LSKNet Meets DiffusionDet for
Advanced Object Detection [55.2480439325792]
We present an in-depth evaluation of an object detection model that integrates the LSKNet backbone with the DiffusionDet head.
The proposed model achieves a mean average precision (MAP) of approximately 45.7%, which is a significant improvement.
This advancement underscores the effectiveness of the proposed modifications and sets a new benchmark in aerial image analysis.
arXiv Detail & Related papers (2023-11-21T19:49:13Z) - Labeling Indoor Scenes with Fusion of Out-of-the-Box Perception Models [4.157013247909771]
We propose to leverage the recent advancements in state-of-the-art models for bottom-up segmentation (SAM), object detection (Detic), and semantic segmentation (MaskFormer)
We aim to develop a cost-effective labeling approach to obtain pseudo-labels for semantic segmentation and object instance detection in indoor environments.
We demonstrate the effectiveness of the proposed approach on the Active Vision dataset and the ADE20K dataset.
arXiv Detail & Related papers (2023-11-17T21:58:26Z) - Weakly-supervised Contrastive Learning for Unsupervised Object Discovery [52.696041556640516]
Unsupervised object discovery is promising due to its ability to discover objects in a generic manner.
We design a semantic-guided self-supervised learning model to extract high-level semantic features from images.
We introduce Principal Component Analysis (PCA) to localize object regions.
arXiv Detail & Related papers (2023-07-07T04:03:48Z) - MegaPose: 6D Pose Estimation of Novel Objects via Render & Compare [84.80956484848505]
MegaPose is a method to estimate the 6D pose of novel objects, that is, objects unseen during training.
We present a 6D pose refiner based on a render&compare strategy which can be applied to novel objects.
Second, we introduce a novel approach for coarse pose estimation which leverages a network trained to classify whether the pose error between a synthetic rendering and an observed image of the same object can be corrected by the refiner.
arXiv Detail & Related papers (2022-12-13T19:30:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.