Highly Accurate Dichotomous Image Segmentation
- URL: http://arxiv.org/abs/2203.03041v2
- Date: Tue, 8 Mar 2022 19:13:10 GMT
- Title: Highly Accurate Dichotomous Image Segmentation
- Authors: Xuebin Qin and Hang Dai and Xiaobin Hu and Deng-Ping Fan and Ling Shao
and and Luc Van Gool
- Abstract summary: A new task called dichotomous image segmentation (DIS) aims to segment highly accurate objects from natural images.
We collect the first large-scale dataset, DIS5K, which contains 5,470 high-resolution (e.g., 2K, 4K or larger) images.
We also introduce a simple intermediate supervision baseline (IS-Net) using both feature-level and mask-level guidance for DIS model training.
- Score: 139.79513044546
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a systematic study on a new task called dichotomous image
segmentation (DIS), which aims to segment highly accurate objects from natural
images. To this end, we collected the first large-scale dataset, called DIS5K,
which contains 5,470 high-resolution (e.g., 2K, 4K or larger) images covering
camouflaged, salient, or meticulous objects in various backgrounds. All images
are annotated with extremely fine-grained labels. In addition, we introduce a
simple intermediate supervision baseline (IS-Net) using both feature-level and
mask-level guidance for DIS model training. Without tricks, IS-Net outperforms
various cutting-edge baselines on the proposed DIS5K, making it a general
self-learned supervision network that can help facilitate future research in
DIS. Further, we design a new metric called human correction efforts (HCE)
which approximates the number of mouse clicking operations required to correct
the false positives and false negatives. HCE is utilized to measure the gap
between models and real-world applications and thus can complement existing
metrics. Finally, we conduct the largest-scale benchmark, evaluating 16
representative segmentation models, providing a more insightful discussion
regarding object complexities, and showing several potential applications
(e.g., background removal, art design, 3D reconstruction). Hoping these efforts
can open up promising directions for both academic and industries. We will
release our DIS5K dataset, IS-Net baseline, HCE metric, and the complete
benchmark results.
Related papers
- Semantic Segmentation in Satellite Hyperspectral Imagery by Deep Learning [54.094272065609815]
We propose a lightweight 1D-CNN model, 1D-Justo-LiuNet, which outperforms state-of-the-art models in the hypespectral domain.
1D-Justo-LiuNet achieves the highest accuracy (0.93) with the smallest model size (4,563 parameters) among all tested models.
arXiv Detail & Related papers (2023-10-24T21:57:59Z) - Self-Supervised Masked Digital Elevation Models Encoding for
Low-Resource Downstream Tasks [0.6374763930914523]
GeoAI is uniquely poised to take advantage of the self-supervised methodology due to the decades of data collected.
The proposed architecture is the Masked Autoencoder pre-trained on ImageNet.
arXiv Detail & Related papers (2023-09-06T21:20:10Z) - Dense Depth Distillation with Out-of-Distribution Simulated Images [30.79756881887895]
We study data-free knowledge distillation (KD) for monocular depth estimation (MDE)
KD learns a lightweight model for real-world depth perception tasks by compressing it from a trained teacher model while lacking training data in the target domain.
We show that our method outperforms the baseline KD by a good margin and even slightly better performance with as few as 1/6 of training images.
arXiv Detail & Related papers (2022-08-26T07:10:01Z) - Large-scale Unsupervised Semantic Segmentation [163.3568726730319]
We propose a new problem of large-scale unsupervised semantic segmentation (LUSS) with a newly created benchmark dataset to track the research progress.
Based on the ImageNet dataset, we propose the ImageNet-S dataset with 1.2 million training images and 40k high-quality semantic segmentation annotations for evaluation.
arXiv Detail & Related papers (2021-06-06T15:02:11Z) - Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets.
This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets.
We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z) - Group-Wise Semantic Mining for Weakly Supervised Semantic Segmentation [49.90178055521207]
This work addresses weakly supervised semantic segmentation (WSSS), with the goal of bridging the gap between image-level annotations and pixel-level segmentation.
We formulate WSSS as a novel group-wise learning task that explicitly models semantic dependencies in a group of images to estimate more reliable pseudo ground-truths.
In particular, we devise a graph neural network (GNN) for group-wise semantic mining, wherein input images are represented as graph nodes.
arXiv Detail & Related papers (2020-12-09T12:40:13Z) - Re-thinking Co-Salient Object Detection [170.44471050548827]
Co-salient object detection (CoSOD) aims to detect the co-occurring salient objects in a group of images.
Existing CoSOD datasets often have a serious data bias, assuming that each group of images contains salient objects of similar visual appearances.
We introduce a new benchmark, called CoSOD3k in the wild, which requires a large amount of semantic context.
arXiv Detail & Related papers (2020-07-07T12:20:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.