ObjectAug: Object-level Data Augmentation for Semantic Image
Segmentation
- URL: http://arxiv.org/abs/2102.00221v1
- Date: Sat, 30 Jan 2021 12:46:20 GMT
- Title: ObjectAug: Object-level Data Augmentation for Semantic Image
Segmentation
- Authors: Jiawei Zhang, Yanchun Zhang, Xiaowei Xu
- Abstract summary: semantic image segmentation aims to obtain object labels with precise boundaries.
Current strategies operate at the image level, and objects and the background are coupled.
We propose ObjectAug to perform object-level augmentation for semantic image segmentation.
- Score: 22.91204798022379
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic image segmentation aims to obtain object labels with precise
boundaries, which usually suffers from overfitting. Recently, various data
augmentation strategies like regional dropout and mix strategies have been
proposed to address the problem. These strategies have proved to be effective
for guiding the model to attend on less discriminative parts. However, current
strategies operate at the image level, and objects and the background are
coupled. Thus, the boundaries are not well augmented due to the fixed semantic
scenario. In this paper, we propose ObjectAug to perform object-level
augmentation for semantic image segmentation. ObjectAug first decouples the
image into individual objects and the background using the semantic labels.
Next, each object is augmented individually with commonly used augmentation
methods (e.g., scaling, shifting, and rotation). Then, the black area brought
by object augmentation is further restored using image inpainting. Finally, the
augmented objects and background are assembled as an augmented image. In this
way, the boundaries can be fully explored in the various semantic scenarios. In
addition, ObjectAug can support category-aware augmentation that gives various
possibilities to objects in each category, and can be easily combined with
existing image-level augmentation methods to further boost performance.
Comprehensive experiments are conducted on both natural image and medical image
datasets. Experiment results demonstrate that our ObjectAug can evidently
improve segmentation performance.
Related papers
- ResVG: Enhancing Relation and Semantic Understanding in Multiple Instances for Visual Grounding [42.10086029931937]
Visual grounding aims to localize the object referred to in an image based on a natural language query.
Existing methods demonstrate a significant performance drop when there are multiple distractions in an image.
We propose a novel approach, the Relation and Semantic-sensitive Visual Grounding (ResVG) model, to address this issue.
arXiv Detail & Related papers (2024-08-29T07:32:01Z) - High-Quality Entity Segmentation [110.55724145851725]
CropFormer is designed to tackle the intractability of instance-level segmentation on high-resolution images.
It improves mask prediction by fusing high-res image crops that provide more fine-grained image details and the full image.
With CropFormer, we achieve a significant AP gain of $1.9$ on the challenging entity segmentation task.
arXiv Detail & Related papers (2022-11-10T18:58:22Z) - SemAug: Semantically Meaningful Image Augmentations for Object Detection
Through Language Grounding [5.715548995729382]
We propose an effective technique for image augmentation by injecting contextually meaningful knowledge into the scenes.
Our method of semantically meaningful image augmentation for object detection via language grounding, SemAug, starts by calculating semantically appropriate new objects.
arXiv Detail & Related papers (2022-08-15T19:00:56Z) - Panoptic-based Object Style-Align for Image-to-Image Translation [2.226472061870956]
We propose panoptic-based object style-align generative adversarial networks (POSA-GANs) for image-to-image translation.
The proposed method was systematically compared with different competing methods and obtained significant improvement on both image quality and object recognition performance for translated images.
arXiv Detail & Related papers (2021-12-03T14:28:11Z) - SemIE: Semantically-aware Image Extrapolation [1.5588799679661636]
We propose a semantically-aware novel paradigm to perform image extrapolation.
The proposed approach focuses on (i) extending the already present objects but also on (ii) adding new objects in the extended region based on the context.
We conduct experiments on Cityscapes and ADE20K-bedroom datasets and show that our method outperforms all baselines in terms of FID, and similarity in object co-occurrence statistics.
arXiv Detail & Related papers (2021-08-31T09:31:27Z) - Open-World Entity Segmentation [70.41548013910402]
We introduce a new image segmentation task, termed Entity (ES) with the aim to segment all visual entities in an image without considering semantic category labels.
All semantically-meaningful segments are equally treated as categoryless entities and there is no thing-stuff distinction.
ES enables the following: (1) merging multiple datasets to form a large training set without the need to resolve label conflicts; (2) any model trained on one dataset can generalize exceptionally well to other datasets with unseen domains.
arXiv Detail & Related papers (2021-07-29T17:59:05Z) - Segmenter: Transformer for Semantic Segmentation [79.9887988699159]
We introduce Segmenter, a transformer model for semantic segmentation.
We build on the recent Vision Transformer (ViT) and extend it to semantic segmentation.
It outperforms the state of the art on the challenging ADE20K dataset and performs on-par on Pascal Context and Cityscapes.
arXiv Detail & Related papers (2021-05-12T13:01:44Z) - Improving Semantic Segmentation via Decoupled Body and Edge Supervision [89.57847958016981]
Existing semantic segmentation approaches either aim to improve the object's inner consistency by modeling the global context, or refine objects detail along their boundaries by multi-scale feature fusion.
In this paper, a new paradigm for semantic segmentation is proposed.
Our insight is that appealing performance of semantic segmentation requires textitexplicitly modeling the object textitbody and textitedge, which correspond to the high and low frequency of the image.
We show that the proposed framework with various baselines or backbone networks leads to better object inner consistency and object boundaries.
arXiv Detail & Related papers (2020-07-20T12:11:22Z) - Mining Cross-Image Semantics for Weakly Supervised Semantic Segmentation [128.03739769844736]
Two neural co-attentions are incorporated into the classifier to capture cross-image semantic similarities and differences.
In addition to boosting object pattern learning, the co-attention can leverage context from other related images to improve localization map inference.
Our algorithm sets new state-of-the-arts on all these settings, demonstrating well its efficacy and generalizability.
arXiv Detail & Related papers (2020-07-03T21:53:46Z) - Object-Centric Image Generation from Layouts [93.10217725729468]
We develop a layout-to-image-generation method to generate complex scenes with multiple objects.
Our method learns representations of the spatial relationships between objects in the scene, which lead to our model's improved layout-fidelity.
We introduce SceneFID, an object-centric adaptation of the popular Fr'echet Inception Distance metric, that is better suited for multi-object images.
arXiv Detail & Related papers (2020-03-16T21:40:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.