Related papers: Dataset Enhancement with Instance-Level Augmentations

Dataset Enhancement with Instance-Level Augmentations

URL: http://arxiv.org/abs/2406.08249v1
Date: Wed, 12 Jun 2024 14:18:07 GMT
Title: Dataset Enhancement with Instance-Level Augmentations
Authors: Orest Kupyn, Christian Rupprecht,
Abstract summary: We present a method for expanding a dataset by incorporating knowledge from the wide distribution of pre-trained latent diffusion models. We go beyond simple pixel transformations and introduce the concept of instance-level data augmentation by repainting parts of the image at the level of object instances.
Score: 20.935062361595197
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We present a method for expanding a dataset by incorporating knowledge from the wide distribution of pre-trained latent diffusion models. Data augmentations typically incorporate inductive biases about the image formation process into the training (e.g. translation, scaling, colour changes, etc.). Here, we go beyond simple pixel transformations and introduce the concept of instance-level data augmentation by repainting parts of the image at the level of object instances. The method combines a conditional diffusion model with depth and edge maps control conditioning to seamlessly repaint individual objects inside the scene, being applicable to any segmentation or detection dataset. Used as a data augmentation method, it improves the performance and generalization of the state-of-the-art salient object detection, semantic segmentation and object detection models. By redrawing all privacy-sensitive instances (people, license plates, etc.), the method is also applicable for data anonymization. We also release fully synthetic and anonymized expansions for popular datasets: COCO, Pascal VOC and DUTS.

Related papers

GenMix: Effective Data Augmentation with Generative Diffusion Model Image Editing [60.101097709212716]
This paper introduces GenMix, a generalizable prompt-guided generative data augmentation approach.<n>Our technique leverages image editing to generate augmented images based on custom conditional prompts.<n>Our approach mitigates unrealistic images and label ambiguity, improving the performance and adversarial robustness of the resulting models.
arXiv Detail & Related papers (2024-12-03T10:45:34Z)
Erase, then Redraw: A Novel Data Augmentation Approach for Free Space Detection Using Diffusion Model [5.57325257338134]
Traditional data augmentation methods cannot alter high-level semantic attributes. We propose a text-to-image diffusion model to parameterize image-to-image transformations. We achieve this goal by erasing instances of real objects from the original dataset and generating new instances with similar semantics in the erased regions.
arXiv Detail & Related papers (2024-09-30T10:21:54Z)
A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance. We propose a simple yet effective data augmentation approach by leveraging advancements in generative models. Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z)
Distribution-Aware Data Expansion with Diffusion Models [55.979857976023695]
We propose DistDiff, a training-free data expansion framework based on the distribution-aware diffusion model. DistDiff consistently enhances accuracy across a diverse range of datasets compared to models trained solely on original data.
arXiv Detail & Related papers (2024-03-11T14:07:53Z)
Generalizing Dataset Distillation via Deep Generative Prior [75.9031209877651]
We propose to distill an entire dataset's knowledge into a few synthetic images. The idea is to synthesize a small number of synthetic data points that, when given to a learning algorithm as training data, result in a model approximating one trained on the original data. We present a new optimization algorithm that distills a large number of images into a few intermediate feature vectors in the generative model's latent space.
arXiv Detail & Related papers (2023-05-02T17:59:31Z)
SemAug: Semantically Meaningful Image Augmentations for Object Detection Through Language Grounding [5.715548995729382]
We propose an effective technique for image augmentation by injecting contextually meaningful knowledge into the scenes. Our method of semantically meaningful image augmentation for object detection via language grounding, SemAug, starts by calculating semantically appropriate new objects.
arXiv Detail & Related papers (2022-08-15T19:00:56Z)
Salient Objects in Clutter [130.63976772770368]
This paper identifies and addresses a serious design bias of existing salient object detection (SOD) datasets. This design bias has led to a saturation in performance for state-of-the-art SOD models when evaluated on existing datasets. We propose a new high-quality dataset and update the previous saliency benchmark.
arXiv Detail & Related papers (2021-05-07T03:49:26Z)
Learning Representational Invariances for Data-Efficient Action Recognition [52.23716087656834]
We show that our data augmentation strategy leads to promising performance on the Kinetics-100, UCF-101, and HMDB-51 datasets. We also validate our data augmentation strategy in the fully supervised setting and demonstrate improved performance.
arXiv Detail & Related papers (2021-03-30T17:59:49Z)
Data Augmentation for Object Detection via Differentiable Neural Rendering [71.00447761415388]
It is challenging to train a robust object detector when annotated data is scarce. Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data. We introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views.
arXiv Detail & Related papers (2021-03-04T06:31:06Z)
Combining Semantic Guidance and Deep Reinforcement Learning For Generating Human Level Paintings [22.889059874754242]
Generation of stroke-based non-photorealistic imagery is an important problem in the computer vision community. Previous methods have been limited to datasets with little variation in position, scale and saliency of the foreground object. We propose a Semantic Guidance pipeline with 1) a bi-level painting procedure for learning the distinction between foreground and background brush strokes at training time.
arXiv Detail & Related papers (2020-11-25T09:00:04Z)

This list is automatically generated from the titles and abstracts of the papers in this site.