A general approach to bridge the reality-gap
- URL: http://arxiv.org/abs/2009.01865v1
- Date: Thu, 3 Sep 2020 18:19:28 GMT
- Title: A general approach to bridge the reality-gap
- Authors: Michael Lomnitz, Zigfried Hampel-Arias, Nina Lopatina, Felipe A. Mejia
- Abstract summary: A common approach to circumvent this is to leverage existing, similar data-sets with large amounts of labelled data.
We propose learning a general transformation to bring arbitrary images towards a canonical distribution.
This transformation is trained in an unsupervised regime, leveraging data augmentation to generate off-canonical examples of images.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Employing machine learning models in the real world requires collecting large
amounts of data, which is both time consuming and costly to collect. A common
approach to circumvent this is to leverage existing, similar data-sets with
large amounts of labelled data. However, models trained on these canonical
distributions do not readily transfer to real-world ones. Domain adaptation and
transfer learning are often used to breach this "reality gap", though both
require a substantial amount of real-world data. In this paper we discuss a
more general approach: we propose learning a general transformation to bring
arbitrary images towards a canonical distribution where we can naively apply
the trained machine learning models. This transformation is trained in an
unsupervised regime, leveraging data augmentation to generate off-canonical
examples of images and training a Deep Learning model to recover their original
counterpart. We quantify the performance of this transformation using
pre-trained ImageNet classifiers, demonstrating that this procedure can recover
half of the loss in performance on the distorted data-set. We then validate the
effectiveness of this approach on a series of pre-trained ImageNet models on a
real world data set collected by printing and photographing images in different
lighting conditions.
Related papers
- CycleMix: Mixing Source Domains for Domain Generalization in Style-Dependent Data [5.124256074746721]
In the case of image classification, one frequent reason that algorithms fail to generalize is that they rely on spurious correlations present in training data.
These associations may not be present in the unseen test data, leading to significant degradation of their effectiveness.
In this work, we attempt to mitigate this Domain Generalization problem by training a robust feature extractor which disregards features attributed to image-style but infers based on style-invariant image representations.
arXiv Detail & Related papers (2024-07-18T11:43:26Z) - Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Data Attribution for Text-to-Image Models by Unlearning Synthesized Images [71.23012718682634]
The goal of data attribution for text-to-image models is to identify the training images that most influence the generation of a new image.
We propose a new approach that efficiently identifies highly-influential images.
arXiv Detail & Related papers (2024-06-13T17:59:44Z) - Semantic Augmentation in Images using Language [6.642383216055697]
We propose a technique to utilize generated images to augment existing datasets.
This paper explores various strategies for effective data augmentation to improve the out-of-domain generalization capabilities of deep learning models.
arXiv Detail & Related papers (2024-04-02T22:54:24Z) - Data-efficient Event Camera Pre-training via Disentangled Masked
Modeling [20.987277885575963]
We present a new data-supervised voxel-based self-supervised learning method for event cameras.
Our method overcomes the limitations of previous methods, which either sacrifice temporal information or directly employ paired image data.
It exhibits excellent generalization performance and demonstrates significant improvements across various tasks with fewer parameters and lower computational costs.
arXiv Detail & Related papers (2024-03-01T10:02:25Z) - Synthetic-to-Real Domain Adaptation using Contrastive Unpaired
Translation [28.19031441659854]
We propose a multi-step method to obtain training data without manual annotation effort.
From 3D object meshes, we generate images using a modern synthesis pipeline.
We utilize a state-of-the-art image-to-image translation method to adapt the synthetic images to the real domain.
arXiv Detail & Related papers (2022-03-17T17:13:23Z) - Meta Internal Learning [88.68276505511922]
Internal learning for single-image generation is a framework, where a generator is trained to produce novel images based on a single image.
We propose a meta-learning approach that enables training over a collection of images, in order to model the internal statistics of the sample image more effectively.
Our results show that the models obtained are as suitable as single-image GANs for many common image applications.
arXiv Detail & Related papers (2021-10-06T16:27:38Z) - Leveraging Self-Supervision for Cross-Domain Crowd Counting [71.75102529797549]
State-of-the-art methods for counting people in crowded scenes rely on deep networks to estimate crowd density.
We train our network to recognize upside-down real images from regular ones and incorporate into it the ability to predict its own uncertainty.
This yields an algorithm that consistently outperforms state-of-the-art cross-domain crowd counting ones without any extra computation at inference time.
arXiv Detail & Related papers (2021-03-30T12:37:55Z) - SIR: Self-supervised Image Rectification via Seeing the Same Scene from
Multiple Different Lenses [82.56853587380168]
We propose a novel self-supervised image rectification (SIR) method based on an important insight that the rectified results of distorted images of the same scene from different lens should be the same.
We leverage a differentiable warping module to generate the rectified images and re-distorted images from the distortion parameters.
Our method achieves comparable or even better performance than the supervised baseline method and representative state-of-the-art methods.
arXiv Detail & Related papers (2020-11-30T08:23:25Z) - Automated Synthetic-to-Real Generalization [142.41531132965585]
We propose a textitlearning-to-optimize (L2O) strategy to automate the selection of layer-wise learning rates.
We demonstrate that the proposed framework can significantly improve the synthetic-to-real generalization performance without seeing and training on real data.
arXiv Detail & Related papers (2020-07-14T10:57:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.