Exploiting the Distortion-Semantic Interaction in Fisheye Data
- URL: http://arxiv.org/abs/2305.00079v2
- Date: Sat, 6 May 2023 18:15:21 GMT
- Title: Exploiting the Distortion-Semantic Interaction in Fisheye Data
- Authors: Kiran Kokilepersaud, Mohit Prabhushankar, Yavuz Yarici, Ghassan
AlRegib, Armin Parchami
- Abstract summary: Fisheye data has the wider field of view advantage over other types of cameras, but this comes at the expense of high distortion.
objects further from the center exhibit deformations that make it difficult for a model to identify their semantic context.
We introduce an approach to exploit this relationship by first extracting distortion class labels based on an object's distance from the center of the image.
We then shape a backbone's representation space with a weighted contrastive loss that constrains objects of the same semantic class and distortion class to be close to each other.
- Score: 12.633032175875865
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this work, we present a methodology to shape a fisheye-specific
representation space that reflects the interaction between distortion and
semantic context present in this data modality. Fisheye data has the wider
field of view advantage over other types of cameras, but this comes at the
expense of high radial distortion. As a result, objects further from the center
exhibit deformations that make it difficult for a model to identify their
semantic context. While previous work has attempted architectural and training
augmentation changes to alleviate this effect, no work has attempted to guide
the model towards learning a representation space that reflects this
interaction between distortion and semantic context inherent to fisheye data.
We introduce an approach to exploit this relationship by first extracting
distortion class labels based on an object's distance from the center of the
image. We then shape a backbone's representation space with a weighted
contrastive loss that constrains objects of the same semantic class and
distortion class to be close to each other within a lower dimensional embedding
space. This backbone trained with both semantic and distortion information is
then fine-tuned within an object detection setting to empirically evaluate the
quality of the learnt representation. We show this method leads to performance
improvements by as much as 1.1% mean average precision over standard object
detection strategies and .6% improvement over other state of the art
representation learning approaches.
Related papers
- Zero-Shot Object-Centric Representation Learning [72.43369950684057]
We study current object-centric methods through the lens of zero-shot generalization.
We introduce a benchmark comprising eight different synthetic and real-world datasets.
We find that training on diverse real-world images improves transferability to unseen scenarios.
arXiv Detail & Related papers (2024-08-17T10:37:07Z) - Deep Domain Adaptation: A Sim2Real Neural Approach for Improving Eye-Tracking Systems [80.62854148838359]
Eye image segmentation is a critical step in eye tracking that has great influence over the final gaze estimate.
We use dimensionality-reduction techniques to measure the overlap between the target eye images and synthetic training data.
Our methods result in robust, improved performance when tackling the discrepancy between simulation and real-world data samples.
arXiv Detail & Related papers (2024-03-23T22:32:06Z) - GS-Pose: Category-Level Object Pose Estimation via Geometric and
Semantic Correspondence [5.500735640045456]
Category-level pose estimation is a challenging task with many potential applications in computer vision and robotics.
We propose to utilize both geometric and semantic features obtained from a pre-trained foundation model.
This requires significantly less data to train than prior methods since the semantic features are robust to object texture and appearance.
arXiv Detail & Related papers (2023-11-23T02:35:38Z) - Uncovering the Background-Induced bias in RGB based 6-DoF Object Pose
Estimation [5.30320006562872]
In recent years, there has been a growing trend of using data-driven methods in industrial settings.
It becomes critical to understand how the manipulation of video and images can impact the effectiveness of a machine learning method.
Our case study aims precisely to analyze the Linemod dataset, considered the state of the art in 6D pose estimation context.
arXiv Detail & Related papers (2023-04-17T12:54:20Z) - CbwLoss: Constrained Bidirectional Weighted Loss for Self-supervised
Learning of Depth and Pose [13.581694284209885]
Photometric differences are used to train neural networks for estimating depth and camera pose from unlabeled monocular videos.
In this paper, we deal with moving objects and occlusions utilizing the difference of the flow fields and depth structure generated by affine transformation and view synthesis.
We mitigate the effect of textureless regions on model optimization by measuring differences between features with more semantic and contextual information without adding networks.
arXiv Detail & Related papers (2022-12-12T12:18:24Z) - Contrastive Object-level Pre-training with Spatial Noise Curriculum
Learning [12.697842097171119]
We present a curriculum learning mechanism that adaptively augments the generated regions, which allows the model to consistently acquire a useful learning signal.
Our experiments show that our approach improves on the MoCo v2 baseline by a large margin on multiple object-level tasks when pre-training on multi-object scene image datasets.
arXiv Detail & Related papers (2021-11-26T18:29:57Z) - Object-aware Contrastive Learning for Debiased Scene Representation [74.30741492814327]
We develop a novel object-aware contrastive learning framework that localizes objects in a self-supervised manner.
We also introduce two data augmentations based on ContraCAM, object-aware random crop and background mixup, which reduce contextual and background biases during contrastive self-supervised learning.
arXiv Detail & Related papers (2021-07-30T19:24:07Z) - SIR: Self-supervised Image Rectification via Seeing the Same Scene from
Multiple Different Lenses [82.56853587380168]
We propose a novel self-supervised image rectification (SIR) method based on an important insight that the rectified results of distorted images of the same scene from different lens should be the same.
We leverage a differentiable warping module to generate the rectified images and re-distorted images from the distortion parameters.
Our method achieves comparable or even better performance than the supervised baseline method and representative state-of-the-art methods.
arXiv Detail & Related papers (2020-11-30T08:23:25Z) - Combining Semantic Guidance and Deep Reinforcement Learning For
Generating Human Level Paintings [22.889059874754242]
Generation of stroke-based non-photorealistic imagery is an important problem in the computer vision community.
Previous methods have been limited to datasets with little variation in position, scale and saliency of the foreground object.
We propose a Semantic Guidance pipeline with 1) a bi-level painting procedure for learning the distinction between foreground and background brush strokes at training time.
arXiv Detail & Related papers (2020-11-25T09:00:04Z) - Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations [71.00754846434744]
We show that imperceptible additive perturbations can significantly alter the disparity map.
We show that, when used for adversarial data augmentation, our perturbations result in trained models that are more robust.
arXiv Detail & Related papers (2020-09-21T19:20:09Z) - Distilling Localization for Self-Supervised Representation Learning [82.79808902674282]
Contrastive learning has revolutionized unsupervised representation learning.
Current contrastive models are ineffective at localizing the foreground object.
We propose a data-driven approach for learning in variance to backgrounds.
arXiv Detail & Related papers (2020-04-14T16:29:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.