Related papers: Adaptable Deformable Convolutions for Semantic Segmentation of Fisheye Images in Autonomous Driving Systems

Adaptable Deformable Convolutions for Semantic Segmentation of Fisheye Images in Autonomous Driving Systems

URL: http://arxiv.org/abs/2102.10191v1
Date: Fri, 19 Feb 2021 22:47:44 GMT
Title: Adaptable Deformable Convolutions for Semantic Segmentation of Fisheye Images in Autonomous Driving Systems
Authors: Cl\'ement Playout, Ola Ahmad, Freddy Lecue and Farida Cheriet
Abstract summary: We show that a CNN trained on standard images can be readily adapted to fisheye images. Our adaptation protocol mainly relies on modifying the support of the convolutions by using their deformable equivalents on top of pre-existing layers.
Score: 4.231909978425546
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Advanced Driver-Assistance Systems rely heavily on perception tasks such as semantic segmentation where images are captured from large field of view (FoV) cameras. State-of-the-art works have made considerable progress toward applying Convolutional Neural Network (CNN) to standard (rectilinear) images. However, the large FoV cameras used in autonomous vehicles produce fisheye images characterized by strong geometric distortion. This work demonstrates that a CNN trained on standard images can be readily adapted to fisheye images, which is crucial in real-world applications where time-consuming real-time data transformation must be avoided. Our adaptation protocol mainly relies on modifying the support of the convolutions by using their deformable equivalents on top of pre-existing layers. We prove that tuning an optimal support only requires a limited amount of labeled fisheye images, as a small number of training samples is sufficient to significantly improve an existing model's performance on wide-angle images. Furthermore, we show that finetuning the weights of the network is not necessary to achieve high performance once the deformable components are learned. Finally, we provide an in-depth analysis of the effect of the deformable convolutions, bringing elements of discussion on the behavior of CNN models.

Related papers

DarSwin-Unet: Distortion Aware Encoder-Decoder Architecture [13.412728770638465]
We present an encoder-decoder model that adapts to distortions in wide-angle lenses by leveraging the physical characteristics defined by the radial distortion profile. In contrast to the original model, which only performs classification tasks, we introduce a U-Net architecture, DarSwin-Unet, designed for pixel level tasks. Our approach enhances the model capability to handle pixel-level tasks in wide-angle fisheye images, making it more effective for real-world applications.
arXiv Detail & Related papers (2024-07-24T14:52:18Z)
Deformable Convolution Based Road Scene Semantic Segmentation of Fisheye Images in Autonomous Driving [4.720434481945155]
This study investigates the effectiveness of modern Deformable Convolutional Neural Networks (DCNNs) for semantic segmentation tasks. Our experiments focus on segmenting the WoodScape fisheye image dataset into ten distinct classes, assessing the Deformable Networks' ability to capture intricate spatial relationships. The significant improvement in mIoU score resulting from integrating Deformable CNNs demonstrates their effectiveness in handling the geometric distortions present in fisheye imagery.
arXiv Detail & Related papers (2024-07-23T17:02:24Z)
Efficient Visual State Space Model for Image Deblurring [83.57239834238035]
Convolutional neural networks (CNNs) and Vision Transformers (ViTs) have achieved excellent performance in image restoration. We propose a simple yet effective visual state space model (EVSSM) for image deblurring.
arXiv Detail & Related papers (2024-05-23T09:13:36Z)
Adapting CNNs for Fisheye Cameras without Retraining [3.683202928838613]
In many applications it is beneficial to use non conventional cameras, such as fisheye cameras, that have a larger field of view. The issue arises that these large-FOV images can't be rectified to a perspective projection without significant cropping of the original image. We propose Rectified Convolutions; a new approach for adapting pre-trained convolutional networks to operate with new non-perspective images.
arXiv Detail & Related papers (2024-04-12T01:36:00Z)
Convolution kernel adaptation to calibrated fisheye [45.90423821963144]
Convolution kernels are the basic structural component of convolutional neural networks (CNNs) We propose a method that leverages the calibration of cameras to deform the convolution kernel accordingly and adapt to the distortion. We show how, with just a brief fine-tuning stage in a small dataset, we improve the performance of the network for the calibrated fisheye.
arXiv Detail & Related papers (2024-02-02T14:44:50Z)
Learning Robust Multi-Scale Representation for Neural Radiance Fields from Unposed Images [65.41966114373373]
We present an improved solution to the neural image-based rendering problem in computer vision. The proposed approach could synthesize a realistic image of the scene from a novel viewpoint at test time.
arXiv Detail & Related papers (2023-11-08T08:18:23Z)
Distance Weighted Trans Network for Image Completion [52.318730994423106]
We propose a new architecture that relies on Distance-based Weighted Transformer (DWT) to better understand the relationships between an image's components. CNNs are used to augment the local texture information of coarse priors. DWT blocks are used to recover certain coarse textures and coherent visual structures.
arXiv Detail & Related papers (2023-10-11T12:46:11Z)
SimFIR: A Simple Framework for Fisheye Image Rectification with Self-supervised Representation Learning [105.01294305972037]
We introduce SimFIR, a framework for fisheye image rectification based on self-supervised representation learning. To learn fine-grained distortion representations, we first split a fisheye image into multiple patches and extract their representations with a Vision Transformer. The transfer performance on the downstream rectification task is remarkably boosted, which verifies the effectiveness of the learned representations.
arXiv Detail & Related papers (2023-08-17T15:20:17Z)
Spherical Image Inpainting with Frame Transformation and Data-driven Prior Deep Networks [13.406134708071345]
In this work, we focus on the challenging task of spherical image inpainting with deep learning-based regularizer. We employ a fast directional spherical Haar framelet transform and develop a novel optimization framework based on a sparsity assumption of the framelet transform. We show that the proposed algorithms can greatly recover damaged spherical images and achieve the best performance over purely using deep learning denoiser and plug-and-play model.
arXiv Detail & Related papers (2022-09-29T07:51:27Z)
SIR: Self-supervised Image Rectification via Seeing the Same Scene from Multiple Different Lenses [82.56853587380168]
We propose a novel self-supervised image rectification (SIR) method based on an important insight that the rectified results of distorted images of the same scene from different lens should be the same. We leverage a differentiable warping module to generate the rectified images and re-distorted images from the distortion parameters. Our method achieves comparable or even better performance than the supervised baseline method and representative state-of-the-art methods.
arXiv Detail & Related papers (2020-11-30T08:23:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.