Distortion-aware Transformer in 360{\deg} Salient Object Detection
- URL: http://arxiv.org/abs/2308.03359v1
- Date: Mon, 7 Aug 2023 07:28:24 GMT
- Title: Distortion-aware Transformer in 360{\deg} Salient Object Detection
- Authors: Yinjie Zhao, Lichen Zhao, Qian Yu, Jing Zhang, Lu Sheng, Dong Xu
- Abstract summary: We propose a Transformer-based model called DATFormer to address the distortion problem.
To exploit the unique characteristics of 360deg data, we present a learnable relation matrix.
Our model outperforms existing 2D SOD (salient object detection) and 360 SOD methods.
- Score: 44.74647420381127
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: With the emergence of VR and AR, 360{\deg} data attracts increasing attention
from the computer vision and multimedia communities. Typically, 360{\deg} data
is projected into 2D ERP (equirectangular projection) images for feature
extraction. However, existing methods cannot handle the distortions that result
from the projection, hindering the development of 360-data-based tasks.
Therefore, in this paper, we propose a Transformer-based model called DATFormer
to address the distortion problem. We tackle this issue from two perspectives.
Firstly, we introduce two distortion-adaptive modules. The first is a
Distortion Mapping Module, which guides the model to pre-adapt to distorted
features globally. The second module is a Distortion-Adaptive Attention Block
that reduces local distortions on multi-scale features. Secondly, to exploit
the unique characteristics of 360{\deg} data, we present a learnable relation
matrix and use it as part of the positional embedding to further improve
performance. Extensive experiments are conducted on three public datasets, and
the results show that our model outperforms existing 2D SOD (salient object
detection) and 360 SOD methods.
Related papers
- R3D-AD: Reconstruction via Diffusion for 3D Anomaly Detection [12.207437451118036]
3D anomaly detection plays a crucial role in monitoring parts for localized inherent defects in precision manufacturing.
Embedding-based and reconstruction-based approaches are among the most popular and successful methods.
We propose R3D-AD, reconstructing anomalous point clouds by diffusion model for precise 3D anomaly detection.
arXiv Detail & Related papers (2024-07-15T16:10:58Z) - GEOcc: Geometrically Enhanced 3D Occupancy Network with Implicit-Explicit Depth Fusion and Contextual Self-Supervision [49.839374549646884]
This paper presents GEOcc, a Geometric-Enhanced Occupancy network tailored for vision-only surround-view perception.
Our approach achieves State-Of-The-Art performance on the Occ3D-nuScenes dataset with the least image resolution needed and the most weightless image backbone.
arXiv Detail & Related papers (2024-05-17T07:31:20Z) - OPDN: Omnidirectional Position-aware Deformable Network for
Omnidirectional Image Super-Resolution [18.138867445188293]
We propose a two-stage framework for 360deg omnidirectional image superresolution.
Our proposed method achieves superior performance and wins the NTIRE 2023 challenge of 360deg omnidirectional image super-resolution.
arXiv Detail & Related papers (2023-04-26T11:47:40Z) - View-aware Salient Object Detection for 360{\deg} Omnidirectional Image [33.43250302656753]
We construct a large scale 360deg ISOD dataset with object-level pixel-wise annotation on equirectangular projection (ERP)
Inspired by humans' observing process, we propose a view-aware salient object detection method based on a Sample Adaptive View Transformer (SAVT) module.
arXiv Detail & Related papers (2022-09-27T07:44:08Z) - Towards Model Generalization for Monocular 3D Object Detection [57.25828870799331]
We present an effective unified camera-generalized paradigm (CGP) for Mono3D object detection.
We also propose the 2D-3D geometry-consistent object scaling strategy (GCOS) to bridge the gap via an instance-level augment.
Our method called DGMono3D achieves remarkable performance on all evaluated datasets and surpasses the SoTA unsupervised domain adaptation scheme.
arXiv Detail & Related papers (2022-05-23T23:05:07Z) - Homography Loss for Monocular 3D Object Detection [54.04870007473932]
A differentiable loss function, termed as Homography Loss, is proposed to achieve the goal, which exploits both 2D and 3D information.
Our method yields the best performance compared with the other state-of-the-arts by a large margin on KITTI 3D datasets.
arXiv Detail & Related papers (2022-04-02T03:48:03Z) - Wide-angle Image Rectification: A Survey [86.36118799330802]
wide-angle images contain distortions that violate the assumptions underlying pinhole camera models.
Image rectification, which aims to correct these distortions, can solve these problems.
We present a detailed description and discussion of the camera models used in different approaches.
Next, we review both traditional geometry-based image rectification methods and deep learning-based methods.
arXiv Detail & Related papers (2020-10-30T17:28:40Z) - Synthetic Training for Monocular Human Mesh Recovery [100.38109761268639]
This paper aims to estimate 3D mesh of multiple body parts with large-scale differences from a single RGB image.
The main challenge is lacking training data that have complete 3D annotations of all body parts in 2D images.
We propose a depth-to-scale (D2S) projection to incorporate the depth difference into the projection function to derive per-joint scale variants.
arXiv Detail & Related papers (2020-10-27T03:31:35Z) - Distortion-aware Monocular Depth Estimation for Omnidirectional Images [26.027353545874522]
We propose a Distortion-Aware Monocular Omnidirectional (DAMO) dense depth estimation network to address this challenge on indoor panoramas.
First, we introduce a distortion-aware module to extract calibrated semantic features from omnidirectional images.
Second, we introduce a plug-and-play spherical-aware weight matrix for our objective function to handle the uneven distribution of areas projected from a sphere.
arXiv Detail & Related papers (2020-10-18T08:47:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.