Weakly But Deeply Supervised Occlusion-Reasoned Parametric Layouts
- URL: http://arxiv.org/abs/2104.06730v1
- Date: Wed, 14 Apr 2021 09:32:29 GMT
- Title: Weakly But Deeply Supervised Occlusion-Reasoned Parametric Layouts
- Authors: Buyu Liu, Bingbing Zhuang, Manmohan Chandraker
- Abstract summary: We propose an end-to-end network that takes a single perspective RGB image of a complex road scene as input, to produce occlusion-reasoned layouts in perspective space.
The only human annotations required by our method are for parametric attributes that are cheaper and less ambiguous to obtain.
We validate our approach on two public datasets, KITTI and NuScenes, to achieve state-of-the-art results with considerably lower human supervision.
- Score: 87.370534321618
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose an end-to-end network that takes a single perspective RGB image of
a complex road scene as input, to produce occlusion-reasoned layouts in
perspective space as well as a top-view parametric space. In contrast to prior
works that require dense supervision such as semantic labels in perspective
view, the only human annotations required by our method are for parametric
attributes that are cheaper and less ambiguous to obtain. To solve this
challenging task, our design is comprised of modules that incorporate inductive
biases to learn occlusion-reasoning, geometric transformation and semantic
abstraction, where each module may be supervised by appropriately transforming
the parametric annotations. We demonstrate how our design choices and proposed
deep supervision help achieve accurate predictions and meaningful
representations. We validate our approach on two public datasets, KITTI and
NuScenes, to achieve state-of-the-art results with considerably lower human
supervision.
Related papers
- Combining Optimal Transport and Embedding-Based Approaches for More Expressiveness in Unsupervised Graph Alignment [19.145556156889064]
Unsupervised graph alignment finds the one-to-one node correspondence between a pair of attributed graphs by only exploiting graph structure and node features.
We propose a principled approach to combine their advantages motivated by theoretical analysis of model expressiveness.
We are the first to guarantee the one-to-one matching constraint by reducing the problem to maximum weight matching.
arXiv Detail & Related papers (2024-06-19T04:57:35Z) - 360 Layout Estimation via Orthogonal Planes Disentanglement and Multi-view Geometric Consistency Perception [56.84921040837699]
Existing panoramic layout estimation solutions tend to recover room boundaries from a vertically compressed sequence, yielding imprecise results.
We propose an orthogonal plane disentanglement network (termed DOPNet) to distinguish ambiguous semantics.
We also present an unsupervised adaptation technique tailored for horizon-depth and ratio representations.
Our solution outperforms other SoTA models on both monocular layout estimation and multi-view layout estimation tasks.
arXiv Detail & Related papers (2023-12-26T12:16:03Z) - Improving Semantic Correspondence with Viewpoint-Guided Spherical Maps [39.00415825387414]
We propose a new approach for semantic correspondence estimation that supplements discriminative features with 3D understanding via a weak geometric spherical prior.
Compared to more involved 3D pipelines, our model only requires weak viewpoint information, and the simplicity of our spherical representation enables us to inject informative geometric priors into the model during training.
We present results on the challenging SPair-71k dataset, where our approach demonstrates is capable of distinguishing between symmetric views and repeated parts across many object categories.
arXiv Detail & Related papers (2023-12-20T17:35:24Z) - Multitask AET with Orthogonal Tangent Regularity for Dark Object
Detection [84.52197307286681]
We propose a novel multitask auto encoding transformation (MAET) model to enhance object detection in a dark environment.
In a self-supervision manner, the MAET learns the intrinsic visual structure by encoding and decoding the realistic illumination-degrading transformation.
We have achieved the state-of-the-art performance using synthetic and real-world datasets.
arXiv Detail & Related papers (2022-05-06T16:27:14Z) - Learning Oriented Remote Sensing Object Detection via Naive Geometric
Computing [38.508709334835316]
We propose a mechanism that learns the regression of horizontal proposals, oriented proposals, and rotation angles of objects in a consistent manner.
Our proposed idea is simple and intuitive that can be readily implemented.
arXiv Detail & Related papers (2021-12-01T13:58:42Z) - An Adaptive Framework for Learning Unsupervised Depth Completion [59.17364202590475]
We present a method to infer a dense depth map from a color image and associated sparse depth measurements.
We show that regularization and co-visibility are related via the fitness of the model to data and can be unified into a single framework.
arXiv Detail & Related papers (2021-06-06T02:27:55Z) - Self-supervised Geometric Perception [96.89966337518854]
Self-supervised geometric perception is a framework to learn a feature descriptor for correspondence matching without any ground-truth geometric model labels.
We show that SGP achieves state-of-the-art performance that is on-par or superior to the supervised oracles trained using ground-truth labels.
arXiv Detail & Related papers (2021-03-04T15:34:43Z) - Example-Guided Image Synthesis across Arbitrary Scenes using Masked
Spatial-Channel Attention and Self-Supervision [83.33283892171562]
Example-guided image synthesis has recently been attempted to synthesize an image from a semantic label map and an exemplary image.
In this paper, we tackle a more challenging and general task, where the exemplar is an arbitrary scene image that is semantically different from the given label map.
We propose an end-to-end network for joint global and local feature alignment and synthesis.
arXiv Detail & Related papers (2020-04-18T18:17:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.