Illumination Normalization by Partially Impossible Encoder-Decoder Cost
Function
- URL: http://arxiv.org/abs/2011.03428v2
- Date: Mon, 9 Nov 2020 15:43:42 GMT
- Title: Illumination Normalization by Partially Impossible Encoder-Decoder Cost
Function
- Authors: Steve Dias Da Cruz, Bertram Taetz, Thomas Stifter, Didier Stricker
- Abstract summary: We introduce a new strategy for the cost function formulation of encoder-decoder networks to average out all the unimportant information in the input images.
Our method exploits the availability of identical sceneries under different illumination and environmental conditions.
Its applicability is assessed on three publicly available datasets.
- Score: 13.618797548020462
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Images recorded during the lifetime of computer vision based systems undergo
a wide range of illumination and environmental conditions affecting the
reliability of previously trained machine learning models. Image normalization
is hence a valuable preprocessing component to enhance the models' robustness.
To this end, we introduce a new strategy for the cost function formulation of
encoder-decoder networks to average out all the unimportant information in the
input images (e.g. environmental features and illumination changes) to focus on
the reconstruction of the salient features (e.g. class instances). Our method
exploits the availability of identical sceneries under different illumination
and environmental conditions for which we formulate a partially impossible
reconstruction target: the input image will not convey enough information to
reconstruct the target in its entirety. Its applicability is assessed on three
publicly available datasets. We combine the triplet loss as a regularizer in
the latent space representation and a nearest neighbour search to improve the
generalization to unseen illuminations and class instances. The importance of
the aforementioned post-processing is highlighted on an automotive application.
To this end, we release a synthetic dataset of sceneries from three different
passenger compartments where each scenery is rendered under ten different
illumination and environmental conditions: see https://sviro.kl.dfki.de
Related papers
- Drive-1-to-3: Enriching Diffusion Priors for Novel View Synthesis of Real Vehicles [81.29018359825872]
This paper consolidates a set of good practices to finetune large pretrained models for a real-world task.
Specifically, we develop several strategies to account for discrepancies between the synthetic data and real driving data.
Our insights lead to effective finetuning that results in a $68.8%$ reduction in FID for novel view synthesis over prior arts.
arXiv Detail & Related papers (2024-12-19T03:39:13Z) - IDArb: Intrinsic Decomposition for Arbitrary Number of Input Views and Illuminations [64.07859467542664]
Capturing geometric and material information from images remains a fundamental challenge in computer vision and graphics.
Traditional optimization-based methods often require hours of computational time to reconstruct geometry, material properties, and environmental lighting from dense multi-view inputs.
We introduce IDArb, a diffusion-based model designed to perform intrinsic decomposition on an arbitrary number of images under varying illuminations.
arXiv Detail & Related papers (2024-12-16T18:52:56Z) - KRONC: Keypoint-based Robust Camera Optimization for 3D Car Reconstruction [58.04846444985808]
This paper introduces KRONC, a novel approach aimed at inferring view poses by leveraging prior knowledge about the object to reconstruct and its representation through semantic keypoints.
With a focus on vehicle scenes, KRONC is able to estimate the position of the views as a solution to a light optimization problem targeting the convergence of keypoints' back-projections to a singular point.
arXiv Detail & Related papers (2024-09-09T08:08:05Z) - UpFusion: Novel View Diffusion from Unposed Sparse View Observations [66.36092764694502]
UpFusion can perform novel view synthesis and infer 3D representations for an object given a sparse set of reference images.
We show that this mechanism allows generating high-fidelity novel views while improving the synthesis quality given additional (unposed) images.
arXiv Detail & Related papers (2023-12-11T18:59:55Z) - Neural Radiance Transfer Fields for Relightable Novel-view Synthesis
with Global Illumination [63.992213016011235]
We propose a method for scene relighting under novel views by learning a neural precomputed radiance transfer function.
Our method can be solely supervised on a set of real images of the scene under a single unknown lighting condition.
Results show that the recovered disentanglement of scene parameters improves significantly over the current state of the art.
arXiv Detail & Related papers (2022-07-27T16:07:48Z) - Spatio-Temporal Outdoor Lighting Aggregation on Image Sequences using
Transformer Networks [23.6427456783115]
In this work, we focus on outdoor lighting estimation by aggregating individual noisy estimates from images.
Recent work based on deep neural networks has shown promising results for single image lighting estimation, but suffers from robustness.
We tackle this problem by combining lighting estimates from several image views sampled in the angular and temporal domain of an image sequence.
arXiv Detail & Related papers (2022-02-18T14:11:16Z) - Learning Efficient Photometric Feature Transform for Multi-view Stereo [37.26574529243778]
We learn to convert the perpixel photometric information at each view into spatially distinctive and view-invariant low-level features.
Our framework automatically adapts to and makes efficient use of the geometric information available in different forms of input data.
arXiv Detail & Related papers (2021-03-27T02:53:15Z) - Scene relighting with illumination estimation in the latent space on an
encoder-decoder scheme [68.8204255655161]
In this report we present methods that we tried to achieve that goal.
Our models are trained on a rendered dataset of artificial locations with varied scene content, light source location and color temperature.
With this dataset, we used a network with illumination estimation component aiming to infer and replace light conditions in the latent space representation of the concerned scenes.
arXiv Detail & Related papers (2020-06-03T15:25:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.