RefinedFields: Radiance Fields Refinement for Unconstrained Scenes
- URL: http://arxiv.org/abs/2312.00639v3
- Date: Fri, 19 Apr 2024 14:14:59 GMT
- Title: RefinedFields: Radiance Fields Refinement for Unconstrained Scenes
- Authors: Karim Kassab, Antoine Schnepf, Jean-Yves Franceschi, Laurent Caraffa, Jeremie Mary, Valérie Gouet-Brunet,
- Abstract summary: We propose RefinedFields, to the best of our knowledge, the first method leveraging pre-trained models to improve in-the-wild scene modeling.
We employ pre-trained networks to refine K-Planes representations via optimization guidance.
We carry out extensive experiments and verify the merit of our method on synthetic data and real tourism photo collections.
- Score: 7.421845364041002
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modeling large scenes from unconstrained images has proven to be a major challenge in computer vision. Existing methods tackling in-the-wild scene modeling operate in closed-world settings, where no conditioning on priors acquired from real-world images is present. We propose RefinedFields, which is, to the best of our knowledge, the first method leveraging pre-trained models to improve in-the-wild scene modeling. We employ pre-trained networks to refine K-Planes representations via optimization guidance using an alternating training procedure. We carry out extensive experiments and verify the merit of our method on synthetic data and real tourism photo collections. RefinedFields enhances rendered scenes with richer details and improves upon its base representation on the task of novel view synthesis in the wild. Our project page can be found at https://refinedfields.github.io.
Related papers
- Image-Editing Specialists: An RLAIF Approach for Diffusion Models [28.807572302899004]
We present a novel approach to training specialized instruction-based image-editing diffusion models.
We introduce an online reinforcement learning framework that aligns the diffusion model with human preferences.
Experimental results demonstrate that our models can perform intricate edits in complex scenes, after just 10 training steps.
arXiv Detail & Related papers (2025-04-17T10:46:39Z) - FoundIR: Unleashing Million-scale Training Data to Advance Foundation Models for Image Restoration [66.61201445650323]
Existing methods suffer from a generalization bottleneck in real-world scenarios.
We contribute a million-scale dataset with two notable advantages over existing training data.
We propose a robust model, FoundIR, to better address a broader range of restoration tasks in real-world scenarios.
arXiv Detail & Related papers (2024-12-02T12:08:40Z) - PrefPaint: Aligning Image Inpainting Diffusion Model with Human Preference [62.72779589895124]
We make the first attempt to align diffusion models for image inpainting with human aesthetic standards via a reinforcement learning framework.
We train a reward model with a dataset we construct, consisting of nearly 51,000 images annotated with human preferences.
Experiments on inpainting comparison and downstream tasks, such as image extension and 3D reconstruction, demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-10-29T11:49:39Z) - Reconstructing Training Data From Real World Models Trained with Transfer Learning [29.028185455223785]
We present a novel approach enabling data reconstruction in realistic settings for models trained on high-resolution images.
Our method adapts the reconstruction scheme of arXiv:2206.07758 to real-world scenarios.
We introduce a novel clustering-based method to identify good reconstructions from thousands of candidates.
arXiv Detail & Related papers (2024-07-22T17:59:10Z) - MaRINeR: Enhancing Novel Views by Matching Rendered Images with Nearby References [49.71130133080821]
MaRINeR is a refinement method that leverages information of a nearby mapping image to improve the rendering of a target viewpoint.
We show improved renderings in quantitative metrics and qualitative examples from both explicit and implicit scene representations.
arXiv Detail & Related papers (2024-07-18T17:50:03Z) - TK-Planes: Tiered K-Planes with High Dimensional Feature Vectors for Dynamic UAV-based Scenes [58.180556221044235]
We present a new approach to bridge the domain gap between synthetic and real-world data for unmanned aerial vehicle (UAV)-based perception.
Our formulation is designed for dynamic scenes, consisting of small moving objects or human actions.
We evaluate its performance on challenging datasets, including Okutama Action and UG2.
arXiv Detail & Related papers (2024-05-04T21:55:33Z) - Boosting Self-Supervision for Single-View Scene Completion via Knowledge Distillation [39.08243715525956]
Inferring scene geometry from images via Structure from Motion is a long-standing and fundamental problem in computer vision.
With the popularity of neural radiance fields (NeRFs), implicit representations also became popular for scene completion.
We propose to fuse the scene reconstruction from multiple images and distill this knowledge into a more accurate single-view scene reconstruction.
arXiv Detail & Related papers (2024-04-11T17:30:24Z) - SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation
for Novel View Synthesis from a Single Image [60.52991173059486]
We introduce SAMPLING, a Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image.
Our method demonstrates considerable performance gains in large-scale unbounded outdoor scenes using a single image on the KITTI dataset.
arXiv Detail & Related papers (2023-09-12T15:33:09Z) - TensoIR: Tensorial Inverse Rendering [51.57268311847087]
TensoIR is a novel inverse rendering approach based on tensor factorization and neural fields.
TensoRF is a state-of-the-art approach for radiance field modeling.
arXiv Detail & Related papers (2023-04-24T21:39:13Z) - Progressively Optimized Local Radiance Fields for Robust View Synthesis [76.55036080270347]
We present an algorithm for reconstructing the radiance field of a large-scale scene from a single casually captured video.
For handling unknown poses, we jointly estimate the camera poses with radiance field in a progressive manner.
For handling large unbounded scenes, we dynamically allocate new local radiance fields trained with frames within a temporal window.
arXiv Detail & Related papers (2023-03-24T04:03:55Z) - Neural Radiance Transfer Fields for Relightable Novel-view Synthesis
with Global Illumination [63.992213016011235]
We propose a method for scene relighting under novel views by learning a neural precomputed radiance transfer function.
Our method can be solely supervised on a set of real images of the scene under a single unknown lighting condition.
Results show that the recovered disentanglement of scene parameters improves significantly over the current state of the art.
arXiv Detail & Related papers (2022-07-27T16:07:48Z) - Modeling Image Composition for Complex Scene Generation [77.10533862854706]
We present a method that achieves state-of-the-art results on layout-to-image generation tasks.
After compressing RGB images into patch tokens, we propose the Transformer with Focal Attention (TwFA) for exploring dependencies of object-to-object, object-to-patch and patch-to-patch.
arXiv Detail & Related papers (2022-06-02T08:34:25Z) - ERF: Explicit Radiance Field Reconstruction From Scratch [12.254150867994163]
We propose a novel explicit dense 3D reconstruction approach that processes a set of images of a scene with sensor poses and calibrations and estimates a photo-real digital model.
One of the key innovations is that the underlying volumetric representation is completely explicit.
We show that our method is general and practical. It does not require a highly controlled lab setup for capturing, but allows for reconstructing scenes with a vast variety of objects.
arXiv Detail & Related papers (2022-02-28T19:37:12Z) - Compositional Transformers for Scene Generation [13.633811200719627]
We introduce the GANformer2 model, an iterative object-oriented transformer, explored for the task of generative modeling.
We show it achieves state-of-the-art performance in terms of visual quality, diversity and consistency.
Further experiments demonstrate the model's disentanglement and provide a deeper insight into its generative process.
arXiv Detail & Related papers (2021-11-17T08:11:42Z) - Semantic Palette: Guiding Scene Generation with Class Proportions [34.746963256847145]
We introduce a conditional framework with novel architecture designs and learning objectives, which effectively accommodates class proportions to guide the scene generation process.
Thanks to the semantic control, we can produce layouts close to the real distribution, helping enhance the whole scene generation process.
We demonstrate the merit of our approach for data augmentation: semantic segmenters trained on real layout-image pairs outperform models only trained on real pairs.
arXiv Detail & Related papers (2021-06-03T07:04:00Z) - Free View Synthesis [100.86844680362196]
We present a method for novel view synthesis from input images that are freely distributed around a scene.
Our method does not rely on a regular arrangement of input views, can synthesize images for free camera movement through the scene, and works for general scenes with unconstrained geometric layouts.
arXiv Detail & Related papers (2020-08-12T18:16:08Z) - A Generative Model for Generic Light Field Reconstruction [15.394019131959096]
We present for the first time a generative model for 4D light field patches using variational autoencoders.
We develop a generative model conditioned on the central view of the light field and incorporate this as a prior in an energy minimization framework.
Our proposed method demonstrates good reconstruction, with performance approaching end-to-end trained networks.
arXiv Detail & Related papers (2020-05-13T18:27:42Z) - Two-shot Spatially-varying BRDF and Shape Estimation [89.29020624201708]
We propose a novel deep learning architecture with a stage-wise estimation of shape and SVBRDF.
We create a large-scale synthetic training dataset with domain-randomized geometry and realistic materials.
Experiments on both synthetic and real-world datasets show that our network trained on a synthetic dataset can generalize well to real-world images.
arXiv Detail & Related papers (2020-04-01T12:56:13Z) - Deep CG2Real: Synthetic-to-Real Translation via Image Disentanglement [78.58603635621591]
Training an unpaired synthetic-to-real translation network in image space is severely under-constrained.
We propose a semi-supervised approach that operates on the disentangled shading and albedo layers of the image.
Our two-stage pipeline first learns to predict accurate shading in a supervised fashion using physically-based renderings as targets.
arXiv Detail & Related papers (2020-03-27T21:45:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.