Auto-Weighted Layer Representation Based View Synthesis Distortion
Estimation for 3-D Video Coding
- URL: http://arxiv.org/abs/2201.02420v1
- Date: Fri, 7 Jan 2022 12:12:41 GMT
- Title: Auto-Weighted Layer Representation Based View Synthesis Distortion
Estimation for 3-D Video Coding
- Authors: Jian Jin, Xingxing Zhang, Lili Meng, Weisi Lin, Jie Liang, Huaxiang
Zhang, Yao Zhao
- Abstract summary: In this paper, an auto-weighted layer representation based view synthesis distortion estimation model is developed.
The proposed method outperforms the relevant state-of-the-art methods in both accuracy and efficiency.
- Score: 78.53837757673597
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, various view synthesis distortion estimation models have been
studied to better serve for 3-D video coding. However, they can hardly model
the relationship quantitatively among different levels of depth changes,
texture degeneration, and the view synthesis distortion (VSD), which is crucial
for rate-distortion optimization and rate allocation. In this paper, an
auto-weighted layer representation based view synthesis distortion estimation
model is developed. Firstly, the sub-VSD (S-VSD) is defined according to the
level of depth changes and their associated texture degeneration. After that, a
set of theoretical derivations demonstrate that the VSD can be approximately
decomposed into the S-VSDs multiplied by their associated weights. To obtain
the S-VSDs, a layer-based representation of S-VSD is developed, where all the
pixels with the same level of depth changes are represented with a layer to
enable efficient S-VSD calculation at the layer level. Meanwhile, a nonlinear
mapping function is learnt to accurately represent the relationship between the
VSD and S-VSDs, automatically providing weights for S-VSDs during the VSD
estimation. To learn such function, a dataset of VSD and its associated S-VSDs
are built. Experimental results show that the VSD can be accurately estimated
with the weights learnt by the nonlinear mapping function once its associated
S-VSDs are available. The proposed method outperforms the relevant
state-of-the-art methods in both accuracy and efficiency. The dataset and
source code of the proposed method will be available at
https://github.com/jianjin008/.
Related papers
- EVolSplat: Efficient Volume-based Gaussian Splatting for Urban View Synthesis [61.1662426227688]
Existing NeRF and 3DGS-based methods show promising results in achieving photorealistic renderings but require slow, per-scene optimization.
We introduce EVolSplat, an efficient 3D Gaussian Splatting model for urban scenes that works in a feed-forward manner.
arXiv Detail & Related papers (2025-03-26T02:47:27Z) - RewardSDS: Aligning Score Distillation via Reward-Weighted Sampling [14.725841457150414]
RewardSDS weights noise samples based on alignment scores from a reward model, producing a weighted SDS loss.
This loss prioritizes gradients from noise samples that yield aligned high-reward output.
We evaluate RewardSDS and RewardVSD on text-to-image, 2D editing, and text-to-3D generation tasks.
arXiv Detail & Related papers (2025-03-12T17:59:47Z) - A Lesson in Splats: Teacher-Guided Diffusion for 3D Gaussian Splats Generation with 2D Supervision [65.33043028101471]
We introduce a diffusion model for Gaussian Splats, SplatDiffusion, to enable generation of three-dimensional structures from single images.
Existing methods rely on deterministic, feed-forward predictions, which limit their ability to handle the inherent ambiguity of 3D inference from 2D data.
arXiv Detail & Related papers (2024-12-01T00:29:57Z) - Semantic Score Distillation Sampling for Compositional Text-to-3D Generation [28.88237230872795]
Generating high-quality 3D assets from textual descriptions remains a pivotal challenge in computer graphics and vision research.
We introduce a novel SDS approach, designed to improve the expressiveness and accuracy of compositional text-to-3D generation.
Our approach integrates new semantic embeddings that maintain consistency across different rendering views.
By leveraging explicit semantic guidance, our method unlocks the compositional capabilities of existing pre-trained diffusion models.
arXiv Detail & Related papers (2024-10-11T17:26:00Z) - DreamMapping: High-Fidelity Text-to-3D Generation via Variational Distribution Mapping [20.7584503748821]
Score Distillation Sampling (SDS) has emerged as a prevalent technique for text-to-3D generation, enabling 3D content creation by distilling view-dependent information from text-to-2D guidance.
We conduct a thorough analysis of SDS and refine its formulation, finding that the core design is to model the distribution of rendered images.
We introduce a novel strategy called Variational Distribution Mapping (VDM), which expedites the distribution modeling process by regarding the rendered images as instances of degradation from diffusion-based generation.
arXiv Detail & Related papers (2024-09-08T14:04:48Z) - Optimizing 3D Gaussian Splatting for Sparse Viewpoint Scene Reconstruction [11.840097269724792]
3D Gaussian Splatting (3DGS) has emerged as a promising approach for 3D scene representation, offering a reduction in computational overhead compared to Neural Radiance Fields (NeRF)
We introduce SVS-GS, a novel framework for Sparse Viewpoint Scene reconstruction that integrates a 3D Gaussian smoothing filter to suppress artifacts.
arXiv Detail & Related papers (2024-09-05T03:18:04Z) - SDL-MVS: View Space and Depth Deformable Learning Paradigm for Multi-View Stereo Reconstruction in Remote Sensing [12.506628755166814]
We re-examine the deformable learning method in the Multi-View Stereo task and propose a novel paradigm based on view Space and Depth deformable Learning (SDL-MVS)
Our SDL-MVS aims to learn deformable interactions of features in different view spaces and deformably model the depth ranges and intervals to enable high accurate depth estimation.
Experiments on LuoJia-MVS and WHU datasets show that our SDL-MVS reaches state-of-the-art performance.
arXiv Detail & Related papers (2024-05-27T12:59:46Z) - SAGS: Structure-Aware 3D Gaussian Splatting [53.6730827668389]
We propose a structure-aware Gaussian Splatting method (SAGS) that implicitly encodes the geometry of the scene.
SAGS reflects to state-of-the-art rendering performance and reduced storage requirements on benchmark novel-view synthesis datasets.
arXiv Detail & Related papers (2024-04-29T23:26:30Z) - Consistent3D: Towards Consistent High-Fidelity Text-to-3D Generation with Deterministic Sampling Prior [87.55592645191122]
Score distillation sampling (SDS) and its variants have greatly boosted the development of text-to-3D generation, but are vulnerable to geometry collapse and poor textures yet.
We propose a novel and effective "Consistent3D" method that explores the ODE deterministic sampling prior for text-to-3D generation.
Experimental results show the efficacy of our Consistent3D in generating high-fidelity and diverse 3D objects and large-scale scenes.
arXiv Detail & Related papers (2024-01-17T08:32:07Z) - RadOcc: Learning Cross-Modality Occupancy Knowledge through Rendering
Assisted Distillation [50.35403070279804]
3D occupancy prediction is an emerging task that aims to estimate the occupancy states and semantics of 3D scenes using multi-view images.
We propose RadOcc, a Rendering assisted distillation paradigm for 3D Occupancy prediction.
arXiv Detail & Related papers (2023-12-19T03:39:56Z) - StableDreamer: Taming Noisy Score Distillation Sampling for Text-to-3D [88.66678730537777]
We present StableDreamer, a methodology incorporating three advances.
First, we formalize the equivalence of the SDS generative prior and a simple supervised L2 reconstruction loss.
Second, our analysis shows that while image-space diffusion contributes to geometric precision, latent-space diffusion is crucial for vivid color rendition.
arXiv Detail & Related papers (2023-12-02T02:27:58Z) - Stable View Synthesis [100.86844680362196]
We present Stable View Synthesis (SVS)
Given a set of source images depicting a scene from freely distributed viewpoints, SVS synthesizes new views of the scene.
SVS outperforms state-of-the-art view synthesis methods both quantitatively and qualitatively on three diverse real-world datasets.
arXiv Detail & Related papers (2020-11-14T07:24:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.