Exploiting Radiance Fields for Grasp Generation on Novel Synthetic Views
- URL: http://arxiv.org/abs/2505.11467v1
- Date: Fri, 16 May 2025 17:23:09 GMT
- Title: Exploiting Radiance Fields for Grasp Generation on Novel Synthetic Views
- Authors: Abhishek Kashyap, Henrik Andreasson, Todor Stoyanov,
- Abstract summary: We show initial results which indicate that novel view synthesis can provide additional context in generating grasp poses.<n>Our experiments on the Graspnet-1billion dataset show that novel views contributed force-closure grasps.<n>In the future we hope this work can be extended to improve grasp extraction from radiance fields constructed with a single input image.
- Score: 7.305342793164903
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Vision based robot manipulation uses cameras to capture one or more images of a scene containing the objects to be manipulated. Taking multiple images can help if any object is occluded from one viewpoint but more visible from another viewpoint. However, the camera has to be moved to a sequence of suitable positions for capturing multiple images, which requires time and may not always be possible, due to reachability constraints. So while additional images can produce more accurate grasp poses due to the extra information available, the time-cost goes up with the number of additional views sampled. Scene representations like Gaussian Splatting are capable of rendering accurate photorealistic virtual images from user-specified novel viewpoints. In this work, we show initial results which indicate that novel view synthesis can provide additional context in generating grasp poses. Our experiments on the Graspnet-1billion dataset show that novel views contributed force-closure grasps in addition to the force-closure grasps obtained from sparsely sampled real views while also improving grasp coverage. In the future we hope this work can be extended to improve grasp extraction from radiance fields constructed with a single input image, using for example diffusion models or generalizable radiance fields.
Related papers
- Improving Novel view synthesis of 360$^\circ$ Scenes in Extremely Sparse Views by Jointly Training Hemisphere Sampled Synthetic Images [6.273625958279926]
Novel view synthesis in 360$circ$ scenes from extremely sparse input views is essential for applications like virtual reality and augmented reality.<n>This paper presents a novel framework for novel view synthesis in extremely sparse-view cases.
arXiv Detail & Related papers (2025-05-25T18:42:34Z) - Novel View Extrapolation with Video Diffusion Priors [98.314893665023]
ViewExtrapolator is a novel view synthesis approach that leverages the generative priors of Stable Video Diffusion (SVD) for realistic novel view extrapolation.
ViewExtrapolator can work with different types of 3D rendering such as views rendered from point clouds when only a single view or monocular video is available.
arXiv Detail & Related papers (2024-11-21T15:16:48Z) - Sampling for View Synthesis: From Local Light Field Fusion to Neural Radiance Fields and Beyond [27.339452004523082]
Local light field fusion proposes an algorithm for practical view synthesis from an irregular grid of sampled views.
We achieve the perceptual quality of Nyquist rate view sampling while using up to 4000x fewer views.
We reprise some of the recent results on sparse and even single image view synthesis.
arXiv Detail & Related papers (2024-08-08T16:56:03Z) - MetaCap: Meta-learning Priors from Multi-View Imagery for Sparse-view Human Performance Capture and Rendering [91.76893697171117]
We propose a method for efficient and high-quality geometry recovery and novel view synthesis given very sparse or even a single view of the human.
Our key idea is to meta-learn the radiance field weights solely from potentially sparse multi-view videos.
We collect a new dataset, WildDynaCap, which contains subjects captured in, both, a dense camera dome and in-the-wild sparse camera rigs.
arXiv Detail & Related papers (2024-03-27T17:59:54Z) - SAMPLING: Scene-adaptive Hierarchical Multiplane Images Representation
for Novel View Synthesis from a Single Image [60.52991173059486]
We introduce SAMPLING, a Scene-adaptive Hierarchical Multiplane Images Representation for Novel View Synthesis from a Single Image.
Our method demonstrates considerable performance gains in large-scale unbounded outdoor scenes using a single image on the KITTI dataset.
arXiv Detail & Related papers (2023-09-12T15:33:09Z) - SPARF: Neural Radiance Fields from Sparse and Noisy Poses [58.528358231885846]
We introduce Sparse Pose Adjusting Radiance Field (SPARF) to address the challenge of novel-view synthesis.
Our approach exploits multi-view geometry constraints in order to jointly learn the NeRF and refine the camera poses.
arXiv Detail & Related papers (2022-11-21T18:57:47Z) - Ray Priors through Reprojection: Improving Neural Radiance Fields for
Novel View Extrapolation [35.47411859184933]
We study the novel view extrapolation setting that (1) the training images can well describe an object, and (2) there is a notable discrepancy between the training and test viewpoints' distributions.
We propose a random ray casting policy that allows training unseen views using seen views.
A ray atlas pre-computed from the observed rays' viewing directions could further enhance the rendering quality for extrapolated views.
arXiv Detail & Related papers (2022-05-12T07:21:17Z) - Crowdsampling the Plenoptic Function [56.10020793913216]
We present a new approach to novel view synthesis under time-varying illumination from such data.
We introduce a new DeepMPI representation, motivated by observations on the sparsity structure of the plenoptic function.
Our method can synthesize the same compelling parallax and view-dependent effects as previous MPI methods, while simultaneously interpolating along changes in reflectance and illumination with time.
arXiv Detail & Related papers (2020-07-30T02:52:10Z) - NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis [78.5281048849446]
We present a method that achieves state-of-the-art results for synthesizing novel views of complex scenes.
Our algorithm represents a scene using a fully-connected (non-convolutional) deep network.
Because volume rendering is naturally differentiable, the only input required to optimize our representation is a set of images with known camera poses.
arXiv Detail & Related papers (2020-03-19T17:57:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.