Cross-modal Learning for Image-Guided Point Cloud Shape Completion
- URL: http://arxiv.org/abs/2209.09552v1
- Date: Tue, 20 Sep 2022 08:37:05 GMT
- Title: Cross-modal Learning for Image-Guided Point Cloud Shape Completion
- Authors: Emanuele Aiello, Diego Valsesia, Enrico Magli
- Abstract summary: We show how it is possible to combine the information from the two modalities in a localized latent space.
We also investigate a novel weakly-supervised setting where the auxiliary image provides a supervisory signal.
Experiments show significant improvements over state-of-the-art supervised methods for both unimodal and multimodal completion.
- Score: 23.779985842891705
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In this paper we explore the recent topic of point cloud completion, guided
by an auxiliary image. We show how it is possible to effectively combine the
information from the two modalities in a localized latent space, thus avoiding
the need for complex point cloud reconstruction methods from single views used
by the state-of-the-art. We also investigate a novel weakly-supervised setting
where the auxiliary image provides a supervisory signal to the training process
by using a differentiable renderer on the completed point cloud to measure
fidelity in the image space. Experiments show significant improvements over
state-of-the-art supervised methods for both unimodal and multimodal
completion. We also show the effectiveness of the weakly-supervised approach
which outperforms a number of supervised methods and is competitive with the
latest supervised models only exploiting point cloud information.
Related papers
- HVDistill: Transferring Knowledge from Images to Point Clouds via Unsupervised Hybrid-View Distillation [106.09886920774002]
We present a hybrid-view-based knowledge distillation framework, termed HVDistill, to guide the feature learning of a point cloud neural network.
Our method achieves consistent improvements over the baseline trained from scratch and significantly out- performs the existing schemes.
arXiv Detail & Related papers (2024-03-18T14:18:08Z) - Cross-Modal Information-Guided Network using Contrastive Learning for
Point Cloud Registration [17.420425069785946]
We present a novel Cross-Modal Information-Guided Network (CMIGNet) for point cloud registration.
We first incorporate the projected images from the point clouds and fuse the cross-modal features using the attention mechanism.
We employ two contrastive learning strategies, namely overlapping contrastive learning and cross-modal contrastive learning.
arXiv Detail & Related papers (2023-11-02T12:56:47Z) - Point Contrastive Prediction with Semantic Clustering for
Self-Supervised Learning on Point Cloud Videos [71.20376514273367]
We propose a unified point cloud video self-supervised learning framework for object-centric and scene-centric data.
Our method outperforms supervised counterparts on a wide range of downstream tasks.
arXiv Detail & Related papers (2023-08-18T02:17:47Z) - Few-Shot Point Cloud Semantic Segmentation via Contrastive
Self-Supervision and Multi-Resolution Attention [6.350163959194903]
We propose a contrastive self-supervision framework for few-shot learning pretrain.
Specifically, we implement a novel contrastive learning approach with a learnable augmentor for a 3D point cloud.
We develop a multi-resolution attention module using both the nearest and farthest points to extract the local and global point information more effectively.
arXiv Detail & Related papers (2023-02-21T07:59:31Z) - Single Stage Virtual Try-on via Deformable Attention Flows [51.70606454288168]
Virtual try-on aims to generate a photo-realistic fitting result given an in-shop garment and a reference person image.
We develop a novel Deformable Attention Flow (DAFlow) which applies the deformable attention scheme to multi-flow estimation.
Our proposed method achieves state-of-the-art performance both qualitatively and quantitatively.
arXiv Detail & Related papers (2022-07-19T10:01:31Z) - Self-Supervised Arbitrary-Scale Point Clouds Upsampling via Implicit
Neural Representation [79.60988242843437]
We propose a novel approach that achieves self-supervised and magnification-flexible point clouds upsampling simultaneously.
Experimental results demonstrate that our self-supervised learning based scheme achieves competitive or even better performance than supervised learning based state-of-the-art methods.
arXiv Detail & Related papers (2022-04-18T07:18:25Z) - Self-Supervised Feature Learning from Partial Point Clouds via Pose
Disentanglement [35.404285596482175]
We propose a novel self-supervised framework to learn informative representations from partial point clouds.
We leverage partial point clouds scanned by LiDAR that contain both content and pose attributes.
Our method not only outperforms existing self-supervised methods, but also shows a better generalizability across synthetic and real-world datasets.
arXiv Detail & Related papers (2022-01-09T14:12:50Z) - Towards Unsupervised Sketch-based Image Retrieval [126.77787336692802]
We introduce a novel framework that simultaneously performs unsupervised representation learning and sketch-photo domain alignment.
Our framework achieves excellent performance in the new unsupervised setting, and performs comparably or better than state-of-the-art in the zero-shot setting.
arXiv Detail & Related papers (2021-05-18T02:38:22Z) - View-Guided Point Cloud Completion [43.139758470826806]
ViPC (view-guided point cloud completion) takes the missing crucial global structure information from an extra single-view image.
Our method achieves significantly superior results over typical existing solutions on a new large-scale dataset.
arXiv Detail & Related papers (2021-04-12T17:35:45Z) - MAF: Multimodal Alignment Framework for Weakly-Supervised Phrase
Grounding [74.33171794972688]
We present algorithms to model phrase-object relevance by leveraging fine-grained visual representations and visually-aware language representations.
Experiments conducted on the widely-adopted Flickr30k dataset show a significant improvement over existing weakly-supervised methods.
arXiv Detail & Related papers (2020-10-12T00:43:52Z) - Single Image Cloud Detection via Multi-Image Fusion [23.641624507709274]
A primary challenge in developing algorithms is the cost of collecting annotated training data.
We demonstrate how recent advances in multi-image fusion can be leveraged to bootstrap single image cloud detection.
We collect a large dataset of Sentinel-2 images along with a per-pixel semantic labelling for land cover.
arXiv Detail & Related papers (2020-07-29T22:52:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.