Joint Supervised and Self-Supervised Learning for 3D Real-World
Challenges
- URL: http://arxiv.org/abs/2004.07392v1
- Date: Wed, 15 Apr 2020 23:34:03 GMT
- Title: Joint Supervised and Self-Supervised Learning for 3D Real-World
Challenges
- Authors: Antonio Alliegro, Davide Boscaini, Tatiana Tommasi
- Abstract summary: Point cloud processing and 3D shape understanding are challenging tasks for which deep learning techniques have demonstrated great potentials.
Here we consider several possible scenarios involving synthetic and real-world point clouds where supervised learning fails due to data scarcity and large domain gaps.
We propose to enrich standard feature representations by leveraging self-supervision through a multi-task model that can solve a 3D puzzle while learning the main task of shape classification or part segmentation.
- Score: 16.328866317851187
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Point cloud processing and 3D shape understanding are very challenging tasks
for which deep learning techniques have demonstrated great potentials. Still
further progresses are essential to allow artificial intelligent agents to
interact with the real world, where the amount of annotated data may be limited
and integrating new sources of knowledge becomes crucial to support autonomous
learning. Here we consider several possible scenarios involving synthetic and
real-world point clouds where supervised learning fails due to data scarcity
and large domain gaps. We propose to enrich standard feature representations by
leveraging self-supervision through a multi-task model that can solve a 3D
puzzle while learning the main task of shape classification or part
segmentation. An extensive analysis investigating few-shot, transfer learning
and cross-domain settings shows the effectiveness of our approach with
state-of-the-art results for 3D shape classification and part segmentation.
Related papers
- Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models [113.18524940863841]
This survey provides a comprehensive overview of the methodologies enabling large language models to process, understand, and generate 3D data.
Our investigation spans various 3D data representations, from point clouds to Neural Radiance Fields (NeRFs)
It examines their integration with LLMs for tasks such as 3D scene understanding, captioning, question-answering, and dialogue.
arXiv Detail & Related papers (2024-05-16T16:59:58Z) - Explore In-Context Learning for 3D Point Cloud Understanding [71.20912026561484]
We introduce a novel framework, named Point-In-Context, designed especially for in-context learning in 3D point clouds.
We propose the Joint Sampling module, carefully designed to work in tandem with the general point sampling operator.
We conduct extensive experiments to validate the versatility and adaptability of our proposed methods in handling a wide range of tasks.
arXiv Detail & Related papers (2023-06-14T17:53:21Z) - Accelerating exploration and representation learning with offline
pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset.
We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z) - Self-Ensemling for 3D Point Cloud Domain Adaption [29.330315360307374]
We propose an end-to-end self-ensembling network (SEN) for 3D point cloud domain adaption tasks.
Our SEN resorts to the advantages of Mean Teacher and semi-supervised learning, and introduces a soft classification loss and a consistency loss.
Our SEN outperforms the state-of-the-art methods on both classification and segmentation tasks.
arXiv Detail & Related papers (2021-12-10T02:18:09Z) - Spatio-temporal Self-Supervised Representation Learning for 3D Point
Clouds [96.9027094562957]
We introduce a-temporal representation learning framework, capable of learning from unlabeled tasks.
Inspired by how infants learn from visual data in the wild, we explore rich cues derived from the 3D data.
STRL takes two temporally-related frames from a 3D point cloud sequence as the input, transforms it with the spatial data augmentation, and learns the invariant representation self-supervisedly.
arXiv Detail & Related papers (2021-09-01T04:17:11Z) - Multi-task learning from fixed-wing UAV images for 2D/3D city modeling [0.0]
Multi-task learning is an approach to scene understanding which involves multiple related tasks each with potentially limited training data.
In urban management applications such as infrastructure development, traffic monitoring, smart 3D cities, and change detection, automated multi-task data analysis is required.
In this study, a common framework for the performance assessment of multi-task learning methods from fixed-wing UAV images for 2D/3D city modeling is presented.
arXiv Detail & Related papers (2021-08-25T14:45:42Z) - Point Discriminative Learning for Unsupervised Representation Learning
on 3D Point Clouds [54.31515001741987]
We propose a point discriminative learning method for unsupervised representation learning on 3D point clouds.
We achieve this by imposing a novel point discrimination loss on the middle level and global level point features.
Our method learns powerful representations and achieves new state-of-the-art performance.
arXiv Detail & Related papers (2021-08-04T15:11:48Z) - Sense and Learn: Self-Supervision for Omnipresent Sensors [9.442811508809994]
We present a framework named Sense and Learn for representation or feature learning from raw sensory data.
It consists of several auxiliary tasks that can learn high-level and broadly useful features entirely from unannotated data without any human involvement in the tedious labeling process.
Our methodology achieves results that are competitive with the supervised approaches and close the gap through fine-tuning a network while learning the downstream tasks in most cases.
arXiv Detail & Related papers (2020-09-28T11:57:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.