Few-shot Class-incremental Learning for 3D Point Cloud Objects
- URL: http://arxiv.org/abs/2205.15225v1
- Date: Mon, 30 May 2022 16:33:53 GMT
- Title: Few-shot Class-incremental Learning for 3D Point Cloud Objects
- Authors: Townim Chowdhury, Ali Cheraghian, Sameera Ramasinghe, Sahar Ahmadi,
Morteza Saberi, Shafin Rahman
- Abstract summary: Few-shot class-incremental learning (FSCIL) aims to incrementally fine-tune a model trained on base classes for a novel set of classes.
Recent efforts of FSCIL address this problem primarily on 2D image data.
Due to the advancement of camera technology, 3D point cloud data has become more available than ever.
- Score: 11.267975876074706
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Few-shot class-incremental learning (FSCIL) aims to incrementally fine-tune a
model trained on base classes for a novel set of classes using a few examples
without forgetting the previous training. Recent efforts of FSCIL address this
problem primarily on 2D image data. However, due to the advancement of camera
technology, 3D point cloud data has become more available than ever, which
warrants considering FSCIL on 3D data. In this paper, we address FSCIL in the
3D domain. In addition to well-known problems of catastrophic forgetting of
past knowledge and overfitting of few-shot data, 3D FSCIL can bring newer
challenges. For example, base classes may contain many synthetic instances in a
realistic scenario. In contrast, only a few real-scanned samples (from RGBD
sensors) of novel classes are available in incremental steps. Due to the data
variation from synthetic to real, FSCIL endures additional challenges,
degrading performance in later incremental steps. We attempt to solve this
problem by using Microshapes (orthogonal basis vectors) describing any 3D
objects using a pre-defined set of rules. It supports incremental training with
few-shot examples minimizing synthetic to real data variation. We propose new
test protocols for 3D FSCIL using popular synthetic datasets, ModelNet and
ShapeNet, and 3D real-scanned datasets, ScanObjectNN, and Common Objects in 3D
(CO3D). By comparing state-of-the-art methods, we establish the effectiveness
of our approach in the 3D domain.
Related papers
- P3P: Pseudo-3D Pre-training for Scaling 3D Masked Autoencoders [32.85484320025852]
We propose a novel self-supervised pre-training framework utilizing the real 3D data and the pseudo-3D data lifted from images by a large depth estimation model.
Our method achieves state-of-the-art performance in 3D classification and few-shot learning while maintaining high pre-training and downstream fine-tuning efficiency.
arXiv Detail & Related papers (2024-08-19T13:59:53Z) - DIRECT-3D: Learning Direct Text-to-3D Generation on Massive Noisy 3D Data [50.164670363633704]
We present DIRECT-3D, a diffusion-based 3D generative model for creating high-quality 3D assets from text prompts.
Our model is directly trained on extensive noisy and unaligned in-the-wild' 3D assets.
We achieve state-of-the-art performance in both single-class generation and text-to-3D generation.
arXiv Detail & Related papers (2024-06-06T17:58:15Z) - FILP-3D: Enhancing 3D Few-shot Class-incremental Learning with
Pre-trained Vision-Language Models [62.663113296987085]
Few-shot class-incremental learning aims to mitigate the catastrophic forgetting issue when a model is incrementally trained on limited data.
We introduce two novel components: the Redundant Feature Eliminator (RFE) and the Spatial Noise Compensator (SNC)
Considering the imbalance in existing 3D datasets, we also propose new evaluation metrics that offer a more nuanced assessment of a 3D FSCIL model.
arXiv Detail & Related papers (2023-12-28T14:52:07Z) - PonderV2: Pave the Way for 3D Foundation Model with A Universal
Pre-training Paradigm [114.47216525866435]
We introduce a novel universal 3D pre-training framework designed to facilitate the acquisition of efficient 3D representation.
For the first time, PonderV2 achieves state-of-the-art performance on 11 indoor and outdoor benchmarks, implying its effectiveness.
arXiv Detail & Related papers (2023-10-12T17:59:57Z) - ULIP: Learning a Unified Representation of Language, Images, and Point
Clouds for 3D Understanding [110.07170245531464]
Current 3D models are limited by datasets with a small number of annotated data and a pre-defined set of categories.
Recent advances have shown that similar problems can be significantly alleviated by employing knowledge from other modalities, such as language.
We learn a unified representation of images, texts, and 3D point clouds by pre-training with object triplets from the three modalities.
arXiv Detail & Related papers (2022-12-10T01:34:47Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - Exemplar Fine-Tuning for 3D Human Model Fitting Towards In-the-Wild 3D
Human Pose Estimation [107.07047303858664]
Large-scale human datasets with 3D ground-truth annotations are difficult to obtain in the wild.
We address this problem by augmenting existing 2D datasets with high-quality 3D pose fits.
The resulting annotations are sufficient to train from scratch 3D pose regressor networks that outperform the current state-of-the-art on in-the-wild benchmarks.
arXiv Detail & Related papers (2020-04-07T20:21:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.