3D objects and scenes classification, recognition, segmentation, and
reconstruction using 3D point cloud data: A review
- URL: http://arxiv.org/abs/2306.05978v1
- Date: Fri, 9 Jun 2023 15:45:23 GMT
- Title: 3D objects and scenes classification, recognition, segmentation, and
reconstruction using 3D point cloud data: A review
- Authors: Omar Elharrouss, Kawther Hassine, Ayman Zayyan, Zakariyae Chatri, Noor
almaadeed, Somaya Al-Maadeed and Khalid Abualsaud
- Abstract summary: Three-dimensional (3D) point cloud analysis has become one of the attractive subjects in realistic imaging and machine visions.
A significant effort has recently been devoted to developing novel strategies, using different techniques such as deep learning models.
Various tasks performed on 3D point could data are investigated, including objects and scenes detection, recognition, segmentation and reconstruction.
- Score: 5.85206759397617
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Three-dimensional (3D) point cloud analysis has become one of the attractive
subjects in realistic imaging and machine visions due to its simplicity,
flexibility and powerful capacity of visualization. Actually, the
representation of scenes and buildings using 3D shapes and formats leveraged
many applications among which automatic driving, scenes and objects
reconstruction, etc. Nevertheless, working with this emerging type of data has
been a challenging task for objects representation, scenes recognition,
segmentation, and reconstruction. In this regard, a significant effort has
recently been devoted to developing novel strategies, using different
techniques such as deep learning models. To that end, we present in this paper
a comprehensive review of existing tasks on 3D point cloud: a well-defined
taxonomy of existing techniques is performed based on the nature of the adopted
algorithms, application scenarios, and main objectives. Various tasks performed
on 3D point could data are investigated, including objects and scenes
detection, recognition, segmentation and reconstruction. In addition, we
introduce a list of used datasets, we discuss respective evaluation metrics and
we compare the performance of existing solutions to better inform the
state-of-the-art and identify their limitations and strengths. Lastly, we
elaborate on current challenges facing the subject of technology and future
trends attracting considerable interest, which could be a starting point for
upcoming research studies
Related papers
- Automatic Scene Generation: State-of-the-Art Techniques, Models, Datasets, Challenges, and Future Prospects [0.94371657253557]
This survey focuses on techniques that leverage machine learning, deep learning, embedded systems, and natural language processing (NLP)
We categorize the models into four main types: Variational Autoencoders (VAEs), Generative Adrial Networks (GANs), Transformers, and Diffusion Models.
We also review the most commonly used datasets, such as COCO-Stuff, Visual Genome, and MS-COCO, which are critical for training and evaluating these models.
arXiv Detail & Related papers (2024-09-14T19:09:10Z) - Enhancing Generalizability of Representation Learning for Data-Efficient 3D Scene Understanding [50.448520056844885]
We propose a generative Bayesian network to produce diverse synthetic scenes with real-world patterns.
A series of experiments robustly display our method's consistent superiority over existing state-of-the-art pre-training approaches.
arXiv Detail & Related papers (2024-06-17T07:43:53Z) - Deep Models for Multi-View 3D Object Recognition: A Review [16.500711021549947]
Multi-view 3D representations for object recognition has thus far demonstrated the most promising results for achieving state-of-the-art performance.
This review paper comprehensively covers recent progress in multi-view 3D object recognition methods for 3D classification and retrieval tasks.
arXiv Detail & Related papers (2024-04-23T16:54:31Z) - A Comprehensive Survey of 3D Dense Captioning: Localizing and Describing
Objects in 3D Scenes [80.20670062509723]
3D dense captioning is an emerging vision-language bridging task that aims to generate detailed descriptions for 3D scenes.
It presents significant potential and challenges due to its closer representation of the real world compared to 2D visual captioning.
Despite the popularity and success of existing methods, there is a lack of comprehensive surveys summarizing the advancements in this field.
arXiv Detail & Related papers (2024-03-12T10:04:08Z) - Multi-Modal Dataset Acquisition for Photometrically Challenging Object [56.30027922063559]
This paper addresses the limitations of current datasets for 3D vision tasks in terms of accuracy, size, realism, and suitable imaging modalities for photometrically challenging objects.
We propose a novel annotation and acquisition pipeline that enhances existing 3D perception and 6D object pose datasets.
arXiv Detail & Related papers (2023-08-21T10:38:32Z) - SketchANIMAR: Sketch-based 3D Animal Fine-Grained Retrieval [17.286320102183502]
We introduce a novel SHREC challenge track that focuses on retrieving relevant 3D animal models from a dataset using sketch queries.
Our contest requires participants to retrieve 3D models based on complex and detailed sketches.
We receive satisfactory results from eight teams and 204 runs.
arXiv Detail & Related papers (2023-04-12T09:40:38Z) - RandomRooms: Unsupervised Pre-training from Synthetic Shapes and
Randomized Layouts for 3D Object Detection [138.2892824662943]
A promising solution is to make better use of the synthetic dataset, which consists of CAD object models, to boost the learning on real datasets.
Recent work on 3D pre-training exhibits failure when transfer features learned on synthetic objects to other real-world applications.
In this work, we put forward a new method called RandomRooms to accomplish this objective.
arXiv Detail & Related papers (2021-08-17T17:56:12Z) - Active 3D Shape Reconstruction from Vision and Touch [66.08432412497443]
Humans build 3D understandings of the world through active object exploration, using jointly their senses of vision and touch.
In 3D shape reconstruction, most recent progress has relied on static datasets of limited sensory data such as RGB images, depth maps or haptic readings.
We introduce a system composed of: 1) a haptic simulator leveraging high spatial resolution vision-based tactile sensors for active touching of 3D objects; 2) a mesh-based 3D shape reconstruction model that relies on tactile or visuotactile priors to guide the shape exploration; and 3) a set of data-driven solutions with either tactile or visuo
arXiv Detail & Related papers (2021-07-20T15:56:52Z) - Unsupervised Learning of 3D Object Categories from Videos in the Wild [75.09720013151247]
We focus on learning a model from multiple views of a large collection of object instances.
We propose a new neural network design, called warp-conditioned ray embedding (WCR), which significantly improves reconstruction.
Our evaluation demonstrates performance improvements over several deep monocular reconstruction baselines on existing benchmarks.
arXiv Detail & Related papers (2021-03-30T17:57:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.