Related papers: Fine-grained 3D object recognition: an approach and experiments

Fine-grained 3D object recognition: an approach and experiments

URL: http://arxiv.org/abs/2306.15919v1
Date: Wed, 28 Jun 2023 04:48:21 GMT
Title: Fine-grained 3D object recognition: an approach and experiments
Authors: Junhyung Jo, Hamidreza Kasaei
Abstract summary: Three-dimensional (3D) object recognition technology is being used as a core technology in advanced technologies such as autonomous driving of automobiles. There are two sets of approaches for 3D object recognition: (i) hand-crafted approaches like Global Orthographic Object Descriptor (GOOD), and (ii) deep learning-based approaches such as MobileNet and VGG. In this paper, we first implemented an offline 3D object recognition system that takes an object view as input and generates category labels as output. In the offline stage, instance-based learning (IBL) is used to form a new
Score: 0.0
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Three-dimensional (3D) object recognition technology is being used as a core technology in advanced technologies such as autonomous driving of automobiles. There are two sets of approaches for 3D object recognition: (i) hand-crafted approaches like Global Orthographic Object Descriptor (GOOD), and (ii) deep learning-based approaches such as MobileNet and VGG. However, it is needed to know which of these approaches works better in an open-ended domain where the number of known categories increases over time, and the system should learn about new object categories using few training examples. In this paper, we first implemented an offline 3D object recognition system that takes an object view as input and generates category labels as output. In the offline stage, instance-based learning (IBL) is used to form a new category and we use K-fold cross-validation to evaluate the obtained object recognition performance. We then test the proposed approach in an online fashion by integrating the code into a simulated teacher test. As a result, we concluded that the approach using deep learning features is more suitable for open-ended fashion. Moreover, we observed that concatenating the hand-crafted and deep learning features increases the classification accuracy.

Related papers

C3D-AD: Toward Continual 3D Anomaly Detection via Kernel Attention with Learnable Advisor [9.917394249928092]
3D Anomaly Detection (AD) has shown great potential in detecting anomalies or defects of high-precision industrial products.<n>Existing methods are typically trained in a class-specific manner and also lack the capability of learning from emerging classes.<n>We propose Continual 3D Anomaly Detection (C3D-AD), which can not only learn generalized representations for multi-class point clouds but also handle new classes emerging over time.
arXiv Detail & Related papers (2025-08-02T10:54:55Z)
Open Vocabulary Monocular 3D Object Detection [10.424711580213616]
We pioneer the study of open-vocabulary monocular 3D object detection, a novel task that aims to detect and localize objects in 3D space from a single RGB image. We introduce a class-agnostic approach that leverages open-vocabulary 2D detectors and lifts 2D bounding boxes into 3D space. Our approach decouples the recognition and localization of objects in 2D from the task of estimating 3D bounding boxes, enabling generalization across unseen categories.
arXiv Detail & Related papers (2024-11-25T18:59:17Z)
Object-Oriented Material Classification and 3D Clustering for Improved Semantic Perception and Mapping in Mobile Robots [6.395242048226456]
We propose a complement-aware deep learning approach for RGB-D-based material classification built on top of an object-oriented pipeline. We show a significant improvement in material classification and 3D clustering accuracy compared to state-of-the-art approaches for 3D semantic scene mapping.
arXiv Detail & Related papers (2024-07-08T16:25:01Z)
Deep Models for Multi-View 3D Object Recognition: A Review [16.500711021549947]
Multi-view 3D representations for object recognition has thus far demonstrated the most promising results for achieving state-of-the-art performance. This review paper comprehensively covers recent progress in multi-view 3D object recognition methods for 3D classification and retrieval tasks.
arXiv Detail & Related papers (2024-04-23T16:54:31Z)
OV-Uni3DETR: Towards Unified Open-Vocabulary 3D Object Detection via Cycle-Modality Propagation [67.56268991234371]
OV-Uni3DETR achieves the state-of-the-art performance on various scenarios, surpassing existing methods by more than 6% on average. Code and pre-trained models will be released later.
arXiv Detail & Related papers (2024-03-28T17:05:04Z)
Generalized Label-Efficient 3D Scene Parsing via Hierarchical Feature Aligned Pre-Training and Region-Aware Fine-tuning [55.517000360348725]
This work presents a framework for dealing with 3D scene understanding when the labeled scenes are quite limited. To extract knowledge for novel categories from the pre-trained vision-language models, we propose a hierarchical feature-aligned pre-training and knowledge distillation strategy. Experiments with both indoor and outdoor scenes demonstrated the effectiveness of our approach in both data-efficient learning and open-world few-shot learning.
arXiv Detail & Related papers (2023-12-01T15:47:04Z)
I3DOD: Towards Incremental 3D Object Detection via Prompting [31.75287371048825]
We present a novel Incremental 3D Object Detection framework with the guidance of prompting, i.e., I3DOD. Specifically, we propose a task-shared prompts mechanism to learn the matching relationships between the object localization information and category semantic information. Our method outperforms the state-of-the-art object detection methods by 0.6% - 2.7% in terms of mAP@0.25.
arXiv Detail & Related papers (2023-08-24T02:54:38Z)
ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds. The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled. The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z)
SL3D: Self-supervised-Self-labeled 3D Recognition [89.19932178712065]
We propose a Self-supervised-Self-Labeled 3D Recognition (SL3D) framework. SL3D simultaneously solves two coupled objectives, i.e., clustering and learning feature representation. It can be applied to solve different 3D recognition tasks, including classification, object detection, and semantic segmentation.
arXiv Detail & Related papers (2022-10-30T11:08:25Z)
CMR3D: Contextualized Multi-Stage Refinement for 3D Object Detection [57.44434974289945]
We propose Contextualized Multi-Stage Refinement for 3D Object Detection (CMR3D) framework. Our framework takes a 3D scene as input and strives to explicitly integrate useful contextual information of the scene. In addition to 3D object detection, we investigate the effectiveness of our framework for the problem of 3D object counting.
arXiv Detail & Related papers (2022-09-13T05:26:09Z)
Lifelong Ensemble Learning based on Multiple Representations for Few-Shot Object Recognition [6.282068591820947]
We present a lifelong ensemble learning approach based on multiple representations to address the few-shot object recognition problem. To facilitate lifelong learning, each approach is equipped with a memory unit for storing and retrieving object information instantly. We have performed extensive sets of experiments to assess the performance of the proposed approach in offline, and open-ended scenarios.
arXiv Detail & Related papers (2022-05-04T10:29:10Z)
Learnable Online Graph Representations for 3D Multi-Object Tracking [156.58876381318402]
We propose a unified and learning based approach to the 3D MOT problem. We employ a Neural Message Passing network for data association that is fully trainable. We show the merit of the proposed approach on the publicly available nuScenes dataset by achieving state-of-the-art performance of 65.6% AMOTA and 58% fewer ID-switches.
arXiv Detail & Related papers (2021-04-23T17:59:28Z)

This list is automatically generated from the titles and abstracts of the papers in this site.