Open-Ended Fine-Grained 3D Object Categorization by Combining Shape and
Texture Features in Multiple Colorspaces
- URL: http://arxiv.org/abs/2009.09235v3
- Date: Fri, 28 May 2021 19:54:03 GMT
- Title: Open-Ended Fine-Grained 3D Object Categorization by Combining Shape and
Texture Features in Multiple Colorspaces
- Authors: Nils Keunecke and S. Hamidreza Kasaei
- Abstract summary: In this work, shape information encodes the common patterns of all categories, while texture information is used to describe the appearance of each instance in detail.
The proposed network architecture out-performed the selected state-of-the-art approaches in terms of object classification accuracy and scalability.
- Score: 5.89118432388542
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a consequence of an ever-increasing number of service robots, there is a
growing demand for highly accurate real-time 3D object recognition. Considering
the expansion of robot applications in more complex and dynamic environments,it
is evident that it is not possible to pre-program all object categories and
anticipate all exceptions in advance. Therefore, robots should have the
functionality to learn about new object categories in an open-ended fashion
while working in the environment.Towards this goal, we propose a deep transfer
learning approach to generate a scale- and pose-invariant object representation
by considering shape and texture information in multiple colorspaces. The
obtained global object representation is then fed to an instance-based object
category learning and recognition,where a non-expert human user exists in the
learning loop and can interactively guide the process of experience acquisition
by teaching new object categories, or by correcting insufficient or erroneous
categories. In this work, shape information encodes the common patterns of all
categories, while texture information is used to describes the appearance of
each instance in detail.Multiple color space combinations and network
architectures are evaluated to find the most descriptive system. Experimental
results showed that the proposed network architecture out-performed the
selected state-of-the-art approaches in terms of object classification accuracy
and scalability. Furthermore, we performed a real robot experiment in the
context of serve-a-beer scenario to show the real-time performance of the
proposed approach.
Related papers
- Variational Inference for Scalable 3D Object-centric Learning [19.445804699433353]
We tackle the task of scalable unsupervised object-centric representation learning on 3D scenes.
Existing approaches to object-centric representation learning show limitations in generalizing to larger scenes.
We propose to learn view-invariant 3D object representations in localized object coordinate systems.
arXiv Detail & Related papers (2023-09-25T10:23:40Z) - Object Scene Representation Transformer [56.40544849442227]
We introduce Object Scene Representation Transformer (OSRT), a 3D-centric model in which individual object representations naturally emerge through novel view synthesis.
OSRT scales to significantly more complex scenes with larger diversity of objects and backgrounds than existing methods.
It is multiple orders of magnitude faster at compositional rendering thanks to its light field parametrization and the novel Slot Mixer decoder.
arXiv Detail & Related papers (2022-06-14T15:40:47Z) - Lifelong Ensemble Learning based on Multiple Representations for
Few-Shot Object Recognition [6.282068591820947]
We present a lifelong ensemble learning approach based on multiple representations to address the few-shot object recognition problem.
To facilitate lifelong learning, each approach is equipped with a memory unit for storing and retrieving object information instantly.
We have performed extensive sets of experiments to assess the performance of the proposed approach in offline, and open-ended scenarios.
arXiv Detail & Related papers (2022-05-04T10:29:10Z) - Complex-Valued Autoencoders for Object Discovery [62.26260974933819]
We propose a distributed approach to object-centric representations: the Complex AutoEncoder.
We show that this simple and efficient approach achieves better reconstruction performance than an equivalent real-valued autoencoder on simple multi-object datasets.
We also show that it achieves competitive unsupervised object discovery performance to a SlotAttention model on two datasets, and manages to disentangle objects in a third dataset where SlotAttention fails - all while being 7-70 times faster to train.
arXiv Detail & Related papers (2022-04-05T09:25:28Z) - MetaGraspNet: A Large-Scale Benchmark Dataset for Vision-driven Robotic
Grasping via Physics-based Metaverse Synthesis [78.26022688167133]
We present a large-scale benchmark dataset for vision-driven robotic grasping via physics-based metaverse synthesis.
The proposed dataset contains 100,000 images and 25 different object types.
We also propose a new layout-weighted performance metric alongside the dataset for evaluating object detection and segmentation performance.
arXiv Detail & Related papers (2021-12-29T17:23:24Z) - Contrastive Object Detection Using Knowledge Graph Embeddings [72.17159795485915]
We compare the error statistics of the class embeddings learned from a one-hot approach with semantically structured embeddings from natural language processing or knowledge graphs.
We propose a knowledge-embedded design for keypoint-based and transformer-based object detection architectures.
arXiv Detail & Related papers (2021-12-21T17:10:21Z) - ObjectFolder: A Dataset of Objects with Implicit Visual, Auditory, and
Tactile Representations [52.226947570070784]
We present Object, a dataset of 100 objects that addresses both challenges with two key innovations.
First, Object encodes the visual, auditory, and tactile sensory data for all objects, enabling a number of multisensory object recognition tasks.
Second, Object employs a uniform, object-centric simulations, and implicit representation for each object's visual textures, tactile readings, and tactile readings, making the dataset flexible to use and easy to share.
arXiv Detail & Related papers (2021-09-16T14:00:59Z) - VAT-Mart: Learning Visual Action Trajectory Proposals for Manipulating
3D ARTiculated Objects [19.296344218177534]
The space of 3D articulated objects is exceptionally rich in their myriad semantic categories, diverse shape geometry, and complicated part functionality.
Previous works mostly abstract kinematic structure with estimated joint parameters and part poses as the visual representations for manipulating 3D articulated objects.
We propose object-centric actionable visual priors as a novel perception-interaction handshaking point that the perception system outputs more actionable guidance than kinematic structure estimation.
arXiv Detail & Related papers (2021-06-28T07:47:31Z) - Simultaneous Multi-View Object Recognition and Grasping in Open-Ended
Domains [0.0]
We propose a deep learning architecture with augmented memory capacities to handle open-ended object recognition and grasping simultaneously.
We demonstrate the ability of our approach to grasp never-seen-before objects and to rapidly learn new object categories using very few examples on-site in both simulation and real-world settings.
arXiv Detail & Related papers (2021-06-03T14:12:11Z) - Look-into-Object: Self-supervised Structure Modeling for Object
Recognition [71.68524003173219]
We propose to "look into object" (explicitly yet intrinsically model the object structure) through incorporating self-supervisions.
We show the recognition backbone can be substantially enhanced for more robust representation learning.
Our approach achieves large performance gain on a number of benchmarks, including generic object recognition (ImageNet) and fine-grained object recognition tasks (CUB, Cars, Aircraft)
arXiv Detail & Related papers (2020-03-31T12:22:51Z) - Investigating the Importance of Shape Features, Color Constancy, Color
Spaces and Similarity Measures in Open-Ended 3D Object Recognition [4.437005770487858]
We study the importance of shape information, color constancy, color spaces, and various similarity measures in open-ended 3D object recognition.
Experimental results show that all of the textitcombinations of color and shape yields significant improvements over the textitshape-only and textitcolor-only approaches.
arXiv Detail & Related papers (2020-02-10T14:24:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.