Grasp-type Recognition Leveraging Object Affordance
- URL: http://arxiv.org/abs/2009.09813v1
- Date: Wed, 26 Aug 2020 08:40:27 GMT
- Title: Grasp-type Recognition Leveraging Object Affordance
- Authors: Naoki Wake, Kazuhiro Sasabuchi, Katsushi Ikeuchi
- Abstract summary: Key challenge in robot teaching is grasp-type recognition with a single RGB image and a target object name.
We propose a simple yet effective pipeline to enhance learning-based recognition by leveraging a prior distribution of grasp types for each object.
- Score: 7.227058013536165
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A key challenge in robot teaching is grasp-type recognition with a single RGB
image and a target object name. Here, we propose a simple yet effective
pipeline to enhance learning-based recognition by leveraging a prior
distribution of grasp types for each object. In the pipeline, a convolutional
neural network (CNN) recognizes the grasp type from an RGB image. The
recognition result is further corrected using the prior distribution (i.e.,
affordance), which is associated with the target object name. Experimental
results showed that the proposed method outperforms both a CNN-only and an
affordance-only method. The results highlight the effectiveness of
linguistically-driven object affordance for enhancing grasp-type recognition in
robot teaching.
Related papers
- LAC-Net: Linear-Fusion Attention-Guided Convolutional Network for Accurate Robotic Grasping Under the Occlusion [79.22197702626542]
This paper introduces a framework that explores amodal segmentation for robotic grasping in cluttered scenes.
We propose a Linear-fusion Attention-guided Convolutional Network (LAC-Net)
The results on different datasets show that our method achieves state-of-the-art performance.
arXiv Detail & Related papers (2024-08-06T14:50:48Z) - Cycle-Correspondence Loss: Learning Dense View-Invariant Visual Features from Unlabeled and Unordered RGB Images [8.789674502390378]
We introduce Cycle-Correspondence Loss (CCL) for view-invariant dense descriptor learning.
The key idea is to autonomously detect valid pixel correspondences by attempting to use a prediction over a new image.
Our evaluation shows that we outperform other self-supervised RGB-only methods, and approach performance of supervised methods.
arXiv Detail & Related papers (2024-06-18T09:44:56Z) - SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised
Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds.
With the development of Transformer, the scale of SIRST models is constantly increasing.
With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z) - MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments [72.6405488990753]
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks.
We propose a single-stage and standalone method, MOCA, which unifies both desired properties.
We achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols.
arXiv Detail & Related papers (2023-07-18T15:46:20Z) - Learning Common Rationale to Improve Self-Supervised Representation for
Fine-Grained Visual Recognition Problems [61.11799513362704]
We propose learning an additional screening mechanism to identify discriminative clues commonly seen across instances and classes.
We show that a common rationale detector can be learned by simply exploiting the GradCAM induced from the SSL objective.
arXiv Detail & Related papers (2023-03-03T02:07:40Z) - ALSO: Automotive Lidar Self-supervision by Occupancy estimation [70.70557577874155]
We propose a new self-supervised method for pre-training the backbone of deep perception models operating on point clouds.
The core idea is to train the model on a pretext task which is the reconstruction of the surface on which the 3D points are sampled.
The intuition is that if the network is able to reconstruct the scene surface, given only sparse input points, then it probably also captures some fragments of semantic information.
arXiv Detail & Related papers (2022-12-12T13:10:19Z) - DcnnGrasp: Towards Accurate Grasp Pattern Recognition with Adaptive
Regularizer Learning [13.08779945306727]
Current state-of-the-art methods ignore category information of objects which is crucial for grasp pattern recognition.
This paper presents a novel dual-branch convolutional neural network (DcnnGrasp) to achieve joint learning of object category classification and grasp pattern recognition.
arXiv Detail & Related papers (2022-05-11T00:34:27Z) - Learning Consistency from High-quality Pseudo-labels for Weakly
Supervised Object Localization [7.602783618330373]
We propose a two-stage approach to learn more consistent localization.
In the first stage, we propose a mask-based pseudo label generator algorithm, and use the pseudo-supervised learning method to initialize an object localization network.
In the second stage, we propose a simple and effective method for evaluating the confidence of pseudo-labels based on classification discrimination.
arXiv Detail & Related papers (2022-03-18T09:05:51Z) - Object Localization Through a Single Multiple-Model Convolutional Neural
Network with a Specific Training Approach [0.0]
A special training approach is proposed for a light convolutional neural network (CNN) to determine the region of interest in an image.
Almost all CNN-based detectors utilize a fixed input size image, which may yield poor performance when dealing with various object sizes.
arXiv Detail & Related papers (2021-03-24T16:52:01Z) - Lightweight Convolutional Neural Network with Gaussian-based Grasping
Representation for Robotic Grasping Detection [4.683939045230724]
Current object detectors are difficult to strike a balance between high accuracy and fast inference speed.
We present an efficient and robust fully convolutional neural network model to perform robotic grasping pose estimation.
The network is an order of magnitude smaller than other excellent algorithms.
arXiv Detail & Related papers (2021-01-25T16:36:53Z) - Depthwise Non-local Module for Fast Salient Object Detection Using a
Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection.
The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.