Related papers: Lightweight Convolutional Neural Network with Gaussian-based Grasping Representation for Robotic Grasping Detection

Lightweight Convolutional Neural Network with Gaussian-based Grasping Representation for Robotic Grasping Detection

URL: http://arxiv.org/abs/2101.10226v1
Date: Mon, 25 Jan 2021 16:36:53 GMT
Title: Lightweight Convolutional Neural Network with Gaussian-based Grasping Representation for Robotic Grasping Detection
Authors: Hu Cao, Guang Chen, Zhijun Li, Jianjie Lin, Alois Knoll
Abstract summary: Current object detectors are difficult to strike a balance between high accuracy and fast inference speed. We present an efficient and robust fully convolutional neural network model to perform robotic grasping pose estimation. The network is an order of magnitude smaller than other excellent algorithms.
Score: 4.683939045230724
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The method of deep learning has achieved excellent results in improving the performance of robotic grasping detection. However, the deep learning methods used in general object detection are not suitable for robotic grasping detection. Current modern object detectors are difficult to strike a balance between high accuracy and fast inference speed. In this paper, we present an efficient and robust fully convolutional neural network model to perform robotic grasping pose estimation from an n-channel input image of the real grasping scene. The proposed network is a lightweight generative architecture for grasping detection in one stage. Specifically, a grasping representation based on Gaussian kernel is introduced to encode training samples, which embodies the principle of maximum central point grasping confidence. Meanwhile, to extract multi-scale information and enhance the feature discriminability, a receptive field block (RFB) is assembled to the bottleneck of our grasping detection architecture. Besides, pixel attention and channel attention are combined to automatically learn to focus on fusing context information of varying shapes and sizes by suppressing the noise feature and highlighting the grasping object feature. Extensive experiments on two public grasping datasets, Cornell and Jacquard demonstrate the state-of-the-art performance of our method in balancing accuracy and inference speed. The network is an order of magnitude smaller than other excellent algorithms while achieving better performance with an accuracy of 98.9$\%$ and 95.6$\%$ on the Cornell and Jacquard datasets, respectively.

Related papers

Understanding and Improving Training-Free AI-Generated Image Detections with Vision Foundation Models [68.90917438865078]
Deepfake techniques for facial synthesis and editing pose serious risks for generative models. In this paper, we investigate how detection performance varies across model backbones, types, and datasets. We introduce Contrastive Blur, which enhances performance on facial images, and MINDER, which addresses noise type bias, balancing performance across domains.
arXiv Detail & Related papers (2024-11-28T13:04:45Z)
SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds. With the development of Transformer, the scale of SIRST models is constantly increasing. With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z)
Global Context Aggregation Network for Lightweight Saliency Detection of Surface Defects [70.48554424894728]
We develop a Global Context Aggregation Network (GCANet) for lightweight saliency detection of surface defects on the encoder-decoder structure. First, we introduce a novel transformer encoder on the top layer of the lightweight backbone, which captures global context information through a novel Depth-wise Self-Attention (DSA) module. The experimental results on three public defect datasets demonstrate that the proposed network achieves a better trade-off between accuracy and running efficiency compared with other 17 state-of-the-art methods.
arXiv Detail & Related papers (2023-09-22T06:19:11Z)
New SAR target recognition based on YOLO and very deep multi-canonical correlation analysis [0.1503974529275767]
This paper proposes a robust feature extraction method for SAR image target classification by adaptively fusing effective features from different CNN layers. Experiments on the MSTAR dataset demonstrate that the proposed method outperforms the state-of-the-art methods.
arXiv Detail & Related papers (2021-10-28T18:10:26Z)
Convolutional Deep Denoising Autoencoders for Radio Astronomical Images [0.0]
We apply a Machine Learning technique known as Convolutional Denoising Autoencoder to denoise synthetic images of state-of-the-art radio telescopes. Our autoencoder can effectively denoise complex images identifying and extracting faint objects at the limits of the instrumental sensitivity.
arXiv Detail & Related papers (2021-10-16T17:08:30Z)
SignalNet: A Low Resolution Sinusoid Decomposition and Estimation Network [79.04274563889548]
We propose SignalNet, a neural network architecture that detects the number of sinusoids and estimates their parameters from quantized in-phase and quadrature samples. We introduce a worst-case learning threshold for comparing the results of our network relative to the underlying data distributions. In simulation, we find that our algorithm is always able to surpass the threshold for three-bit data but often cannot exceed the threshold for one-bit data.
arXiv Detail & Related papers (2021-06-10T04:21:20Z)
Hyperspectral Images Classification Based on Multi-scale Residual Network [5.166817530813299]
Hyperspectral remote sensing images contain a lot of redundant information and the data structure is non-linear. Deep convolutional neural network has high accuracy, but when a small amount of data is used for training, the classification accuracy of deep learning methods is greatly reduced. In order to solve the problem of low classification accuracy of existing algorithms on small samples of hyperspectral images, a multi-scale residual network is proposed.
arXiv Detail & Related papers (2020-04-26T13:46:52Z)
Resolution Adaptive Networks for Efficient Inference [53.04907454606711]
We propose a novel Resolution Adaptive Network (RANet), which is inspired by the intuition that low-resolution representations are sufficient for classifying "easy" inputs. In RANet, the input images are first routed to a lightweight sub-network that efficiently extracts low-resolution representations. High-resolution paths in the network maintain the capability to recognize the "hard" samples.
arXiv Detail & Related papers (2020-03-16T16:54:36Z)
BiDet: An Efficient Binarized Object Detector [96.19708396510894]
We propose a binarized neural network learning method called BiDet for efficient object detection. Our BiDet fully utilizes the representational capacity of the binary neural networks for object detection by redundancy removal. Our method outperforms the state-of-the-art binary neural networks by a sizable margin.
arXiv Detail & Related papers (2020-03-09T08:16:16Z)
Depthwise Non-local Module for Fast Salient Object Detection Using a Single Thread [136.2224792151324]
We propose a new deep learning algorithm for fast salient object detection. The proposed algorithm achieves competitive accuracy and high inference efficiency simultaneously with a single CPU thread.
arXiv Detail & Related papers (2020-01-22T15:23:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.