A New Dataset, Poisson GAN and AquaNet for Underwater Object Grabbing
- URL: http://arxiv.org/abs/2003.01446v2
- Date: Wed, 28 Jul 2021 01:32:42 GMT
- Title: A New Dataset, Poisson GAN and AquaNet for Underwater Object Grabbing
- Authors: Chongwei Liu, Zhihui Wang, Shijie Wang, Tao Tang, Yulong Tao, Caifei
Yang, Haojie Li, Xing Liu, and Xin Fan
- Abstract summary: We propose a new dataset (UDD) consisting of three categories (seacucumber, seaurchin, and scallop) with 2,227 images.
We also propose a novel Poisson-blending Generative Adversarial Network (Poisson GAN) and an efficient object detection network (AquaNet) to address two common issues within related datasets.
- Score: 33.580474181751676
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: To boost the object grabbing capability of underwater robots for open-sea
farming, we propose a new dataset (UDD) consisting of three categories
(seacucumber, seaurchin, and scallop) with 2,227 images. To the best of our
knowledge, it is the first 4K HD dataset collected in a real open-sea farm. We
also propose a novel Poisson-blending Generative Adversarial Network (Poisson
GAN) and an efficient object detection network (AquaNet) to address two common
issues within related datasets: the class-imbalance problem and the problem of
mass small object, respectively. Specifically, Poisson GAN combines Poisson
blending into its generator and employs a new loss called Dual Restriction loss
(DR loss), which supervises both implicit space features and image-level
features during training to generate more realistic images. By utilizing
Poisson GAN, objects of minority class like seacucumber or scallop could be
added into an image naturally and annotated automatically, which could increase
the loss of minority classes during training detectors to eliminate the
class-imbalance problem; AquaNet is a high-efficiency detector to address the
problem of detecting mass small objects from cloudy underwater pictures. Within
it, we design two efficient components: a depth-wise-convolution-based
Multi-scale Contextual Features Fusion (MFF) block and a Multi-scale
Blursampling (MBP) module to reduce the parameters of the network to 1.3
million. Both two components could provide multi-scale features of small
objects under a short backbone configuration without any loss of accuracy. In
addition, we construct a large-scale augmented dataset (AUDD) and a
pre-training dataset via Poisson GAN from UDD. Extensive experiments show the
effectiveness of the proposed Poisson GAN, AquaNet, UDD, AUDD, and pre-training
dataset.
Related papers
- PVAFN: Point-Voxel Attention Fusion Network with Multi-Pooling Enhancing for 3D Object Detection [59.355022416218624]
integration of point and voxel representations is becoming more common in LiDAR-based 3D object detection.
We propose a novel two-stage 3D object detector, called Point-Voxel Attention Fusion Network (PVAFN)
PVAFN uses a multi-pooling strategy to integrate both multi-scale and region-specific information effectively.
arXiv Detail & Related papers (2024-08-26T19:43:01Z) - YOLC: You Only Look Clusters for Tiny Object Detection in Aerial Images [33.80392696735718]
YOLC (You Only Look Clusters) is an efficient and effective framework that builds on an anchor-free object detector, CenterNet.
To overcome the challenges posed by large-scale images and non-uniform object distribution, we introduce a Local Scale Module (LSM) that adaptively searches cluster regions for zooming in for accurate detection.
We perform extensive experiments on two aerial image datasets, including Visdrone 2019 and UAVDT, to demonstrate the effectiveness and superiority of our proposed approach.
arXiv Detail & Related papers (2024-04-09T10:03:44Z) - SIRST-5K: Exploring Massive Negatives Synthesis with Self-supervised
Learning for Robust Infrared Small Target Detection [53.19618419772467]
Single-frame infrared small target (SIRST) detection aims to recognize small targets from clutter backgrounds.
With the development of Transformer, the scale of SIRST models is constantly increasing.
With a rich diversity of infrared small target data, our algorithm significantly improves the model performance and convergence speed.
arXiv Detail & Related papers (2024-03-08T16:14:54Z) - Small Object Detection via Coarse-to-fine Proposal Generation and
Imitation Learning [52.06176253457522]
We propose a two-stage framework tailored for small object detection based on the Coarse-to-fine pipeline and Feature Imitation learning.
CFINet achieves state-of-the-art performance on the large-scale small object detection benchmarks, SODA-D and SODA-A.
arXiv Detail & Related papers (2023-08-18T13:13:09Z) - Class Anchor Margin Loss for Content-Based Image Retrieval [97.81742911657497]
We propose a novel repeller-attractor loss that falls in the metric learning paradigm, yet directly optimize for the L2 metric without the need of generating pairs.
We evaluate the proposed objective in the context of few-shot and full-set training on the CBIR task, by using both convolutional and transformer architectures.
arXiv Detail & Related papers (2023-06-01T12:53:10Z) - Boosting R-CNN: Reweighting R-CNN Samples by RPN's Error for Underwater
Object Detection [12.291063755824585]
We propose a two-stage underwater detector named boosting R-CNN, which comprises three key components.
First, a new region proposal network named RetinaRPN is proposed, which provides high-quality proposals.
Second, the probabilistic inference pipeline is introduced to combine the first-stage prior uncertainty and the second-stage classification score.
arXiv Detail & Related papers (2022-06-28T03:29:20Z) - SALISA: Saliency-based Input Sampling for Efficient Video Object
Detection [58.22508131162269]
We propose SALISA, a novel non-uniform SALiency-based Input SAmpling technique for video object detection.
We show that SALISA significantly improves the detection of small objects.
arXiv Detail & Related papers (2022-04-05T17:59:51Z) - SWIPENET: Object detection in noisy underwater images [41.35601054297707]
We propose a novel Sample-WeIghted hyPEr Network (SWIPENET), and a robust training paradigm named Curriculum Multi-Class Adaboost (CMA) to address these two problems.
The backbone of SWIPENET produces multiple high resolution and semantic-rich Hyper Feature Maps, which significantly improve small object detection.
Inspired by the human education process that drives the learning from easy to hard concepts, we here propose the CMA training paradigm that first trains a clean detector which is free from the influence of noisy data.
arXiv Detail & Related papers (2020-10-19T16:41:20Z) - Reinforced Axial Refinement Network for Monocular 3D Object Detection [160.34246529816085]
Monocular 3D object detection aims to extract the 3D position and properties of objects from a 2D input image.
Conventional approaches sample 3D bounding boxes from the space and infer the relationship between the target object and each of them, however, the probability of effective samples is relatively small in the 3D space.
We propose to start with an initial prediction and refine it gradually towards the ground truth, with only one 3d parameter changed in each step.
This requires designing a policy which gets a reward after several steps, and thus we adopt reinforcement learning to optimize it.
arXiv Detail & Related papers (2020-08-31T17:10:48Z) - Underwater object detection using Invert Multi-Class Adaboost with deep
learning [37.14538666012363]
We propose a novel neural network architecture, namely Sample-WeIghted hyPEr Network (SWIPENet), for small object detection.
We show that the proposed SWIPENet+IMA framework achieves better performance in detection accuracy against several state-of-the-art object detection approaches.
arXiv Detail & Related papers (2020-05-23T15:30:38Z) - PENet: Object Detection using Points Estimation in Aerial Images [9.33900415971554]
A novel network structure, Points Estimated Network (PENet), is proposed in this work to answer these challenges.
PENet uses a Mask Resampling Module (MRM) to augment the imbalanced datasets, a coarse anchor-free detector (CPEN) to effectively predict the center points of the small object clusters, and a fine anchor-free detector FPEN to locate the precise positions of the small objects.
Our experiments on aerial datasets visDrone and UAVDT showed that PENet achieved higher precision results than existing state-of-the-art approaches.
arXiv Detail & Related papers (2020-01-22T19:43:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.