Gram-SLD: Automatic Self-labeling and Detection for Instance Objects
- URL: http://arxiv.org/abs/2112.03641v1
- Date: Tue, 7 Dec 2021 11:34:55 GMT
- Title: Gram-SLD: Automatic Self-labeling and Detection for Instance Objects
- Authors: Rui Wang, Chengtun Wu, Jiawen Xin, and Liang Zhang
- Abstract summary: We propose a new framework based on co-training called Gram Self-Labeling and Detection (Gram-SLD)
Gram-SLD can automatically annotate a large amount of data with very limited manually labeled key data and achieve competitive performance.
- Score: 6.512856940779818
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Instance object detection plays an important role in intelligent monitoring,
visual navigation, human-computer interaction, intelligent services and other
fields. Inspired by the great success of Deep Convolutional Neural Network
(DCNN), DCNN-based instance object detection has become a promising research
topic. To address the problem that DCNN always requires a large-scale annotated
dataset to supervise its training while manual annotation is exhausting and
time-consuming, we propose a new framework based on co-training called Gram
Self-Labeling and Detection (Gram-SLD). The proposed Gram-SLD can automatically
annotate a large amount of data with very limited manually labeled key data and
achieve competitive performance. In our framework, gram loss is defined and
used to construct two fully redundant and independent views and a key sample
selection strategy along with an automatic annotating strategy that
comprehensively consider precision and recall are proposed to generate high
quality pseudo-labels. Experiments on the public GMU Kitchen Dataset , Active
Vision Dataset and the self-made BHID-ITEM Datasetdemonstrate that, with only
5% labeled training data, our Gram-SLD achieves competitive performance in
object detection (less than 2% mAP loss), compared with the fully supervised
methods. In practical applications with complex and changing environments, the
proposed method can satisfy the real-time and accuracy requirements on instance
object detection.
Related papers
- Web-Scale Visual Entity Recognition: An LLM-Driven Data Approach [56.55633052479446]
Web-scale visual entity recognition presents significant challenges due to the lack of clean, large-scale training data.
We propose a novel methodology to curate such a dataset, leveraging a multimodal large language model (LLM) for label verification, metadata generation, and rationale explanation.
Experiments demonstrate that models trained on this automatically curated data achieve state-of-the-art performance on web-scale visual entity recognition tasks.
arXiv Detail & Related papers (2024-10-31T06:55:24Z) - Semi-Supervised Object Detection: A Survey on Progress from CNN to Transformer [12.042768320132694]
This paper presents a review of 27 cutting-edge developments in semi-supervised learning for object detection.
It covers data augmentation techniques, pseudo-labeling strategies, consistency regularization, and adversarial training methods.
We aim to ignite further research interest in overcoming existing challenges and exploring new directions in semi-supervised learning for object detection.
arXiv Detail & Related papers (2024-07-11T12:58:13Z) - Self-Supervised Learning for User Localization [8.529237718266042]
Machine learning techniques have shown remarkable accuracy in localization tasks.
Their dependency on vast amounts of labeled data, particularly Channel State Information (CSI) and corresponding coordinates, remains a bottleneck.
We propose a pioneering approach that leverages self-supervised pretraining on unlabeled data to boost the performance of supervised learning for user localization based on CSI.
arXiv Detail & Related papers (2024-04-19T21:49:10Z) - Run-time Introspection of 2D Object Detection in Automated Driving
Systems Using Learning Representations [13.529124221397822]
We introduce a novel introspection solution for 2D object detection based on Deep Neural Networks (DNNs)
We implement several state-of-the-art (SOTA) introspection mechanisms for error detection in 2D object detection, using one-stage and two-stage object detectors evaluated on KITTI and BDD datasets.
Our performance evaluation shows that the proposed introspection solution outperforms SOTA methods, achieving an absolute reduction in the missed error ratio of 9% to 17% in the BDD dataset.
arXiv Detail & Related papers (2024-03-02T10:56:14Z) - Attentive Prototypes for Source-free Unsupervised Domain Adaptive 3D
Object Detection [85.11649974840758]
3D object detection networks tend to be biased towards the data they are trained on.
We propose a single-frame approach for source-free, unsupervised domain adaptation of lidar-based 3D object detectors.
arXiv Detail & Related papers (2021-11-30T18:42:42Z) - Self-Supervised Object Detection via Generative Image Synthesis [106.65384648377349]
We present the first end-to-end analysis-by synthesis framework with controllable GANs for the task of self-supervised object detection.
We use collections of real world images without bounding box annotations to learn to synthesize and detect objects.
Our work advances the field of self-supervised object detection by introducing a successful new paradigm of using controllable GAN-based image synthesis for it.
arXiv Detail & Related papers (2021-10-19T11:04:05Z) - Unsupervised Domain Adaptive Learning via Synthetic Data for Person
Re-identification [101.1886788396803]
Person re-identification (re-ID) has gained more and more attention due to its widespread applications in video surveillance.
Unfortunately, the mainstream deep learning methods still need a large quantity of labeled data to train models.
In this paper, we develop a data collector to automatically generate synthetic re-ID samples in a computer game, and construct a data labeler to simultaneously annotate them.
arXiv Detail & Related papers (2021-09-12T15:51:41Z) - On the Use of Interpretable Machine Learning for the Management of Data
Quality [13.075880857448059]
We propose the use of interpretable machine learning to deliver the features that are important to be based for any data processing activity.
Our aim is to secure data quality, at least, for those features that are detected as significant in the collected datasets.
arXiv Detail & Related papers (2020-07-29T08:49:32Z) - Federated Self-Supervised Learning of Multi-Sensor Representations for
Embedded Intelligence [8.110949636804772]
Smartphones, wearables, and Internet of Things (IoT) devices produce a wealth of data that cannot be accumulated in a centralized repository for learning supervised models.
We propose a self-supervised approach termed textitscalogram-signal correspondence learning based on wavelet transform to learn useful representations from unlabeled sensor inputs.
We extensively assess the quality of learned features with our multi-view strategy on diverse public datasets, achieving strong performance in all domains.
arXiv Detail & Related papers (2020-07-25T21:59:17Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z) - Stance Detection Benchmark: How Robust Is Your Stance Detection? [65.91772010586605]
Stance Detection (StD) aims to detect an author's stance towards a certain topic or claim.
We introduce a StD benchmark that learns from ten StD datasets of various domains in a multi-dataset learning setting.
Within this benchmark setup, we are able to present new state-of-the-art results on five of the datasets.
arXiv Detail & Related papers (2020-01-06T13:37:51Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.