Related papers: Training-based Model Refinement and Representation Disagreement for Semi-Supervised Object Detection

Training-based Model Refinement and Representation Disagreement for Semi-Supervised Object Detection

URL: http://arxiv.org/abs/2307.13755v4
Date: Thu, 26 Oct 2023 18:13:05 GMT
Title: Training-based Model Refinement and Representation Disagreement for Semi-Supervised Object Detection
Authors: Seyed Mojtaba Marvasti-Zadeh, Nilanjan Ray, Nadir Erbilgin
Abstract summary: Semi-supervised object detection (SSOD) aims to improve the performance and generalization of existing object detectors. Recent SSOD methods are still challenged by inadequate model refinement using the classical exponential moving average (EMA) strategy. This paper proposes a novel training-based model refinement stage and a simple yet effective representation disagreement (RD) strategy.
Score: 8.096382537967637
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Semi-supervised object detection (SSOD) aims to improve the performance and generalization of existing object detectors by utilizing limited labeled data and extensive unlabeled data. Despite many advances, recent SSOD methods are still challenged by inadequate model refinement using the classical exponential moving average (EMA) strategy, the consensus of Teacher-Student models in the latter stages of training (i.e., losing their distinctiveness), and noisy/misleading pseudo-labels. This paper proposes a novel training-based model refinement (TMR) stage and a simple yet effective representation disagreement (RD) strategy to address the limitations of classical EMA and the consensus problem. The TMR stage of Teacher-Student models optimizes the lightweight scaling operation to refine the model's weights and prevent overfitting or forgetting learned patterns from unlabeled data. Meanwhile, the RD strategy helps keep these models diverged to encourage the student model to explore additional patterns in unlabeled data. Our approach can be integrated into established SSOD methods and is empirically validated using two baseline methods, with and without cascade regression, to generate more reliable pseudo-labels. Extensive experiments demonstrate the superior performance of our approach over state-of-the-art SSOD methods. Specifically, the proposed approach outperforms the baseline Unbiased-Teacher-v2 (& Unbiased-Teacher-v1) method by an average mAP margin of 2.23, 2.1, and 3.36 (& 2.07, 1.9, and 3.27) on COCO-standard, COCO-additional, and Pascal VOC datasets, respectively.

Related papers

LPLgrad: Optimizing Active Learning Through Gradient Norm Sample Selection and Auxiliary Model Training [2.762397703396293]
Loss Prediction Loss with Gradient Norm (LPLgrad) is designed to quantify model uncertainty effectively and improve the accuracy of image classification tasks. LPLgrad operates in two distinct phases: (i) em Training Phase aims to predict the loss for input features by jointly training a main model and an auxiliary model. This dual-model approach enhances the ability to extract complex input features and learn intrinsic patterns from the data effectively.
arXiv Detail & Related papers (2024-11-20T18:12:59Z)
Semi-Supervised Model-Free Bayesian State Estimation from Compressed Measurements [57.04370580292727]
We consider data-driven Bayesian state estimation from compressed measurements. The dimension of the temporal measurement vector is lower than that of the temporal state vector to be estimated. The underlying dynamical model of the state's evolution is unknown for a'model-free process'
arXiv Detail & Related papers (2024-07-10T05:03:48Z)
ACTRESS: Active Retraining for Semi-supervised Visual Grounding [52.08834188447851]
A previous study, RefTeacher, makes the first attempt to tackle this task by adopting the teacher-student framework to provide pseudo confidence supervision and attention-based supervision. This approach is incompatible with current state-of-the-art visual grounding models, which follow the Transformer-based pipeline. Our paper proposes the ACTive REtraining approach for Semi-Supervised Visual Grounding, abbreviated as ACTRESS.
arXiv Detail & Related papers (2024-07-03T16:33:31Z)
STAR: Constraint LoRA with Dynamic Active Learning for Data-Efficient Fine-Tuning of Large Language Models [21.929902181609936]
We propose a novel approach to integrate uncertainty-based active learning and LoRA. For the uncertainty gap, we introduce a dynamic uncertainty measurement that combines the uncertainty of the base model and the uncertainty of the full model. For poor model calibration, we incorporate the regularization method during LoRA training to keep the model from being over-confident.
arXiv Detail & Related papers (2024-03-02T10:38:10Z)
Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class. Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z)
Robust Training of Federated Models with Extremely Label Deficiency [84.00832527512148]
Federated semi-supervised learning (FSSL) has emerged as a powerful paradigm for collaboratively training machine learning models using distributed data with label deficiency. We propose a novel twin-model paradigm, called Twin-sight, designed to enhance mutual guidance by providing insights from different perspectives of labeled and unlabeled data. Our comprehensive experiments on four benchmark datasets provide substantial evidence that Twin-sight can significantly outperform state-of-the-art methods across various experimental settings.
arXiv Detail & Related papers (2024-02-22T10:19:34Z)
Universal Domain Adaptation from Foundation Models: A Baseline Study [58.51162198585434]
We make empirical studies of state-of-the-art UniDA methods using foundation models. We introduce textitCLIP distillation, a parameter-free method specifically designed to distill target knowledge from CLIP models. Although simple, our method outperforms previous approaches in most benchmark tasks.
arXiv Detail & Related papers (2023-05-18T16:28:29Z)
Meta-tuning Loss Functions and Data Augmentation for Few-shot Object Detection [7.262048441360132]
Few-shot object detection is an emerging topic in the area of few-shot learning and object detection. We propose a training scheme that allows learning inductive biases that can boost few-shot detection. The proposed approach yields interpretable loss functions, as opposed to highly parametric and complex few-shot meta-models.
arXiv Detail & Related papers (2023-04-24T15:14:16Z)
TWINS: A Fine-Tuning Framework for Improved Transferability of Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks. We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework. TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z)
Model Agnostic Sample Reweighting for Out-of-Distribution Learning [38.843552982739354]
We propose a principled method, textbfAgnostic samtextbfPLe rtextbfEweighting (textbfMAPLE) to effectively address OOD problem. Our key idea is to find an effective reweighting of the training samples so that the standard empirical risk minimization training of a large model leads to superior OOD generalization performance.
arXiv Detail & Related papers (2023-01-24T05:11:03Z)
DEALIO: Data-Efficient Adversarial Learning for Imitation from Observation [57.358212277226315]
In imitation learning from observation IfO, a learning agent seeks to imitate a demonstrating agent using only observations of the demonstrated behavior without access to the control signals generated by the demonstrator. Recent methods based on adversarial imitation learning have led to state-of-the-art performance on IfO problems, but they typically suffer from high sample complexity due to a reliance on data-inefficient, model-free reinforcement learning algorithms. This issue makes them impractical to deploy in real-world settings, where gathering samples can incur high costs in terms of time, energy, and risk. We propose a more data-efficient IfO algorithm
arXiv Detail & Related papers (2021-03-31T23:46:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.