Interactron: Embodied Adaptive Object Detection
- URL: http://arxiv.org/abs/2202.00660v1
- Date: Tue, 1 Feb 2022 18:56:14 GMT
- Title: Interactron: Embodied Adaptive Object Detection
- Authors: Klemen Kotar, Roozbeh Mottaghi
- Abstract summary: We propose Interactron, a method for adaptive object detection in an interactive setting.
Our idea is to continue training during inference and adapt the model at test time without any explicit supervision via interacting with the environment.
- Score: 18.644357684104662
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Over the years various methods have been proposed for the problem of object
detection. Recently, we have witnessed great strides in this domain owing to
the emergence of powerful deep neural networks. However, there are typically
two main assumptions common among these approaches. First, the model is trained
on a fixed training set and is evaluated on a pre-recorded test set. Second,
the model is kept frozen after the training phase, so no further updates are
performed after the training is finished. These two assumptions limit the
applicability of these methods to real-world settings. In this paper, we
propose Interactron, a method for adaptive object detection in an interactive
setting, where the goal is to perform object detection in images observed by an
embodied agent navigating in different environments. Our idea is to continue
training during inference and adapt the model at test time without any explicit
supervision via interacting with the environment. Our adaptive object detection
model provides a 11.8 point improvement in AP (and 19.1 points in AP50) over
DETR, a recent, high-performance object detector. Moreover, we show that our
object detection model adapts to environments with completely different
appearance characteristics, and its performance is on par with a model trained
with full supervision for those environments.
Related papers
- Weakly Supervised Test-Time Domain Adaptation for Object Detection [23.89166024655107]
In some applications such as surveillance, there is usually a human operator overseeing the system's operation.
We propose to involve the operator in test-time domain adaptation to raise the performance of object detection beyond what is achievable by fully automated adaptation.
We show that the proposed method outperforms existing works, demonstrating a great benefit of human-in-the-loop test-time domain adaptation.
arXiv Detail & Related papers (2024-07-08T04:44:42Z) - Adaptive Rentention & Correction for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - STFAR: Improving Object Detection Robustness at Test-Time by
Self-Training with Feature Alignment Regularization [35.16122933158808]
Domain adaptation helps generalizing object detection models to target domain data with distribution shift.
We explore adapting an object detection model at test-time, a.k.a. test-time adaptation (TTAOD)
Our proposed method sets the state-of-the-art on test-time adaptive object detection task.
arXiv Detail & Related papers (2023-03-31T10:04:44Z) - TTA-COPE: Test-Time Adaptation for Category-Level Object Pose Estimation [86.80589902825196]
We propose a method of test-time adaptation for category-level object pose estimation called TTA-COPE.
We design a pose ensemble approach with a self-training loss using pose-aware confidence.
Our approach processes the test data in a sequential, online manner, and it does not require access to the source domain at runtime.
arXiv Detail & Related papers (2023-03-29T14:34:54Z) - Self-improving object detection via disagreement reconciliation [30.971936386281275]
This paper studies how to automatically fine-tune a pre-existing object detector while exploring and acquiring images in a new environment.
By assuming that pseudo-labels for the same object must be consistent across different views, we devise a novel mechanism for producing refined predictions from the consensus among observations.
Our approach improves the off-the-shelf object detector by 2.66% in terms of mAP and outperforms the current state of the art without relying on ground-truth annotations.
arXiv Detail & Related papers (2023-02-21T12:20:46Z) - Suspected Object Matters: Rethinking Model's Prediction for One-stage
Visual Grounding [93.82542533426766]
We propose a Suspected Object Transformation mechanism (SOT) to encourage the target object selection among the suspected ones.
SOT can be seamlessly integrated into existing CNN and Transformer-based one-stage visual grounders.
Extensive experiments demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2022-03-10T06:41:07Z) - Robust Object Detection via Instance-Level Temporal Cycle Confusion [89.1027433760578]
We study the effectiveness of auxiliary self-supervised tasks to improve the out-of-distribution generalization of object detectors.
Inspired by the principle of maximum entropy, we introduce a novel self-supervised task, instance-level temporal cycle confusion (CycConf)
For each object, the task is to find the most different object proposals in the adjacent frame in a video and then cycle back to itself for self-supervision.
arXiv Detail & Related papers (2021-04-16T21:35:08Z) - Reformulating HOI Detection as Adaptive Set Prediction [25.44630995307787]
We reformulate HOI detection as an adaptive set prediction problem.
We propose an Adaptive Set-based one-stage framework (AS-Net) with parallel instance and interaction branches.
Our method outperforms previous state-of-the-art methods without any extra human pose and language features.
arXiv Detail & Related papers (2021-03-10T10:40:33Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - Unsupervised Domain Adaptation for Spatio-Temporal Action Localization [69.12982544509427]
S-temporal action localization is an important problem in computer vision.
We propose an end-to-end unsupervised domain adaptation algorithm.
We show that significant performance gain can be achieved when spatial and temporal features are adapted separately or jointly.
arXiv Detail & Related papers (2020-10-19T04:25:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.