Enhanced Training of Query-Based Object Detection via Selective Query
Recollection
- URL: http://arxiv.org/abs/2212.07593v3
- Date: Wed, 22 Mar 2023 00:23:39 GMT
- Title: Enhanced Training of Query-Based Object Detection via Selective Query
Recollection
- Authors: Fangyi Chen, Han Zhang, Kai Hu, Yu-kai Huang, Chenchen Zhu, Marios
Savvides
- Abstract summary: This paper investigates a phenomenon where query-based object detectors mispredict at the last decoding stage while predicting correctly at an intermediate stage.
We design and present Selective Query Recollection, a simple and effective training strategy for query-based object detectors.
- Score: 35.3219210570517
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper investigates a phenomenon where query-based object detectors
mispredict at the last decoding stage while predicting correctly at an
intermediate stage. We review the training process and attribute the overlooked
phenomenon to two limitations: lack of training emphasis and cascading errors
from decoding sequence. We design and present Selective Query Recollection
(SQR), a simple and effective training strategy for query-based object
detectors. It cumulatively collects intermediate queries as decoding stages go
deeper and selectively forwards the queries to the downstream stages aside from
the sequential structure. Such-wise, SQR places training emphasis on later
stages and allows later stages to work with intermediate queries from earlier
stages directly. SQR can be easily plugged into various query-based object
detectors and significantly enhances their performance while leaving the
inference pipeline unchanged. As a result, we apply SQR on Adamixer, DAB-DETR,
and Deformable-DETR across various settings (backbone, number of queries,
schedule) and consistently brings 1.4-2.8 AP improvement.
Related papers
- DQ-LoRe: Dual Queries with Low Rank Approximation Re-ranking for
In-Context Learning [66.85379279041128]
In this study, we introduce a framework that leverages Dual Queries and Low-rank approximation Re-ranking to automatically select exemplars for in-context learning.
DQ-LoRe significantly outperforms prior state-of-the-art methods in the automatic selection of exemplars for GPT-4, enhancing performance from 92.5% to 94.2%.
arXiv Detail & Related papers (2023-10-04T16:44:37Z) - Temporal-aware Hierarchical Mask Classification for Video Semantic
Segmentation [62.275143240798236]
Video semantic segmentation dataset has limited categories per video.
Less than 10% of queries could be matched to receive meaningful gradient updates during VSS training.
Our method achieves state-of-the-art performance on the latest challenging VSS benchmark VSPW without bells and whistles.
arXiv Detail & Related papers (2023-09-14T20:31:06Z) - Deep Equilibrium Object Detection [24.69829309391189]
We present a new query-based object detector (DEQDet) by designing a deep equilibrium decoder.
Our experiments demonstrate DEQDet converges faster, consumes less memory, and achieves better results than the baseline counterpart.
arXiv Detail & Related papers (2023-08-18T13:56:03Z) - StageInteractor: Query-based Object Detector with Cross-stage
Interaction [21.84964476813102]
We propose a new query-based object detector with cross-stage interaction, coined as StageInteractor.
Our model improves the baseline by 2.2 AP, and achieves 44.8 AP with ResNet-50 as backbone.
With longer training time and 300 queries, StageInteractor achieves 51.1 AP and 52.2 AP with ResNeXt-101-DCN and Swin-S, respectively.
arXiv Detail & Related papers (2023-04-11T04:50:13Z) - Noise-Robust Dense Retrieval via Contrastive Alignment Post Training [89.29256833403167]
Contrastive Alignment POst Training (CAPOT) is a highly efficient finetuning method that improves model robustness without requiring index regeneration.
CAPOT enables robust retrieval by freezing the document encoder while the query encoder learns to align noisy queries with their unaltered root.
We evaluate CAPOT noisy variants of MSMARCO, Natural Questions, and Trivia QA passage retrieval, finding CAPOT has a similar impact as data augmentation with none of its overhead.
arXiv Detail & Related papers (2023-04-06T22:16:53Z) - ReAct: Temporal Action Detection with Relational Queries [84.76646044604055]
This work aims at advancing temporal action detection (TAD) using an encoder-decoder framework with action queries.
We first propose a relational attention mechanism in the decoder, which guides the attention among queries based on their relations.
Lastly, we propose to predict the localization quality of each action query at inference in order to distinguish high-quality queries.
arXiv Detail & Related papers (2022-07-14T17:46:37Z) - UP-DETR: Unsupervised Pre-training for Object Detection with
Transformers [11.251593386108189]
We propose a novel pretext task named random query patch detection in Unsupervised Pre-training DETR (UP-DETR)
Specifically, we randomly crop patches from the given image and then feed them as queries to the decoder.
UP-DETR significantly boosts the performance of DETR with faster convergence and higher average precision on object detection, one-shot detection and panoptic segmentation.
arXiv Detail & Related papers (2020-11-18T05:16:11Z) - Query Resolution for Conversational Search with Limited Supervision [63.131221660019776]
We propose QuReTeC (Query Resolution by Term Classification), a neural query resolution model based on bidirectional transformers.
We show that QuReTeC outperforms state-of-the-art models, and furthermore, that our distant supervision method can be used to substantially reduce the amount of human-curated data required to train QuReTeC.
arXiv Detail & Related papers (2020-05-24T11:37:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.