End-to-End Lane detection with One-to-Several Transformer
- URL: http://arxiv.org/abs/2305.00675v4
- Date: Sat, 13 May 2023 04:42:42 GMT
- Title: End-to-End Lane detection with One-to-Several Transformer
- Authors: Kunyang Zhou and Rui Zhou
- Abstract summary: O2SFormer converges 12.5x faster than DETR for the ResNet18 backbone.
O2SFormer with ResNet50 backbone achieves 77.83% F1 score on CULane dataset, outperforming existing Transformer-based and CNN-based detectors.
- Score: 6.79236957488334
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although lane detection methods have shown impressive performance in
real-world scenarios, most of methods require post-processing which is not
robust enough. Therefore, end-to-end detectors like DEtection TRansformer(DETR)
have been introduced in lane detection.However, one-to-one label assignment in
DETR can degrade the training efficiency due to label semantic conflicts.
Besides, positional query in DETR is unable to provide explicit positional
prior, making it difficult to be optimized. In this paper, we present the
One-to-Several Transformer(O2SFormer). We first propose the one-to-several
label assignment, which combines one-to-many and one-to-one label assignment to
solve label semantic conflicts while keeping end-to-end detection. To overcome
the difficulty in optimizing one-to-one assignment. We further propose the
layer-wise soft label which dynamically adjusts the positive weight of positive
lane anchors in different decoder layers. Finally, we design the dynamic
anchor-based positional query to explore positional prior by incorporating lane
anchors into positional query. Experimental results show that O2SFormer with
ResNet50 backbone achieves 77.83% F1 score on CULane dataset, outperforming
existing Transformer-based and CNN-based detectors. Futhermore, O2SFormer
converges 12.5x faster than DETR for the ResNet18 backbone.
Related papers
- Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement [19.277560848076984]
Two-stage selection strategies result in scale bias and redundancy due to mismatch between selected queries and objects.
We propose hierarchical salience filtering refinement, which performs transformer encoding only on filtered discriminative queries.
The proposed Salience DETR achieves significant improvements of +4.0% AP, +0.2% AP, +4.4% AP on three challenging task-specific detection datasets.
arXiv Detail & Related papers (2024-03-24T13:01:57Z) - ASAG: Building Strong One-Decoder-Layer Sparse Detectors via Adaptive
Sparse Anchor Generation [50.01244854344167]
We bridge the performance gap between sparse and dense detectors by proposing Adaptive Sparse Anchor Generator (ASAG)
ASAG predicts dynamic anchors on patches rather than grids in a sparse way so that it alleviates the feature conflict problem.
Our method outperforms dense-d ones and achieves a better speed-accuracy trade-off.
arXiv Detail & Related papers (2023-08-18T02:06:49Z) - Semi-DETR: Semi-Supervised Object Detection with Detection Transformers [105.45018934087076]
We analyze the DETR-based framework on semi-supervised object detection (SSOD)
We present Semi-DETR, the first transformer-based end-to-end semi-supervised object detector.
Our method outperforms all state-of-the-art methods by clear margins.
arXiv Detail & Related papers (2023-07-16T16:32:14Z) - Semi-Supervised and Long-Tailed Object Detection with CascadeMatch [91.86787064083012]
We propose a novel pseudo-labeling-based detector called CascadeMatch.
Our detector features a cascade network architecture, which has multi-stage detection heads with progressive confidence thresholds.
We show that CascadeMatch surpasses existing state-of-the-art semi-supervised approaches in handling long-tailed object detection.
arXiv Detail & Related papers (2023-05-24T07:09:25Z) - Bridging the Gap Between End-to-end and Non-End-to-end Multi-Object
Tracking [27.74953961900086]
Existing end-to-end Multi-Object Tracking (e2e-MOT) methods have not surpassed non-end-to-end tracking-by-detection methods.
We present Co-MOT, a simple and effective method to facilitate e2e-MOT by a novel coopetition label assignment with a shadow concept.
arXiv Detail & Related papers (2023-05-22T05:18:34Z) - StageInteractor: Query-based Object Detector with Cross-stage
Interaction [21.84964476813102]
We propose a new query-based object detector with cross-stage interaction, coined as StageInteractor.
Our model improves the baseline by 2.2 AP, and achieves 44.8 AP with ResNet-50 as backbone.
With longer training time and 300 queries, StageInteractor achieves 51.1 AP and 52.2 AP with ResNeXt-101-DCN and Swin-S, respectively.
arXiv Detail & Related papers (2023-04-11T04:50:13Z) - Detection Transformer with Stable Matching [48.963171068785435]
We show that the most important design is to use and only use positional metrics to supervise classification scores of positive examples.
Under the principle, we propose two simple yet effective modifications by integrating positional metrics to DETR's classification loss and matching cost.
We achieve 50.4 and 51.5 AP on the COCO detection benchmark using ResNet-50 backbones under 12 epochs and 24 epochs training settings.
arXiv Detail & Related papers (2023-04-10T17:55:37Z) - Cross-domain Speech Recognition with Unsupervised Character-level
Distribution Matching [60.8427677151492]
We propose CMatch, a Character-level distribution matching method to perform fine-grained adaptation between each character in two domains.
Experiments on the Libri-Adapt dataset show that our proposed approach achieves 14.39% and 16.50% relative Word Error Rate (WER) reduction on both cross-device and cross-environment ASR.
arXiv Detail & Related papers (2021-04-15T14:36:54Z) - End-to-End Object Detection with Transformers [88.06357745922716]
We present a new method that views object detection as a direct set prediction problem.
Our approach streamlines the detection pipeline, effectively removing the need for many hand-designed components.
The main ingredients of the new framework, called DEtection TRansformer or DETR, are a set-based global loss.
arXiv Detail & Related papers (2020-05-26T17:06:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.