Custom Object Detection via Multi-Camera Self-Supervised Learning
- URL: http://arxiv.org/abs/2102.03442v1
- Date: Fri, 5 Feb 2021 23:11:14 GMT
- Title: Custom Object Detection via Multi-Camera Self-Supervised Learning
- Authors: Yan Lu and Yuanchao Shu
- Abstract summary: MCSSL is a self-supervised learning approach for building custom object detection models in multi-camera networks.
Our evaluation shows that compared with legacy selftraining methods, MCSSL improves average mAP by 5.44% and 6.76% on WildTrack and CityFlow dataset.
- Score: 15.286868970188223
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper proposes MCSSL, a self-supervised learning approach for building
custom object detection models in multi-camera networks. MCSSL associates
bounding boxes between cameras with overlapping fields of view by leveraging
epipolar geometry and state-of-the-art tracking and reID algorithms, and
prudently generates two sets of pseudo-labels to fine-tune backbone and
detection networks respectively in an object detection model. To train
effectively on pseudo-labels,a powerful reID-like pretext task with consistency
loss is constructed for model customization. Our evaluation shows that compared
with legacy selftraining methods, MCSSL improves average mAP by 5.44% and 6.76%
on WildTrack and CityFlow dataset, respectively.
Related papers
- MTP: Advancing Remote Sensing Foundation Model via Multi-Task Pretraining [73.81862342673894]
Foundation models have reshaped the landscape of Remote Sensing (RS) by enhancing various image interpretation tasks.
transferring the pretrained models to downstream tasks may encounter task discrepancy due to their formulation of pretraining as image classification or object discrimination tasks.
We conduct multi-task supervised pretraining on the SAMRS dataset, encompassing semantic segmentation, instance segmentation, and rotated object detection.
Our models are finetuned on various RS downstream tasks, such as scene classification, horizontal and rotated object detection, semantic segmentation, and change detection.
arXiv Detail & Related papers (2024-03-20T09:17:22Z) - S$^3$Track: Self-supervised Tracking with Soft Assignment Flow [45.77333923477176]
We study self-supervised multiple object tracking without using any video-level association labels.
We propose differentiable soft object assignment for object association.
We evaluate our proposed model on the KITTI, nuScenes, and Argoverse datasets.
arXiv Detail & Related papers (2023-05-17T06:25:40Z) - De-coupling and De-positioning Dense Self-supervised Learning [65.56679416475943]
Dense Self-Supervised Learning (SSL) methods address the limitations of using image-level feature representations when handling images with multiple objects.
We show that they suffer from coupling and positional bias, which arise from the receptive field increasing with layer depth and zero-padding.
We demonstrate the benefits of our method on COCO and on a new challenging benchmark, OpenImage-MINI, for object classification, semantic segmentation, and object detection.
arXiv Detail & Related papers (2023-03-29T18:07:25Z) - Cut and Learn for Unsupervised Object Detection and Instance
Segmentation [65.43627672225624]
Cut-and-LEaRn (CutLER) is a simple approach for training unsupervised object detection and segmentation models.
CutLER is a zero-shot unsupervised detector and improves detection performance AP50 by over 2.7 times on 11 benchmarks.
arXiv Detail & Related papers (2023-01-26T18:57:13Z) - Tracking Passengers and Baggage Items using Multiple Overhead Cameras at
Security Checkpoints [2.021502591596062]
We introduce a novel framework to track multiple objects in overhead camera videos for airport checkpoint security scenarios.
We propose a Self-Supervised Learning (SSL) technique to provide the model information about instance segmentation uncertainty from overhead images.
Our results show that self-supervision improves object detection accuracy by up to $42%$ without increasing the inference time of the model.
arXiv Detail & Related papers (2022-12-31T12:57:09Z) - 3DMODT: Attention-Guided Affinities for Joint Detection & Tracking in 3D
Point Clouds [95.54285993019843]
We propose a method for joint detection and tracking of multiple objects in 3D point clouds.
Our model exploits temporal information employing multiple frames to detect objects and track them in a single network.
arXiv Detail & Related papers (2022-11-01T20:59:38Z) - Know Your Surroundings: Panoramic Multi-Object Tracking by Multimodality
Collaboration [56.01625477187448]
We propose a MultiModality PAnoramic multi-object Tracking framework (MMPAT)
It takes both 2D panorama images and 3D point clouds as input and then infers target trajectories using the multimodality data.
We evaluate the proposed method on the JRDB dataset, where the MMPAT achieves the top performance in both the detection and tracking tasks.
arXiv Detail & Related papers (2021-05-31T03:16:38Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.