Ensembling object detectors for image and video data analysis
- URL: http://arxiv.org/abs/2102.04798v1
- Date: Tue, 9 Feb 2021 12:38:16 GMT
- Title: Ensembling object detectors for image and video data analysis
- Authors: Kateryna Chumachenko, Jenni Raitoharju, Alexandros Iosifidis, Moncef
Gabbouj
- Abstract summary: We propose a method for ensembling the outputs of multiple object detectors for improving detection performance and precision of bounding boxes on image data.
We extend it to video data by proposing a two-stage tracking-based scheme for detection refinement.
- Score: 98.26061123111647
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a method for ensembling the outputs of multiple
object detectors for improving detection performance and precision of bounding
boxes on image data. We further extend it to video data by proposing a
two-stage tracking-based scheme for detection refinement. The proposed method
can be used as a standalone approach for improving object detection
performance, or as a part of a framework for faster bounding box annotation in
unseen datasets, assuming that the objects of interest are those present in
some common public datasets.
Related papers
- Linear Object Detection in Document Images using Multiple Object
Tracking [58.720142291102135]
Linear objects convey substantial information about document structure.
Many approaches can recover some vector representation, but only one closed-source technique introduced in 1994.
We propose a framework for accurate instance segmentation of linear objects in document images using Multiple Object Tracking.
arXiv Detail & Related papers (2023-05-26T14:22:03Z) - Uncertainty Aware Active Learning for Reconfiguration of Pre-trained
Deep Object-Detection Networks for New Target Domains [0.0]
Object detection is one of the most important and fundamental aspects of computer vision tasks.
To obtain training data for object detection model efficiently, many datasets opt to obtain their unannotated data in video format.
Annotating every frame from a video is costly and inefficient since many frames contain very similar information for the model to learn from.
In this paper, we proposed a novel active learning algorithm for object detection models to tackle this problem.
arXiv Detail & Related papers (2023-03-22T17:14:10Z) - Image Segmentation-based Unsupervised Multiple Objects Discovery [1.7674345486888503]
Unsupervised object discovery aims to localize objects in images.
We propose a fully unsupervised, bottom-up approach, for multiple objects discovery.
We provide state-of-the-art results for both unsupervised class-agnostic object detection and unsupervised image segmentation.
arXiv Detail & Related papers (2022-12-20T09:48:24Z) - Recent Trends in 2D Object Detection and Applications in Video Event
Recognition [0.76146285961466]
We discuss the pioneering works in object detection, followed by the recent breakthroughs that employ deep learning.
We highlight recent datasets for 2D object detection both in images and videos, and present a comparative performance summary of various state-of-the-art object detection techniques.
arXiv Detail & Related papers (2022-02-07T14:15:11Z) - Target-Aware Object Discovery and Association for Unsupervised Video
Multi-Object Segmentation [79.6596425920849]
This paper addresses the task of unsupervised video multi-object segmentation.
We introduce a novel approach for more accurate and efficient unseen-temporal segmentation.
We evaluate the proposed approach on DAVIS$_17$ and YouTube-VIS, and the results demonstrate that it outperforms state-of-the-art methods both in segmentation accuracy and inference speed.
arXiv Detail & Related papers (2021-04-10T14:39:44Z) - Slender Object Detection: Diagnoses and Improvements [74.40792217534]
In this paper, we are concerned with the detection of a particular type of objects with extreme aspect ratios, namely textbfslender objects.
For a classical object detection method, a drastic drop of $18.9%$ mAP on COCO is observed, if solely evaluated on slender objects.
arXiv Detail & Related papers (2020-11-17T09:39:42Z) - End-to-End Multi-Object Tracking with Global Response Map [23.755882375664875]
We present a completely end-to-end approach that takes image-sequence/video as input and outputs directly the located and tracked objects of learned types.
Specifically, with our introduced multi-object representation strategy, a global response map can be accurately generated over frames.
Experimental results based on the MOT16 and MOT17 benchmarks show that our proposed on-line tracker achieved state-of-the-art performance on several tracking metrics.
arXiv Detail & Related papers (2020-07-13T12:30:49Z) - Black-box Explanation of Object Detectors via Saliency Maps [66.745167677293]
We propose D-RISE, a method for generating visual explanations for the predictions of object detectors.
We show that D-RISE can be easily applied to different object detectors including one-stage detectors such as YOLOv3 and two-stage detectors such as Faster-RCNN.
arXiv Detail & Related papers (2020-06-05T02:13:35Z) - Adaptive Object Detection with Dual Multi-Label Prediction [78.69064917947624]
We propose a novel end-to-end unsupervised deep domain adaptation model for adaptive object detection.
The model exploits multi-label prediction to reveal the object category information in each image.
We introduce a prediction consistency regularization mechanism to assist object detection.
arXiv Detail & Related papers (2020-03-29T04:23:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.