Context-Matched Collage Generation for Underwater Invertebrate Detection
- URL: http://arxiv.org/abs/2211.08479v1
- Date: Tue, 15 Nov 2022 20:08:16 GMT
- Title: Context-Matched Collage Generation for Underwater Invertebrate Detection
- Authors: R. Austin McEver, Bowen Zhang, B.S. Manjunath
- Abstract summary: We introduce Context Matched Collages, which leverage explicit context labels to combine unused background examples with existing annotated data to synthesize additional training samples.
By combining a set of our generated collage images with the original training set, we see improved performance using three different object detectors on DUSIA.
- Score: 12.255951530970249
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The quality and size of training sets often limit the performance of many
state of the art object detectors. However, in many scenarios, it can be
difficult to collect images for training, not to mention the costs associated
with collecting annotations suitable for training these object detectors. For
these reasons, on challenging video datasets such as the Dataset for Underwater
Substrate and Invertebrate Analysis (DUSIA), budgets may only allow for
collecting and providing partial annotations. To aid in the challenges
associated with training with limited and partial annotations, we introduce
Context Matched Collages, which leverage explicit context labels to combine
unused background examples with existing annotated data to synthesize
additional training samples that ultimately improve object detection
performance. By combining a set of our generated collage images with the
original training set, we see improved performance using three different object
detectors on DUSIA, ultimately achieving state of the art object detection
performance on the dataset.
Related papers
- BOOTPLACE: Bootstrapped Object Placement with Detection Transformers [23.300369070771836]
We introduce BOOTPLACE, a novel paradigm that formulates object placement as a placement-by-detection problem.
Experimental results on established benchmarks demonstrate BOOTPLACE's superior performance in object repositioning.
arXiv Detail & Related papers (2025-03-27T21:21:20Z) - PEEKABOO: Hiding parts of an image for unsupervised object localization [7.161489957025654]
Localizing objects in an unsupervised manner poses significant challenges due to the absence of key visual information.
We propose a single-stage learning framework, dubbed PEEKABOO, for unsupervised object localization.
The key idea is to selectively hide parts of an image and leverage the remaining image information to infer the location of objects without explicit supervision.
arXiv Detail & Related papers (2024-07-24T20:35:20Z) - Efficiently Collecting Training Dataset for 2D Object Detection by Online Visual Feedback [5.015678820698308]
Training deep-learning-based vision systems require the manual annotation of a significant number of images.
We propose a human-in-the-loop dataset collection method that uses a web application.
To counterbalance the workload and performance by encouraging the collection of multi-view object image datasets in an enjoyable manner, we propose three types of online visual feedback features.
arXiv Detail & Related papers (2023-04-11T00:17:28Z) - ComplETR: Reducing the cost of annotations for object detection in dense
scenes with vision transformers [73.29057814695459]
ComplETR is designed to explicitly complete missing annotations in partially annotated dense scene datasets.
This reduces the need to annotate every object instance in the scene thereby reducing annotation cost.
We show performance improvement for several popular detectors such as Faster R-CNN, Cascade R-CNN, CenterNet2, and Deformable DETR.
arXiv Detail & Related papers (2022-09-13T00:11:16Z) - Free Lunch for Co-Saliency Detection: Context Adjustment [14.688461235328306]
We propose a "cost-free" group-cut-paste (GCP) procedure to leverage images from off-the-shelf saliency detection datasets and synthesize new samples.
We collect a novel dataset called Context Adjustment Training. The two variants of our dataset, i.e., CAT and CAT+, consist of 16,750 and 33,500 images, respectively.
arXiv Detail & Related papers (2021-08-04T14:51:37Z) - Learning to Track Instances without Video Annotations [85.9865889886669]
We introduce a novel semi-supervised framework by learning instance tracking networks with only a labeled image dataset and unlabeled video sequences.
We show that even when only trained with images, the learned feature representation is robust to instance appearance variations.
In addition, we integrate this module into single-stage instance segmentation and pose estimation frameworks.
arXiv Detail & Related papers (2021-04-01T06:47:41Z) - Dense Relation Distillation with Context-aware Aggregation for Few-Shot
Object Detection [18.04185751827619]
Few-shot object detection is challenging since the fine-grained feature of novel object can be easily overlooked with only a few data available.
We propose Dense Relation Distillation with Context-aware Aggregation (DCNet) to tackle the few-shot detection problem.
arXiv Detail & Related papers (2021-03-30T05:34:49Z) - Improving filling level classification with adversarial training [90.01594595780928]
We investigate the problem of classifying - from a single image - the level of content in a cup or a drinking glass.
We use adversarial training in a generic source dataset and then refine the training with a task-specific dataset.
We show that transfer learning with adversarial training in the source domain consistently improves the classification accuracy on the test set.
arXiv Detail & Related papers (2021-02-08T08:32:56Z) - Learning Object Detection from Captions via Textual Scene Attributes [70.90708863394902]
We argue that captions contain much richer information about the image, including attributes of objects and their relations.
We present a method that uses the attributes in this "textual scene graph" to train object detectors.
We empirically demonstrate that the resulting model achieves state-of-the-art results on several challenging object detection datasets.
arXiv Detail & Related papers (2020-09-30T10:59:20Z) - One-Shot Object Detection without Fine-Tuning [62.39210447209698]
We introduce a two-stage model consisting of a first stage Matching-FCOS network and a second stage Structure-Aware Relation Module.
We also propose novel training strategies that effectively improve detection performance.
Our method exceeds the state-of-the-art one-shot performance consistently on multiple datasets.
arXiv Detail & Related papers (2020-05-08T01:59:23Z) - Gradient-Induced Co-Saliency Detection [81.54194063218216]
Co-saliency detection (Co-SOD) aims to segment the common salient foreground in a group of relevant images.
In this paper, inspired by human behavior, we propose a gradient-induced co-saliency detection method.
arXiv Detail & Related papers (2020-04-28T08:40:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.