Bounding Box Annotation with Visible Status
- URL: http://arxiv.org/abs/2304.04901v1
- Date: Tue, 11 Apr 2023 00:17:28 GMT
- Title: Bounding Box Annotation with Visible Status
- Authors: Takuya Kiyokawa, Naoki Shirakura, Hiroki Katayama, Keita Tomochika,
Jun Takamatsu
- Abstract summary: This study presents a mobile application-based free-viewpoint image-capturing method.
With the proposed application, users can collect multi-view image datasets automatically that are annotated with bounding boxes by moving the camera.
Our experiments demonstrated that using the gamified mobile application for bounding box annotation, with visible collection progress status, can motivate users to collect multi-view object image datasets.
- Score: 6.69350212746025
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training deep-learning-based vision systems requires the manual annotation of
a significant amount of data to optimize several parameters of the deep
convolutional neural networks. Such manual annotation is highly time-consuming
and labor-intensive. To reduce this burden, a previous study presented a fully
automated annotation approach that does not require any manual intervention.
The proposed method associates a visual marker with an object and captures it
in the same image. However, because the previous method relied on moving the
object within the capturing range using a fixed-point camera, the collected
image dataset was limited in terms of capturing viewpoints. To overcome this
limitation, this study presents a mobile application-based free-viewpoint
image-capturing method. With the proposed application, users can collect
multi-view image datasets automatically that are annotated with bounding boxes
by moving the camera. However, capturing images through human involvement is
laborious and monotonous. Therefore, we propose gamified application features
to track the progress of the collection status. Our experiments demonstrated
that using the gamified mobile application for bounding box annotation, with
visible collection progress status, can motivate users to collect multi-view
object image datasets with less mental workload and time pressure in an
enjoyable manner, leading to increased engagement.
Related papers
- Feedback-driven object detection and iterative model improvement [2.3700911865675187]
We present the development and evaluation of a platform designed to interactively improve object detection models.
The platform allows uploading and annotating images as well as fine-tuning object detection models.
We show evidence for a significant time reduction of up to 53% for semi-automatic compared to manual annotation.
arXiv Detail & Related papers (2024-11-29T16:45:25Z) - Harnessing Diffusion Models for Visual Perception with Meta Prompts [68.78938846041767]
We propose a simple yet effective scheme to harness a diffusion model for visual perception tasks.
We introduce learnable embeddings (meta prompts) to the pre-trained diffusion models to extract proper features for perception.
Our approach achieves new performance records in depth estimation tasks on NYU depth V2 and KITTI, and in semantic segmentation task on CityScapes.
arXiv Detail & Related papers (2023-12-22T14:40:55Z) - Labeling Indoor Scenes with Fusion of Out-of-the-Box Perception Models [4.157013247909771]
We propose to leverage the recent advancements in state-of-the-art models for bottom-up segmentation (SAM), object detection (Detic), and semantic segmentation (MaskFormer)
We aim to develop a cost-effective labeling approach to obtain pseudo-labels for semantic segmentation and object instance detection in indoor environments.
We demonstrate the effectiveness of the proposed approach on the Active Vision dataset and the ADE20K dataset.
arXiv Detail & Related papers (2023-11-17T21:58:26Z) - Accelerating exploration and representation learning with offline
pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset.
We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z) - Context-Matched Collage Generation for Underwater Invertebrate Detection [12.255951530970249]
We introduce Context Matched Collages, which leverage explicit context labels to combine unused background examples with existing annotated data to synthesize additional training samples.
By combining a set of our generated collage images with the original training set, we see improved performance using three different object detectors on DUSIA.
arXiv Detail & Related papers (2022-11-15T20:08:16Z) - CoDo: Contrastive Learning with Downstream Background Invariance for
Detection [10.608660802917214]
We propose a novel object-level self-supervised learning method, called Contrastive learning with Downstream background invariance (CoDo)
The pretext task is converted to focus on instance location modeling for various backgrounds, especially for downstream datasets.
Experiments on MSCOCO demonstrate that the proposed CoDo with common backbones, ResNet50-FPN, yields strong transfer learning results for object detection.
arXiv Detail & Related papers (2022-05-10T01:26:15Z) - Crop-Transform-Paste: Self-Supervised Learning for Visual Tracking [137.26381337333552]
In this work, we develop the Crop-Transform-Paste operation, which is able to synthesize sufficient training data.
Since the object state is known in all synthesized data, existing deep trackers can be trained in routine ways without human annotation.
arXiv Detail & Related papers (2021-06-21T07:40:34Z) - Few-Cost Salient Object Detection with Adversarial-Paced Learning [95.0220555274653]
This paper proposes to learn the effective salient object detection model based on the manual annotation on a few training images only.
We name this task as the few-cost salient object detection and propose an adversarial-paced learning (APL)-based framework to facilitate the few-cost learning scenario.
arXiv Detail & Related papers (2021-04-05T14:15:49Z) - Data Augmentation for Object Detection via Differentiable Neural
Rendering [71.00447761415388]
It is challenging to train a robust object detector when annotated data is scarce.
Existing approaches to tackle this problem include semi-supervised learning that interpolates labeled data from unlabeled data.
We introduce an offline data augmentation method for object detection, which semantically interpolates the training data with novel views.
arXiv Detail & Related papers (2021-03-04T06:31:06Z) - Embodied Visual Active Learning for Semantic Segmentation [33.02424587900808]
We study the task of embodied visual active learning, where an agent is set to explore a 3d environment with the goal to acquire visual scene understanding.
We develop a battery of agents - both learnt and pre-specified - and with different levels of knowledge of the environment.
We extensively evaluate the proposed models using the Matterport3D simulator and show that a fully learnt method outperforms comparable pre-specified counterparts.
arXiv Detail & Related papers (2020-12-17T11:02:34Z) - From ImageNet to Image Classification: Contextualizing Progress on
Benchmarks [99.19183528305598]
We study how specific design choices in the ImageNet creation process impact the fidelity of the resulting dataset.
Our analysis pinpoints how a noisy data collection pipeline can lead to a systematic misalignment between the resulting benchmark and the real-world task it serves as a proxy for.
arXiv Detail & Related papers (2020-05-22T17:39:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.