ImageSubject: A Large-scale Dataset for Subject Detection
- URL: http://arxiv.org/abs/2201.03101v1
- Date: Sun, 9 Jan 2022 22:49:59 GMT
- Title: ImageSubject: A Large-scale Dataset for Subject Detection
- Authors: Xin Miao, Jiayi Liu, Huayan Wang, Jun Fu
- Abstract summary: Main subjects usually exist in the images or videos, as they are the objects that the photographer wants to highlight.
Detecting the main subjects is an important technique to help machines understand the content of images and videos.
We present a new dataset with the goal of training models to understand the layout of the objects then to find the main subjects among them.
- Score: 9.430492045581534
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Main subjects usually exist in the images or videos, as they are the objects
that the photographer wants to highlight. Human viewers can easily identify
them but algorithms often confuse them with other objects. Detecting the main
subjects is an important technique to help machines understand the content of
images and videos. We present a new dataset with the goal of training models to
understand the layout of the objects and the context of the image then to find
the main subjects among them. This is achieved in three aspects. By gathering
images from movie shots created by directors with professional shooting skills,
we collect the dataset with strong diversity, specifically, it contains
107\,700 images from 21\,540 movie shots. We labeled them with the bounding box
labels for two classes: subject and non-subject foreground object. We present a
detailed analysis of the dataset and compare the task with saliency detection
and object detection. ImageSubject is the first dataset that tries to localize
the subject in an image that the photographer wants to highlight. Moreover, we
find the transformer-based detection model offers the best result among other
popular model architectures. Finally, we discuss the potential applications and
conclude with the importance of the dataset.
Related papers
- Structuring Quantitative Image Analysis with Object Prominence [0.0]
We suggest carefully considering objects' prominence as an essential step in analyzing images as data.
Our approach combines qualitative analyses with the scalability of quantitative approaches.
arXiv Detail & Related papers (2024-08-30T19:05:28Z) - 360 in the Wild: Dataset for Depth Prediction and View Synthesis [66.58513725342125]
We introduce a large scale 360$circ$ videos dataset in the wild.
This dataset has been carefully scraped from the Internet and has been captured from various locations worldwide.
Each of the 25K images constituting our dataset is provided with its respective camera's pose and depth map.
arXiv Detail & Related papers (2024-06-27T05:26:38Z) - Salient Object Detection for Images Taken by People With Vision
Impairments [13.157939981657886]
We introduce a new salient object detection dataset using images taken by people who are visually impaired.
VizWiz-SalientObject is the largest (i.e., 32,000 human-annotated images) and contains unique characteristics.
We benchmarked seven modern salient object detection methods on our dataset and found they struggle most with images featuring large, have less complex boundaries, and lack text.
arXiv Detail & Related papers (2023-01-12T22:33:01Z) - Automatic dataset generation for specific object detection [6.346581421948067]
We present a method to synthesize object-in-scene images, which can preserve the objects' detailed features without bringing irrelevant information.
Our result shows that in the synthesized image, the boundaries of objects blend very well with the background.
arXiv Detail & Related papers (2022-07-16T07:44:33Z) - FewSOL: A Dataset for Few-Shot Object Learning in Robotic Environments [21.393674766169543]
We introduce the Few-Shot Object Learning dataset for object recognition with a few images per object.
We captured 336 real-world objects with 9 RGB-D images per object from different views.
The evaluation results show that there is still a large margin to be improved for few-shot object classification in robotic environments.
arXiv Detail & Related papers (2022-07-06T05:57:24Z) - Object-aware Contrastive Learning for Debiased Scene Representation [74.30741492814327]
We develop a novel object-aware contrastive learning framework that localizes objects in a self-supervised manner.
We also introduce two data augmentations based on ContraCAM, object-aware random crop and background mixup, which reduce contextual and background biases during contrastive self-supervised learning.
arXiv Detail & Related papers (2021-07-30T19:24:07Z) - FAIR1M: A Benchmark Dataset for Fine-grained Object Recognition in
High-Resolution Remote Sensing Imagery [21.9319970004788]
We propose a novel benchmark dataset with more than 1 million instances and more than 15,000 images for Fine-grAined object recognItion in high-Resolution remote sensing imagery.
All objects in the FAIR1M dataset are annotated with respect to 5 categories and 37 sub-categories by oriented bounding boxes.
arXiv Detail & Related papers (2021-03-09T17:20:15Z) - A Simple and Effective Use of Object-Centric Images for Long-Tailed
Object Detection [56.82077636126353]
We take advantage of object-centric images to improve object detection in scene-centric images.
We present a simple yet surprisingly effective framework to do so.
Our approach can improve the object detection (and instance segmentation) accuracy of rare objects by 50% (and 33%) relatively.
arXiv Detail & Related papers (2021-02-17T17:27:21Z) - Learning Object Detection from Captions via Textual Scene Attributes [70.90708863394902]
We argue that captions contain much richer information about the image, including attributes of objects and their relations.
We present a method that uses the attributes in this "textual scene graph" to train object detectors.
We empirically demonstrate that the resulting model achieves state-of-the-art results on several challenging object detection datasets.
arXiv Detail & Related papers (2020-09-30T10:59:20Z) - Counting from Sky: A Large-scale Dataset for Remote Sensing Object
Counting and A Benchmark Method [52.182698295053264]
We are interested in counting dense objects from remote sensing images. Compared with object counting in a natural scene, this task is challenging in the following factors: large scale variation, complex cluttered background, and orientation arbitrariness.
To address these issues, we first construct a large-scale object counting dataset with remote sensing images, which contains four important geographic objects.
We then benchmark the dataset by designing a novel neural network that can generate a density map of an input image.
arXiv Detail & Related papers (2020-08-28T03:47:49Z) - Improving Object Detection with Selective Self-supervised Self-training [62.792445237541145]
We study how to leverage Web images to augment human-curated object detection datasets.
We retrieve Web images by image-to-image search, which incurs less domain shift from the curated data than other search methods.
We propose a novel learning method motivated by two parallel lines of work that explore unlabeled data for image classification.
arXiv Detail & Related papers (2020-07-17T18:05:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.