Indiscernible Object Counting in Underwater Scenes
- URL: http://arxiv.org/abs/2304.11677v1
- Date: Sun, 23 Apr 2023 15:09:02 GMT
- Title: Indiscernible Object Counting in Underwater Scenes
- Authors: Guolei Sun, Zhaochong An, Yun Liu, Ce Liu, Christos Sakaridis,
Deng-Ping Fan, Luc Van Gool
- Abstract summary: Indiscernible object counting is the goal of which is to count objects that are blended with respect to their surroundings.
We present a large-scale dataset IOCfish5K which contains a total of 5,637 high-resolution images and 659,024 annotated center points.
- Score: 91.86044762367945
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, indiscernible scene understanding has attracted a lot of attention
in the vision community. We further advance the frontier of this field by
systematically studying a new challenge named indiscernible object counting
(IOC), the goal of which is to count objects that are blended with respect to
their surroundings. Due to a lack of appropriate IOC datasets, we present a
large-scale dataset IOCfish5K which contains a total of 5,637 high-resolution
images and 659,024 annotated center points. Our dataset consists of a large
number of indiscernible objects (mainly fish) in underwater scenes, making the
annotation process all the more challenging. IOCfish5K is superior to existing
datasets with indiscernible scenes because of its larger scale, higher image
resolutions, more annotations, and denser scenes. All these aspects make it the
most challenging dataset for IOC so far, supporting progress in this area. For
benchmarking purposes, we select 14 mainstream methods for object counting and
carefully evaluate them on IOCfish5K. Furthermore, we propose IOCFormer, a new
strong baseline that combines density and regression branches in a unified
framework and can effectively tackle object counting under concealed scenes.
Experiments show that IOCFormer achieves state-of-the-art scores on IOCfish5K.
Related papers
- A Density-Guided Temporal Attention Transformer for Indiscernible Object
Counting in Underwater Video [27.329015161325962]
Indiscernible object counting, which aims to count the number of targets that are blended with respect to their surroundings, has been a challenge.
We propose a large-scale dataset called YoutubeFish-35, which contains a total of 35 sequences of high-definition videos.
We propose TransVidCount, a new strong baseline that combines density and regression branches along the temporal domain in a unified framework.
arXiv Detail & Related papers (2024-03-06T04:54:00Z) - Improving Underwater Visual Tracking With a Large Scale Dataset and
Image Enhancement [70.2429155741593]
This paper presents a new dataset and general tracker enhancement method for Underwater Visual Object Tracking (UVOT)
It poses distinct challenges; the underwater environment exhibits non-uniform lighting conditions, low visibility, lack of sharpness, low contrast, camouflage, and reflections from suspended particles.
We propose a novel underwater image enhancement algorithm designed specifically to boost tracking quality.
The method has resulted in a significant performance improvement, of up to 5.0% AUC, of state-of-the-art (SOTA) visual trackers.
arXiv Detail & Related papers (2023-08-30T07:41:26Z) - LaRS: A Diverse Panoptic Maritime Obstacle Detection Dataset and
Benchmark [9.864996020621701]
We present the first maritime panoptic obstacle detection benchmark LaRS, featuring scenes from Lakes, Rivers and Seas.
LaRS is composed of over 4000 per-pixel labeled key frames with nine preceding frames to allow utilization of the temporal texture.
We report the results of 27 semantic and panoptic segmentation methods, along with several performance insights and future research directions.
arXiv Detail & Related papers (2023-08-18T15:21:15Z) - KOLOMVERSE: Korea open large-scale image dataset for object detection in the maritime universe [0.5732204366512352]
We present KOLOMVERSE, an open large-scale image dataset for object detection in the maritime domain by KRISO.
We collected 5,845 hours of video data captured from 21 territorial waters of South Korea.
The dataset has images of 3840$times$2160 pixels and to our knowledge, it is by far the largest publicly available dataset for object detection in the maritime domain.
arXiv Detail & Related papers (2022-06-20T16:45:12Z) - Highly Accurate Dichotomous Image Segmentation [139.79513044546]
A new task called dichotomous image segmentation (DIS) aims to segment highly accurate objects from natural images.
We collect the first large-scale dataset, DIS5K, which contains 5,470 high-resolution (e.g., 2K, 4K or larger) images.
We also introduce a simple intermediate supervision baseline (IS-Net) using both feature-level and mask-level guidance for DIS model training.
arXiv Detail & Related papers (2022-03-06T20:09:19Z) - ASOD60K: Audio-Induced Salient Object Detection in Panoramic Videos [79.05486554647918]
We propose PV-SOD, a new task that aims to segment salient objects from panoramic videos.
In contrast to existing fixation-level or object-level saliency detection tasks, we focus on multi-modal salient object detection (SOD)
We collect the first large-scale dataset, named ASOD60K, which contains 4K-resolution video frames annotated with a six-level hierarchy.
arXiv Detail & Related papers (2021-07-24T15:14:20Z) - Concealed Object Detection [140.98738087261887]
We present the first systematic study on concealed object detection (COD)
COD aims to identify objects that are "perfectly" embedded in their background.
To better understand this task, we collect a large-scale dataset called COD10K.
arXiv Detail & Related papers (2021-02-20T06:49:53Z) - Counting from Sky: A Large-scale Dataset for Remote Sensing Object
Counting and A Benchmark Method [52.182698295053264]
We are interested in counting dense objects from remote sensing images. Compared with object counting in a natural scene, this task is challenging in the following factors: large scale variation, complex cluttered background, and orientation arbitrariness.
To address these issues, we first construct a large-scale object counting dataset with remote sensing images, which contains four important geographic objects.
We then benchmark the dataset by designing a novel neural network that can generate a density map of an input image.
arXiv Detail & Related papers (2020-08-28T03:47:49Z) - RPT: Learning Point Set Representation for Siamese Visual Tracking [15.04182251944942]
We propose an effcient visual tracking framework to accurately estimate the target state with a finer representation as a set of representative points.
Our method achieves new state-of-the-art performance while running at over 20 FPS.
arXiv Detail & Related papers (2020-08-08T07:42:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.