Multi-label Video Classification for Underwater Ship Inspection
- URL: http://arxiv.org/abs/2305.17338v1
- Date: Sat, 27 May 2023 02:38:54 GMT
- Title: Multi-label Video Classification for Underwater Ship Inspection
- Authors: Md Abulkalam Azad, Ahmed Mohammed, Maryna Waszak, Brian Elves{\ae}ter
and Martin Ludvigsen
- Abstract summary: We propose an automatic video analysis system using deep learning and computer vision to improve upon existing methods.
Our proposed method has demonstrated promising results and can serve as a benchmark for future research and development in underwater video hull inspection applications.
- Score: 2.537406035246369
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Today ship hull inspection including the examination of the external coating,
detection of defects, and other types of external degradation such as corrosion
and marine growth is conducted underwater by means of Remotely Operated
Vehicles (ROVs). The inspection process consists of a manual video analysis
which is a time-consuming and labor-intensive process. To address this, we
propose an automatic video analysis system using deep learning and computer
vision to improve upon existing methods that only consider spatial information
on individual frames in underwater ship hull video inspection. By exploring the
benefits of adding temporal information and analyzing frame-based classifiers,
we propose a multi-label video classification model that exploits the
self-attention mechanism of transformers to capture spatiotemporal attention in
consecutive video frames. Our proposed method has demonstrated promising
results and can serve as a benchmark for future research and development in
underwater video inspection applications.
Related papers
- Appearance-Based Refinement for Object-Centric Motion Segmentation [85.2426540999329]
We introduce an appearance-based refinement method that leverages temporal consistency in video streams to correct inaccurate flow-based proposals.
Our approach involves a sequence-level selection mechanism that identifies accurate flow-predicted masks as exemplars.
Its performance is evaluated on multiple video segmentation benchmarks, including DAVIS, YouTube, SegTrackv2, and FBMS-59.
arXiv Detail & Related papers (2023-12-18T18:59:51Z) - Video Dynamics Prior: An Internal Learning Approach for Robust Video
Enhancements [83.5820690348833]
We present a framework for low-level vision tasks that does not require any external training data corpus.
Our approach learns neural modules by optimizing over a corrupted sequence, leveraging the weights of the coherence-temporal test and statistics internal statistics.
arXiv Detail & Related papers (2023-12-13T01:57:11Z) - SeaDSC: A video-based unsupervised method for dynamic scene change
detection in unmanned surface vehicles [3.2716252389196288]
This paper outlines our approach to detect dynamic scene changes in Unmanned Surface Vehicles (USVs)
Our objective is to identify significant changes in the dynamic scenes of maritime video data, particularly those scenes that exhibit a high degree of resemblance.
In our system for dynamic scene change detection, we propose completely unsupervised learning method.
arXiv Detail & Related papers (2023-11-20T07:34:01Z) - Evaluating Deep Learning Assisted Automated Aquaculture Net Pens
Inspection Using ROV [0.27309692684728615]
Fish escape from fish farms into the open sea due to net damage.
Traditional inspection system relies on visual inspection by expert divers or ROVs.
This article presents a robotic-based automatic net defect detection system for aquaculture net pens.
arXiv Detail & Related papers (2023-08-26T09:35:49Z) - Context-Driven Detection of Invertebrate Species in Deep-Sea Video [11.38215488702246]
We present a benchmark suite to train, validate, and test methods for temporally localizing four underwater substrates and 59 underwater invertebrate species.
DUSIA currently includes over ten hours of footage across 25 videos captured in 1080p at 30 fps by an ROV.
Some frames are annotated with precise bounding box locations for invertebrate species of interest.
arXiv Detail & Related papers (2022-06-01T18:59:46Z) - Flow-Guided Sparse Transformer for Video Deblurring [124.11022871999423]
FlowGuided Sparse Transformer (F GST) is a framework for video deblurring.
FGSW-MSA enjoys the guidance of the estimated optical flow to globally sample spatially sparse elements corresponding to the same scene patch in neighboring frames.
Our proposed F GST outperforms state-of-the-art patches on both DVD and GOPRO datasets and even yields more visually pleasing results in real video deblurring.
arXiv Detail & Related papers (2022-01-06T02:05:32Z) - Video Salient Object Detection via Contrastive Features and Attention
Modules [106.33219760012048]
We propose a network with attention modules to learn contrastive features for video salient object detection.
A co-attention formulation is utilized to combine the low-level and high-level features.
We show that the proposed method requires less computation, and performs favorably against the state-of-the-art approaches.
arXiv Detail & Related papers (2021-11-03T17:40:32Z) - Self-supervised Video Object Segmentation by Motion Grouping [79.13206959575228]
We develop a computer vision system able to segment objects by exploiting motion cues.
We introduce a simple variant of the Transformer to segment optical flow frames into primary objects and the background.
We evaluate the proposed architecture on public benchmarks (DAVIS2016, SegTrackv2, and FBMS59)
arXiv Detail & Related papers (2021-04-15T17:59:32Z) - Robust Unsupervised Video Anomaly Detection by Multi-Path Frame
Prediction [61.17654438176999]
We propose a novel and robust unsupervised video anomaly detection method by frame prediction with proper design.
Our proposed method obtains the frame-level AUROC score of 88.3% on the CUHK Avenue dataset.
arXiv Detail & Related papers (2020-11-05T11:34:12Z) - Anomaly Detection in Video Data Based on Probabilistic Latent Space
Models [7.269230232703388]
A Variational Autoencoder (VAE) is used for reducing the dimensionality of video frames.
An Adapted Markov Jump Particle Filter defined by discrete and continuous inference levels is employed to predict the following frames.
Our method is evaluated on different video scenarios where a semi-autonomous vehicle performs a set of tasks in a closed environment.
arXiv Detail & Related papers (2020-03-17T10:32:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.