SeaDSC: A video-based unsupervised method for dynamic scene change
detection in unmanned surface vehicles
- URL: http://arxiv.org/abs/2311.11580v1
- Date: Mon, 20 Nov 2023 07:34:01 GMT
- Title: SeaDSC: A video-based unsupervised method for dynamic scene change
detection in unmanned surface vehicles
- Authors: Linh Trinh, Ali Anwar, Siegfried Mercelis
- Abstract summary: This paper outlines our approach to detect dynamic scene changes in Unmanned Surface Vehicles (USVs)
Our objective is to identify significant changes in the dynamic scenes of maritime video data, particularly those scenes that exhibit a high degree of resemblance.
In our system for dynamic scene change detection, we propose completely unsupervised learning method.
- Score: 3.2716252389196288
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, there has been an upsurge in the research on maritime vision, where
a lot of works are influenced by the application of computer vision for
Unmanned Surface Vehicles (USVs). Various sensor modalities such as camera,
radar, and lidar have been used to perform tasks such as object detection,
segmentation, object tracking, and motion planning. A large subset of this
research is focused on the video analysis, since most of the current vessel
fleets contain the camera's onboard for various surveillance tasks. Due to the
vast abundance of the video data, video scene change detection is an initial
and crucial stage for scene understanding of USVs. This paper outlines our
approach to detect dynamic scene changes in USVs. To the best of our
understanding, this work represents the first investigation of scene change
detection in the maritime vision application. Our objective is to identify
significant changes in the dynamic scenes of maritime video data, particularly
those scenes that exhibit a high degree of resemblance. In our system for
dynamic scene change detection, we propose completely unsupervised learning
method. In contrast to earlier studies, we utilize a modified cutting-edge
generative picture model called VQ-VAE-2 to train on multiple marine datasets,
aiming to enhance the feature extraction. Next, we introduce our innovative
similarity scoring technique for directly calculating the level of similarity
in a sequence of consecutive frames by utilizing grid calculation on retrieved
features. The experiments were conducted using a nautical video dataset called
RoboWhaler to showcase the efficient performance of our technique.
Related papers
- TK-Planes: Tiered K-Planes with High Dimensional Feature Vectors for Dynamic UAV-based Scenes [58.180556221044235]
We present a new approach to bridge the domain gap between synthetic and real-world data for unmanned aerial vehicle (UAV)-based perception.
Our formulation is designed for dynamic scenes, consisting of small moving objects or human actions.
We evaluate its performance on challenging datasets, including Okutama Action and UG2.
arXiv Detail & Related papers (2024-05-04T21:55:33Z) - V-MAD: Video-based Morphing Attack Detection in Operational Scenarios [4.353138826597465]
This paper introduces and explores the potential of Video-based Morphing Attack Detection (V-MAD) systems in real-world operational scenarios.
V-MAD is based on video sequences, exploiting the video streams often acquired by face verification tools available.
We show for the first time the advantages that the availability of multiple probe frames can bring to the morphing attack detection task.
arXiv Detail & Related papers (2024-04-10T12:22:19Z) - SGDViT: Saliency-Guided Dynamic Vision Transformer for UAV Tracking [12.447854608181833]
This work presents a novel saliency-guided dynamic vision Transformer (SGDViT) for UAV tracking.
The proposed method designs a new task-specific object saliency mining network to refine the cross-correlation operation.
A lightweight saliency filtering Transformer further refines saliency information and increases the focus on appearance information.
arXiv Detail & Related papers (2023-03-08T05:01:00Z) - Deep Learning Computer Vision Algorithms for Real-time UAVs On-board
Camera Image Processing [77.34726150561087]
This paper describes how advanced deep learning based computer vision algorithms are applied to enable real-time on-board sensor processing for small UAVs.
All algorithms have been developed using state-of-the-art image processing methods based on deep neural networks.
arXiv Detail & Related papers (2022-11-02T11:10:42Z) - Implicit Motion Handling for Video Camouflaged Object Detection [60.98467179649398]
We propose a new video camouflaged object detection (VCOD) framework.
It can exploit both short-term and long-term temporal consistency to detect camouflaged objects from video frames.
arXiv Detail & Related papers (2022-03-14T17:55:41Z) - Recent Trends in 2D Object Detection and Applications in Video Event
Recognition [0.76146285961466]
We discuss the pioneering works in object detection, followed by the recent breakthroughs that employ deep learning.
We highlight recent datasets for 2D object detection both in images and videos, and present a comparative performance summary of various state-of-the-art object detection techniques.
arXiv Detail & Related papers (2022-02-07T14:15:11Z) - PreViTS: Contrastive Pretraining with Video Tracking Supervision [53.73237606312024]
PreViTS is an unsupervised SSL framework for selecting clips containing the same object.
PreViTS spatially constrains the frame regions to learn from and trains the model to locate meaningful objects.
We train a momentum contrastive (MoCo) encoder on VGG-Sound and Kinetics-400 datasets with PreViTS.
arXiv Detail & Related papers (2021-12-01T19:49:57Z) - Multi-Object Tracking with Deep Learning Ensemble for Unmanned Aerial
System Applications [0.0]
Multi-object tracking (MOT) is a crucial component of situational awareness in military defense applications.
We present a robust object tracking architecture aimed to accommodate for the noise in real-time situations.
We propose a kinematic prediction model, called Deep Extended Kalman Filter (DeepEKF), in which a sequence-to-sequence architecture is used to predict entity trajectories in latent space.
arXiv Detail & Related papers (2021-10-05T13:50:38Z) - A Flow Base Bi-path Network for Cross-scene Video Crowd Understanding in
Aerial View [93.23947591795897]
In this paper, we strive to tackle the challenges and automatically understand the crowd from the visual data collected from drones.
To alleviate the background noise generated in cross-scene testing, a double-stream crowd counting model is proposed.
To tackle the crowd density estimation problem under extreme dark environments, we introduce synthetic data generated by game Grand Theft Auto V(GTAV)
arXiv Detail & Related papers (2020-09-29T01:48:24Z) - Perceiving Traffic from Aerial Images [86.994032967469]
We propose an object detection method called Butterfly Detector that is tailored to detect objects in aerial images.
We evaluate our Butterfly Detector on two publicly available UAV datasets (UAVDT and VisDrone 2019) and show that it outperforms previous state-of-the-art methods while remaining real-time.
arXiv Detail & Related papers (2020-09-16T11:37:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.