Multi-Modal Domain Fusion for Multi-modal Aerial View Object
Classification
- URL: http://arxiv.org/abs/2212.07039v1
- Date: Wed, 14 Dec 2022 05:14:02 GMT
- Title: Multi-Modal Domain Fusion for Multi-modal Aerial View Object
Classification
- Authors: Sumanth Udupa, Aniruddh Sikdar, Suresh Sundaram
- Abstract summary: A novel Multi-Modal Domain Fusion(MDF) network is proposed to learn the domain invariant features from multi-modal data.
The network achieves top-10 performance in the Track-1 with an accuracy of 25.3 % and top-5 performance in Track-2 with an accuracy of 34.26 %.
- Score: 4.438928487047433
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Object detection and classification using aerial images is a challenging task
as the information regarding targets are not abundant. Synthetic Aperture
Radar(SAR) images can be used for Automatic Target Recognition(ATR) systems as
it can operate in all-weather conditions and in low light settings. But, SAR
images contain salt and pepper noise(speckle noise) that cause hindrance for
the deep learning models to extract meaningful features. Using just aerial view
Electro-optical(EO) images for ATR systems may also not result in high accuracy
as these images are of low resolution and also do not provide ample information
in extreme weather conditions. Therefore, information from multiple sensors can
be used to enhance the performance of Automatic Target Recognition(ATR)
systems. In this paper, we explore a methodology to use both EO and SAR sensor
information to effectively improve the performance of the ATR systems by
handling the shortcomings of each of the sensors. A novel Multi-Modal Domain
Fusion(MDF) network is proposed to learn the domain invariant features from
multi-modal data and use it to accurately classify the aerial view objects. The
proposed MDF network achieves top-10 performance in the Track-1 with an
accuracy of 25.3 % and top-5 performance in Track-2 with an accuracy of 34.26 %
in the test phase on the PBVS MAVOC Challenge dataset [18].
Related papers
- Performance Assessment of Feature Detection Methods for 2-D FS Sonar Imagery [11.23455335391121]
Key challenges include non-uniform lighting and poor visibility in turbid environments.
High-frequency forward-look sonar cameras address these issues.
We evaluate a number of feature detectors using real sonar images from five different sonar devices.
arXiv Detail & Related papers (2024-09-11T04:35:07Z) - Multi-Stage Fusion Architecture for Small-Drone Localization and Identification Using Passive RF and EO Imagery: A Case Study [0.1872664641238533]
This work develops a multi-stage fusion architecture using passive radio frequency (RF) and electro-optic (EO) imagery data.
Supervised deep learning based techniques and unsupervised foreground/background separation techniques are explored to cope with challenging environments.
The proposed fusion architecture is tested and the tracking and performance is quantified over the range.
arXiv Detail & Related papers (2024-03-30T22:53:28Z) - Fewer is More: Efficient Object Detection in Large Aerial Images [59.683235514193505]
This paper presents an Objectness Activation Network (OAN) to help detectors focus on fewer patches but achieve more efficient inference and more accurate results.
Using OAN, all five detectors acquire more than 30.0% speed-up on three large-scale aerial image datasets.
We extend our OAN to driving-scene object detection and 4K video object detection, boosting the detection speed by 112.1% and 75.0%, respectively.
arXiv Detail & Related papers (2022-12-26T12:49:47Z) - Bridging the View Disparity of Radar and Camera Features for Multi-modal
Fusion 3D Object Detection [6.959556180268547]
This paper focuses on how to utilize millimeter-wave (MMW) radar and camera sensor fusion for 3D object detection.
A novel method which realizes the feature-level fusion under bird-eye view (BEV) for a better feature representation is proposed.
arXiv Detail & Related papers (2022-08-25T13:21:37Z) - Target-aware Dual Adversarial Learning and a Multi-scenario
Multi-Modality Benchmark to Fuse Infrared and Visible for Object Detection [65.30079184700755]
This study addresses the issue of fusing infrared and visible images that appear differently for object detection.
Previous approaches discover commons underlying the two modalities and fuse upon the common space either by iterative optimization or deep networks.
This paper proposes a bilevel optimization formulation for the joint problem of fusion and detection, and then unrolls to a target-aware Dual Adversarial Learning (TarDAL) network for fusion and a commonly used detection network.
arXiv Detail & Related papers (2022-03-30T11:44:56Z) - Learning Efficient Representations for Enhanced Object Detection on
Large-scene SAR Images [16.602738933183865]
It is a challenging problem to detect and recognize targets on complex large-scene Synthetic Aperture Radar (SAR) images.
Recently developed deep learning algorithms can automatically learn the intrinsic features of SAR images.
We propose an efficient and robust deep learning based target detection method.
arXiv Detail & Related papers (2022-01-22T03:25:24Z) - Attentional Feature Refinement and Alignment Network for Aircraft
Detection in SAR Imagery [24.004052923372548]
Aircraft detection in Synthetic Aperture Radar (SAR) imagery is a challenging task due to aircraft's discrete appearance, obvious intraclass variation, small size and serious background's interference.
In this paper, a single-shot detector namely Attentional Feature Refinement and Alignment Network (AFRAN) is proposed for detecting aircraft in SAR images with competitive accuracy and speed.
arXiv Detail & Related papers (2022-01-18T16:54:49Z) - Rethinking Drone-Based Search and Rescue with Aerial Person Detection [79.76669658740902]
The visual inspection of aerial drone footage is an integral part of land search and rescue (SAR) operations today.
We propose a novel deep learning algorithm to automate this aerial person detection (APD) task.
We present the novel Aerial Inspection RetinaNet (AIR) algorithm as the combination of these contributions.
arXiv Detail & Related papers (2021-11-17T21:48:31Z) - Meta-UDA: Unsupervised Domain Adaptive Thermal Object Detection using
Meta-Learning [64.92447072894055]
Infrared (IR) cameras are robust under adverse illumination and lighting conditions.
We propose an algorithm meta-learning framework to improve existing UDA methods.
We produce a state-of-the-art thermal detector for the KAIST and DSIAC datasets.
arXiv Detail & Related papers (2021-10-07T02:28:18Z) - Anchor-free Small-scale Multispectral Pedestrian Detection [88.7497134369344]
We propose a method for effective and efficient multispectral fusion of the two modalities in an adapted single-stage anchor-free base architecture.
We aim at learning pedestrian representations based on object center and scale rather than direct bounding box predictions.
Results show our method's effectiveness in detecting small-scaled pedestrians.
arXiv Detail & Related papers (2020-08-19T13:13:01Z) - Exploring Thermal Images for Object Detection in Underexposure Regions
for Autonomous Driving [67.69430435482127]
Underexposure regions are vital to construct a complete perception of the surroundings for safe autonomous driving.
The availability of thermal cameras has provided an essential alternate to explore regions where other optical sensors lack in capturing interpretable signals.
This work proposes a domain adaptation framework which employs a style transfer technique for transfer learning from visible spectrum images to thermal images.
arXiv Detail & Related papers (2020-06-01T09:59:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.