Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of Aircraft
- URL: http://arxiv.org/abs/2405.05574v1
- Date: Thu, 9 May 2024 06:48:42 GMT
- Title: Vision-Language Modeling with Regularized Spatial Transformer Networks for All Weather Crosswind Landing of Aircraft
- Authors: Debabrata Pal, Anvita Singh, Saumya Saumya, Shouvik Das,
- Abstract summary: In harsh weather, a pilot must have a clear view of runway elements before the minimum decision altitude.
A vision-based system tailored to localize runway elements likewise gets affected, especially during crosswind.
We propose to integrate a prompt-based climatic diffusion network with a weather distillation model using a novel diffusion-distillation loss.
- Score: 0.3749861135832073
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The intrinsic capability to perceive depth of field and extract salient information by the Human Vision System (HVS) stimulates a pilot to perform manual landing over an autoland approach. However, harsh weather creates visibility hindrances, and a pilot must have a clear view of runway elements before the minimum decision altitude. To help a pilot in manual landing, a vision-based system tailored to localize runway elements likewise gets affected, especially during crosswind due to the projective distortion of aircraft camera images. To combat this, we propose to integrate a prompt-based climatic diffusion network with a weather distillation model using a novel diffusion-distillation loss. Precisely, the diffusion model synthesizes climatic-conditioned landing images, and the weather distillation model learns inverse mapping by clearing those visual degradations. Then, to tackle the crosswind landing scenario, a novel Regularized Spatial Transformer Networks (RuSTaN) learns to accurately calibrate for projective distortion using self-supervised learning, which minimizes localization error by the downstream runway object detector. Finally, we have simulated a clear-day landing scenario at the busiest airport globally to curate an image-based Aircraft Landing Dataset (AIRLAD) and experimentally validated our contributions using this dataset to benchmark the performance.
Related papers
- Power Line Aerial Image Restoration under dverse Weather: Datasets and Baselines [17.3009572002435]
Power Line Autonomous Inspection (PLAI) plays a crucial role in the construction of smart grids.
PLAI is completed by accurately detecting the electrical components and defects in the aerial images captured byUAVs.
The visible quality of aerial images is inevitably degraded by adverse weather like haze, rain, or snow, which are found to drastically decrease the detection accuracy in our research.
We propose a new task of Power Line Aerial Image Restoration under Adverse Weather (PLAIR-AW), which aims to recover clean and high-quality images from degraded images with bad weather.
arXiv Detail & Related papers (2024-09-07T12:53:05Z) - Genuine Knowledge from Practice: Diffusion Test-Time Adaptation for
Video Adverse Weather Removal [53.15046196592023]
We introduce test-time adaptation into adverse weather removal in videos.
We propose the first framework that integrates test-time adaptation into the iterative diffusion reverse process.
arXiv Detail & Related papers (2024-03-12T14:21:30Z) - Aircraft Landing Time Prediction with Deep Learning on Trajectory Images [18.536109188450876]
In this study, a trajectory image-based deep learning method is proposed to predict ALTs for the aircraft entering the research airspace that covers the Terminal Maneuvering Area (TMA)
The trajectory images contain various information, including the aircraft position, speed, heading, relative distances, and arrival traffic flows.
We also use real-time runway usage obtained from the trajectory data and the external information such as aircraft types and weather conditions as additional inputs.
arXiv Detail & Related papers (2024-01-02T07:56:05Z) - Exploring the Application of Large-scale Pre-trained Models on Adverse
Weather Removal [97.53040662243768]
We propose a CLIP embedding module to make the network handle different weather conditions adaptively.
This module integrates the sample specific weather prior extracted by CLIP image encoder together with the distribution specific information learned by a set of parameters.
arXiv Detail & Related papers (2023-06-15T10:06:13Z) - ScatterNeRF: Seeing Through Fog with Physically-Based Inverse Neural
Rendering [83.75284107397003]
We introduce ScatterNeRF, a neural rendering method which renders scenes and decomposes the fog-free background.
We propose a disentangled representation for the scattering volume and the scene objects, and learn the scene reconstruction with physics-inspired losses.
We validate our method by capturing multi-view In-the-Wild data and controlled captures in a large-scale fog chamber.
arXiv Detail & Related papers (2023-05-03T13:24:06Z) - Inferring Traffic Models in Terminal Airspace from Flight Tracks and
Procedures [52.25258289718559]
We propose a probabilistic model that can learn the variability from procedural data and flight tracks collected from radar surveillance data.
We show how a pairwise model can be used to generate traffic involving an arbitrary number of aircraft.
arXiv Detail & Related papers (2023-03-17T13:58:06Z) - Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a
Single Image using Diffusion Models [72.76182801289497]
We present a novel method, Aerial Diffusion, for generating aerial views from a single ground-view image using text guidance.
We address two main challenges corresponding to domain gap between the ground-view and the aerial view.
Aerial Diffusion is the first approach that performs ground-to-aerial translation in an unsupervised manner.
arXiv Detail & Related papers (2023-03-15T22:26:09Z) - Monocular BEV Perception of Road Scenes via Front-to-Top View Projection [57.19891435386843]
We present a novel framework that reconstructs a local map formed by road layout and vehicle occupancy in the bird's-eye view.
Our model runs at 25 FPS on a single GPU, which is efficient and applicable for real-time panorama HD map reconstruction.
arXiv Detail & Related papers (2022-11-15T13:52:41Z) - PlaneDepth: Self-supervised Depth Estimation via Orthogonal Planes [41.517947010531074]
Multiple near frontal-parallel planes based depth estimation demonstrated impressive results in self-supervised monocular depth estimation (MDE)
We propose the PlaneDepth, a novel planes based presentation, including vertical planes and ground planes.
Our method can extract the ground plane in an unsupervised manner, which is important for autonomous driving.
arXiv Detail & Related papers (2022-10-04T13:51:59Z) - Phased Flight Trajectory Prediction with Deep Learning [8.898269198985576]
The unprecedented increase of commercial airlines and private jets over the past ten years presents a challenge for air traffic control.
Precise flight trajectory prediction is of great significance in air transportation management, which contributes to the decision-making for safe and orderly flights.
We propose a phased flight trajectory prediction framework that can outperform state-of-the-art methods for flight trajectory prediction for large passenger/transport airplanes.
arXiv Detail & Related papers (2022-03-17T02:16:02Z) - Predicting Flight Delay with Spatio-Temporal Trajectory Convolutional
Network and Airport Situational Awareness Map [20.579487904188802]
We propose a vision-based solution to achieve a high forecasting accuracy, applicable to the airport.
We propose an end-to-end deep learning architecture, TrajCNN, which captures both the spatial and temporal information from the situational awareness map.
Our proposed framework obtained a good result (around 18 minutes error) for predicting flight departure delay at Los Angeles International Airport.
arXiv Detail & Related papers (2021-05-19T07:38:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.