AviationLMM: A Large Multimodal Foundation Model for Civil Aviation
- URL: http://arxiv.org/abs/2601.09105v2
- Date: Fri, 16 Jan 2026 04:56:15 GMT
- Title: AviationLMM: A Large Multimodal Foundation Model for Civil Aviation
- Authors: Wenbin Li, Jingling Wu, Xiaoyong Lin. Jing Chen, Cong Chen,
- Abstract summary: This paper introduces the vision of AviationLMM, a Large Multimodal foundation Model for civil aviation.<n>We describe the model architecture that ingests multimodal inputs such as air-ground voice, surveillance, on-board telemetry, video and structured texts.<n>We identify key research opportunities to address, including data acquisition, alignment and fusion, pretraining, reasoning, trustworthiness, privacy, robustness to missing modalities, and synthetic scenario generation.
- Score: 4.416746793380407
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Civil aviation is a cornerstone of global transportation and commerce, and ensuring its safety, efficiency and customer satisfaction is paramount. Yet conventional Artificial Intelligence (AI) solutions in aviation remain siloed and narrow, focusing on isolated tasks or single modalities. They struggle to integrate heterogeneous data such as voice communications, radar tracks, sensor streams and textual reports, which limits situational awareness, adaptability, and real-time decision support. This paper introduces the vision of AviationLMM, a Large Multimodal foundation Model for civil aviation, designed to unify the heterogeneous data streams of civil aviation and enable understanding, reasoning, generation and agentic applications. We firstly identify the gaps between existing AI solutions and requirements. Secondly, we describe the model architecture that ingests multimodal inputs such as air-ground voice, surveillance, on-board telemetry, video and structured texts, and performs cross-modal alignment and fusion, and produces flexible outputs ranging from situation summaries and risk alerts to predictive diagnostics and multimodal incident reconstructions. In order to fully realize this vision, we identify key research opportunities to address, including data acquisition, alignment and fusion, pretraining, reasoning, trustworthiness, privacy, robustness to missing modalities, and synthetic scenario generation. By articulating the design and challenges of AviationLMM, we aim to boost the civil aviation foundation model progress and catalyze coordinated research efforts toward an integrated, trustworthy and privacy-preserving aviation AI ecosystem.
Related papers
- Forging Spatial Intelligence: A Roadmap of Multi-Modal Data Pre-Training for Autonomous Systems [75.78934957242403]
Self-driving vehicles and drones require true Spatial Intelligence from multi-modal onboard sensor data.<n>This paper presents a framework for multi-modal pre-training, identifying the core set of techniques driving progress toward this goal.
arXiv Detail & Related papers (2025-12-30T17:58:01Z) - AerialMind: Towards Referring Multi-Object Tracking in UAV Scenarios [64.51320327698231]
We introduce AerialMind, the first large-scale RMOT benchmark in UAV scenarios.<n>We develop an innovative semi-automated collaborative agent-based labeling assistant framework.<n>We also propose HawkEyeTrack, a novel method that collaboratively enhances vision-language representation learning.
arXiv Detail & Related papers (2025-11-26T04:44:27Z) - All You Need for Object Detection: From Pixels, Points, and Prompts to Next-Gen Fusion and Multimodal LLMs/VLMs in Autonomous Vehicles [7.863490977061713]
Autonomous Vehicles (AVs) are transforming the future of transportation through advances in intelligent perception, decision-making, and control systems.<n>Their success is tied to one core capability, reliable object detection in complex and multimodal environments.<n>Recent breakthroughs in Computer Vision (CV) and Artificial Intelligence (AI) have driven remarkable progress.<n>This survey bridges that gap by delivering a forward-looking analysis of object detection in AVs.
arXiv Detail & Related papers (2025-10-30T16:08:25Z) - Integrating Neurosymbolic AI in Advanced Air Mobility: A Comprehensive Survey [19.989015008002056]
Neurosymbolic AI combines neural network adaptability with symbolic reasoning.<n>This survey reviews its applications across key Advanced Air Mobility domains.<n>We classify current advancements, present relevant case studies, and outline future research directions.
arXiv Detail & Related papers (2025-08-10T03:30:06Z) - Agentic Satellite-Augmented Low-Altitude Economy and Terrestrial Networks: A Survey on Generative Approaches [76.12691010182802]
This survey focuses on enabling agentic artificial intelligence (AI) in satellite-augmented low-altitude economy and terrestrial networks (SLAETNs)<n>We introduce the architecture and characteristics of SLAETNs, and analyze the challenges that arise in integrating satellite, aerial, and terrestrial components.<n>We examine how these models empower agentic functions across three domains: communication enhancement, security and privacy protection, and intelligent satellite tasks.
arXiv Detail & Related papers (2025-07-19T14:07:05Z) - UAVs Meet Agentic AI: A Multidomain Survey of Autonomous Aerial Intelligence and Agentic UAVs [0.36868085124383626]
Agentic UAVs surpass traditional UAVs by exhibiting goal-driven behavior, contextual reasoning, and interactive autonomy.<n>This study explores seven high-impact application domains precision agriculture, construction & mining, disaster response, environmental monitoring, infrastructure inspection, logistics, security, and wildlife conservation.
arXiv Detail & Related papers (2025-06-08T01:39:51Z) - Generative AI for Autonomous Driving: Frontiers and Opportunities [145.6465312554513]
This survey delivers a comprehensive synthesis of the emerging role of GenAI across the autonomous driving stack.<n>We begin by distilling the principles and trade-offs of modern generative modeling, encompassing VAEs, GANs, Diffusion Models, and Large Language Models.<n>We categorize practical applications, such as synthetic data generalization, end-to-end driving strategies, high-fidelity digital twin systems, smart transportation networks, and cross-domain transfer to embodied AI.
arXiv Detail & Related papers (2025-05-13T17:59:20Z) - Probabilistic Mission Design in Neuro-Symbolic Systems [19.501311018760177]
Probabilistic Mission Design (ProMis) is a system architecture that links geospatial and sensory data with declarative, Hybrid Probabilistic Logic Programs (HPLP)<n>ProMis generates Probabilistic Mission Landscapes (PML), which quantify the agent's belief that a set of mission conditions is satisfied across its navigation space.<n>We show its integration with potent machine learning models such as Large Language Models (LLM) and Transformer-based vision models.
arXiv Detail & Related papers (2024-12-25T11:04:00Z) - Aligning Cyber Space with Physical World: A Comprehensive Survey on Embodied AI [116.8199519880327]
Embodied Artificial Intelligence (Embodied AI) is crucial for achieving Artificial General Intelligence (AGI)<n>In this survey, we give a comprehensive exploration of the latest advancements in Embodied AI.
arXiv Detail & Related papers (2024-07-09T14:14:47Z) - Large Language Models for UAVs: Current State and Pathways to the Future [6.85423435360359]
Unmanned Aerial Vehicles (UAVs) have emerged as a transformative technology across diverse sectors.
This work explores the significant potential of integrating UAVs and Large Language Models (LLMs) to propel the development of autonomous systems.
arXiv Detail & Related papers (2024-05-02T21:30:10Z) - Data-Driven Aerospace Engineering: Reframing the Industry with Machine
Learning [49.367020832638794]
The aerospace industry is poised to capitalize on big data and machine learning.
Recent trends will be explored in context of critical challenges in design, manufacturing, verification and services.
arXiv Detail & Related papers (2020-08-24T22:40:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.