Speech and Natural Language Processing Technologies for Pseudo-Pilot
Simulator
- URL: http://arxiv.org/abs/2212.07164v1
- Date: Wed, 14 Dec 2022 11:34:59 GMT
- Title: Speech and Natural Language Processing Technologies for Pseudo-Pilot
Simulator
- Authors: Amrutha Prasad, Juan Zuluaga-Gomez, Petr Motlicek, Saeed Sarfjoo,
Iuliia Nigmatulina, Karel Vesely
- Abstract summary: This paper describes a simple yet efficient repetition-based modular system for speeding up air-traffic controllers (ATCos) training.
E.g., a human pilot is still required in EURO's ESCAPE lite simulator (see https://www.eurocontrol.int/simulator/escape) during ATCo training.
This need can be substituted by an automatic system that could act as a pilot.
- Score: 0.5480546613836199
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper describes a simple yet efficient repetition-based modular system
for speeding up air-traffic controllers (ATCos) training. E.g., a human pilot
is still required in EUROCONTROL's ESCAPE lite simulator (see
https://www.eurocontrol.int/simulator/escape) during ATCo training. However,
this need can be substituted by an automatic system that could act as a pilot.
In this paper, we aim to develop and integrate a pseudo-pilot agent into the
ATCo training pipeline by merging diverse artificial intelligence (AI) powered
modules. The system understands the voice communications issued by the ATCo,
and, in turn, it generates a spoken prompt that follows the pilot's phraseology
to the initial communication. Our system mainly relies on open-source AI tools
and air traffic control (ATC) databases, thus, proving its simplicity and ease
of replicability. The overall pipeline is composed of the following: (1) a
submodule that receives and pre-processes the input stream of raw audio, (2) an
automatic speech recognition (ASR) system that transforms audio into a sequence
of words; (3) a high-level ATC-related entity parser, which extracts relevant
information from the communication, i.e., callsigns and commands, and finally,
(4) a speech synthesizer submodule that generates responses based on the
high-level ATC entities previously extracted. Overall, we show that this system
could pave the way toward developing a real proof-of-concept pseudo-pilot
system. Hence, speeding up the training of ATCos while drastically reducing its
overall cost.
Related papers
- Joint vs Sequential Speaker-Role Detection and Automatic Speech Recognition for Air-traffic Control [60.35553925189286]
We propose a transformer-based joint ASR-SRD system that solves both tasks jointly while relying on a standard ASR architecture.
We compare this joint system against two cascaded approaches for ASR and SRD on multiple ATC datasets.
arXiv Detail & Related papers (2024-06-19T21:11:01Z) - LLM4Drive: A Survey of Large Language Models for Autonomous Driving [62.10344445241105]
Large language models (LLMs) have demonstrated abilities including understanding context, logical reasoning, and generating answers.
In this paper, we systematically review a research line about textitLarge Language Models for Autonomous Driving (LLM4AD).
arXiv Detail & Related papers (2023-11-02T07:23:33Z) - A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers [0.797970449705065]
We propose a novel virtual simulation-pilot engine for speeding up air traffic controller (ATCo) training.
The engine receives spoken communications from ATCo trainees, and it performs automatic speech recognition and understanding.
To the best of our knowledge, this is the first work fully based on open-source ATC resources and AI tools.
arXiv Detail & Related papers (2023-04-16T17:45:21Z) - ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech
Recognition and Natural Language Understanding of Air Traffic Control
Communications [51.24043482906732]
We introduce the ATCO2 corpus, a dataset that aims at fostering research on the challenging air traffic control (ATC) field.
The ATCO2 corpus is split into three subsets.
We expect the ATCO2 corpus will foster research on robust ASR and NLU.
arXiv Detail & Related papers (2022-11-08T07:26:45Z) - Call-sign recognition and understanding for noisy air-traffic
transcripts using surveillance information [72.20674534231314]
Air traffic control (ATC) relies on communication via speech between pilot and air-traffic controller (ATCO)
The call-sign, as unique identifier for each flight, is used to address a specific pilot by the ATCO.
We propose a new call-sign recognition and understanding (CRU) system that addresses this issue.
The recognizer is trained to identify call-signs in noisy ATC transcripts and convert them into the standard International Civil Aviation Organization (ICAO) format.
arXiv Detail & Related papers (2022-04-13T11:30:42Z) - Deliberation Model for On-Device Spoken Language Understanding [69.5587671262691]
We propose a novel deliberation-based approach to end-to-end (E2E) spoken language understanding (SLU)
We show that our approach can significantly reduce the degradation when moving from natural speech to synthetic speech training.
arXiv Detail & Related papers (2022-04-04T23:48:01Z) - BERTraffic: A Robust BERT-Based Approach for Speaker Change Detection
and Role Identification of Air-Traffic Communications [2.270534915073284]
Speech Activity Detection (SAD) or diarization system fails and then two or more single speaker segments are in the same recording.
We developed a system that combines the segmentation of a SAD module with a BERT-based model that performs Speaker Change Detection (SCD) and Speaker Role Identification (SRI) based on ASR transcripts (i.e., diarization + SRI)
The proposed model reaches up to 0.90/0.95 F1-score on ATCO/pilot for SRI on several test sets.
arXiv Detail & Related papers (2021-10-12T07:25:12Z) - Contextual Semi-Supervised Learning: An Approach To Leverage
Air-Surveillance and Untranscribed ATC Data in ASR Systems [0.6465251961564605]
The callsign used to address an airplane is an essential part of all ATCo-pilot communications.
We propose a two-steps approach to add contextual knowledge during semi-supervised training to reduce the ASR system error rates.
arXiv Detail & Related papers (2021-04-08T09:53:54Z) - A question-answering system for aircraft pilots' documentation [58.720142291102135]
The aerospace industry relies on massive collections of complex and technical documents covering system descriptions, manuals or procedures.
This paper presents a question answering (QA) system that would help aircraft pilots access information by naturally interacting with the system and asking questions in natural language.
arXiv Detail & Related papers (2020-11-26T13:33:47Z) - Automatic Speech Recognition Benchmark for Air-Traffic Communications [1.175956452196938]
CleanSky EC-H2020 ATCO2 aims to develop an ASR-based platform to collect, organize and automatically pre-process ATCo speech-data from air space.
Cross-accent flaws due to speakers' accents are minimized due to the amount of data, making the system feasible for ATC environments.
arXiv Detail & Related papers (2020-06-18T06:49:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.