A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers
- URL: http://arxiv.org/abs/2304.07842v1
- Date: Sun, 16 Apr 2023 17:45:21 GMT
- Title: A Virtual Simulation-Pilot Agent for Training of Air Traffic Controllers
- Authors: Juan Zuluaga-Gomez, Amrutha Prasad, Iuliia Nigmatulina, Petr Motlicek,
Matthias Kleinert
- Abstract summary: We propose a novel virtual simulation-pilot engine for speeding up air traffic controller (ATCo) training.
The engine receives spoken communications from ATCo trainees, and it performs automatic speech recognition and understanding.
To the best of our knowledge, this is the first work fully based on open-source ATC resources and AI tools.
- Score: 0.797970449705065
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper we propose a novel virtual simulation-pilot engine for speeding
up air traffic controller (ATCo) training by integrating different
state-of-the-art artificial intelligence (AI) based tools. The virtual
simulation-pilot engine receives spoken communications from ATCo trainees, and
it performs automatic speech recognition and understanding. Thus, it goes
beyond only transcribing the communication and can also understand its meaning.
The output is subsequently sent to a response generator system, which resembles
the spoken read back that pilots give to the ATCo trainees. The overall
pipeline is composed of the following submodules: (i) automatic speech
recognition (ASR) system that transforms audio into a sequence of words; (ii)
high-level air traffic control (ATC) related entity parser that understands the
transcribed voice communication; and (iii) a text-to-speech submodule that
generates a spoken utterance that resembles a pilot based on the situation of
the dialogue. Our system employs state-of-the-art AI-based tools such as
Wav2Vec 2.0, Conformer, BERT and Tacotron models. To the best of our knowledge,
this is the first work fully based on open-source ATC resources and AI tools.
In addition, we have developed a robust and modular system with optional
submodules that can enhance the system's performance by incorporating real-time
surveillance data, metadata related to exercises (such as sectors or runways),
or even introducing a deliberate read-back error to train ATCo trainees to
identify them. Our ASR system can reach as low as 5.5% and 15.9% word error
rates (WER) on high and low-quality ATC audio. We also demonstrate that adding
surveillance data into the ASR can yield callsign detection accuracy of more
than 96%.
Related papers
- Joint vs Sequential Speaker-Role Detection and Automatic Speech Recognition for Air-traffic Control [60.35553925189286]
We propose a transformer-based joint ASR-SRD system that solves both tasks jointly while relying on a standard ASR architecture.
We compare this joint system against two cascaded approaches for ASR and SRD on multiple ATC datasets.
arXiv Detail & Related papers (2024-06-19T21:11:01Z) - Lessons Learned in ATCO2: 5000 hours of Air Traffic Control
Communications for Robust Automatic Speech Recognition and Understanding [3.4713477325880464]
ATCO2 project aimed to develop a unique platform to collect and preprocess large amounts of ATC data from airspace in real time.
This paper reviews previous work from ATCO2 partners, including robust automatic speech recognition.
We believe that the pipeline developed during the ATCO2 project, along with the open-sourcing of its data, will encourage research in the ATC field.
arXiv Detail & Related papers (2023-05-02T02:04:33Z) - Speech and Natural Language Processing Technologies for Pseudo-Pilot
Simulator [0.5480546613836199]
This paper describes a simple yet efficient repetition-based modular system for speeding up air-traffic controllers (ATCos) training.
E.g., a human pilot is still required in EURO's ESCAPE lite simulator (see https://www.eurocontrol.int/simulator/escape) during ATCo training.
This need can be substituted by an automatic system that could act as a pilot.
arXiv Detail & Related papers (2022-12-14T11:34:59Z) - ATCO2 corpus: A Large-Scale Dataset for Research on Automatic Speech
Recognition and Natural Language Understanding of Air Traffic Control
Communications [51.24043482906732]
We introduce the ATCO2 corpus, a dataset that aims at fostering research on the challenging air traffic control (ATC) field.
The ATCO2 corpus is split into three subsets.
We expect the ATCO2 corpus will foster research on robust ASR and NLU.
arXiv Detail & Related papers (2022-11-08T07:26:45Z) - Call-sign recognition and understanding for noisy air-traffic
transcripts using surveillance information [72.20674534231314]
Air traffic control (ATC) relies on communication via speech between pilot and air-traffic controller (ATCO)
The call-sign, as unique identifier for each flight, is used to address a specific pilot by the ATCO.
We propose a new call-sign recognition and understanding (CRU) system that addresses this issue.
The recognizer is trained to identify call-signs in noisy ATC transcripts and convert them into the standard International Civil Aviation Organization (ICAO) format.
arXiv Detail & Related papers (2022-04-13T11:30:42Z) - CI-AVSR: A Cantonese Audio-Visual Speech Dataset for In-car Command
Recognition [91.33781557979819]
We introduce a new dataset, Cantonese In-car Audio-Visual Speech Recognition (CI-AVSR)
It consists of 4,984 samples (8.3 hours) of 200 in-car commands recorded by 30 native Cantonese speakers.
We provide detailed statistics of both the clean and the augmented versions of our dataset.
arXiv Detail & Related papers (2022-01-11T06:32:12Z) - BERTraffic: A Robust BERT-Based Approach for Speaker Change Detection
and Role Identification of Air-Traffic Communications [2.270534915073284]
Speech Activity Detection (SAD) or diarization system fails and then two or more single speaker segments are in the same recording.
We developed a system that combines the segmentation of a SAD module with a BERT-based model that performs Speaker Change Detection (SCD) and Speaker Role Identification (SRI) based on ASR transcripts (i.e., diarization + SRI)
The proposed model reaches up to 0.90/0.95 F1-score on ATCO/pilot for SRI on several test sets.
arXiv Detail & Related papers (2021-10-12T07:25:12Z) - Contextual Semi-Supervised Learning: An Approach To Leverage
Air-Surveillance and Untranscribed ATC Data in ASR Systems [0.6465251961564605]
The callsign used to address an airplane is an essential part of all ATCo-pilot communications.
We propose a two-steps approach to add contextual knowledge during semi-supervised training to reduce the ASR system error rates.
arXiv Detail & Related papers (2021-04-08T09:53:54Z) - Automatic Speech Recognition Benchmark for Air-Traffic Communications [1.175956452196938]
CleanSky EC-H2020 ATCO2 aims to develop an ASR-based platform to collect, organize and automatically pre-process ATCo speech-data from air space.
Cross-accent flaws due to speakers' accents are minimized due to the amount of data, making the system feasible for ATC environments.
arXiv Detail & Related papers (2020-06-18T06:49:22Z) - You Do Not Need More Data: Improving End-To-End Speech Recognition by
Text-To-Speech Data Augmentation [59.31769998728787]
We build our TTS system on an ASR training database and then extend the data with synthesized speech to train a recognition model.
Our system establishes a competitive result for end-to-end ASR trained on LibriSpeech train-clean-100 set with WER 4.3% for test-clean and 13.5% for test-other.
arXiv Detail & Related papers (2020-05-14T17:24:57Z) - Improving Readability for Automatic Speech Recognition Transcription [50.86019112545596]
We propose a novel NLP task called ASR post-processing for readability (APR)
APR aims to transform the noisy ASR output into a readable text for humans and downstream tasks while maintaining the semantic meaning of the speaker.
We compare fine-tuned models based on several open-sourced and adapted pre-trained models with the traditional pipeline method.
arXiv Detail & Related papers (2020-04-09T09:26:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.