Training Data Augmentation for Deep Learning Radio Frequency Systems
- URL: http://arxiv.org/abs/2010.00178v4
- Date: Mon, 4 Jan 2021 15:50:24 GMT
- Title: Training Data Augmentation for Deep Learning Radio Frequency Systems
- Authors: William H. Clark IV, Steven Hauser, William C. Headley, and Alan J.
Michaels
- Abstract summary: This work focuses on the data used during training.
In general, the examined data types each have useful contributions to a final application.
Despite the benefit of captured data, the difficulties and costs that arise from live collection often make the quantity of data needed to achieve peak performance impractical.
- Score: 1.1199585259018459
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Applications of machine learning are subject to three major components that
contribute to the final performance metrics. Within the category of neural
networks, and deep learning specifically, the first two are the architecture
for the model being trained and the training approach used. This work focuses
on the third component, the data used during training. The primary questions
that arise are ``what is in the data'' and ``what within the data matters?''
Looking into the Radio Frequency Machine Learning (RFML) field of Automatic
Modulation Classification (AMC) as an example of a tool used for situational
awareness, the use of synthetic, captured, and augmented data are examined and
compared to provide insights about the quantity and quality of the available
data necessary to achieve desired performance levels. There are three questions
discussed within this work: (1) how useful a synthetically trained system is
expected to be when deployed without considering the environment within the
synthesis, (2) how can augmentation be leveraged within the RFML domain, and
lastly, (3) what impact knowledge of degradations to the signal caused by the
transmission channel contributes to the performance of a system. In general,
the examined data types each have useful contributions to a final application,
but captured data germane to the intended use case will always provide more
significant information and enable the greatest performance. Despite the
benefit of captured data, the difficulties and costs that arise from live
collection often make the quantity of data needed to achieve peak performance
impractical. This paper helps quantify the balance between real and synthetic
data, offering concrete examples where training data is parametrically varied
in size and source.
Related papers
- Data Augmentation for Traffic Classification [54.92823760790628]
Data Augmentation (DA) is a technique widely adopted in Computer Vision (CV) and Natural Language Processing (NLP) tasks.
DA has struggled to gain traction in networking contexts, particularly in Traffic Classification (TC) tasks.
arXiv Detail & Related papers (2024-01-19T15:25:09Z) - D3A-TS: Denoising-Driven Data Augmentation in Time Series [0.0]
This work focuses on studying and analyzing the use of different techniques for data augmentation in time series for classification and regression problems.
The proposed approach involves the use of diffusion probabilistic models, which have recently achieved successful results in the field of Image Processing.
The results highlight the high utility of this methodology in creating synthetic data to train classification and regression models.
arXiv Detail & Related papers (2023-12-09T11:37:07Z) - A New Benchmark: On the Utility of Synthetic Data with Blender for Bare
Supervised Learning and Downstream Domain Adaptation [42.2398858786125]
Deep learning in computer vision has achieved great success with the price of large-scale labeled training data.
The uncontrollable data collection process produces non-IID training and test data, where undesired duplication may exist.
To circumvent them, an alternative is to generate synthetic data via 3D rendering with domain randomization.
arXiv Detail & Related papers (2023-03-16T09:03:52Z) - Automatic Data Augmentation via Invariance-Constrained Learning [94.27081585149836]
Underlying data structures are often exploited to improve the solution of learning tasks.
Data augmentation induces these symmetries during training by applying multiple transformations to the input data.
This work tackles these issues by automatically adapting the data augmentation while solving the learning task.
arXiv Detail & Related papers (2022-09-29T18:11:01Z) - Online Data Selection for Federated Learning with Limited Storage [53.46789303416799]
Federated Learning (FL) has been proposed to achieve distributed machine learning among networked devices.
The impact of on-device storage on the performance of FL is still not explored.
In this work, we take the first step to consider the online data selection for FL with limited on-device storage.
arXiv Detail & Related papers (2022-09-01T03:27:33Z) - Deep Feature Learning for Medical Acoustics [78.56998585396421]
The purpose of this paper is to compare different learnables in medical acoustics tasks.
A framework has been implemented to classify human respiratory sounds and heartbeats in two categories, i.e. healthy or affected by pathologies.
arXiv Detail & Related papers (2022-08-05T10:39:37Z) - Training from Zero: Radio Frequency Machine Learning Data Quantity Forecasting [0.0]
The data used during training in any given application space is directly tied to the performance of the system once deployed.
One of the underlying rule of thumbs used within the machine learning space is that more data leads to better models.
This work examines a modulation classification problem in the Radio Frequency domain space.
arXiv Detail & Related papers (2022-05-07T18:45:06Z) - Pushing the Limits of Simple Pipelines for Few-Shot Learning: External
Data and Fine-Tuning Make a Difference [74.80730361332711]
Few-shot learning is an important and topical problem in computer vision.
We show that a simple transformer-based pipeline yields surprisingly good performance on standard benchmarks.
arXiv Detail & Related papers (2022-04-15T02:55:58Z) - Deep Reinforcement Learning Assisted Federated Learning Algorithm for
Data Management of IIoT [82.33080550378068]
The continuous expanded scale of the industrial Internet of Things (IIoT) leads to IIoT equipments generating massive amounts of user data every moment.
How to manage these time series data in an efficient and safe way in the field of IIoT is still an open issue.
This paper studies the FL technology applications to manage IIoT equipment data in wireless network environments.
arXiv Detail & Related papers (2022-02-03T07:12:36Z) - MLReal: Bridging the gap between training on synthetic data and real
data applications in machine learning [1.9852463786440129]
We describe a novel approach to enhance supervised training on synthetic data with real data features.
In the training stage, the input data are from the synthetic domain and the auto-correlated data are from the real domain.
In the inference/application stage, the input data are from the real subset domain and the mean of the autocorrelated sections are from the synthetic data subset domain.
arXiv Detail & Related papers (2021-09-11T14:43:34Z) - Improving the Performance of Fine-Grain Image Classifiers via Generative
Data Augmentation [0.5161531917413706]
We develop Data Augmentation from Proficient Pre-Training of Robust Generative Adrial Networks (DAPPER GAN)
DAPPER GAN is an ML analytics support tool that automatically generates novel views of training images.
We experimentally evaluate this technique on the Stanford Cars dataset, demonstrating improved vehicle make and model classification accuracy.
arXiv Detail & Related papers (2020-08-12T15:29:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.