Research Trends and Applications of Data Augmentation Algorithms
- URL: http://arxiv.org/abs/2207.08817v1
- Date: Mon, 18 Jul 2022 11:38:32 GMT
- Title: Research Trends and Applications of Data Augmentation Algorithms
- Authors: Joao Fonseca, Fernando Bacao
- Abstract summary: We identify the main areas of application of data augmentation algorithms, the types of algorithms used, significant research trends, their progression over time and research gaps in data augmentation literature.
We expect readers to understand the potential of data augmentation, as well as identify future research directions and open questions within data augmentation research.
- Score: 77.34726150561087
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: In the Machine Learning research community, there is a consensus regarding
the relationship between model complexity and the required amount of data and
computation power. In real world applications, these computational requirements
are not always available, motivating research on regularization methods. In
addition, current and past research have shown that simpler classification
algorithms can reach state-of-the-art performance on computer vision tasks
given a robust method to artificially augment the training dataset. Because of
this, data augmentation techniques became a popular research topic in recent
years. However, existing data augmentation methods are generally less
transferable than other regularization methods. In this paper we identify the
main areas of application of data augmentation algorithms, the types of
algorithms used, significant research trends, their progression over time and
research gaps in data augmentation literature. To do this, the related
literature was collected through the Scopus database. Its analysis was done
following network science, text mining and exploratory analysis approaches. We
expect readers to understand the potential of data augmentation, as well as
identify future research directions and open questions within data augmentation
research.
Related papers
- Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs)
We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs.
We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z) - A Comprehensive Survey on Data Augmentation [55.355273602421384]
Data augmentation is a technique that generates high-quality artificial data by manipulating existing data samples.
Existing literature surveys only focus on a certain type of specific modality data.
We propose a more enlightening taxonomy that encompasses data augmentation techniques for different common data modalities.
arXiv Detail & Related papers (2024-05-15T11:58:08Z) - A Survey on Data Selection for Language Models [148.300726396877]
Data selection methods aim to determine which data points to include in a training dataset.
Deep learning is mostly driven by empirical evidence and experimentation on large-scale data is expensive.
Few organizations have the resources for extensive data selection research.
arXiv Detail & Related papers (2024-02-26T18:54:35Z) - Mapping Computer Science Research: Trends, Influences, and Predictions [0.0]
We employ advanced machine learning techniques, including Decision Tree and Logistic Regression models, to predict trending research areas.
Our analysis reveals that the number of references cited in research papers (Reference Count) plays a pivotal role in determining trending research areas.
The Logistic Regression model outperforms the Decision Tree model in predicting trends, exhibiting higher accuracy, precision, recall, and F1 score.
arXiv Detail & Related papers (2023-08-01T16:59:25Z) - Advanced Data Augmentation Approaches: A Comprehensive Survey and Future
directions [57.30984060215482]
We provide a background of data augmentation, a novel and comprehensive taxonomy of reviewed data augmentation techniques, and the strengths and weaknesses (wherever possible) of each technique.
We also provide comprehensive results of the data augmentation effect on three popular computer vision tasks, such as image classification, object detection and semantic segmentation.
arXiv Detail & Related papers (2023-01-07T11:37:32Z) - Data Augmentation techniques in time series domain: A survey and
taxonomy [0.20971479389679332]
Deep neural networks used to work with time series heavily depend on the size and consistency of the datasets used in training.
This work systematically reviews the current state-of-the-art in the area to provide an overview of all available algorithms.
The ultimate aim of this study is to provide a summary of the evolution and performance of areas that produce better results to guide future researchers in this field.
arXiv Detail & Related papers (2022-06-25T17:09:00Z) - Deep Generative Modeling in Network Science with Applications to Public
Policy Research [0.0]
Network data is increasingly being used in quantitative, data-driven public policy research.
Deep generative methods can be used to generate realistic synthetic networks useful for microsimulation and agent-based models.
We develop a new generative framework which applies to large social contact networks commonly used in epidemiological modeling.
arXiv Detail & Related papers (2020-10-15T16:47:34Z) - Deep Learning for Community Detection: Progress, Challenges and
Opportunities [79.26787486888549]
Article summarizes the contributions of the various frameworks, models, and algorithms in deep neural networks.
This article summarizes the contributions of the various frameworks, models, and algorithms in deep neural networks.
arXiv Detail & Related papers (2020-05-17T11:22:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.