Related papers: Survey and Taxonomy: The Role of Data-Centric AI in Transformer-Based Time Series Forecasting

Survey and Taxonomy: The Role of Data-Centric AI in Transformer-Based Time Series Forecasting

URL: http://arxiv.org/abs/2407.19784v1
Date: Mon, 29 Jul 2024 08:27:21 GMT
Title: Survey and Taxonomy: The Role of Data-Centric AI in Transformer-Based Time Series Forecasting
Authors: Jingjing Xu, Caesar Wu, Yuan-Fang Li, Gregoire Danoy, Pascal Bouvry,
Abstract summary: We argue that data-centric AI is essential for training AI models, particularly for transformer-based TSF models efficiently. We review the previous research works from a data-centric AI perspective and we intend to lay the foundation work for the future development of transformer-based architecture and data-centric AI.
Score: 36.31269406067809
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Alongside the continuous process of improving AI performance through the development of more sophisticated models, researchers have also focused their attention to the emerging concept of data-centric AI, which emphasizes the important role of data in a systematic machine learning training process. Nonetheless, the development of models has also continued apace. One result of this progress is the development of the Transformer Architecture, which possesses a high level of capability in multiple domains such as Natural Language Processing (NLP), Computer Vision (CV) and Time Series Forecasting (TSF). Its performance is, however, heavily dependent on input data preprocessing and output data evaluation, justifying a data-centric approach to future research. We argue that data-centric AI is essential for training AI models, particularly for transformer-based TSF models efficiently. However, there is a gap regarding the integration of transformer-based TSF and data-centric AI. This survey aims to pin down this gap via the extensive literature review based on the proposed taxonomy. We review the previous research works from a data-centric AI perspective and we intend to lay the foundation work for the future development of transformer-based architecture and data-centric AI.

Related papers

Data Readiness for Scientific AI at Scale [0.0]
This paper examines how Data Readiness for AI (DRAI) principles apply to leadership-scale scientific datasets used to train foundation models.<n>We analyze archetypal across four representative domains - climate, nuclear fusion, bio/health, and materials.<n>We introduce a two-dimensional readiness framework composed of Data Readiness Levels (raw to AI-ready) and Data Processing Stages (ingest to shard)
arXiv Detail & Related papers (2025-07-30T18:30:37Z)
Multimodal Generative AI for Story Point Estimation in Software Development [0.9831489366502301]
This research explores the application of Multimodal Generative AI to enhance story point estimation in Agile software development.<n>By integrating text, image, and categorical data using advanced models like BERT, CNN, and XGBoost, our approach surpasses the limitations of traditional single-modal estimation methods.
arXiv Detail & Related papers (2025-05-22T06:40:41Z)
Data Analysis in the Era of Generative AI [56.44807642944589]
This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges. We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow. We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps.
arXiv Detail & Related papers (2024-09-27T06:31:03Z)
Enhancing Feature Selection and Interpretability in AI Regression Tasks Through Feature Attribution [38.53065398127086]
This study investigates the potential of feature attribution methods to filter out uninformative features in input data for regression problems. We introduce a feature selection pipeline that combines Integrated Gradients with k-means clustering to select an optimal set of variables from the initial data space. To validate the effectiveness of this approach, we apply it to a real-world industrial problem - blade vibration analysis in the development process of turbo machinery.
arXiv Detail & Related papers (2024-09-25T09:50:51Z)
AI Foundation Models in Remote Sensing: A Survey [6.036426846159163]
This paper provides a comprehensive survey of foundation models in the remote sensing domain. We categorize these models based on their applications in computer vision and domain-specific tasks. We highlight emerging trends and the significant advancements achieved by these foundation models.
arXiv Detail & Related papers (2024-08-06T22:39:34Z)
Data-Centric Long-Tailed Image Recognition [49.90107582624604]
Long-tail models exhibit a strong demand for high-quality data. Data-centric approaches aim to enhance both the quantity and quality of data to improve model performance. There is currently a lack of research into the underlying mechanisms explaining the effectiveness of information augmentation.
arXiv Detail & Related papers (2023-11-03T06:34:37Z)
On Responsible Machine Learning Datasets with Fairness, Privacy, and Regulatory Norms [56.119374302685934]
There have been severe concerns over the trustworthiness of AI technologies. Machine and deep learning algorithms depend heavily on the data used during their development. We propose a framework to evaluate the datasets through a responsible rubric.
arXiv Detail & Related papers (2023-10-24T14:01:53Z)
Data-centric Artificial Intelligence: A Survey [47.24049907785989]
Recently, the role of data in AI has been significantly magnified, giving rise to the emerging concept of data-centric AI. In this survey, we discuss the necessity of data-centric AI, followed by a holistic view of three general data-centric goals. We believe this is the first comprehensive survey that provides a global view of a spectrum of tasks across various stages of the data lifecycle.
arXiv Detail & Related papers (2023-03-17T17:44:56Z)
Data-centric AI: Perspectives and Challenges [51.70828802140165]
Data-centric AI (DCAI) advocates a fundamental shift from model advancements to ensuring data quality and reliability. We bring together three general missions: training data development, inference data development, and data maintenance.
arXiv Detail & Related papers (2023-01-12T05:28:59Z)
Data-Centric Artificial Intelligence [2.5874041837241304]
Data-centric artificial intelligence (data-centric AI) represents an emerging paradigm emphasizing that the systematic design and engineering of data is essential for building effective and efficient AI-based systems. We define relevant terms, provide key characteristics to contrast the data-centric paradigm to the model-centric one, and introduce a framework for data-centric AI.
arXiv Detail & Related papers (2022-12-22T16:41:03Z)
The Principles of Data-Centric AI (DCAI) [9.211953610948862]
Data-centric AI (DCAI) as an emerging concept brings data, its quality and its dynamism to the forefront. This article brings together data-centric perspectives and concepts to outline the foundations of DCAI.
arXiv Detail & Related papers (2022-11-26T16:43:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.