Federated Learning for Big Data: A Survey on Opportunities, Applications, and Future Directions
- URL: http://arxiv.org/abs/2110.04160v3
- Date: Mon, 07 Jul 2025 15:45:16 GMT
- Title: Federated Learning for Big Data: A Survey on Opportunities, Applications, and Future Directions
- Authors: Thippa Reddy Gadekallu, Quoc-Viet Pham, Thien Huynh-The, Hailin Feng, Kai Fang, Sharnil Pandya, Madhusanka Liyanage, Wei Wang, Thanh Thi Nguyen,
- Abstract summary: Federated Learning (FL) emerges as a sub-field of machine learning.<n>This paper reviews the potential of FL in big data acquisition, storage, big data analytics and further privacy preservation.<n>The potential of FL in big data applications, such as smart city, smart healthcare, smart transportation, smart grid, and social media are also explored.
- Score: 18.95670953718066
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: In the recent years, generation of data have escalated to extensive dimensions and big data has emerged as a propelling force in the development of various machine learning advances and internet-of-things (IoT) devices. In this regard, the analytical and learning tools that transport data from several sources to a central cloud for its processing, training, and storage enable realization of the potential of big data. Nevertheless, since the data may contain sensitive information like banking account information, government information, and personal information, these traditional techniques often raise serious privacy concerns. To overcome such challenges, Federated Learning (FL) emerges as a sub-field of machine learning that focuses on scenarios where several entities (commonly termed as clients) work together to train a model while maintaining the decentralisation of their data. Although enormous efforts have been channelized for such studies, there still exists a gap in the literature wherein an extensive review of FL in the realm of big data services remains unexplored. The present paper thus emphasizes on the use of FL in handling big data and related services which encompasses comprehensive review of the potential of FL in big data acquisition, storage, big data analytics and further privacy preservation. Subsequently, the potential of FL in big data applications, such as smart city, smart healthcare, smart transportation, smart grid, and social media are also explored. The paper also highlights various projects pertaining to FL-big data and discusses the associated challenges related to such implementations. This acts as a direction of further research encouraging the development of plausible solutions.
Related papers
- Federated Large Language Models: Current Progress and Future Directions [63.68614548512534]
This paper surveys Federated learning for LLMs (FedLLM), highlighting recent advances and future directions.
We focus on two key aspects: fine-tuning and prompt learning in a federated setting, discussing existing work and associated research challenges.
arXiv Detail & Related papers (2024-09-24T04:14:33Z) - Data-Centric AI in the Age of Large Language Models [51.20451986068925]
This position paper proposes a data-centric viewpoint of AI research, focusing on large language models (LLMs)
We make the key observation that data is instrumental in the developmental (e.g., pretraining and fine-tuning) and inferential stages (e.g., in-context learning) of LLMs.
We identify four specific scenarios centered around data, covering data-centric benchmarks and data curation, data attribution, knowledge transfer, and inference contextualization.
arXiv Detail & Related papers (2024-06-20T16:34:07Z) - Federated Learning for 6G: Paradigms, Taxonomy, Recent Advances and
Insights [52.024964564408]
This paper examines the added-value of implementing Federated Learning throughout all levels of the protocol stack.
It presents important FL applications, addresses hot topics, provides valuable insights and explicits guidance for future research and developments.
Our concluding remarks aim to leverage the synergy between FL and future 6G, while highlighting FL's potential to revolutionize wireless industry.
arXiv Detail & Related papers (2023-12-07T20:39:57Z) - Federated Learning: A Cutting-Edge Survey of the Latest Advancements and Applications [6.042202852003457]
Federated learning (FL) is a technique for developing robust machine learning (ML) models.
To protect user privacy, FL requires users to send model updates rather than transmitting large quantities of raw and potentially confidential data.
This survey provides a comprehensive analysis and comparison of the most recent FL algorithms.
arXiv Detail & Related papers (2023-10-08T19:54:26Z) - LargeST: A Benchmark Dataset for Large-Scale Traffic Forecasting [65.71129509623587]
Road traffic forecasting plays a critical role in smart city initiatives and has experienced significant advancements thanks to the power of deep learning.
However, the promising results achieved on current public datasets may not be applicable to practical scenarios.
We introduce the LargeST benchmark dataset, which includes a total of 8,600 sensors in California with a 5-year time coverage.
arXiv Detail & Related papers (2023-06-14T05:48:36Z) - Federated Learning and Meta Learning: Approaches, Applications, and
Directions [94.68423258028285]
In this tutorial, we present a comprehensive review of FL, meta learning, and federated meta learning (FedMeta)
Unlike other tutorial papers, our objective is to explore how FL, meta learning, and FedMeta methodologies can be designed, optimized, and evolved, and their applications over wireless networks.
arXiv Detail & Related papers (2022-10-24T10:59:29Z) - Big Data and Analytics Implementation in Tertiary Institutions to
Predict Students Performance in Nigeria [0.0]
The term Big Data has been coined to refer to the gargantuan bulk of data that cannot be dealt with by traditional data-handling techniques.
This paper explores the attributes of big data that are relevant to educational institutions.
It investigates the factors influencing the adoption of big data and analytics in learning institutions.
arXiv Detail & Related papers (2022-07-29T13:52:24Z) - DataPerf: Benchmarks for Data-Centric AI Development [81.03754002516862]
DataPerf is a community-led benchmark suite for evaluating ML datasets and data-centric algorithms.
We provide an open, online platform with multiple rounds of challenges to support this iterative development.
The benchmarks, online evaluation platform, and baseline implementations are open source.
arXiv Detail & Related papers (2022-07-20T17:47:54Z) - Towards Federated Long-Tailed Learning [76.50892783088702]
Data privacy and class imbalance are the norm rather than the exception in many machine learning tasks.
Recent attempts have been launched to, on one side, address the problem of learning from pervasive private data, and on the other side, learn from long-tailed data.
This paper focuses on learning with long-tailed (LT) data distributions under the context of the popular privacy-preserved federated learning (FL) framework.
arXiv Detail & Related papers (2022-06-30T02:34:22Z) - Edge-Native Intelligence for 6G Communications Driven by Federated
Learning: A Survey of Trends and Challenges [14.008159759350264]
A new technique, coined as federated learning (FL), arose to bring machine learning to the edge of wireless networks.
FL exploits both decentralised datasets and computing resources of participating clients to develop a generalised ML model without compromising data privacy.
The purpose of this survey is to provide an overview of the state-of-the-art of FL applications in key wireless technologies.
arXiv Detail & Related papers (2021-11-14T17:13:34Z) - "If we didn't solve small data in the past, how can we solve Big Data
today?" [0.0]
We aim to research terms such as'small' and 'big' data, understand their attributes, and look at ways in which they can add value.
Based on the research, it can be inferred that, regardless of how small data might have been used, organizations can still leverage big data with the right technology and business vision.
arXiv Detail & Related papers (2021-11-08T16:31:01Z) - DID-eFed: Facilitating Federated Learning as a Service with
Decentralized Identities [0.11470070927586015]
Federated learning (FL) emerges as a functional solution to build high-performance models shared among multiple parties.
We present DID-eFed, where FL is facilitated by decentralized identities (DID) and a smart contract.
We describe particularly the scenario where our DID-eFed enables the FL among hospitals and research institutions.
arXiv Detail & Related papers (2021-05-18T16:55:34Z) - Data Mining with Big Data in Intrusion Detection Systems: A Systematic
Literature Review [68.15472610671748]
Cloud computing has become a powerful and indispensable technology for complex, high performance and scalable computation.
The rapid rate and volume of data creation has begun to pose significant challenges for data management and security.
The design and deployment of intrusion detection systems (IDS) in the big data setting has, therefore, become a topic of importance.
arXiv Detail & Related papers (2020-05-23T20:57:12Z) - A Review of Privacy-preserving Federated Learning for the
Internet-of-Things [3.3517146652431378]
This work reviews federated learning as an approach for performing machine learning on distributed data.
We aim to protect the privacy of user-generated data as well as reducing communication costs associated with data transfer.
We identify the strengths and weaknesses of different methods applied to federated learning.
arXiv Detail & Related papers (2020-04-24T15:27:23Z) - Evaluating the Communication Efficiency in Federated Learning Algorithms [3.713348568329249]
Recently, in light of new privacy legislations in many countries, the concept of Federated Learning (FL) has been introduced.
In FL, mobile users are empowered to learn a global model by aggregating their local models, without sharing the privacy-sensitive data.
This raises the challenge of communication cost when implementing FL at large scale.
arXiv Detail & Related papers (2020-04-06T15:31:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.