The OARF Benchmark Suite: Characterization and Implications for
Federated Learning Systems
- URL: http://arxiv.org/abs/2006.07856v4
- Date: Wed, 2 Mar 2022 05:22:17 GMT
- Title: The OARF Benchmark Suite: Characterization and Implications for
Federated Learning Systems
- Authors: Sixu Hu, Yuan Li, Xu Liu, Qinbin Li, Zhaomin Wu, Bingsheng He
- Abstract summary: Open Application Repository for Federated Learning (OARF) is a benchmark suite for federated machine learning systems.
OARF mimics more realistic application scenarios with publicly available data sets as different data silos in image, text and structured data.
- Score: 41.90546696412147
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents and characterizes an Open Application Repository for
Federated Learning (OARF), a benchmark suite for federated machine learning
systems. Previously available benchmarks for federated learning have focused
mainly on synthetic datasets and use a limited number of applications. OARF
mimics more realistic application scenarios with publicly available data sets
as different data silos in image, text and structured data. Our
characterization shows that the benchmark suite is diverse in data size,
distribution, feature distribution and learning task complexity. The extensive
evaluations with reference implementations show the future research
opportunities for important aspects of federated learning systems. We have
developed reference implementations, and evaluated the important aspects of
federated learning, including model accuracy, communication cost, throughput
and convergence time. Through these evaluations, we discovered some interesting
findings such as federated learning can effectively increase end-to-end
throughput.
Related papers
- FedLLM-Bench: Realistic Benchmarks for Federated Learning of Large Language Models [48.484485609995986]
Federated learning has enabled multiple parties to collaboratively train large language models without directly sharing their data (FedLLM)
There are currently no realistic datasets and benchmarks for FedLLM.
We propose FedLLM-Bench, which involves 8 training methods, 4 training datasets, and 6 evaluation metrics.
arXiv Detail & Related papers (2024-06-07T11:19:30Z) - On the Cross-Dataset Generalization of Machine Learning for Network
Intrusion Detection [50.38534263407915]
Network Intrusion Detection Systems (NIDS) are a fundamental tool in cybersecurity.
Their ability to generalize across diverse networks is a critical factor in their effectiveness and a prerequisite for real-world applications.
In this study, we conduct a comprehensive analysis on the generalization of machine-learning-based NIDS through an extensive experimentation in a cross-dataset framework.
arXiv Detail & Related papers (2024-02-15T14:39:58Z) - Factor-Assisted Federated Learning for Personalized Optimization with
Heterogeneous Data [6.024145412139383]
Federated learning is an emerging distributed machine learning framework aiming at protecting data privacy.
Data in different clients contain both common knowledge and personalized knowledge.
We develop a novel personalized federated learning framework for heterogeneous data, which we refer to as FedSplit.
arXiv Detail & Related papers (2023-12-07T13:05:47Z) - Exploring Machine Learning Models for Federated Learning: A Review of
Approaches, Performance, and Limitations [1.1060425537315088]
Federated learning is a distributed learning framework enhanced to preserve the privacy of individuals' data.
In times of crisis, when real-time decision-making is critical, federated learning allows multiple entities to work collectively without sharing sensitive data.
This paper is a systematic review of the literature on privacy-preserving machine learning in the last few years.
arXiv Detail & Related papers (2023-11-17T19:23:21Z) - DataPerf: Benchmarks for Data-Centric AI Development [81.03754002516862]
DataPerf is a community-led benchmark suite for evaluating ML datasets and data-centric algorithms.
We provide an open, online platform with multiple rounds of challenges to support this iterative development.
The benchmarks, online evaluation platform, and baseline implementations are open source.
arXiv Detail & Related papers (2022-07-20T17:47:54Z) - Fair and efficient contribution valuation for vertical federated
learning [49.50442779626123]
Federated learning is a popular technology for training machine learning models on distributed data sources without sharing data.
The Shapley value (SV) is a provably fair contribution valuation metric originated from cooperative game theory.
We propose a contribution valuation metric called vertical federated Shapley value (VerFedSV) based on SV.
arXiv Detail & Related papers (2022-01-07T19:57:15Z) - MLPerfTM HPC: A Holistic Benchmark Suite for Scientific Machine Learning
on HPC Systems [32.621917787044396]
We introduceerf HPC, a benchmark suite of scientific machine learning training applications driven by the MLCommonsTM Association.
We develop a systematic framework for their joint analysis and compare them in terms of data staging, algorithmic convergence, and compute performance.
We conclude by characterizing each benchmark with respect to low-level memory, I/O, and network behavior.
arXiv Detail & Related papers (2021-10-21T20:30:12Z) - FedScale: Benchmarking Model and System Performance of Federated
Learning [4.1617240682257925]
FedScale is a set of challenging and realistic benchmark datasets for federated learning (FL) research.
FedScale is open-source with permissive licenses and actively maintained.
arXiv Detail & Related papers (2021-05-24T15:55:27Z) - Learning summary features of time series for likelihood free inference [93.08098361687722]
We present a data-driven strategy for automatically learning summary features from time series data.
Our results indicate that learning summary features from data can compete and even outperform LFI methods based on hand-crafted values.
arXiv Detail & Related papers (2020-12-04T19:21:37Z) - Resource-Constrained Federated Learning with Heterogeneous Labels and
Models [1.4824891788575418]
We propose a framework with simple $alpha$-weighted federated aggregation of scores which leverages overlapping information gain across labels.
We also demonstrate the on-device capabilities of our proposed framework by experimenting with federated learning and inference across different iterations on a Raspberry Pi 2.
arXiv Detail & Related papers (2020-11-06T06:23:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.