Related papers: Enriching the Machine Learning Workloads in BigBench

Enriching the Machine Learning Workloads in BigBench

URL: http://arxiv.org/abs/2406.10843v1
Date: Sun, 16 Jun 2024 08:32:28 GMT
Title: Enriching the Machine Learning Workloads in BigBench
Authors: Matthias Polag, Todor Ivanov, Timo Eichhorn,
Abstract summary: This work enriches the improved BigBench V2 with three new workloads and expands the coverage of machine learning algorithms. Our workloads utilize multiple algorithms and compare different implementations for the same algorithm across several popular libraries like MLlib, SystemML, Scikit-learn and Pandas.
Score: 0.4178382980763478
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In the era of Big Data and the growing support for Machine Learning, Deep Learning and Artificial Intelligence algorithms in the current software systems, there is an urgent need of standardized application benchmarks that stress test and evaluate these new technologies. Relying on the standardized BigBench (TPCx-BB) benchmark, this work enriches the improved BigBench V2 with three new workloads and expands the coverage of machine learning algorithms. Our workloads utilize multiple algorithms and compare different implementations for the same algorithm across several popular libraries like MLlib, SystemML, Scikit-learn and Pandas, demonstrating the relevance and usability of our benchmark extension.

Related papers

Benchmarking Predictive Coding Networks -- Made Simple [48.652114040426625]
We tackle the problems of efficiency and scalability for predictive coding networks (PCNs) in machine learning. We propose a library, called PCX, that focuses on performance and simplicity, and use it to implement a large set of standard benchmarks. We perform extensive tests on such benchmarks using both existing algorithms for PCNs, as well as adaptations of other methods popular in the bio-plausible deep learning community.
arXiv Detail & Related papers (2024-07-01T10:33:44Z)
Machine Learning Augmented Branch and Bound for Mixed Integer Linear Programming [11.293025183996832]
Mixed Linear Programming (MILP) offers a powerful modeling language for a wide range of applications. In recent years, there has been an explosive development in the use of machine learning algorithms for enhancing all main tasks involved in the branch-and-bound algorithm. In particular, we give detailed attention to machine learning algorithms that automatically optimize some metric of branch-and-bound efficiency.
arXiv Detail & Related papers (2024-02-08T09:19:26Z)
Machine Learning Insides OptVerse AI Solver: Design Principles and Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver. We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem. We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z)
Using Machine Learning To Identify Software Weaknesses From Software Requirement Specifications [49.1574468325115]
This research focuses on finding an efficient machine learning algorithm to identify software weaknesses from requirement specifications. Keywords extracted using latent semantic analysis help map the CWE categories to PROMISE_exp. Naive Bayes, support vector machine (SVM), decision trees, neural network, and convolutional neural network (CNN) algorithms were tested.
arXiv Detail & Related papers (2023-08-10T13:19:10Z)
PDEBENCH: An Extensive Benchmark for Scientific Machine Learning [20.036987098901644]
We introduce PDEBench, a benchmark suite of time-dependent simulation tasks based on Partial Differential Equations (PDEs) PDEBench comprises both code and data to benchmark the performance of novel machine learning models against both classical numerical simulations and machine learning baselines.
arXiv Detail & Related papers (2022-10-13T17:03:36Z)
Tree-Based Adaptive Model Learning [62.997667081978825]
We extend the Kearns-Vazirani learning algorithm to handle systems that change over time. We present a new learning algorithm that can reuse and update previously learned behavior, implement it in the LearnLib library, and evaluate it on large examples.
arXiv Detail & Related papers (2022-08-31T21:24:22Z)
MQBench: Towards Reproducible and Deployable Model Quantization Benchmark [53.12623958951738]
MQBench is a first attempt to evaluate, analyze, and benchmark the and deployability for model quantization algorithms. We choose multiple platforms for real-world deployments, including CPU, GPU, ASIC, DSP, and evaluate extensive state-of-the-art quantization algorithms. We conduct a comprehensive analysis and find considerable intuitive or counter-intuitive insights.
arXiv Detail & Related papers (2021-11-05T23:38:44Z)
Benchmarking Processor Performance by Multi-Threaded Machine Learning Algorithms [0.0]
In this paper, I will make a performance comparison of multi-threaded machine learning clustering algorithms. I will be working on Linear Regression, Random Forest, and K-Nearest Neighbors to determine the performance characteristics of the algorithms.
arXiv Detail & Related papers (2021-09-11T13:26:58Z)
Generative and reproducible benchmarks for comprehensive evaluation of machine learning classifiers [6.605210393590192]
DIverse and GENerative ML Benchmark (DIGEN) is a collection of synthetic datasets for benchmarking of machine learning algorithms. The resource with extensive documentation and analyses is open-source and available on GitHub.
arXiv Detail & Related papers (2021-07-14T03:58:02Z)
A Survey on Large-scale Machine Learning [67.6997613600942]
Machine learning can provide deep insights into data, allowing machines to make high-quality predictions. Most sophisticated machine learning approaches suffer from huge time costs when operating on large-scale data. Large-scale Machine Learning aims to learn patterns from big data with comparable performance efficiently.
arXiv Detail & Related papers (2020-08-10T06:07:52Z)
Reinforcement Learning with Augmented Data [97.42819506719191]
We present Reinforcement Learning with Augmented Data (RAD), a simple plug-and-play module that can enhance most RL algorithms. We show that augmentations such as random translate, crop, color jitter, patch cutout, random convolutions, and amplitude scale can enable simple RL algorithms to outperform complex state-of-the-art methods.
arXiv Detail & Related papers (2020-04-30T17:35:32Z)
Guidelines for enhancing data locality in selected machine learning algorithms [0.0]
We analyze one of the means to increase the performances of machine learning algorithms which is exploiting data locality. Repeated data access can be seen as redundancy in data movement. This work also identifies some of the opportunities for avoiding these redundancies by directly reusing results.
arXiv Detail & Related papers (2020-01-09T14:16:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.