Related papers: Hard-ODT: Hardware-Friendly Online Decision Tree Learning Algorithm and System

Hard-ODT: Hardware-Friendly Online Decision Tree Learning Algorithm and System

URL: http://arxiv.org/abs/2012.06272v1
Date: Fri, 11 Dec 2020 12:06:44 GMT
Title: Hard-ODT: Hardware-Friendly Online Decision Tree Learning Algorithm and System
Authors: Zhe Lin, Sharad Sinha, Wei Zhang
Abstract summary: In the era of big data, traditional decision tree induction algorithms are not suitable for learning large-scale datasets. We introduce a new quantile-based algorithm to improve the induction of the Hoeffding tree, one of the state-of-the-art online learning models. We present Hard-ODT, a high-performance, hardware-efficient and scalable online decision tree learning system on a field-programmable gate array (FPGA) with system-level optimization techniques.
Score: 17.55491405857204
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Decision trees are machine learning models commonly used in various application scenarios. In the era of big data, traditional decision tree induction algorithms are not suitable for learning large-scale datasets due to their stringent data storage requirement. Online decision tree learning algorithms have been devised to tackle this problem by concurrently training with incoming samples and providing inference results. However, even the most up-to-date online tree learning algorithms still suffer from either high memory usage or high computational intensity with dependency and long latency, making them challenging to implement in hardware. To overcome these difficulties, we introduce a new quantile-based algorithm to improve the induction of the Hoeffding tree, one of the state-of-the-art online learning models. The proposed algorithm is light-weight in terms of both memory and computational demand, while still maintaining high generalization ability. A series of optimization techniques dedicated to the proposed algorithm have been investigated from the hardware perspective, including coarse-grained and fine-grained parallelism, dynamic and memory-based resource sharing, pipelining with data forwarding. Following this, we present Hard-ODT, a high-performance, hardware-efficient and scalable online decision tree learning system on a field-programmable gate array (FPGA) with system-level optimization techniques. Performance and resource utilization are modeled for the complete learning system for early and fast analysis of the trade-off between various design metrics. Finally, we propose a design flow in which the proposed learning system is applied to FPGA run-time power monitoring as a case study.

Related papers

A Comparative Study of OpenMP Scheduling Algorithm Selection Strategies [4.068270792140994]
We propose and evaluate learning-based approaches for selecting scheduling algorithms in OpenMP.<n>Our results show that RL methods are capable of learning high-performing scheduling decisions.<n>The approach can also be extended to MPI-based programs, enabling optimization of scheduling decisions across multiple levels of parallelism.
arXiv Detail & Related papers (2025-07-27T15:10:30Z)
Efficient Implementation of LinearUCB through Algorithmic Improvements and Vector Computing Acceleration for Embedded Learning Systems [0.10470286407954035]
This paper presents algorithmic and hardware techniques to implement two LinearUCB Contextual Bandits algorithms on resource-constrained embedded devices. Results show notable improvements in execution time and energy consumption.
arXiv Detail & Related papers (2025-01-22T13:39:44Z)
Deep Symbolic Optimization for Combinatorial Optimization: Accelerating Node Selection by Discovering Potential Heuristics [10.22111332588471]
We propose a novel deep symbolic optimization learning framework that combines their advantages. Dso4NS guides the search for mathematical expressions within the high-dimensional discrete symbolic space and then incorporates the highest-performing mathematical expressions into a solver. Experiments demonstrate the effectiveness of Dso4NS in learning high-quality expressions, outperforming existing approaches on a CPU machine.
arXiv Detail & Related papers (2024-06-14T06:02:14Z)
Edge Intelligence Optimization for Large Language Model Inference with Batching and Quantization [20.631476379056892]
Large Language Models (LLMs) are at the forefront of this movement. LLMs require cloud hosting, which raises issues regarding privacy, latency, and usage limitations. We present an edge intelligence optimization problem tailored for LLM inference.
arXiv Detail & Related papers (2024-05-12T02:38:58Z)
Performance and Energy Consumption of Parallel Machine Learning Algorithms [0.0]
Machine learning models have achieved remarkable success in various real-world applications. Model training in machine learning requires large-scale data sets and multiple iterations before it can work properly. Parallelization of training algorithms is a common strategy to speed up the process of training.
arXiv Detail & Related papers (2023-05-01T13:04:39Z)
Biologically Plausible Learning on Neuromorphic Hardware Architectures [27.138481022472]
Neuromorphic computing is an emerging paradigm that confronts this imbalance by computations directly in analog memories. This work is the first to compare the impact of different learning algorithms on Compute-In-Memory-based hardware and vice versa.
arXiv Detail & Related papers (2022-12-29T15:10:59Z)
Simple Stochastic and Online Gradient DescentAlgorithms for Pairwise Learning [65.54757265434465]
Pairwise learning refers to learning tasks where the loss function depends on a pair instances. Online descent (OGD) is a popular approach to handle streaming data in pairwise learning. In this paper, we propose simple and online descent to methods for pairwise learning.
arXiv Detail & Related papers (2021-11-23T18:10:48Z)
Ranking Cost: Building An Efficient and Scalable Circuit Routing Planner with Evolution-Based Optimization [49.207538634692916]
We propose a new algorithm for circuit routing, named Ranking Cost, to form an efficient and trainable router. In our method, we introduce a new set of variables called cost maps, which can help the A* router to find out proper paths. Our algorithm is trained in an end-to-end manner and does not use any artificial data or human demonstration.
arXiv Detail & Related papers (2021-10-08T07:22:45Z)
Towards Optimally Efficient Tree Search with Deep Learning [76.64632985696237]
This paper investigates the classical integer least-squares problem which estimates signals integer from linear models. The problem is NP-hard and often arises in diverse applications such as signal processing, bioinformatics, communications and machine learning. We propose a general hyper-accelerated tree search (HATS) algorithm by employing a deep neural network to estimate the optimal estimation for the underlying simplified memory-bounded A* algorithm.
arXiv Detail & Related papers (2021-01-07T08:00:02Z)
Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
Towards Efficient and Scalable Acceleration of Online Decision Tree Learning on FPGA [20.487660974785943]
In the era of big data, traditional decision tree induction algorithms are not suitable for learning large-scale datasets. We introduce a new quantile-based algorithm to improve the induction of the Hoeffding tree, one of the state-of-the-art online learning models. We present a high-performance, hardware-efficient and scalable online decision tree learning system on a field-programmable gate array.
arXiv Detail & Related papers (2020-09-03T03:23:43Z)
A Survey on Large-scale Machine Learning [67.6997613600942]
Machine learning can provide deep insights into data, allowing machines to make high-quality predictions. Most sophisticated machine learning approaches suffer from huge time costs when operating on large-scale data. Large-scale Machine Learning aims to learn patterns from big data with comparable performance efficiently.
arXiv Detail & Related papers (2020-08-10T06:07:52Z)
Spiking Neural Networks Hardware Implementations and Challenges: a Survey [53.429871539789445]
Spiking Neural Networks are cognitive algorithms mimicking neuron and synapse operational principles. We present the state of the art of hardware implementations of spiking neural networks. We discuss the strategies employed to leverage the characteristics of these event-driven algorithms at the hardware level.
arXiv Detail & Related papers (2020-05-04T13:24:00Z)

This list is automatically generated from the titles and abstracts of the papers in this site.