Boosted Trees on a Diet: Compact Models for Resource-Constrained Devices
- URL: http://arxiv.org/abs/2510.26557v1
- Date: Thu, 30 Oct 2025 14:47:57 GMT
- Title: Boosted Trees on a Diet: Compact Models for Resource-Constrained Devices
- Authors: Jan Stenkamp, Nina Herrmann, Benjamin Karic, Stefan Oehmcke, Fabian Gieseke,
- Abstract summary: We present a compression scheme for boosted decision trees, addressing the growing need for lightweight machine learning models.<n>We show that models achieved the same performance with a compression ratio of 4-16x compared to LightGBM models.<n>This capability opens the door to a wide range of IoT applications, including remote monitoring, edge analytics, and real-time decision making in isolated or power-limited environments.
- Score: 1.2483467287071346
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Deploying machine learning models on compute-constrained devices has become a key building block of modern IoT applications. In this work, we present a compression scheme for boosted decision trees, addressing the growing need for lightweight machine learning models. Specifically, we provide techniques for training compact boosted decision tree ensembles that exhibit a reduced memory footprint by rewarding, among other things, the reuse of features and thresholds during training. Our experimental evaluation shows that models achieved the same performance with a compression ratio of 4-16x compared to LightGBM models using an adapted training process and an alternative memory layout. Once deployed, the corresponding IoT devices can operate independently of constant communication or external energy supply, and, thus, autonomously, requiring only minimal computing power and energy. This capability opens the door to a wide range of IoT applications, including remote monitoring, edge analytics, and real-time decision making in isolated or power-limited environments.
Related papers
- SnapGen++: Unleashing Diffusion Transformers for Efficient High-Fidelity Image Generation on Edge Devices [72.0937240883345]
Recent advances in diffusion transformers (DiTs) have set new standards in image generation, yet remain impractical for on-device deployment.<n>We present an efficient DiT framework tailored for mobile and edge devices that achieves transformer-level generation quality under strict resource constraints.
arXiv Detail & Related papers (2026-01-13T07:46:46Z) - XBTorch: A Unified Framework for Modeling and Co-Design of Crossbar-Based Deep Learning Accelerators [0.5834731599084116]
This paper introduces XBTorch, a novel simulation framework that integrates seamlessly with PyTorch.<n>XBTorch provides specialized tools for accurately and efficiently modeling crossbar-based systems based on emerging memory technologies.
arXiv Detail & Related papers (2026-01-11T22:35:30Z) - Split Knowledge Distillation for Large Models in IoT: Architecture, Challenges, and Solutions [16.25411682771788]
Large models (LMs) have immense potential in Internet of Things (IoT) systems, enabling applications such as intelligent voice assistants, predictive maintenance, and healthcare monitoring.<n>Training LMs on edge servers raises data privacy concerns, while deploying them directly on IoT devices is constrained by limited computational and memory resources.<n>We propose a split knowledge distillation framework to efficiently distill LMs into smaller, deployable versions for IoT devices while ensuring raw data remains local.
arXiv Detail & Related papers (2024-12-17T02:31:31Z) - E-QUARTIC: Energy Efficient Edge Ensemble of Convolutional Neural Networks for Resource-Optimized Learning [9.957458251671486]
Ensembling models like Convolutional Neural Networks (CNNs) result in high memory and computing overhead, preventing their deployment in embedded systems.
We propose E-QUARTIC, a novel Energy Efficient Edge Ensembling framework to build ensembles of CNNs targeting Artificial Intelligence (AI)-based embedded systems.
arXiv Detail & Related papers (2024-09-12T19:30:22Z) - Training Neural Networks from Scratch with Parallel Low-Rank Adapters [46.764982726136054]
We introduce LoRA-the-Explorer (LTE), a novel bi-level optimization algorithm designed to enable parallel training of multiple low-rank heads across computing nodes.
Our approach includes extensive experimentation on vision transformers using various vision datasets, demonstrating that LTE is competitive with standard pre-training.
arXiv Detail & Related papers (2024-02-26T18:55:13Z) - Heterogeneous Decentralized Machine Unlearning with Seed Model
Distillation [47.42071293545731]
Information security legislation endowed users with unconditional rights to be forgotten by trained machine learning models.
We design a decentralized unlearning framework called HDUS, which uses distilled seed models to construct erasable ensembles for all clients.
arXiv Detail & Related papers (2023-08-25T09:42:54Z) - EVE: Environmental Adaptive Neural Network Models for Low-power Energy
Harvesting System [8.16411986220709]
Energy harvesting technology that harvests energy from ambient environment is a promising alternative to batteries for powering those devices.
This paper proposes EVE, an automated machine learning framework to search for desired multi-models with shared weights for energy harvesting IoT devices.
Experimental results show that the neural networks models generated by EVE is on average 2.5X faster than the baseline models without pruning and shared weights.
arXiv Detail & Related papers (2022-07-14T20:53:46Z) - Real-time Neural-MPC: Deep Learning Model Predictive Control for
Quadrotors and Agile Robotic Platforms [59.03426963238452]
We present Real-time Neural MPC, a framework to efficiently integrate large, complex neural network architectures as dynamics models within a model-predictive control pipeline.
We show the feasibility of our framework on real-world problems by reducing the positional tracking error by up to 82% when compared to state-of-the-art MPC approaches without neural network dynamics.
arXiv Detail & Related papers (2022-03-15T09:38:15Z) - Computational Intelligence and Deep Learning for Next-Generation
Edge-Enabled Industrial IoT [51.68933585002123]
We investigate how to deploy computational intelligence and deep learning (DL) in edge-enabled industrial IoT networks.
In this paper, we propose a novel multi-exit-based federated edge learning (ME-FEEL) framework.
In particular, the proposed ME-FEEL can achieve an accuracy gain up to 32.7% in the industrial IoT networks with the severely limited resources.
arXiv Detail & Related papers (2021-10-28T08:14:57Z) - Differentiable Network Pruning for Microcontrollers [14.864940447206871]
We present a differentiable structured network pruning method for convolutional neural networks.
It integrates a model's MCU-specific resource usage and parameter importance feedback to obtain highly compressed yet accurate classification models.
arXiv Detail & Related papers (2021-10-15T20:26:15Z) - LCS: Learning Compressible Subspaces for Adaptive Network Compression at
Inference Time [57.52251547365967]
We propose a method for training a "compressible subspace" of neural networks that contains a fine-grained spectrum of models.
We present results for achieving arbitrarily fine-grained accuracy-efficiency trade-offs at inference time for structured and unstructured sparsity.
Our algorithm extends to quantization at variable bit widths, achieving accuracy on par with individually trained networks.
arXiv Detail & Related papers (2021-10-08T17:03:34Z) - Model of the Weak Reset Process in HfOx Resistive Memory for Deep
Learning Frameworks [0.6745502291821955]
We present a model of the weak RESET process in hafnium oxide RRAM.
We integrate this model within the PyTorch deep learning framework.
We use this tool to train Binarized Neural Networks for the MNIST handwritten digit recognition task.
arXiv Detail & Related papers (2021-07-02T08:50:35Z) - Learning Discrete Energy-based Models via Auxiliary-variable Local
Exploration [130.89746032163106]
We propose ALOE, a new algorithm for learning conditional and unconditional EBMs for discrete structured data.
We show that the energy function and sampler can be trained efficiently via a new variational form of power iteration.
We present an energy model guided fuzzer for software testing that achieves comparable performance to well engineered fuzzing engines like libfuzzer.
arXiv Detail & Related papers (2020-11-10T19:31:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.