Ex Uno Pluria: Insights on Ensembling in Low Precision Number Systems
- URL: http://arxiv.org/abs/2411.14860v1
- Date: Fri, 22 Nov 2024 11:18:20 GMT
- Title: Ex Uno Pluria: Insights on Ensembling in Low Precision Number Systems
- Authors: Giung Nam, Juho Lee,
- Abstract summary: Ensembling deep neural networks has shown promise in improving generalization performance.
We propose low precision ensembling, where ensemble members are derived from a single model within low precision number systems.
Our empirical analysis demonstrates the effectiveness of our proposed low precision ensembling method compared to existing ensemble approaches.
- Score: 16.89998201009075
- License:
- Abstract: While ensembling deep neural networks has shown promise in improving generalization performance, scaling current ensemble methods for large models remains challenging. Given that recent progress in deep learning is largely driven by the scale, exemplified by the widespread adoption of large-scale neural network architectures, scalability emerges an increasingly critical issue for machine learning algorithms in the era of large-scale models. In this work, we first showcase the potential of low precision ensembling, where ensemble members are derived from a single model within low precision number systems in a training-free manner. Our empirical analysis demonstrates the effectiveness of our proposed low precision ensembling method compared to existing ensemble approaches.
Related papers
- Fast training of large kernel models with delayed projections [14.459817519150997]
We present a new methodology for building kernel machines that can scale efficiently with both data size and model size.
Our algorithm introduces delayed projections to Preconditioned Gradient Descent (PSGD) allowing the training of much larger models than was previously feasible.
We validate our algorithm, EigenPro4, demonstrating drastic training speed up over the existing methods while maintaining comparable or better classification accuracy.
arXiv Detail & Related papers (2024-11-25T18:42:13Z) - Towards Scalable and Versatile Weight Space Learning [51.78426981947659]
This paper introduces the SANE approach to weight-space learning.
Our method extends the idea of hyper-representations towards sequential processing of subsets of neural network weights.
arXiv Detail & Related papers (2024-06-14T13:12:07Z) - LoRA-Ensemble: Efficient Uncertainty Modelling for Self-attention Networks [52.46420522934253]
We introduce LoRA-Ensemble, a parameter-efficient deep ensemble method for self-attention networks.
By employing a single pre-trained self-attention network with weights shared across all members, we train member-specific low-rank matrices for the attention projections.
Our method exhibits superior calibration compared to explicit ensembles and achieves similar or better accuracy across various prediction tasks and datasets.
arXiv Detail & Related papers (2024-05-23T11:10:32Z) - Accurate Neural Network Pruning Requires Rethinking Sparse Optimization [87.90654868505518]
We show the impact of high sparsity on model training using the standard computer vision and natural language processing sparsity benchmarks.
We provide new approaches for mitigating this issue for both sparse pre-training of vision models and sparse fine-tuning of language models.
arXiv Detail & Related papers (2023-08-03T21:49:14Z) - On Robust Numerical Solver for ODE via Self-Attention Mechanism [82.95493796476767]
We explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances.
We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, Attr, which introduces an additive self-attention mechanism to the numerical solution of differential equations.
arXiv Detail & Related papers (2023-02-05T01:39:21Z) - Precision Machine Learning [5.15188009671301]
We compare various function approximation methods and study how they scale with increasing parameters and data.
We find that neural networks can often outperform classical approximation methods on high-dimensional examples.
We develop training tricks which enable us to train neural networks to extremely low loss, close to the limits allowed by numerical precision.
arXiv Detail & Related papers (2022-10-24T17:58:30Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Firearm Detection via Convolutional Neural Networks: Comparing a
Semantic Segmentation Model Against End-to-End Solutions [68.8204255655161]
Threat detection of weapons and aggressive behavior from live video can be used for rapid detection and prevention of potentially deadly incidents.
One way for achieving this is through the use of artificial intelligence and, in particular, machine learning for image analysis.
We compare a traditional monolithic end-to-end deep learning model and a previously proposed model based on an ensemble of simpler neural networks detecting fire-weapons via semantic segmentation.
arXiv Detail & Related papers (2020-12-17T15:19:29Z) - The Self-Simplifying Machine: Exploiting the Structure of Piecewise
Linear Neural Networks to Create Interpretable Models [0.0]
We introduce novel methodology toward simplification and increased interpretability of Piecewise Linear Neural Networks for classification tasks.
Our methods include the use of a trained, deep network to produce a well-performing, single-hidden-layer network without further training.
On these methods, we conduct preliminary studies of model performance, as well as a case study on Wells Fargo's Home Lending dataset.
arXiv Detail & Related papers (2020-12-02T16:02:14Z) - Neural-iLQR: A Learning-Aided Shooting Method for Trajectory
Optimization [17.25824905485415]
We present Neural-iLQR, a learning-aided shooting method over the unconstrained control space.
It is shown to outperform the conventional iLQR significantly in the presence of inaccuracies in system models.
arXiv Detail & Related papers (2020-11-21T07:17:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.