Training with reduced precision of a support vector machine model for
text classification
- URL: http://arxiv.org/abs/2007.08657v1
- Date: Fri, 17 Jul 2020 11:59:30 GMT
- Title: Training with reduced precision of a support vector machine model for
text classification
- Authors: Dominik \.Zurek and Marcin Pietro\'n
- Abstract summary: This work is focused on comparing the efficiency of SVM model trained using reduced precision with its original form.
The main advantage of using quantization is decrease in computation time and in memory footprint on the dedicated hardware platform.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents the impact of using quantization on the efficiency of
multi-class text classification in the training process of a support vector
machine (SVM). This work is focused on comparing the efficiency of SVM model
trained using reduced precision with its original form. The main advantage of
using quantization is decrease in computation time and in memory footprint on
the dedicated hardware platform which supports low precision computation like
GPU (16-bit) or FPGA (any bit-width). The paper presents the impact of a
precision reduction of the SVM training process on text classification
accuracy. The implementation of the CPU was performed using the OpenMP library.
Additionally, the results of the implementation of the GPU using double, single
and half precision are presented.
Related papers
- Scaling Probabilistic Circuits via Monarch Matrices [109.65822339230853]
Probabilistic Circuits (PCs) are tractable representations of probability distributions.<n>We propose a novel sparse and structured parameterization for the sum blocks in PCs.
arXiv Detail & Related papers (2025-06-14T07:39:15Z) - Automatic mixed precision for optimizing gained time with constrained loss mean-squared-error based on model partition to sequential sub-graphs [0.8999666725996975]
Mixed Precision (MP) mitigates the tradeoff by varying numerical precision across network layers.<n>This study focuses on automatically selecting an optimal MP configuration within Post-Training Quantization (PTQ) for inference.
arXiv Detail & Related papers (2025-05-19T12:51:02Z) - Low-Bit Integerization of Vision Transformers using Operand Reordering for Efficient Hardware [0.7136205674624813]
We analyze the computation graph and propose an integerization process based on operation reordering.<n>This enables integerized matrix multiplication and linear module by directly processing the quantized input.<n> Experimental results show that our low-bit inference reduces per-PE power consumption for linear layer and matrix multiplication.
arXiv Detail & Related papers (2025-04-11T16:09:54Z) - Scalify: scale propagation for efficient low-precision LLM training [1.4999444543328293]
Low-precision formats such as float8 have been introduced in machine learning accelerated hardware to improve computational efficiency for large language models training and inference.
We present Scalify, a end-to-end scale propagation paradigm for computational graphs.
arXiv Detail & Related papers (2024-07-24T15:26:01Z) - On-Chip Hardware-Aware Quantization for Mixed Precision Neural Networks [52.97107229149988]
We propose an On-Chip Hardware-Aware Quantization framework, performing hardware-aware mixed-precision quantization on deployed edge devices.
For efficiency metrics, we built an On-Chip Quantization Aware pipeline, which allows the quantization process to perceive the actual hardware efficiency of the quantization operator.
For accuracy metrics, we propose Mask-Guided Quantization Estimation technology to effectively estimate the accuracy impact of operators in the on-chip scenario.
arXiv Detail & Related papers (2023-09-05T04:39:34Z) - DeepGEMM: Accelerated Ultra Low-Precision Inference on CPU Architectures
using Lookup Tables [49.965024476651706]
DeepGEMM is a lookup table based approach for the execution of ultra low-precision convolutional neural networks on SIMD hardware.
Our implementation outperforms corresponding 8-bit integer kernels by up to 1.74x on x86 platforms.
arXiv Detail & Related papers (2023-04-18T15:13:10Z) - Incremental Online Learning Algorithms Comparison for Gesture and Visual
Smart Sensors [68.8204255655161]
This paper compares four state-of-the-art algorithms in two real applications: gesture recognition based on accelerometer data and image classification.
Our results confirm these systems' reliability and the feasibility of deploying them in tiny-memory MCUs.
arXiv Detail & Related papers (2022-09-01T17:05:20Z) - LUT-GEMM: Quantized Matrix Multiplication based on LUTs for Efficient Inference in Large-Scale Generative Language Models [9.727062803700264]
We introduce LUT-GEMM, an efficient kernel for quantized matrix multiplication.
LUT-GEMM eliminates the resource-intensive dequantization process and reduces computational costs.
We show experimentally that when applied to the OPT-175B model with 3-bit quantization, LUT-GEMM substantially accelerates token generation latency.
arXiv Detail & Related papers (2022-06-20T03:48:17Z) - Large-Margin Representation Learning for Texture Classification [67.94823375350433]
This paper presents a novel approach combining convolutional layers (CLs) and large-margin metric learning for training supervised models on small datasets for texture classification.
The experimental results on texture and histopathologic image datasets have shown that the proposed approach achieves competitive accuracy with lower computational cost and faster convergence when compared to equivalent CNNs.
arXiv Detail & Related papers (2022-06-17T04:07:45Z) - FBGEMM: Enabling High-Performance Low-Precision Deep Learning Inference [1.1292678337479967]
fbgemm is a high-performance kernel library for high-performance quantized inference on current generation CPUs.
fbgemm achieves efficiency by fusing common quantization operations with a high-performance gemm implementation and by shape- and size-specific kernel code generation at runtime.
The library has been deployed at Facebook, where it delivers greater than 2x performance gains with respect to our current production baseline.
arXiv Detail & Related papers (2021-01-13T00:34:04Z) - MC-BERT: Efficient Language Pre-Training via a Meta Controller [96.68140474547602]
Large-scale pre-training is computationally expensive.
ELECTRA, an early attempt to accelerate pre-training, trains a discriminative model that predicts whether each input token was replaced by a generator.
We propose a novel meta-learning framework, MC-BERT, to achieve better efficiency and effectiveness.
arXiv Detail & Related papers (2020-06-10T09:22:19Z) - Accelerating Neural Network Inference by Overflow Aware Quantization [16.673051600608535]
Inherited heavy computation of deep neural networks prevents their widespread applications.
We propose an overflow aware quantization method by designing trainable adaptive fixed-point representation.
With the proposed method, we are able to fully utilize the computing power to minimize the quantization loss and obtain optimized inference performance.
arXiv Detail & Related papers (2020-05-27T11:56:22Z) - On Coresets for Support Vector Machines [61.928187390362176]
A coreset is a small, representative subset of the original data points.
We show that our algorithm can be used to extend the applicability of any off-the-shelf SVM solver to streaming, distributed, and dynamic data settings.
arXiv Detail & Related papers (2020-02-15T23:25:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.