On Error and Compression Rates for Prototype Rules
- URL: http://arxiv.org/abs/2206.08014v1
- Date: Thu, 16 Jun 2022 09:15:15 GMT
- Title: On Error and Compression Rates for Prototype Rules
- Authors: Omer Kerem and Roi Weiss
- Abstract summary: We study the interplay between error and compression in the non-parametric multiclass classification setting.
We first show that OptiNet achieves non-trivial compression rates while enjoying near minimax-optimal error rates.
We then proceed to study a novel general compression scheme for further compressing prototype rules.
- Score: 1.116812194101501
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We study the close interplay between error and compression in the
non-parametric multiclass classification setting in terms of prototype learning
rules. We focus in particular on a close variant of a recently proposed
compression-based learning rule termed OptiNet. Beyond its computational
merits, this rule has been recently shown to be universally consistent in any
metric instance space that admits a universally consistent rule -- the first
learning algorithm known to enjoy this property. However, its error and
compression rates have been left open. Here we derive such rates in the case
where instances reside in Euclidean space under commonly posed smoothness and
tail conditions on the data distribution. We first show that OptiNet achieves
non-trivial compression rates while enjoying near minimax-optimal error rates.
We then proceed to study a novel general compression scheme for further
compressing prototype rules that locally adapts to the noise level without
sacrificing accuracy. Applying it to OptiNet, we show that under a geometric
margin condition, further gain in the compression rate is achieved.
Experimental results comparing the performance of the various methods are
presented.
Related papers
- Neural Normalized Compression Distance and the Disconnect Between Compression and Classification [42.98054061480037]
We develop a Neural NCD and compare LLMs to classic general-purpose algorithms like gzip.
We find that classification accuracy is not predictable by compression rate alone.
Our results imply that our intuition on what it means for a neural network to compress'' are not yet well understood.
arXiv Detail & Related papers (2024-10-20T04:31:56Z) - Differential error feedback for communication-efficient decentralized learning [48.924131251745266]
We propose a new decentralized communication-efficient learning approach that blends differential quantization with error feedback.
We show that the resulting communication-efficient strategy is stable both in terms of mean-square error and average bit rate.
The results establish that, in the small step-size regime and with a finite number of bits, it is possible to attain the performance achievable in the absence of compression.
arXiv Detail & Related papers (2024-06-26T15:11:26Z) - Flattened one-bit stochastic gradient descent: compressed distributed optimization with controlled variance [55.01966743652196]
We propose a novel algorithm for distributed gradient descent (SGD) with compressed gradient communication in the parameter-server framework.
Our gradient compression technique, named flattened one-bit gradient descent (FO-SGD), relies on two simple algorithmic ideas.
arXiv Detail & Related papers (2024-05-17T21:17:27Z) - Compression with Exact Error Distribution for Federated Learning [33.74795273515338]
We present and analyze different aggregation schemes based on layered quantizers achieving exact error distribution.
We provide different methods to leverage the proposed compression schemes to obtain compression-for-free in differential privacy applications.
arXiv Detail & Related papers (2023-10-31T17:48:22Z) - Optimal Compression of Unit Norm Vectors in the High Distortion Regime [30.6205706348233]
We investigate the method for compressing a unit norm vector into the minimum number of bits, while still allowing for some acceptable level of distortion in recovery.
Our study considers both biased and unbiased compression methods and determines the optimal compression rates.
While the results are a mix of new and known, they are compiled in this paper for completeness.
arXiv Detail & Related papers (2023-07-16T04:23:57Z) - Unrolled Compressed Blind-Deconvolution [77.88847247301682]
sparse multichannel blind deconvolution (S-MBD) arises frequently in many engineering applications such as radar/sonar/ultrasound imaging.
We propose a compression method that enables blind recovery from much fewer measurements with respect to the full received signal in time.
arXiv Detail & Related papers (2022-09-28T15:16:58Z) - Lower Bounds and Nearly Optimal Algorithms in Distributed Learning with
Communication Compression [33.217552987061474]
Communication compression is one of the most effective means of reducing communication.
Recent advances in distributed optimization and learning have shown that communication compression is one of the most effective means of reducing communication.
arXiv Detail & Related papers (2022-06-08T03:36:34Z) - Learning, compression, and leakage: Minimising classification error via
meta-universal compression principles [87.054014983402]
A promising group of compression techniques for learning scenarios is normalised maximum likelihood (NML) coding.
Here we consider a NML-based decision strategy for supervised classification problems, and show that it attains PAC learning when applied to a wide variety of models.
We show that the misclassification rate of our method is upper bounded by the maximal leakage, a recently proposed metric to quantify the potential of data leakage in privacy-sensitive scenarios.
arXiv Detail & Related papers (2020-10-14T20:03:58Z) - Linear Convergent Decentralized Optimization with Compression [50.44269451541387]
Existing decentralized algorithms with compression mainly focus on compressing DGD-type algorithms.
Motivated by primal-dual algorithms, this paper proposes first underlineLinunderlineEAr convergent.
underlineDecentralized with compression, LEAD.
arXiv Detail & Related papers (2020-07-01T04:35:00Z) - On Compression Principle and Bayesian Optimization for Neural Networks [0.0]
We propose a compression principle that states that an optimal predictive model is the one that minimizes a total compressed message length of all data and model definition while guarantees decodability.
We show that dropout can be used for a continuous dimensionality reduction that allows to find optimal network dimensions as required by the compression principle.
arXiv Detail & Related papers (2020-06-23T03:23:47Z) - On Biased Compression for Distributed Learning [55.89300593805943]
We show for the first time that biased compressors can lead to linear convergence rates both in the single node and distributed settings.
We propose several new biased compressors with promising theoretical guarantees and practical performance.
arXiv Detail & Related papers (2020-02-27T19:52:24Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.