Related papers: Generalization Analysis on Learning with a Concurrent Verifier

Generalization Analysis on Learning with a Concurrent Verifier

URL: http://arxiv.org/abs/2210.05331v1
Date: Tue, 11 Oct 2022 10:51:55 GMT
Title: Generalization Analysis on Learning with a Concurrent Verifier
Authors: Masaaki Nishino, Kengo Nakamura, Norihito Yasuda
Abstract summary: We analyze how the learnability of a machine learning model changes with a CV. We show that typical error bounds based on Rademacher complexity will be no larger than that of the original model.
Score: 16.298786827265673
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Machine learning technologies have been used in a wide range of practical systems. In practical situations, it is natural to expect the input-output pairs of a machine learning model to satisfy some requirements. However, it is difficult to obtain a model that satisfies requirements by just learning from examples. A simple solution is to add a module that checks whether the input-output pairs meet the requirements and then modifies the model's outputs. Such a module, which we call a {\em concurrent verifier} (CV), can give a certification, although how the generalizability of the machine learning model changes using a CV is unclear. This paper gives a generalization analysis of learning with a CV. We analyze how the learnability of a machine learning model changes with a CV and show a condition where we can obtain a guaranteed hypothesis using a verifier only in the inference time. We also show that typical error bounds based on Rademacher complexity will be no larger than that of the original model when using a CV in multi-class classification and structured prediction settings.

Related papers

Predicting the Performance of Black-box LLMs through Self-Queries [60.87193950962585]
Large language models (LLMs) are increasingly relied on in AI systems, predicting when they make mistakes is crucial. In this paper, we extract features of LLMs in a black-box manner by using follow-up prompts and taking the probabilities of different responses as representations. We demonstrate that training a linear model on these low-dimensional representations produces reliable predictors of model performance at the instance level.
arXiv Detail & Related papers (2025-01-02T22:26:54Z)
Provable unlearning in topic modeling and downstream tasks [36.571324268874264]
Provable guarantees for unlearning are often limited to supervised learning settings. We provide the first theoretical guarantees for unlearning in the pre-training and fine-tuning paradigm. We show that it is easier to unlearn pre-training data from models that have been fine-tuned to a particular task, and one can unlearn this data without modifying the base model.
arXiv Detail & Related papers (2024-11-19T16:04:31Z)
Attribute-to-Delete: Machine Unlearning via Datamodel Matching [65.13151619119782]
Machine unlearning -- efficiently removing a small "forget set" training data on a pre-divertrained machine learning model -- has recently attracted interest. Recent research shows that machine unlearning techniques do not hold up in such a challenging setting.
arXiv Detail & Related papers (2024-10-30T17:20:10Z)
Increasing Trust in Language Models through the Reuse of Verified Circuits [1.8434042562191815]
Language Models (LMs) are increasingly used for a wide range of prediction tasks, but their training can often neglect rare edge cases. We show that a model can be trained to meet this standard if built using mathematically and logically specified frameworks. We find extensive reuse of the addition circuits for both tasks, easing verification of the more complex subtractor model.
arXiv Detail & Related papers (2024-02-04T21:33:18Z)
Is K-fold cross validation the best model selection method for Machine Learning? [0.0]
K-fold cross-validation (CV) is the most common approach to ascertaining the likelihood that a machine learning outcome is generated by chance. A novel statistical test based on K-fold CV and the Upper Bound of the actual risk (K-fold CUBV) is proposed.
arXiv Detail & Related papers (2024-01-29T18:46:53Z)
Generalization Properties of Retrieval-based Models [50.35325326050263]
Retrieval-based machine learning methods have enjoyed success on a wide range of problems. Despite growing literature showcasing the promise of these models, the theoretical underpinning for such models remains underexplored. We present a formal treatment of retrieval-based models to characterize their generalization ability.
arXiv Detail & Related papers (2022-10-06T00:33:01Z)
Learning from aggregated data with a maximum entropy model [73.63512438583375]
We show how a new model, similar to a logistic regression, may be learned from aggregated data only by approximating the unobserved feature distribution with a maximum entropy hypothesis. We present empirical evidence on several public datasets that the model learned this way can achieve performances comparable to those of a logistic model trained with the full unaggregated data.
arXiv Detail & Related papers (2022-10-05T09:17:27Z)
QuantifyML: How Good is my Machine Learning Model? [0.0]
QuantifyML aims to quantify the extent to which machine learning models have learned and generalized from the given data. The formula is analyzed with off-the-shelf model counters to obtain precise counts with respect to different model behavior.
arXiv Detail & Related papers (2021-10-25T01:56:01Z)
Task-agnostic Continual Learning with Hybrid Probabilistic Models [75.01205414507243]
We propose HCL, a Hybrid generative-discriminative approach to Continual Learning for classification. The flow is used to learn the data distribution, perform classification, identify task changes, and avoid forgetting. We demonstrate the strong performance of HCL on a range of continual learning benchmarks such as split-MNIST, split-CIFAR, and SVHN-MNIST.
arXiv Detail & Related papers (2021-06-24T05:19:26Z)
A Note on High-Probability versus In-Expectation Guarantees of Generalization Bounds in Machine Learning [95.48744259567837]
Statistical machine learning theory often tries to give generalization guarantees of machine learning models. Statements made about the performance of machine learning models have to take the sampling process into account. We show how one may transform one statement to another.
arXiv Detail & Related papers (2020-10-06T09:41:35Z)
Deducing neighborhoods of classes from a fitted model [68.8204255655161]
In this article a new kind of interpretable machine learning method is presented. It can help to understand the partitioning of the feature space into predicted classes in a classification model using quantile shifts. Basically, real data points (or specific points of interest) are used and the changes of the prediction after slightly raising or decreasing specific features are observed.
arXiv Detail & Related papers (2020-09-11T16:35:53Z)
Query Training: Learning a Worse Model to Infer Better Marginals in Undirected Graphical Models with Hidden Variables [11.985433487639403]
Probabilistic graphical models (PGMs) provide a compact representation of knowledge that can be queried in a flexible way. We introduce query training (QT), a mechanism to learn a PGM that is optimized for the approximate inference algorithm that will be paired with it. We demonstrate experimentally that QT can be used to learn a challenging 8-connected grid Markov random field with hidden variables.
arXiv Detail & Related papers (2020-06-11T20:34:32Z)

This list is automatically generated from the titles and abstracts of the papers in this site.