On Inductive Biases for Machine Learning in Data Constrained Settings
- URL: http://arxiv.org/abs/2302.10692v1
- Date: Tue, 21 Feb 2023 14:22:01 GMT
- Title: On Inductive Biases for Machine Learning in Data Constrained Settings
- Authors: Gr\'egoire Mialon
- Abstract summary: This thesis explores a different answer to the problem of learning expressive models in data constrained settings.
Instead of relying on big datasets to learn neural networks, we will replace some modules by known functions reflecting the structure of the data.
Our approach falls under the hood of "inductive biases", which can be defined as hypothesis on the data at hand restricting the space of models to explore.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning with limited data is one of the biggest problems of machine
learning. Current approaches to this issue consist in learning general
representations from huge amounts of data before fine-tuning the model on a
small dataset of interest. While such technique, coined transfer learning, is
very effective in domains such as computer vision or natural langage
processing, it does not yet solve common problems of deep learning such as
model interpretability or the overall need for data. This thesis explores a
different answer to the problem of learning expressive models in data
constrained settings: instead of relying on big datasets to learn neural
networks, we will replace some modules by known functions reflecting the
structure of the data. Very often, these functions will be drawn from the rich
literature of kernel methods. Indeed, many kernels can reflect the underlying
structure of the data, thus sparing learning parameters to some extent. Our
approach falls under the hood of "inductive biases", which can be defined as
hypothesis on the data at hand restricting the space of models to explore
during learning. We demonstrate the effectiveness of this approach in the
context of sequences, such as sentences in natural language or protein
sequences, and graphs, such as molecules. We also highlight the relationship
between our work and recent advances in deep learning. Additionally, we study
convex machine learning models. Here, rather than proposing new models, we
wonder which proportion of the samples in a dataset is really needed to learn a
"good" model. More precisely, we study the problem of safe sample screening,
i.e, executing simple tests to discard uninformative samples from a dataset
even before fitting a machine learning model, without affecting the optimal
model. Such techniques can be used to prune datasets or mine for rare samples.
Related papers
- An Information Theoretic Approach to Machine Unlearning [45.600917449314444]
Key challenge in unlearning is forgetting the necessary data in a timely manner, while preserving model performance.
In this work, we address the zero-shot unlearning scenario, whereby an unlearning algorithm must be able to remove data given only a trained model and the data to be forgotten.
We derive a simple but principled zero-shot unlearning method based on the geometry of the model.
arXiv Detail & Related papers (2024-02-02T13:33:30Z) - Learn to Unlearn for Deep Neural Networks: Minimizing Unlearning
Interference with Gradient Projection [56.292071534857946]
Recent data-privacy laws have sparked interest in machine unlearning.
Challenge is to discard information about the forget'' data without altering knowledge about remaining dataset.
We adopt a projected-gradient based learning method, named as Projected-Gradient Unlearning (PGU)
We provide empirically evidence to demonstrate that our unlearning method can produce models that behave similar to models retrained from scratch across various metrics even when the training dataset is no longer accessible.
arXiv Detail & Related papers (2023-12-07T07:17:24Z) - Learning Defect Prediction from Unrealistic Data [57.53586547895278]
Pretrained models of code have become popular choices for code understanding and generation tasks.
Such models tend to be large and require commensurate volumes of training data.
It has become popular to train models with far larger but less realistic datasets, such as functions with artificially injected bugs.
Models trained on such data tend to only perform well on similar data, while underperforming on real world programs.
arXiv Detail & Related papers (2023-11-02T01:51:43Z) - Data Models for Dataset Drift Controls in Machine Learning With Optical
Images [8.818468649062932]
A primary failure mode are performance drops due to differences between the training and deployment data.
Existing approaches do not account for explicit models of the primary object of interest: the data.
We demonstrate how such data models can be constructed for image data and used to control downstream machine learning model performance related to dataset drift.
arXiv Detail & Related papers (2022-11-04T16:50:10Z) - Synthetic Model Combination: An Instance-wise Approach to Unsupervised
Ensemble Learning [92.89846887298852]
Consider making a prediction over new test data without any opportunity to learn from a training set of labelled data.
Give access to a set of expert models and their predictions alongside some limited information about the dataset used to train them.
arXiv Detail & Related papers (2022-10-11T10:20:31Z) - A Survey of Learning on Small Data: Generalization, Optimization, and
Challenge [101.27154181792567]
Learning on small data that approximates the generalization ability of big data is one of the ultimate purposes of AI.
This survey follows the active sampling theory under a PAC framework to analyze the generalization error and label complexity of learning on small data.
Multiple data applications that may benefit from efficient small data representation are surveyed.
arXiv Detail & Related papers (2022-07-29T02:34:19Z) - ALT-MAS: A Data-Efficient Framework for Active Testing of Machine
Learning Algorithms [58.684954492439424]
We propose a novel framework to efficiently test a machine learning model using only a small amount of labeled test data.
The idea is to estimate the metrics of interest for a model-under-test using Bayesian neural network (BNN)
arXiv Detail & Related papers (2021-04-11T12:14:04Z) - Synthesizing Irreproducibility in Deep Networks [2.28438857884398]
Modern day deep networks suffer from irreproducibility (also referred to as nondeterminism or underspecification)
We show that even with a single nonlinearity and for very simple data and models, irreproducibility occurs.
Model complexity and the choice of nonlinearity also play significant roles in making deep models irreproducible.
arXiv Detail & Related papers (2021-02-21T21:51:28Z) - Adversarial Vulnerability of Active Transfer Learning [0.0]
Two widely used techniques for training supervised machine learning models on small datasets are Active Learning and Transfer Learning.
We show that the combination of these techniques is particularly susceptible to a new kind of data poisoning attack.
We show that a model trained on such a poisoned dataset has a significantly deteriorated performance, dropping from 86% to 34% test accuracy.
arXiv Detail & Related papers (2021-01-26T14:07:09Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.