Related papers: Active Learning for Regression based on Wasserstein distance and GroupSort Neural Networks

Active Learning for Regression based on Wasserstein distance and GroupSort Neural Networks

URL: http://arxiv.org/abs/2403.15108v1
Date: Fri, 22 Mar 2024 10:51:55 GMT
Title: Active Learning for Regression based on Wasserstein distance and GroupSort Neural Networks
Authors: Benjamin Bobbia, Matthias Picard,
Abstract summary: The Wasserstein active regression model is based on the principles of distribution-matching to measure the representativeness of the labeled dataset. The Wasserstein distance is computed using GroupSort Neural Networks.
Score: 0.0
License: http://creativecommons.org/publicdomain/zero/1.0/
Abstract: This paper addresses a new active learning strategy for regression problems. The presented Wasserstein active regression model is based on the principles of distribution-matching to measure the representativeness of the labeled dataset. The Wasserstein distance is computed using GroupSort Neural Networks. The use of such networks provides theoretical foundations giving a way to quantify errors with explicit bounds for their size and depth. This solution is combined with another uncertainty-based approach that is more outlier-tolerant to complete the query strategy. Finally, this method is compared with other classical and recent solutions. The study empirically shows the pertinence of such a representativity-uncertainty approach, which provides good estimation all along the query procedure. Moreover, the Wasserstein active regression often achieves more precise estimations and tends to improve accuracy faster than other models.

Related papers

Contextual Similarity Distillation: Ensemble Uncertainties with a Single Model [5.624791703748109]
Uncertainty quantification is a critical aspect of reinforcement learning and deep learning. We propose contextual similarity distillation, a novel approach that explicitly estimates the variance of an ensemble of deep neural networks with a single model. We empirically validate our method across a variety of out-of-distribution detection benchmarks and sparse-reward reinforcement learning environments.
arXiv Detail & Related papers (2025-03-14T12:09:58Z)
An Optimal Transport Approach for Network Regression [0.6238182916866519]
We build upon recent developments in generalized regression models on metric spaces based on Fr'echet means. We propose a network regression method using the Wasserstein metric.
arXiv Detail & Related papers (2024-06-18T02:03:07Z)
Distributed High-Dimensional Quantile Regression: Estimation Efficiency and Support Recovery [0.0]
We focus on distributed estimation and support recovery for high-dimensional linear quantile regression. We transform the original quantile regression into the least-squares optimization. An efficient algorithm is developed, which enjoys high computation and communication efficiency.
arXiv Detail & Related papers (2024-05-13T08:32:22Z)
Learning a Gaussian Mixture for Sparsity Regularization in Inverse Problems [2.375943263571389]
In inverse problems, the incorporation of a sparsity prior yields a regularization effect on the solution. We propose a probabilistic sparsity prior formulated as a mixture of Gaussians, capable of modeling sparsity with respect to a generic basis. We put forth both a supervised and an unsupervised training strategy to estimate the parameters of this network.
arXiv Detail & Related papers (2024-01-29T22:52:57Z)
Structured Radial Basis Function Network: Modelling Diversity for Multiple Hypotheses Prediction [51.82628081279621]
Multi-modal regression is important in forecasting nonstationary processes or with a complex mixture of distributions. A Structured Radial Basis Function Network is presented as an ensemble of multiple hypotheses predictors for regression problems. It is proved that this structured model can efficiently interpolate this tessellation and approximate the multiple hypotheses target distribution.
arXiv Detail & Related papers (2023-09-02T01:27:53Z)
Making Look-Ahead Active Learning Strategies Feasible with Neural Tangent Kernels [6.372625755672473]
We propose a new method for approximating active learning acquisition strategies that are based on retraining with hypothetically-labeled candidate data points. Although this is usually infeasible with deep networks, we use the neural tangent kernel to approximate the result of retraining.
arXiv Detail & Related papers (2022-06-25T06:13:27Z)
Federated Learning Aggregation: New Robust Algorithms with Guarantees [63.96013144017572]
Federated learning has been recently proposed for distributed model training at the edge. This paper presents a complete general mathematical convergence analysis to evaluate aggregation strategies in a federated learning framework. We derive novel aggregation algorithms which are able to modify their model architecture by differentiating client contributions according to the value of their losses.
arXiv Detail & Related papers (2022-05-22T16:37:53Z)
Wasserstein Iterative Networks for Barycenter Estimation [80.23810439485078]
We present an algorithm to approximate the Wasserstein-2 barycenters of continuous measures via a generative model. Based on the celebrity faces dataset, we construct Ave, celeba! dataset which can be used for quantitative evaluation of barycenter algorithms.
arXiv Detail & Related papers (2022-01-28T16:59:47Z)
Sparsely constrained neural networks for model discovery of PDEs [0.0]
We present a modular framework that determines the sparsity pattern of a deep-learning based surrogate using any sparse regression technique. We show how a different network architecture and sparsity estimator improve model discovery accuracy and convergence on several benchmark examples.
arXiv Detail & Related papers (2020-11-09T11:02:40Z)
Good Classifiers are Abundant in the Interpolating Regime [64.72044662855612]
We develop a methodology to compute precisely the full distribution of test errors among interpolating classifiers. We find that test errors tend to concentrate around a small typical value $varepsilon*$, which deviates substantially from the test error of worst-case interpolating model. Our results show that the usual style of analysis in statistical learning theory may not be fine-grained enough to capture the good generalization performance observed in practice.
arXiv Detail & Related papers (2020-06-22T21:12:31Z)
Path Sample-Analytic Gradient Estimators for Stochastic Binary Networks [78.76880041670904]
In neural networks with binary activations and or binary weights the training by gradient descent is complicated. We propose a new method for this estimation problem combining sampling and analytic approximation steps. We experimentally show higher accuracy in gradient estimation and demonstrate a more stable and better performing training in deep convolutional models.
arXiv Detail & Related papers (2020-06-04T21:51:21Z)
Continual Learning using a Bayesian Nonparametric Dictionary of Weight Factors [75.58555462743585]
Naively trained neural networks tend to experience catastrophic forgetting in sequential task settings. We propose a principled nonparametric approach based on the Indian Buffet Process (IBP) prior, letting the data determine how much to expand the model complexity. We demonstrate the effectiveness of our method on a number of continual learning benchmarks and analyze how weight factors are allocated and reused throughout the training.
arXiv Detail & Related papers (2020-04-21T15:20:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.