Kernel-based function learning in dynamic and non stationary
environments
- URL: http://arxiv.org/abs/2310.02767v1
- Date: Wed, 4 Oct 2023 12:31:31 GMT
- Title: Kernel-based function learning in dynamic and non stationary
environments
- Authors: Alberto Giaretta, Mauro Bisiacco, Gianluigi Pillonetto
- Abstract summary: One central theme in machine learning is function estimation from sparse and noisy data.
In this work, we consider kernel-based ridge regression and derive convergence conditions under non stationary distributions.
This includes the important exploration-exploitation problems where e.g. a set of agents/robots has to monitor an environment to reconstruct a sensorial field.
- Score: 0.6138671548064355
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: One central theme in machine learning is function estimation from sparse and
noisy data. An example is supervised learning where the elements of the
training set are couples, each containing an input location and an output
response. In the last decades, a substantial amount of work has been devoted to
design estimators for the unknown function and to study their convergence to
the optimal predictor, also characterizing the learning rate. These results
typically rely on stationary assumptions where input locations are drawn from a
probability distribution that does not change in time. In this work, we
consider kernel-based ridge regression and derive convergence conditions under
non stationary distributions, addressing also cases where stochastic adaption
may happen infinitely often. This includes the important
exploration-exploitation problems where e.g. a set of agents/robots has to
monitor an environment to reconstruct a sensorial field and their movements
rules are continuously updated on the basis of the acquired knowledge on the
field and/or the surrounding environment.
Related papers
- Localized Gaussians as Self-Attention Weights for Point Clouds Correspondence [92.07601770031236]
We investigate semantically meaningful patterns in the attention heads of an encoder-only Transformer architecture.
We find that fixing the attention weights not only accelerates the training process but also enhances the stability of the optimization.
arXiv Detail & Related papers (2024-09-20T07:41:47Z) - A Self-Organizing Clustering System for Unsupervised Distribution Shift Detection [1.0436203990235575]
We propose a continual learning framework for monitoring and detecting distribution changes.
In particular, we investigate the projections made by two topology-preserving maps: the Self-Organizing Map and the Scale Invariant Map.
Our method can be applied in both a supervised and an unsupervised context.
arXiv Detail & Related papers (2024-04-25T14:48:29Z) - A Unifying Perspective on Non-Stationary Kernels for Deeper Gaussian Processes [0.9558392439655016]
We show a variety of kernels in action using representative datasets, carefully study their properties, and compare their performances.
Based on our findings, we propose a new kernel that combines some of the identified advantages of existing kernels.
arXiv Detail & Related papers (2023-09-18T18:34:51Z) - Out of Distribution Detection via Domain-Informed Gaussian Process State
Space Models [22.24457254575906]
In order for robots to safely navigate in unseen scenarios, it is important to accurately detect out-of-training-distribution (OoD) situations online.
We propose a novel approach to embed existing domain knowledge in the kernel and (ii) an OoD online runtime monitor, based on receding-horizon predictions.
arXiv Detail & Related papers (2023-09-13T01:02:42Z) - Continual Test-Time Domain Adaptation [94.51284735268597]
Test-time domain adaptation aims to adapt a source pre-trained model to a target domain without using any source data.
CoTTA is easy to implement and can be readily incorporated in off-the-shelf pre-trained models.
arXiv Detail & Related papers (2022-03-25T11:42:02Z) - On Generalizing Beyond Domains in Cross-Domain Continual Learning [91.56748415975683]
Deep neural networks often suffer from catastrophic forgetting of previously learned knowledge after learning a new task.
Our proposed approach learns new tasks under domain shift with accuracy boosts up to 10% on challenging datasets such as DomainNet and OfficeHome.
arXiv Detail & Related papers (2022-03-08T09:57:48Z) - Learning Operators with Coupled Attention [9.715465024071333]
We propose a novel operator learning method, LOCA, motivated from the recent success of the attention mechanism.
In our architecture the input functions are mapped to a finite set of features which are then averaged with attention weights that depend on the output query locations.
By coupling these attention weights together with an integral transform, LOCA is able to explicitly learn correlations in the target output functions.
arXiv Detail & Related papers (2022-01-04T08:22:03Z) - Decentralized Local Stochastic Extra-Gradient for Variational
Inequalities [125.62877849447729]
We consider distributed variational inequalities (VIs) on domains with the problem data that is heterogeneous (non-IID) and distributed across many devices.
We make a very general assumption on the computational network that covers the settings of fully decentralized calculations.
We theoretically analyze its convergence rate in the strongly-monotone, monotone, and non-monotone settings.
arXiv Detail & Related papers (2021-06-15T17:45:51Z) - What training reveals about neural network complexity [80.87515604428346]
This work explores the hypothesis that the complexity of the function a deep neural network (NN) is learning can be deduced by how fast its weights change during training.
Our results support the hypothesis that good training behavior can be a useful bias towards good generalization.
arXiv Detail & Related papers (2021-06-08T08:58:00Z) - Contrastive learning of strong-mixing continuous-time stochastic
processes [53.82893653745542]
Contrastive learning is a family of self-supervised methods where a model is trained to solve a classification task constructed from unlabeled data.
We show that a properly constructed contrastive learning task can be used to estimate the transition kernel for small-to-mid-range intervals in the diffusion case.
arXiv Detail & Related papers (2021-03-03T23:06:47Z) - The Traveling Observer Model: Multi-task Learning Through Spatial
Variable Embeddings [28.029643109302715]
We frame a general prediction system as an observer traveling around a continuous space, measuring values at some locations, and predicting them at others.
This perspective leads to a machine learning framework in which seemingly unrelated tasks can be solved by a single model.
In experiments, the approach is shown to (1) recover intuitive locations of variables in space and time, (2) exploit regularities across related datasets with completely disjoint input and output spaces, and (3) exploit regularities across seemingly unrelated tasks.
arXiv Detail & Related papers (2020-10-05T21:51:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.