Related papers: Self-Supervised Learning for Covariance Estimation

Self-Supervised Learning for Covariance Estimation

URL: http://arxiv.org/abs/2403.08662v1
Date: Wed, 13 Mar 2024 16:16:20 GMT
Title: Self-Supervised Learning for Covariance Estimation
Authors: Tzvi Diskin and Ami Wiesel
Abstract summary: We propose to globally learn a neural network that will then be applied locally at inference time. The architecture is based on the popular attention mechanism. It can be pre-trained as a foundation model and then be repurposed for various downstream tasks, e.g., adaptive target detection in radar or hyperspectral imagery.
Score: 3.04585143845864
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider the use of deep learning for covariance estimation. We propose to globally learn a neural network that will then be applied locally at inference time. Leveraging recent advancements in self-supervised foundational models, we train the network without any labeling by simply masking different samples and learning to predict their covariance given their surrounding neighbors. The architecture is based on the popular attention mechanism. Its main advantage over classical methods is the automatic exploitation of global characteristics without any distributional assumptions or regularization. It can be pre-trained as a foundation model and then be repurposed for various downstream tasks, e.g., adaptive target detection in radar or hyperspectral imagery.

Related papers

Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning [51.177789437682954]
Class-incremental learning (CIL) seeks to enable a model to sequentially learn new classes while retaining knowledge of previously learned ones. Balancing flexibility and stability remains a significant challenge, particularly when the task ID is unknown. We propose a novel semantic drift calibration method that incorporates mean shift compensation and covariance calibration.
arXiv Detail & Related papers (2025-02-11T13:57:30Z)
Building Bridges between Regression, Clustering, and Classification [5.78009645672281]
We propose a new method to improve the training of models on regression tasks, with continuous scalar targets. Our method is based on casting this task in a different fashion, using a target encoder, and a prediction decoder, inspired by approaches in classification and clustering.
arXiv Detail & Related papers (2025-02-05T08:45:00Z)
A separability-based approach to quantifying generalization: which layer is best? [0.0]
Generalization to unseen data remains poorly understood for deep learning classification and foundation models. We provide a new method for evaluating the capacity of networks to represent a sampled domain. We find that (i) high classification accuracy does not imply high generalizability; and (ii) deeper layers in a model do not always generalize the best.
arXiv Detail & Related papers (2024-05-02T17:54:35Z)
From Reactive to Proactive Volatility Modeling with Hemisphere Neural Networks [0.0]
We reinvigorate maximum likelihood estimation (MLE) for macroeconomic density forecasting through a novel neural network architecture with dedicated mean and variance hemispheres. Our Hemisphere Neural Network (HNN) provides proactive volatility forecasts based on leading indicators when it can, and reactive volatility based on the magnitude of previous prediction errors when it must.
arXiv Detail & Related papers (2023-11-27T21:37:50Z)
Activate and Reject: Towards Safe Domain Generalization under Category Shift [71.95548187205736]
We study a practical problem of Domain Generalization under Category Shift (DGCS) It aims to simultaneously detect unknown-class samples and classify known-class samples in the target domains. Compared to prior DG works, we face two new challenges: 1) how to learn the concept of unknown'' during training with only source known-class samples, and 2) how to adapt the source-trained model to unseen environments.
arXiv Detail & Related papers (2023-10-07T07:53:12Z)
Chaos to Order: A Label Propagation Perspective on Source-Free Domain Adaptation [8.27771856472078]
We present Chaos to Order (CtO), a novel approach for source-free domain adaptation (SFDA) CtO strives to constrain semantic credibility and propagate label information among target subpopulations. Empirical evidence demonstrates that CtO outperforms the state of the arts on three public benchmarks.
arXiv Detail & Related papers (2023-01-20T03:39:35Z)
Learning Classifiers of Prototypes and Reciprocal Points for Universal Domain Adaptation [79.62038105814658]
Universal Domain aims to transfer the knowledge between datasets by handling two shifts: domain-shift and categoryshift. Main challenge is correctly distinguishing the unknown target samples while adapting the distribution of known class knowledge from source to target. Most existing methods approach this problem by first training the target adapted known and then relying on the single threshold to distinguish unknown target samples.
arXiv Detail & Related papers (2022-12-16T09:01:57Z)
Divide and Contrast: Source-free Domain Adaptation via Adaptive Contrastive Learning [122.62311703151215]
Divide and Contrast (DaC) aims to connect the good ends of both worlds while bypassing their limitations. DaC divides the target data into source-like and target-specific samples, where either group of samples is treated with tailored goals. We further align the source-like domain with the target-specific samples using a memory bank-based Maximum Mean Discrepancy (MMD) loss to reduce the distribution mismatch.
arXiv Detail & Related papers (2022-11-12T09:21:49Z)
Semantic Self-adaptation: Enhancing Generalization with a Single Sample [45.111358665370524]
We propose a self-adaptive approach for semantic segmentation. It fine-tunes the parameters of convolutional layers to the input image using consistency regularization. Our empirical study suggests that self-adaptation may complement the established practice of model regularization at training time.
arXiv Detail & Related papers (2022-08-10T12:29:01Z)
Towards Disentangling Information Paths with Coded ResNeXt [11.884259630414515]
We take a novel approach to enhance the transparency of the function of the whole network. We propose a neural network architecture for classification, in which the information that is relevant to each class flows through specific paths.
arXiv Detail & Related papers (2022-02-10T21:45:49Z)
Predicting Deep Neural Network Generalization with Perturbation Response Curves [58.8755389068888]
We propose a new framework for evaluating the generalization capabilities of trained networks. Specifically, we introduce two new measures for accurately predicting generalization gaps. We attain better predictive scores than the current state-of-the-art measures on a majority of tasks in the Predicting Generalization in Deep Learning (PGDL) NeurIPS 2020 competition.
arXiv Detail & Related papers (2021-06-09T01:37:36Z)
Learning from Failure: Training Debiased Classifier from Biased Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge. We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously. Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z)
Regularizing Class-wise Predictions via Self-knowledge Distillation [80.76254453115766]
We propose a new regularization method that penalizes the predictive distribution between similar samples. This results in regularizing the dark knowledge (i.e., the knowledge on wrong predictions) of a single network. Our experimental results on various image classification tasks demonstrate that the simple yet powerful method can significantly improve the generalization ability.
arXiv Detail & Related papers (2020-03-31T06:03:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.