LSFSL: Leveraging Shape Information in Few-shot Learning
- URL: http://arxiv.org/abs/2304.06672v1
- Date: Thu, 13 Apr 2023 16:59:22 GMT
- Title: LSFSL: Leveraging Shape Information in Few-shot Learning
- Authors: Deepan Chakravarthi Padmanabhan, Shruthi Gowda, Elahe Arani, Bahram
Zonooz
- Abstract summary: Few-shot learning techniques seek to learn the underlying patterns in data using fewer samples, analogous to how humans learn from limited experience.
In this limited-data scenario, the challenges associated with deep neural networks, such as shortcut learning and texture bias behaviors, are further exacerbated.
We propose LSFSL, which enforces the model to learn more generalizable features utilizing the implicit prior information present in the data.
- Score: 11.145085584637746
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Few-shot learning (FSL) techniques seek to learn the underlying patterns in
data using fewer samples, analogous to how humans learn from limited
experience. In this limited-data scenario, the challenges associated with deep
neural networks, such as shortcut learning and texture bias behaviors, are
further exacerbated. Moreover, the significance of addressing shortcut learning
is not yet fully explored in the few-shot setup. To address these issues, we
propose LSFSL, which enforces the model to learn more generalizable features
utilizing the implicit prior information present in the data. Through
comprehensive analyses, we demonstrate that LSFSL-trained models are less
vulnerable to alteration in color schemes, statistical correlations, and
adversarial perturbations leveraging the global semantics in the data. Our
findings highlight the potential of incorporating relevant priors in few-shot
approaches to increase robustness and generalization.
Related papers
- Leveraging Task-Specific Knowledge from LLM for Semi-Supervised 3D Medical Image Segmentation [9.778201925906913]
We introduce LLM-SegNet, which exploits a large language model (LLM) to integrate task-specific knowledge into our co-training framework.
Experiments on publicly available Left Atrium, Pancreas-CT, and Brats-19 datasets demonstrate the superior performance of LLM-SegNet compared to the state-of-the-art.
arXiv Detail & Related papers (2024-07-06T14:23:16Z) - Understanding Privacy Risks of Embeddings Induced by Large Language Models [75.96257812857554]
Large language models show early signs of artificial general intelligence but struggle with hallucinations.
One promising solution is to store external knowledge as embeddings, aiding LLMs in retrieval-augmented generation.
Recent studies experimentally showed that the original text can be partially reconstructed from text embeddings by pre-trained language models.
arXiv Detail & Related papers (2024-04-25T13:10:48Z) - Can We Break Free from Strong Data Augmentations in Self-Supervised Learning? [18.83003310612038]
Self-supervised learning (SSL) has emerged as a promising solution for addressing the challenge of limited labeled data in deep neural networks (DNNs)
We explore SSL behavior across a spectrum of augmentations, revealing their crucial role in shaping SSL model performance and learning mechanisms.
We propose a novel learning approach that integrates prior knowledge, with the aim of curtailing the need for extensive data augmentations.
arXiv Detail & Related papers (2024-04-15T12:53:48Z) - Making Self-supervised Learning Robust to Spurious Correlation via
Learning-speed Aware Sampling [26.444935219428036]
Self-supervised learning (SSL) has emerged as a powerful technique for learning rich representations from unlabeled data.
In real-world settings, spurious correlations between some attributes (e.g. race, gender and age) and labels for downstream tasks often exist.
We propose a learning-speed aware SSL (LA-SSL) approach, in which we sample each training data with a probability that is inversely related to its learning speed.
arXiv Detail & Related papers (2023-11-27T22:52:45Z) - Representation Learning Dynamics of Self-Supervised Models [7.289672463326423]
Self-Supervised Learning (SSL) is an important paradigm for learning representations from unlabelled data.
We study the learning dynamics of SSL models, specifically representations obtained by minimising contrastive and non-contrastive losses.
We derive the exact learning dynamics of the SSL models trained using gradient descent on the Grassmannian manifold.
arXiv Detail & Related papers (2023-09-05T07:48:45Z) - To Repeat or Not To Repeat: Insights from Scaling LLM under Token-Crisis [50.31589712761807]
Large language models (LLMs) are notoriously token-hungry during pre-training, and high-quality text data on the web is approaching its scaling limit for LLMs.
We investigate the consequences of repeating pre-training data, revealing that the model is susceptible to overfitting.
Second, we examine the key factors contributing to multi-epoch degradation, finding that significant factors include dataset size, model parameters, and training objectives.
arXiv Detail & Related papers (2023-05-22T17:02:15Z) - Mitigating Forgetting in Online Continual Learning via Contrasting
Semantically Distinct Augmentations [22.289830907729705]
Online continual learning (OCL) aims to enable model learning from a non-stationary data stream to continuously acquire new knowledge as well as retain the learnt one.
Main challenge comes from the "catastrophic forgetting" issue -- the inability to well remember the learnt knowledge while learning the new ones.
arXiv Detail & Related papers (2022-11-10T05:29:43Z) - Semi-Supervised and Unsupervised Deep Visual Learning: A Survey [76.2650734930974]
Semi-supervised learning and unsupervised learning offer promising paradigms to learn from an abundance of unlabeled visual data.
We review the recent advanced deep learning algorithms on semi-supervised learning (SSL) and unsupervised learning (UL) for visual recognition from a unified perspective.
arXiv Detail & Related papers (2022-08-24T04:26:21Z) - Collaborative Intelligence Orchestration: Inconsistency-Based Fusion of
Semi-Supervised Learning and Active Learning [60.26659373318915]
Active learning (AL) and semi-supervised learning (SSL) are two effective, but often isolated, means to alleviate the data-hungry problem.
We propose an innovative Inconsistency-based virtual aDvErial algorithm to further investigate SSL-AL's potential superiority.
Two real-world case studies visualize the practical industrial value of applying and deploying the proposed data sampling algorithm.
arXiv Detail & Related papers (2022-06-07T13:28:43Z) - On Data-Augmentation and Consistency-Based Semi-Supervised Learning [77.57285768500225]
Recently proposed consistency-based Semi-Supervised Learning (SSL) methods have advanced the state of the art in several SSL tasks.
Despite these advances, the understanding of these methods is still relatively limited.
arXiv Detail & Related papers (2021-01-18T10:12:31Z) - Understanding Self-supervised Learning with Dual Deep Networks [74.92916579635336]
We propose a novel framework to understand contrastive self-supervised learning (SSL) methods that employ dual pairs of deep ReLU networks.
We prove that in each SGD update of SimCLR with various loss functions, the weights at each layer are updated by a emphcovariance operator.
To further study what role the covariance operator plays and which features are learned in such a process, we model data generation and augmentation processes through a emphhierarchical latent tree model (HLTM)
arXiv Detail & Related papers (2020-10-01T17:51:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.