A Maximum Log-Likelihood Method for Imbalanced Few-Shot Learning Tasks
- URL: http://arxiv.org/abs/2211.14668v1
- Date: Sat, 26 Nov 2022 21:31:00 GMT
- Title: A Maximum Log-Likelihood Method for Imbalanced Few-Shot Learning Tasks
- Authors: Samuel Hess and Gregory Ditzler
- Abstract summary: We propose a new maximum log-likelihood metric for few-shot architectures.
We demonstrate that the proposed metric achieves superior performance accuracy w.r.t. conventional similarity metrics.
We also show that our algorithm achieves state-of-the-art transductive few-shot performance when the evaluation data is imbalanced.
- Score: 3.2895195535353308
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Few-shot learning is a rapidly evolving area of research in machine learning
where the goal is to classify unlabeled data with only one or "a few" labeled
exemplary samples. Neural networks are typically trained to minimize a distance
metric between labeled exemplary samples and a query set. Early few-shot
approaches use an episodic training process to sub-sample the training data
into few-shot batches. This training process matches the sub-sampling done on
evaluation. Recently, conventional supervised training coupled with a cosine
distance has achieved superior performance for few-shot. Despite the diversity
of few-shot approaches over the past decade, most methods still rely on the
cosine or Euclidean distance layer between the latent features of the trained
network. In this work, we investigate the distributions of trained few-shot
features and demonstrate that they can be roughly approximated as exponential
distributions. Under this assumption of an exponential distribution, we propose
a new maximum log-likelihood metric for few-shot architectures. We demonstrate
that the proposed metric achieves superior performance accuracy w.r.t.
conventional similarity metrics (e.g., cosine, Euclidean, etc.), and achieve
state-of-the-art inductive few-shot performance. Further, additional gains can
be achieved by carefully combining multiple metrics and neither of our methods
require post-processing feature transformations, which are common to many
algorithms. Finally, we demonstrate a novel iterative algorithm designed around
our maximum log-likelihood approach that achieves state-of-the-art transductive
few-shot performance when the evaluation data is imbalanced. We have made our
code publicly available at https://github.com/samuelhess/MLL_FSL/.
Related papers
- Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Intra-class Adaptive Augmentation with Neighbor Correction for Deep
Metric Learning [99.14132861655223]
We propose a novel intra-class adaptive augmentation (IAA) framework for deep metric learning.
We reasonably estimate intra-class variations for every class and generate adaptive synthetic samples to support hard samples mining.
Our method significantly improves and outperforms the state-of-the-art methods on retrieval performances by 3%-6%.
arXiv Detail & Related papers (2022-11-29T14:52:38Z) - Compare learning: bi-attention network for few-shot learning [6.559037166322981]
One of the Few-shot learning methods called metric learning addresses this challenge by first learning a deep distance metric to determine whether a pair of images belong to the same category.
In this paper, we propose a novel approach named Bi-attention network to compare the instances, which can measure the similarity between embeddings of instances precisely, globally and efficiently.
arXiv Detail & Related papers (2022-03-25T07:39:10Z) - BatchFormer: Learning to Explore Sample Relationships for Robust
Representation Learning [93.38239238988719]
We propose to enable deep neural networks with the ability to learn the sample relationships from each mini-batch.
BatchFormer is applied into the batch dimension of each mini-batch to implicitly explore sample relationships during training.
We perform extensive experiments on over ten datasets and the proposed method achieves significant improvements on different data scarcity applications.
arXiv Detail & Related papers (2022-03-03T05:31:33Z) - Engineering the Neural Automatic Passenger Counter [0.0]
We explore and exploit various aspects of machine learning to increase reliability, performance, and counting quality.
We show how aggregation techniques such as ensemble quantiles can reduce bias, and we give an idea of the overall spread of the results.
arXiv Detail & Related papers (2022-03-02T14:56:11Z) - Squeezing Backbone Feature Distributions to the Max for Efficient
Few-Shot Learning [3.1153758106426603]
Few-shot classification is a challenging problem due to the uncertainty caused by using few labelled samples.
We propose a novel transfer-based method which aims at processing the feature vectors so that they become closer to Gaussian-like distributions.
In the case of transductive few-shot learning where unlabelled test samples are available during training, we also introduce an optimal-transport inspired algorithm to boost even further the achieved performance.
arXiv Detail & Related papers (2021-10-18T16:29:17Z) - An Empirical Comparison of Instance Attribution Methods for NLP [62.63504976810927]
We evaluate the degree to which different potential instance attribution agree with respect to the importance of training samples.
We find that simple retrieval methods yield training instances that differ from those identified via gradient-based methods.
arXiv Detail & Related papers (2021-04-09T01:03:17Z) - Frustratingly Simple Few-Shot Object Detection [98.42824677627581]
We find that fine-tuning only the last layer of existing detectors on rare classes is crucial to the few-shot object detection task.
Such a simple approach outperforms the meta-learning methods by roughly 220 points on current benchmarks.
arXiv Detail & Related papers (2020-03-16T00:29:14Z) - An end-to-end approach for the verification problem: learning the right
distance [15.553424028461885]
We augment the metric learning setting by introducing a parametric pseudo-distance, trained jointly with the encoder.
We first show it approximates a likelihood ratio which can be used for hypothesis tests.
We observe training is much simplified under the proposed approach compared to metric learning with actual distances.
arXiv Detail & Related papers (2020-02-21T18:46:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.