VL4Pose: Active Learning Through Out-Of-Distribution Detection For Pose
Estimation
- URL: http://arxiv.org/abs/2210.06028v1
- Date: Wed, 12 Oct 2022 09:03:55 GMT
- Title: VL4Pose: Active Learning Through Out-Of-Distribution Detection For Pose
Estimation
- Authors: Megh Shukla, Roshan Roy, Pankaj Singh, Shuaib Ahmed, Alexandre Alahi
- Abstract summary: We introduce VL4Pose, a first principles approach for active learning through out-of-distribution detection.
Our solution involves modelling the pose through a simple parametric Bayesian network trained via maximum likelihood estimation.
We perform qualitative and quantitative experiments on three datasets: MPII, LSP and ICVL, spanning human and hand pose estimation.
- Score: 79.50280069412847
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Advances in computing have enabled widespread access to pose estimation,
creating new sources of data streams. Unlike mock set-ups for data collection,
tapping into these data streams through on-device active learning allows us to
directly sample from the real world to improve the spread of the training
distribution. However, on-device computing power is limited, implying that any
candidate active learning algorithm should have a low compute footprint while
also being reliable. Although multiple algorithms cater to pose estimation,
they either use extensive compute to power state-of-the-art results or are not
competitive in low-resource settings. We address this limitation with VL4Pose
(Visual Likelihood For Pose Estimation), a first principles approach for active
learning through out-of-distribution detection. We begin with a simple premise:
pose estimators often predict incoherent poses for out-of-distribution samples.
Hence, can we identify a distribution of poses the model has been trained on,
to identify incoherent poses the model is unsure of? Our solution involves
modelling the pose through a simple parametric Bayesian network trained via
maximum likelihood estimation. Therefore, poses incurring a low likelihood
within our framework are out-of-distribution samples making them suitable
candidates for annotation. We also observe two useful side-outcomes: VL4Pose
in-principle yields better uncertainty estimates by unifying joint and pose
level ambiguity, as well as the unintentional but welcome ability of VL4Pose to
perform pose refinement in limited scenarios. We perform qualitative and
quantitative experiments on three datasets: MPII, LSP and ICVL, spanning human
and hand pose estimation. Finally, we note that VL4Pose is simple,
computationally inexpensive and competitive, making it suitable for challenging
tasks such as on-device active learning.
Related papers
- Data Adaptive Traceback for Vision-Language Foundation Models in Image Classification [34.37262622415682]
We propose a new adaptation framework called Data Adaptive Traceback.
Specifically, we utilize a zero-shot-based method to extract the most downstream task-related subset of the pre-training data.
We adopt a pseudo-label-based semi-supervised technique to reuse the pre-training images and a vision-language contrastive learning method to address the confirmation bias issue in semi-supervised learning.
arXiv Detail & Related papers (2024-07-11T18:01:58Z) - Probabilistic Contrastive Learning for Long-Tailed Visual Recognition [78.70453964041718]
Longtailed distributions frequently emerge in real-world data, where a large number of minority categories contain a limited number of samples.
Recent investigations have revealed that supervised contrastive learning exhibits promising potential in alleviating the data imbalance.
We propose a novel probabilistic contrastive (ProCo) learning algorithm that estimates the data distribution of the samples from each class in the feature space.
arXiv Detail & Related papers (2024-03-11T13:44:49Z) - Towards Seamless Adaptation of Pre-trained Models for Visual Place Recognition [72.35438297011176]
We propose a novel method to realize seamless adaptation of pre-trained models for visual place recognition (VPR)
Specifically, to obtain both global and local features that focus on salient landmarks for discriminating places, we design a hybrid adaptation method.
Experimental results show that our method outperforms the state-of-the-art methods with less training data and training time.
arXiv Detail & Related papers (2024-02-22T12:55:01Z) - ProbVLM: Probabilistic Adapter for Frozen Vision-Language Models [69.50316788263433]
We propose ProbVLM, a probabilistic adapter that estimates probability distributions for the embeddings of pre-trained vision-language models.
We quantify the calibration of embedding uncertainties in retrieval tasks and show that ProbVLM outperforms other methods.
We present a novel technique for visualizing the embedding distributions using a large-scale pre-trained latent diffusion model.
arXiv Detail & Related papers (2023-07-01T18:16:06Z) - Gaussian Switch Sampling: A Second Order Approach to Active Learning [11.775252660867285]
In active learning, acquisition functions define informativeness directly on the representation position within the model manifold.
We propose a grounded second-order definition of information content and sample importance within the context of active learning.
We show that our definition produces highly accurate importance scores even when the model representations are constrained by the lack of training data.
arXiv Detail & Related papers (2023-02-16T15:24:56Z) - Uncertainty Estimation for Language Reward Models [5.33024001730262]
Language models can learn a range of capabilities from unsupervised training on text corpora.
It is often easier for humans to choose between options than to provide labeled data, and prior work has achieved state-of-the-art performance by training a reward model from such preference comparisons.
We seek to address these problems via uncertainty estimation, which can improve sample efficiency and robustness using active learning and risk-averse reinforcement learning.
arXiv Detail & Related papers (2022-03-14T20:13:21Z) - Knowledge-driven Active Learning [70.37119719069499]
Active learning strategies aim at minimizing the amount of labelled data required to train a Deep Learning model.
Most active strategies are based on uncertain sample selection, and even often restricted to samples lying close to the decision boundary.
Here we propose to take into consideration common domain-knowledge and enable non-expert users to train a model with fewer samples.
arXiv Detail & Related papers (2021-10-15T06:11:53Z) - EGL++: Extending Expected Gradient Length to Active Learning for Human
Pose Estimation [2.0305676256390934]
State of the art human pose estimation models rely on large quantities of labelled data for robust performance.
EGL++ is a novel algorithm that extends expected gradient length to tasks where discrete labels are not available.
arXiv Detail & Related papers (2021-04-19T17:56:59Z) - Fast Uncertainty Quantification for Deep Object Pose Estimation [91.09217713805337]
Deep learning-based object pose estimators are often unreliable and overconfident.
In this work, we propose a simple, efficient, and plug-and-play UQ method for 6-DoF object pose estimation.
arXiv Detail & Related papers (2020-11-16T06:51:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.