Towards Speaker Age Estimation with Label Distribution Learning
- URL: http://arxiv.org/abs/2202.11424v1
- Date: Wed, 23 Feb 2022 11:11:58 GMT
- Title: Towards Speaker Age Estimation with Label Distribution Learning
- Authors: Shijing Si, Jianzong Wang, Junqing Peng, Jing Xiao
- Abstract summary: We utilize the ambiguous information among the age labels, convert each age label into a discrete label distribution and leverage the label distribution learning (LDL) method to fit the data.
Our method naturally combines the age classification and regression approaches, which enhances the robustness of our method.
We conduct experiments on the public NIST SRE08-10 dataset and a real-world dataset, which exhibit that our method outperforms baseline methods by a relatively large margin.
- Score: 26.12240876065871
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Existing methods for speaker age estimation usually treat it as a multi-class
classification or a regression problem. However, precise age identification
remains a challenge due to label ambiguity, \emph{i.e.}, utterances from
adjacent age of the same person are often indistinguishable. To address this,
we utilize the ambiguous information among the age labels, convert each age
label into a discrete label distribution and leverage the label distribution
learning (LDL) method to fit the data. For each audio data sample, our method
produces a age distribution of its speaker, and on top of the distribution we
also perform two other tasks: age prediction and age uncertainty minimization.
Therefore, our method naturally combines the age classification and regression
approaches, which enhances the robustness of our method. We conduct experiments
on the public NIST SRE08-10 dataset and a real-world dataset, which exhibit
that our method outperforms baseline methods by a relatively large margin,
yielding a 10\% reduction in terms of mean absolute error (MAE) on a real-world
dataset.
Related papers
- Extracting Clean and Balanced Subset for Noisy Long-tailed Classification [66.47809135771698]
We develop a novel pseudo labeling method using class prototypes from the perspective of distribution matching.
By setting a manually-specific probability measure, we can reduce the side-effects of noisy and long-tailed data simultaneously.
Our method can extract this class-balanced subset with clean labels, which brings effective performance gains for long-tailed classification with label noise.
arXiv Detail & Related papers (2024-04-10T07:34:37Z) - Soft Curriculum for Learning Conditional GANs with Noisy-Labeled and
Uncurated Unlabeled Data [70.25049762295193]
We introduce a novel conditional image generation framework that accepts noisy-labeled and uncurated data during training.
We propose soft curriculum learning, which assigns instance-wise weights for adversarial training while assigning new labels for unlabeled data.
Our experiments show that our approach outperforms existing semi-supervised and label-noise robust methods in terms of both quantitative and qualitative performance.
arXiv Detail & Related papers (2023-07-17T08:31:59Z) - Partial-Label Regression [54.74984751371617]
Partial-label learning is a weakly supervised learning setting that allows each training example to be annotated with a set of candidate labels.
Previous studies on partial-label learning only focused on the classification setting where candidate labels are all discrete.
In this paper, we provide the first attempt to investigate partial-label regression, where each training example is annotated with a set of real-valued candidate labels.
arXiv Detail & Related papers (2023-06-15T09:02:24Z) - SVLDL: Improved Speaker Age Estimation Using Selective Variance Label
Distribution Learning [24.57668015470307]
We propose selective variance label distribution learning (SVLDL) method to adapt the variance of different age distributions.
Model uses WavLM as the speech feature extractor and adds the auxiliary task of gender recognition to further improve the performance.
Experiments show that the model achieves state-of-the-art performance on all aspects of the NIST SRE08-10 and a real-world datasets.
arXiv Detail & Related papers (2022-10-18T01:34:31Z) - Tackling Instance-Dependent Label Noise with Dynamic Distribution
Calibration [18.59803726676361]
Instance-dependent label noise is realistic but rather challenging, where the label-corruption process depends on instances directly.
It causes a severe distribution shift between the distributions of training and test data, which impairs the generalization of trained models.
In this paper, to address the distribution shift in learning with instance-dependent label noise, a dynamic distribution-calibration strategy is adopted.
arXiv Detail & Related papers (2022-10-11T03:50:52Z) - Re-distributing Biased Pseudo Labels for Semi-supervised Semantic
Segmentation: A Baseline Investigation [30.688753736660725]
We present a simple and yet effective Distribution Alignment and Random Sampling (DARS) method to produce unbiased pseudo labels.
Our method performs favorably in comparison with state-of-the-art approaches.
arXiv Detail & Related papers (2021-07-23T14:45:14Z) - using multiple losses for accurate facial age estimation [6.851375622634309]
We propose a simple yet effective approach for age estimation, which improves the performance compared to classification-based methods.
We validate the Age-Granularity-Net framework on the CVPR Chalearn 2016 dataset, and extensive experiments show that the proposed approach can reduce the prediction error compared to any individual loss.
arXiv Detail & Related papers (2021-06-17T11:18:16Z) - Disentangling Sampling and Labeling Bias for Learning in Large-Output
Spaces [64.23172847182109]
We show that different negative sampling schemes implicitly trade-off performance on dominant versus rare labels.
We provide a unified means to explicitly tackle both sampling bias, arising from working with a subset of all labels, and labeling bias, which is inherent to the data due to label imbalance.
arXiv Detail & Related papers (2021-05-12T15:40:13Z) - Semi-supervised Long-tailed Recognition using Alternate Sampling [95.93760490301395]
Main challenges in long-tailed recognition come from the imbalanced data distribution and sample scarcity in its tail classes.
We propose a new recognition setting, namely semi-supervised long-tailed recognition.
We demonstrate significant accuracy improvements over other competitive methods on two datasets.
arXiv Detail & Related papers (2021-05-01T00:43:38Z) - Enhancing Facial Data Diversity with Style-based Face Aging [59.984134070735934]
In particular, face datasets are typically biased in terms of attributes such as gender, age, and race.
We propose a novel, generative style-based architecture for data augmentation that captures fine-grained aging patterns.
We show that the proposed method outperforms state-of-the-art algorithms for age transfer.
arXiv Detail & Related papers (2020-06-06T21:53:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.