SVLDL: Improved Speaker Age Estimation Using Selective Variance Label
Distribution Learning
- URL: http://arxiv.org/abs/2210.09524v1
- Date: Tue, 18 Oct 2022 01:34:31 GMT
- Title: SVLDL: Improved Speaker Age Estimation Using Selective Variance Label
Distribution Learning
- Authors: Zuheng Kang, Jianzong Wang, Junqing Peng, Jing Xiao
- Abstract summary: We propose selective variance label distribution learning (SVLDL) method to adapt the variance of different age distributions.
Model uses WavLM as the speech feature extractor and adds the auxiliary task of gender recognition to further improve the performance.
Experiments show that the model achieves state-of-the-art performance on all aspects of the NIST SRE08-10 and a real-world datasets.
- Score: 24.57668015470307
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating age from a single speech is a classic and challenging topic.
Although Label Distribution Learning (LDL) can represent adjacent
indistinguishable ages well, the uncertainty of the age estimate for each
utterance varies from person to person, i.e., the variance of the age
distribution is different. To address this issue, we propose selective variance
label distribution learning (SVLDL) method to adapt the variance of different
age distributions. Furthermore, the model uses WavLM as the speech feature
extractor and adds the auxiliary task of gender recognition to further improve
the performance. Two tricks are applied on the loss function to enhance the
robustness of the age estimation and improve the quality of the fitted age
distribution. Extensive experiments show that the model achieves
state-of-the-art performance on all aspects of the NIST SRE08-10 and a
real-world datasets.
Related papers
- Generalizable Low-Resource Activity Recognition with Diverse and
Discriminative Representation Learning [24.36351102003414]
Human activity recognition (HAR) is a time series classification task that focuses on identifying the motion patterns from human sensor readings.
We propose a novel approach called Diverse and Discriminative representation Learning (DDLearn) for generalizable lowresource HAR.
Our method significantly outperforms state-of-art methods by an average accuracy improvement of 9.5%.
arXiv Detail & Related papers (2023-05-25T08:24:22Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Robust Outlier Rejection for 3D Registration with Variational Bayes [70.98659381852787]
We develop a novel variational non-local network-based outlier rejection framework for robust alignment.
We propose a voting-based inlier searching strategy to cluster the high-quality hypothetical inliers for transformation estimation.
arXiv Detail & Related papers (2023-04-04T03:48:56Z) - Fairness Improves Learning from Noisily Labeled Long-Tailed Data [119.0612617460727]
Long-tailed and noisily labeled data frequently appear in real-world applications and impose significant challenges for learning.
We introduce the Fairness Regularizer (FR), inspired by regularizing the performance gap between any two sub-populations.
We show that the introduced fairness regularizer improves the performances of sub-populations on the tail and the overall learning performance.
arXiv Detail & Related papers (2023-03-22T03:46:51Z) - Adaptive Mean-Residue Loss for Robust Facial Age Estimation [7.667560350473354]
We propose a loss function for robust facial age estimation via distribution learning.
Experimental results in the datasets FG-NET and CLAP2016 have validated the effectiveness of the proposed loss.
arXiv Detail & Related papers (2022-03-31T16:28:34Z) - Towards Speaker Age Estimation with Label Distribution Learning [26.12240876065871]
We utilize the ambiguous information among the age labels, convert each age label into a discrete label distribution and leverage the label distribution learning (LDL) method to fit the data.
Our method naturally combines the age classification and regression approaches, which enhances the robustness of our method.
We conduct experiments on the public NIST SRE08-10 dataset and a real-world dataset, which exhibit that our method outperforms baseline methods by a relatively large margin.
arXiv Detail & Related papers (2022-02-23T11:11:58Z) - using multiple losses for accurate facial age estimation [6.851375622634309]
We propose a simple yet effective approach for age estimation, which improves the performance compared to classification-based methods.
We validate the Age-Granularity-Net framework on the CVPR Chalearn 2016 dataset, and extensive experiments show that the proposed approach can reduce the prediction error compared to any individual loss.
arXiv Detail & Related papers (2021-06-17T11:18:16Z) - Balancing Biases and Preserving Privacy on Balanced Faces in the Wild [50.915684171879036]
There are demographic biases present in current facial recognition (FR) models.
We introduce our Balanced Faces in the Wild dataset to measure these biases across different ethnic and gender subgroups.
We find that relying on a single score threshold to differentiate between genuine and imposters sample pairs leads to suboptimal results.
We propose a novel domain adaptation learning scheme that uses facial features extracted from state-of-the-art neural networks.
arXiv Detail & Related papers (2021-03-16T15:05:49Z) - PML: Progressive Margin Loss for Long-tailed Age Classification [9.020103398777653]
We propose a progressive margin loss (PML) approach for unconstrained facial age classification.
Our PML aims to adaptively refine the age label pattern by enforcing a couple of margins.
arXiv Detail & Related papers (2021-03-03T02:47:09Z) - Unsupervised neural adaptation model based on optimal transport for
spoken language identification [54.96267179988487]
Due to the mismatch of statistical distributions of acoustic speech between training and testing sets, the performance of spoken language identification (SLID) could be drastically degraded.
We propose an unsupervised neural adaptation model to deal with the distribution mismatch problem for SLID.
arXiv Detail & Related papers (2020-12-24T07:37:19Z) - Enhancing Facial Data Diversity with Style-based Face Aging [59.984134070735934]
In particular, face datasets are typically biased in terms of attributes such as gender, age, and race.
We propose a novel, generative style-based architecture for data augmentation that captures fine-grained aging patterns.
We show that the proposed method outperforms state-of-the-art algorithms for age transfer.
arXiv Detail & Related papers (2020-06-06T21:53:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.