SVLDL: Improved Speaker Age Estimation Using Selective Variance Label
Distribution Learning
- URL: http://arxiv.org/abs/2210.09524v1
- Date: Tue, 18 Oct 2022 01:34:31 GMT
- Title: SVLDL: Improved Speaker Age Estimation Using Selective Variance Label
Distribution Learning
- Authors: Zuheng Kang, Jianzong Wang, Junqing Peng, Jing Xiao
- Abstract summary: We propose selective variance label distribution learning (SVLDL) method to adapt the variance of different age distributions.
Model uses WavLM as the speech feature extractor and adds the auxiliary task of gender recognition to further improve the performance.
Experiments show that the model achieves state-of-the-art performance on all aspects of the NIST SRE08-10 and a real-world datasets.
- Score: 24.57668015470307
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Estimating age from a single speech is a classic and challenging topic.
Although Label Distribution Learning (LDL) can represent adjacent
indistinguishable ages well, the uncertainty of the age estimate for each
utterance varies from person to person, i.e., the variance of the age
distribution is different. To address this issue, we propose selective variance
label distribution learning (SVLDL) method to adapt the variance of different
age distributions. Furthermore, the model uses WavLM as the speech feature
extractor and adds the auxiliary task of gender recognition to further improve
the performance. Two tricks are applied on the loss function to enhance the
robustness of the age estimation and improve the quality of the fitted age
distribution. Extensive experiments show that the model achieves
state-of-the-art performance on all aspects of the NIST SRE08-10 and a
real-world datasets.
Related papers
- Navigating Semantic Drift in Task-Agnostic Class-Incremental Learning [51.177789437682954]
Class-incremental learning (CIL) seeks to enable a model to sequentially learn new classes while retaining knowledge of previously learned ones.
Balancing flexibility and stability remains a significant challenge, particularly when the task ID is unknown.
We propose a novel semantic drift calibration method that incorporates mean shift compensation and covariance calibration.
arXiv Detail & Related papers (2025-02-11T13:57:30Z) - From Age Estimation to Age-Invariant Face Recognition: Generalized Age Feature Extraction Using Order-Enhanced Contrastive Learning [23.817867981093382]
Generalized age feature extraction is crucial for age-related facial analysis tasks.
We propose Order-Enhanced Contrastive Learning (OrdCon) to minimize the domain gap across different datasets and scenarios.
We demonstrate that our proposed method achieves comparable results to state-of-the-art methods on various benchmark datasets.
arXiv Detail & Related papers (2025-01-03T11:23:52Z) - ASPEST: Bridging the Gap Between Active Learning and Selective
Prediction [56.001808843574395]
Selective prediction aims to learn a reliable model that abstains from making predictions when uncertain.
Active learning aims to lower the overall labeling effort, and hence human dependence, by querying the most informative examples.
In this work, we introduce a new learning paradigm, active selective prediction, which aims to query more informative samples from the shifted target domain.
arXiv Detail & Related papers (2023-04-07T23:51:07Z) - Robust Outlier Rejection for 3D Registration with Variational Bayes [70.98659381852787]
We develop a novel variational non-local network-based outlier rejection framework for robust alignment.
We propose a voting-based inlier searching strategy to cluster the high-quality hypothetical inliers for transformation estimation.
arXiv Detail & Related papers (2023-04-04T03:48:56Z) - Fairness Improves Learning from Noisily Labeled Long-Tailed Data [119.0612617460727]
Long-tailed and noisily labeled data frequently appear in real-world applications and impose significant challenges for learning.
We introduce the Fairness Regularizer (FR), inspired by regularizing the performance gap between any two sub-populations.
We show that the introduced fairness regularizer improves the performances of sub-populations on the tail and the overall learning performance.
arXiv Detail & Related papers (2023-03-22T03:46:51Z) - Adaptive Mean-Residue Loss for Robust Facial Age Estimation [7.667560350473354]
We propose a loss function for robust facial age estimation via distribution learning.
Experimental results in the datasets FG-NET and CLAP2016 have validated the effectiveness of the proposed loss.
arXiv Detail & Related papers (2022-03-31T16:28:34Z) - Towards Speaker Age Estimation with Label Distribution Learning [26.12240876065871]
We utilize the ambiguous information among the age labels, convert each age label into a discrete label distribution and leverage the label distribution learning (LDL) method to fit the data.
Our method naturally combines the age classification and regression approaches, which enhances the robustness of our method.
We conduct experiments on the public NIST SRE08-10 dataset and a real-world dataset, which exhibit that our method outperforms baseline methods by a relatively large margin.
arXiv Detail & Related papers (2022-02-23T11:11:58Z) - using multiple losses for accurate facial age estimation [6.851375622634309]
We propose a simple yet effective approach for age estimation, which improves the performance compared to classification-based methods.
We validate the Age-Granularity-Net framework on the CVPR Chalearn 2016 dataset, and extensive experiments show that the proposed approach can reduce the prediction error compared to any individual loss.
arXiv Detail & Related papers (2021-06-17T11:18:16Z) - PML: Progressive Margin Loss for Long-tailed Age Classification [9.020103398777653]
We propose a progressive margin loss (PML) approach for unconstrained facial age classification.
Our PML aims to adaptively refine the age label pattern by enforcing a couple of margins.
arXiv Detail & Related papers (2021-03-03T02:47:09Z) - Unsupervised neural adaptation model based on optimal transport for
spoken language identification [54.96267179988487]
Due to the mismatch of statistical distributions of acoustic speech between training and testing sets, the performance of spoken language identification (SLID) could be drastically degraded.
We propose an unsupervised neural adaptation model to deal with the distribution mismatch problem for SLID.
arXiv Detail & Related papers (2020-12-24T07:37:19Z) - Enhancing Facial Data Diversity with Style-based Face Aging [59.984134070735934]
In particular, face datasets are typically biased in terms of attributes such as gender, age, and race.
We propose a novel, generative style-based architecture for data augmentation that captures fine-grained aging patterns.
We show that the proposed method outperforms state-of-the-art algorithms for age transfer.
arXiv Detail & Related papers (2020-06-06T21:53:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.