Audio-based Kinship Verification Using Age Domain Conversion
- URL: http://arxiv.org/abs/2410.11120v1
- Date: Mon, 14 Oct 2024 22:08:57 GMT
- Title: Audio-based Kinship Verification Using Age Domain Conversion
- Authors: Qiyang Sun, Alican Akman, Xin Jing, Manuel Milling, Björn W. Schuller,
- Abstract summary: Key challenge in the task arises from differences in age across samples from different individuals.
We utilise the optimised CycleGAN-VC3 network to perform age-audio conversion to generate the in-domain audio.
The generated audio dataset is employed to extract a range of features, which are then fed into a metric learning architecture to verify kinship.
- Score: 39.4890403254022
- License:
- Abstract: Audio-based kinship verification (AKV) is important in many domains, such as home security monitoring, forensic identification, and social network analysis. A key challenge in the task arises from differences in age across samples from different individuals, which can be interpreted as a domain bias in a cross-domain verification task. To address this issue, we design the notion of an "age-standardised domain" wherein we utilise the optimised CycleGAN-VC3 network to perform age-audio conversion to generate the in-domain audio. The generated audio dataset is employed to extract a range of features, which are then fed into a metric learning architecture to verify kinship. Experiments are conducted on the KAN_AV audio dataset, which contains age and kinship labels. The results demonstrate that the method markedly enhances the accuracy of kinship verification, while also offering novel insights for future kinship verification research.
Related papers
- Benchmarking Cross-Domain Audio-Visual Deception Detection [45.342156006617394]
We present the first cross-domain audio-visual deception detection benchmark.
We compare single-to-single and multi-to-single domain generalization performance.
We propose an algorithm to enhance the generalization performance.
arXiv Detail & Related papers (2024-05-11T12:06:31Z) - Explaining Cross-Domain Recognition with Interpretable Deep Classifier [100.63114424262234]
Interpretable Deep (IDC) learns the nearest source samples of a target sample as evidence upon which the classifier makes the decision.
Our IDC leads to a more explainable model with almost no accuracy degradation and effectively calibrates classification for optimum reject options.
arXiv Detail & Related papers (2022-11-15T15:58:56Z) - Cross-domain Voice Activity Detection with Self-Supervised
Representations [9.02236667251654]
Voice Activity Detection (VAD) aims at detecting speech segments on an audio signal.
Current state-of-the-art methods focus on training a neural network exploiting features directly contained in the acoustics.
We show that representations based on Self-Supervised Learning (SSL) can adapt well to different domains.
arXiv Detail & Related papers (2022-09-22T14:53:44Z) - Dual Domain-Adversarial Learning for Audio-Visual Saliency Prediction [17.691475370621]
Deep convolution neural networks (CNN) showcase strong capacity in coping with the audio-visual saliency prediction task.
Due to various factors such as shooting scenes and weather, there often exists moderate distribution discrepancy between source training data and target testing data.
We propose a dual domain-adversarial learning algorithm to mitigate the domain discrepancy between source and target data.
arXiv Detail & Related papers (2022-08-10T08:50:32Z) - Frequency Spectrum Augmentation Consistency for Domain Adaptive Object
Detection [107.52026281057343]
We introduce a Frequency Spectrum Augmentation Consistency (FSAC) framework with four different low-frequency filter operations.
In the first stage, we utilize all the original and augmented source data to train an object detector.
In the second stage, augmented source and target data with pseudo labels are adopted to perform the self-training for prediction consistency.
arXiv Detail & Related papers (2021-12-16T04:07:01Z) - TASK3 DCASE2021 Challenge: Sound event localization and detection using
squeeze-excitation residual CNNs [4.4973334555746]
This study is based on the one carried out by the same team last year.
It has been decided to study how this technique improves each of the datasets.
This modification shows an improvement in the performance of the system compared to the baseline using MIC dataset.
arXiv Detail & Related papers (2021-07-30T11:34:15Z) - PILOT: Introducing Transformers for Probabilistic Sound Event
Localization [107.78964411642401]
This paper introduces a novel transformer-based sound event localization framework, where temporal dependencies in the received multi-channel audio signals are captured via self-attention mechanisms.
The framework is evaluated on three publicly available multi-source sound event localization datasets and compared against state-of-the-art methods in terms of localization error and event detection accuracy.
arXiv Detail & Related papers (2021-06-07T18:29:19Z) - Cross-domain Adaptation with Discrepancy Minimization for
Text-independent Forensic Speaker Verification [61.54074498090374]
This study introduces a CRSS-Forensics audio dataset collected in multiple acoustic environments.
We pre-train a CNN-based network using the VoxCeleb data, followed by an approach which fine-tunes part of the high-level network layers with clean speech from CRSS-Forensics.
arXiv Detail & Related papers (2020-09-05T02:54:33Z) - Unsupervised Domain Adaptation for Acoustic Scene Classification Using
Band-Wise Statistics Matching [69.24460241328521]
Machine learning algorithms can be negatively affected by mismatches between training (source) and test (target) data distributions.
We propose an unsupervised domain adaptation method that consists of aligning the first- and second-order sample statistics of each frequency band of target-domain acoustic scenes to the ones of the source-domain training dataset.
We show that the proposed method outperforms the state-of-the-art unsupervised methods found in the literature in terms of both source- and target-domain classification accuracy.
arXiv Detail & Related papers (2020-04-30T23:56:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.