DASEE A Synthetic Database of Domestic Acoustic Scenes and Events in
Dementia Patients Environment
- URL: http://arxiv.org/abs/2104.13423v1
- Date: Tue, 27 Apr 2021 18:51:44 GMT
- Title: DASEE A Synthetic Database of Domestic Acoustic Scenes and Events in
Dementia Patients Environment
- Authors: Abigail Copiaco, Christian Ritz, Stefano Fasciani, Nidhal Abdulaziz
- Abstract summary: We generate an unbiased synthetic domestic audio database, consisting of sound scenes and events, emulated in both quiet and noisy environments.
Data is carefully curated such that it reflects issues commonly faced in a dementia patients environment.
We present an 11-class database containing excerpts of clean and noisy signals at 5-seconds duration each, uniformly sampled at 16 kHz.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Access to informative databases is a crucial part of notable research
developments. In the field of domestic audio classification, there have been
significant advances in recent years. Although several audio databases exist,
these can be limited in terms of the amount of information they provide, such
as the exact location of the sound sources, and the associated noise levels. In
this work, we detail our approach on generating an unbiased synthetic domestic
audio database, consisting of sound scenes and events, emulated in both quiet
and noisy environments. Data is carefully curated such that it reflects issues
commonly faced in a dementia patients environment, and recreate scenarios that
could occur in real-world settings. Similarly, the room impulse response
generated is based on a typical one-bedroom apartment at Hebrew SeniorLife
Facility. As a result, we present an 11-class database containing excerpts of
clean and noisy signals at 5-seconds duration each, uniformly sampled at 16
kHz. Using our baseline model using Continues Wavelet Transform Scalograms and
AlexNet, this yielded a weighted F1-score of 86.24 percent.
Related papers
- NoisyAG-News: A Benchmark for Addressing Instance-Dependent Noise in Text Classification [7.464154519547575]
Existing research on learning with noisy labels predominantly focuses on synthetic noise patterns.
We constructed a benchmark dataset to better understand label noise in real-world text classification settings.
Our findings reveal that while pre-trained models are resilient to synthetic noise, they struggle against instance-dependent noise.
arXiv Detail & Related papers (2024-07-09T06:18:40Z) - Sound Tagging in Infant-centric Home Soundscapes [30.76025173544015]
We explore the performance of a large pre-trained model on infant-centric noise soundscapes in the home.
Our results show that fine-tuning the model by combining our collected dataset with public datasets increases the F1-score.
arXiv Detail & Related papers (2024-06-25T00:15:54Z) - AV-GS: Learning Material and Geometry Aware Priors for Novel View Acoustic Synthesis [62.33446681243413]
view acoustic synthesis aims to render audio at any target viewpoint, given a mono audio emitted by a sound source at a 3D scene.
Existing methods have proposed NeRF-based implicit models to exploit visual cues as a condition for synthesizing audio.
We propose a novel Audio-Visual Gaussian Splatting (AV-GS) model to characterize the entire scene environment.
Experiments validate the superiority of our AV-GS over existing alternatives on the real-world RWAS and simulation-based SoundSpaces datasets.
arXiv Detail & Related papers (2024-06-13T08:34:12Z) - Real Acoustic Fields: An Audio-Visual Room Acoustics Dataset and Benchmark [65.79402756995084]
Real Acoustic Fields (RAF) is a new dataset that captures real acoustic room data from multiple modalities.
RAF is the first dataset to provide densely captured room acoustic data.
arXiv Detail & Related papers (2024-03-27T17:59:56Z) - AGS: An Dataset and Taxonomy for Domestic Scene Sound Event Recognition [1.5106201893222209]
This paper proposes a data set (called as AGS) for the home environment sound.
This data set considers various types of overlapping audio in the scene, background noise.
arXiv Detail & Related papers (2023-08-30T03:03:47Z) - Self-Supervised Visual Acoustic Matching [63.492168778869726]
Acoustic matching aims to re-synthesize an audio clip to sound as if it were recorded in a target acoustic environment.
We propose a self-supervised approach to visual acoustic matching where training samples include only the target scene image and audio.
Our approach jointly learns to disentangle room acoustics and re-synthesize audio into the target environment, via a conditional GAN framework and a novel metric.
arXiv Detail & Related papers (2023-07-27T17:59:59Z) - Analysing the Impact of Audio Quality on the Use of Naturalistic
Long-Form Recordings for Infant-Directed Speech Research [62.997667081978825]
Modelling of early language acquisition aims to understand how infants bootstrap their language skills.
Recent developments have enabled the use of more naturalistic training data for computational models.
It is currently unclear how the sound quality could affect analyses and modelling experiments conducted on such data.
arXiv Detail & Related papers (2023-05-03T08:25:37Z) - Learning with Noisy Labels Revisited: A Study Using Real-World Human
Annotations [54.400167806154535]
Existing research on learning with noisy labels mainly focuses on synthetic label noise.
This work presents two new benchmark datasets (CIFAR-10N, CIFAR-100N)
We show that real-world noisy labels follow an instance-dependent pattern rather than the classically adopted class-dependent ones.
arXiv Detail & Related papers (2021-10-22T22:42:11Z) - Bridging the Gap Between Clean Data Training and Real-World Inference
for Spoken Language Understanding [76.89426311082927]
Existing models are trained on clean data, which causes a textitgap between clean data training and real-world inference.
We propose a method from the perspective of domain adaptation, by which both high- and low-quality samples are embedding into similar vector space.
Experiments on the widely-used dataset, Snips, and large scale in-house dataset (10 million training examples) demonstrate that this method not only outperforms the baseline models on real-world (noisy) corpus but also enhances the robustness, that is, it produces high-quality results under a noisy environment.
arXiv Detail & Related papers (2021-04-13T17:54:33Z) - Deep Sound Field Reconstruction in Real Rooms: Introducing the ISOBEL
Sound Field Dataset [0.0]
This paper extends evaluations of sound field reconstruction at low frequencies by introducing a dataset with measurements from four real rooms.
The paper advances on a recent deep learning-based method for sound field reconstruction using a very low number of microphones.
arXiv Detail & Related papers (2021-02-12T11:34:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.