Related papers: On Multitask Loss Function for Audio Event Detection and Localization

On Multitask Loss Function for Audio Event Detection and Localization

URL: http://arxiv.org/abs/2009.05527v1
Date: Fri, 11 Sep 2020 16:59:03 GMT
Title: On Multitask Loss Function for Audio Event Detection and Localization
Authors: Huy Phan, Lam Pham, Philipp Koch, Ngoc Q. K. Duong, Ian McLoughlin, Alfred Mertins
Abstract summary: We propose a multitask regression model, in which both (multi-label) event detection and localization are formulated as regression problems. We show that the common combination of heterogeneous loss functions causes the network to underfit the data whereas the homogeneous mean squared error loss leads to better convergence and performance.
Score: 37.91770573529112
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Audio event localization and detection (SELD) have been commonly tackled using multitask models. Such a model usually consists of a multi-label event classification branch with sigmoid cross-entropy loss for event activity detection and a regression branch with mean squared error loss for direction-of-arrival estimation. In this work, we propose a multitask regression model, in which both (multi-label) event detection and localization are formulated as regression problems and use the mean squared error loss homogeneously for model training. We show that the common combination of heterogeneous loss functions causes the network to underfit the data whereas the homogeneous mean squared error loss leads to better convergence and performance. Experiments on the development and validation sets of the DCASE 2020 SELD task demonstrate that the proposed system also outperforms the DCASE 2020 SELD baseline across all the detection and localization metrics, reducing the overall SELD error (the combined metric) by approximately 10% absolute.

Related papers

CLIP Meets Diffusion: A Synergistic Approach to Anomaly Detection [54.85000884785013]
Anomaly detection is a complex problem due to the ambiguity in defining anomalies, the diversity of anomaly types, and the scarcity of training data.<n>We propose CLIPfusion, a method that leverages both discriminative and generative foundation models.<n>We believe that our method underscores the effectiveness of multi-modal and multi-model fusion in tackling the multifaceted challenges of anomaly detection.
arXiv Detail & Related papers (2025-06-13T13:30:15Z)
FUN-AD: Fully Unsupervised Learning for Anomaly Detection with Noisy Training Data [1.0650780147044159]
We propose a novel learning-based approach for fully unsupervised anomaly detection with unlabeled and potentially contaminated training data. Our method is motivated by two observations, that i) the pairwise feature distances between the normal samples are on average likely to be smaller than those between the anomaly samples or heterogeneous samples and ii) pairs of features mutually closest to each other are likely to be homogeneous pairs. Building on the first observation that nearest-neighbor distances can distinguish between confident normal samples and anomalies, we propose a pseudo-labeling strategy using an iteratively reconstructed memory bank.
arXiv Detail & Related papers (2024-11-25T05:51:38Z)
MIM-GAN-based Anomaly Detection for Multivariate Time Series Data [7.734588574138427]
The loss function of Generative adversarial network(GAN) is an important factor that affects the quality and diversity of the generated samples for anomaly detection. We propose an unsupervised multiple time series anomaly detection algorithm based on the GAN with message importance measure(MIM-GAN) Experimental results show that the proposed MIM-GAN-based anomaly detection algorithm has superior performance in terms of precision, recall, and F1 score.
arXiv Detail & Related papers (2023-10-26T02:09:39Z)
Anomaly Detection with Score Distribution Discrimination [4.468952886990851]
We propose to optimize the anomaly scoring function from the view of score distribution. We design a novel loss function called Overlap loss that minimizes the overlap area between the score distributions of normal and abnormal samples.
arXiv Detail & Related papers (2023-06-26T03:32:57Z)
Robust Multimodal Failure Detection for Microservice Systems [32.25907616511765]
AnoFusion is an unsupervised failure detection approach for microservice systems. It learns the correlation of the heterogeneous multimodal data and integrates a Graph Attention Network (GAT) and Gated Recurrent Unit (GRU) It achieves the F1-score of 0.857 and 0.922, respectively, outperforming state-of-the-art failure detection approaches.
arXiv Detail & Related papers (2023-05-30T12:39:42Z)
The KFIoU Loss for Rotated Object Detection [115.334070064346]
In this paper, we argue that one effective alternative is to devise an approximate loss who can achieve trend-level alignment with SkewIoU loss. Specifically, we model the objects as Gaussian distribution and adopt Kalman filter to inherently mimic the mechanism of SkewIoU. The resulting new loss called KFIoU is easier to implement and works better compared with exact SkewIoU.
arXiv Detail & Related papers (2022-01-29T10:54:57Z)
Discriminative-Generative Dual Memory Video Anomaly Detection [81.09977516403411]
Recently, people tried to use a few anomalies for video anomaly detection (VAD) instead of only normal data during the training process. We propose a DiscRiminative-gEnerative duAl Memory (DREAM) anomaly detection model to take advantage of a few anomalies and solve data imbalance.
arXiv Detail & Related papers (2021-04-29T15:49:01Z)
Shaping Deep Feature Space towards Gaussian Mixture for Visual Classification [74.48695037007306]
We propose a Gaussian mixture (GM) loss function for deep neural networks for visual classification. With a classification margin and a likelihood regularization, the GM loss facilitates both high classification performance and accurate modeling of the feature distribution. The proposed model can be implemented easily and efficiently without using extra trainable parameters.
arXiv Detail & Related papers (2020-11-18T03:32:27Z)
Optimized Loss Functions for Object detection: A Case Study on Nighttime Vehicle Detection [0.0]
In this paper, we optimize both two loss functions for classification and localization simultaneously. Compared to the existing studies, in which the correlation is only applied to improve the localization accuracy for positive samples, this paper utilizes the correlation to obtain the really hard negative samples. A novel localization loss named MIoU is proposed by incorporating a Mahalanobis distance between predicted box and target box, which eliminate the gradients inconsistency problem in the DIoU loss.
arXiv Detail & Related papers (2020-11-11T03:00:49Z)
Learning by Minimizing the Sum of Ranked Range [58.24935359348289]
We introduce the sum of ranked range (SoRR) as a general approach to form learning objectives. A ranked range is a consecutive sequence of sorted values of a set of real numbers. We explore two applications in machine learning of the minimization of the SoRR framework, namely the AoRR aggregate loss for binary classification and the TKML individual loss for multi-label/multi-class classification.
arXiv Detail & Related papers (2020-10-05T01:58:32Z)
SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier Detection [63.253850875265115]
Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples. We propose a modular acceleration system, called SUOD, to address it.
arXiv Detail & Related papers (2020-03-11T00:22:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.