Related papers: Streaming Anchor Loss: Augmenting Supervision with Temporal Significance

Streaming Anchor Loss: Augmenting Supervision with Temporal Significance

URL: http://arxiv.org/abs/2310.05886v2
Date: Thu, 18 Apr 2024 06:11:43 GMT
Title: Streaming Anchor Loss: Augmenting Supervision with Temporal Significance
Authors: Utkarsh Oggy Sarawgi, John Berkowitz, Vineet Garg, Arnav Kundu, Minsik Cho, Sai Srujana Buddi, Saurabh Adya, Ahmed Tewfik,
Abstract summary: Streaming neural network models for fast frame-wise responses to various speech and sensory signals are widely adopted on resource-constrained platforms. We propose a new loss, Streaming Anchor Loss (SAL), to better utilize the given learning capacity by encouraging the model to learn more from essential frames.
Score: 5.7654216719335105
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Streaming neural network models for fast frame-wise responses to various speech and sensory signals are widely adopted on resource-constrained platforms. Hence, increasing the learning capacity of such streaming models (i.e., by adding more parameters) to improve the predictive power may not be viable for real-world tasks. In this work, we propose a new loss, Streaming Anchor Loss (SAL), to better utilize the given learning capacity by encouraging the model to learn more from essential frames. More specifically, our SAL and its focal variations dynamically modulate the frame-wise cross entropy loss based on the importance of the corresponding frames so that a higher loss penalty is assigned for frames within the temporal proximity of semantically critical events. Therefore, our loss ensures that the model training focuses on predicting the relatively rare but task-relevant frames. Experimental results with standard lightweight convolutional and recurrent streaming networks on three different speech based detection tasks demonstrate that SAL enables the model to learn the overall task more effectively with improved accuracy and latency, without any additional data, model parameters, or architectural changes.

Related papers

Enhancing material behavior discovery using embedding-oriented Physically-Guided Neural Networks with Internal Variables [0.0]
Physically Guided Neural Networks with Internal Variables are SciML tools that use only observable data for training and unravel internal state relations.<n>Despite their potential, these models face challenges in scalability when applied to high-dimensional data such as fine-grid spatial fields or time-evolving systems.<n>We propose some enhancements to the PGNNIV framework that address these scalability limitations through reduced-order modeling techniques.
arXiv Detail & Related papers (2025-08-01T12:33:21Z)
FlowDistill: Scalable Traffic Flow Prediction via Distillation from LLMs [5.6685153523382015]
FlowDistill is a lightweight traffic prediction framework based on knowledge distillation from large language models (LLMs) Despite its simplicity, FlowDistill consistently outperforms state-of-the-art models in prediction accuracy while requiring significantly less training data.
arXiv Detail & Related papers (2025-04-02T19:54:54Z)
ODEStream: A Buffer-Free Online Learning Framework with ODE-based Adaptor for Streaming Time Series Forecasting [11.261457967759688]
ODEStream is a buffer-free continual learning framework that incorporates a temporal isolation layer that integrates temporal dependencies within the data. Our approach focuses on learning how the dynamics and distribution of historical data change with time, facilitating the direct processing of streaming sequences. Evaluations on benchmark real-world datasets demonstrate that ODEStream outperforms the state-of-the-art online learning and streaming analysis baselines.
arXiv Detail & Related papers (2024-11-11T22:36:33Z)
TDS-CLIP: Temporal Difference Side Network for Image-to-Video Transfer Learning [6.329214318116305]
We propose a memory-efficient Temporal Difference Side Network ( TDS-CLIP) to balance knowledge transferring and temporal modeling. Specifically, we introduce a Temporal Difference Adapter (TD-Adapter), which can effectively capture local temporal differences in motion features. We also designed a Side Motion Enhancement Adapter (SME-Adapter) to guide the proposed side network in efficiently learning the rich motion information in videos.
arXiv Detail & Related papers (2024-08-20T09:40:08Z)
Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks. We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z)
Interference Cancellation GAN Framework for Dynamic Channels [74.22393885274728]
We introduce an online training framework that can adapt to any changes in the channel. Our framework significantly outperforms recent neural network models on highly dynamic channels.
arXiv Detail & Related papers (2022-08-17T02:01:18Z)
From Environmental Sound Representation to Robustness of 2D CNN Models Against Adversarial Attacks [82.21746840893658]
This paper investigates the impact of different standard environmental sound representations (spectrograms) on the recognition performance and adversarial attack robustness of a victim residual convolutional neural network. We show that while the ResNet-18 model trained on DWT spectrograms achieves a high recognition accuracy, attacking this model is relatively more costly for the adversary.
arXiv Detail & Related papers (2022-04-14T15:14:08Z)
Real-time Object Detection for Streaming Perception [84.2559631820007]
Streaming perception is proposed to jointly evaluate the latency and accuracy into a single metric for video online perception. We build a simple and effective framework for streaming perception. Our method achieves competitive performance on Argoverse-HD dataset and improves the AP by 4.9% compared to the strong baseline.
arXiv Detail & Related papers (2022-03-23T11:33:27Z)
Learning Fast and Slow for Online Time Series Forecasting [76.50127663309604]
Fast and Slow learning Networks (FSNet) is a holistic framework for online time-series forecasting. FSNet balances fast adaptation to recent changes and retrieving similar old knowledge. Our code will be made publicly available.
arXiv Detail & Related papers (2022-02-23T18:23:07Z)
Enabling Continual Learning with Differentiable Hebbian Plasticity [18.12749708143404]
Continual learning is the problem of sequentially learning new tasks or knowledge while protecting previously acquired knowledge. catastrophic forgetting poses a grand challenge for neural networks performing such learning process. We propose a Differentiable Hebbian Consolidation model which is composed of a Differentiable Hebbian Plasticity.
arXiv Detail & Related papers (2020-06-30T06:42:19Z)
Network Diffusions via Neural Mean-Field Dynamics [52.091487866968286]
We propose a novel learning framework for inference and estimation problems of diffusion on networks. Our framework is derived from the Mori-Zwanzig formalism to obtain an exact evolution of the node infection probabilities. Our approach is versatile and robust to variations of the underlying diffusion network models.
arXiv Detail & Related papers (2020-06-16T18:45:20Z)

This list is automatically generated from the titles and abstracts of the papers in this site.