Related papers: Robust Noise Attenuation via Adaptive Pooling of Transformer Outputs

Robust Noise Attenuation via Adaptive Pooling of Transformer Outputs

URL: http://arxiv.org/abs/2506.09215v1
Date: Tue, 10 Jun 2025 20:18:32 GMT
Title: Robust Noise Attenuation via Adaptive Pooling of Transformer Outputs
Authors: Greyson Brothers,
Abstract summary: This work considers problems where a subset of the input vectors contains requisite information for a downstream task (signal) while the rest are distractors (noise)<n>Standard methods used to aggregate transformer outputs, AvgPool, MaxPool, and ClsToken, are vulnerable to performance collapse as the signal-to-noise ratio (SNR) of inputs fluctuates.<n>We show that an attention-based adaptive pooling method can approximate the signal-optimal vector quantizer within derived error bounds for any SNR.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We investigate the design of pooling methods used to summarize the outputs of transformer embedding models, primarily motivated by reinforcement learning and vision applications. This work considers problems where a subset of the input vectors contains requisite information for a downstream task (signal) while the rest are distractors (noise). By framing pooling as vector quantization with the goal of minimizing signal loss, we demonstrate that the standard methods used to aggregate transformer outputs, AvgPool, MaxPool, and ClsToken, are vulnerable to performance collapse as the signal-to-noise ratio (SNR) of inputs fluctuates. We then show that an attention-based adaptive pooling method can approximate the signal-optimal vector quantizer within derived error bounds for any SNR. Our theoretical results are first validated by supervised experiments on a synthetic dataset designed to isolate the SNR problem, then generalized to standard relational reasoning, multi-agent reinforcement learning, and vision benchmarks with noisy observations, where transformers with adaptive pooling display superior robustness across tasks.

Related papers

Adaptive folding and noise filtering for robust quantum error mitigation [0.0]
This paper presents noise-adaptive folding, a technique that enhances zero-noise extrapolation (ZNE)<n>We introduce two filtering methods: one relies on measuring error strength, while the other utilizes statistical filtering to improve the extrapolation process.<n>Our findings demonstrate that these adaptive methods effectively strengthen error mitigation against noise fluctuations, thereby enhancing the precision and reliability of quantum computations.
arXiv Detail & Related papers (2025-05-07T14:35:01Z)
Robust semi-parametric signal detection in particle physics with classifiers decorrelated via optimal transport [0.20971479389679337]
In particle physics, a supervised classifier is used to separate a signal model from the known Standard Model physics.<n>But errors in the background model might adversely affect the signal detection procedure.<n>This paper shows how to use the (possibly misspecified) classifier only to perform a preliminary signal-enrichment step.<n>We then carry out a bump hunt on the signal-rich sample using only the real experimental data.
arXiv Detail & Related papers (2024-09-10T10:32:21Z)
Locality-Aware Generalizable Implicit Neural Representation [54.93702310461174]
Generalizable implicit neural representation (INR) enables a single continuous function to represent multiple data instances. We propose a novel framework for generalizable INR that combines a transformer encoder with a locality-aware INR decoder. Our framework significantly outperforms previous generalizable INRs and validates the usefulness of the locality-aware latents for downstream tasks.
arXiv Detail & Related papers (2023-10-09T11:26:58Z)
Transformers as Meta-Learners for Implicit Neural Representations [10.673855995948736]
Implicit Neural Representations (INRs) have emerged and shown their benefits over discrete representations in recent years. We propose a formulation that uses Transformers as hypernetworks for INRs, where it can directly build the whole set of INR weights. We demonstrate the effectiveness of our method for building INRs in different tasks and domains, including 2D image regression and view synthesis for 3D objects.
arXiv Detail & Related papers (2022-08-04T17:54:38Z)
Treatment Learning Causal Transformer for Noisy Image Classification [62.639851972495094]
In this work, we incorporate this binary information of "existence of noise" as treatment into image classification tasks to improve prediction accuracy. Motivated from causal variational inference, we propose a transformer-based architecture, that uses a latent generative model to estimate robust feature representations for noise image classification. We also create new noisy image datasets incorporating a wide range of noise factors for performance benchmarking.
arXiv Detail & Related papers (2022-03-29T13:07:53Z)
Deliberation of Streaming RNN-Transducer by Non-autoregressive Decoding [21.978994865937786]
The method performs a few refinement steps, where each step shares a transformer decoder that attends to both text features and audio features. We show that, conditioned on hypothesis alignments of a streaming RNN-T model, our method obtains significantly more accurate recognition results than the first-pass RNN-T.
arXiv Detail & Related papers (2021-12-01T01:34:28Z)
Adaptive Low-Pass Filtering using Sliding Window Gaussian Processes [71.23286211775084]
We propose an adaptive low-pass filter based on Gaussian process regression. We show that the estimation error of the proposed method is uniformly bounded.
arXiv Detail & Related papers (2021-11-05T17:06:59Z)
Bayesian Transformer Language Models for Speech Recognition [59.235405107295655]
State-of-the-art neural language models (LMs) represented by Transformers are highly complex. This paper proposes a full Bayesian learning framework for Transformer LM estimation.
arXiv Detail & Related papers (2021-02-09T10:55:27Z)
Adaptive filters for the moving target indicator system [10.152838128195468]
Two approaches to improve the convergence of adaptive algorithms are presented. The proposed approach is based on an empirical signal to interference plus noise ratio (SINR) Its effectiveness is demonstrated using simulated data.
arXiv Detail & Related papers (2020-12-31T04:22:55Z)
Rethinking Transformer-based Set Prediction for Object Detection [57.7208561353529]
Experimental results show that the proposed methods not only converge much faster than the original DETR, but also significantly outperform DETR and other baselines in terms of detection accuracy.
arXiv Detail & Related papers (2020-11-21T21:59:42Z)
Neural Control Variates [71.42768823631918]
We show that a set of neural networks can face the challenge of finding a good approximation of the integrand. We derive a theoretically optimal, variance-minimizing loss function, and propose an alternative, composite loss for stable online training in practice. Specifically, we show that the learned light-field approximation is of sufficient quality for high-order bounces, allowing us to omit the error correction and thereby dramatically reduce the noise at the cost of negligible visible bias.
arXiv Detail & Related papers (2020-06-02T11:17:55Z)

This list is automatically generated from the titles and abstracts of the papers in this site.