Related papers: Compute Optimal Inference and Provable Amortisation Gap in Sparse Autoencoders

Compute Optimal Inference and Provable Amortisation Gap in Sparse Autoencoders

URL: http://arxiv.org/abs/2411.13117v2
Date: Thu, 30 Jan 2025 09:15:26 GMT
Title: Compute Optimal Inference and Provable Amortisation Gap in Sparse Autoencoders
Authors: Charles O'Neill, Alim Gumran, David Klindt,
Abstract summary: A recent line of work has shown promise in using sparse autoencoders (SAEs) to uncover interpretable features in neural network representations.<n>However, the simple linear-nonlinear encoding mechanism in SAEs limits their ability to perform accurate sparse inference.<n>We prove that an SAE encoder is inherently insufficient for accurate sparse inference, even in solvable cases.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: A recent line of work has shown promise in using sparse autoencoders (SAEs) to uncover interpretable features in neural network representations. However, the simple linear-nonlinear encoding mechanism in SAEs limits their ability to perform accurate sparse inference. Using compressed sensing theory, we prove that an SAE encoder is inherently insufficient for accurate sparse inference, even in solvable cases. We then decouple encoding and decoding processes to empirically explore conditions where more sophisticated sparse inference methods outperform traditional SAE encoders. Our results reveal substantial performance gains with minimal compute increases in correct inference of sparse codes. We demonstrate this generalises to SAEs applied to large language models, where more expressive encoders achieve greater interpretability. This work opens new avenues for understanding neural network representations and analysing large language model activations.

Related papers

A Theoretical Perspective for Speculative Decoding Algorithm [60.79447486066416]
One effective way to accelerate inference is emphSpeculative Decoding, which employs a small model to sample a sequence of draft tokens and a large model to validate. This paper tackles this gap by conceptualizing the decoding problem via markov chain abstraction and studying the key properties, emphoutput quality and inference acceleration, from a theoretical perspective.
arXiv Detail & Related papers (2024-10-30T01:53:04Z)
Interpretability as Compression: Reconsidering SAE Explanations of Neural Activations with MDL-SAEs [0.0]
We present an information-theoretic framework for interpreting SAEs as lossy compression algorithms. We argue that using MDL rather than sparsity may avoid potential pitfalls with naively maximising sparsity.
arXiv Detail & Related papers (2024-10-15T01:38:03Z)
Sample what you cant compress [6.24979299238534]
We show how to learn a continuous encoder and decoder under a diffusion-based loss. This approach yields better reconstruction quality as compared to GAN-based autoencoders. We also show that the resulting representation is easier to model with a latent diffusion model as compared to the representation obtained from a state-of-the-art GAN-based loss.
arXiv Detail & Related papers (2024-09-04T08:42:42Z)
Disentangling Dense Embeddings with Sparse Autoencoders [0.0]
Sparse autoencoders (SAEs) have shown promise in extracting interpretable features from complex neural networks. We present one of the first applications of SAEs to dense text embeddings from large language models. We show that the resulting sparse representations maintain semantic fidelity while offering interpretability.
arXiv Detail & Related papers (2024-08-01T15:46:22Z)
Speculative Contrastive Decoding [55.378200871224074]
Large language models(LLMs) exhibit exceptional performance in language tasks, yet their auto-regressive inference is limited due to high computational requirements and is sub-optimal due to the exposure bias. Inspired by speculative decoding and contrastive decoding, we introduce Speculative Contrastive Decoding(SCD), a straightforward yet powerful decoding approach.
arXiv Detail & Related papers (2023-11-15T14:15:30Z)
Symmetric Equilibrium Learning of VAEs [56.56929742714685]
We view variational autoencoders (VAEs) as decoder-encoder pairs, which map distributions in the data space to distributions in the latent space and vice versa. We propose a Nash equilibrium learning approach, which is symmetric with respect to the encoder and decoder and allows learning VAEs in situations where both the data and the latent distributions are accessible only by sampling.
arXiv Detail & Related papers (2023-07-19T10:27:34Z)
In-context Autoencoder for Context Compression in a Large Language Model [70.7621953091318]
We propose the In-context Autoencoder (ICAE) to compress a long context into short compact memory slots. ICAE is first pretrained using both autoencoding and language modeling objectives on massive text data.
arXiv Detail & Related papers (2023-07-13T17:59:21Z)
Improving Deep Representation Learning via Auxiliary Learnable Target Coding [69.79343510578877]
This paper introduces a novel learnable target coding as an auxiliary regularization of deep representation learning. Specifically, a margin-based triplet loss and a correlation consistency loss on the proposed target codes are designed to encourage more discriminative representations.
arXiv Detail & Related papers (2023-05-30T01:38:54Z)
Machine Learning-Aided Efficient Decoding of Reed-Muller Subcodes [59.55193427277134]
Reed-Muller (RM) codes achieve the capacity of general binary-input memoryless symmetric channels. RM codes only admit limited sets of rates. Efficient decoders are available for RM codes at finite lengths.
arXiv Detail & Related papers (2023-01-16T04:11:14Z)
Fundamental Limits of Two-layer Autoencoders, and Achieving Them with Gradient Methods [91.54785981649228]
This paper focuses on non-linear two-layer autoencoders trained in the challenging proportional regime. Our results characterize the minimizers of the population risk, and show that such minimizers are achieved by gradient methods. For the special case of a sign activation function, our analysis establishes the fundamental limits for the lossy compression of Gaussian sources via (shallow) autoencoders.
arXiv Detail & Related papers (2022-12-27T12:37:34Z)
Benign Autoencoders [0.0]
We formalize the problem of finding the optimal encoder-decoder pair and characterize its solution, which we name the "benign autoencoder" (BAE) We prove that BAE projects data onto a manifold whose dimension is the optimal compressibility dimension of the generative problem. As an illustration, we show how BAE can find optimal, low-dimensional latent representations that improve the performance of a discriminator under a distribution shift.
arXiv Detail & Related papers (2022-10-02T21:36:27Z)
Variational Sparse Coding with Learned Thresholding [6.737133300781134]
We propose a new approach to variational sparse coding that allows us to learn sparse distributions by thresholding samples. We first evaluate and analyze our method by training a linear generator, showing that it has superior performance, statistical efficiency, and gradient estimation.
arXiv Detail & Related papers (2022-05-07T14:49:50Z)
Adversarial Neural Networks for Error Correcting Codes [76.70040964453638]
We introduce a general framework to boost the performance and applicability of machine learning (ML) models. We propose to combine ML decoders with a competing discriminator network that tries to distinguish between codewords and noisy words. Our framework is game-theoretic, motivated by generative adversarial networks (GANs)
arXiv Detail & Related papers (2021-12-21T19:14:44Z)
Dynamic Neural Representational Decoders for High-Resolution Semantic Segmentation [98.05643473345474]
We propose a novel decoder, termed dynamic neural representational decoder (NRD) As each location on the encoder's output corresponds to a local patch of the semantic labels, in this work, we represent these local patches of labels with compact neural networks. This neural representation enables our decoder to leverage the smoothness prior in the semantic label space, and thus makes our decoder more efficient.
arXiv Detail & Related papers (2021-07-30T04:50:56Z)
Variational Autoencoders: A Harmonic Perspective [79.49579654743341]
We study Variational Autoencoders (VAEs) from the perspective of harmonic analysis. We show that the encoder variance of a VAE controls the frequency content of the functions parameterised by the VAE encoder and decoder neural networks.
arXiv Detail & Related papers (2021-05-31T10:39:25Z)
The Interpretable Dictionary in Sparse Coding [4.205692673448206]
In our work, we illustrate that an ANN, trained using sparse coding under specific sparsity constraints, yields a more interpretable model than the standard deep learning model. The dictionary learned by sparse coding can be more easily understood and the activations of these elements creates a selective feature output.
arXiv Detail & Related papers (2020-11-24T00:26:40Z)
A New Modal Autoencoder for Functionally Independent Feature Extraction [6.690183908967779]
A new modal autoencoder (MAE) is proposed by othogonalising the columns of the readout weight matrix. The results were validated on the MNIST variations and USPS classification benchmark suite. The new MAE introduces a very simple training principle for autoencoders and could be promising for the pre-training of deep neural networks.
arXiv Detail & Related papers (2020-06-25T13:25:10Z)
MetaSDF: Meta-learning Signed Distance Functions [85.81290552559817]
Generalizing across shapes with neural implicit representations amounts to learning priors over the respective function space. We formalize learning of a shape space as a meta-learning problem and leverage gradient-based meta-learning algorithms to solve this task.
arXiv Detail & Related papers (2020-06-17T05:14:53Z)

This list is automatically generated from the titles and abstracts of the papers in this site.