Related papers: BatchTopK Sparse Autoencoders

BatchTopK Sparse Autoencoders

URL: http://arxiv.org/abs/2412.06410v1
Date: Mon, 09 Dec 2024 11:39:00 GMT
Title: BatchTopK Sparse Autoencoders
Authors: Bart Bussmann, Patrick Leask, Neel Nanda,
Abstract summary: BatchTopK is a training method that improves upon TopK SAEs by relaxing the top-k constraint to the batch-level.<n>We show that BatchTopK SAEs consistently outperform TopK SAEs in reconstructing activations from GPT-2 Small and Gemma 2 2B.
Score: 1.8754113193437074
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sparse autoencoders (SAEs) have emerged as a powerful tool for interpreting language model activations by decomposing them into sparse, interpretable features. A popular approach is the TopK SAE, that uses a fixed number of the most active latents per sample to reconstruct the model activations. We introduce BatchTopK SAEs, a training method that improves upon TopK SAEs by relaxing the top-k constraint to the batch-level, allowing for a variable number of latents to be active per sample. As a result, BatchTopK adaptively allocates more or fewer latents depending on the sample, improving reconstruction without sacrificing average sparsity. We show that BatchTopK SAEs consistently outperform TopK SAEs in reconstructing activations from GPT-2 Small and Gemma 2 2B, and achieve comparable performance to state-of-the-art JumpReLU SAEs. However, an advantage of BatchTopK is that the average number of latents can be directly specified, rather than approximately tuned through a costly hyperparameter sweep. We provide code for training and evaluating BatchTopK SAEs at https://github.com/bartbussmann/BatchTopK

Related papers

AbsTopK: Rethinking Sparse Autoencoders For Bidirectional Features [19.58274892471746]
Sparse autoencoders (SAEs) have emerged as powerful techniques for interpretability of large language models.<n>We introduce such a framework by unrolling the proximal gradient method for sparse coding.<n>We show that a single-step update naturally recovers common SAE variants, including ReLU, JumpReLU, and TopK.
arXiv Detail & Related papers (2025-10-01T01:29:31Z)
Distribution-Aware Feature Selection for SAEs [1.2396474483677118]
TopK SAE reconstructs each token from its K most active latents.<n> BatchTopK addresses this limitation by selecting top activations across a batch of tokens.<n>This improves average reconstruction but risks an "activation lottery"
arXiv Detail & Related papers (2025-08-29T04:42:17Z)
TreeRPO: Tree Relative Policy Optimization [55.97385410074841]
name is a novel method that estimates the mathematical expectations of rewards at various reasoning steps using tree sampling.<n>Building on the group-relative reward training mechanism of GRPO, name innovatively computes rewards based on step-level groups generated during tree sampling.
arXiv Detail & Related papers (2025-06-05T15:56:38Z)
Ensembling Sparse Autoencoders [10.81463830315253]
Sparse autoencoders (SAEs) are used to decompose neural network activations into human-interpretable features.<n>We propose to ensemble multiple SAEs through naive bagging and boosting.<n>Our empirical results demonstrate that ensembling SAEs can improve the reconstruction of language model activations, diversity of features, and SAE stability.
arXiv Detail & Related papers (2025-05-21T23:31:21Z)
Breaking the Batch Barrier (B3) of Contrastive Learning via Smart Batch Mining [57.352097333505476]
'Breaking the Batch Barrier' (B3) is a novel batch construction strategy designed to curate high-quality batches for Contrastive Learning (CL)<n>Our approach begins by using a pretrained teacher embedding model to rank all examples in the dataset.<n>A community detection algorithm is then applied to this graph to identify clusters of examples that serve as strong negatives for one another.<n>The clusters are then used to construct batches that are rich in in-batch negatives.
arXiv Detail & Related papers (2025-05-16T14:25:43Z)
Finding Needles in Emb(a)dding Haystacks: Legal Document Retrieval via Bagging and SVR Ensembles [51.0691253204425]
We introduce a retrieval approach leveraging Support Vector Regression ensembles, bootstrap aggregation (bagging), and embedding spaces on the German dataset for Legal Information Retrieval (GerDaLIR) We show improved recall over the baselines using our voting ensemble, suggesting promising initial results, without training or fine-tuning any deep learning models.
arXiv Detail & Related papers (2025-01-09T07:21:44Z)
Prefixing Attention Sinks can Mitigate Activation Outliers for Large Language Model Quantization [13.475050661770796]
We develop a simple yet effective strategy to facilitate per-tensor activation quantization by preventing the generation of problematic tokens. We tune the token cache to regularize the activations of subsequent tokens to be more quantization-friendly. We thoroughly evaluate our method over a wide range of models and benchmarks and find that it significantly surpasses the established baseline of per-tensor W8A8 quantization.
arXiv Detail & Related papers (2024-06-17T18:33:44Z)
CapS-Adapter: Caption-based MultiModal Adapter in Zero-Shot Classification [3.594351309950969]
CapS-Adapter is an innovative method that harnesses both image and caption features to exceed existing state-of-the-art techniques in training-free scenarios. Our method achieves outstanding zero-shot classification results across 19 benchmark datasets, improving accuracy by 2.19% over the previous leading method.
arXiv Detail & Related papers (2024-05-26T14:50:40Z)
TS-RSR: A provably efficient approach for batch bayesian optimization [4.622871908358325]
This paper presents a new approach for batch Bayesian Optimization (BO) called Thompson Sampling-Regret to Sigma Ratio directed sampling. Our sampling objective is able to coordinate the actions chosen in each batch in a way that minimizes redundancy between points. We demonstrate that our method attains state-of-the-art performance on a range of challenging synthetic and realistic test functions.
arXiv Detail & Related papers (2024-03-07T18:58:26Z)
Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class. Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z)
How to Prune Your Language Model: Recovering Accuracy on the "Sparsity May Cry'' Benchmark [60.72725673114168]
We revisit the question of accurate BERT-pruning during fine-tuning on downstream datasets. We propose a set of general guidelines for successful pruning, even on the challenging SMC benchmark.
arXiv Detail & Related papers (2023-12-21T03:11:30Z)
BatchGFN: Generative Flow Networks for Batch Active Learning [80.73649229919454]
BatchGFN is a novel approach for pool-based active learning that uses generative flow networks to sample sets of data points proportional to a batch reward. We show our approach enables principled sampling near-optimal utility batches at inference time with a single forward pass per point in the batch in toy regression problems.
arXiv Detail & Related papers (2023-06-26T20:41:36Z)
Contextual Bandits with Packing and Covering Constraints: A Modular Lagrangian Approach via Regression [65.8785736964253]
We consider contextual bandits with linear constraints (CBwLC), a variant of contextual bandits in which the algorithm consumes multiple resources subject to linear constraints on total consumption. This problem generalizes contextual bandits with knapsacks (CBwK), allowing for packing and covering constraints, as well as positive and negative resource consumption. We provide the first algorithm for CBwLC (or CBwK) that is based on regression oracles. The algorithm is simple, computationally efficient, and statistically optimal under mild assumptions.
arXiv Detail & Related papers (2022-11-14T16:08:44Z)
Stochastic Batch Acquisition: A Simple Baseline for Deep Active Learning [48.19646855997791]
We examine a simple strategy for adapting well-known single-point acquisition functions to allow batch active learning. This strategy can perform just as well as compute-intensive state-of-the-art batch acquisition functions, like BatchBALD or BADGE, while using orders of magnitude less compute. In addition to providing a practical option for machine learning practitioners, the surprising success of the proposed method in a wide range of experimental settings raises a difficult question for the field.
arXiv Detail & Related papers (2021-06-22T21:07:50Z)
PowerEvaluationBALD: Efficient Evaluation-Oriented Deep (Bayesian) Active Learning with Stochastic Acquisition Functions [2.0305676256390934]
We develop BatchEvaluationBALD, a new acquisition function for deep active learning. We also develop a variant for the non-Bayesian setting, which we call Evaluation Information Gain. To reduce computational requirements and allow these methods to scale to larger batch sizes, we introduce acquisition functions that use importance-sampling of tempered acquisition scores.
arXiv Detail & Related papers (2021-01-10T13:46:45Z)

This list is automatically generated from the titles and abstracts of the papers in this site.