Related papers: From Knots to Knobs: Towards Steerable Collaborative Filtering Using Sparse Autoencoders

From Knots to Knobs: Towards Steerable Collaborative Filtering Using Sparse Autoencoders

URL: http://arxiv.org/abs/2601.11182v1
Date: Fri, 16 Jan 2026 10:58:21 GMT
Title: From Knots to Knobs: Towards Steerable Collaborative Filtering Using Sparse Autoencoders
Authors: Martin Spišák, Ladislav Peška, Petr Škoda, Vojtěch Vančura, Rodrigo Alves,
Abstract summary: This paper is the first to applyparse autoencoders to collaborative filtering.<n>We propose suitable mapping functions between semantic concepts and individual neurons.<n>We also evaluate a simple yet effective method that utilizes this representation to steer the recommendations in a desired direction.
Score: 8.744951561204507
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Sparse autoencoders (SAEs) have recently emerged as pivotal tools for introspection into large language models. SAEs can uncover high-quality, interpretable features at different levels of granularity and enable targeted steering of the generation process by selectively activating specific neurons in their latent activations. Our paper is the first to apply this approach to collaborative filtering, aiming to extract similarly interpretable features from representations learned purely from interaction signals. In particular, we focus on a widely adopted class of collaborative autoencoders (CFAEs) and augment them by inserting an SAE between their encoder and decoder networks. We demonstrate that such representation is largely monosemantic and propose suitable mapping functions between semantic concepts and individual neurons. We also evaluate a simple yet effective method that utilizes this representation to steer the recommendations in a desired direction.

Related papers

Evaluating Sparse Autoencoders: From Shallow Design to Matching Pursuit [23.806945495163774]
Sparse autoencoders (SAEs) have recently become central tools for interpretability.<n>This paper evaluates SAEs in a controlled setting using MNIST.<n>We compare them with an iterative SAE that unrolls Matching Pursuit (MP-SAE)
arXiv Detail & Related papers (2025-06-05T16:57:58Z)
Sparse Autoencoder Insights on Voice Embeddings [3.2377830280631468]
This study applies sparse autoencoders to speaker embeddings generated from a Titanet model.<n>The extracted features exhibit characteristics similar to those found in Large Language Model embeddings, including feature splitting and steering.<n>The analysis reveals that the autoencoder can identify and manipulate features such as language and music, which are not evident in the original embedding.
arXiv Detail & Related papers (2025-01-31T19:21:43Z)
Compute Optimal Inference and Provable Amortisation Gap in Sparse Autoencoders [0.0]
A recent line of work has shown promise in using sparse autoencoders (SAEs) to uncover interpretable features in neural network representations.<n>However, the simple linear-nonlinear encoding mechanism in SAEs limits their ability to perform accurate sparse inference.<n>We prove that an SAE encoder is inherently insufficient for accurate sparse inference, even in solvable cases.
arXiv Detail & Related papers (2024-11-20T08:21:53Z)
Automatically Interpreting Millions of Features in Large Language Models [1.8035046415192353]
sparse autoencoders (SAEs) can be used to transform activations into a higher-dimensional latent space.<n>We build an open-source pipeline to generate and evaluate natural language explanations for SAE features.<n>Our large-scale analysis confirms that SAE latents are indeed much more interpretable than neurons.
arXiv Detail & Related papers (2024-10-17T17:56:01Z)
Disentangling Dense Embeddings with Sparse Autoencoders [0.0]
Sparse autoencoders (SAEs) have shown promise in extracting interpretable features from complex neural networks. We present one of the first applications of SAEs to dense text embeddings from large language models. We show that the resulting sparse representations maintain semantic fidelity while offering interpretability.
arXiv Detail & Related papers (2024-08-01T15:46:22Z)
Triple-Encoders: Representations That Fire Together, Wire Together [51.15206713482718]
Contrastive Learning is a representation learning method that encodes relative distances between utterances into the embedding space via a bi-encoder. This study introduces triple-encoders, which efficiently compute distributed utterance mixtures from these independently encoded utterances. We find that triple-encoders lead to a substantial improvement over bi-encoders, and even to better zero-shot generalization than single-vector representation models.
arXiv Detail & Related papers (2024-02-19T18:06:02Z)
Improving Deep Representation Learning via Auxiliary Learnable Target Coding [69.79343510578877]
This paper introduces a novel learnable target coding as an auxiliary regularization of deep representation learning. Specifically, a margin-based triplet loss and a correlation consistency loss on the proposed target codes are designed to encourage more discriminative representations.
arXiv Detail & Related papers (2023-05-30T01:38:54Z)
Wav2Seq: Pre-training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages [58.43299730989809]
We introduce Wav2Seq, the first self-supervised approach to pre-train both parts of encoder-decoder models for speech data. We induce a pseudo language as a compact discrete representation, and formulate a self-supervised pseudo speech recognition task. This process stands on its own, or can be applied as low-cost second-stage pre-training.
arXiv Detail & Related papers (2022-05-02T17:59:02Z)
Multi-Modal Zero-Shot Sign Language Recognition [51.07720650677784]
We propose a multi-modal Zero-Shot Sign Language Recognition model. A Transformer-based model along with a C3D model is used for hand detection and deep features extraction. A semantic space is used to map the visual features to the lingual embedding of the class labels.
arXiv Detail & Related papers (2021-09-02T09:10:39Z)
MetaSDF: Meta-learning Signed Distance Functions [85.81290552559817]
Generalizing across shapes with neural implicit representations amounts to learning priors over the respective function space. We formalize learning of a shape space as a meta-learning problem and leverage gradient-based meta-learning algorithms to solve this task.
arXiv Detail & Related papers (2020-06-17T05:14:53Z)
Improve Variational Autoencoder for Text Generationwith Discrete Latent Bottleneck [52.08901549360262]
Variational autoencoders (VAEs) are essential tools in end-to-end representation learning. VAEs tend to ignore latent variables with a strong auto-regressive decoder. We propose a principled approach to enforce an implicit latent feature matching in a more compact latent space.
arXiv Detail & Related papers (2020-04-22T14:41:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.