Slow Feature Analysis on Markov Chains from Goal-Directed Behavior
- URL: http://arxiv.org/abs/2506.01145v1
- Date: Sun, 01 Jun 2025 19:57:41 GMT
- Title: Slow Feature Analysis on Markov Chains from Goal-Directed Behavior
- Authors: Merlin Schüler, Eddie Seabrook, Laurenz Wiskott,
- Abstract summary: This work investigates the effects of goal-directed behavior on value-function approximation in an idealized setting.<n>Three correction routes, which can potentially alleviate detrimental scaling effects, are evaluated and discussed.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Slow Feature Analysis is a unsupervised representation learning method that extracts slowly varying features from temporal data and can be used as a basis for subsequent reinforcement learning. Often, the behavior that generates the data on which the representation is learned is assumed to be a uniform random walk. Less research has focused on using samples generated by goal-directed behavior, as commonly the case in a reinforcement learning setting, to learn a representation. In a spatial setting, goal-directed behavior typically leads to significant differences in state occupancy between states that are close to a reward location and far from a reward location. Through the perspective of optimal slow features on ergodic Markov chains, this work investigates the effects of these differences on value-function approximation in an idealized setting. Furthermore, three correction routes, which can potentially alleviate detrimental scaling effects, are evaluated and discussed. In addition, the special case of goal-averse behavior is considered.
Related papers
- Stochastic Encodings for Active Feature Acquisition [100.47043816019888]
Active Feature Acquisition is an instance-wise, sequential decision making problem.<n>The aim is to dynamically select which feature to measure based on current observations, independently for each test instance.<n>Common approaches either use Reinforcement Learning, which experiences training difficulties, or greedily maximize the conditional mutual information of the label and unobserved features, which makes myopic.<n>We introduce a latent variable model, trained in a supervised manner. Acquisitions are made by reasoning about the features across many possible unobserved realizations in a latent space.
arXiv Detail & Related papers (2025-08-03T23:48:46Z) - Anomalous Decision Discovery using Inverse Reinforcement Learning [3.3675535571071746]
Anomaly detection plays a critical role in Autonomous Vehicles (AVs) by identifying unusual behaviors through perception systems.<n>Current approaches, which often rely on predefined thresholds or supervised learning paradigms, exhibit reduced efficacy when confronted with unseen scenarios.<n>We present Trajectory-Reward Guided Adaptive Pre-training (TRAP), a novel IRL framework for anomaly detection.
arXiv Detail & Related papers (2025-07-06T17:01:02Z) - Inverse Reinforcement Learning using Revealed Preferences and Passive Stochastic Optimization [15.878313629774269]
The first two chapters view inverse reinforcement learning (IRL) through the lens of revealed preferences from microeconomics.<n>The third chapter studies adaptive gradient algorithms.
arXiv Detail & Related papers (2025-07-06T13:56:02Z) - Spatial regularisation for improved accuracy and interpretability in keypoint-based registration [5.286949071316761]
Recent approaches based on unsupervised keypoint detection stand out as very promising for interpretability.<n>Here, we propose a three-fold loss to regularise the spatial distribution of the features.<n>Our loss considerably improves the interpretability of the features, which now correspond to precise and anatomically meaningful landmarks.
arXiv Detail & Related papers (2025-03-06T14:48:25Z) - Unlearning-based Neural Interpretations [51.99182464831169]
We show that current baselines defined using static functions are biased, fragile and manipulable.<n>We propose UNI to compute an (un)learnable, debiased and adaptive baseline by perturbing the input towards an unlearning direction of steepest ascent.
arXiv Detail & Related papers (2024-10-10T16:02:39Z) - Bidirectional Decoding: Improving Action Chunking via Guided Test-Time Sampling [51.38330727868982]
We show how action chunking impacts the divergence between a learner and a demonstrator.<n>We propose Bidirectional Decoding (BID), a test-time inference algorithm that bridges action chunking with closed-loop adaptation.<n>Our method boosts the performance of two state-of-the-art generative policies across seven simulation benchmarks and two real-world tasks.
arXiv Detail & Related papers (2024-08-30T15:39:34Z) - On the Dynamics Under the Unhinged Loss and Beyond [104.49565602940699]
We introduce the unhinged loss, a concise loss function, that offers more mathematical opportunities to analyze closed-form dynamics.
The unhinged loss allows for considering more practical techniques, such as time-vary learning rates and feature normalization.
arXiv Detail & Related papers (2023-12-13T02:11:07Z) - Improving Estimation of the Koopman Operator with Kolmogorov-Smirnov
Indicator Functions [0.0]
Key to a practical success of the approach is the identification of a set of observables which form a good basis in which to expand the slow relaxation modes.
We propose a simple and computationally efficient clustering procedure to infer surrogate observables that form a good basis for slow modes.
We consistently demonstrate that the inferred indicator functions can significantly improve the estimation of the leading eigenvalues of the Koopman operators.
arXiv Detail & Related papers (2023-06-09T15:01:43Z) - How Does Data Freshness Affect Real-time Supervised Learning? [15.950108699395077]
We show that the performance of real-time supervised learning degrades monotonically as the feature becomes stale.
To minimize the inference error in real-time, we propose a new "selection-from-buffer" model for sending the features.
Data-driven evaluations are presented to illustrate the benefits of the proposed scheduling algorithms.
arXiv Detail & Related papers (2022-08-15T00:14:13Z) - Locality-aware Attention Network with Discriminative Dynamics Learning
for Weakly Supervised Anomaly Detection [0.8883733362171035]
We propose a Discriminative Dynamics Learning (DDL) method with two objective functions, i.e., dynamics ranking loss and dynamics alignment loss.
A Locality-aware Attention Network (LA-Net) is constructed to capture global correlations and re-calibrate the location preference across snippets, followed by a multilayer perceptron with causal convolution to obtain anomaly scores.
arXiv Detail & Related papers (2022-08-11T04:27:33Z) - Interpretable Deep Feature Propagation for Early Action Recognition [39.966828592322315]
In this study, we address action prediction by investigating how action patterns evolve over time in a spatial feature space.
We work with intermediate-layer ConvNet features, which allow for abstraction from raw data, while retaining spatial layout.
We employ a Kalman filter to combat error build-up and unify across prediction start times.
arXiv Detail & Related papers (2021-07-11T19:40:19Z) - Extreme Memorization via Scale of Initialization [72.78162454173803]
We construct an experimental setup in which changing the scale of initialization strongly impacts the implicit regularization induced by SGD.
We find that the extent and manner in which generalization ability is affected depends on the activation and loss function used.
In the case of the homogeneous ReLU activation, we show that this behavior can be attributed to the loss function.
arXiv Detail & Related papers (2020-08-31T04:53:11Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z) - Value-driven Hindsight Modelling [68.658900923595]
Value estimation is a critical component of the reinforcement learning (RL) paradigm.
Model learning can make use of the rich transition structure present in sequences of observations, but this approach is usually not sensitive to the reward function.
We develop an approach for representation learning in RL that sits in between these two extremes.
This provides tractable prediction targets that are directly relevant for a task, and can thus accelerate learning the value function.
arXiv Detail & Related papers (2020-02-19T18:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.