Trajectory Volatility for Out-of-Distribution Detection in Mathematical Reasoning
- URL: http://arxiv.org/abs/2405.14039v1
- Date: Wed, 22 May 2024 22:22:25 GMT
- Title: Trajectory Volatility for Out-of-Distribution Detection in Mathematical Reasoning
- Authors: Yiming Wang, Pei Zhang, Baosong Yang, Derek F. Wong, Zhuosheng Zhang, Rui Wang,
- Abstract summary: We propose a trajectory-based method TV score, which uses trajectory volatility for OOD detection in mathematical reasoning.
Our method outperforms all traditional algorithms on GLMs under mathematical reasoning scenarios.
Our method can be extended to more applications with high-density features in output spaces, such as multiple-choice questions.
- Score: 50.84938730450622
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Real-world data deviating from the independent and identically distributed (i.i.d.) assumption of in-distribution training data poses security threats to deep networks, thus advancing out-of-distribution (OOD) detection algorithms. Detection methods in generative language models (GLMs) mainly focus on uncertainty estimation and embedding distance measurement, with the latter proven to be most effective in traditional linguistic tasks like summarization and translation. However, another complex generative scenario mathematical reasoning poses significant challenges to embedding-based methods due to its high-density feature of output spaces, but this feature causes larger discrepancies in the embedding shift trajectory between different samples in latent spaces. Hence, we propose a trajectory-based method TV score, which uses trajectory volatility for OOD detection in mathematical reasoning. Experiments show that our method outperforms all traditional algorithms on GLMs under mathematical reasoning scenarios and can be extended to more applications with high-density features in output spaces, such as multiple-choice questions.
Related papers
- Trajectory Anomaly Detection with Language Models [21.401931052512595]
This paper presents a novel approach for trajectory anomaly detection using an autoregressive causal-attention model, termed LM-TAD.
By treating trajectories as sequences of tokens, our model learns the probability distributions over trajectories, enabling the identification of anomalous locations with high precision.
Our experiments demonstrate the effectiveness of LM-TAD on both synthetic and real-world datasets.
arXiv Detail & Related papers (2024-09-18T17:33:31Z) - Scalable and reliable deep transfer learning for intelligent fault
detection via multi-scale neural processes embedded with knowledge [7.730457774728478]
This paper proposes a novel DTL-based deep transfer learning method known as Neural Processes-based deep transfer learning with graph convolution network (GTNP)
The validation of the proposed method is conducted across 3 IFD tasks, consistently showing the superior detection performance of GTNP compared to the other DTL-based methods.
arXiv Detail & Related papers (2024-02-20T05:39:32Z) - Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Few-Shot Anomaly Detection with Adversarial Loss for Robust Feature
Representations [8.915958745269442]
Anomaly detection is a critical and challenging task that aims to identify data points deviating from normal patterns and distributions within a dataset.
Various methods have been proposed using a one-class-one-model approach, but these techniques often face practical problems such as memory inefficiency and the requirement of sufficient data for training.
We propose a few-shot anomaly detection method that integrates adversarial training loss to obtain more robust and generalized feature representations.
arXiv Detail & Related papers (2023-12-04T09:45:02Z) - Distributionally Robust Model-based Reinforcement Learning with Large
State Spaces [55.14361269378122]
Three major challenges in reinforcement learning are the complex dynamical systems with large state spaces, the costly data acquisition processes, and the deviation of real-world dynamics from the training environment deployment.
We study distributionally robust Markov decision processes with continuous state spaces under the widely used Kullback-Leibler, chi-square, and total variation uncertainty sets.
We propose a model-based approach that utilizes Gaussian Processes and the maximum variance reduction algorithm to efficiently learn multi-output nominal transition dynamics.
arXiv Detail & Related papers (2023-09-05T13:42:11Z) - Consistency and Diversity induced Human Motion Segmentation [231.36289425663702]
We propose a novel Consistency and Diversity induced human Motion (CDMS) algorithm.
Our model factorizes the source and target data into distinct multi-layer feature spaces.
A multi-mutual learning strategy is carried out to reduce the domain gap between the source and target data.
arXiv Detail & Related papers (2022-02-10T06:23:56Z) - A hypothesis-driven method based on machine learning for neuroimaging
data analysis [0.0]
Machine learning approaches for discrimination of spatial patterns of brain images have limited their operation to feature extraction and linear classification tasks.
We show that the estimation of the conventional General linear Model (GLM) has been connected to an univariate classification task.
We derive a refined statistical test with the GLM based on the parameters obtained by a linear Support Vector Regression (SVR) in the emphinverse problem (SVR-iGLM)
Using real data from a multisite initiative the proposed MLE-based inference demonstrates statistical power and the control of false positives, outperforming the regular G
arXiv Detail & Related papers (2022-02-09T11:13:02Z) - Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic
Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer.
With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices.
We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z) - Learning while Respecting Privacy and Robustness to Distributional
Uncertainties and Adversarial Data [66.78671826743884]
The distributionally robust optimization framework is considered for training a parametric model.
The objective is to endow the trained model with robustness against adversarially manipulated input data.
Proposed algorithms offer robustness with little overhead.
arXiv Detail & Related papers (2020-07-07T18:25:25Z) - Anomaly Detection in Trajectory Data with Normalizing Flows [0.0]
We propose an approach based on normalizing flows that enables complex density estimation from data with neural networks.
Our proposal computes exact model likelihood values, an important feature of normalizing flows, for each segment of the trajectory.
We evaluate our methodology, named aggregated anomaly detection with normalizing flows (GRADINGS), using real world trajectory data and compare it with more traditional anomaly detection techniques.
arXiv Detail & Related papers (2020-04-13T14:16:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.