On the Sequence Evaluation based on Stochastic Processes
- URL: http://arxiv.org/abs/2405.17764v3
- Date: Thu, 03 Oct 2024 03:03:24 GMT
- Title: On the Sequence Evaluation based on Stochastic Processes
- Authors: Tianhao Zhang, Zhexiao Lin, Zhecheng Sheng, Chen Jiang, Dongyeop Kang,
- Abstract summary: We propose a novel approach to learn the dynamics of long text sequences, utilizing a negative log-likelihood-based encoder.
We also introduce a likelihood-based evaluation metric for long-text assessment, which measures sequence coherence.
- Score: 17.497842325320825
- License:
- Abstract: Generative models have gained significant prominence in Natural Language Processing (NLP), especially in tackling the complex task of modeling and evaluating long text sequences. This task is crucial for advancing various downstream applications, such as text generation and machine translation. Recent methods that utilize stochastic processes to capture the intrinsic dynamics of sequences have shown superior performance in generative modeling. However, the accurate encoding of both temporal and structural dependencies from text datasets, as well as leveraging this encoded information for sequence evaluation, remains an open area of research. In this paper, we propose a novel approach to learn the stochastic dynamics of long text sequences, utilizing a negative log-likelihood-based encoder that outperforms contrastive learning methods. We also introduce a likelihood-based evaluation metric for long-text assessment, which measures sequence coherence and can be applied to downstream tasks such as Human-AI discrimination. Our encoder preserves sequence coherence effectively and performs robustly on out-of-domain datasets. Additionally, the proposed evaluation metric captures both temporal and structural information comprehensively. Theoretical analysis demonstrates the superiority of our metric in sequence evaluation, and experimental results highlight its flexibility and exceptional performance across a variety of tasks, showcasing its utility in diverse NLP applications.
Related papers
- How Hard is this Test Set? NLI Characterization by Exploiting Training Dynamics [49.9329723199239]
We propose a method for the automated creation of a challenging test set without relying on the manual construction of artificial and unrealistic examples.
We categorize the test set of popular NLI datasets into three difficulty levels by leveraging methods that exploit training dynamics.
When our characterization method is applied to the training set, models trained with only a fraction of the data achieve comparable performance to those trained on the full dataset.
arXiv Detail & Related papers (2024-10-04T13:39:21Z) - Pointer-Guided Pre-Training: Infusing Large Language Models with Paragraph-Level Contextual Awareness [3.2925222641796554]
"pointer-guided segment ordering" (SO) is a novel pre-training technique aimed at enhancing the contextual understanding of paragraph-level text representations.
Our experiments show that pointer-guided pre-training significantly enhances the model's ability to understand complex document structures.
arXiv Detail & Related papers (2024-06-06T15:17:51Z) - State Sequences Prediction via Fourier Transform for Representation
Learning [111.82376793413746]
We propose State Sequences Prediction via Fourier Transform (SPF), a novel method for learning expressive representations efficiently.
We theoretically analyze the existence of structural information in state sequences, which is closely related to policy performance and signal regularity.
Experiments demonstrate that the proposed method outperforms several state-of-the-art algorithms in terms of both sample efficiency and performance.
arXiv Detail & Related papers (2023-10-24T14:47:02Z) - Effective Long-Context Scaling of Foundation Models [90.57254298730923]
We present a series of long-context LLMs that support effective context windows of up to 32,768 tokens.
Our models achieve consistent improvements on most regular tasks and significant improvements on long-context tasks over Llama 2.
arXiv Detail & Related papers (2023-09-27T21:41:49Z) - Instruction Position Matters in Sequence Generation with Large Language
Models [67.87516654892343]
Large language models (LLMs) are capable of performing conditional sequence generation tasks, such as translation or summarization.
We propose enhancing the instruction-following capability of LLMs by shifting the position of task instructions after the input sentences.
arXiv Detail & Related papers (2023-08-23T12:36:57Z) - Scalable Learning of Latent Language Structure With Logical Offline
Cycle Consistency [71.42261918225773]
Conceptually, LOCCO can be viewed as a form of self-learning where the semantic being trained is used to generate annotations for unlabeled text.
As an added bonus, the annotations produced by LOCCO can be trivially repurposed to train a neural text generation model.
arXiv Detail & Related papers (2023-05-31T16:47:20Z) - FineDiving: A Fine-grained Dataset for Procedure-aware Action Quality
Assessment [93.09267863425492]
We argue that understanding both high-level semantics and internal temporal structures of actions in competitive sports videos is the key to making predictions accurate and interpretable.
We construct a new fine-grained dataset, called FineDiving, developed on diverse diving events with detailed annotations on action procedures.
arXiv Detail & Related papers (2022-04-07T17:59:32Z) - Learning Temporal Point Processes for Efficient Retrieval of Continuous
Time Event Sequences [24.963828650935913]
We propose NEUROSEQRET which learns to retrieve and rank a relevant set of continuous-time event sequences for a given query sequence.
We develop two variants of the relevance model which offer a tradeoff between accuracy and efficiency.
Our experiments with several datasets show the significant accuracy boost of NEUROSEQRET beyond several baselines.
arXiv Detail & Related papers (2022-02-17T11:16:31Z) - Contrastively Disentangled Sequential Variational Autoencoder [20.75922928324671]
We propose a novel sequence representation learning method, named Contrastively Disentangled Sequential Variational Autoencoder (C-DSVAE)
We use a novel evidence lower bound which maximizes the mutual information between the input and the latent factors, while penalizes the mutual information between the static and dynamic factors.
Our experiments show that C-DSVAE significantly outperforms the previous state-of-the-art methods on multiple metrics.
arXiv Detail & Related papers (2021-10-22T23:00:32Z) - Interpretable Feature Construction for Time Series Extrinsic Regression [0.028675177318965035]
In some application domains, it occurs that the target variable is numerical and the problem is known as time series extrinsic regression (TSER)
We suggest an extension of a Bayesian method for robust and interpretable feature construction and selection in the context of TSER.
Our approach exploits a relational way to tackle with TSER: (i), we build various and simple representations of the time series which are stored in a relational data scheme, then, (ii), a propositionalisation technique is applied to build interpretable features from secondary tables to "flatten" the data.
arXiv Detail & Related papers (2021-03-15T08:12:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.