On the long-term learning ability of LSTM LMs
- URL: http://arxiv.org/abs/2106.08927v1
- Date: Wed, 16 Jun 2021 16:34:37 GMT
- Title: On the long-term learning ability of LSTM LMs
- Authors: Wim Boes, Robbe Van Rompaey, Lyan Verwimp, Joris Pelemans, Hugo Van
hamme, Patrick Wambacq
- Abstract summary: We evaluate a contextual extension based on the Continuous Bag-of-Words (CBOW) model for both sentence- and discourse-level LSTM LMs.
Sentence-level models using the long-term contextual module perform comparably to vanilla discourse-level LSTM LMs.
On the other hand, the extension does not provide gains for discourse-level models.
- Score: 17.700860670640015
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We inspect the long-term learning ability of Long Short-Term Memory language
models (LSTM LMs) by evaluating a contextual extension based on the Continuous
Bag-of-Words (CBOW) model for both sentence- and discourse-level LSTM LMs and
by analyzing its performance. We evaluate on text and speech. Sentence-level
models using the long-term contextual module perform comparably to vanilla
discourse-level LSTM LMs. On the other hand, the extension does not provide
gains for discourse-level models. These findings indicate that discourse-level
LSTM LMs already rely on contextual information to perform long-term learning.
Related papers
- LIFBench: Evaluating the Instruction Following Performance and Stability of Large Language Models in Long-Context Scenarios [16.72802527902692]
We introduce the Long-context Instruction-Following Benchmark (LIFBench), a scalable dataset designed to evaluate Large Language Models (LLMs)
LIFBench comprises three long-context scenarios and eleven diverse tasks, supported by 2,766 instructions generated through an automated expansion method across three dimensions: length, expression, and variables.
For evaluation, we propose LIFEval, a rubric-based assessment framework that provides precise, automated scoring of complex LLM responses without relying on LLM-assisted evaluations or human judgments.
arXiv Detail & Related papers (2024-11-11T14:43:51Z) - Efficiently Exploring Large Language Models for Document-Level Machine Translation with In-context Learning [38.89119606657543]
In contrast to sentence-level translation, document-level translation (DOCMT) by large language models (LLMs) based on in-context learning faces two major challenges.
We propose a Context-Aware Prompting method (CAP) to generate more accurate, cohesive, and coherent translations via in-context learning.
We conduct extensive experiments across various DOCMT tasks, and the results demonstrate the effectiveness of our approach.
arXiv Detail & Related papers (2024-06-11T09:11:17Z) - LongSkywork: A Training Recipe for Efficiently Extending Context Length in Large Language Models [61.12177317970258]
LongSkywork is a long-context Large Language Model capable of processing up to 200,000 tokens.
We develop two novel methods for creating synthetic data.
LongSkywork achieves outstanding performance on a variety of long-context benchmarks.
arXiv Detail & Related papers (2024-06-02T03:34:41Z) - Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks [76.43527940649939]
We introduce Ada-LEval, a benchmark for evaluating the long-context understanding of large language models (LLMs)
Ada-LEval includes two challenging subsets, TSort and BestAnswer, which enable a more reliable evaluation of LLMs' long context capabilities.
We evaluate 4 state-of-the-art closed-source API models and 6 open-source models with Ada-LEval.
arXiv Detail & Related papers (2024-04-09T17:30:48Z) - Large Language Models can Contrastively Refine their Generation for Better Sentence Representation Learning [57.74233319453229]
Large language models (LLMs) have emerged as a groundbreaking technology and their unparalleled text generation capabilities have sparked interest in their application to the fundamental sentence representation learning task.
We propose MultiCSR, a multi-level contrastive sentence representation learning framework that decomposes the process of prompting LLMs to generate a corpus.
Our experiments reveal that MultiCSR enables a less advanced LLM to surpass the performance of ChatGPT, while applying it to ChatGPT achieves better state-of-the-art results.
arXiv Detail & Related papers (2023-10-17T03:21:43Z) - BAMBOO: A Comprehensive Benchmark for Evaluating Long Text Modeling Capacities of Large Language Models [141.21603469555225]
Large language models (LLMs) have achieved dramatic proficiency over NLP tasks with normal length.
We propose BAMBOO, a multi-task long context benchmark.
It consists of 10 datasets from 5 different long text understanding tasks.
arXiv Detail & Related papers (2023-09-23T11:36:15Z) - L-Eval: Instituting Standardized Evaluation for Long Context Language
Models [91.05820785008527]
We propose L-Eval to institute a more standardized evaluation for long context language models (LCLMs)
We build a new evaluation suite containing 20 sub-tasks, 508 long documents, and over 2,000 human-labeled query-response pairs.
Results show that popular n-gram matching metrics generally can not correlate well with human judgment.
arXiv Detail & Related papers (2023-07-20T17:59:41Z) - Deep Learning Approaches to Lexical Simplification: A Survey [19.079916794185642]
Lexical Simplification (LS) is the task of replacing complex for simpler words in a sentence.
LS is the lexical component of Text Simplification (TS)
Recent advances in deep learning have sparked renewed interest in LS.
arXiv Detail & Related papers (2023-05-19T20:56:22Z) - Future Vector Enhanced LSTM Language Model for LVCSR [67.03726018635174]
This paper proposes a novel enhanced long short-term memory (LSTM) LM using the future vector.
Experiments show that, the proposed new LSTM LM gets a better result on BLEU scores for long term sequence prediction.
Rescoring using both the new and conventional LSTM LMs can achieve a very large improvement on the word error rate.
arXiv Detail & Related papers (2020-07-31T08:38:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.