Related papers: Shifting Long-Context LLMs Research from Input to Output

Shifting Long-Context LLMs Research from Input to Output

URL: http://arxiv.org/abs/2503.04723v2
Date: Fri, 07 Mar 2025 03:14:02 GMT
Title: Shifting Long-Context LLMs Research from Input to Output
Authors: Yuhao Wu, Yushi Bai, Zhiqing Hu, Shangqing Tu, Ming Shan Hee, Juanzi Li, Roy Ka-Wei Lee,
Abstract summary: This paper advocates for a paradigm shift in NLP research toward addressing the challenges of long-form generation.<n> Tasks such as novel writing, long-term planning, and complex reasoning require models to understand extensive contexts and produce coherent, contextually rich, and logically consistent extended text.<n>We underscore the importance of this under-explored domain and call for focused efforts to develop foundational LLMs tailored for generating high-quality, long-form outputs.
Score: 32.227507695283144
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent advancements in long-context Large Language Models (LLMs) have primarily concentrated on processing extended input contexts, resulting in significant strides in long-context comprehension. However, the equally critical aspect of generating long-form outputs has received comparatively less attention. This paper advocates for a paradigm shift in NLP research toward addressing the challenges of long-output generation. Tasks such as novel writing, long-term planning, and complex reasoning require models to understand extensive contexts and produce coherent, contextually rich, and logically consistent extended text. These demands highlight a critical gap in current LLM capabilities. We underscore the importance of this under-explored domain and call for focused efforts to develop foundational LLMs tailored for generating high-quality, long-form outputs, which hold immense potential for real-world applications.

Related papers

Long-Short Alignment for Effective Long-Context Modeling in LLMs [32.13785291956956]
Large language models (LLMs) have exhibited impressive performance and surprising emergent properties.<n>Length generalization -- the ability to generalize to sequences longer than those seen during training -- is a classical and fundamental problem.<n>We highlight the critical role of textbflong-short alignment -- the consistency of output distributions across sequences of varying lengths.
arXiv Detail & Related papers (2025-06-13T13:25:39Z)
LLM$\times$MapReduce-V2: Entropy-Driven Convolutional Test-Time Scaling for Generating Long-Form Articles from Extremely Long Resources [65.41986915457058]
Long-form generation is crucial for a wide range of practical applications. While short-to-long generations have received considerable attention, generating long texts from extremely long resources remains relatively underexplored. We propose LLM$times$MapReduce-V2, a novel test-time scaling strategy designed to enhance the ability of large language models to process extremely long inputs.
arXiv Detail & Related papers (2025-04-08T07:03:48Z)
Reducing Distraction in Long-Context Language Models by Focused Learning [6.803882766744194]
We propose a novel training method that enhances Large Language Models' ability to discern relevant information. During fine-tuning with long contexts, we employ a retriever to extract the most relevant segments. We then introduce an auxiliary contrastive learning objective to explicitly ensure that outputs from the original context and the retrieved sub-context are closely aligned.
arXiv Detail & Related papers (2024-11-08T19:27:42Z)
FltLM: An Intergrated Long-Context Large Language Model for Effective Context Filtering and Understanding [32.197113821638936]
We propose a novel integrated Long-Context Large Language Model (FltLM) FltLM incorporates a context filter with a soft mask mechanism, identifying and dynamically excluding irrelevant content to concentrate on pertinent information. Experimental results demonstrate that FltLM significantly outperforms supervised fine-tuning and retrieval-based methods in complex QA scenarios.
arXiv Detail & Related papers (2024-10-09T13:47:50Z)
KV Cache Compression, But What Must We Give in Return? A Comprehensive Benchmark of Long Context Capable Approaches [52.02764371205856]
Long context capability is a crucial competency for large language models (LLMs) This work provides a taxonomy of current methods and evaluating 10+ state-of-the-art approaches across seven categories of long context tasks.
arXiv Detail & Related papers (2024-07-01T17:59:47Z)
Leave No Document Behind: Benchmarking Long-Context LLMs with Extended Multi-Doc QA [71.04146366608904]
Long-context modeling capabilities have garnered widespread attention, leading to the emergence of Large Language Models (LLMs) with ultra-context windows. We propose a novel long-context benchmark, Loong, aligning with realistic scenarios through extended multi-document question answering (QA) Loong introduces four types of tasks with a range of context lengths: Spotlight Locating, Comparison, Clustering, and Chain of Reasoning.
arXiv Detail & Related papers (2024-06-25T09:42:56Z)
Long Context Alignment with Short Instructions and Synthesized Positions [56.1267385315404]
This paper introduces Step-Skipping Alignment (SkipAlign) It is a new technique designed to enhance the long-context capabilities of Large Language Models (LLMs) With a careful selection of the base model and alignment datasets, SkipAlign with only 6B parameters achieves it's best performance and comparable with strong baselines like GPT-3.5-Turbo-16K on LongBench.
arXiv Detail & Related papers (2024-05-07T01:56:22Z)
NovelQA: Benchmarking Question Answering on Documents Exceeding 200K Tokens [63.7488938083696]
NovelQA is a benchmark designed to test the capabilities of Large Language Models with extended texts. This paper presents the design and construction of NovelQA, highlighting its manual annotation, and diverse question types. Our evaluation of Long-context LLMs on NovelQA reveals significant insights into the models' performance.
arXiv Detail & Related papers (2024-03-18T17:32:32Z)
LooGLE: Can Long-Context Language Models Understand Long Contexts? [46.143956498529796]
LooGLE is a benchmark for large language models' long context understanding. It features relatively new documents post-2022, with over 24,000 tokens per document and 6,000 newly generated questions spanning diverse domains. The evaluation of eight state-of-the-art LLMs on LooGLE revealed key findings.
arXiv Detail & Related papers (2023-11-08T01:45:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.