Related papers: QRMeM: Unleash the Length Limitation through Question then Reflection Memory Mechanism

QRMeM: Unleash the Length Limitation through Question then Reflection Memory Mechanism

URL: http://arxiv.org/abs/2406.13167v1
Date: Wed, 19 Jun 2024 02:46:18 GMT
Title: QRMeM: Unleash the Length Limitation through Question then Reflection Memory Mechanism
Authors: Bo Wang, Heyan Huang, Yixin Cao, Jiahao Ying, Wei Tang, Chong Feng,
Abstract summary: Memory mechanism offers a flexible solution for managing long contexts. We introduce a novel strategy, Question then Reflection Memory Mechanism (QRMeM), incorporating a dual-structured memory pool. Our evaluation across multiple-choice questions (MCQ) and multi-document question answering (Multi-doc QA) benchmarks showcases QRMeM enhanced performance compared to existing approaches.
Score: 46.441032033076034
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While large language models (LLMs) have made notable advancements in natural language processing, they continue to struggle with processing extensive text. Memory mechanism offers a flexible solution for managing long contexts, utilizing techniques such as compression, summarization, and structuring to facilitate nuanced and efficient handling of large volumes of text. However, existing techniques face challenges with static knowledge integration, leading to insufficient adaptation to task-specific needs and missing multi-segmentation relationships, which hinders the dynamic reorganization and logical combination of relevant segments during the response process. To address these issues, we introduce a novel strategy, Question then Reflection Memory Mechanism (QRMeM), incorporating a dual-structured memory pool. This pool synergizes static textual content with structured graph guidance, fostering a reflective trial-and-error approach for navigating and identifying relevant segments. Our evaluation across multiple-choice questions (MCQ) and multi-document question answering (Multi-doc QA) benchmarks showcases QRMeM enhanced performance compared to existing approaches.

Related papers

Resource-Friendly Dynamic Enhancement Chain for Multi-Hop Question Answering [21.077964610022313]
This work proposes a novel framework called DEC (Dynamic Enhancement Chain)<n> DEC first decomposes complex questions into logically coherent subquestions to form a hallucination-free reasoning chain.<n>It then iteratively refines these subquestions through context-aware rewriting to generate effective query formulations.
arXiv Detail & Related papers (2025-06-21T11:55:27Z)
MEM1: Learning to Synergize Memory and Reasoning for Efficient Long-Horizon Agents [84.62985963113245]
We introduce MEM1, an end-to-end reinforcement learning framework that enables agents to operate with constant memory across long multi-turn tasks.<n>At each turn, MEM1 updates a compact shared internal state that jointly supports memory consolidation and reasoning.<n>We show that MEM1-7B improves performance by 3.5x while reducing memory usage by 3.7x compared to Qwen2.5-14B-Instruct on a 16-objective multi-hop QA task.
arXiv Detail & Related papers (2025-06-18T19:44:46Z)
Structured Memory Mechanisms for Stable Context Representation in Large Language Models [16.929937978584917]
The model integrates explicit memory units, gated writing mechanisms, and attention-based reading modules.<n>A forgetting function is introduced to enable dynamic updates of memory content.<n>The model achieves clear advantages in text generation consistency, stability in multi-turn question answering, and accuracy in cross-context reasoning.
arXiv Detail & Related papers (2025-05-28T22:49:04Z)
Towards Multi-Granularity Memory Association and Selection for Long-Term Conversational Agents [73.77930932005354]
We propose MemGAS, a framework that enhances memory consolidation by constructing multi-granularity association, adaptive selection, and retrieval.<n>MemGAS is based on multi-granularity memory units and employs Gaussian Mixture Models to cluster and associate new memories with historical ones.<n>Experiments on four long-term memory benchmarks demonstrate that MemGAS outperforms state-of-the-art methods on both question answer and retrieval tasks.
arXiv Detail & Related papers (2025-05-26T06:13:07Z)
In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents [70.12342024019044]
Large Language Models (LLMs) have made significant progress in open-ended dialogue, yet their inability to retain and retrieve relevant information limits their effectiveness. We propose Reflective Memory Management (RMM), a novel mechanism for long-term dialogue agents, integrating forward- and backward-looking reflections. RMM shows more than 10% accuracy improvement over the baseline without memory management on the LongMemEval dataset.
arXiv Detail & Related papers (2025-03-11T04:15:52Z)
Think Before You Segment: High-Quality Reasoning Segmentation with GPT Chain of Thoughts [64.93416171745693]
ThinkFirst is a training-free reasoning segmentation framework. Our approach allows GPT-4o or other powerful MLLMs to generate a detailed, chain-of-thought description of an image. This summarized description is then passed to a language-instructed segmentation assistant to aid the segmentation process.
arXiv Detail & Related papers (2025-03-10T16:26:11Z)
Emulating Retrieval Augmented Generation via Prompt Engineering for Enhanced Long Context Comprehension in LLMs [23.960451986662996]
This paper proposes a method that emulates Retrieval Augmented Generation (RAG) through specialized prompt engineering and chain-of-thought reasoning. We evaluate our approach on selected tasks from BABILong, which interleaves standard bAbI QA problems with large amounts of distractor text.
arXiv Detail & Related papers (2025-02-18T02:49:40Z)
Autonomous Structural Memory Manipulation for Large Language Models Using Hierarchical Embedding Augmentation [0.0]
This study introduces hierarchical embedding augmentation as a means to redefine the representation of tokens through multi-level semantic structures. Results reveal substantial improvements in computational efficiency, with marked reductions in processing overhead for longer input sequences. The ability to dynamically adjust token representations and memory configurations contributed to the model's robustness under varied and unpredictable input conditions.
arXiv Detail & Related papers (2025-01-23T22:20:36Z)
Meta-Chunking: Learning Text Segmentation and Semantic Completion via Logical Perception [10.614437503578856]
This paper proposes the Meta-Chunking framework, which specifically enhances chunking quality.<n>We design two adaptive chunking techniques based on uncertainty, namely Perplexity Chunking and Margin Sampling Chunking.<n>We establish the global information compensation mechanism, encompassing a two-stage hierarchical summary generation process and a three-stage text chunk rewriting procedure.
arXiv Detail & Related papers (2024-10-16T17:59:32Z)
Enhancing Multi-Step Reasoning Abilities of Language Models through Direct Q-Function Optimization [50.485788083202124]
Reinforcement Learning (RL) plays a crucial role in aligning large language models with human preferences and improving their ability to perform complex tasks. We introduce Direct Q-function Optimization (DQO), which formulates the response generation process as a Markov Decision Process (MDP) and utilizes the soft actor-critic (SAC) framework to optimize a Q-function directly parameterized by the language model. Experimental results on two math problem-solving datasets, GSM8K and MATH, demonstrate that DQO outperforms previous methods, establishing it as a promising offline reinforcement learning approach for aligning language models.
arXiv Detail & Related papers (2024-10-11T23:29:20Z)
Bridging Context Gaps: Leveraging Coreference Resolution for Long Contextual Understanding [28.191029786204624]
We introduce the Long Question Coreference Adaptation (LQCA) method to enhance the performance of large language models (LLMs) This framework focuses on coreference resolution tailored to long contexts, allowing the model to identify and manage references effectively. The framework provides easier-to-handle partitions for LLMs, promoting better understanding.
arXiv Detail & Related papers (2024-10-02T15:39:55Z)
Do LLMs suffer from Multi-Party Hangover? A Diagnostic Approach to Addressee Recognition and Response Selection in Conversations [11.566214724241798]
We propose a methodological pipeline to investigate model performance across specific structural attributes of conversations. We focus on Response Selection and Addressee Recognition tasks, to diagnose model weaknesses. Results show that response selection relies more on the textual content of conversations, while addressee recognition requires capturing their structural dimension.
arXiv Detail & Related papers (2024-09-27T10:07:33Z)
AQA: Adaptive Question Answering in a Society of LLMs via Contextual Multi-Armed Bandit [59.10281630985958]
In question answering (QA), different questions can be effectively addressed with different answering strategies. We develop a dynamic method that adaptively selects the most suitable QA strategy for each question. Our experiments show that the proposed solution is viable for adaptive orchestration of a QA system with multiple modules.
arXiv Detail & Related papers (2024-09-20T12:28:18Z)
Retrieve, Summarize, Plan: Advancing Multi-hop Question Answering with an Iterative Approach [6.549143816134531]
We propose a novel iterative RAG method called ReSP, equipped with a dual-function summarizer. Experimental results on the multi-hop question-answering HotpotQA and 2WikiMultihopQA demonstrate that our method significantly outperforms the state-of-the-art.
arXiv Detail & Related papers (2024-07-18T02:19:00Z)
Enhancing Textbook Question Answering Task with Large Language Models and Retrieval Augmented Generation [3.948068081583197]
This paper proposes a methodology that handle the out-of-domain scenario in Textbook question answering (TQA) Through supervised fine-tuning of the LLM model Llama-2 and the incorporation of RAG, our architecture outperforms the baseline, achieving a 4.12% accuracy improvement on validation set and 9.84% on test set for non-diagram multiple-choice questions.
arXiv Detail & Related papers (2024-02-05T11:58:56Z)
Thread of Thought Unraveling Chaotic Contexts [133.24935874034782]
"Thread of Thought" (ThoT) strategy draws inspiration from human cognitive processes. In experiments, ThoT significantly improves reasoning performance compared to other prompting techniques.
arXiv Detail & Related papers (2023-11-15T06:54:44Z)
Walking Down the Memory Maze: Beyond Context Limit through Interactive Reading [63.93888816206071]
We introduce MemWalker, a method that processes the long context into a tree of summary nodes. Upon receiving a query, the model navigates this tree in search of relevant information, and responds once it gathers sufficient information. We show that, beyond effective reading, MemWalker enhances explainability by highlighting the reasoning steps as it interactively reads the text; pinpointing the relevant text segments related to the query.
arXiv Detail & Related papers (2023-10-08T06:18:14Z)
RET-LLM: Towards a General Read-Write Memory for Large Language Models [53.288356721954514]
RET-LLM is a novel framework that equips large language models with a general write-read memory unit. Inspired by Davidsonian semantics theory, we extract and save knowledge in the form of triplets. Our framework exhibits robust performance in handling temporal-based question answering tasks.
arXiv Detail & Related papers (2023-05-23T17:53:38Z)

This list is automatically generated from the titles and abstracts of the papers in this site.