Filling Memory Gaps: Enhancing Continual Semantic Parsing via SQL Syntax Variance-Guided LLMs without Real Data Replay
- URL: http://arxiv.org/abs/2412.07246v1
- Date: Tue, 10 Dec 2024 07:11:49 GMT
- Title: Filling Memory Gaps: Enhancing Continual Semantic Parsing via SQL Syntax Variance-Guided LLMs without Real Data Replay
- Authors: Ruiheng Liu, Jinyu Zhang, Yanqi Song, Yu Zhang, Bailong Yang,
- Abstract summary: Continual Semantic Parsing (CSP) aims to train annotateds to convert natural language questions intosql across tasks with limited examples.
Previous studies mitigate this challenge by replaying historical data or employing parameter-efficient tuning (PET)
We propose a new Large Language Model (LLM)-Enhanced Continuous Semantic Parsing method, named LECSP, which alleviates forgetting while encouraging generalization.
- Score: 5.308585520353363
- License:
- Abstract: Continual Semantic Parsing (CSP) aims to train parsers to convert natural language questions into SQL across tasks with limited annotated examples, adapting to the real-world scenario of dynamically updated databases. Previous studies mitigate this challenge by replaying historical data or employing parameter-efficient tuning (PET), but they often violate data privacy or rely on ideal continual learning settings. To address these problems, we propose a new Large Language Model (LLM)-Enhanced Continuous Semantic Parsing method, named LECSP, which alleviates forgetting while encouraging generalization, without requiring real data replay or ideal settings. Specifically, it first analyzes the commonalities and differences between tasks from the SQL syntax perspective to guide LLMs in reconstructing key memories and improving memory accuracy through a calibration strategy. Then, it uses a task-aware dual-teacher distillation framework to promote the accumulation and transfer of knowledge during sequential training. Experimental results on two CSP benchmarks show that our method significantly outperforms existing methods, even those utilizing data replay or ideal settings. Additionally, we achieve generalization performance beyond the upper limits, better adapting to unseen tasks.
Related papers
- Continual LLaVA: Continual Instruction Tuning in Large Vision-Language Models [93.5327725085853]
Continual LLaVA is a rehearsal-free method tailored for continual instruction tuning in LVLMs.
Experiments indicate that the proposed Continual LLaVA outperforms previous methods by significantly reducing the forgetting during the continual instruction tuning process.
arXiv Detail & Related papers (2024-11-04T19:55:32Z) - An Actor-Critic Approach to Boosting Text-to-SQL Large Language Model [7.01795534825797]
We propose a simple, general, and performance guaranteed T2S enhancement approach called Actor-Critic (AC)
We design two roles using the same large language models (LLMs): an Actor to producesql queries and a Critic to evaluate the producedsql.
If the Critic believes the producedsql is wrong, it notifies the Actor to reproduce thesql and perform evaluation again.
We conducted extensive experiments on the Spider and related datasets with eleven LLMs, and demonstrated that the Actor-Critic method consistently improves the performance of T2S.
arXiv Detail & Related papers (2024-10-28T15:22:35Z) - P-RAG: Progressive Retrieval Augmented Generation For Planning on Embodied Everyday Task [94.08478298711789]
Embodied Everyday Task is a popular task in the embodied AI community.
Natural language instructions often lack explicit task planning.
Extensive training is required to equip models with knowledge of the task environment.
arXiv Detail & Related papers (2024-09-17T15:29:34Z) - Adaptive Retention & Correction: Test-Time Training for Continual Learning [114.5656325514408]
A common problem in continual learning is the classification layer's bias towards the most recent task.
We name our approach Adaptive Retention & Correction (ARC)
ARC achieves an average performance increase of 2.7% and 2.6% on the CIFAR-100 and Imagenet-R datasets.
arXiv Detail & Related papers (2024-05-23T08:43:09Z) - PARMESAN: Parameter-Free Memory Search and Transduction for Dense Prediction Tasks [5.5127111704068374]
This work addresses flexibility in deep learning by means of transductive reasoning.
We propose PARMESAN, a scalable method which leverages a memory module for solving dense prediction tasks.
Our method is compatible with commonly used architectures and canonically transfers to 1D, 2D, and 3D grid-based data.
arXiv Detail & Related papers (2024-03-18T12:55:40Z) - InsCL: A Data-efficient Continual Learning Paradigm for Fine-tuning Large Language Models with Instructions [29.682289142922752]
InsCL dynamically replays previous data based on task similarity, calculated by Wasserstein Distance with instructions.
InsCL achieves performance gains of 3.0 Relative Gain compared with Random Replay, and 27.96 Relative Gain compared with No Replay.
arXiv Detail & Related papers (2024-03-18T03:10:36Z) - Continual Referring Expression Comprehension via Dual Modular
Memorization [133.46886428655426]
Referring Expression (REC) aims to localize an image region of a given object described by a natural-language expression.
Existing REC algorithms make a strong assumption that training data feeding into a model are given upfront, which degrades its practicality for real-world scenarios.
In this paper, we propose Continual Referring Expression (CREC), a new setting for REC, where a model is learning on a stream of incoming tasks.
In order to continuously improve the model on sequential tasks without forgetting prior learned knowledge and without repeatedly re-training from a scratch, we propose an effective baseline method named Dual Modular Memorization
arXiv Detail & Related papers (2023-11-25T02:58:51Z) - ReLLa: Retrieval-enhanced Large Language Models for Lifelong Sequential Behavior Comprehension in Recommendation [43.270424225285105]
We focus on adapting and empowering a pure large language model for zero-shot and few-shot recommendation tasks.
We propose Retrieval-enhanced Large Language models (ReLLa) for recommendation tasks in both zero-shot and few-shot settings.
arXiv Detail & Related papers (2023-08-22T02:25:04Z) - SQL-PaLM: Improved Large Language Model Adaptation for Text-to-SQL (extended) [53.95151604061761]
This paper introduces the framework for enhancing Text-to- filtering using large language models (LLMs)
With few-shot prompting, we explore the effectiveness of consistency decoding with execution-based error analyses.
With instruction fine-tuning, we delve deep in understanding the critical paradigms that influence the performance of tuned LLMs.
arXiv Detail & Related papers (2023-05-26T21:39:05Z) - Learn from Yesterday: A Semi-Supervised Continual Learning Method for
Supervision-Limited Text-to-SQL Task Streams [18.010095381310972]
This paper proposes integrating semi-supervised learning (SSL) and continual learning (CL) in a stream of text-to-labeled tasks.
The experiments on two datasets shows that SFNet outperforms the widely-used SSL-only and CL-only baselines on multiple metrics.
arXiv Detail & Related papers (2022-11-21T07:40:28Z) - Improving Meta-learning for Low-resource Text Classification and
Generation via Memory Imitation [87.98063273826702]
We propose a memory imitation meta-learning (MemIML) method that enhances the model's reliance on support sets for task adaptation.
A theoretical analysis is provided to prove the effectiveness of our method.
arXiv Detail & Related papers (2022-03-22T12:41:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.