Infusing Hierarchical Guidance into Prompt Tuning: A Parameter-Efficient
Framework for Multi-level Implicit Discourse Relation Recognition
- URL: http://arxiv.org/abs/2402.15080v1
- Date: Fri, 23 Feb 2024 03:53:39 GMT
- Title: Infusing Hierarchical Guidance into Prompt Tuning: A Parameter-Efficient
Framework for Multi-level Implicit Discourse Relation Recognition
- Authors: Haodong Zhao, Ruifang He, Mengnan Xiao and Jing Xu
- Abstract summary: Multi-level implicit discourse relation recognition (MIDRR) aims at identifying hierarchical discourse relations among arguments.
In this paper, we propose a prompt-based.
Efficient Multi-level IDRR (PEMI) framework to solve the above problems.
- Score: 16.647413058592125
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Multi-level implicit discourse relation recognition (MIDRR) aims at
identifying hierarchical discourse relations among arguments. Previous methods
achieve the promotion through fine-tuning PLMs. However, due to the data
scarcity and the task gap, the pre-trained feature space cannot be accurately
tuned to the task-specific space, which even aggravates the collapse of the
vanilla space. Besides, the comprehension of hierarchical semantics for MIDRR
makes the conversion much harder. In this paper, we propose a prompt-based
Parameter-Efficient Multi-level IDRR (PEMI) framework to solve the above
problems. First, we leverage parameter-efficient prompt tuning to drive the
inputted arguments to match the pre-trained space and realize the approximation
with few parameters. Furthermore, we propose a hierarchical label refining
(HLR) method for the prompt verbalizer to deeply integrate hierarchical
guidance into the prompt tuning. Finally, our model achieves comparable results
on PDTB 2.0 and 3.0 using about 0.1% trainable parameters compared with
baselines and the visualization demonstrates the effectiveness of our HLR
method.
Related papers
- SpaLLM: Unified Compressive Adaptation of Large Language Models with Sketching [32.4599581528901]
"Two-tower" architecture is used for compressing pre-trained LLM parameters into compact representations and fine-tuning the additive full-precision adapter.
We propose SpaLLM (Sketched Adapting of LLMs), a novel compressive adaptation approach for LLMs.
We show that SpaLLM sketches pre-trained LLM weights into lookup tables and directly fine-tunes the values in these tables.
arXiv Detail & Related papers (2024-10-08T20:58:24Z) - Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM Auto-Prompting [22.025533583703126]
We propose a novel Prompt Recursive Search (PRS) framework for large language models (LLMs)
PRS framework incorporates an assessment of problem complexity and an adjustable structure, ensuring a reduction in the likelihood of errors.
Compared to the Chain of Thought (CoT) method, the PRS method has increased the accuracy on the BBH dataset by 8% using Llama3-7B model, achieving a 22% improvement.
arXiv Detail & Related papers (2024-08-02T17:59:42Z) - Pareto Low-Rank Adapters: Efficient Multi-Task Learning with Preferences [49.14535254003683]
PaLoRA is a novel parameter-efficient method that augments the original model with task-specific low-rank adapters.
Our experimental results show that PaLoRA outperforms MTL and PFL baselines across various datasets.
arXiv Detail & Related papers (2024-07-10T21:25:51Z) - SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning [63.93193829913252]
We propose an innovative METL strategy called SHERL for resource-limited scenarios.
In the early route, intermediate outputs are consolidated via an anti-redundancy operation.
In the late route, utilizing minimal late pre-trained layers could alleviate the peak demand on memory overhead.
arXiv Detail & Related papers (2024-07-10T10:22:35Z) - Guiding Language Model Reasoning with Planning Tokens [122.43639723387516]
Large language models (LLMs) have recently attracted considerable interest for their ability to perform complex reasoning tasks.
We propose a hierarchical generation scheme to encourage a more structural generation of chain-of-thought steps.
Our approach requires a negligible increase in trainable parameters (0.001%) and can be applied through either full fine-tuning or a more parameter-efficient scheme.
arXiv Detail & Related papers (2023-10-09T13:29:37Z) - Parameter-Efficient Fine-Tuning without Introducing New Latency [7.631596468553607]
We introduce a novel adapter technique that directly applies the adapter to pre-trained parameters instead of the hidden representation.
Our proposed method attains a new state-of-the-art outcome in terms of both performance and storage efficiency, storing only 0.03% parameters of full fine-tuning.
arXiv Detail & Related papers (2023-05-26T08:44:42Z) - Parameter-efficient Tuning of Large-scale Multimodal Foundation Model [68.24510810095802]
We propose A graceful prompt framework for cross-modal transfer (Aurora) to overcome these challenges.
Considering the redundancy in existing architectures, we first utilize the mode approximation to generate 0.1M trainable parameters to implement the multimodal prompt tuning.
A thorough evaluation on six cross-modal benchmarks shows that it not only outperforms the state-of-the-art but even outperforms the full fine-tuning approach.
arXiv Detail & Related papers (2023-05-15T06:40:56Z) - Towards Robust Low-Resource Fine-Tuning with Multi-View Compressed
Representations [51.75960511842552]
Fine-tuning of pretrained language models (PLMs) is prone to overfitting in the low resource scenarios.
We present a novel method that operates on the hidden representations of a PLM to reduce overfitting.
arXiv Detail & Related papers (2022-11-16T09:39:29Z) - GRASP: Guiding model with RelAtional Semantics using Prompt [3.1275060062551208]
We propose a Guiding model with RelAtional Semantics using Prompt (GRASP)
We adopt a prompt-based fine-tuning approach and capture relational semantic clues of a given dialogue with an argument-aware prompt marker strategy.
In the experiments, GRASP state-of-the-art performance in terms of both F1 and F1c scores on a DialogRE dataset.
arXiv Detail & Related papers (2022-08-26T08:19:28Z) - KECP: Knowledge Enhanced Contrastive Prompting for Few-shot Extractive
Question Answering [28.18555591429343]
We propose a novel framework named Knowledge Enhanced Contrastive Prompt-tuning (KECP)
Instead of adding pointer heads to PLMs, we transform the task into a non-autoregressive Masked Language Modeling (MLM) generation problem.
Our method consistently outperforms state-of-the-art approaches in few-shot settings by a large margin.
arXiv Detail & Related papers (2022-05-06T08:31:02Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.