DEEM: Dynamic Experienced Expert Modeling for Stance Detection
- URL: http://arxiv.org/abs/2402.15264v3
- Date: Fri, 26 Apr 2024 01:06:31 GMT
- Title: DEEM: Dynamic Experienced Expert Modeling for Stance Detection
- Authors: Xiaolong Wang, Yile Wang, Sijie Cheng, Peng Li, Yang Liu,
- Abstract summary: We propose a Dynamic Experienced Expert Modeling (DEEM) method which can leverage the generated experts and let LLMs reason in a semi-parametric way.
Experimental results demonstrate that DEEM consistently achieves the best results on three standard benchmarks.
- Score: 22.826544082557316
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent work has made a preliminary attempt to use large language models (LLMs) to solve the stance detection task, showing promising results. However, considering that stance detection usually requires detailed background knowledge, the vanilla reasoning method may neglect the domain knowledge to make a professional and accurate analysis. Thus, there is still room for improvement of LLMs reasoning, especially in leveraging the generation capability of LLMs to simulate specific experts (i.e., multi-agents) to detect the stance. In this paper, different from existing multi-agent works that require detailed descriptions and use fixed experts, we propose a Dynamic Experienced Expert Modeling (DEEM) method which can leverage the generated experienced experts and let LLMs reason in a semi-parametric way, making the experts more generalizable and reliable. Experimental results demonstrate that DEEM consistently achieves the best results on three standard benchmarks, outperforms methods with self-consistency reasoning, and reduces the bias of LLMs.
Related papers
- Let the Expert Stick to His Last: Expert-Specialized Fine-Tuning for Sparse Architectural Large Language Models [24.915387910764082]
Expert-Specialized Fine-Tuning, or ESFT, tunes the experts most relevant to downstream tasks while freezing the other experts and modules.
MoE models with finer-grained experts are more advantageous in selecting the combination of experts that are most relevant to downstream tasks.
arXiv Detail & Related papers (2024-07-02T03:11:13Z) - Large Language Models can Deliver Accurate and Interpretable Time Series Anomaly Detection [34.40206965758026]
Time series anomaly detection (TSAD) plays a crucial role in various industries by identifying atypical patterns that deviate from standard trends.
Traditional TSAD models, which often rely on deep learning, require extensive training data and operate as black boxes.
We propose LLMAD, a novel TSAD method that employs Large Language Models (LLMs) to deliver accurate and interpretable TSAD results.
arXiv Detail & Related papers (2024-05-24T09:07:02Z) - Beyond the Known: Investigating LLMs Performance on Out-of-Domain Intent
Detection [34.135738700682055]
This paper conducts a comprehensive evaluation of large language models (LLMs) represented by ChatGPT.
We find that LLMs exhibit strong zero-shot and few-shot capabilities, but is still at a disadvantage compared to models fine-tuned with full resource.
arXiv Detail & Related papers (2024-02-27T07:02:10Z) - Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models [90.14693869269519]
MoE LLMs can achieve higher performance with fewer parameters, but it is still hard to deploy them due to their immense parameter sizes.
This paper mainly aims to enhance the deployment efficiency of MoE LLMs by introducing plug-and-play expert-level sparsification techniques.
arXiv Detail & Related papers (2024-02-22T18:56:07Z) - On Least Square Estimation in Softmax Gating Mixture of Experts [78.3687645289918]
We investigate the performance of the least squares estimators (LSE) under a deterministic MoE model.
We establish a condition called strong identifiability to characterize the convergence behavior of various types of expert functions.
Our findings have important practical implications for expert selection.
arXiv Detail & Related papers (2024-02-05T12:31:18Z) - Omni-SMoLA: Boosting Generalist Multimodal Models with Soft Mixture of Low-rank Experts [74.40198929049959]
Large multi-modal models (LMMs) exhibit remarkable performance across numerous tasks.
generalist LMMs often suffer from performance degradation when tuned over a large collection of tasks.
We propose Omni-SMoLA, an architecture that uses the Soft MoE approach to mix many multimodal low rank experts.
arXiv Detail & Related papers (2023-12-01T23:04:27Z) - ExpeL: LLM Agents Are Experiential Learners [60.54312035818746]
We introduce the Experiential Learning (ExpeL) agent to allow learning from agent experiences without requiring parametric updates.
Our agent autonomously gathers experiences and extracts knowledge using natural language from a collection of training tasks.
At inference, the agent recalls its extracted insights and past experiences to make informed decisions.
arXiv Detail & Related papers (2023-08-20T03:03:34Z) - Separate the Wheat from the Chaff: Model Deficiency Unlearning via
Parameter-Efficient Module Operation [25.6335380561493]
Large language models (LLMs) have been widely used in various applications but are known to suffer from issues related to untruthfulness and toxicity.
We propose a PEMs operation approach, namely Extraction-before-Subtraction (Ext-Sub), to enhance the truthfulness and detoxification of LLMs.
arXiv Detail & Related papers (2023-08-16T01:46:01Z) - Unlocking the Potential of User Feedback: Leveraging Large Language
Model as User Simulator to Enhance Dialogue System [65.93577256431125]
We propose an alternative approach called User-Guided Response Optimization (UGRO) to combine it with a smaller task-oriented dialogue model.
This approach uses LLM as annotation-free user simulator to assess dialogue responses, combining them with smaller fine-tuned end-to-end TOD models.
Our approach outperforms previous state-of-the-art (SOTA) results.
arXiv Detail & Related papers (2023-06-16T13:04:56Z) - Soft Expert Reward Learning for Vision-and-Language Navigation [94.86954695912125]
Vision-and-Language Navigation (VLN) requires an agent to find a specified spot in an unseen environment by following natural language instructions.
We introduce a Soft Expert Reward Learning (SERL) model to overcome the reward engineering designing and generalisation problems of the VLN task.
arXiv Detail & Related papers (2020-07-21T14:17:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.