Quartet Logic: A Four-Step Reasoning (QLFR) framework for advancing
Short Text Classification
- URL: http://arxiv.org/abs/2401.03158v1
- Date: Sat, 6 Jan 2024 08:28:20 GMT
- Title: Quartet Logic: A Four-Step Reasoning (QLFR) framework for advancing
Short Text Classification
- Authors: Hui Wu, Yuanben Zhang, Zhonghe Han, Yingyan Hou, Lei Wang, Siye Liu,
Qihang Gong and Yunping Ge
- Abstract summary: Short Text Classification (STC) is crucial for processing and comprehending the brief but substantial content prevalent on contemporary digital platforms.
The emergence of Large Language Models (LLMs) and Chain-of-Thought (CoT) has significantly improved the performance of complex reasoning tasks.
This study introduces Quartet Logic: A Four-Step Reasoning (QLFR) framework.
- Score: 5.561563686684933
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Short Text Classification (STC) is crucial for processing and comprehending
the brief but substantial content prevalent on contemporary digital platforms.
The STC encounters difficulties in grasping semantic and syntactic intricacies,
an issue that is apparent in traditional pre-trained language models. Although
Graph Convolutional Networks enhance performance by integrating external
knowledge bases, these methods are limited by the quality and extent of the
knowledge applied. Recently, the emergence of Large Language Models (LLMs) and
Chain-of-Thought (CoT) has significantly improved the performance of complex
reasoning tasks. However, some studies have highlighted the limitations of
their application in fundamental NLP tasks. Consequently, this study sought to
employ CoT to investigate the capabilities of LLMs in STC tasks. This study
introduces Quartet Logic: A Four-Step Reasoning (QLFR) framework. This
framework primarily incorporates Syntactic and Semantic Enrichment CoT,
effectively decomposing the STC task into four distinct steps: (i) essential
concept identification, (ii) common-sense knowledge retrieval, (iii) text
rewriting, and (iv) classification. This elicits the inherent knowledge and
abilities of LLMs to address the challenges in STC. Surprisingly, we found that
QLFR can also improve the performance of smaller models. Therefore, we
developed a CoT-Driven Multi-task learning (QLFR-CML) method to facilitate the
knowledge transfer from LLMs to smaller models. Extensive experimentation
across six short-text benchmarks validated the efficacy of the proposed
methods. Notably, QLFR achieved state-of-the-art performance on all datasets,
with significant improvements, particularly on the Ohsumed and TagMyNews
datasets.
Related papers
- Bridging Context Gaps: Leveraging Coreference Resolution for Long Contextual Understanding [28.191029786204624]
We introduce the Long Question Coreference Adaptation (LQCA) method to enhance the performance of large language models (LLMs)
This framework focuses on coreference resolution tailored to long contexts, allowing the model to identify and manage references effectively.
The framework provides easier-to-handle partitions for LLMs, promoting better understanding.
arXiv Detail & Related papers (2024-10-02T15:39:55Z) - Re-TASK: Revisiting LLM Tasks from Capability, Skill, and Knowledge Perspectives [54.14429346914995]
Chain-of-Thought (CoT) has become a pivotal method for solving complex problems.
Large language models (LLMs) often struggle to accurately decompose domain-specific tasks.
This paper introduces the Re-TASK framework, a novel theoretical model that revisits LLM tasks from the perspectives of capability, skill, and knowledge.
arXiv Detail & Related papers (2024-08-13T13:58:23Z) - On the Hardness of Faithful Chain-of-Thought Reasoning in Large Language Models [25.029579061612456]
Large Language Models (LLMs) are increasingly being employed in real-world applications in critical domains such as healthcare.
It is important to ensure that the Chain-of-Thought (CoT) reasoning generated by these models faithfully captures their underlying behavior.
arXiv Detail & Related papers (2024-06-15T13:16:44Z) - Token-Efficient Leverage Learning in Large Language Models [13.830828529873056]
Large Language Models (LLMs) have excelled in various tasks but perform better in high-resource scenarios.
Data scarcity and the inherent difficulty of adapting LLMs to specific tasks compound the challenge.
We present a streamlined implement of this methodology called Token-Efficient Leverage Learning (TELL)
arXiv Detail & Related papers (2024-04-01T04:39:44Z) - TriSum: Learning Summarization Ability from Large Language Models with Structured Rationale [66.01943465390548]
We introduce TriSum, a framework for distilling large language models' text summarization abilities into a compact, local model.
Our method enhances local model performance on various benchmarks.
It also improves interpretability by providing insights into the summarization rationale.
arXiv Detail & Related papers (2024-03-15T14:36:38Z) - ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis [20.24915029448926]
Large language models (LLMs) have achieved commendable accomplishments in various natural language processing tasks.
These challenges arise from the presence of implicit relationships that demand multi-step reasoning.
We propose a novel approach ERA-CoT, which aids LLMs in understanding context by capturing relationships between entities.
arXiv Detail & Related papers (2024-03-11T17:18:53Z) - Evaluating LLMs' Mathematical Reasoning in Financial Document Question
Answering [53.56653281752486]
This study explores Large Language Models' mathematical reasoning on four financial question-answering datasets.
We focus on sensitivity to table complexity and performance variations with an increasing number of arithmetic reasoning steps.
We introduce a novel prompting technique tailored to semi-structured documents, matching or outperforming other baselines in performance.
arXiv Detail & Related papers (2024-02-17T05:10:18Z) - InFoBench: Evaluating Instruction Following Ability in Large Language
Models [57.27152890085759]
Decomposed Requirements Following Ratio (DRFR) is a new metric for evaluating Large Language Models' (LLMs) ability to follow instructions.
We present InFoBench, a benchmark comprising 500 diverse instructions and 2,250 decomposed questions across multiple constraint categories.
arXiv Detail & Related papers (2024-01-07T23:01:56Z) - MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning [63.80739044622555]
We introduce MuSR, a dataset for evaluating language models on soft reasoning tasks specified in a natural language narrative.
This dataset has two crucial features. First, it is created through a novel neurosymbolic synthetic-to-natural generation algorithm.
Second, our dataset instances are free text narratives corresponding to real-world domains of reasoning.
arXiv Detail & Related papers (2023-10-24T17:59:20Z) - Knowledge-Augmented Reasoning Distillation for Small Language Models in
Knowledge-Intensive Tasks [90.11273439036455]
Large Language Models (LLMs) have shown promising performance in knowledge-intensive reasoning tasks.
We propose Knowledge-Augmented Reasoning Distillation (KARD), a novel method that fine-tunes small LMs to generate rationales from LLMs with augmented knowledge retrieved from an external knowledge base.
We empirically show that KARD significantly improves the performance of small T5 and GPT models on the challenging knowledge-intensive reasoning datasets.
arXiv Detail & Related papers (2023-05-28T13:00:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.