Think from Words(TFW): Initiating Human-Like Cognition in Large Language
Models Through Think from Words for Japanese Text-level Classification
- URL: http://arxiv.org/abs/2312.03458v1
- Date: Wed, 6 Dec 2023 12:34:46 GMT
- Title: Think from Words(TFW): Initiating Human-Like Cognition in Large Language
Models Through Think from Words for Japanese Text-level Classification
- Authors: Chengguang Gan, Qinghao Zhang, Tatsunori Mori
- Abstract summary: "Think from Words" (TFW) initiates the comprehension process at the word level and then extends it to encompass the entire text.
"TFW with Extra word-level information" (TFW Extra) augmenting comprehension with additional word-level data.
Our findings shed light on the impact of various word-level information types on LLMs' text comprehension.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The proliferation of Large Language Models (LLMs) has spurred extensive
research into LLM-related Prompt investigations, such as Instruction Learning
(IL), In-context Learning (ICL), and Chain-of-Thought (CoT). These approaches
aim to improve LLMs' responses by enabling them to provide concise statements
or examples for deeper contemplation when addressing questions. However,
independent thinking by LLMs can introduce variability in their thought
processes, leading to potential inaccuracies. In response, our study seeks to
bridge the gap between LLM and human-like thinking processes, recognizing that
text comprehension begins with understanding individual words. To tackle this
challenge, we have expanded the CoT method to cater to a specific domain. Our
approach, known as "Think from Words" (TFW), initiates the comprehension
process at the word level and then extends it to encompass the entire text. We
also propose "TFW with Extra word-level information" (TFW Extra), augmenting
comprehension with additional word-level data. To assess our methods, we employ
text classification on six Japanese datasets comprising text-level and
word-level elements. Our findings not only validate the effectiveness of TFW
but also shed light on the impact of various word-level information types on
LLMs' text comprehension, offering insights into their potential to cause
misinterpretations and errors in the overall comprehension of the final text.
Related papers
- Can LLMs Understand the Implication of Emphasized Sentences in Dialogue? [64.72966061510375]
Emphasis is a crucial component in human communication, which indicates the speaker's intention and implication beyond pure text in dialogue.
This paper introduces Emphasized-Talk, a benchmark with emphasis-annotated dialogue samples capturing the implications of emphasis.
We evaluate various Large Language Models (LLMs), both open-source and commercial, to measure their performance in understanding emphasis.
arXiv Detail & Related papers (2024-06-16T20:41:44Z) - Can large language models understand uncommon meanings of common words? [30.527834781076546]
Large language models (LLMs) have shown significant advancements across diverse natural language understanding (NLU) tasks.
Yet, lacking widely acknowledged testing mechanisms, answering whether LLMs are parrots or genuinely comprehend the world' remains unclear.
This paper presents innovative construction of a Lexical Semantic dataset with novel evaluation metrics.
arXiv Detail & Related papers (2024-05-09T12:58:22Z) - FAC$^2$E: Better Understanding Large Language Model Capabilities by Dissociating Language and Cognition [56.76951887823882]
Large language models (LLMs) are primarily evaluated by overall performance on various text understanding and generation tasks.
We present FAC$2$E, a framework for Fine-grAined and Cognition-grounded LLMs' Capability Evaluation.
arXiv Detail & Related papers (2024-02-29T21:05:37Z) - When LLMs Meet Cunning Texts: A Fallacy Understanding Benchmark for Large Language Models [59.84769254832941]
We propose a FaLlacy Understanding Benchmark (FLUB) containing cunning texts that are easy for humans to understand but difficult for models to grasp.
Specifically, the cunning texts that FLUB focuses on mainly consist of the tricky, humorous, and misleading texts collected from the real internet environment.
Based on FLUB, we investigate the performance of multiple representative and advanced LLMs.
arXiv Detail & Related papers (2024-02-16T22:12:53Z) - How Well Do Large Language Models Understand Syntax? An Evaluation by
Asking Natural Language Questions [25.39259677000101]
This study seeks to explore the question through the lens of syntax.
We craft questions targeting nine syntactic knowledge points that are most closely related to sentence comprehension.
Experiments conducted on 24 large language models (LLMs) suggest that most have a limited grasp of syntactic knowledge.
arXiv Detail & Related papers (2023-11-14T16:30:36Z) - Improving Factual Consistency of Text Summarization by Adversarially
Decoupling Comprehension and Embellishment Abilities of LLMs [67.56087611675606]
Large language models (LLMs) generate summaries that are factually inconsistent with original articles.
These hallucinations are challenging to detect through traditional methods.
We propose an adversarially DEcoupling method to disentangle the abilities of LLMs (DECENT)
arXiv Detail & Related papers (2023-10-30T08:40:16Z) - MAGNIFICo: Evaluating the In-Context Learning Ability of Large Language
Models to Generalize to Novel Interpretations [37.13707912132472]
Humans possess a remarkable ability to assign novel interpretations to linguistic expressions.
Large Language Models (LLMs) have a knowledge cutoff and are costly to finetune repeatedly.
We systematically analyse the ability of LLMs to acquire novel interpretations using in-context learning.
arXiv Detail & Related papers (2023-10-18T00:02:38Z) - Translate to Disambiguate: Zero-shot Multilingual Word Sense
Disambiguation with Pretrained Language Models [67.19567060894563]
Pretrained Language Models (PLMs) learn rich cross-lingual knowledge and can be finetuned to perform well on diverse tasks.
We present a new study investigating how well PLMs capture cross-lingual word sense with Contextual Word-Level Translation (C-WLT)
We find that as the model size increases, PLMs encode more cross-lingual word sense knowledge and better use context to improve WLT performance.
arXiv Detail & Related papers (2023-04-26T19:55:52Z) - Prompting Language Models for Linguistic Structure [73.11488464916668]
We present a structured prompting approach for linguistic structured prediction tasks.
We evaluate this approach on part-of-speech tagging, named entity recognition, and sentence chunking.
We find that while PLMs contain significant prior knowledge of task labels due to task leakage into the pretraining corpus, structured prompting can also retrieve linguistic structure with arbitrary labels.
arXiv Detail & Related papers (2022-11-15T01:13:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.