Related papers: Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts

Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts

URL: http://arxiv.org/abs/2410.20763v2
Date: Wed, 06 Nov 2024 00:20:32 GMT
Title: Evaluating LLMs for Targeted Concept Simplification for Domain-Specific Texts
Authors: Sumit Asthana, Hannah Rashkin, Elizabeth Clark, Fantine Huot, Mirella Lapata,
Abstract summary: Lack of context and unfamiliarity with difficult concepts is a major reason for adult readers' difficulty with domain-specific text. We introduce "targeted concept simplification," a simplification task for rewriting text to help readers comprehend text containing unfamiliar concepts. We benchmark the performance of open-source and commercial LLMs and a simple dictionary baseline on this task.
Score: 53.421616210871704
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: One useful application of NLP models is to support people in reading complex text from unfamiliar domains (e.g., scientific articles). Simplifying the entire text makes it understandable but sometimes removes important details. On the contrary, helping adult readers understand difficult concepts in context can enhance their vocabulary and knowledge. In a preliminary human study, we first identify that lack of context and unfamiliarity with difficult concepts is a major reason for adult readers' difficulty with domain-specific text. We then introduce "targeted concept simplification," a simplification task for rewriting text to help readers comprehend text containing unfamiliar concepts. We also introduce WikiDomains, a new dataset of 22k definitions from 13 academic domains paired with a difficult concept within each definition. We benchmark the performance of open-source and commercial LLMs and a simple dictionary baseline on this task across human judgments of ease of understanding and meaning preservation. Interestingly, our human judges preferred explanations about the difficult concept more than simplification of the concept phrase. Further, no single model achieved superior performance across all quality dimensions, and automated metrics also show low correlations with human evaluations of concept simplification ($\sim0.2$), opening up rich avenues for research on personalized human reading comprehension support.

Related papers

Read Quietly, Think Aloud: Decoupling Comprehension and Reasoning in LLMs [3.153044931505783]
Large Language Models (LLMs) have demonstrated remarkable proficiency in understanding text and generating high-quality responses.<n>This paper investigates methods to imbue LLMs with a similar capacity for internal processing.
arXiv Detail & Related papers (2025-07-04T06:23:06Z)
Meaning Is Not A Metric: Using LLMs to make cultural context legible at scale [3.283323176831235]
We argue that large language models (LLMs) can make cultural context, and therefore human meaning, legible at an unprecedented scale in AI-based sociotechnical systems.<n>We frame this as a crucial direction for the application of generative AI and identify five key challenges.
arXiv Detail & Related papers (2025-05-23T04:10:42Z)
SimplifyMyText: An LLM-Based System for Inclusive Plain Language Text Simplification [8.027416277493924]
This paper is the first system designed to produce plain language content from multiple input formats. We employ GPT-4 and Llama-3 and evaluate outputs across multiple metrics.
arXiv Detail & Related papers (2025-04-19T08:07:53Z)
Can large language models understand uncommon meanings of common words? [30.527834781076546]
Large language models (LLMs) have shown significant advancements across diverse natural language understanding (NLU) tasks. Yet, lacking widely acknowledged testing mechanisms, answering whether LLMs are parrots or genuinely comprehend the world' remains unclear. This paper presents innovative construction of a Lexical Semantic dataset with novel evaluation metrics.
arXiv Detail & Related papers (2024-05-09T12:58:22Z)
Digital Comprehensibility Assessment of Simplified Texts among Persons with Intellectual Disabilities [2.446971913303003]
We conducted an evaluation study of text comprehensibility including participants with and without intellectual disabilities reading German texts on a tablet computer. We explored four different approaches to measuring comprehensibility: multiple-choice comprehension questions, perceived difficulty ratings, response time, and reading speed. For the target group of persons with intellectual disabilities, comprehension questions emerged as the most reliable measure, while analyzing reading speed provided valuable insights into participants' reading behavior.
arXiv Detail & Related papers (2024-02-20T15:37:08Z)
Interpreting Pretrained Language Models via Concept Bottlenecks [55.47515772358389]
Pretrained language models (PLMs) have made significant strides in various natural language processing tasks. The lack of interpretability due to their black-box'' nature poses challenges for responsible implementation. We propose a novel approach to interpreting PLMs by employing high-level, meaningful concepts that are easily understandable for humans.
arXiv Detail & Related papers (2023-11-08T20:41:18Z)
Are NLP Models Good at Tracing Thoughts: An Overview of Narrative Understanding [21.900015612952146]
Narrative understanding involves capturing the author's cognitive processes, providing insights into their knowledge, intentions, beliefs, and desires. Although large language models (LLMs) excel in generating grammatically coherent text, their ability to comprehend the author's thoughts remains uncertain. This hinders the practical applications of narrative understanding.
arXiv Detail & Related papers (2023-10-28T18:47:57Z)
LC-Score: Reference-less estimation of Text Comprehension Difficulty [0.0]
We present textscLC-Score, a simple approach for training text comprehension metric for any French text without reference. Our objective is to quantitatively capture the extend to which a text suits to the textitLangage Clair (LC, textitClear Language) guidelines. We explore two approaches: (i) using linguistically motivated indicators used to train statistical models, and (ii) neural learning directly from text leveraging pre-trained language models.
arXiv Detail & Related papers (2023-10-04T11:49:37Z)
Text Simplification of Scientific Texts for Non-Expert Readers [3.4761212729163318]
Simplification of scientific abstracts helps non-experts to access the core information. This is especially relevant for, e.g., cancer patients reading about novel treatment options.
arXiv Detail & Related papers (2023-07-07T13:05:11Z)
PromptRobust: Towards Evaluating the Robustness of Large Language Models on Adversarial Prompts [76.18347405302728]
This study uses a plethora of adversarial textual attacks targeting prompts across multiple levels: character, word, sentence, and semantic. The adversarial prompts are then employed in diverse tasks including sentiment analysis, natural language inference, reading comprehension, machine translation, and math problem-solving. Our findings demonstrate that contemporary Large Language Models are not robust to adversarial prompts.
arXiv Detail & Related papers (2023-06-07T15:37:00Z)
Elaborative Simplification as Implicit Questions Under Discussion [51.17933943734872]
This paper proposes to view elaborative simplification through the lens of the Question Under Discussion (QUD) framework. We show that explicitly modeling QUD provides essential understanding of elaborative simplification and how the elaborations connect with the rest of the discourse.
arXiv Detail & Related papers (2023-05-17T17:26:16Z)
HALMA: Humanlike Abstraction Learning Meets Affordance in Rapid Problem Solving [104.79156980475686]
Humans learn compositional and causal abstraction, ie, knowledge, in response to the structure of naturalistic tasks. We argue there shall be three levels of generalization in how an agent represents its knowledge: perceptual, conceptual, and algorithmic. This benchmark is centered around a novel task domain, HALMA, for visual concept development and rapid problem-solving.
arXiv Detail & Related papers (2021-02-22T20:37:01Z)
Concept Learners for Few-Shot Learning [76.08585517480807]
We propose COMET, a meta-learning method that improves generalization ability by learning to learn along human-interpretable concept dimensions. We evaluate our model on few-shot tasks from diverse domains, including fine-grained image classification, document categorization and cell type annotation.
arXiv Detail & Related papers (2020-07-14T22:04:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.