Fine-tuning and aligning question answering models for complex
information extraction tasks
- URL: http://arxiv.org/abs/2309.14805v1
- Date: Tue, 26 Sep 2023 10:02:21 GMT
- Title: Fine-tuning and aligning question answering models for complex
information extraction tasks
- Authors: Matthias Engelbach, Dennis Klau, Felix Scheerer, Jens Drawehn,
Maximilien Kintz
- Abstract summary: extractive language models like question answering (QA) or passage retrieval models guarantee query results to be found within the boundaries of an according context document.
We show that fine-tuning existing German QA models boosts performance for tailored extraction tasks of complex linguistic features.
We deduce a combined metric from Levenshtein distance, F1-Score, Exact Match and ROUGE-L to mimic the assessment criteria from human experts.
- Score: 0.8392546351624164
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: The emergence of Large Language Models (LLMs) has boosted performance and
possibilities in various NLP tasks. While the usage of generative AI models
like ChatGPT opens up new opportunities for several business use cases, their
current tendency to hallucinate fake content strongly limits their
applicability to document analysis, such as information retrieval from
documents. In contrast, extractive language models like question answering (QA)
or passage retrieval models guarantee query results to be found within the
boundaries of an according context document, which makes them candidates for
more reliable information extraction in productive environments of companies.
In this work we propose an approach that uses and integrates extractive QA
models for improved feature extraction of German business documents such as
insurance reports or medical leaflets into a document analysis solution. We
further show that fine-tuning existing German QA models boosts performance for
tailored extraction tasks of complex linguistic features like damage cause
explanations or descriptions of medication appearance, even with using only a
small set of annotated data. Finally, we discuss the relevance of scoring
metrics for evaluating information extraction tasks and deduce a combined
metric from Levenshtein distance, F1-Score, Exact Match and ROUGE-L to mimic
the assessment criteria from human experts.
Related papers
- Likelihood as a Performance Gauge for Retrieval-Augmented Generation [78.28197013467157]
We show that likelihoods serve as an effective gauge for language model performance.
We propose two methods that use question likelihood as a gauge for selecting and constructing prompts that lead to better performance.
arXiv Detail & Related papers (2024-11-12T13:14:09Z) - Leveraging Large Language Models for Mobile App Review Feature Extraction [4.879919005707447]
This study explores the hypothesis that encoder-only large language models can enhance feature extraction from mobile app reviews.
By leveraging crowdsourced annotations from an industrial context, we redefine feature extraction as a supervised token classification task.
Empirical evaluations demonstrate that this method improves the precision and recall of extracted features and enhances performance efficiency.
arXiv Detail & Related papers (2024-08-02T07:31:57Z) - generAItor: Tree-in-the-Loop Text Generation for Language Model
Explainability and Adaptation [28.715001906405362]
Large language models (LLMs) are widely deployed in various downstream tasks, e.g., auto-completion, aided writing, or chat-based text generation.
We tackle this shortcoming by proposing a tree-in-the-loop approach, where a visual representation of the beam search tree is the central component for analyzing, explaining, and adapting the generated outputs.
We present generAItor, a visual analytics technique, augmenting the central beam search tree with various task-specific widgets, providing targeted visualizations and interaction possibilities.
arXiv Detail & Related papers (2024-03-12T13:09:15Z) - A Question Answering Based Pipeline for Comprehensive Chinese EHR
Information Extraction [3.411065529290054]
We propose a novel approach that automatically generates training data for transfer learning of question answering models.
Our pipeline incorporates a preprocessing module to handle challenges posed by extraction types.
The obtained QA model exhibits excellent performance on subtasks of information extraction in EHRs.
arXiv Detail & Related papers (2024-02-17T02:55:35Z) - Assessing Privacy Risks in Language Models: A Case Study on
Summarization Tasks [65.21536453075275]
We focus on the summarization task and investigate the membership inference (MI) attack.
We exploit text similarity and the model's resistance to document modifications as potential MI signals.
We discuss several safeguards for training summarization models to protect against MI attacks and discuss the inherent trade-off between privacy and utility.
arXiv Detail & Related papers (2023-10-20T05:44:39Z) - Peek Across: Improving Multi-Document Modeling via Cross-Document
Question-Answering [49.85790367128085]
We pre-training a generic multi-document model from a novel cross-document question answering pre-training objective.
This novel multi-document QA formulation directs the model to better recover cross-text informational relations.
Unlike prior multi-document models that focus on either classification or summarization tasks, our pre-training objective formulation enables the model to perform tasks that involve both short text generation and long text generation.
arXiv Detail & Related papers (2023-05-24T17:48:40Z) - Discover, Explanation, Improvement: An Automatic Slice Detection
Framework for Natural Language Processing [72.14557106085284]
slice detection models (SDM) automatically identify underperforming groups of datapoints.
This paper proposes a benchmark named "Discover, Explain, improve (DEIM)" for classification NLP tasks.
Our evaluation shows that Edisa can accurately select error-prone datapoints with informative semantic features.
arXiv Detail & Related papers (2022-11-08T19:00:00Z) - Recitation-Augmented Language Models [85.30591349383849]
We show that RECITE is a powerful paradigm for knowledge-intensive NLP tasks.
Specifically, we show that by utilizing recitation as the intermediate step, a recite-and-answer scheme can achieve new state-of-the-art performance.
arXiv Detail & Related papers (2022-10-04T00:49:20Z) - SAIS: Supervising and Augmenting Intermediate Steps for Document-Level
Relation Extraction [51.27558374091491]
We propose to explicitly teach the model to capture relevant contexts and entity types by supervising and augmenting intermediate steps (SAIS) for relation extraction.
Based on a broad spectrum of carefully designed tasks, our proposed SAIS method not only extracts relations of better quality due to more effective supervision, but also retrieves the corresponding supporting evidence more accurately.
arXiv Detail & Related papers (2021-09-24T17:37:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.