Related papers: Domain Specific Fine-tuning of Denoising Sequence-to-Sequence Models for Natural Language Summarization

Domain Specific Fine-tuning of Denoising Sequence-to-Sequence Models for Natural Language Summarization

URL: http://arxiv.org/abs/2204.09716v1
Date: Wed, 6 Apr 2022 18:17:14 GMT
Title: Domain Specific Fine-tuning of Denoising Sequence-to-Sequence Models for Natural Language Summarization
Authors: Brydon Parker, Alik Sokolov, Mahtab Ahmed, Matt Kalebic, Sedef Akinli Kocak, Ofer Shai
Abstract summary: We explore applications of a state-of-the-art NLP model (BART) We show that our end-to-end fine-tuning approach can result in a 5-6% absolute ROUGE-1 improvement over an out-of-the-box pre-trained BART summarizer.
Score: 2.9360071145551068
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Summarization of long-form text data is a problem especially pertinent in knowledge economy jobs such as medicine and finance, that require continuously remaining informed on a sophisticated and evolving body of knowledge. As such, isolating and summarizing key content automatically using Natural Language Processing (NLP) techniques holds the potential for extensive time savings in these industries. We explore applications of a state-of-the-art NLP model (BART), and explore strategies for tuning it to optimal performance using data augmentation and various fine-tuning strategies. We show that our end-to-end fine-tuning approach can result in a 5-6\% absolute ROUGE-1 improvement over an out-of-the-box pre-trained BART summarizer when tested on domain specific data, and make available our end-to-end pipeline to achieve these results on finance, medical, or other user-specified domains.

Related papers

A Universal Prompting Strategy for Extracting Process Model Information from Natural Language Text using Large Language Models [0.8899670429041453]
We show that generative large language models (LLMs) can solve NLP tasks with very high quality without the need for extensive data. Based on a novel prompting strategy, we show that LLMs are able to outperform state-of-the-art machine learning approaches.
arXiv Detail & Related papers (2024-07-26T06:39:35Z)
IGOT: Information Gain Optimized Tokenizer on Domain Adaptive Pretraining [2.009700777745832]
Pretrained Large Language Models (LLM) have demonstrated strong capabilities in various fields of natural language generation. When using generative AI to process downstream tasks, a common approach is to add new knowledge through continued training or fine-tuning. In this article, we proposed Information Gain Optimized Tokenizer (IGOT) which analyzes the special token set of downstream tasks, constructs a new subset using $phi$ with the special token and its information gain. We explored the many positive effects of this method's customized tokenizer on domain-adaptive pretraining and verified this method can perform better than the
arXiv Detail & Related papers (2024-05-16T07:25:10Z)
Dial-insight: Fine-tuning Large Language Models with High-Quality Domain-Specific Data Preventing Capability Collapse [4.98050508891467]
We propose a two-stage approach for the construction of production prompts designed to yield high-quality data. This method involves the generation of a diverse array of prompts that encompass a broad spectrum of tasks and exhibit a rich variety of expressions. We introduce a cost-effective, multi-dimensional quality assessment framework to ensure the integrity of the generated labeling data.
arXiv Detail & Related papers (2024-03-14T08:27:32Z)
When Parameter-efficient Tuning Meets General-purpose Vision-language Models [65.19127815275307]
PETAL revolutionizes the training process by requiring only 0.5% of the total parameters, achieved through a unique mode approximation technique. Our experiments reveal that PETAL not only outperforms current state-of-the-art methods in most scenarios but also surpasses full fine-tuning models in effectiveness.
arXiv Detail & Related papers (2023-12-16T17:13:08Z)
FinGPT: Instruction Tuning Benchmark for Open-Source Large Language Models in Financial Datasets [9.714447724811842]
This paper introduces a distinctive approach anchored in the Instruction Tuning paradigm for open-source large language models. We capitalize on the interoperability of open-source models, ensuring a seamless and transparent integration. The paper presents a benchmarking scheme designed for end-to-end training and testing, employing a cost-effective progression.
arXiv Detail & Related papers (2023-10-07T12:52:58Z)
PrivacyMind: Large Language Models Can Be Contextual Privacy Protection Learners [81.571305826793]
We introduce Contextual Privacy Protection Language Models (PrivacyMind) Our work offers a theoretical analysis for model design and benchmarks various techniques. In particular, instruction tuning with both positive and negative examples stands out as a promising method.
arXiv Detail & Related papers (2023-10-03T22:37:01Z)
"FIJO": a French Insurance Soft Skill Detection Dataset [0.0]
This article proposes a new public dataset, FIJO, containing insurance job offers, including many soft skill annotations. We present the results of skill detection algorithms using a named entity recognition approach and show that transformers-based models have good token-wise performances on this dataset.
arXiv Detail & Related papers (2022-04-11T15:54:22Z)
Fine-Tuning Large Neural Language Models for Biomedical Natural Language Processing [55.52858954615655]
We conduct a systematic study on fine-tuning stability in biomedical NLP. We show that finetuning performance may be sensitive to pretraining settings, especially in low-resource domains. We show that these techniques can substantially improve fine-tuning performance for lowresource biomedical NLP applications.
arXiv Detail & Related papers (2021-12-15T04:20:35Z)
SDA: Improving Text Generation with Self Data Augmentation [88.24594090105899]
We propose to improve the standard maximum likelihood estimation (MLE) paradigm by incorporating a self-imitation-learning phase for automatic data augmentation. Unlike most existing sentence-level augmentation strategies, our method is more general and could be easily adapted to any MLE-based training procedure.
arXiv Detail & Related papers (2021-01-02T01:15:57Z)
DAGA: Data Augmentation with a Generation Approach for Low-resource Tagging Tasks [88.62288327934499]
We propose a novel augmentation method with language models trained on the linearized labeled sentences. Our method is applicable to both supervised and semi-supervised settings.
arXiv Detail & Related papers (2020-11-03T07:49:15Z)
Domain-Specific Language Model Pretraining for Biomedical Natural Language Processing [73.37262264915739]
We show that for domains with abundant unlabeled text, such as biomedicine, pretraining language models from scratch results in substantial gains. Our experiments show that domain-specific pretraining serves as a solid foundation for a wide range of biomedical NLP tasks.
arXiv Detail & Related papers (2020-07-31T00:04:15Z)

This list is automatically generated from the titles and abstracts of the papers in this site.