Automatic Generation of Factual News Headlines in Finnish
- URL: http://arxiv.org/abs/2212.02170v1
- Date: Mon, 5 Dec 2022 11:12:14 GMT
- Title: Automatic Generation of Factual News Headlines in Finnish
- Authors: Maximilian Koppatz, Khalid Alnajjar, Mika H\"am\"al\"ainen, Thierry
Poibeau
- Abstract summary: We model this as a summarization task where a model is given a news article and its task is to produce a concise headline describing the main topic of the article.
Because there are no openly available GPT-2 models for Finnish, we will first build such a model using several corpora.
The model is then fine-tuned for the headline generation task using a massive news corpus.
- Score: 1.6918354618189375
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present a novel approach to generating news headlines in Finnish for a
given news story. We model this as a summarization task where a model is given
a news article, and its task is to produce a concise headline describing the
main topic of the article. Because there are no openly available GPT-2 models
for Finnish, we will first build such a model using several corpora. The model
is then fine-tuned for the headline generation task using a massive news
corpus. The system is evaluated by 3 expert journalists working in a Finnish
media house. The results showcase the usability of the presented approach as a
headline suggestion tool to facilitate the news production process.
Related papers
- NewsEdits 2.0: Learning the Intentions Behind Updating News [74.84017890548259]
As events progress, news articles often update with new information: if we are not cautious, we risk propagating outdated facts.
In this work, we hypothesize that linguistic features indicate factual fluidity, and that we can predict which facts in a news article will update using solely the text of a news article.
arXiv Detail & Related papers (2024-11-27T23:35:23Z) - Identifying Informational Sources in News Articles [109.70475599552523]
We build the largest and widest-ranging annotated dataset of informational sources used in news writing.
We introduce a novel task, source prediction, to study the compositionality of sources in news articles.
arXiv Detail & Related papers (2023-05-24T08:56:35Z) - Framing the News:From Human Perception to Large Language Model
Inferences [8.666172545138272]
Identifying the frames of news is important to understand the articles' vision, intention, message to be conveyed, and which aspects of the news are emphasized.
We develop a protocol for human labeling of frames for 1786 headlines of No-Vax movement articles of European newspapers from 5 countries.
We investigate two approaches for frame inference of news headlines: first with a GPT-3.5 fine-tuning approach, and second with GPT-3.5 prompt-engineering.
arXiv Detail & Related papers (2023-04-27T18:30:18Z) - Beyond Discrete Genres: Mapping News Items onto a Multidimensional
Framework of Genre Cues [0.0]
We propose a non-discrete framework for mapping news items in terms of genre cues.
To automatically analyze a large amount of news items, we deliver two computational models for predicting news sentences.
This proposed approach helps in deepening our insight into the evolving nature of news genres.
arXiv Detail & Related papers (2022-12-08T10:54:31Z) - NewsEdits: A News Article Revision Dataset and a Document-Level
Reasoning Challenge [122.37011526554403]
NewsEdits is the first publicly available dataset of news revision histories.
It contains 1.2 million articles with 4.6 million versions from over 22 English- and French-language newspaper sources.
arXiv Detail & Related papers (2022-06-14T18:47:13Z) - Faking Fake News for Real Fake News Detection: Propaganda-loaded
Training Data Generation [105.20743048379387]
We propose a novel framework for generating training examples informed by the known styles and strategies of human-authored propaganda.
Specifically, we perform self-critical sequence training guided by natural language inference to ensure the validity of the generated articles.
Our experimental results show that fake news detectors trained on PropaNews are better at detecting human-written disinformation by 3.62 - 7.69% F1 score on two public datasets.
arXiv Detail & Related papers (2022-03-10T14:24:19Z) - End-to-End Segmentation-based News Summarization [15.549631631269198]
We introduce the task of segmenting a news article into multiple sections and generating the corresponding summary to each section.
First, we create and make available a dataset, SegNews, consisting of 27k news articles with sections and aligned heading-style section summaries.
Second, we propose a novel segmentation-based language generation model adapted from pre-trained language models.
arXiv Detail & Related papers (2021-10-15T04:17:26Z) - Modeling "Newsworthiness" for Lead-Generation Across Corpora [85.92467549469147]
We train models on automatically labeled corpora to predict whether each article was a front-page article.
We rank documents in unlabeled corpora on "newsworthiness"
A fine-tuned RoBERTa model achieves.93 AUC performance on heldout labeled documents, and.88 AUC on expert-validated unlabeled corpora.
arXiv Detail & Related papers (2021-04-19T21:48:15Z) - NewsBERT: Distilling Pre-trained Language Model for Intelligent News
Application [56.1830016521422]
We propose NewsBERT, which can distill pre-trained language models for efficient and effective news intelligence.
In our approach, we design a teacher-student joint learning and distillation framework to collaboratively learn both teacher and student models.
In our experiments, NewsBERT can effectively improve the model performance in various intelligent news applications with much smaller models.
arXiv Detail & Related papers (2021-02-09T15:41:12Z) - Viable Threat on News Reading: Generating Biased News Using Natural
Language Models [49.90665530780664]
We show that publicly available language models can reliably generate biased news content based on an input original news.
We also show that a large number of high-quality biased news articles can be generated using controllable text generation.
arXiv Detail & Related papers (2020-10-05T16:55:39Z) - Generating Representative Headlines for News Stories [31.67864779497127]
Grouping articles that are reporting the same event into news stories is a common way of assisting readers in their news consumption.
It remains a challenging research problem to efficiently and effectively generate a representative headline for each story.
We develop a distant supervision approach to train large-scale generation models without any human annotation.
arXiv Detail & Related papers (2020-01-26T02:08:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.