Research on Information Extraction of LCSTS Dataset Based on an Improved BERTSum-LSTM Model
- URL: http://arxiv.org/abs/2406.18364v1
- Date: Wed, 26 Jun 2024 14:04:15 GMT
- Title: Research on Information Extraction of LCSTS Dataset Based on an Improved BERTSum-LSTM Model
- Authors: Yiming Chen, Haobin Chen, Simin Liu, Yunyun Liu, Fanhao Zhou, Bing Wei,
- Abstract summary: This paper studies the information extraction method of the LCSTS dataset based on an improved BERTSum-LSTM model.
We improve the BERTSum-LSTM model to make it perform better in generating Chinese news summaries.
- Score: 3.942479021508835
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the continuous advancement of artificial intelligence, natural language processing technology has become widely utilized in various fields. At the same time, there are many challenges in creating Chinese news summaries. First of all, the semantics of Chinese news is complex, and the amount of information is enormous. Extracting critical information from Chinese news presents a significant challenge. Second, the news summary should be concise and clear, focusing on the main content and avoiding redundancy. In addition, the particularity of the Chinese language, such as polysemy, word segmentation, etc., makes it challenging to generate Chinese news summaries. Based on the above, this paper studies the information extraction method of the LCSTS dataset based on an improved BERTSum-LSTM model. We improve the BERTSum-LSTM model to make it perform better in generating Chinese news summaries. The experimental results show that the proposed method has a good effect on creating news summaries, which is of great importance to the construction of news summaries.
Related papers
- Assessing the Performance of Chinese Open Source Large Language Models in Information Extraction Tasks [12.400599440431188]
Information Extraction (IE) plays a crucial role in Natural Language Processing (NLP)
Recent experiments focusing on English IE tasks have shed light on the challenges faced by Large Language Models (LLMs) in achieving optimal performance.
arXiv Detail & Related papers (2024-06-04T08:00:40Z) - Dynamic data sampler for cross-language transfer learning in large language models [34.464472766868106]
ChatFlow is a cross-language transfer-based Large Language Models (LLMs)
We employ a mix of Chinese, English, and parallel corpus to continuously train the LLaMA2 model.
Experimental results demonstrate that our approach accelerates model convergence and achieves superior performance.
arXiv Detail & Related papers (2024-05-17T08:40:51Z) - COIG-CQIA: Quality is All You Need for Chinese Instruction Fine-tuning [57.600941792026006]
We introduce COIG-CQIA, a high-quality Chinese instruction tuning dataset.
Our aim is to build a diverse, wide-ranging instruction-tuning dataset to better align model behavior with human interactions.
We train models of various scales on different subsets of CQIA, following in-depth evaluation and analyses.
arXiv Detail & Related papers (2024-03-26T19:24:18Z) - Simultaneous Machine Translation with Large Language Models [51.470478122113356]
We investigate the possibility of applying Large Language Models to SimulMT tasks.
We conducted experiments using the textttLlama2-7b-chat model on nine different languages from the MUST-C dataset.
The results show that LLM outperforms dedicated MT models in terms of BLEU and LAAL metrics.
arXiv Detail & Related papers (2023-09-13T04:06:47Z) - Enhancing LLM with Evolutionary Fine Tuning for News Summary Generation [2.1828601975620257]
We propose a new paradigm for news summary generation using LLM with powerful natural language understanding and generative capabilities.
We use LLM to extract multiple structured event patterns from the events contained in news paragraphs, evolve the event pattern population with genetic algorithm, and select the most adaptive event pattern to input into the LLM to generate news summaries.
A News Summary Generator (NSG) is designed to select and evolve the event pattern populations and generate news summaries.
arXiv Detail & Related papers (2023-07-06T08:13:53Z) - Information Screening whilst Exploiting! Multimodal Relation Extraction
with Feature Denoising and Multimodal Topic Modeling [96.75821232222201]
Existing research on multimodal relation extraction (MRE) faces two co-existing challenges, internal-information over-utilization and external-information under-exploitation.
We propose a novel framework that simultaneously implements the idea of internal-information screening and external-information exploiting.
arXiv Detail & Related papers (2023-05-19T14:56:57Z) - Dynamic Multi-View Fusion Mechanism For Chinese Relation Extraction [12.818297160055584]
We propose a mixture-of-view-experts framework (MoVE) to dynamically learn multi-view features for Chinese relation extraction.
With both the internal and external knowledge of Chinese characters, our framework can better capture the semantic information of Chinese characters.
arXiv Detail & Related papers (2023-03-09T07:35:31Z) - Reinforced Iterative Knowledge Distillation for Cross-Lingual Named
Entity Recognition [54.92161571089808]
Cross-lingual NER transfers knowledge from rich-resource language to languages with low resources.
Existing cross-lingual NER methods do not make good use of rich unlabeled data in target languages.
We develop a novel approach based on the ideas of semi-supervised learning and reinforcement learning.
arXiv Detail & Related papers (2021-06-01T05:46:22Z) - Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language.
We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models.
Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z) - Multi-Task Learning for Cross-Lingual Abstractive Summarization [26.41478399867083]
We introduce existing genuine data such as translation pairs and monolingual abstractive summarization data into training.
Our proposed method, Transum, attaches a special token to the beginning of the input sentence to indicate the target task.
The experimental results show that Transum achieves better performance than the model trained with only pseudo cross-lingual summarization data.
arXiv Detail & Related papers (2020-10-15T04:03:00Z) - InfoBERT: Improving Robustness of Language Models from An Information
Theoretic Perspective [84.78604733927887]
Large-scale language models such as BERT have achieved state-of-the-art performance across a wide range of NLP tasks.
Recent studies show that such BERT-based models are vulnerable facing the threats of textual adversarial attacks.
We propose InfoBERT, a novel learning framework for robust fine-tuning of pre-trained language models.
arXiv Detail & Related papers (2020-10-05T20:49:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.