Proposal Report for the 2nd SciCAP Competition 2024
- URL: http://arxiv.org/abs/2407.01897v1
- Date: Tue, 2 Jul 2024 02:42:29 GMT
- Title: Proposal Report for the 2nd SciCAP Competition 2024
- Authors: Pengpeng Li, Tingmin Li, Jingyuan Wang, Boyuan Wang, Yang Yang,
- Abstract summary: We propose a method for document summarization using auxiliary information.
Our experiments demonstrate that leveraging high-quality OCR data enables efficient summarization of the content related to described objects.
Our method achieved top scores of 4.33 and 4.66 in the long caption and short caption tracks, respectively, of the 2024 SciCAP competition.
- Score: 20.58804817441756
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we propose a method for document summarization using auxiliary information. This approach effectively summarizes descriptions related to specific images, tables, and appendices within lengthy texts. Our experiments demonstrate that leveraging high-quality OCR data and initially extracted information from the original text enables efficient summarization of the content related to described objects. Based on these findings, we enhanced popular text generation model models by incorporating additional auxiliary branches to improve summarization performance. Our method achieved top scores of 4.33 and 4.66 in the long caption and short caption tracks, respectively, of the 2024 SciCAP competition, ranking highest in both categories.
Related papers
- Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - The Solution for the CVPR2024 NICE Image Captioning Challenge [2.614188906122931]
This report introduces a solution to the Topic 1 Zero-shot Image Captioning of 2024 NICE : New frontiers for zero-shot Image Captioning Evaluation.
arXiv Detail & Related papers (2024-04-19T09:32:16Z) - The Solution for the ICCV 2023 1st Scientific Figure Captioning Challenge [19.339645217996235]
We propose a solution for improving the quality of captions generated for figures in papers.
Our approach ranked first in the final test with a score of 4.49.
arXiv Detail & Related papers (2024-03-26T03:03:50Z) - Rank Your Summaries: Enhancing Bengali Text Summarization via
Ranking-based Approach [0.0]
This paper aims to identify the most accurate and informative summary for a given text by utilizing a simple but effective ranking-based approach.
We utilize four pre-trained summarization models to generate summaries, followed by applying a text ranking algorithm to identify the most suitable summary.
Experimental results suggest that by leveraging the strengths of each pre-trained transformer model, our methodology significantly improves the accuracy and effectiveness of the Bengali text summarization.
arXiv Detail & Related papers (2023-07-14T15:07:20Z) - ICDAR 2023 Competition on Structured Text Extraction from Visually-Rich
Document Images [198.35937007558078]
The competition opened on 30th December, 2022 and closed on 24th March, 2023.
There are 35 participants and 91 valid submissions received for Track 1, and 15 participants and 26 valid submissions received for Track 2.
According to the performance of the submissions, we believe there is still a large gap on the expected information extraction performance for complex and zero-shot scenarios.
arXiv Detail & Related papers (2023-06-05T22:20:52Z) - Exploiting Summarization Data to Help Text Simplification [50.0624778757462]
We analyzed the similarity between text summarization and text simplification and exploited summarization data to help simplify.
We named these pairs Sum4Simp (S4S) and conducted human evaluations to show that S4S is high-quality.
arXiv Detail & Related papers (2023-02-14T15:32:04Z) - COLO: A Contrastive Learning based Re-ranking Framework for One-Stage
Summarization [84.70895015194188]
We propose a Contrastive Learning based re-ranking framework for one-stage summarization called COLO.
COLO boosts the extractive and abstractive results of one-stage systems on CNN/DailyMail benchmark to 44.58 and 46.33 ROUGE-1 score.
arXiv Detail & Related papers (2022-09-29T06:11:21Z) - Comparing Methods for Extractive Summarization of Call Centre Dialogue [77.34726150561087]
We experimentally compare several such methods by using them to produce summaries of calls, and evaluating these summaries objectively.
We found that TopicSum and Lead-N outperform the other summarisation methods, whilst BERTSum received comparatively lower scores in both subjective and objective evaluations.
arXiv Detail & Related papers (2022-09-06T13:16:02Z) - Long Document Summarization with Top-down and Bottom-up Inference [113.29319668246407]
We propose a principled inference framework to improve summarization models on two aspects.
Our framework assumes a hierarchical latent structure of a document where the top-level captures the long range dependency.
We demonstrate the effectiveness of the proposed framework on a diverse set of summarization datasets.
arXiv Detail & Related papers (2022-03-15T01:24:51Z) - RetrievalSum: A Retrieval Enhanced Framework for Abstractive
Summarization [25.434558112121778]
We propose a novel retrieval enhanced abstractive summarization framework consisting of a dense Retriever and a Summarizer.
We validate our method on a wide range of summarization datasets across multiple domains and two backbone models: BERT and BART.
Results show that our framework obtains significant improvement by 1.384.66 in ROUGE-1 score when compared with the powerful pre-trained models.
arXiv Detail & Related papers (2021-09-16T12:52:48Z) - Topic Modeling Based Extractive Text Summarization [0.0]
We propose a novel method to summarize a text document by clustering its contents based on latent topics.
We utilize the lesser used and challenging WikiHow dataset in our approach to text summarization.
arXiv Detail & Related papers (2021-06-29T12:28:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.