Extracting Information from Scientific Literature via Visual Table Question Answering Models
- URL: http://arxiv.org/abs/2508.18661v1
- Date: Tue, 26 Aug 2025 04:08:16 GMT
- Title: Extracting Information from Scientific Literature via Visual Table Question Answering Models
- Authors: Dongyoun Kim, Hyung-do Choi, Youngsun Jang, John Kim,
- Abstract summary: This study explores three approaches to processing table data in scientific papers to enhance extractive question answering.<n>The methods evaluated include: (1) Optical Character Recognition (OCR) for extracting information from documents, (2) Pre-trained models for document visual question answering, and (3) Table detection and structure recognition to extract and merge key information from tables with textual content to answer extractive questions.
- Score: 1.6411967992595455
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study explores three approaches to processing table data in scientific papers to enhance extractive question answering and develop a software tool for the systematic review process. The methods evaluated include: (1) Optical Character Recognition (OCR) for extracting information from documents, (2) Pre-trained models for document visual question answering, and (3) Table detection and structure recognition to extract and merge key information from tables with textual content to answer extractive questions. In exploratory experiments, we augmented ten sample test documents containing tables and relevant content against RF- EMF-related scientific papers with seven predefined extractive question-answer pairs. The results indicate that approaches preserving table structure outperform the others, particularly in representing and organizing table content. Accurately recognizing specific notations and symbols within the documents emerged as a critical factor for improved results. Our study concludes that preserving the structural integrity of tables is essential for enhancing the accuracy and reliability of extractive question answering in scientific documents.
Related papers
- Towards Knowledge-Aware Document Systems: Modeling Semantic Coverage Relations via Answerability Detection [40.12543056558646]
We introduce a novel framework for modelling Semantic Coverage Relations (SCR), which classifies document pairs based on how their informational content aligns.<n>We define three core relation types: equivalence, inclusion, and semantic overlap.<n>We use a question answering (QA)-based approach, using the answerability of shared questions across documents as an indicator of semantic coverage.
arXiv Detail & Related papers (2025-09-10T06:00:01Z) - TalentMine: LLM-Based Extraction and Question-Answering from Multimodal Talent Tables [5.365164774382722]
We introduce TalentMine, a novel framework that transforms extracted tables into semantically enriched representations.<n> TalentMine achieves 100% accuracy in query answering tasks compared to 0% for standard AWS Textract extraction.<n>Our comparative analysis also reveals that the Claude v3 Haiku model achieves optimal performance for talent management applications.
arXiv Detail & Related papers (2025-06-22T22:17:42Z) - Evaluation of Table Representations to Answer Questions from Tables in Documents : A Case Study using 3GPP Specifications [0.650923326742559]
The representation of a table in terms of what is a relevant chunk is not obvious.
Row level representations with corresponding table header information being included in every cell improves the performance of the retrieval.
arXiv Detail & Related papers (2024-08-30T04:40:35Z) - Large-Scale Knowledge Synthesis and Complex Information Retrieval from
Biomedical Documents [0.33249867230903685]
Recent advances in the healthcare industry have led to an abundance of unstructured data.
Our work offers an all-in-one scalable solution for extracting and exploring complex information from large-scale research documents.
arXiv Detail & Related papers (2023-02-14T06:03:43Z) - ReSel: N-ary Relation Extraction from Scientific Text and Tables by
Learning to Retrieve and Select [53.071352033539526]
We study the problem of extracting N-ary relations from scientific articles.
Our proposed method ReSel decomposes this task into a two-stage procedure.
Our experiments on three scientific information extraction datasets show that ReSel outperforms state-of-the-art baselines significantly.
arXiv Detail & Related papers (2022-10-26T02:28:02Z) - Mixed-modality Representation Learning and Pre-training for Joint
Table-and-Text Retrieval in OpenQA [85.17249272519626]
An optimized OpenQA Table-Text Retriever (OTTeR) is proposed.
We conduct retrieval-centric mixed-modality synthetic pre-training.
OTTeR substantially improves the performance of table-and-text retrieval on the OTT-QA dataset.
arXiv Detail & Related papers (2022-10-11T07:04:39Z) - Towards Complex Document Understanding By Discrete Reasoning [77.91722463958743]
Document Visual Question Answering (VQA) aims to understand visually-rich documents to answer questions in natural language.
We introduce a new Document VQA dataset, named TAT-DQA, which consists of 3,067 document pages and 16,558 question-answer pairs.
We develop a novel model named MHST that takes into account the information in multi-modalities, including text, layout and visual image, to intelligently address different types of questions.
arXiv Detail & Related papers (2022-07-25T01:43:19Z) - Layout-Aware Information Extraction for Document-Grounded Dialogue:
Dataset, Method and Demonstration [75.47708732473586]
We propose a layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents.
LIE contains 62k annotations of three extraction tasks from 4,061 pages in product and official documents.
Empirical results show that layout is critical for VRD-based extraction, and system demonstration also verifies that the extracted knowledge can help locate the answers that users care about.
arXiv Detail & Related papers (2022-07-14T07:59:45Z) - Neural Content Extraction for Poster Generation of Scientific Papers [84.30128728027375]
The problem of poster generation for scientific papers is under-investigated.
Previous studies focus mainly on poster layout and panel composition, while neglecting the importance of content extraction.
To get both textual and visual elements of a poster panel, a neural extractive model is proposed to extract text, figures and tables of a paper section simultaneously.
arXiv Detail & Related papers (2021-12-16T01:19:37Z) - Tab.IAIS: Flexible Table Recognition and Semantic Interpretation System [84.39812458417246]
We develop two rule-based algorithms that perform the complete table recognition process and support the most frequent table formats.
To incorporate the extraction of semantic information into the table recognition process, we develop a graph-based table interpretation method.
Our table recognition approach achieves results competitive with state-of-the-art approaches.
arXiv Detail & Related papers (2021-05-25T12:31:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.