Information Extraction from Unstructured data using Augmented-AI and Computer Vision
- URL: http://arxiv.org/abs/2312.09880v2
- Date: Fri, 25 Jul 2025 08:32:49 GMT
- Title: Information Extraction from Unstructured data using Augmented-AI and Computer Vision
- Authors: Aditya Parikh,
- Abstract summary: This paper presents a framework for information extraction that combines Augmented Intelligence (A2I) with computer vision and natural language processing techniques.<n>Our approach addresses the limitations of conventional methods by leveraging deep learning architectures for object detection.<n>The proposed methodology demonstrates improved accuracy and efficiency in extracting structured information from diverse document formats.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Information extraction (IE) from unstructured documents remains a critical challenge in data processing pipelines. Traditional optical character recognition (OCR) methods and conventional parsing engines demonstrate limited effectiveness when processing large-scale document datasets. This paper presents a comprehensive framework for information extraction that combines Augmented Intelligence (A2I) with computer vision and natural language processing techniques. Our approach addresses the limitations of conventional methods by leveraging deep learning architectures for object detection, particularly for tabular data extraction, and integrating cloud-based services for scalable document processing. The proposed methodology demonstrates improved accuracy and efficiency in extracting structured information from diverse document formats including PDFs, images, and scanned documents. Experimental validation shows significant improvements over traditional OCR-based approaches, particularly in handling complex document layouts and multi-modal content extraction.
Related papers
- Structured Attention Matters to Multimodal LLMs in Document Understanding [52.37530640460363]
We investigate how input format influences document comprehension performance.<n>We discover that raw OCR text often impairs rather than improves MLLMs' performance.<n>We propose a novel structure-preserving approach that encodes document elements using the LaTex paradigm.
arXiv Detail & Related papers (2025-06-19T07:16:18Z) - Digitization of Document and Information Extraction using OCR [0.0]
This document presents a framework for text extraction that merges Optical Character Recognition (OCR) techniques with Large Language Models (LLMs)<n>Scanned files are processed using OCR engines, while digital files are interpreted through layout-aware libraries.<n>The extracted raw text is then analyzed by an LLM to identify key-value pairs and resolve ambiguities.
arXiv Detail & Related papers (2025-06-11T16:03:01Z) - Towards a scalable AI-driven framework for data-independent Cyber Threat Intelligence Information Extraction [0.0]
This paper introduces 0-CTI, a scalable AI-based framework designed for efficient CTI Information Extraction.
The proposed system processes complete text sequences of CTI reports to extract a cyber ontology of named entities and their relationships.
Our contribution is the development of 0-CTI, the first modular framework for CTI Information Extraction that supports both supervised and zero-shot learning.
arXiv Detail & Related papers (2025-01-08T12:35:17Z) - Advanced ingestion process powered by LLM parsing for RAG system [0.0]
This paper introduces a novel multi-strategy parsing approach using LLM-powered OCR to extract content from diverse document types.<n>The methodology employs a node-based extraction technique that creates relationships between different information types and generates context-aware metadata.
arXiv Detail & Related papers (2024-12-16T20:33:33Z) - Contextual Document Embeddings [77.22328616983417]
We propose two complementary methods for contextualized document embeddings.
First, an alternative contrastive learning objective that explicitly incorporates the document neighbors into the intra-batch contextual loss.
Second, a new contextual architecture that explicitly encodes neighbor document information into the encoded representation.
arXiv Detail & Related papers (2024-10-03T14:33:34Z) - GPT-3 Powered Information Extraction for Building Robust Knowledge Bases [0.0]
This work uses the state-of-the-art language model GPT-3 to offer a novel method of information extraction for knowledge base development.
We conduct experiments on a huge corpus of text from diverse fields to assess the performance of our suggested technique.
arXiv Detail & Related papers (2024-07-31T14:59:29Z) - Assessing the quality of information extraction [0.0]
We introduce an automatic framework to assess the quality of the information extraction/retrieval and its completeness.
We discuss how to handle the input/output size limitations of the large language models and analyze their performance.
arXiv Detail & Related papers (2024-04-05T12:51:48Z) - View-Dependent Octree-based Mesh Extraction in Unbounded Scenes for
Procedural Synthetic Data [71.22495169640239]
Procedural signed distance functions (SDFs) are a powerful tool for modeling large-scale detailed scenes.
We propose OcMesher, a mesh extraction algorithm that efficiently handles high-detail unbounded scenes with perfect view-consistency.
arXiv Detail & Related papers (2023-12-13T18:56:13Z) - Data Efficient Training of a U-Net Based Architecture for Structured
Documents Localization [0.0]
We propose SDL-Net: a novel U-Net like encoder-decoder architecture for the localization of structured documents.
Our approach allows pre-training the encoder of SDL-Net on a generic dataset containing samples of various document classes.
arXiv Detail & Related papers (2023-10-02T07:05:19Z) - STAR: Boosting Low-Resource Information Extraction by Structure-to-Text
Data Generation with Large Language Models [56.27786433792638]
STAR is a data generation method that leverages Large Language Models (LLMs) to synthesize data instances.
We design fine-grained step-by-step instructions to obtain the initial data instances.
Our experiments show that the data generated by STAR significantly improve the performance of low-resource event extraction and relation extraction tasks.
arXiv Detail & Related papers (2023-05-24T12:15:19Z) - Visual Information Extraction in the Wild: Practical Dataset and
End-to-end Solution [48.693941280097974]
We propose a large-scale dataset consisting of camera images for visual information extraction (VIE)
We propose a novel framework for end-to-end VIE that combines the stages of OCR and information extraction in an end-to-end learning fashion.
We evaluate the existing end-to-end methods for VIE on the proposed dataset and observe that the performance of these methods has a distinguishable drop from SROIE to our proposed dataset due to the larger variance of layout and entities.
arXiv Detail & Related papers (2023-05-12T14:11:47Z) - More From Less: Self-Supervised Knowledge Distillation for Routine
Histopathology Data [3.93181912653522]
We show that it is possible to distil knowledge during training from information-dense data into models which only require information-sparse data for inference.
This improves downstream classification accuracy on information-sparse data, making it comparable with the fully-supervised baseline.
This approach enables the design of models which require only routine images, but contain insights from state-of-the-art data, allowing better use of the available resources.
arXiv Detail & Related papers (2023-03-19T13:41:59Z) - A Multi-Format Transfer Learning Model for Event Argument Extraction via
Variational Information Bottleneck [68.61583160269664]
Event argument extraction (EAE) aims to extract arguments with given roles from texts.
We propose a multi-format transfer learning model with variational information bottleneck.
We conduct extensive experiments on three benchmark datasets, and obtain new state-of-the-art performance on EAE.
arXiv Detail & Related papers (2022-08-27T13:52:01Z) - Layout-Aware Information Extraction for Document-Grounded Dialogue:
Dataset, Method and Demonstration [75.47708732473586]
We propose a layout-aware document-level Information Extraction dataset, LIE, to facilitate the study of extracting both structural and semantic knowledge from visually rich documents.
LIE contains 62k annotations of three extraction tasks from 4,061 pages in product and official documents.
Empirical results show that layout is critical for VRD-based extraction, and system demonstration also verifies that the extracted knowledge can help locate the answers that users care about.
arXiv Detail & Related papers (2022-07-14T07:59:45Z) - Deep Reinforcement Learning Assisted Federated Learning Algorithm for
Data Management of IIoT [82.33080550378068]
The continuous expanded scale of the industrial Internet of Things (IIoT) leads to IIoT equipments generating massive amounts of user data every moment.
How to manage these time series data in an efficient and safe way in the field of IIoT is still an open issue.
This paper studies the FL technology applications to manage IIoT equipment data in wireless network environments.
arXiv Detail & Related papers (2022-02-03T07:12:36Z) - One-shot Key Information Extraction from Document with Deep Partial
Graph Matching [60.48651298832829]
Key Information Extraction (KIE) from documents improves efficiency, productivity, and security in many industrial scenarios.
Existing supervised learning methods for the KIE task need to feed a large number of labeled samples and learn separate models for different types of documents.
We propose a deep end-to-end trainable network for one-shot KIE using partial graph matching.
arXiv Detail & Related papers (2021-09-26T07:45:53Z) - Efficient Learning of Pinball TWSVM using Privileged Information and its
applications [0.0]
We propose privileged information based Twin Pinball Support Vector Machine classifier (Pin-TWSVMPI)
The proposed Pin-TWSVMPI incorporates privileged information by using correcting function so as to obtain two nonparallel decision hyperplanes.
For UCI datasets, we first implement a procedure which extracts privileged information from the features of the dataset which are then further utilized by Pin-TWSVMPI.
arXiv Detail & Related papers (2021-07-14T14:42:07Z) - Evaluation of a Region Proposal Architecture for Multi-task Document
Layout Analysis [0.685316573653194]
Mask-RCNN architecture is designed to address the problem of baseline detection and region segmentation.
We present experimental results on two handwritten text datasets and one handwritten music dataset.
The analyzed architecture yields promising results, outperforming state-of-the-art techniques in all three datasets.
arXiv Detail & Related papers (2021-06-22T14:07:27Z) - Knowledge Graph Anchored Information-Extraction for Domain-Specific
Insights [1.6308268213252761]
We use a task-based approach for fulfilling specific information needs within a new domain.
A pipeline constructed of state of the art NLP technologies is used to automatically extract an instance level semantic structure.
arXiv Detail & Related papers (2021-04-18T19:28:10Z) - TRIE: End-to-End Text Reading and Information Extraction for Document
Understanding [56.1416883796342]
We propose a unified end-to-end text reading and information extraction network.
multimodal visual and textual features of text reading are fused for information extraction.
Our proposed method significantly outperforms the state-of-the-art methods in both efficiency and accuracy.
arXiv Detail & Related papers (2020-05-27T01:47:26Z) - Privileged Information Dropout in Reinforcement Learning [56.82218103971113]
Using privileged information during training can improve the sample efficiency and performance of machine learning systems.
In this work, we investigate Privileged Information Dropout (pid) for achieving the latter which can be applied equally to value-based and policy-based reinforcement learning algorithms.
arXiv Detail & Related papers (2020-05-19T05:32:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.