Towards an efficient framework for Data Extraction from Chart Images
- URL: http://arxiv.org/abs/2105.02039v1
- Date: Wed, 5 May 2021 13:18:53 GMT
- Title: Towards an efficient framework for Data Extraction from Chart Images
- Authors: Weihong Ma, Hesuo Zhang, Shuang Yan, Guangshun Yao, Yichao Huang, Hui
Li, Yaqiang Wu, Lianwen Jin
- Abstract summary: We adopt state-of-the-art computer vision techniques for the data extraction stage in a data mining system.
For building a robust point detector, a fully convolutional network with feature fusion module is adopted.
For data conversion, we translate the detected element into data with semantic value.
- Score: 27.114170963444074
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we fill the research gap by adopting state-of-the-art computer
vision techniques for the data extraction stage in a data mining system. As
shown in Fig.1, this stage contains two subtasks, namely, plot element
detection and data conversion. For building a robust box detector, we
comprehensively compare different deep learning-based methods and find a
suitable method to detect box with high precision. For building a robust point
detector, a fully convolutional network with feature fusion module is adopted,
which can distinguish close points compared to traditional methods. The
proposed system can effectively handle various chart data without making
heuristic assumptions. For data conversion, we translate the detected element
into data with semantic value. A network is proposed to measure feature
similarities between legends and detected elements in the legend matching
phase. Furthermore, we provide a baseline on the competition of Harvesting raw
tables from Infographics. Some key factors have been found to improve the
performance of each stage. Experimental results demonstrate the effectiveness
of the proposed system.
Related papers
- Leveraging Mixture of Experts for Improved Speech Deepfake Detection [53.69740463004446]
Speech deepfakes pose a significant threat to personal security and content authenticity.
We introduce a novel approach for enhancing speech deepfake detection performance using a Mixture of Experts architecture.
arXiv Detail & Related papers (2024-09-24T13:24:03Z) - Efficient Segmentation with Texture in Ore Images Based on
Box-supervised Approach [6.6773975364173]
A box-supervised technique with texture features is proposed to identify complete and independent ores.
The proposed method achieves over 50 frames per second with a small model size of 21.6 MB.
The method maintains a high level of accuracy compared with the state-of-the-art approaches on ore image dataset.
arXiv Detail & Related papers (2023-11-10T08:28:22Z) - DiffusionEngine: Diffusion Model is Scalable Data Engine for Object
Detection [41.436817746749384]
Diffusion Model is a scalable data engine for object detection.
DiffusionEngine (DE) provides high-quality detection-oriented training pairs in a single stage.
arXiv Detail & Related papers (2023-09-07T17:55:01Z) - Knowledge Distillation for Object Detection: from generic to remote
sensing datasets [7.872075562968697]
We evaluate various off-the-shelf object knowledge distillation methods which have been originally developed on generic computer vision datasets.
In particular, methods covering both logit and feature imitation approaches are applied for vehicle detection using the well-known benchmarks as xView and VEDAI datasets.
arXiv Detail & Related papers (2023-07-18T13:49:00Z) - Detection Hub: Unifying Object Detection Datasets via Query Adaptation
on Language Embedding [137.3719377780593]
A new design (named Detection Hub) is dataset-aware and category-aligned.
It mitigates the dataset inconsistency and provides coherent guidance for the detector to learn across multiple datasets.
The categories across datasets are semantically aligned into a unified space by replacing one-hot category representations with word embedding.
arXiv Detail & Related papers (2022-06-07T17:59:44Z) - Real-Time Scene Text Detection with Differentiable Binarization and
Adaptive Scale Fusion [62.269219152425556]
segmentation-based scene text detection methods have drawn extensive attention in the scene text detection field.
We propose a Differentiable Binarization (DB) module that integrates the binarization process into a segmentation network.
An efficient Adaptive Scale Fusion (ASF) module is proposed to improve the scale robustness by fusing features of different scales adaptively.
arXiv Detail & Related papers (2022-02-21T15:30:14Z) - Weakly Supervised Change Detection Using Guided Anisotropic Difusion [97.43170678509478]
We propose original ideas that help us to leverage such datasets in the context of change detection.
First, we propose the guided anisotropic diffusion (GAD) algorithm, which improves semantic segmentation results.
We then show its potential in two weakly-supervised learning strategies tailored for change detection.
arXiv Detail & Related papers (2021-12-31T10:03:47Z) - Pretrained equivariant features improve unsupervised landmark discovery [69.02115180674885]
We formulate a two-step unsupervised approach that overcomes this challenge by first learning powerful pixel-based features.
Our method produces state-of-the-art results in several challenging landmark detection datasets.
arXiv Detail & Related papers (2021-04-07T05:42:11Z) - Ensembling object detectors for image and video data analysis [98.26061123111647]
We propose a method for ensembling the outputs of multiple object detectors for improving detection performance and precision of bounding boxes on image data.
We extend it to video data by proposing a two-stage tracking-based scheme for detection refinement.
arXiv Detail & Related papers (2021-02-09T12:38:16Z) - Efficient and accurate object detection with simultaneous classification
and tracking [1.4620086904601473]
We propose a detection framework based on simultaneous classification and tracking in the point stream.
In this framework, a tracker performs data association in sequences of the point cloud, guiding the detector to avoid redundant processing.
Experiments were conducted on the benchmark dataset, and the results showed that the proposed method outperforms original tracking-by-detection approaches.
arXiv Detail & Related papers (2020-07-04T10:22:33Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.