VizExtract: Automatic Relation Extraction from Data Visualizations
- URL: http://arxiv.org/abs/2112.03485v1
- Date: Tue, 7 Dec 2021 04:27:08 GMT
- Title: VizExtract: Automatic Relation Extraction from Data Visualizations
- Authors: Dale Decatur, Sanjay Krishnan
- Abstract summary: This paper presents a framework for automatically extracting compared variables from statistical charts.
We leverage a computer vision based framework to automatically identify and localize visualization facets in line graphs, scatter plots, or bar graphs.
In controlled experiments, our framework is able to classify, with 87.5% accuracy, the correlation between variables for graphs with 1-3 series per graph, varying colors, and solid line styles.
- Score: 7.2241069295727955
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Visual graphics, such as plots, charts, and figures, are widely used to
communicate statistical conclusions. Extracting information directly from such
visualizations is a key sub-problem for effective search through scientific
corpora, fact-checking, and data extraction. This paper presents a framework
for automatically extracting compared variables from statistical charts. Due to
the diversity and variation of charting styles, libraries, and tools, we
leverage a computer vision based framework to automatically identify and
localize visualization facets in line graphs, scatter plots, or bar graphs and
can include multiple series per graph. The framework is trained on a large
synthetically generated corpus of matplotlib charts and we evaluate the trained
model on other chart datasets. In controlled experiments, our framework is able
to classify, with 87.5% accuracy, the correlation between variables for graphs
with 1-3 series per graph, varying colors, and solid line styles. When deployed
on real-world graphs scraped from the internet, it achieves 72.8% accuracy
(81.2% accuracy when excluding "hard" graphs). When deployed on the FigureQA
dataset, it achieves 84.7% accuracy.
Related papers
- RAGraph: A General Retrieval-Augmented Graph Learning Framework [35.25522856244149]
We introduce a novel framework called General Retrieval-Augmented Graph Learning (RAGraph)
RAGraph brings external graph data into the general graph foundation model to improve model generalization on unseen scenarios.
During inference, the RAGraph adeptly retrieves similar toy graphs based on key similarities in downstream tasks.
arXiv Detail & Related papers (2024-10-31T12:05:21Z) - Parametric Graph Representations in the Era of Foundation Models: A Survey and Position [69.48708136448694]
Graphs have been widely used in the past decades of big data and AI to model comprehensive relational data.
Identifying meaningful graph laws can significantly enhance the effectiveness of various applications.
arXiv Detail & Related papers (2024-10-16T00:01:31Z) - PlasmoData.jl -- A Julia Framework for Modeling and Analyzing Complex Data as Graphs [0.0]
We present PlasmoData.jl, an open-source, Julia framework that uses concepts of graph theory to facilitate the modeling and analysis of complex datasets.
The core of our framework is a general data modeling abstraction, which we call a DataGraph.
We show how the abstraction and software implementation can be used to represent diverse data objects as graphs.
arXiv Detail & Related papers (2024-01-21T05:04:38Z) - Graph Out-of-Distribution Generalization with Controllable Data
Augmentation [51.17476258673232]
Graph Neural Network (GNN) has demonstrated extraordinary performance in classifying graph properties.
Due to the selection bias of training and testing data, distribution deviation is widespread.
We propose OOD calibration to measure the distribution deviation of virtual samples.
arXiv Detail & Related papers (2023-08-16T13:10:27Z) - GenPlot: Increasing the Scale and Diversity of Chart Derendering Data [0.0]
We propose GenPlot, a plot generator that can generate billions of additional plots for chart-derendering using synthetic data.
OCR-free chart-to-text translation has achieved state-of-the-art results on visual language tasks.
arXiv Detail & Related papers (2023-06-20T17:25:53Z) - Bures-Wasserstein Means of Graphs [60.42414991820453]
We propose a novel framework for defining a graph mean via embeddings in the space of smooth graph signal distributions.
By finding a mean in this embedding space, we can recover a mean graph that preserves structural information.
We establish the existence and uniqueness of the novel graph mean, and provide an iterative algorithm for computing it.
arXiv Detail & Related papers (2023-05-31T11:04:53Z) - CGMN: A Contrastive Graph Matching Network for Self-Supervised Graph
Similarity Learning [65.1042892570989]
We propose a contrastive graph matching network (CGMN) for self-supervised graph similarity learning.
We employ two strategies, namely cross-view interaction and cross-graph interaction, for effective node representation learning.
We transform node representations into graph-level representations via pooling operations for graph similarity computation.
arXiv Detail & Related papers (2022-05-30T13:20:26Z) - Graph Contrastive Learning Automated [94.41860307845812]
Graph contrastive learning (GraphCL) has emerged with promising representation learning performance.
The effectiveness of GraphCL hinges on ad-hoc data augmentations, which have to be manually picked per dataset.
This paper proposes a unified bi-level optimization framework to automatically, adaptively and dynamically select data augmentations when performing GraphCL on specific graph data.
arXiv Detail & Related papers (2021-06-10T16:35:27Z) - Tensor Fields for Data Extraction from Chart Images: Bar Charts and
Scatter Plots [0.0]
Automated chart reading involves data extraction and contextual understanding of the data from chart images.
We identify an appropriate tensor field as the model and propose a methodology for the use of its degenerate point extraction for data extraction from chart images.
Our results show that tensor voting is effective for data extraction from bar charts and scatter plots, and histograms, as a special case of bar charts.
arXiv Detail & Related papers (2020-10-05T20:19:40Z) - Multilevel Graph Matching Networks for Deep Graph Similarity Learning [79.3213351477689]
We propose a multi-level graph matching network (MGMN) framework for computing the graph similarity between any pair of graph-structured objects.
To compensate for the lack of standard benchmark datasets, we have created and collected a set of datasets for both the graph-graph classification and graph-graph regression tasks.
Comprehensive experiments demonstrate that MGMN consistently outperforms state-of-the-art baseline models on both the graph-graph classification and graph-graph regression tasks.
arXiv Detail & Related papers (2020-07-08T19:48:19Z) - Graph Partitioning and Graph Neural Network based Hierarchical Graph
Matching for Graph Similarity Computation [5.710312846460821]
Graph similarity aims to predict a similarity score between one pair of graphs to facilitate downstream applications.
We propose a graph partitioning and graph neural network-based model, called PSimGNN, to effectively resolve this issue.
PSimGNN outperforms state-of-the-art methods in graph similarity computation tasks using approximate Graph Edit Distance (GED) as the graph similarity metric.
arXiv Detail & Related papers (2020-05-16T15:01:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.