Genetic Programming for Evolving a Front of Interpretable Models for
Data Visualisation
- URL: http://arxiv.org/abs/2001.09578v1
- Date: Mon, 27 Jan 2020 04:03:19 GMT
- Title: Genetic Programming for Evolving a Front of Interpretable Models for
Data Visualisation
- Authors: Andrew Lensen, Bing Xue, Mengjie Zhang
- Abstract summary: We propose a genetic programming approach named GPtSNE for evolving interpretable mappings from a dataset to high-quality visualisations.
A multi-objective approach is designed that produces a variety of visualisations in a single run which give different trade-offs between visual quality and model complexity.
- Score: 4.4181317696554325
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data visualisation is a key tool in data mining for understanding big
datasets. Many visualisation methods have been proposed, including the
well-regarded state-of-the-art method t-Distributed Stochastic Neighbour
Embedding. However, the most powerful visualisation methods have a significant
limitation: the manner in which they create their visualisation from the
original features of the dataset is completely opaque. Many domains require an
understanding of the data in terms of the original features; there is hence a
need for powerful visualisation methods which use understandable models. In
this work, we propose a genetic programming approach named GPtSNE for evolving
interpretable mappings from a dataset to highquality visualisations. A
multi-objective approach is designed that produces a variety of visualisations
in a single run which give different trade-offs between visual quality and
model complexity. Testing against baseline methods on a variety of datasets
shows the clear potential of GP-tSNE to allow deeper insight into data than
that provided by existing visualisation methods. We further highlight the
benefits of a multi-objective approach through an in-depth analysis of a
candidate front, which shows how multiple models can
Related papers
- Interactive dense pixel visualizations for time series and model attribution explanations [8.24039921933289]
DAVOTS is an interactive visual analytics approach to explore raw time series data, activations of neural networks, and attributions in a dense-pixel visualization.
We apply clustering approaches to the visualized data domains to highlight groups and present ordering strategies for individual and combined data exploration.
arXiv Detail & Related papers (2024-08-27T14:02:21Z) - A Simple Background Augmentation Method for Object Detection with Diffusion Model [53.32935683257045]
In computer vision, it is well-known that a lack of data diversity will impair model performance.
We propose a simple yet effective data augmentation approach by leveraging advancements in generative models.
Background augmentation, in particular, significantly improves the models' robustness and generalization capabilities.
arXiv Detail & Related papers (2024-08-01T07:40:00Z) - Diffusion Models as Data Mining Tools [87.77999285241219]
This paper demonstrates how to use generative models trained for image synthesis as tools for visual data mining.
We show that after finetuning conditional diffusion models to synthesize images from a specific dataset, we can use these models to define a typicality measure.
This measure assesses how typical visual elements are for different data labels, such as geographic location, time stamps, semantic labels, or even the presence of a disease.
arXiv Detail & Related papers (2024-07-20T17:14:31Z) - Multi-modal Auto-regressive Modeling via Visual Words [96.25078866446053]
We propose the concept of visual tokens, which maps the visual features to probability distributions over Large Multi-modal Models' vocabulary.
We further explore the distribution of visual features in the semantic space within LMM and the possibility of using text embeddings to represent visual information.
arXiv Detail & Related papers (2024-03-12T14:58:52Z) - StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized
Image-Dialogue Data [129.92449761766025]
We propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning.
This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models.
Our research includes comprehensive experiments conducted on various datasets.
arXiv Detail & Related papers (2023-08-20T12:43:52Z) - Multi-Objective Genetic Algorithm for Multi-View Feature Selection [0.23343923880060582]
We propose a novel genetic algorithm strategy to overcome limitations of traditional feature selection methods for multi-view data.
Our proposed approach, called the multi-view multi-objective feature selection genetic algorithm (MMFS-GA), simultaneously selects the optimal subset of features within a view and between views.
The results of our evaluations on three benchmark datasets, including synthetic and real data, show improvement over the best baseline methods.
arXiv Detail & Related papers (2023-05-26T13:25:20Z) - A Spectral Method for Assessing and Combining Multiple Data
Visualizations [13.193958370464683]
We propose an efficient spectral method for assessing and combining multiple visualizations of a given dataset.
The proposed method provides a quantitative measure -- the visualization eigenscore -- of the relative performance of the visualizations for preserving the structure around each data point.
We analyze multiple simulated and real-world datasets to demonstrate the effectiveness of the eigenscores for evaluating visualizations and the superiority of the proposed consensus visualization.
arXiv Detail & Related papers (2022-10-25T02:13:19Z) - Auto-weighted Multi-view Feature Selection with Graph Optimization [90.26124046530319]
We propose a novel unsupervised multi-view feature selection model based on graph learning.
The contributions are threefold: (1) during the feature selection procedure, the consensus similarity graph shared by different views is learned.
Experiments on various datasets demonstrate the superiority of the proposed method compared with the state-of-the-art methods.
arXiv Detail & Related papers (2021-04-11T03:25:25Z) - Multi-view Data Visualisation via Manifold Learning [0.03222802562733786]
This manuscript proposes extensions of Student's t-distributed SNE, LLE and ISOMAP, to allow for dimensionality reduction and visualisation of multi-view data.
We show that by incorporating the low-dimensional embeddings obtained via the multi-view manifold learning approaches into the K-means algorithm, clusters of the samples are accurately identified.
arXiv Detail & Related papers (2021-01-17T19:54:36Z) - Generative Partial Multi-View Clustering [133.36721417531734]
We propose a generative partial multi-view clustering model, named as GP-MVC, to address the incomplete multi-view problem.
First, multi-view encoder networks are trained to learn common low-dimensional representations, followed by a clustering layer to capture the consistent cluster structure across multiple views.
Second, view-specific generative adversarial networks are developed to generate the missing data of one view conditioning on the shared representation given by other views.
arXiv Detail & Related papers (2020-03-29T17:48:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.