NCVis: Noise Contrastive Approach for Scalable Visualization
- URL: http://arxiv.org/abs/2001.11411v1
- Date: Thu, 30 Jan 2020 15:43:50 GMT
- Title: NCVis: Noise Contrastive Approach for Scalable Visualization
- Authors: Aleksandr Artemenkov and Maxim Panov
- Abstract summary: NCVis is a high-performance dimensionality reduction method built on a sound statistical basis of noise contrastive estimation.
We show that NCVis outperforms state-of-the-art techniques in terms of speed while preserving the representation quality of other methods.
- Score: 79.44177623781043
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Modern methods for data visualization via dimensionality reduction, such as
t-SNE, usually have performance issues that prohibit their application to large
amounts of high-dimensional data. In this work, we propose NCVis -- a
high-performance dimensionality reduction method built on a sound statistical
basis of noise contrastive estimation. We show that NCVis outperforms
state-of-the-art techniques in terms of speed while preserving the
representation quality of other methods. In particular, the proposed approach
successfully proceeds a large dataset of more than 1 million news headlines in
several minutes and presents the underlying structure in a human-readable way.
Moreover, it provides results consistent with classical methods like t-SNE on
more straightforward datasets like images of hand-written digits. We believe
that the broader usage of such software can significantly simplify the
large-scale data analysis and lower the entry barrier to this area.
Related papers
- Inferring Neural Signed Distance Functions by Overfitting on Single Noisy Point Clouds through Finetuning Data-Driven based Priors [53.6277160912059]
We propose a method to promote pros of data-driven based and overfitting-based methods for better generalization, faster inference, and higher accuracy in learning neural SDFs.
We introduce a novel statistical reasoning algorithm in local regions which is able to finetune data-driven based priors without signed distance supervision, clean point cloud, or point normals.
arXiv Detail & Related papers (2024-10-25T16:48:44Z) - Noisy Data Visualization using Functional Data Analysis [14.255424476694946]
We propose a new data visualization method called Functional Information Geometry (FIG) for dynamical processes.
We experimentally demonstrate that the resulting method outperforms a variant of EIG designed for visualization.
We then use our method to visualize EEG brain measurements of sleep activity.
arXiv Detail & Related papers (2024-06-05T15:53:25Z) - Enhancing Representation Learning on High-Dimensional, Small-Size
Tabular Data: A Divide and Conquer Method with Ensembled VAEs [7.923088041693465]
We present an ensemble of lightweight VAEs to learn posteriors over subsets of the feature-space, which get aggregated into a joint posterior in a novel divide-and-conquer approach.
We show that our approach is robust to partial features at inference, exhibiting little performance degradation even with most features missing.
arXiv Detail & Related papers (2023-06-27T17:55:31Z) - Laplacian-based Cluster-Contractive t-SNE for High Dimensional Data
Visualization [20.43471678277403]
We propose LaptSNE, a new graph-based dimensionality reduction method based on t-SNE.
Specifically, LaptSNE leverages the eigenvalue information of the graph Laplacian to shrink the potential clusters in the low-dimensional embedding.
We show how to calculate the gradient analytically, which may be of broad interest when considering optimization with Laplacian-composited objective.
arXiv Detail & Related papers (2022-07-25T14:10:24Z) - Distributed Dynamic Safe Screening Algorithms for Sparse Regularization [73.85961005970222]
We propose a new distributed dynamic safe screening (DDSS) method for sparsity regularized models and apply it on shared-memory and distributed-memory architecture respectively.
We prove that the proposed method achieves the linear convergence rate with lower overall complexity and can eliminate almost all the inactive features in a finite number of iterations almost surely.
arXiv Detail & Related papers (2022-04-23T02:45:55Z) - Hierarchical Nearest Neighbor Graph Embedding for Efficient
Dimensionality Reduction [25.67957712837716]
We introduce a novel method based on a hierarchy built on 1-nearest neighbor graphs in the original space.
The proposal is an optimization-free projection that is competitive with the latest versions of t-SNE and UMAP.
In the paper, we argue about the soundness of the proposed method and evaluate it on a diverse collection of datasets with sizes varying from 1K to 11M samples and dimensions from 28 to 16K.
arXiv Detail & Related papers (2022-03-24T11:41:16Z) - MANet: Improving Video Denoising with a Multi-Alignment Network [72.93429911044903]
We present a multi-alignment network, which generates multiple flow proposals followed by attention-based averaging.
Experiments on a large-scale video dataset demonstrate that our method improves the denoising baseline model by 0.2dB.
arXiv Detail & Related papers (2022-02-20T00:52:07Z) - Scalable semi-supervised dimensionality reduction with GPU-accelerated
EmbedSOM [0.0]
BlosSOM is a high-performance semi-supervised dimensionality reduction software for interactive user-steerable visualization of high-dimensional datasets.
We show the application of BlosSOM on realistic datasets, where it helps to produce high-quality visualizations that incorporate user-specified layout and focus on certain features.
arXiv Detail & Related papers (2022-01-03T15:06:22Z) - Revisiting Point Cloud Simplification: A Learnable Feature Preserving
Approach [57.67932970472768]
Mesh and Point Cloud simplification methods aim to reduce the complexity of 3D models while retaining visual quality and relevant salient features.
We propose a fast point cloud simplification method by learning to sample salient points.
The proposed method relies on a graph neural network architecture trained to select an arbitrary, user-defined, number of points from the input space and to re-arrange their positions so as to minimize the visual perception error.
arXiv Detail & Related papers (2021-09-30T10:23:55Z) - Visualising Deep Network's Time-Series Representations [93.73198973454944]
Despite the popularisation of machine learning models, more often than not they still operate as black boxes with no insight into what is happening inside the model.
In this paper, a method that addresses that issue is proposed, with a focus on visualising multi-dimensional time-series data.
Experiments on a high-frequency stock market dataset show that the method provides fast and discernible visualisations.
arXiv Detail & Related papers (2021-03-12T09:53:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.