Related papers: DiRe-JAX: A JAX based Dimensionality Reduction Algorithm for Large-scale Data

DiRe-JAX: A JAX based Dimensionality Reduction Algorithm for Large-scale Data

URL: http://arxiv.org/abs/2503.03156v2
Date: Thu, 06 Mar 2025 04:40:27 GMT
Title: DiRe-JAX: A JAX based Dimensionality Reduction Algorithm for Large-scale Data
Authors: Alexander Kolpakov, Igor Rivin,
Abstract summary: DiRe is a new dimensionality reduction toolkit designed to address some of the challenges faced by traditional methods like UMAP and tSNE.<n>The toolkit shows considerable promise in preserving both local and global structures within the data as compared to state-of-the-art UMAP and tSNE implementations.
Score: 49.84018914962972
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: DiRe - JAX is a new dimensionality reduction toolkit designed to address some of the challenges faced by traditional methods like UMAP and tSNE such as loss of global structure and computational efficiency. Built on the JAX framework, DiRe leverages modern hardware acceleration to provide an efficient, scalable, and interpretable solution for visualizing complex data structures, and for quantitative analysis of lower-dimensional embeddings. The toolkit shows considerable promise in preserving both local and global structures within the data as compared to state-of-the-art UMAP and tSNE implementations. This makes it suitable for a wide range of applications in machine learning, bio-informatics, and data science.

Related papers

Scaling Linear Attention with Sparse State Expansion [58.161410995744596]
Transformer architecture struggles with long-context scenarios due to quadratic computation and linear memory growth.<n>We introduce a row-sparse update formulation for linear attention by conceptualizing state updating as information classification.<n>Second, we present Sparse State Expansion (SSE) within the sparse framework, which expands the contextual state into multiple partitions.
arXiv Detail & Related papers (2025-07-22T13:27:31Z)
InTreeger: An End-to-End Framework for Integer-Only Decision Tree Inference [1.2495506469683937]
InTreeger is an end-to-end framework that takes a training dataset as input, and outputs an architecture-agnostic integer-only C implementation of tree-based machine learning model.<n>This framework enables anyone, even those without prior experience in machine learning, to generate a highly optimized integer-only classification model.
arXiv Detail & Related papers (2025-05-21T11:28:43Z)
ZeroLM: Data-Free Transformer Architecture Search for Language Models [54.83882149157548]
Current automated proxy discovery approaches suffer from extended search times, susceptibility to data overfitting, and structural complexity. This paper introduces a novel zero-cost proxy methodology that quantifies model capacity through efficient weight statistics. Our evaluation demonstrates the superiority of this approach, achieving a Spearman's rho of 0.76 and Kendall's tau of 0.53 on the FlexiBERT benchmark.
arXiv Detail & Related papers (2025-03-24T13:11:22Z)
Building Interactable Replicas of Complex Articulated Objects via Gaussian Splatting [66.29782808719301]
Building articulated objects is a key challenge in computer vision.<n>Existing methods often fail to effectively integrate information across different object states.<n>We introduce ArtGS, a novel approach that leverages 3D Gaussians as a flexible and efficient representation.
arXiv Detail & Related papers (2025-02-26T10:25:32Z)
Task-Oriented Real-time Visual Inference for IoVT Systems: A Co-design Framework of Neural Networks and Edge Deployment [61.20689382879937]
Task-oriented edge computing addresses this by shifting data analysis to the edge. Existing methods struggle to balance high model performance with low resource consumption. We propose a novel co-design framework to optimize neural network architecture.
arXiv Detail & Related papers (2024-10-29T19:02:54Z)
Scalable Geometric Fracture Assembly via Co-creation Space among Assemblers [24.89380678499307]
We develop a scalable framework for geometric fracture assembly without relying on semantic information. We introduce a novel loss function, i.e., the geometric-based collision loss, to address collision issues during the fracture assembly process. Our framework exhibits better performance on both PartNet and Breaking Bad datasets compared to existing state-of-the-art frameworks.
arXiv Detail & Related papers (2023-12-19T17:13:51Z)
A survey on efficient vision transformers: algorithms, techniques, and performance benchmarking [19.65897437342896]
Vision Transformer (ViT) architectures are becoming increasingly popular and widely employed to tackle computer vision applications. This paper mathematically defines the strategies used to make Vision Transformer efficient, describes and discusses state-of-the-art methodologies, and analyzes their performances over different application scenarios.
arXiv Detail & Related papers (2023-09-05T08:21:16Z)
Efficient Multi-View Graph Clustering with Local and Global Structure Preservation [59.49018175496533]
We propose a novel anchor-based multi-view graph clustering framework termed Efficient Multi-View Graph Clustering with Local and Global Structure Preservation (EMVGC-LG) Specifically, EMVGC-LG jointly optimize anchor construction and graph learning to enhance the clustering quality. In addition, EMVGC-LG inherits the linear complexity of existing AMVGC methods respecting the sample number.
arXiv Detail & Related papers (2023-08-31T12:12:30Z)
Advancing Reacting Flow Simulations with Data-Driven Models [50.9598607067535]
Key to effective use of machine learning tools in multi-physics problems is to couple them to physical and computer models. The present chapter reviews some of the open opportunities for the application of data-driven reduced-order modeling of combustion systems.
arXiv Detail & Related papers (2022-09-05T16:48:34Z)
Towards a comprehensive visualization of structure in data [0.0]
We show that a simplified parameter setup with a single control parameter, namely the perplexity, can effectively balance local and global data structure visualization. We also designed a chunk&mix protocol to efficiently parallelize t-SNE and explore data structure across a much wide range of scales.
arXiv Detail & Related papers (2021-11-30T15:43:45Z)
Visualizing High-Dimensional Trajectories on the Loss-Landscape of ANNs [15.689418447376587]
Training artificial neural networks requires the optimization of highly non-dimensional loss functions. Visualization tools have played a key role in uncovering key geometric characteristics of loss-landscape of ANNs. We propose the modernity reduction method which represents the SOTA in terms both local and global structures.
arXiv Detail & Related papers (2021-01-31T16:30:50Z)
Improving the Performance of Fine-Grain Image Classifiers via Generative Data Augmentation [0.5161531917413706]
We develop Data Augmentation from Proficient Pre-Training of Robust Generative Adrial Networks (DAPPER GAN) DAPPER GAN is an ML analytics support tool that automatically generates novel views of training images. We experimentally evaluate this technique on the Stanford Cars dataset, demonstrating improved vehicle make and model classification accuracy.
arXiv Detail & Related papers (2020-08-12T15:29:11Z)
Novel Human-Object Interaction Detection via Adversarial Domain Generalization [103.55143362926388]
We study the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios. The challenge mainly stems from the large compositional space of objects and predicates, which leads to the lack of sufficient training data for all the object-predicate combinations. We propose a unified framework of adversarial domain generalization to learn object-invariant features for predicate prediction.
arXiv Detail & Related papers (2020-05-22T22:02:56Z)
GridMask Data Augmentation [76.79300104795966]
We propose a novel data augmentation method GridMask' in this paper. It utilizes information removal to achieve state-of-the-art results in a variety of computer vision tasks.
arXiv Detail & Related papers (2020-01-13T07:27:05Z)

This list is automatically generated from the titles and abstracts of the papers in this site.